sinks method paper presentation @ duke political networks conference 2010

21
Distance Measures for Dynamic Citation Networks M. Bommarito D. Katz J. Zelner J. Fowler May 21, 2010 M. Bommarito, D. Katz, J. Zelner, J. Fowler () Distance Measures for Dynamic Citation Networks May 21, 2010 1 / 21

Upload: daniel-martin-katz

Post on 04-Jul-2015

2.386 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Distance Measures for Dynamic Citation Networks

M. Bommarito D. Katz J. Zelner J. Fowler

May 21, 2010

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 1 / 21

Page 2: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Outline

1 Goals

Supreme Court Citation Network

2 Citation Dynamics and Sinks

3 Distance Measures for Dynamic Citation Networks

4 How does the “sink” method perform?

Simulation Results

United States Supreme Court

5 Conclusion and Future Directions

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 2 / 21

Page 3: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Goals Supreme Court Citation Network

Goals & Data

Goal: Can we uncover various mesoscopic patterns within thejurisprudence of the United States Supreme Court?

1 |V | ≈ 36k, |E| ≈ 280k2 1791-2005

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 3 / 21

Page 4: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Goals Supreme Court Citation Network

Standard Solution

Standard Solution: Obtain vertex community membership by

applying an out-of-the-box community detection method.

Methods:

1 Edge-Betweenness (Girvan & Newman 2002)

2 Fast-Greedy (Clauset et al. 2004)

3 Leading (or more) Eigenvector (Newman 2006, Richardson et al.2009)

4 Walktrap (Pons & Latapy 2006)

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 4 / 21

Page 5: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Goals Supreme Court Citation Network

Expectations

Expectation: Dyadic relationships should be fairly stable.

If two vertices are in the same community m at t, they should be in thesame community n (not necessarily identical to m) at t + 1.

Formally, this can be written as “pairwise stability” σ:

σ =P(Ct+1i = Ct+1

j |Cti = Ct

j)

Cti :community membership of vertex i at time t

This conception of stability avoids many issues with community tracking.

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 5 / 21

Page 6: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Goals Supreme Court Citation Network

Results

Fast-Greedy Eigenvector

The results of these approaches do not match our expectation.

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 6 / 21

Page 7: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Goals Supreme Court Citation Network

Research Source

Title: On the Stability of Community Detection Algorithms on

Longitudinal Citation Data.

Michael J. Bommarito II, Daniel M. Katz, Jonathan L. Zelner.Forthcoming in Proceedings of ASNA 2009 (ETH-Zurich).

Goal: Compare out-of-the-box community detection methods under

different parameters of a citation model w.r.t.:

1 Average number of resulting communities across all time steps

2 Average pairwise stability of all vertex pairs across all time steps

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 7 / 21

Page 8: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Goals Supreme Court Citation Network

Results

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 8 / 21

Page 9: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Goals Supreme Court Citation Network

Implications

Citation networks are different.

1 Patterns within citation networks are not well-revealed by thesemethods.

2 Qualitative conclusions may vary dramatically based on the chosenmethod.

3 The “appropriateness” of each method may depend on parameters ofthe generating process.

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 9 / 21

Page 10: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Citation Dynamics and Sinks

Citation Dynamics

What are the basic growth rules of a citation network?1 Documents and their citations are introduced into the network in

sequence.

2 Documents cannot create new outbound citations after introduction.

These rules guarantee that any resulting network is an acyclic digraph.The simplest topological ordering is just the order of vertex introduction.

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 10 / 21

Page 11: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Citation Dynamics and Sinks

Dynamic Acyclic Digraphs

What properties do we have?

1 Each component has at least one “sink” and one “source.”

2 Sinks are vertices with zero out-degree. The first vertex in atopological ordering must be a sink.

3 Sources are vertices with zero in-degree. The last vertex in atopological ordering must be a source.

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 11 / 21

Page 12: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Citation Dynamics and Sinks

Sinks

If sinks have zero out-degree, they must represent the point atwhich at least one idea is introduced into the network.

Either the document “invents” the idea or the head of the citation arc wasnot sampled in the dataset.

Weak vs. Strong - Dimensional Data can help identify Weak Sinks

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 12 / 21

Page 13: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Citation Dynamics and Sinks

Six Degrees of Marbury v. Madison

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 13 / 21

Page 14: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Distance Measures for Dynamic Citation Networks

Basic Idea of the Distance Measure

If two vertices share more “ideas,” they should be more similar.

Alternative Example: Articles in Political Science

1 American Politics

2 Congress

3 Committee Assignments

4 Formal Theory

We want to be able to use clustering methods, so we then construct adistance measure from this basic premise.

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 14 / 21

Page 15: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Distance Measures for Dynamic Citation Networks

A Simple Distance Measure

Simplest Distance Measure: Proportion of Possibly Shared Ideas

Di,j =1− |Si ∩ Sj ||Si ∪ Sj |

Si :the set of sink vertex IDs for vertex i

Note that this is only one way to translate from similarity to distance.

Also note that distance between vertices i and j don’t change overtime.

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 15 / 21

Page 16: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Distance Measures for Dynamic Citation Networks

Flexible Framework for More Detailed Specifications

What if the story is more complicated?

1 Minimum path length to a sink

2 Number of paths to a sink

3 Total number of shared ancestors

4 Total elapsed time along path

Example with arbitrary f for path length and number of sharedancestors:

Di,j =1−�

s∈Si∩Sjf(Ai,s, Pi,s, Aj,s, Pj,s)

�s∈Si∪Sj

f(Ai,s, Pi,s, Aj,s, Pj,s)

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 16 / 21

Page 17: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

How does the “sink” method perform? Simulation Results

Simulation

1 Directed

2 Two vertex types

3 Asymmetric vertex connection probabilities

4 Preferential attachment mechanism (Two-Dimensional)

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 17 / 21

Page 18: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

How does the “sink” method perform? Simulation Results

Simulation Results

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 18 / 21

Page 19: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

How does the “sink” method perform? United States Supreme Court

United States Supreme Court

The Early Years of the United States Supreme CourtM. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 19 / 21

Movie Available @computationallegalstudies.com

Page 20: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

How does the “sink” method perform? United States Supreme Court

Supreme Court Results Using the Sink Method

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 20 / 21

Page 21: Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Conclusion and Future Directions

Conclusion

1 There are issues with existing community detection methods in

dynamic citation networks.

2 Our sink-based method provides more reasonable qualitative results

than other methods we’ve tried.

3 Application to a larger segment of the SCOTUS data together with

qualitative strategy designed to evaluate the outputs

M. Bommarito, D. Katz, J. Zelner, J. Fowler ()Distance Measures for Dynamic Citation Networks May 21, 2010 21 / 21