the very small world of the well-connected. (19 june 2008 ) lada adamic school of information...

26
The Very Small World of the Well- connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 [email protected] Anna C. Gilbert Department of Mathematics University of Michigan Ann Arbor, MI 48109-1043 [email protected] Xiaolin Shi Department of EECS University of Michigan Ann Arbor, MI 48109-2121 [email protected] Matthew Bonner Department of EECS University of Michigan Ann Arbor, MI 48109-2121 [email protected] School of Information. University of Michigan Ann Arbor, MI 48109-1107

Post on 22-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

The Very Small World of the Well-connected.

(19 june 2008 )

Lada AdamicSchool of InformationUniversity of Michigan

Ann Arbor, MI [email protected]

Anna C. GilbertDepartment of Mathematics

University of MichiganAnn Arbor, MI 48109-1043

[email protected]

Xiaolin ShiDepartment of EECS

University of MichiganAnn Arbor, MI 48109-2121

[email protected]

Matthew BonnerDepartment of EECS

University of MichiganAnn Arbor, MI 48109-2121

[email protected]

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 2: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

PRELIMINARIES Importance measures Network datasets proprieties description

IMPORTANT VERTICES Network properties and important vertices Original vs. subgraph properties

Summary

Introduction

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 3: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Importance measuresLet the graph G (V,E ) have |V | = n

vertices

1 Degree D (vi ): Is the number of edges incident to vi. Degree reflects a local property of the

vertices in the graph.

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 4: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Importance measuresLet the graph G (V,E ) have |V | = n

vertices 1 Degree D (vi ).

2 Betweenness B (vi ) : a measure of how many pairs of vertices go

through vi in order to connect through shortest paths in G:

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 5: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Importance measuresLet the graph G (V,E ) have |V | = n vertices 1 Degree D (vi ). 2 Betweenness B (vi ).

3 Closeness C (vi ): a measure of the distances from all other

vertices in G to vertex vi closeness means that vertices that are in the

“middle” of the network are important.

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 6: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Importance measuresLet the graph G (V,E ) have |V | = n vertices 1 Degree D (vi ). 2 Betweenness B (vi ). 3 Closeness C (vi ). 4 PageRank :

a variant of the Eigenvector centrality measure and assigns greater importance to vertices that are themselves neighbors of important vertices

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 7: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network datasets proprieties description

Data sets is a representative of web.

Data sets as an online social network data.

Data sets will be interested in examining the properties of important vertices and their graph synopsis.

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 8: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network datasets proprieties description

prototypical random graph

1 Erdos-Renyi random graph : each pair of vertices having an equal

probability p of being joined by an edge. |V | = 10000 ; p = 0.001 ; d = p × |V | =

10.

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 9: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network datasets proprieties description

prototypical random graph 1 Erdos-Renyi random graph. 2 Budyzoo dataset :

Considered as the first real-world network producing an undirected graph from AOL

Instant Messenger (AIM) Users >> Nodes Contact list >> edges

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 10: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network datasets proprieties description prototypical random graph

1 Erdos-Renyi random graph. 2 Budyzoo dataset. 3 TREC (Text REtrieval Conference).

Considered as the second real-world graph is a network of blog connections It is a crawl of 100,649 RSS and Atom feeds collected

The TREC dataset contains Hyperlinks, comments, trackbacks, etc.

removed feeds and feeds without a homepage or permalinks are. over 300 Technorati tags. which are in fact automatically

generated are not true indicators of social linking.

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 11: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network datasets proprieties description

prototypical random graph 1 Erdos-Renyi random graph. 2 Budyzoo dataset. 3 TREC (Text REtrieval Conference). 4 Web graph dataset

259,794 websites 50 million pages Collected in 1998

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 12: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network datasets proprieties description prototypical random graph

1 Erdos-Renyi random graph. 2 Budyzoo dataset. 3 TREC (Text REtrieval Conference). 4 Web graph dataset

PRELIMINARIES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

recent blog datasets

the decade old website-level data

set

== Similarity ==applicable to larger, ore current webcrawls

Page 13: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network properties and important vertices 1 Degree distributions.

The degree distributions of online networks

IMPORTANT VERTICES

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Type : social networks

due to the limitation of The data sampling

Page 14: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network properties and important vertices

1 Degree distributions. 2 Correlation of importance values of

different measures. relationships of importance measures in

different networks. Analysis of correlation

Higher : degree, betweenness and PageRank Lower : closeness.

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

IMPORTANT VERTICES

Page 15: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network properties and important vertices

1 Degree distributions. 2 Correlation of importance values of different measures.

3 Assortativity. The concept of assortativity or

assortative mixing is defined as the preference of the vertices in a network to have edges with others that are similar.

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

IMPORTANT VERTICES

Page 16: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

•Assortativity :

Page 17: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Important vertices in their subgraphs.

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 18: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

ConnectivityThe Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 19: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

DensityThe Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 20: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Network properties and important vertices

1 Degree distributions. 2 Correlation of importance values of different

measures. Assortativity.

3 Important vertices in their subgraphs. Connectivity Density

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

IMPORTANT VERTICES

Page 21: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

IMPORTANT VERTICES

Original vs. subgraph properties 1 Density 2 distance. 3 Relative importance.

Page 22: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

DensityThe Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 23: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

distanceThe Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 24: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Relative importanceThe Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Page 25: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

Original vs. subgraph properties 1 Density.

2 distance.

3 Relative importance.

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

IMPORTANT VERTICES

Page 26: The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu

two overall observations about the four networks: Different importance measures yield subgraphs of varying

density and topology However, in spite of these differences, “important

vertices” in the online networks have some properties that agree with each other

Thus, we know that in the real online networks, in contrast to random graph model

the important vertices tend to preserve information about the relationships among important vertices

we can use the subgraphs to study the properties of important

vertices in the original graphs.

The Very Small World of the Well-connected.

School of Information. University of Michigan Ann Arbor, MI 48109-1107

Summary and conclusion