on the eigenvalue power law
DESCRIPTION
On the Eigenvalue Power Law. Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley. &. P2P. WWW. Network and application studies need properties and models of: Internet graphs & Internet Traffic. Shift of networking paradigm: Open, decentralized, dynamic. - PowerPoint PPT PresentationTRANSCRIPT
1
On the Eigenvalue Power Law
Milena Mihail Georgia Tech
Christos PapadimitriouU.C. Berkeley
&
2
Network and application studies need properties and models of:
Internet graphs & Internet Traffic.
Shift of networking paradigm: Open, decentralized, dynamic.
Intense measurement efforts. Intense modeling efforts.
Internet Measurement and Models
Routers
WWW
P2P
3
Internet & WWW Graphs
http://www.etc
http://www.XXX.net
http://www.YYY.com
http://www.etc http://www.ZZZ.edu
http://www.XXX.com
http://www.etc
Routers exchanging traffic. Web pages and hyperlinks.
10K – 300K nodesAvrg degree ~ 3
4
Real Internet Graphs
CAIDA http://www.caida.org
Average Degree = ConstantA Few Degrees VERY LARGE
Degrees not sharply concentrated around their mean.
5
Degree-Frequency Power Law
degree1 3 4 5 102 100
frequ
enc
y
WWW measurement: Kumar et al 99Internet measurement: Faloutsos et
al 99
E[d] = const., butNo sharp concentration
6
Degree-Frequency Power Law
1 3 4 5 102 100
frequ
enc
y
E[d] = const., butNo sharp concentration
degree
E[d] = const., butNo sharp concentration
Erdos-Renyi sharp concentration
Models by Kumar et al 00, x Bollobas et al 01, x Fabrikant et al 02
7
Rank-Degree Power Law
rank
degr
ee
1 2 3 4 5 10
Internet measurement: Faloutsos et al 99
UUNET Sprint
C&WUSAAT&T
BBN
8
Eigenvalue Power Law
rank
eige
nval
ue
1 2 3 4 5 10
Internet measurement: Faloutsos et al 99
9
This Paper: Large Degrees & Eigenvalues
rank
eige
nval
ues
1 2 3 4 5 10
UUNET
SprintC&WUSA
AT&TBBN2
34
2 3 4
degr
ees
10
This Paper: Large Degrees & Eigenvalues
11
Principal Eigenvector of a Star
11
1
11
1
1
1
d
12
Large Degrees
2
3
4
13
Large Eigenvalues
2
34
14
Main Result of the Paper The largest eigenvalues of the adjacency
martix of a graph whose large degrees are power law distributed (Zipf), are also power law distributed.
Explains Internet measurements. Negative implications for the spectral
filtering method in information retrieval.
15
Random Graph Model
let
Connectivity analyzed by Chung & Lu ‘01
16
Random Graph Model
17
Random Graph Model
18
Theorem :
Ffor large enough Wwith probability at least
19
Proof : Step 1. Decomposition
Vertex Disjoint StarsLR-extra
RR
LL
LR =
-
20
Proof: Step 2: Vertex Disjoint Stars
Degrees of each Vertex Disjoint Stars Sharply Concentrated around its Mean d_iHence Principal Eigenvalue Sharply Concentrated around
21
Proof: Step 3: LL, RR, LR-extra
LR-extra has max degree
LL hasedges
RR has max degree
22
Proof: Step 3: LL, RR, LR-extra
LR-extra has max degree
RR has max degree
LL hasedges
23
Proof: Step 4: Matrix Perturbation Theory
Vertex Disjoint Stars have principal eigenvalues
All other parts have max eigenvalue QED
24
Implication for Info Retrieval
Spectral filtering, without preprocessing, reveals only the large degrees.
Term-Norm Distribution Problem :
25
Implication for Info Retrieval
Term-Norm Distribution Problem : Spectral filtering, without preprocessing, reveals only the large degrees.Local information.No “latent semantics”.
26
Implication for Information Retrieval
Application specific preprocessing (normalization of degrees) reveals clusters:WWW: related to searching, Kleinberg 97IR, collaborative filtering, …Internet: related to congestion, Gkantsidis et al 02
Open : Formalize “preprocessing”.
Term-Norm Distribution Problem :