subgraph frequencies - stanford universitystanford.edu/~jugander/papers/ · fitting λ to subgraph...
TRANSCRIPT
![Page 1: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/1.jpg)
Johan Ugander, Lars Backstrom, Jon KleinbergWorld Wide Web ConferenceMay !", #$!%
Subgraph Frequencies:The Empirical and Extremal Geographyof Large Graph Collections
![Page 2: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/2.jpg)
▪ Neighborhoods: graph induced by friends of a single ego, excluding ego
Graph collections
All Friends
One-way Communication Mutual Communication
Maintained Relationships
![Page 3: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/3.jpg)
▪ Neighborhoods: graph induced by friends of a single ego, excluding ego
▪ Groups: graph induced by members of a Facebook ‘group‘
▪ Events: graph induced by ‘Yes’ respondents to a Facebook ‘event’
Graph collections
All Friends
One-way Communication Mutual Communication
Maintained Relationships
![Page 4: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/4.jpg)
▪ Neighborhoods: graph induced by friends of a single ego, excluding ego
▪ Groups: graph induced by members of a Facebook ‘group‘
▪ Events: graph induced by ‘Yes’ respondents to a Facebook ‘event’
Graph collections
All Friends
One-way Communication Mutual Communication
Maintained Relationships
Seeking a ‘coordinate system’ on these graphs
![Page 5: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/5.jpg)
SubgraphsAll Friends
One-way Communication Mutual Communication
Maintained Relationships
![Page 6: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/6.jpg)
SubgraphsAll Friends
One-way Communication Mutual Communication
Maintained Relationships
![Page 7: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/7.jpg)
SubgraphsAll Friends
One-way Communication Mutual Communication
Maintained Relationships
![Page 8: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/8.jpg)
SubgraphsAll Friends
One-way Communication Mutual Communication
Maintained Relationships
Compute frequencies
![Page 9: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/9.jpg)
Subgraph Frequencies▪ Definition: The subgraph frequency s(F,G) of a k-node subgraph F in a graph G
is the fraction of k-tuples of nodes in G that induce a copy of F.
Motifs/Frequent subgraphs: Inokuchi et al. #$$$, Milo et al. #$$#, Yan-Han #$$#, Kuramochi-Karypis #$$&Triad census: Davis-Leinhardt !'(!, Wasserman-Faust !''&
![Page 10: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/10.jpg)
Subgraph Frequencies▪ Definition: The subgraph frequency s(F,G) of a k-node subgraph F in a graph G
is the fraction of k-tuples of nodes in G that induce a copy of F.
▪ Subgraph frequency vectors:
s(·, G) = (y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11)
s(·, G) = (x1, x2, x3, x4)
Motifs/Frequent subgraphs: Inokuchi et al. #$$$, Milo et al. #$$#, Yan-Han #$$#, Kuramochi-Karypis #$$&Triad census: Davis-Leinhardt !'(!, Wasserman-Faust !''&
= ($.!), $.%(, $.!&, $.%!)
![Page 11: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/11.jpg)
Subgraph Frequencies▪ Definition: The subgraph frequency s(F,G) of a k-node subgraph F in a graph G
is the fraction of k-tuples of nodes in G that induce a copy of F.
▪ Subgraph frequency vectors:
Motifs/Frequent subgraphs: Inokuchi et al. #$$$, Milo et al. #$$#, Yan-Han #$$#, Kuramochi-Karypis #$$&Triad census: Davis-Leinhardt !'(!, Wasserman-Faust !''&
= ($.!), $.%(, $.!&, $.%!)
s(·, G) = (y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11)
s(·, G) = (x1, x2, x3, x4)
![Page 12: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/12.jpg)
Empirical/Extremal Questions
▪ Consider the subgraph frequencies as a ‘coordinate system’
▪ Empirical Geography:
▪ What subgraph frequencies do social graphs exhibit?
▪ Is there a good model?
▪ Extremal Geography:
▪ How much of this space is even feasible, combinatorially?
▪ Do empirical graphs fill the feasible space?
![Page 13: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/13.jpg)
Empirical/Extremal Questions
▪ What’s a property of graphs and what’s a property of people?
▪ Consider the subgraph frequencies as a ‘coordinate system’
▪ Empirical Geography:
▪ What subgraph frequencies do social graphs exhibit?
▪ Is there a good model?
▪ Extremal Geography:
▪ How much of this space is even feasible, combinatorially?
▪ Do empirical graphs fill the feasible space?
![Page 14: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/14.jpg)
What do we expect?
![Page 15: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/15.jpg)
tria
dic clo
sure
triadic closure
triadic closure
What do we expect?
![Page 16: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/16.jpg)
What do we expect?
We expect few wedges, many triangles for social networks.
![Page 17: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/17.jpg)
The triad space
![Page 18: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/18.jpg)
The triad space
You are here
*$ node graphsOrange - Neighborhoods Green - GroupsLavender - Events
![Page 19: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/19.jpg)
You are here
Gn,p
The triad space
*$ node graphsOrange - Neighborhoods Green - GroupsLavender - Events
![Page 20: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/20.jpg)
Subgraph frequency of
*$ node graphsOrange - Neighborhoods Green - GroupsLavender - Events
![Page 21: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/21.jpg)
Subgraph frequency of
*$ node graphsOrange - Neighborhoods Green - GroupsLavender - Events
![Page 22: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/22.jpg)
Subgraph frequency of
Gn,p
*$ node graphsOrange - Neighborhoods Green - GroupsLavender - Events
![Page 23: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/23.jpg)
Subgraph frequency of
*$ node graphsOrange - Neighborhoods Green - GroupsLavender - Events
ExtremalGraph Theory
![Page 24: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/24.jpg)
Subgraph frequency ofFrequency of the ‘forbidden triad’ is bounded at ! "/#.
Sharp for Kn/#,n/# (bipartite graph) when n is even.
*$ node graphsOrange - Neighborhoods Green - GroupsLavender - Events
![Page 25: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/25.jpg)
Subgraph frequencies
![Page 26: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/26.jpg)
‘Crowd-sourced’ inner bounds
Consider all social graphs and the complements of all graphs, anti-social graphs (which are also graphs!)
![Page 27: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/27.jpg)
What graphs are missing?
![Page 28: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/28.jpg)
▪ Square unlikely to form:
Triadic Closure and Squares
![Page 29: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/29.jpg)
▪ Square unlikely to form:
Triadic Closure and Squares
![Page 30: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/30.jpg)
▪ Square unlikely to form:
▪ Square has very short ‘half-life’:
Triadic Closure and Squares
![Page 31: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/31.jpg)
Continuous Time Markov Chain Model
![Page 32: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/32.jpg)
tria
dic clo
sure
triadic closure
triadic closure
Continuous Time Markov Chain Model
![Page 33: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/33.jpg)
Edge Formation Random Walk (EFRW)▪ Continuous-time Markov chain▪ Transitions between unlabeled, undirected graphs based in edge formation.
▪ Independent Poisson processes for all node pairs:▪ Arbitrary formation: rate ɣ > $ ▪ Arbitrary deletion: rate δ > $
▪ Triadic closure formation for each wedge: rate λ + $
![Page 34: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/34.jpg)
Edge Formation Random Walk (EFRW)▪ Continuous-time Markov chain▪ Transitions between unlabeled, undirected graphs based in edge formation.
▪ Independent Poisson processes for all node pairs:▪ Arbitrary formation: rate ɣ > $ ▪ Arbitrary deletion: rate δ > $
▪ Triadic closure formation for each wedge: rate λ + $
▪ For &-node graphs, succinct Markov chain state transition diagram:
6�
�4�
�
2�
2�
3�
3�
2�
�4�
�
�+�
2�
3(�+�)
6�3�
�
4�2(�+�)
4�2(�+
�)
2(�+2�)�+�
�
�
2��
![Page 35: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/35.jpg)
Edge Formation Random Walk (EFRW)▪ Continuous-time Markov chain▪ Transitions between unlabeled, undirected graphs based in edge formation.
▪ Independent Poisson processes for all node pairs:▪ Arbitrary formation: rate ɣ > $ ▪ Arbitrary deletion: rate δ > $
▪ Triadic closure formation for each wedge: rate λ + $
▪ For &-node graphs, succinct Markov chain state transition diagram:
6�
�4�
�
2�
2�
3�
3�
2�
�4�
�
�+�
2�
3(�+�)
6�3�
�
4�2(�+�)
4�2(�+
�)
2(�+2�)�+�
�
�
2��
![Page 36: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/36.jpg)
Fitting λ to subgraph data▪ How well can we fit λ?
▪ Subgraph frequencies are modeled very well by triadic closure.
frequ
ency
0.00
10.
010
0.10
01.
000
Neighborhoods, n=50
●●
●
●
●
●
●
●
●
●
●
●
●
Neighborhoods data, meanFit model, λ ν = 19.37
frequ
ency
0.00
10.
010
0.10
01.
000
Groups, n=50
●
●
Groups data, meanFit model, λ ν = 7.02
●●
●
● ●
●
●
●
●
● ●
frequ
ency
0.00
10.
010
0.10
01.
000
Events, n=50●
●
●
● ●
●
●
●
●
● ●
●
●
Events data, meanFit model, λ ν = 7.38
(log-scale y-axis)
![Page 37: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/37.jpg)
Extremal graph theory▪ Subgraph frequencies s(F,G) closely related to homomorphism density t(F,G).
▪ Frequency of cliques, lower bounds: Moon-Moser !'"#, Razborov #$$)▪ Frequency of cliques, upper bounds: Kruskal-Katona Theorem▪ Frequency of trees: Sidorenko Conjecture (‘Theorem for trees’)▪ Also linear relationships across sizes.▪ => Linear Program!
[Borgs et al. $%%&, Lovasz $%%']
![Page 38: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/38.jpg)
▪ A proposition for all subgraphs:
Proposition. For every k, there exist constants ✏ and n0 such that the following
holds. If F is a k-node subgraph that is not a clique and not empty, and G is
any graph on n � n0 nodes, then s(F,G) < 1� ✏.
Extremal graph theory
![Page 39: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/39.jpg)
▪ How do different audience graphs differ?
Audience graph classification
20 50 100 200 500 1000
0.05
0.10
0.20
0.50
1.00
size
Aver
age
edge
den
sity
NeighborhoodsNeighborhoods + egoGroupsEvents
40075
![Page 40: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/40.jpg)
▪ How do different audience graphs differ?
▪ Classification challenges A) (*-node neigh. vs. (*-node events B) &$$-node neigh. vs. &$$-node groups
Audience graph classification
20 50 100 200 500 1000
0.05
0.10
0.20
0.50
1.00
size
Aver
age
edge
den
sity
NeighborhoodsNeighborhoods + egoGroupsEvents
40075
![Page 41: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/41.jpg)
▪ How do different audience graphs differ?
▪ Classification challenges A) (*-node neigh. vs. (*-node events B) &$$-node neigh. vs. &$$-node groups
▪ Features: Quad frequencies : (", / (", accuracy Global features: "', / (", accuracy Quad frequencies + Global features: )!, / )#, accuracy
Audience graph classification
20 50 100 200 500 1000
0.05
0.10
0.20
0.50
1.00
size
Aver
age
edge
den
sity
NeighborhoodsNeighborhoods + egoGroupsEvents
40075
![Page 42: Subgraph Frequencies - Stanford Universitystanford.edu/~jugander/papers/ · Fitting λ to subgraph data How well can we fit λ? Subgraph frequencies are modeled very well by triadic](https://reader033.vdocuments.net/reader033/viewer/2022053023/60537031182ea009e770d2d8/html5/thumbnails/42.jpg)
▪ Subgraph frequencies usefully characterize social graphs, have extremal limits!
▪ Edge Formation Random Walk model of dense social graphs:
▪ Homomorphism density bounds yield subgraph density bounds:
Conclusions
6�
�4�
�
2�
2�
3�
3�
2�
�4�
�
�+�
2�
3(�+�)
6�3�
�
4�2(�+�)
4�2(�+
�)
2(�+2�)�+�
�
�
2��