statistical properties of network community structure
DESCRIPTION
Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta and Michael Mahoney Yahoo! Research. Statistical properties of network community structure. Network communities. Communities: Sets of nodes with lots of connections inside and few to outside (the rest of the network) Assumption: - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/1.jpg)
Statistical properties of network community structureJure Leskovec, CMUKevin Lang, Anirban Dasgupta and Michael MahoneyYahoo! Research
![Page 2: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/2.jpg)
Network communities
Communities: Sets of nodes with lots
of connections inside and few to outside (the rest of the network)
Assumption: Networks are
(hierarchically) composed of communities (modules)
Communities, clusters, groups,
modules
![Page 3: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/3.jpg)
Community score (quality) How community like is a
set of nodes? Want a measure that
corresponds to intuition Conductance
(normalized cut):Φ(S) = # edges cut / # edges inside
Small Φ(S) corresponds to more community-like sets of nodes
![Page 4: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/4.jpg)
Community score (quality)
Score: Φ(S) = # edges cut / # edges inside
![Page 5: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/5.jpg)
Community score (quality)
Score: Φ(S) = # edges cut / # edges inside
Bad communit
yΦ=5/7 = 0.7
![Page 6: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/6.jpg)
Community score (quality)
Score: Φ(S) = # edges cut / # edges inside
Better communit
y
Φ=5/7 = 0.7
Bad communit
y
Φ=2/5 = 0.4
![Page 7: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/7.jpg)
Community score (quality)
Score: Φ(S) = # edges cut / # edges inside
Better communit
y
Φ=5/7 = 0.7
Bad communit
y
Φ=2/5 = 0.4
Best communit
yΦ=2/8 = 0.25
![Page 8: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/8.jpg)
Network Community Profile Plot We define:
Network community profile (NCP) plotPlot the score of best community of size k
Search over all subsets of size k and find best: Φ(k=5) = 0.25
NCP plot is intractable to compute Use approximation algorithm
![Page 9: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/9.jpg)
NCP plot: Small Social Network Dolphin social network
Two communities of dolphins
NCP plotNetwork
![Page 10: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/10.jpg)
NCP plot: Zachary’s karate club
Zachary’s university karate club social network During the study club split into 2 The split (squares vs. circles) corresponds to cut B
NCP plotNetwork
![Page 11: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/11.jpg)
NCP plot: Network Science Collaborations between scientists in Networks
NCP plotNetwork
![Page 12: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/12.jpg)
Geometric and Hierarchical graphs
Hierarchical network
Geometric (grid-like) network – Small social
networks– Geometric and– Hierarchical networkhave downward NCP plot
![Page 13: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/13.jpg)
Our work: Large networks Previously researchers examined community
structure of small networks (~100 nodes) We examined more than 70 different large
social and information networks
Large real-world networks look
completely different!
![Page 14: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/14.jpg)
Example of our findings Typical example:
General relativity collaboration network (4,158 nodes, 13,422 edges)
![Page 15: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/15.jpg)
NCP: LiveJournal (N=5M, E=42M)
Better and better
communities
Worse and worse
communities
Best community
has 100 nodes
Com
mun
ity sc
ore
Community size
![Page 16: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/16.jpg)
Explanation: Downward part Whiskers are responsible for
downward slope of NCP plot
Whisker is a set of nodes connected to the network by a single
edge
NCP plot
Largest whisker
![Page 17: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/17.jpg)
Explanation: Upward part Each new edge inside the
community costs moreNCP plot
Φ=2/1 = 2Φ=8/3 = 2.6
Φ=64/11 = 5.8
Each node has twice as many
children
![Page 18: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/18.jpg)
Suggested Network Structure
Network structure: Core-
periphery, jellyfish, octopus
Whiskers are responsible for
good communities
Denser and denser core
of the network
Core contains 60%
node and 80% edges
![Page 19: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/19.jpg)
Caveat: Bag of whiskersWhat if we allow
cuts that give disconnected communities?
Cut all whiskers and compose
communities out of them
![Page 20: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/20.jpg)
Caveat: Bag of whiskersCo
mm
unity
scor
e
Community size
We get better community scores when
composing disconnected sets of
whiskersConnected communitie
sBag of whiskers
![Page 21: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/21.jpg)
Comparison to a rewired network
Rewired network: random
network with same degree distribution
![Page 22: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/22.jpg)
What is a good model?
What is a good model that explains such network structure?
None of the existing models work
Pref. attachment Small World Geometric Pref. Attachment
FlatDown and Flat
Flat and Down
![Page 23: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/23.jpg)
Forest Fire model works Forest Fire: connections spread like a fire
New node joins the network Selects a seed node Connects to some of its neighbors Continue recursively
As community grows it
blends into the core of
the network
![Page 24: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/24.jpg)
Forest Fire NCP plot
rewired
network
Bag of whisk
ers
![Page 25: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/25.jpg)
Conclusion and connections Whiskers:
Largest whisker has ~100 nodes Independent of network size Dunbar number: a person can maintain social
relationship to 150 people Bond vs. identity communites
Core: Newman et al. analyzed 400k node product network▪ Largest community has 50% nodes▪ Community was labeled “miscelaneous”
![Page 26: Statistical properties of network community structure](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816307550346895dd3841e/html5/thumbnails/26.jpg)
Conclusion and connections NCP plot is a way to analyze network
community structure Our results agree with previous work on small
networks But large networks are fundamentally
different Large networks have core-periphery structure
Small well isolated communities blend into the core of the networks as they grow