Download - 7270 Community Detection
-
8/18/2019 7270 Community Detection
1/28
Community Detection and
Graph-based Clustering
Chapter 3
Of Lei Tang and HuanLiu’s oo!
"lides prepared by
#iang $ang%&"T% Hong'ong
(Chapter 3% Community Detection and )ining in "ocial )edia* Lei Tangand Huan Liu% )organ , Claypool% "eptember% .(.*
-
8/18/2019 7270 Community Detection
2/28
-
8/18/2019 7270 Community Detection
3/28
Community
• Community/ 0t is formed by indi1iduals such that those2ithin a group interact 2ith each other morefreuently than 2ith those outside the group – a*!*a* group% cluster% cohesi1e subgroup% module in di4erent
conte5ts
• Community detection/ disco1ering groups in a net2or!2here indi1iduals’ group memberships are note5plicitly gi1en
•6hy communities in social media7 – Human beings are social – 8asy-to-use social media allo2s people to e5tend their
social life in unprecedented 2ays – Di9cult to meet friends in the physical 2orld% but much
easier to :nd friend online 2ith similar interests
– 0nteractions bet2een nodes can help determinecommunities
3
-
8/18/2019 7270 Community Detection
4/28
Communities in "ocial)edia
• T2o types of groups in social media – 85plicit Groups/ formed by user subscriptions – 0mplicit Groups/ implicitly formed by social
interactions
• "ome social media sites allo2 people to ;oingroups% is it necessary to e5tract groups based onnet2or! topology7 –
-
8/18/2019 7270 Community Detection
5/28
COMMUNITY DETECTION
@
-
8/18/2019 7270 Community Detection
6/28
"ub;ecti1ity of CommunityDe:nition
8ach componentis a communityA densely-!nit
community
De:nition of a communitycan be sub;ecti1e*
unsuper1ised learningB
-
8/18/2019 7270 Community Detection
7/28
Ta5onomy of CommunityCriteria
• Criteria 1ary depending on the tas!s• oughly% community detection methods can be
di1ided into ? categories not e5clusi1eB/• artition the 2hole net2or! into se1eral dis;oint sets
• Hierarchy-Centric Community – Construct a hierarchical structure of communities
E
-
8/18/2019 7270 Community Detection
8/28
-
8/18/2019 7270 Community Detection
9/28
Complete )utuality/ Cliues
• Cliue/ a ma5imum complete subgraph in2hich all nodes are ad;acent to each other
• -hard to :nd the ma5imum cliue in anet2or!
• "traightfor2ard implementation to :ndcliues is 1ery e5pensi1e in time comple5ity
-
8/18/2019 7270 Community Detection
10/28
inding the )a5imum Cliue
• 0n a cliue of si=e !% each node maintains degreeIJ !-( –
-
8/18/2019 7270 Community Detection
11/28
)a5imum Cliue 85ample
• "uppose 2e sample a sub-net2or! 2ith nodes (-M and :nd a cliue (% % 3M of si=e 3
• 0n order to :nd a cliue I3% remo1e all nodes 2ithdegree KJ3-(J – emo1e nodes and
– emo1e nodes ( and 3
– emo1e node ?
((
-
8/18/2019 7270 Community Detection
12/28
Cliue >ercolation )ethodC>)B
• Cliue is a 1ery strict de:nition% unstable• ) is such a method to :nd o1erlapping communities – Input
• A parameter !% and a net2or! – Procedure
• ind out all cliues of si=e ! in a gi1en net2or!• Construct a cliue graph* T2o cliues are ad;acent if
they share !-( nodes• 8ach connected component in the cliue graph forms
a community(
-
8/18/2019 7270 Community Detection
13/28
C>) 85ample
Cliques of sie !"(% % 3M% (% 3% ?M%?% @% M% @% % EM%@% % FM% @% E% FM%% E% FM
Communities/
(% % 3% ?M?% @% % E% FM
(3
-
8/18/2019 7270 Community Detection
14/28
eachability / !-cliue% !-club
• Any node in a group should be reachable in !hops
• !-cliue/ a ma5imal subgraph in 2hich the largestgeodesic distance bet2een any t2o nodes KJ !
• !-club/ a substructure of diameter KJ !
• A !-cliue might ha1e diameter larger than ! inthe subgraph – 8*g* (% % 3% ?% @M
•Commonly used in traditional "
-
8/18/2019 7270 Community Detection
15/28
Group-Centric CommunityDetection/ Density-ased
Groups• The group-centric criterion reuires the 2holegroup to satisfy a certain condition – 8*g*% the group density IJ a gi1en threshold
• A subgraph is a uasi-cliue if
2here the denominator is the ma5imum number of
degrees*
• A similar strategy to that of cliues can be used – "ample a subgraph% and :nd a ma5imal
uasi-cliue say% of si=e B
– emo1e nodes 2ith degree less than the a1erage degree(@
,
<
-
8/18/2019 7270 Community Detection
16/28
-
8/18/2019 7270 Community Detection
17/28
Clustering based on Nerte5"imilarity
• Apply !-means or similarity-based clustering tonodes
• Nerte5 similarity is de:ned in terms of thesimilarity of their neighborhood
• "tructural eui1alence/ t2o nodes are structurallyeui1alent i4 they are connecting to the same setof actors
•"tructural eui1alence is too strict for practicaluse*
-
8/18/2019 7270 Community Detection
18/28
Nerte5 "imilarity
• Paccard "imilarity
• Cosine similarity
(F
(1) Clustering based on vertex similarity
(4) S t l l t i
-
8/18/2019 7270 Community Detection
19/28
Cut
• )ost interactions are 2ithin group 2hereasinteractions bet2een groups are fe2
• community detection minimum cut problem
• Cut/ A partition of 1ertices of a graph into t2odis;oint sets
• )inimum cut problem/ :nd a graph partition suchthat the number of edges bet2een the t2o sets isminimi=ed
(4) Spectral clustering
(4) S t l l t i
-
8/18/2019 7270 Community Detection
20/28
atio Cut ,
-
8/18/2019 7270 Community Detection
21/28
atio Cut ,
-
8/18/2019 7270 Community Detection
22/28
)odularity )a5imi=ation
• )odularity measures the strength of a communitypartition by ta!ing into account the degreedistribution
• Gi1en a net2or! 2ith m edges% the e5pectednumber of edges bet2een t2o nodes 2ithdegrees di and d j is
• "trength of a community/
• )odularity/ •
The e5pected number ofedges bet2een nodes (
and is3RS R(?B J 3S(?
E
(5) odularity maximi!ation
"iven t#e degree distribution
-
8/18/2019 7270 Community Detection
23/28
Hierarchy-Centric CommunityDetection
• Goal/ build a hierarchical structure ofcommunities based on net2or! topology
• Allo2 the analysis of a net2or! at di4erentresolutions
• epresentati1e approaches/ – Di1isi1e Hierarchical Clustering top-
do2nB
– Agglomerati1e Hierarchical clustering
bottom-upB 3(
-
8/18/2019 7270 Community Detection
24/28
Di1isi1e HierarchicalClustering
• Di1isi1e clustering – >artition nodes into se1eral sets
– 8ach set is further di1ided into smaller ones
–
-
8/18/2019 7270 Community Detection
25/28
8dge et2eenness
• The strength of a tie can be measured by edgebet2eenness
• 8dge bet2eenness/ the number of shortest pathsthat pass along 2ith the edge
• The edge 2ith higher bet2eenness tends to bethe bridge bet2een t2o communities*
The edge bet2eenness ofe(% B is ? JS V (B% asall the shortest paths from to ?% @% % E% F% M ha1eto either pass e(% B or e%3B% and e(%B is theshortest path bet2een (and
33
-
8/18/2019 7270 Community Detection
26/28
Di1isi1e clustering based onedge bet2eenness
After remo1e e?%@B% thebet2eenness of e?% B
becomes .% 2hich is thehighest
After remo1e e?%B% the edgeeE%B has the highest
bet2eenness 1alue ?% and
should be remo1ed*
0nitial bet2eenness 1alue
3?$dea% progressively removing edges &it# t#e #ig#est bet&eenness
-
8/18/2019 7270 Community Detection
27/28
Agglomerati1e HierarchicalClustering
• 0nitiali=e each node as a community
• )erge communities successi1ely intolarger communities follo2ing a certain
criterion – 8*g*% based on modularity increase
3@
'endrogram according to gglomerative Clustering based on odularity
-
8/18/2019 7270 Community Detection
28/28
"ummary of CommunityDetection
•