spectral clusteringwcohen/10-605/unsup-on-graphs.pdf · spectral clustering: graph = matrix w*v 1 =...
TRANSCRIPT
![Page 1: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/1.jpg)
1
![Page 2: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/2.jpg)
Spectral Clustering
2
![Page 3: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/3.jpg)
spectral k-means
after transformation
3
![Page 4: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/4.jpg)
4
![Page 5: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/5.jpg)
SpectralClustering:Graph=Matrix
AB
C
FD
E
GI
HJ
A B C D E F G H I JA 1 1 1B 1 1C 1D 1 1E 1F 1 1 1G 1H 1 1 1I 1 1 1J 1 1
5
![Page 6: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/6.jpg)
SpectralClustering:Graph=MatrixTransitivelyClosedComponents=“Blocks”
AB
C
FD
E
GI
HJ
A B C D E F G H I JA _ 1 1 1B 1 _ 1C 1 1 _D _ 1 1E 1 _ 1F 1 1 1 _G _ 1 1H _ 1 1I 1 1 _ 1J 1 1 1 _
Of course we can’t see the “blocks” unless the nodesare sorted by cluster…
sometimes called a block-stochastic matrix:• each node has a latent “block”• fixed probability qi for links between
elements of block i• fixed probability q0 for links between
elements of different blocks
6
![Page 7: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/7.jpg)
SpectralClustering:Graph=MatrixVector=NodeàWeight
H
A B C D E F G H I JA _ 1 1 1B 1 _ 1C 1 1 _D _ 1 1E 1 _ 1F 1 1 1 _G _ 1 1H _ 1 1I 1 1 _ 1J 1 1 1 _
AB
C
FD
E
GI
J
AA 3B 2C 3DEFGHIJ
M v
7
![Page 8: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/8.jpg)
SpectralClustering:Graph=MatrixM*v1 =v2“propogates weightsfromneighbors”
A B C D E F G H I JA _ 1 1 1B 1 _ 1C 1 1 _D _ 1 1E 1 _ 1F 1 1 _G _ 1 1H _ 1 1I 1 1 _ 1J 1 1 1 _
AB
C
FD
E
I
A 3B 2C 3DEFGHIJ
M
M v1
A 2*1+3*1+0*1
B 3*1+3*1
C 3*1+2*1
DEFGHIJ
v2* =
8
![Page 9: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/9.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
A B C D E F G H I J
A _ .5 .5 .3
B .3 _ .5
C .3 .5 _
D _ .5 .3
E .5 _ .3
F .3 .5 .5 _
G _ .3 .3
H _ .3 .3
I .5 .5 _ .3
J .5 .5 .3 _
AB
C
FD
E
I
A 3
B 2
C 3
D
E
F
G
H
I
J
W v1
A 2*.5+3*.5+0*.3
B 3*.3+3*.5
C 3*.33+2*.5
D
E
F
G
H
I
J
v2* =W: normalized so columns sum to 1
9
![Page 10: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/10.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
ll eigenvaluer with eigenvectoan is : vvvW =×
Q: How do I pick v to be an eigenvector
for a block-stochastic matrix?
10
![Page 11: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/11.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
ll eigenvaluer with eigenvectoan is : vvvW =×
How do I pick v to be an eigenvector
for a block-stochastic matrix?
11
![Page 12: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/12.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
ll eigenvaluer with eigenvectoan is : vvvW =×
[Shi & Meila, 2002]
λ2
λ3
λ4
λ5,6,7,…
.
λ1e1
e2
e3“eigengap”
12
![Page 13: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/13.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
ll eigenvaluer with eigenvectoan is : vvvW =×
[Shi & Meila, 2002]
e2
e3
-0.4 -0.2 0 0.2
-0.4
-0.2
0.0
0.2
0.4
xx x xx x
yyyy
y
xx xxx x
zzzzz zzzzz z e1
e2
13
![Page 14: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/14.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
ll eigenvaluer with eigenvectoan is : vvvW =×If W is connected but roughly block diagonal with k blocks then• the top eigenvector is a constant vector • the next k eigenvectors are roughly piecewise constant with “pieces” corresponding to blocks
14
![Page 15: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/15.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
ll eigenvaluer with eigenvectoan is : vvvW =×If W is connected but roughly block diagonal with k blocks then• the “top” eigenvector is a constant vector • the next k eigenvectors are roughly piecewise constant with “pieces” corresponding to blocks
Spectral clustering:• Find the top k+1 eigenvectors v1,…,vk+1• Discard the “top” one• Replace every node a with k-dimensional vector xa = <v2(a),…,vk+1 (a) >• Cluster with k-means
15
![Page 16: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/16.jpg)
Backtothe2-Dexamples
• Create a graph.• Each 2D point is a vertex• Add edges connecting all points in the 2-D initial space to all other points• Weight of edge is distance
b
b
b
b
b
g g
g
g
g
g g
g gr r r r
r r r…
16
![Page 17: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/17.jpg)
SpectralClustering:ProsandCons• Elegant,andwell-foundedmathematically• Tendstoavoidlocalminima
– Optimalsolutiontorelaxedversionofmincut problem(Normalizedcut,akaNCut)
• Worksquitewellwhenrelationsareapproximatelytransitive(likesimilarity,socialconnections)
• Expensiveforverylargedatasets– Computingeigenvectorsisthebottleneck– Approximateeigenvectorcomputationnotalwaysuseful
• Noisydatasetssometimescauseproblems– Pickingnumberofeigenvectorsandkistricky– “Informative” eigenvectorsneednotbeintopfew– Performancecandropsuddenlyfromgoodtoterrible
17
![Page 18: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/18.jpg)
Experimentalresults:best-caseassignmentofclasslabelstoclusters
18
![Page 19: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/19.jpg)
Spectral clustering as Optimization
19
![Page 20: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/20.jpg)
Spectral Clustering: Graph = MatrixW*v1 = v2 “propogates weights from neighbors”
ll eigenvaluer with eigenvectoan is : vvvW =ו A = adjacency matrix• D = diagonal normalizer for A (# outlinks)• W = A D-1
• smallest eigenvecs of D-A are largest eigenvecs of A• smallest eigenvecs of I-W are largest eigenvecs of W
Suppose each y(i)=+1 or -1:• Then y is a cluster indicator that splits the nodes into two • what is yT(D-A)y ?
20
![Page 21: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/21.jpg)
úû
ùêë
é-=
úû
ùêë
é-+=
úúû
ù
êêë
é-÷
ø
öçè
æ+÷÷
ø
öççè
æ=
úû
ùêë
é-=
-=-=-
å
ååå
åå åå å
åå
åå
jijiji
jijiji
jijij
jiiij
jijiji
jj
iij
ii
jij
jijiji
iii
jijiji
iii
TTT
yya
yyayaya
yyayaya
yyayd
yyaydADAD
,
2,
,,
,
2
,
2
,,
22
,,
2
,,
2
)(21
221
221
2221
)( yyyyyy
= size of CUT(y)
)NCUT( of size)( yyy =-WIT
NCUT: roughly minimize ratio of transitions between classes vs transitions within classes
Same as smoothness criterion used in SSL but now there are no seeds
21
![Page 22: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/22.jpg)
Spectral Clustering: Graph = MatrixW*v1 = v2 “propogates weights from neighbors”
ll eigenvaluer with eigenvectoan is : vvvW =ו smallest eigenvecs of D-A are largest eigenvecs of A• smallest eigenvecs of I-W are largest eigenvecs of WSuppose each y(i)=+1 or -1:• Then y is a cluster indicator that cuts the nodes into two • what is yT(D-A)y ? The cost of the graph cut defined by y• what is yT(I-W)y ? Also a cost of a graph cut defined by y• How to minimize it?
• Turns out: to minimize yT X y / (yTy) find smallest eigenvector of X• But: this will not be +1/-1, so it’s a “relaxed” solution
22
![Page 23: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/23.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
ll eigenvaluer with eigenvectoan is : vvvW =×
[Shi & Meila, 2002]
λ2
λ3
λ4
λ5,6,7,…
.
λ1e1
e2
e3“eigengap”
23
![Page 24: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/24.jpg)
Another way to think about spectral clustering….
• Mostnormalpeoplethinkaboutspectralclusteringlikethat- asrelaxedoptimization
• …me,notsomuch• Iliketheconnectionto“averaging”…because….itleadstoanicefastvariationofspectralclusteringcalledpoweriterationclustering
24
![Page 25: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/25.jpg)
Unsupervised Learning on Graphs with Power Iteration
Clustering
25
![Page 26: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/26.jpg)
Spectral Clustering: Graph = Matrix
AB
C
FD
E
GI
HJ
A B C D E F G H I JA 1 1 1B 1 1C 1D 1 1E 1F 1 1 1G 1H 1 1 1I 1 1 1J 1 1
26
![Page 27: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/27.jpg)
Spectral Clustering: Graph = MatrixTransitively Closed Components = “Blocks”
AB
C
FD
E
GI
HJ
A B C D E F G H I JA _ 1 1 1B 1 _ 1C 1 1 _D _ 1 1E 1 _ 1F 1 1 1 _G _ 1 1H _ 1 1I 1 1 _ 1J 1 1 1 _
Of course we can’t see the “blocks” unless the nodesare sorted by cluster…
sometimes called a block-stochastic matrix:• each node has a latent “block”• fixed probability qi for links between
elements of block i• fixed probability q0 for links between
elements of different blocks
27
![Page 28: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/28.jpg)
Spectral Clustering: Graph = MatrixVector = Node à Weight
H
A B C D E F G H I JA _ 1 1 1B 1 _ 1C 1 1 _D _ 1 1E 1 _ 1F 1 1 1 _G _ 1 1H _ 1 1I 1 1 _ 1J 1 1 1 _
AB
C
FD
E
GI
J
AA 3B 2C 3DEFGHIJ
M
M v
28
![Page 29: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/29.jpg)
SpectralClustering:Graph=MatrixM*v1 =v2“propogates weightsfromneighbors”
A B C D E F G H I JA _ 1 1 1B 1 _ 1C 1 1 _D _ 1 1E 1 _ 1F 1 1 _G _ 1 1H _ 1 1I 1 1 _ 1J 1 1 1 _
AB
C
FD
E
I
A 3B 2C 3DEFGHIJ
M v1
A 2*1+3*1+0*1
B 3*1+3*1
C 3*1+2*1
D
E
F
G
H
I
J
v2* =
29
![Page 30: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/30.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
A B C D E F G H I J
A _ .5 .5 .3
B .3 _ .5
C .3 .5 _
D _ .5 .3
E .5 _ .3
F .3 .5 .5 _
G _ .3 .3
H _ .3 .3
I .5 .5 _ .3
J .5 .5 .3 _
AB
C
FD
E
I
A 3
B 2
C 3
D
E
F
G
H
I
J
W v1
A 2*.5+3*.5+0*.3
B 3*.3+3*.5
C 3*.33+2*.5
D
E
F
G
H
I
J
v2* =W: normalized so columns sum to 1
30
![Page 31: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/31.jpg)
Repeatedaveragingwithneighborsasaclusteringmethod
• Pickavectorv0(maybeatrandom)• Computev1 =Wv0
– i.e.,replacev0[x]withweightedaverage ofv0[y]fortheneighborsyofx• Plotv1[x]foreachx• Repeatforv2,v3,…
• Variantswidelyusedforsemi-supervised learning– HF/CoEM/wvRN - average+clampingoflabelsfornodeswithknownlabels
• Withoutclamping,willconvergetoconstantvt
• Whatarethedynamics ofthisprocess?
31
![Page 32: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/32.jpg)
Repeatedaveragingwithneighborsonasampleproblem…
• Create a graph, connecting all points in the 2-D initial space to all other points
• Weighted by distance• Run power iteration (averaging) for 10 steps• Plot node id x vs v10(x)
• nodes are ordered and colored by actual cluster number
b
b
b
b
b
g g
g
g
g
g g
g gr r r r
r r r…
blue green ___red___
32
![Page 33: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/33.jpg)
Repeatedaveragingwithneighborsonasampleproblem…
blue green ___red___blue green ___red___ blue green ___red___
smal
ler
larg
er
33
![Page 34: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/34.jpg)
Repeatedaveragingwithneighborsonasampleproblem…
blue green ___red___ blue green ___red___blue green ___red___
blue green ___red___blue green ___red___
34
![Page 35: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/35.jpg)
Repeatedaveragingwithneighborsonasampleproblem…
very
smal
l
35
![Page 36: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/36.jpg)
PIC:PowerIterationClusteringrunpoweriteration(repeatedaveragingw/neighbors)with
earlystopping
– V0:randomstart,or“degreematrix” D,or…– Easytoimplementandefficient– Veryeasilyparallelized
– Experimentally,oftenbetterthan traditionalspectralmethods
– Surprisingsincetheembeddedspaceis1-dimensional!
36
![Page 37: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/37.jpg)
Experiments
• “Network” problems:naturalgraphstructure– PolBooks:105politicalbooks,3classes,linkedbycopurchaser– UMBCBlog:404politicalblogs,2classes,blogroll links– AGBlog:1222politicalblogs,2classes,blogroll links
• “Manifold” problems:cosinedistancebetweenclassificationinstances– Iris:150flowers,3classes– PenDigits01,17:200handwrittendigits,2classes(0-1or1-7)– 20ngA:200docs,misc.forsale vs soc.religion.christian– 20ngB:400docs,misc.forsale vs soc.religion.christian– 20ngC:20ngB+200docsfromtalk.politics.guns– 20ngD:20ngC+200docsfromrec.sport.baseball
37
![Page 38: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/38.jpg)
Experimentalresults:best-caseassignmentofclasslabelstoclusters
38
![Page 39: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/39.jpg)
39
![Page 40: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/40.jpg)
Experiments:runtimeandscalability
Time in millisec
40
![Page 41: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/41.jpg)
More experiments
VaryingthenumberofclustersforPICandPIC4(startswithrandom4-dpointratherthanarandom1-dpoint).
41
![Page 42: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/42.jpg)
More experiments
More“real”networkdatasetsfromvariousdomains
42
![Page 43: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/43.jpg)
More experiments
43
![Page 44: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/44.jpg)
44
![Page 45: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/45.jpg)
LEARNING ON GRAPHS FOR NON-GRAPH DATASETS
45
![Page 46: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/46.jpg)
Why I’m talking about graphs• Lotsoflargedataisgraphs
– Facebook,Twitter,citationdata,andothersocialnetworks
– Theweb,theblogosphere,thesemanticweb,Freebase,Wikipedia,Twitter,andotherinformation networks
– Textcorpora(likeRCV1),largedatasetswithdiscretefeaturevalues,andotherbipartitenetworks
• nodes=documentsorwords• linksconnectdocumentàwordorwordà document
– Computernetworks,biologicalnetworks(proteins,ecosystems,brains,…),…
– Heterogeneousnetworkswithmultipletypesofnodes• people,groups,documents
46
![Page 47: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/47.jpg)
proposal
CMU
CALO
graph
William
Simplest Case: Bi-partite Graphs
47
![Page 48: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/48.jpg)
Motivation:ExperimentalDatasets`UsedforPICare…
• “Network” problems:naturalgraphstructure– PolBooks:105politicalbooks,3classes,linkedbycopurchaser– UMBCBlog:404politicalblogs,2classes,blogroll links– AGBlog:1222politicalblogs,2classes,blogroll links– Also:Zachary’skarateclub,citationnetworks,...
• “Manifold” problems:cosinedistancebetweenallpairsofclassificationinstances– Iris:150flowers,3classes– PenDigits01,17:200handwrittendigits,2classes(0-1or1-7)– 20ngA:200docs,misc.forsale vs soc.religion.christian– 20ngB:400docs,misc.forsale vs soc.religion.christian– …
Gets expensive fast
48
![Page 49: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/49.jpg)
SpectralClustering:Graph=MatrixA*v1 =v2“propogates weightsfromneighbors”
A B C D E F G H I JA _ 1 1 1B 1 _ 1C 1 1 _D _ 1 1E 1 _ 1F 1 1 _G _ 1 1H _ 1 1I 1 1 _ 1J 1 1 1 _
AB
C
FD
E
I
A 3B 2C 3DEFGHIJ
M
A v1
A 2*1+3*1+0*1
B 3*1+3*1
C 3*1+2*1
D
E
F
G
H
I
J
v2* =
49
![Page 50: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/50.jpg)
SpectralClustering:Graph=MatrixW*v1 =v2“propogates weightsfromneighbors”
A B C D E F G H I J
A _ .5 .5 .3
B .3 _ .5
C .3 .5 _
D _ .5 .3
E .5 _ .3
F .3 .5 .5 _
G _ .3 .3
H _ .3 .3
I .5 .5 _ .3
J .5 .5 .3 _
AB
C
FD
E
I
A 3
B 2
C 3
D
E
F
G
H
I
J
W v1
A 2*.5+3*.5+0*.3
B 3*.3+3*.5
C 3*.33+2*.5
D
E
F
G
H
I
J
v2* =W: normalized so columns sum to 1
W = D-1*A D[i,i]=1/degree(i)50
![Page 51: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/51.jpg)
Lazycomputationofdistancesandnormalizers
• RecallPIC’supdateis– vt =W*vt-1==D-1A*vt-1
– …whereDisthe[diagonal]degreematrix:D=A*1• Myfavoritedistancemetricfortextislength-normalizedTFIDF:– Def’n:A(i,j)=<vi,vj>/||vi||*||vj||– LetN(i,i)=||vi||…andN(i,j)=0fori!=j– LetF(i,k)=TFIDFweightofwordwk indocumentvi–Then:A=N-1FTFN-1
<u,v>=inner product||u|| is L2-norm
1 is a column vector of 1’s
51
![Page 52: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/52.jpg)
Lazycomputationofdistancesandnormalizers
• RecallPIC’supdateis– vt =W*vt-1==D-1A*vt-1
– …whereDisthe[diagonal]degreematrix:D=A*1– LetF(i,k)=TFIDFweightofwordwk indocumentvi– ComputeN(i,i)=||vi||…andN(i,j)=0fori!=j– Don’t computeA=N-1FTFN-1
– LetD(i,i)=N-1FTFN-1*1where1isanall-1’svector• ComputedasD=N-1(FT(F(N-1*1)))forefficiency
–Newupdate:• vt =D-1A*vt-1 =D-1 N-1FTFN-1*vt-1
Equivalent to using TFIDF/cosine on all pairs of examples but requires only
sparse matrices
52
![Page 53: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/53.jpg)
Experimentalresults
• RCV1textclassificationdataset– 800k+newswirestories– Categorylabelsfromindustry vocabulary– Tooksingle-labeldocumentsandcategorieswithatleast500instances
– Result:193,844documents,103categories• Generated100randomcategorypairs
– Eachisalldocumentsfromtwocategories– Rangeinsizeanddifficulty– Pickcategory1,withm1 examples– Pickcategory2suchthat0.5m1<m2<2m1
53
![Page 54: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/54.jpg)
Results
•NCUTevd: Ncut with exact eigenvectors•NCUTiram: Implicit restarted Arnoldi method•No stat. signif. diffs between NCUTevd and PIC
54
![Page 55: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/55.jpg)
Results
55
![Page 56: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/56.jpg)
Results
56
![Page 57: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/57.jpg)
Results
57
![Page 58: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/58.jpg)
Results• Linearrun-timeimpliesconstantnumberofiterations
• Numberofiterationsto“acceleration-convergence” ishardtoanalyze:–Fasterthanasinglecompleterunofpoweriterationtoconvergence
–Onourdatasets• 10-20iterationsistypical• 30-35isexceptional
58
![Page 59: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/59.jpg)
59
![Page 60: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/60.jpg)
PIC AND SSL
60
![Page 61: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/61.jpg)
PIC:PowerIterationClusteringrunpoweriteration(repeatedaveragingw/neighbors)with
earlystopping
61
![Page 62: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/62.jpg)
HarmonicFunctions/CoEM/wvRN
thenreplacevt+1(i)withseedvalues+1/-1forlabeleddata
for5-10iterations
Classifydatausingfinalvaluesfromv
62
![Page 63: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/63.jpg)
ImplicitManifoldsontheNELLdatasets
Paris
live in arg1
San FranciscoAustin
traits such as arg1
anxiety
mayor of arg1
Pittsburgh
Seattle
denial
arg1 is home of
selfishness
Nodes “near” seeds Nodes “far from” seeds
arrogancetraits such as arg1
denialselfishness
63
![Page 64: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/64.jpg)
Using the Manifold Trick for SSL
64
![Page 65: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/65.jpg)
Using the Manifold Trick for SSL
65
![Page 66: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/66.jpg)
Using the Manifold Trick for SSL
66
![Page 67: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/67.jpg)
Using the Manifold Trick for SSL
67
![Page 68: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/68.jpg)
Using the Manifold Trick for SSL
68
![Page 69: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/69.jpg)
Using the Manifold Trick for SSLA smoothing trick:
69
![Page 70: Spectral Clusteringwcohen/10-605/unsup-on-graphs.pdf · Spectral Clustering: Graph = Matrix W*v 1 = v 2 “propogatesweights from neighbors” W × v = lv: v is an eigenvecto r with](https://reader033.vdocuments.net/reader033/viewer/2022050412/5f892bfee07a644d6834bc06/html5/thumbnails/70.jpg)
Using the Manifold Trick for SSL
70