pert 05 aplikasi clustering
TRANSCRIPT
![Page 1: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/1.jpg)
Overview of Clustering
Rong Jin
![Page 2: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/2.jpg)
Outline K means for clustering Expectation Maximization algorithm for clustering Spectrum clustering (if time is permitted) w w
w
![Page 3: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/3.jpg)
Clustering
$$$
age
Find out the underlying structure for given data points
![Page 4: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/4.jpg)
Application (I): Search Result Clustering
![Page 5: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/5.jpg)
Application (II): Navigation
![Page 6: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/6.jpg)
Application (III): Google News
![Page 7: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/7.jpg)
Application (III): Visualization
Islands of music (Pampalk et al., KDD’ 03)
![Page 8: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/8.jpg)
Application (IV): Image Compression
http://www.ece.neu.edu/groups/rpl/kmeans/
![Page 9: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/9.jpg)
How to Find good Clustering? Minimize the sum of
distance within clustersC1
C2
C3
C4C5
,
6 2,
1 1,arg min
j i j
n
i j i jj iC m
m x C
,
6
,1
1 the j-th cluster0 the j-th cluster
1
any a single cluster
ii j
i
i jj
i
xm
x
m
x
![Page 10: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/10.jpg)
How to Efficiently Clustering Data?
,
6 2,
1 1,arg min
j i j
n
i j i jj iC m
m x C
,Memberships and centers are correlated.i j jm C
,
1,
,1
Given memberships ,
n
i j ii
i j j n
i ji
m xm C
m
2
,
1 arg min( ) Given centers { },
0 otherwise
i jkj i j
j x CC m
![Page 11: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/11.jpg)
K-means for Clustering K-means
Start with a random guess of cluster centers
Determine the membership of each data points
Adjust the cluster centers
![Page 12: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/12.jpg)
K-means for Clustering K-means
Start with a random guess of cluster centers
Determine the membership of each data points
Adjust the cluster centers
![Page 13: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/13.jpg)
K-means for Clustering K-means
Start with a random guess of cluster centers
Determine the membership of each data points
Adjust the cluster centers
![Page 14: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/14.jpg)
K-means1. Ask user how many
clusters they’d like. (e.g. k=5)
![Page 15: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/15.jpg)
K-means1. Ask user how many
clusters they’d like. (e.g. k=5)
2. Randomly guess k cluster Center locations
![Page 16: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/16.jpg)
K-means1. Ask user how many
clusters they’d like. (e.g. k=5)
2. Randomly guess k cluster Center locations
3. Each datapoint finds out which Center it’s closest to. (Thus each Center “owns” a set of datapoints)
![Page 17: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/17.jpg)
K-means1. Ask user how many
clusters they’d like. (e.g. k=5)
2. Randomly guess k cluster Center locations
3. Each datapoint finds out which Center it’s closest to.
4. Each Center finds the centroid of the points it owns
![Page 18: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/18.jpg)
K-means1. Ask user how many
clusters they’d like. (e.g. k=5)
2. Randomly guess k cluster Center locations
3. Each datapoint finds out which Center it’s closest to.
4. Each Center finds the centroid of the points it ownsAny Computational Problem?
Computational Complexity: O(N) where N is the number of points?
![Page 19: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/19.jpg)
Improve K-means Group points by region
KD tree SR tree
Key difference Find the closest center for
each rectangle Assign all the points within a
rectangle to one cluster
![Page 20: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/20.jpg)
Improved K-means Find the closest center for
each rectangle Assign all the points within
a rectangle to one cluster
![Page 21: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/21.jpg)
Improved K-means
![Page 22: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/22.jpg)
Improved K-means
![Page 23: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/23.jpg)
Improved K-means
![Page 24: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/24.jpg)
Improved K-means
![Page 25: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/25.jpg)
Improved K-means
![Page 26: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/26.jpg)
Improved K-means
![Page 27: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/27.jpg)
Improved K-means
![Page 28: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/28.jpg)
Improved K-means
![Page 29: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/29.jpg)
Improved K-means
![Page 30: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/30.jpg)
A Gaussian Mixture Model for Clustering Assume that data are
generated from a mixture of Gaussian distributions
For each Gaussian distribution Center: i
Variance: i (ignore) For each data point
Determine membership
: if belongs to j-th clusterij iz x
![Page 31: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/31.jpg)
Learning a Gaussian Mixture(with known covariance)
Probability ( )ip x x
2
/ 2 22
( ) ( , ) ( ) ( | )
1( ) exp22
j j
j
i i j j i j
i jj d
p x x p x x p p x x
xp
![Page 32: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/32.jpg)
Learning a Gaussian Mixture(with known covariance)
Probability ( )ip x x
2
/ 2 22
( ) ( , ) ( ) ( | )
1( ) exp22
j j
j
i i j j i j
i jj d
p x x p x x p p x x
xp
Log-likelihood of data
Apply MLE to find optimal parameters
2/ 2 22
1log ( ) log ( ) exp22j
i ji j d
i i
xp x x p
( ),j j jp
![Page 33: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/33.jpg)
Learning a Gaussian Mixture(with known covariance)
22
22
1 ( )2
1 ( )2
1
( )
( )
i j
i n
x
j
k x
nn
e p
e p
[ ] ( | )ij j iE z p x x E-Step
1
( | ) ( )
( | ) ( )
i j jk
i n jn
p x x p
p x x p
![Page 34: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/34.jpg)
Learning a Gaussian Mixture(with known covariance)
1
1
1 [ ][ ]
m
j ij imi
iji
E z xE z
M-Step
1
1( ) [ ]m
j iji
p E zm
![Page 35: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/35.jpg)
Gaussian Mixture Example: Start
![Page 36: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/36.jpg)
After First Iteration
![Page 37: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/37.jpg)
After 2nd Iteration
![Page 38: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/38.jpg)
After 3rd Iteration
![Page 39: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/39.jpg)
After 4th Iteration
![Page 40: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/40.jpg)
After 5th Iteration
![Page 41: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/41.jpg)
After 6th Iteration
![Page 42: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/42.jpg)
After 20th Iteration
![Page 43: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/43.jpg)
Mixture Model for Doc Clustering A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
![Page 44: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/44.jpg)
Mixture Model for Doc Clustering A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
( )ip d d
( , )
1
( ) ( , )
( ) ( | )
( ) ( | )
j
j
k i
j
i i j
j i j
V tf w dj k jk
p d d p d d
p p d d
p p w
Probability
![Page 45: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/45.jpg)
Mixture Model for Doc Clustering A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
( )ip d d
( , )
1
( ) ( , )
( ) ( | )
( ) ( | )
j
j
k i
j
i i j
j i j
V tf w dj k jk
p d d p d d
p p d d
p p w
Probability
![Page 46: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/46.jpg)
Mixture Model for Doc Clustering A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
( )ip d d
( , )
1
( ) ( , )
( ) ( | )
( ) ( | )
j
j
k i
j
i i j
j i j
V tf w dj k jk
p d d p d d
p p d d
p p w
Probability
Introduce hidden variable zijzij: document di is generated by the
j-th language model j.
![Page 47: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/47.jpg)
Learning a Mixture Model
( , )
1
( , )
1 1
( | ) ( )
( | ) ( )
k i
k i
V tf w dm j j
mVK
tf w dm n n
n m
p w p
p w p
1
[ ] ( | )
( | ) ( )
( | ) ( )
ij j i
i j jK
i n nn
E z p d d
p d d p
p d d p
E-Step
K: number of language models
![Page 48: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/48.jpg)
Learning a Mixture Model
M-Step
1
1( ) [ ]N
j iji
p E zN
1
1
[ ] ( , )( | )
[ ]
N
ij i kk
i j N
ij kk
E z tf w dp w
E z d
N: number of documents
![Page 49: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/49.jpg)
Examples of Mixture Models
![Page 50: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/50.jpg)
Other Mixture Models Probabilistic latent semantic index (PLSI) Latent Dirichlet Allocation (LDA)
![Page 51: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/51.jpg)
Problems (I) Both k-means and mixture models need to compute
centers of clusters and explicit distance measurement Given strange distance measurement, the center of
clusters can be hard to computeE.g., ' ' '
1 1 2 2' max , ,..., n nx x x x x x x x
x y
z
x y x z
![Page 52: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/52.jpg)
Problems (II) Both k-means and mixture models look for compact
clustering structures In some cases, connected clustering structures are more desirable
![Page 53: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/53.jpg)
Graph Partition MinCut: bipartite graphs with minimal number of
cut edges
CutSize = 2
![Page 54: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/54.jpg)
2-way Spectral Graph Partitioning Weight matrix W
wi,j: the weight between two vertices i and j
Membership vector q1 Cluster -1 Cluster i
i Aq
i B
[ 1,1]
2,
,
arg min
14
n
i j i ji j
CutSize
CutSize J q q w
![Page 55: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/55.jpg)
Solving the Optimization Problem
Directly solving the above problem requires combinatorial search exponential complexity
How to reduce the computation complexity?
2,
[ 1,1] ,
1arg min4n
i j i ji jq q w
![Page 56: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/56.jpg)
Relaxation Approach Key difficulty: qi has to be either –1, 1
Relax qi to be any real number Impose constraint
21
nii q n
2 2 2, ,
, ,
2, ,
,
1 1 24 4
1 12 24 4
i j i j i j i j i ji j i j
i i j i j i ji j i j
J q q w q q q q w
q w q q w
,i i jj
d w
2, , ,
,
1 1 12 2 2i i i j i j i i i j i j ji i j iq d q q w q d w q
,i i jD d
( )TJ q D W q
![Page 57: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/57.jpg)
Relaxation Approach
2
* arg min arg min ( )
subject to
T
kk
J
q n
q q
q q D W q
![Page 58: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/58.jpg)
Relaxation Approach
Solution: the second minimum eigenvector for D-W
2
* arg min arg min ( )
subject to
T
kk
J
q n
q q
q q D W q
2( )D W q q
![Page 59: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/59.jpg)
Graph Laplacian
L is semi-positive definitive matrix For Any x, we have xTLx 0, why? Minimum eigenvalue 1 = 0 (what is the eigenvector?) The second minimum eigenvalue 2 gives the best bipartite
graph
, , ,: , i j i j i jjw w L D W W D
1 2 30 ... k
![Page 60: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/60.jpg)
Recovering Partitions Due to the relaxation, q can be any number (not just
–1 and 1) How to construct partition based on the eigenvector?
Simple strategy: { | 0}, { | 0}i iA i q B i q
![Page 61: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/61.jpg)
Spectral Clustering Minimum cut does not balance the size of bipartite
graphs
![Page 62: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/62.jpg)
Normalized Cut (Shi & Malik, 1997)
Minimize the similarity between clusters and meanwhile maximize the similarity within clusters
,( , ) , ,
( , ) ( , )
i j A i B ii A j B i A i B
A B
s A B w d d d d
s A B s A BJd d
,( , ) ( , ) B A
i ji A j BA B A B
d ds A B s A BJ wd d d d
jj
d d
2
,B A
i ji A j B A B
d dw
d d d
BidddAiddd
iqBA
AB
if if
//
)(
2,i j i j
i jw q q
![Page 63: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/63.jpg)
Normalized Cut 2
, ( - )
/ if
/ if
Ti j i j
i j
B Ai
A B
J w q q
d d d i Aq
d d d i B
q D W q
![Page 64: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/64.jpg)
Normalized Cut
Relax q to real value under the constraint
2, ( - )
/ if
/ if
Ti j i j
i j
B Ai
A B
J w q q
d d d i Aq
d d d i B
q D W q
0,1 DeqDqq TT
Solution: DqqWD )(
![Page 65: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/65.jpg)
Image Segmentation
![Page 66: Pert 05 aplikasi clustering](https://reader036.vdocuments.net/reader036/viewer/2022062503/586fd69b1a28ab18428b4fab/html5/thumbnails/66.jpg)
Non-negative Matrix Factorization