idea of co-clustering
DESCRIPTION
Idea of Co-Clustering. Co-clustering To combine the row and column clustering of co-occurrence matrix together and bootstrap each other. Simultaneously cluster the rows X and columns Y of the co-occurrence matrix. Hierarchical Co-Clustering Based on Entropy Splitting. - PowerPoint PPT PresentationTRANSCRIPT
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Idea of Co-Clustering
• Co-clusteringTo combine the row and column clustering of co-
occurrence matrix together and bootstrap each other.Simultaneously cluster the rows X and columns Y of
the co-occurrence matrix.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Hierarchical Co-Clustering Based on Entropy Splitting
• View (scaled) co-occurrence matrix as a joint probability distribution between row & column random variables
• Objective: seeking a hierarchical co-clustering containing given number of clusters while maintaining as much “Mutual Information” between row and column clusters as possible.
yx
yxoccurenceco
yxoccurencecoyxp
,
),(#
),(#),(
XY
XY
c1 c2 c3 c4
r1 0.1 0 0.2 0
r2 0 0.1 0.1 0
r3 0.2 0.1 0.1 0
r4 0 0 0 0.1
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Hierarchical Co-Clustering Based on Entropy Splitting
Y
0.1 0 0.2 0
0 0.1 0.1 0
0.2 0.1 0.1 0
0 0 0 0.1
X
0.1 0.2 0
0.2 0.4 0
0 0 0.1
Y
X
1
Y
X
1 0 2 0
0 1 1 0
2 1 1 0
0 0 0 1
1 0 2 0
0 1 1 0
2 1 1 0
0 0 0 1
1 0 2 0
0 1 1 0
2 1 1 0
0 0 0 1
0 0.4691 0.7751
Co-occurrence Matrices
Joint probability distribution between row & column cluster random variables
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Hierarchical Co-Clustering Based on Entropy Splitting
Update cluster indicators
Pipeline: (recursive splitting)
While(Termination condition)While(Termination condition)
Find optimal row/column cluster split which achieves maximal ˆ ˆ( , )I X Y
Termination Condition: ˆ ˆ( , )
( , )ˆˆ| | max | | maxI X Y
r cI X Y or R or C
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Hierarchical Co-Clustering Based on Entropy Splitting
Randomly split cluster S into S1 and S2
Converge at a local optima
How to find an optimal split at each step?
An Entropy-based Splitting Algorithm:
Input: Cluster S
Until Convergence Until Convergence
Update cluster indicators and probability values
For all element x in S, re-assign it to cluster S1 or S2 to minimize:
ˆ ˆ( ( | ) || ( | ) {1,2}jD p Y x p Y S j
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Hierarchical Co-Clustering Based on Entropy Splitting
• Example Y1 Y2 Y3 Y4
X1 0.1 0 0 0
X2 0 0.2 0.2 0
X3 0 0.2 0.2 0
X4 0.1 0 0 0
S={X1, X2, X3, X4}
S1={X1} S2={X2, X3, X4}
Naïve method needs trying 7 splits.Exponential time to size of S.
Randomly split
Re-assign X4 to S1S2={X2, X3}S1={X1, X4}
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Experiments
• Data sets Synthetic data 20 Newsgroups data
20 classes, 20000 documents
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Results-Synthetic Data
11.40
1000*1000 Matrix
Add noise to (a) by flipping values with probability 0.3
Randomly permute rows andcolumns of (b)
Clustering resultWith hierarchical structure
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Results-20 Newsgroups Data
Compare
with
baselines:
Method HICC NVBD ICC HCC
Dataset
m-pre
#clusters
m-pre #clusters
m-pre #clusters m-pre #clusters
Multi5subject 0.95 5 0.93 5 0.89 5 0.72 5 Multi5 0.93 5 N/A 0.87 5 0.71 5
Multi10subject 0.69 10 0.67 10 0.54 10 0.44 10 Multi10 0.67 10 N/A 0.56 10 0.61 10
HICC(merged) Single-Link UPGMA WPGMA Complete-Link
m-pre
#clusters m-pre
#clusters m-pre
#clusters m-pre
#clusters m-pre #clusters
0.96 30 0.27 30 0.73 30 0.65 30 0.89 300.96 30 0.29 30 0.59 30 0.71 30 0.85 30
0.74 60 0.24 60 0.60 60 0.58 60 0.67 600.74 60 0.24 60 0.61 60 0.62 60 0.60 60
Micro-averaged precision: M/NM:number of documents correctly clustered; N: total number of documents
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Thank You !
Questions?