presented by wanchen lu 2/25/2013 multi-view clustering via canonical correlation analysis kamalika...

21
Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009.

Upload: spencer-stevenson

Post on 26-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

P r e s e n t e d B y W a n c h e n L u2 / 2 5 / 2 0 1 3

Multi-view Clustering viaCanonical Correlation Analysis

Kamalika Chaudhuri et al. ICML 2009.

Page 2: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

INTRODUCTION

Page 3: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

ASSUMPTION IN MULTI-VIEW PROBLEMS

• The input variable (a real vector) can be partitioned into two different views, where it is assumed that either view of the input is sufficient to make accurate predictions --- essentially the co-training assumption.

• e.g.• Identity recognition with one view being a video stream and

the other an audio stream;• Web page classification where one view is the text and the

other is the hyperlink structure;• Object recognition with pictures from different camera

angles;• A bilingual parallel corpus, with each view presented in one

language.

Page 4: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

INTUITION IN MULTI-VIEW PROBLEMS

• Many multi-view learning algorithms force agreement between the predictors based on either view. (usually force the predictor on view 1 to equal to the predictor based on view 2)• The complexity of the learning problem is

reduced by eliminating hypothesis from each view that do not agree with each other.

Page 5: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

BACKGROUND

Page 6: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

CANONICAL CORRELATION ANALYSIS

• CCA is a way of measuring the linear relationship between two multidimensional variables.• Find two basis vectors, one for x and one for y,

such that the correlations between the projections of the variables onto these basis vectors are maximized.

Page 7: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

CALCULATING CANONICAL CORRELATIONS

• Consider the total covariance matrix of random variables x and y with zero mean:

• The canonical correlations between x and y can be found by solving the eigenvalue equations

Page 8: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

RELATION TO OTHER LINEAR SUBSPACE METHODS

• Formulate the problems in one single eigenvalue equation

Page 9: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

PRINCIPAL COMPONENT ANALYSIS

• The principal components are the eigenvectors of the covariance matrix. • The projection of data onto the principal

components is an orthogonal transformation that diagonalizes the covariance matrix.

Page 10: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

PARTIAL LEAST SQUARES

• PLS is basically the singular value decomposition (SVD) of a between-sets covariance matrix.• In PLS regression, the principal vectors

corresponding to the largest principal values are used as basis. A regression of y onto x is then performed in this basis.

Page 11: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

ALGORITHM

Page 12: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

THE BASIC IDEA

• Use CCA to project the data down to the subspace spanned by the means to get an easier clustering problem, then apply standard clustering algorithms in this space.• When the data in at least one of the views is well

separated, this algorithm clusters correctly with high probability.

Page 13: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

ALGORITHM

• Input: a set of samples S, the number of clusters k

1. Randomly partition S into two subsets A and B of equal size.

2. Let C_12(A) be the covariance matrix between views 1 and 2, computed from the set A. Compute the top k-1 left singular vectors of C_12(A), and project the samples in B on the subspace spanned by these vectors.

3. Apply clustering algorithm (single linkage clustering, K-means) to the projected examples in view 1.

Page 14: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

EXPERIMENTS

Page 15: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

SPEAKER IDENTIFICATION

• Dataset• 41 speakers, speaking 10 sentences each• Audio features 1584 dimensions• Video feature 2394 dimensions

• Method 1: use PCA project into 40 D• Method 2: use CCA (after PCA into 100 D for

images and 1000 D for audios)• Cluster into 82 clusters (2 / speaker) using K-

means

Page 16: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

SPEAKER IDENTIFICATION

• Evaluation• Conditional perplexity• = the mean # of speakers corresponding to each cluster

Page 17: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

CLUSTERING WIKIPEDIA ARTICLES

• Dataset• 128 K Wikipedia articles, evaluated on 73 K articles that

belong to the 500 most frequent categories.• Link structure feature L is a concatenation of ``to`` and

``from`` vectors. L(i) is the number of times the current article links to/from article i.

• Text feature is a bag-of-words vector.

• Methods: compared PCA and CCA• Used a hierarchical clustering procedure, iteratively pick

the largest cluster, reduce the dimensionality using PCA or CCA, and use k-means to break the cluster into smaller ones, until reaching the total desired number of clusters.

Page 18: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

RESULTS

Page 19: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

THANK YOU

Page 20: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

APPENDIX: A NOTE ON CORRELATION

• Correlation between x_i and x_i is the covariance normalized by the geometric mean of the variances of x_i and x_j

Page 21: Presented By Wanchen Lu 2/25/2013 Multi-view Clustering via Canonical Correlation Analysis Kamalika Chaudhuri et al. ICML 2009

AFFINE TRANSFORMATIONS

• An affine transformation is a map