2020.3.24 presented by samy coulombe, shenyang (andy ... · presented by samy coulombe, shenyang...

23
A Tutorial on Spectral Clustering By Ulrike von Luxburg, 2007 Max Planck Institute for Biological Cybernetics Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24

Upload: others

Post on 14-Aug-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

A Tutorial on Spectral Clustering

By Ulrike von Luxburg, 2007Max Planck Institute for Biological Cybernetics

Presented by Samy Coulombe, Shenyang (Andy) Huang2020.3.24

Page 2: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Agenda

1. Motivation for the paper2. Intro to similarity graphs and notation3. Graph Laplacian and their basic properties4. Spectral Clustering Algorithm5. Connection to Perturbation theory

2

Page 3: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Motivation and background (2007)Aim: a self-contained introduction to spectral clustering. ● Graph Laplacian and Spectral Clustering algorithm

● Spectral Clustering and its connections to various areas

3

Comparison between results of k-means clustering and spectral clustering (Colour indicates cluster assignment) [2]

Page 4: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

● Given a set of data points, and a similarity measure*

○ ( ) between all pairs of data points

● Construct a similarity graph:

Clustering Setting

Goal: 1). High within-group similarity, 2). Low between-group similarity

4

*domain dependent choice

Page 5: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Types of similarity graphs

5

Legend

A) Data points

B) ε-neighborhood graph:edge ( , ) exists if distance( , ) ≤ ε

C) kNN graph:edge ( , ) exists if is among ‘s k nearest nodes

D) Mutual kNN graph:edge ( , ) exists if is among ’s k nearest nodes and vice versa.

[1]

Page 6: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Similarity Graphs and the Graph Laplacian

From the similarity graph , we can define the following:● G’s weighted adjacency matrix W

● The degree of a node i, , and G’s diagonal degree matrix Dii = di

6

Graph Laplacian (L) of G:

● D is the degree matrix ● W is the weighted adjacency matrix

Page 7: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Similarity Graphs and the Graph Laplacian

From the similarity graph ,

● Degree matrix D, Adjacency matrix W, Graph Laplacian L

7

- =

Page 8: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Properties of Graph LaplacianL satisfies the following properties:● vector :

● L is symmetric and positive semi-definite

● L eigenvalues

8

Page 9: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

0 eigenvalues and the connected componentsProposition (number of connected components): Let G be an undirected graph with non-negative weight matrix W. Then,● The multiplicity k of the eigenvalue 0 of L equals the number of connected

components in the graph.

● The eigenspace of eigenvalue 0 is spanned by k indicator vectors which map the n nodes to the k clusters.

9

2 connected components 1 connected component

Page 10: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

ProofBase Case: k = 1 (connected graph)Assuming f is an eigenvector with eigenvalue 0. By definition of eigenvalue and eigenvector:

We know from the properties of the Laplacian that:

10

Page 11: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

ProofBase Case: k = 1 (connected graph)Assuming f is an eigenvector with eigenvalue 0. We know from property of Laplacian that:

● If vertices and are connected ( ), the

● Because all nodes are connected by a path, we must have

● Therefore is a constant normalized one vector for a connected component

11

Page 12: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Proof continuedCase: k connected components● Assume the vertices are ordered according to the connected components they belong to. ● Weighted adjacency W and Laplacian L have a block diagonal form.

● Each is the Laplacian of i-th connected component

12

Page 13: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Proof continuedCase: k connected components● both adjacency W and Laplacian L have a block diagonal form.

● The spectrum of block diagonal matrix L is the union of the spectrums of

● From base case, every has eigenvalue 0 with multiplicity 1, and the corresponding eigenvector is an indicator vector for the i-th connected component.

13

Page 14: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Spectral Clustering algorithmInput: Similarity matrix , number of clusters k

1. Compute the Laplacian matrix L

2. Compute the first k smallest non-zero eigenvectors

3. Let be the matrix formed by the eigenvectors, each row vector and these form data points to cluster

4. Cluster with k-means algorithm into clusters

Output: Clusters where

14

Page 15: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Spectral Clustering algorithmCore idea of Spectral Clustering:

Project dataset into a nice low dimensional embedding space, then cluster

Questions you might have:

1. Why use the k smallest eigenvectors of the Laplacian matrix?a. See upcoming slides

2. Why use k-means for clustering?a. Actually any other clustering algorithm can be used b. But the Euclidean distance assumption of k-means is actually motivated here

15

Page 16: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Spectral Clustering Algorithm in Action

Toy example: 200 random points drawn from a mixture of 4 Gaussians

Similarity function: Gaussian similarity (non-Euclidean)Construct K-nearest neighbor graphspectral clustering using k-means detected the correct clusterings

16

Page 17: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Spectral Clustering Algorithm in Action:Eigenvalue and eigenvectors of Laplacian

17

Page 18: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Perturbation theory point of view

● ideally: k disconnected components

● reality: a connected graph with small between-cluster connectivity

● k smallest eigenvectors still similar to ideal case

Consider a perturbed symmetric matrix

18

Page 19: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Davis-Kahan Theorem● Distance between two Eigenspaces :

● Spectral Gap is

● Davis-Kahan theorem tells us that:

○ The norm is the Frobenius norm or the Two-norm

19

Page 20: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Comments on perturbation theoryFor spectral clustering to work properly

● There should be an ideal case where A is block diagonal

● A should be a symmetric matrix

● The entries of eigenvectors away from 0.

20

Page 21: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

More covered in the paper1. Connection to Graph Cut problems

2. Symmetric and Random-Walk Normalized Graph Laplacians (normalize by degree)3. Connection to Random Walk Matrices4. Many practical tips for spectral clustering

21

Page 22: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

Thank you for listening :)

22

Page 23: 2020.3.24 Presented by Samy Coulombe, Shenyang (Andy ... · Presented by Samy Coulombe, Shenyang (Andy) Huang 2020.3.24. Agenda 1. Motivation for the paper 2. Intro to similarity

References

1. A Tutorial on Spectral Clustering2. Image taken from https://towardsdatascience.com/spectral-clustering-aba2640c0d5b

23