prml 9.1-9.2: k-means clustering & mixtures of gaussians

Post on 16-Aug-2015

100 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

PRML 9.1-9.2

K-means Clustering &

Mixtures of Gaussians July 16, 2014

by Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Clustering Problem

An unsupervised machine learning problem Divide data in some group (=cluster) where ü  similar data > same group ü  dissimilar data > different group

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Clustering Problem

Divide data in some group (=cluster) where ü  similar data > same group ü  dissimilar data > different group

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Clustering Problem

Divide data in some group (=cluster) where ü  similar data > same group ü  dissimilar data > different group

MinimizeN!

n=1

!xn " µk(n)!2

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Clustering Problem

Divide data in some group (=cluster) where ü  similar data > same group ü  dissimilar data > different group

MinimizeN!

n=1

!xn " µk(n)!2

Center of the cluster

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Clustering Problem

Given data set and # of cluster K Let be cluster representative and be assignment indicator ( ), Here, J is called “distortion measure”.

X = {x1, . . . ,xN}

µk rnkrnk = 1 if x ! Ck

Minimize J =N!

n=1

K!

k=1

rnk!xn " µk!2

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

How to solve that?

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

How to solve that? and are dependent each other

> No closed form solution µk rnk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

How to solve that? and are dependent each other

> No closed form solution Use iterative algorithm !

µk rnk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Strategy and can't be updated simultaneously µk rnk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Strategy and can't be updated simultaneously

> Update them one by one

µk rnk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (assignment) Since each can be determined independently, J will be minimum if they are assigned to the nearest .

rnk

xn

µk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (assignment) Since each can be determined independently, J will be minimum if they are assigned to the nearest . Therefore,

rnk

xn

µk

rnk =

!1 if k = arg minj !xn " µj!2,

0 otherwise.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (parameter estimation) Optimal is obtained by setting derivative 0. µk

µk

!

!µk

N!

n=1

K!

k!=1

rnk!!xn " µk!!2 = 0.

#$ 2N!

n=1

rnk(xn " µk) = 0.

! µk ="N

n=1 rnkxn"Nn=1 rnk

=1

Nk

!

xn!Ck

xn.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (parameter estimation) Optimal is obtained by setting derivative 0. µk

µk

!

!µk

N!

n=1

K!

k!=1

rnk!!xn " µk!!2 = 0.

#$ 2N!

n=1

rnk(xn " µk) = 0.

! µk ="N

n=1 rnkxn"Nn=1 rnk

=1

Nk

!

xn!Ck

xn.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (parameter estimation) Optimal is obtained by setting derivative 0. µk

µk

!

!µk

N!

n=1

K!

k!=1

rnk!!xn " µk!!2 = 0.

#$ 2N!

n=1

rnk(xn " µk) = 0.

! µk ="N

n=1 rnkxn"Nn=1 rnk

=1

Nk

!

xn!Ck

xn.

Mean of the cluster

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (parameter estimation) Optimal is obtained by setting derivative 0. µk

µk

!

!µk

N!

n=1

K!

k!=1

rnk!!xn " µk!!2 = 0.

#$ 2N!

n=1

rnk(xn " µk) = 0.

! µk ="N

n=1 rnkxn"Nn=1 rnk

=1

Nk

!

xn!Ck

xn.

Mean of the cluster

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (parameter estimation) Optimal is obtained by setting derivative 0. µk

µk

!

!µk

N!

n=1

K!

k!=1

rnk!!xn " µk!!2 = 0.

#$ 2N!

n=1

rnk(xn " µk) = 0.

! µk ="N

n=1 rnkxn"Nn=1 rnk

=1

Nk

!

xn!Ck

xn.

Mean of the cluster

is the mean of the cluster Cost function J corresponds to the sum of inner-class variance!

µk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Update of (parameter estimation) Optimal is obtained by setting derivative 0. µk

µk

!

!µk

N!

n=1

K!

k!=1

rnk!!xn " µk!!2 = 0.

#$ 2N!

n=1

rnk(xn " µk) = 0.

! µk ="N

n=1 rnkxn"Nn=1 rnk

=1

Nk

!

xn!Ck

xn.

Mean of the cluster

is the mean of the cluster Cost function J corresponds to the sum of inner-class variance!

µk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

K-means algorithm 1. Initialize , 2. Repeat following two steps until converge

i) Assign each to closest ii) Update to the mean of the cluster

µk rnk

xn µk

µk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

K-means algorithm 1. Initialize , 2. Repeat following two steps until converge

i) Assign each to closest ii) Update to the mean of the cluster

µk rnk

xn µk

µk

E step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

K-means algorithm 1. Initialize , 2. Repeat following two steps until converge

i) Assign each to closest ii) Update to the mean of the cluster

µk rnk

xn µk

µk

M step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Convergence property Both steps never increase J, so we can obtain better result in every iteration.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Convergence property Both steps never increase J, so we can obtain better result in every iteration. Since is finite, algorithm converge after finite iterations.

rnk

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

E step July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm M step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

E step July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm M step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

E step July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm M step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

E step July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm M step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Calculation performance E step ... Comparison of every data point

and every cluster mean > O(KN)

µk

xn

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Calculation performance E step ... Comparison of every data point

and every cluster mean > O(KN)

µk

xn

Not good

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Calculation performance E step ... Comparison of every data point

and every cluster mean > O(KN)

µk

xn

Not good Improve with kd-tree, triangle inequality...etc

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Calculation performance E step ... Comparison of every data point

and every cluster mean > O(KN)

µk

xn

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Calculation performance E step ... Comparison of every data point

and every cluster mean > O(KN)

M step ... Calculation of mean for every cluster > O(N)

µk

xn

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Here, two variation will be introduced: 1.  On-line version 2.  General dissimilarity

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Here, two variation will be introduced: 1.  On-line version 2.  General dissimilarity

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 1. On-line version The case where one datum is observed at once.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 1. On-line version The case where one datum is observed at once.

> Apply Robbins-Monro algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 1. On-line version The case where one datum is observed at once.

> Apply Robbins-Monro algorithm µnew

k = µoldk + !n(xn ! µold

k ).

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 1. On-line version The case where one datum is observed at once.

> Apply Robbins-Monro algorithm µnew

k = µoldk + !n(xn ! µold

k ).Learning rate

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 1. On-line version The case where one datum is observed at once.

> Apply Robbins-Monro algorithm µnew

k = µoldk + !n(xn ! µold

k ).Learning rate Decrease with iteration

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

Here, two variation will be introduced: 1.  On-line version 2.  General dissimilarity

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 2. General dissimilarity Euclidian distance is not ü  appropriate to categorical data, etc. ü  robust to outlier.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 2. General dissimilarity Euclidian distance is not ü  appropriate to categorical data, etc. ü  robust to outlier. > Use general dissimilarity measure V(x,x!)

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 2. General dissimilarity Euclidian distance is not ü  appropriate to categorical data, etc. ü  robust to outlier. > Use general dissimilarity measure V(x,x!)

E step ... No difference July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 2. General dissimilarity Euclidian distance is not ü  appropriate to categorical data, etc. ü  robust to outlier. > Use general dissimilarity measure V(x,x!)

M step ... Not assured J is easy to minimize July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 2. General dissimilarity To make M-step easy, restrict to the vector chosen from > A solution can be obtained by finite number of comparison

µk

{xn}

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

K-means Clustering

[Variation] 2. General dissimilarity To make M-step easy, restrict to the vector chosen from > A solution can be obtained by finite number of comparison

µk

{xn}

µk = arg minxn

!

xn!!Ck

V(xn,xn!)

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

K-means algorithm can be applied to Image Compression and Segmentation

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

K-means algorithm can be applied to Image Compression and Segmentation Basic Idea Treat similar pixel as same one

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

K-means algorithm can be applied to Image Compression and Segmentation Basic Idea Treat similar pixel as same one

Original data

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

K-means algorithm can be applied to Image Compression and Segmentation Basic Idea Treat similar pixel as same one

Cluster center

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

K-means algorithm can be applied to Image Compression and Segmentation Basic Idea Treat similar pixel as same one

Cluster center (pallet / code-book vector)

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

K-means algorithm can be applied to Image Compression and Segmentation Basic Idea Treat similar pixel as same one = so called “vector quantization”

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

Demo

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

Demo

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

Demo

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

Compression rate Original image...24N bits

(N=# of pixels)

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

Compression rate Original image...24N bits

(N=# of pixels) Compressed image... 24K+N log2K bits

(K=# of pallet)

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Application for Image Compression

Compression rate Original image...24N bits

(N=# of pixels) Compressed image... 24K+N log2K bits

(K=# of pallet) 16.7% if N~1M, K=10

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

In K-means, all assignments are equal, “all or nothing”.

Treated same July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

In K-means, all assignments are equal, “all or nothing”. Is these “hard” assignment appropriate?

Treated same July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

In K-means, all assignments are equal, “all or nothing”. Is these “hard” assignment appropriate? > Want introduce "soft" assignment

Treated same Probabilistic

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Introduce random variable z, having 1-of-K representation > Control unobserved “states”

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Introduce random variable z, having 1-of-K representation > Control unobserved “states”

Once state is determined,

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Introduce random variable z, having 1-of-K representation > Control unobserved “states”

Once state is determined, x is drawn from Gaussian of the state

p(x|zk = 1) = N (x|µk,!k).

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Introduce random variable z, having 1-of-K representation > Control unobserved “states”

Once state is determined, x is drawn from Gaussian of the state

p(x|zk = 1) = N (x|µk,!k).

x

z

Graphical representation July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Here the distribution over x is p(x) =

!

z

p(z)p(x|z)

=K!

k=1

p(zk = 1)p(x|zk = 1)

=K!

k=1

!kN (x|µk,!k).

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Here the distribution over x is p(x) =

!

z

p(z)p(x|z)

=K!

k=1

p(zk = 1)p(x|zk = 1)

=K!

k=1

!kN (x|µk,!k).

z is 1-of-K rep.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Here the distribution over x is p(x) =

!

z

p(z)p(x|z)

=K!

k=1

p(zk = 1)p(x|zk = 1)

=K!

k=1

!kN (x|µk,!k).

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Here the distribution over x is p(x) =

!

z

p(z)p(x|z)

=K!

k=1

p(zk = 1)p(x|zk = 1)

=K!

k=1

!kN (x|µk,!k).

Gaussian Mixtures ! July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

!(zk) ! p(zk = 1|x) =p(zk = 1)p(x|zk = 1)!j p(zj = 1)p(x|zj = 1)

="kN (x|µk,!k)!j "jN (x|µj ,!j)

.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

!(zk) ! p(zk = 1|x) =p(zk = 1)p(x|zk = 1)!j p(zj = 1)p(x|zj = 1)

="kN (x|µk,!k)!j "jN (x|µj ,!j)

.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

!(zk) ! p(zk = 1|x) =p(zk = 1)p(x|zk = 1)!j p(zj = 1)p(x|zj = 1)

="kN (x|µk,!k)!j "jN (x|µj ,!j)

.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

!(zk) ! p(zk = 1|x) =p(zk = 1)p(x|zk = 1)!j p(zj = 1)p(x|zj = 1)

="kN (x|µk,!k)!j "jN (x|µj ,!j)

.Posteriors

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

!(zk) ! p(zk = 1|x) =p(zk = 1)p(x|zk = 1)!j p(zj = 1)p(x|zj = 1)

="kN (x|µk,!k)!j "jN (x|µj ,!j)

.Posteriors Priors

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

!(zk) ! p(zk = 1|x) =p(zk = 1)p(x|zk = 1)!j p(zj = 1)p(x|zj = 1)

="kN (x|µk,!k)!j "jN (x|µj ,!j)

.Posteriors Priors Likelihood

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Estimate (or “explain”) x came from which state This value is also called “responsibilities”

!(zk) ! p(zk = 1|x) =p(zk = 1)p(x|zk = 1)!j p(zj = 1)p(x|zj = 1)

="kN (x|µk,!k)!j "jN (x|µj ,!j)

.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Introduction of Latent Variable

Example of Gaussian Mixtures

(a)

0 0.5 1

0

0.5

1(b)

0 0.5 1

0

0.5

1 (c)

0 0.5 1

0

0.5

1

No state info Coloured by true state

Coloured by responsibility

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

ML estimates of mixtures of Gaussians have two problems:

i.  Presence of Singularities ii.  Identifiability

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

ML estimates of mixtures of Gaussians have two problems:

i.  Presence of Singularities ii.  Identifiability

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities What if a mean collides with a data point? !j,m µj = xm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities What if a mean collides with a data point? Likelihood can be however large by

!j,m µj = xm

!j ! 0

L !

!

" 1!j

+#

k !=j

pk,m

$

%&

n!=m

!

" 1!j

exp

'" (xn " µj)2

2!2j

(+

#

k !=j

pk,n

$

%

#$.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities What if a mean collides with a data point? Likelihood can be however large by

!j,m µj = xm

!j ! 0

L !

!

" 1!j

+#

k !=j

pk,m

$

%&

n!=m

!

" 1!j

exp

'" (xn " µj)2

2!2j

(+

#

k !=j

pk,n

$

%

#$.! "

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities What if a mean collides with a data point? Likelihood can be however large by

!j,m µj = xm

!j ! 0

L !

!

" 1!j

+#

k !=j

pk,m

$

%&

n!=m

!

" 1!j

exp

'" (xn " µj)2

2!2j

(+

#

k !=j

pk,n

$

%

#$.! "

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities What if a mean collides with a data point? Likelihood can be however large by

!j,m µj = xm

!j ! 0

L !

!

" 1!j

+#

k !=j

pk,m

$

%&

n!=m

!

" 1!j

exp

'" (xn " µj)2

2!2j

(+

#

k !=j

pk,n

$

%

#$.! " ! 0

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities What if a mean collides with a data point? Likelihood can be however large by

!j,m µj = xm

!j ! 0

L !

!

" 1!j

+#

k !=j

pk,m

$

%&

n!=m

!

" 1!j

exp

'" (xn " µj)2

2!2j

(+

#

k !=j

pk,n

$

%

#$.! " > 0

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities What if a mean collides with a data point? Likelihood can be however large by

!j,m µj = xm

!j ! 0

L !

!

" 1!j

+#

k !=j

pk,m

$

%&

n!=m

!

" 1!j

exp

'" (xn " µj)2

2!2j

(+

#

k !=j

pk,n

$

%

#$.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities It doesn't occur in single Gaussian.

L ! 1!N

j

!

n!=m

exp

"" (xn " µj)2

2!2j

#

#0.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities It doesn't occur in single Gaussian.

L ! 1!N

j

!

n!=m

exp

"" (xn " µj)2

2!2j

#

#0.! "

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities It doesn't occur in single Gaussian.

L ! 1!N

j

!

n!=m

exp

"" (xn " µj)2

2!2j

#

#0.! " ! 0

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities It doesn't occur in single Gaussian.

L ! 1!N

j

!

n!=m

exp

"" (xn " µj)2

2!2j

#

#0.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

i) Presence of Singularities It doesn't occur in single Gaussian. It doesn't occur in Bayesian approach either.

L ! 1!N

j

!

n!=m

exp

"" (xn " µj)2

2!2j

#

#0.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

ML estimates of mixtures of Gaussians have two problems:

i.  Presence of Singularities ii.  Identifiability

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

ii) Identifiability Optimal solutions are not unique: If we have a solution, there are (K!-1) other equivalent solution.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Problems of ML estimates

ii) Identifiability Optimal solutions are not unique: If we have a solution, there are (K!-1) other equivalent solution. Matters when interpret, but does not matter when model only

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

The conditions of ML are obtained by where

!

!µk

L = 0,

!

!!k

L = 0,

!

!"k

!L + #

"#j "j ! 1

$%= 0.

L(!,µ,!) =!N

n=1 ln"!K

k=1 !kN (xn|µk,!k)#

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

The conditions of ML where

µk =1

Nk

N!

n=1

!n(zk)xn,

!k =1

Nk

N!

n=1

!n(zk)(xn ! µj)(xn ! µj)T,

"k =Nk

N,

Nk =!N

n=1 !n(zk)

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

The conditions of ML where

µk =1

Nk

N!

n=1

!n(zk)xn,

!k =1

Nk

N!

n=1

!n(zk)(xn ! µj)(xn ! µj)T,

"k =Nk

N,

Nk =!N

n=1 !n(zk)

!n(zk) appeared

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Recall that

!n(zk) ="kN (xn|µk,!k)!j "jN (xn|µj ,!j)

.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Recall that

!n(zk) ="kN (xn|µk,!k)!j "jN (xn|µj ,!j)

.

Parameters appeared

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Recall that

!n(zk) ="kN (xn|µk,!k)!j "jN (xn|µj ,!j)

.

Parameters appeared = No closed form solution

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Recall that Again, use iterative algorithm!

!n(zk) ="kN (xn|µk,!k)!j "jN (xn|µj ,!j)

.

Parameters appeared = No closed form solution

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

EM algorithm for Gaussian Mixtures 1. Initialize parameters 2. Repeat following two steps until converge

i) Calculate ii) Update parameters

!n(zk) ="kN (xn|µk,!k)!j "jN (xn|µj ,!j)

.

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

EM algorithm for Gaussian Mixtures 1. Initialize parameters 2. Repeat following two steps until converge

i) Calculate ii) Update parameters

!n(zk) ="kN (xn|µk,!k)!j "jN (xn|µj ,!j)

.

E step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

EM algorithm for Gaussian Mixtures 1. Initialize parameters 2. Repeat following two steps until converge

i) Calculate ii) Update parameters

!n(zk) ="kN (xn|µk,!k)!j "jN (xn|µj ,!j)

.

M step

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

EM-algorithm for Gaussian Mixtures

Comparison with K-means

EM for Gaussian Mixtures

K-means Clustering July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

Mixtures of Gaussians K-means Clustering

Today's topics

1.  K-means Clustering 1.  Clustering Problem 2.  K-means Clustering 3.  Application for Image Compression

2.  Mixtures of Gaussians 1.  Introduction of latent variables 2.  Problem of ML estimates 3.  EM-algorithm for Mixture of Gaussians

July 16, 2014 PRML 9.1-9.2 Shinichi TAMURA

top related