made by: maor levy, temple university 2012 1. many data points, no labels 2

Machine LearningUnit 5, Introduction to Artificial Intelligence, Stanford online course

Made by: Maor Levy, Temple University 2012

The unsupervised learning problem

Many data points, no labels

Google Street View

Unsupervised Learning?

K-Means

Many data points, no labels

Choose a fixed number of clusters

Choose cluster centers and point-cluster allocations to minimize error

can’t do this by exhaustive search, because there are too many possible allocations.

Algorithm◦ fix cluster centers;

allocate points to closest cluster

◦ fix allocation; compute best cluster centers

x could be any set of features for which we can compute a distance (careful about scaling)

K-Means

x j i2

jelements of i'th cluster

iclusters

* From Marc Pollefeys COMP 256 2003

K-Means

K-means clustering using intensity alone and color alone

Image Clusters on intensity Clusters on color

Results of K-Means Clustering:

Is an approximation to EM◦ Model (hypothesis space): Mixture of N Gaussians◦ Latent variables: Correspondence of data and Gaussians

We notice: ◦ Given the mixture model, it’s easy to calculate the

correspondence◦ Given the correspondence it’s easy to estimate the

mixture models

K-Means

Data generated from mixture of Gaussians

Latent variables: Correspondence between Data Items and Gaussians

Expectation Maximzation: Idea

Generalized K-Means (EM)

Gaussians

ML Fitting Gaussians

Learning a Gaussian Mixture(with known covariance)

M-Step

E-Step

Converges! Proof [Neal/Hinton, McLachlan/Krishnan]:

◦ E/M step does not decrease data likelihood◦ Converges at local minimum or saddle point

But subject to local minima

Expectation Maximization

EM Clustering: Results

http://www.ece.neu.edu/groups/rpl/kmeans/

Number of Clusters unknown Suffers (badly) from local minima Algorithm:

◦ Start new cluster center if many points “unexplained”

◦ Kill cluster center that doesn’t contribute◦ (Use AIC/BIC criterion for all this, if you want to be

formal)

Practical EM

Spectral Clustering

The Two Spiral Problem

Spectral Clustering: Overview

Data Similarities Block-Detection

* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University

Block matrices have block eigenvectors:

Near-block matrices have near-block eigenvectors: [Ng et al., NIPS 02]

Eigenvectors and Blocks

1 1 0 0

0 0 1 1

eigensolver

l1= 2 l2= 2 l3= 0 l4= 0

1 1 .2 0

1 1 0 -.2

.2 0 1 1

0 -.2 1 1

eigensolver

l1= 2.02 l2= 2.02 l3= -0.02 l4= -0.02

Can put items into blocks by eigenvectors:

Resulting clusters independent of row ordering:

Spectral Space

1 1 .2 0

1 1 0 -.2

.2 0 1 1

0 -.2 1 1

1 .2 1 0

.2 1 0 1

1 0 1 -.2

0 1 -.2 1

The key advantage of spectral clustering is the spectral space representation:

The Spectral Advantage

Measuring Affinity

Intensity

Texture

Distance

aff x, y exp 12 i

I x I y 2

aff x, y exp 12 d

aff x, y exp 12 t

c x c y 2

Scale affects affinity

26* From Marc Pollefeys COMP 256 2003

The Space of Digits (in 2D)

M. Brand, MERL

Dimensionality Reduction with PCA

Fit multivariate Gaussian

Compute eigenvectors of Covariance

Project onto eigenvectors with largest eigenvalues

Linear: Principal Components

Other examples of unsupervised learning

Mean face (after alignment)

Slide credit: Santiago Serrano

Eigenfaces

Slide credit: Santiago Serrano

Isomap Local Linear Embedding

Non-Linear Techniques

Isomap, Science, M. Balasubmaranian and E. Schwartz

Scape (Drago Anguelov et al)

References:◦ Sebastian Thrun and Peter Norvig, Artificial Intelligence, Stanford

University http://www.stanford.edu/class/cs221/notes/cs221-lecture6-fall11.pdf

made by: maor levy, temple university 2012 1. many data points, no labels 2

Documents

col maor 1998 5

col maor 1974 6

col maor 1992 3

col maor 1987 01

gad maor, storemaven

maor presentation 2016

col maor 2010 004

col maor 1997 5

col maor 1990 5

col maor 1964 006

col maor 2001 1

col maor 1991 1

col maor 1986 4

col maor 1978 3

made by: maor levy, temple university 2012

col maor 2007 001

maor patent pct

col maor 1996 5

col maor 1971 4

col maor 1996 3