machine learning - oregon state...

65
Machine Learning CUNY Graduate Center, Spring 2013 http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Professor Liang Huang [email protected] Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models)

Upload: others

Post on 30-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Machine LearningCUNY Graduate Center, Spring 2013

http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

Professor Liang [email protected]

Lectures 11-12: Unsupervised Learning 1

(Clustering: k-means, EM, mixture models)

Page 2: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Roadmap• so far: (large-margin) supervised learning

• binary, multiclass, and structured classifications

• online learning: avg perceptron/MIRA, convergence proof

• kernels and kernelized perceptron in dual

• SVMs: formulation, KKT, dual, convex optimization, slacks

• structured perceptron, HMM, and Viterbi algorithm

• what we left out: many classical algorithms

• nearest neighbors (instance-based), decision trees, logistic regression...

• next up: unsupervised learning

• clustering: k-means, EM, mixture models, hierarchical

• dimensionality reduction: linear (PCA/ICA, MDS), nonlinear (isomap) 2

y=-1y=+1

the man bit the dog

DT NN VBD DT NN

Page 3: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Sup=>Unsup: Nearest Neighbor=> k-means

• let’s look at a supervised learning method: nearest neighbor

• SVM, perceptron (in dual) and NN are all instance-based learning

• instance-based learning: store a subset of examples for classification

• compression rate: SVM: very high, perceptron: medium high, NN: 0

3

Page 4: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-Nearest Neighbor

• one way to prevent overfitting => more stable results

4

Page 5: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

NN Voronoi in 2D and 3D

5

Page 6: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Voronoi for Euclidian and Manhattan

6

Page 7: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Unsupervised Learning• cost of supervised learning

• labeled data: expensive to annotate!

• but there exists huge data w/o labels

• unsupervised learning

• can only hallucinate the labels

• infer some “internal structures” of data

• still the “compression” view of learning

• too much data => reduce it!

• clustering: reduce # of examples

• dimensionality reduction: reduce # of dimensions7

(a)

−2 0 2

−2

0

2

Page 8: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Challenges in Unsupervised Learning

• how to evaluate the results?

• there is no gold standard data!

• internal metric?

• how to interpret the results?

• how to “name” the clusters?

• how to initialize the model/guess?

• a bad initial guess can lead to very bad results

• unsup is very sensitive to initialization (unlike supervised)

• how to do optimization => in general no longer convex!8

(a)

−2 0 2

−2

0

2

Page 9: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

9

(a)

−2 0 2

−2

0

2

Page 10: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

10

(b)

−2 0 2

−2

0

2

Page 11: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

11

(c)

−2 0 2

−2

0

2

Page 12: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

12

(d)

−2 0 2

−2

0

2

Page 13: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

13

(e)

−2 0 2

−2

0

2

Page 14: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

14

(f)

−2 0 2

−2

0

2

Page 15: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

15

(g)

−2 0 2

−2

0

2

Page 16: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

16

(h)

−2 0 2

−2

0

2

Page 17: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

17

(i)

−2 0 2

−2

0

2

Page 18: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means• (randomly) pick k points to be initial centroids

• repeat the two steps until convergence

• assignment to centroids: voronoi, like NN

• recomputation of centroids based on the new assignment

• how to define convergence?

• after a fixed number of iterations, or

• assignments do not change, or

• centroids do not change (equivalent?) or

• change in objective function falls below threshold

18

(i)

−2 0 2

−2

0

2

Page 19: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means objective function

• residual sum of squares (RSS)

• sum of distances from points to their centroids

• guaranteed to decrease monotonically

• convergence proof: decrease + finite # of clusterings

19

J

1 2 3 40

500

1000

Page 20: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means for image segmentation

20

Page 21: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Problems with k-means

• problem: sensitive to initialization

• the objective function is non-convex: many local minima

• why?

• k-means works well if

• clusters are spherical

• clusters are well separated

• clusters of similar volumes

• clusters have similar # of examples

21

Page 22: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Problems with k-means

• problem: sensitive to initialization

• the objective function is non-convex: many local minima

• why?

• k-means works well if

• clusters are spherical

• clusters are well separated

• clusters of similar volumes

• clusters have similar # of examples

21

Page 23: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Problems with k-means

• problem: sensitive to initialization

• the objective function is non-convex: many local minima

• why?

• k-means works well if

• clusters are spherical

• clusters are well separated

• clusters of similar volumes

• clusters have similar # of examples

21

Page 24: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Better (“soft”) k-means?• random restarts -- definitely helps

• soft clusters => EM with Gaussian Mixture Model

22

(i)

−2 0 2

−2

0

2

x

p(x)

0.5 0.3

0.2

(a)

0 0.5 1

0

0.5

1

(b)

0 0.5 1

0

0.5

1

Page 25: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means

• randomize k initial centroids

• repeat the two steps until convergence

• E-step: assignment each example to centroids (Voronoi)

• M-step: recomputation of centroids (based on the new assignment)

23

(a)

−2 0 2

−2

0

2

Page 26: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures• randomize k means, covariances, mixing coefficients

• repeat the two steps until convergence

• E-step: evaluate the responsibilities using current parameters

• M-step: reestimate parameters using current responsibilities

24(a)−2 0 2

−2

0

2

Page 27: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures• randomize k means, covariances, mixing coefficients

• repeat the two steps until convergence

• E-step: evaluate the responsibilities using current parameters

• M-step: reestimate parameters using current responsibilities

25(b)−2 0 2

−2

0

2“fractional

assignments”

Page 28: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures• randomize k means, covariances, mixing coefficients

• repeat the two steps until convergence

• E-step: evaluate the responsibilities using current parameters

• M-step: reestimate parameters using current responsibilities

26(c)

L = 1

−2 0 2

−2

0

2

Page 29: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures• randomize k means, covariances, mixing coefficients

• repeat the two steps until convergence

• E-step: evaluate the responsibilities using current parameters

• M-step: reestimate parameters using current responsibilities

27(d)

L = 2

−2 0 2

−2

0

2

Page 30: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures• randomize k means, covariances, mixing coefficients

• repeat the two steps until convergence

• E-step: evaluate the responsibilities using current parameters

• M-step: reestimate parameters using current responsibilities

28(e)

L = 5

−2 0 2

−2

0

2

Page 31: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures• randomize k means, covariances, mixing coefficients

• repeat the two steps until convergence

• E-step: evaluate the responsibilities using current parameters

• M-step: reestimate parameters using current responsibilities

29(f)

L = 20

−2 0 2

−2

0

2

Page 32: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures• randomize k means, covariances, mixing coefficients

• repeat the two steps until convergence

• E-step: evaluate the responsibilities using current parameters

• M-step: reestimate parameters using current responsibilities

30(f)

L = 20

−2 0 2

−2

0

2

Page 33: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM for Gaussian Mixtures

31

(b)−2 0 2

−2

0

2

(c)

L = 1

−2 0 2

−2

0

2

Page 34: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Convergence

• EM converges much slower than k-means

• can’t use “assignment doesn’t change” for convergence

• use log likelihood of the data

• stop if increase in log likelihood smaller than threshold

• or a maximum # of iterations has reached

32

L = logP (data)

= log⇧jP (xj)

=

X

j

logP (xj)

=

X

j

log

X

i

P (ci)P (xj | ci)

Page 35: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

EM: pros and cons (vs. k-means)

• EM: pros

• doesn’t need the data to be spherical

• doesn’t need the data to be well-separated

• doesn’t need the clusters to be in similar sizes/volumes

• EM: cons

• converges much slower than k-means

• per-iteration computation also slower

• (to speedup EM): use k-means to burn-in

• (same as k-means) local minimum!33

Page 36: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

k-means is a special case of EM

• k-means is “hard” EM

• covariance matrix is diagonal -- i.e., spherical

• diagonal variances are approaching 0

34

(i)

−2 0 2

−2

0

2

(f)

L = 20

−2 0 2

−2

0

2

Page 37: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Why EM increases p(data) iteratively?

35

Page 38: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Why EM increases p(data) iteratively?

35

Page 39: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Why EM increases p(data) iteratively?

35

Page 40: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Why EM increases p(data) iteratively?

35

Page 41: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Why EM increases p(data) iteratively?

36

convexauxiliary function

converge tolocal maxima

KL-

dive

rgen

ce

= D

Page 42: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

How to maximize the auxiliary?

37

θold θnew

L (q, θ)

ln p(X|θ)

Page 43: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

Machine LearningA Geometric Approach

CUNY Graduate Center, Spring 2013

http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

Professor Liang [email protected]

Lectures 13: Unsupervised Learning 2(dimensionality reduction: linear: PCA, ICA, CCA, LDA1, LDA2;

non-linear: kernel PCA, isomap, LLE, SDE)

Page 44: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Dimensionality Reduction

39

Page 45: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Dimensionality Reduction

40

Page 46: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Dimensionality Reduction

41Wrist rotation

Fing

ers

exte

nsio

n from the Isomap paper (Tenebaum, de Silva, Langford, Science 2000)

Page 47: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Algorithms

• linear methods

• PCA - principle ...

• ICA - independent ...

• CCA - canonical ...

• MDS - multidim. scaling

• LEM - laplacian eignen maps

• LDA1 - linear discriminant analysis

• LDA2 - latent dirichlet allocation

42

• non-linear methods

• kernelized PCA

• isomap

• LLE - locally linear embedding

• SDE - semidefinite embedding

all are spectral methods! -- i.e., using eigenvalues

Page 48: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

PCA

• greedily find d orthogonal axes onto which the variance under projection is maximal

• the “max variance subspace” formulation

• 1st PC: direction of greatest variability in data

• 2nd PC: the next unrelated max-var direction

• remove all variance in 1st PC, redo max-var

• another equivalent formulation: “minimum reconstruction error”

• find orthogonal vectors onto which the projection yields min MSE reconstruction

43

Page 49: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

PCA

• greedily find d orthogonal axes onto which the variance under projection is maximal

• the “max variance subspace” formulation

• 1st PC: direction of greatest variability in data

• 2nd PC: the next unrelated max-var direction

• remove all variance in 1st PC, redo max-var

• another equivalent formulation: “minimum reconstruction error”

• find orthogonal vectors onto which the projection yields min MSE reconstruction

43

Page 50: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

PCA optimization: max-var proj.• first translate data to zero mean

• compute co-variance matrix

• find top d eigenvalues and eigenvectors of covar matrix

• project data onto those eigenvectors

44

Page 51: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

PCA for k-means and whitening

• rescaling to zero mean and unit variance as preprocessing

• we did that in perceptron HW1/2 also!

• but PCA can do more: whitening (spherication)

45

Page 52: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Eigendigits

46

Page 53: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Eigenfaces

47

Page 54: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Eigenfaces

47

Page 55: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Eigenfaces

47

Page 56: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

LDA: Fisher’s linear discriminant analysis

• PCA finds a small representation of the data

• LDA performs dimensionality reduction while preserving as much the given class discrimination info as possible

• it’s a linear classification method (like perceptron)

• find a scalar projection that maximizes separability

• max separation b/w projected means

• min variance within each projected class

48

Page 57: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

PCA vs LDA

49

Page 58: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Linear vs. non-Linear

50

LLE or isomap

Page 59: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Linear vs. non-Linear

51

PCA

Page 60: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Linear vs. non-Linear

52

PCA

Page 61: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Linear vs. non-Linear

53

LLE

Page 62: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Linear vs. non-Linear

54

PCA

Page 63: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Linear vs. non-Linear

55

LLE

Page 64: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

PCA vs Kernel PCA

56

Page 65: Machine Learning - Oregon State Universityweb.engr.oregonstate.edu/~huanlian/teaching/machine-learning/2013spring/lec-11...Roadmap • so far: (large-margin) supervised learning •

CS 562 - EM

Brain does non-linear dim. redux

57