Transcript
Page 1: Recent developments in nonlinear dimensionality reduction

Recent developments in nonlinear dimensionality

reduction

Josh Tenenbaum

MIT

Page 2: Recent developments in nonlinear dimensionality reduction

Collaborators

• Vin de Silva

• John Langford

• Mira Bernstein

• Mark Steyvers

• Eric Berger

Page 3: Recent developments in nonlinear dimensionality reduction

Outline

• The problem of nonlinear dimensionality reduction

• The Isomap algorithm

• Development #1: Curved manifolds

• Development #2: Sparse approximations

Page 4: Recent developments in nonlinear dimensionality reduction

Learning an appearance map

• Given input: . . .

• Desired output:– Intrinsic dimensionality: 3– Low-dimensional

representation:

Page 5: Recent developments in nonlinear dimensionality reduction

Linear dimensionality reduction: PCA, MDS

• PCA dimensionality of faces:

• First two

PCs:

Page 6: Recent developments in nonlinear dimensionality reduction

• Linear manifold: PCA

• Nonlinear manifold: ?

Page 7: Recent developments in nonlinear dimensionality reduction

Previous approaches to nonlinear dimensionality reduction

• Local methods seek a set of low-dimensional models, each valid over a limited range of data:– Local PCA

– Mixture of factor analyzers

• Global methods seek a single low-dimensional model valid over the whole data set:– Autoencoder neural networks– Self-organizing map– Elastic net– Principal curves & surfaces– Generative topographic mapping

Page 8: Recent developments in nonlinear dimensionality reduction

A generative model

• Latent space Y Rd

• Latent data {yi} Y generated from p(Y)

• Mapping f: YRN for some N > d

• Observed data {xi = f (yi)} RN

Goal: given {xi}, recover f and {yi}.

Page 9: Recent developments in nonlinear dimensionality reduction

Chicken-and-egg problem

• We know {xi} . . .

• . . . and if we knew{yi}, could estimate f.

• . . . or if we knew f, could estimate {yi}.

• So use EM, right? Wrong.

Page 10: Recent developments in nonlinear dimensionality reduction

The problem of local minima

GTM SOM

• Global nonlinear dimensionality reduction + local optimization = severe local minima

Page 11: Recent developments in nonlinear dimensionality reduction

A different approach

• Attempt to infer {yi} directly from {xi}, without explicit reference to f.

• Closed-form, non-iterative, globally optimal solution for {yi}.

• Then can approximate f with a suitable interpolation algorithm (RBFs, local linear, ...).

• In other words, finding f becomes a supervised learning problem on pairs {yi ,xi}.

Page 12: Recent developments in nonlinear dimensionality reduction

When does this work?

• Only given some assumptions on the nature of f and the distribution of the {yi}.

• The trick: exploit some invariant of f, a property of the {yi} that is preserved in the {xi}, and that allows the {yi} to be read off uniquely*.

* up to some isomorphism (e.g., rotation).

Page 13: Recent developments in nonlinear dimensionality reduction

The assumptions behind three algorithms

No free lunch: weaker assumptions on f stronger assumptions on p(Y).

Distribution: p(Y) Mapping: f Algorithm

ii) convex, dense isometric Isomap

iii) convex, uniformly dense conformal C-Isomap

i) ii) iii)

i) arbitrary linear isometric Classical MDS

Page 14: Recent developments in nonlinear dimensionality reduction

The assumptions behind three algorithms

Distribution: p(Y) Mapping: f Algorithm

ii) convex, dense isometric Isomap

iii) convex, uniformly dense conformal C-Isomap

i)

i) arbitrary linear isometric Classical MDS

Page 15: Recent developments in nonlinear dimensionality reduction

Classical MDS

• Invariant: Euclidean distance • Algorithm:

– Calculate Euclidean distance matrix D– Convert D to canonical inner product matrix B by

“double centering”:

– Compute {yi} from eigenvectors of B.

ijij

jij

iijijij d

nd

nd

ndb 2

2222 111

2

1

Page 16: Recent developments in nonlinear dimensionality reduction

The assumptions behind three algorithms

Distribution: p(Y) Mapping: f Algorithm

ii) convex, dense isometric Isomap

iii) convex, uniformly dense conformal C-Isomap

ii)

i) arbitrary linear isometric Classical MDS

Page 17: Recent developments in nonlinear dimensionality reduction

Isomap

• Invariant: geodesic distance

Page 18: Recent developments in nonlinear dimensionality reduction

The Isomap algorithm• Construct neighborhood graph G.

– method– K method

• Compute shortest paths in G, with edge ij weighted by the Euclidean distance |xi - xj|.

– Floyd – Dijkstra (+ Fibonacci heaps)

• Reconstruct low-dimensional latent data {yi}.

– Classical MDS on graph distances– Sparse MDS with landmarks

Page 19: Recent developments in nonlinear dimensionality reduction

Illustration on swiss roll

Page 20: Recent developments in nonlinear dimensionality reduction

Discovering the dimensionality

• Measure residual variance in geodesic distances . . .

• . . . and find the elbow.

MDS / PCA

Isomap

Page 21: Recent developments in nonlinear dimensionality reduction

Theoretical analysis of asymptotic convergence

• Conditions for PAC-style asymptotic convergence– Geometric:

• Mapping f is isometric to a subset of Euclidean space (i.e., zero intrinsic curvature).

– Statistical: • Latent data {yi} are a “representative” sample* from

a convex domain.

* Minimum distance from any point on the manifold to a sample point < e.g., variable density Poisson process).

Page 22: Recent developments in nonlinear dimensionality reduction

Theoretical results on the rate of convergence

• Upper bound on the number of data points required.

• Rate of convergence depends on several geometric parameters of the manifold: – Intrinsic:

• dimensionality

– Embedding-dependent: • minimal radius of curvature

• minimal branch separation

Page 23: Recent developments in nonlinear dimensionality reduction

Face under varying pose and illumination

• Dimensionality

• pictureMDS / PCA

Isomap

Page 24: Recent developments in nonlinear dimensionality reduction

Hand under nonrigid articulation

• Dimensionality

• pictureMDS / PCA

Isomap

Page 25: Recent developments in nonlinear dimensionality reduction

Apparent motion

Page 26: Recent developments in nonlinear dimensionality reduction
Page 27: Recent developments in nonlinear dimensionality reduction
Page 28: Recent developments in nonlinear dimensionality reduction
Page 29: Recent developments in nonlinear dimensionality reduction
Page 30: Recent developments in nonlinear dimensionality reduction
Page 31: Recent developments in nonlinear dimensionality reduction
Page 32: Recent developments in nonlinear dimensionality reduction
Page 33: Recent developments in nonlinear dimensionality reduction
Page 34: Recent developments in nonlinear dimensionality reduction
Page 35: Recent developments in nonlinear dimensionality reduction
Page 36: Recent developments in nonlinear dimensionality reduction
Page 37: Recent developments in nonlinear dimensionality reduction
Page 38: Recent developments in nonlinear dimensionality reduction

Digits

• Dimensionality

• picture. MDS / PCA

Isomap

Page 39: Recent developments in nonlinear dimensionality reduction
Page 40: Recent developments in nonlinear dimensionality reduction
Page 41: Recent developments in nonlinear dimensionality reduction
Page 42: Recent developments in nonlinear dimensionality reduction
Page 43: Recent developments in nonlinear dimensionality reduction
Page 44: Recent developments in nonlinear dimensionality reduction
Page 45: Recent developments in nonlinear dimensionality reduction
Page 46: Recent developments in nonlinear dimensionality reduction
Page 47: Recent developments in nonlinear dimensionality reduction
Page 48: Recent developments in nonlinear dimensionality reduction
Page 49: Recent developments in nonlinear dimensionality reduction
Page 50: Recent developments in nonlinear dimensionality reduction

Summary of Isomap

A framework for global nonlinear dimensionality reduction that preserves the crucial features of PCA and classical MDS:

• A noniterative, polynomial-time algorithm.• Guaranteed to construct a globally optimal Euclidean

embedding. • Guaranteed to converge asymptotically for an important class

of nonlinear manifolds.

Plus, good results on real and nontrivial synthetic data sets.

Page 51: Recent developments in nonlinear dimensionality reduction

Outline

• The problem of nonlinear dimensionality reduction

• The Isomap algorithm

• Development #1: Curved manifolds

• Development #2: Sparse approximations

Page 52: Recent developments in nonlinear dimensionality reduction

Locally Linear Embedding (LLE)

• Roweis and Saul (2000)

Page 53: Recent developments in nonlinear dimensionality reduction

Comparing LLE and Isomap

• Both start with only local metric information.• Isomap first estimates global metric structure, then

finds an embedding that optimally preserves global structure.

• LLE finds an embedding that optimally preserves only local structure.

• LLE may be more efficient, but may also introduce unpredictable global distortions.

• No asymptotic convergence results for LLE.

Page 54: Recent developments in nonlinear dimensionality reduction

LLE Isomap

Page 55: Recent developments in nonlinear dimensionality reduction

Outline

• The problem of nonlinear dimensionality reduction

• The Isomap algorithm

• Development #1: Curved manifolds

• Development #2: Sparse approximations

Page 56: Recent developments in nonlinear dimensionality reduction

The assumptions behind three algorithms

Distribution: p(Y) Mapping: f Algorithm

ii) convex, dense isometric Isomap

iii) convex, uniformly dense conformal C-Isomap

iii)

i) arbitrary linear isometric Classical MDS

Page 57: Recent developments in nonlinear dimensionality reduction

Isometric vs. conformal mapping

• Isometric map: preserves the Euclidean metric at each point y.

• Conformal map: preserves the Euclidean metric at each point y, up to an arbitrary scale factor (y) > 0.

• Properties of conformal maps: – Angle-preserving.– Any subset topologically equivalent to a disk can be

conformally mapped onto a disk.

Page 58: Recent developments in nonlinear dimensionality reduction

)()()( iYX yiMiM

C-Isomap

• Invariant: ,

,

f

ijjiX xxiM

||)(ijjiY yyiM

||)(

independent of i

Y

X

Page 59: Recent developments in nonlinear dimensionality reduction

The Isomap algorithm• Construct neighborhood graph G.

– method– K method

• Compute shortest paths in G, with edge ij weighted by the Euclidean distance |xi - xj|.

– Floyd – Dijkstra (+ Fibonacci heaps)

• Reconstruct low-dimensional latent data {yi}.

– Classical MDS on graph distances– Sparse MDS with landmarks

Page 60: Recent developments in nonlinear dimensionality reduction

The C-Isomap algorithm• Construct neighborhood graph G.

– method– K method

• Compute shortest paths in G, with edge ij weighted by rescaled distance – Floyd – Dijkstra (+ Fibonacci heaps)

• Reconstruct low-dimensional latent data {yi}.

– Classical MDS on graph distances– Sparse MDS with landmarks

)()(|| jMiMxx XXji

Page 61: Recent developments in nonlinear dimensionality reduction

Conformal fishbowl

Data MDS Isomap

C-Isomap LLE GTM

Page 62: Recent developments in nonlinear dimensionality reduction

Uniform fishbowl

Data MDS Isomap

C-Isomap LLE GTM

Page 63: Recent developments in nonlinear dimensionality reduction

Conformal fishbowl, Gaussian density

Latent data C-Isomap LLE

Page 64: Recent developments in nonlinear dimensionality reduction

Conformal fishbowl, offset Gaussian density

Latent data C-Isomap LLE

Page 65: Recent developments in nonlinear dimensionality reduction

Wavelet

Data MDS Isomap

C-Isomap LLE GTM

Page 66: Recent developments in nonlinear dimensionality reduction

Images of Tom’s face

• Two intrinsic degrees of freedom:– Translation: left/right– Zoom: in/out

• Scale variables (e.g., zoom) introduce conformal distortion.

. . .

Page 67: Recent developments in nonlinear dimensionality reduction

Face under translation and zoom

Data MDS Isomap

C-Isomap LLE GTM

Page 68: Recent developments in nonlinear dimensionality reduction

Curvature in LLE vs. Isomap

• LLE: +/- Approach: look only at local structure, ignoring global structure.

- Asymptotics: unknown.

+ Nonconformal maps: good for some, but not all.

• Isomap: +/- Approach: explicitly estimate, and factor out, local metric distortion (assuming uniform density).

+ Asymptotics: succeeds for all conformal mappings.

+ Nonconformal maps: good for some, but not all.


Top Related