dimensionality reduction part 2: nonlinear methods

52
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Dimensionality Reduction Part 2: Nonlinear Methods Comp 790-090 Spring 2007

Upload: dirk

Post on 15-Jan-2016

56 views

Category:

Documents


0 download

DESCRIPTION

Dimensionality Reduction Part 2: Nonlinear Methods. Comp 790-090 Spring 2007. Previously…. Linear Methods for Dimensionality Reduction PCA: rotate data so that principal axes lie in direction of maximum variance MDS: find coordinates that best preserve pairwise distances. PCA. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dimensionality ReductionPart 2: Nonlinear Methods

Comp 790-090Spring 2007

Page 2: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Previously…

Linear Methods for Dimensionality Reduction

PCA: rotate data so that principal axes lie in direction of maximum varianceMDS: find coordinates that best preserve pairwise distances

PCA

Page 3: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

MotivationLinear Dimensionality Reduction doesn’t always workData violates underlying “linear”assumptions

Data is not accurately modeled by “affine” combinations of measurementsStructure of data, while apparent, is not simpleIn the end, linear methods do nothing more than “globally transform” (rate, translate, and scale) all of the data, sometime what’s needed is to “unwrap” the data first

Page 4: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Stopgap Remedies

Local PCACompute PCA models for small overlapping item neighborhoodsRequires a clustering preprocessFast and simple, but results in no global parameterization

Neural NetworksAssumes a solution of a given dimensionUses relaxation methods to deform given solution to find a better fitRelaxation step is modeled as “layers” in a network where properties of future iterations are computed based on information from the current structureMany successes, but a bit of an art

Page 5: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Why Linear Modeling Fails

Suppose that your sample data lies on some low-dimensional surface embedded within the high-dimensional measurement space. Linear models allow ALLaffine combinationsOften, certaincombinations are atypicalof the actual dataRecognizing this isharder as dimensionalityincreases

Page 6: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

What does PCA Really Model?

Principle Component Analysis assumptionsMean-centered distribution

What if the mean, itself is atypical?

Eigenvectors ofCovariance

Basis vectors alignedwith successive directionsof greatest variance

Classic 1st Orderstatistical model

Distribution is characterizedby its mean and variance (Gaussian Hyperspheres)

Page 7: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Dimensionality Reduction

Non-linear Manifold LearningInstead of preserving global pairwise distances, non-linear dimensionality reduction tries to preserve only the geometric properties of local neighborhoods

Discover a lower-dimensional“embedding” manifoldFind a parameterizationover that manifold

Linear parameter spaceProjection mappingfrom original M-Dspace to d-Dembedding space

Linear Embedding Space

“projection”

“reprojection,elevating, or

lifting”

Page 8: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Nonlinear DimRedux Steps

Discover a low-dimensional embedding manifoldFind a parameterization over the manifoldProject data into parameter spaceAnalyze, interpolate, and compress in embedding space

Orient (by linear transformation) the parameter space to align axes with salient features

Linear (affine) combinations are valid here

In the case of interpolation and compression use “lifting” to estimate M-D original data

Page 9: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Nonlinear Methods

Local Linear Embeddings [Roweis 2000]Isomaps [Tenenbaum 2000]These two papers ignited the fieldPrincipled approach (Asymptotically, as the amount of data goes to infinity they have been proven to find the “real” manifold)Widely appliedHotly contested

Page 10: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Local Linear Embeddings

First InsightLocally, at a fine enough scale, everything looks linear

Page 11: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Local Linear Embeddings

First InsightFind an affine combination the “neighborhood” about a point

that best approximates it

Page 12: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Finding a Good Neighborhood

This is the remaining “Art” aspect of nonlinear methodsCommon choices-ball: find all items that lie within an epsilon ball of the target item as measured under some metric

Best if density of items is high and every point has a sufficient number of neighbors

K-nearest neighbors: find the k-closest neighbors to a point under some metric

Guarantees all items are similarly represented, limits dimension to K-1

Page 13: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Affine “Neighbor” CombinationsWithin locally linear neighborhoods, each point can be considered as an affine combination of its neighbors

1ijj

w

Weights should still be valid in lower-dimensional embedding space

( )i

i jijj neighbor x

x w x

Imagine cutting out patches from

manifold and placing them in

lower-dim so that angles between

points are preserved.

Page 14: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Find WeightsRewriting as a matrix for all x

Reorganizing

Want to find W that minimizes , and satisfies “sum-to-one” constraintEnds up as constrained “least-squares” problem

0000

00

000

1

3231

2321

12

321321

NN

nn

www

www

xxxxxxxx

~~~~

N

M

N

M

2

iNeighborj jjiii xwx

“Unknown W matrix”N

N

Page 15: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Find Linear Embedding Space

Now that we have the weight matrix W, find the linear vector that satisfies the following

where W is N x N and X is M x NThis can be found by finding the null space of

Classic problem: run SVD on and find the orthogonal vector associated with the smallest d singular values (the smallest singular value will be zero and represent the system’s invariance to translation)

XWX

XAWIX )(0A

Page 16: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Numerical IssuesNumerical problems can arise in computing LLEsThe least-squared covariance matrix that arises in the computation of the weighting matrix, W, solution can be ill-conditioned

Regularization (rescale the measurements by adding a small multiple of the Identity to covariance matrix)

Finding small singular (eigen) values is not as well conditioned as finding large ones. The small ones are subject to numerical precision errors, and to get mixed

Good (but slow) solvers exist, you have to use them

Page 17: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Results

The resulting parameter vector, yi, gives the coordinates associated with the item xi

The dth embedding coordinate is formed from the orthogonal vector associated with thedst singular value of A.

Page 18: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

ReprojectionOften, for data analysis, a parameterization is enoughFor interpolation and compression we might want to map points from the parameter space back to the “original” spaceNo perfect solution, but a few approximations

Delauney triangulate the points in the embedding space, find the triangle that the desired parameter setting falls into, and compute the baricenric coordinates of it, and use them as weightsInterpolate by using a radially symmetric kernel centered about the desired parameter settingWorks, but mappings might not be one-to-one

Page 19: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

LLE Example3-D S-Curve manifold with points color-codedCompute a 2-D embedding

The local affine structure is well maintainedThe metric structure is okay locally, but can drift slowly over the domain (this causes the manifold to taper)

Page 20: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More LLE Examples

Page 21: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More LLE Examples

Page 22: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

LLE FailuresDoes not work on to closed manifoldsCannot recognize Topology

Page 23: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

IsomapAn alternative non-linear dimensionality reduction method that extends MDSKey Observation:

On a manifold distances are measured using geodesic distances rather than Euclidean distances

Small Euclidean distance

Large geodesic distance

Page 24: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Problem: How to Get Geodesics

Without knowledge of the manifold it is difficult to compute the geodesic distance between pointsIt is even difficult if you know the manifoldSolution

Use a discrete geodesic approximationApply a graph algorithm to approximate the geodesic distances

Page 25: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

Page 26: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

Page 27: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

Page 28: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

Page 29: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Isomap algorithmCompute fully-connected neighborhood of points for each item• Can be k nearest

neighbors or ε-ball• Neighborhoods must

be symmetric• Test that resulting

graph is fully-connected, if not increase either K or

Calculate pairwise Euclidean distances within each neighborhoodUse Dijkstra’s Algorithm to compute shortest path from each point to non-neighboring pointsRun MDS on resulting distance matrix

Page 30: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Isomap ResultsFind a 2D embedding of the 3D S-curve (also shown for LLE)Isomap does a good job of preserving metric structure (not surprising)The affine structure is also well preserved

Page 31: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Residual Fitting Error

Page 32: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Neighborhood Graph

Page 33: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More Isomap Results

Page 34: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More Isomap Results

Page 35: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Isomap FailuresIsomap also has problems on closed manifolds of arbitrary topology

Page 36: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear ExampleA Data-Driven Reflectance Model (Matusik et al, Siggraph2003)Bidirectional Reflectance Distribution Functions(BRDF)

Define ratio of the reflected radiance in a particular direction to the incident irradiance from direction.

Isotropic BRDF),,(),,,( irrirrii ff

Page 37: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Modeling Bidirectional Reflectance Distribution Functions(BRDFs)

Measurement

Page 38: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

A “fast” BRDF measurement device inspired by Marshner[1998]

Measurement

Page 39: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Measurement

20-80 million reflectance measurements per materialEach tabulated BRDF entails 90x90x180x3=4,374,000 measurement bins

Page 40: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Measurement

20-80 million reflectance measurements per materialEach tabulated BRDF entails 90x90x180x3=4,374,000 measurement bins

Page 41: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Rendering from Tabulated BRDFs

Even without further analysis, our BRDFs are immediately usefulRenderings made with Henrik Wann Jensen’s Dali renderer

Nickel Hematite Gold Paint Pink Felt

Page 42: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

BRDFs as Vectors in High-Dimensional Space

Each tabulated BRDF is a vector in 90x90x180x3 =4,374,000 dimensional space

Unroll

90

90

180

4,374,000

Page 43: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Linear Analysis (PCA)

Find optimal “linear basis”for our data set45 componentsneeded to reduce

residue to under measurement error

0 20 40 60 80 100 120

Eigenvalue magnitude

Dimension

mean 5 10 20 30 45 60 all

Page 44: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Problems with Linear Subspace Modeling

Large number of basis vectors (45)Some linear combinations yield invalid or unlikely BRDFs (outside convex hull)

Page 45: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Problems with Linear Subspace Modeling

Large number of basis vectors (45)Some linear combinations yield invalid or unlikely BRDFs (inside convex hull)

Page 46: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Results of Non-LinearManifold Learning

At 15 dimensions reconstruction error is less than 1%

Parameter count similar to analytical models

5 10 15

Dimensionality

Error

Page 47: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Advantages15-dimensional parameter spaceMore robust than linear model

More extrapolations are plausible

Non-linear ModelExtrapolation

Linear Model Extrapolation

Page 48: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Model Results

Page 49: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Model Results

Page 50: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Model Results

Page 51: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Representing Physical Processes

Steel Oxidation

Page 52: Dimensionality Reduction Part 2: Nonlinear Methods

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

SummaryNon-Linear Dimensionality Reduction Methods

These methods are considerably more powerful and temperamental than linear methodApplications of these methods are a hot area of research

ComparisonsLLE is generally faster, but more brittle than IsomapsIsomaps tends to work better on smaller data sets(i.e. less dense sampling)Isomaps tends to be less sensitive to noise (perturbation of the input vectors)

IssuesNeither method handles closed manifolds and topological variations well