dimensionality reduction part 2: nonlinear methods

Post on 15-Jan-2016

56 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Dimensionality Reduction Part 2: Nonlinear Methods. Comp 790-090 Spring 2007. Previously…. Linear Methods for Dimensionality Reduction PCA: rotate data so that principal axes lie in direction of maximum variance MDS: find coordinates that best preserve pairwise distances. PCA. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dimensionality ReductionPart 2: Nonlinear Methods

Comp 790-090Spring 2007

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Previously…

Linear Methods for Dimensionality Reduction

PCA: rotate data so that principal axes lie in direction of maximum varianceMDS: find coordinates that best preserve pairwise distances

PCA

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

MotivationLinear Dimensionality Reduction doesn’t always workData violates underlying “linear”assumptions

Data is not accurately modeled by “affine” combinations of measurementsStructure of data, while apparent, is not simpleIn the end, linear methods do nothing more than “globally transform” (rate, translate, and scale) all of the data, sometime what’s needed is to “unwrap” the data first

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Stopgap Remedies

Local PCACompute PCA models for small overlapping item neighborhoodsRequires a clustering preprocessFast and simple, but results in no global parameterization

Neural NetworksAssumes a solution of a given dimensionUses relaxation methods to deform given solution to find a better fitRelaxation step is modeled as “layers” in a network where properties of future iterations are computed based on information from the current structureMany successes, but a bit of an art

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Why Linear Modeling Fails

Suppose that your sample data lies on some low-dimensional surface embedded within the high-dimensional measurement space. Linear models allow ALLaffine combinationsOften, certaincombinations are atypicalof the actual dataRecognizing this isharder as dimensionalityincreases

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

What does PCA Really Model?

Principle Component Analysis assumptionsMean-centered distribution

What if the mean, itself is atypical?

Eigenvectors ofCovariance

Basis vectors alignedwith successive directionsof greatest variance

Classic 1st Orderstatistical model

Distribution is characterizedby its mean and variance (Gaussian Hyperspheres)

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Dimensionality Reduction

Non-linear Manifold LearningInstead of preserving global pairwise distances, non-linear dimensionality reduction tries to preserve only the geometric properties of local neighborhoods

Discover a lower-dimensional“embedding” manifoldFind a parameterizationover that manifold

Linear parameter spaceProjection mappingfrom original M-Dspace to d-Dembedding space

Linear Embedding Space

“projection”

“reprojection,elevating, or

lifting”

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Nonlinear DimRedux Steps

Discover a low-dimensional embedding manifoldFind a parameterization over the manifoldProject data into parameter spaceAnalyze, interpolate, and compress in embedding space

Orient (by linear transformation) the parameter space to align axes with salient features

Linear (affine) combinations are valid here

In the case of interpolation and compression use “lifting” to estimate M-D original data

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Nonlinear Methods

Local Linear Embeddings [Roweis 2000]Isomaps [Tenenbaum 2000]These two papers ignited the fieldPrincipled approach (Asymptotically, as the amount of data goes to infinity they have been proven to find the “real” manifold)Widely appliedHotly contested

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Local Linear Embeddings

First InsightLocally, at a fine enough scale, everything looks linear

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Local Linear Embeddings

First InsightFind an affine combination the “neighborhood” about a point

that best approximates it

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Finding a Good Neighborhood

This is the remaining “Art” aspect of nonlinear methodsCommon choices-ball: find all items that lie within an epsilon ball of the target item as measured under some metric

Best if density of items is high and every point has a sufficient number of neighbors

K-nearest neighbors: find the k-closest neighbors to a point under some metric

Guarantees all items are similarly represented, limits dimension to K-1

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Affine “Neighbor” CombinationsWithin locally linear neighborhoods, each point can be considered as an affine combination of its neighbors

1ijj

w

Weights should still be valid in lower-dimensional embedding space

( )i

i jijj neighbor x

x w x

Imagine cutting out patches from

manifold and placing them in

lower-dim so that angles between

points are preserved.

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Find WeightsRewriting as a matrix for all x

Reorganizing

Want to find W that minimizes , and satisfies “sum-to-one” constraintEnds up as constrained “least-squares” problem

0000

00

000

1

3231

2321

12

321321

NN

nn

www

www

xxxxxxxx

~~~~

N

M

N

M

2

iNeighborj jjiii xwx

“Unknown W matrix”N

N

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Find Linear Embedding Space

Now that we have the weight matrix W, find the linear vector that satisfies the following

where W is N x N and X is M x NThis can be found by finding the null space of

Classic problem: run SVD on and find the orthogonal vector associated with the smallest d singular values (the smallest singular value will be zero and represent the system’s invariance to translation)

XWX

XAWIX )(0A

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Numerical IssuesNumerical problems can arise in computing LLEsThe least-squared covariance matrix that arises in the computation of the weighting matrix, W, solution can be ill-conditioned

Regularization (rescale the measurements by adding a small multiple of the Identity to covariance matrix)

Finding small singular (eigen) values is not as well conditioned as finding large ones. The small ones are subject to numerical precision errors, and to get mixed

Good (but slow) solvers exist, you have to use them

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Results

The resulting parameter vector, yi, gives the coordinates associated with the item xi

The dth embedding coordinate is formed from the orthogonal vector associated with thedst singular value of A.

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

ReprojectionOften, for data analysis, a parameterization is enoughFor interpolation and compression we might want to map points from the parameter space back to the “original” spaceNo perfect solution, but a few approximations

Delauney triangulate the points in the embedding space, find the triangle that the desired parameter setting falls into, and compute the baricenric coordinates of it, and use them as weightsInterpolate by using a radially symmetric kernel centered about the desired parameter settingWorks, but mappings might not be one-to-one

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

LLE Example3-D S-Curve manifold with points color-codedCompute a 2-D embedding

The local affine structure is well maintainedThe metric structure is okay locally, but can drift slowly over the domain (this causes the manifold to taper)

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More LLE Examples

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More LLE Examples

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

LLE FailuresDoes not work on to closed manifoldsCannot recognize Topology

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

IsomapAn alternative non-linear dimensionality reduction method that extends MDSKey Observation:

On a manifold distances are measured using geodesic distances rather than Euclidean distances

Small Euclidean distance

Large geodesic distance

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Problem: How to Get Geodesics

Without knowledge of the manifold it is difficult to compute the geodesic distance between pointsIt is even difficult if you know the manifoldSolution

Use a discrete geodesic approximationApply a graph algorithm to approximate the geodesic distances

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Dijkstra’s Algorithm

Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Isomap algorithmCompute fully-connected neighborhood of points for each item• Can be k nearest

neighbors or ε-ball• Neighborhoods must

be symmetric• Test that resulting

graph is fully-connected, if not increase either K or

Calculate pairwise Euclidean distances within each neighborhoodUse Dijkstra’s Algorithm to compute shortest path from each point to non-neighboring pointsRun MDS on resulting distance matrix

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Isomap ResultsFind a 2D embedding of the 3D S-curve (also shown for LLE)Isomap does a good job of preserving metric structure (not surprising)The affine structure is also well preserved

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Residual Fitting Error

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Neighborhood Graph

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More Isomap Results

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

More Isomap Results

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Isomap FailuresIsomap also has problems on closed manifolds of arbitrary topology

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear ExampleA Data-Driven Reflectance Model (Matusik et al, Siggraph2003)Bidirectional Reflectance Distribution Functions(BRDF)

Define ratio of the reflected radiance in a particular direction to the incident irradiance from direction.

Isotropic BRDF),,(),,,( irrirrii ff

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Modeling Bidirectional Reflectance Distribution Functions(BRDFs)

Measurement

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

A “fast” BRDF measurement device inspired by Marshner[1998]

Measurement

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Measurement

20-80 million reflectance measurements per materialEach tabulated BRDF entails 90x90x180x3=4,374,000 measurement bins

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Measurement

20-80 million reflectance measurements per materialEach tabulated BRDF entails 90x90x180x3=4,374,000 measurement bins

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Rendering from Tabulated BRDFs

Even without further analysis, our BRDFs are immediately usefulRenderings made with Henrik Wann Jensen’s Dali renderer

Nickel Hematite Gold Paint Pink Felt

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

BRDFs as Vectors in High-Dimensional Space

Each tabulated BRDF is a vector in 90x90x180x3 =4,374,000 dimensional space

Unroll

90

90

180

4,374,000

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Linear Analysis (PCA)

Find optimal “linear basis”for our data set45 componentsneeded to reduce

residue to under measurement error

0 20 40 60 80 100 120

Eigenvalue magnitude

Dimension

mean 5 10 20 30 45 60 all

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Problems with Linear Subspace Modeling

Large number of basis vectors (45)Some linear combinations yield invalid or unlikely BRDFs (outside convex hull)

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Problems with Linear Subspace Modeling

Large number of basis vectors (45)Some linear combinations yield invalid or unlikely BRDFs (inside convex hull)

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Results of Non-LinearManifold Learning

At 15 dimensions reconstruction error is less than 1%

Parameter count similar to analytical models

5 10 15

Dimensionality

Error

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Advantages15-dimensional parameter spaceMore robust than linear model

More extrapolations are plausible

Non-linear ModelExtrapolation

Linear Model Extrapolation

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Model Results

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Model Results

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Non-Linear Model Results

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Representing Physical Processes

Steel Oxidation

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

SummaryNon-Linear Dimensionality Reduction Methods

These methods are considerably more powerful and temperamental than linear methodApplications of these methods are a hot area of research

ComparisonsLLE is generally faster, but more brittle than IsomapsIsomaps tends to work better on smaller data sets(i.e. less dense sampling)Isomaps tends to be less sensitive to noise (perturbation of the input vectors)

IssuesNeither method handles closed manifolds and topological variations well

top related