dimensionality reduction part 2: nonlinear methods
DESCRIPTION
Dimensionality Reduction Part 2: Nonlinear Methods. Comp 790-090 Spring 2007. Previously…. Linear Methods for Dimensionality Reduction PCA: rotate data so that principal axes lie in direction of maximum variance MDS: find coordinates that best preserve pairwise distances. PCA. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Dimensionality ReductionPart 2: Nonlinear Methods
Comp 790-090Spring 2007
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Previously…
Linear Methods for Dimensionality Reduction
PCA: rotate data so that principal axes lie in direction of maximum varianceMDS: find coordinates that best preserve pairwise distances
PCA
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
MotivationLinear Dimensionality Reduction doesn’t always workData violates underlying “linear”assumptions
Data is not accurately modeled by “affine” combinations of measurementsStructure of data, while apparent, is not simpleIn the end, linear methods do nothing more than “globally transform” (rate, translate, and scale) all of the data, sometime what’s needed is to “unwrap” the data first
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Stopgap Remedies
Local PCACompute PCA models for small overlapping item neighborhoodsRequires a clustering preprocessFast and simple, but results in no global parameterization
Neural NetworksAssumes a solution of a given dimensionUses relaxation methods to deform given solution to find a better fitRelaxation step is modeled as “layers” in a network where properties of future iterations are computed based on information from the current structureMany successes, but a bit of an art
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Why Linear Modeling Fails
Suppose that your sample data lies on some low-dimensional surface embedded within the high-dimensional measurement space. Linear models allow ALLaffine combinationsOften, certaincombinations are atypicalof the actual dataRecognizing this isharder as dimensionalityincreases
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
What does PCA Really Model?
Principle Component Analysis assumptionsMean-centered distribution
What if the mean, itself is atypical?
Eigenvectors ofCovariance
Basis vectors alignedwith successive directionsof greatest variance
Classic 1st Orderstatistical model
Distribution is characterizedby its mean and variance (Gaussian Hyperspheres)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Non-Linear Dimensionality Reduction
Non-linear Manifold LearningInstead of preserving global pairwise distances, non-linear dimensionality reduction tries to preserve only the geometric properties of local neighborhoods
Discover a lower-dimensional“embedding” manifoldFind a parameterizationover that manifold
Linear parameter spaceProjection mappingfrom original M-Dspace to d-Dembedding space
Linear Embedding Space
“projection”
“reprojection,elevating, or
lifting”
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Nonlinear DimRedux Steps
Discover a low-dimensional embedding manifoldFind a parameterization over the manifoldProject data into parameter spaceAnalyze, interpolate, and compress in embedding space
Orient (by linear transformation) the parameter space to align axes with salient features
Linear (affine) combinations are valid here
In the case of interpolation and compression use “lifting” to estimate M-D original data
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Nonlinear Methods
Local Linear Embeddings [Roweis 2000]Isomaps [Tenenbaum 2000]These two papers ignited the fieldPrincipled approach (Asymptotically, as the amount of data goes to infinity they have been proven to find the “real” manifold)Widely appliedHotly contested
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Local Linear Embeddings
First InsightLocally, at a fine enough scale, everything looks linear
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Local Linear Embeddings
First InsightFind an affine combination the “neighborhood” about a point
that best approximates it
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Finding a Good Neighborhood
This is the remaining “Art” aspect of nonlinear methodsCommon choices-ball: find all items that lie within an epsilon ball of the target item as measured under some metric
Best if density of items is high and every point has a sufficient number of neighbors
K-nearest neighbors: find the k-closest neighbors to a point under some metric
Guarantees all items are similarly represented, limits dimension to K-1
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Affine “Neighbor” CombinationsWithin locally linear neighborhoods, each point can be considered as an affine combination of its neighbors
1ijj
w
Weights should still be valid in lower-dimensional embedding space
( )i
i jijj neighbor x
x w x
Imagine cutting out patches from
manifold and placing them in
lower-dim so that angles between
points are preserved.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Find WeightsRewriting as a matrix for all x
Reorganizing
Want to find W that minimizes , and satisfies “sum-to-one” constraintEnds up as constrained “least-squares” problem
0000
00
000
1
3231
2321
12
321321
NN
nn
www
www
xxxxxxxx
~~~~
N
M
N
M
2
iNeighborj jjiii xwx
“Unknown W matrix”N
N
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Find Linear Embedding Space
Now that we have the weight matrix W, find the linear vector that satisfies the following
where W is N x N and X is M x NThis can be found by finding the null space of
Classic problem: run SVD on and find the orthogonal vector associated with the smallest d singular values (the smallest singular value will be zero and represent the system’s invariance to translation)
XWX
XAWIX )(0A
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Numerical IssuesNumerical problems can arise in computing LLEsThe least-squared covariance matrix that arises in the computation of the weighting matrix, W, solution can be ill-conditioned
Regularization (rescale the measurements by adding a small multiple of the Identity to covariance matrix)
Finding small singular (eigen) values is not as well conditioned as finding large ones. The small ones are subject to numerical precision errors, and to get mixed
Good (but slow) solvers exist, you have to use them
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Results
The resulting parameter vector, yi, gives the coordinates associated with the item xi
The dth embedding coordinate is formed from the orthogonal vector associated with thedst singular value of A.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
ReprojectionOften, for data analysis, a parameterization is enoughFor interpolation and compression we might want to map points from the parameter space back to the “original” spaceNo perfect solution, but a few approximations
Delauney triangulate the points in the embedding space, find the triangle that the desired parameter setting falls into, and compute the baricenric coordinates of it, and use them as weightsInterpolate by using a radially symmetric kernel centered about the desired parameter settingWorks, but mappings might not be one-to-one
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
LLE Example3-D S-Curve manifold with points color-codedCompute a 2-D embedding
The local affine structure is well maintainedThe metric structure is okay locally, but can drift slowly over the domain (this causes the manifold to taper)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
More LLE Examples
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
More LLE Examples
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
LLE FailuresDoes not work on to closed manifoldsCannot recognize Topology
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
IsomapAn alternative non-linear dimensionality reduction method that extends MDSKey Observation:
On a manifold distances are measured using geodesic distances rather than Euclidean distances
Small Euclidean distance
Large geodesic distance
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Problem: How to Get Geodesics
Without knowledge of the manifold it is difficult to compute the geodesic distance between pointsIt is even difficult if you know the manifoldSolution
Use a discrete geodesic approximationApply a graph algorithm to approximate the geodesic distances
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Dijkstra’s Algorithm
Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Dijkstra’s Algorithm
Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Dijkstra’s Algorithm
Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Dijkstra’s Algorithm
Efficient Solution to all-points-shortest path problemGreedy breath-first algorithm
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Isomap algorithmCompute fully-connected neighborhood of points for each item• Can be k nearest
neighbors or ε-ball• Neighborhoods must
be symmetric• Test that resulting
graph is fully-connected, if not increase either K or
Calculate pairwise Euclidean distances within each neighborhoodUse Dijkstra’s Algorithm to compute shortest path from each point to non-neighboring pointsRun MDS on resulting distance matrix
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Isomap ResultsFind a 2D embedding of the 3D S-curve (also shown for LLE)Isomap does a good job of preserving metric structure (not surprising)The affine structure is also well preserved
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Residual Fitting Error
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Neighborhood Graph
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
More Isomap Results
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
More Isomap Results
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Isomap FailuresIsomap also has problems on closed manifolds of arbitrary topology
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Non-Linear ExampleA Data-Driven Reflectance Model (Matusik et al, Siggraph2003)Bidirectional Reflectance Distribution Functions(BRDF)
Define ratio of the reflected radiance in a particular direction to the incident irradiance from direction.
Isotropic BRDF),,(),,,( irrirrii ff
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Modeling Bidirectional Reflectance Distribution Functions(BRDFs)
Measurement
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
A “fast” BRDF measurement device inspired by Marshner[1998]
Measurement
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Measurement
20-80 million reflectance measurements per materialEach tabulated BRDF entails 90x90x180x3=4,374,000 measurement bins
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Measurement
20-80 million reflectance measurements per materialEach tabulated BRDF entails 90x90x180x3=4,374,000 measurement bins
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Rendering from Tabulated BRDFs
Even without further analysis, our BRDFs are immediately usefulRenderings made with Henrik Wann Jensen’s Dali renderer
Nickel Hematite Gold Paint Pink Felt
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
BRDFs as Vectors in High-Dimensional Space
Each tabulated BRDF is a vector in 90x90x180x3 =4,374,000 dimensional space
Unroll
90
90
180
4,374,000
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Linear Analysis (PCA)
Find optimal “linear basis”for our data set45 componentsneeded to reduce
residue to under measurement error
0 20 40 60 80 100 120
Eigenvalue magnitude
Dimension
mean 5 10 20 30 45 60 all
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Problems with Linear Subspace Modeling
Large number of basis vectors (45)Some linear combinations yield invalid or unlikely BRDFs (outside convex hull)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Problems with Linear Subspace Modeling
Large number of basis vectors (45)Some linear combinations yield invalid or unlikely BRDFs (inside convex hull)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Results of Non-LinearManifold Learning
At 15 dimensions reconstruction error is less than 1%
Parameter count similar to analytical models
5 10 15
Dimensionality
Error
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Non-Linear Advantages15-dimensional parameter spaceMore robust than linear model
More extrapolations are plausible
Non-linear ModelExtrapolation
Linear Model Extrapolation
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Non-Linear Model Results
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Non-Linear Model Results
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Non-Linear Model Results
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
Representing Physical Processes
Steel Oxidation
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
SummaryNon-Linear Dimensionality Reduction Methods
These methods are considerably more powerful and temperamental than linear methodApplications of these methods are a hot area of research
ComparisonsLLE is generally faster, but more brittle than IsomapsIsomaps tends to work better on smaller data sets(i.e. less dense sampling)Isomaps tends to be less sensitive to noise (perturbation of the input vectors)
IssuesNeither method handles closed manifolds and topological variations well