approximate nearest subspace search with applications to pattern recognition ronen basri, tal...

Approximate Nearest Subspace Search with Applications to Pattern Recognition

Ronen Basri, Tal Hassner, Lihi Zelnik-Manor

presented by Andrew Guillory and Ian Simon

The Problem

• Given n linear subspaces Si:

0T xZ i

The Problem


• And a query point q:

0T xZ i

The Problem


• And a query point q:

• Find the subspace Si that minimizes dist(Si,q).

0T xZ i

Why?

• object appearance variation = subspace

– fast queries on object database

Why?

• object appearance variation = subspace

– fast queries on object database

• Other reasons?

Approach

• Solve by reduction to nearest neighbor.– point-to-point distances

Approach


not actual reduction

Approach


• In higher-dimensional space.not actual reduction

Point-Subspace Distance

• Use squared distance.

TT

TT

TT

TT

2T2

2

,dist

ZZhxxh

ZZxx

xZZx

xZxZ

xZSx



2

2

2

2

22

1

12

11

21

22212

11211

dd

d

d

dddd

d

d

a

a

aa

a

a

aaa

aaa

aaa

h

TT

TT

TT

TT

2T2

2

,dist

ZZhxxh

ZZxx

xZZx

xZxZ

xZSx



• Squared point-subspace distancecan be represented as a dot product.

2

2

2

2

22

1

12

11

21

22212

11211

dd

d

d

dddd

d

d

a

a

aa

a

a

aaa

aaa

aaa

h

TT

TT

TT

TT

2T2

2

,dist

ZZhxxh

ZZxx

xZZx

xZxZ

xZSx

The Reduction

• Let:Remember:

TT

xxhv

ZZhu

TT2 2,dist ZZhxxhSx

The Reduction

• Let:

• Then:

Remember:

222

222

,dist

2,dist

vuSx

vvuuvu

TT2 2,dist ZZhxxhSx

TT

xxhv

ZZhu

The Reduction

2222 ,dist,dist vuSxvu T

T

xxhv

ZZhu

The Reduction

T

T

xxhv

ZZhu

constant over query

2222 ,dist,dist vuSxvu

The Reduction

?

T

T

xxhv

ZZhu

2222 ,dist,dist vuSxvu constant over query

The Reduction

kd

ZZ

ZZ

ZZZZ

ZZhZZh

ZZhu

2

1

Tr2

1

Tr2

1

Tr2

1

T

T

TT

TT

2T2

T

T

xxhv

ZZhu

ZTZ = I

2222 ,dist,dist vuSxvu ? constant over query

The Reduction

kd

ZZ

ZZ

ZZZZ

ZZhZZh

ZZhu

2

1

Tr2

1

Tr2

1

Tr2

1

T

T

TT

TT

2T2

T

T

xxhv

ZZhu

ZTZ = I

Z is d-by-(d-k), columns orthonormal.

2222 ,dist,dist vuSxvu ? constant over query

The Reduction

• For query point q:

422

2

1,dist,dist qkdSqvu

The Reduction

• For query point q:

• Can we decrease the additive constant?

422

2

1,dist,dist qkdSqvu

Observation 1

• All data points lie on a hyperplane.

kdZZ TTr

Observation 1

• All data points lie on a hyperplane.

• Let:

• Now the hyperplane contains the origin.

kdZZ TTr

Id

kdZZhu T

Id

qqqhv

2

T

Observation 2

• After hyperplane projection:

• All data points lie on a hypersphere.

1

2

T2

d

kdk

Id

kdZZhu

Observation 2

• After hyperplane projection:

• All data points lie on a hypersphere.

• Let:

• Now the query point lies on the hypersphere.

1

2

T2

d

kdk

Id

kdZZhu

Id

qqqh

d

kdk

qv

2

T2 1

1

Reduction Geometry

• What is happening?

Finally

• Additive constant depends only on dimension of points and subspaces.

• This applies to linear subspaces, all of the same dimension.

11

1

1

,dist,dist

2

22

d

kdkk

d

k

d

kdk

q

Sqvu

Extensions

• subspaces of different dimension– lines and planes, e.g.– Not all data points have the same norm.• Add extra dimension to fix this.

Extensions

• subspaces of different dimension– lines and planes, e.g.– Not all data points have the same norm.• Add extra dimension to fix this.

• affine subspaces

– Again, not all data pointshave the same norm.

bxZ i T

Approximate Nearest Neighbor Search

• Find point x with distance d(x, q) <= (1 + ε) mini d(xi,q)

• Tree based approaches: KD-trees, metric / ball trees, cover trees

• Locality sensitive hashing• This paper uses multiple KD-Trees with

(different) random projections

KD-Trees

• Decompose space into axis aligned rectangles

Image from Dan Pelleg

Random Projections

• Multiply data with a random matrix X with X(i,j) drawn from N(0,1)

• Several different justifications– Johnson-Lindenstrauss (data set that is small

compared to dimensionality)– Compressed Sensing (data set that is sparse in

some linear basis)– RP-Trees (data set that has small doubling

dimension)

Results

• Two goals– show their method is fast – show nearest subspace is useful

• Four experiments– Synthetic Experiments– Image Approximation– Yale Faces– Yale Patches

Image Reconstruction

Yale Faces

Questions / Issues

• Should random projections be applied before or after the reduction?

• Why does the effective distance error go down with the ambient dimensionality?

• The reduction tends to make query points far away from the points in the database. Are there better approximate nearest neighbor algorithms in this case?

approximate nearest subspace search with applications to pattern recognition ronen basri, tal...

Documents