last time: pcaalumni.media.mit.edu/.../07_face_recognition_mpca_09.pdf · 2009. 2. 24. · 4 data...

1

Statistical Multilinear Models:

TensorFaces

Statistical Multilinear Models:

TensorFaces

Last Time: PCALast Time: PCA

• PCA identifies an m dimensional explanation of n dimensional data where m < n.

• Originated as a statistical analysis technique. • PCA attempts to minimize the reconstruction error under the

following restrictions– Linear Reconstruction– Orthogonal Factors

• Equivalently, PCA attempts to maximize variance.

PART I: 2D VisionPART I: 2D Vision• Appearance-Based Methods

• Statistical Linear Models:– PCA– ICA, FLD– Non-negative Matrix Factorization, Sparse Matrix Factorization

• Statistical Tensor Models:– Multilinear PCA,– Multilinear ICA

• Person and Activity Recognition

Today

The Problem with Linear (PCA) Appearance Based Recognition

Methods

The Problem with Linear (PCA) Appearance Based Recognition

Methods• Eigenimages work best for recognition when only a single

factor – e.g., object identity – is allowed to vary

• Natural images are the composite consequences of multiple factors (or modes) related to scene structure, illumination and imaging

Image ManifoldImage Manifold

Eigenvectors

Unsupervised vs SupervisedUnsupervised vs Supervised

• Unsupervised: - there are no labels for the data– PCA– ICA

• Supervised: - each image has label information that describes the factors that constructed the observed image– FLD– Modular PCA (View Based PCA)

2

Modular PCA: View-based PCAModular PCA: View-based PCA Linear RepresentationsLinear Representations

• PCA and ICA:– One linear model– Each image has a unique representation– Face (object) recognition

• Each image has a unique representation• Every person has multiple representations

• FLD– One linear model– Goal: one representation per person – Practice:

• One representation per image• Every person (object) has multiple representations

• Modular PCA:– Assumption the data lies on a curved space that can be modeled

locally by a linear model– A linear model for every cluster

• Images are multivariate functions that respond to changes in viewpoint, illumination and geometry

• Goal: Multilinear Representation of Image Ensemblesfor Recognition and Compression

Multilinear Representation of Image Ensemblesfor Recognition and Compression

TensorFaces:

Linear vs Multilinear ManifoldsLinear vs Multilinear ManifoldsPrevious Efforts to Improve

Linear Models Previous Efforts to Improve

Linear Models • Improved PCA Methods:

– Moghaddam & Pentland 1994, 1995"View-Based and Modular Eigenspaces for Face Recognition"

– Nayar, Murase & Nene 1996“Parametric Appearance Representation“

• Fisher Linear Discriminant:– Belhumeur & Hespanha & Kriegman

"Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection“

• Bilinear Models:– Marimont & Wandell 1992

"Linear models of surface and illuminant spectra"– Freeman & Tenenbaum 1997

"Learning Bilinear Models for Two-Factor Problems in Vision"

3

• Eigenfaces capture the variability across images without disentanglingidentity variability from variability in other factors such as lighting, viewpoint or expressions

Eigenfaces Experience DifficultiesWhen More Than a Single Factor

Varies


Varies

Multilinear Model ApproachMultilinear Model Approach

• Non-linear appearance based technique

• Appearance based model that explicitly accounts for each of the multiple factors inherent in image formation

• Multilinear algebra, the algebra of higher order tensors

• Applied to facial images, we call our tensor technique “TensorFaces” [ Vasilescu & Terzopoulos, ECCV 02, ICPR 02 ]

TensorFaces vs Eigenfaces(PCA)

TensorFaces vs Eigenfaces(PCA)

88%27%Training: 23 people, 5 viewpoints (0,+17, -17,+34,-34), 3 illuminations

Testing: 23 people, 5 viewpoints (0,+17, -17,+34,-34), 4th illumination

80%61%Training: 23 people, 3 viewpoints (0,+34,-34), 4 illuminations

Testing: 23 people, 2 viewpoints (+17,-17), 4 illuminations (center,left,right,left+right)

TensorFacesPCAPIE Recognition Experiment• Eigenfaces capture the variability across images without disentangling

identity variability from variability in other factors such as lighting, viewpoint or expressions


Varies


Varies

PIE Database (Weizmann)PIE Database (Weizmann) Data OrganizationData Organization

• Linear/PCA: Data Matrix– Rpixels x images

– a matrix of image vectors

• Multilinear: Data Tensor– Rpeople x views x illums x express x pixels

– N-dimensional array– 28 people, 45 images/person– 5 views, 3 illuminations,

3 expressions per person

exilvpp ,,,i

People

Illu

min

atio

nsEx

pres

sions

Views

D

D

Pixe

ls

ImagesD

4

Data OrganizationData Organization• Linear / PCA: Data Matrix

– Rpixels x images

– a matrix of image vectors

• Multilinear: Data Tensor– Rpeople x views x illums x express x pixels

– N- dimensional matrix– 28 people, 45 images/person– 5 views, 3 illums., 3 express. per person

exilvpp ,,,i

People

Illu

min

atio

nsEx

pres

sions

Views

D

D

Tensor Decomposition

Complete Dataset

}Learning Stage

Tensor DecompositionTensor Decomposition

Views

Illu

ms.

People

Matrix Decomposition - SVDMatrix Decomposition - SVD

• A matrix has a column and row space

• SVD orthogonalizes these spaces and decomposes

• Rewrite in terms of mode-n products

2xI1IIR∈D

T21 UUD ∑=

2UUD x1x 2 1 ∑=

Pixe

ls

ImagesD

( contains the eigenfaces )1U

D

Tensor DecompositionTensor Decomposition• D is a n-dimensional matrix, comprising N-spaces

• N-mode SVD is the natural generalization of SVD

• N-mode SVD orthogonalizes these spaces & decomposes

D as the mode-n product of N-orthogonal spaces

• Z core tensor; governs interaction between mode matrices

• , mode-n matrix, is the column space of nU )(nD

nn UUUU ... x x x xx 3 21 1 432 ZD =

Tensor DecompositionTensor Decomposition

• D=Z x1 U1 x2 U2 x3 U3

5

D321 x xx 32 1 UUU Z=

Multilinear (Tensor) DecompositionMultilinear (Tensor) Decomposition

323

32121

3211

3

1

2

1

1

1rrr

rrrr

rr,,, uuu oo∑∑∑=

===

RRR

σ

U1U2

U3

D Z=

D321 x xx 32 1 UUU Z=

Multilinear (Tensor) DecompositionMultilinear (Tensor) Decomposition

( ) ( ) ( )ZD vecvec 123 UUU ⊗⊗=32

3321

21

3211

3

1

2

1

1

1rrr

rrrr

rr,,, uuu oo∑∑∑=

===

RRR

σ

U1U2

U3

D Z=

pixelsxexpressxillums.xviews xpeoplex 51 UUUUU . ZD 432=

Facial Data Tensor DecompositionFacial Data Tensor DecompositionPeo

ple

Illu

min

atio

nsEx

pres

sions

ViewsD

Decomposition from Pixel Space to Factor Spaces

Decomposition from Pixel Space to Factor Spaces

N-Mode SVD AlgorithmN-Mode SVD Algorithm

1. For n=1,…,N, compute matrix by computing the SVD of the flattened matrix and setting to be the left matrix of the SVD.

2. Solve for the core tensor as follows

)(nDnU nU

TTTTT5544332211 UUUUU x x x xx DZ =

Multilinear (Tensor) AlgebraMultilinear (Tensor) Algebra

mode -n tensor flattening

NIII L××ℜ∈ 21A

( )( )Nnnn IIIIII

nLL 1121 +−×ℜ∈A

flattening=

“matricize”

N = 3

6

DD(illums.)

People

Expr

essio

ns

Views

Computing UillumsComputing Uillums

Images

Illu

min

atio

ns

• D(illums) – flatten D along the illumination dimension• Uillums – orthogonalizes the column space of D(illums)

People

Illu

min

atio

nsEx

pres

sions

ViewsD(views)D

Computing UviewsComputing UviewsImages

• D(views) - flatten D along the view point dimension• Uviews – orthogonalize the column space of D(views)

People

Illu

min

atio

nsEx

pres

sions

Views ImagesD(illums.)

Computing UillumsD

• D(illums) - flatten D along the illumination dimension

• Uillums – orthogonalize D(illums)

People

Illu

min

atio

nsEx

pres

sions

Views ImagesD(views)

Computing UviewsD

• D(views) - flatten D along the view point dimension• Uviews – orthogonalize D(views)

Pixe

ls

Images

Computing UpixelsComputing Upixels

D(pixels)

• D(pixels) - flatten D along the pixel dimension

• Upixels – orthogonalize D(pixels)– eigenimages

N-Mode SVD AlgorithmN-Mode SVD Algorithm

1. For n=1,…,N, compute matrix by computing the SVD of the flattened matrix and setting to be the left matrix of the SVD.

2. Solve for the core tensor as follows

)(nDnU

nU

TTTTT5544332211 UUUUU x x x xx DZ =

7

I1

J1

x1

• Mode-n product is a generalization of the product of two matrices• It is the product of a tensor with a matrix

• Mode-n product of andNn III x...xx...x1ℜ∈A

I1

I2

I3I2

J2

J2

I2

J2

I2

= x2

Nnnn IIJII x..xxx.x...x 111 +−ℜ∈B

nn IJ xℜ=M

Mn×= AB

Mode-n ProductMode-n Product

AB M x3

J3

I3

I3

J3

nnNnnn ijiiiii

niNininjnii

n ma ............

A111

111+−∑=×

+−

M

• Mode-n product is a generalization of the product of two matrices• It is the product of a tensor with a matrix• Mode-n product of and

( ) ∑ +−=+−ni

ninjNinininiiiijiinma

Nnnn............

A111111

Mx

Nn IIIIR x...xx...x1∈A

I1

I2

I3I2

J2

J2

I2

J2

I2

I1

I2

I3

= x2x1

I3

J3

J3

I3

x3

Nnnn IIJIIIR x..xxx.x...x 111 +−∈Bnn IJIR x=M

Mnx AB =

Mode-N ProductMode-N Product

I1

J1

AB M

Multilinear (Tensor) AlgebraMultilinear (Tensor) Algebra

N-th order tensor NIII L××ℜ∈ 21A

nn IJ ×ℜ∈Mmatrix (2nd order tensor)

mode -n product:

Mn×= AB ( ) ( )nn AMB =where( )

44444444 344444444 21321321matrixt coefficien

matrix basis

matrix data

expressillums.viewspeople(pixels)pixels(pixels)

T.UUUUZUD ⊗⊗⊗=

Eigenfaces vs TensorFaces Eigenfaces vs TensorFaces • Multilinear Analysis / TensorFaces:

pixelsxexpressxillums.xviews xpeoplex 51 UUUUU . ZD 432=

( ) T express U(pixels) Z viewsU⊗illums.U⊗ peopleU⊗=321matrix data

(pixels)D 321matrix basis

pixelsU

• Linear Analysis / Eigenfaces:

• TensorFaces subsumes Eigenfaces

TensorFaces:TensorFaces:

Views

Illums.

People

B = Z x5 Upixels

TensorFaces:explicitly representcovariance acrossfactors

PCA:

TensorFaces:

8

TensorFaces

Mean Sq. Err. = 409.153 illum + 11 people param.

33 basis vectors

PCA

Mean Sq. Err. = 85.7533 parameters

33 basis vectors

Strategic Data Compression = Perceptual Quality

Strategic Data Compression = Perceptual Quality

Original

176 basis vectors

TensorFaces

6 illum + 11 people param.66 basis vectors

• TensorFaces data reduction in illumination space primarily degrades illumination effects (cast shadows, highlights)

• PCA has lower mean square error but higher perceptual error

last time: pcaalumni.media.mit.edu/.../07_face_recognition_mpca_09.pdf · 2009. 2. 24. · 4 data...

Documents