last time: pcaalumni.media.mit.edu/.../07_face_recognition_mpca_09.pdf · 2009. 2. 24. · 4 data...
TRANSCRIPT
-
1
Statistical Multilinear Models:
TensorFaces
Statistical Multilinear Models:
TensorFaces
Last Time: PCALast Time: PCA
• PCA identifies an m dimensional explanation of n dimensional data where m < n.
• Originated as a statistical analysis technique. • PCA attempts to minimize the reconstruction error under the
following restrictions– Linear Reconstruction– Orthogonal Factors
• Equivalently, PCA attempts to maximize variance.
PART I: 2D VisionPART I: 2D Vision• Appearance-Based Methods
• Statistical Linear Models:– PCA– ICA, FLD– Non-negative Matrix Factorization, Sparse Matrix Factorization
• Statistical Tensor Models:– Multilinear PCA,– Multilinear ICA
• Person and Activity Recognition
Today
The Problem with Linear (PCA) Appearance Based Recognition
Methods
The Problem with Linear (PCA) Appearance Based Recognition
Methods• Eigenimages work best for recognition when only a single
factor – e.g., object identity – is allowed to vary
• Natural images are the composite consequences of multiple factors (or modes) related to scene structure, illumination and imaging
Image ManifoldImage Manifold
Eigenvectors
Unsupervised vs SupervisedUnsupervised vs Supervised
• Unsupervised: - there are no labels for the data– PCA– ICA
• Supervised: - each image has label information that describes the factors that constructed the observed image– FLD– Modular PCA (View Based PCA)
-
2
Modular PCA: View-based PCAModular PCA: View-based PCA Linear RepresentationsLinear Representations
• PCA and ICA:– One linear model– Each image has a unique representation– Face (object) recognition
• Each image has a unique representation• Every person has multiple representations
• FLD– One linear model– Goal: one representation per person – Practice:
• One representation per image• Every person (object) has multiple representations
• Modular PCA:– Assumption the data lies on a curved space that can be modeled
locally by a linear model– A linear model for every cluster
• Images are multivariate functions that respond to changes in viewpoint, illumination and geometry
• Goal: Multilinear Representation of Image Ensemblesfor Recognition and Compression
Multilinear Representation of Image Ensemblesfor Recognition and Compression
TensorFaces:
Linear vs Multilinear ManifoldsLinear vs Multilinear ManifoldsPrevious Efforts to Improve
Linear Models Previous Efforts to Improve
Linear Models • Improved PCA Methods:
– Moghaddam & Pentland 1994, 1995"View-Based and Modular Eigenspaces for Face Recognition"
– Nayar, Murase & Nene 1996“Parametric Appearance Representation“
• Fisher Linear Discriminant:– Belhumeur & Hespanha & Kriegman
"Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection“
• Bilinear Models:– Marimont & Wandell 1992
"Linear models of surface and illuminant spectra"– Freeman & Tenenbaum 1997
"Learning Bilinear Models for Two-Factor Problems in Vision"
-
3
• Eigenfaces capture the variability across images without disentanglingidentity variability from variability in other factors such as lighting, viewpoint or expressions
Eigenfaces Experience DifficultiesWhen More Than a Single Factor
Varies
Eigenfaces Experience DifficultiesWhen More Than a Single Factor
Varies
Multilinear Model ApproachMultilinear Model Approach
• Non-linear appearance based technique
• Appearance based model that explicitly accounts for each of the multiple factors inherent in image formation
• Multilinear algebra, the algebra of higher order tensors
• Applied to facial images, we call our tensor technique “TensorFaces” [ Vasilescu & Terzopoulos, ECCV 02, ICPR 02 ]
TensorFaces vs Eigenfaces(PCA)
TensorFaces vs Eigenfaces(PCA)
88%27%Training: 23 people, 5 viewpoints (0,+17, -17,+34,-34), 3 illuminations
Testing: 23 people, 5 viewpoints (0,+17, -17,+34,-34), 4th illumination
80%61%Training: 23 people, 3 viewpoints (0,+34,-34), 4 illuminations
Testing: 23 people, 2 viewpoints (+17,-17), 4 illuminations (center,left,right,left+right)
TensorFacesPCAPIE Recognition Experiment• Eigenfaces capture the variability across images without disentangling
identity variability from variability in other factors such as lighting, viewpoint or expressions
Eigenfaces Experience DifficultiesWhen More Than a Single Factor
Varies
Eigenfaces Experience DifficultiesWhen More Than a Single Factor
Varies
PIE Database (Weizmann)PIE Database (Weizmann) Data OrganizationData Organization
• Linear/PCA: Data Matrix– Rpixels x images
– a matrix of image vectors
• Multilinear: Data Tensor– Rpeople x views x illums x express x pixels
– N-dimensional array– 28 people, 45 images/person– 5 views, 3 illuminations,
3 expressions per person
exilvpp ,,,i
People
Illu
min
atio
nsEx
pres
sions
Views
D
D
Pixe
ls
ImagesD
-
4
Data OrganizationData Organization• Linear / PCA: Data Matrix
– Rpixels x images
– a matrix of image vectors
• Multilinear: Data Tensor– Rpeople x views x illums x express x pixels
– N- dimensional matrix– 28 people, 45 images/person– 5 views, 3 illums., 3 express. per person
exilvpp ,,,i
People
Illu
min
atio
nsEx
pres
sions
Views
D
D
Tensor Decomposition
Complete Dataset
}Learning Stage
Tensor DecompositionTensor Decomposition
Views
Illu
ms.
People
Matrix Decomposition - SVDMatrix Decomposition - SVD
• A matrix has a column and row space
• SVD orthogonalizes these spaces and decomposes
• Rewrite in terms of mode-n products
2xI1IIR∈D
T21 UUD ∑=
2UUD x1x 2 1 ∑=
Pixe
ls
ImagesD
( contains the eigenfaces )1U
D
Tensor DecompositionTensor Decomposition• D is a n-dimensional matrix, comprising N-spaces
• N-mode SVD is the natural generalization of SVD
• N-mode SVD orthogonalizes these spaces & decomposes
D as the mode-n product of N-orthogonal spaces
• Z core tensor; governs interaction between mode matrices
• , mode-n matrix, is the column space of nU )(nD
nn UUUU ... x x x xx 3 21 1 432 ZD =
Tensor DecompositionTensor Decomposition
• D=Z x1 U1 x2 U2 x3 U3
-
5
D321 x xx 32 1 UUU Z=
Multilinear (Tensor) DecompositionMultilinear (Tensor) Decomposition
323
32121
3211
3
1
2
1
1
1rrr
rrrr
rr,,, uuu oo∑∑∑=
===
RRR
σ
U1U2
U3
D Z=
D321 x xx 32 1 UUU Z=
Multilinear (Tensor) DecompositionMultilinear (Tensor) Decomposition
( ) ( ) ( )ZD vecvec 123 UUU ⊗⊗=32
3321
21
3211
3
1
2
1
1
1rrr
rrrr
rr,,, uuu oo∑∑∑=
===
RRR
σ
U1U2
U3
D Z=
pixelsxexpressxillums.xviews xpeoplex 51 UUUUU . ZD 432=
Facial Data Tensor DecompositionFacial Data Tensor DecompositionPeo
ple
Illu
min
atio
nsEx
pres
sions
ViewsD
Decomposition from Pixel Space to Factor Spaces
Decomposition from Pixel Space to Factor Spaces
N-Mode SVD AlgorithmN-Mode SVD Algorithm
1. For n=1,…,N, compute matrix by computing the SVD of the flattened matrix and setting to be the left matrix of the SVD.
2. Solve for the core tensor as follows
)(nDnU nU
TTTTT5544332211 UUUUU x x x xx DZ =
Multilinear (Tensor) AlgebraMultilinear (Tensor) Algebra
mode -n tensor flattening
NIII L××ℜ∈ 21A
( )( )Nnnn IIIIII
nLL 1121 +−×ℜ∈A
flattening=
“matricize”
N = 3
-
6
DD(illums.)
People
Expr
essio
ns
Views
Computing UillumsComputing Uillums
Images
Illu
min
atio
ns
• D(illums) – flatten D along the illumination dimension• Uillums – orthogonalizes the column space of D(illums)
People
Illu
min
atio
nsEx
pres
sions
ViewsD(views)D
Computing UviewsComputing UviewsImages
• D(views) - flatten D along the view point dimension• Uviews – orthogonalize the column space of D(views)
People
Illu
min
atio
nsEx
pres
sions
Views ImagesD(illums.)
Computing UillumsD
• D(illums) - flatten D along the illumination dimension
• Uillums – orthogonalize D(illums)
People
Illu
min
atio
nsEx
pres
sions
Views ImagesD(views)
Computing UviewsD
• D(views) - flatten D along the view point dimension• Uviews – orthogonalize D(views)
Pixe
ls
Images
Computing UpixelsComputing Upixels
D(pixels)
• D(pixels) - flatten D along the pixel dimension
• Upixels – orthogonalize D(pixels)– eigenimages
N-Mode SVD AlgorithmN-Mode SVD Algorithm
1. For n=1,…,N, compute matrix by computing the SVD of the flattened matrix and setting to be the left matrix of the SVD.
2. Solve for the core tensor as follows
)(nDnU
nU
TTTTT5544332211 UUUUU x x x xx DZ =
-
7
I1
J1
x1
• Mode-n product is a generalization of the product of two matrices• It is the product of a tensor with a matrix
• Mode-n product of andNn III x...xx...x1ℜ∈A
I1
I2
I3I2
J2
J2
I2
J2
I2
= x2
Nnnn IIJII x..xxx.x...x 111 +−ℜ∈B
nn IJ xℜ=M
Mn×= AB
Mode-n ProductMode-n Product
AB M x3
J3
I3
I3
J3
nnNnnn ijiiiii
niNininjnii
n ma ............
A111
111+−∑=×
+−
M
• Mode-n product is a generalization of the product of two matrices• It is the product of a tensor with a matrix• Mode-n product of and
( ) ∑ +−=+−ni
ninjNinininiiiijiinma
Nnnn............
A111111
Mx
Nn IIIIR x...xx...x1∈A
I1
I2
I3I2
J2
J2
I2
J2
I2
I1
I2
I3
= x2x1
I3
J3
J3
I3
x3
Nnnn IIJIIIR x..xxx.x...x 111 +−∈Bnn IJIR x=M
Mnx AB =
Mode-N ProductMode-N Product
I1
J1
AB M
Multilinear (Tensor) AlgebraMultilinear (Tensor) Algebra
N-th order tensor NIII L××ℜ∈ 21A
nn IJ ×ℜ∈Mmatrix (2nd order tensor)
mode -n product:
Mn×= AB ( ) ( )nn AMB =where( )
44444444 344444444 21321321matrixt coefficien
matrix basis
matrix data
expressillums.viewspeople(pixels)pixels(pixels)
T.UUUUZUD ⊗⊗⊗=
Eigenfaces vs TensorFaces Eigenfaces vs TensorFaces • Multilinear Analysis / TensorFaces:
pixelsxexpressxillums.xviews xpeoplex 51 UUUUU . ZD 432=
( ) T express U(pixels) Z viewsU⊗illums.U⊗ peopleU⊗=321matrix data
(pixels)D 321matrix basis
pixelsU
• Linear Analysis / Eigenfaces:
• TensorFaces subsumes Eigenfaces
TensorFaces:TensorFaces:
Views
Illums.
People
B = Z x5 Upixels
TensorFaces:explicitly representcovariance acrossfactors
PCA:
TensorFaces:
-
8
TensorFaces
Mean Sq. Err. = 409.153 illum + 11 people param.
33 basis vectors
PCA
Mean Sq. Err. = 85.7533 parameters
33 basis vectors
Strategic Data Compression = Perceptual Quality
Strategic Data Compression = Perceptual Quality
Original
176 basis vectors
TensorFaces
6 illum + 11 people param.66 basis vectors
• TensorFaces data reduction in illumination space primarily degrades illumination effects (cast shadows, highlights)
• PCA has lower mean square error but higher perceptual error