principal component analysis (pca) j.-s roger jang ( 張智星 ) [email protected] mir labmir lab,...
TRANSCRIPT
Principal Component Analysis (PCA)
J.-S Roger Jang (張智星 )
http://mirlab.org/jang
MIR Lab, CSIE Dept
National Taiwan University
-2-
Introduction to PCA
PCA (Principal Component Analysis) An effective method for
reducing dataset dimensions while keeping spatial characteristics as much as possible
Characteristics: For unlabeled data A linear transform with
solid mathematical foundation
Applications Line/plane fitting Face recognition
-3-
Problem Definition
Input A dataset of n d-dim
points which are zero justified:
Output A unity vector u such
that the square sum of the dataset’s projection onto u is maximized.
112/04/19 3
0x
xxx
n
ii
nX
1
21 ,...,,
-4-
Projection
Angle between two vectors
Projection of x onto u
112/04/19 4
cosTx u
x u cos if 1
TTx u
x x u uu
-5-
Mathematical Formulation
Dataset representation: X is d by n, with n>d
Projection of each column of X onto u:
Square sum:
Objective function with a constraint on u:
112/04/19 5
|||
|||
21 nxxxX
uX
ux
ux
ux
p T
Tn
T
T
2
1
uXXuuXuXpppu TTTTTTJ 2
~
min 1
min 1
T T T
T T T
J
J
u
u,
u u XX u, s.t. u u
u, u XX u u u
-6-
Optimization of the Obj. Function
Set the gradient to zero:
u is the eigenvector while is the eigenvalue
When u is the eigenvector:
If we arrange eigenvalues such that:
Max of J(u) is 1, which occurs at u=u1
Min of J(u) is d, which occurs at u=ud
112/04/19 6
~
0 2 2 0T
T
J
u u, XX u u
XX u u
2 T T TJ u p u XX u u u
1 2 d
-7-
Facts about Symmetric Matrices
A square symmetric matrix have orthogonal eigenvectors corresponding to different eigenvalues
112/04/19 7
1 2 1 2 2 2 1 21 1 1
2 2 2 1 2 1 2 1 1 2
2 1 2 1 1 2 2 1 1 2 1 2
Proof:
0 0.
T T T
TT T T
T T T T
x Ax x x x xAx x
Ax x x A x Ax x x x
x x x x x x x x
-8-
Conversion
Conversion between orthonormal bases
112/04/19 8
11 2
1
21 1 2 2 1 2
1
1,
0,
| | |
| | |
| | |
| | |
Ti j i j
T Td
d d d
d
T
if i j
otherwise
I
y
yy y y
y
u u =u u
U u u u UU U U
x u u u u u u Uy
y U x=Ux
-9-112/04/19 9
Steps for PCA
1. Find the sample mean:2. Compute the covariance matrix:
3. Find the eigenvalues of C and arrange them into descending order, with the corresponding eigenvectors
4. The transformation is , with
1 2 d
n
iin 1
1xμ
1
1 1( )( )
nT T
i ii
XXn n
C x μ x μ
},,,{ 21 duuu
|||
|||
21 duuuU
xUy T
-10-
PCA for TLS
Problem for ordinary LS (least squares) Not robust if the fitting line has a large slope PCA can be used for TLS (total least squares)
PCA for TLS of lines in 2D Zero adjustment (Prove that the TLS line goes
through the mean of the dataset.) Find the u1 & u2. Use u2 as the normal vector. Can be extend to surfaces in 3D.
-11-112/04/19 11
Tidbits
1. PCA is designed for unlabeled data. For classification problem, we usually resort to LDA (linear discriminant analysis) for dimension reduction.
2. If d>>n, then we need to have a workaround for computing the eigenvectors
-12-112/04/19 12
Example of PCA
IRIS dataset projection
-13-112/04/19 13
Weakness of PCANot designed for classification problem (with labeledtraining data)
Ideal situation Adversary situation
-14-112/04/19 14
Linear Discriminant AnalysisLDA projection onto directions that can best separate dataof different classes.
Adversary situationfor PCA
Ideal situationfor LDA