![Page 1: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/1.jpg)
SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL COMPONENTS ANALYSIS (PCA)
!1
![Page 2: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/2.jpg)
SVD - EXAMPLE
!2
U, S, VT = numpy . linalg . svd(img)
![Page 3: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/3.jpg)
SVD - EXAMPLE
!3
full rank 600 300
100
50 20
10 U[: , k]S[: k]VT[: k, :]
![Page 4: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/4.jpg)
PCA - INTRODUCTION
!4
X =
1 2 42 1 53 4 104 3 11
![Page 5: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/5.jpg)
PCA - INTRODUCTION
!5
![Page 6: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/6.jpg)
PCA - INTRODUCTION
!6
![Page 7: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/7.jpg)
PCA - INTRODUCTION
!7
![Page 8: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/8.jpg)
PRINCIPAL COMPONENT ANALYSIS
• A technique to find the directions along which the points (set of tuples) in high-dimensional data line up best.
• Treat a set of tuples as a matrix M and find the eigenvectors for MMT or MTM.
• The matrix of these eigenvectors can be thought of as a rigid rotation in a high-dimensional space.
• When this transformation is applied to the original data - the axis corresponding to the principal eigenvector is the one along which the points are most “spread out”.
!8
![Page 9: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/9.jpg)
PRINCIPAL COMPONENT ANALYSIS
• When this transformation is applied to the original data - the axis corresponding to the principal eigenvector is the one along which the points are most “spread out”.
• This axis is the one along which variance of the data is maximized.
• Points can best be viewed as lying along this axis with small deviations from this axis.
• Likewise, the axis corresponding the second eigenvector is the axis along which the variance of distances from the first axis is greatest, and so on.
!9
![Page 10: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/10.jpg)
PRINCIPAL COMPONENT ANALYSIS• Principal Component Analysis (PCA) is a dimensionality reduction
method.
• The goal is to embed data in high dimensional space, onto a small number of dimensions.
• It most frequent use is in exploratory data analysis and visualization.
• It can also be helpful in regression (linear or logistic) where we can transform input variables into a smaller number of predictors for modeling.
!10
![Page 11: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/11.jpg)
PRINCIPAL COMPONENT ANALYSIS• Mathematically,Given: Data set where, is the vector of p variable values for the i-th observation. Return:Matrix of linear transformations that retain maximal variance.
• You can think of the first vector as a linear transformation that embeds observations into 1 dimension
!11
{x1, x2, . . . , xn}
xi
[ϕ1, ϕ2, . . . , ϕp]
ϕ1
Z1 = ϕ11X1 + ϕ21X2 + … + ϕp1Xp
![Page 12: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/12.jpg)
PRINCIPAL COMPONENT ANALYSIS• You can think of the first vector as a linear transformation that
embeds observations into 1 dimensionwhere is selected so that the resulting dataset has maximum variance.
• In order for this to make sense, mathematically, data has to be centered
• Each has zero mean • Transformation vector has to be normalized, i.e.,
!12
ϕ1
Z1 = ϕ11X1 + ϕ21X2 + … + ϕp1Xp
ϕ1 {zi, …, zn}
Xiϕ1
p
∑j=1
ϕ2j1 = 1
![Page 13: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/13.jpg)
PRINCIPAL COMPONENT ANALYSIS• In order for this to make sense, mathematically, data has to be
centered • Each has zero mean • Transformation vector has to be normalized, i.e.,
• We can find by solving an optimization problem: Maximize variance but subject to normalization constraint.
!13
ϕ1
Xi p
∑j=1
ϕ2j1 = 1
ϕ1
maxϕ11,ϕ21,…,ϕp1
1n
n
∑i=1 (
p
∑j=1
ϕj1xij)2
s.t. p
∑j=1
ϕ2j1 = 1
![Page 14: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/14.jpg)
PRINCIPAL COMPONENT ANALYSIS
• We can find by solving an optimization problem: Maximize variance but subject to normalization constraint.
• The second transformation, is obtained similarly with the added constraint that is orthogonal to
• Taken together define a pair of linear transformations of the data into 2 dimensional space
!14
ϕ1
maxϕ11,ϕ21,…,ϕp1
1n
n
∑i=1 (
p
∑j=1
ϕj1xij)2
s.t. p
∑j=1
ϕ2j1 = 1
ϕ2
ϕ2 ϕ1
[ϕ1, ϕ2]
Zn×2 = Xn×p[ϕ1, ϕ2]p×2
![Page 15: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/15.jpg)
PRINCIPAL COMPONENT ANALYSIS
• Taken together define a pair of linear transformations of the data into 2 dimensional space
• Each of the columns of the Z matrix are called Principal components.
• The units of the PCs are meaningless.
• In practice we may also scale to have unit variance.
• In general if variables are measured in different units(e.g., miles vs. liters vs. dollars), variables should be scaled to have unit variance.
!15
[ϕ1, ϕ2]
Zn×2 = Xn×p[ϕ1, ϕ2]p×2
Xj
Xj
![Page 16: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/16.jpg)
!16
(XT X)ϕ = λϕ
XXT(Xϕ) = λ(Xϕ)
Using Spectral theorem
The matrices XXT and XT X share the same nonzero eigenvalues
Conclusion:
To get an eigenvector of XXT from XT X multiply ϕ on the left by X
SPECTRAL THEOREM
Very powerful, particularly if number of observations, m, and the number of predictors, n, are drastically different in size.
Cov(X, X ) = XXTFor PCA:
![Page 17: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/17.jpg)
EXAMPLE - PCA
!17
X =
1 22 13 44 3
Eigen Values and Eigen Vectors?
![Page 18: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/18.jpg)
EXAMPLE - PCA
!18
X =
1 22 13 44 3
From spectral theorem:
(XT X)ϕ = λϕ
XT X = [1 2 3 42 1 4 3]
1 22 13 44 3
= [30 2828 30]
![Page 19: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/19.jpg)
EXAMPLE - PCA
!19
X =
1 22 13 44 3
From spectral theorem:
(XT X)ϕ = λϕ ⟹ (XT X)ϕ − λIϕ = 0
[30 − λ 2828 30 − λ] = 0 ⟹ λ = 58 and λ = 2
((XT X) − λI)ϕ = 0
![Page 20: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/20.jpg)
EXAMPLE - PCA
!20
X =
1 22 13 44 3
From spectral theorem:
(XT X)ϕ = λϕ
[30 2828 30] [ϕ11
ϕ12] = 58 [ϕ11
ϕ12] ⟹ ϕ1 =
1
21
2
![Page 21: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/21.jpg)
EXAMPLE - PCA
!21
X =
1 22 13 44 3
From spectral theorem: (XT X)ϕ = λϕ
[30 2828 30] [ϕ11
ϕ12] = 58 [ϕ11
ϕ12] ⟹ ϕ1 =
1
21
2
[30 2828 30] [ϕ21
ϕ22] = 2 [ϕ21
ϕ22] ⟹ ϕ2 =
−1
21
2
![Page 22: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/22.jpg)
EXAMPLE - PCA
!22
X =
1 22 13 44 3
From spectral theorem: (XT X)ϕ = λϕ
ϕ1 =
1
21
2
ϕ2 =
−1
21
2
λ1 = 58 λ2 = 2
ϕ =
1
2
−1
21
2
1
2
![Page 23: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/23.jpg)
EXAMPLE - PCA
!23
X =
1 22 13 44 3
ϕ1 =
1
21
2
ϕ2 =
−1
21
2
λ1 = 58 λ2 = 2
Z = Xϕ =
1 22 13 44 3
1
2
−1
21
2
1
2
=
3
2
1
23
2
−1
27
2
1
27
2
−1
2
![Page 24: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/24.jpg)
EXAMPLE - PCA
!24
X =
1 22 13 44 3
Z =
3
2
1
23
2
−1
27
2
1
27
2
−1
2
(2,1)
(1,2) (4,3)
(3,4)
(1.5,1.5)
(3.5,3.5)
( 3
2,
1
2 )
( 3
2,
−1
2 )
( 7
2,
1
2 )
( 7
2,
−1
2 )
![Page 25: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/25.jpg)
PCA STEPS - STEP 1 MEAN SUBTRACTION
!25
![Page 26: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/26.jpg)
PCA STEPS - STEP 2 COVARIANCE MATRIX
!26
top 5 rows of mean centered data
covariance matrix
![Page 27: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/27.jpg)
PCA STEPS - STEP 3 EIGEN VALUES & EIGEN VECTORS OF COVARIANCE MATRIX
!27
![Page 28: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/28.jpg)
PCA STEPS - STEP 4 - PRINCIPAL COMPONENTS
Multiply each eigen vector by its corresponding eigen value (usually square root)
Plot them on top of the data
!28
![Page 29: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/29.jpg)
PCA STEPS - STEP 5 - PROJECT DATA ALONG DOMINANT PC
!29
newData = PC1 × oldData
![Page 30: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/30.jpg)
HOW MANY PRINCIPAL COMPONENTS ?
• How many PCs should we consider in post-hoc analysis?
• One result of PCA is a measure of the variance to each PC relative to the total variance of the dataset.
• We can calculate the percentage of variance explained for the m-th PC:
!30
PVEm =
n∑i=1
z2im
p∑j=1
n∑i=1
x2ij
![Page 31: SINGULAR VALUE DECOMPOSITION (SVD)/ PRINCIPAL … › ... › files › 5_PCA.pdf · PCA - INTRODUCTION!4 X = 1 2 4 2 1 5 3 4 10 4 3 11. PCA - INTRODUCTION!5. PCA - INTRODUCTION!6](https://reader030.vdocuments.net/reader030/viewer/2022040413/5f0e86ac7e708231d43faae6/html5/thumbnails/31.jpg)
HOW MANY PRINCIPAL COMPONENTS ?
• We can calculate the percentage of variance explained for the m-th PC:
!31
PVEm =
n∑i=1
z2im
p∑j=1
n∑i=1
x2ij