techniques and applications of multivariate...
TRANSCRIPT
-
2. Principal Component Analysis
Techniques and Applications of Multivariate Analysis
-
Lecture 2. Principal Component Analysis (PCA) Lecture2-1 2.1 Introduction of PCA
2.2 Concepts of PCs Lecture2-2 2.3 Sample PCs 2.4 Graphing the PCs : PC scores Appendix: - Orthogonal Projection - Difference between PCs for the two scales
-
2.1 Introduction of PCA
Definition of Principal Components (PCs) : Algebraic Def.: Particular linear combinations of the original p random variables Geometric Def.: Selection of a new coordinate system obtained by rotating the original system with as the coordinate axes. PCA : technique for concerning with explaining the variance-covariance structure through PCs.
Objectives:
Data (dimension) reduction: p(p+1)/2 -> p PCs
Interpretation
Checking the normality and outliers
PCs scores can be used as a new data
-
Karl Pearson(1901): best fitting subspace based on the orthogonal projection of a two-dimensional vector onto a one –dimensional subspace
2.1 Introduction of PCA: History
Harold Hotelling(1933): approach for finding the PCs maximizing
1857-1936
1895-1973
http://upload.wikimedia.org/wikipedia/commons/2/21/Karl_Pearson_2.jpg
-
2.2 Concepts of PCs
Concepts:
Random vector of p variables with mean vector and covariance matrix
Eigenvalues and eigenvectors of : ,
ith PC :
Variance and covariance of PCs:
Properties of PCs:
ith PC maximizes
-
2.2 Concepts of PCs
Figure 1 gives a plot of 50 observations on two highly correlated variables . If we transform to PCs y1, y2, we obtain the plot given Figure 2 wrt PCs. (Jolliffe(2002). Principal Component Analysis, Spring-Verlag, New York)
Figure 1: Plot of 50 observations on x1 and x2
Consideration of variations in both x1 and x2: Rather more variation in the direction of x1 than x2. Clearly there is greater variation in the direction of y1 but very little variation in the direction of y2.
Figure 2: Plot of 50 observations on PCs y1 and y2
-
Contours of constant probability based on
Difference between PCs for the two scales of measurement in x1
Most of the variation is the direction of x1 Both variables have the same
degree of variation
2.2 Concepts of PCs : Examples
-
2.2 Concepts of PCs
[Result 1]Total population variance and the sum of the variances of PCs are same
[Result 2] Proportion of total population variance explained by the ith PC
[Result 3] correlation coefficient bt the ith PC and kth variable
-
2.2 Concepts of PCs : Procedures for PCA