principal component analysis principles and application

29
Principal Component Analysis Principles and Application

Post on 22-Dec-2015

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Principal Component Analysis Principles and Application

Principal Component Analysis

Principles and

Application

Page 2: Principal Component Analysis Principles and Application

Fast Multi-Sensor Large

Computers Instruments Data Sets

Examples:•Satellite Data•Digital Camera, Video Data•Tomography•Particle Imaging Velocimetry (PIV)•Ultrasound Velocimetry (UVP)

Page 3: Principal Component Analysis Principles and Application

Low resolution image

Large Data Sets

1 1 2 1 1 600

2 1 2 2 2 600

400 1 400 2 400 600

x

( , ) ( , ) ( , )y

( , ) ( , ) ( , )

( , ) ( , ) ( , )

p x y p x y p x y

p x y p x y p x y

p x y p x y p x y

• There are 400 x 600 = 240,000 pieces of information.

• Not all of this information is independent => information compression (data compression)

Page 4: Principal Component Analysis Principles and Application

Experiment:

• Consider the flow past a cylinder, and suppose we position a cross-wire probe downstream of the cylinder.

• With a cross-wire probe we can measure two components of the velocity at successive time intervals and store the results in a computer.

1 2

1 2

, , , , ,

time

j m

j m

u uu u

v vv v

Example 1Two component velocity measurement

Page 5: Principal Component Analysis Principles and Application

• As the previous slide suggests, the pair of velocities can be represented as a column vector:

• u is a vector at position x in physical space:

• The magnitude and angle of the vector changes with time.

j

jj

u

v

u

x

yu

x

Mathematical Representation of Data

Page 6: Principal Component Analysis Principles and Application

• Mean velocity :

• Variance :

• Covariance :

• Correlation :

1

1, where the bar means

m

jj

uu u

v m

u

2 2

1

2 2

1

1( ) ( )

1( ) ( )

m

u jj

m

v jj

Var u u um

Var v v vm

1

1cov( , ) ( )( )

cov( , ) cov( , )

m

j jj

u v u u v vm

v u u v

Basic Statistics

cov( , ) , 1 1uv uv

u v

u v

Page 7: Principal Component Analysis Principles and Application

Plot u vs v

u

v1

1

j m

j m

u u u

v v v

The data look correlated

Page 8: Principal Component Analysis Principles and Application

Examine the Statistics

Move to a data centered

coordinate system

u

v ( , )u v

v’

u’

2

1 1

2

1 1

1 1

1 1

m m

i i i

m m

i i i

u u vm m

v u vm m

Calculate the Covariance

matrix

Diagonal terms are the variances in the

u’ and v’ directions

Page 9: Principal Component Analysis Principles and Application

Examine the Statistics

Move to a data centered

coordinate system

u

v ( , )u v

v’

u’

2

1 1

2

1 1

1 1

1 1

m m

i i i

m m

i i i

u u vm m

v u vm m

Calculate the Covariance

matrix

covariance or cross-correlation

Page 10: Principal Component Analysis Principles and Application

Rotate coordinates to remove the correlations

u

v

1

v”

2

u”

2

1

2

1

01

0

m

i

m

i

u

mv

Covariance matrix in the (u”,v”) coordinate system

Page 11: Principal Component Analysis Principles and Application

We have just carried out a

Principal Axis Transformation.

This is the first step in a

Principal Component Analysis

(PCA).

Page 12: Principal Component Analysis Principles and Application

Principal Component Analysis

A procedure for transforming a set of correlated

variables into a new set of uncorrelated variables.

How do we do it??

Page 13: Principal Component Analysis Principles and Application

Construction of the

PCA coordinate system

The PCA coordinate system is one that maximizes the mean squared projection of the data. In this sense it is an “optimal” orthogonal coordinate system. Its popularity is primarily due to its dimension reducing properties.

The basic algorithm for constructing the PCA eigenvectors is:

• Find the best direction (line) in the space, 1.

• Find the best direction (line) 2 with the restriction that it must be orthogonal to 1.

• Find the best direction (line) i with the restriction that i is orthogonal to j for all j < i.

Page 14: Principal Component Analysis Principles and Application

How do we find this nice

coordinate system??

Calculate the eigenvalues and eigenvectors

of the

Covariance Matrix

Page 15: Principal Component Analysis Principles and Application

Experiment:

• Pipe Flow -- measurement of velocity profile.

Example 2.Velocity Profile Measurement

z

u(z)

1

2 where ( )k k

n

u

uu u z

u

u

Page 16: Principal Component Analysis Principles and Application

• As before we represent the velocities in the form of a column vector, but this time the vector is not in physical space.

• The space in which our vector lives is one we shall call profile space or pattern space.

• Profile space has n dimensions. In this example, the position zk defines a direction in profile space.

• As time evolves, we measure a sequence of velocity profiles:11 1 1 2 1

22 1 2 2 2

1 2

( , )( , ) ( , ) ( , )

( , )( , ) ( , ) ( , ), , , , ,

( , )( , ) ( , ) ( , )

time

j m

j m

n jn n n m

u z tu z t u z t u z t

u z tu z t u z t u z t

u z tu z t u z t u z t

Vectors in Profile Space

Page 17: Principal Component Analysis Principles and Application
Page 18: Principal Component Analysis Principles and Application

The Preliminary Calculations

1 1 2 1 1

2 1 2 2 2

1 2

( , ) ( , ) ( , )

( , ) ( , ) ( , )

( , ) ( , ) ( , )

m

m

n n n m

u z t u z t u z t

u z t u z t u z t

u z t u z t u z t

U

1. UVP Data Matrix (n x m=128 x 1024)

1 1 1

2 2 2

1

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )

1( ) ( , )

n n n

m

i i kk

u z u z u z

u z u z u z

u z u z u z

u z u z tm

U

2. Mean Profile Matrix (n x m)

1

m X U U

3. Centered Data Matrix (n x m)

1 1

T

T

T

m m

X X

R U U U U

R XX

4. Covariance Matrix (n x n = 128 x 128)

Page 19: Principal Component Analysis Principles and Application

The Diagonalization

R λ Φ 0

Eigenvalue Equation

1

1 2

0

0

0 n

λ

Eigenvalues

1

1

11 1

1

1

0

n

T

n

n n

i k

n

n

i k

i k

Φ

Eigenvectors (eigenprofiles)

2

k

Note: is the variance of

the data in the direction:k

k

k

Page 20: Principal Component Analysis Principles and Application

Example 3.Taylor-Couette Flow

Page 21: Principal Component Analysis Principles and Application

UVP Example

space

time

UVP data

Before

space

space

After (diagonalisation)

Covariance Matrix

compression!!

Page 22: Principal Component Analysis Principles and Application

The Eigenvalue Spectrum(Signal) Energy Spectrum

Energy Fraction

1

1 1

kk n

kk

n

kk

E

E

Ek

Mode Number 1281

1

0

cumulative sum of Ek

Ek

Mode Number1 20

1

0

1

Page 23: Principal Component Analysis Principles and Application

Filtering and Reconstruction

• Decompose X into signal and noise dominated components (subspaces):

where XF is the Filtered data

XNoise is the Residual

• Reconstruct filtered UVP velocity

F F U X U

F Noise X X X

Page 24: Principal Component Analysis Principles and Application

U

UF

XNoise=U-UF

Page 25: Principal Component Analysis Principles and Application

Eigenvalue Spectrum

Page 26: Principal Component Analysis Principles and Application

Filtered Time Series(Channel 70)

Raw data

Filtered data

Residual

Page 27: Principal Component Analysis Principles and Application

Power Spectra(Integrated over all channels)

Page 28: Principal Component Analysis Principles and Application

Superimpose the Spectra

Page 29: Principal Component Analysis Principles and Application

Generalizations

Generalise

• Response to a stimulus • Comparison of multiple data sets obtained by

varying a parameter to study a transition.

1ref

m X U U

ref U 0