i ndependent c omponents a nalysis with the jade algorithm douglas n. rutledge, delphine...

61
INDEPENDENT COMPONENTS ANALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge , Delphine Jouan-Rimbaud Bouveresse [email protected] [email protected]

Upload: kimberly-horn

Post on 16-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

INDEPENDENT COMPONENTS ANALYSIS WITH THE JADE ALGORITHM

Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse

[email protected] [email protected]

Page 2: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Multivariate Data Analysis

11 1 1

1

...

... ... ...

... ... ...

...

j m

im

ij

n nj nm

x x x

x

x

x x x

X

Variables

Samples

(objects)(individuals)

Line Vectors 1 11 1 1... j mx x xx

Column Vectors

11

T1

1

1

...

i

n

x

x

x

x

Page 3: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

A matrix as points in space

X = -1.00 0.00 0.50 1.00 -1.00 -1.00-0.50 1.00 -1.00 1.00 1.00 -1.00 2.00 -0.50 1.00

-10

12

-1-0.5

00.5

1-1

-0.5

0.0

0.5

1

Page 4: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Principal Components Analysis

Pearson, 1901

Page 5: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Principal Components Analysis

Samples projected onto space defined by variables- if NO information → spherical distribution

→ NO preferred axes

Page 6: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Principal Components Analysis

Samples projected onto space defined by variables- if information → non-spherical distribution

→ preferred axes

PCA

Page 7: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

“Blind Source Separation”

1

1

11 1

... ... ...

... ... ...

...

...

im

j

ij

n nj nm

m

x

x

x x

x x x

x

X

Variables

Observed Sensor Signals

1

...

i

n

T

T

T

x

x

X

x

Row Vectors

Page 8: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Principal Components Analysis (PCA)

data matrix points in the multivariate space

Independent Components Analysis (ICA)

data matrix a set of observed signals, where

- each observed sensor signal is the weighted sum of pure source signals

- the weighting coefficients are proportions of the source signals

Independent Components Analysis

x1 = a11*s1 + a12*s2

x2 = a21*s1 + a22*s2

…xn = an1*s1 + an2*s2

In matrix notation :X = A*S

Page 9: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Example

2221212

2121111

sasax

sasax

Two Independent Sources

Two mixtures

aij ... Proportion of source signal sj in observed signal xi

Page 10: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

The objective of ICA is to find "physically significant" vectors.

Hypotheses :

1) No reason for the variations in one pure signal to depend in any way on the

variations in another pure signal

Pure signal sources should therefore be « independent »

2) The measured signals being combinations of several independants sources, they

should be more gaussian than the sources

(Central Limit Theorem)

Page 11: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

“Nongaussian is independent”: Central Limit Theorem

The measured signals being combinations of several independants sources, they should be more gaussian than the sources

Idea of ICA is to search for sources the least gaussian possible

Need to find a criterion to define the « gaussianity » of a signals :KurtosisNegentropyMutual informationComplexity …

Page 12: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA Principal (Non-Gaussian is Independent)

• Key to estimating A is non-gaussianity• The distribution of a sum of independent random variables tends toward a

Gaussian distribution (by CLT)

f(s1) f(s2) f(x1) = f(s1 +s2)• Where w is one of the rows of matrix W :

• y is a linear combination of si, with weights given by zi. • Since sum of two independent r.v. is more gaussian than individual r.v.,

so zTs is more gaussian than either si

• zTs becomes least gaussian when equal to one of si

• So we take w as a vector which maximizes the non-gaussianity of wTx.

szAswxwy TTT

Page 13: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Measures of Non-GaussianityQuantitative measure of non-gaussianity for ICA estimation

• Kurtosis : gauss=0 (sensitive to outliers)

• Entropy : gauss=largest

• Neg-entropy : gauss = 0 (difficult to estimate)

• Approximations

where v is a standard gaussian random variable and :

224 }){(3}{)( yEyEykurt

dyyfyfyH )(log)()(

)()()( yHyHyJ gauss

222 )(481

121)( ykurtyEyJ

2)()()( vGEyGEyJ

)2/.exp()(

).cosh(log1)(

2uayG

yaayG

Page 14: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA : - decomposition of a set of vectors into linear components

that are “as independent as possible”

“Independence” : - goes beyond (second-order) decorrelation

involves the non-Gaussianity of the data

Page 15: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

“Independent” versus “non-correlated”

y = 36.667

R2 = 0

0

20

40

60

80

100

120

-15 -10 -5 0 5 10 15

X

Page 16: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA attempts to recover the original signals by estimating a linear transformation, using a criterion that measures statistical independence among the sources

This may be achieved by the use of higher-order information that can be extracted from the densities of the data

PCA does not really look for (and usually does not find) components with physical reality

Why not ?- PCA finds “direction of greatest dispersion of the samples”- this is the wrong direction for “signal separation” !

Page 17: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA demo (mixtures)

Page 18: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA demo (whitened)

Page 19: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA demo (step 1)

Page 20: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA demo (step 2)

Page 21: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA demo (step 3)

Page 22: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA demo (step 4)

Page 23: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA demo (step 5 - end)

Page 24: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

0 100 200 300 400 500 600 700 800-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 100 200 300 400 500 600 700 800-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 100 200 300 400 500 600 700 800-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0 100 200 300 400 500 600 700 800-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0 100 200 300 400 500 600 700 800-1.5

-1

-0.5

0

0.5

1

1.5

0 100 200 300 400 500 600 700 800-1.5

-1

-0.5

0

0.5

1

1.5

0 100 200 300 400 500 600 700 800-6

-4

-2

0

2

4

6

+ Gaussian Noise

Pure source signals Observed sensor signals

A simulated example

-1.5 -1 -0.5 0 0.5 1 1.50

50

100

150

200

250

300

350

400

Histograms of source signals

-2 -1.5 -1 -0.5 0 0.5 1 1.5 20

2

4

6

8

10

12

14

16

18

20

-2 -1.5 -1 -0.5 0 0.5 1 1.5 20

2

4

6

8

10

12

14

16

-1.5 -1 -0.5 0 0.5 1 1.50

10

20

30

40

50

60

70

80

-1.5 -1 -0.5 0 0.5 1 1.50

10

20

30

40

50

60

70

80

-1.5 -1 -0.5 0 0.5 1 1.50

50

100

150

200

250

300

350

400

Histograms

-6 -4 -2 0 2 4 60

5

10

15

20

25

Page 25: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Initial data matrix

Matrix rows Histograms

Page 26: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

PCA applied to the example

Loadings Histograms

Page 27: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

PCA applied to the example

-0.1 -0.05 0 0.05

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

PCA Loadings

-0.06-0.04-0.0200.020.040.060.08

0

5

-0.1

-0.05

0

0.05

0510

Page 28: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Source Signals Histograms

ICA applied to the example

Page 29: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA applied to the example

-1.5 -1 -0.5 0 0.5 1 1.5

-1.5

-1

-0.5

0

0.5

1

1.5

ICA Loadings

-1.5-1-0.500.511.5

0

20

-1.5

-1

-0.5

0

0.5

1

1.5

050

Page 30: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

ICA calculates a demixing matrix, W

W approximates A-1 , the inverse mixing matrix

The pure component signals are recovered from the measured mixed signals by :

S = W*X

The trick is to calculate W when we only have X !

Procedure

Page 31: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Many algorithms available :

FastICA : numerical gradient search Projection Pursuit : interestingly structured vectorsComplexity Pursuit : minimise common information

JADE : based on “cumulants”

….

Calculating the demixing matrix

Page 32: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

JADE was developed by Cardoso et al. in 1993 [1].

A blind source separation method to extract independent non-Gaussian sources from signal mixtures with Gaussian noise

Based on the construction of a fourth-order cumulant array from the data

It is freely downloadable from http://perso.telecom-paristech.fr/~cardoso/Algo/Jade/jadeR.m

JADE(Joint Approximate Diagonalization of Eigenmatrices)

[1] Cardoso and Souloumiac, IEE proceedings-F 140 (6) (1993), 362-370

Page 33: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

For a set of values : x

2° order auto-cumulant :

s2(x) = Cum2{x, x} = E{x2}

4° order auto-cumulantkx(x) = Cum4{x, x, x, x} = E{x4} − 3.E2{x2}

Cumulants

Page 34: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Cumulants can be expressed in tensor form.A tensor can be seen as a generalization of the notion of matrices.

Cumulant tensors are a generalization of covariance matrices.

The main diagonal of cumulants tensors are auto-cumulants and non-diagonal elements are cross-cumulants.

Statistically mutually independent vectors (components) give :- null cross-cumulants - maximum auto-cumulants

Cumulants

Page 35: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

In PCA :The Eigenvalue decomposition of a covariance matrix results in its diagonalization, i.e. the cancellation of its non-diagonal terms.

In JADE :A generalization of this methodology was proposed by Cardoso and Souloumiac to diagonalise fourth-order cumulants tensors.

Cumulants

Page 36: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

In the simple case of 3 observed signal vectors : x1, x2, x3 and 2 pure source signals : s1, s2

where :

x1 = a11. s1 + a12. s2

x2 = a21. s1 + a22. s2

x3 = a31. s1 + a32. s2

k(x1) = Cum4{x1, x1, x1, x1} = E{x14} − 3.E2{x1

2}

k(x1) = a114 . [E{s1

4} − 3.E2{s12}]

+ a124 . [E{s2

4} − 3.E2{s22}]

k(x1) = a114 . k(s1) + a12

4 . k(s2)

Cumulants

Page 37: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

For the 3 observed signal vectors : x1, x2, x3

Create a four-way array with dimensions [3, 3, 3, 3]

Each of the 81 elements is the mean of products of the vectors.

For vi, vj, vk, vl = x1, x2, x3

Cum4{vi, vj, vk, vl} = E{vivjvkvl}

− E{vivj} . E{vkvl} − E{vivk} . E{vjvl}

− E{vivl} . E{vjvk}

4th-order Cumulants

Page 38: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

For vi, vj, vk, vl = xp (p=1:3)

Cum4{p, p, p, p} = E{xp4}

− E{xp2} . E{xp

2} − E{xp

2} . E{xp2}

− E{xp2} . E{xp

2} Cum4{p, p, p, p} = mean (xp

4) − mean (xp

2) . mean (xp2)

− mean (xp2) . mean (xp

2) − mean (xp

2) . mean (xp2)

Cum4{p, p, p, p} = mean (xp4) − 3

3 Auto-Cumulants3 different values

Page 39: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

For vi, vj = xp et vk, vl = xq (p=1:3, q=1:3, p≠q )

Cum4{p, p, q, q} = E{xp2xq

2} − E{xp

2} . E{xq2}

− E{xpxq} . E{xpxq} − E{xpxq} . E{xpxq} Cum4{p, p, q, q} = mean (xp

2xq2)

− mean (xp2) . mean (xp

2) − mean (xpxq) . mean (xpxq)

− mean (xpxq) . mean (xpxq)

Cum4{p, p, q, q} = mean (xp2xq

2) − 1

18 Even Cross-Cumulants3 different values

Page 40: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

For vi= xp et vj, vk, vl = xq (p=1:3, q=1:3, p≠q)

Cum4{p, q, q, q} = E{xpxq3}

− E{xpxq} . E{xq2}

− E{xpxq} . E{xq2}

− E{xpxq} . E{xq2}

Cum4{p, q, q, q} = mean (xpxq

3) − mean (xpxq) . mean (xp

2) − mean (xpxq) . mean (xq

2) − mean (xpxq) . mean (xq

2)

Cum4{p, q, q, q} = mean (xpxq3) − 0

60 Odd Cross-Cumulants9 different values

Page 41: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Initial X matrixInitial X matrix

100 200 300 400 500 600 700 800

10

20

30

40

50

60

70

80

90

100

Scores of first PCs of X

10 20 30 40 50 60 70 80 90 100

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

First nICs PCs of X

10 20 30 40 50 60 70 80 90 100

-15

-10

-5

0

5

10

15

20

First nICs scaled PCs of X

Scaled Scores of X

Reduced & whitened X matrix

100 200 300 400 500 600 700 800

1

2

3

4

5

6

7

8

Smaller, whitened X matrix

Page 42: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

4-way tensor of cumulantsMatcum(1,1:,:)

1 2 3

1

2

3

Matcum(1,2:,:)

1 2 3

1

2

3

Matcum(1,3:,:)

1 2 3

1

2

3

Matcum(2,1:,:)

1 2 3

1

2

3

Matcum(2,2:,:)

1 2 3

1

2

3

Matcum(2,3:,:)

1 2 3

1

2

3

Matcum(3,1:,:)

1 2 3

1

2

3

Matcum(3,2:,:)

1 2 3

1

2

3

Matcum(3,3:,:)

1 2 3

1

2

3

Page 43: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Initial orthogonal matricesto project cumulants

CM(:,:,1)

1 2 3

1

2

3

CM(:,:,2)

1 2 3

1

2

3

CM(:,:,3)

1 2 3

1

2

3

CM(:,:,4)

1 2 3

1

2

3

CM(:,:,5)

1 2 3

1

2

3

CM(:,:,6)

1 2 3

1

2

3

Page 44: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

CM(:,:,1)

1 2 3

1

2

3

CM(:,:,2)

1 2 3

1

2

3

CM(:,:,3)

1 2 3

1

2

3

CM(:,:,4)

1 2 3

1

2

3

CM(:,:,5)

1 2 3

1

2

3

CM(:,:,6)

1 2 3

1

2

3

Eigenmatrices of cumulants

Page 45: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Joint diagonalization using the Jacobi algorithm, based on Givens rotation matrices :

Minimize the sum of squares of the “off-diagonal" 4th-order cumulants (i.e. the cumulants between different signals)

JADE

Page 46: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

CM(:,:,1)

1 2 3

1

2

3

CM(:,:,2)

1 2 3

1

2

3

CM(:,:,3)

1 2 3

1

2

3

CM(:,:,4)

1 2 3

1

2

3

CM(:,:,5)

1 2 3

1

2

3

CM(:,:,6)

1 2 3

1

2

3

Diagonalised matrices

Page 47: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

The JADE algorithm:

a multi-step procedure

Page 48: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Whitening (sphering) of the original data matrix X

- Reduce the data dimensionality

- Orthogonalize the data

- Normalise the data on all dimensions

If X is very wide or tall, it can be replaced by the loadings obtained from Kernel-PCA, or by using (segmented-) PCT

Page 49: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Cumulants computation

Cumulants :

Generalization of moments such as the mean (1st-order auto-cumulant)

and the variance (2nd-order auto-cumulant) to statistics of order > 2:

Independent signal vectors have :

null 4th-order cross-cumulants and maximal auto-cumulants

Find rotations of Pw so that loadings vectors become independent

Page 50: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Decomposition of the cumulant tensor

Generalization to higher-order statistics

of the eigenvalue decomposition of the variance-covariance matrix:

- Decompose the 4th-order tensor into a set of n(n+1)/2 orthogonal

eigenmatrices Mi

- Project the cumulant tensor onto these planes

Page 51: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Joint Diagonalization of eigenmatrices

(based on the Jacobi algorithm)

Page 52: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Calculate the vector of proportions

Page 53: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Summary of JADESVD

X UV

Diagonalise(Rotate)

M M* R

CalculateCumulants

DecomposeTensor

MB x X

BScale

(x S-1)

S

Page 54: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

JADE

BRotateScoresR Wx

x SW XCalculateSignals

S'X

CalculateProp’ns

S'S-1

A

Page 55: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Extracted vectors have a physico-chemical meaning

ICA can be used to eliminate artefacts

ICA can be applied to all kinds of signals, including multi-way signals:

- Unfold the signals

- Calculate the ICA model on the matrix of one-dimensional signals

- Fold the ICs back

Advantages of ICA

Page 56: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Different results may be obtained using different algorithms

Different results may be produced as a function of the number of Independent

Components (ICs) extracted

Determination of the number of ICs

Difficulties with ICA

Page 57: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Links 1• Feature extraction (Images, Video)

– http://hlab.phys.rug.nl/demos/ica/

• Aapo Hyvarinen: ICA (1999)– http://www.cis.hut.fi/aapo/papers/NCS99web/node11.html

• ICA demo step-by-step– http://www.cis.hut.fi/projects/ica/icademo/

• Lots of links– http://sound.media.mit.edu/~paris/ica.html

Page 58: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Links 2• Object-based audio capture demos

– http://www.media.mit.edu/~westner/sepdemo.html

• Demo for BBS with „CoBliSS“ (wav-files)– http://www.esp.ele.tue.nl/onderzoek/daniels/BSS.html

• Tomas Zeman‘s page on BSS research– http://ica.fun-thom.misto.cz/page3.html

Page 59: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Links 3

• An efficient batch algorithm: JADE– http://www-sig.enst.fr/~cardoso/guidesepsou.html

• Dr JV Stone: ICA and Temporal Predictability– http://www.shef.ac.uk/~pc1jvs/

Page 60: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

Links 4• Information for scientists, engineers and industrialists

– http://www.cnl.salk.edu/~tewon/ica_cnl.html

• FastICA package for matlab– http://www.cis.hut.fi/projects/ica/fastica/fp.shtml

• Aapo Hyvärinen– http://www.cis.hut.fi/~aapo/

• Erkki Oja– http://www.cis.hut.fi/~oja/

Page 61: I NDEPENDENT C OMPONENTS A NALYSIS WITH THE JADE ALGORITHM Douglas N. Rutledge, Delphine Jouan-Rimbaud Bouveresse douglas.rutledge@agroparistech.frdouglas.rutledge@agroparistech.fr

PCA does not look for (and usually does not find) components

with direct physical meaning

ICA tries to recover the original signals by estimating a linear transformation,

using a criterion that measures statistical independence among the sources

This is done using higher-order information that can be extracted

from the densities of the data

ICA can be applied to all types of data, including multi-way data

D.N. Rutledge, D. Jouan-Rimbaud Bouveresse,

Independent Components Analysis with the JADE algorithm

Trends in Analytical Chemistry, 50, (2013) 22–32

Conclusion