data analysis and manifold learning lecture 12: some examples of manifold learning...

22
Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning Applications Radu Horaud INRIA Grenoble Rhone-Alpes, France [email protected] http://perception.inrialpes.fr/ Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Upload: others

Post on 17-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Data Analysis and Manifold LearningLecture 12: Some Examples of Manifold Learning

Applications

Radu HoraudINRIA Grenoble Rhone-Alpes, France

[email protected]://perception.inrialpes.fr/

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 2: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Outline of Lecture 12

Spectral embedding of a graph applied to fMRI data analysis

The Google PageRank algorithm

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 3: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Material for this lecture

X. Shen and F. Meyer. Low-Dimensional Embedding of fMRIDatasets. NeuroImage 41 (2008).http://ecee.colorado.edu/~fmeyer/Pub/neuroimage08.pdf

T. Hastie, R. Tibshirani, and J. Friedman. The Elements ofStatistical Learning. Springer (2009). The Google PageRankalgorithm is briefly described 576–578.

Lawrence Page, Sergey Brin, Rajeev Motwani, and TerryWinograd. The PageRank Citation Ranking: Bringing Orderto the Web. Technical Report. Stanford University (1999).http://ilpubs.stanford.edu:8090/422/

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 4: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Low-dimensional embedding of fMRI

fMRI provides a large scale measurement of neuronal activity

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 5: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

fMRI data

Each voxel vi generates a time series

xi =(xi(1) . . . xi(T )

)> ∈ RT

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 6: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

A network of functionally correlated voxels

A connectivity graph is formed with the standard approach:

Wij ={

exp(−‖xi − xj‖2/σ2

)if vi ∼ vj

0 otherwise

The Euclidean distance between time series:

‖xi − xj‖2 =T∑

t=1

(xi(t)− xj(t))2

Choice for σ:σ = 2 min

i<j‖xi − xj‖

Choice for nn (nearest neighbor): user defined and varies from7 to ... 100.

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 7: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Matrices

Diagonal degree matrix: D(i, i) =∑

j Wi,j

Transition matrix: P = D−1W

Transition probabilities:

Pi,j = P(i, j) =Wi,j∑j Wi,j

It is a row-stochastic matrix:∑

j Pi,j = 1

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 8: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Random walk and commute-time

Consider a random walk on the graph denoted by Zn: if thewalk is at vi it jumps to one of its neighbors vj withprobability Pi,j .

if vi and vj are in the same functional area and vi and vk arein different functional areas, we expect that Pi,j � Pi,k.

The average hitting time measures the number of steps that ittakes for a random walk starting in vi to hit vj for the firsttime:

H(vi, vj) = Ei[Tj ] with Tj = min{n ≥ 0;Zn = j}

The hitting time is not symmetric, use the commute timeinstead:

κ(vi, vj) = H(vi, vj) +H(vj , vi) = Ei[Tj ] + Ej [Ti]

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 9: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

The commute time distance in the spectral domain

Let (λk,φk)Nk=1 be the eigenvalue-eigenvector pairs of matrix

P that can be easily computed because D1/2PD−1/2 is a realsymmetric matrix. Moreover (see Lecture #3) we have:

−1 ≤ λN ≤ . . . ≤ λk ≤ . . . ≤ λ1 = 1

The commute time distance is:

κ(xi,xj)2 =K+1∑k=2

11− λk

(φk(i)√πi− φk(j)√πj

)

π = (π1 . . . πi . . . πN )> is the eigenvector P>π = π with:

πi =di∑

i,j Wi,j

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 10: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Embedding

The initial time series can now be embedded using themapping: xi −→ Ψ(xi), i.e., RT → RK :

Ψ(xi) =1√πi

(φk(i)√1− λ2

. . .φk(i)√1− λk

. . .φk(i)√

1− λK+1

)>This is strictly equivalent to the commute-time embeddingbased on the normalized graph Laplacian (see Lecture #3).

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 11: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Choosing the dimension

Remind that each voxel in the brain corresponds to a graphnode and there is a time series at each voxel:

X =

x1(1) . . . x1(t) . . . x1(T )

......

...xi(1) . . . xi(t) . . . xi(T )

......

...xN (1) . . . xN (t) . . . xN (T )

Each column x(t) in this matrix is a scalar function definedover the graphs’ vertices. It can be decomposed using theeigenvectors:

x(t) =K+1∑k=2

〈x(t),φk〉φk + r(t)

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 12: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Choosing the dimension

This can be written as:

x̂(t) =K+1∑k=2

〈x(t),φk〉φk =K+1∑k=2

xk(t)φk

Therefore, each entry i (at each brain location or graphvertex) of this approximated vector is:

x̂i(t) =K+1∑k=2

xk(t)φk(i)

The discrepancy between the initial observations and theirapproximate representation is:

εi(K) =∑T

t=1(xi(t)− x̂i(t))2∑Tt=1 x

2i (t)

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 13: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Choosing the dimension

The authors suggest to average εi(K) over the voxels lying ina ”functional” region, and to take the max over all theseaverage values.

It is not clear what is a functional region, since this is what itis searched for and why the average is selected.

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 14: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Segmentation using K-means

The authors notice that the embedding has a star-like shape.

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 15: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Segmentation using K-means

The ”arms” correspond to:

activated time series orstrong physiological artifacts

The center blob corresponds to ”background activity”

The embedded data are projected on a hyper-sphere ofdimension K.

The background is spread over the sphere.

The K-means algorithm is applied to the spherical data

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 16: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Event related dataset

Study of age-related changes in functional anatomy

2050 time series.

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 17: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Spectral embedding/clustering results

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 18: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Results obtained with PCA and ISOMAP

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 19: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Activation maps

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 20: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

The Google PageRank Algorithm

A web page is important if many other pages point to it.

Lij = 1 if page j points to page i and zero otherwise

cj =∑

i Lij number of pages pointing to by page j (numberof outlinks)

PageRanks pi are defined by the recursive relationship:

pi = (1− d) + d∑

j

(Lij

cj

)pj

d = 0.85

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 21: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

In matrix form

A =1− dN

ee> + dLD−1c

A has positive entries and each column sums to one - acolumn stochastic matrix.

The algorithm becomes:

p = Ap

e = (1 . . . 1)>, Dc = Diag(ci)

Radu Horaud Data Analysis and Manifold Learning; Lecture 12

Page 22: Data Analysis and Manifold Learning Lecture 12: Some Examples of Manifold Learning …perception.inrialpes.fr/~Horaud/Courses/pdf/Horaud-DAML... · 2011-05-16 · Radu Horaud Data

Markov chain intepretation

PageRank as a model of user behavior: a random web surferclicks on links randomly without regard of the content.

The surfer does a randome walk on the web, choosing amongavailable outgoing links at random

The maximal (equal to 1) eigenvalue-eigenvector pair of Acorresponds to the stationary distribution of an irreducibleaperiod Markov chain over the N web pages.

Radu Horaud Data Analysis and Manifold Learning; Lecture 12