multivariate models and machine learning for fmri - tnu · multivariate models and machine learning...

Post on 07-May-2018

248 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Multivariate models and machine learning for fMRI

Methods and Models in fMRI, 15.11.2016

Jakob Heinzleheinzle@biomed.ee.ethz.chTranslational Neuromodeling Unit (TNU) Institute for Biomedical Engineering (IBT)University and ETH Zürich

Many thanks to Sudhir Raman and Kay Brodersenfor material

Translational Neuromodeling Unit

1

Overview

fMRI Analysis and Classifcation 2

Motivation

Learning from data

Multivariate Bayes in SPM

Generative Embedding

Modelling Terminology

Why multivariate?

Univariate approaches are excellent for localizing activations in individual voxels.

v1 v2 v1 v2

reward no reward

*

n.s.

Why multivariate?

Multivariate approaches can be used to examine responses that are jointly encoded in multiple voxels.

v1 v2 v1 v2

n.s.

orange juice apple juice v1

v2

n.s.

A bit of history – Multidymensional scaling

fMRI Analysis and Classifcation 5

Edelman et al, Psychobiology, 1998

Psychophysical rating fMRI

Two-dimensional projection of similarity measure for bothpsychophysical rating and fMRI response.

A bit of history – Classification Studies

fMRI Analysis and Classifcation 6

Haxby et al, Science, 2001

A bit of history – Classification Studies

fMRI Analysis and Classifcation 7

Kamitani and Tong, Nat Neurosci, 2005

Representational similarity analysis

fMRI Analysis and Classifcation 8

Idea: Compare the similarity of representations (correlation betweenactivation patterns) between different stimuli. Allows for a comparison between monkey(neural firing pattern) and human (fMRI activation patterns).

Kriegeskorte et al, Neuron, 2008

Overview

fMRI Analysis and Classifcation 9

Motivation

Learning from data

Multivariate Bayes in SPM

Generative Embedding

Modelling Terminology

Analysis steps

Feature Extraction

Modelling

Classification

Clustering

Regression

Prediction

Model Selection

Cross validation

Performance

Inference

Feature space

F1 F2 . . . FP

S1 1 0.5

S2 0 5.7

. 1 4

. 1 5.3SN 1 6.6

• Discrete• Continuous

Data Points

Features

Feature selection for fMRImultivariate analysis

fMRI Analysis and Classifcation 12

Different features answer different questions.Reducing the dimensionality might reduce noise,but could also reduce relevant information.

Model parametersMean valuesRaw data

Model Parameters,e.g. DCM

Correlationsbetweenregions

Model selection - Generalizability

fMRI Analysis and Classifcation 13

Model Fit

Model Complexity

Bishop (2006), Pitt & Miyung (2002), TICS

Encoding and decoding models

fMRI Analysis and Classifcation 14

context (cause or consequence)𝑋𝑋𝑡𝑡 ∈ ℝ𝑑𝑑

BOLD signal𝑌𝑌𝑡𝑡 ∈ ℝ𝑣𝑣

conditionstimulus

responseprediction error

encoding model

decoding model

𝑔𝑔:𝑋𝑋𝑡𝑡 → 𝑌𝑌𝑡𝑡

ℎ:𝑌𝑌𝑡𝑡 → 𝑋𝑋𝑡𝑡

Modelling goals

• Prediction

hY X

Predictive Density

Modelling goals

• Model Selection

Sparse Coding Distributed Coding

Model Evidence

Overview

Motivation

Learning From Data

Multivariate Bayes in SPM

Generative Embedding

Modelling Concepts

Learning from data

Reinforcement Learning

Semi-supervised Learning

SupervisedLearning

Unsupervised Learning

Labels for trainingdata are known!

Labels for trainingdata are NOT known!

Supervised learning

Function - f

Independent variablesX

dependent variableY CategoricalContinuous

Classification

Support Vector Machines

• Kernel Function – K 𝒙𝒙𝒊𝒊,𝒙𝒙𝒋𝒋 = 𝝓𝝓 𝒙𝒙𝒊𝒊 .𝝓𝝓 𝒙𝒙𝒋𝒋

𝝓𝝓

Function - fX Y

Kernel Methods

Kernel methods for pattern analysis, Taylor , Cristianini, 2004

Other popular classifiers• Gaussian Processes

• Deep Belief networks

G.E. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets”, Neural Computation, vol 18, 2006

http://deeplearning.net/tutorial/DBN.html

C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006,

Generative and Discriminative classifiers

fMRI Analysis and Classifcation 22

• Generative classifiers• Learn the parameters for the functions p(Y) and

p(X|Y), e.g. Naïve Bayes Classifier• Discriminative classifiers

• Learn the parameters for p(Y|X), e.g. logistic regression, SVM

Cross-validation

The generalization ability of a classifier can be estimated using a resampling procedure known as cross-validation. One example is 2-fold cross-validation:

examples123

99100

?training exampletest examples

folds??

1

...

???

2

...performance evaluation

fMRI Analysis and Classifcation 23

• Model Selection• Performance evaluation

• Balanced Accuracy• F1 Score

Cross-validation

Another commonly used variant is leave-one-out cross-validation.

examples123

99100

?training exampletest example

?...98

?

...99

?

...

100

...

folds?1

...

?

2

...performance evaluation

In fMRI often leave one-run-out

fMRI Analysis and Classifcation 24

Performance – Single Subject

fMRI Analysis and Classifcation 25

𝑝𝑝 = 𝑃𝑃 𝑋𝑋 ≥ 𝑘𝑘 𝐻𝐻0 = 1 − 𝐵𝐵 𝑘𝑘|𝑛𝑛,𝜋𝜋0

Brodersen et al. 2013, NeuroImage

Binomial Test

k=30

!!! Cross-validated data are not necessarilybinomially distributed Permutation tests are better!!!

Performance – Mulitple subjects

fMRI Analysis and Classifcation 26

Brodersen et al. 2013, NeuroImage

Fixed effects

Random effects

http://www.translationalneuromodeling.org/tapas/

Confounds – GLM vs. MVPA

fMRI Analysis and Classifcation 27

Todd et al. 2013, NeuroImage

Second level t-tests for accuracies?

fMRI Analysis and Classifcation 28

True β-Values are normallydistributed.

True accuracies are not normal and truncated at chance.

A possible solution is givenby Allefeld et al.

Allefeld et al. Neuroimage, 2016

Statistical testing with classification

• Within subjects:– Permutation statistics

– Parametric tests ar not valid (assumptions not met), e.g. Biomial-or t-test (c.f. Schreiber and Krekelberg, 2013).

• Across subjects:– Assumptions for t-tests are not met

– Full Bayesian model (Bordersen et al. 2013, but assumptions arenot met for CV)

– Use prevalence statistic proposed in Allefeld et al., 2016

fMRI Analysis and Classifcation 29

Research questions for classification

Temporal evolution of discriminability Model-based classificationaccuracy

50 %

100 %

within-trial time

Accuracy rises above chance

Participant indicates decision

Overall classification accuracy Spatial deployment of discriminative regions

80%

55%

accuracy

50 %

100 %

classification task

Truthor

lie?

Left or right

button?

Healthy or ill?

Pereira et al. (2009) NeuroImage, Brodersen et al. (2009) The New Collection

{ group 1, group 2 }

fMRI Analysis and Classifcation 30

Decoding «hidden» intentions –searchlight approach

fMRI Analysis and Classifcation 31

Haynes et al., Current Biology, 2007

Decoding of free decisions

fMRI Analysis and Classifcation 32

Soon et al., Nat Neurosci, 2008

Decoding of fingerpresses (red line). Participants freely choose timingand hand.

Earliest information about left-rightlong before execution – free will?

Decoding task preparation –connectitivy based decoding

fMRI Analysis and Classifcation 33

Heinzle et al., J Neurosci, 2012

SV-Classifier on connectivity graph (correlation)

Discriminative maps

Unsupervised learning

fMRI Analysis and Classifcation 34

Building a representation of data

Dimensionality Reduction Time seriesClustering

K-means Mixture models

K-means clustering

fMRI Analysis and Classifcation 35

• Cost function

• Algorithm1. Initialize2. Estimate assignments3. Estimate cluster centroids4. Repeat 2,3 until

convergence

Bishop PRML (2006)

Clustering – Mixture of Gaussians

fMRI Analysis and Classifcation 36

Bishop PRML (2006)

Interpretation

fMRI Analysis and Classifcation 37

• Cluster parameters

• Internal Criterion – Model Evidence• External Criterion - Purity

Inferred Labels

External Labels

Subjects

Cluster 1 Cluster 2

fMRI Analysis and Classifcation 38

Motivation

Learning from Data

Multivariate Bayes in SPM

Generative EmbeddingModelling

Encoding vs. Decoding models

fMRI Analysis and Classifcation 39

Encoding vs. Decoding models

fMRI Analysis and Classifcation 40

Coding Hypotheses

fMRI Analysis and Classifcation 41

Spatial vectors Smooth vectors

Sparse vectors

Singular vectors of data Support vectors

Distributed vectors

𝑈𝑈 = 𝑅𝑅𝑌𝑌𝑇𝑇𝑈𝑈𝑈𝑈𝑉𝑉𝑇𝑇 = 𝑅𝑅𝑌𝑌𝑇𝑇

Coding Hypotheses

fMRI Analysis and Classifcation 42

Friston et al. 2008 NeuroImage

Solved with variational Bayes

fMRI Analysis and Classifcation 43

Friston et al. 2008 NeuroImage

Example – Decoding of motion.

fMRI Analysis and Classifcation 44

Experimental factors:1. Photic2. Motion3. Attention

Attention to motion dataset - Büchel & Friston 1999 Cerebral Cortex

Friston et al. 2008 NeuroImage

fMRI Analysis and Classifcation 45

Friston et al. 2008 NeuroImage

Results

fMRI Analysis and Classifcation 46

Friston et al. 2008 NeuroImage

Multivariate Bayes in SPM

fMRI Analysis and Classifcation 47

1 2 3 4 5-20

0

20

40

60

partitions

log-evidencemaximum p = 100.00%

-0.04 -0.02 0 0.02 0.040

100

200

300

400

500

voxel-weight

frequ

ency

distribution of weights

Posterior probabilities at maxima ________________________________p(|w| > 0) location (x,y,z) weight (w)________________________________p = 0.993 -39.0,-90.0,-3.0mm q = 0.0254;p = 0.983 -33.0,-99.0,-3.0mm q = -0.0216;p = 0.983 -30.0,-99.0,3.0mm q = 0.0211;p = 0.982 -42.0,-90.0,9.0mm q = 0.0201;p = 0.980 -45.0,-75.0,-3.0mm q = 0.0168;p = 0.979 -30.0,-84.0,6.0mm q = -0.0187;p = 0.977 -39.0,-87.0,3.0mm q = -0.0196;p = 0.973 -30.0,-84.0,-6.0mm q = -0.0204;p = 0.972 -39.0,-81.0,-15.0mm q = 0.0166;p = 0.946 -36.0,-84.0,12.0mm q = -0.0144;p = 0.933 -48.0,-84.0,-3.0mm q = -0.0119;p = 0.929 -39.0,-75.0,3.0mm q = -0.0160;________________________________506 voxels; 360 scans

PPM: MVB_Motion (Motion)

0 100 200 300 400-1

-0.5

0

0.5

scans

adjus

ted re

spon

se

MVB_Motion (prior: sparse)

targetprediction

-1 -0.5 0 0.5-0.4

-0.2

0

0.2

0.4

contrast

pred

iction

observed and predicted contrastSNR (variance) 0.64

SPM

mip

[-36

, -87

, -3]

<

< <

SPM{T338}

Motion

SPMresults: .\SPM-practical\attention\GLMHeight threshold T = 4.874226 {p<0.05 (FWE)}

50

100

150

200

250

300

contrast(s)

3

Laminar activity related to novelty andepisodic memory

fMRI Analysis and Classifcation 48

Maas et al. 2014 Nature Communications

fMRI Analysis and Classifcation 49

Motivation

Learning from Data

Multivariate Bayes in SPM

Generative Embedding

Modelling Principles

Classifying Groups of Subjects

fMRI Analysis and Classifcation 50

Subject 1

Subject 2

Subject N

Voxel activity

Subject 1

Subject 2

Subject N

Connectivity

Dynamic causal model (DCM)

ClassificationClustering

Group 1 Group 2

......

• High dimensionality• Unusual cluster distributions• Lack of interpretation

Generative Embedding

fMRI Analysis and Classifcation 51

Brodersen et al. PLOS computation biology 2011.

DCM for speech processing

fMRI Analysis and Classifcation 52

Working memory in Schizophrenia

fMRI Analysis and Classifcation 53

• 41 Schizophrenia patients (DSM IV,ICD 10), 42 controls

• Visual numeric n-back working memory task

Deserno et al (2012) The Journal of Neuroscience

1 5

4 29 8

9

3 5900ms

500ms

Model based clustering

fMRI Analysis and Classifcation 54

Brodersen et al 2014 Neuroimage

Results healthy vs. schizophrenia patients

fMRI Analysis and Classifcation 55

Brodersen et al 2014 Neuroimage

Within patients clustering

fMRI Analysis and Classifcation 56

Brodersen et al 2014 Neuroimage

Be aware

• Interpretation of decoding or classificationresults is difficult.

• The decoded information must be in thedata, but in what features exactly is oftenhard to find out …

fMRI Analysis and Classifcation 57

Summary

fMRI Analysis and Classifcation 58

Summary

Learning from Data

Multivariate Bayes in SPM

Generative Embedding

Modelling Principles

Acknowledgments

fMRI Analysis and Classifcation 59

Many thanks to K.E. Stephan, Sudhir S. Raman and K. Brodersen for sharing their teaching material.

top related