general linear model & classical inference guillaume flandin wellcome trust centre for...
TRANSCRIPT
General Linear Model&
Classical InferenceGuillaume Flandin
Wellcome Trust Centre for Neuroimaging
University College London
SPM M/EEGCourse
London, May 2013
Pre-processings
GeneralLinearModel
Statistical Inference
𝑦=𝑋 𝛽+𝜀
�� 2= ��𝑇 ��𝑟𝑎𝑛𝑘(𝑋 )
Contrast cRandom
Field Theory
𝑆𝑃𝑀 {𝑇 ,𝐹 }
��= (𝑋𝑇 𝑋 )−1𝑋𝑇 𝑦
Statistical Parametric Maps
mm
mm
mm
mm
time
mm
time
freq
uenc
y
3D M/EEG sourcereconstruction,
fMRI, VBM
2D time-frequency
2D+t scalp-time
1D time
time
ERP example
Random presentation of ‘faces’ and ‘scrambled faces’
70 trials of each type 128 EEG channels
Question:is there a difference between the ERP of
‘faces’ and ‘scrambled faces’?
ERP example: channel B9
compares size of effect to its error standard deviation
sf
2
sf
11nn
t
Focus on N170
Data modelling
= b2b1 + +
Error
Faces ScrambledData
= b2b1 + +X2X1Y ••
Design matrix
= +
= b +XY •
Data
vector
Design
matri
xPara
mete
r
vector
Error
vector
b2
b1
General Linear Model
=
e+yy XX
N
1
N N
1 1p
p
Model is specified by1. Design matrix X2. Assumptions
about e
Model is specified by1. Design matrix X2. Assumptions
about e
N: number of scansp: number of regressors
N: number of scansp: number of regressors
eXy
The design matrix embodies all available knowledge about experimentally controlled factors and potential
confounds.
),0(~ 2INe
• one sample t-test• two sample t-test• paired t-test• Analysis of Variance
(ANOVA)• Analysis of Covariance
(ANCoVA)• correlation• linear regression• multiple regression
GLM: a flexible framework for parametric analyses
Parameter estimation
eXy
Ordinary least squares estimation
(OLS) (assuming i.i.d. error):
Ordinary least squares estimation
(OLS) (assuming i.i.d. error):
yXXX TT 1)(ˆ
Objective:estimate parameters to minimize
N
tte
1
2
= +b2
b1
eXY
Mass-univariate analysis: voxel-wise GLMEvoked response image
Tim
e
1. Transform data for all subjects and conditions2. SPM: Analyse data at each voxel
Sensor to voxeltransform
Hypothesis Testing
Null Hypothesis H0
Typically what we want to disprove (no effect).
The Alternative Hypothesis HA expresses outcome of interest.
To test an hypothesis, we construct “test statistics”.
Test Statistic T
The test statistic summarises evidence about H0.
Typically, test statistic is small in magnitude when the hypothesis H0 is true and large when false.
We need to know the distribution of T under the null hypothesis. Null Distribution of T
Hypothesis Testing
p-value:
A p-value summarises evidence against H0.
This is the chance of observing value more extreme than t under the null hypothesis.
Null Distribution of T
Significance level α:
Acceptable false positive rate α.
threshold uα
Threshold uα controls the false positive rate
t
p-value
Null Distribution of T
u
Conclusion about the hypothesis:
We reject the null hypothesis in favour of the alternative hypothesis if t > uα
)|( 0HuTp
𝑝 (𝑇>𝑡∨𝐻0 )
Contrast : specifies linear combination of parameter vector:
ERP: faces < scrambled ?=
t =
contrast ofestimated
parameters
varianceestimate
Contrast & t-test
cT = -1 +1 SPM-t over time & space
Tc
?21 ˆˆ
0?11 21 ˆˆ
0?TcTest H0:
pNTT
T
tcXXc
cT
~ˆ
ˆ
12
T-test: summary
T-test is a signal-to-noise measure (ratio of estimate to standard deviation of estimate).
T-contrasts are simple combinations of the betas; the T-statistic does not depend on the scaling of the regressors or the scaling of the contrast.
H0: 0Tc vs HA: 0Tc
Alternative hypothesis:
Model comparison: Full vs. Reduced model?
Null Hypothesis H0: True model is X0 (reduced model)Null Hypothesis H0: True model is X0 (reduced model)
RSS
RSSRSSF
0
21 ,~ FRSS
ESSF
Test statistic: ratio of explained and unexplained
variability (error)
1 = rank(X) – rank(X0)2 = N – rank(X)
RSS
2ˆ fullRSS0
2ˆreduced
Full model ?
X1 X0
Or reduced model?
X0
Extra-sum-of-squares & F-test
F-test & multidimensional contrastsTests multiple linear hypotheses:
H0: True model is X0H0: True model is X0
Full or reduced model?
X1 (b3-4) X0 X0
0 0 1 0 0 0 0 1
cT =
H0: b3 = b4 = 0H0: b3 = b4 = 0 test H0 : cTb = 0 ?test H0 : cTb = 0 ?
F-test: summary
F-tests can be viewed as testing for the additional variance explained by a larger model wrt a simpler (nested) model model comparison.
0000
0100
0010
0001
In testing uni-dimensional contrast with an F-test, for example b1 – b2, the result will be the same as testing b2 – b1. It will be exactly the square of the t-test, testing for both positive and negative effects.
F tests a weighted sum of squares of one or several combinations of the regression coefficients b.
In practice, we don’t have to explicitly separate X into [X1X2] thanks to multidimensional contrasts.
Hypotheses:
0 : Hypothesis Null 3210 H
0 oneleast at : Hypothesis eAlternativ kAH
Variability described by Variability described by
Orthogonal regressors
Variability in YTesting for Testing for
Correlated regressorsV
aria
bilit
y de
scrib
ed b
y V
ariability described by
Shared variance
Variability in Y
Correlated regressors
Var
iabi
lity
desc
ribed
by
Variability described by
Variability in Y
Testing for
Correlated regressors
Var
iabi
lity
desc
ribed
by
Variability described by
Variability in Y
Testing for
Correlated regressors
Var
iabi
lity
desc
ribed
by
Variability described by
Variability in Y
Correlated regressors
Var
iabi
lity
desc
ribed
by
Variability described by
Variability in Y
Testing for
Correlated regressors
Var
iabi
lity
desc
ribed
by
Variability described by
Variability in Y
Testing for
Correlated regressors
Var
iabi
lity
desc
ribed
by
Variability described by
Variability in Y
Testing for and/or
Summary Mass-univariate GLM:
– Fit GLMs with design matrix, X, to data at different points in space to estimate local effect sizes,
– GLM is a very general approach(one-sample, two-sample, paired t-tests, ANCOVAs, …)
Hypothesis testing framework
– Contrasts
– t-tests
– F-tests
Multiple covariance components
= 1 + 2
Q1 Q2
Estimation of hyperparameters with ReML (Restricted Maximum Likelihood).
V
enhanced noise model at voxel i
error covariance components Q and hyperparameters
jj
ii
QV
VC
2
),0(~ ii CNe
1 1 1ˆ ( )T TX V X X V y
Weighted Least Squares (WLS)
Let 1TW W V
1
1
ˆ ( )
ˆ ( )
T T T T
T Ts s s s
X W WX X W Wy
X X X y
Then
where
,s sX WX y Wy
WLS equivalent toOLS on whiteneddata and design
stimulus function
1. Decompose data into effects and error
2. Form statistic using estimates of effects and error
Make inferences about effects of interest
Why?
How?
datastatisti
c
Modelling the measured data
linearmode
l
effects estimate
error estimate