power calculations for group fmri studies accounting for...

1
Power Calculations for Group fMRI Studies Accounting for Arbitrary Design and Temporal Autocorrelation Jeanette Mumford & Thomas E Nichols Department of Biostatistics, University of Michigan Introduction Statistical power calculations for group functional magnetic resonance imaging (fMRI) studies are an important study plan- ning tool. Power estimation requires prior knowledge of ex- pected effect size and its variability. Many factors contribute to these values including: study design, form and magnitude of the temporal autocorrelation, length of study and between subject variability. Recently, Desmond & Glover [1] proposed a method which uses a paired T-test to analyze percent sig- nal change across subjects accounting for both between- and within-subject variability components. Although this approach is useful for planning experiments using paired t-tests, it cannot complete power calculations for more complicated noise and signal models, including second level F-tests. We introduce a group model power calculation method that ad- mits a wider variety of study designs. Our method is based on the general two-stage summary statistics model, making it easy to adapt to models used by current fMRI software packages. We illustrate the need for such flexibility by showing consequences of using the wrong signal and/or noise model in a power cal- culation. Also, we show how our power calculations can aid in efficiently planning and determining cost of a future fMRI study. Methods 2-Stage Summary Statistics Model To define our power method, first we review the 2-stage sum- mary statistics model. Level 1: : -vector of fMRI response data (subject ) : design matrix : -vector of parameters : -vector, Level 2: , is a vector contrast ( -subjects) : design matrix , diag Hypothesis Test : matrix of contrasts : -vector Test Statistic: Under , Central F with numerator and denominator degrees of freedom Under , ; noncentral F with numerator, denominator degrees of freedom and noncentrality parameter, Power Power is the probability of rejecting the null under the alterna- tive distribution, where is the threshold corresponding to a false posi- tive rate of . The following flow chart shows the relationship of the pieces of information that are necessary for carrying out a power calculation, where items already known are in green and items to be calculated are indicated in red: Between sub- ject variance ( ), within subject variance ( ), and temporal correlation ( ). Figure 1: Necessary information required for power calculation. Green indicates known information and red indicates information to be calculated. Estimating Variance Parameters Using FSL Since our model is based on the two-level summary statistics group model, it can easily be adapted to calculate power for a future analysis carried out using, for example, the FMRIB Soft- ware Library (FSL) [3]. Since FSL uses a voxelwise unstruc- tured autocovariance function (ACF) estimate to model tempo- ral covariance, our first step is to summarize this estimate using the 3 parameters of an covariance structure de- fined by Cov A single subject FSL analysis stores the voxelwise ACF esti- mate as well as the variance estimates. We estimate the param- eters of the covariance in voxels within regions of interest (ROIs) determined by a thresholded T-statistic im- age ( ) then base final covariance on mean values of , , and . Note, for the model to fit, the FSL analysis must have been run without highpass filtering. Here are the steps for acquiring the parameters: : Construct periodogram using FSL’s ACF and use av- erage height at high frequencies to estimate . and : Remove from original covariance and use Yule-Walker to estimate parameters of remaining co- variance parameters. A single group FSL anaysis is required to estimate the between subject variability, . This value is estimated for every voxel, so all that is necessary is to average within ROIs. Data We used the FIAC single subject block design data for subjects 0, 1, 2, 3, 4, and 6 [2]. Single subject variance parameter esti- mates were based on subject 0’s data and and the group variance parameters were obtained from a group model using all 6 sub- jects. In all cases the contrast of interest corresponded to same sentence same speaker. Results Parameter Estimation The left panel of Figure 2 illustrates how well the 3 parameter estimate matches FSL’s unstructured covariance estimate by comparing values using each method’s covari- ance estimate. The values are similar indicating that the 3 pa- rameter summary of FSL’s unstructured covariance works well. -5 0 5 10 15 -5 0 5 10 15 T AR(1)+WN T FSL Comparison of T Statistics 0 0.1 0.2 0.3 0.4 0 50 100 150 Within Subject Variance Frequency d Var(c ˆ β )(%) 0 1 2 3 4 0 1000 2000 3000 4000 Between Subject Variance Frequency ˆ σ 2 B (%) Figure 2: Comparison of statistics based on FSL’s unstructured ACF and the 3 pa- rameter summary of FSL’s covariance estimate. The middle and right panels show histograms of the within and between subject variances within ROI’s based on FSL analyses. The red line in the middle figure indicates the variance based on the covariance using the average values of , , and and the red line in the right figure indicates the mean of the distribution Figure 2 also shows histograms of the within (middle) and be- tween (right) subject variance based on FSL’s variance esti- mates within ROI’s. The red line indicates the value used in our power calculation, where the within subject variance is based on the model using the mean values of , and and the between subject variance is simply the mean of the distribution. For the power calculations we used the following values based on the FSL analysis described above : , , , , , and .A block design study with 15s of stimulus followed by 15s of rest was used. Desmond and Glover do not model autocorrelation and use a simple T-test model to estimate power, which is most likely not the same model that will be used for the data. There- fore, to understand the impact of using the wrong signal or noise model when estimating power, we have illustrated statisti- cal power for different model possibilities: right/wrong design (with/without HRF convolution) and right/wrong noise model (correlated/uncorrelated). Figure 3 shows that when the wrong design (no HRF convolution) and wrong noise (independent) are assumed, power is overestimated by up to 10%. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 60 65 70 75 80 85 Power (%) ρ Wrong Noise and Design Wrong Design Wrong Noise Correct Figure 3: Illustration of the consequences in statistical power that will result if the model used to estimate power is not the same as the model you use to analyze your data. If you assume the boxcar regressor is not convolved with an HRF and the data are inde- pendent (wrong noise and design), power is overestimated by up to 10% compared to the correct model with HRF convolution and accounting for autocorrelation. Figure 4 illustrates how our power calculations can be used to design a study. The left panel shows power for different sam- ple sizes when the overall functional scanner time is at most 60 minutes. The top x-axis represents time in minutes and the bottom x-axis represents the number of 30s on/off cycles of the block design. Notice how the power curves show diminishing returns; for each sample size there is a point where collecting additional cycles has little impact on power. The right panels illustrate cost and number of cycles necessary to achieve 80% power for different sample sizes, where cost is based on a fee of $10/minute. Interestingly, a smaller sample size is more ex- pensive than a larger sample size, due to the need of additional scanner time for each subject. Power Fixed Power=80% Total Functional Scanner Time 1 hour 0 1 2 3 4 5 6 Time (minutes) 0 2 4 6 8 10 12 0 10 20 30 40 50 60 70 80 90 100 Number of on/off (15s/15s) cycles Power (%) 24 subjects 22 subjects 20 subjects 18 subjects 16 subjects 14 subjects 12 subjects 16 17 18 19 20 21 22 23 24 2 3 4 5 6 7 Number of Cycles Number of Subjects 16 17 18 19 20 21 22 23 24 300 350 400 450 500 550 600 Functional Cost ($) Number of Subjects Figure 4: Power and cost for a block design fMRI study. The left panel illustrates sta- tistical power when the maximum functional scanner time is 60 minutes. The two figures on the right illustrate cost and # of cycles for different sample sizes in order to obtain 80% power. Cost is based on a fee of $100 per minute. Conclusions We have introduced a flexible power estimation technique that will admit any first and second level study designs, can estimate power for or tests, and accounts for temporal autocorre- lation. Since it is based on the two-level summary statistics model, it is easily adapted to the models of fMRI software packages such as SPM2 and FSL. The necessity of a flexible power estimation model was illus- trated in Figure 3, which showed that using a simple model to estimate power, such as the model used by Desmond and Glover, may lead to an overestimate of up to 10% in power. To obtain a reliable power estimate, it is necessary to match the power model as closely as possible to the model that will be used to analyze the data. We also illustrated how our power model can be used to help design a study in a way that maximizes power and minimizes the cost of the study. References [1]Desmond & Glover. 2002, J Neuro Meth, 118:115-128; [2]Dehane-Lambertz et al. 2006, Hum Brain Map, 27:360-371; [3] FSL, http://www.fmrib.ox.ac.uk/fsl/

Upload: lekhuong

Post on 27-Feb-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Power Calculations for Group fMRI Studies Accounting for ...mumford.bol.ucla.edu/fmripower/mumford_hbm_2006.pdf · Arbitrary Design and Temporal Autocorrelation ... 0.1 0.2 0.3 0.4

Power Calculations for Group fMRI Studies Accounting forArbitrary Design and Temporal Autocorrelation

Jeanette Mumford & Thomas E NicholsDepartment of Biostatistics, University of Michigan

Introduction

Statistical power calculations for group functional magneticresonance imaging (fMRI) studies are an important study plan-ning tool. Power estimation requires prior knowledge of ex-pected effect size and its variability. Many factors contributeto these values including: study design, form and magnitudeof the temporal autocorrelation, length of study and betweensubject variability. Recently, Desmond & Glover [1] proposeda method which uses a paired T-test to analyze percent sig-nal change across subjects accounting for both between- andwithin-subject variability components. Although this approachis useful for planning experiments using paired t-tests, it cannotcomplete power calculations for more complicated noise andsignal models, including second level F-tests.

We introduce a group model power calculation method that ad-mits a wider variety of study designs. Our method is based onthe general two-stage summary statistics model, making it easyto adapt to models used by current fMRI software packages. Weillustrate the need for such flexibility by showing consequencesof using the wrong signal and/or noise model in a power cal-culation. Also, we show how our power calculations can aidin efficiently planning and determining cost of a future fMRIstudy.

Methods

2-Stage Summary Statistics ModelTo define our power method, first we review the 2-stage sum-mary statistics model.

� Level 1:����������� �����

–���

: � � -vector of fMRI response data (subject � )–���

: � �����design matrix

–��

:�

-vector of parameters–���

: � � -vector, ��������� �"!�$# �&%� Level 2: ')(+*-,$.��/�10�203���40– ')(5*-,&.��7698 ';: �=<><><>� 8 ')? @BA

,8

is a vector contrast ( � -subjects)–� 0

: � �C� 0design matrix

–� 0 �����D��� # 0 % , # 0 �

diag EF� !� 8 � �GA� #IH :� ���$% H : 8 JLK�� � !M�NHypothesis Test�PO1QSR 8 0 0 � �UT O1VWR 8 0 0 �YX

–8 0

: Z �C� 0matrix of contrasts

–X

: Z -vector� Test Statistic: [ � � 8 0 0 % J]\ 8 0 E � J0 # H :0 � 0 K38 J0_^ H : � 8 0 0 %– Under O`Q , [a� bdc-e , Hgfh Central F with Z numerator andiWj � 0

denominator degrees of freedom– Under O`V , [k�lbmcne , Hofh e ?3prq

; noncentral F with Z numerator,isj � 0denominator degrees of freedom and noncentrality

parameter, �utwv �YX J \ 8 0 E � J0 # H :0 � 0 K38 J0 ^ H : XPowerPower is the probability of rejecting the null under the alterna-tive distribution,vyxgzU{>Z � vUZyE|bdc-e , Hgf h e ?3prqG} [ : H�~ e cne , Hof h K �where [ : H�~ e cne , Hofh is the threshold corresponding to a false posi-tive rate of � . The following flow chart shows the relationshipof the pieces of information that are necessary for carrying outa power calculation, where items already known are in greenand items to be calculated are indicated in red: Between sub-ject variance ( � !M ), within subject variance ( � !� ), and temporalcorrelation (

# �).

Figure 1: Necessary information required for power calculation. Green indicatesknown information and red indicates information to be calculated.

Estimating Variance Parameters Using FSLSince our model is based on the two-level summary statisticsgroup model, it can easily be adapted to calculate power for afuture analysis carried out using, for example, the FMRIB Soft-ware Library (FSL) [3]. Since FSL uses a voxelwise unstruc-tured autocovariance function (ACF) estimate to model tempo-ral covariance, our first step is to summarize this estimate usingthe 3 parameters of an �U�1�4� %m�Y� � covariance structure de-fined by

Cov � ��� � ��� Ho� %�� � � !V���F� j�� ! %� �)� � � � ���� �� � ��!V���F� j�� ! %� � � !� ? � � � �A single subject FSL analysis stores the voxelwise ACF esti-mate as well as the variance estimates. We estimate the param-eters of the ���`�4� %;�/� � covariance in voxels within regionsof interest (ROIs) determined by a thresholded T-statistic im-age ( � }��

) then base final ���`�F� %"��� � covariance on meanvalues of � , �"!V�� , and ��!� ?

. Note, for the ���1�4� %"��� � modelto fit, the FSL analysis must have been run without highpassfiltering. Here are the steps for acquiring the ���`�F� %S��� �parameters:�P����G� : Construct periodogram using FSL’s ACF and use av-

erage height at high frequencies to estimate � !� ?.�P����3� and � : Remove ��!� ?

from original covariance and useYule-Walker to estimate parameters of remaining ���`�4� % co-variance parameters.

A single group FSL anaysis is required to estimate the betweensubject variability, � !M . This value is estimated for every voxel,so all that is necessary is to average � !M within ROIs.DataWe used the FIAC single subject block design data for subjects0, 1, 2, 3, 4, and 6 [2]. Single subject variance parameter esti-mates were based on subject 0’s data and and the group varianceparameters were obtained from a group model using all 6 sub-jects. In all cases the contrast of interest corresponded to samesentence same speaker.

Results

Parameter EstimationThe left panel of Figure 2 illustrates how well the 3 parameter���`�F� %)��� � estimate matches FSL’s unstructured covarianceestimate by comparing � values using each method’s covari-ance estimate. The � values are similar indicating that the 3 pa-rameter summary of FSL’s unstructured covariance works well.

−5 0 5 10 15−5

0

5

10

15

TAR(1)+WN

TF

SL

Comparison of T Statistics

0 0.1 0.2 0.3 0.40

50

100

150

Within Subject Variance

Fre

quen

cy

V̂ar(cβ̂)(%)

0 1 2 3 40

1000

2000

3000

4000

Between Subject Variance

Fre

quen

cy

σ̂2B(%)

Figure 2: Comparison of J statistics based on FSL’s unstructured ACF and the 3 pa-rameter

V��¡  :5¢F£ � ?summary of FSL’s covariance estimate. The middle and right panels

show histograms of the within and between subject variances within ROI’s based on FSLanalyses. The red line in the middle figure indicates the variance based on the

V��¡  :5¢D£ � ?covariance using the average values of ¤ , ¥§¦¨�© , and ¥ª¦«�¬ and the red line in the right figureindicates the mean of the distribution

Figure 2 also shows histograms of the within (middle) and be-tween (right) subject variance based on FSL’s variance esti-mates within ROI’s. The red line indicates the value used in ourpower calculation, where the within subject variance is basedon the ���`�F� % �­� � model using the mean values of � , � !V��and ��!� ?

and the between subject variance is simply the meanof the distribution.

For the power calculations we used the following values basedon the FSL analysis described above :

X � ��<¯®±° , �;!� ? ���<³²µ´g° , ��!V�� � ��< �g� ° , � � ��<·¶g� , ��!M � ��< � ¶o° , and � � ��<·�g¶ . Ablock design study with 15s of stimulus followed by 15s of restwas used.

Desmond and Glover do not model autocorrelation and use asimple T-test model to estimate power, which is most likelynot the same model that will be used for the data. There-fore, to understand the impact of using the wrong signal ornoise model when estimating power, we have illustrated statisti-cal power for different model possibilities: right/wrong design(with/without HRF convolution) and right/wrong noise model(correlated/uncorrelated). Figure 3 shows that when the wrongdesign (no HRF convolution) and wrong noise (independent)are assumed, power is overestimated by up to 10%.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.960

65

70

75

80

85

Pow

er (

%)

ρ

Wrong Noise and DesignWrong DesignWrong NoiseCorrect

Figure 3: Illustration of the consequences in statistical power that will result if themodel used to estimate power is not the same as the model you use to analyze your data.If you assume the boxcar regressor is not convolved with an HRF and the data are inde-pendent (wrong noise and design), power is overestimated by up to 10% compared to thecorrect model with HRF convolution and accounting for autocorrelation.

Figure 4 illustrates how our power calculations can be used todesign a study. The left panel shows power for different sam-ple sizes when the overall functional scanner time is at most60 minutes. The top x-axis represents time in minutes and thebottom x-axis represents the number of 30s on/off cycles of theblock design. Notice how the power curves show diminishingreturns; for each sample size there is a point where collectingadditional cycles has little impact on power. The right panelsillustrate cost and number of cycles necessary to achieve 80%power for different sample sizes, where cost is based on a feeof $10/minute. Interestingly, a smaller sample size is more ex-pensive than a larger sample size, due to the need of additionalscanner time for each subject.

Power Fixed Power=80%Total Functional Scanner Time ¸ 1 hour

0 1 2 3 4 5 6

Time (minutes)

0 2 4 6 8 10 120

10

20

30

40

50

60

70

80

90

100

Number of on/off (15s/15s) cycles

Pow

er (

%)

24 subjects22 subjects20 subjects18 subjects16 subjects14 subjects12 subjects

16 17 18 19 20 21 22 23 242

3

4

5

6

7

Num

ber

of C

ycle

s

Number of Subjects

16 17 18 19 20 21 22 23 24300

350

400

450

500

550

600

Fun

ctio

nal C

ost (

$)

Number of Subjects

Figure 4: Power and cost for a block design fMRI study. The left panel illustrates sta-tistical power when the maximum functional scanner time is 60 minutes. The two figureson the right illustrate cost and # of cycles for different sample sizes in order to obtain 80%power. Cost is based on a fee of $100 per minute.

Conclusions

We have introduced a flexible power estimation technique thatwill admit any first and second level study designs, can estimatepower for � or b tests, and accounts for temporal autocorre-lation. Since it is based on the two-level summary statisticsmodel, it is easily adapted to the models of fMRI softwarepackages such as SPM2 and FSL.

The necessity of a flexible power estimation model was illus-trated in Figure 3, which showed that using a simple modelto estimate power, such as the model used by Desmond andGlover, may lead to an overestimate of up to 10% in power. Toobtain a reliable power estimate, it is necessary to match thepower model as closely as possible to the model that will beused to analyze the data.

We also illustrated how our power model can be used to helpdesign a study in a way that maximizes power and minimizesthe cost of the study.

References [1]Desmond & Glover. 2002, J Neuro Meth, 118:115-128; [2]Dehane-Lambertz et al. 2006, Hum Brain Map, 27:360-371; [3] FSL, http://www.fmrib.ox.ac.uk/fsl/