statistical modeling of positron emission tomography images in wavelet space

Statistical Modeling of Positron Emission Tomography Imagesin Wavelet Space

Federico E. Turkheimer, *Matthew Brett, John A. D. Aston, Alexander P. Leff, Peter A. Sargent,Richard J. S. Wise, Paul M. Grasby, and Vincent J. Cunningham

MRC Cyclotron Unit, Hammersmith Hospital, London, and *MRC Cognition and Brain Sciences Unit, Cambridge, U.K.

Summary: A new method is introduced for the analysis ofmultiple studies measured with emission tomography. Tradi-tional models of statistical analysis (ANOVA, ANCOVA andother linear models) are applied not directly on images but ontheir correspondent wavelet transforms. Maps of model effectsestimated from these models are filtered using a thresholdingprocedure based on a simple Bonferroni correction and thenreconstructed. This procedure inherently represents a completemodeling approach and therefore obtains estimates of the ef-fects of interest (condition effect, difference between condi-tions, covariate of interest, and so on) under the specified sta-

tistical risk. By performing the statistical modeling step inwavelet space, the procedure allows the direct estimation of theerror for each wavelet coefficient; hence, the local noise char-acteristics are accounted for in the subsequent filtering. Themethod was validated by use of a null dataset and then appliedto typical examples of neuroimaging studies to highlight con-ceptual and practical differences from existing statisticalparametric mapping approaches. Key Words: Wavelets—Statistical analysis—Statistical parametric map—Estimation—Positron emission tomography—Single photon emission com-puted tomography.

This article concludes a trilogy of papers devoted tothe use of the wavelet transform (WT) for the spatialmodeling of images obtained with emission tomography(ET). The first of the series (Turkheimer et al., 1999)described a unified framework for the analysis of singleemission images and derived statistical maps in station-ary noise conditions (homogeneous variance). The sec-ond article (Turkheimer et al., 2000) considered the spa-tio-temporal modeling of dynamic ET studies and devel-oped a combined application of kinetic modeling andWT that allowed the optimal estimation, in the leastsquare sense, of parametric maps in nonstationary noiseconditions (heterogeneous variance).

This article builds upon this bulk of work and devel-ops a framework for the statistical analysis of multipleET images in nonstationary noise conditions. As it willbe clear in the next sections, the theoretical basis of such

a method already resides in the previous two manu-scripts. Therefore, the purpose of the following sectionsis to clarify the use of the WT for this particular appli-cation and highlight the conceptual differences betweencurrent approaches in image space and WT methods.This task is pursued by theoretical means and applica-tions to real datasets.

Statistical modeling in image spaceLet Y(s,i) be a set of ET scans indexed by i, where s:

s � (sx,sy,sz)∈�3 spans the pixels of the digitized im-ages. In general, i: i � (k,h,j) ∈�3 as Y(s,i) may bemeasured in different conditions (index j) in differentsubjects (index h) in repeated occasions (index k). Y(s,i)are acquired from emission tomographs and recon-structed from sinograms or planar projections at theirhighest resolution using back-projection with a ramp fil-ter. It is useful to consider Y(s,i) as projected in the sameanatomic space so that each spatial coordinate s willcorrespond to the same anatomic location for all imagesin the set.

The measurement of the set Y(s,i) was aimed to obtaininferences on the spatial distribution P(s) of a certainparameter P that is a function of the conditions underwhich the data-sequence was acquired. For example, incerebral blood flow (CBF) studies of brain function, thepattern of interest P(s) is the difference in CBF betweenbaseline and an activation condition. This difference can

Received May 25, 2000; final revision received July 27, 2000; ac-cepted August 14, 2000.

Address correspondence and reprint requests to Federico E.Turkheimer, MRC Cyclotron Unit, Hammersmith Hospital, DuCaneRoad, London W12 0NN, U.K.

Abbreviations used: CBF, cerebral blood flow; DWT, dyadic wave-let transform; ET, emission tomography; GLM, general linear model;MSE, mean squared error; PET, positron emission tomography; PCA,principal component analysis; SPECT, single photon emission com-puted tomography; SPM, statistical parametric map; WT, wavelettransform.

Journal of Cerebral Blood Flow and Metabolism20:1610–1618 © 2000 The International Society for Cerebral Blood Flow and MetabolismPublished by Lippincott Williams & Wilkins, Inc., Philadelphia

1610

be estimated by use of a general linear model (GLM) inwhich the effect of each condition is inscribed as anadditive term and other effects of no interest (subjecteffects, temporal effects, scans global counts, and so on)are also included as additive terms or covariates (Fristonet al., 1995).

The mathematic relation between the data and the ef-fect of interest will be condensed here into a functionlabeled �(). �() in ET is usually derived from linearmodels (ANOVA or ANCOVA) where the set Y(s,i) ismodeled in terms of conditions and covariates related toI � (k,h,j) (Friston et al., 1995).

Alternatively, the model for the sequence Y(s,i) can beextracted from the data themselves in the form of, say, aset of eigenvectors extracted from a Principal Compo-nent Analysis (PCA; Barber, 1980; Moeller et al., 1987;Friston et al., 1993); in this case, the pattern P(s) ofinterest is the regional distribution of the contribution ofeach eigenvector to the pixel variance.

The application of �() models the dependency of thedata from the experimental setting and accommodatesthe dimension of Y(s,i) indexed by i. However, a com-plete modeling approach aims to the estimation of theparameter of interest, in this instance the pattern P(s); todo this, all dimensions of the data must be considered.Therefore it is also necessary to model the spatial dimen-sion indexed by s.

Figure 1 sketches a common approach to solving sucha problem in neuroimaging. In this scheme, spatial mod-eling is approached by two separate steps, filtering andstatistical thresholding. Filtering commonly consists ofthe application of monoresolution filters (for example,gaussian filters) to produce the filtered set YF(s,i). How-ever, such a simple modeling approach does not allowany kind of control over the statistical properties, biasand variance, of the derived map P̂(s) � �(YF(s)). There-fore, such a map is usually accommodated in the form ofa statistical parametric map (SPM) in which the para-metric pattern P̂(s) is normalized by correspondent esti-mates of its standard error �̂(s) to obtain a map of sta-tistical scores (for example, t scores, z scores, and so on)(Friston et al., 1991; Worsley et al., 1992). This is pre-liminary to the subsequent thresholding procedure thatconsists of feeding the statistical scores into the appro-priate cumulative distribution function be �() that allowsthe control of the statistical risk of false positives in theentire SPM. The correspondent P value map is then de-rived as P̂p(s) � 1 - �(P̂(s),�̂(s)). Those coordinates swhere P̂p(s) > � (� is an operator selected threshold,usually 0.1, 0.05, and so on) are then identified as thoselocations where parameter P has a statistically significantvalue. To capture varying patterns of P(s), a numberof �() have been proposed that are sensitive to spatialdistributions of the signal that range from focal andsparse to nonfocal and distributed (Friston et al., 1994;

FIG. 1. Scheme describes a common approach for the analysisof neuroimages. Y(s,i) (represented as bidimensional set of im-ages indexed by i) represents a set of multiple images acquiredin various conditions and all normalized to the same anatomicreference. Frames Y(s,i) are smoothed with a monoresolutionfilter to produce the filtered set YF(s,i). A functional �() is thenapplied to produce the parametric pattern P̂(s) and correspondentstandard error �̂(s). It is common to obtain P̂(s) from the pixel-wise applications of ANOVA or ANCOVA linear models and toexpress it as a statistical map where each value of the parameterof interest is normalized by correspondent values of �̂(s). Thesestatistical scores are then inserted in an appropriate cumulativeprobability distribution function that allows the control of the rateof false positive in the volume of interest. The output is a prob-ability map P̂p(s). Those locations where P̂p(s) is under a pre-defined threshold are then defined as those anatomic coordi-nates where pattern P̂(s) has a statistically significant value.

STATISTICAL MODELING OF PET IMAGES IN WAVELET SPACE 1611

J Cereb Blood Flow Metab, Vol. 20, No. 11, 2000

Worsley et al., 1995). Alternatively, such sensitivity maybe achieved by iterating the filtering and thresholdingsteps at different resolutions where �() is derived toobtain control of statistical risk over the searched scales(Poline and Mazoyer, 1994; Worsley et al., 1996).

Hypothesis testing versus estimationThe scheme of Fig. 1 was designed to obtain an esti-

mate of the pattern P(s) with a controlled statistical risk.However, the final output is a probability map and notthe pattern itself. Such a map may be used to assess thelocation of statistically significant values of P(s), but notto obtain a numeric estimate of its spatial distribution(Worsley, 1997; Turkheimer et al., 1999). The reason forthis lies in the incomplete modeling of the data: whereasthe dimension i of Y(s,i) is modeled by �(), no model isused for the spatial dimension s. The use of a monoreso-lution filter is suboptimal as it cannot adapt to the spa-tially varying features of the signal nor to the local sig-nal-to-noise ratios.

Statistical modeling in wavelet spaceOne general approach to the modeling of images con-

sists of the use of a mathematical transform that projects

the data onto an appropriate functional space. Previouswork (Unser et al., 1995; Ruttiman et al., 1996, Turkhei-mer et al., 1999) showed that bases of smooth waveletsare able to model efficiently the signal of images mea-sured with positron emission tomography (PET) andsingle photon emission computed tomography (SPECT).Besides, the projection onto wavelet space is obtainedthrough application of the dyadic wavelet transform(DWT; Mallat, 1989) that is a linear and orthogonal op-erator. Such properties therefore can be used for the spa-tial modeling of the dataset Y(s,i) by using the methodshown in Fig. 2. In this scheme, every single frame of thedataset is transformed by application of the wavelettransform to obtain W(w,i) � WT(Y(s,i)). The applica-tion of the statistical model through the operator �() pro-duces the parametric image in wavelet space P̂wt(w) andcorrespondent estimates of the standard error �̂wt(w).Both P̂wt(w) and �̂wt(w) then are passed through a wave-let filter h() that thresholds each wavelet coefficientP̂wt(w) proportionally to its error �̂wt(w) and accordinglyto a certain statistical risk (Turkheimer et al., 1999). Byapplication of the inverse WT one can obtain a final esti-mate of the parametric pattern P̂F(s) � WT−1(P̂F

wt(w)).

FIG. 2. Solution to the problem of es-timation of P̂(s) with a controlled sta-tistical risk. All frames Y(s,i) are pro-jected in wavelet space through thedyadic wavelet transform. A patternP̂wt(w) and correspondent error esti-mate �̂wt(w) are obtained by applica-tion of an appropriate linear statisticalmodel in wavelet space. A wavelet fil-ter h() is then applied to P̂wt(w) thatshrinks each wavelet coefficient pro-portionally to its error by use of aspecified statistical risk. The filteredmap P̂F

wt(w). is then inverted in imagespace to produce the final estimateP̂F(s).

F. E. TURKHEIMER ET AL.1612


This sequence is identical in all aspects to that usedpreviously for the estimation of parametric maps fromdynamic ET scans (Turkheimer et al., 2000). In fact, theapplication of a statistical model to a series of ET studiesis in no way different from the application of any othermodel (for example, a kinetic one) to a sequence of to-mographic images. In both cases, the final aim is theestimation of the pattern of the parameters of interest.

The kinetic modeling of a dynamic ET acquisition isused to recover the spatial distribution of physiologicindexes such as blood flow, metabolism, receptor den-sity, and so on (Phelps et al., 1986). The statistical mod-eling of a series of PET images is used to investigate thepattern either of variables (factors and covariates) relat-ing to the conditions under which the scans were made,or of linear combinations (contrasts) of such variables.

The only difference between the analysis of dynamicscans and the analysis of multiple scans is the statisticalrisk that characterizes the choice of filter h(). If P(s) isthe pattern of a physiologic parameter one may be moreinterested in the mean squared error (MSE; the meansquared error is equal to the sum of the variance andsquared bias) properties of the final estimate (Turkhei-mer et al., 2000). Instead, when P(s) is the pattern of acertain measurement related to a certain experimentalcondition then one is usually more interested in the con-trol of the variance of P̂ F(s).

A number of wavelet filters exist in the statistical lit-erature for each risk of choice. For example, the SUREfilter (Donoho and Johnstone, 1995) has excellent MSEproperties. However, variance control can be exerted byminmax filters (Donoho and Johnstone, 1994; 1995) orby thresholding approaches that adopt multiple testingcorrections like the Bonferroni (Turkheimer et el., 1999).

The scheme in Fig. 2 already has been discussed indetail (Turkheimer et el., 2000). The previous manu-scripts (Turkheimer et al., 1999; 2000) also contain back-ground material on wavelet bases, DWT, and filteringprocedures in wavelet space. However, for the context ofthis article, it is relevant to review the following aspectsof the methodology.

First, the scheme illustrated in Fig. 2 is a comprehen-sive modeling approach that uses appropriate functionalsupport for dimensions s and i. In fact, the outcome P̂ F(s)is a parametric map and not a map of probabilities. Sec-ond, if the functional �() is linear then, given that theDWT is a linear and orthogonal operator, a solutionP̂ F

wt(w) that is optimal under a certain statistical risk inwavelet space will produce an inverse P̂ F(s) that is alsooptimal under the same risk in image space (Turkheimeret al., 2000). Therefore, such procedures may have ex-tensive use in the analysis of ET images where linearmodels, such as GLM or PCA, are common. Third, noiseprocesses in wavelet space are spatially uncorrelatedgiven the well-known whitening properties of the DWT

(Turkheimer et al., 1999; 2000). In this particular con-text, this means that to control the variance of the solu-tion any standard multiple hypothesis testing approachcan be efficiently used (Ruttiman et al., 1996). Notably,in image space, the control of error rates is more difficultbecause the spatial correlation of the original images, orthe one introduced by the filter, has to be taken intoaccount; consequently, the methods rely on Gaussianrandom fields theory (Adler, 1981) or randomizationprocedures (Poline and Mazoyer, 1994; Holmes et al.,1996).

Finally, the scheme of Fig. 2 allows the point-wiseestimation of the noise process in wavelet space �̂wt(w)and is therefore able to handle the computational processin nonstationary gaussian noise conditions typical of thistype of study (Turkheimer et al., 2000).

Filters that control the false positive rates were se-lected for the application section of this article to mirrorcommon thresholding strategies in image space. The twoproperties of risk equivalence between image and wave-let space and of noncorrelation of statistical scores indi-cate the use of a simple filter in the form of hard-thresholding operator defined, for a wavelet coefficient dwith error �̂, as:

hhard�d� = I��d��̂��, (1)

where I() is the indicator function. The threshold � isderived from the Bonferroni adjustment for multiple testsas (Turkheimer et al., 1999):

� = �−1�1 − ��d�*2��, (2)

where � is the desired probability of at least one falsepositive in the set (for example, 0.5, 0.01, and so on), #()is the cardinality of the set (number of non-null coeffi-cients). Given the assumption of normal noise, the in-verse cumulative distribution function �-1 can havethe simple form of an inverse Student’s t distributionwith degrees of freedom equal to those of the estimatederror �̂.

MATERIALS AND METHODS

The Results section of this article is devoted to validationand practical illustration of the method. Although extensivework has been performed for validation of WT methods intomography (Turkheimer et al., 1999, 2000), it is of interest toverify the assumptions behind the filters of Eqs. 1 and 2 that arethe ones of choice for the present context. In particular, the useof standard Student’s t distributions in Eq. 2 and relative Bon-ferroni adjustment are appropriate only if the noise processes ofthe set W(w,i) strictly match the independent normal noise con-ditions in wavelet space described above. By analogy with theapproach of Worsley et al. (1993), such correspondence wasverified by comparing the randomization distribution of stu-dentized wavelet coefficients generated from a group of realimages with the expected probability distribution function.



Practical application of the method considered two PETdatasets in which the statistical model consisted of a GLM thatis a common choice for data analysis in neuroimaging. For anexample on the use of PCA refer to the previous article(Turkheimer et al., 2000).

The first dataset is a parametric study of CBF response toword recognition. In this case, the pattern of interest P(s) is themap of the rate of change of CBF toward the specified rate ofwords. The second study considers the measurement of [car-bonyl-11C]WAY-100635 binding in brain in a group of normalcontrols and in a group of depressed subjects. In this case, it isof interest to estimate the distribution of the difference in se-rotonin receptors density between the two populations.

Details on the modeling of the experimental conditions aregiven in the following sections. The software used was writtenin MATLAB (Mathworks, Nadick, MA, U.S.A.) and run on anUltra-Sparc 10 Sun Workstation (SunSystems, Mountain View,CA, U.S.A.). The DWT was implemented as detailed inTurkheimer et al. (1999) using Battle-Lemarie wavelets (Battle,1987; Lemarie, 1988). The three-dimensional DWT was usedin all studies. Because DWT requires a data size that has to bea power of 2, all original images were imbedded in a 128 × 128× 128 volume with 2 × 2 × 2 mm pixel size. The translationinvariant DWT used previously (Turkheimer et al., 1999, 2000)could not be extended to the third dimension because such anapproach requires storage space and computational power thatis not provided by the hardware available. However, the use ofinformation from all three axes compensated for the translationinvariant approach as the amount of artifacts in the parametricmaps was negligible (see Results section). In all studies, thehard-threshold filter of Eq. 1 was used and the threshold wasderived from Eq. 2 by using a standard t-distribution with ap-propriate degrees of freedom derived from the GLM used. Theoverall error rate in Eq. 2 � � 0.05 for both studies. Waveletcoefficients outside the brain were suppressed.

Randomization studyIt is of interest to verify the ability of the wavelet filter of Eq.

2 to control error rate in wavelet space. This means that underthe null hypothesis of all frames Y(s,i) being acquired in thesame condition, the Bonferroni adjusted threshold of Eq. 2must be able to allow a proportion of studentized wavelet co-efficients d/�̂equal to the desired error rate �. Such ability ofthe procedure may be verified by constructing from the nulldataset Y(s,i) a number of simulated datasets consisting in twogroups of scans made by random assignment of the originalones (Worsley et al., 1992). Statistical analysis is then per-formed on each simulated dataset, and in each occasion theoccurrence of at least one wavelet coefficient over the thresholddefined in Eq. 2 is recorded as a positive event. After all pos-sible permutations of Y(s,i) in the two groups have been per-formed, the number of positive events divided by the totalnumber of simulations performed is compared with the errorrate �.

The null dataset was composed by the scans in activationstate of a previously published protocol (Brett et al., 1998).Such protocol consisted in two conditions: activation and rest.In all conditions, subjects looked at a computer monitor dis-playing a series of stimuli, consisting of abstract designs andvideo clips of four abstract hand gestures. In the activationcondition, subjects performed one of the four abstract handgestures with their right hand. In the rest condition, they weretold to observe the stimuli without moving, and to let theirminds go blank. Subjects were scanned with an ECAT 953BPET camera (CTI/Siemens, Knoxville, TN, U.S.A.) operatingin three-dimensional mode. Approximately 9.2 mCi H2O15

were injected by the bolus method for each scan. For eachsubject, reconstructed images were realigned to the first scan inthe session, and then coregistered to the Montreal NeurologicalInstitute standard brain (Evans et al., 1993) using SPM99 soft-ware (Functional Imaging Laboratory, London, U.K.). The nullset Y(s,i) consisted of data from 8 subjects, 5 frames each.

The statistical model consisted of a GLM with 8 subjectsfactors plus 8 subject specific covariates to normalize each scanto its global counts. Single frames from all subjects were ran-domly assigned to 2 groups of 20 scans each, then the statisticalanalysis in wavelet space was performed. Error estimates wereobtained by pooling residuals from all 40 scans. Because the setof all possible permutations was too large (40!/(20!20!) � 1.3e+ 11), analysis was repeated for a subset of 1000 random as-signments, and the number of times that studentized waveletcoefficients were over prefixed thresholds (0.1, 0.05, 0.01) wasrecorded.

Parametric study of cerebral blood flow response toword recognition

This example illustrates the application of the WT method toa typical parametric study of CBF response to an experimentalcondition. By “parametric” it is usually meant that the CBF isrelated to stimulus intensity by a linear relation and is thereforeof interest to estimate the pattern of such linear rate determinedby the experimental setting. In this particular instance, data areselected from a previously published study that considered aword recognition task (Leff et al., 2000). Five right-handed,normal subjects participated in the study. All subjects claimedEnglish as their first language. Subjects viewed single words,presented centrally, at different rates: 5, 20, 40, 60, and 80 perminute. Stimuli were displayed for 500 milliseconds and wereproceeded by a central cross-hair. The cross-hair remained forthe entire interstimulus interval. There was also a “baseline”condition—a single word appeared just before the scan startedand was then replaced by a cross for the rest of the scan—toensure that subjects attended to the screen at all times duringthe scan. All stimuli were single syllable, English words (nounsand verbs). Subjects were not given any explicit task related tothe stimuli; instead, they were instructed to “read and under-stand” the words. Each participant underwent 8 scans, the orderof which was randomized within and across subjects. Imageswere acquired using an ECAT 953B camera (CTI, Knoxville,TN, U.S.A.) operated in three-dimensional mode. Arterialsamples of the input function were not available, therefore datarepresent normalized tissue concentrations after injection ofH2

15O. For each subject, reconstructed images were realignedand then coregistered to the MNI standard brain (Evans et al.,1993) using SPM99 software.

Scans were entered into a design matrix for a GLM withstimulus rate as the covariate of interest. Subjects effects andsubject-specific covariates for global normalization completedthe matrix.

Measuring the effect of depression on brainserotonin receptors with [11C]WAY-100635

[carbonyl-11C]WAY-100635 is a selective 5-HT1A receptorantagonist that can be used in conjunction with PET to assess5-HT1A receptor binding in vivo (Pike et al., 1996). In thissection the authors reanalyze a previously published dataset(Sargent et al., 2000) in which this tracer was used to measurechanges in 5-HT1A receptors in depression. [11C]WAY-100635scans were performed with an ECAT 953B camera (CTI) on 15patients with major depressive disorder (unmedicated) and 18healthy volunteer subjects. Binding potential images were



generated from the dynamic scans using a reference tissuemodel (Gunn et al., 1998). Images were spatially normalizedwith SPM-96 software (Functional Imaging Laboratory, Lon-don, U.K.) into Montreal Neurological Institute stereotacticspace using a [11C]WAY-100635 template as previously de-scribed (Meyer et al., 1999). Statistical modeling was per-formed as in the original analysis to simplify the comparisonbetween the results of the WT method and those previouslyobtained with statistical parametric mapping. The GLM con-tained only one factor (group) in which error was estimated bypooling residuals estimates from both populations.

RESULTS

Randomization studyProportions of false positives at each of the � � 0.1,

0.05, 0.01 are shown in Table 1. All proportions are closeto nominal levels, which suggests that the underlyingnoise processes in wavelet space are reasonably wellapproximated by normal, independent distributions.Table 1 also reports the error of simulations. This can beestimated by noting that the rejection of the null hypoth-esis is a binomial event. Using the normal approximationto the binomial distribution, the approximate 90% con-fidence interval surrounding each rejection frequency Fmay be easily computed as F ± 1.645[(F)(1−F)/N]1/2

where N is the number of simulated datasets (Noreen,1989).

Empirically, one can appreciate the concordance of theempirical probability distribution function of wavelet co-efficients with the expected one by looking at Fig. 3where such distribution from one of the simulateddatasets is compared with the Student’s t-distributionwith 23 degrees of freedom.

Parametric cerebral blood flow studyThe estimated pattern of CBF response to stimulus

rate is shown in Fig. 4 superimposed to the correspon-dent MNI space. The main area of activity was in theposterior part of primary visual cortex (BA17 or V1), andprestriate cortex (BA18 and 19), bilaterally. From pres-triate cortex, activity ran anteriorly, medially, and ven-trally through the occipito-temporal junction and into thefusiform gyrus (BA 37). This effect was greater on theleft. The other two main brain regions, which also re-sponded bilaterally, were in the prefrontal cortex: thesupplementary motor area (SMA) and premotor cortex,and medial and lateral BA6, respectively.

Previous processing in SPM99 (Leff et al., 2000) iden-tified the same areas with smaller clusters. Bilateral pri-mary and secondary visual cortex and SMA were acti-

vated, but only the left premotor (lateral BA6) survivedcorrection at the overall error rate � � 0.05 (one reso-lution only analyzed using a gaussian filter with 10 mmfull width at half maximum); activity on the right waspresent but only at a lower threshold (P � 0.001 uncor-rected). This was also true for the bilateral fusiform area.

Measuring the effect of depression on brainserotonin receptors with [11C]WAY-100635

Results are presented in Fig. 5 as binding potentialvalues superimposed on the MNI brain. In the depressedpopulation, binding potential values were reduced acrossfrontal, temporal, and limbic cortex (maximum differ-ence 0.6 ∼ 10% difference from normal values). Ana-tomic location of such differences matches the one re-ported previously using SPM96 (Sargent et al., 2000).However, such a pattern could not be obviously esti-mated by SPM and the detection of the areas of signifi-cant change of the parameter pattern required a muchlower threshold (P < 0.01 uncorrected, images smoothedwith a gaussian filter with 8 mm full width at half maxi-mum) with equivalent increase in the error rate �. Figure5 also shows a greater reduction of binding in the rightorbitofrontal cortex compared with the left side thatcould not be revealed in the previous analysis by SPMbut only by region of interest analysis (Sargent et al.,2000).

DISCUSSION

A new method for the statistical analysis of sets ofPET images was introduced. The method adopts currentstatistical models for the description of the set but isnovel in the use of a mathematical transform, the wavelettransform, to model the spatial dimension of images. Use

TABLE 1. Specificity of wavelet filter with Bonferronicorrection (± 90% confidence interval)

Nominal Level � � 0.1 � � 0.05 � � 0.01Estimates From

Randomization 0.087 (±0.016) 0.045 (±0.011) 0.012 (±0.005)

FIG. 3. Distribution of wavelet scores computed from 1 of thenull-datasets obtained by random assignment of 40 scans ac-quired in the same experimental conditions in 2 simulated experi-mental groups of 20. Figure displays the empirical distribution ofstudentized wavelet coefficients superimposed with the expecteddistribution, a standard Student’s t-distribution with 23 degrees offreedom.



FIG. 4. Results of the wavelet procedure when applied to the analysis of a water activation study. The activation study was designed toactivate areas that respond to word recognition. The filtered activation image is displayed in color overlaid on a gray scale magneticresonance imaging of the MNI standard brain. Values represent the pattern of the change of blood flow toward the rate of words estimatedwith an error rate � = 0.05 over all slices. The left side of the brain is displayed at the left of the slices. Slices are labeled with their distancein millimeters from the transverse plane containing the anterior and posterior commissures. Units are in normalized counts for unit wordrate.

FIG. 5. Results of the wavelet procedure for the [11C]WAY-100635. The colored pattern, overlaid on the MNI standard brain, representsthe difference in binding potential (BP units are pure numbers) between the controls and the depressed group. Overall error rate for theestimation was � = 0.05. The left side of the brain is displayed at the left of the slices. Labels indicate the distance in millimeters from thetransverse plane containing the anterior and posterior commissures.



of a transform allows a complete modeling approach thatproduces spatial estimates of the parameters of interestunder the desired statistical risk.

This property distinguishes this approach from currentmethods of analysis. These methods combine mono-resolution filters with thresholding procedures that con-trol the rate of false positives; their output are probabilitymaps and no proper estimation of effects is allowed.

In short, whereas SPM methods are meant to obtainstatistical significance, the wavelet approach targets theestimation problem and this difference is substantial, asillustrated by the words of S.C. Pearce (1993): “Also,there should be no suggestion that statistical analyzesexist to find significant differences. Experiments are con-ducted in order to find answers to the questions beingasked. Possibly no one doubts that a difference exists. Ifso, the task is to estimate its size and not to test for itsexistence. Further, a difference of means may be signifi-cant but not important or vice versa. . . .”

The technique was built upon previous work with at-tention to the noise model; in particular, the applicationof the statistical model in wavelet space allowed themethod to adapt to the local noise characteristics of to-mographic images. Such an ability combines with theenergy preserving and whitening properties of thetransform to yield a theoretical setting that controls ef-fectively the chosen risk by use of standard statisticalprocedures. In the framework of multiple hypothesis test-ing, which is common in neuroimaging, the control offalse positives may be obtained by a simple Bonferronicorrection.

Theoretical expectations were validated in the appli-cation section of the paper. In the first instance, the cor-respondence between the empirical distributions of dataand the predicted ones for a null dataset (all frames in thesame condition) was shown. Note that the verification ofassumptions on a single dataset, however randomly se-lected and representative of the general experimentalconditions that data set may be, does not imply that theseassumptions would hold on any other set. Deviationsfrom independence and normality can be either predictedor easily detected using routine diagnostics. In this case,one may divert from the use of parametric approachesand turn toward nonparametric approaches. In this par-ticular instance, permutation methods applied to waveletcoefficients may be appropriate. Permutation methodsfor multiple testing are reviewed by Westfall and Young(1993). Also see Holmes et al., (1996) for an applicationto neuroimaging.

The same considerations regarding the robustness ofparametric assumptions in neuroimaging also apply tocurrent statistical parametric methods with the importantdifference that the methods in wavelet space do not re-quire the additional constraints imposed by gaussian ran-dom field theory such as large number of degrees of

freedom and upper and inferior bounds on the smooth-ness of the field (Holmes et al., 1996).

Finally, the method was also applied to two otherdatasets to clarify its use in typical instances of analysisof neuroimages. Comparison of these results with thoseobtained previously with statistical parametric mappingshowed that the use of wavelet bases, which are optimalor near-optimal for PET signals (Turkheimer et al.,1999), combined with an adequate noise model, not onlyestimates correctly the parametric patterns but also pro-duces a valuable increase in the detection power.

REFERENCES

Adler RJ (1981) The geometry of random fields. New York: WileyBarber DC (1980) The use of principal components in the quantitative

analysis of gamma camera dynamic studies. Phys Med Biol25:283–292

Battle G (1987) A block spin construction of ondelettes. Part I: Lemariefunctions. Commun Math Phys 110:601–615

Brett M, Stein JF, Brooks DJ (1998) The role of the premotor cortex inimitated and conditional praxis. Neuroimage 7(4):S978

Donoho DL, Johnstone IM (1994) Ideal spatial adaptation by waveletshrinkage. Biometrika 81:425–455

Donoho DL, Johnstone IM (1995) Adapting to unknown smoothnessvia wavelet shrinkage. J Am Statist Ass 90:1200–1224

Evans AC, Collins DL, Mills SR, Brown ED, Kelly RL, Peters TM. 3Dstatistical neuroanatomical models from 305 MRI volumes. In:Proc IEEE Nuclear Science Symposium and Medical ImagingConference 1993;1813–1817.

Friston KJ, Frith CD, Liddle PF, Frackowiak RSJ (1991) Comparingfunctional (PET) images: the assessment of significant change. JCereb Blood Flow Metab 11:458–466

Friston KJ, Firth CD, Liddle PF, Frackowiack RSJ (1993) Functionalconnectivity: the principal component analysis of large (PET) datasets. J Cereb Blood Flow Metab 13:5–14

Friston KJ, Worsley KJ, Frackowiack RSJ, Mazziotta JC, Evans AC(1994) Assessing the significance of focal activations using theirspatial extent. Human Brain Mapping 1:210–220

Friston KJ, Holmes AP, Worsley KJ, Poline JP, Frith CD, FrackowiackRSJ (1995) Statistical parametric maps in functional imaging: ageneral linear approach. Human Brain Mapping 2:189–210

Gunn RN, Sargent PA, Bench CJ, Rabiner EA, Osman S, Pike VW,Hume SP, Grasby PM, Lammertsma AA (1998) Tracer kineticmodeling of the 5-HT1A receptor ligand [carbonyl-11C]WAY-100635 for PET. Neuroimage 8:426–440

Holmes AP, Blair RC, Watson JDG, Ford I (1996) Nonparametricanalysis of statistic images from functional mapping experiments.J Cereb Blood Flow Metab 16:7–22

Leff AP, Scott SK, Crewes H, Hodgson TL, Cowey A, Howard D,Wise RJS (2000) Impaired reading in patients with right hemiano-pia. Ann Neurol 47:171–178

Lemarie PG (1988) Ondelettes a localization exponentielles. J MathPures et Appl 67:227–236

Mallat SG (1989) A theory of multiresolution signal decomposition:the wavelet representation. IEEE Trans Pattern Anal Machine In-tell 11:673–693

Meyer JH, Gunn RN, Myers R, Grasby PM (1999) Assessment ofspatial normalization of PET ligand images using ligand specifictemplates. Neuroimage 9:545–553

Moeller JR, Strother SC, Sidtis JJ, Rottenberg DA (1987) Scaled sub-profile model: a statistical approach to the analysis of functionalpatterns in positron emission tomographic data. J Cereb BloodFlow Metab 7:649–658

Noreen EW (1989) Computer-intensive Methods for Testing Hypoth-eses: an Introduction. New York: Wiley

Pearce SC (1993) Introduction to Fisher (1925) Statistical Methods forResearch Workers. In: Breakthroughs in Statistics, Volume II(Kotz S, Johnson NL, eds), New York: Springer Verlag, p 60



Phelps M, Mazziotta JC, Schelbert H (1986) Positron Emission To-mography and Autoradiography; Principles and Applications forthe Brain and the Heart. New York: Raven Press

Pike VW, McCarron JA, Lammertsma AA, Osman S, Hume SP, Sar-gent PA, Bench CJ, Cliffe IA, Fletcher A, Grasby PM (1996)Exquisite delineation of 5-HT1A receptors in human brain withPET and [carbonyl-11C]WAY-100635. Eur J Pharmacol301:R5–R7

Poline JB, Mazoyer BM (1994) Analysis of individual brain activationmaps using hierarchical description and multiscale detection. IEEETrans Med Imag 13:702–710

Ruttiman UE, Unser M, Thevenaz P, Lee C, Rio D, Hommer DW(1996) Statistical analysis of image differences by wavelet decom-position. In: Wavelets in Medicine and Biology (Aldroubi A, UnserM, eds), Boca Raton, FL: CRC Press, pp 115–144

Sargent PA, Kjaer KH, Bench CJ, Rabiner EA, Messa C, Meyer J,Gunn RN, Grasby PM, Cowen PJ (2000) Brain serotonin1A recep-tor binding measured by PET with [11C]WAY-100635. Arch GenPsychiatry 57:171–178

Turkheimer FE, Brett M, Visvikis D, Cunningham VJ (1999) Mul-

tiresolution analysis of emission tomography images in the waveletdomain. J Cereb Blood Flow Metab 19:1189–1208

Turkheimer FE, Banati RB, Visvikis D, Aston JAD, Gunn RN, Cun-ningham VJ (2000) Modeling dynamic PET-SPECT studies in thewavelet domain. J Cereb Blood Flow Metab 20:879–893

Unser M, Thevenaz P, Lee C, Ruttiman UE (1995) Registration andstatistical analysis of PET images using the wavelet transform.IEEE Eng Med Biol Mag 14:603–611

Westfall PH, Young SS (1993) Resampling-Based Multiple Testing.New York: Wiley

Worsley KJ, Evans AC, Marrett S, Neelin P (1992) A three-dimensional analysis for CBF activation studies in human brain. JCereb Blood Flow Metab 12:900–918

Worsley KJ, Poline JB, Vandal AC, Friston KJ (1995) Tests for dis-tributed, nonfocal brain activations. Neuroimage 2:183–194

Worsley KJ, Marret S, Neelin P, Evans AC (1996) Searching scalespace for activation in PET images. Human Brain Mapping4:74–90

Worsley KJ (1997) Discussion of the paper by Lange and Zeger. ApplStatist 46:25



statistical modeling of positron emission tomography images in wavelet space

Documents