[ieee 1999 ieee nuclear science symposium. conference record. 1999 ieee nuclear science symposium...

A Bayesian Multiscale Framework for SPECT

R. D. Nowak‘, E. Kolaczykt, D. Lalusht, and B. Tsuit *Electrical and Computer Engineering, Rice University, Houston, TX

t Mathematics and Statistics, Boston University, Boston, MA $Biomedical Engineering, University of North Carolina, Chapel Hill, NC

Abstract

This paper describes a new Bayesian modeling and analysis method for emission computed tomography based on a novel multiscale framework. The class of multiscale priors has the interesting feature that the “non-informative” member yields the traditional maximum likelihood solution; other choices are made to reflect prior belief as to the smoothness of the unknown intensity. Remarkably, this Bayesian multiscale framework admits a novel maximum a posteriori (MAP) reconstruction procedure using an expectation-maximization (EM) algorithm, in which the EM update equations have simple, closed-form expressions. The potential of this new framework is assessed using the Zubal brain phantom and simulated SPECT studies.

I. INTRODUCTION The objective in emission computed tomography

(ECT) is to recover an object (intensity) from indirect Poisson data (counts); that is, Poisson data are collected whose underlying intensity function is indirectly related to an object of interest through a linear system of equations determined by the physics and geometry of the tomographic imaging system. It is well known that maximum likelihood estimation (based on the EM algorithm) produces highly variable (“noisy”) reconstructions without appropriate stopping measures. Another remedy is to employ formal regularization methods (Bayesian or otherwise), but these techniques are often computationally prohibitive. To overcome these shortcomings, in this paper we propose a new approach based on a multiscale random field model prior that admits a computationally efficient EM algorithm.

Wavelet and multiscale regularization methods

‘This work was supported by the National Science Foundation, dvigrant no. MP-9701692, the Office of Naval Research, award number N00014-99-1-0219, and the Army Research Office, grant no. DAAD19-99-1- 0349.

have recently received considerable attention in the signal processing and statistics literatures, e.g., [1,2]. Most of the techniques developed to date are based on Gaussian noise models which are not directly applicable in Poisson inverse problems. In the low SNR cases of greatest practical interest, the data are not well modeled with standard Gaussian approximations to the Poisson likelihood, and hence many existing multiscale regularization methods are simply inappropriate.

Nowak and Kolaczyk recently introduced a novel Bayesian multiscale framework specifically designed for Poisson inverse problems [3, 41, extending their earlier work in the case of directly observed Poisson data [5, 61. This framework admits a remarkably simple Bayesian multiscale analysis tool for Poisson data that is analogous to wavelet-based counterparts used in Gaussian denoising/estimation problems [ 1, 21. In addition has several other desirable features that are germane to the specific context of ECT problems. First, this framework admits a simple EM algorithm for computing MAP reconstructions. The EM algorithm involves closed-form (analytic) steps at each iteration, making it computationally attractive. Second, under mild regularity assumptions on the multiscale prior density, it can be proven that the EM algorithm converges to a unique, global MAP estimate [4]. Third, the effects of the multiscale prior density (and hyperparameter settings) are easily interpreted, which is important from a user’s perspective in applications.

The paper is organized as follows. In Section 2, we discuss the basic ECT problem and introduce some notation. In Section 3, we discuss a multiscale approach to analyzing Poisson data. In Section 4, we introduce a new multiscale intensity prior suitable for ECT. In Section 5, we describe an EM algorithm for computing MAP reconstructions. In Section 6, we apply the new framework to a simulated SPECT study and make concluding remarks.

0-7803-5696-9/00/$10.00 (c) 2000 EEE 1147

11. INTRODUCTION 111. MULTISCALE DATA ANALYSIS We adopt the standard statistical model for ECT. In the context of indirectly observed data,

through the EM algorithm one is led to consider the complete-data likelihood. As we show below, the complete-data likelihood can be re-parameterized in terms of a multiscale data analysis, yielding

We observe Poisson distributed data (counts)

Yn p(Pn>, n = 0, . - 7 N - 1, (1)

where p ( h ) denotes a ’oisson distribution with intensity parameter h* The (unknown) intensities c1 = {h}f<l are to Other (unknown)

a factorized (product-form) likelihood. Let us begin by recalling that the direct (unobserved) emission data are given by = E, z(n,m),

intensities, = {x,}::;, of primary interest,

an N x M system matrix of known non-negative weights. Photons are emitted (from the emission space) according to an intensity A. Those photons ~ j , ~ = xj+1,2, + xj+1,2,+1,

emitted from location m are detected (in the detection space) at position n with transition

The index j refers to the resolution of the analysis, probability h,,. The problem is to reconstruct (or 2j; j = J being the index for the highest resolution estimate) X from the observed data y = {y,}fit. (finest scale), and j = 0 corresponding to the Throughout this paper, we will assume M = 2‘, for lowest resolution (coarsest scale). The multiscale some integer J > 0, while N can be an arbitrary data {xj,,} are the (unnormalized) Haar scaling integer. Since M typically is chosen by the user, coefficients of x. Haar multiscale analysis is while N normally is predetermined by instrumental especially well-suited to Poisson data for the design constraints, this common condition should following reasons. First, the Haar analysis is present no difficulties. essentially just summation, and summation is a

Our later usage Of an EM reproduces (i.e., the unweighted sum of independent to introduce the so-called “unobservable” data: Poisson variates is itself Poisson distributed). the number of photons emitted at location rn and Second, analyses with general wavelets detected at location n, denoted Z(n,m), in which result in arbitrary linear combinations of Poisson case random variables, for which such nice distributional

Hence the indirectly observed (and therefore The multiscale analysis of the (unknown) “incomplete”) data Y in (1) are given by intensity X is analogous to that defined for x: gn = Cm z ( n , m). Additionally, were we able to observe them, the direct emission data for each location m would be given by sums of the form %P = A j + 1 j 2 m + X j + l J m + l r

x, Cnz(n,m), from which it follows that m=O ,..., 2 j - 1 , O s j s J-1. (4) 2m we The parameters {Aj,m} are the (unnormalized) Haar could avoid the inverse problem altogether and scaling coefficients of simply deal with the issue of estimating a Poisson intensity given direct Of course, With these definitions, it can be shown [3-61 that this device is precisely what the well-known EM the complete-data likelihood has a simple factorized

algorithm exploits in producing estimates of X from form:

= o, . . . , M -

xJ,m E x m ~ = O, . . * , 2J -

(where M = 2 ~ ) . Define the via the c1 = ”9 where = {hn,,} is multiscale analysis of these data according to:

m=O , .” , 2 J - l , O < j < J - l . (3)

From a standpoint, foreshadowing transformation under which the Poisson distribution it is

z(n,m) P(XmPn,m) . (2) characteristics do not result.

AJ,, A,, m = 0 , . . . , 2 J - 1

if ’ were

the indirect data y, a fact that will be fundamental p ( z I A) 0: p(xo,o~Ao,o) x (5) to our own approach. J-1 23-1

J-J I-J w 3 + 1 , 2 , I Zj,m,Pj,m)r j = o m=O

0-7803-5696-9/00/$10.00 (c) 2000 IEEE 1148

where P(ZO,O IX0,o) denotes a Poisson probability mass function of q , ~ , with intensity &,o, and B ( z l n , p ) = (g)p"(l - p)"-", denotes the binomial distribution with parameters n and p .

IV. MULTISCALE INTENSITY PRIORS The crucial ingredient in any Bayesian procedure

is the selection of a suitable prior. Ideally, the prior reflects known or assumed attributes of the intensity in question and is well matched, in functional form, to the Poisson (or Poisson-binomial) likelihood. Parametric conjugate priors are advantageous for computational reasons since the posterior distribution is obtained by simply "updating" the parameters of the prior based on the observations; see [3-61 for further information. Moreover, we will see that conjugate priors can provide in the current setting very plausible models for the multiscale parameters. The family of gamma densities is a conjugate family to the Poisson likelihood and the beta family is conjugate to the binomial, and we adopt these priors here.

Begin by placing a gamma density prior on the total intensity parameter:

with y > 0 and 6 > 0. Next, we model each multiscale split parameter as an independent beta distributed random variable,

0 5 p 5 1, where B(a, P) denotes the standard beta function. In this paper, we will only use symmetric beta priors of mean 1/2, characterized by ct = P. Here, as in most related approaches, we do not have the parameters depend on the location m, since location dependent signal characteristics are usually not known a priori.

The prior density for the unknown parameters XO,O and p is therefore

J-12'-1

P(Xo,o, P ) = G(Xo,olY, 4 n n W P j , k I aj, q). j=o k=O

(6)

0-7803-5696-9/00/$10.00 (C) 2000 IEEE 1149

The gamma prior on &,o can be tailored to reflect knowledge of the total intensity of the process under consideration. However, since the total count ZO,O is typically quite large, with reasonable settings for the hyperparameters y and S the effect of the gamma prior is negligible. More important are the beta priors placed on the splits. Here, the hyperparameters {aj} reflect our belief or prior knowledge regarding the regularity of the intensity. To illustrate briefly, setting 9 = 1, j = 0,. .. , J - 1, we have uniform (constant) prior densities on the splits, expressing absolute ignorance about the multiscale refinement of the intensity. In this case, the MAP estimates of the splits { ~ j , ~ } coincide with the MLEs [4]. With ctj >> 1, j = 0,. . . , J - 1, the beta prior densities are peaked about the point 1/2, favoring a more uniform reconstructions. These settings tend to stabilize the estimates in low-count situations, pushing each MAP estimate (of away from the MLE and closer to 1/2 (an even split indicative of smoothness or regularity in the intensity at that scale and position). Large settings for (9) tend to produce more smoothing. Hence, our new multiscale priors have very simple and interpretable hyperparameters that can be easily tuned to prior knowledge of smoothness and regularity [4]. In contrast, many Bayesian reconstruction methods based on classical Markov random field models can be much more difficult to interpret [7].

To finish our discussion, we note that the log complete-data posterior distribution is proportional to

W ) fx logp(zlN + logp(X), (7)

where p(X) is a prior (e.g., the intensity prior induced by multiscale prior (6)). Alternatively, in the multiscale re-parameterization the log complete-data posterior distribution can be written as

where C is a constant that does not depend on the parameters (AJ,~, p) . So, we have two equivalent expressions for the complete-data log-posterior; (7) in the spatial domain and (8) in the multiscale parameterization. Due to its simple form, maximizing (8) with respect to the splits and total intensity is trivial; simply differentiate the expression to obtain the MAP estimates [4]; in other words, we have a closed-form M-Step. Given these MAP estimates, a MAP estimate of the intensity X is reconstructed according to the multiscale synthesis equations

h h

m=O , . " ) 2 j - - l , O < j g - l . (9)

V. MAP ESTIMATION VIA EM ALGORITHM The EM algorithm proceeds in the following

manner. Let denote the intensity returned on the k-th iteration and define

E-Step: Compute

&(X,X(") = Ep) [logP(zlX)lYl. (11)

Note that this is just the classical E-Step [8]:

And so, the functional form of expected complete-data log-posterior is &(A, A(')) = l o g p ( z ( ~ ) l ~ x ) .

M-Step: Maximize the expected complete-data log-posterior, (7), after transforming into the multiscale representation (8). This reduces to a two-step process.

(i.) Generate x ( ~ ) from z@). (ii.) Calculate (@:'I, ~("'1) by

maximizing (8), with x(~) in place of x. Then A("') is obtained via (9).

This algorithm has several desirable properties. First, as a standard property of the EM algorithm, the posterior probability is non-decreasing as we iterate. Second, it is easily verified that, by construction, the resulting estimate is non-negative. Third, if we set aj = 1, j = 0, . . . , J - 1, in which case the beta densities for the { ~ j , ~ } coincide with the uniform density on [0,1] (a non-informative case of our split prior), and set y = 1, 6 = 0 in the gamma prior on XO,O, a limiting form of the gamma density, then we recover the classical MLE method [8].

A remarkable feature of our EM algorithm is its computational simplicity. It is no more demanding than the classical EM algorithm. Most other MAP criteria proposed for this problem do not admit such a simple EM algorithm; usually the maximization step does not have a closed-form expression and must be computed numerically.

VI. BRAIN SPECT EXAMPLE In this section, we present an example of

the application of the multiscale prior to brain SPECT imaging. The dataset used is a realistic simulation of a Tc-99m HMPAO brain perfusion SPECT study using the Zubal brain phantom [9]. Parallel-beam projection data were simulated using an accurate volume-weighted projection model which includes the effects of nonuniform attenuation and 3D detector response for a low- energy, high-resolution collimator. We simulated 128 views over 360", with 3mm bin size and radius of rotation of 15 cm. Poisson noise was simulated to emulate the count level in a typical patient study, minus scattered photons, to obtain approximately 4.6 million counts.

Reconstructions were performed by incorporating the multiscale prior into the RBI-MAP algorithm [ 101, an iterative maximization routine that uses the ordered-subsets principle [ 1 I] to speed up convergence. Fifty iterations of the algorithm were applied. Nonuniform attenuation and 3D detector response were modeled using a rotation-based projector, different from the simulation projector. The 3D multiscale analysis was computed by successively aggregating neighboring voxels along the x-, then y-, and then z- directions. The multiscale analysis was only

0-7803-5696-9/00/$10.00 (c) 2000 EEE 1150

carried to the scale at which each direction had been collapsed three times, generating macro-voxels of size 8 x 8 ~ 8 , because coarser scales were found to have little effect on the final result. In implementing the prior term in the RBI-MAP algorithm, we require the derivative of the log prior in terms of the voxel intensities, and not the splits. As shown in [4], the log prior in this form, ignoring the constraint on total intensity, is:

The derivative with respect to intensity pixel b,i at the finest (full resolution) scale J is then:

d -log h(A) = dX J,i J-1

where m2,J 3 [$J, the smallest integer greater than or equal to 5. We then reparameterized the prior, replacing the parameters with the product of a global prior weighting parameter, W , and an individual weight applied at each scale, 9. In our experiment, all wJ were equal to one and the global weighting parameter was varied.

Figures 1 (a) and (b) present transverse slices from the phantom for two different levels of global weighting in the multiscale MAP approach, in comparison with the true image and the reconstruction with no prior (W = 0). The MAP images demonstrate significant noise reduction compared to the unsmoothed reconstructions. Ongoing efforts are focused on assessing the performance of the new multiscale framework with classical reconstruction methods and investigating joint reconstruction and segmentation methods for attenuation correctiodcompensation.

VII. REFERENCES [l] D. Donoho and I. Johnstone, “Ideal adaptation via

wavelet shrinkage,” Biometrika, pp. 425-455,1994.

(a) (b) Figure 1: Two transverse slices, (a) and (b). of the brain phantom for various reconstructions of simulated SPECT data. The first columns of (a) and (b) includes the phantom and the unsmoothed case (W = 0). The second columns of (a) and (b) show multiscale reconstruction results at different levels of global weighting.

E. Kolaczyk, “A wavelet shrinkage approach to tomographic image reconstruction,” J. Amer: Statist.

R. Nowak and E. Kolaczyk, “A multiscale MAP estimation method for Poisson inverse problems,” in Proc. 32nd Asilomar Con5 Signals, Systems, and Comp., Pacific Grove, CA, pp. 1682-1686,1998. R. Nowak and E. Kolaczyk, “A Bayesian multiscale framework for Poisson inverse problems,” submitted to IEEE Trans. Info. Theory, 1999. K. Timmermann and R. Nowak, “Multiscale modeling and estimation of Poisson processes with application to photon-limited imaging,” IEEE Transactions on Information Theory, vol. 45, no. 3, pp. 846-862, April, 1999. E. Kolaczyk, “Bayesian multi-scale models for Poisson processes,” J. Amer. Statist. Assoc., vol. 94, pp. 920-933,1999. D. S. Lalush and B. M. W. Tsui, “Simulation evaluation of Gibbs prior distributions for use in maximum a posteriori SPECT reconstructions,” IEEE Trans. Med. Imag., pp. 261-275, 1992. Y. Vardi, L. A. Shepp, and L. Kaufman, “A statistical model for positron emission tomography,” J. Amer. Statist Assoc., vol. 80, pp. 8-37, 1985. I. G. Zubal, C. R. Harrell, E. 0. Smith, Z. Rattner, G. R. Gindi, and P. B. Hoffer, “Computerized three- dimensional segmented human anatomy,” Med.

D. S. Lalush and B. M. W. Tsui, “Block-iterative techniques for fast 4D reconstruction using a priori motion models in gated cardiac SPECT,” Phys. Med. Biol., vol. 43, pp. 875-887, 1998. H. M. Hudson and R. S. Larkin, “Accelerated image reconstruction using ordered subsets of projection data,” IEEE Trans. Med. Im., pp. 601-609, 1994.

ASSOC., vol. 9 1 , ~ ~ . 1079-1990,1996.

PhyS., vol. 21, pp. 299-302, 1994.

0-7803-5696-9/00/$10.00 (c) 2000 IEEE 1151

[ieee 1999 ieee nuclear science symposium. conference record. 1999 ieee nuclear science symposium...

Documents