fast reconstructionin magneticparticleimagingfast reconstruction in mpi 2 1. introduction magnetic...

24
Fast Reconstruction in Magnetic Particle Imaging J Lampe 1 , C Bassoy 2 , J Rahmer 3 , J Weizenecker 4 , H Voss 1 ,B Gleich 3 and J Borgert 3 1 Institute of Numerical Simulation, Hamburg University of Technology, D-21071 Hamburg, Germany 2 Institute of Computer Technology, Hamburg University of Technology, D-21071 Hamburg, Germany 3 Philips Research Laboratories, D-22335 Hamburg, Germany 4 University of Applied Sciences, D-76133 Karlsruhe, Germany E-mail: [email protected], [email protected] Abstract. Magnetic particle imaging (MPI) is a new tomographic imaging method which is able to capture fast dynamic behavior of magnetic tracer material. From measured induced signals, the unknown magnetic particle concentration is reconstructed using a system function, which has been determined beforehand and describes the relation between particle position and signal response. After discretization, the system function is represented by a matrix, whose size can prohibit the use of direct solvers for matrix inversion to reconstruct the image. In this paper, we present a new reconstruction approach, which combines efficient compression techniques and iterative reconstruction solvers. The data compression is based on orthogonal transforms, which extract the most relevant information from the system function matrix by thresholding, such that any iterative solver is strongly accelerated. The effect of the compression with respect to memory requirements, computational complexity and image quality is investigated. With the proposed method, it is possible to achieve real-time reconstruction with almost no loss in image quality using measured 4D MPI data. Keywords : Medical imaging, system matrix, data compression, inverse problem PACS numbers: 87.85.Rs, 87.57.-s, 07.05.Kf, 02.30.Zz, 02.60.Dc Submitted to: Phys. Med. Biol.

Upload: others

Post on 29-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • Fast Reconstruction in Magnetic Particle Imaging

    J Lampe1, C Bassoy2, J Rahmer3, J Weizenecker4, H Voss1, B

    Gleich3 and J Borgert3

    1 Institute of Numerical Simulation, Hamburg University of Technology, D-21071

    Hamburg, Germany2 Institute of Computer Technology, Hamburg University of Technology, D-21071

    Hamburg, Germany3 Philips Research Laboratories, D-22335 Hamburg, Germany4 University of Applied Sciences, D-76133 Karlsruhe, Germany

    E-mail: [email protected], [email protected]

    Abstract. Magnetic particle imaging (MPI) is a new tomographic imaging method

    which is able to capture fast dynamic behavior of magnetic tracer material.

    From measured induced signals, the unknown magnetic particle concentration is

    reconstructed using a system function, which has been determined beforehand

    and describes the relation between particle position and signal response. After

    discretization, the system function is represented by a matrix, whose size can prohibit

    the use of direct solvers for matrix inversion to reconstruct the image. In this

    paper, we present a new reconstruction approach, which combines efficient compression

    techniques and iterative reconstruction solvers. The data compression is based on

    orthogonal transforms, which extract the most relevant information from the system

    function matrix by thresholding, such that any iterative solver is strongly accelerated.

    The effect of the compression with respect to memory requirements, computational

    complexity and image quality is investigated. With the proposed method, it is possible

    to achieve real-time reconstruction with almost no loss in image quality using measured

    4D MPI data.

    Keywords : Medical imaging, system matrix, data compression, inverse problem

    PACS numbers: 87.85.Rs, 87.57.-s, 07.05.Kf, 02.30.Zz, 02.60.Dc

    Submitted to: Phys. Med. Biol.

  • Fast Reconstruction in MPI 2

    1. Introduction

    Magnetic particle imaging (MPI) is a rather new tomographic technique for imaging

    the distribution and concentration of magnetic tracer material with a high spatial and

    temporal resolution (Gleich and Weizenecker 2005). For the generation of a signal, the

    method exploits the non-linear magnetization response of particles in the presence of

    an oscillating magnetic drive-field. The magnetization response induces a voltage in

    the receive coils, i.e., the MPI signal. The spectrum of the signal typically contains a

    large number of higher harmonics of the drive frequency. For 2D or 3D MPI, orthogonal

    drive fields are applied at slightly different frequencies, which produces not only (pure)

    higher harmonics, but also signal at a wealth of beat frequencies. In order to localize the

    particles, a static inhomogeneous magnetic selection field is superimposed. This limits

    the particle response to a small region, denoted as the field free point (FFP). With the

    help of the drive fields, the FFP can be driven through a 2D or 3D domain of interest,

    denoted as the field of view (FOV), see (Gleich and Weizenecker 2005, Gleich et al 2008,

    Weizenecker et al 2009) for a detailed description.

    For the image reconstruction in MPI, a system function is needed, that maps a given

    particle distribution in the FOV to the measured frequency response, cf. (Weizenecker et

    al 2007, Rahmer et al 2009). Assuming a linear relation between the induced signal and

    the concentration of the magnetic particles, the image reconstruction turns into solving

    an inverse problem. After suitable discretization, the reconstruction process results in

    a least squares problem with a large system matrix, cf. (Knopp et al 2010).

    The system function can either be described by a mathematical model or it can

    be obtained experimentally from measured data. An approach to find a suitable model

    function can be found in (Knopp et al 2010a). But although a mathematical model

    is highly desirable, the model-based reconstruction faces difficulties for 2D- and 3D

    problems, mainly due to the lack of a realistic particle model. In this paper we consider

    a measured system matrix which is obtained by placing small, point-like samples of the

    magnetic tracer material at a grid of points in space within the FOV. When performing

    the measurement procedure with the FFP traveling on the predefined trajectory (which

    typically is a Lissajous curve) (Weizenecker et al 2009), the corresponding column of

    the system matrix is obtained by the FFT of the recorded signal.

    In this paper, we remain with the classical choice of a Lissajous trajectory. A

    Lissajous trajectory for a 2D FOV is obtained by two orthogonal drive fields with

    slightly different frequencies. The ratio of the frequencies (assumed to be a rational

    fraction) determines the time for one closed Lissajous curve of the FFP, as well as it

    determines how fine the trajectory covers the FOV. The trajectory can be extended

    straightforwardly to cover a 3D FOV by using three orthogonal drive fields.

    Different trajectories for the FFP have been investigated in (Knopp et al 2009). A

    sensitivity analysis has been accomplished in (Weizenecker et al 2007) and the use of a

    field free line instead of a FFP was proposed in (Weizenecker et al 2008). Experimental

    results on rapid 2D MPI can be found in (Gleich et al 2008), 3D real-time in vivo

  • Fast Reconstruction in MPI 3

    measurements have been reported in (Weizenecker et al 2009). Some results on the

    solution of the linear system by iterative methods can be found in (Knopp et al 2010).

    In this paper, we present a new method which combines iterative reconstruction

    methods of low computational complexity (i.e., of the order of matrix-vector

    multiplications) with efficient data compression techniques. Note that a factorization

    of the system matrix is much more expensive and should be avoided in any cases. The

    compression is based on orthogonal transformations and is used for extracting the most

    relevant parts from the large system matrix. The effect of the compression with respect

    to computational complexity and image quality is investigated. Using the presented

    method on current MPI data, a real-time reconstruction with almost no significant

    change in image quality has been achieved.

    This paper is organized as follows. In section 2, we give a short derivation of

    the MPI reconstruction problem and introduce some direct and iterative solvers. In

    section 3, the properties of the rows of the system matrix are investigated more closely.

    The theory for the data compression of the rows is presented, as well as how to use a

    condensed system matrix when solving the reconstruction problem by iterative methods.

    The efficiency of the new method is demonstrated on two large 3D MPI examples in

    section 4. Conclusions can be found in section 5.

    2. System Matrix and Reconstruction Problem

    The MPI imaging (or reconstruction) problem is an inverse problem. Let us denote xnas the particle concentration at the spatial position rn within the FOV rn, n = 1, . . . , N ,

    and let bm denote the measured signal at the frequency fm, m = 1, . . . , M̃ . Thus the

    following linear mapping is obtained:

    Ãx = b, with x := (xn(rn))N

    n=1 and b := (bm(fm))M̃

    m=1, (2.1)

    where the system matrix is defined as

    Ã := (s(fm, rn))m=1:M̃,n=1:N

    with the system function s(fm, rn) introduced in (Gleich et al 2005). The number of

    columns of à is equal to the number of points we are interested in, i.e. N = np×nq×nrwhere n{p,q,r} is the voxel location of a three-dimensional image. The closer these points

    are chosen, the higher the resolution of the reconstructed image. Thus, the dimension

    of the system matrix à is determined by the spatial resolution and the size of the FOV

    (determines number of columns N), and the recording time and sampling rate of the

    induced measured signal of a single recording coil (determines number of rows M̃).

    For 3D imaging, 3 orthogonal receive coils are used, each yielding a full system function

    matrix. The matrices can be combined into a single matrix with three times the number

    of frequency components.

    To be more specific, the MPI data is obtained by performing a one-dimensional fast

    Fourier transform (FFT) of the recorded time signal. The transformed vector still has

  • Fast Reconstruction in MPI 4

    length M̃ but is now complex. To keep the computations in real arithmetic, real and

    complex part of the transformed vectors are stored separately:‡

    ÃFFTx , ÃFFTy , Ã

    FFTz ∈ CM̃×N −−−−−−−−−−→sep. ℜ and ℑ AFFTx , AFFTy , AFFTz ∈ R2M̃×N .

    By stacking the responses of the three recording coils, the columns of A now have the

    length M = 3 · 2M̃ :

    A =

    AFFTxAFFTyAFFTz

    ∈ RM×N . (2.2)

    Usually it holds that M > N which yields the overdetermined linear system

    Ax ≈ b.The right-hand side b is obtained by the measurement (i.e., the Lissajous-trajectory

    of the FFP) of an unknown object, which is similar to acquiring a column of A. The

    columns of A have been obtained by performing the same measurement procedure, after

    placing a small point-like sample at predefined grid positions (which are usually reached

    by using a robot). However, the difference is that the columns of A only have to be

    determined once for a fixed system set up, i.e., when using the same particles and system

    parameters. So the system matrix can be acquired far in advance, independently of the

    measurement of an unknown object for obtaining the right-hand side b.

    Since the underlying problem Atruex = btrue is a discrete ill-posed problem (see

    (Hansen 1998) for details on ill-posed problems), the condition number of Atrue is large.

    Hence we are in need of some regularization technique to solve the least squares problem

    minx

    = ‖Ax− b‖22. (2.3)

    In subsection 2.1, we briefly describe a direct method and compare the computational

    complexity to iterative methods that are discussed in subsection 2.2. Remarks on the

    effect of measurement noise can be found in subsection 2.3.

    2.1. Reconstruction via direct methods

    By using the standard form Tikhonov regularization (Phillips 1962, Tikhonov 1963),

    the following minimization problem is obtained

    minx

    = ‖Ax− b‖22 + α‖x‖22.

    Let us assume to be given a suitable value of the regularization parameter α > 0, then

    the solution can be obtained by the normal equations

    (ATA+ αI)x = AT b.

    A straightforward approach is to compute the Cholesky factorization CTC = ATA+αI,

    with an upper triangular matrix C = chol(ATA + αI). Once the factorization has

    ‡ Notice that by considering real- and imaginary part of all frequencies separately, the meaning of thesystem matrix slightly changes, which results in slightly different reconstructed images.

  • Fast Reconstruction in MPI 5

    been accomplished, the solution is then easily obtained by solving the triangular

    linear systems CTy = AT b for y and then Cx = y for the solution x. Solving a

    triangular system only has a computational complexity ofO(N2). However the Choleskyfactorization is of cubic computational complexity with

    FChol = N2M + 1/3N3 (2.4)

    floating point operations (flops), i.e., N2M flops for setting up the normal equations

    (considering the symmetry of ATA) and 1/3N3 flops for computing C. This is highly

    expensive for large system matrices. A cubic complexity holds true for all direct

    methods, which makes direct inversion of the system matrix highly expensive. However,

    inversion of the matrix would only have to be computed once or in the case that system

    parameters or particle properties changes it would have to be recomputed once again.

    Thus in normal operation, the inverted matrix could be reused for all measurements.

    But it turned out that for the investigated examples even a precomputed factorization

    of A is not sufficient to reach real-time reconstruction.

    2.2. Reconstruction via iterative methods

    In this subsection, we briefly introduce two well known iterative methods: The Algebraic

    Reconstruction Technique (ART) and the Conjugate Gradient Least Squares (CGLS)

    approach. The main advantage of iterative methods is that they rely on matrix-vector

    multiplications, thus only have quadratic complexity and avoid the costly O(N3) oper-ations from direct methods. In both iterative methods the number of iterations serves

    as a regularization parameter.

    ART:

    The ART method is also known as Kaczmarz method (Kaczmarz 1937). In one iteration

    step a sweep through the rows of A is performed to update the approximate solution

    vector x0:

    xi = xi−1 +b(i)− (aTi xi−1)

    ‖ai‖22ai, for i = 1, . . . ,M (2.5)

    where the final vector xM is the new iterate. b(i) is the ith component of the right-hand

    side b and ai is the ith row vector of A turned into a column vector, i.e., ai = A(i, :)T .

    One inner iteration within one sweep can be interpreted as orthogonal projection of xi−1onto the hyperplane Hi = {x|aTi x = bi}, which then gives xi. The method has a lowcomplexity: The preparation costs before the first sweep are 2MN flops for computing

    all M row norms ‖ai‖2. The cost of one sweep are 3MN flops (note that one matrix-vector multiplication with A is 2MN flops).

    The ART method is known for its application in computed tomography, where

    experience shows that it typically converges very quickly in the first iterations.

    Nevertheless, the analytical speed of convergence can be arbitrarily slow, see (Elsner

    et al 1991, Hanke and Niethammer 1990) for results on convergence properties.

  • Fast Reconstruction in MPI 6

    Furthermore the classical Kaczmarz method (2.5) is not able to solve the least

    squares problem (2.3) if the underlying linear system (2.1) is not consistent. As shown

    in (Saunders 1995), a remedy is to use a modified version which is denoted as regularized

    ART (RART) and is based on the extended system

    (A αI)

    (x

    0

    )= b. (2.6)

    The idea is now to apply the Kaczmarz method analogously to the extended system,

    but through the parameter α > 0 a certain amount of regularization can be imposed.

    Notice that a regularizing effect is also present just by the number of iterations that are

    performed.

    The regularized ART has a computational complexity of

    FRART = 2M ·N + 3k ·M ·N (2.7)floating point operations, with k as the number of sweeps.

    The (R)ART method furthermore allows inclusion of local regularization and in-

    equality constraints in the iteration (2.5), cf. (Dax 1993). An additional non-negativity

    constraint can be useful for reconstructing particle concentrations in MPI.

    CGLS and LSQR:

    If the system matrix A is symmetric positive definite, the Conjugate Gradient method

    (CG) can be applied directly to the consistent linear system Ax = b, i.e., it holds that

    b − AxCG = 0 with xCG = A−1b. For solving the least squares problem (2.3) with anon-square matrix A ∈ RM×N the Conjugate Gradient method can be applied to thenormal equations

    ATAx = AT b.

    This method is denoted as CGNR method and was introduced in (Hesteness and Stiefel

    1952). The ’R’ indicates that the residual r = b − Ax is minimized on an ascendingsequence of Krylov subspaces. A Krylov subspace Kk(B, c) is defined as

    Kk(B, c) = span{c, Bc, B2c, . . . , Bk−1c}with a square matrix B ∈ RN×N and a vector c ∈ RN . For the iterates xk of CGNR itholds that

    xk ∈ Sk := x0 +Kk(ATA,AT r0).With the typical choice for the initial approximation x0 = 0 it holds that

    xk ∈ Kk(ATA,AT b) = span{AT b, (ATA)AT b, (ATA)2AT b, . . . , (ATA)k−1AT b}.

    In each iteration step the following problem is solved

    minx

    ‖Ax− b‖22 subject to x ∈ Kk,

  • Fast Reconstruction in MPI 7

    hence the residual rk = b− Axk is monotonically decreasing, i.e.‖Axk − b‖22 ≤ ‖Axk−1 − b‖22 for k = 1, 2, . . . .

    Well-known implementations of the CGNR method are the CGLS and LSQR algorithm

    (Paige and Saunders 1982), which are identical in exact arithmetic. The cost of one

    iteration are 2 matrix-vector operations and few vector-vector operations, i.e. roughly

    4MN flops. Like in the ART method, only few iterations (e.g. 4-10 iterations) are

    needed to obtain good approximate solutions. Thus the LSQR and CGLS method have

    a computational complexity of

    FCGLS,LSQR = 4k ·M ·N (2.8)floating point operations, with k as the number of iterations. The speed of convergence

    depends on the spectrum of A (i.e., the set of its eigenvalues) and can be accelerated

    by suitable preconditioning methods: Splitting methods like Gauß-Seidel or Symmetric

    Successive Over Relaxation (SSOR), incomplete LU or incomplete LQ decompositions,

    cf. (Saad 2003) for details.

    2.3. Considering Measurement Noise

    A short note on the measurement noise. The underlying linear system Atruex = btruefrom (2.1) is assumed to hold with equality. But due to measurement noise E and r we

    face a perturbed linear system

    Ax ≈ bwith the measured system matrixA = Atrue+E and the object measurement b = btrue+r.

    Let us assume the noise of a row of A to be similar to Gaussian white noise, i.e. to be

    independent and identically distributed, with zero mean. This means that the covariance

    matrix of the errors of all rows of E is equal to a multiple of the identity matrix.

    This noise structure does not change under orthogonal transformations. And this

    is also roughly true for the more realistic signal-correlated noise ei = e · diag(ai) withei = E(i, :), ai = Atrue(i, :) and the white noise row vector e. Both properties are shown

    in the book of Hansen (Hansen 2010).

    Since the right-hand side b is obtained by the same measurements as the columns

    of A (i.e. the Lissajous trajectory of the FFP), it can be assumed that for the enlarged

    matrix (A, b) = (Atrue, btrue) + (E, r) the noise in the rows is again similar to Gaussian

    white noise. This means that the covariance matrix of the errors of all rows of (E, r)

    is also equal to a multiple of the identity matrix. A method taking into account the

    error of the system matrix A as well as the error of the right-hand side b, is the method

    of Total Least Squares. Two fundamental books about this topic are written by Van

    Huffel (Van Huffel and Vandewalle 1991, Van Huffel and Lemmerling 2002). When

    furthermore regularization is necessary for ill-conditioned problems, efficient methods

    proposed in (Lampe and Voss 2008, Lampe and Voss 2008a) should be applied.

  • Fast Reconstruction in MPI 8

    In this paper we are considering measurement errors in the right-hand side b only,

    since the corresponding algorithms are much simpler.

    3. Data Compression of the System Matrix

    In this section it is explained in detail how the data of the system matrix can be

    compressed such that any iterative solver (e.g. from subsection 2.2) is accelerated

    significantly and the MPI reconstruction problem can be solved very fast.

    In subsection 3.1 the structure of the rows of A is analyzed. This insight is used

    in subsection 3.2 to apply a suitable orthogonal transformation. It is also shown how

    the efficient computation of this transformation can be carried out. In subsection 3.3

    the data compression is introduced, and in subsection 3.4, the effect of the compression

    with respect to the computational complexity of iterative reconstruction methods is

    investigated.

    3.1. Properties of the System Matrix

    The system function matrix relates the spatial response pattern of a particle distribution

    to the frequency components. A row of the system matrix can be visualized as a

    grayscale image with the dimensions of the corresponding FOV: For a 2D FOV this

    is just a 2D image, whereas for a 3D FOV this is a grayscale cuboid.

    In (Rahmer et al 2009) a model function for the MPI system function has been

    investigated with the aim to analytically describe the spatial variations occurring at

    different harmonics of the excitation frequency. For one-dimensional excitation of

    Langevin particles, the spatial patterns can be described by Chebyshev polynomials

    of the second kind§. For 2D or 3D Lissajous trajectory excitation, the spatial patternscan be approximated by tensor products of the Chebyshev polynomials, but no exact

    two- or three-dimensional analytical description of the measured system function has

    been found. Thus, in this paper, we reconstruct images using a measured system

    function matrix. However, we will consider these polynomials as suitable orthogonal

    basis functions for compressing the data, which has also been used in audio and video

    compression in (Ishwar and Meher 2008). Another possible approximation of the system

    function are cosine- or wavelet polynomials, which are very extensively used in different

    coding standards such as the JPEG and JPEG 2000, cf. (Kuesters 1995).

    In the next Figure 1 a row of the system matrix from the example in subsection 4.2 is

    visualized. The size of the system matrix A is 11386×10816, with the number of columnscorresponding to a 3D FOV of np ·nq ·nr = 10816 voxels with np = 26, nq = 16, nr = 26.Figure 1 left illustrates a 3D spatial pattern of the row entries shown in Figure 1 right,

    of the frequency f = a · fx + b · fy + c · fz ≈ 205 kHz where a = −1, b = 6, c = 3and with fx, fy, fz as the respective drive field frequencies fx = 2.5 Mhz/99 ≈ 25.25

    § These polynomials describe the system function as a continuous function in space sm(r) := s(fm, r)at a fixed frequency value fm.

  • Fast Reconstruction in MPI 9

    0 2000 4000 6000 8000 10000−1

    −0.5

    0

    0.5

    1

    Index

    Row

    ent

    ry

    Entries of row no. 3430

    Figure 1. Left: 3D spatial isosurface corresponding to row number 3430. Right: 1D

    visualization of the row. The row represents the frequency f ≈ 205 kHz.

    kHz, fy = 2.5 Mhz/96 ≈ 26.04 kHz and fz = 2.5 Mhz/102 ≈ 24.51 kHz. Hence thefrequency f belongs to the interval f ∈ 8 · [fz, fy], which can be interpreted as thefrequency band of the 8th higher harmonic. This frequency band consists of multiples

    of the drive field frequencies together with all beat frequencies that fulfill a+ b+ c = 8.

    Note that the 3D visualization is an isosurface plot, where a threshold value of about

    the half of the maximal value (of the row entries) has been chosen, such that only the

    surface corresponding to this value is shown.

    This grayscale cuboid contains only very few “blobs”, and its corresponding one

    dimensional plot, Figure 1 right, seems to consists mainly of a sum of few trigonometric

    basis functions. When leaving the interval bands around higher harmonics, i.e., typically

    by choosing coefficients a, b, c ∈ Z of large absolute values, one finds an increasingnumber of oscillations, which is visualized in three dimensions as many more “blobs”.

    The further away the corresponding frequency is from the current higher harmonic

    band, the more high frequent information can be encountered in the grayscale cuboid. In

    between two harmonic frequency bands the corresponding row has a very noisy behavior,

    which indicates little information content.

    Similar to lossy compression techniques in image or video coding we approximate

    the original data by neglecting frequencies of the transformed data which have only little

    contribution to the signal. This has the effect of smoothing the image or even filtering

    out noisy data and is discussed in the following subsections.

    3.2. Orthogonal transformation

    By applying an orthogonal transformation to all rows of the linear system (2.1) it is

    transformed into another domain. The transformation can be expressed by (3.1) in

    which DT denotes any kind of orthogonal discrete transformation.

    Ax = b −−−−→DT ADTxDT = bDT = b. (3.1)

  • Fast Reconstruction in MPI 10

    The equality bDT = b holds because the transformation is applied row-wise to the column

    vector b, i.e., for each element of the vector b it is applied

    b(m) −−−−→DT bDT (m), for m = 1, . . . ,M.And since an orthogonal transformation of a scalar value is just the same scalar value,

    it holds that bDT (m) = b(m) for all m and thus bDT = b. Note that the solution xDT of

    the linear system (3.1) is not directly the solution of (2.1), but has to be recovered by

    xDT −−−−−→iDT xwhere iDT denotes the inverse of the orthogonal transformation DT .

    A general expression of an one-dimensional discrete transformation is given by

    X̂(u) = α(u)

    N−1∑

    n=0

    c(u, n)X(n), (3.2)

    with 0 ≤ u ≤ U − 1 and α(u) being scaling factors, i.e., N values of the originalspace are transformed into U values of the transformed space. Thereby c(u, n) denotes

    the kernel of the transformation and determines together with the scaling factors

    the transformation type. Here X(n) denotes a row of the system matrix and X̂(u)

    denotes the transformed row. Let us consider the kernel functions for some well-known

    transformations:

    • The kernel of a discrete one-dimensional cosine transform (DCT) is given byccos(u, n) = cos [π(2n+ 1)u/(2N)], with the scaling factors α(u) = 1/

    √N for u = 0,

    and√2/N for u > 0.

    • The kernel of a discrete one-dimensional Fourier transform (DFT) is given bycexp(u, n) = exp (2π i n u/N), with the scaling factors α(u) = 1/

    √N .

    • When considering the (normalized) discrete one-dimensional Chebyshev transform(DTT), the kernel can be recursively expressed by

    ccheb(u, n) = (a1n + a2)ccheb(u− 1, n) + a3ccheb(u− 2, n) with

    ccheb(0, n) = 1/√N and ccheb(1, n) = (2n+ 1−N)

    √3

    N(N2 − 1)

    where a1 =2

    u

    √4u2 − 1N2 − u2 , a2 =

    1−Nu

    √4u2 − 1N2 − u2

    and a3 =1− uu

    · 2u+ 12u− 3

    √N2 − (u− 1)2

    N2 − u2and the scaling factors α(u) = 1. The resulting polynomials are just the orthogonal

    Chebyshev polynomials of the second kind, that are additionally normalized to one.

    Thus, the polynomials ccheb(u, n), u = 0, 1, . . . are orthonormal with respect to the

    inner product 〈f, g〉 :=∫ 1−1

    f(n) · g(n)√1− n2 dn.

  • Fast Reconstruction in MPI 11

    We can express (3.2) with the help of Einstein notation:‖X̂u = Auu C̃un Xn,

    with

    C̃un =

    c(0, 0) c(0, 1) · · · c(0, N − 1)c(1, 0) c(1, 1) · · · c(1, N − 1)

    ......

    . . ....

    c(U − 1, 0) c(U − 1, 1) · · · c(U − 1, N − 1)

    ,

    and X̂u ∈ RU , Xn ∈ RN and Auu = diag(α(0), . . . , α(U − 1)) ∈ R(U,U). If we choosescaling factors such that Cun = AuuC̃un, CunC

    Tun = E and U = N , (3.2) becomes an

    orthogonal transformation and can be expressed as follows

    X̂u = Cun Xn.

    The inverse transform can then be written as

    Xn = CTun X̂u.

    A two-dimensional transform and its corresponding inverse are obtained when each

    dimension is transformed by a separate kernel.

    X̂uv = Cun Xnm CTvm,

    Xnm = CTun X̂uv Cvm,

    with Xnm ∈ R(N,M), X̂uv ∈ R(U,V ) and N = U,M = V . Here N does not denote thenumber of rows of the system matrix, but corresponds to the number of voxels in the

    first spatial dimension. M is the number of voxels in the second spatial dimension. Thus

    in 2D, the number of elements within a row equals N ·M = np · nq.Analogously a three-dimensional transform can be expressed by using 2 × P two-dimensional transforms

    X̂uvi = Cun Xnmi CTvm, with i ∈ P (3.3)

    and N ×M one-dimensional transformsX̂jko = Cwo Xjko, with j ∈ N and k ∈ M,

    with X̂uvw ∈ R(U,V,W ) and Xnmp ∈ R(N,M,P ). Similar to the two-dimensional case it holdsthat np = N , nq = M and nr = P .

    A naive implementation of a three-dimensional transform for all rows M of the

    system matrix requires

    Ftrafo = M · F3D-DT= M · np · nq · nr · ((2np − 1) + (2nq − 1) + (2nr − 1))= M ·N · ((2np − 1) + (2nq − 1) + (2nr − 1))

    ‖ If an index occurs twice in an expression, the expression will be summed up over all values denotedby the index. If an index occurs only once in an expression, the equation remains for all values which

    are pointed by the index.

  • Fast Reconstruction in MPI 12

    floating point operations. An even more efficient implementation is to use faster

    transformation algorithms such as the FFT, which is capable of transforming the

    complete system matrix with just

    F3D-FFT = M · np · nq · nr · log2(np · nq · nr)= M ·N · log2(N)

    floating point operations. In both cases, an orthogonal transformation requires much

    less floating point operations than a factorization of A for direct solvers, cf. (2.4). In

    this work we have used the naive implementation which simplifies the application and

    analysis of different transformation types, as only the kernel matrices C have to be

    changed.

    3.3. Reduction of the System Matrix

    Before solving the reconstruction problem, the transformed system matrix ADT is

    compressed to a size which speeds up the computation. The reduction is accomplished

    by only storing entries with a large magnitude. By introducing a threshold value

    τ ∈ R, τ > 0 we are able to determine the degree of reduction:ADT −−−−−−−→DT ≥ τ ÂDTτ

    with the matrix ÂDTτ defined element-wise by

    ÂDTτ (m,n) =

    {ADT (m,n) for |ADT (m,n)| > τ

    0 for |ADT (m,n)| ≤ τ

    for m = 1, . . . ,M and n = 1, . . . , N . The matrix ÂDTτ approximates the full transformed

    matrix ADT . The larger the value τ is chosen the less elements are kept. For values

    τ ≥ τmax := maxm,n(|ADT (m,n)|) it holds that ÂDTτ = 0 and for τ = 0 it simplyholds that ÂDTτ = A

    DT . By choosing a value τ close to τmax the matrix ÂDTτ becomes

    very sparse, i.e., it contains mainly zeros. Let us denote the number of the non-zero

    elements by nnz. The memory requirement can be reduced drastically by only storing

    the non-zero elements, which can be accomplished by using predefined storage format

    such as the compressed sparse row (CSR) or column format (CSC). In (Saad 2003) an

    introduction into sparse storage formats can be found. The memory complexity M isthen reduced from MADT = M ·N for a full matrix to

    MÂDTτ = O(nnz)floating point numbers for a sparse matrix with nnz ≪ M ·N elements. The density ρof the matrix is related to the number of non-zero elements by:

    ρ(τ) := nnz(ÂDTτ )/(M ·N). (3.4)The function ρ(τ) maps τ ∈ [0, τmax] → ρ ∈ [0, 1]. For the density at τ = 0 it holdsρ(τ = 0) ≤ 1, with equality if and only if ADT does not contain any zero. This is dueto the denominator M · N in (3.4) (instead of nnz(ADT )) to keep the meaning of 1/ρas compression factor when using the sparse format.

  • Fast Reconstruction in MPI 13

    When there is no suitable value for τ at hand, it is also possible to choose some

    desired density ρ for the reduced matrix. In order to determine the corresponding

    threshold value of τ , a fast sorting algorithm can be applied to sort the entries of

    the transformed matrix by magnitude. Thus the reduction process in this case has

    the computational complexity of the sorting algorithm, i.e., the worst-case scenario for

    efficient sorting algorithms needs

    FÂDTτ = O(M ·N · log2(M ·N))flops, with M ·N being the number of elements in the system matrix. This is still muchless compared to a factorization of the system matrix.

    Note that this proposed compression technique has another spirit than JPEG

    compression. First the 2D or 3D grayscale images (i.e. the visualization of the rows of

    A) are not divided into smaller parts by imposing some fixed grid. And second, for all

    rows there exists the same threshold value. This is something completely different than

    a fixed compression factor for each individual row. Approximating all grayscale images

    by a fixed number of coefficients only leads to poor compression factors, i.e. the image

    quality deteriorates drastically even for moderate compression factors like 2− 10. Theeffect of the global threshold τ is that a lot of coefficients from important frequencies

    close to higher harmonics are kept while only very few or no coefficients remain from

    frequencies that mainly consists of noise. In the following Figure 2 the positions of the

    non-zero entries of a compressed system matrix ÂDTτ from the example of subsection 4.2

    are displayed. The full matrix consists of two stacked system matrices corresponding

    to the recording coils in y- and z-direction, i.e., A =

    (AFFTyAFFTz

    )∈ R11386×10816, cf. also

    (2.2). The compression has been computed by the 3D discrete Chebyshev transform

    described in subsection 3.2, and a threshold value τ has been determined such that the

    compression factor is 1/ρ = (M · N)/nnz(ÂDTτ ) ≈ 1000. Here the number of non-zeroelements is nnz(ÂDTτ ) = 123150. The columns of the matrix in Figure 2 represents the

    Chebyshev parameters and the rows corresponds to the frequencies. Note that the rows

    up to approximately number 6000 corresponds to the recording coil in y-direction and the

    subsequent rows corresponds to the recording coil in z-direction. Figure 2 nicely shows

    that by using the proposed threshold strategy the elements of rows which correspond to

    higher harmonic frequencies are kept much more likely than elements of rows between

    higher harmonics. This is due to the dominant role that the higher harmonic frequency

    bands play in the transformed space.

    3.4. Sparse Reconstruction

    All methods that touch the system matrix only by matrix-vector and vector-vector

    operations make perfect use of the sparse structure. Thus all iterative algorithms are

    accelerated by a factor that corresponds to the compression factor 1/ρ. Once the

    system matrix is reduced to a desired size, the reconstruction algorithms mentioned

  • Fast Reconstruction in MPI 14

    0 2000 4000 6000 8000 10000

    0

    2000

    4000

    6000

    8000

    10000

    column number

    Sparsity pattern of AτDT, compression factor 1000

    row

    num

    ber

    Figure 2. Sparsity pattern of compressed system matrix, using 3D-DTT

    in subsection 2.2 can be used also for sparse matrices stored in a compressed format. In

    this case the number of floating point operations for the CGLS and LSQR method (cf.

    (2.8)) reduces to

    FsCGLS,sLSQR = O(nnz) +O(M +N) (3.5)and for the RART (cf. (2.7)) to

    FsRART = O(nnz) +O(M). (3.6)Let us compare these approaches to the direct method from subsection 2.1. The

    Cholesky factorization of the sparse matrix C = chol((ÂDTτ )

    T (ÂDTτ ) + αI)

    is a full

    matrix again, hence the time needed for reconstruction is of the order

    O(N ·N) ≫ O(nnz).This is due to solving systems with the Cholesky factors C and CT respectively. The

    only remedy would be a special sparse factorization of ÂDTτ , such that direct solvers can

    make use of the sparsity and might yield comparable reconstruction times. But we do

    not have investigated any matrix factorization in this paper.

    An almost linear computational complexity for the sparse iterative reconstruction

    algorithms can be achieved if the matrix density ρ(τ) is chosen close to 1/N or 1/M , or

    equivalently if the compression factor is chosen to M or N , respectively. It should be

  • Fast Reconstruction in MPI 15

    noticed that such a drastic reduction is only possible if the measured data correspond

    to an object which contains a certain degree of smoothness.

    A short note on the regularization parameter α after the transformation and

    reduction have been performed: This value might have a very different regularizing

    effect in the original and transformed space. There exists no simple possibility to obtain

    a suitable value α in both spaces. If α is not determined heuristically, the L-curve

    method or the generalized cross-validation can be applied, see (Hansen 1998) for a

    detailed description. In the numerical examples in section 4 we are using a heuristic

    approach.

    4. Numerical Examples

    In order to evaluate our new method we have compared the image quality when us-

    ing different compression factors for the system matrices of two 3D examples. In both

    examples measured system matrices are available. Two different kinds of orthogonal

    transformations have been investigated: The three-dimensional discrete cosine trans-

    form (3D-DCT) and the three-dimensional discrete Chebyshev transform (3D-DTT),

    which are both described in subsection 3.2. Since we used a transformation algorithm

    based on (3.3), both orthogonal transforms had identical computational times. The

    computation time for the compression process has not been taken into account since it

    can be computed once the system matrix is available. It can be stated that it takes

    roughly as long as the computation of one reconstruction with the original (uncom-

    pressed) system matrix when using one of the iterative solvers. The image quality is

    measured by comparing the reconstructed images corresponding to different densities

    ρ(τ) with a reference solution. The reference solution is obtained by using the full sys-

    tem matrix and the same iterative method that is used for the compressed data. All

    iterative methods presented in subsection 2.2 have been tested for both examples. The

    metric we have used is the mean squared error (MSE). ¶

    The three reconstruction algorithms, LSQR, CGLS and regularized ART (RART),

    are implemented in C and translated with the GCC (Version 4.4.5) and the Intel Math

    Kernel Library (Version 10.2.6.038). The Intel Math Kernel Library (MKL) provides

    efficient implementations of the Basic Linear Algebra Subprograms (BLAS) and sparse

    BLAS for Intel CPUs with several cores. Hence, the runtime evaluations were carried

    out on two Intel Xeon X5560 processors, each having four cores using a shared memory

    which had an overall processor load of 80% in all test cases. For visualization and

    verification of the reconstructed data we used Matlab. All data are processed and

    stored in single precision. Doing the computations in double precision has no visible

    effect on the image quality.

    An important issue for reaching a high performance is the storage format of the

    ¶ However, a quantification of the error by any metric does not replace the visual judgement of theimage quality.

  • Fast Reconstruction in MPI 16

    system matrix in order to have contiguous memory access. Because the ART method

    and the orthogonal transformation of the system matrix are working on single rows, we

    have used the row storage format in which the elements of a row are stored successively

    in memory. If we had instead used the column storage format the ART method slows

    down by a factor of 4. For the CGLS and LSQR method the reconstruction times are

    independent of the storage format.

    For the sparse computations the compressed row storage format (CSR), see (Saad

    2003) for sparse formats, was applied in a three-array variant that could be perfectly

    integrated into the sparse BLAS. Some modifications have to be made to adapt the

    code to sparse data: Within the LSQR and CGLS method the BLAS2 (matrix-vector)

    operations have to be changed into sparse BLAS2. ART only requires vector-vector

    operations, thus the BLAS1 operations are changed into sparse BLAS1.

    For the two large numerical examples that we have investigated it turned out

    that four iterations of the CGLS and LSQR method are sufficient to yield satisfactory

    reconstructed images (by optical inspection). The image quality does not change

    much within the next few iterations. But notice that after too many iterations the

    image quality deteriorated, because then we approach the (unregularized) least squares

    solution. For reasons of comparison we set the number of iterations for the ART method

    to four as well.

    4.1. MPI Data: Beating Mouse Heart

    The first example investigates the beating heart of a mouse. The time for one complete

    closed Lissajous trajectory is 21.5 ms, corresponding to encoding 46.4 volumes per

    second, and covers a 3D FOV of 20.4 × 12 × 16.8 mm3. The system function hasbeen acquired on a 34× 20× 28 grid, with a voxel size of 0.6 mm3. Hence the resultingsystem function matrix has N = np · nq · nr = 19040 columns, where np = 34, nq = 20and nr = 28. The number of frequency components, i.e., the number of rows of A,

    has been chosen to M = 64626. These are selected frequency components from all

    three recording coils, where a lot of very noisy frequency components have already been

    discarded. There exist 1800 right-hand sides, i.e., this is a video sequence of the heart

    beating. A single right-hand side from this 4D series has been chosen at a moment where

    the bolus of magnetic particles passes through the heart, i.e., the signal from the heart

    is large. The data set is similar to the published data of the in vivo 3D experiments on

    the beating heart of a mouse in (Weizenecker et al 2009). The upper part of Figure 3

    shows the 3D surface plot of the computed reference solutions using LSQR and RART

    respectively, together with a 2D contour plot of the corresponding slice at z = 15 in

    the lower part of Figure 3. For the RART algorithm (2.6) the additional regularization

    parameter in the transformed 3D-DCT space has been chosen to α = 5. This value has

    been obtained heuristically by choosing the most promising of several tested values for

    α. No additional non-negativity constraint has been used within the algorithms, only

    in the images the negative particle concentrations have been set to zero. The RART

  • Fast Reconstruction in MPI 17

    y−axis

    x−ax

    is

    Slice at z=15, no compression, LSQR

    5 10 15 20

    5

    10

    15

    20

    25

    30

    0

    0.5

    1

    1.5

    2

    2.5

    3

    x 10−6

    y−axis

    x−ax

    is

    Slice at z=15, no compression, RART

    5 10 15 20

    5

    10

    15

    20

    25

    30

    0

    0.5

    1

    1.5

    2

    x 10−6

    Figure 3. Reference solutions reconstructed by LSQR and RART

    solution slightly differs from the results achieved by the LSQR method, which is just

    due to the different algorithms. Note that the solution obtained by the CGLS method

    is visually indistinguishable from the LSQR solution. In this example we are not able

    to judge which of the reconstructed solutions is the better approximation of the true

    solution. Thus they are both treated as true solutions, i.e., as reference solutions for

    the correspondent sparse solutions using the same algorithm.

    The next Figure 4 illustrates reconstructed solutions for decreasing matrix densities

    ρ(τ) ∈ [0.1, 0.001, 0.00001]. The compression has been performed by applying the 3Ddiscrete cosine transform, with using corresponding threshold values τ to obtain the

    chosen densities. The image at the bottom of Figure 4 has been reconstructed with

    only 12238 non-zero elements of the system matrix A, which corresponds to only about

    0.001% of the original data. With such a drastic compression factor of 105 ≈ 0.5Nwe reach linear reconstruction time. The smoothing effect of the compression process

    mentioned in section 3.1 can be observed in the 2D contour plots, in Figure 4 right.

    In Figure 5 error plots for seven right-hand sides are displayed. The reconstructed

    solutions with a compressed sparse system matrix ÂDTτ are compared to the

  • Fast Reconstruction in MPI 18

    y−axis

    x−ax

    is

    Slice at z=15, compression factor 10

    5 10 15 20

    5

    10

    15

    20

    25

    30

    0

    0.5

    1

    1.5

    2

    2.5

    3

    x 10−6

    y−axis

    x−ax

    is

    Slice at z=15, compression factor 1000

    5 10 15 20

    5

    10

    15

    20

    25

    30

    0

    0.5

    1

    1.5

    2

    2.5

    3

    x 10−6

    y−axis

    x−ax

    is

    Slice at z=15, compression factor 100000

    5 10 15 20

    5

    10

    15

    20

    25

    30

    0.5

    1

    1.5

    2

    2.5

    3x 10

    −6

    Figure 4. Solutions reconstructed by LSQR, using 3D-DCT compression

  • Fast Reconstruction in MPI 19

    corresponding reference solution.+ The error curves clearly show that a decreasing

    10−3

    10−2

    10−1

    100

    101

    102

    10−22

    10−20

    10−18

    10−16

    10−14

    10−12

    system matrix density [%]

    mea

    n sq

    uare

    d er

    ror

    (MS

    E)

    Matrix density vs. MSE for different right−hand sides, using 3D−DCT

    LSQR − b(250)LSQR − b(280)LSQR − b(310)LSQR − b(340)LSQR − b(370)LSQR − b(400)LSQR − b(430)

    Figure 5. Error plots for different densities, using 3D-DCT and LSQR

    density of the system matrix corresponds to a growing mean-squared error (MSE).∗However, it can also be observed that the error curve has a similar behavior for different

    measured right-hand sides. Thus, choosing a fixed matrix density corresponds to a small

    interval for the MSE value.

    The following Figure 6 shows the reconstruction times of the LSQR and RART

    method. Although both methods have comparable computational complexity of

    O(M ·N) for the full matrix A or O(nnz) for a compressed matrix ÂDTτ , the LSQR (orCGLS) method is able to compute the solution much faster than ART. This is due to the

    vector-vector operations used in ART that only require BLAS1, while the LSQR method

    can make use of fast implementations of BLAS2. The horizontal line indicates the time

    where real-time reconstruction can be accomplished, i.e. trecon ≤ 50ms. Hence, inthis example the RART method is just barely able to provide real-time reconstruction

    for compression factors larger than 1/ρ ≥ 10000, while LSQR is able to reconstruct inreal-time for matrix densities smaller than 0.4% which corresponds to a compression

    factor of 250. A reconstructed solution with LSQR corresponding to a matrix density

    of 0.1% is depicted in the second row of Figure 4. The peak at a matrix density a little

    less than 100% is due to the change to the sparse format. Only for a density of 100%

    the standard full format is used, where for all other densities the sparse format is used.

    The use of the sparse format is only advantageous for densities less than 20%, which is

    no surprise since it has to be stored the corresponding entry of ÂDTτ and its position.

    But typically the compression factors are much higher than five, such that the sparse

    + The right-hand sides belong to seven different volumes extracted from the 3D video sequence.∗ Although the data have been stored in single precision, the mean squared error is computed withdouble precision.

  • Fast Reconstruction in MPI 20

    10−3

    10−2

    10−1

    100

    101

    102

    100

    101

    102

    103

    104

    105

    system matrix density [%]

    reco

    nstr

    uctio

    n tim

    e [m

    s]

    Matrix density vs. reconstruction time, using 3D−DCT

    LSQRRART

    Figure 6. Reconstruction times for different densities, LSQR and RART

    format will pay.

    The reconstructed images for the different densities do not change significantly if

    we use the 3D-DTT instead of the 3D-DCT. Error plots and reconstruction times (i.e.,

    Figures 5 and 6) are very similar as well when using the DTT-compression. The visual

    image quality is also pretty much the same when LSQR is replaced by RART, cf. Figure

    3.

    4.2. MPI Data: Flow Phantom

    The second example is obtained from a flow-phantom, consisting of seven tubes through

    which water is flowing constantly. The tracer material is injected into the water flow

    during the object measurement. The system set up is similar to the example from

    subsection 4.1. Here the system function has been acquired on a 26 × 16 × 26 grid,hence the resulting system function matrix has N = np · nq · nr = 19040 columns,with np = 26, nq = 16, nr = 26. The number of frequency components (number of

    rows of A) has been chosen to M = 11386. These are selected frequency components

    from two recording coils, i.e., from the y- and z-direction. Here again a lot of noisy

    frequency components have already been discarded. There exists 800 right-hand sides,

    i.e., this is a video sequence of the water flowing through the tubes. A single right-hand

    side from this 4D series has been chosen at a moment where all tubes are filled with

    water (with some air bubbles). Figure 7 illustrates different reconstructed solutions

    with RART for decreasing matrix densities ρ(τ) ∈ [1, 0.1, 0.001]. A suitable value ofthe regularization parameter in the transformed space, i.e., here the 3D-DTT space, has

    been again heuristically obtained as α = 5. On the left it is shown the 3D surface plot

    of the computed solution, together with a 2D contour plot of the corresponding slice at

  • Fast Reconstruction in MPI 21

    z = 15 on the right of Figure 7. In the top row of Figure 7 the reference solution is

    y−axis

    x−ax

    is

    Slice at z=15, no compression

    5 10 15

    5

    10

    15

    20

    25

    0

    2

    4

    6

    8

    x 10−4

    y−axis

    x−ax

    is

    Slice at z=15, compression factor 10

    5 10 15

    5

    10

    15

    20

    25

    0

    2

    4

    6

    8

    x 10−4

    y−axis

    x−ax

    is

    Slice at z=15, compression factor 1000

    5 10 15

    5

    10

    15

    20

    25

    0

    2

    4

    6

    8

    x 10−4

    Figure 7. Solutions reconstructed by RART, using 3D-DTT compression

    displayed, i.e., at ρ = 1. As in the first example the effect of smoothing can be observed

    as the density of the system matrix decreases. For very high compression factors (i.e.

    1/ρ > 1000) the resulting image gets distorted and does not catch the main structure

    of the reference solution.

    The error curves of the flow-phantom in Figure 8 have a similar behavior like the

  • Fast Reconstruction in MPI 22

    error curves of the mouse example, cf. Figure 5. The following Figure 9 contains the

    10−3

    10−2

    10−1

    100

    101

    102

    10−13

    10−12

    10−11

    10−10

    10−9

    10−8

    10−7

    10−6

    system matrix density [%]

    mea

    n sq

    uare

    d er

    ror

    (MS

    E)

    Matrix density vs. MSE for different right−hand sides, using 3D−DTT

    RART − b(180)RART − b(200)RART − b(220)RART − b(240)RART − b(260)RART − b(280)

    Figure 8. Error plots for different densities, using 3D-DTT and RART

    reconstruction times for the flow-phantom in relation to the compression factor. It can

    be observed that the runtime does not decrease for matrix densities smaller than 0.01%

    which corresponds to 12248 non-zero elements of ÂDTτ , i.e., this is almost equal to the

    number of rows of A. This behavior can be explained as follows: Choosing matrix den-

    sities smaller than 1/M or 1/N are not reasonable, since the reconstruction time then

    is determined mainly by the O(M) term for the sparse RART method (cf. (3.6)) andby O(M +N) in case of the sparse LSQR method (cf. (3.5)). This term is not affectedby the compression. The horizontal line indicate again the threshold for real-time re-

    construction, i.e., trecon ≤ 50ms. Hence, in this example the RART method requiresa compression factor of at least 50 (which gives a density of 2%), while LSQR only

    requires a compression factor of 25 (which corresponds to a density of 4%) for real-time

    reconstruction.

    The reconstructed images for the different densities do not change significantly if

    we replace the 3D-DTT by the 3D-DCT. Again the error plots and reconstruction times

    (Figures 8 and 9) for DCT-compression are very similar as well.

    It should be noticed that for these two examples it was possible to use relatively

    high compression factors of about 1/N without worsen the image quality substantially.

    It might be the case that MPI data of a higher spatial resolution, e.g. when measuring

    non-smooth objects that contain high frequency components, that those compression

    factors can not be achieved any longer.

  • Fast Reconstruction in MPI 23

    10−3

    10−2

    10−1

    100

    101

    102

    100

    101

    102

    103

    104

    system matrix density [%]

    reco

    nstr

    uctio

    n tim

    e [m

    s]

    Matrix density vs. reconstruction time, using 3D−DTT

    LSQRRART

    Figure 9. Reconstruction times for different densities, LSQR and RART

    5. Conclusions

    A new method for fast image reconstruction in MPI is presented. It is mainly based

    on an efficient data compression technique using orthogonal transformations. Here we

    investigated the discrete cosine and Chebyshev transforms, that both can extract the

    relevant parts of the system matrix. The storage requirements can be reduced by a

    factor of 103 − 105. Furthermore, the impact of the data compression on the imagereconstruction has been shown. Three iterative methods, ART, LSQR and CGLS, are

    investigated in view of image quality and computational time: A speed up of a factor

    of 50 − 500 is achieved with a visually comparable image quality allowing real-timereconstruction for current MPI data.

    In some future work it would be interesting to investigate the inclusion of non-

    negativity constraints after the sparsity transform.

    Acknowledgments

    Acknowledgement We thank Dr. Tiemann (University Medical Center Hamburg-

    Eppendorf) for animal handling and acknowledge funding by the German Federal Min-

    istry of Education and Research (BMBF) under the grant number FKZ 13N9079.

    References

    Dax A 1993 On Row Relaxation Methods for Large Constrained Least Squares Problems SIAM Journal

    on Scientific Computing 14 570–84

  • Fast Reconstruction in MPI 24

    Elsner L, Koltracht I and Lancester P 1991 Convergence properties of ART and SOR algorithms

    Numerische Math. 59 91–106

    Gleich B and Weizenecker J 2005 Tomographic imaging using the nonlinear response of magnetic

    particles Nature 435 1214–17

    Gleich B, Weizenecker J and Borgert J 2008 Experimental results on fast 2D-encoded magnetic particle

    imaging Phys. Med. Biol. 53 N81–N84

    Hanke M and Niethammer W 1990 On the Acceleration of Kaczmarz’s Method for Inconsistent Linear

    Systems Lin. Alg. Appl. 130 83–98

    Hansen P C 1998 Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear

    Inversion SIAM, Philadelphia

    Hansen P C 2010 Discrete Inverse Problems: Insight and Algorithms SIAM, Philadelphia

    Hestenes M R and Stiefel E 1952 Methods of Conjugate Gradients for Solving Linear Systems Journal

    of Research of the National Bureau of Standards 49 409–36

    Ishwar S and Meher P K 2008 Discrete Tchebicheff transform-A fast 4x4 algorithm and its application

    in image/video compression Proc. ISCAS 260–63

    Kaczmarz S 1937 Angenäherte Auflösung von Systemen linearer Gleichungen Bulletin International de

    l’Academie Polonaise des Sciences et des Lettres 35 355–57

    Knopp T, Biederer S, Sattel T, Rahmer J, Weizenecker J, Gleich B, Borgert J and Buzug T 2009

    Trajectory Analysis for Magnetic Particle Imaging Phys. Med. Biol. 54 385–97

    Knopp T, Rahmer J, Sattel T, Biederer S, Weizenecker J, Gleich B, Borgert J and Buzug T 2010

    Weighted iterative reconstruction for magnetic particle imaging Phys. Med. Biol. 55 1577–89

    Knopp T, Sattel T, Biederer S, Rahmer J, Weizenecker J, Gleich B, Borgert J and Buzug T 2010a

    Model-based reconstruction for magnetic particle imaging IEEE Trans. Med. Imag. 29 12–18

    Küsters H 1995 Bilddatenkomprimierung mit JPEG und MPEG Franzis, Poing, ISBN 3-7723-7281-3

    Lampe J and Voss H 2008 Global convergence of RTLSQEP: a solver of regularized total least squares

    problems via quadratic eigenproblems Math. Model. Anal. 13 55–66

    Lampe J and Voss H 2008a A fast algorithm for solving regularized total least squares problems Electron.

    Trans. Numer. Anal. 31 12–24

    Paige C C and Saunders M A 1982 LSQR: An Algorithm for Sparse Linear Equations and Sparse Least

    Squares ACM Trans. Math. Software 8/1 43–71

    Phillips D L 1962 A Technique for the Numerical Solution of Certain Integral Equations of the First

    Kind ACM J. Assoc. Comput. Mach. 9/1 84–97

    Rahmer J, Weizenecker J, Gleich B and Borgert J 2009 Signal encoding in magnetic particle imaging:

    properties of the system function BMC Medical Imaging 9 4

    Saad Y 2003 Iterative Methods for Sparse Linear Systems SIAM, Philadelphia

    Saunders M A 1995 Solution of sparse rectangular systems using LSQR and Craig BIT Numerical

    Math. 35 588–604

    Tikhonov A N 1963 Solution of incorrectly formulated problems and the regularization method Soviet

    Math. Dokl. 4 1035–38

    Van Huffel S and Vandewalle J 1991 The Total Least Squares Problems: Computational Aspects and

    Analysis SIAM, Philadelphia, Frontiers in Applied Mathematics

    Van Huffel S and Lemmerling P 2002 Total Least Squares and Errors-in-Variables Modeling: Analysis,

    Algorithms and Applications Kluwer, Dordrecht

    Weizenecker J, Borgert J and Gleich B 2007 A simulation study on the resolution and sensitivity of

    magnetic particle imaging Phys. Med. Biol. 52 6363–74

    Weizenecker J, Gleich B and Borgert J 2008 Magnetic particle imaging using a field free line J. Phys.

    D: Appl. Phys. 41 105009

    Weizenecker J, Gleich B, Rahmer J, Dahnke H and Borgert J 2009 Three-dimensional real-time in vivo

    magnetic particle imaging Phys. Med. Biol. 54 L1–L10