4 Image Reconstruction Algorithms in PET* Michel Defrise, Paul E Kinahan and Christian J Michel

and the reconstruction problem is defined. The third Introduction section reviews the classical analytic reconstruction of 2D tomographic data and describes the FBP method, This chapter describes the 2D and 3D image recon- which remains a workhorse of tomography. Iterative struction algorithms used in PET and the most impor- reconstruction is presented in the following section, tant evolutions in the last ten years: the introduction of where the accent is set on the key concepts and on their 3D acquisition and reconstruction and the increasing practical implications. Owing to the wide variety of role of iterative algorithms. As will be seen, iterative al- iterative methods, only the popular ML-EM and OSEM gorithms improve image quality by allowing more ac- methods are described in detail, though this does not curate modeling of the data acquisition. This model entail any claim that these algorithms are optimal. The includes the detection, the photon transport in the last sections concern the reconstruction of data tissues, and the statistical distribution of the acquired acquired in 3D mode. Three-dimensional FBP is data, i.e. the noise properties. The popularity of itera- described, as well as fast rebinning algorithms, which tive methods dates back to the seminal paper of Shepp reduce the redundant 3D data set to synthetic 2D data and Vardi on the maximum-likelihood (ML) estima- that can be processed by analytic or iterative 2D algo- tion of the tracer distribution. Practical implementa- rithms. Hybrid algorithms combining rebinning with a tion of this algorithm has long been hindered by the 2D iterative algorithm are introduced, and the chapter size of the collected data, which has increased more concludes with a discussion of the practical aspects of rapidly than the speed of computers. Thanks to the in- fully 3D iterative reconstruction. troduction of fast iterative algorithms in the nineties, Presented here as a separate chapter, image recon- such as the popular Ordered Subset Expectation struction cannot be understood independently of the Maximization (OSEM) algorithm, iterative reconstruc- other steps of the data-processing chain, including data tion has become practical. Reconstruction time with it- acquisition, data corrections (described in chapters 2, erative methods nevertheless remains an issue for very 3, 5), as well as the quantitative or qualitative analysis large 3D data sets, especially when multiple data sets of the reconstructed images. The variety of algorithms are acquired in whole-body or dynamic studies. Speed, for PET reconstruction arises from the fact that there is however, is not the only reason why filtered-back- no such thing as an optimal reconstruction algorithm. projection (FBP) remains important: analytic algo- Different algorithms may be preferred depending on rithms are linear and thereby allow an easier control of factors such as the signal-to-noise ratio (number of the spatial resolution and noise correlations in the re- collected coincident events in the emission and trans- construction, a control which is mandatory for quanti- mission scans), the static or dynamic character of the tative data analysis. tracer distribution, the practical constraints on the The chapter is organized as follows. First, the organi- processing time, and, most importantly, the specific zation of the data acquired in 2D mode is described, clinical task for which the image is reconstructed. It is

* Figures 4.1–4.11 are reproduced from Valk PE, Bailey DL, Townsend DW, Maisey MN. Positron Emission Tomography: Basic Science and Clinical Practice. Springer-Verlag London Ltd 2003, 91–114. 64 Positron Emission Tomography important to keep this observation in mind when dis- Analytic reconstruction algorithms assume that the cussing reconstruction as an isolated topic. data have been pre-corrected for various effects such as randoms, scatter and attenuation. In addition, these algorithms model each tube of response as a mathematical line joining the center of the front face of the 2D Data Organization two crystals(1). This means that the sensitivity function ψ ∈ L da,db (r) is zero except when r da,db . With this approximation, the data are modeled as line integrals of Line of Responses the tracer distribution: rr 〈〉= pdrfrdd, ∫ () ()2 A PET scanner counts coincident events between pairs ab L of detectors. The straight line connecting the centers of da ,db two detectors is called a line of response (LOR). Unscattered photon pairs recorded for a specific LOR arise from annihilation events located within a thin Sinogram Data and Sampling volume centered around the LOR. This volume typically has the shape of an elongated parallelipiped and The natural parameterization of PET data uses the is referred to as a tube of response. indices (da,db) of the two detectors in coincidence, as To each pair of detectors da,db is associated an LOR in Eq. (1). However, there are several reasons to modify L ψ this parameterization: da,db and a sensitivity function da,db (r = (x, y, z)) such that the number of coincident events detected is a • The natural parameterization is often poorly Poisson variable with a mean value adapted to analytic algorithms. This is why raw data τ ψ are usually interpolated into an alternative sinogram < pda,db > = FOV drf(r) da,db (r) (1) parameterization described below. where τ is the acquisition time and f(r) denotes the • The number of recorded coincidences N in a tracer concentration. We assume that the tracer events given scan may be too small to take full advantage of concentration is stationary and that f(r) = 0 when the nominal spatial resolution of the scanner. In such √(x2 + y2) > R ,where R denotes the radius of the F F a case, undersampling by grouping neighboring field-of-view (FOV). The reconstruction problem con- LORs reduces the data storage requirements and the sists of recovering f(r) from the acquired data p , da,db reconstruction time without significantly affecting {d ,d} = 1 · · · ,N ,where N , the number of detec- a b LOR LOR the reconstructed spatial resolution, which is primar- tor pairs in coincidence, can exceed 109 with modern ily limited by the low count density. scanners. The model defined by Eq. (1) is linear and hence Another approach to reduce data storage and process- implies that nonlinear effects due to random coinci- ing time when NLOR » Nevents consists of recording the dences and dead time be pre-corrected. In the absence coordinates (da,db) of each coincident event in a se- of photon scattering in the tissues, the sensitivity func- quential data stream called a list-mode data set. tion vanishes outside the tube of response centered on Additional information such as the time or the energy the LOR. In such a case, the accuracy of the spatial lo- of each detected photon can also be stored. In contrast calization of the annihilation events is determined by to undersampling, list-mode acquisition does not com- the size of the tube of response, which in turn depends promise the accuracy of the spatial localization of each on the geometrical size of the detectors and on other event. But the fact remains that the number of mea- factors such as the photon scattering in the detectors, sured coincidences may be too low to exploit the full or the variable depth of interaction of the gamma rays resolution of the scanner. within the crystal (parallax error, figure 2.26). Let us define the standard parameterization of 2D We have so far considered a scanner comprising PET data into sinograms. Consider a transaxial section multiple small detectors. Scanners based on large-area, z = z0 measured using a ring of detectors. Figure 4.1 position-sensitive detectors such as Anger cameras can defines the variables s and φ used to parameterize a be described similarly if viewed as consisting of a large straight line (an LOR) with respect to a Cartesian coor- number of very small virtual detectors. dinate system (x, y) in the plane. The radial variable s is

1 When the depth of interaction is accounted for, LORs are deﬁned by connecting photon interaction points projected on the long axis of the crystals [1]. Image Reconstruction Algorithms in PET 65

i.e., p p(s, φ), where the parameters (s, φ) corre- y da,db d a spond to the radial position and angle of da,db. Thus, the geometrical arrangement of discrete detectors in a scanner determines a set of samples (s, φ) in sinogram space. The most common arrangement is a ring s scanner: an even number Nd of detectors uniformly (3) φ spaced along a circle of radius Rd > RF . Each detector, x in coincidence with an arc of detectors on the opposite d side of the ring, defines a fan of LORs (figure 3.6), and b the corresponding sampling of the sinogram is: sR==−cos((201 kj – )π /)Ν k ,,K N jk, d d d ()4 φπ==−jjN/,,Ν 01K jd d where the pair of indices j, k corresponds to the coinci- Figure 4.1. Schematic representation of a ring scanner. A tube of response dences between the two detectors with indices da = between two detectors da and db is represented in grey with the corresponding LOR, which connects the center of the front face of the two detectors. j – k and db = k. Due to the curvature of the ring, each The sinogram variables s and φ define the location and orientation of the parallel projection j is sampled non-uniformly in the π LOR. radial variable, with a sampling distance Δs 2 Rd/Nd near the center of the FOV (i.e. for s 0). The radial samples of two adjacent parallel projections j and j +1 the signed distance between the LOR and the center of are shifted by approximately Δs/2, as can be seen by the coordinate system (usually the center of the detec- shifting only one end of a LOR (Fig. 4.2). tor ring). The angular variable φ specifies the orien- For practical and historical reasons, it is customary tation of the LOR. Line integrals of the tracer in PET to reorganize the data on a rectangular distribution are then defined as sampling grid φφφ==−∞ sksk==Δ − N,,K N ps(, , z )∫−∞ dtfx ( s cos t sin , kss (5) 0 φφ==−Δ K =+φφ = j jjN01, φ ys sin t cos ,) zz0 (3) where t, the integration variable, is the coordinate along the line. In the presentation of the 2D reconstruction problem below, we will omit the z arguments φ in the functions p and f. The next section describes how a function f(x, y) can π be reconstructed from its line integrals measured for |s| φ π < RF and 0 ≤ < . The mathematical operator mapping a function f(x, y) onto its line integrals p(s, φ) is called the x-ray transform(2), and this operator will be denoted X,so that p(s, φ) = (Xf)(s, φ). The function p(s, φ) is referred to as a sinogram, and the variables (s, φ) are called sinogram variables. This name was coined in 1975 by the Swedish scientist Paul Edholm because the set of LORs containing a fixed point (x0,y0) are 0 φ φ s located along a sinusoid s = x0 cos + y0 sin in the (s, 0 φ) plane, as can be seen from Eq. (3). For a fixed angle φ Figure 4.2. Representation of the sinogram sampling for a ring scanner φ φ = 0, the set of parallel line integrals p(s, 0) is a 1D with 20 detectors. The interleaved pattern provided by the LORs connect- parallel projection of f. ing detector pairs is shown by +’s. Note the decrease of the radial sampling distance at large values of s, which is exaggerated here because the At the line integral approximation, and after data plot extends to 90% of the ring radius. PET acquisition systems reorganize pre-correction, the PET data provide estimates of the these data into the rectangular sampling pattern (see equation (5)) shown x-ray transform for all LORs connecting two detectors, by ×’s.

2 In 2D, the x-ray transform coincides with the Radon transform, see [2]. 3 If the depth of interaction is not measured, an effective value of Rd is used that accounts for the mean penetration of the 511 keV gamma rays into the crystal. 66 Positron Emission Tomography

φ π with Δ = 2 /Nd, Nφ = Nd/2, and a uniform radial sam- LORs connecting such detector pairs are not trans- π pling interval Δs = Rd /Nd equal to half the spacing axial, but the maximum ring difference d2D,max is chosen between adjacent detectors in the ring. The parallel- to be small enough (typically 5) that the angle between beam sampling defined by Eq. (5) will be used in the rest these oblique LORs and the transaxial planes δθ (5) of the chapter. In this scheme the line defined by a ( d2D,max Δz/(2Rd)) can be neglected . sample (j, k) no longer coincides with a measured LOR Consider first the LORs between detectors in adja- connecting two detectors. The reorganization into paral- cent rings r and r + 1. These data are assembled in a 2D lel-beam data therefore requires an interpolation sinogram p(s, φ,z = (r +1/2)Δz) and used to recon- (usually linear interpolation) to redistribute the counts struct a transaxial slice that is approximated as lying on the rectangular sampling grid (Eq. (5)). This interpo- midway, axially, between the two detector rings. Each lation entails a loss of resolution, which is usually negli- sample in this cross-plane sinogram is the average of gible owing to the relatively low SNR in PET(4).In two LORs: on the one hand the LOR connecting a de- addition, the geometry of some scanners is not circular, tector da in ring r to a detector db in ring r + 1, and on but hexagonal or octagonal. Resampling is then needed the other hand the LOR connecting detectors da in ring anyway if standard analytic algorithms are to be used. r + 1 and db in ring r. Indeed, these two LORs coincide When the average number of detected coincidences if we neglect the small angles ±δθ they form with the per sinogram sample is small, undersampling is often transaxial plane. One effect of the introduction of the applied to reduce the storage and computing require- cross-plane sinograms is to increase the sampling rate ments. Angular undersampling (increasing Δφ) is in the axial direction so that instead of reconstructing called transaxial mashing in the PET jargon. The Nr image planes of thickness Δz, we end up with 2Nr – 1 φ π mashing factor defined by m = Δ Nd/(2 ) is usually an image planes separated by Δz/2. integer so that undersampling simply amounts to More generally, the LORs between rings r – j and summing groups of m consecutive rows (j’s) in the r + j,with j = 0, 1, 2,..≤ d2D,max/2 are added to form the sinogram. Angular undersampling results in a loss of direct sinogram of slice z = rΔz, and the LORs between resolution, which is smallest at the center of the FOV rings r – j + 1 and r + j,with j = 0, 1, 2,..≤(d2D,max + and maximum at its edge. Therefore, the maximum 1)/2 are added to form the cross-plane sinogram of allowed mashing factor depends not only on the SNR slice z = (r + 1/2)Δz. There are an odd number of ring but also on the radius RF of the reconstructed FOV: for pairs contributing to the direct plane sinograms and a fixed SNR, we can allow more mashing for a brain an even number of ring pairs contributing to the cross- scan than for a whole-body study. Radial undersam- plane sinograms (Fig. 4.3). For small values of d2D,max, pling (increasing Δs) tends to generate more severe artifacts, and is rarely used. A rule of thumb to match the φ radial and angular sampling is the relation Δ Δs/RF, which is derived using Shannon’s sampling theory [2]. z

7 7 Multi-slice 2D Data 6 6 5 5 So far we have discussed data sampling for a single 4 4 3 3 ring scanner located in the plane z = z0. Multi-ring scanners are stacks of N rings of detectors spaced 2 2 R 1 1 axially by Δz and indexed as r = 0, · · ·, N – 1 [3]. The R 0 Δz 0 coincidences between two detectors belonging to the same ring r are organized in a direct sinogram p(s, φ,z = rΔz) as described in the previous section. This is the Figure 4.3. Longitudinal view of a multi-ring scanner with Nr = 8 rings, op- sinogram of the function f(x, y, z = rΔz) (Fig. 4.3). erated in 2D mode, illustrating the formation of sinograms for two transaxial Multi-ring scanners also collect coincidences between slices (in grey), with d2D,max = 2. The sinogram for the cross slice at detectors located in a few adjacent rings, i.e. between z = 13 z/2 (top) is obtained by averaging the coincidences between two rings pairs (ra, rb) = (6, 7) and (7, 6). The sinogram for the direct slice at z = one detector in some ring r and another detector in 8z/2 (bottom) is obtained by averaging the coincidences between three one of the rings r + d,with d = –d2D,max, · · · , d2D,max.The rings pairs (ra, rb) = (3, 5), (5, 3) and (4, 4).

4 Parallel-beam resampling is used by some CT scanners despite more severe requirements in terms of spatial resolution. 5 For d = ±1 this approximation is of the same order as when resampling the sinogram to parallel beam. Image Reconstruction Algorithms in PET 67 there is thus a signiﬁcant difference in the number of pleted by estimating the missing LOR data. When the LORs contributing to the different types of sinograms, gaps in the sinogram are not too wide, simple interpo- and therefore also a difference in the corresponding lation can be used, but more sophisticated techniques

SNRs (see also figure 3.9). As d2D,max increases, the SNR have been proposed [11, 12]. An alternative is to apply of both types of sinograms increases and the differ- iterative reconstruction techniques, which are less sen- ences diminish; however, there is a degradation in the sitive to the specific geometry. We note, however, that image resolution as we will see later in the single-slice the use of iterative methods does not provide a solu- rebinning algorithm. In practice, the value of d2D,max is tion for the missing data problem. Rather it simplifies chosen to balance these trade-offs, with typical values the introduction of prior knowledge which can par- ranging from 3 to 11. tially compensate for the missing data.

The Cornerstone of Tomographic Analytic 2D Reconstruction Reconstruction:The Central Section Theorem

Properties of the X-ray Transform Tomographic reconstruction relies on Fourier analysis. Recall that the Fourier transform of a function f(x, y) is In this section, we solve the inverse 2D x-ray transform. defined by A closed-form solution of the integral equation, Eq. (3) F νν= νν is first derived assuming a continuous sampling of the ()(,)fFxy (,) xy φ ∈ × π =−+∫ πν ν sinogram variables over (s, ) [–RF,RF ] [0, ]. An dx dy f(, x y )exp( 2 i ( xxy y )) (6) approximation to this exact solution will then be written 2 in terms of the discrete data samples (defined by and is inverted by changing the sign of the argument of Eq. (5)), leading to the standard filtered-backprojection the complex exponential algorithm (FBP). We refer for this section to the com- −1 prehensive books by Natterer [2, 4], Kak and Slaney [5], ()(,)(,)F Fxy= fxy =+νν νν πν ν Barrett and Swindell [6], and Barrett and Myers [7]. ∫ ddFxy(,) xyexp(2 ixy ( x y )) (7) 2 First, two properties of Eq. (3) should be stressed: We use ν and ν to denote the frequencies associated • The problem is invariant for translations in the sense x y to x and y respectively, and denote the Fourier trans- that the x-ray transform of a translated image ft(x, y) φ φ form of a function, e.g., f, by the corresponding upper = f(x – tx, y – ty) is (Xft)(s, ) = (Xf)(s – tx cos – ty sin φ, φ). Translating the image simply shifts each sino- case character, e.g., F. These definitions are extended in gram row. the obvious way to N dimensions. A key property of the Fourier transform is the convo- • The problem is invariant for rotations in the sense lution theorem, which states that the Fourier transform that the x-ray transform of a rotated image fθ(x, y) = of the convolution of two functions f and h, f(x cos θ – y sin θ,x sin θ + y cos θ) is (Xfθ)(s, φ) = (Xf)(s, φ + θ). (*)(,)f h x y= ∫ dx′′ dy f (, x ′′ y )( h x− x′ , y− y′ ) ()8 2 These two invariances, and also the algorithms described in the next sections, are valid only when the is the product of their Fourier transforms: scanner measures all line integrals crossing the FFFνν=⋅ νν νν ( (fh * ))(xy , ) ( f )( xy , ) ( h )( xy , ) (9 ) support of the image (the disc of radius RF), so that the sinogram is sampled over the complete range (s, φ) ∈ In signal- or image-processing terms, convolving f with × π [– RF,RF] [0, ]. When this condition is not satisfied, h amounts to filtering f with a shift-invariant (i.e. in- the problem is called an incomplete data problem variant for translations) point spread function h.The (among many references, see [2] Ch. VI, [4, 8, 9]). This convolution theorem simplifies convolution by reduc- happens in particular with hexagonal or octagonal ing it to a product in frequency space. In general, the scanners such as the Siemens/CPS HHRT, where the Fourier transform is useful for all problems that are in- gaps between adjacent flat panel detectors cause un- variant for translation, and therefore also for tomo- measured diagonal bands in the sinogram [10]. Before graphic reconstruction as will now be shown. applying the FBP algorithm presented below, the in- The central section theorem, also called the projection completely measured sinograms must first be com- slice theorem, states that the 1D Fourier transform of the 68 Positron Emission Tomography

ν ν ∈ 2 x-ray transform Xf with respect to the radial variable s is of F on a square grid ( x = kΔ, y = lΔ), (k, l) Z , related to the 2D Fourier transform of the image f by which does not coincide with the polar grid of samples P(νφ , )==F( ν ν cos φν , = ν sin φ ) ( 10) provided by the data (see the right hand side of Eq. 10). x y Direct Fourier reconstruction therefore involves a 2D where interpolation to map the polar grid onto the square grid. This interpolation is often based on gridding P(,)νφ== (F p )(,) νφ∫ ds p (,) s φexp( − 2 π is ν ) ( 11 ) techniques similar to those used for magnetic reso- nance imaging [13, 14, 15]. and ν is the frequency associated to the radial variable s. This theorem is easily proven by replacing the x-ray transform p(s, φ) = (Xf)(s, φ) in the right hand side of The Filtered Backprojection Algorithm Eq. (11) by its definition (Eq. (3)) as a line integral of f. Thus the 1D Fourier transform of a parallel projection The FBP algorithm is the standard algorithm of tomog- of an image f at an angle φ determines the 2D Fourier raphy. It is equivalent to the direct Fourier reconstruc- transform of that image along the radial line in fre- tion in the limit of continuous sampling, but its ν ν φ ν discrete implementation differs. quency plane ( x, y) that forms an angle with the x axis. The implication for reconstruction is the follow- The FBP inversion explicitly combines Eqs. (11), (10) ing: if we measure all projections φ ∈ [0, π], the radial and (7). Straight-forward manipulations involving ν ν ν φ line sweeps over the whole frequency plane and changing from Cartesian ( x, y) to polar ( , ) coordi- ν ν nates lead to a two-step inversion formula (Fig. 4.4): thereby allows the recovery of F( x, y) for all frequen- ν ν ∈ 2 cies ( x, y) . The image f can then be recon- fxy(, )== ( Xp* F )(, xy ) structed by inverse 2D Fourier transform (Eq. (7)). π ()12 F The discrete implementation of the inversion ∫ dpφφφφ(,) s=+ x cos y sin formula combining Eqs. (11), (10) and (7) is referred to 0 as the direct Fourier reconstruction. This algorithm is where the filtered projections are numerically efficient because the discretized 2D R F Fourier transform (Eq. (7)) can be calculated with the pF (, sφφ )= ∫ ds′′ p ( s , )( h s− s′ ) (13) − FFT algorithm. The 2D FFT requires as input the values R F

f(x,y) p(s,φ)

X +

(x0, y0)

X* X* *h(s)

Figure 4.4. Illustration of 2D filtered backprojection. The x cosφ + y sinφ =s top row shows a brain section and its sinogram p = Xf. The 0 0 backprojection X*p of the sinogram (bottom right) is the 2D convolution of f with the point spread function 1/√(x2 + y2) and illustrates the blurring effect of line integration. The filtered sinogram pF obtained by 1D convolution with the ramp filter kernel has enhanced high frequencies, and F φ when backprojected, yields the original image f, up to noise p (s, ) and discretization errors. Image Reconstruction Algorithms in PET 69 and the ramp filter kernel is defined as (iii) The approximation of the backprojection by a dis- ∞ crete quadrature = ∫ νν πν hs() d | | exp(2 is ) ( 14 ) Ν φ −1 −∞ Δφφφφ∑ F =+ fxy(, ) p ( s x cos jjj y sin , ) (17) = Three remarks are in order. j 0 for a set of image points (x, y) (usually a square (i) The operator X* mapping pF onto f in Eq. (12) is pixel grid)(6). called the backprojection and is the dual of the x- (iv) The estimation of pF (s = x cos φ + y sin φ , φ ) in ray transform. Geometrically, (X*pF )(x, y) is the j j j Eq. (17) from the available samples pF (kΔs, φ ). sum of the filtered data pF for all lines that contain j This is usually done using linear interpolation: the point (x, y). s (ii) The convolution (Eq. (13)) can be expressed using psF (,φφ ) ( k+−1 )pksF (Δ , ) + the convolution theorem as PF (ν, φ) = |ν| P(ν, φ). j Δs j (18) (iii) The integral (Eq. (14)) defines the kernel h as the s −+F Δ φ (kp ) (( k1 ) s ,j ) inverse 1D Fourier transform of the ramp filter Δs function |ν|. This integral does not converge in the where k is the integer index such that kΔs ≤ s < (k usual sense, and h is only defined as a generalized + 1)Δs. Instead of linear interpolation some imple- function (see chapter 2 in [7]). mentations apply a faster nearest-neighbor interpolation to filtered projections which have first been linearly interpolated on a finer grid (typically Discrete Implementation of the FBP sampled at a rate Δs/4). Remarkably, most FBP implementations only use The discrete implementation of Eqs. (12) and (13) simple tools of numerical analysis, such as linear in- φ using the measured samples of p(s, ) described in the terpolation and trapezoidal quadrature, despite many section on sinogram data and sampling, above attempts to demonstrate the benefits of more sophisti- (Eq. (5)), involves four approximations: cated techniques. (i) The approximation of the kernel h(s) by an apodized kernel The Ill-posedness of the Inverse X-ray ∞ = ∫ νν ν πν Transform hsw () d | | w ( ) exp(2 is ) (15) −∞ where w(ν) is a low-pass filter which suppresses Like many problems in applied physics, the inversion the high spatial frequencies, and will be discussed of the x-ray transform is an ill-posed problem: the solution f defined by Eqs. (11), (10) and (7) does not later in the section on the ill-posedness of the φ inverse X-ray transform. depend continuously on the data p(s, ). Concretely, (ii) The approximation of the convolution integral by this means that an arbitrarily small perturbation of p a discrete quadrature. Usually standard trape- due to measurement noise can cause an arbitrarily zoidal quadrature is used: large error on the reconstructed image f. We refer to Bertero and Boccacci [20] and Barrett and Myers [7] N s F ΔΔφφ ∑ ′ Δ− ′ Δ pks(,)jjw s pks ( ,)(()) h kk s for an introduction to the concept of ill-posedness and ′=− kNs (16) its implication in tomography. Intuitively, ill-posedness =− K k NNss,, can be understood by noting that the ramp filter |ν| The calculation of this discrete convolution can be amplifies the high frequencies during the filtering step accelerated using the discrete Fourier transform P(ν, φ) → PF (ν, φ) = |ν|P(ν, φ). The power spectrum of (FFT) (see [16] section 13.1). In this case, some a typical image decreases rapidly with increasing fre- care is needed when defining the discrete filter: to quencies, whereas the noise power spectrum decreases avoid bias, this filter must be calculated as the FFT in general slowly(7). Consider a hypothetical perturba- φ → φ πν ν of the sampled convolution kernel hw(kΔs), k = 0, tion of the data p(s, ) p(s, ) + cos(2 0s)/√ 0 for ν ±1, ±2, …, and not by simply sampling the contin- some o > 0. This perturbation becomes arbitrarily ν uous filter function |ν|w(ν). small when 0 tends to ∞, but the corresponding

6 Alternative and faster implementations of the backprojection have been proposed [17, 18, 19]. 7 In the so-called white noise limit, the noise power spectrum is constant. 70 Positron Emission Tomography perturbation of the filtered projection is easily seen to kernel hw(s). The choice of the cut-off frequency must F φ → F φ ν πν be p (s, ) p (s, ) + √ 0 cos(2 0s) and is arbitrar- take two factors into account: ily large for large ν . This artificial example illustrates 0 • Given the radial sampling distance Δs in the sino- the fact that the ill-posedness of the inverse x-ray gram, Shannon’s sampling theory states that the transform (and of most inverse problems) arises from maximum frequency that can be recovered without high-frequency perturbations. aliasing is 1/2Δs. The cut-off frequency is therefore This discussion suggests that the reconstruction can constrained by ν ≤1/2Δs. be stabilized by filtering out the high frequencies. This c • As we have seen, stabilization requires suppressing is achieved by introducing a low-pass apodizing high frequencies. Therefore, lower values of ν are window w(ν) as in Eq. (15). A window frequently used c selected when the signal-to-noise ratio (i.e. the in tomography is the Hamming window number of detected coincidences) is low. νπνννν=+ < wham() (12cos( / c ))/ | | c The stability of the discrete FBP can be analyzed as- =≥νν (19) 0 | | c suming a Poisson distribution for the measurement ν noise. Consider the reconstruction of a disc of radius R where c is some cut-off frequency. The rectangular window containing a uniform tracer distribution, from 2D PET data comprising Nevents coincident events. Neglecting ννν=< wrec()1 | | c attenuation, scatter and random, the relative variance =≥νν (20) 0 || c of the reconstructed image at the center of the disc can be shown [21] to be results in a better spatial resolution, but introduces π 33(/RsΔ ) ringing artifacts near sharp boundaries. Figure 4.5 il- variance fx(,==00 y ) (21) lustrates the apodized window and the convolution 6 Nevents

hrec(s)

0 s

hham(s)

0 s

Figure 4.5. The convolution kernels corresponding to the rectangular window in equation (20) (top), and to the Hamming window (19) (bottom) are shown with arbitrary vertical scales. The smaller width of the central lobe of hrec(s) results in higher spatial resolution in the reconstruction, while the larger side lobes, compared to hham(s) indicate a higher sensitivity to noise. A transaxial slice of an FDG brain scan reconstructed using FBP with these two windows is shown on the right. Image Reconstruction Algorithms in PET 71 where Δs is the radial sampling and a rectangular ν The General Ingredients of Iterative window with c = 1/2Δs has been used. This result means that the number of detected events, hence the Reconstruction Algorithms: Data Model scanner sensitivity, should be multiplied by a factor The data are represented using Eq. (1). To simplify no- of 8 when the spatial resolution Δs is halved. This is tations, a single index j is used to denote the detector to be compared with the factor 4 increase that pair (d ,d), and the mean number of events detected suffices in the absence of tomographic reconstruc- a b for one LOR is then rewritten as tion, e.g. if perfect time-of-flight information is available, or if f is obtained from planar scintigraphy as in r rr r 〈〉=〈 〉=τ Ψ = K single photon imaging. The supplementary factor of pp{j∫ drfrrjN ( ) j ( ),1 , , LOR } (22) 2 reflects the ill-posed character of the inverse x-ray FOV transform. For a multi-slice 2D reconstruction, an ad- Any linear physical effect can be modeled in the sensi- ditional factor of 2 must be included if the axial reso- Ψ tivity function j : attenuation and scatter (assuming a lution is also halved, leading to a 16 fold increase of known density map), gaps in the detectors, non-uniform the number of counts when the isotropic resolution is resolution of the detectors, etc. The accuracy of the halved. When an improvement in detector resolution physical model ultimately determines the accuracy of is not matched by an increase in sensitivity, a cut-off the reconstruction. Nevertheless, approximate models ν frequency c smaller than the Nyquist frequency are often used to limit the computational burden, and 1/2Δs must be used to limit noise. In such a case, the these approximations are justified for low-count studies improvement in detector resolution is not fully trans- where image quality is primarily limited by noise. Many lated in the reconstructed image resolution. The im- approaches can be found, ranging from a simple line in- provement nevertheless remains beneficial because tegral model (as for FBP) up to a highly accurate model the modulation transfer function is enlarged at the required for high SNR studies with small-animal scan- ν ν lower frequencies | | ≤ c, allowing better recovery ners. A clever exploitation of the symmetries of the coefficients for small structures. scanner and the use of lookup tables, as described in Qi et al. [24], allows the computational costs of such a complex modeling to be kept to a reasonable level. Eq. (22) represents the mean value of the data. The

statistical distribution of each LOR data pj around its mean value < pj > must also be modeled. An inaccu- Iterative Reconstruction rate statistical model results not only in a sub-optimal

variance, but also in a bias. Usually, the “raw data” pj are This section introduces the major concepts of the it- counted numbers of detected photon pairs and are dis- erative reconstruction algorithms, which play an in- tributed as independent Poisson variables. The likeli- creasingly important role in clinical PET. These hood function then has the form algorithms rely on a discrete representation of both r N LOR =−〈〉〈∏ 〉 pj the data and the reconstructed image, in contrast Pr{pf|} exp( pjj ) p / pj ! (23) with the analytic algorithms, which are derived as- j=1 suming a continuous data sampling and introduce Due to the various forms of data pre-processing, the the discrete character of the data a posteriori. We actual distribution of the data presented to the algo- begin this section with a general discussion of the in- rithm often deviates from the Poisson model. If the gredients of an iterative algorithm: the data model, number of counts per bin is high enough, the distribu- the image model, the objective function, and the opti- tion is approximately Gaussian mization algorithm. We refer to [22, 23] for more details. The various possible choices for each of these −〈 〉 2 r N LOR 1 ()pp ingredients explains the wide variety of iterative al- Pr{pf|}=−∏ exp( jj) = σ 2 (24) gorithms in the literature. One speciﬁc algorithm will j 1 2πσ 2 j j be described in detail in the section on ML-EM and σ 2 OSEM (below). and the variance j of each LOR can be estimated One of the strengths of iterative algorithms is that knowing the data pre-processing steps. A more general they are largely independent of the acquisition geome- Gaussian model with a non-diagonal covariance try. Therefore, the concepts presented below apply matrix may be needed if the pre-processing introduces equally to 2D and to 3D PET data. correlations between LORs. 72 Positron Emission Tomography

Another variant has been proposed to model data The pixel basis function is not band-limited: its (8) F ν ν pre-corrected for random coincidences . When this is Fourier transform ( bi)( x, y) decreases slowly at done by subtracting delayed coincidences, the sub- large frequencies due to the discontinuity at the tracted data (the “prompts” minus the “delayed”) are boundary of the pixel. This property is at odds with the ν no longer Poisson variables. An approximate model, fact that the frequencies larger than c = 1/2Δs cannot the shifted Poisson model, for the distribution of such be recovered from sampled data (see The Ill-posedness pre-corrected data has been proposed in [25, 26]. of the Inverse X-ray Transform, above). An alternative We conclude this section on data modeling with a proposed by Lewitt [32] consists of using smooth few words on the reconstruction of transmission data basis functions which are essentially band-limited. acquired with monoenergetic photons of energy E. Significant improvements in image quality have been Typically, E = 511 keV if a positron source such as demonstrated using truncated Kaiser–Bessel functions, 68Ge/68Ga is used or E = 662 keV for a 137Cs single dubbed blobs [33]. These radially symmetrical basis photon source. These data are also distributed as functions have a compact support, but they do overlap, Poisson variables but with mean values which increases the processing time unless the spacing r rr r and size of the basis functions are carefully chosen. At 〈〉=〈 〉=0 − μ Ψtr ppp{(,)())jj exp( ∫ drrEr j the time of writing, most iterative algorithms are still FOV (25) jN= 1,,K } based on discontinuous basis functions, but at least one LOR clinical scanner implements blobs. 0 instead of Eq. (22). Here, pj is the mean number of co- In principle, the choice of the basis functions deter- Ψtr incident events in the reference (blank) scan, j is the mines the image model and reduces image reconstruc- μ sensitivity function for the transmission data, and (r, tion to the estimation of a vector {fi,i = 1, · · · , P}, E) is the attenuation coefficient to be reconstructed. usually with the constraint fi ≥ 0. The constraint im- The difference between this model and Eq. (22) shows plicit in this discrete representation(9) helps to stabilize that specific iterative algorithms are needed for trans- the reconstruction, but may be insufficient. In such a mission data [27, 28, 29, 30]. An alternative is to apply case, a small perturbation of the data vector p still algorithms developed for emission data to the loga- causes an unacceptably large perturbation of the re- 0 rithm log(pj / pj), but this approach introduces construction f . The set of admissible images must then significant biases because the logarithm of the data is be further restricted. Several techniques can be used not a Poisson variable any more. Examples of these for this purpose, we focus here on the popular biases are given in [31]. Bayesian scheme (see, for example, [34, 35, 7]). In the Bayesian scheme, regularization is achieved by considering the image as a random vector with a The Image Model: Basis Functions and prescribed probability distribution Pr(f). This distrib- Prior Distribution ution is called the prior distribution (or simply “the prior”). Typically, the prior enforces smoothness by as- Iterative algorithms model the image as a linear combi- signing a low probability to images having large dif- nation of basis functions ferences |fix, iy – fix ± 1, iy ±1| between neighboring pixels. P One says that large differences between neighboring ∑ f (,xy ) fbxyii (, ) (26 ) pixels are penalized by the prior. In practice, priors are i=1 defined empirically because the clinically relevant Most algorithms use contiguous and non-overlapping prior information is usually too complex to be ex- pixel basis functions, which partition the field of view: pressed mathematically. We will see in the section on the cost function (below) how the prior is incorporated bxy(, )=−<12 | x x |ΔΔ x / and | y −< y | x / 2 ii iin the reconstruction. =−≥02||/xxΔΔ x or | yy −≥ |/ x 2 iiPriors based on a Gaussian distribution with a ()27 uniform (i.e. shift-invariant) covariance are in essence th with i = (ix,iy) and the center of the i pixel is equivalent to the linear smoothness constraint intro- (xi = ixΔx, yi = iyΔx). The pixel size is Δx = Δs/Z,where duced in the FBP algorithm by low pass windows w(ν) Z is the zoom factor. discussed earlier. More sophisticated priors can

8 In contrast with the randoms, the contribution of scattered coincidences is linearly related to true coincidences and hence can in Ψ principle be included in the model j(r).However, the scatter background is more often subtracted from the data prior to reconstruction for the sake of numerical efficiency. See chapter 5. 9 Mathematically we constrain f(x, y) to belong to the P-th dimensional space of functions spanned by the bi. Image Reconstruction Algorithms in PET 73 improve image quality, especially sharpness, in specific matrices, each of which models a specific aspect of the situations and for specific tasks, but may introduce data acquisition [24]. subtle nonlinear biases and noise correlations. A direct inversion of the linear system (Eq. (28))

An attractive class of prior distributions exploits a with the < pj > replaced by the measured data pj is im- registered anatomic, MR or CT, image of the patient [36, practical for two reasons: 37, 38, 39, 40]. This image defines likely boundaries • The discrete system is ill-conditioned: the condition between regions in which uniform tracer concentration number of a is large(10). Consequently, the solution(11) is expected. These boundaries can be incorporated in a of Eq. (28) is unstable for small perturbations prior that enforces smoothness only between pixels be- p – < p > of the data. Ill-conditioning is the discrete longing to the same anatomical region. Despite promis- j j equivalent of the ill-posedness of the inverse x-ray ing results, that approach still needs further validation transform discussed in the section on the ill- and comparison with the alternative approach in which posedness of the inverse X-ray transform (above). the MR or CT prior information is exploited visually • Numerically, the inversion of matrix a is hindered by using, for example, image fusion techniques. its very large size (typically P = 106 unknowns and Let us finally stress that the 2D or 3D nature of the N 106 up to N 109 in 3D PET). image model is independent of the fact that the data are LOR LOR acquired in 2D or 3D mode. Indeed, true 3D image The first problem is solved by incorporating prior models based, for example, on 3D blobs and on 3D knowledge in a cost function. The second, numerical smoothness constraints are useful even when the data problem, is solved by optimizing the cost function by are collected independently for each slice (or rebinned, successive approximations. see section on 3D analytic reconstruction by rebinning (below)) [41]. For dynamic or gated PET studies, mixed basis functions depending on both the time and the The Cost Function spatial coordinates can be defined to model the expected behavior of the tracer kinetics [42, 43]. The key ingredient of an iterative algorithm is a cost function Q(f = (f1, · · ·, fp), p), which depends on the unknown image coefficients and on the measured data. The System Matrix Q(f , p) is also called the objective function. The reconstructed image estimate f* is defined as one that maximizes Q: We can now summarize the assumptions in the two rr previous sections. Putting the image model (Eq. (26)) r fQfp* = arg max (,) (30) into the data model (Eq. (22)) reduces the problem to a r f set of linear equations:

P with usually the constraint fj ≥ 0. The role of the cost 〈〉=∑ = K pafjji, i j1,, N LOR (28) function is to enforce (i) a good fit with the data, i.e., i=1 Eq. (28) should be approximately satisfied, (ii) the prior where the elements of the system matrix are conditions on the image model. rr r In the Bayesian framework, the cost function is the = τ ∫ Ψ ==KK adrbrrji, i() j () jNiP11,,LOR ; ,, posterior probability distribution FOV ()29 r rr r r Pr{pf|} Pr{ f } Pr{ fp|}= r (31) A line integral model including only attenuation cor- Pr{p} rection and normalization generates a sparse system matrix a with elements simple enough to be calculated The first factor in the numerator of the right hand side on the fly. More accurate models that include scatter is the data likelihood (given, for example, by the lead to densely populated matrices, which are complex Poisson model (Eq. (23)), and the second factor is the to calculate. A practical algorithm then requires a com- prior probability discussed above in the section on promise between accuracy, required storage, and the image model. The denominator is independent of f speed. A useful approach is to factor a as a product of and can be dropped. Maximizing the posterior proba-

10 The condition number of a matrix is the ratio between its largest and smallest singular values. 11 Or the generalized Moore–Penrose solution if a is singular or p ∉ range a, see [20]. 74 Positron Emission Tomography bility is equivalent to maximizing its logarithm, and When the cost function is differentiable and a non- the cost function becomes negative solution is required, the solution f* must r rrrr satisfy the Karush–Kuhn–Tucker conditions: =+ Qf(|) plog Pr{ p |} f log Pr{ f } () 32 r r ((,))∇=>=Qf** p001 f , j ,,K P The first term penalizes images which do not well fit jj ()35 ≤=00f * the data, whereas the second one stabilizes the inver- j sion by penalizing images which are deemed a priori “unlikely”. An image maximizing Q(f, p) is called a where the gradient of the cost function is the vector maximum a posteriori (MAP) estimator. When the with components r r log-likelihood is Gaussian (Eq. (24)), the first term in r r ∂ * Qf(,) p rr Eq. (32) is a quadratic function and the algorithm max- ∇=Qf(,)) p |()* 36 j ∂f ff= imizing Q(f , p) is called a penalized weighted least- j square method [44, 45]. Ideally, the maximum of Q should be unique. When positivity is not enforced, the Karush– Uniqueness is guaranteed when the cost function is Kuhn–Tucker condition reduces to the first line of Eq. convex, i.e., when the Hessian matrix (35). If in addition the cost function is quadratic (e.g., r r with a Gaussian log-likelihood), optimization reduces ∂2Qf(,) p to a set of P linear equations in P unknowns. With a H = ij,,,= 133K P () ij, ∂∂ff Gaussian likelihood without prior, these equations are ij the so-called normal equations corresponding to Eq. 28 is negative definite for all feasible f. Non-convex cost [7, 20]. functions may still have an unique global maximum, There is a considerable literature on optimization, but they can also have local maxima, which complicate and even within the field of tomography a wide the optimization. variety of methods have been proposed. A detailed overview (see [7, 16, 22, 46]) is beyond the scope of this chapter, but it may be useful to briefly list a few basic tools that can be used to develop iterative Optimization Algorithms methods. The major difficulty is that the system of equations (Eq. (35)) is large, strongly coupled, and The cost function (assuming it has an unique global often non-linear. Many algorithms are based on the * maximum) defines the looked-for estimate f of the replacement at each iteration of the original * tracer distribution. To actually calculate f , an opti- optimization problem (Eq. (30)) by an alternative mization algorithm is needed. Such an algorithm is a problem which is easier to solve because prescription to produce a sequence of image estimates fn, n = 0, 1, 2,· · ·, which should converge asymptoti- • it has a much smaller dimensionality, and/or cally to the solution: • the modified cost function is quadratic in its un- rr knowns, or even better separable in the sense that its lim ffn = * ()34 gradient is a sum of functions each depending on a −∞ n single unknown parameter fj . Asymptotic convergence is not the only requirement: Standard examples include: the optimization algorithm should be stable, efficient numerically, and ensure fast convergence indepen- (i) Gradient-based methods. The prototype is the dently of the choice of the starting image f0.A further steepest-ascent method, which reduces the problem to a one-dimensional optimization along property is that of monotonic convergence, which th guarantees that Q(fn+1, p) ≥ Q(fn, p) at each iteration. the direction defined by the gradient. The n itera- Though not strictly needed, monotonic convergence is tion is defined by: useful in practice and is often the key property used to rr rr nn+1 =+∇α n prove asymptotic convergence. ffn Qfp(,)rrrr αα=+∇nn (37) In principle, the choice of the optimization algorithm n arg maxQf((,),) Qf p p α should not influence the solution, which is defined by α Eq. (30). In practice, however, the image that will be The step length n maximizes the cost function used is produced by a necessarily finite number of iter- along the gradient direction, taking into account ations and thereby does depend on the algorithm. possible constraints such as positivity. Image Reconstruction Algorithms in PET 75

(ii) Methods using subsets of the image vector.Only a mization) algorithm and its accelerated version subset of the components of the unknown image OSEM (Ordered Subset EM). The ML-EM method vector (i.e., a subset of voxels) is allowed to vary at was introduced by Dempster et al in 1977 [51] and each iteration, while the value of the other compo- first applied to PET by Shepp and Vardi [52] and nents is kept constant. A different subset of voxels Lange and Carsson [27]. The algorithm is akin to the is allowed to vary at each iteration. In the coordi- Richardson–Lucy algorithm developed for image nate ascent algorithm, a single voxel is varied at restoration in astronomy (see, for example, [20]). The each iteration, according to: OSEM variation of the ML-EM algorithm, proposed ffn+1 =≠n jJn() in 1994 by Hudson and Larkin was the first iterative j j r algorithm sufficiently fast for clinical applications. ==n KKn n n arg maxQf((111 , , fj−+ , ffjj , , , fp ), p ) j Jn ( ) The cost function in the ML-EM and OSEM algo- f j rithms is the Poisson likelihood (Eq. (23)). Putting ()38 Eq. (28) into Eq. (23), taking the logarithm, and drop-

where J(n) defines the order in which voxels are ac- ping the terms that do not depend on the unknowns fi, cessed in successive iterations, e.g., J(n) = n mod P. we get (iii) Methods based on surrogate cost functions.The r original cost function Q(f , p) is replaced at each r N LOR P P =−∑ ∑ + ∑ step by a modified objective function Q˜ (f, fn, p) Qf(,) p { aji,, fij p log( a ji fi )} (40 ) j==11i=1 i that satisfies the following conditions [47]: Q˜ (f, fn, p) can easily be maximized with • If the matrix a is non-singular, this cost function is respect to f, e.g., it is quadratic or separable, convex and defines a unique image. ˜ • Q(f , f , p) = Q(f , p) The EM iteration is a mapping of the current image n • Q˜ (f , f , p) ≤ Q(f , p) estimate fn onto the next estimate fn+1 : The two last conditions ensure that the next image r N estimate n+1 1 LOR p ff= n ∑ a j iP= 1,,K rrrr i i N ji, P n + ∑ LOR j=1 ∑ n 1 = ˜ n j′=1 a ji′, i ′=1afji, ′′ i fQffparg maxr (, ,) (39) f ()41

monotonically increases the value of the cost func- 1 n+1 n Usually, the first estimate is a uniform distribution fi = 1, tion: Q(f , p) ≥ Q(f , p). The ML-EM algorithm ′ (next section), the least-square ISRA algorithm i = 1, …, P. The sum over i in the denominator of the [48, 49], and Bayesian variants [50] can be derived second factor in the right hand side is a forward projection and corresponds to Eq. (28): therefore the de- using surrogate functions. n nominator is the average value that would be (iv) Block-iterative methods use at each iteration only a n subset of the data. They are called row-action measured if f was the true image. The sum over j in methods when a single datum is used at each itera- the numerator is a multiplication with the transposed tion as in the ART algorithm. The OSEM method system matrix and represents the backprojection of the (see next section) and its variants are also block- ratio between the measured and estimated data. iterative methods. While allowing significant accel- Finally, the denominator in the first factor is equal to eration of the optimization, these methods do not the sensitivity of the scanner for pixel i. guarantee a monotonic increase of the cost func- The ML-EM iteration has several remarkable prop- tion. In addition, the iterated image estimates tend erties: asymptotically to cycle between S slightly different • The cost function increases monotonically at each it- solutions, where S is the number of subsets. eration, Q(fn+1, p) ≥ Q(fn, p), n * Appropriate under-relaxation can be used to alle- • The iterates f converge for n → ∞ to an image f viate the problem. that maximizes the loglikelihood, • All image estimates are non-negative if the first one is, • The algorithm can easily be implemented with list- ML-EM and OSEM mode data [53, 54, 55, 56] because the only LORs that contribute to the backprojection sum over j in The most widely used iterative algorithms in PET are Eq. 41 are those for which at least one event has been the ML-EM (maximum-likelihood expectation maxi- detected (pj ≥ 1). 76 Positron Emission Tomography

Figure 4.6. 2D reconstruction of a mathematical phantom with the ML-EM algorithm (Nr = 128,Nφ = 256,Nx = Ny = 256). The Poisson log-likelihood (left scale) and the square reconstruction error with regard to the reference image (right scale) are plotted versus the iteration number. The left plot is for ideal noise-free data. For the right plot, pseudo-random Poisson noise has been added for a total of 400,000 coincidences. The cost function increases monotonically in contrast with the error, which reaches a minimum around 10 iterations. The 13th and 64th image estimates obtained from noisy data are shown.

What about stability? The ML-EM cost function does • filter the data before applying the ML-EM algorithm, nmax not include any prior. The algorithm converges there- • stop the algorithm after nmax steps, and use f as fore to the image that “best” fits the data (“best” in the solution estimate. Methods to automatically estimate sense defined by the Poisson likelihood). But fitting too an appropriate number of iterations have been pro- closely the noisy data of an ill-conditioned problem posed [60, 61] though all clinical implementations induces instabilities (Fig. 4.6). In practice, this instabil- determine nmax empirically. ity corrupts the image estimates fn by high-frequency “checkerboard-like” artifacts when the number of iter- The Ordered Subset Expectation Maximization algo- ations exceeds some threshold [57, 58].Various methods rithm [62] is based on a simple modification of can remedy this problem: Eq. (41), which has a significant impact on clinical PET • Introduce a Bayesian prior term into the cost func- imaging by making iterative reconstruction practical. tion (see [59] and references therein), The LOR data are partitioned in S disjoint subsets ʚ • apply a post-reconstruction filter, typically a 3D J1, · · · , JS [1, · · ·,NLOR]. For 2D sinogram data (see Gaussian filter with a FWHM related to the spatial Eq. (5)), one usually assigns the 1D parallel projections resolution that is deemed achievable given the SNR, m,m + S,m + 2S, · · · ,≤ Nφ to the subset Jm+1. Image Reconstruction Algorithms in PET 77

Figure 4.7. Comparison between the FBP and the OSEM reconstruction of a 2D FDG whole-body study, showing a frontal section. The algorithms used for the reconstruction of the transmission scan and of the emission scan are FBP-FBP (left), OSEM-FBP (center), OSEM-OSEM (right).

The ML-EM iteration (Eq. (41)) is then applied in- Poisson distribution are closely related to OSEM, but corporating the data from one subset only. Each guarantee asymptotic convergence under certain subset is processed in a well-deﬁned order, usually in conditions. a periodic pattern where subset Jn mod S is used at Compared to FBP reconstructions, some qualitative iteration n(12): characteristics of images reconstructed from Poisson data using ML-EM or OSEM are: 1 p ffn+1 = n ∑ a j i i ji, P n ∑ ′∈ a ′ jJ∈ ∑ Reduced streak artifacts jJnSmod ji, nSmod i′=1afji, ′′ i • iP= 142,,K () • A better SNR in regions of low tracer uptake, resulting in particular in a better visibility of the contours Empirically, the convergence is accelerated by a factor of the body S with respect to ML-EM. But the asymptotic con- • Some non-isotropy and non-uniformity of the vergence to the maximum-likelihood estimator is no spatial resolution, especially when the range of longer guaranteed. In fact, OSEM tends to cycle values of the attenuation correction factor is large, as between S slightly different image estimates. To mini- e.g. in the chest mize the adverse effects of this behavior it is recom- • A slower convergence for regions of low tracer mended to keep the number of 1D parallel uptake than for regions of high tracer uptake. projections in each subset equal to at least 4. In addition, several authors suggest progressively decreasing Figure 4.7 illustrates some of these properties. the number of subsets during iteration. Finally, we Finally, some comments are in order about data cor- only mention here the row action maximum likeli- rections prior to reconstruction with ML-EM or OSEM. hood (RAMLA) algorithm [63] and the rescaled Physical effects such as detector efﬁciency variations, block-iterative ML-EM algorithm [64]. These two al- attenuation, scattered and random coincidences, etc., gorithms for maximum-likelihood estimation with a must be accounted for to obtain quantitatively correct

12 In the OSEM jargon, such an iteration is called a sub-iteration, and an iteration denotes a set of S consecutive sub-iterations, corresponding to one pass through the whole data set. 78 Positron Emission Tomography images. With analytical algorithms such as FBP the them using the ML-EM algorithm [30]. This yields the data are corrected before reconstruction to comply following iteration: with the line integral model. With the ML-EM algo- s 1 N LOR p rithm, on the contrary, pre-correction must be avoided ffn+1 = n ∑ a j i i N ji, P n ∑ LOR β j=1 ∑ because it would destroy the Poisson character of the j′=1 jji′′a , i′=1afji, ′′ i data and thereby could bias the reconstruction. This iP= 144,,K () means that the ML-EM algorithm should be applied to c α the raw data, and that all physical effects should be in- In the case of the attenuation correction, pj = jpj , β α , s cluded in the system matrix as described in the section and one easily checks that j = 1/ j and pj = pj,so that on the system matrix (above). A full modeling, Eqs. (44) and (43) coincide. however, can be impractical when the system matrix is When the data are acquired in true mode as the dif- too large to be pre-computed. A faster, approximate, ference between the prompt and delayed coincidences, procedure consists of including only the most the shifted Poisson model (described earlier) leads to significant effect – attenuation – in the system matrix. the following modified ML-EM algorithm [26], Corrections for scattered and random coincidences can ++ 1 N LOR prs2 be of the order of 50%, but are often less. Attenuation f n+1 = f n ∑ a jjj i i N ji, P n ∑ LOR α j=1 ∑ ++α correction, however, involves multiplication by factors j′=1 a ji′′, / j i′=1afji, ′′ i jjj()2 r s ranging from 5 to more than 100 ! The attenuation cor- i = 145,,K P () rection is multiplicative and can easily be incorporated where the pj are the data corrected for random and in the ML-EM iteration as shown by Hebert and Leahy α [65], scatter (but not attenuation), j and aj,i are as in equation (43), and ¯rj and ¯sj are low-variance estimates of the 1 N LOR p random and scatter background in LOR j. The mean ffn+1 = n ∑ a j i i N ji, P n random ¯rj is generally estimated using variance reduc- ∑ LOR a /α j=1 ∑ ′= af′′ j′=1 ji′′, j i 1 ji, i tion techniques or from the single photon data (see = K iP143,, () sections Randoms Variance Reduction and Estimation from Single Rates in next chapter). The mean scatter ¯ where the p are the data corrected for all effects except j ¯s is estimated using a model based scatter model (see attenuation, α is the pre-computed attenuation correc- j j section Simulation-based Scatter Correction in next tion factor(13) for LOR j, and the system matrix a does chapter). not include the effect of attenuation. This attenuation- weighting (AW) of the ML-EM algorithm is easily extended to the attenuation-weighted OSEM algorithm (AW-OSEM). The AW-OSEM approach has been shown to perform almost as well as algorithms that model all Variance and resolution with physical effects, with only modest increases in computation time over OSEM applied to pre-corrected sino- non-linear reconstruction gram data [66]. algorithms The previous approach can also be applied to other multiplicative corrections such as the normalization for detector efficiency variations. For more complex, Predicting and controling the statistical properties and e.g., non-linear, relations between the raw data pj and the resolution of reconstructed PET images is of para- c the corrected data pj , an approximate statistical mod- mount importance for quantitative applications of PET eling can be achieved by applying the ML-EM algo- and for task oriented performance studies using nu- s β c β c c rithm to scaled data pj = jpj,where j = /var(pj) is merical observers. For clinical PET, a good awareness a low-variance (smoothed) estimate of the ratio of these properties helps minimizing the probability of between the mean and the variance of the corrected erroneous image interpretations. β data. With this choice of j the scaled data satisfy the Denote the “true” image by f , the measured data s s same relation var(pj) as data obeying Poisson vector by p, and the mean data by < p >= Af ,where A statistics, and it is therefore reasonable to reconstruct is the system matrix (see Eq. (28)). Consider any

13 The ratio between the blank and transmission scans. Image Reconstruction Algorithms in PET 79 speciﬁc algorithm denoted by T (for example 100 ML- also some additional point source Δf located in voxel δ EM iterations with a uniform initial image estimate). j0:Δfj = j,j0, and denote the corresponding mean con- The reconstruction is then f* = T (p), and the recon- tribution to the data by Δp = AΔf. Then the non-lin- struction error is earity of the algorithm T means that, in general,

f* – f = (T (p) – < T (p) >) + (< T (p) > – f) (46) T (p + Δp) ≠ T (p) + T (Δp) (48) where < T (p) > denotes the mean value of the recon- A concrete consequence of this non-linearity can be ob- structed image, which could be estimated by averaging served when the ML-EM algorithm is used with the a large number of images reconstructed by applying the small number of iterations typical of clinical practice: algorithm T to statistically independent realizations of the reconstruction of a unit “point source” sitting on top the random data vector p. The first term in the RHS of of a uniform background broadens when the strength of Eq. (46) is the statistical error due to the fluctuations of the background is increased. Similarly, the anisotropy of data p around its mean value < p >. The statistical error the attenuation correction factors for an elongated is characterized by the covariance matrix object such as the chest at the level of the shoulders is translated by ML-EM into an anisotropy of the point re- T T T T Vj,j′ = < ( (p)j – < (p)j >) ( (p)j′ – < (p)j′ >) > sponse: if the point source is located in an ellipsoidal at- j, j′ = 1, … , P (47) tenuating medium with long axis along the x-axis, the point response takes an ellipsoidal shape with long axis the diagonal elements of which give the variance of along the y-axis. One should therefore interpret with each reconstructed pixel value. The second term in the care results on the “reconstructed resolution of ML-EM” RHS of Eq. (46) is the systematic error or bias:even the obtained for isolated point or line sources. Similar ob- mean value of the reconstructed image is not exact servations hold for MAP or other non-linear algorithms. because of sampling, apodization, finite number of A local infinitesimal point response function, de- iterations, etc. pending both on the data p and on the position at voxel j , can be defined as the image For a linear reconstruction algorithm the image co- 0 r r variance can easily be determined once we know the 1 rrr∂T ()Af δ r =+((TTpppε )–()) =∈P (49) statistical (e.g. Poisson) properties of the data. In addi- pj, 0 lim ε→0 ε ∂f tion, with a linear algorithm, the systematic error is j0 fully characterized by the point response defined as the Approximate expressions and efficient numerical tech- reconstruction of the mean data of a point source niques have been developed [67] to calculate this point ∈ located in a voxel j0 [1, … , P]. While the point re- response, as well as methods to design a penalty term sponse depends in general on the position of the voxel log Pr{f} in Eq. (32)14 that guarantee homogeneous res- j0 relative to the scanner, it does not depend on the olution [68]. An alternative approach to improve the strength of the point source, or on whether that source homogeneity of the resolution consists in pursuing the is sitting or not over some background. This allows an ML-EM iteration beyond the point where the image is unambiguous definition of the resolution, using para- deemed acceptable, and in post-filtering this image meters such as the FWHM of the point response. For with an appropriate filter [69]. the FBP algorithm, in particular, the statistical error The image covariance (Eq. (47)) can be estimated and the bias are determined by the apodized ramp numerically by reconstructing a large number of filter, and the trade-off between these two errors is well data sets simulated with statistically independent understood (see the section Ill-posedness of the pseudorandom noise realizations. An alternative for Inverse X-ray Transform). maximum-likelihood algorithms is to calculate the For non-linear algorithms such as ML-EM, the de- Fisher information matrix, the inverse of which is rivation of analytical expressions for the covariance related by the Cramer-Rao theorem to the covariance of matrix is complex. More importantly, the point re- the ML estimator (see e.g. [7]). An approximate expres- sponse becomes object dependent. To understand this sion of the covariance, for the more relevant case where important point, consider any data set p, measured e.g. ML-EM iteration is stopped well before convergence, as a “normal” whole-body tracer distribution. Consider was derived in [57], and validated numerically in [58].

14 This penalty depends on the data and can no longer be interpreted as a real Bayesian prior.The algorithm is then better referred to as a penalized likelihood method. 80 Positron Emission Tomography

The major conclusion is that the variance of the ML- of the transaxial FOV. However, for each θ ≠0,not all EM reconstruction is roughly proportional to the LORs parallel to n and crossing the FOV of the scanner image itself, i.e. are measured. That is, the parallel projection p(s, n) is measured only for some subset of LORs s ∈ M(n) ʚ n⊥. 2 Vj,j C < fj > j = 1, … , P (50) One says that this projection is truncated. Two important properties of the 3D data can already for some constant C depending on the object and on be stressed: the number of iterations. Thus, the ML-EM reconstructions have lower variance in regions of low tracer (i) 3D data are redundant since four variables are re- uptake, thereby allowing good detectability in these quired to parameterize p(s, n) (two for the orienta- regions. This is in contrast with FBP reconstructions, tion n and two for the vector s) whereas the image in which the noise arising from the high uptake regions only depends on three variables (x, y, z). spreads more uniformly over the whole FOV, resulting (ii) 3D data are not invariant for translation as in the in particular in the well-known streak artefacts. 2D case because the cylindrical detector has a finite length and the measured projections are truncated. The vector s can be defined by its components (s, u) on two orthonormal basis vectors in n⊥. 3D Data Organization r ss=+−(,,)(cos φφsin 0 u sin θφθφθ sin , sin cos ,) cos ()53 Two-dimensional Parallel Projections The variable s coincides with the 2D radial sinogram φ We have seen in the previous section that 2D data ac- variable of Eq. (3). We will thus write p(s, n) = p(s, u, , θ φ quired with a ring scanner can be stored in a sinogram ). The subset p(s, u, , 0) is the 2D sinogram of the p(s, φ). If the data are modeled as line integrals, as for slice z = u. analytic algorithms, the sinogram is a set of 1D parallel The LORs measured by a PET scanner do not uni- φ θ projections of f(x, y) for a set of orientations φ ∈ [0, π]. formly sample the variables (s, u, , ), and therefore in- Similarly, the LORs measured by a volume PET scanner terpolation is needed to reorganize the raw data into can be grouped into sets of lines parallel to a direction parallel projections. This holds both for multi-ring θ φ scanners and for scanners based on flat panel detectors. specified by a unit vector n = (nx, ny, nz) = (–cos sin , cos θ cos φ, sin θ) ∈ S2 where S2 denotes the unit sphere. The angle θ is the angle between the LOR and the transaxial plane, so that the data acquired in a 2D Oblique Sinograms acquisition therefore correspond to θ = 0. The set of Some analytic algorithms use an alternative parame- line integrals parallel to n is a 2D parallel projection of terization of the parallel projections, where the vari- the tracer distribution: rr r r psn(,)=+∫ dtf ( s tn ) (51 ) y ra where the position of the line is specified by the vector da s ∈ n⊥, which belongs to the projection plane n⊥ orthogonal to n. θ

Consider a cylindrical scanner with Nr rings of radius Rd, extending axially over 0 ≤ z ≤ L,where L = s NrΔz. Assuming continuous sampling, this scanner φ=0 x ζ z measures all LORs such that the line deﬁned by (s, n) has two intersections with the lateral surface of the cylinder (these intersections are the positions of the two detectors in coincidence). The set of measured ori- d entations is b rb r Ω(){(,)|[,),[θφθφπθθθ==n ∈0 ∈−+ , ]} max max max Figure 4.8. A transverse and a longitudinal view of a multi-ring scanner. ()52 An LOR connecting a detector da in ring ra to a detector db in ring rb is shown, with the four variables (s, φ, ζ, θ) used for the oblique sinogram parameteri- θ =−22 φ with tan max LRR/2 dF,where RF is the radius zation. The particular LOR represented has = 0. Image Reconstruction Algorithms in PET 81 able u in Eq. (53) is replaced by the axial coordinate If the radius of the FOV is small, θ in Eq. (55) is ap- = u/ cos θ, the average of the axial coordinates of the proximately independent of s. With this approximation, two detectors in coincidence. One deﬁnes weighted the coincidences between two rings ra and rb can be θ parallel projections used to build an oblique sinogram ps(., ., , ) with = (r + r )Δz/2 and tan θ = (r –r )Δz/(2R ). ps(,φζθ , , )= ps (, ζ cos θφθ , , ) cos θ a b b a d s To save storage and computation, some volume scan- = ∫ dt′ f(, s cos φφφ− t′ sin s sin + ners use axial angular undersampling by averaging sets ′ φζ+ ′ θ θ ttcos,) tan ()54 of sinograms with adjacent values of . The degree of φ ∈[0 π undersampling is characterized by an odd integer pa- The domain of the variables is |s| ≤ RF, , ), rameter S, called the span. The resulting sampling is ⎛ ⎞ ⎡ non-interleaved: θ ≤−⎜ 22⎟ ζθ∈−||tan Rs22 , || arctan ⎝LRs /2 d ⎠ ,and ⎣⎢ d θ ==−+Δ K ⎤ tan iSθθ z/ (256 Rd ) i imax , , imax ( ) LRs−−||tanθ 22 θ ζ =≤≤−−Δ d ⎦⎥ (Fig. 4.8). For each pair , the izzzr/()()222 ziminθθ i N zi min θ function ps(., ., , ) is called an oblique sinogram by analogy with Eq. (3). The similarity with the 2D format where zmin(iθ) = max(0,|iθ|S – S/2). Each sample makes this oblique sinogram format suited to the ana- (iθ,iz) is obtained by averaging data from all pairs of lytic rebinning algorithms, which reduce the 3D data to rings such that 2D data. −≤−≤+ iSSθθ//22 rriSSba Consider now the discrete sampling of the oblique =+ irrzba ()57 sinograms. The measured LORs connecting detector da in ring ra to detector db in ring rb corresponds to para- The sampling scheme is often illustrated on a 2D meters (s, φ, , θ) in Eq. (48), where s and φ are deter- diagram, the “Michelogram”, in which each grid point mined as in the 2D case (Eq. (4)), and the axial represents one ring pair and each sampled oblique variables are determined by sinogram (iθ,iz) is represented by a line segment con- ⎛ ⎞ necting the contributing pairs r ,r (Fig. 4.9). Just as for tan θ =−()/rrzΔ 2 R22 − s a b ba⎝ d ⎠ the azimuthal undersampling (“mashing”, see end of ζ =+Δ ()/rrzab 255 ()sinogram data and sampling section, above), a good

rb 15 14 13

12 iz 11 0 1 89

iθ 7 Figure 4.9. A Michelogram for a 16-ring scanner, illustrating the axial sampling with 6 a span S = 5 and a maximum ring difference 5 2 defined by imax = 2. Each grid point corresponds to a ring pair, and each diagonal line segment links the ring pairs (2 1 or 3 except at the edge of the FOV) that are

1234 averaged to form one oblique sinogram. ra The samples located outside the square 0 1234 56 78910 11 12 13 14 15 (dots) are the unmeasured oblique sinograms needed to obtain a shift- invariant response. In the 3DRP algorithm, -1 these missing sinograms are estimated by forward-projecting an initial 2D -2 reconstruction of the direct segment iθ = 0. 82 Positron Emission Tomography choice of the span S depends both on the SNR and on of non-truncated 2D projections with orientations n ∈ the radius of the FOV.Values between 3 (for high-sta- Ω,where Ω is a subset of the unit sphere that satisfies tistics brain studies) and 9 (for low-count, whole-body Orlov’s condition. The reconstructed image is a 3D studies) are standard. backprojection When the radius of the FOV is large, more accurate rrrrrrrr interpolation is needed to reorganize the raw data into fr()= ∫ dnpF ( s=− r ⋅ ( rnnn ), ) (60 ) parallel projections according to Eq. (55), but the sam- Ω pling pattern (Eq. (56)) can be kept. which, as in 2D, is the sum of the filtered projections pF for all lines containing the point r. The filtered projections are given by rr r r r r r r F = ∫ ′′ − ′ 3D Analytic Reconstruction by p(,) snr dspsnhs ( ,)C ( sn ,) (61 ) n ⊥ Filtered-backprojection In this equation, the 2D convolution kernel hC(s) is the 2D inverse Fourier transform of the filter function due The Central Section Theorem to Colsher [72]: r rr r r r rr = ′δ ⋅ ′ −⊥1 = ||v ∈ The central section theorem (Eq. 10) can be general- HvnC (,) {∫ dnvn ( )} r vn ized to 3D, and states that Ω LvΩ () rr r r r ()62 Pvn(,)=∈ Fv () v n⊥ (58 ) where δ is the Dirac delta function, and LΩ(ν) is the arc where length of the intersection between Ω and the great rr r rr r r circle normal to ν (Fig. 4.10). Orlov’s condition ensures =⋅π Pvn(,)∫ dspsn (,)exp(–2 isv ) (59 ) that LΩ(ν) > 0. An expression of this filter in terms of n ⊥ ν ν φ θ the variables s, u, , can be found in [72]. Like the is the 2D Fourier transform of a parallel projection and ramp filter, Colsher’s filter is proportional to the F is the 3D Fourier transform of the image. Note that as modulus of the frequency. In contrast to the 2D case, the integral in Eq. (59) is over the whole projection however, the filter depends on the angular part of the plane n , the central section theorem is only valid for non-truncated parallel projections. Geometrically, this theorem means that a projection z of direction n allows the recovery of the Fourier transform of the image on the central plane orthogonal to n ν in 3D frequency space. A corollary is that the image can be reconstructed in a stable way from a set of non- truncated projections n ∈Ω ʚ S2 if and only if the set Ω has an intersection with any equatorial circle on the unit sphere S2. This condition is due to Orlov [70]. The Ω θ (ν equatorial band ( max) in Eq. (52) satisfies Orlov’s LΩ ) condition for any θ > 0. max Ω(θ The direct 3D Fourier reconstruction algorithm is a max) direct implementation of Eq. (58) [71]. This technique involves a complex interpolation in frequency space, and has not so far been used in practice. However, Matej [15] recently demonstrated a significant gain of r reconstruction time compared to the standard FBP. 2 Figure 4.10. Each vectorr r n on the unit sphere S is the direction of one 2D parallel projection p( s , n ). The set of directions Ω(θmax) measured by a cylindrical scannerr (equation (52)) is shown as a grey subset. The Fourier 3D Filtered Backprojection transform F(ν ) can be recovered from any projection along the measuredr (thick line) segment of the great circle orthogonal to ν . The

Following the same lines as for the 2D FBP inversion, reciprocal I/LΩ of the length of this segment is the angular part of the Eq. (58) leads to a two-step inversion formula for a set reconstruction filter. Image Reconstruction Algorithms in PET 83 frequency. Another specificity of 3D reconstruction, backprojected. With a 24-ring scanner, using dmax = 19 due to the redundancy of the 3D data, is that the recon- instead of the maximum value Nr – 1 = 23 still incorpo- struction filter is not unique [73]. Colsher’s filter, rates 95% of the data. however, yields the reconstructed image with the Images reconstructed with the 3DRP algorithm minimal variance under fairly general assumption on share many features with 2D FBP reconstructions, in- the data statistics [74]. cluding linearity (the reconstructed FWHM in a given The discretization of the 3D FBP algorithm is based point is the same for a cold and for a hot spot) and the as in 2D on replacing integrals by trapezoidal quadra- prevalence of streak artifacts in low-count studies. One tures and on linear interpolation in s for the 3D difference with 2D reconstructions is the axial depen- backprojection. The 3D backprojection is the most dence of the spatial resolution, due to the increasing time-consuming step in the algorithm and various contribution of the estimated data near the edges of techniques have been proposed to accelerate this pro- the axial FOV (see Fig. 4.9). This property of 3DRP cedure (see [17] and references therein). The 2D convo- reflects the non-uniform sensitivity of the volume PET lution is implemented in frequency space as: scanner. Clearly, any analytic or iterative algorithm has to somehow reflect this property in the reconstruction. rr r rr r rr rr F = . With the rebinning algorithms described below, the psn(,)r⊥ dvhvnwvPvn C (,) () (,) exp2 ( isv)() 63 n lower sensitivity in the edge slices is translated in an increased variance rather than in a degraded spatial where P(ν, n) is the 2D Fourier transform of the non- resolution. truncated projection and w(ν) is an apodizing window, which plays the same stabilizing role as in 2D (see the remark below Eq. (16) and reference [75] for details on the discrete implementation using the 2D FFT). 3D Analytic Reconstruction by Rebinning The Reprojection Algorithm

The 3D FBP algorithm is valid only for non-truncated The high sensitivity of a PET scanner operated in 3D parallel projections. In almost all PET studies, the mode is directly related to the large number of tracer distribution extends axially over the whole FOV sampled LORs, which is much larger than the number of the scanner, and the only non-truncated parallel of reconstructed pixels: NLOR >> P (by a factor propor- projections are those with θ = 0. For sampled data, the tional to Nr). We have already mentioned in the previ- θ θ θ ous section that this data redundancy results in the equality = 0 is replaced by < 0 for some small θ non-uniqueness of the reconstruction filter. From the maximum oblicity angle 0, which corresponds typi- practical point of view, redundancy increases the data cally to the maximum ring difference d2D,max incorporated in a 2D acquisition. storage requirements and the computational load for The standard analytic reconstruction algorithm for reconstruction and data correction. volume PET scanners is the 3D reprojection algorithm This observation has motivated the development of (3DRP) [76], which consists of four steps: rebinning algorithms. A rebinning algorithm is an algorithm that estimates the ordinary sinogram (Eq. (3)) (i) Reconstruct a first image estimate f2D(r) by apply- of each sampled transaxial section z ∈ [0,L], i.e. ing the 2D FBP algorithm to the non-truncated ∞ θ θ φφφφ==−= data subset < 0. pszreb ( , , )∫ dtfxs ( cos t sin , ys sin (ii) Forward project f (r) to estimate the unmeasured −∞ 2D + φ parts p( s ∉ M(n), n) of a set of 2D parallel projec- tz cos,) ( 64 ) tions n ∈Ω(θ ), φ θ max from the measured oblique sinograms ps(s, , , ) (iii) Merge the measured and estimated data to form defined by Eq. (54). Each rebinned sinogram is then re- non-truncated projections, constructed separately using a 2D reconstruction algo- (iv) Reconstruct these merged data with the 3D FBP al- rithm. This procedure is illustrated in Fig. 4.11. gorithm described in the previous section. Rebinning would be trivial for noise-free data because one easily checks by comparing Eqs. (54) and In general, a value of θ smaller than the scanner max (3) that maximum axial acceptance angle is used to limit the φφζθ=== amount of missing data, which must be estimated and pszpsreb(, , ) s (, , z ,065 ) ( ) 84 Positron Emission Tomography

2Nr -1 transaxial slices

slice by slice 3D acquisition 2D reconstruction + corrections

2N r-1 rebinned sinograms 2 Nr oblique sinograms

φ φ

Rebinning

s s

Figure 4.11. Schematic representation of the principle of a rebinning algorithm for 3D PET data.

In the presence of noise, however, an efficient rebin- is the maximum value of the variable t′. Using this ap- ning method should optimize the SNR by exploiting proximation, Eq. (65) can be extended to the whole set of oblique sinograms to estimate p . reb psz(,φφζθ , ) ps (, ,== z ,0 ) (66) Several approximate [77, 78, 79, 80, 81] and exact [82, reb s 83] rebinning methods have been published. We only and by averaging all available estimates, SSRB defines summarize the two algorithms that have been most the rebinned sinograms by used in practice. θ max (sz,) φ 1 θφζθ= pdpszssrb (,s, z) = ∫ s (, , , ) θ −θ 2s,z)max ( max (sz,) The Single-slice Rebinning Algorithm ()67 (SSRB) θ ⎛ − 22− ⎞ where max (s, z) = arctan ⎝min[]zL , z / Rd s ⎠ is This approximate algorithm [77] is based on the as- the maximum axial aperture for an LOR at a distance s sumption that each measured oblique LOR only tra- from the axis in slice z. The algorithm is exact for verses a single transaxial section within the support of tracer distributions which are linear in z, of the type the tracer distribution. Referring to the third argument f(x, y, z) = a(x, y) + zb(x, y). For realistic distributions, of f in Eq. (54), this assumption amounts to neglecting the accuracy of the approximation will decrease with θ θ the product RF tan ,where RF , the radius of the FOV, increasing RF and max. Axial blurring and transaxial Image Reconstruction Algorithms in PET 85 distortions increasing with the distance from the axis Like all analytic algorithms, FORE assumes that the φ θ of the scanner are the main symptoms of the SSRB ap- data ps (s, , , ) are line integrals of the tracer dis- proximation. tribution and that each oblique sinogram is sampled φ ∈ π The discrete implementation of the SSRB algorithm over the whole range (s, ) [–RF,RF ] x [0, ]. is simply the extension of the technique described in Therefore, the raw data must be corrected for all the multi-slice 2D data section (above) to build 2D data effects including detector efficiency variations, atten- with a multi-ring scanner operated in 2D mode, with uation, and scattered and random coincidences, d2Dmax replaced by a larger value dmax. The choice of dmax before applying FORE. Also, when the data are in- entails a compromise between the systematic errors complete due to gaps in the detector assembly, the

(which increase with dmax) and the reconstructed image sinograms must be filled as discussed in the section variance (which increases with decreasing dmax). on properties of the inverse 2D radon transform (above). Refer to [81] for a detailed description and for the derivation of FORE. In practice, FORE is sufficiently accurate when the The Fourier Rebinning Algorithm (FORE) θ axial aperture max is smaller than about 20°, though The approximate Fourier rebinning algorithm [81] is the limit depends on the radius of the FOV and on the more accurate than the SSRB algorithm and extends type of image. Beyond 20°, artifacts similar to those ob- the range of 3D PET studies that can be processed served with SSRB (at lower apertures) appear [84]: de- using rebinning algorithms. The main characteristics graded image quality at increasing distance from the of FORE is that it proceeds via the 2D Fourier trans- axis. Two variations of FORE, the FOREJ and FOREX form of each oblique sinogram, defined as rebinning algorithms [82, 83], are exact in the limit of continuous sampling, and have been shown to over- 2π come this loss of axial resolution when reconstructing ζθ=− φ φ π × Ps (,,,) v k∫ d exp( ik )∫ ds exp(-2 isv ) high statistics data acquired with a large aperture 0 scanner [85]. However, the current implementation of ps(,,,)φζθ k∈∈ Zv , (68 ) s the FOREJ algorithm [82] is more sensitive to noise than FORE since the correction term involves a second where k is the azimuthal Fourier index. Rebinning is derivative of the data with respect to the axial coordi- based on the following relation between the Fourier nate , and the application to low statistics data transforms of oblique and direct sinograms: remains questionable. ζθπθ=+ Pvkzss( , , ,0269 ) Pvk ( , , z k tan / ( v ), ) ( )

For each θ such that the oblique sinogram , θ is measured (see Eq. (54)), the RHS yields an indepen- Hybrid Reconstruction Algorithms dent estimate of the direct data θ = 0. FORE then av- erages all these estimates to optimize the SNR. The for 3D PET accuracy of the approximation (Eq. (69)) breaks down at low frequencies ν. Therefore, for all frequen- The future evolution of image reconstruction in PET cies below some small threshold, the Fourier trans- will most probably lead to the generalized utilization form of the rebinned data is estimated using the of iterative algorithms, both for 2D and for 3D data. As SSRB approximation. shown in the next section, it is straightforward to The main steps of the FORE algorithm are: extend iterative methods, such as OSEM, to fully 3D (i) Initialize a stack of Fourier transformed sino- scanning. These algorithms have the potential to model ν grams Pfore( ,k,z), accurately the data acquisition, the measurement noise, (ii) For each oblique sinogram , θ and also the prior information on the tracer distribu- ν θ a. Calculate the 2D Fourier transform Ps( ,k, , ), tion. In contrast, analytic algorithms are bound to the b. For each frequency component (ν,k), increment line integral representation of the data. Even though ν θ πν ν θ Pfore( ,k, –k tan /(2 )) by Ps( ,k, , ), some physical effects can be incorporated in pre- or ν (iii) Normalize Pfore( ,k,z) for the varying number of post-processing steps, an accurate modeling of the contributions it has received, Poisson statistics of the data is difficult with analytic (iv) Take the 2D inverse Fourier transform to get the methods. To date, however, the computational burden φ rebinned data pfore(s, ,z). of fully 3D iterative algorithms remains a major issue 86 Positron Emission Tomography for some applications involving multiple acquisitions, as the FORE+OSEM(AW) algorithm. Note that this al- or for research scanners such as the HRRT which gorithm is still approximate: even in the absence of at- sample a very large number of LORs. The current prac- tenuation and scatter, the rebinned sinograms are not tice of undersampling these data (see above) to acceler- independent Poisson variables because of the complex ate reconstruction is contradictory with the aim of linear combination of the 3D data during FORE rebin- accurate modeling claimed by iterative methods. ning. Strictly speaking, it is inappropriate to recon- This limitation has led to the application of hybrid struct the rebinned data using the OSEM algorithm algorithms for 3D PET data [41, 66, 91]. These algo- derived for independent Poisson data, and it is prefer- rithms first rebin the 3D data into a multi-slice set of able to use a weighted least-square method [87] or the ordinary sinogram data, using e.g. the SSRB method, NEC scaling technique [30] (Eq. (44)). In each case, one or, more often, FORE. Each rebinned sinogram is then needs to estimate the variance of the rebinned data reconstructed using some 2D iterative algorithm. This [88] and also, ideally, the covariance [89]. hybrid approach provides a significant time gain with Finally, modeling the shift-variant detector response respect to fully 3D iterative reconstruction. (e.g. due to crystal penetration) has not yet been The two components of hybrid algorithms, rebinning attempted with hybrid methods. One approach would and iterative methods, have been discussed in previous be to apply sinogram restoration prior to rebinning. sections. In this section, we briefly discuss the interplay A related problem occurs with scanners such as the between these two elements, the main difficulty being Siemens/CPS HRRT [10], which has gaps between adja- to model the rebinned data that are presented to the 2D cent flat panel detector heads. Since Fourier rebinning iterative algorithm. We focus on the application of requires complete sinogram data, these gaps must be FORE followed by a 2D OSEM reconstruction but the filled before rebinning. Gap filling techniques may same problems would arise with other combinations, range in complexity from linear interpolation to such as SSRB followed by an iterative minimization of forward projection of an image reconstructed from the a 2D penalized weighted least-square (PWLS) cost 2D segment by using a system matrix which accounts function [86]. for the missing data [12]. In general, however, a 3D iter- One of the major benefits of iterative reconstruction ative reconstruction is preferable to an hybrid one arises from a correct modeling of the data statistics, because the gap filling procedure followed by the re- which allows to weight each LOR according to its vari- binning is sensitive to noise propagated from regions ance. This is the reason why improved image quality is with high attenuation. obtained by reconstructing the raw, uncorrected data Despite these difficulties, fast hybrid algorithms such with a system matrix incorporating the effects of atten- as FORE+OSEM(AW) have been applied to whole- uation, normalisation and scatter, rather than by recon- body FDG scans, and shown to provide for these structing pre-corrected data with a system matrix studies an image quality comparable to fully 3D itera- modeling only the detector’s geometric response. tive reconstruction (see [90, 92] and the example in Ideally, therefore, we would like to develop a hybrid al- Fig. 4.13 below). gorithm in which un-corrected rebinned data are reconstructed by means of a 2D iterative algorithm including the effects of attenuation, etc. This approach is impossible because the FORE Eq. (69) must be Fully 3D Iterative Reconstruction applied to fully pre-corrected data as discussed at the end of the previous section. The rebinned data must then be reconstructed with a 2D iterative algorithm Axial and transaxial undersampling techniques were which does not model the pre-corrected physical developed to reduce the data to a manageable size effects. while hybrid algorithms were developed to achieve One solution to improve the statistical model is to fast reconstruction for clinical PET scanners with de-correct the data for the physical effects after the re- limited computer resources. With sufficient CPU binning. This de-correction restores Poisson-like sta- power and disk capacity these early approaches are tistics to the rebinned data, and the physical effects can not needed. The application of fully 3D iterative re- then be reintroduced in the system matrix. If we hy- construction methods then allows to overcome the pothesize that the most important effect is that of at- limitations of the hybrid algorithms discussed in the tenuation, we can decorrect for attenuation only and previous section. then reconstruct the de-corrected rebinned data with We have seen that iterative reconstruction methods AW-OSEM (see Eq. (43)). This approach is referred to are conceptually independent of the 2-D or 3-D nature Image Reconstruction Algorithms in PET 87 of the data. Several implementations have been de- The benefit expected from fully 3D iterative recon- scribed for 3-D data, based on the Space Alternating struction is easily demonstrated for scanners with Generalized EM [93], on ML-EM and OSEM [94], on large polar aperture, particularly in the presence of Bayesian estimation [26, 95], and on the row-action gaps. Fig. 4.12, for example, shows a high resolution maximum likelihood [90]. All algorithms of the phantom measured with the HHRT brain scanner. The ML-EM type described above, for instance, can be bottom image was reconstructed with FORE+OSEM readily generalized to 3D PET by replacing the system (AW), while the top one was reconstructed with matrix aj,i describing the acquisition geometry (equa- OSEM3D(ANW), where “ANW” indicates that both the tion (29)) by its 3D equivalent, which takes into normalization and attenuation corrections are incor- account the axial coordinates of the LORs. For block- porated in the system matrix. The horizontal streak ar- iterative methods such as OSEM (see Eq. (42)), the set tifacts in the coronal section of the FORE+OSEM(AW) of LORs parameterized by the two transaxial sinogram image are attributed to the gap filling step prior to φ indices sk, j in Eq. (5) and by the two axial coordinates FORE. Blurring can also be observed on the 3 bright- iθ, iz in Eq. (56) must be divided into subsets. Most est rods at the edge of the cylinder. When the polar implementations simply subdivide the azimuthal index angle is smaller, as with many clinical scanners, the φ j, exactly as in the 2D case. Each subset then contains bias introduced by FORE is small, and the benefit of all axial samples iθ, iz. 3D reconstruction is harder to visualize, especially at

Figure 4.12. High resolution phantom data acquired on the HRRT: comparison of a fully 3D iterative reconstruction using OSEM3D(ANW) (top) and of a hybrid reconstruction with FORE+OSEM(AW) (bottom). Both images were reconstructed with 4 iterations and 16 subsets. The phantom is oriented vertically in the FOV of the scanner and the vertical axis on the coronal section is parallel to the axis of the scanner (courtesy K. Wienhard, Köln). 88 Positron Emission Tomography low count statistics and when regularization is reduced using a factorized model and by taking achieved by post-reconstruction smoothing. This is advantage of symmetries, the computation time is illustrated by Fig. 4.13, which shows whole-body necessarily longer than with simplified system patient data processed with FORE+OSEM(AW) and models. This lead us to considerations of the poten- OSEM3D(ANW). Note the similarity between the two tial for parallel-processing of image reconstructions reconstructions, even in regions with high attenuation on processor arrays. (shoulder and neck for this patient with arms up). In contrast with the algorithms illustrated above, the fully-3D image reconstruction developed by Parallel Implementation of Iterative Leahy et al. [24, 26, 95, 96] is based on an extensive Reconstruction system model. The algorithm incorporates a shifted Poisson model that includes the statistics of true, The need for parallel implementation of the ML-EM al- scattered and random coincidences, as well as gorithm was already recognized in the mid-eighties. positron range, annihilation photon acolinearity, at- Pioneering work proposed the use of a cluster of com- tenuation, sinogram sampling, detector dead-time modity PCs [97] or dedicated hardware [98]. But as and efficiency, block detector effects, and the spatially soon as commercial parallel systems became available, varying detector resolution due to parallax (depth of dedicated algorithms were developed on high-end interaction) and Compton scatter in the scintillators computers such as transputers [99, 100], hypercubes (Chapter 2). Although the size of the system matrix is [101], meshes [102], rings [103], fine-grain message-

Figure 4.13. Whole-body FDG scan on an HR+ tomograph, reconstructed using FORE+OSEM(AW) (top) and OSEM3D(ANW) (bottom), in both cases with 4 iterations and 16 subsets. A 3D gaussian filter with FWHM 4 mm was applied after reconstruction. The orthogonal views are passing through the cursor (small circle in the neck area). Image Reconstruction Algorithms in PET 89 passing machines [104], linear arrays of DSPs [105] to cite a few examples. Recent efforts concentrated on Acknowledgment using clusters of multi-processor PCs, sometimes called component off the shelf (COS), and combine This work has been supported in part by NIH Grant both shared and distributed memory approaches. This CA-74135 and by the grant G.0174.03 of the FWO choice is dictated by the cost/performance ratio of the (Belgium). hardware, by its flexibility and by the possibility to upgrade the system with faster and cheaper hardware in this very competitive market. One key problem in distributed computing is to optimize the balance References between computation and communication amongst the nodes. The ultimate goal is to keep individual processors busy all the time by interleaving I/O and 1. Virador PRG, Moses WW, Huesman RH. Reconstruction in PET cameras with irregular sampling and depth of interaction capabil- computation. A good measure of the performance of a ity. IEEE Trans Nucl Sc 1998;NS-45:1225–30. parallel algorithm is how well the speed-up factor 2. Natterer F. The mathematics of computerized tomography. New scales linearly with the number of nodes. In their work, York: Wiley, 1986. 3. Townsend DW, Isoardi RA, Bendriem B. Volume imaging Shattuck et al [106] describe a parallel implementation tomographs. In: Bendriem B, Townsend DW (eds) The theory and of the MAP-PCG reconstruction [26] using a master- practice of 3D PET, Dordrecht: Kluwer Academic, pp 111–32, 1998. slave model with 9 dual PC nodes. The work of Vollmar 4. Natterer F,Wuebbeling F. Mathematical methods in image reconstruction. SIAM Monographs on Mathematical Modeling and [107] describes a parallel extension of the OSEM3D re- Computation, 2001. construction [92] and is also using a masterslave 5. Kak AC, Slaney M. Principles of computerized tomographic imaging. New York: IEEE Press, 1988. model with 7 quad PC nodes. By calculating the system 6. Barrett HH, Swindell W. Radiological Imaging. New York: matrix on the fly and neglecting the physics of the de- Academic Press, 1981. tection system these authors could handle very large 7. Barrett HH, Myers KJ, Foundations of Image Science. New York: Wiley, 2004. reconstruction problems on the HRRT. The HRRT 8. Ramm AG, Katsevich AI. The radon transform and local tomogra- scanner acquires generally data in span 3(9) with a phy. New York: CRC Press, 1996. 9. Quinto ET. Exterior and limited-angle tomography in non- maximum ring difference of 67, which generates 3D destructive evaluation. Inverse Problems 1998;14:339–53. data of 983 MB (326MB). The work of Jones et al [108] 10. Wienhard K et al. The ECAT HRRT: Performance and first clinical is another parallel extension of the OSEM3D recon- application of a new high-resolution research tomograph. In: Records of the 2000 IEEE Medical Imaging Symposium, Lyon, struction [92]. It uses a single program multiple-data France. (SPMD) rather than a master-slave model. These 11. Karp JS, Muehllehner G, Lewitt RM. Constrained Fourier space method for compensation of missing data in emission-computed authors have shown that image space decomposition tomography. IEEE Trans Med Imag 1988;MI-7:21–5. (ISD) and projection space decomposition (PSD) were 12. de Jong, H, Boellaard R, Knoess C, Lenox M, Michel C, Casey M, roughly equivalent since the communication burden Lammertsma A. Correction methods for missing data in sinograms of the HRRT PET scanner. IEEE Trans Nucl Sc 2003;NS- was large at forward projection when using ISD but 50:1452–1456. was also large at backprojection when using PSD. 13. O’Sullivan JD. A fast sinc function gridding algorithm for Fourier However, by developing an efficient I/O subsystem and inversion in computer tomography. IEEE Trans Med Imag 1985;MI-4:200–7. reorganizing the data, these authors finally favored the 14. Schomberg P, Timmer J. The gridding method for image recon- PSD model [109]. The performance of this parallel struction by Fourier transforms. IEEE Trans Med Imag 1995;MI- 14:596–607. implementation of OSEM3D was shown to scale rela- 15. Matej S, Lewitt RM. 3D FRP: Direct Fourier reconstruction with tively well up to 16 nodes (32 processors). A commer- Fourier reprojection for fully 3D PET. IEEE Trans Nucl Sc cial implementation of this computing cluster uses 2001;NS-48:1378–85. 16. Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical 8 nodes of dual Pentium 4 Xeon at 3.0 Ghz, and per- recipes in C, Cambridge University Press 1992. forms one iteration of OSEM3D in about 20 min for a 17. Egger ML, Joseph C, Morel VC. Incremental beamwise backprojection using geometrical symmetries for 3D PET reconstruction 3D sinogram set of 983 MB and an image size of in a cylindrical scanner geometry. Phys Med Biol 1998; 43:3009–24. 256×256×207 (27 MB). Finally, the PARAPET initiative, 18. Brady ML. A fast discrete approximation algorithm for the Radon currently known as the STIR project [110], has devel- transform. SIAM J Comp 1998;27:107–19. 19. Brandt A, Mann J, Brodski M, Galun M. A fast and accurate multi- oped a generic, multi-platform and multi-scanner, im- level inversion of the Radon transform. SIAM J Appl Math plementation of OSEM3D using an object-oriented 1999;60:437–62. 20. Bertero M, Boccacci P. Introduction to inverse problems in library [111, 112]. The parallel implementation uses a imaging. Bristol: IOP Publishing, 1998. master-slave model and a PSD scheme. On a 12-node 21. Huesman RH. The effects of a finite number of projection angles Parsytec CC system it provides a factor 7 speed-up and finite lateral sampling of projections on the propagation of statistical errors in transverse section reconstruction. Phys Med compared to serial mode. Biol 1977;22:511–21. 90 Positron Emission Tomography

22. J Fessler, 2001 NSS/MIC statistical image reconstruction short 45. Lalush D, Tsui B. A fast and stable maximum a posteriori course notes, www.eecs.umich.edu/ fessler/papers/talk.html conjugate gradient reconstruction algorithm. Med Phys 1995; 23. Leahy R, Qi J. Statistical approaches in quantitative positron emis- 22:1273–1284. sion tomography. Statistics and Computing 2000;10:147–65. 46. Leahy R, Byrne C. Recent developments in iterative image 24. Qi J, Leahy RM, Cherry SR, Chatziioannou, Farquhar TH. High- reconstruction for PET and SPECT. IEEE Trans Med Imag resolution 3D Bayesian image reconstruction using the micro-PET 2000;19:257–60. small animal scanner. Phys Med Biol 1998;43:1001–13. 47. Lange K, Hunter D, Yang I. Optimization Transfer algorithms 25. Yavuz M, Fessler J. New statistical models for random pre- using surrogate objective functions. J Comp Graph Stat 2000; corrected PET scans. In: Duncan J, Gindi G (eds), Information 9:1–59. Processing in Medical Imaging, 15th International Conference, 48. Daube-Witherspoon ME, Muehllehner G. An iterative image space Berlin: Springer, pp 190–203, 1998. reconstruction algorithm for volume ECT. IEEE Trans Med Imag 26. Qi J, Leahy R, Hsu C, Farquhar T, Cherry S. Fully 3D Bayesian 1986;MI-5:61–6. image reconstruction for ECAT EXACT HR+. IEEE Trans Nucl Sci 49. De Pierro A. On the relation between the ISRA and the EM algo- 1998;NS-45:1096–1103. rithm for positron emission tomography. IEEE Trans Med Imag 27. Lange K, Carson R. EM reconstruction algorithms for emission 1993;MI-12:328–33. and transmission tomography. J Comp Ass Tomo 1984;8:306–16. 50. Levitan E, Herman GT. A maximum a posteriori expectation 28. Mumcuoglu EU, Leahy R, Cherry SR, Zhou Z. Fast gradient-based maximization algorithm for image reconstruction in emission to- methods for Bayesian reconstruction of transmission and emis- mography. IEEE Trans Med Imag 1987;MI-6:185–92. sion PET images. IEEE Trans Med Im 1994;MI-13:687–701. 51. Dempster A, Laird N, Rubin D. Maximum likelihood from incom- 29. Erdogan H, Fessler JA. Accelerated monotonic algorithms for plete data via the EM algorithm. Journal of the Royal Statistical transmission tomography. In Proc IEEE Intl Conf on Image Society 1977;39:1–38. Processing 1998;2:6804. 52. Shepp L A, Vardi Y. Maximum likelihood reconstruction for 30. Nuyts J, Michel C, Dupont P. Maximum-likelihood expectation emission tomography. IEEE Trans Med Imag 1982;MI-1:113–22. maximization reconstruction of sinograms with arbitrary noise 53. Barrett HH, White T, Parra L. List-mode likelihood. J Opt Soc Am distribution using NEC transformations. IEEE Trans Med Imag 1997;14:2914–23. 2001;MI-20:365–75. 54. Reader A, Erlandsson K, Flower M, Ott R. Fast, accurate, iterative 31. Bai C, Kinahan P, Brasse D, Comtat C, Townsend D. Postinjection reconstruction for low-statistics positron volume imaging. Phys single photon transmission tomography with ordered-subset al- Med Biol 1998;43:835–46. gorithms for whole-body PET. IEEE Trans Nucl Sci 2002;NS- 55. Levkovitz R, Falikman D, Zibulevsky M, Ben-Tal A, Nemirovski 49:74–81. A. The design and implementation of COSEM, an iterative 32. Lewitt RM. Alternatives to voxels for image representation in iter- algorithm for fully 3D list-mode data. IEEE Trans Med Imag ative reconstruction algorithms. Phys Med Biol 1992;37:705–16. 2001;MI-20:633–42. 33. Matej S, Herman GT, Narayanan TK, Furie SS, Lewitt RM, 56. Huesman RM, Klein G, Moses W, Qi J, Reutter B, Virador P. List- Kinahan PE. Evaluation of task-oriented performance of several mode maximum-likelihood reconstruction applied to positron fully 3D PET reconstruction algorithms. Phys Med Biol 1994; emission mammography (PEM) with irregular sampling. IEEE 39:355–67. Trans Med Imag 2000;MI-19:532–7. 34. Geman S, Geman D. Stochastic relaxation, Gibbs distributions, 57. Barrett HH, Wilson DW, Tsui BMW. Noise properties of the EM and the Bayesian restoration of images. IEEE Trans Patt Anal algorithm: I. Theory. Phys Med Biol 1994;39:833–46. Mach Intel 1984;PAMI-6:721–41. 58. Wilson DW, Tsui BMW, Barrett HH. Noise properties of the EM 35. Mumcuoglu EU et al. Bayesian reconstruction of PET images, algorithm: II. Monte Carlo simulations. Phys Med Biol 1994; methodology and performance analysis. Phys Med Biol 1996; 39:847–71. 41:1777–807. 59. De Pierro AR, Yamagishi MEB. Fast EM-like methods for 36. Leahy RM, Yan X. Incorporation of anatomical MR data for im- maximum “a posteriori” estimates in emission tomography. IEEE proved functional imaging with PET. In: Information Processing Trans Med Imag 2001;MI-20:280–8. in Medical Imaging: 12th International Conference, Berlin: 60. Veklerov E, Llacer J. Stopping rule for the MLE algorithm based Springer-Verlag, pp 105–20, 1991. on statistical hypothesis testing. IEEE Trans Med Imag 1987;MI- 37. Gindi G, Lee M, Rangarajan A, Zubal G. Bayesian reconstruction 6:313–19. of functional images using anatomical information as priors. IEEE 61. Selivanov V, Lapointe D, Bentourkia M, Lecomte R. Cross-valida- Trans Med Imag 1993;12:670–80. tion stopping rule for ML-EM reconstruction of dynamic PET 38. Bowsher JE, Johnson VE, Turkington TG, Jaszczak RJ, Floyd Jr series: effect on image quality and quantitative accuracy. IEEE CE, Coleman RE. Bayesian reconstruction and use of anatomical a Trans Nucl Sc 2001;NS-48:883–9. priori information for emission tomography. IEEE Trans Med 62. Hudson H, Larkin R. Accelerated image reconstruction using Imag 1996;MI-15:673–86. ordered subsets of projection data. IEEE Trans Med Imag 39. Comtat C, Kinahan P, Fessler F, Beyer T, Townsend D, Defrise M, 1994;MI-13:601–9. Michel C. Clinically feasible reconstruction of whole-body 63. Browne J, De Pierro AR. A row action alternative to the EM algo- PET/CT data using blurred anatomical labels. Phys Med Biol rithm for maximizing likelihoods in emission tomography. IEEE 2002;47:1–20. Trans Med Imag 1996;MI-15:687–99. 40. Baete K, Nuyts J, Van Paesschen W, Suetens P, Dupont P. 64. Byrne CL. Convergent block-iterative algorithms for image Anatomical based FDG-PET reconstruction for the detection of reconstruction from inconsistent data. IEEE Trans Imag Proc hypo-metabolic regions in epilepsy. To appear in IEEE Trans Med 1997;IP-6:1296–304. Imag 2004. 65. Hebert T, Leahy RM. Fast methods for including attenuation in 41. Obi T, Matej S, Lewitt RM, Herman GT. 2.5D simultaneous multi- the EM algorithm. IEEE Trans Nucl Sc 1990;NS-37:754–8. slice reconstruction by iterative algorithms from Fourier- 66. Comtat C, Kinahan PE, Defrise M, Michel C, Townsend DW. Fast rebinned PET data. IEEE Trans Med Imag 2000;MI-19:474–84. reconstruction of 3D PET with accurate statistical modeling. IEEE 42. Nichols TE, Qi J, Leahy RM. Continuous time dynamic pet Trans Nucl Sc 1998;NS-45:1083–9. imaging using list mode data. In: Colchester A et al. (eds), 67. Stayman J W, Fessler J A. Fast methods for approximating the res- Information Processing in Medical Imaging, 12th International olution and covariance for SPECT, in IEEE Nuclear Science and Conference, Berlin: Springer, pp 98–111, 1999. Medical Imaging Conference; 2002, Norfolk, VA, paper M3-146. 43. Asma E, Nichols TE, Qi J, Leahy RM. 4D PET image reconstruc- 68. Stayman JW, Fessler JA. Regularization for uniform spatial reso- tion from list-mode data. Records of the 2000 IEEE Medical lution properties in penalized-likelihood image reconstruction. Imaging Symposium, Lyon (France), paper 15–57. IEEE Trans Med Imag 2000;MI-19:601–615. 44. Fessler J. Penalized weighted least-square image reconstruction 69. Nuyts J, Fessler JA. A penalized-likelihood image reconstruction for PET. IEEE Trans Med Imag 1994;MI-13:290–300. method for emission tomography, compared to post-smoothed Image Reconstruction Algorithms in PET 91

maximum-likelihood with matched spatial resolution. IEEE Trans 92. Liu X, Comtat C, Michel C, Kinahan PE, Defrise M, Townsend Med Imag 2003;MI-22:1042–1052. DW. Comparison of 3D reconstruction with 3D OSEM and with 70. Orlov SS. Theory of three-dimensional reconstruction. 1. FORE+OSEM for PET. IEEE Trans Med Imag 2001;MI-20:804–14. Conditions of a complete set of projections. Sov Phys Crystallo- 93. Ollinger J. Maximum likelihood reconstruction in fully 3D PET graphy 1976;20:429–33. via the SAGE algorithm. In Proc. 1996 IEEE Nucl. Sci. Symp. 71. Stearns CW, Chesler DA, Brownell GL. Accelerated image recon- Medical Imaging Conf., Anaheim, CA, 1594–1598. struction for a cylindrical positron tomograph using Fourier 94. Johnson C, Seidel S, Carson R, Gandler W, Sofer A, Green M, domain methods. IEEE Trans Nucl Sci 1990;NS-37:773–7. Daube-Witherspoon M. Evaluation of 3-D reconstruction algo- 72. Colsher JG. Fully three-dimensional PET. Phys Med Biol 1980; rithms for a small animal PET camera. IEEE Trans Nucl Sci 25:103–115. 1997;NS-44:1303–1308. 73. Defrise M, Townsend DW, Clack R. Image reconstruction from 95. Qi J, Leahy RM. Resolution and noise properties of MAP recon- truncated two-dimensional projections. Inverse Problems struction for fully 3D PET. IEEE Trans Med Imag MI- 1995;11:287–313. 19:493–506. 74. Defrise M, Townsend DW, Deconinck F. Statistical noise in 96. Bai B, Li Q, Holdsworth C, Asma E, Tai Y, Chatziioannou A, three-dimensional positron tomography. Phys Med Biol 1990; Leahy R. Modelbased normalization for iterative 3D PET image 35:131–138. reconstruction. Phys Med Biol 2002;47:2773–2784. 75. Stearns CW, Crawford CR, Hu H. Oversampled filters for 97. Llacer J and Meng J. Matrix-based image reconstruction quantitative volumetric PET reconstruction. Phys Med Biol methods for tomography. IEEE Trans Nucl Sci 1985; 1994;39:381–8. NS-32:855–864. 76. Kinahan PE, Rogers JG. Analytic three-dimensional image 98. Jones W, Byars L, Casey M. Design of a super fast three- reconstruction using all detected events. IEEE Trans Nucl Sci dimensional projection system for positron emission tomogra- 1990;NS-36:964–8. phy. IEEE Trans Nucl Sci 1990;NS-37:800–804. 77. Daube-Witherspoon ME, Muehllehner G. Treatment of axial data 99. Barresi S, Bollini D, Del Guerra A. Use of a transputer system in three-dimensional PET. J Nucl Med 1987;28:1717–24. for fast 3-D image reconstruction in 3-D PET. IEEE Trans Nucl 78. Erlandsson K, Esser PD, Strand S-E, van Heertum RL. 3D recon- Sci 1990;NS-37:812–816. struction for a multi-ring PET scanner by single-slice rebinning 100. Atkins S, Murray D, Harrop R. Use of transputers in a 3-D and axial deconvolution. Phys Med Biol 1994;39:619–29. positron emission tomograph. IEEE Trans Med Imag 1991; 79. Tanaka E, Amo Y. A Fourier rebinning algorithm incorporating MI-10:276–283. spectral transfer efficiency for 3D PET. Phys Med Biol 1998; 101. Chen C.-M., Lee S.-Y., Cho Z. Parallelization of the EM algo- 43:739–46. rithm for 3-D PET image reconstruction. IEEE Trans Med Imag 80. Lewitt RM, Muehllehner G, Karp JS. Three-dimensional recon- 1991;MI-10:513–522. struction for PET by multi-slice rebinning and axial image 102. Rajan K, Patnaik L, Ramakrishna J. High-speed computation of filtering. Phys Med Biol 1994;39:321–40. the EM algorithm for PET image reconstruction. IEEE Trans 81. Defrise M, Kinahan PE, Townsend DW, Michel C, Sibomana M, Nucl Sci 1994;NS-41:1721–1728. Newport DF. Exact and approximate rebinning algorithms for 3D 103. Johnson C, Yan Y, Carson R, Martino R, Daube-Witherspoon M. PET data. IEEE Trans Med Imag 1997;MI-16:145–58. A system for the 3-D reconstruction of retracted-septa PET 82. Defrise M, Liu X. A fast rebinning algorithm for 3D PET using data using the EM algorithm. IEEE Trans Nucl Sci 1995; John’s equation, Inverse Problems 1999;15:1047–65. NS-42:1223–1227. 83. Liu X, Defrise M, Michel C, Sibomana M, Comtat C, Kinahan PE, 104. Cruz-Rivera J, DiBella E, Wills D, Gaylord T, Glytsis E. Townsend DW. Exact rebinning methods for three-dimensional Parallelized formulation of the maximum likelihood expecta- positron tomography. IEEE Trans Med Imag 1999;MI-18:657–64. tion maximization algorithm for fine-grain message-passing 84. Matej S, Karp JS, Lewitt RM, Becher AJ. Performance of the architectures. IEEE Trans Med Imag 1995;MI-14:758–762. Fourier rebinning algorithm for 3D PET with large acceptance 105. Rajan K, Patnaik L, Ramakrishna J. Linear array implementa- angles. Phys Med Biol 1998;43:787–97. tion of the EM reconstruction algorithm for PET image recon- 85. Michel C, Hamill J, Panin V, Conti M, Jones J, Kehren F, Casey M, struction. IEEE Trans Nucl Sci 1995;NS-42:1439–1444. Bendriem B, Byars L, Defrise M. FORE(J)+OSEM2D versus 106. Shattuck D, Rapela J, Asma E, Chatzioannou A, Qi J, Leahy R. OSEM3D reconstruction for Large Aperture Rotating LSO, In: Internet2-based 3-D PET image reconstruction using a PC IEEE Nuclear Science and Medical Imaging Conference; 2002, cluster. Phys Med Biol 2002;47:2785–2795. Norfolk, VA, paper M7-93. 107. Vollmar S, Michel C, Treffert J, Newport D, Casey M, Knoss C, 86. Kinahan PE, Fessler JA, Karp JS. Statistical image reconstruction Wienhard K, Liu X, Defrise M, Heiss W-D. HeinzelCluster: ac- in PET with compensation for missing data. IEEE Trans Nucl Sc celerated reconstruction for FORE and OSEM3D. Phys Med Biol 1997;NS-44:1552–7. 2002;47:2651–2658. 87. Stearns CW, Fessler JA. 3D PET reconstruction with FORE and 108. Jones J, Jones W, Kehren F, Newport D, Reed J, Lenox M, Baker K, WLS-OS-EM. In: IEEE Nuclear Science and Medical Imaging Byars L, Michel C, Casey M. SPMD cluster-based parallel 3D Conference; 2002, Norfolk, VA, paper M5-5. OSEM. IEEE Trans Nucl Sci 2003;NS-50:1498–1502. 88. Janeiro L, Comtat C, Lartizien C, Kinahan P, Defrise M, Michel C, 109. Jones J, Jones W, Kehren F, Burbar Z, Reed J, Lenox M, Baker K, Trébossen R, Almeida P. NEC-scaling applied to FORE+OSEM. In: Byars L, Michel C, Casey M. Clinical Time OSEM3D: Infra- IEEE Nuclear Science and Medical Imaging Conference; 2002, structure Issues. In: IEEE Nuclear Science and Medical Imaging Norfolk, VA, paper M3-71. Conference; 2003, Portland, OR, paper M10-244. 89. Alessio A, Sauer K, Bouman C. PET statistical reconstruction with 110. The current home page of the PARAPET/STIR project is modeling of axial effects of FORE. In: IEEE Nuclear Science and http://stir.sourceforge.net/homepage.shtm Medical Imaging Conference; 2003, Portland, OR, paper M11-194. 111. Bettinardi V, Pagani E, Gilardi M-C, Alenius S, Thielemans K, 90. Daube-Witherspoon ME, Matej S, Karp JS, Lewitt RM. Application Teras M, Fazio F. Implementation and evaluation of a 3D One of the Row Action Maximum Likelihood Algorithm with spherical Step Late reconstruction algorithm for 3D Positron Emission basis functions to clinical PET imaging. IEEE Trans Nucl Sc Tomography studies using Median Root Prior. Eur J Nucl Med 2001;NS-48:24–30. 2002;29:7–18. 91. Kinahan PE, Michel C, Defrise M, Townsend DW, Sibomana M, 112. Jacobson M, Levkovitz R, Ben Tal A, Thielemans K, Spinks T, Lonneux M, Newport DF, Luketich JD. Fast iterative image recon- Belluzzo D, Pagani E, Bettinardi V, Gilardi MC, Zverovich A, struction of 3D PET data. In: IEEE Nuclear Science and Medical Mitra G. Enhanced 3D PET OSEM reconstruction using inter- Imaging Conference; 1996, Anaheim, CA, 1918–22. update Metz filtering. Phys Med Biol 2000;45:2417–2439.