DUG-RECON: a Framework for Direct Image Reconstruction Using Convolutional Generative Networks V.S.S

1 DUG-RECON: A Framework for Direct Image Reconstruction using Convolutional Generative Networks V.S.S. Kandarpa∗, Alexandre Bousse, Didier Benoit and Dimitris Visvikis Abstract—This paper explores convolutional generative net- The use of deep learning for the development of either data works as an alternative to iterative reconstruction algorithms in corrections or post-reconstruction image based approaches (i) medical image reconstruction. The task of medical image recon- has shown potential to improve the quality of reconstructed struction involves mapping of projection domain data collected from the detector to the image domain. This mapping is done typ- images. An example of data corrections by improving the ically through iterative reconstruction algorithms which are time raw data through scatter correction is proposed in [14]. In consuming and computationally expensive. Trained deep learning this work a modified U-Net is used to estimate scatter and networks provide faster outputs as proven in various tasks across correct the raw data in order to improve computed tomography computer vision. In this work we propose a direct reconstruction (CT) images. Denoising the reconstructed positron emission framework exclusively with deep learning architectures. The proposed framework consists of three segments, namely denoising, tomography (PET) images with a deep convolutional network reconstruction and super resolution. The denoising and the super was done in [13]. The authors used perceptual loss along resolution segments act as processing steps. The reconstruction with mean squared error (MSE) to preserve qualitative and segment consists of a novel double U-Net generator (DUG) quantitative accuracy of the reconstructed images. The network which learns the sinogram-to-image transformation. This entire was initially trained on simulated data and then fine-tuned network was trained on positron emission tomography (PET) and computed tomography (CT) images. The reconstruction on real patient data. Despite resulting in an improvement of framework approximates two-dimensional (2-D) mapping from the reconstructed output, the above mentioned methods do not projection domain to image domain. The architecture proposed directly intervene with the reconstruction process. This can be in this proof-of-concept work is a novel approach to direct image done using the two distinct frameworks (ii) and (iii). reconstruction; further improvement is required to implement it The first one involves the incorporation of a deep neural in a clinical setting. network into an unrolled iterative algorithm where a trained Index Terms—Medical Image Reconstruction, Deep Learning, neural network accelerates the convergence by improving the Generative Adversarial Networks intermediate estimates in the iterations [15]–[17]. The paper by Gong et al. used a modified U-Net to represent images I. INTRODUCTION within the iterative reconstruction framework for PET images. The deep learning architecture was trained on low-dose recon- HE use of deep learning in medical imaging has been on structed images as input and high-dose reconstructed images the rise over the last few years. It has widely been used in T as the output. The work by Xie et al. further extended this various tasks across medical imaging such as image segmen- work by replacing the U-Net with a generative adversarial tation [1]–[5], image denoising [6]–[9], image analysis [10]– network (GAN) for image representation within the itera- [12]. The utilization of deep learning for image reconstruction tive framework. Kim et al incorporated a trained denoising is a more challenging task. Image reconstruction using deep convolutional neural network (DnCNN) along with a novel learning corresponds to the task of mapping raw projection local linear fitting function into the iterative algorithm. The data retrieved from the detector to image domain data. One DnCNN which is trained on data with multiple noise levels can broadly identify three different categories of approaches improves the image estimate at each iteration. They used for the implementation of deep learning within the framework arXiv:2012.02000v1 [physics.med-ph] 3 Dec 2020 simulated and real patient data in their study. The second of medical image reconstruction: framework, referred to as direct reconstruction, is based on (i) methods that use deep learning as an image processing performing the whole reconstruction process (replacing the step that improves the quality of the raw data and/or the classical framework) with a deep learning algorithm, using reconstructed image [13], [14]; raw data as input and reconstructed images as output [18]– (ii) methods that embed deep-learning image processing tech- [20]. In contrast to (i) and (ii) which have been extensively niques in the iterative reconstruction framework to accel- investigated, direct reconstruction (iii) using deep learning has erate convergence or to improve image quality [15]–[17]; been much less explored. (iii) direct reconstruction with deep learning alone without There have been to date three particular approaches that any use of traditional reconstruction methods [18]–[20]. are relevant to this strategy. The deep learning architecture proposed by Zhu et al. [19] called AUTOMAP uses fully All the authors are affiliated to LaTIM, INSERM, UMR 1101, Universite´ de Bretagne Occidentale. connected (FC) layers (which encode the raw data informa- ∗ (email: [email protected]) tion) followed with convolutional layers. The first three layers in this architecture are FC layers with dimensions 2n2,n2 II. MATERIALS AND METHODS 2 and n respectively where n × n is the dimension of the A. Image Reconstruction Model input image. The AUTOMAP requires the estimation of a huge number of parameters which makes it computationally In medical imaging, image reconstruction corresponds to the task of reconstructing an image x 2 Rm, from a scanner intensive. Although initially developed for magnetic resonance n imaging (MRI), AUTOMAP has been claimed to work on measurement y 2 R , where m is the number of voxels other imaging modalities too. Brain images encoded into defining the image and n is the number of detectors in the sensor-domain sampling strategies with varying levels of addi- scanner. In CT, the image x = µ corresponds to X-ray tive white Gaussian noise were reconstructed with AUTOMAP. attenuation, measured by the proportion of X-rays scattered Within the same concept of using FC layers’ architectures a or absorbed as they pass through the object. In PET, x = λ three stage image reconstruction pipeline called DirectPET has is the distribution of a radiotracer delivered to the patient been proposed to reduce associated computational issues [18]. by injection, and is measured through the detection of pairs The first stage down-samples the sinogram data, following of γ-rays emitted in opposite directions (indirectly from the which a unique Radon transform layer encodes the transfor- positron-emitting radiotracer). mation from sinogram to image space. Finally the estimated The measurement y is a random vector modeling the num- image is improved using a super resolution block. This work ber of detection (photon counting) at each of the n detector was applied to full body PET images and remains the only bins, and follows a Poisson distribution with independent approach that can reconstruct multiple slices simultaneously entries: (up to 16 images). DeepPET is another approach implemented y ∼ Poisson(y¯(x)) (1) on simulated images using encoder-decoder architecture based n where y¯(x) 2 R is the expected number of counts (noise- on the neural network proposed by the visual geometric group less), which is a function of the image x. In a simplified [20]. Using realistic simulated data, they demonstrated a net- setting, the expected number of counts in CT is work that could reconstruct images faster, and with an image quality (in terms of root mean squared error) comparable to y¯(µ) = exp(−Lµ) (2) that of conventional iterative reconstruction techniques. where L 2 Rn×m is a system matrix such that each entry In our work we explore the use of U-Net based deep [L]i;j represents the contribution of the j-th image voxel to learning architectures [1] to perform a direct reconstruction the i-th detector. In PET, the expected number of counts is from the sinogram to the image domain using real patient (also in a simplified setting) is datasets. Our aim is to reduce the number of trainable parameters along with exploring a novel strategy for direct image y¯(λ) = P λ (3) reconstruction using generative networks. More specifically where P 2 n×m is a system matrix such that each entry our approach consists of a three-stage deep-learning pipeline R [P ] represents the probability that a photon pair emitted consisting of denoising, image reconstruction and super reso- i;j from voxel j. For simplification we assume P = L which is lution segments. Our experiments included training the deep a reasonable assumption in a non time-of-flight setting. Image learning pipeline on PET and CT sinogram-image pairs. A reconstruction is achieved by finding a suitable image x^ = µ^ single pass through the trained network transforms the noisy or λ^ that approximately solves sinograms to reconstructed images. The reconstruction of both PET and CT datasets was considered and presented in the y = y¯(x) : (4) following sections. At this stage the work presented is a proof of concept and needs further improvement before being Filtered-backprojection (FBP) techniques (see [21] for a re- applied in a clinical setting. view) can efficiently solve (4) but they are vulnerable to noise as the system matrix P is ill-conditioned. Since the 80’s, model-based iterative reconstruction (MBIR) techniques [22], Projection Domain [23] became the standard approach. They consist in iteratively approximating a solution x^ such that y¯(x^) maximizes the like- Data Corrections with Deep Learning lihood of the measurement y. As they model the stochasticity of the system, they are more robust to noise as compared with FBP, and can be completed with a penalty term for additional Iterative Unrolled Iterative Deep Learning control over the noise [24].

Load more