Model-Based Bayesian Deep Learning Architecture for Linear Inverse Problems in Computational Imaging
Total Page:16
File Type:pdf, Size:1020Kb
https://doi.org/10.2352/ISSN.2470-1173.2021.15.COIMG-201 © 2021, Society for Imaging Science and Technology Model-Based Bayesian Deep Learning Architecture for Linear Inverse Problems in Computational Imaging Canberk Ekmekci 1, Mujdat Cetin 1,2 1 Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, USA. 2 Goergen Institute for Data Science, University of Rochester, Rochester, NY, USA. Abstract who uses a fully automated pipeline performing reconstruction We propose a neural network architecture combined with and analysis tasks simultaneously. Thus, being able to represent specific training and inference procedures for linear inverse prob- the uncertainty in neural network-based reconstruction methods lems arising in computational imaging to reconstruct the underly- is crucial for computational imaging problems. ing image and to represent the uncertainty about the reconstruc- To solve the problem of obtaining uncertainty information tion. The proposed architecture is built from the model-based with a Bayesian perspective, we need to define probability distri- reconstruction perspective, which enforces data consistency and butions on the weights of a neural network, and to obtain the pos- eliminates the artifacts in an alternating manner. The training terior distribution of the weights. These type of probabilistic mod- and the inference procedures are based on performing approxi- els, called Bayesian neural networks [7], have attracted significant mate Bayesian analysis on the weights of the proposed network attention recently. Unfortunately, obtaining the posterior distribu- using a variational inference method. The proposed architecture tion of weights is not an easy task because of the large number with the associated inference procedure is capable of character- of parameters and complex neural network models. Hence, dif- izing uncertainty while performing reconstruction with a model- ferent approximation methods (see [8] and the references therein) based approach. We tested the proposed method on a simulated have been used in the literature to perform approximate Bayesian magnetic resonance imaging experiment. We showed that the pro- analysis for neural networks. posed method achieved an adequate reconstruction capability and In this article, we introduce a model-based Bayesian deep provided reliable uncertainty estimates in the sense that the re- learning architecture for computational imaging, which is built gions having high uncertainty provided by the proposed method by using a model-based reconstruction approach. The training are likely to be the regions where reconstruction errors occur. and inference procedures utilize Monte Carlo (MC) Dropout [9] to perform variational inference. Combined with specific train- Introduction ing and inference procedures, the proposed architecture performs The problem of reconstructing the underlying image can reconstruction and provides uncertainty information about the re- be cast as solving a linear inverse problem for several imaging constructed image. The proposed architecture and the associated modalities such as magnetic resonance imaging [1] and computed training and inference procedures are easy to implement in deep tomography [2]. Thus, linear inverse problems are at the founda- learning frameworks. tion of the computational imaging. Recently, [10] has proposed a U-Net [11] based methodol- Recently, neural network-based methods have become in- ogy combined with MC Dropout to obtain uncertainty informa- creasingly popular to solve the linear inverse problems arising in tion about the reconstruction for phase imaging. The differences computational imaging (for a review see [3]). While one class of between our approach and the one in [10] lie in that our architec- methods such as [4] tries to invert the forward model with a deep ture is built by taking a model-based reconstruction perspective, neural network to reconstruct the underlying image from the mea- which can be perceived as a slightly more conservative approach sured data, another class of methods such as [5] takes a slightly compared to [10], and that we focus on a broad class of computa- more conservative approach and aims to recover the latent image tional imaging problems, i.e., all computational imaging problems by formulating an optimization problem and replacing some part that can be written as a system of linear equations. of the iterative reconstruction algorithm with a neural network. The rest of the paper is organized as follows: In the “Pre- Although these methods achieve state-of-the-art results in liminaries” section, we review the basic material related to the several computational imaging applications, the resulting image formulation of the reconstruction problem, the generative model might experience unexpected instabilities [6]. The lack of uncer- of the data, and the Bayesian approach to uncertainty estimation. tainty information about the reconstructed image severely limits In the “Proposed Method” section, we describe the proposed ar- the applicability of neural network-based methods in practice, es- chitecture with its training and inference procedures in detail. In pecially in safety-critical applications. If a neural network-based the “Experiments” section, we present the experimental results reconstruction algorithm is able to provide uncertainty informa- we have obtained in a simulated magnetic resonance imaging sce- tion about the reconstructed image, that can be leveraged to as- nario. Finally, the “Conclusion” section concludes the paper. sess the quality of the reconstruction or to warn the practitioner, Preliminaries This work was partially supported by the National Science Founda- In this section, we review some background on the classi- tion (NSF) under Grant CCF-1934962. cal statistical reconstruction approach, the generative model of IS&T International Symposium on Electronic Imaging 2021 Computational Imaging XIX 201-1 the data, and the Bayesian approach to uncertainty estimation for where both p(m˜ ) and p(s˜ m˜ ) are unknown distributions and could j neural networks. be only accessed empirically through Dtr. This assumption is jus- tifiable for computational imaging problems. Each measurement MAP Reconstruction in the training dataset is obtained by using the operator A and the 2 Consider the following setup noise variance snoise, which can be thought as generating a sam- ple from p(m˜ ). Ideally, we want to use ground truth images as m = As + n; (1) target images; however, we do not have an access to ground truth S images in practice. Thus, the target image for the measurement where m C is the vector containing the measurements, A S D 2 2 is obtained by imaging the same object using a suppressed noise C × is the operator that represents the transformation applied D level and without data reduction so that the resulting image, which by the imaging system, s C is the vectorized image, and n S 2 2 is often referred to as the reference image, is close to the ground C is the measurement noise in the system, which is assumed to be circulary-symmetric complex Gaussian noise with variance truth image. This process can be thought as generating a sample 2 from p(s˜ m˜ ). snoise. j For an underdetermined system (S < D), recovering the im- We also assume that the test dataset, age, s, from the measurements, m, becomes an ill-posed problem. (i) (i) (i) (i) (i) (i) Dte = (m˜ ;s˜ ) m˜ = hS(m );s˜ = hD(s );i [M] ; (6) One way to restrict the solution space and regularize the task is f ∗ ∗ j ∗ ∗ ∗ ∗ 2 g to use the prior knowledge about the image, s. Then, the maxi- mum a posteriori (MAP) estimate of the image, sˆ, can be found consists of i.i.d. samples of measurements and target images by solving the following optimization problem drawn from the same distribution that the training measurements and target images are drawn from. n 2 o sˆ = argmin As m 2 + by(s) ; (2) s CD k − k Bayesian Approach to Uncertainty Estimation 2 D Suppose we have a training dataset, Dtr, and a test measure- where the function y : C R is the regularizer coming from the ! ment, m˜ , from Dte. For a regression model with a set of param- image prior, and the parameter b controls the balance between ∗ eters g, the predictive distribution is the data fidelity term and the regularizer. To find an equivalent problem involving only real vectors and matrices, we introduce Z n 2n n m 2n 2m p(s˜ m˜ ;Dtr) = p(s˜ m˜ ;g)p(g Dtr)dg: (7) two operators h : C R and k : C × R × such that ∗j ∗ ∗j ∗ j ! ! Â(X) Á(X) Throughout this paper, we refer to p(s˜ m˜ ;g) as the likelihood, hn(x) = Â(x>) Á(x>) > , kn;m(X) = − ; (3) ∗j ∗ Á(X) Â(X) and to p(g Dtr) as the posterior distribution. Unfortunately,j obtaining the posterior distribution is not an where  and Á compute the element-wise real and imaginary easy task because of the large number of parameters and complex parts of a given vector or matrix, respectively. Then, using these neural network architectures. Hence, approximation techniques two operators, the optimization problem in Eqn. (2) can be writ- such as variational inference methods [9] or Markov Chain Monte ten as Carlo-based methods [7] are necessary. ! n o After obtaining an approximation of the posterior distribu- 1 ˜ 2 ˜ sˆ = hD− argmin As˜ m˜ 2 + by(s˜) ; (4) tion and defining the form of the likelihood, we can approximate s˜ R2D k − k 2 the integral in Eqn. (7) using Monte Carlo integration to esti- ˜ where A := kS;D(A), s˜ := hD(s), m˜ := hS(m), and y˜ is the mod- mate the predictive distribution. Finally, we obtain the mean and ified regularizer such that y(s) = y˜ (s˜) for all (s;s˜) pairs. There- the variance of the estimated predictive distribution with moment- fore, finding the solution of the Eqn. (2) boils down to finding matching. The resulting predictive variance can be used to char- 1 the solution of the optimization problem located inside the hD− acterize the uncertainty in the prediction.