https://doi.org/10.2352/ISSN.2470-1173.2021.15.COIMG-201 © 2021, Society for Imaging Science and Technology

Model-Based Bayesian Deep Learning Architecture for Linear Inverse Problems in Computational Imaging

Canberk Ekmekci 1, Mujdat Cetin 1,2 1 Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, USA. 2 Goergen Institute for Data Science, University of Rochester, Rochester, NY, USA.

Abstract who uses a fully automated pipeline performing reconstruction We propose a neural network architecture combined with and analysis tasks simultaneously. Thus, being able to represent specific training and inference procedures for linear inverse prob- the uncertainty in neural network-based reconstruction methods lems arising in computational imaging to reconstruct the underly- is crucial for computational imaging problems. ing image and to represent the uncertainty about the reconstruc- To solve the problem of obtaining uncertainty information tion. The proposed architecture is built from the model-based with a Bayesian perspective, we need to define probability distri- reconstruction perspective, which enforces data consistency and butions on the weights of a neural network, and to obtain the pos- eliminates the artifacts in an alternating manner. The training terior distribution of the weights. These type of probabilistic mod- and the inference procedures are based on performing approxi- els, called Bayesian neural networks [7], have attracted significant mate Bayesian analysis on the weights of the proposed network attention recently. Unfortunately, obtaining the posterior distribu- using a variational inference method. The proposed architecture tion of weights is not an easy task because of the large number with the associated inference procedure is capable of character- of parameters and complex neural network models. Hence, dif- izing uncertainty while performing reconstruction with a model- ferent approximation methods (see [8] and the references therein) based approach. We tested the proposed method on a simulated have been used in the literature to perform approximate Bayesian magnetic resonance imaging experiment. We showed that the pro- analysis for neural networks. posed method achieved an adequate reconstruction capability and In this article, we introduce a model-based Bayesian deep provided reliable uncertainty estimates in the sense that the re- learning architecture for computational imaging, which is built gions having high uncertainty provided by the proposed method by using a model-based reconstruction approach. The training are likely to be the regions where reconstruction errors occur. and inference procedures utilize Monte Carlo (MC) Dropout [9] to perform variational inference. Combined with specific train- Introduction ing and inference procedures, the proposed architecture performs The problem of reconstructing the underlying image can reconstruction and provides uncertainty information about the re- be cast as solving a linear for several imaging constructed image. The proposed architecture and the associated modalities such as magnetic resonance imaging [1] and computed training and inference procedures are easy to implement in deep [2]. Thus, linear inverse problems are at the founda- learning frameworks. tion of the computational imaging. Recently, [10] has proposed a U-Net [11] based methodol- Recently, neural network-based methods have become in- ogy combined with MC Dropout to obtain uncertainty informa- creasingly popular to solve the linear inverse problems arising in tion about the reconstruction for phase imaging. The differences computational imaging (for a review see [3]). While one class of between our approach and the one in [10] lie in that our architec- methods such as [4] tries to invert the forward model with a deep ture is built by taking a model-based reconstruction perspective, neural network to reconstruct the underlying image from the mea- which can be perceived as a slightly more conservative approach sured data, another class of methods such as [5] takes a slightly compared to [10], and that we focus on a broad class of computa- more conservative approach and aims to recover the latent image tional imaging problems, i.e., all computational imaging problems by formulating an optimization problem and replacing some part that can be written as a system of linear equations. of the algorithm with a neural network. The rest of the paper is organized as follows: In the “Pre- Although these methods achieve state-of-the-art results in liminaries” section, we review the basic material related to the several computational imaging applications, the resulting image formulation of the reconstruction problem, the generative model might experience unexpected instabilities [6]. The lack of uncer- of the data, and the Bayesian approach to uncertainty estimation. tainty information about the reconstructed image severely limits In the “Proposed Method” section, we describe the proposed ar- the applicability of neural network-based methods in practice, es- chitecture with its training and inference procedures in detail. In pecially in safety-critical applications. If a neural network-based the “Experiments” section, we present the experimental results reconstruction algorithm is able to provide uncertainty informa- we have obtained in a simulated magnetic resonance imaging sce- tion about the reconstructed image, that can be leveraged to as- nario. Finally, the “Conclusion” section concludes the paper. sess the quality of the reconstruction or to warn the practitioner, Preliminaries This work was partially supported by the National Science Founda- In this section, we review some background on the classi- tion (NSF) under Grant CCF-1934962. cal statistical reconstruction approach, the generative model of

IS&T International Symposium on Electronic Imaging 2021 Computational Imaging XIX 201-1 the data, and the Bayesian approach to uncertainty estimation for where both p(m˜ ) and p(s˜ m˜ ) are unknown distributions and could | neural networks. be only accessed empirically through Dtr. This assumption is jus- tifiable for computational imaging problems. Each measurement MAP Reconstruction in the training dataset is obtained by using the operator A and the 2 Consider the following setup noise variance σnoise, which can be thought as generating a sam- ple from p(m˜ ). Ideally, we want to use ground truth images as m = As + n, (1) target images; however, we do not have an access to ground truth S images in practice. Thus, the target image for the measurement where m C is the vector containing the measurements, A S D ∈ ∈ is obtained by imaging the same object using a suppressed noise C × is the operator that represents the transformation applied D level and without data reduction so that the resulting image, which by the imaging system, s C is the vectorized image, and n S ∈ ∈ is often referred to as the reference image, is close to the ground C is the measurement noise in the system, which is assumed to be circulary-symmetric complex Gaussian noise with variance truth image. This process can be thought as generating a sample 2 from p(s˜ m˜ ). σnoise. | For an underdetermined system (S < D), recovering the im- We also assume that the test dataset, age, s, from the measurements, m, becomes an ill-posed problem. (i) (i) (i) (i) (i) (i) Dte = (m˜ ,s˜ ) m˜ = ηS(m ),s˜ = ηD(s ),i [M] , (6) One way to restrict the solution space and regularize the task is { ∗ ∗ | ∗ ∗ ∗ ∗ ∈ } to use the prior knowledge about the image, s. Then, the maxi- mum a posteriori (MAP) estimate of the image, sˆ, can be found consists of i.i.d. samples of measurements and target images by solving the following optimization problem drawn from the same distribution that the training measurements and target images are drawn from. n 2 o sˆ = argmin As m 2 + βψ(s) , (2) s CD k − k Bayesian Approach to Uncertainty Estimation ∈ D Suppose we have a training dataset, Dtr, and a test measure- where the function ψ : C R is the regularizer coming from the → ment, m˜ , from Dte. For a regression model with a set of param- image prior, and the parameter β controls the balance between ∗ eters γ, the predictive distribution is the data fidelity term and the regularizer. To find an equivalent problem involving only real vectors and matrices, we introduce Z n 2n n m 2n 2m p(s˜ m˜ ,Dtr) = p(s˜ m˜ ,γ)p(γ Dtr)dγ. (7) two operators η : C R and κ : C × R × such that ∗| ∗ ∗| ∗ | → →     ℜ(X) ℑ(X) Throughout this paper, we refer to p(s˜ m˜ ,γ) as the likelihood, ηn(x) = ℜ(x>) ℑ(x>) > , κn,m(X) = − , (3) ∗| ∗ ℑ(X) ℜ(X) and to p(γ Dtr) as the posterior distribution. Unfortunately,| obtaining the posterior distribution is not an where ℜ and ℑ compute the element-wise real and imaginary easy task because of the large number of parameters and complex parts of a given vector or matrix, respectively. Then, using these neural network architectures. Hence, approximation techniques two operators, the optimization problem in Eqn. (2) can be writ- such as variational inference methods [9] or Markov Chain Monte ten as Carlo-based methods [7] are necessary. ! n o After obtaining an approximation of the posterior distribu- 1 ˜ 2 ˜ sˆ = ηD− argmin As˜ m˜ 2 + βψ(s˜) , (4) tion and defining the form of the likelihood, we can approximate s˜ R2D k − k ∈ the integral in Eqn. (7) using Monte Carlo integration to esti- ˜ where A := κS,D(A), s˜ := ηD(s), m˜ := ηS(m), and ψ˜ is the mod- mate the predictive distribution. Finally, we obtain the mean and ified regularizer such that ψ(s) = ψ˜ (s˜) for all (s,s˜) pairs. There- the variance of the estimated predictive distribution with moment- fore, finding the solution of the Eqn. (2) boils down to finding matching. The resulting predictive variance can be used to char- 1 the solution of the optimization problem located inside the ηD− acterize the uncertainty in the prediction. operator in Eqn. (4). Assuming that the modified regularizer, ψ˜ , is a convex, Proposed Method closed and proper function, we can use different splitting methods We define the following Gaussian likelihood for the recon- such as the proximal gradient method [12] or the alternating direc- struction problem tion of method of multipliers (ADMM) [13] to solve the problem 2 1 p(s˜ m˜ ,γ) = N(s˜ fω (m˜ ),diag(σ (m˜ ) )), (8) inside the ηD− operator in Eqn. (4) efficiently. | | φ 2S 2D 2S 2D where fω : R R and σφ : R R are neural networks Generative Model of the Data → → 2 parametrized by sets of parameters ω and φ, respectively, and Suppose for a fixed operator A and a noise variance σnoise, γ = ω φ. we have a training dataset, Dtr, consisting of a collection of N ∪ measurement-target image pairs, i.e., The architecture of the neural network fω is motivated by the model-based reconstruction approach. Assuming that the mod- (i) (i) (i) (i) (i) (i) Dtr = (m˜ ,s˜ ) m˜ = ηS(m ),s˜ = ηD(s ),i [N] . (5) ified regularizer is a closed proper convex function, the update { | ∈ } equation of the proximal gradient method to solve the optimiza- We assume that the training dataset, D , consists of i.i.d. samples tr tion problem inside the η 1 operator in Eqn. (4) is of measurements and target images drawn from the distribution − n  o (i) (i) [k] ˜ ˜ [k 1] ˜ (m˜ ,s˜ ) p(m˜ )p(s˜ m˜ ), i 1,...,N , s˜ = proxλβψ˜ I 2λA>A s˜ − + 2λA>m˜ , (9) ∼ | ∀ ∈ { } −

IS&T International Symposium on Electronic Imaging 2021 201-2 Computational Imaging XIX [1] [1] [2] [2] [K] [0] z˜ ˜s z˜ ˜s z˜ [K] (i) ˜s h R h R h R ˜s = fω(m˜ )

m˜ (i)

Figure 1. Proposed model-based Bayesian deep learning architecture.

Conv Conv Conv Conv Conv Conv z˜[k] Dropout Dropout Dropout Dropout ˜s[k] ReLU ReLU Dropout ReLU ReLU Dropout Module 1 Module C 1 Module C −

Residual Block 1 Residual Block P Figure 2. Neural network block R used in the proposed model-based Bayesian deep learning architecture. where λ is the step size, and s˜[k] is the reconstructed image at the regression model, which is sometimes referred to as the epistemic kth iteration. The benefit of using proximal gradient method over uncertainty [14], reflects the uncertainty caused by the lack of methods such as ADMM is that it does not require any matrix training examples around the test measurement m˜ . inversion in the update equation. Monte Carlo Dropout [9] is a popular and∗ scalable varia- Similar to the idea in [5], we replace the proximal operator tional inference method because it has a simple implementation proxλβψ˜ with a neural network, and the update equation becomes in deep learning frameworks, requires minimal changes on the classical neural network architecture, training and inference pro- [k]   [k 1] z˜ = I 2λA˜ >A˜ s˜ − + 2λA˜ >m˜ cedures, and provides reliable uncertainty estimates in several ap- − n o (10) plications such as depth completion [8], and semantic segmenta- s˜[k] = R z˜[k],ω , tion [14]. In this work, we employ Monte Carlo Dropout to ob- tain an approximation of the true posterior distribution p(γ Dtr) where R is the neural network, and ω is the set of parameters of by minimizing the Kullback-Leibler divergence between the| true the neural network R. For a fixed number of iterations K, the posterior distribution and the distribution qα (γ) parametrized by series of updates correspond to a deep neural network, which is the set of parameters α. the desired function fω . Figure 1 illustrates the structure of fω If the likelihood is defined as in Eqn. (8), and the distribution in detail. Figure 2 depicts the components of the neural network that we use to approximate the true posterior distribution, qα (γ), R that contains P residual blocks consisting of a skip connection is a Bernoulli variational distribution [15], i.e., and C modules. Each module includes a convolutional layer fol- L lowed by dropout, and each module except the last one includes a [ q γ p z 1 N γ α σ 2 p z 0 N γ 0 σ 2 (11) rectified linear unit (ReLU) activation function. Note that we use α ( ) = ( i = ) ( i i, ) + ( i = ) ( i , ), i 1 | | the same neural network R at every iteration, so the resulting neu- = ral network fω is consistent with the model-based reconstruction L L where σ is a sufficiently small constant, α = αi i=1, γ = γi i=1, approach. { } { } and p(zi = 1) = p, then the optimal set of parameters, α∗, can be For computational imaging problems, the neural network σφ approximated by solving the following optimization problem [14] captures the inherent noise in the target images, which is some- N 2D times referred to as the aleatoric uncertainty [14]. Thus, the 1 (i) α∗ = argmin [log[σ ˜(i) (m˜ )]k choice of the σφ depends on the requirements of the applica- ∑ ∑ φ α { N i=1 k=1 tion. If the true conditional distribution, p(s˜ m˜ ), is not invariant (12) (i) (i) 2 L | ([s˜ ]k [ f (i) (m˜ )]k) p across the training and test datasets, we can simply use an arbi- + − ω˜ ] + α2 , (i) 2 N ∑ i trary neural network that provides a non-negative output to explic- 2[σφ˜(i) (m˜ )]k 2 i=1 } itly model the data dependent uncertainty. On the other hand, if th (i) the true conditional distribution is invariant across the training and where [ . ]k denotes the k element of a given vector, and γ˜ = test datasets, we can focus on quantifying the uncertainty on the ω˜ (i) φ˜(i) is a sample from the Bernoulli variational distribution, ∪ model parameters and set it to be a constant function σφ = σmodel, qα (γ). where σmodel is a fixed model parameter and φ = /0.In the rest of Interestingly, generating a sample from the Bernoulli vari- this section, we assume that σφ is a neural network parametrized ational distribution requires sampling a set of Bernoulli random L by a set of parameters φ to keep the generality. variables zi i=1 and multiplying them with the parameters of the { } L Specifying the form of the neural networks fω and σφ Bernoulli variational distribution, αi i=1. This procedure resem- completely specifies the form of the likelihood function. Next, bles the dropout operation in the deep{ } learning literature. Hence, we need to find an approximation of the posterior distribution, solving the optimization problem in Eqn. (12) boils down to train- p(γ Dtr), which represents the uncertainty in the parameters of ing two neural networks fω and σφ using the first term of the Eqn. | p the regression model, γ. The uncertainty in the parameters of the (12) as a loss function with the weight decay parameter of 2N

IS&T International Symposium on Electronic Imaging 2021 Computational Imaging XIX 201-3 while the dropout is applied after every weight layer of the neural if 20% of the Fourier coefficients are collected, the acceleration networks fω and σ with the dropout rate of 1 p. The result- rate of the acquisition becomes 5 . φ − × ing weights of the neural networks after the training stage are the For different acceleration rates (3.33 , 5 , and 10 ) and × × × optimal parameters of the Bernoulli variational distribution, α∗. noise standard deviations (0.01, 0.05, 0.1, and 0.2), we con- After obtaining an approximation of the posterior distribu- structed 12 different training dataset-test dataset pairs. To obtain tion, qα (γ) p(γ Dtr), we can find approximations of the mean the target images of the training and test datasets, we extracted ∗ ≈ | and the variance of the predictive distribution by moment match- N = 7300 256 256 MRI images from the MRI data of 558 pa- × ing with T samples tients in the IXI Dataset1 and M = 268 256 256 MRI images from the MRI data of 20 patients in the IXI Dataset,× respectively. 1 T The target images were normalized such that the pixel intensity [s˜ m˜ ,D ] f (t) (m˜ ), (13) E tr ∑ ω˜ values lie between 0 and 1. For each acceleration rate-noise stan- ∗| ∗ ≈ T t=1 ∗ dard deviation pair, i.e., for each setup configuration, the mea- surements of the training and test datasets were generated using T T the linear model given in Eqn. (1). 1 2 1 2 Var[[s˜ ]k m˜ ,Dtr] [σφ˜(t) (m˜ )]k + [ fω˜ (t) (m˜ )]k ∗ | ∗ ≈ T ∑ ∗ T ∑ ∗ For the function that maps measurements to images fω , we t=1 t=1 use the neural network architecture introduced in the “Proposed !2 (14) 1 T Method” section with K = 10 iterations, P = 1 residual blocks f t ˜ ∑ ω˜ ( ) (m )]k . and C = 8 modules. We used 32 filters in the first and the second − T t=1 ∗ modules, 64 filters in the third and the fourth modules, 128 filters Because the original formulation of the problem involves in the fifth, sixth, and the seventh modules, and 2 filters in the final complex target images, we can find the mean and the variance module. The kernel size of the filters was set to 3, and the stride of the complex version of the predictive distribution as and the padding size were set to 1. By the assumption that the true conditional distribution is invariant between the training and the 1 test datasets, here we concentrate on quantifying the uncertainty E[s m ,Dtr] = ηD− (E[s˜ m˜ ,Dtr]) (15) ∗| ∗ ∗| ∗ on the parameters of the regression model. Thus, we set σφ to be a constant function with a fixed model parameter σmodel = 0.001,   i.e. σ = σmodel, and φ = /0.In the case where the target images Var[[s ]k m ,Dtr] = Var[[s˜ ]k m˜ ,Dtr] + Var [s˜ ]k+D m˜ ,Dtr φ ∗ | ∗ ∗ | ∗ ∗ | ∗ contain noise, or the true conditional distribution is not invariant (16) across the training and test datasets, representing the aleatoric un- certainty becomes an important task for characterizing the overall for all k 1,2,...,D . Remarkably, the inference procedure also requires∈ { samples from} the Bernoulli variational distribution. uncertainty in the prediction. We leave the investigation of these Thus, finding the predictive variance and the mean boils down to cases for future work. feeding the measurement, m˜ , into the neural networks fω and σφ T times while the dropout is∗ enabled and performing the updates Uncertainty Map and Reconstruction Error in Eqn. (15) and Eqn. (16). To investigate the relationship between the uncertainty map obtained by the proposed method and the reconstruction error, we Experiments and Results tested the proposed model on test measurements and computed The main advantage of the proposed method is that we can the absolute reconstruction errors and the uncertainty maps. Fig- both perform reconstruction and obtain uncertainty information ure 3 illustrates the reconstruction error and the uncertainty map about the reconstruction. In this section, we evaluate the proposed for a single test measurement. method on a simulated magnetic resonance imaging experiment As can be seen in Figure 3, the similarity between the un- and focus on addressing the following questions: certainty map and the absolute reconstruction error is remarkable. The uncertainty provided by the proposed method is high in the • Can we use the uncertainty information to locate the poten- regions where the reconstruction error occurs such as the regions tially erroneous regions in the reconstructed image? around the edges and small details. On the other hand, it is rela- • Is the uncertainty information reliable in the sense that the tively low in the regions where the reconstruction error is negli- region described by the predictive mean and the predictive gibly small such as piecewise constant and smooth regions. We variance contains the target pixel intensity values? observed the same behavior for different setup configurations as • How does the size of the training dataset N effect the uncer- well. Thus, we deduce that we can leverage the uncertainty infor- tainty? mation provided by the proposed method to locate the erroneous regions in a reconstructed image without the need of the reference Setup image. In magnetic resonance imaging, the measurement vector, m, contains the Fourier coefficients of the ground truth image. In ex- Reliability of the Uncertainty Information periments, we used a single-coil model with a constant sensitivity To evaluate the reliability of the uncertainty information ob- map; therefore, the forward operator A is simply a subsampled tained by the proposed method, we trained the proposed model for Fourier transform operator. The subsampling rate of the Fourier each setup configuration using the corresponding training dataset. coefficients in the k-space determines the structure of the A matrix and the acceleration rate of the data acquisition step. For example, 1https://brain-development.org/ixi-dataset/

IS&T International Symposium on Electronic Imaging 2021 201-4 Computational Imaging XIX Target Image Reconstruction 1.0

97.5% 0.8 0.8

95.0%

0.6 0.6

92.5% Mean:91.9% 0.4 0.4

90.0%

0.2 0.2 87.5%

85.0% Absolute Error 3×Predictive STD

0.10 0.06 82.5%

0.08 0.05 3.33/0.01 3.33/0.05 3.33/0.1 3.33/0.2 5.0/0.01 5.0/0.05 5.0/0.1 5.0/0.2 10.0/0.01 10.0/0.05 10.0/0.1 10.0/0.2 Setups (Acceleration rate/Noise std) 0.06 0.04 Figure 4. The percentage of the number of pixels whose target intensity val-

0.04 0.03 ues lie within three predictive standard deviations from the predictive mean for each reconstructed complex image in the test dataset for each setup con- 0.02 0.02 figuration.

0.01

Figure 3. The target image, the reconstructed image, the reconstruction error, and the uncertainty map for a single test measurement. examples, the overall uncertainty in the prediction continued to decrease. For the models trained with N = 100 and N = 7000 examples, the overall decrease in the uncertainty levels was not as drastic as the first two cases, but the uncertainty levels kept We performed T = 100 stochastic forward passes on each mea- decreasing in some of the smooth regions. Hence, the model un- surement in the test datasets to obtain the reconstructed complex certainty maps in Figure 5 suggest that the model would get more image and the corresponding uncertainty map. Then, we counted confident as we increase the size of the training set. the number of pixels whose target intensity values lie within three predictive standard deviations from the predictive mean (the re- One can confirm this by investigating the differences be- construction). Figure 4 shows the percentage of the number of tween the reconstructions obtained by the four models in Figure pixels satisfying this condition for each reconstructed complex 5. From left to right, the values of SSIM index are 0.899, 0.922, image in the test dataset for each setup configuration. 0.929, and 0.932. Considering the contrast, the artifacts and the details, the best and the most confident reconstruction was ob- The results indicate that significant amount of target pixel in- tained with the model trained with N = 7000 examples. There- tensity values lie within the region determined by the predictive fore, our results show that the proposed model is successful at mean and the three predictive standard deviations from the pre- capturing the uncertainty information depending on the size of dictive mean. One can argue that these percentages can be made the dataset. arbitrary high if the predictive variance is high across the image; however, this is not the case because in the previous subsection, Conclusion we experimentally showed that the predictive variance is high in We proposed a model-based Bayesian deep learning archi- the erroneous regions and low in the regions where reconstruc- tecture for computational imaging and its specific training and in- tion error is negligible or not present. Thus, we can claim that the ference procedures. The proposed architecture complies with the uncertainty information obtained by the proposed method is reli- model-based reconstruction approach, and approximate Bayesian able in the sense that it correctly localizes the erroneous and non- inference is performed using the MC Dropout technique. The pro- erroneous regions, and the target pixel intensity values lie within posed method is simple to implement in different deep learning the region determined by the uncertainty information. frameworks and has a fast inference procedure. We tested the pro- posed method in magnetic resonance imaging simulation under a Uncertainty Map and Training Dataset Size variety of setup configurations. We experimentally showed that To assess the effect of the number of training examples in the the uncertainty information provided by the proposed method is dataset on the uncertainty, we trained four different versions of the reliable, and we can leverage it to detect the erroneous regions in proposed model with N = 10, N = 50, N = 100, and N = 7000 the reconstructed image without the need of the reference image. training examples. We performed T = 100 stochastic forward passes on each test example to obtain the reconstructed images References and the corresponding uncertainty maps. Figure 5 shows the re- [1] Jeffrey A. Fessler, Optimization Methods for Magnetic Resonance constructed images and the corresponding uncertainty maps ob- Image Reconstruction: Key Models and Optimization Algorithms, tained by the aforementioned models. IEEE Signal Processing Magazine, vol. 37, no. 1, pp. 33–40, 2020. For the model that was trained on N = 10 examples, the [2] Alireza Entezari, Masih Nilchian, and Michael Unser, A Box Spline model uncertainty obtained by the proposed method is severely Calculus for the Discretization of Computed Tomography Recon- high, and it suggests that the number of training examples must struction Problems, IEEE Transactions on Medical Imaging, vol. 31, be increased to obtain a more confident reconstruction. For the no. 8, pp. 1532–1541, 2012. second and the third models trained with N = 50 and N = 100 [3] George Barbastathis, Aydogan Ozcan, and Guohai Situ, On the use

IS&T International Symposium on Electronic Imaging 2021 Computational Imaging XIX 201-5 N = 10 N = 50 N = 100 N = 7000

0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

N = 10 N = 50 N = 100 N = 7000

6 6 6 6 − − − −

7 7 7 7 − − − −

8 8 8 8 − − − −

9 9 9 9 − − − −

10 10 10 10 − − − −

11 11 11 11 − − − −

12 12 12 12 − − − − Figure 5. Effect of the number of training examples N in the training dataset on the uncertainty information. The first row shows the reconstructed images. The second row shows the corresponding predictive variance obtained by the proposed method for each reconstruction in log scale. The results in the first, second, third, and the fourth columns were obtained by using N = 10, N = 50, N = 100, and N = 7000 training examples, respectively.

of deep learning for computational imaging, Optica, vol. 6, no. 8, pp. in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010. 921–943, 2019. [14] Alex Kendall, and Yarin Gal, What uncertainties do we need in [4] Kyong Hwan Jin, Michael T. McCann, Emmanuel Froustey, and Bayesian deep learning for ?, Advances in Neural In- Michael Unser, Deep Convolutional Neural Network for Inverse formation Processing Systems, pp. 5574–5584, 2017. Problems in Imaging, IEEE Transactions on Image Processing, vol. [15] Yarin Gal, and Zoubin Ghahramani, Bayesian Convolutional Neu- 26, no. 9, pp. 4509–4522, 2017. ral Networks with Bernoulli Approximate Variational Inference, 4th [5] Morteza Mardani, Sun Qingyun, Shreyas Vasawanala, Vardan Pa- International Conference on Learning Representations (ICLR) work- pyan, Hatef Monajemi, John Pauly, and David Donoho, Neural Prox- shop track, 2016. imal Gradient Descent for Compressive Imaging, Proceedings of the 32nd International Conference on Neural Information Processing Author Biography Systems, pp. 9596–9606, 2018. Canberk Ekmekci is currently pursuing the Ph.D. degree at the Uni- [6] Vegard Antun, Francesco Renna, Clarice Poon, Ben Adcock, and An- versity of Rochester, Rochester, NY, under the supervision of Prof. Mujdat ders C. Hansen, On instabilities of deep learning in image recon- Cetin. His research interests include machine learning, convex optimiza- struction and the potential costs of AI, Proceedings of the National tion, and linear inverse problems arising in computational imaging. Academy of Sciences, 2020. Mujdat Cetin is a Professor of Electrical and Computer Engineering [7] Radford M. Neal, Bayesian learning for neural networks, Ph.D. the- and the Director of the Goergen Institute for Data Science at the Univer- sis, University of Toronto, 1995. sity of Rochester. His research interests include computational imaging, [8] Fredrik K. Gustafsson, Martin Danelljan, and Thomas B. Schon, bioimage analysis, and brain-computer/machine interfaces. Prof. Cetin is Evaluating scalable Bayesian deep learning methods for robust com- a Fellow of the IEEE and has served as a member of the IEEE Signal Pro- puter vision, Proceedings of the IEEE/CVF Conference on Computer cessing Society Technical Directions Board and as the Chair of the IEEE Vision and Pattern Recognition Workshops, pp. 318–319, 2020. Computational Imaging Technical Committee. He is currently a Senior [9] Yarin Gal, and Zoubin Ghahramani, Dropout as a Bayesian approx- Area Editor for the IEEE Transactions on Computational Imaging and imation: Representing model uncertainty in deep learning, Interna- the IEEE Transactions on Image Processing. Prof. Cetin has received tional Conference on Machine Learning, pp. 1050–1059, 2016. several awards including the IEEE Signal Processing Society Best Pa- [10] Yujia Xue, Shiyi Cheng, Yunzhe Li, and Lei Tian, Reliable deep- per Award; the EURASIP/Elsevier Signal Processing Best Paper Award; learning-based phase imaging with uncertainty quantification, Optica, the IET Radar, Sonar and Navigation Premium Award; and the Turkish vol. 6, pp. 618–629, 2019. Academy of Sciences Distinguished Young Scientist Award. He was the [11] Olaf Ronneberger, Philipp Fischer, and Thomas Brox, U-Net: Technical Program Co-chair for the IEEE Image, Video, and Multidimen- Convolutional networks for biomedical image segmentation, Inter- sional Signal Processing (IVMSP) Workshop in 2016; for the Interna- national Conference on Medical image computing and computer- tional Conference on Information Fusion in 2016 and 2013; for the In- assisted intervention, pp. 234–241, 2015. ternational Conference on Pattern Recognition (ICPR) in 2010; and for [12] Neal Parikh, and Stephen Boyd, Proximal algorithms, Foundations the IEEE Conference on Signal Processing, Communications, and their and Trends in Optimization, vol. 1, no. 3, pp. 127–239, 2014. Applications in 2006. [13] Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Foundations and Trends

IS&T International Symposium on Electronic Imaging 2021 201-6 Computational Imaging XIX JOIN US AT THE NEXT EI!

IS&T International Symposium on Electronic Imaging SCIENCE AND TECHNOLOGY

Imaging across applications . . . Where industry and academia meet!

• SHORT COURSES • EXHIBITS • DEMONSTRATION SESSION • PLENARY TALKS • • INTERACTIVE PAPER SESSION • SPECIAL EVENTS • TECHNICAL SESSIONS •

www.electronicimaging.org imaging.org