Image Restoration Using Gaussian Scale Mixtures in the Wavelet Domain
Total Page:16
File Type:pdf, Size:1020Kb
IMAGE RESTORATION USING GAUSSIAN SCALE MIXTURES IN THE WAVELET DOMAIN Javier Portilla ∗ Eero Simoncelli y Visual Information Processing Group Center for Neural Science, and Dept. of Comp. Science and Artif. Intell. Courant Inst. of Mathematical Sciences Universidad de Granada New York University, NY 10003 [email protected] [email protected] We describe a statistical model for images decomposed 2. IMAGE MODELING in an overcomplete wavelet pyramid. Each neighborhood of pyramid coefficients is modeled as the product of a Gaus- Modeling the statistics of natural images is a difficult task, sian vector of known covariance, and an independent hid- because of the high dimensionality of the signal and the den positive scalar random variable. We propose an effi- higher-order statistical dependencies that are prevalent. Sim- cient Bayesian estimator for the pyramid coefficients of an plifying assumptions, such as homogeneity and locality are image degraded by linear distortion (e.g., blur) and additive usually applied. In the past decade it has become standard Gaussian noise. We demonstrate the quality of our results to begin by decomposing the image with a set of multi-scale in simulations over a wide range of blur and noise levels. band-pass oriented filters. This kind of representation is adapted to the approximate scale invariance of natural im- 1. INTRODUCTION ages, and it has been shown to decouple some high-order statistics. Another technique for simplifying the descrip- Natural images have distinct features that allow the human tion of high-order statistics is to use a low-order local model visual system to detect the presence of distortion, and to (e.g., Gaussian) whose parameters (e.g., variance) are gov- extract remaining information from the observation. Im- erned by a local hidden random variable. In this work we age restoration aims to construct an approximation sharing have combined these ideas into a Bayesian framework. the relevant features still present in the corrupted image, but with the artifacts suppressed. In order to distinguish the ar- tifacts from the signal, a good image model is essential. In 2.1. Image representation this work we use a Bayesian framework for image restora- tion, basing on a prior statistical model for natural images. As a preprocessing stage for image modeling, we use a fixed We assume Gaussian noise of known covariance has been multi-scale multi-orientation linear decomposition. It is now added after having convolved the image with a known blur- generally agreed that the use of overcomplete representa- ring kernel. Such model for the distortion provides a reason- tions is advantageous for restoration, in order to avoid alias- able approximation to real-world corruption sources, such ing artifacts that plague critically-sampled representations as de-focus or photon and electron noise in cameras. In such as orthogonal wavelets [9]. A widely followed ap- previous works, we have described a model for wavelet co- proach is to use orthogonal or biorthogonal basis functions, efficients using scale mixtures of Gaussians [1, 2]. Related but without decimating the subbands [e.g., 10]. However, models have been developed by several groups [3, 4, 5, 6]. after the critical sampling constraint has been removed, sig- We have applied this model to estimate images in the pres- nificant improvement comes from using representations with ence of independent additive Gaussian noise of known co- higher redundancy and orientation selectivity [9, 11, 12]. variance, following a suboptimal empirical Bayes strat- For the current paper, we have used a particular variant of an egy [2, 7]. Recently, we developed a direct Bayesian Least overcomplete tight frame representation known as a “steer- Square denoising procedure [8]. In this paper, we extend able pyramid” [9] (see [8] for details). In this paper we have this approach to include the full restoration problem. used 5 scales and 4 orientations for the decomposition, for a total number of 25 subbands. This number includes 4 5 ∗JP is supported by a ”Ramon´ y Cajal” grant (Science and Technology × Spanish Ministry) band-pass subbands plus 4 oriented high-pass subbands and yEPS is supported by NSF CAREER grant MIP-9796040. a non-oriented low-pass residual. 2.2. Gaussian scale mixtures in the wavelet domain prior, which for a Gaussian variable with a scale parameter pz yields [16]: For each coefficient in the pyramid representation, we con- 1 sider a neighborhood of coefficients, referring to the center pz(z) : (3) / z coefficient as the reference coefficient of the neighborhood. To avoid computational problems at the estimation stage, The neighborhood may include coefficients from other sub- we have set the prior to zero in the interval , where bands (i.e., corresponding to basis functions at nearby scales [0; zmin) is a small positive number.1 This solution is simple and and orientations), as well as from the same subband. For zmin efficient to implement. Surprisingly, it also performs better this work we have used a 3 3 neighborhood around the than maximum likelihood fitted non-parametric priors for reference coefficient, plus the×parent coefficient (same ori- the neighborhoods of each subband.2 entation and position, next coarser scale), whenever it ex- ists. Due to sampling differences at different scales, the coarser subband must be resampled at double rate in both 3. RESTORATION dimensions for obtaining the parent of each coefficient of the subband. We model the observed image as: We use a Gaussian scale mixture (GSM) to model the y(n; m) = x(n; m) h(n; m) + w(n; m); (4) coefficients within each local neighborhood with reference ∗ coefficients belonging to every given subband. A random where ”*” denotes convolution, x is the original image, h vector x is a Gaussian scale mixture [13] if it can be ex- is a known linear kernel and w is Gaussian noise of known pressed as the product of a zero-mean Gaussian vector u power spectral density (PSD) P (u; v). and an independent positive scalar random variable pz: w x = pzu: (1) 3.1. GSM with blur and noise The variable z is the multiplier. Vector x is an infinite mix- We follow the usual procedure for image restoration in the ture of Gaussian vectors, whose density is determined by the wavelet domain: (1) decompose the image y into pyramid subbands; (2) restore each subband, except for the low-pass covariance matrix Cuu of u and the mixing density pz(z): residual; and (3) invert the pyramid transform. Translating (4) into our local GSM model, a vector y corresponding to a px(x) = p(x z) pz(z) dz Z j neighborhood of N observed coefficients around a reference T −1 coefficient for each subband of the pyramid representation exp x (zCuu) x=2 − (2) = N=2 1=2 pz(z) dz; can be expressed as: Z (2 π) zCuu j j where N is the dimensionality of x and u (the size of the y = Hx + w = pzHu + w; (5) neighborhood). Without loss of generality one can assume where u and w are zero-mean Gaussian vectors, and H is an E z = 1, which implies C = C . uu xx N M matrix expressing the convolution of N coefficients f GSMg densities are symmetric and zero-mean, and they with× a kernel h of size N (N >> N, typically).3 The have leptokurtotic marginal densities (i.e., heavier tails than h h density of the observed neighborhood vector conditioned on a Gaussian). Another key property of the GSM model is that z is zero-mean Gaussian, with covariance C = zC 0 0 + the density of x is Gaussian when conditioned on z. Also, yjz u u C : the normalized vector x=pz is Gaussian. They also present ww interesting joint statistics: the variance of a vector element T −1 exp y (zCu0u0 + Cww) y=2 conditioned on a neighbor scales roughly linearly with the p(y z) = − ; (6) N j (2π) zCu0u0 + Cww square of the neighbor’s value. These marginal and joint j j statistics of GSM distributions are qualitatively similar to p where C 0 0 = HC HT is the N N covariance ma- those of neighbor coefficients responding to natural images u u uu trix of u0 = Hu. As the noise PSD ×P (u; v) is assumed in multi-scale and multi-orientation representation [14, 1]. w known, Cww is easily estimated by computing, for each subband, the sample covariance for the pyramid coefficients 2.3. Prior density for multiplier 9 1We have chosen zmin = 10− in our implementation. Results remain 17 5 Our model requires a specification of the density of the mul- almost unaffected with zmin 2 [10− ; 10− ]. tiplier, z. An alternative to estimate adaptive priors for the 2Note that, when estimating parameters under an overcomplete repre- hidden multiplier (as in [e.g., 5, 7]) is to use a noninfor- sentation, the set of least squares optimal solutions for the subbands is not least square optimal for the whole image. mative prior [15], which does not require the fitting of any 3The kernel h must be cropped in the frequency domain according to parameters to the observation. We have applied Jeffrey’s the spatial resolution of the pyramid subband. −T T −1 of the inverse Fourier transform of Pw(u; v) (a delta for where M = Cxx0 S Q, and v = Q S y. Finally, we the case of white noise). Given Cwwpand taking the expec- restrict the estimate to the reference coefficient: C C 0 0 tation of yjz over z, the filtered signal covariance u u N can be computed from the observation covariance matrix zmcnvn E xc y; z = ; (11) Cyy: Cyy = E z Cu0u0 + Cww.