<<

arXiv:0810.4807v6 [stat.ME] 7 Jul 2009 [email protected] ehooiu e omnctos 03Ain,Tnsa E Tunisia, Ariana, 2083 Communications, [email protected] des Technologique ’nomtqeGsadMne-CR,744Mrel Vall´ee la Marne 77454 CNRS, E-mail: - France. Monge Gaspard d’Informatique rn ANR-05-MMSA-0014-01. grant nCmuiain,CnrladSga rcsig[1]. Processing Signal and Control Communications, on a oiae h s fteWvltTasom(T [8], (WT) Transform the Thi of representation. use frequency the sparse motivated a local have has other not and images) sharp do for shortcoming: features (edges [7 main signal the a operations in has filtering transitions domain transform of Fourier Fourier the representation the However, as simple domain, a frequency provides the deconvolutio in the works, with make pioneering dealt In to solve. expected a to being easier in problem transform operate the natu often ill-posed domain, methods the transform deconvolution with problems, cope To these inv . of its to and sensitive ill-conditioned very is thus it oper or non-invertible deconvoluti the practi usually Indeed, of suitable ill-posed. problems often inverse Designing are as interest task, [6]. challenging a [5], is methods [4], noisy of [3], deconvolution the signals to dedicated proce been further have any works usua before Many is artifacts task these restoration reduce data to eff A required blur [2]. a images sensing create remote vibrations in mechanical and inh systems aberrations optical the instance, cameras, electronics For satellite the of processes. an aperture to transmission limited by due and recording contamination be the may a of which bandlimite and noise system the Gaussian acquisition to additive the related signal/image of often of nature sources convolution main a two are degradation: there that consider may hwtego efrac formto ...conventional w.r.t. method our of methods. are performance im restoration natural estimate wavelet-based good on risk out the Stein’s carried Simulations show the work. concerning ca of this results in that variance provided theoretical frames the New of synthesis not. calculation and or analysis overcomplete proc any be deconvolution using unbi the of Stein’s performed Secondly, from is th originality derived estimate. lead criterion formulate risk problem The a and quadratic we of estimation noise. minimization nonlinear Firstly, convolution the a Gaussian two-fold. to a as is white problem by approach restoration a degraded proposed of the data addition restoring the of problem .BnzaBnai swt RS,SPCM Cit´e SUP’COM, URISA, with is Benazza-Benyahia A. .C eqe n .Caxaewt h Universit´e Paris-Es the with are Chaux C. and Pesquet J.-C. hswr a upre yteAec ainl el Recherc la de Nationale Agence the by supported was work This ato hswr a rsne tte3dIE International IEEE 3rd the at presented was work this of Part ti elkonta,i aypatclstain,one situations, practical many in that, well-known is It Abstract UEApoc o iia Signal/Image Digital for Approach SURE A I hsppr eaeitrse nteclassical the in interested are we paper, this —In [email protected] .I I. enCrsoh eqe,Ae eaz-eyha n,Car and, Benazza-Benyahia, Amel Pesquet, Jean-Christophe NTRODUCTION . eovlto Problems Deconvolution . ,Laboratoire t, Symposium rn to erent ee 2, Cedex eunder he reis erse tris ator ssing. edure -mail: ased ized ages also is n the the ing cal ect lly on re ]. n d e s , ftetre ouini rm 2] ntesm a,a way, same the In [21]. frame a in solution target the repres spars the (e.g., of about available information is priori distribution) probability a or which solvi in for problems framework b inverse variational extended convex been flexible has a [20] in proposing work the recently, compute More to solution. derived was a objective considered projection-based been an related has regularization [19], mixed coefficie a wavelet or In a ularization variation, variation. total the total including function regulari the a in involving exploited been scheme has wavel the transforms of curvelet complementarity the the and deconvolu [18], expectation-maximizatio In a the [17]. in under instance, framework derived For bee was method also authors. tion have several restoration by image investigated for approaches variational case. multichannel extension the an to proposed method WaveD have the we on of [16], based In is which wavele arguments. [15] elegant minimax [14], an WaveD frequency called through method signals been a restoration degraded have following estimate to By Meyer’s used [13]. band-limited employs approach, which filter domain Katsaggelos Kalman was and multiscale procedure Banham a two-step by A presented metric. also error wavelet mean-squared ver and the approximate an Fourier of optimizing of to by derived amount applicable is the regularization [12]. is between kernel which balance degradation domains, optimal non-invertible WT or the invertible and any Fourier the in Neelamani by developed al. Regulari was com Fourier-Wavelet (ForWaRD) more called Deconvolution the a approach in deconvolution, hybrid Similar wavelet-vaguelette petitive [9]. the Silverman to and spirit Abramovich by tran the presented al is decomposition An wavelet-vaguelette recoverin the operators. degradation for to convolutive native appropriate arbitrary the by not of the degraded is signals response adapting method by frequency the However, [11] the filter. in to refined basis wavele been wavelet The has performed. shrinkag method is a coefficients vaguelette component, wavelet noise amplifi- filtered colored the the of avoid resulting To is the technique. of [8] filtering cation inverse in restoratio an proposed of on class method based new wavelet-vaguelette a The to methods. birth giving combine to methods, wavelet-ba suggested denoising with been approaches advantag has deconvolution take based it frequency To domains, [10]. transform coefficients both noisy of energy simple discard successfu WT, good to be can applied the the domain wavelet of the to in properties operations Thanks shrinkage decorrelation extensions. and various compaction its and [9] trtv aee-ae hehligmtosrligon relying methods thresholding wavelet-based Iterative w-tg hikg rcdr ucsieyoperates successively procedure shrinkage two-stage a : ln Chaux oline entation treg- nt sform da nd sion The zed ter- zed sed the ity lly ng et et t- y n n g n e e 1 - - t 2 new class of iterative shrinkage/thresholding was also [37]). In addition, she investigated the problem of the proposed in [22]. Its novelty relies on the fact that the update nonlinear estimation of deterministic parameters from a linear equation depends on the two previous iterated values. In [23], observation model in the presence of additive noise. In the a fast variational deconvolution algorithm was introduced. It context of deconvolution, the derived SURE was employed consists of minimizing a quadratic data term subject to a to evaluate the MSE performance of solutions to regularized regularization on the ℓ1-norm of the coefficients of the solution objective functions. Another work in this research direction in a Shannon wavelet basis. Recently, in [24], a two-step is [38], where the risk estimate is minimized by a Monte decoupling scheme was presented for image deblurring. It Carlo technique for denoising applications. A very recent work starts from a global linear blur compensation by a generalized [39] also proposes a recursive estimation of the risk when a Wiener filter. Then, a nonlinear denoising is carried out by thresholded Landweber algorithm is employed to restore data. computing the Bayes least squares Gaussian scale mixtures In this paper, we adopt a viewpoint similar to that in [36], estimate. Note also that an advanced restoration method was [39] in the sense that, by using Stein’s principle, we obtain an developed in [25], which does not operate in the wavelet estimate of the MSE for a given class of estimators operating domain. in deconvolution problems. The main contribution of our work In the same time, much attention was paid to Stein’s is the derivation of the variance of the proposed quadratic risk principle [26] in order to derive estimates of the Mean- estimate. These results allow us to propose a novel SURE-LET Square Error (MSE) in statistical problems involving an ad- approach for data restoration which can exploit any discrete ditive Gaussian noise. The key advantage of Stein’s Unbiased frame representation. Risk Estimate (SURE) is that it does not require a priori The paper is organized as follows. In Section II-A, the knowledge about the of the unknown data, while required background is presented and some notations are yielding an expression of the MSE only depending on the introduced. The generic form of the estimator we consider for statistics of the observed data. Hence, it avoids the difficult restoration purposes is presented in Section II-B. In Section III, problem of the estimation of the hyperparameters of some we provide extensions of Stein’s identity which will be useful prior distribution, which classically needs to be addressed in throughout the paper. In Section IV-A we show how Stein’s Bayesian approaches.1 Consequently, a SURE approach can principle can be employed in a restoration framework when the be applied by directly parameterizing the estimator and finding degradation system is invertible. The case of a non-invertible the optimal parameters that minimize the MSE estimate. The system is addressed in Section IV-B. The expression of the first efforts in this direction were performed in the context of variance of the empirical estimate of the quadratic risk is then denoising applications with the SUREShrink technique [10], derived in Section V. In Section VI, two scenarii are discussed [27] and, the SUREVect estimate [28] in the case of multi- where the determination of the parameters minimizing the risk channel images. More recently, in addition to the estimation estimate takes a simplified form. The structure of the proposed of the MSE, Luisier et al. have proposed a very appealing SURE-LET deconvolution method is subsequently described structure of the denoising function consisting of a linear in Section VII and examples of its application to wavelet-based combination of nonlinear elementary functions (the SURE- image restoration are shown in Section VIII. Some concluding Linear Expansion of Threshold or SURE-LET) [29]. Notice remarks are given in Section IX. that this idea was also present in some earlier works [30]. The notations used in the paper are summarized in Table I. In this way, the optimization of the MSE estimate reduces to solving a set of linear equations. Several variations of the basic SURE-LET method were investigated: an improvement of the II. PROBLEM STATEMENT denoising performance has been achieved by accounting for the interscale information [31] and the case of color images has A. Background also been addressed [32]. Another advantage of this method is that it remains valid when redundant multiscale representations We consider an unknown real-valued field whose value at of the observations are considered, as the minimization of location x D is s(x) where D = 0,...,D1 1 ∈ { ∗ d − }×···×∗ the SURE-LET estimator can be easily carried out in the 0,...,Dd 1 with (D1,...,Dd) (N ) where N denotes { − } ∈ time/space domain. A similar approach has also been adopted the set of positive integers. Here, s is a d-dimensional digital by Raphan and Simoncelli [33] for denoising in redundant random field of finite size D = D1 ...Dd with finite variance. multiresolution representations. Overcomplete representations Of pratical interest are the cases when d = 1 (temporal have also been successfully used for multivariate shrinkage signals), d = 2 (images), d = 3 (volumetric data or video estimators optimized with a SURE approach operating in the sequences) and d =4 (3D+t data). transform domain [34]. An alternative use of Stein’s principle The field is degraded by the acquisition system with (de- was made in [35] for building convex constraints in image terministic) impulse response h, and it is also corrupted by denoising problems. an additive noise n, which is assumed to be independent of In [36], Eldar generalized Stein’s principle to derive an MSE the random process s. The noise n corresponds to a random estimate when the noise has an exponential distribution (see field which is assumed to be Gaussian with zero-mean and 2 covariance field: (x, y) D , E[n(x)n(y)] = γδx y, where ∀ ∈ − 1This does not mean that SURE approaches are superior to Bayesian (δx)x∈Zd is the Kronecker sequence and γ > 0. In other approaches, which are quite versatile. words, the noise is white. 3

TABLE I NOTATIONS.

Variable Definition D set of spatial (or frequency) indices s original signal h impulse response of the degradation filter n additive white Gaussian noise r observed signal S of s H Fourier transform of h R Fourier transform of r E(·) mean square value of the signal in argument E[·] mathematical expectation (ϕℓ)1≤ℓ≤L family of analysis vectors (ϕeℓ)1≤ℓ≤L family of synthesis vectors (sℓ)1≤ℓ≤L coefficients of the decomposition of s onto (ϕℓ)1≤ℓ≤L Φℓ Fourier transform of ϕℓ e Φℓ Fourier transform of ϕeℓ Θℓ estimating function applied to sℓ sb estimate of s P set of frequency indices for which H is considered equal to 0 Q set of frequency indices for which H takes significant values χ threshold value in the s projection of s onto the subspace whose Fourier coefficients vanish on P e R r inverse Fourier transform of the projection of H onto the subspace whose Fourier coefficients vanish on P Km index subset for the m-th subband λ constant used in the Wiener-like filter

Thus, the observation model can be expressed as follows: and, the coefficients S(p) and N(p) are obtained by expres- sions similar to (2). x D, r(x) = (h s)(x)+n(x)= h(x y)s(y)+n(x) ∗ ∀ ∈ ∗ − Let (ϕℓ)1≤ℓ≤L be a family of L N analysis vectors y∈D D1×···×Dd ∈ D1×···×Dd X (1) of R . Thus, every signal r of R can be e e decomposed as: where (h(x))x∈Zd is the periodic extension of (h(x))x∈D. It must be pointed out that (1) corresponds to a periodic ℓ 1,...,L , r = r, ϕ = r(x)ϕ (x), (4) ∀ ∈{ } ℓ h ℓi ℓ approximatione of the discrete convolution (this problem can x∈D be alleviated by making use of zero-padding techniques [40], X the operator , designating the Euclidean inner product of [41]). RD1×···×Dd .h· According·i to Plancherel’s formula, the coeffi- A restoration method aims at estimating s based on the cients of the decomposition of r onto this family are given observed data . In this paper, a supervised approach is r by adopted by assuming that both the degradation kernel h and 1 ∗ the noise variance γ are known. ℓ 1,...,L , r = R(p) Φ (p) , (5) ∀ ∈{ } ℓ D ℓ p∈D X  B. Considered nonlinear estimator ∗ where Φℓ(p) is a discrete Fourier coefficient of ϕℓ and ( ) The proposed estimation procedure consists of first trans- denotes the complex conjugation. Let us now define, for every· forming the observed data to some other domain (through ℓ 1,...,L , an estimating function Θℓ : R R (the choice some analysis vectors), performing a non-linear operation on of∈{ this function} will be discussed in Section VII-B),→ so that the so-obtained coefficients (based on an estimating function) with parameters that must be estimated, and finally recon- sℓ =Θℓ rℓ . (6) structing the estimated signal (through some synthesis vectors). We will use as an estimate of s(x), More precisely, the discrete Fourier coefficients p b R( ) p∈D L of r are given by:  s(x)= sℓ ϕℓ(x) (7) △ p D, R(p) = r(x)exp( 2πıx⊤D−1p) (2) Xℓ=1 ∀ ∈ − x∈D where (ϕℓ)1≤ℓ≤L isb a familyb e of synthesis vectors of X RD1×···×Dd . Equivalently, the estimate of S is given by: where D = Diag(D1,...,Dd). In the frequency domain, (1) becomes: e L △ S(p)= sℓ Φℓ(p) (8) R(p)= U(p)+ N(p), where U(p) = H(p)S(p) (3) Xℓ=1 b b e 4

where Φℓ(p) is a discrete Fourier coefficient of ϕℓ. It must which will be useful to optimize a parametric form of the esti- be pointed out that our formulation is quite general. Different mator from the observed data. For this purpose, the following analysis/synthesise families can be used. These familiese may assumption is made: be overcomplete (which implies that L>D) or not. Assumption 1. III. STEIN-LIKE IDENTITIES (i) The degradation filter is such that, for every p D, H(p) =0. ∈ Stein’s principle will play a central role in the evaluation (ii) For every6 ℓ in 1,...,L , Θ is a continuous, almost of the mean square estimation error of the proposed estimator. ℓ everywhere differentiable{ } function such that We first recall the standard form of Stein’s principle: (a) α R∗, τ R, ∀ ∈ ∀ ∈ 2 Proposition 1. [26] Let Θ: R R be a continuous, almost 2 ζ → lim Θℓ(τ + ζ)ζ exp( )=0, everywhere differentiable function. Let η be a real-valued zero- |ζ|→∞ −2α2 2 E 3 E ′ 3 mean Gaussian random variable with variance σ and υ be (b) [ Θℓ(rℓ) ] < and [ Θℓ(rℓ) ] < where ′| | ∞ | | ∞ a real-valued random variable which is independent of η. Let Θℓ is the derivative of Θℓ. ρ = υ + η and assume that 2 ζ Under this assumption, the degradation model can be re- • R, 2 , τ lim|ζ|→∞ Θ(τ + ζ)exp 2σ =0 expressed as x x x where and are the fields ∀ ∈ 2 ′ − ′ s( )= r( ) n( ) r n • E[(Θ(ρ)) ] < and E[ Θ (ρ) ] < where Θ is the −  whose discrete Fourier coefficients are derivative of Θ∞. | | ∞ e R(p)e eN(p)e Then, R(p)= , N(p)= . (15) p p E[Θ(ρ)η]= σ2E[Θ′(ρ)]. (9) H( ) H( ) Thus, since the noise has been assumed spatially white, it is We now derive extended forms of the above formula (see e e easy to show that Appendix A) which will be useful in the remainder of this paper: ′ 2 ′ ∗ γD (p, p ) D , E N(p) N(p ) = δp p′ (16) ∀ ∈ H(p) − Proposition 2. Let Θi : R R with i 1, 2 be continuous, → ∈{ }  ′ ∗ γD almost everywhere differentiable functions. Let (η1, η2, η1, η2) E Ne(p) N(p ) = δp p′ H(p) 2 − be a real-valued zero-mean Gaussian vector and (υ1,υ2) | |    (17) be a real-valued random vector which is independente e of e e ∗ (η1, η2, η1, η2). Let ρi = υi + ηi where i 1, 2 and assume and E N(p) S(p′) = 0. The latter relation shows that n that ∈{ } 2 and s are uncorrelated fields. ∗ 2 ζ    (i) αe eR , τ R, lim|ζ|→∞ Θi(τ +ζ)ζ exp 2 = We aree now able to state the following result (see Appendix ∀ ∈ ∀ ∈ − 2α e 0, B): 3  (ii) E[ Θi(ρi) ] < , (iii) E[|Θ′ (ρ )|3] < ∞ where Θ′ is the derivative of Θ . Proposition 3. The mean square error on each frequency | i i | ∞ i i component is such that, for every p D, Then, ∈ E E ′ E 2 2 γD [Θ1(ρ1)η1]= [Θ (ρ1)] [η1η1] (10) E p p E p p 1 [ S( ) S( ) ]= [ S( ) R( ) ] 2 ′ | − | | − | − H(p) E[Θ1(ρ1)η1η2]= E[Θ (ρ1)η2]E[η1η1] 1 L | ∗| e e b b e Φℓ(p) Φℓ(p) + E[Θ1(ρ1)]E[η1η2] (11) E ′ +2γ [Θℓ(rℓ)] Re (18) 2 ′ 2 ′ H(p) E[Θ (ρ )ηe ηe ]= E[Θ (ρ )ηe ]E[η ηe ]+2E[Θ (ρ )] ℓ=1  1 1 1 2 1 1 2 1 1 1 1 X n e o e e E[η1η2]E[η2η1]. (12) and, the global mean square estimation error can be expressed e e × e e as E[Θ1(ρ1)Θ2(ρ2)η1η2]= E[Θ1(ρ1)Θ2(ρ2)]E[η1η2] + E[Θe ′e(ρ )Θe (ρ )η ]E[η η ] E[ (s s)] = E[ (s r)] + ∆ (19) 1 1 2 2 2 1 1 E − E − e e ′ e e L + E[Θ1(ρ1)Θ (ρ2)η1]E[η2η2] 2 γ E ′ −2 E ′ ′ eE eE ∆=b 2 [Θℓb(rℓ)]eγℓ H(p) (20) + [Θ1(ρ1)Θ2(ρ2)]( [η1η2] [η2η1] D − | | ℓ=1 p∈D E[η η ]E[η η ]).e e (13)  X X  − 1 1 2 2 e e where (γℓ)1≤ℓ≤L is the real-valued cross-correlation sequence Note that Proposition 2 obviously is applicable when defined by: for all ℓ 1,...,L , (υ ,υ ) is deterministic. e e ∈{ } 1 2 ∗ 1 Φℓ(p) Φℓ(p) γ = . (21) IV. USE OF STEIN’SPRINCIPLE ℓ D H(p) p∈D  A. Case of invertible degradation systems X e In this section, we come back to the deconvolution problem B. Case of non-invertible degradation systems and develop an unbiased estimate of the quadratic risk: Assumption 1(i) expresses the fact that the degradation filter 1 2 (s s)= s(x) s(x) (14) is invertible. Let us now examine how this assumption can be E − D − x∈D relaxed. X  b b 5

We denote by P the set of indices for which the frequency It remains now to express the mean square estimation error response H vanishes: E[ (s s)] on the observable part. This is done quite similarly toE the− end of the proof of Proposition 3. P = p D H(p)=0 . (22) { ∈ | } Remarkb 1. It is then clear that the components of S(p) with p P, (i) Assume that the functions s and h share the same are unobservable. The observable part of the signal s∈thus frequency band in the sense that, for all p Q, corresponds to the projection s = Π s of s onto the subspace S(p)=0. Let us also assume that, for every p6∈ Q, of RD1×···×Dd of the fields whose discrete Fourier coefficients ∈  H(p)=1. This typically corresponds to a denoising vanish on P. In the Fourier domain, the projector Π is therefore problems for a signal with frequency band Q. Then, defined by since s = s and r = r, (26) becomes S(p) if p P p D, S(p)= 6∈ (23) E[ (s s)] = E[ (s r)] ∀ ∈ 0 if p P. E − e E − ( L ∈ γ In this context, it is judicious to restrict the summation in (5) + 2 E[Θ′ (r )] ϕ , ϕ card(Q) , (30) bD bℓ ℓ h ℓ ℓi− to Q = D P so as to limit the influence of the noise present in ℓ=1 \  X  the unobservable part of s. This leads to the following modified where card(Q) denotes the cardinalitye of Q. In the expression of the coefficients rℓ: case when d = 2 (images) and Q = D, the resulting 1 ∗ expression is identical to the one which has been derived r = R(p) Φ (p) = r, ϕ (24) ℓ D ℓ h ℓi in [29] for denoising problems. p∈Q X  (ii) Proposition 4 remains valid for more general choices where r = Π r . The second step in the estimation procedure of the set P than (22). In particular, (26) and (27) are (Eq. (6)) is kept unchanged. For the last step, we impose the unchanged if following structure  to the estimator: P = p D H(p) χ (31) L L { ∈ | | |≤ } s(x) = Π s ϕ (x) = s ϕ (x) (25) where χ 0, provided that the complementary set Q ℓ ℓ ℓ ℓ ≥ ℓ=1 ℓ=1 satisfies Assumption 2.  X  X (iii) It is possible to give an alternative proof of (26)-(27) where ϕ b= Π(ϕ ). We willb e also replace Assumptionb e 1(i) by ℓ ℓ by applying Proposition 1 in [36]. the following less restrictive one:

Assumptione 2. eThe set Q is nonempty. V. EMPIRICAL ESTIMATION OF THE RISK Under this condition and Assumption 1(ii), an extended Under the assumptions of the previous section, we are now form of Proposition 3 is the following: interested in the estimation of the “observable” part of the risk in (26), that is o = (s s), from the observed field r. As Proposition 4. The mean square error on each frequency E E − shown by Proposition 4, an unbiased estimator of o is component is given, for every p Q, by (18). The global E ∈ b mean square estimation error can be expressed as o = (s r)+ ∆ (32) E E − E[ (s s)] = E[ (s s)] + E[ (s r)] + ∆ (26) where E − E − E − b b e b L γ L γ ′ −2 ′ −2 E p (27) ∆= 2 Θℓ(rℓ)γℓ H(p) . (33) ∆=b 2 [Θℓ(rℓ)]γℓ bH(e) . D − | | D − | | ℓ=1 p∈Q  ℓ=1 p∈Q   X X  X X b Hereabove, r denotes the 2D field with Fourier coefficients We will study in more detail the statistical behaviour of this estimator by considering the difference: R(p) if p Q 2 e R(p)= H(p) ∈ (28) = s(x) s(x) n(x) (n) ∆. (34)  Eo − Eo D − −E − 0 otherwise, x∈D  X  e More precisely,b by makingb use of Propositione e 2, theb variance and, the real-valued cross-correlation sequence (γℓ)1≤ℓ≤L becomes: of this term can be derived (see Appendix C). ∗ 1 Φℓ(p) Φℓ(p)) γ = . (29) Proposition 5. The variance of the estimate of the observable ℓ D H(p) p∈Q part of the quadratic risk is given by X e Proof: The proof that (18) holds for every p Q is Var 4γ E ∈ [ o o]= [ (sH rH )] identical to that in Proposition 3. The global MSE can be E − E D E − decomposed as the sum of the errors on its unobservable 2 L L 2 4γ b E ′ ′ 2γ 1 and observable parts, respectively. Using the orthogonality + [Θℓ(rℓ)Θb i(ri)]eγℓ,iγi,ℓ D2 − D2 H(p) 4 i=1 p Q property for the projection operator Π, the corresponding Xℓ=1 X X∈ | | quadratic risk is given by: (s s) = (s s)+ (s s). (35) E − E − E − b b 6

where rH is the field with discrete Fourier coefficients given Let us now assume that the coefficients (rℓ)1≤ℓ≤L are by classified according to M N∗ distinct nonempty index ∈ M e R(p) subsets Km, m 1,...,M . We have then L = m=1 Km if p Q where, for every∈{m 1,...,M} , K = card(K ). For RH (p)= H(p) ∈ (36) m m instance, for a wavelet∈ { decomposition,} these subsetsP may 0e otherwise, e correspond to the subbands associated with the different res- sH is similarly defined from s and, olution levels, orientations,... In addition, consider that, for every m 1,...,M , the estimating functions Θ ∗ ℓ ℓ∈Km 1 Φℓ(p) Φi(p) ∈ { } b (ℓ,i) 1,...,L 2, γb = . belong to a given class of parametric functions and they are ∀ ∈{ } ℓ,i D H(p) a  p∈Q  characterized by a vector parameter m. The same estimating X e (37) function is thus employed for a given subset Km of indices. Then, it can be noticed that the criterion to be minimized in Remark 2. (38) is the sum of M partial MSEs corresponding to each (i) Eq. (35) suggests that caution should be taken in re- subset Km. Consequently, we can separately adjust the vector lying on the unbiased risk estimate when H(p) takes a | | m, for every m 1,...,M , so as to minimize small values. Indeed, the terms in the expression of the ∈{ } p 2 variance involve divisions by H( ) and may therefore Θ (r ) r +2γ Θ′ (r )γ . (39) become of high magnitude, in this case. ℓ ℓ − ℓ ℓ ℓ ℓ ℓ∈Km ℓ∈Km (ii) An alternative statement of Proposition 5 is to say that X  X e L L 4γ 4γ2 (s r )+ Θ′ (r )Θ′ (r )γ γ B. Example of LET functions D E H − H D2 ℓ ℓ i i ℓ,i i,ℓ i=1 Xℓ=1 X As in the previous section, we assume that the coefficients 2γ2 (r ) as defined in (24) are classified according to b e p −4 ℓ 1≤ℓ≤L 2 H( ) ∗ − D | | M N distinct index subsets Km, m 1,...,M . Within p∈Q ∈ ∈{ } X each class Km, a LET estimating function is built from a ∗ is an unbiased estimate of Var[ ]. linear combination of Im N given functions fm,i : R R Eo − Eo ∈ → applied to rℓ. So, for every m 1,...,M and ℓ Km, the estimator takes the form: ∈{ } ∈ VI. CASE STUDY b It is important to emphasize that the proposed restoration Im Θ (r )= a f (r ) (40) framework presents various degrees of freedom. Firstly, it ℓ ℓ m,i m,i ℓ i=1 is possible to choose redundant or non redundant analy- X sis/synthesis families. In Section VI-A, we will show that where (am,i)1≤i≤Im are scalar real-valued weighting factors. in the case of orthonormal synthesis families, the estimator We deduce from (25) that the estimate can be expressed as design can be split into several simpler optimization proce- dures. Secondly, any structure of the estimator can be virtu- M Im s(x)= a β (x) (41) ally considered. Of particular interest are restoration methods m,i m,i m=1 i=1 involving Linear Expansion of Threshold (LET) functions, X X which are investigated in Section VI-B. As already mentioned, where b the latter estimators have been successfully used in denoising β (x)= f (r )ϕ (x). (42) problems [29]. m,i m,i ℓ ℓ Km ℓX∈ Then, the problem of optimizing the estimatore boils down to A. Use of orthonormal synthesis families the determination of the weights am,i which minimize the We now examine the case when (ϕ ) is an orthonor- ℓ 1≤ℓ≤L unbiased risk estimate. According to (32) and (33), this is mal basis of Π(RD1×···×Dd ) (thus, L = card(Q)). This arises, equivalent to minimize (s r)+ 2γ M Θ′ (r )γ , E − D m=1 ℓ∈Km ℓ ℓ ℓ in particular, when (ϕℓ)1≤ℓ≤L is an orthonormal basis of where is given by (29). From (41), it can be e γℓ 1≤ℓ≤L D1×···×Dd P P R and the degradation system is invertible (Q = D). deduced that this amountsb toe minimizing: Then, due to the orthogonality of the functions (ϕ ) ,  e ℓ 1≤ℓ≤L the unbiased estimate of the risk in (26) can be rewritten as M Im0 M Im −1 L 2 (s s)+ = (s s)+ D (s r ) + ∆, where am0,i0 am,i β ,β o ℓ=1 ℓ ℓ e h m0,i0 m,ii E − E E − − m0=1 i0=1 m=1 i=1 rℓ = r, ϕℓ . Thanks to (33), the observable part of the risk X X X X h i P Im estimate canb be expressed as b e b M 0 2 am0,i0 β , r e e eL L − h m0,i0 i 1 2 2γ ′ γ −2 m0=1 i0=1 o = (sℓ rℓ) + Θℓ(rℓ)γℓ H(p) . X X E D − D − D | | M Im0 p Q e Xℓ=1 Xℓ=1 X∈ ′ (38) +2γ am0,i0 fm0,i0 (rℓ) γℓ. b b e where is given by (29). m0=1 ℓ∈Km0 i0=1 γℓ 1≤ℓ≤L X X X  7

This minimization can be easily shown to yield the following frame of which, with a slight (ψm,kℓ )ℓ∈Km 1≤m≤M set of linear equations: abuse of notation, will be assumed of the form:  (ϕ kℓ ) Km where, for every ℓ K , ϕ kℓ m, ℓ∈ 1≤m≤M ∈ m m, m0 1,...,M , i0 1,...,Im0 , denotes the k -periodically shifted version of ϕ . So, ∀ ∈{ } ∀ ∈{ } ℓ m,0 M Im basicallye the restoration method can be summarized bye Fig. β ,β am,i 1. h m0,i0 m,ii e m=1 i=1 X X With these choices, it can be deduced from (29) that ′ = β , r γ f (rℓ) γ . (43) m0,i0 m0,i0 ℓ h i− m 1,...,M , ℓ Km, ℓ∈Km0 X ∀ ∈{ } ∀ ∈ ∗ 1 Ψm,0(p) Φm,0(p) e γ = . (48) VII. PARAMETERCHOICE ℓ D H(p) 2 + λ p∈Q | |  A. Choice of analysis/synthesis functions X e This shows that only M values of γℓ need to be computed Using the same notations as in Section VI, let Km, 1 (instead of L). Similarly, simplified forms of the constants m M be a partition of 1,...,L . Consider{ now≤ a (γℓ,i)1≤ℓ,i≤L and (κℓ)1≤ℓ≤L as defined by (37) and (124) can frame≤ of R}D1×···×Dd , { } , where, for ev- (ψm,kℓ )ℓ∈Km 1≤m≤M be easily obtained. D1×···×Dd ery m 1,...,M , ψ 0 is some field in R and, ∈{ } m,  for every ℓ Km, ψm,kℓ denotes its kℓ-periodically shifted B. Choice of estimating functions version where∈ k is some shift value in D. Notice that, by ℓ We will employ LET estimating functions due to the sim- appropriately choosing the sets (K ) , any frame of m 1≤m≤M plicity of their optimization, as explained in Section VI-B. RD1×···×Dd can be written under this form but that it is mostly More precisely, the following two possible forms will be useful to describe periodic wavelet bases, wavelet packets [43], investigated in this work: mirror wavelet bases [11], redundant/undecimated wavelet representations as well as related frames [44], [45], [46], [47]. • nonlinear estimating function in [29]: we set Im = 2, For example, for a classical 1D periodic wavelet basis, M 1 take for fm,1 the identity function and choose − represents the number of resolution levels and, for every ℓ in ρ8 ρ R,f (ρ)= 1 exp ρ (49) subband K at resolution level , the shift m,2 8 m m 1,...,M 1 ∀ ∈ − − (ωσm) parameter k is a multiple of 2m ∈{(K being here− } the index    ℓ M where ω ]0, [ and σ is the standard deviation subset related to the approximation subband). ∈ ∞ m of (nℓ)ℓ∈Km . According to (120) and (44), we have, A possible choice for the analysis family (ϕℓ)1≤ℓ≤L is then 2 −1 2 for any ℓ Km, σm = γD p∈Q Φℓ(p) = obtained by setting −1 ∈ 2 2 | | γD G(p) Ψ 0(p) . p∈Q | | | m, | P m 1,...,M , ℓ Km, p D, • nonlinear estimating function in [30]: again, we set Im = ∀ ∈{ } ∀ ∈ ∀ ∈ P 2, and take for fm,1 the identity function but, we choose: Φℓ(p)= G(p)Ψm,kℓ (p) ⊤ −1 = exp( 2πık D p)G(p)Ψ 0(p) ρ R, − ℓ m, ∀ ∈ (44) ρ + ξσm ρ ξσm fm,2(ρ)= tanh ′ tanh −′ ρ where G(p) typically corresponds to the frequency response ω σm − ω σm      of an “inverse” of the degradation filter. It can be noticed (50) that a similar choice is made in the WaveD estimator [14] by where ′ 2 and is defined as for the ∗ (ξ,ω ) ]0, [ σm setting, for every p Q, G(p)=1/ H(p) (starting from a previous estimating∈ ∞ function. dyadic Meyer wavelet∈ basis). By analogy with Wiener filtering  techniques, a more general form for the frequency response of VIII. EXPERIMENTAL RESULTS this filter can be chosen: A. Simulation context H(p) G(p)= (45) In our experiments, the test data set contains six 8-bit images H(p) 2 + λ of size 512 512 which are displayed in Fig. 2. Different | | × where λ 0. Note that, due to (44), the computation of have been applied: (i) 5 5 and 7 7 uniform ≥ × × the coefficients (rℓ)1≤ℓ≤L amounts to the computation of the blurs, (ii) Gaussian blur with standard deviation σh equal to frame coefficients: 2, (iii) cosine blur defined by: (p1,p2) 0,...,D1 1 0,...,D 1 , H(p ,p )= ∀H (p )H∈{(p ) where − }× { 2 − } 1 2 1 1 2 2 m 1,...,M , ℓ Km, rℓ = r,ˇ ψm,kℓ (46) ∀ ∈{ } ∀ ∈ h i i 1, 2 , where rˇ is the field with discrete Fourier coefficients ∀ ∈{ } ∗ 1 if 0 pi FcDi G(p) R(p) if p Q ≤ ≤ Rˇ(p)= ∈ (47) π(pi FcDi) 0 otherwise. Hi(pi)= cos − if FcDi pi Di/2 (   (1 2Fc)Di ≤ ≤  − ∗  H (D p )  otherwise Concerning the associated synthesis family i i − i  (51) (ϕℓ)1≤ℓ≤L, we simply choose the dual synthesis  

e 8

Degraded Observable Filtered image data data Frame Extraction of r Wiener−like rˇ r decomposition the observable ∗ filter G (ψ k ) part Π m, ℓ ℓ,m Wavelet (rℓ)ℓ coefficients of the Extraction of Frame (sℓ)ℓ Pointwise nonlinear filtered data s the observable reconstruction estimation operator part Π (ϕℓ)ℓ b (Θℓ)ℓ Estimatedb Estimated image e wavelet coefficients

Fig. 1. Restoration method.

with Fc [0, 1/2), (iv) Dirac (the restoration problem then re- in [20]) where we use a tight wavelet frame consisting of duces to∈ a denoising problem) and, realizations of a zero-mean the union of four shifted orthonormal wavelet decompositions. white Gaussian noise have been added to the blurred images. The shift parameters are (0, 0), (1, 0), (0, 1) and (1, 1). We The noise variance γ is chosen so that the averaged blurred have also included in our comparisons the results obtained with signal to noise ratio BSNR reaches a given target value, where the classical Wiener filter and with a least squares optimization △ 4 BSNR = 10 log h s 2 /(Dγ) . The performance of a approach using a Laplacian regularization operator. 10 k ∗ k restoration method is measured by the averaged Signal to △ Noise Ratio: SNR =e 10 log E[s2]/E[(s s)2] where E B. Numerical results 10 − denotes the spatial average operator. In our simulations, we Table II provides the values of the SNR achieved by the have chosen the set P as givenb by (31)b whereb the thresholdb different considered techniques for several values of the BSNR value χ is automatically adjusted so as to secure a reliable and a given form of blur (uniform 5 5) on the six test estimation of the risk while maximizing the size of the set Q. images. All the provided quantitative results× are median values In practice, χ has been set, through a dichotomic search, to computed over 10 noise realizations. It can be observed that, the smallest positive value such that o > 10√Vmax, where whatever the considered image is, SURE-based restoration Var E Vmax is an upper bound of [ o o]. This bound has methods generally lead to significant gains w.r.t. the other E − E been derived from (35) under some simplifyingb assumptions approaches, especially for low BSNRs. Furthermore, the two aiming at facilitating its computation. In anb empirical manner, kinds of nonlinear estimating function which have been eval- the parameter λ in (45) has been chosen proportional to the uated lead to almost identical results. It can also be noticed ratio of the noise variance to the variance of the blurred that the ForWaRD and TwIST methods perform quite well image, by taking λ = 3γ/(E[r2] (E[r])2 γ). The other in terms of MSE for high BSNR. However, by examining − − parameters of the method have been set to ω =3 in (49) and more carefully the restored images, it can be seen that these (ξ,ω′)=(3.5, 2.25) in (50). b b methods may better recover uniform areas, at the expense of To validate our approach, we have made comparisons with a loss of some detail information which is better preserved by state-of-the-art wavelet-based restoration methods and some the considered SURE-based method. This behaviour is visible other restoration approaches. For all these methods, symlet- on Fig. 3 where the proposed approach allows us to better 8 wavelet decompositions performed over 4 resolution levels recover Barbara’s stripe trouser. have been used [42]. The first approach is the ForWaRD Table III provides the SNRs obtained with the different method2 which employs a translation invariant wavelet repre- techniques for several values of the BSNR and various blurs sentation [48], [49]. The ForWaRD estimator has been applied on Tunis image (see Fig. 2 (e)). The reported results allow us to with an optimized value of the regularization parameter. The confirm the good performance of SURE-based methods. The same translation invariant wavelet decomposition is used for lower performance of the wavelet-based variational approach the proposed SURE-based method. The second method we may be related to the fact that it requires the estimation of have tested is the TwIST3 algorithm [22] considering a total the hyperparameters of the prior distribution of the wavelet variation penalization term. The third approach is the varia- coefficients. This estimation has been performed by a max- tional method in [21, Section 6] (which extends the method imum likelihood approach which is suboptimal in terms of mean square restoration error. The results at the bottom- 2A Matlab toolbox can be downloaded from right of Table III are in agreement with those in [29], [33] http://www.dsp.rice.edu/software/ward.shtml. 3A Matlab toolbox can be downloaded from 4We use the implementations of these methods provided in the Matlab http://www.lx.it.pt/∼bioucas/code.htm. Image Processing Toolbox, assuming that the noise level is known. 9

(a)Lena (b)Barbara (c)Marseille (d)Boat (e)Tunis (f)Tiffany

Fig. 2. Original images: (a) Lena, (b) Barbara, (c) Marseille, (d) Boat, (e) Tunis and (f) Tiffany. showing the outperformance of LET estimators for denoising by H¨older’s inequality, problems. The poorer results obtained with ForWaRD in this E[ Θ (ρ )η ] E[ Θ (ρ ) 3]1/3E[ η 3/2]2/3 (52) case indicate that this method is tailored for deconvolution | 1 1 1| ≤ | 1 1 | | 1| problems. E[ Θ (ρ )η η ] E[ Θ (ρ ) 3]1/3E[ η η 3/2]2/3 | 1 1 1 2| ≤ | 1 1 | | 1 2| In the previous experiments, for all the considered methods, e e (53) the noise variance γ was assumed to be known. Table IV E 2 E 3 1/3E 3/2 3 2/3 [ Θ1(ρ1)ηe1 eη2] [ Θ1(ρ1) ] [ ηe1 e η2 ] gives the SNR values obtained with the different techniques | | ≤ | | | | | | (54) for several noise levels and various blurs on Tunis image, E[ Θ (ρ )Θ (ρ )ηe ηe ] E[ Θ (ρ ) 3]1/3E[ Θe (ρ )e3]1/3 when the noise variance is estimated via the classical median | 1 1 1 2 1 2| ≤ | 1 1 | | 1 2 | absolute deviation (MAD) wavelet estimator [50, p. 447]. One E[ η η 3]1/3. (55) × | 1 2| can observe that the results are close to the case when the e e We can decompose ηi with i 1, 2 as follows: noise variance is known, except when the problem reduces ∈{e e} to a denoising problem associated with a high BSNR. In this ηi = aiη1 +ˇηi (56) case indeed, the MAD estimator does not provide a precise e estimation of the noise variance. However, the restoration where ai is the mean-square prediction coefficient given by e results are still satisfactory. 2 σ ai = E[η1ηi] (57) 2 E 2 IX. CONCLUSIONS with σ = [η1] and, ηˇi is the associated zero-mean prediction e 5 In this paper, we have addressed the problem of recovering error which is independent of υ1 and η1. We deduce that data degraded by a convolution and the addition of a white E[Θ (ρ )η ]= a E[Θ (ρ )η ]. (58) Gaussian noise. We have adopted a hybrid approach that com- 1 1 1 1 1 1 1 bines frequency and multiscale analyses. By formulating the We can invoke Stein’s principle to express E[Θ1(ρ1)η1], pro- underlying deconvolution problem as a nonlinear estimation vided that the assumptionse in Proposition 1 are satisfied. To problem, we have shown that the involved criterion to be check these assumptions, we remark that, for every τ R, 2 ζ ∈ optimized can be deduced from Stein’s unbiased quadratic risk when ζ is large enough, Θ1(τ +ζ) exp 2 Θ1(τ + 2 2σ estimate. In this context, attention must be paid to the variance 2 | | ζ | | − ≤ | ζ) ζ exp 2 , which, owing to Assumption (i), implies 2σ 2  of the risk estimate. The expression of this variance has been | − ζ that lim Θ (τ + ζ)exp 2 = 0. In addition, derived in this paper. |ζ|→∞ 1 2σ from Jensen’s inequality and Assumption− (ii), E[ Θ′ (ρ ) ] The flexibility of the proposed recovery approach must  1 1 E[ Θ′ (ρ ) 3]1/3 < . Consequently, (9) combined| with| (57)≤ be emphasized. Redundant or non-redundant data represen- 1 1 can| be applied| to simplify∞ (58), so allowing us to obtain (10). tations can be employed as well as various combinations of Let us next prove (11). From (56), we get: linear/nonlinear estimates. Based on a specific choice of the wavelet representation and particular forms of the estimator E E 2 E E [Θ1(ρ1)η1η2]=a1a2 [Θ1(ρ1)η1 ]+ a1 [Θ1(ρ1)η1] [ˇη2] structure, experiments have been conducted on a set of images, + a E[Θ (ρ )η ]E[ˇη ]+ E[Θ (ρ )]E[ˇη ηˇ ] illustrating the good performance of the proposed approach. 2 1 1 1 1 1 1 1 2 e e E 2 E E In our future work, we plan to further improve this restoration =a1a2 [Θ1(ρ1)η1 ]+ [Θ1(ρ1)] [ˇη1ηˇ2] (59) method by considering more sophisticated forms of the estima- where we have used in the first equality the fact that (ˇη1, ηˇ2) is tor, for example, by taking into account multiscale or spatial independent of (η1,υ1) and, in the second one, that it is zero- dependencies as proposed in [32], [34] for denoising problems. mean. Then, by making use of the orthogonality relation: Furthermore, it seems interesting to extend this work to the 2 case of multicomponent data by accounting for the cross- E[η1η2]= a1a2σ + E[ˇη1ηˇ2] (60) channel correlations. we have e e APPENDIX A E[Θ (ρ )η η ]=a a E[Θ (ρ )η2] σ2E[Θ (ρ )] 1 1 1 2 1 2 1 1 1 − 1 1 PROOF OF PROPOSITION 2 E E + [Θ 1(ρ1)] [η1η2].  (61) We first notice that Assumption (ii) is a sufficient condition e e for the existence of the left hand-side terms of (10)-(13) since, 5Recall that (η1, ηe1, ηe2) is zero-mean Gaussian. e e 10

TABLE II RESTORATION RESULTS FOR A 5 × 5 UNIFORM BLUR: INITIAL SNR (SNR i) AND SNR OBTAINED WITH OUR APPROACH USING THE NONLINEAR FUNCTION IN (49) (SNR b), OURAPPROACHUSINGTHENONLINEARFUNCTIONIN (50) (SNR s), FORWARD (SNR f),TWIST (SNR t), THE WAVELET-BASED VARIATIONAL APPROACH (SNR v), THE (SNR w) AND THE REGULARIZED QUADRATIC METHOD (SNR r).

Image BSNR 10 15 20 25 30 Image BSNR 10 15 20 25 30 SNR i 9.796 14.26 17.89 20.21 21.31 SNR i 9.713 14.03 17.39 19.39 20.28 SNR b 19.98 21.36 22.43 23.48 24.62 SNR b 19.09 20.27 21.31 22.37 23.59 SNR s 19.93 21.29 22.39 23.45 24.58 SNR s 19.05 20.22 21.27 22.36 23.57 Lena SNR f 18.27 20.04 21.36 23.35 24.63 Boat SNR f 16.75 19.05 20.68 22.14 23.49 SNR t 19.52 21.17 22.79 23.90 24.91 SNR t 18.29 19.67 21.09 22.35 23.80 SNR v 17.91 20.01 21.34 22.41 23.42 SNR v 14.37 17.46 19.31 20.54 21.88 SNR w 15.82 19.69 21.54 22.23 22.46 SNR w 15.80 19.19 20.56 21.02 21.17 SNR r 18.70 20.06 21.25 22.18 22.43 SNR r 18.18 19.31 20.35 20.96 21.15 SNR i 9.366 13.10 15.54 16.72 17.17 SNR i 9.713 14.03 17.37 19.36 20.24 SNR b 17.02 17.54 18.05 18.79 19.75 SNR b 18.63 19.73 20.75 21.72 22.66 SNR s 16.99 17.52 18.06 18.77 19.63 SNR s 18.62 19.73 20.73 21.70 22.66 Barbara SNR f 16.14 17.04 17.62 18.56 19.49 Tunis SNR f 16.57 18.54 19.99 21.20 22.29 SNR t 16.74 17.45 17.95 18.38 19.07 SNR t 18.03 18.94 20.35 21.50 22.56 SNR v 16.41 17.26 17.76 18.32 18.94 SNR v 17.45 18.73 19.60 20.54 21.56 SNR w 14.53 16.92 17.78 18.02 18.10 SNR w 15.80 19.07 20.37 20.79 20.92 SNR r 16.53 17.11 17.58 17.92 18.10 SNR r 18.13 19.20 20.17 20.76 20.91 SNR i 8.926 11.93 13.57 14.25 14.49 SNR i 9.923 14.73 19.15 22.72 24.95 SNR b 13.92 15.12 16.21 17.27 18.46 SNR b 24.18 25.28 26.13 26.92 28.09 SNR s 13.93 15.12 16.21 17.28 18.48 SNR s 24.16 25.24 26.11 26.92 28.09 Marseille SNR f 13.10 14.42 15.75 17.00 18.32 Tiffany SNR f 17.93 21.72 23.67 26.53 28.12 SNR t 12.74 14.19 15.52 16.87 18.36 SNR t 23.08 25.17 26.23 27.20 27.85 SNR v 13.57 14.99 16.12 16.92 17.66 SNR v 21.81 24.47 25.63 26.44 27.46 SNR w 12.60 14.70 15.53 15.81 15.90 SNR w 18.01 23.13 25.48 26.28 26.53 SNR r 13.25 14.43 15.45 15.78 15.90 SNR r 23.65 24.60 25.42 26.12 26.44

(a) (b)

(c) (d) (e) Fig. 3. Zooms on Barbara image, BSNR = 25 dB; (a) Original, (b) Degraded, (c) Restored with ForWaRD, (d) Restored with TwIST and (e) Restored with the proposed method using (50).

In addition, by integration by parts, the conditional expectation can be reexpressed as w.r.t. given by υ1 E[Θ (ρ )η2 υ = τ] 1 1 1 | 1 σ ζ2 = lim Θ1(τ + ζ)ζ exp √2π ζ→−∞ − 2σ2  2  E 2 ζ τ, [Θ1(ρ1)η1 υ1 = τ]= lim Θ1(τ + ζ)ζ exp ∀ | − ζ→∞ − 2σ2 1 ∞ ζ2 2 (62) ∞  ζ2 Θ1(τ + ζ)ζ exp 2 dζ ′ √2πσ − 2σ + Θ1(τ + ζ)+Θ1(τ + ζ)ζ exp dζ . Z−∞ − 2σ2  Z−∞   (63) 11

TABLE III TUNIS IMAGE RESTORATION FOR VARIOUS BLURS.

Blur BSNR 10 15 20 25 30 Blur BSNR 10 15 20 25 30 SNR i 9.662 13.86 17.01 18.80 19.56 SNR i 9.577 13.64 16.56 18.13 18.77 SNR b 18.16 19.11 20.00 20.83 21.57 SNR b 18.06 18.87 19.57 20.30 21.14 Gaussian SNR s 18.16 19.10 19.99 20.81 21.56 Uniform SNR s 18.05 18.86 19.57 20.29 21.14 σh = 2 SNR f 17.52 18.57 19.61 20.57 21.43 7 × 7 SNR f 16.54 18.02 19.15 20.10 20.97 SNR t 17.12 18.32 19.47 20.40 21.33 SNR t 17.39 18.21 19.22 20.04 20.94 SNR v 17.39 18.98 19.84 20.63 21.45 SNR v 17.39 18.81 19.53 20.27 21.13 SNR w 16.24 18.70 19.49 19.57 19.62 SNR w 16.06 18.45 19.11 19.28 19.33 SNR r 17.73 18.63 19.35 19.57 19.62 SNR r 17.64 18.50 19.09 19.26 19.33 SNR i 9.971 14.87 19.57 23.74 26.83 SNR i 10.00 15.00 20.00 25.00 30.00 SNR b 19.82 21.87 24.05 26.16 27.99 SNR b 19.97 22.30 25.05 28.25 31.97 Cosine SNR s 19.82 21.85 24.02 26.13 27.98 SNR s 19.96 22.27 25.02 28.20 31.91 Fc = 3/32 SNR f 17.94 19.85 22.48 24.79 26.82 Dirac SNR f 5.701 11.68 19.10 25.96 31.41 SNR t 18.52 21.18 23.81 25.94 27.78 SNR t 19.82 22.03 24.27 27.18 31.00 SNR v 18.30 21.17 23.45 25.61 27.41 SNR v 18.10 21.02 23.44 26.58 30.18 SNR w 12.63 17.50 21.98 25.61 27.89 SNR w 10.42 15.13 20.04 25.01 30.00 SNR r 19.23 21.03 23.00 25.03 27.04 SNR r 19.39 21.37 23.76 26.58 29.91

TABLE IV TUNIS IMAGE RESTORATION FOR VARIOUS BLURS WHEN THE NOISE VARIANCE IS ESTIMATED WITH THE MAD ESTIMATOR.

Blur BSNR 10 15 20 25 30 Blur BSNR 10 15 20 25 30 SNR i 9.662 13.86 17.01 18.80 19.56 SNR i 9.577 13.64 16.56 18.13 18.77 Gaussian SNR b 18.16 19.12 20.00 20.82 21.57 Uniform SNR b 18.05 18.87 19.57 20.30 21.15 σh = 2 SNR s 18.15 19.11 19.99 20.81 21.56 7 × 7 SNR s 18.05 18.86 19.57 20.30 21.15 SNR i 9.971 14.87 19.57 23.74 26.83 SNR i 10.00 15.00 20.00 25.00 30.00 Cosine SNR b 19.82 21.87 24.05 26.15 27.99 Dirac SNR b 19.97 22.28 24.96 27.87 30.79 Fc = 3/32 SNR s 19.81 21.85 24.02 26.12 27.97 SNR s 19.96 22.26 24.92 27.83 30.77

The existence of the latter integral is secured for almost every write value τ that can be taken by υ , thanks to Assumptions (ii) 1 E 2 and (iii) and, the fact that, if µ denotes the probability measure [Θ1(ρ1)η1η2 ] 2E 3 E 2 E of υ1, =a1a2 [Θ1(ρ1)η1]+2a1a2 [Θ1(ρ1)η1] [ˇη2] E e e E 2 2E 2 E + a1 [Θ1(ρ1)η1] [ˇη2]+ a2 [Θ1(ρ1)η1 ] [ˇη1] 2 +2a E[Θ (ρ )η ]E[ˇη ηˇ ]+ E[Θ (ρ )]E[ˇη ηˇ2] ′ ζ 2 1 1 1 1 2 1 1 1 2 Θ1(υ1 + τ)+Θ1(υ1 + τ)ζ exp dζdµ(τ) 2 3 2 2 − 2σ2 =a a E[Θ (ρ )η ]+ E[Θ (ρ )η ] a E[ˇη ]+2a E[ˇη ηˇ ] ZZR 1 2 1 1 1 1 1 1 1 2 2 1 2 =E[ Θ ( ρ )+Θ′ (ρ )η ]  (66) 1 1 1 1 1  | ′ | E[ Θ1(ρ1) ]+ E[ Θ (ρ1)η1 ] where the latter equality stems from the symmetry of the ≤ | | | 1 | 3 1/3 ′ 3 1/3 3/2 2/3 probability distribution of (ˇη , ηˇ ). Taking into account the E[ Θ1(ρ1) ] + E[ Θ (ρ1) ] E[ η1 ] < . 1 2 ≤ | | | 1 | | | ∞ E 2 2 2 E 2 (64) relation [η2 ]= a2σ + [ˇη2] and (60), we get 2 2 3 2 E[Θ1(ρ1)η1η ]=a1a E[Θ1(ρ1)η ] 3σ E[Θ1(ρ1)η1] e 2 2 1 − E E 2 E Since, for every τ R, when ζ is large enough, Θ1(τ + + [Θ1(ρ1)η1] a1 [η ]+2a2 [η1η2] . 2 2 2  ζ ∈ | 2| ζ | ζ)ζ exp 2 Θ1(τ + ζ) ζ exp 2 , Assumption e e (67) 2σ 2σ 2  | − ≤ | | − ζ (i) implies that lim|ζ|→∞ Θ1(τ + ζ)ζ exp 2 = 0. By E e 3 e e  2σ Let us now focus our attention on [Θ1(ρ1)η1 ]. Since As- using this property, we deduce from (63) that− E[Θ (ρ )η2 1 1 1 sumptions (ii) and (iii) imply that υ ]= σ2 E[Θ (υ +η ) υ ]+E[Θ′ (υ +η )η υ ] , which| 1 1 1 1 | 1 1 1 1 1 | 1 yields E[ 2Θ (ρ )η +Θ′ (ρ )η2 ] 2E[ Θ (ρ ) 3]1/3E[ η 3/2]2/3  | 1 1 1 1 1 1 | ≤ | 1 1 | | 1| + E[ Θ′ (ρ ) 3]1/3E[ η 3]2/3 < (68) | 1 1 | | 1| ∞ E 2 2 E E ′ [Θ1(ρ1)η1 ]= σ [Θ1(ρ1)] + [Θ1(ρ1)η1] . (65) and Assumption (i) holds, we can proceed by integration by  parts, similarly to the proof of (65) to show that By inserting this equation in (61), we find that E 3 2 E E ′ 2 [Θ1(ρ1)η1 ]= σ 2 [Θ1(ρ1)η1]+ [Θ1(ρ1)η1 ] . (69) E 2E ′ E E [Θ1(ρ1)η1η2] = a1a2σ [Θ1(ρ1)η1] + [Θ1(ρ1)] [η1η2]. Formula (11) straightforwardly follows by noticing that, Thus, (66) reads  according to (56) and (57), E[Θ′ (ρ )η ]E[η η ] = e e 1 1 2 1 1 e e E[Θ (ρ )η η2]=a a2σ2E[Θ′ (ρ )η2]+ E[Θ (ρ )η ] 2E ′ Consider now (12). By using (56) 1 1 1 2 1 2 1 1 1 1 1 1 a1a2σ [Θ1(ρ1)η1]. 2 2 2 a1E[η ] a1a σ +2a2E[η1η2] (70) and the independence between (ˇη1, ηˇ2) and (eη1,υ1),e we can 2 2 e e × −  e e e 12

2 which, by using (9), can also be reexpressed as and, for every (τ1, τ2) R , lim|ζ|→∞ Θ1(τ1 + c1ζ)Θ2(τ2 + 2 ζ ∈ c ζ)ζ exp 2 =0 since, for ζ large enough, E 2 2 2E ′ 2 2E ′ 2 − 2σe | | [Θ1(ρ1)η1η2 ]=a1a2σ [Θ1(ρ1)η1 ]+ σ [Θ1(ρ1)] E 2 2 2 E  2 a1 [η2] a1a2σ +2a2 [η1η2] . 2 2 ζ × − c c Θ1(τ1 + c1ζ)Θ2(τ2 + c2ζ)ζ exp e e (71) 1 2| | − 2σ2  2 e e e 2 ζ  In turn, we have Θ1(τ1 + c1ζ) (c1ζ) exp ≤ | | − 4σe2 2 ′ 2 2 ′ 2 ′ 2  ζ E[Θ (ρ1)η ]=a E[Θ (ρ1)η ]+2a2E[Θ (ρ1)η1]E[ˇη2] Θ2(τ2 + c2ζ) (c2ζ) exp (80) 1 2 2 1 1 1 × | | e− 4σ2 + E[Θ′ (ρ )]E[ˇη2] 1 1 2 and Assumption (i) holds. We can therefore deduce, by in- e =a2E[Θ′ (ρ )η2]+ E[Θ′ (ρ )] E[η2] a2σ2 2 1 1 1 1 1 2 − 2 tegrating by parts in (78) and taking the expectatione w.r.t. (72) ⊥ ⊥  (η1 , η2 ,υ1,υ2), that e which, by using (57), leads to E 2 [Θ1(ρ1)Θ2(ρ2)η2] 2 E E ′ ′ 2 2 2 ′ 2 =σ [Θ1(ρ1)Θ2(ρ2)] + c1 [Θ1(ρ1)Θ2(ρ2) η2] E[Θ (ρ1)η ]E[η1η1]=a1a σ E[Θ (ρ1)η ] 1 2 2 1 1 E ′e 2 ′ 2 2 2 + c2 [Θ1(ρ1)Θ2(ρ2) η2] + σ E[Θ (ρ1)]a1 E[η ] a σ . 1 2 − 2 e2E E ′ Ee e e (73) =σ [Θ1(ρ1)Θ2(ρ2)] + [Θ 1(ρ1)Θ2(ρ2) η2] [η1η2]  E ′ eE e + [Θ1(ρ1)Θ2(ρ2) η2] [η2η2]. (81) From the difference of (71) and (73), we derive that e e e E ⊥ Let us now calculate [Θ1(ρ1)Θ2(ρ2)η1 η2]. We have, for i E 2 E ′ 2 E ⊥ ⊥ e e ∈ [Θ1(ρ1)η1η2 ]= [Θ1(ρ1)η2] [η1η1] 1, 2 , ηi =ˇciη1 +ˇηi , where 2 ′ { } +2a2σ E[Θ (ρ1)]E[η1η2] (74) e e 1 cˇ E[(η⊥)2]= E[η η⊥] (82) e e e e e i 1 i 1 which, by using again (57), yields (12). e e and, η⊥ is independent of (η , ηˇ⊥, ηˇ⊥,υ ,υ ). By proceeding 1 e 2 1 2 e 1 2 Finally, we will prove Formula (13). We decompose η1 as similarly to the proof of (81), we get follows: e e E ⊥ ⊥ ⊥ E ⊥ 2 ⊥ 2 e [Θ1(ρ1)Θ2(ρ2)η1 η2, ηˇ1 , ηˇ2 ]= [(η1 ) ] η1 = b η2 + η where b σ = E[η1η2], (75) | 1 cˇ E[Θ′ (ρ )Θ (ρ ) η , ηˇ⊥, ηˇ⊥] × 1 1 1 2 2 | 2 1 2 2 E 2 ⊥ e eE ′ e ⊥ ⊥ σ =e [η2]eand,eη1 is independente of (ηe2,υe1,υ2). This +ˇc2 [Θ1(ρ1)Θ2(ρ2) η2, ηˇ1 , ηˇ2 ] (83) allows us to write e | which, owing to (82), allows us to write  e e e e e E E 2 [Θ1(ρ1)Θ2(ρ2)η1η2]= b [Θ1(ρ1)Θ2(ρ2)η2 ] E ⊥ E ′ E ⊥ [Θ1(ρ1)Θ2(ρ2)η1 η2]= [Θ1(ρ1)Θ2(ρ2)η2] [η1η1 ] ⊥ + E[Θ1(ρ1)Θ2(ρ2)η η2]. (76) ′ ⊥ 1 + E[Θ1(ρ1)Θ (ρ2)η2]E[η2η ]. e e e 2 1 2 e e e e (84) Let us first calculate E[Θ1(ρ1)Θ(ρ2)η ]. For i 1, 2 , 2 e e∈ { } e e consider the decomposition: On the other hand, from (75), we deduce that, for i 1, 2 , E ⊥ E E ∈{ } e [ηiη1 ]= [ηiη1] b [ηiη2], so yielding ⊥ − ηi = ciη2 + ηi (77) E ⊥ E ′ E [Θ1(ρ1)Θ2(ρ2)η1 η2]= [Θ1(ρ1)Θ2(ρ2)η2] [η1η1] 2 ⊥ ⊥ e e e where ciσ = E[ηiη2] and, η2, (η , η ) and (υ1,υ2) are E ′ E e 1 2 + [Θ1(ρ1)Θ2(ρ2)η2] [η2η1] independent. We have then e e b E[Θ′ (ρ )Θ e(ρ )η ]eE[η η ] − 1 1 2 2 2 1 2 e e e E ′ e E e E 2 ⊥ ⊥ + [Θ1(ρ1)Θ2(ρ2)η2] [η2η2] . [Θ1(ρ1)Θ2(ρ2)η2 η1 , η2 ,υ1,υ2] ∞ | e e(85) 1 ⊥ ⊥  = Θ1(υ1 + c1ζ + η )Θ2(υ2 + c2ζ + η ) e e √2πσ e 1 2 Altogether, (76), (75), (81) and (85) lead to Z−∞ ζ2 ζ2 exp dζ. (78) E[Θ (ρ )Θ (ρ )η η ]= E[Θ (ρ )Θ (ρ )]E[η η ] e × − 2σ2 1 1 2 2 1 2 1 1 2 2 1 2 E ′ E E ′ E  + [Θ1(ρ1)Θ2(ρ2)η2] [η1η1]+ [Θ1(ρ1)Θ2(ρ2)η2] [η2η1]. It can be noticed that e e e e (86) e E ′ ′ e e e e [ Θ1(ρ1)Θ2(ρ2) + (c1Θ1(ρ1)Θ2(ρ2)+ c2Θ1(ρ1)Θ2(ρ2)) η2 ] In order to obtain a more symmetric expression, let us | | ′ 3 1/3 3 1/3 ′ 3 1/3 now look at the difference E[Θ1(ρ1)Θ (ρ2)η2]E[η2η1] E[ Θ1(ρ1) ] E[ Θ2(ρ2) ] + c1 E[ Θ (ρ1) ] 2 ≤ | | | | | | | 1 | E[Θ (ρ )Θ′ (ρ )η ]E[η η ] = E[Θ (ρ )Θ′ (ρ )η ] where− E 3 1/3 E 3 1/3E ′ 3 1/3 e 1 1 2 2 1 2 2 1 1 2 2 12 [ Θ2(ρ2) ] + c2 [ Θ1(ρ1) ] [ Θ (ρ2)) ] E E × | | | | | | | 2 | η12 = [η2η1]η2 [η2η2]η1. Since η12 is a lineare combinatione E[ η 3]1/3 < (79) of η and η and,− E[η η ] = 0, η is independent of × | 2| ∞  1 2 e e2 12 12 e e e e e e e e e e e e 13

(η2,υ1,υ2). Similarly to the derivation of Formula (10), it Combining this equation with (92) and (94) yields can be deduced that ∗ L Φ (p) Φ (p) ′ ∗ ′ ℓ ℓ E[Θ (ρ )Θ (ρ )η η ,υ ,υ ] E[N(p) S(p) ]= γ E[Θ (rℓ)] . (96) 1 1 2 2 12 2 1 2 ℓ H(p) ′ | ℓ=1  =E[Θ1(ρ1)Θ (υ2 + η2)η12 η2,υ1,υ2] X e 2 |  e ′ Gatheringe nowb (90), (91) and (96), (18) is obtained. =E[Θ1(ρ1)η12 η2,υ1,υ2]Θ (υ2 + η2) | 2 From Parseval’s formula, the global MSE can be expressed E ′ Ee ′ = [Θ1(ρ1) η2,υ1,υ2] [η1η12]Θ2(ρ2) (87) as e| 1 which leads to D E[ (s s)] = E[ S(p) S(p) 2]. (97) e E − D | − | E ′ E E ′ E p∈D [Θ1(ρ1)Θ2(ρ2)η2] [η2η1] [Θ1(ρ1)Θ2(ρ2)η1] [η2η2] X − E ′ E ′ ′ E The above equationb together with (18) showb that (19) holds = [Θ1(ρ1)Θ2(ρ2)η12]= [Θ1(ρ1)Θ2(ρ2)] [η1η12] E ′ ′ eE eE E E e e with = [Θ1(ρ1)Θ2(ρ2)]( [η1η2] [η2η1] [η1η1] [η2η2]). (88) e − e L Eq. (13) is then derived by combining (86) with (88). D∆= γ 2 E[Θ′ (r )] Re γ H(p) −2 . (98) e e e e ℓ ℓ { ℓ}− | | p∈D APPENDIX B  Xℓ=1 X  PROOF OF PROPOSITION 3 Furthermore, by defining We have, for every p D, ∈ ℓ 1,...,L , nℓ = n, ϕℓ (99) ∗ ∀ ∈{ } h i E[ R(p) 2]= E[ S(p) 2]+E[ N(p) 2]+2 Re E[N(p) S(p) ] . and using (16), it can be noticed that | | | | | | { (89)}  e e e Sincee n and s aree uncorrelated,e this yields E[nℓ nℓ]= E[n(y)n(x)]ϕℓ(y)ϕℓ(x) E 2 E 2 E 2 [ S(p) ] = [ R(p) ] [ N(p) ]. In addition, we have (x,y)∈D2 E | | ∗ | E | − | ∗ | E ∗ X [S(p) Se(p) ]= [R(p) S(p) ] [N(p) S(p) ]. The 1 ∗ ∗ − = e E N(p) eN(p′) eΦ (p) Φ (p′) previous two equationse show thate D2 ℓ ℓ    (p,p′)∈D2 p bD E pe bp 2 E ep b p 2 X     , [ S( ) S( ) ]= [ S( ) R( ) ] e ∗ e ∀ ∈ | − | | − ∗ | γ Φℓ(p) Φℓ(p) E[ N(p) 2]+2Re E[N(p) S(p) ] . (90) = = γγ . (100) − | | { } D H(p) ℓ b b e p∈D  Moreover, using (17), the second term in the right-hand side X e e e b of (90) is Hence, as claimed in the last part of our statements, (γℓ)1≤ℓ≤L γD is real-valued since it is the cross-correlation of a real-valued E[ N(p) 2]= . (91) | | H(p) 2 sequence. | | On the other hand, accordinge to (8), the last term in the right- hand side of (90) is such that APPENDIX C PROOF OF PROPOSITION 5 L ∗ L E[N(p) S(p) ]= E[sℓ n(x)] By using (25), we have x∈D s(x) n(x) = ℓ=1 sℓ nℓ, x∈D ℓ=1 where nℓ has been here redefined as  X X ∗ P P e b exp( 2πıx⊤D−1p) Φ (p) . (92) b e b e b e ℓ nℓ = n, ϕ . (101) × − e h ℓi Furthermore, we know from (6) that sℓ =Θℓ(uℓ + nℓ), where −2 2 2 e In addition, (n) = D N(p) / H(p) . This al- E e pe∈Qe| | | | nℓ = n, ϕℓ , uℓ = u, ϕℓ (93) lows us to rewrite (34) as = 2A B +2C, where h i b h i EPo − Eo − − and, u is the field in RD1×···×Dd whose discrete Fourier e1 A = s(x) n(x) (102) coefficients are given by (3). From (93) as well as the D b x D assumptions made on the noise n corrupting the data, it is X∈ 1 2 γ −2 clear that nℓ, n(x) is a zero-mean Gaussian vector which is B = n(x)e H(p) D − D | | independent of uℓ. Thus, by using (10) in Proposition 2, we x D p Q  X∈ X∈ obtain:  2 e 1 Ne (p) /D γ E x E ′ E x (94) = | | − (103) [sℓn( )] = [Θℓ(rℓ)] [nℓ n( )]. D H(p) 2 p∈Q Let us now calculate E x . Using (93) and (16), we get X | | [nℓ n( )] L b e e 1 ′ E[nℓ n(x)] = E[n(x) n(y)]ϕℓ(y) C = sℓ nℓ Θℓ(rℓ)γγℓ . (104) e D − y∈D Xℓ=1 X ′′  ∗ Φ (p ) The variance of the errorb e in the estimation of the risk is thus = e E N(p′) eN(p′′) exp(2πıx⊤D−1p′) ℓ D2 given by (p′,p′′)∈D2 X    γ Φ (p′e) Var E 2 E 2 E E ℓ ⊤D−1 ′ [ o o]= [( o o) ]=4 [A ]+4 [AB] 8 [AC] = exp(2πıx p ). (95) E −E E −E − D H(p′) 2 2 p′∈D + E[B ] 4E[BC]+4E[C ]. (105) X b b − 14

We will now calculate each of the terms in the right hand-side This yields term of the above expression to determine the variance. ∗ L Φ (p) • Due to the independence of s and n, the first term to γ ℓ E[AC]= E[S(p)Θℓ(rℓ)] calculate is equal to D3 H(p) 2 p∈Q ℓ=1 | | X X e ∗ 2 1 ′ ∗ ′ ∗ γ E[S(p) S(p) ] E[A ]= E[S(p) S(p ) ] E[N(p) N(p ) ]. (114) 4 = 3 2 . D ′ 2 D H(p) (p,p )∈Q p∈Q  X   | b | e e (106) X By using (17), this expression simplifies as • The calculation of the fourth term is more classical since N(p) 2/D is the p bin of the periodogram [51] of the 2 γ E[ S(p) ] | | 2 E[A2]= | | . (107) Gaussian n. More precisely, since N(p) /D D3 H(p) 2 | | p Q is an unbiased estimate of γ, X∈ | | 2 ′ 2 • The second term cancels. Indeed, since n and hence n 1 Cov( N(p) , N(p ) ) E[B2]= | | | | . (115) are zero-mean, D4 H(p) 2 H(p′) 2 (p,p′)∈Q2 | | | | 1 e X E[AB]= E[s(x)]E[n(x) n(x′))2]. (108) ′ D2 In the above summation, we know that, if p = p and p = (x,x′)∈D2 D ′ ⊤ Rd 6 6 ′ X 1 p with 1 = (1,..., 1) , N(p) and N(p ) − Cov ∈ 2 ′ 2 and, since (n(x), n(x′)) is zero-meane Gaussian,e it has a are independent and thus, ( N(p) , N(p ) )=0. ′ | | |D | ′ symmetric distribution and E[n(x) n(x′))2]=0. On the other hand, if p = p or p = 1 p , then Cov( N(p) 2, N(p′) 2)= E[ N(p) 4] γ2D2−. Let • The calculatione ofe the third term is a bit more involved. | | | | | | − We have e e ⊤ S = p = (p1,...,pd) D L ∈ | 1 i 1,...,d ,pi 0,Di/2 . (116) E[AC]= E[s(x)sℓ nℓ n(x)]  ∀ ∈{ } ∈{ } D2 x D X∈ Xℓ=1 If p S, then N(p) is a zero-mean Gaussian real random E ′ ∈ 2 2 [s(x)Θbℓ(reℓ)en(x)]γγℓ . (109) E 4 E 2 2 − variable and [ N(p) ] = 3 N(p) = 3γ D . Otherwise, p| a zero-mean| Gaussian circular com- In order to find a tractable expression of N( ) plex random variable and E[ N(p) 4]=  2E[ N(p) 2]2 E[s(x)s n n(x)] with ℓ 1,...,Le , we will ℓ ℓ 2 2. It can be deduced that| | | | first consider the following conditional∈ { expectation} w.r.t. =2γ D E E s: [s(bx)esℓ enℓ n(x) s] = s(x) [Θ(rℓ)nℓ n(x) s]. 2 | 6 | 1 Var[ N(p) ] According to Formula (11) in Proposition 2, E[B2]= D4  H(p) 4 b e e ′ e e p∈Q∩S | |  E[Θ(rℓ)nℓ n(x) s]=E[Θ (rℓ)n(x) s] E[nℓ nℓ] X | ℓ |  2 E E Var[ N(p) ] + [Θℓ(rℓ) s] [nℓ n(x)] (110) + | H(p) 4 e e e e Q D S which, by using (100), allows us to deduce that p∈ X∩( \ )  | | e e Cov( N(p) 2, N(D1 p) 2) + E E ′ | 2| | D − 2 | [s(x)Θ(rℓ)nℓ n(x)] = [s(x)Θℓ(rℓ)n(x)] γγℓ H(p) H( 1 p) | | | − |  E x E x (111) + [s( )Θℓ(rℓ)] [nℓ n( )]. 1 2γ2D2 e e e = D4  H(p) 4 This shows that (109) can be simplified as follows: p Q S e e ∈X∩ | | L  2 2 2 2 E 1 E E γ D γ D [AC]= [s(x)Θℓ(rℓ)] [nℓ n(x)]. (112) + + D2 4 4 x∈D ℓ=1 H(p) H(p)  X X p∈QX∩(D\S) | | | |  Furthermore, according to (17) and (101),e wee have 2γ2 1  (117) = 2 4 . 1 ∗ ∗ D H(p) E x E p p′ p p∈Q | | [nℓ n( )] = 2 [N( ) N( ) ] Φℓ( ) X D ′ 2 (p,pX)∈Q • Let us now turn our attention to the fifth term. According ⊤ −1 ′   e e exp( 2πıx De pe) e to (10) and the definition of γℓ in (100), for every ℓ × − ′ ∈ ∗ 1,...,L , sℓ nℓ Θℓ(rℓ)γγℓ is zero-mean and we have γ Φℓ(p) { } − = exp( 2πıx⊤D−1p). then D H(p) 2 − p∈Q  X |e | b e L (113) 1 2 E[BC]= E[s n n(x) ] D2 ℓ ℓ x∈D ℓ=1 6Proposition 2 is applicable to the calculation of the conditional expectation X X   2 since conditioning w.r.t. s amounts to fixing uℓ (see the remark at the end of ′ E[Θb (erℓ)en(x) ]γγ . (118) Section III). − ℓ ℓ   e 15

By applying now Formula (12) in Proposition 2, we have, Furthermore, according to (17), (101), (16) and (120), we for every ℓ 1,...,L , have ∈{ } 2 2 E 1 E ′ ∗ ∗ ′ E x E ′ x [nℓni]= [N(p) N(p ) ] Φℓ(p) Φi(p ) [sℓ nℓ n( ) ] [Θℓ(rℓ) n( ) ]γγℓ D2 − (p,p′)∈Q2 E x 2 E ′ x 2 E X   = [Θℓ(rℓ) nℓ n( ) ] [Θℓ(rℓ) n( ) ] [nℓ nℓ] ∗e e e e b e′ e − e e e γ Φℓ(p) Φi(p) =2E[Θ (rℓ)] E[nℓ n(x)] E[n(x) nℓ] (119) = (128) ℓ   D H(p) 2 e e e e p∈Q |  | where, in compliance with (24), n is now given by X e e e e e ℓ and nℓ = n, ϕℓ . (120) E 1 E ′ ∗ ′ ∗ h i [nℓni]= [N(p) N(p ) ]Φℓ(p ) Φi(p) D2 (p,p′)∈Q2 Furthermore, similarly to (95), we have X   e e e = γγℓ,i (129) 1 ′ ∗ ′ E[n(x) nℓ]= E[N(p) N(p ) ]Φℓ(p ) D2 where the expression of γℓ,i is given by (37). Hence, by (p,p′)∈Q2 X  using (128)-(129), (127) can be rewritten as e exp(2πıx⊤De−1p) × γ E[ S(p) 2] γ Φ (p′) E[C2]= | | = ℓ exp(2πıx⊤D−1p′). D3 H(p) 2 D H(p′) p∈Q | | p′∈Q X b L L X (121) γ2 + E[Θ′ (r )Θ′ (r )]γ γ . (130) D2 ℓ ℓ i i ℓ,i i,ℓ i=1 Altogether, (113), (119) and (121) yield Xℓ=1 X • In conclusion, we deduce from (105), (107), (114), (117), E 2 E ′ 2 [sℓ nℓ n(x) ] [Θℓ(rℓ) n(x) ]γγℓ (123) and (130) that − ∗ 2 ′ 2 2γ Φℓ(p ) Φℓ(p) 4γ E[ S(p) S(p) ] = E[Θ′ (r )]  Var b e e 2 ℓ ℓ e ′ 2 [ o o]= 3 | − 2 | D ′ H(p ) H(p) E − E D H(p) (p,p )∈Q2  p∈Q X |e | X b | | exp 2πıx⊤D−1(p′ p) . (122) b 2 L L × − 4γ E ′ ′ + 2 [Θℓ(rℓ)Θi(ri)]γℓ,iγi,ℓ  D Hence, (118) can be reexpressed as  ℓ=1 i=1 L X X 2 L ′ 1 1 2γ 2 E[Θ (rℓ)]κℓ + . (131) E[BC]= E[Θ′ (r )] κ (123) − ℓ 2 H(p) 4 D2 ℓ ℓ ℓ ℓ=1 p∈Q | | ℓ=1 X X  X By exploiting now (18) (see Proposition 4) and noticing where ∗ that (κℓ)1≤ℓ≤L is real-valued, this expression can be 1 Φℓ(p) Φℓ(p) κ = . (124) simplified as follows: ℓ D H(p) H(p) 2 p∈Q  X |e | 4γ E[ S(p) R(p) 2] Var[ o o]= | − | • Let us now consider the last term E − E D3 H(p) 2 p∈Q X b | e| L L 4bγ2 L L 2 1 ′ E ′ ′ E[C ]= E[s s n n ] E[s Θ (r ) n ]γγ + [Θℓ(rℓ)Θi(ri)]γℓ,iγi,ℓ 2 ℓ i ℓ i i ℓ ℓ i ℓ D2 D − i=1 ℓ=1 i=1 Xℓ=1 X E ′ X X E ′ ′ 2 2γ2 1 [sℓΘi(ri) nℓ]γγi + [Θbℓ(breℓ)Θei(ri)]γb γℓγi . e(125) . (132) − − D2 H(p) 4 p Q Appealing to Formula (13) in Proposition 2 and (100), X∈ | | b e we have Eq. (35) follows by using Parseval’s formula.

E[Θℓ(rℓ)Θi(ri)nℓni] REFERENCES E E E ′ = [Θℓ(rℓ)Θi(ri)] [nℓni]+ [Θℓ(rℓ)Θi(ri)ni]γγℓ [1] A. Benazza-Benyahia and J.- C. Pesquet, “A SURE approach for image E ′ e e E ′ ′ deconvolution in an orthonormal wavelet basis,” International Sympo- + [Θℓ(rℓ)Θi(ri)nℓ]γγi + [Θℓ(rℓ)Θi(ri)] sium on Communications, Control and , ISCCSP’08, E E e e 2 e pp. 1536-1541, St Julians, Malta, 12-14 Mar., 2008. ( [ninℓ] [ηℓni] γ γℓγi). (126) × e− [2] A. K. Jain, Fundamentals of , Englewood Cliffs, This allows us to simplify (125) as follows: NJ: Prentice Hall, 1989. e e [3] G. Demoment, “Image reconstruction and restoration: overview of common estimation structure and problems,” IEEE Trans. on Acoustics, L L Speech and Signal Processing, vol. 37, pp. 2024-2036, Dec. 1989. 2 1 E[C ]= E[s s ]E[n n ] [4] A. K. Katsaggelos, Digital image restoration, New-York: Springer- D2 ℓ i ℓ i ℓ=1 i=1 Verlag, 1991. EX′ X ′ E E [5] P. L. Combettes, “The foundations of set theoretic estimation,” Proc. + [Θℓ(rℓ)Θi(brib)] [neineℓ] [ηℓni] . (127) IEEE, vol. 81, pp. 182–208, Feb. 1993.  e e 16

[6] M. Bertero and P. Boccacci, Introduction to inverse problems in imaging, [31] F. Luisier, T. Blu, and M. Unser, “A new SURE approach to image Bristol, UK: IOP Publishing, 1998. denoising: interscale orthonormal wavelet thresholding,” IEEE Trans on [7] A. D. Hillery and R. T. Chin, “Iterative Wiener filters for image Image Processing, vol. 16, pp. 593-606, Mar. 2007. restoration,” IEEE Trans. Signal Processing, vol. 39, pp. 1892-1899, [32] F. Luisier and T. Blu, “SURE-LET multichannel image denoising: Aug. 1991. interscale orthonormal wavelet thresholding,” IEEE Trans on Image [8] D. L. Donoho, “Nonlinear solution of linear inverse problems by Processing, vol. 17, pp. 482-492, Apr. 2008. wavelet-vaguelette decomposition,” Appl. Comput. Harmon. Anal., [33] M. Raphan and E. P. Simoncelli, “Optimal denoising in redundant vol. 2, pp. 101-126, 1995. representations,” IEEE Trans. Image Processing, vol.17, pp. 1342-1352, [9] F. Abramovich and B. W. Silverman, “Wavelet decomposition ap- Aug. 2008. proaches to statistical inverse problems,” Biometrika, vol. 85, pp. 115- [34] C. Chaux, L. Duval, A. Benazza-Benyahia, and J.-C. Pesquet, “A 129, 1998. nonlinear Stein based estimator for multichannel image denoising,” IEEE [10] D. L. Donoho and I. M. Johnstone, “Adapting to unknown smoothness Trans. on Signal Processing, vol. 56, pp. 3855-3870, Aug. 2008. via wavelet shrinkage,” J. of the Amer. Stat. Ass., vol. 90, pp. 1200-1224, [35] P. L. Combettes and J.-C. Pesquet, “Wavelet-constrained image restora- 1995. tion,” International Journal on Wavelets, Multiresolution and Informa- [11] J. Kalifa, S. Mallat, and B. Roug´e, “Deconvolution by thresholding in tion Processing, vol. 2, no. 4, pp. 371-389, Dec. 2004. mirror wavelet bases,” IEEE Trans. Image Processing, vol. 12, pp. 446- [36] Y. Eldar, “Generalized SURE for exponential families: applications to 457, Apr. 2003. regularization,” to appear in IEEE Trans. on Signal Processing. [12] R. Neelamani, H. Choi, and R. Baraniuk, “ForWaRD: Fourier-wavelet [37] R. Averkamp and C. Houdr´e, “Stein estimation for infinitely divisible regularized deconvolution for ill-conditioned systems,” IEEE Trans. on laws,” ESAIM: Probability and Statistics, vol. 10, pp. 269-276, 2006. Signal Processing, vol. 52, pp. 418-433, Feb. 2004. [38] S. Ramani, T. Blu, and M. Unser, “Monte-Carlo SURE: a black- [13] M. R. Banham and A. K. Katsaggelos, “Spatially adaptive wavelet-based box optimization of regularization parameters for general denoising multiscale image restoration,” IEEE Trans. Image Processing, vol. 5, pp. algorithms,” IEEE Trans. on Image Processing, vol. 17, pp. 1540-1554, 619-634, Apr. 1996. Sept. 2008. [14] I. M. Johnstone, G. Kerkyacharian, D. Picard, and M. Raimondo, [39] C. Vonesh, S. Ramani, and M. Unser, “Recursive risk estimation for non- “Wavelet deconvolution in a periodic setting,” J. R. Statistic. Soc. B, linear image deconvolution with a wavelet-domain sparsity constraint,” vol. 66, part 3, pp. 547-573, 2004. IEEE International Conference on Image Processing, ICIP’08, pp. 665– [15] L. Qu, P. S. Routh, and K. Ko, “Wavelet deconvolution in a periodic 668, San Diego, CA, USA, 12-14 Oct., 2008. setting using cross-validation,” IEEE Signal Processing Letters, vol. 13, [40] E. O. Brigham, The fast Fourier transform, Englewood Cliffs, N.J.: pp. 232-235, Apr. 2006. Prentice-Hall, 1974. [16] A. Benazza-Benyahia and J.- C. Pesquet, “Multichannel image decon- [41] R. C. Gonzalez and R. E. Woods, Digital image processing, Upper volution in the wavelet transform domain,” European Signal and Image Saddle River, NJ: Addison-Wesley, 1992. Processing Conference, EUSIPCO’06, 5 p., Firenze, Italy, 4-8 Sept., [42] I. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA: SIAM, 1992. 2006. [43] R. R. Coifman and M. V. Wickerhauser, “Entropy-based algorithms for [17] M. A. T. Figueiredo and R. D. Nowak, “An EM algorithm for wavelet- best basis selection,” IEEE Trans. on Information Theory, vol. 38, pp. based image restoration,” IEEE Trans. on Image Processing, vol. 12, no. 713–718, Mar. 1992. 8, pp. 906-916, 2003. [44] I. W. Selesnick, R. G. Baraniuk, and N. C. Kingsbury, “The dual tree [18] J.-L. Starck, M. Nguyen, and F. Murtagh, “Wavelets and curvelets for complex wavelet transform,” IEEE Signal Processing Magazine, vol. 22, image deconvolution: A combined approach ,” Signal Processing, vol. pp. 123-151, Nov. 2005. 83, pp. 2279-2283, 2003. [45] C. Chaux, L. Duval, and J.-C. Pesquet, “Image analysis using a dual- [19] J. Bect, L. Blanc-F´eraud, G. Aubert, and A. Chambolle, “A ℓ1-unified tree M-band wavelet transform,” IEEE Trans. on Image Processing, variational framework for image restoration,” European Conf. Computer vol. 15, pp. 2397-2412, Aug. 2006. Vision (ECCV), T. Pajdla and J. Matas (Eds.): LNCS 3024, pp. 1-13, [46] M. N. Do and M. Vetterli, “The contourlet transform: an efficient Springer-Verlag Berlin Heidelberg 2004. directional multiresolution image representation,” IEEE Trans. Image [20] I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding Processing, vol. 14, pp. 2091–2106, Dec. 2005. algorithm for linear inverse problems with a sparsity constraint,” Comm. [47] S. Mallat, “Geometrical grouplets,” Applied and Computational Har- Pure Appl. Math., vol. 57, no. 11, pp. 1413-1457, 2004. monic Analysis, vol. 26, no 2, pp. 161-180, Mar. 2009. [21] C. Chaux, P. L. Combettes, J.-C. Pesquet, and V. R. Wajs, “A variational [48] G. T. Nason and B. W. Silverman, “The stationary wavelet transform and formulation for frame based inverse problems,” Inverse Problems, vol. some statistical applications,” in Wavelets and statistics, A. Antoniadis 23, pp. 1495-1518, Jun. 2007. and G. Oppenheim editors, Lecture notes in statistics, Springer Verlag, [22] J. M. Bioucas-Dias and M. A. Figueiredo, “A new TwIST: two-step pp. 281-299, 1995. iterative shrinkage/thresholding algorithms for image restoration,” IEEE [49] J.-C. Pesquet, H. Krim, and H. Carfantan, “Time-invariant orthonormal Trans. Image Process., vol. 16, pp. 2992–3004, Dec. 2007. wavelet representations,” IEEE Trans. Signal Processing, vol. 44, pp. [23] C. Vonesch and M. Unser, “A fast thresholded Landweber algorithm 1964–1970, Aug. 1996. for wavelet-regularized multidimensional deconvolution,” IEEE Trans. [50] S. Mallat, A wavelet tour of signal processing, San Diego, USA: on Image Processing, vol. 17, pp. 539-549, Apr. 2008. Academic Press, 1998. [24] J. A. Guerrero-Col´on, L. Mancera, and J. Portilla, “Image restoration us- [51] G. M. Jenkins and D. G. Watts, Spectral analysis and its applications, ing space-invariant Gaussian scale mixtures in overcomplete pyramids,” San Francisco, CA: Holden-Day, Inc., 1968. IEEE Trans. on Image Processing, vol. 17, no. 1, pp. 27-41, January 2008. [25] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image restoration by sparse 3D transform-domain collaborative filtering,” SPIE Electronic Imaging, vol. 6812, no. 6812-1D, San Jose, CA, USA, January 2008. [26] C. Stein, “Estimation of the mean of a multivariate normal distribution,” Annals of Statistics, vol. 9, no. 6, pp. 1135-1151, 1981. [27] H. Krim, D. Tucker, S. Mallat, and D. L. Donoho, “On denoising and best signal representation,” IEEE Trans. on Information Theory, vol. 5, pp. 2225-2238, Nov. 1999. [28] A. Benazza-Benyahia and J. C. Pesquet, “Building robust wavelet estimators for multicomponent images using Stein’s principle, ” IEEE Image Processing Trans., vol. 14, pp. 1814-1830, Nov. 2005. [29] T. Blu and F. Luisier, “The SURE-LET approach to image denoising,” IEEE Trans. on Image Processing, vol. 16, no. 11, pp. 2778-2786, Nov. 2007. [30] J.-C. Pesquet and D. Leporini, “A new wavelet estimator for image denoising,” IEE Sixth International Conference on Image Processing and its Applications, vol. 1, pp. 249-253, Dublin, Ireland, Jul. 14-17, 1997.