Calculus of Variations in Image Processing
Total Page:16
File Type:pdf, Size:1020Kb
Calculus of variations in image processing Jean-François Aujol CMLA, ENS Cachan, CNRS, UniverSud, 61 Avenue du Président Wilson, F-94230 Cachan, FRANCE Email : [email protected] http://www.cmla.ens-cachan.fr/membres/aujol.html 18 september 2008 Note: This document is a working and uncomplete version, subject to errors and changes. Readers are invited to point out mistakes by email to the author. This document is intended as course notes for second year master students. The author encourage the reader to check the references given in this manuscript. In particular, it is recommended to look at [9] which is closely related to the subject of this course. Contents 1. Inverse problems in image processing 4 1.1 Introduction.................................... 4 1.2 Examples ....................................... 4 1.3 Ill-posedproblems............................... .... 6 1.4 Anillustrativeexample. ..... 8 1.5 Modelizationandestimator . ..... 10 1.5.1 Maximum Likelihood estimator . 10 1.5.2 MAPestimator ................................ 11 1.6 Energymethodandregularization. ....... 11 2. Mathematical tools and modelization 14 2.1 MinimizinginaBanachspace . 14 2.2 Banachspaces.................................... 14 2.2.1 Preliminaries ................................. 14 2.2.2 Topologies in Banach spaces . 17 2.2.3 Convexity and lower semicontinuity . ...... 19 2.2.4 Convexanalysis................................ 23 2.3 Subgradient of a convex function . ...... 23 2.4 Legendre-Fencheltransform: . ....... 24 2.5 The space of funtions with bounded variation . ......... 26 2.5.1 Introduction.................................. 26 2.5.2 Definition ................................... 27 2.5.3 Properties................................... 28 2.5.4 Decomposability of BV (Ω): ......................... 28 2.5.5 SBV ...................................... 30 2.5.6 Setsoffiniteperimeter . 32 2.5.7 Coarea formula and applications . ..... 33 3. Energy methods 35 3.1 Introduction.................................... 35 3.2 Tychonov regularization . ..... 35 3.2.1 Introduction.................................. 35 3.2.2 Sketchoftheproof(tofixsomeideas) . 36 3.2.3 Generalcase.................................. 36 3.2.4 Minimization algorithms . 39 3.3 Rudin-Osher-Fatemimodel. 40 3.3.1 Introduction.................................. 40 3.3.2 Interpretationasaprojection . ..... 45 3.3.3 Euler-Lagrange equation for (3.67): . ....... 47 3.3.4 Other regularization choices . ..... 48 3.3.5 Half quadratic minimmization . 50 3.4 Wavelets........................................ 53 3.4.1 Besovspaces.................................. 53 2 3.4.2 Waveletshrinkage. .. ... ... .. ... ... .. ... ... .. ... 54 3.4.3 Variational interpretation . ..... 55 4. Advanced topics: Image decomposition 56 4.1 Introduction.................................... 56 4.2 A space for modeling oscillating patterns in bounded domains .......... 57 4.2.1 Definitionandproperties. 57 4.2.2 StudyofMeyerproblem . 60 4.3 Decomposition models and algorithms . ....... 61 4.3.1 Whichspacetouse? ............................. 61 4.3.2 Parametertuning............................... 62 4.3.3 T V G algorithms.............................. 64 4.3.4 T V − L1 .................................... 65 − 1 4.3.5 T V H− ................................... 66 4.3.6 T V -Hilbert− .................................. 67 4.3.7 UsingnegativeBesovspace . 68 4.3.8 Using Meyer’s E space............................ 69 4.3.9 Applications of image decomposition . ...... 71 A. Discretization 71 3 1. Inverse problems in image processing For this section, we refer the interested reader to [71]. We encourage the reader not familiar with matrix to look at [29]. 1.1 Introduction In many problems in image processing, the goal is to recover an ideal image u from an obser- vation f. u is a perfect original image describing a real scene. f is an observed image, which is a degraded version of u. The degradation can be due to: Signal transmission: there can be some noise (random perturbation). • Defects of the imaging system: there can be some blur (deterministic perturbation). • The simplest modelization is the following: f = Au + v (1.1) where v is the noise, and A is the blur, a linear operator (for example a convolution). The following assumptions are classical: A is known (but often not invertible) • Only some statistics (mean, variance, . ) are know of n. • 1.2 Examples Image restoration (Figure 1) f = u + v (1.2) with n a white gaussian noise with standard deviation σ. Radar image restoration (Figure 2) f = uv (1.3) with v a gamma noise with mean one. Poisson distribution for tomography. Image decomposition (Figure 3) f = u + v (1.4) u is the geometrical component of the original image f (u can be seen a s a sketch of f), and v is the textured component of the original image f (v contains all the details of f). 4 Original image Noisy image f (σ = 30) Restoration (u) Figure 1: Denoising Noise free image Speckled image (f) restored image u Figure 2: Denoising of a synthetic image with gamma noise. f has been corrupted by some multiplicative noise with gamma law of mean one. Original image (f) = BV component (u) + Oscillatory component(v) Figure 3: Image decomposition 5 Original image Degraded image Restored image Figure 4: Example of TV deconvolution Degraded image Inpainted image Figure 5: Image deconvolution (Figure 4) f = Au + v (1.5) Image inpainting [56] (Figure 5) Zoom [54] (Figure 6) 1.3 Ill-posed problems Let X and Y be two Hilbert spaces. Let A : X Y a continous linear application (in short, an operator). → Consider the following problem: Given f Y, find u X such that f = Au (1.6) ∈ ∈ The problem is said to be well-posed if (i) f Y there exists a unique u X such that f = Au. ∀ ∈ ∈ (ii) The solution u depends continuously on f. 6 Figure 6: Top left: ideal image; top right: zoom with total variation minimization; bottom left: zoom by pixel duplication; bottom right: zoom with cubic splines 7 1 In other words, the problem is well-posed if A is invertible and its inverse A− : Y X is continuous. → Conditions (i) and (ii) are referred to as the Hadamard conditions. A problem that is not well-posed is said to be ill-posed. Notice that a mathematically well-posed problem may be ill-posed in practice: the solution may exist, be unique, and depend continuously on the data, but still be very sensitive to small 1 perturbations of it. An eror δf produces the error δu = A− δf, which may have dramatic 1 consequences on the interpretation of the solution. In particular, if A− is very large, errors 1 k k may be strongly amplified by the action of A− . There can also be some computational time issues. 1.4 An illustrative example See [29] for the defintion of . 2 of a vector, a matrix, a positive symmetric matrix, an orthogonal matrix, . k k We consider the following problem: f = Au + v (1.7) v is the the amount of noise. k k 1 We assume that A is a real symetric positive matrix, and has some small eigenvalues. A− is thus very large. We want to compute a solution by filtering. k k 1 T Since A is symetric, there exists an orthogonal matrix P (i.e. P − = P ) such that: A = P DP T (1.8) with D = diag(λi) a diagonal matrix, and λi > 0 for all i. We have: 1 1 1 T A− f = u + A− v = u + PD− P v (1.9) 1 1 with D− = diag(λi− ). It is easy to see that instabilities arise from small eigenvalues λi. Regularization by filtering: One way to overcome this problem consists in modifying the 1 2 λi− : we multiply them by wα(λi ). wα is chosen such that: 2 1 w (λ )λ− 0 when λ 0. (1.10) α → → 1 This filters out singular components from A− f and leads to an approximation to u by uα defined by: 1 T uα = PDα− P f (1.11) 1 2 1 where Dα− = diag(wα(λi )λi− ). To obtain some accuracy, one must retain singular components corresponding to large sin- gular values. This is done by choosing w (λ2) 1 for large values of λ. α ≈ For wα, we may chose (truncated SVD): 1 if λ2 > α. w (λ2)= (1.12) α 0 if λ2 α. ≤ We may also choose a smoother function (Tychonov filter function): λ2 w (λ2)= (1.13) α λ2 + α An obvious question arises: can the regularization parameter α be selected to guarantee convergence as the error level v goes to zero? k k 8 Error analysis: We denote by Rα the regularization operator: 1 T Rα = PDα− P (1.14) We have uα = Rαf. The regularization error is given by: e = u u = R Au u + R v = etrunc + enoise (1.15) α α − α − α α α where: trunc 1 T e = R Au u = P (D− D Id)P u (1.16) α α − α − and: noise 1 T eα = Rαv = PDα− P v (1.17) trunc eα is the error due to the regularization (it quantifies the loss of information due to the noise regularizing filter). eα is called the noise amplification error. trunc 2 1 1 We first deal with eα . Since wα(λ ) 1 as α 0, we have Dα− D− as α 0 and thus: → → → → etrunc 0 as α 0. (1.18) α → → To deal with the noise amplification error, we use the following inequality for λ> 0: 2 1 1 w (λ )λ− (1.19) α ≤ √α Remind that P = 1 since P orthogonal. We thus deduce that: k k 1 enoise v (1.20) α ≤ √αk k where we recall that v = v is the amount of noise. To conclude, it suffice to choose α = v p k k k k noise k k with p< 2, and let v 0: we get eα 0. Now, if we choosek αk →= v p with 0 <p<→ 2, we have: k k e 0 as v 0. (1.21) α → k k → For such a regularization parameter choice, we say that the method is convergent. Rate of convergence: We assume a range condition: 1 u = A− z (1.22) Since A = P DP T , we have: trunc 1 T 1 1 T e = P (D− D Id)P u = P (D− D− )P z (1.23) α α − α − Hence: trunc 2 1 1 2 2 e D− D− z (1.24) k α k2 ≤ k α − k k k 1 1 2 1 Since D− D− = diag((w (λ ) 1)λ− ), we deduce that: α − α i − i etrunc 2 α z 2 (1.25) k α k2 ≤ k k We thus get: 1 e √α z + v (1.26) k αk ≤ k k √αk k The right-hand side is minimized by taking α = v / z .