MATH 3795 Lecture 10. Regularized Linear Least Squares

MATH 3795 Lecture 10. Regularized Linear Least Squares. Dmitriy Leykekhman Fall 2008 Goals I Understanding the regularization. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 1 Review. I Consider the linear least square problem min kAx − bk2: n 2 x2R From the last lecture: T m×n I Let A = UΣV be the Singular Value Decomposition of A 2 R with singular values σ1 ≥ · · · ≥ σr > σr+1 = ··· = σmin fm;ng = 0 I The minimum norm solution is r X uT b x = i v y σ i i=1 i I If even one singular value σi is small, then small perturbations in b can lead to large errors in the solution. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 2 Regularized Linear Least Squares Problems. I If σ1/σr 1, then it might be useful to consider the regularized linear least squares problem (Tikhonov regularization) 1 λ min kAx − bk2 + kxk2: n 2 2 x2R 2 2 Here λ > 0 is the regularization parameter. I The regularization parameter λ > 0 is not known a-priori and has to be determined based on the problem data. See later. I Observe that 2 1 2 λ 2 pA b min kAx − bk2 + kxk2 = min x − : x x λI 0 2 2 2 D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 3 Regularized Linear Least Squares Problems. I Thus 2 1 2 λ 2 pA b min kAx − bk2 + kxk2 = min x − : (1) x x λI 0 2 2 2 I For λ > 0 the matrix A p 2 (m+n)×n λI R has always full rank n. Hence, for λ > 0, the regularized linear least squares problem (1) has a unique solution. I The normal equation corresponding to (1) are given by T T T A A A b p p x = (AT A+λI)x = AT b = p : λI λI λI 0 D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 4 I Rearranging terms we obtain T T T T V (Σ Σ + λI)V xλ = V Σ U b; T T multiplying both sides by V from the left and setting z = V xλ we get (ΣT Σ + λI)z = ΣT U T b: Regularized Linear Least Squares Problems. T m×m I SVD Decomposition: A = UΣV , where U 2 R and V 2 Rn×n are orthogonal matrices and Σ 2 Rm×n is a 'diagonal' matrix with diagonal entries σ1 ≥ · · · ≥ σr > σr+1 = ··· = σmin fm;ng = 0: I Thus the normal to (1) T T (A A + λI)xλ = A b; can be written as T T T T T (V Σ U U ΣV + λ I )xλ = V Σ U b: | {z } |{z} =I =VV T D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 5 Regularized Linear Least Squares Problems. T m×m I SVD Decomposition: A = UΣV , where U 2 R and V 2 Rn×n are orthogonal matrices and Σ 2 Rm×n is a 'diagonal' matrix with diagonal entries σ1 ≥ · · · ≥ σr > σr+1 = ··· = σmin fm;ng = 0: I Thus the normal to (1) T T (A A + λI)xλ = A b; can be written as T T T T T (V Σ U U ΣV + λ I )xλ = V Σ U b: | {z } |{z} =I =VV T I Rearranging terms we obtain T T T T V (Σ Σ + λI)V xλ = V Σ U b; T T multiplying both sides by V from the left and setting z = V xλ we get (ΣT Σ + λI)z = ΣT U T b: D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 5 Regularized Linear Least Squares Problems. I The normal equation corresponding to (1), T T (A A + λI)xλ = A b; is equivalent to (ΣT Σ + λI) z = ΣT U T b: | {z } diagonal T where z = V xλ. I Solution ( T σi(ui b) σ2+λ ; i = 1; : : : ; r; zi = i 0; i = r + 1; : : : ; n: Pn I Since xλ = V z = i=1 zivi, the solution of the regularized linear least squares problem (1) is given by r T X σi(u b) x = i v : λ σ2 + λ i i=1 i D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 6 I The representation r T X σi(u b) x = i v λ σ2 + λ i i=1 i of the solution of the regularized LLS also reveals the regularizing λ 2 property of adding the term 2 kxk2 to the (ordinary) least squares functional. We have that T ( σi(ui b) 0; if 0 ≈ σi λ ≈ T 2 ui b σ + λ ; if σi λ. i σi Regularized Linear Least Squares Problems. I Note that r T r T X σi(ui b) X ui b lim xλ = lim vi = vi = xy λ!0 λ!0 σ2 + λ σ i=1 i i=1 i i.e., the solution of the regularized LLS problem (1) converges to the minimum norm solution of the LLS problem as λ goes to zero. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 7 Regularized Linear Least Squares Problems. I Note that r T r T X σi(ui b) X ui b lim xλ = lim vi = vi = xy λ!0 λ!0 σ2 + λ σ i=1 i i=1 i i.e., the solution of the regularized LLS problem (1) converges to the minimum norm solution of the LLS problem as λ goes to zero. I The representation r T X σi(u b) x = i v λ σ2 + λ i i=1 i of the solution of the regularized LLS also reveals the regularizing λ 2 property of adding the term 2 kxk2 to the (ordinary) least squares functional. We have that T ( σi(ui b) 0; if 0 ≈ σi λ ≈ T 2 ui b σ + λ ; if σi λ. i σi D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 7 Regularized Linear Least Squares Problems. λ 2 I Hence, adding 2 kxk2 to the (ordinary) least squares functional acts as a filter. Contributions from singular values which are large relative to the regularization parameter λ are left (almost) unchanged whereas contributions from small singular values are (almost) eliminated. I It raises an important question: How to choose λ? D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 8 Regularized Linear Least Squares Problems. I Suppose that the data are b = bex + δb. We want to compute the minimum norm solution of the (ordinary) LLS with unperturbed data bex r X uT b x = i v ex σ i i=1 i but we can only compute with b = bex + δb, we don't know bex. I The solution of the regularized least squares problem is r T T X σi(u bex) σi(u δb) x = i + i v : λ σ2 + λ σ2 + λ i i=1 i i D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 9 Regularized Linear Least Squares Problems. I We observed that r T X σi(u bex) i ! x as λ ! 0: σ2 + λ ex i=1 i I On the other hand T ( σi(ui δb) 0; if 0 ≈ σi λ ≈ T 2 ui δb σ + λ ; if σi λ, i σi which suggests to choose λ sufficiently large to ensure that errors δb in the data are not magnified by small singular values. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 10 Regularized Linear Least Squares Problems. % Compute A t = 10.^(0:-1:-10)'; A = [ ones(size(t)) t t.^2 t.^3 t.^4 t.^5]; % compute exact data xex = ones(6,1); bex = A*xex; % data perturbation of 0.1% deltab = 0.001*(0.5-rand(size(bex))).*bex; b = bex+deltab; % compute SVD of A [U,S,V] = svd(A); sigma = diag(S); for i = 0:7 % solve regularized LLS for different lambda lambda(i+1) = 10^(-i) xlambda = V * (sigma.*(U'*b) ./ (sigma.^2 + lambda(i+1))) err(i+1) = norm(xlambda - xex); end D. Leykekhmanloglog(lambda,err,'*'); - MATH 3795 Introduction to Computational Mathematics ylabel('||x^{ex} Linear Least Squares {- 11x_{\lambda}||_2'); xlabel('\lambda'); −3 I For this example λ ≈ 10 is a good choice for the regularization parameter λ. However, we could only create this figure with the knowledge of the desired solution xex. I How can we determine a λ ≥ 0 so that kxex − xλk2 is small without knowledge of xex. I One approach is the Morozov discrepancy principle. Regularized Linear Least Squares Problem. I The error kxex − xλk2 for different values of λ (loglog-scale): D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 12 Regularized Linear Least Squares Problem. I The error kxex − xλk2 for different values of λ (loglog-scale): −3 I For this example λ ≈ 10 is a good choice for the regularization parameter λ. However, we could only create this figure with the knowledge of the desired solution xex. I How can we determine a λ ≥ 0 so that kxex − xλk2 is small without knowledge of xex. I One approach is the Morozov discrepancy principle. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares { 12 Regularized Linear Least Squares Problem. I Suppose b = bex + δb. We do not know the perturbation δb, but we assume that we know its size kδbk. I Suppose the unknown desired solution xex satisfies Axex = bex. I Hence, kAxex − bk = kAxex − bex − δbk = kδbk: I Since the exact solution satisfies kAxex − bk = kδbk we want to find a regularization parameter λ ≥ 0 such that the solution xλ of the regularized least squares problem satisfies kAxλ − bk = kδbk This is Morozov's discrepancy principle.

MATH 3795 Lecture 10. Regularized Linear Least Squares

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support