Numerical Linear Algebra Program Lecture 7 Projection Methods The

Program Lecture 7 Numerical Linear Algebra Computing a basis for the Krylov subspace Krylov methods for Hermitian systems • The Arnoldi basis • Lanczos method • Solution methods for linear systems • Conjugate gradients • Minres, Conjugate Residuals • SYMMLQ • Gerard Sleijpen and Martin van Gijzen November 8, 2017 1 November 8, 2017 2 National Master Course National Master Course Delft University of Technology Projection methods The Krylov subspace Last time you saw a general framework to solve Ax = b with a 2 k 1 The subspace span r , Ar , A r ,..., A − r is called the projection method. The main steps are: { 0 0 0 0} Krylov subspace of order k, generated by the matrix A and Use recursion to compute a basis Vk for the search • initial vector r0 and is denoted by subspace; A r span r Ar A2 r Ak 1 r Project the problem and compute it’s representation w.r.t. k( , 0)= 0, 0, 0,..., − 0 • K { } the basis (gives low dimensional problem); Solve projected problem (gives low dimensional solution Projection methods that search Krylov subspaces are called • Krylov subspace methods. vector ~yk); Compute (high dimensional) solution x V • k = k ~yk. As we have seen in the previous lessons, a stable orthogonal Today we will see how symmetry (Hermitian problems) can be basis for k(A, r0) can be computed using Arnoldi’s method. K exploited to enhance efficiency. November 8, 2017 3 November 8, 2017 4 National Master Course National Master Course Arnoldi’s method The Arnoldi relation The Arnoldi method can be summarised in a compact way. With Choose a starting vector v with v = 1. 1 k 1k2 For k = 1,..., do % iteration h1,1 ... ... h1,k w Av % expansion = k .. h2,1 . For i = 1,...,k, do % orthogonalisation .. .. h = v w H k . i,k i∗ ≡ w w v = hi,k i hk,k 1 hk,k − − end for O hk+1,k h = w k+1,k k k2 if h = 0, stop % invariant subspace spanned and Vk [v1, v2,..., vk], we have the Arnoldi relation k+1,k ≡ vk+1 = w/hk+1,k % new basis vector AV = V H end for k k+1 k November 8, 2017 5 November 8, 2017 6 National Master Course National Master Course A is Hermitian A is Hermitian (2) Let A be Hermitian. So According to the Arnoldi relation we have h1,1 h1,2 O V AV = V V H = H , .. .. k∗ k k∗ k+1 k k h2,1 . Hk = . .. .. where H is the k k upper block of H . . hk 1,k k × k − A Moreover, if is Hermitian we have O hk,k 1 hk,k − V A V V AV Hk∗ = k∗ ∗ k = k∗ k = Hk. With α h = h and β h = h the Arnoldi k ≡ k,k k,k k+1 ≡ k+1,k k,k+1 method simplifies to the (Hermitian) Lanczos method. So Hk is Hermitian and upper Hessenberg. This implies that Hk must be tridiagonal. November 8, 2017 7 November 8, 2017 8 National Master Course National Master Course A is Hermitian (2) Lanczos’ method So Choose a starting vector v with v = 1 1 k 1k2 v α1 β¯2 O β1 = 0, 0 = 0 % Initialization .. .. For k = 1,... do % Iteration β2 . Hk = . α = v Av .. .. k k∗ k . βk w Av v v = k αk k βk k 1 − − − O βk αk,k % New direction orthogonal to the previous vs βk+1 = w 2 With αk hk,k = hk,k and βk+1 hk+1,k = hk,k+1 the Arnoldi k k ≡ ≡ v w % Normalization method simplifies to the (Hermitian) Lanczos method. With the k+1 = /βk+1 end for Lanczos method it is possible to compute a new orthonormal basis vector using only the two previous basis vectors. Note that βk > 0. November 8, 2017 8 November 8, 2017 9 National Master Course National Master Course Lanczos’ method Eigenvalue methods Let Arnoldi’s and Lanczos’ method were originally proposed as α β 0 1 2 iterative methods to compute the eigenvalues of a matrix A: .. β2 α2 . V AV .. .. .. k∗ k = Hk . T . k ≡ . .. .. β is ‘almost’ a similarity transformation. k A 0 β α The eigenvalues of Hk are called Ritz values (of of order k). k k 0 β k+1 H ~z = θ ~z Au θ u V where u V ~z. k ⇔ − ⊥ k ≡ k and Vk = [v1, v2,..., vk]. Then, the Lanczos relation reads u is a Ritz vector, (θ, u) is a Ritz pair. AVk = Vk+1 T k. November 8, 2017 10 November 8, 2017 11 National Master Course National Master Course Optimal approximations Optimal approximations (2) The Lanczos method provides a cheap way to compute an We first look at the minimisation of the error in the A-norm: orthogonal basis for the Krylov subspace k(A, r0). K x x V ~y Our approximations can be written as k − 0 − k kkA is minimal iff ek x x0 Vk ~yk A Vk , or, xk = x0 + Vk ~yk, ≡ − − ⊥ equivalently, V Ae = r AV ~y . This yields k ⊥ k 0 − k k where ~yk is determined so that either the error V AV V r k∗ k ~yk = k∗ 0. x x x x A x x k A = ( k)∗ ( k) k − k − − With T V∗AV , the k k upper block of T , p k ≡ k k × k is minimised in A-norm (only meaningful if A is pos. def.) or that and r0 = r0 2 v1, we get k k rk 2 = A(x xk) 2 = r∗rk, T ~y = r e k k k − k k k k 0 2 1 p k k is minimised, i.e. the norm of the residual is minimised. with e1 the first canonical basis vector. November 8, 2017 12 November 8, 2017 13 National Master Course National Master Course Optimal approximations (3) Towards a practical algorithm In particular, we have that the residuals are orthogonal to the The main problem in the Lanczos algorithm is that all vi (or ri 1) − basis vectors: have to be stored to compute xk. This problem can be overcome by making an implicit LU-factorisation, Tk = Lk Uk of Tk, and rk = Aek = r0 AVk ~yk Vk − ⊥ updating xk, 1 1 xk = x0 + Vk U − L− ( r0 2 e1) , Since the r ’s are orthogonal, each residual is just a multiple of k k k k k the corresponding basis vector vk+1: rk and vk+1 are collinear. in every iteration. (Recall that r0 = r0 2 v1.) k k Details can be found in the book of Van der Vorst. This also means that the residuals form an orthogonal Krylov basis for the Krylov subspace. With this technique we get the famous and very elegant Conjugate Gradient method. November 8, 2017 14 November 8, 2017 15 National Master Course National Master Course The Conjugate Gradient method The Conjugate Gradient method r0 = b Ax0, u 1 = 0, ρ 1 = 1 % Initialization x = x0, r = b Ax, u = 0, ρ = 1 % Initialization − − − − For k = 0, 1,..., do For k = 0, 1,...,kmax do r r r r ρk = k∗ k, βk = ρk/ρk 1 ρ′ = ρ, ρ = ∗ , β = ρ/ρ′ − uk = rk + βk uk 1 % Update direction vector Stop if ρ tol − ≤ c = Au u r + β u % Update direction vector k k ← u c c Au σk = k∗ k, αk = ρk/σk = xk+1 = xk + αk uk % Update iterate σ = u∗c, α = ρ/σ r = r α c % Update residual x x + α u % Update iterate k+1 k − k k ← end for r r α c % Update residual ← − end for Note that, in our notation, the CG αj and βj are not the same as the Lanczos αj and βj. November 8, 2017 16 November 8, 2017 17 National Master Course National Master Course Properties of CG Lanczos matrix and CG Since CG and Lanczos are mathematically equivalent it should CG has several favourable properties: be possible to recover the Lanczos matrix Tk from the The method uses limited memory: only three vectors need • CG-iteration parameters. This is indeed the case to be stored; 1 √β1 0 The method is optimal: the error is minimised in A-norm; α0 α0 • . √β1 1 + β1 .. The method is finite: the (n + 1)st residual must be zero α0 α1 α0 • .. .. .. since all the residuals are orthogonal; Tk = . . A u Au .. .. √βk−1 The method is robust (if is HPD): σk k∗ k = 0 and . • ≡ αk−2 r r ρk ∗ k = 0 both imply that the true solution has been √βk−1 β −1 ≡ k 0 1 + k αk−2 αk−1 αk−2 found (that rk = 0). Note that the CG β is not in the matrix (β is meaningless u Au and r r for 0 0 • i∗ j = 0 i∗ j = 0 i = j 6 anyway since u 1 = 0). − November 8, 2017 18 November 8, 2017 19 National Master Course National Master Course Conjugate Gradient Error Analysis A convergence bound for CG Since e x x = e V ~y with V ~y (A, r ), The iterates x obtained from the CG algorithm satisfy the k ≡ − k 0 − k k k k ∈Kk 0 k and r0 = Ae0, we see that following inequality: 2 k k ek = e0 γ1Ae0 γ2A e0 γkA e0 = pk(A)e0 x xk A √ 2 1 2k − − −···− kx − x k 2 C − 2exp . 0 A ≤ √ 2 + 1 ≤ −√ 2 So, ek is a degree k polynomial pk in A times e0, with pk(0) = 1.

Load more