Numerical Linear Algebra Solving Ax = B , an Overview Introduction

Solving Ax = b, an overview Numerical Linear Algebra ∗ A = A no Good precond. yes flex. precond. yes GCR Improving iterative solvers: ⇒ ⇒ ⇒ yes no no ⇓ ⇓ ⇓ preconditioning, deflation, numerical yes GMRES A > 0 CG software and parallelisation ⇒ ⇓ no ⇓ ⇓ ill cond. yes SYMMLQ str indef no large im eig no Bi-CGSTAB ⇒ ⇒ ⇒ no yes yes ⇓ ⇓ ⇓ MINRES IDR no large im eig BiCGstab(ℓ) ⇐ yes Gerard Sleijpen and Martin van Gijzen a good precond itioner is available ⇓ the precond itioner is flex ible IDRstab November 29, 2017 A + A∗ is strongly indefinite A 1 has large imaginary eigenvalues November 29, 2017 2 National Master Course National Master Course Delft University of Technology Introduction Program We already saw that the performance of iterative methods can Preconditioning be improved by applying a preconditioner. Preconditioners (and • Diagonal scaling, Gauss-Seidel, SOR and SSOR deflation techniques) are a key to successful iterative methods. • Incomplete Choleski and Incomplete LU In general they are very problem dependent. • Deflation Today we will discuss some standard preconditioners and we will • Numerical software explain the idea behind deflation. • Parallelisation We will also discuss some efforts to standardise numerical • software. Shared memory versus distributed memory • Domain decomposition Finally we will discuss how to perform scientific computations on • a parallel computer. November 29, 2017 3 November 29, 2017 4 National Master Course National Master Course Preconditioning Why preconditioners? A preconditioned iterative solver solves the system − − M 1Ax = M 1b. 1 The matrix M is called the preconditioner. 0.8 The preconditioner should satisfy certain requirements: 0.6 Convergence should be (much) faster (in time) for the 0.4 • preconditioned system than for the original system. Normally 0.2 0 this means that M is constructed as an “easily invertible” 25 20 14 15 12 approximation to A. Note that if M = A any iterative method 10 10 8 6 5 4 2 converges in one iteration. 0 0 −1 Operations with M should be easy to perform ("cheap"). Information after 1 iteration • M should be relatively easy to construct. • November 29, 2017 5 November 29, 2017 6 National Master Course National Master Course Why preconditioners? Why preconditioners? 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 25 25 20 14 20 14 15 12 15 12 10 10 10 8 10 8 6 6 5 4 5 4 2 2 0 0 0 0 Information after 7 iterations Information after 21 iterations November 29, 2017 6 November 29, 2017 6 National Master Course National Master Course Why preconditioning? (2) Clustering the spectrum In lecture 7 we saw that CG performs better when the spectrum From the previous pictures it is clear that, in 2-d, we need of A is clustered. (1/h)= (√n) iterations to move information from one end to O O the other end of the grid. Therefore, a good preconditioner clusters the spectrum. 3/2 So, at best it takes (n ) operations to compute the solution 1 1 O with an iterative method. 0.5 0.5 In order to improve this we need a preconditioner that enables fast propagation of information through the mesh. 0 0 -0.5 -0.5 -1 -1 0 1 2 3 4 0 1 2 3 4 November 29, 2017 7 November 29, 2017 8 National Master Course National Master Course Incorporating a preconditioner Incorporating a preconditioner (2) Normally the matrix M−1A is not explicitly formed. Preconditioners can be applied in different ways: The multiplication from the left − • − − u = M 1Av M 1Ax = M 1b, can simply be carried out by the two operations centrally • − − − − M = LU; L 1AU 1y = L 1b; x = U 1y, t = Av %% MatVec or from the right solve Mu = t for u %% MSolve. • − − AM 1y = b; x = M 1y. Some methods allow implicit preconditioning. • Left, central and right preconditioning is explicit. November 29, 2017 9 November 29, 2017 10 National Master Course National Master Course Incorporating a preconditioner (3) CG with central preconditioning All forms of preconditioning lead to the same spectrum. −1 x = Ux0, r = L (b Ax), u = 0, ρ = 1 %% Initialization Yet there are differences: − while r > tol do Left preconditioning is most natural: no extra step is required to k k • σ = ρ, ρ = r∗r, β = ρ/σ compute x; − −1 −1 ∗ u r β u, c = (L AU )u Central preconditioning allows to preserve symmetry (if U = L ); ← ∗− • σ = u c, α = ρ/σ Right preconditioning does not affect the residual norm. r r α c %% update residual • ← − Implicit preconditioning does not affect the residual norm, no x x + α u %% update iterate • ← extra steps are required to compute x. Unfortunately, not all end while methods allow implicit preconditioning. If possible, adaption of x U−1x the code of the iterative solver is required. ← ∗ Explicit preconitioning allows optimisation of the combination of Here, M = LU with U = L . Note that M is positive definite. • −1 ∗ − 1 MSolve and MatVec as the Eisentat trick. If M = (L + D)D (L + D) , take L (L + D)D 2 . ≡ November 29, 2017 11 November 29, 2017e e e 12 National Master Course National Master Course CG with implicit preconditioning Incorporating a preconditioner (4) Explicit preconditioning requires x = x0, r = b Ax, u = 0, ρ = 1 %% Initialization − ′ ′ preprocessing (to solve Mb b for b in left while r > tol do • = k k −1 ′ c = M r preconditioning, and x0 = Mx0 in right preconditioning), ∗ σ = ρ, ρ = c r, β = ρ/σ adjust MatVec: adjustment of the code for the operation − • u c βu, c = Au that does the matrix-vector multiplication (to produce, say, ← ∗− − σ = u c, α = ρ/σ c = M 1Au), r r α c %% update residual ′ ← − postprocessing (to solve Mx = x for x in right precond.). x x + α u %% update iterate • ← end while Implicit preconditioning requires adaption of the code of the iterative solver. M should be Hermitian. Matlab takes M1, M2 (i.e., M = M1M2). • −1 ∗ Matlab’s PCG uses implicit preconditioning. For instance, M1 = LD + I and M2 = (L + D) . November 29, 2017 e e 13 November 29, 2017 14 National Master Course National Master Course Preconditioning in MATLAB (1) Preconditioning in MATLAB (2) ∗ ∗ Example. L L + D, where A = L + D + L is p.d.. Example. L L + D, where A = L + D + L is p.d.. ≡ −1 ∗ 1 −1 −∗ 1 1 −1 ≡ −1 ∗ 1 −1 −∗ 1 1 −1 Ax = b, M LDe L D 2 Le AL De2 y = D 2 L b Ax = b, M LDe L D 2 Le AL De2 y = D 2 L b ≡ ≡ function [PreMV,pre_b,PostProcess,pre_x0] = preprocess(A,b,x0) function x = CG(A,b,tol,kmax,x0) L = tril(A); D = diag(sqrt(diag(A))); L = L/D; U = L’; ... function c = MyPreMV(u); c=A*u; c = L\(A*(U\u)); ... end end PreMV = @MyPreMV; pre_b = L\b; pre_x0 = U x0; * In Matlab’s command window, replace function x = MyPostProcess(y) x = U\y; > x = CG(A,b,100,10ˆ(-8),x0); end PostProcess = @MyPostProcess; by end > [pMV,pb,postP,px0] = preprocess(A,b,x0); > y = CG(pMV,pb,100,10ˆ(-8),px0); Here, A is a sparse matrix. Note that L,U,D are also sparse > x = postP(y); November 29, 2017 15 November 29, 2017 16 National Master Course National Master Course Preconditioning in MATLAB (3) Preconditioning in MATLAB (4) ∗ ∗ Example. L L + D, where A = L + D + L is p.d.. Example. L L + D, where A = L + D + L is p.d.. ≡ −1 ∗ 1 −1 −∗ 1 1 −1 ≡ −1 ∗ Ax = b, M LDe L D 2 Le AL De2 y = D 2 L b Ax = b, M LDe L implicite preconditioninge ≡ ≡ The CG code should take both functions and matrices: function MSolve = SetPrecond(A) L = tril(A); U = L’; D = diag(diag(A)); function x = CG(A,b,tol,kmax,x0) function c = MyPreMV(u); A_is_matrix = isnumeric(A); c = U\(D*(L\u)); function c=MV(u) end if A_is_matrix, c=A*u; else, c=A(u); end MSolve = @MyMSolve; end end function x = PCG(A,b,tol,kmax,MSolve,x0) ... ... c=MV(u); c=MSolve(r); ... ... c=A(u); end ... end November 29, 2017 17 November 29, 2017 18 National Master Course National Master Course Diagonal scaling Gauss-Seidel, SOR and SSOR The Gauss-Seidel preconditioner is defined by Diagonal scaling or Jacobi preconditioning uses M = L + D M = diag(A) with L the strictly lower-triangular part of A and D = diag(A). as preconditioner. Clearly, this preconditioner does not enable By introducing a relaxation parameter ω (using D/ω instead of fast propagation through a grid. On the other hand, operations D), we get the SOR-preconditioner. with diag(A) are very easy to perform and diagonal scaling can be useful as a first step, in combination with other techniques. For symmetric problems it is wise to take a symmetric preconditioner. A symmetric variant of Gauss-Seidel is − ∗ If the choice for M is restricted to diagonals, M = (L + D)D 1(L + D) − then M = diag(A) minimize (M 1A) is some sense. C By introducing a relaxation parameter ω (i.e., by using D/ω instead of D) we get the so called SSOR-preconditioner. November 29, 2017 19 November 29, 2017 20 National Master Course National Master Course ILU-preconditioners Sparsity patterns, fill ILU-preconditioners are the most popular ‘black box’ 0 preconditioners. They are constructed by making a standard 5 LU-decomposition 10 A = LU. 15 However, during the elimination process some nonzero entries in 20 the factors are discarded (i.e., replaced by 0).

Numerical Linear Algebra Solving Ax = B , an Overview Introduction

Overview of Iterative Linear System Solver Packages

A Geometric Theory for Preconditioned Inverse Iteration. III: a Short and Sharp Convergence Estimate for Generalized Eigenvalue Problems

16 Preconditioning

Preconditioned Inverse Iteration and Shift-Invert Arnoldi Method

Preconditioning the Coarse Problem of BDDC Methods—Three-Level, Algebraic Multigrid, and Vertex-Based Preconditioners

The Preconditioned Conjugate Gradient Method with Incomplete Factorization Preconditioners

Performance of H-Lu Preconditioning for Sparse Matrices

An Introduction to Preconditioners

Chapter 9 PRECONDITIONING

A New Preconditioning Technique for Solving Large Sparse Linear Systems

HYPRE: High Performance Preconditioners

GMRES Convergence Bounds for Eigenvalue Problems