Iterative Methods for Linear System
Total Page:16
File Type:pdf, Size:1020Kb
Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline ●Basics: Matrices and their properties Eigenvalues, Condition Number ●Iterative Methods Direct and Indirect Methods ●Krylov Subspace Methods Ritz Galerkin: CG Minimum Residual Approach : GMRES/MINRES Petrov-Gaelerkin Method: BiCG, QMR, CGS 1st April JASS 2009 2 Basics ●Linear system of equations Ax = b ●A Hermitian matrix (or self-adjoint matrix ) is a square matrix with complex entries which is equal to its own conjugate transpose, 3 2 + i that is, the element in the ith row and jth column is equal to the A = complex conjugate of the element in the jth row and ith column, for 2 − i 1 all indices i and j • Symmetric if a ij = a ji • Positive definite if, for every nonzero vector x XTAx > 0 • Quadratic form: 1 f( x) === xTAx −−− bTx +++ c 2 ∂ f( x) • Gradient of Quadratic form: ∂x 1 1 1 f' (x) = ⋮ = ATx + Ax − b ∂ 2 2 f( x) ∂xn 1st April JASS 2009 3 Various quadratic forms Positive-definite Negative-definite matrix matrix Singular Indefinite positive-indefinite matrix matrix 1st April JASS 2009 4 Various quadratic forms 1st April JASS 2009 5 Eigenvalues and Eigenvectors For any n×n matrix A, a scalar λ and a nonzero vector v that satisfy the equation Av =λv are said to be the eigenvalue and the eigenvector of A. ●If the matrix is symmetric , then the following properties hold: (a) the eigenvalues of A are real (b) eigenvectors associated with distinct eigenvalues are orthogonal ●The matrix A is positive definite (or positive semidefinite) if and only if all eigenvalues of A are positive (or nonnegative). 1st April JASS 2009 6 Eigenvalues and Eigenvectors Why should we care about the eigenvalues? Iterative methods often depend on applying A to a vector over and over again : (a) If | λ|<1, then Aiv=λiv vanishes as i approaches infinity (b) If | λ|>1, then Aiv=λiv will grow to infinity. 1st April JASS 2009 7 Some more terms: Spectral radius of a matrix is: ρ(A)= max| λ | i Condition number is : K= max min Error : e = x exact – x app Residual : r = b-A.x app 1st April JASS 2009 8 Preconditioning Preconditioning is a technique for improving the condition number of a matrix. Suppose that M is a symmetric, positive-definite matrix that approximates A, but is easier to invert. We can solve Ax = b indirectly by solving M-1 Ax = M -1 b 1 Type of preconditioners: ●Perfect preconditioner M = A Condition number =1 solution in one iteration but Mx=b is not useful preconditioner ●Diagonal preconditioner, trivial to invert but mediocre ●Incomplete Cholesky : A LL T ● Not always stable 1st April JASS 2009 9 Stationary and non-stationary methods Stationary methods for Ax = b : x(k+1) =Rx (k) + c neither R or c depend upon the iteration counter k. ●Splitting of A A = M - K with nonsingular M Ax= Mx -Kx = b x = M -1 Kx – M -1 b = Rx +c Examples: ● Jacobi method ● Gauss-Seidel ● Successive Overrelaxation (SOR) 1st April JASS 2009 10 Jacobi Method ●Splitting for Jacobi Method, M=D and K=L +U x(k+1) = D -1 (( L+U )x(k) + b ) solve for x from equation i, assuming other entries fixed i for i = 1 to n for j = 1 to n u (k+1) = (u (k) + u (k) + u (k) + u (k))/4 i,j i-1,j i+1,j i,j-1 i,j+1 1st April JASS 2009 11 Gauss-Siedel Method and SOR(Successive-Over-Relaxation) Splitting for Jacobi Method, M=D-L and K=U x(k+1) = (D-L )-1 (U x(k) + b ) While looping over the equations, use the most recent values x i for i = 1 to n for j = 1 to n u (k+1) = (u (k+1) + u (k) + u (k+1) + u (k))/4 i,j i-1,j i+1,j i,j-1 i,j+1 Splitting for SOR: x(k+1) = ωx (k+1) + (1- ω) x (k) i i OR x(k+1) = (D-ωL)-1 (ωU + (1- ω) D ) x(k) + ω (D-ωL)-1 b 1st April JASS 2009 12 Stationary and non-stationary methods ●Non-stationary methods: ● The constant are computed by taking inner products of residual or other vectors arising from the iterative method ● Examples: ● Conjugate gradient (CG) ● Minimum Residual (MINRES) ● Generalized Minimal Residual (GMRES) ● BiConjugate Gradient (BiCG) ● Quasi Minimal Residual (QMR) ● Conjugate Gradient Squared (CGS) 1st April JASS 2009 13 Descent Algorithms Fundamental underlying structure for almost all the descent algorithms: ● Start with an initial point ● Determine according to a fixed rule a direction of movement ● Move in that direction to a relative minimum of the objective function ● At the new point, a new direction is determined and the process is repeated. ● The difference between different algorithms depends upon the rule by which successive directions of movement are selected 1st April JASS 2009 14 The Method of Steepest Descent • In the method of steepest descent, one starts with an arbitrary point x(0) and takes a series of steps x(1) , x(2) , … until we are satisfied that we are close enough to the solution. • When taking the step, one chooses the direction in which f decreases most quickly, i.e. • Definitions: −−− f' (x(i) ) === b −−− Ax (i) error vector: e(i) =x(i) -x residual: r(i) =b-Ax (i) • From Ax =b, it follows that r(i) =- Ae (i) =-f’( x(i) ) Residual is direction of Steepest Descent 1st April JASS 2009 15 The Method of Steepest Descent Starting at Find the (-2,-2) take point of steps in intersection direction of of these steepest surfaces that descent of f minimizes f The gradient of the The parabola bottomost is the point is intersection of orthogonal to surfaces gradient of previous step 1st April JASS 2009 16 The Method of Steepest Descent 1st April JASS 2009 17 The Method of Steepest Descent • The algorithm r(i) = b − Ax (i) rT r Two matrix-vector (i) (i) multiplications are α(i) = rT Ar required. (i) (i) x(i + )1 = x(i) + α(i) r(i) => e(i + )1 = e(i) + α(i) r(i) • To avoid one matrix-vector multiplication, one uses r(i + )1 = r(i) − α(i) Ar (i) The disadvantage of using this recurrence is that the residual sequence is determined without any feedback from the value of x(i) , so that round-off errors may cause x(i) to converge to some point near x. 1st April JASS 2009 18 Steepest Descent Problem ●The gradient at the minimum of a line search is orthogonal to the direction of that search ⇒ the steepest descent algorithm tends to make right angle turns, taking many steps down a long narrow potential well. Too many steps to get to a simple minimum. 1st April JASS 2009 19 The Method of Conjugate Directions Basic idea: • Mathematical formulation: • Pick a set of orthogonal 1. For each step we choose a search directions d(0) , d(1) , … , point d(n-1) x(i+1) =x(i) +α (i) d(i) • Take exactly one step in each search direction to line up with 2. To find α , we use the fact x (i) that e(i+1) is orthogonal to d(i) • Solution will be reached in n steps 1st April JASS 2009 20 The Method of Conjugate Directions • To solve the problem of not knowing e(i) , one makes the search directions to be A-orthogonal rather then orthogonal to each other, i.e.: T d )i( Ad )j( = 0 1st April JASS 2009 21 The Method of Conjugate Directions • The new requirement is now that e(i+1) is A-orthogonal to d(i) d dx f( x ) = ('f x )T (i + )1 = 0 dα (i + )1 (i + )1 dα T r(i + )1 d(i) = 0 T d )i( Ae i( + )1 = 0 T d )i( A()e )i( + α )i( d )i( = 0 dT r (i) (i) α )i( = dT Ad (i) (i) If the search vectors were the residuals, this formula would be identical to the method of steepest descent. 1st April JASS 2009 22 The Method of Conjugate Directions • Calculation of the A-orthogonal search directions by a conjugate Gram-Schmidt process 1. Take a set of linearly independent vectors u0, u1, … , un-1 2. Assume that d(0) =u0 3. For i>0, take an ui and subtracts all the components from it that are not A-orthogonal to the previous search directions T i−1 u(i) Ad )j( d )i( = u )i( + ∑ βij d )j( , βij = − T j=0 d(j) Ad )j( 1st April JASS 2009 23 The Method of Conjugate Directions • The method of Conjugate Gradients is simply the method of conjugate directions where the search directions are constructed by conjugation of the residuals, i.e. ui=r(i) • This allows us to simplify the calculation of the new search direction because T T 1 r r r r )i( )i( = )i( )i( i = j+1 β = T T ij α i( − )1 d i( − )1 Ad i( − )1 r i( − )1 r i( − )1 0 i > j+1 • The new search direction is determined as a linear combination of the previous search direction and the new residual d(i + )1 = r i( + )1 + βid(i) 1st April JASS 2009 24 The Method of Conjugate Directions x0 = 0, r 0 = b, d 0 = r 0 for k = 1, 2, 3, .