Universita` di Pisa Dipartimento di Informatica
Technical Report
Krylov subspace methods for solving linear systems
G. M. Del Corso O. Menchi F. Romani
LICENSE: Creative Commons: Attribution-Noncommercial - No Derivative Works
ADDRESS: Largo B. Pontecorvo 3, 56127 Pisa, Italy. TEL: +39 050 2212700 FAX: +39 050 2212726
Krylov subspace methods for solving linear systems
G. M. Del Corso O. Menchi F. Romani
1 Introduction
With respect to the influence on the development and practice of science and engineering in the 20th century, Krylov subspace methods are considered as one of the most important classes of numerical methods [9]. Given a nonsingular matrix A ∈ RN×N and a vector b ∈ RN , consider the system
Ax = b (1)
and denote by x∗ the solution. When N is large and A does not enjoy particular structure properties, iterative methods are required to solve (1). Most iterative methods start from an ∗ initial guess x0 and compute a sequence xn which approximates x moving in an affine subspace N x0 + K ⊂ R .
To identify suitable subspaces K, standard approaches can be followed. Denoting by rn = b − Axn the residual vector at the nth iteration step, we can impose orthogonality conditions on rn or minimize some norm of rn. In any case, we are interested in procedures that allow the construction of K exploiting only simple operations such as matrix-by-vector products. Using this kind of operations we can construct Krylov subspaces, an ideal setting where to develop iterative methods. Actually, the iterative methods that are today applied for solving large-scale linear systems are mostly Krylov subspace solvers. Classical iterative methods that do not belong to this class, like the successive overrelaxation (SOR) method, are no longer competitive. The idea of Krylov subspaces iteration was established around the early 1950 by C. Lanczos and W. Arnoldi. Lanczos method, originally applied to the computation of eigenvalues, was based on two mutually orthogonal sequences of vectors and a three-term recurrence. In 1952 M. Hestenes and E. Stiefel gave their classical description of the conjugate gradient method (CG), presented as a direct method, rather than an iterative method. The method was related to the Lanczos method, but reduced the two mutually orthogonal sequences to just one and the three-term recurrence to a couple of two-term recurrences. At first the CG did not receive much attention because of its intrinsic instability, but in 1971 J. Reid pointed out its effectiveness as an iterative method for symmetric positive definite systems. Since then, a considerable part of the research in numerical linear algebra has been devoted to generalizations of CG to nonsymmetric or indefinite systems. The assumption we have made on the nonsingularity of A greatly simplifies the problem since if A is singular, Krylov subspaces methods can fail. Even if a solution x∗ of (1) exists, it may not belong to a Krylov subspace.
2 Krylov subspaces
Alexei Krylov was a Russian mathematician who published a paper on the subject in 1931 [20]. The basis for its subspaces can be found in the Cayley-Hamilton theorem, which says that the inverse of a matrix A is expressed in terms of a linear combination of powers of A. The Krylov subspace methods share the feature that the matrix A needs only be known as an operator (for example through a subroutine) which gives the matrix-by-vector product Av for any N-vector v.
1 Given a vector v ∈ RN and an integer n ≤ N, a Krylov subspace is