SIAM J. MATRIX ANAL. APPL. c 2006 Society for Industrial and Applied Mathematics Vol. 28, No. 1, pp. 264–284
MODIFIED GRAM–SCHMIDT (MGS), LEAST SQUARES, AND BACKWARD STABILITY OF MGS-GMRES∗
† ‡ ‡ CHRISTOPHER C. PAIGE , MIROSLAV ROZLOZNˇ ´IK , AND ZDENEKˇ STRAKOSˇ
Abstract. The generalized minimum residual method (GMRES) [Y. Saad and M. Schultz, SIAM J. Sci. Statist. Comput., 7 (1986), pp. 856–869] for solving linear systems Ax = b is implemented as a sequence of least squares problems involving Krylov subspaces of increasing dimensions. The most usual implementation is modified Gram–Schmidt GMRES (MGS-GMRES). Here we show that MGS-GMRES is backward stable. The result depends on a more general result on the backward stability of a variant of the MGS algorithm applied to solving a linear least squares problem, and uses other new results on MGS and its loss of orthogonality, together with an important but neglected condition number, and a relation between residual norms and certain singular values.
Key words. rounding error analysis, backward stability, linear equations, condition numbers, large sparse matrices, iterative solution, Krylov subspace methods, Arnoldi method, generalized minimum residual method, modified Gram–Schmidt, QR factorization, loss of orthogonality, least squares, singular values
AMS subject classifications. 65F10, 65F20, 65F25, 65F35, 65F50, 65G50, 15A12, 15A42
DOI. 10.1137/050630416
1. Introduction. Consider a system of linear algebraic equations Ax = b, where A is a given n × n (unsymmetric) nonsingular matrix and b a nonzero n-dimensional vector. Given an initial approximation x0, one approach to finding x is to first compute the initial residual r0 = b − Ax0. Using this, derive a sequence of Krylov subspaces k−1 Kk(A, r0) ≡ span{r0,Ar0,...,A r0},k=1, 2,..., in some way, and look for ap- proximate solutions xk ∈ x0 + Kk(A, r0) . Various principles are used for constructing xk, which determine various Krylov subspace methods for solving Ax = b. Similarly, Krylov subspaces for A can be used to obtain eigenvalue approximations or to solve other problems involving A. Krylov subspace methods are useful for solving problems involving very large sparse matrices, since these methods use these matrices only for multiplying vectors, and the resulting Krylov subspaces frequently exhibit good approximation proper- ties. The Arnoldi method [2] is a Krylov subspace method designed for solving the eigenproblem of unsymmetric matrices. The generalized minimum residual method (GMRES) [20] uses the Arnoldi iteration and adapts it for solving the linear system Ax = b. GMRES can be computationally more expensive per step than some other methods; see, for example, Bi-CGSTAB [24] and QMR [9] for unsymmetric A, and LSQR [16] for unsymmetric or rectangular A. However, GMRES is widely used for solving linear systems arising from discretization of partial differential equations, and
∗Received by the editors May 2, 2005; accepted for publication (in revised form) by M. Benzi October 28, 2005; published electronically March 17, 2006. http://www.siam.org/journals/simax/28-1/63041.html †School of Computer Science, McGill University, Montreal, Quebec, Canada, H3A 2A7 (paige@ cs.mcgill.ca). This author’s work was supported by NSERC of Canada grant OGP0009236. ‡Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vod´arenskou vˇeˇz´ı 2, 182 07 Praha 8, Czech Republic ([email protected], [email protected]). The work of these authors was supported by the project 1ET400300415 within the National Program of Research “In- formation Society” and by the Institutional Research Plan AV0Z10300504 “Computer Science for the Information Society: Models, Algorithms, Applications.” 264 ROUNDING ERROR ANALYSIS OF MGS-GMRES 265
as we will show, it is backward stable and it does effectively minimize the 2-norm of the residual at each step. The most usual way of applying the Arnoldi method for large, sparse unsymmetric A is to use modified Gram–Schmidt orthogonalization (MGS). Unfortunately in finite precision computations this leads to loss of orthogonality among the MGS Arnoldi vectors. If these vectors are used in GMRES we have MGS-GMRES. Fortunately, experience suggests that MGS-GMRES succeeds despite this loss of orthogonality; see [12]. For this reason we examine the MGS version of Arnoldi’s algorithm and use this to show that the MGS-GMRES method does eventually produce a backward stable approximate solution when applied to any member of the following class of linear systems with floating point arithmetic unit roundoff (σ means singular value):
n×n n 2 (1.1) Ax = b =0 ,A∈ R ,b∈ R ,σmin(A) n AF ;
see also the appendix. The restriction here is deliberately imprecise; see below. More- over we show that MGS-GMRES gives backward stable solutions for its least squares problems at all iteration steps, thus answering important open questions. The proofs depend on new results on the loss of orthogonality and backward stability of the MGS algorithm, as well as the application of the MGS algorithm to least squares problems, and a lot of this paper is devoted to first obtaining these results. While the kth step of MGS produces the kth orthonormal vector vk, it is usual to say vk is produced by step k−1 in the Arnoldi and MGS-GMRES algorithms. We will attempt to give a consistent development while avoiding this confusion. Thus step k−1 of MGS-GMRES is essentially the kth step of MGS applied to [b, AVk−1]to produce vk in [b, AVk−1]=VkRk, where Vk ≡ [v1,...,vk] and Rk is upper triangular. In practice, if we reach a solution at step m−1 of MGS-GMRES, then numerically b must lie in the range of AVm−1, so that Bm ≡ [b, AVm−1] is numerically rank deficient. But this means we have to show that our rounding error analysis of MGS holds for rank deficient Bm—and this requires an extension of some results in [5]. In section 2 we describe our notation and present some of the tools we need which may be of more general use. For example we show the importance of the condition numberκ ˜F (A) in (2.1), prove the existence of a nearby vector in Lemma 2.3, and provide a variant of the singular value–residual norm relations of [17] in Theorem 2.4. In sections 3.1–3.2 we review MGS applied to n×mBof rank m, and its numerical equivalence to the Householder QR reduction of B augmented by an m × m matrix of zeros. In section 3.3 we show how the MGS rounding error results extend to the case of m>n, while in section 4 we show how these results apply to the Arnoldi algorithm. In section 5 we analyze the loss of orthogonality in MGS and the Arnoldi algorithm and how it is related to the near rank deficiency of the columns of B or its Arnoldi equivalent, refining a nice result of Giraud and Langou [10] and Langou [14]. Section 6 introduces the key step used to prove convergence of these iterations. In section 7.1 we prove the backward stability of the MGS algorithm applied to solving linear least squares problems of the form required by the MGS-GMRES algorithm, and in section 7.2 we show how loss of orthogonality is directly related to new normwise relative backward errors of a sequence of different least squares problems, supporting a conjecture on the convergence of MGS-GMRES and its loss of orthogonality; see [18]. In section 8.1 we show that at every step MGS-GMRES computes a backward stable solution for that step’s linear least squares problem, and in section 8.2 we show that one of these solutions is also a backward stable solution for (1.1) in at most n+1 MGS steps. 266 C. C. PAIGE, M. ROZLOZNˇ ´IK, AND Z. STRAKOSˇ
The restriction on A in (1.1) is essentially a warning to be prepared for difficulties in using the basic MGS-GMRES method on singular systems; see, for example, [6, 23]. The imprecise nature of the condition (using instead of > with some constant) was chosen to make the presentation easier. A constant could be provided (perhaps closer to 100 than 10), but since the long bounding sequence used was so loose, it would 2 be meaningless. The appendix suggests that the form n AF might be optimal, but since for large n rounding errors tend to combine in a semirandom fashion, it is reasonable to replace n2 by n, and a more practical requirement than (1.1) might be
(1.2) For large n, n AF /σmin(A) ≤ 0.1. 2. Notation and mathematical basics. We describe the notation we will use, together with some generally useful results. We use “≡” to mean “is defined as” in the first occurrence of an expression, but in any later occurrences of this expression it means “is equivalent to (by earlier definition).” A bar above a symbol will denote a computed quantity, so if Vk is an ideal mathematical quantity, V¯k will denote its actual computed value. The floating point arithmetic unit roundoff will be denoted by (half the machine epsilon; see [13, pp. 37–38]), In denotes the n×n unit matrix, ej will be the jth column of a unit matrix I,soBej is the jth column of B, and Bi:j ≡ [Bei,...,Bej]. We will denote the absolute value of a matrix B by |B|, its † Moore–Penrose generalized inverse by B , ·F will denote the Frobenius norm, σ(·) will denote a singular value, and κ2(B) ≡ σmax(B)/σmin(B); see (2.1) forκ ˜F (·). Matrices and vectors whose first symbol is Δ, such as ΔVk, will denote rounding error terms. For the rounding error analyses we will use Higham’s notation [13, pp. 63–68]: c˜ will denote a small integer ≥ 1 whose exact value is unimportant (˜c might have a different value at each appearance) and γn ≡ n /(1 − n ),γ ˜n ≡ cn /˜ (1 − cn ˜ ). Without mentioning it again, we will always assume the conditions are such that the denominators in objects like this (usually bounds) are positive; see, for example, [13, − − − (19.6)]. We seeγ ˜n/(1 γ˜n)=˜cn /(1 2˜cn ), and might writeγ ˜n/(1 γ˜n)=˜γn for mathematical correctness, but will refer to the right-hand side asγ ˜n thereafter. Em, Eˆm, E˜m will denote matrices of rounding errors (see just before Theorem 3.3), and Emej2 ≤ γBej2 implies this holds for j =1,...,m unless otherwise stated. Remark 2.1 (see also the appendix). An important idea used throughout this paper is that column bounds of the above form lead to several results which are independent of column scaling, and we take advantage of this by using the following condition number. Throughout the paper, D will represent any positive definite diagonal matrix. The choice of norms is key to making error analyses readable, and fortunately there is a compact column-scaling-independent result with many uses. Define
(2.1) κ˜F (A) ≡ min ADF /σmin(AD). diagonal D>0 This condition number leads to some useful new results. Lemma 2.1. If E and B have m columns, then for any positive definite diagonal matrix D: Eej2 ≤ γBej2,j=1,...,m, ⇒EDF ≤ γBDF ; † Eej2 ≤ γBej2 for j =1,...,m and rank(B)=m ⇒EB F ≤ γκ˜F (B). −1 With the QR factorization B = Q1R, this leads to ER F ≤ γκ˜F (B)=γκ˜F (R). Proof. Eej2 ≤ γBej2 implies EDej2 ≤ γBDej2 so EDF ≤ γBDF . † −1 † † −1 For B of rank m,(BD) = D B , (BD) 2 = σmin(BD), and so † † † EB F = ED(BD) F ≤EDF (BD) 2 ≤ γBDF /σmin(BD). ROUNDING ERROR ANALYSIS OF MGS-GMRES 267
Since this is true for all such D, we can take the minimum, proving our results. Lemma 2.2. × ¯ T ¯ If m m R is nonsingular and P1 P1 = I in P1R = B + E, and γκ˜F (B) < 1, then
−1 Eej2 ≤ γBej2,j=1,...,m,⇒ER¯ F ≤ γκ˜F (B)/(1 − γκ˜F (B)).
Proof. For any D in (2.1), Eej2 ≤ γBej2 ⇒EDF ≤ γBDF , and then −1 −1 σmin(RD¯ ) ≥ σmin(BD) − γBDF ,soER¯ F = ED(RD¯ ) F is bounded by −1 γ BD F γ BD F /σmin(BD) EDF (RD¯ ) 2 ≤ = . σmin(BD) − γBDF 1 − γBDF /σmin(BD) Taking the minimum over D proves the result. Suppose V¯m ≡ [¯v1,...,v¯m]isann×m matrix whose columns have been compu- tationally normalized to have 2-norms of 1, and so have norms in [1−γ˜n, 1+˜γn]. Now define V˜m ≡ [˜v1,...,v˜m] wherev ˜j is just the correctly normalized version ofv ¯j,so
(2.2) V¯m = V˜m(I +Δm), Δm ≡ diag(νj), where |νj|≤γ˜n,j=1,...,m; ¯ T ¯ ˜ T ˜ ˜ T ˜ ˜ T ˜ ˜ T ˜ Vm Vm = Vm Vm + Vm Vm.Δm +Δm.Vm Vm +Δm.Vm Vm.Δm, ¯ T ¯ − ˜ T ˜ ˜ T ˜ ≤ ≡ Vm Vm Vm Vm F / Vm Vm F γ˜n(2 +γ ˜n) γ˜n. ≡ From now on we will not document the analogues of the last stepγ ˜n(2 +γ ˜n) γ˜n but finish with ≤ γ˜n. In general it will be as effective to consider V˜m as V¯m, and we will develop our results in terms of V˜m rather than V¯m. The following will be useful here: ˜ 2 ˜ ˜ H ˜ ˜ H ˜ 2 ≤ ˜ 2 (2.3) [Vm,In] 2 = In +VmVm 2 =1+ VmVm 2 =1+ Vm 2 1+ Vm F =1+m. Lemma 2.3 deals with the problem: Suppose we have d ∈ Rn and we know for ∈ (m+n) 0 some unknown perturbation f R that d +f = ρ. Is there a perturbation 2 g of the same dimension as d, and having a similar norm to that of f, such√ that d + g2 = ρ also? Here we show such a g exists in the form g = Nf, N2 ≤ 2. Lemma 2.3. For a given d ∈ Rn and unknown f ∈ R(m+n),if f1 0 p1 ≡ + f = pρ ≡ ρ, where p2 =1, d + f2 d p2
n then there exists 0 ≤ σ ≤ 1, v ∈ R with v2 =1, and n×(m+n) N of the form ≡ −1 T (2.4) N [v(1 + σ) p1 ,In], (2.5) so that d + Nf = vρ. √ 0 (2.6) This gives + f = d + Nf2 = ρ, 1 ≤N2 ≤ 2. d 2 n Proof. Define σ ≡p22.Ifσ = 0 take any v ∈ R with v2 = 1. Otherwise ≡ T − 2 define v p2/σ so v 2 = 1. In either case p2 = vσ and p1 p1 =1 σ . Now define N as in (2.4), so −1 2 − d + Nf = d + v(1 + σ) p1 2ρ + f2 = p2ρ + v(1 σ)ρ = vρ, NNT = I + v(1 + σ)−2(1 − σ2)vT , ≤ 2 T − ≤ 1 N 2 = NN 2 =1+(1 σ)/(1 + σ) 2, proving (2.5) and (2.6). 268 C. C. PAIGE, M. ROZLOZNˇ ´IK, AND Z. STRAKOSˇ
This is a refinement of a special case of [5, Lem. 3.1]; see also [13, Ex. 19.12]. The fact that the perturbation g in d has the form of N times the perturbation f is important, as we shall see in section 7.1. Finally we give a general result on the relation between least squares residual norms and singular values. The bounds below were given in [17, Thm. 4.1] but subject to the condition [17, (1.4)] that we cannot be sure will hold here. To prove that our results here hold subject to the different condition (1.1), we need to prove a related result. In order not to be too repetitive, we will prove a slightly more general result than we considered before, or need here, and make the theorem and proof brief. n×k Theorem 2.4. Let B ∈ R have rank s and singular values σ1 ≥···≥σs > 0. n † For 0 = c ∈ R and a scalar φ ≥ 0, define yˆ ≡ B c, rˆ ≡ c−Byˆ, σ(φ) ≡ σs+1([cφ, B]), and δ(φ) ≡ σ(φ)/σs.Ifrφˆ =0 , then σ(φ) > 0.Ifφ0 ≡ σs/c, then for all φ ∈ [0,φ0) we have 0 ≤ δ(φ) < 1. Finally, for all φ>0 such that δ(φ) < 1, we have yˆ2 σ2(φ)[φ−2 + yˆ2] ≤rˆ2 ≤ σ2(φ) φ−2 + 2 . 2 2 1 − δ2(φ)
Proof.ˆr is the least squares residual for By ≈ c,soˆrφ = 0 means [cφ, B] has rank s+1 and σ(φ) > 0. If 0 ≤ φ<φ0, then cφ < cφ0 = σs, so via Cauchy’s interlacing theorem, 0 ≤ σ(φ) ≡ σs+1([cφ, B]) <σs, giving 0 ≤ δ(φ) < 1. Using the singular value decomposition B = W diag(Σ, 0)ZT , W T = W −1, ZT = Z−1, write ⎡ ⎤ ⎡ ⎤ σ1 α1 −1 T a1 Σ0 ⎣ ⎦ ⎣ ⎦ Σ a1 W [c, BZ]= , Σ ≡ · ,a1 ≡ · , yˆ = Z . a2 00 0 σs αs
Then it can be shown (see, for example, [26, (39.4)], [17, (2.6)], [15, pp. 1508–10], et al.) that for all φ such that φ>0 and δ(φ) < 1, σ(φ) is the smallest root of
s α2/σ2 rˆ2 = σ(φ)2 φ−2 + i i . 1 − σ(φ)2/σ2 i=1 i s α2 s α2/σ2 s α2/σ2 yˆ2 But yˆ2 = i ≤ i i ≤ i i = 2 2 σ2 1−σ(φ)2/σ2 1−σ(φ)2/σ2 1−δ2(φ) i=1 i i=1 i i=1 s
while δ(φ) ≡ σ(φ)/σs < 1, and the result follows. We introduced φ0 to show δ(φ) < 1 for some φ>0. For results related to Theorem 2.4 we refer to [15, pp. 1508–1510], which introduced this useful value φ0. 3. The modified Gram–Schmidt (MGS) algorithm. In order to under- stand the numerical behavior of the MGS-GMRES algorithm, we first need a very deep understanding of the MGS algorithm. Here this is obtained by a further study of the numerical equivalence between MGS and the Householder QR factorization of an augmented matrix; see [5] and also, for example, [13, section 19.8]. We do not give exact bounds but work with terms of the formγ ˜n instead; see [13, pp. 63–68] and our section 2. The exact bounds will not even be approached for the large n we are interested in, so there is little reason to include such fine detail. In sections 3.1–3.3 we will review the MGS-Householder equivalence and extend some of the analysis that was given in [5] and [13, section 19.8]. ROUNDING ERROR ANALYSIS OF MGS-GMRES 269
3.1. The basic MGS algorithm. Given a matrix B ∈ Rn×m with rank m ≤ n, MGS in theory produces Vm and nonsingular Rm in the QR factorization
T (3.1) B = VmRm,Vm Vm = Im,Rm upper triangular,
where Vm ≡ [v1,...,vm], and m × mRm ≡ (ρij). The version of the MGS al- gorithm which immediately updates all columns computes a sequence of matrices (1) (2) (m+1) ∈ n×m (i) (i) (i) B = B ,B ,...,B = Vm R , where B =[v1,...,vi−1,bi ,...,bm ]. − (i) (i) Here the first (i 1) columns are final columns in Vm, and bi ,...,bm have been made orthogonal to v1,...,vi−1. In the ith step we take
(i) (i) (3.2) ρii := bi 2 = 0 since rank(B)=m, vi := bi /ρii,
(i) (i) − T and orthogonalize bi+1,...,bm against vi using the orthogonal projector I vivi ,
T (i) (i+1) (i) − (3.3) ρij := vi bj ,bj := bj viρij,j= i +1,...,m.
(i) (i+1) (i) (i) We see B = B R , where R has the same ith row as Rm but is the unit matrix otherwise. Note that in the mth step no computation is performed in (3.3), so that after m steps we have obtained the factorization
(1) (2) (1) (3) (2) (1) (m+1) (m) (1) (3.4) B = B = B R = B R R = B R ···R = VmRm,
where in exact arithmetic the columns of Vm are orthonormal by construction. This formed Rm a row at a time. If the jth column of B is only available after vj−1 is formed, as in MGS-GMRES, then we usually form Rm a column at a time. (2) (3) This does not alter the numerical values if we produce ρ1,j, bj ; ρ2,j, bj ; etc. It was shown in [3] that for the computed R¯m and V¯m in MGS ¯ ¯ ≤ − ¯ T ¯ ≤ (3.5) B + E = VmRm, E 2 c1(m, n) B 2, I Vm Vm 2 c2(m, n) κ2(B),
where ci(m, n) denoted a scalar depending on m, n and the details of the arithmetic. We get a deeper understanding by examining the MGS-Householder QR relationship. 3.2. MGS as a householder method. The MGS algorithm for the QR factor- ization of B can be interpreted as an orthogonal transformation applied to the matrix B augmented with a square matrix of zero elements on top. This is true in theory for any method of QR factorization, but for Householder’s method it is true in the presence of rounding errors as well. This observation was made by Charles Sheffield and relayed to the authors of [5] by Gene Golub. First we look at the theoretical result. Let B ∈ Rn×m have rank m, and let m×m Om ∈ R be a zero matrix. Consider the QR factorization