Chapter 6
Linear Iterative Methods
6.1 Motivation
In Chapter 3 we learned that, in general, solving the linear system of equations n n n 3 Ax = b with A C ⇥ and b C requires (n ) operations. This is too 2 2 O expensive in practice. The high cost begs the following questions: Are there lower cost options? Is an approximation of x good enough? How would such an approximation be generated? Often times we can find schemes that have a much lower cost of computing an approximate solution to x. As an alternative to the direct methods that we studied in the previous chapters, in the present chapter we will describe so called iteration methods for n 1 constructing sequences, x k 1 C , with the desire that x k x := A b, { }k=1 ⇢ ! as k . The idea is that, given some ">0, we look for a k N such that !1 2 x x " k k k with respect to some norm. In this context, " is called the stopping tolerance. In other words, we want to make certain the error is small in norm. But a word of caution. Usually, we do not have a direct way of approximating the error. The residual is more readily available. Suppose that x k is an ap- proximation of x = A 1b. The error is e = x x and the residual is k k r = b Ax = Ae . Recall that, k k k e r k k k (A)k k k. x b k k k k r k Thus, when (A) is large, k b k , which is easily computable, may not be a good k k ek indicator of the size of the relative error k x k , which is not directly computable. One must be careful measuring the error.k k
149 150 CHAPTER 6. LINEAR ITERATIVE METHODS
6.2 Linear Iterative Schemes
n n Definition 6.2.1 (iterative scheme). Let A C ⇥ with det(A) =0and 2 6 b Cn. An iterative scheme to find an approximate solution to Ax = b is a 2 process to generate a sequence of approximations x via an iteration of { k }k1=1 the form
x k = '(A; b; x k 1; ; x k r ), ··· n given the starting values x 0, x r 1 C ,where ··· 2 n n n n n '( ; ; ; ):C ⇥ C C C · · ··· · ⇥ ⇥···⇥ ! is called the iteration function.Ifr =1, we say that the process is a two-layer scheme, otherwise we say it is a multilayer scheme.
n n n 1 Definition 6.2.2. Let A C ⇥ with det(A) =0and b C .Setx = A b. 2 6 2 The two-layer iterative scheme
x k = '(A; b; x k 1), 1 is said to be consistent i↵ x = '(A, b, x), i.e., x = A b is a fixed point of '(A, b, ). The scheme is linear i↵ ·
'(A; ↵b1 + b2; ↵x 1 + x 2)=↵'(A; b1; x 1)+ '(A; b2; x 2),
n for all ↵, C and x 1, x 2 C . 2 2 n n n Proposition 6.2.3. Let A C ⇥ with det(A) =0and b C . Any two-layer, 2 6 2 linear and consistent scheme can be written in the form
x = x + Cr(x )=Cb +(I CA)x , (6.2.1) k+1 k k n k n n for some matrix C C ⇥ ,wherer(z):=b Az is the residual vector. 2 Proof. A two layer scheme is defined by an iteration function
n n n n '( ; ; ):C ⇥ C C C. · · · ⇥ ⇥ ! Given ', define the operator
Cz := '(A, z, 0). 6.2. LINEAR ITERATIVE SCHEMES 151
This is a linear operator, due to the assumed linearity of the iteration function. Consequently, C can be identified as a square matrix. It follows from this definition, using the consistency and linearity of ',that (I CA)w = w '(A, Aw, 0) n = '(A, Aw, w) '(A, Aw, 0) = '(A, 0, w). Furthermore, by linearity, we can write
x k+1 = '(A, b + 0, 0 + x k )
= '(A, b, 0)+'(A, 0, x k ) = Cb +(I CA)x n k = x + C(b Ax ), k k as we intended to show.
If C is invertible, we can, if we like, write 1 C (x x ) Ax = b. k+1 k k Motivated by this form, we have the following generalization. n n n Definition 6.2.4. Let A C ⇥ with det(A) =0and b C . A scheme of 2 6 2 the form B (x x )+Ax = b, k+1 k+1 k k n n where Bk+1 C ⇥ is invertible is called an adaptive two-layer scheme.If 2 Bk+1 = B,whereB is invertible and independent of k, then the scheme is 1 called a stationary two-layer scheme.IfBk+1 = In,where↵k+1 C?, ↵k+1 2 then we say that the adaptive two-layer scheme is explicit. Remark 6.2.5. Consider a stationary two-layer method and assume that B is invertible, then 1 x = x + B (b Ax ), (6.2.2) k+1 k k 1 from this it follows that, if x k converges, then it must converge to x = A b. { } 1 Of course, this form is equivalent to (6.2.1) with C = B . The matrix B in the stationary, two-layer method is called the iterator. Remark 6.2.6. Let us now consider an extreme case, namely B = A. In this case we obtain 1 1 1 x = x + B (b Ax )=x + A (b Ax )=A b, k+1 k k k k that is, we get the exact solution after one step. 152 CHAPTER 6. LINEAR ITERATIVE METHODS
The choice of an iterator comes with two conflicting requirements: 1. B should be easy/cheap to invert.
2. B should approximate the matrix A well. In essence, the art of iterative schemes is concerned with finding good iterators.
n n n Definition 6.2.7. Let A C ⇥ be invertible and b C . Suppose that 1 2 2 x = A b and consider the stationary two-layer scheme (6.2.2) defined by the n n 1 invertible matrix B C ⇥ . The matrix T := In B A is called the error 2 transfer matrix and satisfies
ek+1 = Tek , where e := x x is the error at step k. k k We need to find conditions on T to guarantee that e converges to zero. { k } 6.3 Contraction Mapping Principle (Optional)
Let us first give a very general convergence theory based on the contrac- tion mapping theorem. This section requires some familiarity with Cauchy sequences and Banach spaces. Definition 6.3.1 (contraction). Let X be a normed space. The map T : X ! X is called a contraction i↵there is a real number q (0, 1),suchthat 2 Tx Ty q x y x,y X. k kX k kX 8 2 Definition 6.3.2 (fixed point). Let T : X X. The element x X is called ! 2 a fixed point of T i↵ Tx = x. Remark 6.3.3. Notice that if x˜ is a fixed point of a stationary two-layer scheme, then 1 x˜ =˜x B (b Ax˜) Ax˜ = b, , i.e., x˜ is the solution to the model problem. Theorem 6.3.4 (contraction mapping principle). Let X be a complete normed linear space (a Banach space) and T : X X a contraction. Then it has a ! unique fixed point x¯, which can be found as the limit of the iteration
xk+1 = Txk . 6.3. CONTRACTION MAPPING PRINCIPLE (OPTIONAL) 153
Moreover, we have the following error estimates 1 q xk+1 xk X xk x¯ X xk xk 1 X. 1+q k k k k 1 q k k Proof. We use the iterative formula given in the statement and note that
k xk+1 xk X = Txk Txk 1 q xk xk 1 X q x1 x0 X. k k k k k k ··· k k This implies that, for m 2, k m 1 2 x x q (q + + q + q + 1) x x . k k+m k kX ··· k 1 0kX In addition, since q (0, 1) we have that 2 m m 1 1 q 1 q + + q +1= , ··· 1 q 1 q therefore qk x x x x 0 k k+m k kX 1 q k 1 0kX ! as k . This shows that the sequence of iterates is Cauchy, and so it must !1 converge since the space X is complete. Since T is continuous, the limit must be a fixed point. Let us now show uniqueness. Assuming that x and y are two distinct fixed points we get x y = Tx Ty q x y < x y , k kX k kX k kX k kX which is a contradiction. Finally, the error estimates are obtained by recalling that qk q x x x x x x . k k+m k kX 1 q k 1 0kX 1 q k 1 0kX Passing to the limit m this yields !1 q q x¯ xk X x1 x0 X x¯ xk X xk xk 1 X. k k 1 q k k ,k k 1 q k k Notice also that
xk xk 1 X xk x¯ X + x¯ xk 1 X (1 + q) x¯ xk 1 X k k k k k k k k or 1 xk xk 1 X x¯ xk 1 X. 1+q k k k k 154 CHAPTER 6. LINEAR ITERATIVE METHODS
An immediate corollary of this is the following.
Corollary 6.3.5 (convergence). If T is the error transfer matrix of a linear, stationary, two-layer iterative scheme and T is a contraction with respect to some matrix norm, then the iterative scheme is convergent. In particular, if n n T < 1 with respect to some induced matrix norm : C ⇥ R,themT k k k·k ! is a contraction.
Proof. The first statement is simple. For the second, notice that, if : n n k·k C ⇥ R is an induced matrix norm with the property that T < 1, then ! k k by consistency of induced norms,
Tx Ty = T(x y) T x y = q x y , k k k kk kk k k k where q := T < 1. k k What is the moral of the story? We need to guarantee that T < 1! But, k k in what norm?
6.4 Spectral Convergence Theory
In the current section, we come to the same conclusion as in the previous one, but we go a bit deeper. The theory in this section is based on the the spectrum (T) of the error transfer matrix. We start with a very technical result.
n n Theorem 6.4.1. For every matrix A C ⇥ and any ">0, there is a vector 2 norm R such that k·kA," ! A ⇢(A)+", k kA," n n where, recall, for any M C ⇥ , 2
Mx A," M ," =supk k =sup Mx ," A n A k k x C? x A," x =1 k k 2 k k k kA," is the induced matrix norm with respect to . k·kA," Proof. Appealing to the Schur factorization, Lemma 1.4.11, there is a unitary n n n n matrix P C ⇥ and an upper triangular matrix B C ⇥ such that 2 2 A = PHBP. 6.4. SPECTRAL CONVERGENCE THEORY 155
The diagonal elements of B are the eigenvalues of A. Let us write
B =⇤+U, where ⇤= diag( , , ), with 1 ··· n (A)= , , = (B), { 1 ··· n} n n and U =[ui,j ] C ⇥ is strictly upper triangular. Let >0 be arbitrary, and 2 define 1 1 n D = diag 1, , , . ··· Next, define 1 C := DBD =⇤+E, where 1 E = DUD . Now, observe that E, like U, must be strictly upper triangular, and the elements of E =[ei,j ] must satisfy
0 j i ei,j = j i . ( ui,j j>i Consequently, the elements of E can be made arbitrarily small in modulus, depending on our choice of . Now, notice that 1 1 A = P D CDP, and, since DP is non-singular, the following defines a vector norm and an induced matrix norm: for any x Cn, 2 x := DPx , k k? k k2 n n and, for any M C ⇥ , 2
M ? =sup Mx ? . k k x =1 k k k k? Observe that Ay = DPAy = CDPy . k k? k k2 k k2 Define z := DPy. Then
Ay = Cz = zHCHCz. k k? k k2 p 156 CHAPTER 6. LINEAR ITERATIVE METHODS
But CHC = ⇤H + EH (⇤+ E)=⇤H⇤+M( ), where M( ):=EH⇤+⇤HE + EHE. As an exercise, the reader should prove that, for a given matrix A, there is a constant K1 > 0, such that
M( ) K . k k2 1 for all 0 <