Iterative Methods

Chapter 2 Iterative Methods 2.1 Introduction In this section, we will consider three different iterative methods for solving a sets of equations. First, we consider a series of examples to illustrate iterative methods. To construct an iterative method, we try and re- arrange the system of equations such that we generate a sequence. 2 2.1.1 Simple Iteration Example 1 Example 2.1.1: Let us consider the equation x y=e−x f(x)= x + e− 2=0 . (2.1) − α 2 When solving an equation such as (2.1) for α y=2−x (k) where f(α)=0,0 <α< 2, we can generate a sequence x ∞ from some initial value (guess) { }k=0 x(0) by re-writing the equation as x x =2 e− , − k (k+1) x( ) (0) i.e. by computing x =2 e− from some x . If the series converges, it will converge to the − solution. For example, let us consider x(0) = 1 and x(0) = 1: − 21 k x(k) x(k) 0 1.0 -1.0 1 1.63212 -0.71828 2 1.80449 -0.05091 3 1.83544 0.947776 4 1.84046 1.61240 5 1.84126 1.80059 6 1.84138 1.83480 7 1.84140 1.84124 8 1.84141 1.84138 9 . In this example, both sequences appear to converge to a value close to the root α =1.84141 where 0 <α< 2. Hence, we have constructed a simple algorithm for solving an equation and it appears to be a robust iterative method. However, (2.1) has two solutions: a positive root at 1.84141 and a negative root at -1.14619. Why do we only find one root? (k+1) (k) If f(x)= 0 has a solution x = α then x = g(x ) will converge to α, provided g′(α) < 1 | | and x(0) is suitably chosen. The condition g′(α) < 1 is a necessary condition. | | In the above example, x x g(x)=2 e− and g′(x)= e− , − and g′(x) < 1 if x> 0 . | | So this method can be used to find the positive root of (2.1). However, it will never converge to the negative root. Hence, this kind of approach will not always converge to a solution. 2.1.2 Linear Systems Let us adopt the same approach for a linear system. Example 2.1.2: 22 Consider the following set of linear equations: 10x1 + x2 = 12 x1 + 10x2 = 21 Let us re-write these equations as x = (12 x )/10 1 − 2 x = (21 x )/10 . 2 − 1 Thus, we can use the following: x(k+1) =1.2 x(k)/10 1 − 2 x(k+1) =2.1 x(k)/10 , 2 − 1 (k) (k) (k) T (0) to generate a sequence of vectors x = (x1 , x2 ) from some starting vector, x . If 0 x(0) = , 0 then 0 1.2 0.99 1.002 x(0) = , x(1) = , x(2) = , x(3) = , ... 0 2.1 1.98 2.001 where 1 x(k) as k , → 2 → ∞ which is indeed the correct answer. So we have generated a convergent sequence. Let us consider the above set of linear equations again. Possibly the more obvious rearrangement was x = 21 10x 1 − 2 x = 12 10x . 2 − 1 Thus, we can generate a sequence using: x(k+1) = 21 10x(k) 1 − 2 x(k+1) = 12 10x(k) , 2 − 1 If, we again use 0 x(0) = , 0 then 0 21 99 1011 x(0) = , x(1) = , x(2) = − , x(3) = , ... 0 12 198 1992 23 − Clearly, this sequence is not converging! Why? Example 2.1.3: Let us consider the above example (2.1.2) again. Can we find a method that allows the system to converge more quickly? (1) (0) Let us look at the computation more carefully. In the first step x1 is computed from x2 and in (1) (0) the second step we compute x2 from x1 . (1) (0) It seems more natural, from a computational point of view, to use x1 rather then x1 in the second step. i.e. to use the latest available value. In effect, we want to compute the following: x(k+1) =1.2 x(k)/10 1 − 2 x(k+1) =2.1 x(k+1)/10 , 2 − 1 which gives, 0 1.2 1.002 1 x(0) = , x(1) = , x(2) = , 0 1.98 1.9998 → 2 1 which converges to 2 much more rapidly! In the following sections, we will consider, in general terms, iterative methods for solving a system Ax = b. First, though we introduce some important results about a sequence of vectors 2.2 Sequences of Vectors 2.2.1 The Limit of a Sequence x(k) ∞ Let k=0 be a sequence in a Vector Space V . How do we know if this sequence has a limit? First observe that x = y ; x = y. i.e. two distinct objects in a Vector Space can have the k k k k same size. However, from rule 1 for norms (1.1) we know that if x y = 0, then x y. k − k ≡ So if lim x(k) x =0 k k − k →∞ then lim x(k) = x k →∞ The vector x is the limit of the sequence. 24 2.2.2 Convergence of a Sequence (k) Suppose the sequence x ∞ converges to x, where { }k=0 x(k+1) = Bx(k) + c . If x(k) x for k , then x satisfies the equation: → → ∞ x = Bx + c , and so we have x(k+1) x = B(x(k) x) , − − and thus, taking norms, x(k+1) x B x(k) x . k − k≤k kk − k If B < 1 then k k x(k+1) x < x(k) x , k − k k − k i.e. we have a monotonically decreasing sequence, or, in other words, the error in the approximations decreases. Say we start from an initial guess x(1) x = B(x(0) x). Then − − x(2) x = B(x(1) x) − − = B B(x(0) x) − = B2(x(0) x) , − and so on, to give x(k) x = Bk(x(0) x) . − − Taking norms, and using rule 5 (1.9) for sub-ordinate matrix norms (k) k (0) k 1 (0) k 2 2 (0) k (0) x x B (x x) B − B (x x) B − B (x x) B (x x) . k − k≤k kk − k≤k kk kk − k≤k kk k k − k≤···≤k k k − k If B < 1, then B k 0 as k and hence, x(k) x as k . k k k k → → ∞ → → ∞ Recall that ρ(B) B ( 1.5) so a necessary condition for convergence is ρ(B) < 1. Furthermore, ≤k k § it is possible to show that if ρ(B) < 1 , then B < 1 . k k and if ρ(B) > 1 , then B > 1 , k k although we do not prove these results in this course. Hence, ρ(B) < 1 is not only a necessary condition, but it is also sufficient condition. 25 2.2.3 Spectral radius and rate of convergence In numerical analysis, to compare different methods for solving systems of equations we are interested in determining the rate of convergence of the method. As we will see below the spectral radius is a measure of the rate of convergence. Consider the situation where BN N has N linearly independent eigenvectors. As before we have × x(k+1) x = B(x(k) x) , − − or substituting in for v(k) = x(k) x, we have − v(k+1) = Bv(k) . (0) N Now write v = i=1 αiei where ei are the eigenvectors (with associated eigenvalues λi) of B, then P N N N (1) v = B αiei = αiBei = αiλiei , i=1 ! i=1 i=1 X X X N N N (2) 2 v = B αiλiei = αiλiBei = αiλi ei , i=1 ! i=1 i=1 X X X continuing this sequence gives N (k) k v = αiλi ei . i=1 X Now suppose λ > λ (i =2,...,N), then | 1| | i| N (k) k k v = α1λ1 e1 + αiλi ei i=2 X N k k λi = λ α e1 + α ei . 1 1 i λ " i=2 1 # X Given that λi/λ1 < 1, for large k, (k) k v α λ e1 . ≃ 1 1 Hence, the error associated with x(k), the kth vector in the sequence, is given by v(k) which varies as the kth power of the largest eigenvalue. In other words, it varies as the kth power of the spectral radius ρ(B) (= λ ). So the spectral radius is a good indication of the rate of convergence. | 1| 2.2.4 Gerschgorin’s Theorem The above result means that if we know the magnitude of the largest vector of the iteration matrix we can estimate the rate of convergence of a system of equations for a particular method. However, this 26 requires the magnitudes of all eigenvalues to be known, which would probably have to be determined numerically. The Gerschgorin Theorem is a surprisingly simple result concerning eigenvalues that allows us to put bounds on the size of the eigenvalues of a matrix without actually finding the eigenvalues themselves. The equation Ae = λe, where (λ, e) are an eigenvalue, eigenvector pair of the matrix A, can be written in component notation as N N aij ej = aiiei + aij ej = λei . j=1 j=1 X Xj=i 6 Rearranging implies N e (a λ)= a e , i ii − − ij j j=1 Xj=i 6 and thus, N e a λ a e .

Load more