
Atkinson chapter 8. Numerical Methods in Matrix Computations (NMMC) by Bjork is a valuable refer- ence, as is Matrix Computations by Golub and van Loan (GvL). Sensitivity of Linear Systems Atkinson x8.4, NMMC 1.2.7, & GvL 2.6 1 Before analyzing any particular algorithm for solving linear systems, we will analyze sensitivity to errors in the inputs: the coefficient matrix and the RHS. These are forward error bounds. Later on we will see that Gaussian Elimination (one particular algorithm) in finite-precision floating point arithmetic finds the exact answer to a perturbed problem (a backwards error bound). Those results combined with these will together result in forward error bounds on the solution computed via Gaussian Elimination in finite-precision. 2 Sensitivity to RHS. (Atkinson x8.4) Ax = b where A is nonsingular. What happens if you perturb the RHS, eg due to roundoff, and then solve the perturbed system exactly? Axδ = b + δb −1 −1 −1 xδ = A b + A (δb) = x + A (δb) −1 kxδ − xk = kA (δb)k: We want the relative error; divide both sides by kxk kx − xk kAkkA−1(δb)k kδbk δ = ≤ κ(A) : kxk kAkkxk kbk The condition number is κ(A) = kAkkA−1k: Condition numbers are ≥ 1; e.g. for the 2-norm the condition number is σmax/σmin. If the system is ill-conditioned, then a small perturbation to the RHS can lead to large changes in the solution. For example, 1 0 A = ; b = (1; 0)T ; δb = (0; δ)T 0 The unperturbed solution is (1; 0)T , if δ is small then the perturbation is small, and the solution perturbation is (1; δ/)T so if δ then the perturbed solution is very different. 3 Perturbing both: Non-rigorous asymptotic analysis from GvL x2.6.2. A is nonsingular. Consider (A + F)y() = b + f: Statement without proof: y() is a contiuously-differentiable function of on some nontrivial interval near = 0. Now differentiate (`implicit differentiation') Ay_() + Fy() + Fy_() = f: Now evaluate at = 0, and define the notation y(0) = x Ay_ + Fx = f ) y_ = A−1(f − Fx): Since y() is continuously-differentiable at = 0 it has a Taylor series expansion of the form y() = x + y_(0) + o(): (The `little-o' notation y = o() means that lim!0 y/ = 0. The `big-o' notation y = O() means that lim!0 y/ exists and is neither infinite nor zero.) 1 So we have the following ky − xk kA−1(f − Fx)k = + o(): kxk kxk We'll use the bound kA−1(f − Fx)k kfk ≤ kA−1k + kFk : kxk kxk Note that 1 kAk Ax = b ) kAkkxk ≥ kbk ) ≤ : kxk kbk Using this above we obtain ky − xk kfk kFk ≤ kA−1kkAk + + o(): kxk kbk kAk This says that the relative error is proportional to the condition number times the relative amplitude of the perturbations in the RHS and in the coefficient matrix. The above derivation (though correct) is not completely rigorous. 4 Theorem 8.4 in Atkinson is not very good. We will instead prove a bound similar to the above rigorously. To prove a rigorous bound we need a Lemma (see also Atkinson Theorems 7.10 & 7.11). Note that kFk < 1 (any operator norm) implies that all eigenvalues are less than 1. If all evals of F are less than 1 then I − F is nonsingular (no evals are 0). Assuming kFk < 1, note the following N ! X I − FN+1 = (I + F + ··· + FN ) − F(I + F + ··· + FN ) = Fk (I − F) : 0 Multiply from the right by (I − F)−1 N X (I − FN+1)(I − F)−1 = Fk 0 Take the limit as N ! 1 and recall that kFk < 1 implies that FN ! 0 1 X (I − F)−1 = Fk 0 Note similarity to Taylor series for 1=(1 − x). Return to fixed N: N X k(I − FN+1)(I − F)−1k ≤ kFkk 0 Now take limits, use geometric series 1 k(I − F)−1k ≤ : 1 − kFk Finally, consider 1 1 1 X X X (I − F)−1 − I = Fk − I = Fk = F Fk = F(I − F)−1: 0 1 0 This implies that kFk k(I − F)−1 − Ik = kF(I − F)−1k ≤ : 1 − kFk 2 5 Now return to a rigorous proof. Consider A+E = (I−F)A where F = −EA−1. Assume that kEA−1k < 1 so that A + E is nonsingular. (A + E)y = b + δb; Ax = b (I − F)Ay = b + δb y = A−1(I − F)−1(b + δb) y − x = A−1 (I − F)−1 − I b + (I − F)−1δb kFk 1 ky − xk ≤ kA−1k kbk + kδbk 1 − kFk 1 − kFk Divide by kxk and multiply RHS by kAk=kAk ky − xk kFk kbk 1 kδbk ≤ κ(A) + kxk 1 − kFk kAkkxk 1 − kFk kAkkxk ky − xk kFk 1 kδbk ≤ κ(A) + : kxk 1 − kFk 1 − kFk kbk Now let = maxfkFk; kδbk=kbkg(< 1) ky − xk 2κ(A) ≤ : kxk 1 − This relies on the fact that x=(1 − x) and 1=(1 − x) are increasing on x 2 (0; 1). The above result depends on kFk while the asymptotic result depended on kEk=kAk. You can make the connection using FA = −E ) kFkkAk ≥ kEk ) kFk ≥ kEk=kAk. In summary, if the system is ill-conditioned then small perturbations in the matrix or RHS can lead to large changes in the solution. A large perturbation in the matrix or RHS will lead to large changes in the solution even when the system is ill-conditioned. 6 Preconditioning. Suppose that A, M, Ml, and Mr are all invertible matrices. The following systems all have the same solution • Standard: Ax = b • Left-preconditioned MAx = Mb • Right-preconditioned AMy = b and My = x • Split-preconditioned MlAMry = Mlb and Mry = x. For the left- and right-preconditioned systems, if M ≈ A−1 then the preconditioned system has the same so- lution as the original system but a better condition number. For the split-preconditioned system, if A = LU −1 −1 and Ml ≈ L and Mr ≈ U then the preconditioned system has a better condition number. We won't discuss methods for constructing preconditioners; take the numerical linear algebra course. Gaussian Elimination & LU Factorization Atkinson x8.1{8.4, NMMC x1.2, & GvL Chapter 3 1 Gaussian Elimination with Partial Pivoting is an algorithm that is guaranteed to compute the exact so- lution (in the absence of roundoff errors) to a nonsingular linear system in a finite number of steps. Any algorithm that computes an exact solution in a finite number of steps is a `direct method.' Form the augmented matrix. Then: For k = 1 : n − 1 • Swap row k with the row that has the largest element in the kth column on or below the diagonal. 3 • Add multiples of row k to rows k + 1 : n to set all elements below the kth diagonal to 0. The augmented matrix will now have the form [Ujc]. Set xn = cn=un;n. For k = n − 1 : −1 : 1 Pk+1 • Set xk = (ck − i=n uk;ixi)=uk;k. The first loop is the forward solve, the second loop is the back solve. In the forward solve, if there's a zero on the diagonal you have to pivot, and you can use pivot in any way that produces a nonzero on the diagonal. Now in finite-precision arithmetic the probability of getting a hard zero is small anyways, so that's not really why we pivot. Instead, the strategy that you use for pivoting has an impact on the magnitude of the roundoff error; we pivot to minimize roundoff errors. I have specified a particular pivoting strategy: pivoting the largest element to the diagonal. We will later quote an error bound that depends on this particular strategy. An even better strategy from the perspective of roundoff errors is to pivot the rows and columns so that the largest element in the trailing submatrix gets moved to the diagonal. This is called `complete' pivoting and leads to a better roundoff error bound, but is also more costly. `Gauss-Jordan' elimination is a variant where the second loop is like the first: you eliminate elements above the diagonal (without pivoting). After the second loop the augmented matrix has the form [Ijx]. Gauss-Jordan is more expensive. 2 Gaussian Elimination & LU Factorization. If you solve Ax = b multiple times with different right hand sides, the `forward solve' part of the loop is the same every time. You will always arrive at an augmented matrix of the form [Ujc]; the upper triangular matrix U will always be the same, only the vector c will depend on the right hand side b. So it's wasteful to re-do the whole forward loop each time. Instead, you should save some record of the operations (pivoting, multiplying and adding rows) that occur during the forward loop. If you have to solve the system again, you just re-apply the same squence of operations to the new RHS vector b to get the new c, then proceed to the back solve. It turns out that storing the steps in the forward loop of Gaussian Elimination shows that any nonsingular matrix has a permuted LU factorization PA = LU where P is a permutation matrix, L is lower-triangular with 1 on the diagonal (`special lower triangular') and U is upper-triangular. LN−1 (PN−1LN−2PN−1)(PN−1PN−2LN−3PN−2PN−1) ··· (PN−1 ··· P2L1P2 ··· PN−1) PN−1 ··· P1A = U: th The elimination matrix Li is an identity except for nonzeros below the diagonal of the i column.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages12 Page
-
File Size-