Condition Numbers

The following summarizes the main points of the class discussion of the condition number of a and its relation to the accuracy of a solution of a linear system computed in floating-point arithmetic using Gaussian elimination with pivoting. Suppose A ∈ Rn×n and b ∈ Rn are given, with A nonsingular. Let k · k denote a of interest on Rn and also the that it induces on Rn×n. Definition: The condition number of A is given by κ(A) ≡ kAkkA−1k. Key points: • κ(A) is norm-dependent, i.e., its value depends on the norm of interest. This isn’t a problem for our practical rules below, which are order-of-magnitude guidelines based on practical experience. • κ(A) ≥ 1 always. • κ(A) is a measure of the conditioning of the linear system Ax = b in that it functions as a “magnification factor” by which small changes in A and b are multiplied in changes in the solution (see the Perturbation Theorem and remarks below). • The system is well-conditioned if κ(A) is not much greater than one and ill-conditioned otherwise. Perturbation Theorem: Suppose Ax = b and (A + E)y = b + e. If kA−1Ek < 1, then

ky − xk κ(A) kEk kek ≤ + . kxk 1 − kA−1Ek  kAk kbk

Remark. The assumption that kA−1Ek < 1 is sometimes replaced by the assumption that kA−1kkEk < 1. Since kA−1Ek ≤ kA−1kkEk, this latter assumption is stronger and, conse- quently, results in a weaker theorem. Remark. Since kEk = kAA−1Ek ≤ kAkkA−1Ek, we have that kEk/kAk ≤ kA−1Ek. Thus the assumption that kA−1Ek < 1 implies that E is small relative to A in the sense that kEk/kAk < 1. Remark. If kA−1Ek ≪ 1, then the inequality in the theorem can be viewed as

ky − xk kEk kek < κ(A) + , kxk ≈  kAk kbk

1 < where “ ≈ ” means “approximately bounded by.” This inequality makes clear the role of κ(A) as a “magnification factor,” i.e., a factor by which small perturbations of A and b are magnified in changes in the solution of the system. The above discussion has nothing to do with computed solutions of Ax = b but only with the sensitivity of solutions to small perturbations in A and b. However, it allows one to gain insight into errors in computed solutions through a backward error analysis. In this, one first shows that the solution xc of Ax = b computed using Gaussian elimination with pivoting exactly satisfies a − “nearby” system (A + E)xc = b, with specified bounds on kA 1Ek. (In the usual analysis, only A is perturbed; there is no need to perturb b.) Assuming that kA−1Ek ≪ 1, one then uses the Perturbation Theorem to obtain

kxc − xk kEk < κ(A) , kxk ≈ kAk and the right-hand side can be bounded using the specified bounds on kA−1Ek (which are also bounds on kEk/kAk). The specified bounds on kA−1Ek are based on a “worst-case” analysis, in which roundoff errors are compounded to become as large as they can possibly be. For complete pivoting these bounds grow like a low-degree in n, the order of the system; consequently, Gaussian elimination with complete pivoting is regarded as stable. For partial pivoting, the bounds grow exponentially with n, and (as we’ve seen) there are contrived systems for which this growth is realized. Consequently, Gaussian elimination with partial pivoting cannot be shown to be always stable. However, partial pivoting is stable in practice and is almost always preferred over complete pivoting. The following are two practical “rules of thumb” that usually provide good guidance for esti- mating the error in the computed solution for both partial and complete pivoting.

Practical Rule 1: The solution xc of Ax = b computed using Gaussian elimination with pivoting usually satisfies kxc − xk ≈ ǫκ(A), (1) kxk where ǫ is machine epsilon. Practical Rule 2: If κ(A) ≈ 10k and ǫ ≈ 10−t, then usually

kxc − xk − t−k ≈ 10 ( ). kxk

In other words, about k decimal digits of accuracy are lost, and xc has about t − k correct decimal digits.

2 Here are a few key points: • Although κ(A) may be expensive to compute exactly, there are inexpensive ways of es- timating it and there is good software for doing this (e.g., MATLAB’s cond command). Thus these practical rules are genuinely useful in applications and should be used whenever possible. • When A and b are entered into the computer, results in a slightly perturbed matrix A + E and right-hand side b + e, where kEk≤ ǫkAk and kek≤ ǫkbk. By the Perturbation Theorem, the solution of this perturbed system differs (relatively) from x satisfying Ax = b by about ǫκ(A). Thus errors in the computed solution resulting from Gaussian elimination with pivoting are no worse in magnitude than errors resulting from entering the system into the computer.

• If k ≥ t, i.e., κ(A) ≥ 1/ǫ, then one can expect no accurate digits in xc. In this case, A is said to be numerically singular.

3