Fundamentals
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 2 Fundamentals 2.1 Backward error analysis 2.1.1 Forward and backward errors Inevitably, numerical computation involves errors from several sources. The given problem may be derived from an imperfect model to begin with, and the problem data may be collected using imprecise measurement. The inherent sensitivity of the problem itself may magnify a seemingly negligible perturbation into a substantial deviation. During the computational process, the numerical algorithms contribute unavoidable round-off errors into the mix and pass along all those inaccuracies into the computing results. Among all the contributing factors toward the total error in the computational solution, the algorithm designer controls a significant portion of them. All numerical algorithms are not created equal and some are more accurate than the others. The backward error analysis may help identify where the errors come from, and more importantly, help understand the fundamental question: What is a numerical algorithm really computing? The basic tenet of the backward error analysis may be summarized in one sentence: A stable numerical algorithm calculates the exact solution of a nearby problem. What does it mean to have a numerical solution 0.33333333 to the division 1/3? The answer: It is the exact solution to the nearby problem 0.99999999/3. The error in the solution is 1 The forward error = 0.33333333 < 0.000000004 − 3 and the nearby problem 0.99999999 /3 does not appear to be too far 1 0.99999999 The backward error = = 0.00000001. 3 − 3 2 13 14 2 Fundamentals Using the root-finding problem for x2 2x+1 as an example, Fig. 2.1 illustrates the forward and backward errors as well− as the meaning of the numerical solution whose accuracy is substantially less than commonly expectated. original problem exact roots exact computation - x2 2x + 1 = 0 x = 1, 1 − H 6 H 6 H numerical computation H in singleH precision backward error H forward error 4 < 10 8 H > 10− − HH H H ? ? HHj (x 0.9999)(x 1.0001) = 0 - − − exact computation x = 0.9999, 1.0001 perturbed problem computed roots Fig. 2.1 An illustration of forward and backward error The numerical solutions x = 0.9999,1.0001 are the exact roots of the polyno- mial (x 0.9999)(x 1.0001) = x2 2x + 1.00000001 − − − with a backward error 1 1 The backward error = 2 2 = 0.00000001, −1 − 1.00000001− ∞ which is as tiny as we can expect. We can argueably conclude that the underlying numerical algorithm is as accurate as it can be. The resulting The forward error = 1 0.9999 = 0.0001 1 − 1.0001 ∞ = 10,000 [backward error] × reflects a high sensitivity 10,000that is inherentfrom the root-finding problem itself. Backward error analysis is important and often effective in identifying the real culprit of the inaccurate solution. “If the answer is highly sensitive to perturbations, you have probably asked the wrong question [30].” Indeed, the problem: Problem 2.1. Find the two roots of x2 2x + 1. − is highly sensitive. As shown in 1.4.1, however, the reformulated problem § Problem 2.2. Find the double root of the polynomial nearest to x2 2x + 1. − is surprisingly not sensitive at all if we know the root is 2-fold. We shall further eliminate the required prior knowledge “double root” from Problem 2.2. Knowing 8 an error of the magnitude of 10− is inevitable from round-off in single precision 6 hardware, we can set an backward error tolerance of, say 10− : 2.1 Backward error analysis 15 Problem 2.3. Find the numerical roots of the polynomial x2 2x + 1 and multi- 6 − plicities within ε = 10− as defined in Definition 2.4. The algorithm UVFACTOR implemented in APALAB [37] for this problem gives accurate answer: >> f = [1 -2 1]; >> [F,err,cond] = uvFactor(f,1e-6,1); THECONDITIONNUMBER: 0.214071 THEBACKWARDERROR: 0.00e+000 THE ESTIMATED FORWARD ROOT ERROR: 0.00e+000 FACTORS ( x - 1.000000000000000 )ˆ 2 2.1.2 Backward error estimates and measurements One of the common approach for estimating backward errors is based on the model of floating point arithmetic f l(x) = x(1 + δ), δ u (2.1) f l(x y) = (x y)(1 + η), |η|≤ u ◦ ◦ | |≤ where f l( ) is the mapping from the exact value to its numerical value using floating point· operation, the symbol represents any· one of the binary operation +, , and , and u is the machine◦ epsilon, defined as − × ÷ u = min ε . (2.2) f l(1+ε)=1 | | The method in the following simple example is typical in estimating backward er- rors. Example 2.1 (Backward error of a sum and its improvement). There are at least two ways to calculate a sum ∑k ak. One way is obvious and the other way is clever and more accurate. Consider a1 + a2 + a3 + a4 = ((a1 + a2)+ a3)+ a4 (2.3) = (a1 + a2) + (a3 + a4) (2.4) The sequencial sum (2.3) adds numbersin its given order, and the pairwise sum (2.4) adds number pairwise. The two sums are theoretically equivalent but substantially different in practical numerical computation. The backward error of the seuential sum (2.3) can be estimated as follows. 16 2 Fundamentals f l ((a + b)+ c)+ d = (((a1 + a2)(1 + ε2)+ a3)(1 + ε3)+ a4)(1 + ε4) ε ε ε ε ε ε = a1(1 + 2)(1 + 3)(1 + 4)+ a2(1 + 2)(1 + 3)(1 + 4) +a3(1 + ε3)(1 + ε4)+ a4(1 + ε4) = a˜1 + a˜2 + a˜3 + a˜4 where, by setting ε1 = 0, a˜k = ak (1 + εk)(1 + εk 1) (1 + εn) = ak(1 + µk) + ··· with µk = εk + εk 1 + + εn, k = 1,2,3,4. + ··· As a result, the floating point sequential sum of a,b,c,d istheexactsumofa ˜,b˜,c˜,d˜ with backward error [a1,a2,a3,a4] [a˜1,a˜2,a˜3,a˜4] 2 3u [a1,a2,a3,a4] 2 + o(u) − ≤ Using the same technique, one can similarly conclude that the floating point pair- wise sum of a1,a2,a3,a4 is theexactsumofa ˆ1,aˆ2,aˆ3,aˆ4 with backward error [a1,a2,a3,a4] [aˆ1,aˆ2,aˆ3,aˆ4 2 2u [a1,a2,a3,a4] 2 + o(u). − ≤ The difference between the two the backward error bounds widens as the number n of terms increases. It is easy to see that for a vector a Ê , the backward error ∑n ∈ bound for the floating point sequencial sum k=1 ak is a a˜ 2 (n 1)u + o(u), − ≤ − while the floating point pairwise sum enjoys a much lower bound a aˆ 2 log n u + o(u) − ≤ ⌈ 2 ⌉ where log2 n is the smallest integer above log2 n. For a very large n, say n = 1,000⌈ ,000,⌉ such a simple modification in operation order reduces the backward 8 bound from 0.01 to 0.000002 in single precision with u = 10− , improving the backward error bound by five digits. Using the alternating harmonic series as an example, we know 6 10 ( 1)k 1 ∑ − − = 0.69314668 ln2 k=1 k ··· ≈ The single precision sequential sum is 0.6931373 with 4 correct digits, while the pairwise sum is 0.6931468 with 3 more correct digits. ⊓⊔ Forward error can not be measured directly unless the exact solution is known. There is an advantage of analyzing backward error: It can often be measured or verified using the computed results without knowing the exact solution. Example 2.2 (Backward error of root-finding). For a given polynomial 2.1 Backward error analysis 17 n n 1 p(x) = p0x + p1x − + + pn 1x + pn ··· − with a0 = 0, the computed roots z1,...,zn are the exact roots of n n 1 p˜ = p0(x z1) (x zn) = p0x + p˜1x − + + p˜n 1 + p˜n. − ··· − ··· − The backward error is thus 2 2 [p1, p2,..., pn] [p˜1, p˜2 ..., p˜n] = p1 p˜1 + pn p˜n . − 2 | − | ···| − | If only a single root is computed as z with residual p(z )= δ, then z is the exact root of ∗ ∗ ∗ n n 1 pˆ = p0x + p1x − + + pn 1x + pn δ ··· − − with backward error p pˆ 2 = δ. − ⊓⊔ Example 2.3 (Backward error of matrix eigenvalues). Let λ be a computed eigen- value of an n n matrix A. By an inverse iteration [7, 7.6.1,p.362], we can compute an approximate× eigenvector x of unit length and obtain§ a residual Ax λx = e. − Then, the identities xHx = 1 and Ax λx = e(xHx) lead to − (A exH)x = λx. − As a result, λ is an exact eigenvalue of the perturbed matrix A exH and we have a version of the backward error − H H A (A ex ) 2 = ex 2 e 2. − − ≤ If the eigenvalue λ is part of a computed Schur decomposition [7, p.313] AQ = QT + E where T is the (upper-triangular)Schur form of A, Q is a unitary matrix and E is the residual AQ QT. Then we can similarly derive that λ is an exact eigenvalue H − H of A EQ and obtain a similar backward error EQ 2 = E 2. − ⊓⊔ 2.1.3 Condition numbers Condition numbers are mostly defined case by case. As J. H. Wilkinson put it: “We have avoided framing a precise definition of condition numbers so that we 18 2 Fundamentals may use the term freely[33, p.29]”. In general, we would like to have a condition number as the indicator of the sensitivity of the given problem with respect to the backward error. A large problem indicates that the problem is highly sensitive to data perturbations.