
UNIVERSITY OF CAMBRIDGE Floating Point Computation (slides 1–123) A six-lecture course D J Greaves (thanks to Alan Mycroft) Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/teaching/current/FPComp Lent 2010–11 Floating Point Computation 1 Lent 2010–11 A Few Cautionary Tales UNIVERSITY OF CAMBRIDGE The main enemy of this course is the simple phrase “the computer calculated it, so it must be right”. We’re happy to be wary for integer programs, e.g. having unit tests to check that sorting [5,1,3,2] gives [1,2,3,5], but then we suspend our belief for programs producing real-number values, especially if they implement a “mathematical formula”. Floating Point Computation 2 Lent 2010–11 Global Warming UNIVERSITY OF CAMBRIDGE Apocryphal story – Dr X has just produced a new climate modelling program. Interviewer: what does it predict? Dr X: Oh, the expected 2–4◦C rise in average temperatures by 2100. Interviewer: is your figure robust? ... Floating Point Computation 3 Lent 2010–11 Global Warming (2) UNIVERSITY OF CAMBRIDGE Apocryphal story – Dr X has just produced a new climate modelling program. Interviewer: what does it predict? Dr X: Oh, the expected 2–4◦C rise in average temperatures by 2100. Interviewer: is your figure robust? Dr X: Oh yes, indeed it gives results in the same range even if the input data is randomly permuted . We laugh, but let’s learn from this. Floating Point Computation 4 Lent 2010–11 Global Warming (3) UNIVERSITY OF What could cause this sort or error? CAMBRIDGE the wrong mathematical model of reality (most subject areas lack • models as precise and well-understood as Newtonian gravity) a parameterised model with parameters chosen to fit expected • results (‘over-fitting’) the model being very sensitive to input or parameter values • the discretisation of the continuous model for computation • the build-up or propagation of inaccuracies caused by the finite • precision of floating-point numbers plain old programming errors • We’ll only look at the last four, but don’t forget the first two. Floating Point Computation 5 Lent 2010–11 Real world examples UNIVERSITY OF Find Kees Vuik’s web page “Computer Arithmetic Tragedies” forCAMBRIDGE these and more: Patriot missile interceptor fails to intercept due to 0.1 second being • the ‘recurring decimal’ 0.0011001100 ...2 in binary (1991) Ariane 5 $500M firework display caused by overflow in converting • 64-bit floating-point number to 16-bit integer (1996) The Sleipner A offshore oil platform sank . post-accident • investigation traced the error to inaccurate finite element approximation of the linear elastic model of the tricell (using the popular finite element program NASTRAN). The shear stresses were underestimated by 47% . Learn from the mistakes of the past . Floating Point Computation 6 Lent 2010–11 Overall motto: threat minimisation UNIVERSITY OF CAMBRIDGE Algorithms involving floating point (float and double in Java and • C, [misleadingly named] real in ML and Fortran) pose a significant threat to the programmer or user. Learn to distrust your own na¨ıve coding of such algorithms, and, • even more so, get to distrust others’. Start to think of ways of sanity checking (by human or machine) • any floating point value arising from a computation, library or package—unless its documentation suggests an attention to detail at least that discussed here (and even then treat with suspicion). Just because the “computer produces a numerical answer” doesn’t • mean this has any relationship to the ‘correct’ answer. Here be dragons! Floating Point Computation 7 Lent 2010–11 What’s this course about? UNIVERSITY OF CAMBRIDGE How computers represent and calculate with ‘real number’ values. • What problems occur due to the values only being finite (both range • and precision). How these problems add up until you get silly answers. • How you can stop your programs and yourself from looking silly • (and some ideas on how to determine whether existing programs have been silly). Chaos and ill-conditionedness. • Knowing when to call in an expert—remember there is 50+ years of • knowledge on this and you only get 6 lectures from me. Floating Point Computation 8 Lent 2010–11 Part 1 UNIVERSITY OF CAMBRIDGE Introduction/reminding you what you already know Floating Point Computation 9 Lent 2010–11 Reprise: signed and unsigned integers UNIVERSITY OF CAMBRIDGE An 8-bit value such as 10001011 can naturally be interpreted as either an signed number (27 +23 +21 +20 = 139) or as a signed number ( 27 +23 +21 +20 = 117). − − This places the decimal (binary!?!) point at the right-hand end. It could also be interpreted as a fixed-point number by imagining a decimal point elsewhere (e.g. in the middle to get) 1000.1011; this would have value 3 1 3 4 11 2 +2− +2− +2− =8 16 =8.6875. (The above is an unsigned fixed-point value for illustration, normally we use signed fixed-point values.) Floating Point Computation 10 Lent 2010–11 Fixed point values and saturating arithmetic UNIVERSITY OF CAMBRIDGE Fixed-point values are often useful (e.g. in low-power/embedded devices) but they are prone to overflow. E.g. 2*10001011 = 00010110 so 2*8.6875 = 1.375!! One alternative is to make operations saturating so that 2*10001011 = 11111111 which can be useful (e.g. in audio). Note 1111.11112 = 15.937510. An alternative way to avoid this sort of overflow is to allow the decimal point to be determined at run-time (by another part of the value) “floating point” instead of being fixed (independent of the value as above) “fixed point” – the subject of this course. Floating Point Computation 11 Lent 2010–11 Back to school UNIVERSITY OF CAMBRIDGE Scientific notation (from Wikipedia, the free encyclopedia) In scientific notation, numbers are written using powers of ten in the form a 10b where b is an integer exponent and the coefficient a is any × real number, called the significand or mantissa. In normalised form, a is chosen such that 1 a< 10. It is implicitly ≤ assumed that scientific notation should always be normalised except during calculations or when an unnormalised form is desired. What Wikipedia should say: zero is problematic—its exponent doesn’t matter and it can’t be put in normalised form. Floating Point Computation 12 Lent 2010–11 Back to school (2) UNIVERSITY OF Multiplication and division (from Wikipedia, with some changes) CAMBRIDGE Given two numbers in scientific notation, x = a 10b0 x = a 10b1 0 0 × 1 1 × Multiplication and division; b0+b1 b0 b1 x x =(a a ) 10 x /x =(a /a ) 10 − 0 ∗ 1 0 ∗ 1 × 0 1 0 1 × Note that result is not guaranteed to be normalised even if inputs are: a a may now be between 1 and 100, and a /a may be between 0.1 0 ∗ 1 0 1 and 10 (both at most one out!). E.g. 5 2 3 2 5.67 10− 2.34 10 13.3 10− =1.33 10− × ∗ × ≈ × × 2 5 7 6 2.34 10 /5.67 10− 0.413 10 =4.13 10 × × ≈ × × Floating Point Computation 13 Lent 2010–11 Back to school (3) UNIVERSITY OF Addition and subtraction require the numbers to be representedCAMBRIDGE using the same exponent, normally the bigger of b0 and b1. b1 b0 b0 W.l.o.g. b > b , so write x =(a 10 − ) 10 (a shift!) and 0 1 1 1 ∗ × add/subtract the mantissas. b1 b0 b0 x x =(a (a 10 − )) 10 0 ± 1 0 ± 1 ∗ × E.g. 5 6 5 5 5 2.34 10− +5.67 10− =2.34 10− +0.567 10− 2.91 10− × × × × ≈ × A cancellation problem we will see more of: 5 5 5 7 2.34 10− 2.33 10− =0.01 10− =1.00 10− × − × × × When numbers reinforce (e.g. add with same-sign inputs) new mantissa is in range [1, 20), when they cancel it is in range [0..10). After cancellation we may require several shifts to normalise. Floating Point Computation 14 Lent 2010–11 Significant figures can mislead UNIVERSITY OF CAMBRIDGE When using scientific-form we often compute repeatedly keeping the same number of digits in the mantissa. In science this is often the number of digits of accuracy in the original inputs—hence the term (decimal) significant figures (sig.figs. or sf). This is risky for two reasons: As in the last example, there may be 3sf in the result of a • computation but little accuracy left. 1.01 101 and 9.98 100 are quite close, and both have 3sf, but • × × changing them by one ulp (‘unit in last place’) changes the value by nearly 1% (1 part in 101) in the former and about 0.1% (1 part in 998) in the latter. Later we’ll prefer “relative error”. Floating Point Computation 15 Lent 2010–11 Significant figures can mislead (2) UNIVERSITY OF CAMBRIDGE Scientific form numbers (note unequal gaps) log scale shows this clearly 6 6 6 6666666 6 6 6666666 6 6 6666666 0.3 0.6 3 6 30 60 0.2 0.4 0.8 2.0 4 8 20 40 80 0.1 0.5 1.0 5 10 50 100 You might prefer to say sig.figs.(4.56) = log 0.01/4.56 so that − 10 sf (1.01) and sf (101) is about 3, and sf (9.98) and sf (0.0000998) is nearly 4. (BTW, a good case can be made for 2 and 3 respectively instead.) Exercise: with this more precise understanding of sig.figs. how do the elementary operations (+, , ,/; operating on nominal 3sf arguments − ∗ to give a nominal 3sf result) really behave? Floating Point Computation 16 Lent 2010–11 Get your calculator out! UNIVERSITY OF CAMBRIDGE Calculators are just floating point computers.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages123 Page
-
File Size-