Video 1: Rounding Errors % a Number System Can Be Represented As = ±1

Video 1: Rounding errors % A number system can be represented as ! = ±1. &!&"&#&$×2 for ! ∈ [−6,6] and (! ∈ {0,1}. -0 = Let’s say you want to represent the decimal number 19.625 using the binary number system above. Can you represent this number exactly? " (19.62540--40011.10112=4.00111014×2 1. 0011×24 = 19 1. 0100×24=20 Machine floating point number • Not all real numbers can be exactly represented as a machine floating-point - number. bhtlbhtz - - - • Consider a real number in the normalized floating-point form: & #/ - = ±1. ("(#($ … (% …× 2 • The real number - willI be approximated by either -' or -(, the nearest two machine floating point numbers. mnauchminwer 0 DO!& ! g-!' +∞ X = - t.bibzbz-bnx2m-fasth.it Xt=IntO.O0OOj0②x2m ④ 2m=Emx2m 2M 0 !& ! = q x !' +∞ 2M X - t Xt = Emt x / Ht)- E = Em 2M ! <gap=Em larger # → larger gap × - flex) Rounding =/flex) - Xl The process of replacing ! by a nearby machine number is called rounding, and the error involved is called roundoff error. Round Round Round Round towards towards towards towards − ∞ zero zero + ∞ −∞ !De' ! !& 0 !Ies& ! !' +∞ Round by chopping: s + is positive number + is negative number Round up (ceil) ,- + = +! ,- + = +" ↳ 00Rounding towards +∞ Rounding towards zero Round down (floor) ,- + = +" ,- + = +! -OORounding towards zero Rounding towards −∞ Round to nearest: either round up or round down, whichever is closer Rounding (roundoff) errors thx) Consider rounding by chopping: fl ! ! • Absolute error: & q! ' Heat - xkEm×Ig ¥µ¥FE¥ • Relative error: I He¥gE:÷¥→mA=¥:÷nEmX2Mw tf HE Rounding (roundoff) errors )*(!) − ! ≤ 0 1075|!| % The relative error due to rounding (the process of representing a real number as a machine number) is always bounded by machine epsilon. - IEEE Single Precision IEEE Double Precision I # )*(!) − ! )*(!) − ! ≤ 2&"#≈ 1.2×10&/ ≤ 2&0"≈ 2.2×10&!1 |!| O = |!| ' '" ÷÷÷: :÷÷÷÷ 7- I Gap between two machine numbers Erskinegrneadmberrmamohimnwerfllx-XI.mx - I = fxx) a- flcxto) - few) - gap 8sgap Rule ofthlehnbs Gap between two machine numbers x=q×2m Decimal : x- qxlom Binary " 8 4.5×10 × = X=2 -23 m double " " lo 8510 ftlxto) - flex) " - # 's ) → fllxto) G) → )=flG gelo @ L2 fffxto " '5 #flex #flex) or > D- → ftlxto) ) or ftp.?ngE%3fI#gap...osio'> 2- → felxto) Gap between two machine numbers 8 8 =gap k→1 m fe (x) = I = (II = oftenHE = I = fl If fflxto¥¥*÷¥÷') G) 8 such that what is the smallest - b flat8) -felt ⇒8L gap In practice :(Rule ofThumb) Show Ipython notebook demos ←base base Decimal Binary 'm x=qx2m xD + = ) fflxto) = fee of n x -4.5×104 8 SEMI Example . ¥ :* - - ' ) if 852 5→fffxt8)=feK ) #ffcx) :÷÷.otherwise⇒feats*⇒§:÷÷÷ Video 2: Arithmetic with machine numbers Mathematical properties of FP operations Not necessarily associative: For some ! , #, $ the result below is possible: ! + # + $ ≠ ! + (# + $) Not necessarily distributive: For some ! , #, $ the result below is possible: $ ! + # ≠ $ ! + $ # Not necessarily cumulative: Repeatedly adding a very small number to a large number may do nothing Floating point arithmetic (basic idea) ! = (−1)! 1. ( × 2" • First compute the exact result • Then round the result to make it fit into the desired precision • ! + # = %& ! + # • ! × # = %& ! × # Floating point arithmetic % Consider a number system such that ! = ±1. ,!,",#×2 for ! ∈ [−4,4] and (! ∈ {0,1}. On a Rough algorithm for addition and subtraction: , 1. Bring both numbers onto a common exponent 2. Do “grade-school” operation 3. Round result • Example 1: No rounding needed . ! - = 1.101 " ×2 ! , = 1.001 " ×O2 too ! " / = - + , = 10.110 " ×2 = 1.011 " ×2 O es Floating point arithmetic % Consider a number system such that ! = ±1. ,!,",#×2 for ! ∈ [−4,4] and (! ∈ {0,1}. in • Example 2: Require rounding 4 - = 1.101 " ×2 , = 1.000 ×24 { " O ' 4 / = - + , = 10.101 " ×2 - 1.0101×2 ✓ tllatb) • Example 3: -1.010×24 ! - = 1.100 " ×2 &! { , = 1.100 " ×g2 ! ! ! / = - + , = 1.100 " ×2 + 0.011 " ×2 = 1.111 " ×2 -- Floating point arithmetic % Consider a number system such that ! = ±1. ,!,",#,$×2 for ! ∈ [−4,4] and (! ∈ {0,1}. tF→p=5 • Example 4: ! : - = 1.1011 " ×2 :i% :3 , = 1.1010 ×2! } I { " - ' / = - − , = 0.0001 ×2! 0.0001×2 " - ' i. ¥42 ' bits - 2 fffa b) =L . not sign Floating point arithmetic % Consider a number system such that ! = ±1. ,!,",#,$×2 for ! ∈ [−4,4] and (! ∈ {0,1}. - • Example 4: p=5 ! - = 1.1011 " ×2 ! , = 1.1010 " ×2 pill ! / = - − , = 0.0001 " ×2 ✓ / = 1. ? ? ? ? ×2&# Or after normalization: " a • There is not data to indicate what the missing digits should be. • Machine fills them with its best guess, which is often not good (usually what is called spurious zeros). • Number of significant digits in the result is reduced. • This phenomenon is called Catastrophic Cancellation. Loss of significance Assume - and , are real numbers with - ≫ ,. For example 4 - = 1. -!-"-#-$-0-1 … -5 …×2 &6 , = 1. ,!,",#,$,0,1 … ,5 …×2 - + , 23 In Single Precision, compute f- 4 frat 1. -!-"-#-$-0-1-/-6-7 … -""-"#×2 0.00000001 bib . .- by b,sx2° fllatb) ⇒ b- bits of b= Cancellation Assume - and , are real numbers withO - ≈ ,. % - = 1. -!-"-#-$-0-1 … -5 …×2 % , = 1. ,!,",#,$,0,1 … ,5 …×2 In single precision (without loss of generality), consider this example: = % - = 1. -!-"-#-$-0-1 … -"4-"!10-"$-"0-"1-"/ … ×2 % , = 1. -!-"-#-$-0-1 … -"4-"!11#%,"$,"0,"1,"/ …×2 , − - =FE0.0000 … 0001×2% m - I . 2-23×2 tub a)= not sig. Examples: 1) 2 and 3 are real numbers with same order of magnitude (4 ≈ 6). They have the following representation in a decimal floating point system with 16 decimal digits of accuracy: - - 78 4 = 3004.45 78 6 = 3004.46 g How many accurate digits does your answer have when you computeO 6 − 4? 3¥. lldigits -- 11 digits Loss of Significance How can we avoid this loss of significance? For example, consider the function 5 ! = !" + 1 − 1 If we want to evaluate the function for values ! near zero, there is a potential loss of significance in the subtraction. Assume you are performing this computation using a machine with 5 decimal accurate digits. Compute 5(10&#) 1.000000 f- (10-3)=110-7- I to.QO.O.IO# - 1.000001 I - I =p ' - - Loss of Significance (A -b)(atb) - E b Re-write the functionD 5 ! = !" + 1 − 1 to avoid subtraction of two numbers with similar order of magnitude ' - C 15 e- fastnet - D GET HIT + I N¥t = fH=n¥+⇒ tho-3 ;o÷=o÷¢ round - down Example: If x = 0.3721448693 and y = 0.3720214371 what is the relative error in the computation of (x − y) in a computer with five decimal digits of accuracy? In"'t -:I¥HEEzoox - :÷¥:÷÷at¥X y = O. 0001234 322.

Video 1: Rounding Errors % a Number System Can Be Represented As = ±1

Scilab Textbook Companion for Numerical Methods for Engineers by S

Extended Precision Floating Point Arithmetic

Introduction & Floating Point NMMC §1.4.1, Atkinson §1.2 1 Floating

1.2 Round-Off Errors and Computer Arithmetic

Floating Point Representation (Unsigned) Fixed-Point Representation

Absolute and Relative Error All Computations on a Computer Are Approximate by Nature, Due to the Limited Precision on the Computer

Numerical Computations – with a View Towards R

CS 542G: Preliminaries, Floating-Point, Errors

Limitations of Digital Representations Machine Epsilon

Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates

Floating-Point Numbers and Rounding Errors

• 1E-8 Smaller Than Machine Epsilon (Float) • Forward Sum Fails Utterly