Floating-Point Numbers and Rounding Errors
FLOATING-POINT ARITHMETIC Rounding errors and their analysis 5.1671 × 10−3 ◦ Scientific notation: Floating-Point ◦ Floating-point notation: 5.1671e-3 ◦ Advantage: easy to tell the magnitude of the quantity and Arithmetic fix the precision of approximations: on Computers 777777 ≈ 8.81916 × 102 ◦ Keywords: mantissa, exponent, base, precision ◦ Applications in physics and engineering require huge amounts of numerical computations ◦ Need to fix a particular format to write down those huge numbers of numbers ◦ Solution: floating-point format with fixed precision ◦ Trade-off: effort for storing and calculation vs accuracy of computations We need to compute on a large scale… Human computers Digital computers Source: NASA ◦ Floating-point format: scientific notation but we restrict the precision in all computations Floating-Point ◦ Mainboards and graphic cards have dedicated floating- Arithmetic point units only for such computations on Computers ◦ Phones, tablets, laptops, super computers ◦ Many programming languages (C, C++, Java, ….) have datatypes only for specific floating-point formats ◦ Most popular floating-point formats: ◦ Single precision (4 bytes, 23 bit mantissa, 8 bit exponent) ◦ Double precision (8 bytes, 52 bit mantissa, 11 bit exponent) ◦ Rounding errors are typically negligible, but accumulate after many calculations ◦ Floating-point units implement computation with certain floating-point formats Floating-Point ◦ Single and double precision format, and possibly other Arithmetic formats such as quadruple precision on Computers ◦ Built-in capacity for addition, subtraction, multiplication, division, square root, and possibly trigonometric functions APPROXIMATION OF NUMBERS Absolute and Relative Errors, Unit roundoff, machine epsilon Approximating numbers Suppose that 푥 is some number that 푥 is an approximation to that number.
[Show full text]