Floating Point Numbers and Arithmetic

Overview Floating Point Numbers & • Floating Point Numbers Arithmetic • Motivation: Decimal Scientific Notation – Binary Scientific Notation • Floating Point Representation inside computer (binary) – Greater range, precision • Decimal to Floating Point conversion, and vice versa • Big Idea: Type is not associated with data • MIPS floating point instructions, registers CS 160 Ward 1 CS 160 Ward 2 Review of Numbers Other Numbers •What about other numbers? • Computers are made to deal with –Very large numbers? (seconds/century) numbers 9 3,155,760,00010 (3.1557610 x 10 ) • What can we represent in N bits? –Very small numbers? (atomic diameter) -8 – Unsigned integers: 0.0000000110 (1.010 x 10 ) 0to2N -1 –Rationals (repeating pattern) 2/3 (0.666666666. .) – Signed Integers (Two’s Complement) (N-1) (N-1) –Irrationals -2 to 2 -1 21/2 (1.414213562373. .) –Transcendentals e (2.718...), π (3.141...) •All represented in scientific notation CS 160 Ward 3 CS 160 Ward 4 Scientific Notation Review Scientific Notation for Binary Numbers mantissa exponent Mantissa exponent 23 -1 6.02 x 10 1.0two x 2 decimal point radix (base) “binary point” radix (base) •Computer arithmetic that supports it called •Normalized form: no leadings 0s floating point, because it represents (exactly one digit to left of decimal point) numbers where binary point is not fixed, as it is for integers •Alternatives to representing 1/1,000,000,000 –Declare such variable in C as float –Normalized: 1.0 x 10-9 –Not normalized: 0.1 x 10-8, 10.0 x 10-10 CS 160 Ward 5 CS 160 Ward 6 Floating Point Representation [1] Floating Point Representation [2] yyyy 38 • Normal format: +1.xxxxxxxxxxtwo*2 two • What if result too large? (> 2.0x10 ) • Multiple of Word Size (32 bits) – Overflow! – Overflow => Exponent larger than 31 30 23 22 0 represented in 8-bit Exponent field S Exponent Significand • What if result too small? (>0, < 2.0x10-38 ) 1 bit 8 bits 23 bits – Underflow! • S represents Sign of significand (number) – Underflow => Negative exponent larger than Exponent represents y’s represented in 8-bit Exponent field Significand represents x’s • How to reduce chances of overflow or underflow? • Represent numbers as small as 2.0 x 10-38 to as large as 2.0 x 1038 CS 160 Ward 7 CS 160 Ward 8 Double Precision Fl. Pt. IEEE 754 Fl. Pt. Standard [1] • Next Multiple of Word Size (64 bits) • Single Precision, DP similar 31 30 20 19 0 • Sign bit: 1 means negative S Exponent Significand 0 means positive 1 bit 11 bits 20 bits Significand (cont’d) • Significand: 32 bits – To pack more bits, leading 1 implicit for • Double Precision (vs. Single Precision) normalized numbers – 1 + 23 bits single, 1 + 52 bits double – C variable declared as double – always true: 0 < Significand < 1 – Represent numbers almost as small as 2.0 x 10-308 to almost as large as 2.0 x 10308 • Note: 0 has no leading 1, so reserve – But primary advantage is greater accuracy exponent value 0 just for number 0 due to larger significand CS 160 Ward 9 CS 160 Ward 10 IEEE 754 Fl. Pt. Standard [2] IEEE 754 Fl. Pt. Standard [3] •Negative Exponent? •Called Biased Notation, where bias is number –2’s comp? 1.0 x 2-1 v. 1.0 x2+1 (1/2 v. 2) subtract to get real number –IEEE 754 uses bias of 127 for single precision 1/2 0 1111 1111 000 0000 0000 0000 0000 0000 –Subtract 127 from Exponent field to get actual value for 2 0 0000 0001 000 0000 0000 0000 0000 0000 exponent –1023 is bias for double precision –This notation using integer compare of 1/2 v. 2 makes 1/2 > 2! • Summary (single precision): •Instead, pick notation 0000 0001 is most 3130 2322 0 negative, and 1111 1111 is most positive S Exponent Significand –1.0 x 2-1 v. 1.0 x2+1 (1/2 v. 2) 1 bit 8 bits 23 bits 1/2 0 0111 1110 000 0000 0000 0000 0000 0000 • (-1)S x (1 + Significand) x 2(Exponent-127) 2 0 1000 0000 000 0000 0000 0000 0000 0000 – Double precision identical, except with exponent bias of 1023 and significand of 52 bits CS 160 Ward 11 CS 160 Ward 12 Understanding the Significand [1] Understanding the Significand [2] • Method 1 (Fractions): • Method 2 (Place Values): – Convert from scientific notation – In decimal: 0.34010 => 34010/100010 0 -1 => 3410/10010 – In decimal: 1.6732 = (1x10 ) + (6x10 ) + (7x10-2) + (3x10-3) + (2x10-4) – In binary: 0.110 => 110 /1000 = 6 /8 2 2 2 10 10 0 -1 -2 => 11 /100 = 3 /4 – In binary: 1.1001 = (1x2 ) + (1x2 ) + (0x2 ) 2 2 10 10 + (0x2-3) + (1x2-4) – Advantage: less purely numerical, more thought – Interpretation of value in each position extends oriented; this method usually helps people beyond the decimal/binary point understand the meaning of the significand better – Advantage: good for quickly calculating significand value; use this method for translating FP numbers CS 160 Ward 13 CS 160 Ward 14 Ex: Converting Binary FP to Decimal Continuing Example: Binary to ??? 0 0110 1000 101 0101 0100 0011 0100 0010 0011 0100 0101 0101 0100 0011 0100 0010 • Convert 2’s Comp. Binary to Integer: • Sign: 0 => positive 229+228+226+222+220+218+216+214+29+28+26+ 21 • Exponent: = 878,003,010ten – 0110 1000two = 104ten • Convert Binary to Instruction: – Bias adjustment: 104 - 127 = -23 0011 0100 0101 0101 0100 0011 0100 0010 • Significand: 13 221 17218 – 1 + 1x2-1+ 0x2-2 + 1x2-3 + 0x2-4 + 1x2-5 +... =1+2-1+2-3 +2-5 +2-7 +2-9 +2-14 +2-15 +2-17 +2-22 ori $s5, $v0, 17218 = 1.0 + 0.666115 • Convert Binary to ASCII: -23 -7 • Represents: 1.666115ten*2 ~ 1.986*10 0011 0100 0101 0101 0100 0011 0100 0010 (about 2/10,000,000) 4UCB CS 160 Ward 15 CS 160 Ward 16 Big Idea: Type not associated with Data Converting Decimal to FP [1] 0011 0100 0101 0101 0100 0011 0100 0010 • Simple Case: If denominator is an exponent of •What does bit pattern mean: 2 (2, 4, 8, 16, etc.), then it’s easy. –1.986 *10-7? 878,003,010? “4UCB”? • Show IEEE 32-bit representation of -0.75 ori $s5, $v0, 17218? – -0.75 = -3/4 •Data can be anything; operation of instruction that accesses operand determines its type! – -11two/100two = -0.11two -1 –Side-effect of stored program concept: – Normalized to -1.1two x 2 instructions stored as numbers – (-1)S x (1 + Significand) x 2(Exponent-127) •Power/danger of unrestricted addresses/ pointers: use ASCII as Fl. Pt., instructions as data, integers – (-1)1 x (1 + .100 0000 ... 0000) x 2(126-127) as instructions, ... 1 0111 1110 100 0000 0000 0000 0000 0000 CS 160 Ward 17 CS 160 Ward 18 Converting Decimal to FP [2] Converting Decimal to FP [3] • Not So Simple Case: If denominator is not an • Fact: All rational numbers have a exponent of 2. repeating pattern when written out in – Then we can’t represent number precisely, but that’s decimal. why we have so many bits in significand: for • Fact: This still applies in binary. precision – Once we have significand, normalizing a number to • To finish conversion: get the exponent is easy. – Write out binary number with repeating – So how do we get the significand of a neverending pattern. number? – Cut it off after correct number of bits (different for single vs. double precision). – Derive Sign, Exponent and Significand fields. CS 160 Ward 19 CS 160 Ward 20 Hairy Example [1] Hairy Example [2] • How to represent 1/3 in IEEE 32-bit? • Sign: 0 • 1/3 • Exponent = -2 + 127 = 125 =01111101 = 0.33333…10 10 2 = 0.25 + 0.0625 + 0.015625 + 0.00390625 • Significand = 0101010101… + 0.0009765625 + … 0 0111 1101 0101 0101 0101 0101 0101 010 = 1/4 + 1/16 + 1/64 + 1/256 + 1/1024 + … = 2-2 + 2-4 + 2-6 + 2-8 + 2-10 + … 0 = 0.0101010101… 2 * 2 -2 = 1.0101010101… 2 * 2 CS 160 Ward 21 CS 160 Ward 22 Representation for +/- Infinity Representation for 0 • In FP, divide by zero should produce +/- • Represent 0? infinity, not overflow. – exponent all zeroes • Why? – significand all zeroes too – OK to do further computations with infinity – What about sign? e.g., X/0 > Y may be a valid comparison – +0: 0 00000000 00000000000000000000000 • IEEE 754 represents +/- infinity – -0: 1 00000000 00000000000000000000000 • Why two zeroes? – Most positive exponent reserved for infinity – Helps in some limit comparisons – Significands all zeroes CS 160 Ward 23 CS 160 Ward 24 Special Numbers FP Addition • What have we defined so far? • Much more difficult than with integers (Single Precision) • Can’t just add significands Exponent Significand Object • How do we do it? 00 0 – De-normalize to match exponents 0 nonzero Denormalized – Add significands to get resulting significand 1-254 anything +/- fl. pt. # – Keep the same exponent 255 0 +/- infinity – Normalize (possibly changing exponent) 255 nonzero ??? • Note: If signs differ, just perform a subtract instead. CS 160 Ward 25 CS 160 Ward 26 FP Subtraction FP Addition/Subtraction • Similar to addition • Problems in implementing FP add/sub: • How do we do it? – If signs differ for add (or same for sub), what will be the sign of the result? – De-normalize to match exponents – Subtract significands • Question: How do we integrate this into the integer arithmetic unit? – Keep the same exponent – Normalize (possibly changing exponent) • Answer: We don’t! CS 160 Ward 27 CS 160 Ward 28 Fl.

Floating Point Numbers and Arithmetic

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support