Floating Point Representations

c ENEE 350 C. B. Silio, Jan., 2000 FLOATING POINT REPRESENTATIONS It is assumed that the student is familiar with the discussion in App endix B of the text byA.Tanenbaum, Structured Computer Organization, 4th Ed., Prentice-Hall, 1999. Here we are concerned in particular with the discussion of oating-p ointnumb ers and normalization in pp. 643{651. In the notation used here oating p oint representations for real numb ers have the form: E M R ; where M is a signed mantissa, Ristheradix of normalization, and E is a signed exponent. It is the signed values M and E that are typically packed into computer \words" of varying length to enco de single, double, and extended precision short, long, and extended oating p ointnumb ers. The numb er of bits used in the representation of M determines the precision of the representation. The number of bits used to represent the signed value E determines the range of the representation. Only a nite number of values among those in the representable range can be enco ded in the n bits available, and this number is determined by the precision. To facilitate xed-p oint comparisons of oating-p oint data representations such as might o ccur in sorting applications three things are usually done. The rst is to place the sign of the mantissa M denoted S on the M left in the same lo cation as in the sign representation for xed p ointnumb ers. This provides easy detection of the sign of the oating-p oint data in the same manner as for xed-p oint data. The second thing is to place the signed exp onent representation b etween S and M just to the left of M as packed in the high M order word, and the third thing is to bias the k-bit representation of the signed exp onent so as to shift the represented range. In a k-bit eld the range of signed exp onents is: k 1 k 1 2 1 E +2 1: One of the more common ways of biasing the exp onent but not the only wayaswe shall see in the examples k 1 is to add to the signed exp onent a bias of 2 This bias is appropriately subtracted in unpacking for output conversion and other arithmetic op erations. The biased exp onent is called the characteristic E of the b representation. Biasing the number in this way alleviates the need to examine the sign of the exp onent explicitly in making xed p oint comparisons such as in sorting. The range of the characteristic or biased exp onent is then: k 1 k 1 k 1 k 1 k 1 2 +2 +1 E +2 2 +2 1 ; which is equivalentto k 1 E 2 1: b k 1 The term usually used is that the characteristic represents the exp onent excess 2 . Some computer manufacturers do not use a symmetric range and p ermit one more negative exp onent in the representation of the characteristic by allowing a zero characteristic to b e a valid representation. A oating p ointnumb er packed into a single word would then have the following form: S C H AR AC T E R I S T I C MANTISSA M 1 k j The purp ose of normalization is to preserve as many signi cant digits in the representation as p ossible in the j bits available to represent the normalized mantissa. The greatest number of signi cant bits are preserved j of them when the radix of normalization is binary R = 2. When R = 16, the most signi cant non-zero hexadecimal digit in the normalized mantissa mayhaveasmany as three leading zeros in the j -bit binary representation in a binary computer. Some representations assume a normalized fraction, in which case the radix p oint is assumed to lie at the b oundary b etween the characteristic and the mantissa. Other representations assume the mantissa is 1 normalized as an integer e.g., the CDC 6600, CYBER 70 and successor series machines so that the radix p oint is assumed to lie immediately to the right of the mantissa. In the case of binary normalization some representations assume the numb er to b e normalized only if the most signi cant binary digit is a one. If this digit is always a one, there is no uncertainty and therefore no need to devote a bit in the j bits available to represent this digit; it can thus b e represented implicitly as a hidden-bit so that the remaining lower order j binary digits in which there is some uncertainty from number to numb er can be represented. By using a hidden bit the representation p ermits a precision of j + 1 bits for the mantissa, but at the exp ense of p ossibly giving up an explicit representation for zero. For example, hidden bits are used in the DEC PDP-11 series oating p ointnumb er representations and in the IEEE Standard Floating Point representations. In other representations zero is usually handled as a sp ecial case by letting a word with all zeros represent the value zero or equivalently, in one's complement machines letting a vector of all one's represent minus zero. The p ortion of the real line represented by oating p oint formats app ears as follows: N eg ativ e E xpr essibl e N eg ativ e P ositiv e E xpr essibl e P ositiv e O v er f l ow N eg ativ e U nder f l ow U nder f l ow P ositiv e O v er f l ow R eg ion N umber s R eg ion R eg ion N umber s R eg ion 1 Zero ! +1 Using an n-bit word to represent a single precision oating-p ointword, we note that there are exactly n 2 values in the sets of Positive and Negative Expressible Numb ers that can b e represented including zero. Positive and negative oating p ointnumb ers are usually represented in one of twoways dep ending on the computer's choice of arithmetic circuitry and representation for xed p ointnumb ers. Machines that use one's complement representations for xed p ointnumb ers and have one's complement adders in their arithmetic units typically pack the p ositive oating p oint number into one or more words and then complement all bits in this these words to represent the negative oating p oint numb er. Examples of these one's complement representations include the UNIVAC 1100 series machines and the CDC 6600, CYBER 70 series and successor machines. Machines that represent xed p oint numb ers in two's complement form and have two's complement adders in their arithmetic units typically use a sign magnitude representation for p ositive and negative oating p ointnumb ers with S = 0 for p ositivenumb ers and S = 1 for negativenumb ers. The negative M M mantissas are typically converted to their two's complement representations when the oating p ointnumber is unpacked for oating p oint arithmetic op erations; the result's mantissa is then converted back to a sign- magnitude format when it is repacked into the oating p oint representation after an arithmetic op eration or up on input conversion. Machines that use this sign magnitude representation for oating p ointnumb ers include the IBM 360/370 and compatible instruction set architectures, the DEC PDP-11 series and upward compatible machines, the Cray-1 and successors and machines that use the IEEE Floating Point Standard representation. Wenow consider sp eci c examples to illustrate the concepts. Floating p oint formats used on machines of various manufacturers are considered; namely, the UNIVAC 1100 series, the IBM 360/370 series, the DEC PDP-11/VAX-11 series, the CDC 6600/CYBER 70 series, and the IEEE Floating Point Standard series such as the Intel 8087 series numeric data pro cessors.For each machine format considered we shall present oating- p oint representations of the numb ers +29:2 and 29:2 ; and of the numb ers +0:03125 and 0:03125 : 10 10 10 10 Recall that 29:2 =35:1463 =1D:3 = 11101:0011 : 10 8 16 2 Furthermore, recall that 1 =0:00001 =0:02 =0:08 : 0:03125 = 2 8 16 10 32 10 2 UNIVAC 1100 The UNIVAC 1100 series machines use 36-bit words to represent instructions and xed p oint data and a one's complement representation for negativenumb ers with one's complement arithmetic circuitry. A single precision oating p oint datum is packed into a single 36-bit word, and a double precision oating p oint datum is packed into two consecutive 36-bit words with the second 36-bit word representing a continuation of the low order bits in the mantissa. The mantissa is a binary normalized fraction of the form 0:1xxxxxx or is either 2 all zeros or all ones. A single precision datum has an 8-bit characteristic that represents a signed exp onent with bias excess 128 = 200 , and a double precision datum has an 11-bit characteristic that represents 10 8 the signed exp onent excess 1024 = 2000 . Negative oating-p oint numb ers are represented by packing 10 8 the p ositive representation into the single or double words and then taking the one's complement of this these words by logically complementing each of the 36 72 bit p ositions including the characteristic eld.

Load more