Floating Point Representations

c ENEE 350 C. B. Silio, Jan., 2000 FLOATING POINT REPRESENTATIONS It is assumed that the student is familiar with the discussion in App endix B of the text byA.Tanenbaum, Structured Computer Organization, 4th Ed., Prentice-Hall, 1999. Here we are concerned in particular with the discussion of oating-p ointnumb ers and normalization in pp. 643{651. In the notation used here oating p oint representations for real numb ers have the form: E M R ; where M is a signed mantissa, Ristheradix of normalization, and E is a signed exponent. It is the signed values M and E that are typically packed into computer \words" of varying length to enco de single, double, and extended precision short, long, and extended oating p ointnumb ers. The numb er of bits used in the representation of M determines the precision of the representation. The number of bits used to represent the signed value E determines the range of the representation. Only a nite number of values among those in the representable range can be enco ded in the n bits available, and this number is determined by the precision. To facilitate xed-p oint comparisons of oating-p oint data representations such as might o ccur in sorting applications three things are usually done. The rst is to place the sign of the mantissa M denoted S on the M left in the same lo cation as in the sign representation for xed p ointnumb ers. This provides easy detection of the sign of the oating-p oint data in the same manner as for xed-p oint data. The second thing is to place the signed exp onent representation b etween S and M just to the left of M as packed in the high M order word, and the third thing is to bias the k-bit representation of the signed exp onent so as to shift the represented range. In a k-bit eld the range of signed exp onents is: k 1 k 1 2 1 E +2 1: One of the more common ways of biasing the exp onent but not the only wayaswe shall see in the examples k 1 is to add to the signed exp onent a bias of 2 This bias is appropriately subtracted in unpacking for output conversion and other arithmetic op erations. The biased exp onent is called the characteristic E of the b representation. Biasing the number in this way alleviates the need to examine the sign of the exp onent explicitly in making xed p oint comparisons such as in sorting. The range of the characteristic or biased exp onent is then: k 1 k 1 k 1 k 1 k 1 2 +2 +1 E +2 2 +2 1 ; which is equivalentto k 1 E 2 1: b k 1 The term usually used is that the characteristic represents the exp onent excess 2 . Some computer manufacturers do not use a symmetric range and p ermit one more negative exp onent in the representation of the characteristic by allowing a zero characteristic to b e a valid representation. A oating p ointnumb er packed into a single word would then have the following form: S C H AR AC T E R I S T I C MANTISSA M 1 k j The purp ose of normalization is to preserve as many signi cant digits in the representation as p ossible in the j bits available to represent the normalized mantissa. The greatest number of signi cant bits are preserved j of them when the radix of normalization is binary R = 2. When R = 16, the most signi cant non-zero hexadecimal digit in the normalized mantissa mayhaveasmany as three leading zeros in the j -bit binary representation in a binary computer. Some representations assume a normalized fraction, in which case the radix p oint is assumed to lie at the b oundary b etween the characteristic and the mantissa. Other representations assume the mantissa is 1 normalized as an integer e.g., the CDC 6600, CYBER 70 and successor series machines so that the radix p oint is assumed to lie immediately to the right of the mantissa. In the case of binary normalization some representations assume the numb er to b e normalized only if the most signi cant binary digit is a one. If this digit is always a one, there is no uncertainty and therefore no need to devote a bit in the j bits available to represent this digit; it can thus b e represented implicitly as a hidden-bit so that the remaining lower order j binary digits in which there is some uncertainty from number to numb er can be represented. By using a hidden bit the representation p ermits a precision of j + 1 bits for the mantissa, but at the exp ense of p ossibly giving up an explicit representation for zero. For example, hidden bits are used in the DEC PDP-11 series oating p ointnumb er representations and in the IEEE Standard Floating Point representations. In other representations zero is usually handled as a sp ecial case by letting a word with all zeros represent the value zero or equivalently, in one's complement machines letting a vector of all one's represent minus zero. The p ortion of the real line represented by oating p oint formats app ears as follows: N eg ativ e E xpr essibl e N eg ativ e P ositiv e E xpr essibl e P ositiv e O v er f l ow N eg ativ e U nder f l ow U nder f l ow P ositiv e O v er f l ow R eg ion N umber s R eg ion R eg ion N umber s R eg ion 1 Zero ! +1 Using an n-bit word to represent a single precision oating-p ointword, we note that there are exactly n 2 values in the sets of Positive and Negative Expressible Numb ers that can b e represented including zero. Positive and negative oating p ointnumb ers are usually represented in one of twoways dep ending on the computer's choice of arithmetic circuitry and representation for xed p ointnumb ers. Machines that use one's complement representations for xed p ointnumb ers and have one's complement adders in their arithmetic units typically pack the p ositive oating p oint number into one or more words and then complement all bits in this these words to represent the negative oating p oint numb er. Examples of these one's complement representations include the UNIVAC 1100 series machines and the CDC 6600, CYBER 70 series and successor machines. Machines that represent xed p oint numb ers in two's complement form and have two's complement adders in their arithmetic units typically use a sign magnitude representation for p ositive and negative oating p ointnumb ers with S = 0 for p ositivenumb ers and S = 1 for negativenumb ers. The negative M M mantissas are typically converted to their two's complement representations when the oating p ointnumber is unpacked for oating p oint arithmetic op erations; the result's mantissa is then converted back to a sign- magnitude format when it is repacked into the oating p oint representation after an arithmetic op eration or up on input conversion. Machines that use this sign magnitude representation for oating p ointnumb ers include the IBM 360/370 and compatible instruction set architectures, the DEC PDP-11 series and upward compatible machines, the Cray-1 and successors and machines that use the IEEE Floating Point Standard representation. Wenow consider sp eci c examples to illustrate the concepts. Floating p oint formats used on machines of various manufacturers are considered; namely, the UNIVAC 1100 series, the IBM 360/370 series, the DEC PDP-11/VAX-11 series, the CDC 6600/CYBER 70 series, and the IEEE Floating Point Standard series such as the Intel 8087 series numeric data pro cessors.For each machine format considered we shall present oating- p oint representations of the numb ers +29:2 and 29:2 ; and of the numb ers +0:03125 and 0:03125 : 10 10 10 10 Recall that 29:2 =35:1463 =1D:3 = 11101:0011 : 10 8 16 2 Furthermore, recall that 1 =0:00001 =0:02 =0:08 : 0:03125 = 2 8 16 10 32 10 2 UNIVAC 1100 The UNIVAC 1100 series machines use 36-bit words to represent instructions and xed p oint data and a one's complement representation for negativenumb ers with one's complement arithmetic circuitry. A single precision oating p oint datum is packed into a single 36-bit word, and a double precision oating p oint datum is packed into two consecutive 36-bit words with the second 36-bit word representing a continuation of the low order bits in the mantissa. The mantissa is a binary normalized fraction of the form 0:1xxxxxx or is either 2 all zeros or all ones. A single precision datum has an 8-bit characteristic that represents a signed exp onent with bias excess 128 = 200 , and a double precision datum has an 11-bit characteristic that represents 10 8 the signed exp onent excess 1024 = 2000 . Negative oating-p oint numb ers are represented by packing 10 8 the p ositive representation into the single or double words and then taking the one's complement of this these words by logically complementing each of the 36 72 bit p ositions including the characteristic eld.

Floating Point Representations

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support