<<

Slide 14 for ENCM 369 Winter 2014 Lecture Section 01

Steve Norman, PhD, PEng

Electrical & Computer Engineering Schulich School of Engineering University of Calgary

Winter Term, 2014 ENCM 369 W14 Section 01 Slide Set 14 slide 2/66 Contents

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 3/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 4/66 Introduction to floating-point numbers

We’ve finished ENCM 369 coverage of integer representations and arithmetic. We’re moving on to floating-point numbers and arithmetic. (Section 5.3.2 in the textbook for concepts; 6.7.4 for a very brief introduction to MIPS floating-point registers and instructions.) Floating-point is the generic name given to the kinds of numbers you’ve seen in C and C++ with types double and float. ENCM 369 W14 Section 01 Slide Set 14 slide 5/66 Scientific Notation

This is a format that engineering students should be very familiar with! Example: 6.02214179 × 1023 mol−1 Example: −1.60217656 × 10−19 C Floating-point representation has the same structure as scientific notation, but floating-point typically uses base two, not base ten. ENCM 369 W14 Section 01 Slide Set 14 slide 6/66 Introductory floating-point example

A programmer gives a value to a constant in some C code: const double electron_charge = -1.60217656e-19; The C compiler will use the base ten constant in the C code to create a base two constant a computer can work with. When the program runs, the number the computer uses is −1.0111101001001101101000010110101110011100011110101101 × two−111111, which is very close to but not exactly equal to −1.60217656 × 10−19. ENCM 369 W14 Section 01 Slide Set 14 slide 7/66 Names for parts of a non-zero floating-point number significand exponent 00001011 -1 .01001100011010111010111 × two fractionsign

The significand includes from both sides of the binary point. Another name for significand is mantissa. (Note: This is not base ten, so we should not use the term decimal point!) The fraction is the part of the significand that is to the right of the binary point. So the fraction represents some number that is ≥ 0 but < 1. ENCM 369 W14 Section 01 Slide Set 14 slide 8/66 Normalized non-zero floating-point numbers

In normalized form, an f-p number must have a single 1 immediately to the left of the binary point, and no other 1 bits left of the binary point. Therefore, the significand of a normalized number must be ≥ 1.0 and must also be < 10.0two. (In English: greater than or equal to one, strictly less than two.) ENCM 369 W14 Section 01 Slide Set 14 slide 9/66 Normalized non-zero f-p numbers: examples

Which of the following are in normalized form? 00000101 I A. −1.00000000 × two 00100101 I B. +10.0000000 × two 00010111 I C. +1.10001011 × two 00001100 I D. −0.11101100 × two 01001100 I E. +101.111011 × two ENCM 369 W14 Section 01 Slide Set 14 slide 10/66 Example conversion from base ten to base-two floating-point

What is 9.375ten expressed as a normalized f-p number?

What are the sign, significand, fraction, and exponent of this normalized f-p number? ENCM 369 W14 Section 01 Slide Set 14 slide 11/66 Standard organizations for bits of floating-point numbers

For computer hardware to work with f-p numbers there must be precise rules about how to encode these numbers. The most usual overall sizes for f-p numbers are 32 bits or 64 bits, but other sizes (e.g., 16, 80, or 128 bits) are possible. We need one bit for the sign and some number of bits for information about the exponent; the remaining bits can be used for information about the significand. ENCM 369 W14 Section 01 Slide Set 14 slide 12/66 Sign information for non-zero f-p numbers

This requires a single bit.

A sign bit of 0 is used for positive numbers.

A sign bit of 1 is used for negative numbers. ENCM 369 W14 Section 01 Slide Set 14 slide 13/66 Exponent information for a non-zero f-p numbers

Exponents in f-p numbers are signed integers! f-p numbers with small magnitudes will have negative exponents. So of course two’s complement is used for exponents, right . . . ? WRONG! In fact, an alternate system for signed integers, called biased notation, is used for exponents in f-p numbers. (This fact explains why many introductions to two’s-complement systems state that two’s complement is almost always used for signed integers in modern digital hardware.) ENCM 369 W14 Section 01 Slide Set 14 slide 14/66 How does biased notation work?

The biased exponent is equal to the actual exponent plus some number called a bias. The bias is chosen so that roughly half the allowable actual exponents are negative, and roughly half are positive.

Example: The bias for an 8-bit exponent is 127ten, or 0111_1111two. If the actual exponent is 3ten, what is the biased exponent in base ten and base two? ENCM 369 W14 Section 01 Slide Set 14 slide 15/66 Why is biased notation used for exponents in f-p numbers?

It turns out that biased notation helps with the design of relatively small, speedy circuits to decide whether one f-p number is less than another f-p number. (We won’t study the details of that in ENCM 369.) Also, it’s useful that the bit pattern for an actual exponent of zero is not a sequence of zero bits—then a sequence of zero bits can have a different, special meaning. ENCM 369 W14 Section 01 Slide Set 14 slide 16/66 Significand information for a non-zero, normalized f-p number

1 XXX ··· XXX

We know this bit Any pattern of 1’s and will be a 1. 0’s is possible here.

There is no need to encode the entire significand. Instead we can record only the bits of the fraction. Leaving out the 1 bit from the left of the binary point allows more precision in the fraction. ENCM 369 W14 Section 01 Slide Set 14 slide 17/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 18/66 MIPS formats for 32-bit and 64-bit f-p numbers

bit 31 bits 30–23 bits 22–0 sign bit biased exponent fraction

bit 63 bits 62–52 bits 51–0 sign bit biased exponent fraction

Exponent bias for 32-bit format: 127ten = 0111_1111two. Exponent bias for 64-bit format: 1023ten = 011_1111_1111two. ENCM 369 W14 Section 01 Slide Set 14 slide 19/66 MIPS formats for 32-bit and 64-bit f-p numbers

The 32-bit format is called single precision. The 64-bit format is called double precision. We’ll see later that MIPS instruction mnemonics for single-precision operations end in .s, as in mov.s, while the mnemonics for double-precision operations end in .d, as in add.d. ENCM 369 W14 Section 01 Slide Set 14 slide 20/66

Example: How is 9.375ten encoded in 32-bit and 64-bit formats?

From previous work: 1 1 9.375 = 9 + + 4 8 = 1001.011two = 1.001011 × twothree (normalized)

For each of the 32-bit and 64-bit formats, what are the bit patterns for the biased exponents? What are the complete bit patterns for the f-p numbers? ENCM 369 W14 Section 01 Slide Set 14 slide 21/66 More examples

How would −9.375ten be encoded in the 32-bit format?

How would 0.125ten be encoded in the 32-bit format? What base ten number does the 32-bit pattern 1_0111_1110_11_[21 zeros] represent? ENCM 369 W14 Section 01 Slide Set 14 slide 22/66 How to represent zero in f-p formats

A special rule says that if all exponent and fraction bits are zero, the number being represented is 0.0. So, what are the representations of 0.0 in 32-bit and 64-bit formats? ENCM 369 W14 Section 01 Slide Set 14 slide 23/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 24/66 IEEE standards for floating-point numbers and arithmetic (1)

IEEE: Institute of Electrical and Electronics Engineers “IEEE 754” and “IEEE floating-point” are informal names for both the original IEEE 754-1985 standard and the revised IEEE 754-2008 standard. Prior to the development of the IEEE 754-1985 standard, different companies produced a wide variety of incompatible schemes for floating-point numbers. ENCM 369 W14 Section 01 Slide Set 14 slide 25/66 IEEE standards for floating-point numbers and arithmetic (2)

Modern computer architectures (if they have f-p at all) typically implement part or all of IEEE standard f-p. MIPS follows the IEEE standard for 32-bit and 64-bit f-p types. The same is true for x86, x86-64, ARM and many other architectures. (So examples in earlier slides were not really MIPS-specific—they would also be correct for many other architectures.) In C and C++, float is typically 32-bit IEEE f-p, and double is typically 64-bit IEEE f-p. ENCM 369 W14 Section 01 Slide Set 14 slide 26/66 Scope of IEEE f-p standards

In addition to 32-bit and 64-bit formats, various other formats are specified, for example, 16-bit and 128-bit formats. There are detailed rules for arithmetic—comparison, addition, multiplication, and many other operations. There are detailed rules for rounding—choosing an approximate value when exact results can’t be represented. ENCM 369 W14 Section 01 Slide Set 14 slide 27/66 Special IEEE f-p bit patterns

exponent bits fraction bits meaning all 0’s all 0’s number is 0.0, as seen already all 0’s at least one denormalized number 1 bit all 1’s all 0’s ±infinity, depending on sign bit all 1’s at least one NaN: “not a number” 1 bit

If the exponent field of an IEEE f-p bit pattern has at least one 0 bit and at least one 1 bit, the bit pattern represents a normal, non-zero f-p number. ENCM 369 W14 Section 01 Slide Set 14 slide 28/66 Denormalized numbers

These are non-zero numbers with magnitudes so tiny that they can’t be represented in the normal sign-exponent-fraction format. Example: 1.25 × 2−128 in the 32-bit format. The range of biased exponents is 0000_0001two to 1111_1110two, that is, 1 to 254ten, which allows encoding of actual exponents from −126ten to +127ten. We will NOT study the details of the denormalized number format in ENCM 369. (However, if you’re curious, . . . 1.25 × 2−128 is represented as 0 00000000 01010000000000000000000 in the 32-bit format.) ENCM 369 W14 Section 01 Slide Set 14 slide 29/66 Infinity

IEEE standard f-p arithmetic specifies many ways to generate ±infinity. Some common examples . . .

I x / 0.0 generates +∞ if x > 0.0.

I x / 0.0 generates −∞ if x < 0.0.

I If a and b are regular f-p numbers but the “everyday math” product a × b is too large in magnitude to be an f-p number, then a * b will be ±∞, depending on the signs of a and b. ENCM 369 W14 Section 01 Slide Set 14 slide 30/66 NaN: “not a number”

NaN is specified as the result for many computations where not even ±infinity makes sense as a result. Examples . . .

I 0.0 / 0.0

I infinity / infinity

I sqrt(x), where x < 0.0

I asin(x), where x > 1.0 or x < −1.0 (asin is the C library inverse sine function.)

I arithmetic operation with one or more NaNs as inputs, e.g., 1.0 + x, where x is NaN ENCM 369 W14 Section 01 Slide Set 14 slide 31/66 Demonstration of f-p infinity In “everyday math”, 16303 = 4,330,747,000 and (5.7 × 10102)3 = 1.85193 × 10308.

#include int main(void) { int i = 1630; double d1 = 1630.0, d2 = 5.7e102; printf("%d cubed is %d\n", i, i * i * i); printf("%.1f cubed is %.1f\n", d1, d1 * d1 * d1); printf("%g cubed is %g\n", d2, d2 * d2 * d2); return 0; }

Program output . . .

1630 cubed is 35779704 1630.0 cubed is 4330747000.0 5.7e+102 cubed is inf ENCM 369 W14 Section 01 Slide Set 14 slide 32/66 Demonstration of Not a Number

#include #include int main(void) { double a = 1.0, b = 2.0, c = 2.0; double sqrt_of_d, r1, r2; sqrt_of_d = sqrt(b * b - 4.0 * a * c); r1 = (-b + sqrt_of_d) / (2.0 * a); r2 = (-b - sqrt_of_d) / (2.0 * a); printf("r1 = %g, r2 = %g\n", r1, r2); return 0; }

Program output . . .

r1 = -nan, r2 = -nan ENCM 369 W14 Section 01 Slide Set 14 slide 33/66 The usefulness of infinity and NaN

Recall that for integer addition, subtraction, and multiplication, C and C++ systems usually will NOT tell you that results are wrong because magnitudes of numbers got out of hand.

Results of ±infinity or NaN in floating-point computation clearly indicate that something has gone wrong. This is helpful! Of course, absence of ±infinity and NaN does NOT prove that your program’s results are correct! ENCM 369 W14 Section 01 Slide Set 14 slide 34/66 ENCM 369 Lecture Document: “Floating-Point Format Examples”

Please read this document carefully. Here are some brief notes on what you will see:

I π cannot be represented exactly in f-p format. (This is probably not a surprise.)

I 0.6 cannot be represented exactly in f-p format. (This might be surprising.)

I 32-bit and 64-bit f-p approximations are given for π and 0.6.

I f-p bit patterns for 1.0 are given; they are very different from integer bit patterns for 1. ENCM 369 W14 Section 01 Slide Set 14 slide 35/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 36/66 Floating-point registers

Most processor architectures that have f-p instructions have a set of floating-point registers (FPRs) that is separate from the set of general-purpose registers (GPRs). Important: Most f-p instructions have only FPRs as sources and destination! But there have to be a few instructions for copying data between FPRs and GPRs, or between FPRs and memory. ENCM 369 W14 Section 01 Slide Set 14 slide 37/66 MIPS FPRs

There are 16 64-bit double-precision FPRs: $f0, $f2, $f4, . . . , $f28, $f30. (Note that odd numbers are not allowed for names of these double-precision registers.) There are 32 32-bit single-precision FPRs: $f0, $f1, $f2, . . . , $f30, $f31. Attention: Unlike the set of GPRs, where $zero has special behaviour, none of the FPRs hold a constant value of 0.0. Section 6.7.4 in the textbook suggests names such as $fv0, $fv1, $ft0–$ft3, and so on for the 64-bit FPRs. Those names do not work in MARS! ENCM 369 W14 Section 01 Slide Set 14 slide 38/66 MIPS FPR organization: Each 64-bit FPR shares bits with two 32-bit FPRs . . . purple: 64-bit double-precision FPRs green: 32-bit single-precision FPRs

$f0 $f1 $f0

$f2 $f2$f3

$f30 $f31 $f30 ENCM 369 W14 Section 01 Slide Set 14 slide 39/66 Each 64-bit MIPS FPR shares bits with two 32-bit FPRs: Detailed example

bit number within 64-bit $f4 63 31320 ····· ·

310 ··· 0 31 ··· bit number within bit number within 32-bit $f5 32-bit $f4 A program using the 64-bit $f4 for a double variable must not at the same time use the 32-bit $f4 or $f5 for a float variable! ENCM 369 W14 Section 01 Slide Set 14 slide 40/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 41/66 Coprocessor 1: The MIPS term for floating-point unit

In the old days, when dinosaurs roamed, and processor chips only had hundreds of thousands of transistors (or less), f-p units were literally “coprocessors”. The main processor and the f-p unit were separate chips with separate sockets on a motherboard. Example: Intel 80386 (main processor) and 80387 (f-p unit). ENCM 369 W14 Section 01 Slide Set 14 slide 42/66 Coprocessor 1, continued

In 2014, a single chip (with hundreds of millions of transistors) can have 2 or 4 or 8 cores; each core has a main processor, its own floating-point unit, and a lot of other stuff. Students in ENCM 369 need to know that “coprocessor 1” means “floating-point unit”, because c1 shows up in the mnemonics for many MIPS f-p instructions, and because the “Coproc 1” tab in MARS is where you need to look to find values of FPRs. ENCM 369 W14 Section 01 Slide Set 14 slide 43/66 Some important MIPS c1 instructions what mnemonic stands for / mnemonic operation performed mtc1 move to coprocessor 1 / copy 32 bits from GPR to 32-bit FPR mfc1 move from coprocessor 1 / copy 32 bits from 32-bit FPR to GPR lwc1 load word to coprocessor 1 / copy 32 bits from memory to 32-bit FPR swc1 store word from coprocessor 1 / copy 32 bits from 32-bit FPR to memory ldc1 load double to coprocessor 1 / copy 64 bits from memory to 64-bit FPR sdc1 store double from coprocessor 1 / copy 64 bits from 64-bit FPR to memory ENCM 369 W14 Section 01 Slide Set 14 slide 44/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 45/66 Example translation #1 from C code to MIPS f-p instructions

x = i + y; x and y are of type double in $f2 and $f4, and i is of type int in $s0. WRONG ANSWER: addu $f2, $s0, $f4

Why is this a wrong answer? What would be correct code? Let’s trace the correct code assuming that $s0 contains 2 and $f4 contains 1.5. ENCM 369 W14 Section 01 Slide Set 14 slide 46/66 Example translation #2 from C code to MIPS f-p instructions

if (x < y) x = y; x and y are of type double in $f2 and $f4. WRONG ANSWER: slt $t0, $f2, $f4 bne $t0, $zero, L1 add.d $f2, $f4, $zero L1:

Why is this a wrong answer? What would be correct code? ENCM 369 W14 Section 01 Slide Set 14 slide 47/66 F-P comparisons in MIPS: How to compare?

type is d for double precision, s for single precision . . . Instruction Test c.lt. type FPR1 , FPR2 is FPR1 < FPR2 ? c.le. type FPR1 , FPR2 is FPR1 ≤ FPR2 ? c.eq. type FPR1 , FPR2 is FPR1 = FPR2 ?

slt puts its result in a GPR. Where does an f-p comparison put its result? ENCM 369 W14 Section 01 Slide Set 14 slide 48/66 F-P comparisons in MIPS: How to branch?

bc1t: branch if coprocessor 1 flag is true. bc1f: branch if coprocessor 1 flag is false. Messy detail: Actually, MIPS has eight separate coprocessor 1 flag bits, but by default c.lt.d, c.le.d, c.eq.d, c.lt.s, c.le.s, c.eq.s, bc1t and bc1f all access the same single flag bit. ENCM 369 W14 Section 01 Slide Set 14 slide 49/66 Key things to learn from examples #1 and #2

Do NOT assume that f-p instructions are organized just like integer instructions! Mixing types often works in C arithmetic expressions but usually DOESN’T work in assembly language arithmetic instructions. Before writing f-p MARS code in Lab 12, carefully study f-p instruction documentation provided along with the lab instructions. ENCM 369 W14 Section 01 Slide Set 14 slide 50/66 Register-use conventions and FPRs in ENCM 369

Students are expected to know conventions related to use of GPRs. Use of FPRs makes register-use conventions much more complicated. We’ll use simplified register-use conventions for FPRs; each lab and final-exam f-p programming problem will give a description of FPR-use conventions needed for that problem. ENCM 369 W14 Section 01 Slide Set 14 slide 51/66 Addresses live in GPRs, never in FPRs!

void foo(void) { What of register double d; should be used for d ? double *p; What kind of register should be used for p ? more code } ENCM 369 W14 Section 01 Slide Set 14 slide 52/66 More detail about MIPS f-p programming

There won’t be any more lecture time spent on details of MIPS floating-point instructions. You’ll learn about the most frequently-used f-p instructions by doing Lab 12. ENCM 369 W14 Section 01 Slide Set 14 slide 53/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 54/66 The minifloat type (as introduced in Lab 12)

This is an 8-bit f-p type similar to the IEEE 754 types, but with tiny, tiny exponent and fraction fields . . .

bit 7: Minifloat is useless for sign bits 3-0: practical computation, but bit fraction good for classroom examples and bits 6-4: pencil-and-paper lab biased exponent exercises.

The exponent bias is 3ten = 011two. ENCM 369 W14 Section 01 Slide Set 14 slide 55/66 Let’s try to understand f-p addition by adding two minifloats . . .

(Similar steps would be needed for 32- or 64-bit addition, but we would have to keep track of a lot more bits!) Bits of a are 01001111; bits of b are 00010110.

a represents 3.875ten; b represents 0.34375ten. (Check these values yourself!) So, how to compute the best possible minifloat result for a + b ? ENCM 369 W14 Section 01 Slide Set 14 slide 56/66 Rounding errors in f-p arithmetic

Both rounded results in the minifloat addition example are approximations to the exact sum, which is 4.21875ten. The same kind of rounding errors will occur in 32- and 64-bit f-p arithmetic operations. Relative sizes of rounding errors decrease as the number of fraction bits increases. ENCM 369 W14 Section 01 Slide Set 14 slide 57/66 Floating-point hardware example: Adder

(Unfortunately this year’s textbook doesn’t have any example circuits for f-p arithmetic.) An f-p adder would have to implement all the steps we’ve just seen in minifloat addition:

I comparing exponents of the two inputs

I shifting the input with the smaller exponent

I adding

I normalizing the sum

I rounding the sum Note how much more complicated this is than a simple integer adder! ENCM 369 W14 Section 01 Slide Set 14 slide 58/66 Floating-point hardware concepts: Just a couple of remarks

f-p arithmetic circuits are significantly larger and more complex than integer arithmetic circuits. But modern f-p circuits are very fast, because f-p performance has been a key selling point for processors and other digital hardware . . .

I games and video processing for consumers

I high-speed number-crunching for science and industry ENCM 369 W14 Section 01 Slide Set 14 slide 59/66 Remember this: F-P math is usually approximate

/* Classic mistake: Counting using fractions ... */ double x; for (x = 0.1; x <= 0.3; x += 0.1) printf("x is %f\n", x);

Expected output . . . Actual output . . .

x is 0.100000 x is 0.100000 x is 0.200000 x is 0.200000 x is 0.300000

What went wrong here? ENCM 369 W14 Section 01 Slide Set 14 slide 60/66 Outline of Slide Set 14

Introduction to Floating-Point Numbers

MIPS Formats for F-P Numbers

IEEE Floating-Point Standards

MIPS Floating-Point Registers

Coprocessor 1

Translating C F-P Code to to MIPS A.L.

Quick Overview of F-P Algorithms and Hardware

Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 61/66 Some Data and Remarks about Speed of Arithmetic

See the lecture document called “Arithmetic Performance Examples”. Let’s make some notes about the copy_test and op_test functions.

Computer used: 2009 MacBook Pro with 2013 version of C compiler from Apple. ENCM 369 W14 Section 01 Slide Set 14 slide 62/66 Speed of Arithmetic: Observations (1)

copy_test runs at the same speed for int, double and float. This isn’t surprising—x86 and x86-64 D-caches are designed to allow reading or writing 64-bit data in a single access. Addition, multiplication and shifts are “almost free” for this particular arrangement of C code and C compiler—op_test never takes much more time than copy_test, except when OP_CHOICE asks for division. ENCM 369 W14 Section 01 Slide Set 14 slide 63/66 Speed of Arithmetic: Observations (2)

Multiplication is approximately as fast as addition for all three types. Using a shift to multiply ints by 512 = 29 was not significantly faster than using multiplication—performance of integer multipliers in modern hardware is very good. Division, for all three types, is terribly slow compared to addition and multiplication. ENCM 369 W14 Section 01 Slide Set 14 slide 64/66 Speed of Arithmetic: Observations (3)

f-p addition and multiplication are sometimes faster and never very much slower than corresponding integer operations. Except for division, double-precision math seems to be just as fast as single-precision math. ENCM 369 W14 Section 01 Slide Set 14 slide 65/66 Speed of Arithmetic: Programming Ideas (1)

With current processors, do not prefer integer arithmetic over f-p just to get a speed increase. (Many years ago, f-p arithmetic was usually much slower than integer arithmetic.) Try to avoid division in loops where your programs spend a lot of time. ENCM 369 W14 Section 01 Slide Set 14 slide 66/66 Speed of Arithmetic: Programming Ideas (2)

Relative performance for different types and different operations is highly processor-dependent. Technology can change a lot within just a few years! So dont rely on today’s x86-64 results to guide your number-crunching designs on ARM (or some other hardware) in 2017! Try different data types and algorithms and measure performance.