32- ALU CSE 675.02: Introduction to ALU Control

A 32 Result 32-bit 32 ALU Zero Overflow Arithmetic / Logic Unit – ALU Carry out B 32 Design • Our ALU should be able to perform functions: – logical and function – logical or function Presentation F –arithmetic add function –arithmetic subtract function –arithmetic slt (set-less-then) function Slides by Gojko Babić – logical nor function • ALU control lines define a function to be performed on A and B. 07/19/2005 g. babic Presentation F 2

Functioning of 32-bit ALU Designing 32-bit ALU: Beginning 1. Let us start with and function ALU Control Operation = 0 Æ and ALU Control lines 4 2. Let us now add or function = 1 Æ or Function Ainvert Binvert Operation a0 and 0 0 00 A 32 0 Result0 Result b0 or 0 0 01 32-bit 32 1 add 0 0 10 ALU Zero Overflow subtract 0 1 10 a1 Carry out Result1 B 32 0 slt 0 1 11 b1 1 nor 1 1 00 a2 0 Result2 • Result lines provide result of the chosen function applied to values of b2 A and B 1 • Since this ALU operates on 32-bit operands, it is called 32-bit ALU •Zerooutput indicates if all Result lines have value 0 • Overflow indicates a sign integer overflow of add and subtract functions; for unsigned integers, this overflow indicator does not provide any useful a31 0 Result31 information b31 • Carry out indicates carry out and unsigned integer overflow 1 g. babic Presentation F 3 g. babic 4 Designing 32-bit ALU: Principles Designing

• Number of functions Operation = 0 Æ and • 32-bit adder is built out of 32 1-bit adders are performed inter- = 1 Æ or nally, but only one CarryIn a0 and result is chosen for 0 Result0 1-bit Adder the output of ALU b0 1-bit Adder or a 1 Input Output Sum a1 and Result1 a b Carry Sum Carry 0 b b1 Figure B.5.2 In Out or • 32-bit ALU is built 1 0 0 0 0 0 out of 32 identical CarryOut 0 0 1 1 0 1-bit ALU’s a2 and 0 Result2 b2 CarryIn 0 1 0 1 0 or 1 From the truth 0 1 1 0 1 table and after a 1 0 0 1 0 minimization, we 1 0 1 0 1 can have this 1 1 0 0 1 a31 and design for CarryOut b 0 Result31 1 1 1 1 1 b31 or 1 Figure B.5.5 CarryOut g. babic 5 g. babic Presentation F 6

32-bit Adder 32-bit ALU With 3 Functions

CarryIn =0 Operation “0” 1-bit ALU Operation a0 CarryIn a0 Cin Result0 sum0 CarryIn ALU0 + b0 b0 Cout CarryOut a This is a ripple carry adder. 0 a1 CarryIn a1 Cin Result1 sum1 ALU1 + 1 b1 b1 Result CarryOut Cout Operation = 00 Æ and The key to speeding up addition = 01 Æ or is determining carry out in the 2 a2 CarryIn = 10 Æ add b Result2 a2 Cin ALU2 sum2 higher order sooner. b2 + CarryOut b2 Cout Result: Carry look-ahead adder. CarryOut Figure B.5.6

a31 Cin sum31 a31 CarryIn + Result31 b31 ALU31 Cout b31 Figure B.5.7 + carry out CarryOut

Carry out g. babic Presentation F 7 g. babic Presentation F 8 32-bit 32-bit Adder / Subtractor

“0”“1” binvert “0”

a0 Cin A – B = A + (–B) Result0 a0 Cin + Result0 b0 0 + Cout = A + B + 1 b0 Cout 1

a1 Cin + Result1 a1 Cin b1 + Result1 Cout b1 0 Cout 1 a2 Cin Result2 a2 Cin Binvert = 0 Æ addition + Result2 b2 + = 1 Æ subtraction Cout b2 0 Cout 1

a31 Cin Result31 Cin + a31 Result31 b31 + Cout b31 0 Cout CarryOut 1 0 CarryOut

g. babic Presentation F 9 g. babic 1 10

32-bit ALU With 4 Functions 2’s Complement Overflow

1-bit ALU Binvert Operation Binvert Operation 1-bit ALU for the most significant bit CarryIn a0 CarryIn Binvert Operation b0 ALU0 Result0 a 2’s complement overflow happens: CarryIn 0 CarryOut • if sum of two positive numbers 1 Result results in a negative number a 0 • if sum of two negative numbers a1 CarryIn b 0 2 b1 ALU1 Result1 results in a positive number 1 1 CarryOut Figure B.5.8 Result b 0 2 CarryOut + a2 CarryIn 1 b2 ALU2 Result2

Less 3 Control lines CarryOut Binvert Operation Function Carry Out (1 line) (2 lines) Overflow CarryIn Overflow and 0 00 detection

or 0 01 a31 CarryIn Result31 b31 ALU31 add 0 10 subtract 1 10 Other 1-bit ALUs, i.e. non-most significant bit ALUs, are not affected. 0 Carry Out

g. babic Presentation F1 11 g. babic Presentation F 12 32-bit ALU With 4 Functions and Overflow Set Less Than (slt) Function

Binvert Operation •sltfunction is defined as: a0 CarryIn b0 ALU0 Result0 000 … 001 if A < B, i.e. if A – B < 0

CarryOut A slt B = Control lines 000 … 000 if A ≥ B, i.e. if A – B ≥ 0 a1 CarryIn Function Binvert Operation b1 ALU1 Result1 • Thus each 1-bit ALU should have an additional input (called (1 line) (2 lines) CarryOut and 0 00 “Less”), that will provide results for slt function. This input has

or 0 01 a2 CarryIn value 0 for all but 1-bit ALU for the least significant bit. b2 ALU2 Result2 add 0 10 CarryOut • For the least significant bit Less value should be sign of A – B subtract 1 10

CarryIn

a31 CarryIn Result31 b31 ALU31 Missing: slt & nor functions and Zero output Overflow

Carry Out g. babic Presentation F Add correction for CarryOut g. babic Presentation F 14

32-bit ALU With 5 Functions 32-bit ALU with 5 Functions and Zero

Binvert Operation Binvert Operation CarryIn BnegateBinvert Operation

a a0 CarryIn 1-bit ALU for non- 0 b0 ALU0 Result0 a0 CarryIn most significant Less Result0 b0 ALU0 1 bits CarryOut Less CarryOut Result b 0 2 Control lines a1 CarryIn 1 a1 b1 ALU1 Result1 Binvert Operation CarryIn Result1 Less 3 0 Less Function b1 ALU1 CarryOut (1 line) (2 lines) 0 Less CarryOut Zero 1-bit ALU for the CarryOut and 0 00 Binvert Operation most significant CarryIn a2 CarryIn a2 or 0 01 CarryIn Result2 bits b2 ALU2 Result2 b2 ALU2 0 Less a add 0 10 0 Less 0 CarryOut CarryOut subtract 1 10 1 slt 1 11 Result CarryIn b 0 + 2 1 Result31 a31 CarryIn a31 CarryIn Result31 Less 3 b31 ALU31 Set b31 ALU31 Set 0 Less Overflow Set 0 Less Overflow Carry Out

Overflow Overflow Carry Out detection Carry Out Add correction for CarryOut Operation = 3 and Binvert =1 for slt function Add correction for CarryOut g. babic Presentation F 32-bit ALU with 6 Functions 32-bit ALU Elaboration A nor B = A and B Binvert • We have now accounted for all but one of the arithmetic and logic functions for the core MIPS instruction set. 32-bit ALU with 6 functions omits support for shift instructions. • It would be possible to widen 1-bit ALU to include 1-bit shift left and/or 1-bit shift right. • Hardware designers created the circuit called a , which can shift from 1 to 31 bits in no more time than Figure B.5.10 (Top) it takes to add two 32-bit numbers. Thus, shifting is normally done outside the ALU. Function Ainvert Binvert Operation and 0 0 00 • We now consider integer multiplication (but not division). or 0 0 01 add 0 0 10 subtract 0 1 10 Carry Out slt 0 1 11 Add correction for CarryOut nor 1 1 00 Figure B.5.12 g. babic + Carry Out + Binvert 17 g. babic Presentation F 18

Multiplication Multiplication : Example • Multiplication is more complicated than addition: • The multiplication can be done with intermediate additions. – accomplished via shifting and addition • The same example: • More time and more area required multiplicand 10001 • Let's look at 3 versions based on elementary school multiplier × 10011 algorithm intermediate product 0000000000 • Example of unsigned multiplication: add since multiplier bit=1 10001 intermediate product 0000010001 5-bit multiplicand 10001 = 17 2 10 shift multiplicand and add since multiplier bit=1 10001 5-bit multiplier × 10011 = 19 2 10 intermediate product 0000110011 10001 shift multiplicand and no addition since multiplier bit=0 10001 shift multiplicand and no addition since multiplier bit=0 00000 shift multiplicand and add multiplier since bit=1 10001 00000 final result 0101000011 10001 .

1010000112 = 32310 • But, this algorithm is very impractical to implement in hardware g. babic Presentation F 19 g. babic Presentation F 20 Multiplication Hardware: 1st Version Multiplication Hardware: 2nd Version

Start Start

Multiplier0 = 1 1. Test Multiplier0 = 0 Multiplier0 = 1 1. Test Multiplier0 = 0 Multiplier0 Multiplier0 Multiplicand Multiplicand

Shift left 32 bits 1a. Add multiplicand to the left half of 1a. Add multiplicand to product and 64 bits theproductandplacetheresultin place the result in Product register the left half of the Product register

Multiplier

Multiplier 32-bit ALU Shift right 2. Shift the Multiplicand register left 1 bit 64-bit ALU Shift right 2. Shift the Product register right 1bit 32 bits 32 bits 3. Shift the Multiplier register right 1 bit 3. Shift the Multiplier register right 1 bit Shift right Product Product Control test Control test Write Write No: < 32 repetitions 32nd repetition? 64 bits No: < 32 repetitions 64 bits 32nd repetition?

Yes: 32 repetitions Yes: 32 repetitions

Done Figure 3.5 Done Figure 3.6 g. babic Presentation F 21 g. babic Presentation F 22

Multiplication Hardware: 3rd Version Multiplication of Signed Integers

• A simple algorithm: Start – Convert to positive integer any of operands (if needed) Multiplicand and remember original signs Product0 =1 Product0 =0 32 bits 1. Test Product0 – Perform multiplication of unsigned numbers using the existing algorithm and hardware 1a. Add multiplicand to the left half of theproductandplacetheresultin 32-bit ALU thelefthalfoftheProductregister – Negate product if original signs disagree • This algorithm is not simple to implement in hardware, since 2. Shift the Product register right 1 bit Shift right Control it has to: Product Write test

No: < 32 repetitions – account in advance about signs, 64 bits 32nd repetition?

Yes: 32 repetitions – if needed, convert from negative to positive numbers,

Done – if needed, convert back to negative integer at the end Figure 3.7 • Fast multiplication algorithms. g. babic Presentation F 23 g. babic Presentation F 24 Real Numbers Floating Point Number Formats • Conversion from real binary to real decimal • The term floating point number refers to representation of real – 1101.10112 = – 13.687510 3 2 0 binary numbers in computers. since: 11012 = 2 + 2 + 2 = 1310 and -1 -3 -4 • IEEE 754 standard defines standards for floating point 0.10112 = 2 + 2 + 2 = 0.5 + 0.125 + 0.0625 = 0.687510 • Conversion from real decimal to real binary: representations

+927.4510 = + 1110011111.01 1100 1100 1100 ….. • Single precision: 927/2 = 463 + ½ Å LSB 0.45 × 2 = 0.9 31 30 23 22 0 463/2 = 231 + ½ 0.9 × 2 = 1.8 s E Fraction 231/2 = 155 + ½ 0.8 × 2 = 1.6 155/2 = 57 + ½ 0.6 × 2 = 1.2 • Double precision: 57/2 = 28 + ½ 0.2 × 2 = 0.4

28/2 = 14 + 0 0.4 × 2 = 0.8 63 62 52 51 32 14/2 = 7 + 0 0.8 × 2 = 1.6 s E Fraction 7/2 = 3 + ½ 0.6 × 2 = 1.2 31 0 3/2 = 1 + ½ 0.2 × 2 = 0.4 Fraction 1/2 = 0 + ½ 0.4 × 2 = 0.8 ……

g. babic Presentation F 25 g. babic Presentation F 26

Converting to Floating Point Floating Point: Example 1 1. Normalize binary real number i.e. put it into the normalized • Find single and double precision of –13.6875 form: 10 Normalized form: (-1)1 × 1.1011011 × 23 (-1)s × 1.Fraction * 2Exp

1 3 – single precision: -1101.10112 = (-1) × 1.1011011 * 2 E = 112 + 011111112 = 100000102 +1110011111.011100 = (-1)0 × 1.110011111011100 * 29 |1|10000010|10110110000000000000000|

2. Load fields of single or double precision format with values – double precision from normalized form, but with the adjustment for E field. E = 112 + 011111111112 = 100000000102 |1|10000000010|10110110000000000000| E = Exp + 127 = Exp + 01111111 for single precision 10 2 |00000000000000000000000000000000| E = Exp + 102310 = Exp + 011111111112 for double precision

• E is called a biased exponent.

g. babic Presentation F 27 g. babic Presentation F 28 Floating Point: Example 2 Converting to Floating Point: Conclusion

• Find single and double precision of +927.4510 • Rules for biased exponents in single precision apply only for Normalized form: (-1)0 × 1.110011111011100 * 29 real exponents in the range [-126,127], thus we can have – single precision biased exponents only in the range [1,254].

E = 10012 + 011111112 = 100010002 • The number 0.0 is represented as S=0, E=0 and Fraction=0. |0|10001000|11001111101110011001100|1100... The infinite number is represented with E=255. There are truncation |0|10001000|11001111101110011001100| some additional rules that are outside our scope. rounding |0|10001000|11001111101110011001101| • Find the largest (non-infinite) real (by – double precision magnitude) which can be represented in a single precision.

E = 10012 + 011111111112 = 10000001000 – Floating point overflow |0|10000001000|11001111101110011001| • Find the smallest (non-zero) real binary number (by |10011001100110011001100110011001|1001100… magnitude) which can be represented in a single precision. – Floating point underflow truncation |10011001100110011001100110011001| rounding |10011001100110011001100110011010| g. babic Presentation F 29 g. babic Presentation F 30

Floating Point Addition Arithmetic Unit for Floating Point Addition

Figure 3.16 Figure 3.17 g. babic Presentation F 31 g. babic Presentation F 32 32-bit ALU with 6 Functions

1 2 0

32

1 0 10

1

g. babic 33