<<

Arithmetic and Logic Unit (ALU)

Designing an

……………….…  Traditional circuit design: truth table approach  Cost/Speed tradeoff: • n-bit adder: truth + table with 2n inputs, 22n rows → fast circuit ……………….… (theoretically), but very costly • n 1-bit adders: cheap but slow + ……… + + • Tradeoff: n 1-bit adders and additional circuitry to speed up computation

1-bit Adder

Half Adder r x y S R 0 0 0 0 0 R 0 0 1 1 0 0 1 0 1 0 Full 0 1 1 0 1 Adder 1 0 0 1 0 R = x.y + r.(x+y) 1 0 1 0 1 x y ⊕ = x.y + r.(x ⊕y) 1 1 0 0 1 0 0 0 1 1 1 1 1 0 1 1 1 0 1 1 1 0 y x S R r y ⊕ + x x y r x y S = r’(x’.y + x.y’) + r.(x.y + x’.y’)=r ⊕ x⊕y S

1 n-bit /Substraction

y3 c y2 c y1 c y0 c

y3 x3 y2 x2 y1 x1 y0 x0 r r r r 3 2 1 0 c + + + +

c=0: addition S S S S c=1: substraction 3 2 1 0  Substraction:

• X-Y = X + (-Y) = X + Y’ + 1 r3 r2 r1 r0

• c = 1 → no additional adder X x3 x2 x1 x0

 Bottleneck: Y + y3 y2 y1 y0 propagation

Z r3 s3 s2 s1 s0

A Simple 1-bit ALU

c1 c r c1 0 c m0 0 m1 m m2 MUX m3 + Multiplexor: ALU select among several inputs MUX

c1 c0 m

0 0 m0 x y R 0 1 m1

1 0 m2 c1 c0 ALU

1 1 m 3 0 0 ADD 0 1 AND  Elementary operations: 1 0 NOT • ADD, AND, NOT 1 1 d  ALU: Arithmetic and Logic Unit

n-bits ALU

 n 1-bit ALUs  Embryo of instruction set: n X n n Z Y c1 c0 ALU 0 0 ADD 0 1 AND 1 0 NOT 1 1 d

c1 c0

2 Overflow

1 0 0 1 0 0 0 6 0 1 1 0 -4 1 1 0 0

5 + 0 1 0 1 -7 + 1 0 0 1

-5 1 0 1 1 5 1 0 1 0 1

 Detect overflow in twos-complement ?

Overflow

 Overflow signal Soverflow (1=overflow) zn-1 xn-1 yn-1 Soverflow  Overflow in twos-complement: 0 0 0 0 • x ≤ 0, y ≥ 0 → no overflow possible 0 0 1 0 • x ≥ 0, y ≥ 0: 0 1 0 0

 Overflow: Z=X+Y (twos-complement) 0 1 1 1 n-1 n-1 n-1 Z > 2 -1, et 0 ≤ X ≤ 2 -1, 0 ≤ Y ≤ 2 - 1 0 0 1 1 ⇒2n-1-1 < Z ≤ 2n-2 1 0 1 0 ⇒In twos-complement, negative 1 1 0 0 numbers coded in [2 n-1;2 n-1] ⇒Z is negative 1 1 1 0 • x ≤ 0, y ≤ 0: same; in case of overflow, Z is positive → overflow detection criterion

= ' + ' ' Soverflow xn−1.yn−1.zn−1 xn−1.yn−1.zn−1

Overflow

 Action upon overflow ?  Several solutions: • Stop program (TRAP )

 Example: MIPS R3000;

• Raise a signaling bit such as Soverflow  Example: Intel , Sun SPARC; • Do nothing  Example: using C on a Sun SPARC 32-bits -2147483648 (-231 ) → 2147483647 (2 31 -1)

01110111001101011001010000000000 + 01110111001101011001010000000000 ------11101110011010110010100000000000

3 Speeding Up Carry Propagation

tr7 =16 tr3 =8

+ + + + + + + + c

r3 r2 r1 r0 p7g7 p6g6 p5g5 p4g4 p3g3 p2g2 p1g1 p0g0 CLA Carry Look Ahead tr7 =6 tr3 =4 = + ⊕ x,y ( t=0 ) r0 (x0.y0 ) c.( x0 y0 ) r ( t=t ) r4k-1 (t=t r4k-1G)énération de la retenue r = g + c.p x,y (t=0) 0 0 = + Propagation de la retenue r1 g1 r0.p1 p,g ( t=2 ) = + + (g1 g0.p1) c.( p0.p1) = + S ( t=t r+2 ) G1 c.P1 = + r2 G2 c.P2 r4k,4k+1,4k+2,4k+3 (t=max(t r4k-1+2,4) ) = + r3 G3 c.P3

Speeding Up Carry Propagation

x3y3 x2y2 x1y1 x0y0  n = number of bits c  CLA: O(n) . p3g3 p2g2 p1g1 p0g0  Tree structure:

O(log n) . r0=g 0+p 0c 2 P2..1 =p 2p1 ⊕ G2..1 =g 2+p 2g1  pi= x i yi r  gi=xi.y i 2 ⊕  si= p i ri-1 r3 r1

p p3 p2 p1 0 a2,b 2 a1,b 1 a,b r

s s s3 2 s1 0 b2+b 1.a 2 a1.a 2 b + a.r

Speeding Up Carry Propagation

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c  n=16.  Brent-Kung.  CLA: 5 steps (pg and 4 CLA) to propagate carry  Tree: 7 steps (pg and 6 PG operators)  Starting with 32 bits, tree (8) faster than CLA (9)

4 Speeding Up Carry Propagation

 n=16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c  Han-Carlson.  Cost/Perform ance tradeoff  : • 64 bits, • 0,18 µm, • 482 ps for one addition

5