
Arithmetic and Logic Unit (ALU) Designing an Adder ……………….… Traditional circuit design: truth table approach Cost/Speed tradeoff: • n-bit adder: truth + table with 2n inputs, 22n rows → fast circuit ……………….… (theoretically), but very costly • n 1-bit adders: cheap but slow + ……… + + • Tradeoff: n 1-bit adders and additional circuitry to speed up computation 1-bit Adder Half Adder r x y S R 0 0 0 0 0 R 0 0 1 1 0 0 1 0 1 0 Full 0 1 1 0 1 Adder 1 0 0 1 0 R = x.y + r.(x+y) 1 0 1 0 1 x y ⊕ = x.y + r.(x ⊕y) 1 1 0 0 1 0 0 0 1 1 1 1 1 0 1 1 1 0 1 1 1 0 y x S Exclusive OR R r y ⊕ + x x y r x y S = r’(x’.y + x.y’) + r.(x.y + x’.y’)=r ⊕ x⊕y S 1 n-bit Addition/Substraction y3 c y2 c y1 c y0 c y3 x3 y2 x2 y1 x1 y0 x0 r r r r 3 2 1 0 c + + + + c=0: addition S S S S c=1: substraction 3 2 1 0 Substraction: • X-Y = X + (-Y) = X + Y’ + 1 r3 r2 r1 r0 • c = 1 → no additional adder X x3 x2 x1 x0 Bottleneck: carry Y + y3 y2 y1 y0 propagation Z r3 s3 s2 s1 s0 A Simple 1-bit ALU c1 c r c1 0 c m0 0 m1 m m2 MUX m3 + Multiplexor: ALU select among several inputs MUX c1 c0 m 0 0 m0 x y R 0 1 m1 1 0 m2 c1 c0 ALU 1 1 m 3 0 0 ADD 0 1 AND Elementary operations: 1 0 NOT • ADD, AND, NOT 1 1 d ALU: Arithmetic and Logic Unit n-bits ALU n 1-bit ALUs Embryo of instruction set: n X n n Z Y c1 c0 ALU 0 0 ADD 0 1 AND 1 0 NOT 1 1 d c1 c0 2 Overflow 1 0 0 1 0 0 0 6 0 1 1 0 -4 1 1 0 0 5 + 0 1 0 1 -7 + 1 0 0 1 -5 1 0 1 1 5 1 0 1 0 1 Detect overflow in twos-complement ? Overflow Overflow signal Soverflow (1=overflow) zn-1 xn-1 yn-1 Soverflow Overflow in twos-complement: 0 0 0 0 • x ≤ 0, y ≥ 0 → no overflow possible 0 0 1 0 • x ≥ 0, y ≥ 0: 0 1 0 0 Overflow: Z=X+Y (twos-complement) 0 1 1 1 n-1 n-1 n-1 Z > 2 -1, et 0 ≤ X ≤ 2 -1, 0 ≤ Y ≤ 2 - 1 0 0 1 1 ⇒2n-1-1 < Z ≤ 2n-2 1 0 1 0 ⇒In twos-complement, negative 1 1 0 0 numbers coded in [2 n-1;2 n-1] ⇒Z is negative 1 1 1 0 • x ≤ 0, y ≤ 0: same; in case of overflow, Z is positive → overflow detection criterion = ' + ' ' Soverflow xn−1.yn−1.zn−1 xn−1.yn−1.zn−1 Overflow Action upon overflow ? Several solutions: • Stop program (TRAP ) Example: MIPS R3000; • Raise a signaling bit such as Soverflow Example: Intel x86, Sun SPARC; • Do nothing Example: using C on a Sun SPARC 32-bits -2147483648 (-231 ) → 2147483647 (2 31 -1) 01110111001101011001010000000000 + 01110111001101011001010000000000 ------------------------------------------------ 11101110011010110010100000000000 3 Speeding Up Carry Propagation tr7 =16 tr3 =8 + + + + + + + + c r3 r2 r1 r0 p7g7 p6g6 p5g5 p4g4 p3g3 p2g2 p1g1 p0g0 CLA Carry Look Ahead tr7 =6 tr3 =4 = + ⊕ x,y ( t=0 ) r0 (x0.y0 ) c.( x0 y0 ) r ( t=t ) r4k-1 (t=t r4k-1G)énération de la retenue r = g + c.p x,y (t=0) 0 0 = + Propagation de la retenue r1 g1 r0.p1 p,g ( t=2 ) = + + (g1 g0.p1) c.( p0.p1) = + S ( t=t r+2 ) G1 c.P1 = + r2 G2 c.P2 r4k,4k+1,4k+2,4k+3 (t=max(t r4k-1+2,4) ) = + r3 G3 c.P3 Speeding Up Carry Propagation x3y3 x2y2 x1y1 x0y0 n = number of bits c CLA: O(n) . p3g3 p2g2 p1g1 p0g0 Tree structure: O(log n) . r0=g 0+p 0c 2 P2..1 =p 2p1 ⊕ G2..1 =g 2+p 2g1 pi= x i yi r gi=xi.y i 2 ⊕ si= p i ri-1 r3 r1 p p3 p2 p1 0 a2,b 2 a1,b 1 a,b r s s s3 2 s1 0 b2+b 1.a 2 a1.a 2 b + a.r Speeding Up Carry Propagation 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c n=16. Brent-Kung. CLA: 5 steps (pg and 4 CLA) to propagate carry Tree: 7 steps (pg and 6 PG operators) Starting with 32 bits, tree (8) faster than CLA (9) 4 Speeding Up Carry Propagation n=16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c Han-Carlson. Cost/Perform ance tradeoff Itanium: • 64 bits, • 0,18 µm, • 482 ps for one addition 5.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages5 Page
-
File Size-