Concepts Introduced a 1-Bit Logical Unit 1-Bit Half Adder 1-Bit Half

Total Page:16

File Type:pdf, Size:1020Kb

Concepts Introduced a 1-Bit Logical Unit 1-Bit Half Adder 1-Bit Half Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM Concepts Introduced A 1-Bit Logical Unit Below is a 1-bit logical unit that performs AND and OR operations. Both the AND and OR operations are always performed, but the output produced depends on the selector to the basic arithmetic/logic unit multiplexor. clocks A 32-bit logical unit for AND and OR operations would just be latches and ip-ops an array of these 1-bit logical units. registers SRAMs and DRAMs Operation a 0 Result 1 b Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM 1-Bit Half Adder 1-Bit Half Adder (cont.) A half adder takes two inputs, a and b, and generates two The circuit for the carry can use an AND gate. outputs, the carry and the sum. The circuit for the sum can use an XOR gate. A half adder is called a (2,2) adder as it takes two inputs and produces two outputs. a a carry sum What type of gate can produce the carry output? b b What type of gate can produce the sum output? The gates can be combined together as one logic block. Inputs Outputs Comments abcarry sum a + = sum 00 0 000 002 b + = 01 0 101 012 10 0 11+ 0 = 01 2 carry + = 11 1 011 102 Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM 1-Bit Full Adder 1-Bit Full Adder (cont.) Only the least signicant bit (LSB) result will use a half adder. Note the carryout and the sum outputs for the full adder dier The addition for the other bits need a third input, which is the from the half adder. CarryIn from the previous bit position. Sum = a·b·CarryIn + a·bCarryIn + a·b·CarryIn + a·b·CarryIn This type is called a full adder or a (3,2) adder since it takes CarryOut = three inputs and produces two outputs. a·b·CarryIn + a·b·CarryIn + a·b·CarryIn + a·b·CarryIn CarryIn stupnI stuptuO a b CarryIn CarryOut Sum Comments 0 0 0 0 0 0 + 0 + 0 = 00two a 0 0 1 0 1 0 + 0 + 1 = 01two 0 1 0 0 1 0 + 1 + 0 = 01two + Sum 0 1 1 1 0 0 + 1 + 1 = 10two b 1 0 0 0 1 1 + 0 + 0 = 01two 1 0 1 1 0 1 + 0 + 1 = 10two 1 1 0 1 0 1 + 1 + 0 = 10two 1 1 1 1 1 1 + 1 + 1 = 11 CarryOut two Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM Sum from Full Adder CarryOut from Full Adder Sum = a·b·CarryIn + a·bCarryIn + a·b·CarryIn + a·b·CarryIn CarryIn CarryOut = a·b·CarryIn + a·b·CarryIn + a·b·CarryIn + a·b·CarryIn = b·CarryIn + a·CarryIn + a·b a ab CarryIn 00 01 11 10 b 0 0 0 1 0 Sum 1 0 11 1 Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM CarryOut from Full Adder (cont.) Sum and CarryOut from Full Adder CarryIn CarryOut = b·CarryIn + a·CarryIn + a·b a CarryIn b Sum a b CarryOut CarryOut Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM Hardware Delay 1-Bit ALU for AND, OR, and Addition This 1-bit ALU performs AND, OR, and addition on a and b. Operation CarryIn Delay is dened as the time from when the input is stable to a the time when the output is stable. 0 Will the full 1-bit adder delay be greater than the half 1-bit adder delay? 1 Result ϩ 2 b CarryOut Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM 32-Bit ALU for AND, OR, and Addition Supporting Subtraction The 32-bit adder, also Operation called the ripple carry CarryIn adder, is created by linking the CarryOut of a0 CarryIn ALU0 Result0 b0 the previous 1-bit adder CarryOut Remember that a − b = a + −b = a + b + 1 in a two's to the CarryIn of the complement representation. current 1-bit adder. a1 CarryIn So to support a − b: ALU1 Result1 Faster adders are b1 CarryOut Invert every bit in b. possible. Add 1 to the result. Add with a. a2 CarryIn ALU2 Result2 b2 CarryOut . a31 CarryIn ALU31 Result31 b31 Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM 1-Bit ALU for AND, OR, Addition, and Subtraction NOR Operation This 1-bit ALU performs AND, OR, and addition on (a and b) or (a and b). By selecting b and setting CarryIn to 1 in the LSB of the ALU, the ALU can perform subtraction of b from a. The NOR operation is equivalent to a' AND b'. Binvert Operation CarryIn aba' b' a AND b aNOR b a' AND b' a 00 1 10 1 1 0 01 1 00 0 0 1 Result 10 0 10 0 0 11 0 01 0 0 b 0 ϩ 2 1 CarryOut Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM 1-Bit ALU for AND, OR, NOR, Addition, and Subtraction Supporting the SLT Instruction This 1-bit ALU performs AND, OR, and addition on (a or a) and (b or b). By selecting a and b,(a NOR b) is produced from (a AND b). Remember that the slt instruction produces a 1 if rs < rt (or Ainvert Operation a < b) and 0 otherwise. Binvert CarryIn So we need to set all but the least signicant bit (LSB) to 0. The LSB is set according to the result of the most signicant a 0 bit (MSB) after performing an a - b operation. 0 1 (a − b) < 0 ) ((a − b) + b) < (0 + b) 1 Result ) a < b b 0 ϩ 2 1 CarryOut Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM Dealing with Overow Checking for Overow When the result of the subtraction is greater than the largest positive representable value or is smaller than the smallest negative representable value, then we have an overow. The MSB of a subtraction result cannot be used when there is The largest positive value is 011...1. an overow. Adding the two largest positive values will give a result of A+B 111...10, which has the MSB set, but the CarryOut is not set. Operand A Operand B Result Overflow? The smallest negative value is 100...0. ≥ 0 ≥ 0<0 Adding the two smallest negative values will give a result of ≥ Yes <0 <0 0 000...00, which has the MSB clear, but the CarryOut is set. ≥ 0 ≥ 0 ≥ 0 No <0 <0 <0 ≥ 0<0<0No ≥ 0<0 ≥ 0No <0 ≥ 0<0No <0 ≥ 0 ≥ 0No Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM Checking for Overow (cont.) 1-Bit ALU for All Bits Except the Most Signicant Bit This 1-bit ALU has an extra input called Less that can be used We can detect overow by checking if the CarryIn and for the result of the slt instruction. This 1-bit ALU is used for CarryOut of the MSB dier. all bits except the MSB. MSB MSB Carry In Ainvert Operation MSB MSB Carry Carry MSB Over-XOR Binvert CarryIn AB InOut Result flow? Carry Out Notes a 0 00 0 00No 0 0 1 00 1 01Yes1carries differ 01 0 01No 0 |A| < |B| 1 01 1 10No 0 |A| > |B| Result b 0 10 0 01No 0 |A| > |B| ϩ 2 Operation 10 1Ainvert10No 0 |A| < |B| 1 Binvert CarryIn 11 0 10Yes1carries differ Less 3 a 0 11 1 11No 0 0 1 CarryOut 1 Result Ainvert Operation b 0 Binvert CarryIn ϩ 2 Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM 1 a 0 0 Less 3 1-Bit ALU for the Most Signicant Bit 32-Bit ALU for AND,1 OR, NOR, Addition, Subtraction, SLT Binvert 1 Operation Ainvert This 1-bit ALU for the MSB hasCarryOut an output called Set for the The Set output is the CarryIn Result slt instruction from performing the subtraction. This 1-bit Less inputb for the LSB0 ALU also checks for overow. 1-bit ALU. ϩ 2 a0 CarryIn Result0 Ainvert Operation 1 A zero is input as the b0 ALU0 Binvert CarryIn Less Less signalLess for all CarryOut3 a 0 0 other 1-bit ALUs. Set 1 a1 CarryIn Result1 If Overow is set, Overflow b1 ALU1 Overflow detection 0 Less 1 then the Set result for CarryOut Result the MSB should be inverted. b 0 a2 CarryIn ϩ 2 Result2 b2 ALU2 1 0 Less CarryOut Less 3 . CarryIn Set a31 CarryIn Result31 Overflow Overflow b31 ALU31 Set detection 0 Less Overflow Basic ALU Design Clocks Flip-Flops Regs RAM Basic ALU Design Clocks Flip-Flops Regs RAM Supporting the BEQ Instruction Final 32-Bit ALU Bnegate Operation The Zero Ainvert result is To support a beq instruction, we can get the result of a0 CarryIn a − b Result0 used for b0 ALU0 and check if the result is zero. Less the beq CarryOut (a − b = 0) ) a = b instruction. a1 CarryIn Result1 b1 ALU1 0 Less .
Recommended publications
  • Comparison of Parallel and Pipelined CORDIC Algorithm Using RCA and CSA
    Comparison of Parallel and Pipelined CORDIC algorithm using RCA and CSA Diego Barragan´ Guerrero Lu´ıs Geraldo P. Meloni FEEC - UNICAMP FEEC - UNICAMP Campinas, Sao˜ Paulo, Brazil, 13083-852 Campinas, Sao˜ Paulo, Brazil, 13083-852 +5519 9308-9952 +5519 9778-1523 [email protected] [email protected] Abstract— This paper presents an implementation of the algorithm has two modes of operation: the rotational mode CORDIC algorithm in digital hardware using two types of (RM) where the vector (xi; yi) is rotated by an angle θ to algebraic adders: Ripple-Carry Adder (RCA) and Carry-Select obtain a new vector (x ; y ), and the vectoring mode (VM) Adder (CSA), both in parallel and pipelined architectures. Anal- N N ysis of time performance and resources utilization was carried in which the algorithm computes the modulus R and phase α out by changing the algorithm number of iterations. These results from the x-axis of the vector (x0; y0). The basic principle of demonstrate the efficiency in operating frequency of the pipelined the algorithm is shown in Figure 1. architecture with respect to the parallel architecture. Also it is shown that the use of CSA reduce the timing processing without significantly increasing the slice use. The code was synthesized us- ing FPGA development tools for the Xilinx Spartan-3E xc3s500e ' ' E N y family. N E N Index Terms— CORDIC, pipelined, parallel, RCA, CSA, y N trigonometrics functions. Rotação Pseudo-rotação R N I. INTRODUCTION E i In Digital Signal Processing with FPGA, trigonometric y i R i functions are used in many signal algorithms, for instance N synchronization and equalization [12].
    [Show full text]
  • Basics of Logic Design Arithmetic Logic Unit (ALU) Today's Lecture
    Basics of Logic Design Arithmetic Logic Unit (ALU) CPS 104 Lecture 9 Today’s Lecture • Homework #3 Assigned Due March 3 • Project Groups assigned & posted to blackboard. • Project Specification is on Web Due April 19 • Building the building blocks… Outline • Review • Digital building blocks • An Arithmetic Logic Unit (ALU) Reading Appendix B, Chapter 3 © Alvin R. Lebeck CPS 104 2 Review: Digital Design • Logic Design, Switching Circuits, Digital Logic Recall: Everything is built from transistors • A transistor is a switch • It is either on or off • On or off can represent True or False Given a bunch of bits (0 or 1)… • Is this instruction a lw or a beq? • What register do I read? • How do I add two numbers? • Need a method to reason about complex expressions © Alvin R. Lebeck CPS 104 3 Review: Boolean Functions • Boolean functions have arguments that take two values ({T,F} or {0,1}) and they return a single or a set of ({T,F} or {0,1}) value(s). • Boolean functions can always be represented by a table called a “Truth Table” • Example: F: {0,1}3 -> {0,1}2 a b c f1f2 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 1 1 1 1 1 1 © Alvin R. Lebeck CPS 104 4 Review: Boolean Functions and Expressions F(A, B, C) = (A * B) + (~A * C) ABCF 0000 0011 0100 0111 1000 1010 1101 1111 © Alvin R. Lebeck CPS 104 5 Review: Boolean Gates • Gates are electronics devices that implement simple Boolean functions Examples a a AND(a,b) OR(a,b) a NOT(a) b b a XOR(a,b) a NAND(a,b) b b a NOR(a,b) a XNOR(a,b) b b © Alvin R.
    [Show full text]
  • Implementation of Carry Tree Adders and Compare with RCA and CSLA
    International Journal of Emerging Engineering Research and Technology Volume 4, Issue 1, January 2016, PP 1-11 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Implementation of Carry Tree Adders and Compare with RCA and CSLA 1 2 G. Venkatanaga Kumar , C.H Pushpalatha Department of ECE, GONNA INSTITUTE OF TECHNOLOGY, Vishakhapatnam, India (PG Scholar) Department of ECE, GONNA INSTITUTE OF TECHNOLOGY, Vishakhapatnam, India (Associate Professor) ABSTRACT The binary adder is the critical element in most digital circuit designs including digital signal processors (DSP) and microprocessor data path units. As such, extensive research continues to be focused on improving the power delay performance of the adder. In VLSI implementations, parallel-prefix adders are known to have the best performance. Binary adders are one of the most essential logic elements within a digital system. In addition, binary adders are also helpful in units other than Arithmetic Logic Units (ALU), such as multipliers, dividers and memory addressing. Therefore, binary addition is essential that any improvement in binary addition can result in a performance boost for any computing system and, hence, help improve the performance of the entire system. Parallel-prefix adders (also known as carry-tree adders) are known to have the best performance in VLSI designs. This paper investigates three types of carry-tree adders (the Kogge- Stone, sparse Kogge-Stone, Ladner-Fischer and spanning tree adder) and compares them to the simple Ripple Carry Adder (RCA) and Carry Skip Adder (CSA). In this project Xilinx-ISE tool is used for simulation, logical verification, and further synthesizing. This algorithm is implemented in Xilinx 13.2 version and verified using Spartan 3e kit.
    [Show full text]
  • CS/EE 260 – Homework 5 Solutions Spring 2000
    CS/EE 260 – Homework 5 Solutions Spring 2000 1. (MK 3-23) Construct a 10-to-1 line multiplexer with three 4-to-1 line multiplexers. The multiplexers should be interconnected and inputs labeled so that the selection codes 0000 through 1001 can be directly applied to the multiplexer selections inputs without added logic. 10:1 mux d0 0 1 d1 1 X d9 9 S3S2S1S0 Implementation using 4:1 muxes. d0 0 d2 1 d4 2 d6 3 0 S 1 S2 1 d8 2 X d9 3 d 0 1 S S d3 1 3 0 d5 2 d7 3 S2 S1 1 2. (MK 3-27) Implement a binary full adder with a dual 4-to-1 line multiplexer and a single inverter. AB Ci S Co 00 0 0 0 C 0 00 1 1 i 0 01 0 1 0 C ´ C 01 1 0 i 1 i 10 0 1 0 10 1 0Ci´ 1 Ci 11 0 0 1 C 1 11 1 1 i 1 0 C 1 i 4:1 2 S 3 mux S1 S0 A B 0 0 1 4:1 Co 2 mux 1 3 S1S0 2 3. (MK 3-34) Design a combinational circuit that forms the 2-bit binary sum S1S0 of two 2-bit numbers A1A0 and B1B0 and has both input C0 and a carry output C2. Do not use half adders or full adders, but instead use a two-level circuit plus inverters for the input variables, as needed. Design the circuit by starting with the following equations for each of the two bits of the adder.
    [Show full text]
  • UNIT 8B a Full Adder
    UNIT 8B Computer Organization: Levels of Abstraction 15110 Principles of Computing, 1 Carnegie Mellon University - CORTINA A Full Adder C ABCin Cout S in 0 0 0 A 0 0 1 0 1 0 B 0 1 1 1 0 0 1 0 1 C S out 1 1 0 1 1 1 15110 Principles of Computing, 2 Carnegie Mellon University - CORTINA 1 A Full Adder C ABCin Cout S in 0 0 0 0 0 A 0 0 1 0 1 0 1 0 0 1 B 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 C S out 1 1 0 1 0 1 1 1 1 1 ⊕ ⊕ S = A B Cin ⊕ ∧ ∨ ∧ Cout = ((A B) C) (A B) 15110 Principles of Computing, 3 Carnegie Mellon University - CORTINA Full Adder (FA) AB 1-bit Cout Full Cin Adder S 15110 Principles of Computing, 4 Carnegie Mellon University - CORTINA 2 Another Full Adder (FA) http://students.cs.tamu.edu/wanglei/csce350/handout/lab6.html AB 1-bit Cout Full Cin Adder S 15110 Principles of Computing, 5 Carnegie Mellon University - CORTINA 8-bit Full Adder A7 B7 A2 B2 A1 B1 A0 B0 1-bit 1-bit 1-bit 1-bit ... Cout Full Full Full Full Cin Adder Adder Adder Adder S7 S2 S1 S0 AB 8 ⁄ ⁄ 8 C 8-bit C out FA in ⁄ 8 S 15110 Principles of Computing, 6 Carnegie Mellon University - CORTINA 3 Multiplexer (MUX) • A multiplexer chooses between a set of inputs. D1 D 2 MUX F D3 D ABF 4 0 0 D1 AB 0 1 D2 1 0 D3 1 1 D4 http://www.cise.ufl.edu/~mssz/CompOrg/CDAintro.html 15110 Principles of Computing, 7 Carnegie Mellon University - CORTINA Arithmetic Logic Unit (ALU) OP 1OP 0 Carry In & OP OP 0 OP 1 F 0 0 A ∧ B 0 1 A ∨ B 1 0 A 1 1 A + B http://cs-alb-pc3.massey.ac.nz/notes/59304/l4.html 15110 Principles of Computing, 8 Carnegie Mellon University - CORTINA 4 Flip Flop • A flip flop is a sequential circuit that is able to maintain (save) a state.
    [Show full text]
  • Arithmetic and Logical Unit Design for Area Optimization for Microcontroller Amrut Anilrao Purohit 1,2 , Mohammed Riyaz Ahmed 2 and R
    et International Journal on Emerging Technologies 11 (2): 668-673(2020) ISSN No. (Print): 0975-8364 ISSN No. (Online): 2249-3255 Arithmetic and Logical Unit Design for Area Optimization for Microcontroller Amrut Anilrao Purohit 1,2 , Mohammed Riyaz Ahmed 2 and R. Venkata Siva Reddy 2 1Research Scholar, VTU Belagavi (Karnataka), India. 2School of Electronics and Communication Engineering, REVA University Bengaluru, (Karnataka), India. (Corresponding author: Amrut Anilrao Purohit) (Received 04 January 2020, Revised 02 March 2020, Accepted 03 March 2020) (Published by Research Trend, Website: www.researchtrend.net) ABSTRACT: Arithmetic and Logic Unit (ALU) can be understood with basic knowledge of digital electronics and any engineer will go through the details only once. The advantage of knowing ALU in detail is two- folded: firstly, programming of the processing device can be efficient and secondly, can design a new ALU architecture as per the various constraints of the use cases. The miniaturization of digital circuits can be achieved by either reducing the size of transistor (Moore’s law) or by optimizing the gate count of the circuit. The first has been explored extensively while the latter has been ignored which deals with the application of Boolean rules and requires sound knowledge of logic design. The ultimate outcome is to have an area optimized architecture/approach that optimizes the circuit at gate level. The design of ALU is for various processing devices varies with the device/system requirements. The area optimization places a significant role in the chip design. Here in this work, we have attempted to design an ALU which is area efficient while being loaded with additional functionality necessary for microcontrollers.
    [Show full text]
  • Design of High Speed and Low Power Six Transistor Full Adder Using Two Transistor Xor Gate
    International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol. 3, Issue 1, Mar 2013, 87-96 © TJPRC Pvt. Ltd. DESIGN OF HIGH SPEED AND LOW POWER SIX TRANSISTOR FULL ADDER USING TWO TRANSISTOR XOR GATE B. DILLI KUMAR, K. CHARAN KUMAR & T. NAVEEN KUMAR M. Tech (VLSI), Department of ECE, Sree Vidyanikethan Engineering College (Autonomous), Tirupati, India ABSTRACT Full adder is one of the major components in the design of many sophisticated hardware circuits. In this paper the full adder has been designed by using a new efficient design with less number of transistors. A 2 transistor XOR gate has been proposed with the help of two PMOS (Positive Metal Oxide Semiconductor) transistors. By using this 2T XOR gate the size of the full adder has been decreased to a large extent which can be implemented with only 6 transistors. The proposed full adder has a significant improvement in silicon area and power delay product when compared to the previous 8T full adder circuits. Further the proposed adder requires less area to perform a required logic function. Further, the proposed full adder has less power dissipation which makes it suitable for many of the low power applications and because of less area requirement the proposed design can be used in many of the portable applications also . KEYWORDS: Full Adder, XOR, Less Area, Speed, Low Power, Delay, Less Transistor Count, Low Power VLSI INTRODUCTION Full adder is one of the basic building blocks of many of the digital VLSI circuits. Several refinements has been made regarding its structure since its invention.
    [Show full text]
  • Unit 8 : Microprocessor Architecture
    Unit 8 : Microprocessor Architecture Lesson 1 : Microcomputer Structure 1.1. Learning Objectives On completion of this lesson you will be able to : ♦ draw the block diagram of a simple computer ♦ understand the function of different units of a microcomputer ♦ learn the basic operation of microcomputer bus system. 1.2. Digital Computer A digital computer is a multipurpose, programmable machine that reads A digital computer is a binary instructions from its memory, accepts binary data as input and multipurpose, programmable processes data according to those instructions, and provides results as machine. output. 1.3. Basic Computer System Organization Every computer contains five essential parts or units. They are Basic computer system organization. i. the arithmetic logic unit (ALU) ii. the control unit iii. the memory unit iv. the input unit v. the output unit. 1.3.1. The Arithmetic and Logic Unit (ALU) The arithmetic and logic unit (ALU) is that part of the computer that The arithmetic and logic actually performs arithmetic and logical operations on data. All other unit (ALU) is that part of elements of the computer system - control unit, register, memory, I/O - the computer that actually are there mainly to bring data into the ALU to process and then to take performs arithmetic and the results back out. logical operations on data. An arithmetic and logic unit and, indeed, all electronic components in the computer are based on the use of simple digital logic devices that can store binary digits and perform simple Boolean logic operations. Data are presented to the ALU in registers. These registers are temporary storage locations within the CPU that are connected by signal paths of the ALU.
    [Show full text]
  • Reverse Engineering X86 Processor Microcode
    Reverse Engineering x86 Processor Microcode Philipp Koppe, Benjamin Kollenda, Marc Fyrbiak, Christian Kison, Robert Gawlik, Christof Paar, and Thorsten Holz, Ruhr-University Bochum https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/koppe This paper is included in the Proceedings of the 26th USENIX Security Symposium August 16–18, 2017 • Vancouver, BC, Canada ISBN 978-1-931971-40-9 Open access to the Proceedings of the 26th USENIX Security Symposium is sponsored by USENIX Reverse Engineering x86 Processor Microcode Philipp Koppe, Benjamin Kollenda, Marc Fyrbiak, Christian Kison, Robert Gawlik, Christof Paar, and Thorsten Holz Ruhr-Universitat¨ Bochum Abstract hardware modifications [48]. Dedicated hardware units to counter bugs are imperfect [36, 49] and involve non- Microcode is an abstraction layer on top of the phys- negligible hardware costs [8]. The infamous Pentium fdiv ical components of a CPU and present in most general- bug [62] illustrated a clear economic need for field up- purpose CPUs today. In addition to facilitate complex and dates after deployment in order to turn off defective parts vast instruction sets, it also provides an update mechanism and patch erroneous behavior. Note that the implementa- that allows CPUs to be patched in-place without requiring tion of a modern processor involves millions of lines of any special hardware. While it is well-known that CPUs HDL code [55] and verification of functional correctness are regularly updated with this mechanism, very little is for such processors is still an unsolved problem [4, 29]. known about its inner workings given that microcode and the update mechanism are proprietary and have not been Since the 1970s, x86 processor manufacturers have throughly analyzed yet.
    [Show full text]
  • Reconfigurable Accelerators in the World of General-Purpose Computing
    Reconfigurable Accelerators in the World of General-Purpose Computing Dissertation A thesis submitted to the Faculty of Electrical Engineering, Computer Science and Mathematics of Paderborn University in partial fulfillment of the requirements for the degree of Dr. rer. nat. by Tobias Kenter Paderborn, Germany August 26, 2016 Acknowledgments First and foremost, I would like to thank Prof. Dr. Christian Plessl for the advice and support during my research. As particularly helpful, I perceived his ability to communicate suggestions depending on the situation, either through open questions that give room to explore and learn, or through concrete recommendations that help to achieve results more directly. Special thanks go also to Prof. Dr. Marco Platzner for his advice and support. I profited especially from his experience and ability to systematically identify the essence of challenges and solutions. Furthermore, I would like to thank: • Prof. Dr. João M. P. Cardoso, for serving as external reviewer for my dissertation. • Prof. Dr. Friedhelm Meyer auf der Heide and Dr. Matthias Fischer for serving on my oral examination committee. • All colleagues with whom I had the pleasure to work at the PC2 and the Computer Engineering Group, researchers, technical and administrative staff. In a variation to one of our coffee kitchen puns, I’d like to state that research without colleagues is possible, but pointless. However, I’m not sure about the first part. • My long-time office mates Lars Schäfers and Alexander Boschmann for particularly extensive discussions on our research and far beyond. • Gavin Vaz, Heinrich Riebler and Achim Lösch for intensive and productive collabo- ration on joint research interests.
    [Show full text]
  • Half Adder, Which Finds the Sum of Two Bits
    CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 21, SPRING 2013 TOPICS TODAY • Circuits for Addition • Standard Logic Components • Logisim Demo CIRCUITS FOR ADDITION 3.5 Combinational Circuits • Combinational logic circuits give us many useful devices. • One of the simplest is the half adder, which finds the sum of two bits. • We can gain some insight as to the construction of a half adder by looking at its truth table, shown at the right. 30 Half Adder • Inputs: A and B • Outputs: S = lower bit of A + B, cout = carry bit A B S cout 0 0 0 0 0 1 1 0 1 0 1 0 1 1 0 1 • Using Sum-of-Products: S = AB + AB, cout = AB. • Alternatively, we could use XOR: S = A ⊕ B. ! ! ! ! 1 3.5 Combinational Circuits • As we see, the sum can be found using the XOR operation and the carry using the AND operation. 31 3.5 Combinational Circuits • We can change our half adder into to a full adder by including gates for processing the carry bit. • The truth table for a full adder is shown at the right. 32 Full Adder • Inputs: A, B and cin • Outputs: S = lower bit of A + B, cout = carry bit A B cin S cout 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1 • S = A BC + ABC + AB C + ABC = A ⊕ B ⊕ C.
    [Show full text]
  • Lecture 4 Adders
    Lecture 4 Adders Computer Systems Laboratory Stanford University [email protected] Copyright © 2006 Mark Horowitz Some figures from High-Performance Microprocessor Design © IEEE M Horowitz EE 371 Lecture 4 1 Overview • Readings • Today’s topics – Fast adders generally use a tree structure for parallelism – We will cover basic tree terminology and structures – Look at a few example adder architectures – Examples will spill into next lecture as well M Horowitz EE 371 Lecture 4 2 Adders • Task of an adder is conceptually simple – Sum[n:0]=A[n:0]+B[n:0]+C0 – Subtractors also very simple: -B = ~B+1, so invert B and set C0=1 • Per bit formulas –Sumi = Ai XOR Bi XOR Ci – Couti = Ci+1 = majority(Ai,Bi,Ci) • Fundamental problem is calculating the carry to the nth bit – All carry terms are dependent on all previous terms – So LSB input has a fanout of n • And an absolute minimum of log4n FO4 delays without any logic M Horowitz EE 371 Lecture 4 3 Single-Bit Adders • Adders are chock-full of XORs, which make them interesting – One of the few circuits where pass-gate logic is attractive – A complicated differential passgate logic (DPL) block from the text Lousy way to draw a pair of inverters M Horowitz EE 371 Lecture 4 4 G and P and K, Oh My! • Most fast adders “G”enerate, “P”ropagate, or “K”ill the carry – Usually only G and P are used; K only appears in some carry chains • When does a bit Generate a carry out? –Gi = Ai AND Bi – If Gi is true, then Couti = Ci+1 is forced to be true • When does a bit Propagate a carry in to the carry out? –Pi
    [Show full text]