Lecture 3 Processor: Datapath and Control

Lecture 3 Processor: Datapath and Control

Lecture 3 Processor: Datapath and Control 1 ALU . Arithmetic Logic Unit is the hardware that pp,,,erforms addition, subtraction, AND, OR … 2 §4.1 Int r Recap: Performance oduction CPU Time Instruction Count CPIClock Cycle Time . CPU performance fftactors . Instruction count • Determined by Instruction Set Architecture and compiler . CPI and Cycle time • Determined by imppplementation of the processor Chapter 4 —The Processor — 3 Components of a Computer Processor . Datapath . Control . Components of the . Component of the processor that processor that perform arithmetic commands the operations and holds datapath, memory, data I/O devices according to the instructions of the memory §4.3 Bui Building a Datapath lding a D a . DhDatapath tapath . Elements that process data and addresses in the CPU • Memories, registers, ALUs, … . We will build a MIPS datapath incrementally . considering only a subset of instructions . To start, we will look at 3 elements Chapter 4 —The Processor —6 . A memory unit to store instructions of a program and supply instructi ons given an address . Needs to ppyrovide only read access (once the program is loaded). No control signal is need. 7 . PC (Program Counter or Instruction address register) is a register that holds the address of the current instruction . A new value is written to it every clock cycle. No control signa l is require d to enable write 8 . Adder to increment the PC to the address of the next instruction . An ALU ppyermanently wired to do onl y addition. No extra control signal required 9 Datapath portion for Instruction Fetch Increment by 4 for next 32‐bit instruction register Chapter 4 —The Processor — 10 Types of Elements in the Datapath . State element: . A memoryy,, element, i.e., it contains a state . E.g., program counter, instruction memory . Combinational element: . Elements that operate on values . EgE.g. adder, ALU 11 . Now, we will look at datapath elements reqqyuired by the different classes of instructions . Arithmetic and logical instructions . Data transfer instructions . Branch instructions 12 R-Format ALU Instructions . E.g., add $t1, $t2, $t3 . Perform arithmetic/logical operation . Read two register operands and write register result Chapter 4 —The Processor — 13 R-Format ALU Instructions . Register file: A collection of the registers . Any register can be read or written by specifying thhe number of the register . Contains the register state of the computer Chapter 4 —The Processor — 14 . Read from register file . 2 input s to the regist er file specifyi ng the numbers • 5 bit wide inputs for the 32 registers . 2 outputs from the register file with the read values • 32 bit wide . For all instructions. No control required. Chapter 4 —The Processor — 15 . Write to register file . 1 input to the register file specifying the number • 5 bit wide inputs for the 32 registers . 1 input to the register file wiihth the value to be written • 32 bit wide . OlOnly for some ins truc tions. RWitRegWritecontltrol silignal. Chapter 4 —The Processor — 16 . ALU . Takes two 32 bit input and produces a 32 bit output . Also, sets one-bit signal if the results is 0 . The operation done by ALU is controlled by a 4 bit control signal input. This is set according to the instruction Chapter 4 —The Processor — 17 Data transfer instructions . lw $t1, offset_value($t2) . Load: Read memory and update register . sw $t1, offset_value($t2) . Store: Write reggyister value to memory 18 Data transfer instructions . Compute the memory address by adding the value in base register and the 16 bit offset . need the ALU . Calculate address using 16-bit offset • Use ALU, but sign-extend offset . Write to or read from register . need the register file 19 . Two additional units – data memory and sign unit extension . Data memory . State element with • input for address and data to be written • output for read result . Data memory . Separate control for read and write . Control for read is required because reading from inva lid address can lead to problems . Sign-extension unit takes a 16 bit input and extend it to a 32 bit output 22 Composing the Elements for R-type and data transfer instructions . A simple data path that does an instruction in one clock cycle . Each datapath element can only do one function at a time . Hence, we need separate instruction and data memories . Use multiplexers where alternate data sources are used for different instructions Chapter 4 —The Processor — 23 Multiplexors . An ALU might need input from . Two registers . Or one registers and one immediate field (or off)ffset) . To choose correctly from multiple sources, a har dware element calle d multipl exor is used with appropriate control signals 24 Multiplexors . The data written to registers may come from . Data memory . Or ALU . To choose correctly from multiple sources, a hardware element called multiplexor is used withith appropriat e conttlrol siilgnals 25 R-Type/Load/Store Datapath Chapter 4 —The Processor — 26 Branch Instructions . beq $t1, $t2, offset . Read two registers and compare them . Take the 16 bit offset and add it to the address of next instruction following the branch ins truc tion to obta in the branc h targe t address Chapter 4 —The Processor — 27 Branch Instructions . Read register operands . Compare operands . Use ALU, subtract and check Zero output . Calculate target address . Sign-extend the offset . Shift left 2 places (word displacement) . Add to PC + 4 • Already calculated by instruction fetch Chapter 4 —The Processor — 28 Branch Instructions Just re‐routes wires Sign‐bit wire replicated Chapter 4 —The Processor — 29 Composing all elements together . Instruction fetch datapath . Datapath for R-type and memory instructions . Datapath for branches . Need an additional multiplexor to select the sequential address after branch or the branch tttarget address to be written to the PC 30 Datapath portion for Instruction Fetch Increment by 4 for next 32‐bit instruction register Chapter 4 —The Processor — 31 Full Datapath Chapter 4 —The Processor — 32 Datapath With Control AND gate for branch Chapter 4 —The Processor — 33 A Recap: Combinational Elements . AND-gate Adder A + Y . Y = A & B Y = A + B B A Y B Arithmetic/Logic Unit Multiplexer Y = F((,A, B) Y = S ? I1 : I0 A I0 M u Y ALU Y I1 x B S F Chapter 4 —The Processor — 34 A Recap: State Elements . Registers . Data Memory . Instruction Memory . Clocks are needed to decide when an element that contains state should be updated 35 Recap from Lecture 1: CPU Clocking . Operation of digital hardware governed by a constant-rate clock Clock period: duration of a clock cycle Clock frequency (rate): cycles per second 36 Clocks . A clock is a signal with a fixed cycle time (period) . The clock frequency is the inverse of the cycle time 37 Clocks . The clock cycle time or clock period is divided into two portions: . when the clock is high . whhen the cllkock is low 38 Clocking Methodology . We study . Edge triggered method ol ogy • Because it is simple . Edge triggered methodology: . All state changes occur on a clock edge Chapter 4 —The Processor — 39 Clocking Methodology : State Elements . Register: stores data in a circuit . Uses a clock signal to determine when to update the stored value . Edge-triggered: update when Clk changes from 0 to 1 Clk DQ D Clk Q Chapter 4 —The Processor — 40 Clocking Methodology : Stat e Element s . Register with write control . Only updates on clock edge when write control input is 1 . Used when stored value is required later Clk DQWrite Write D Clk Q Chapter 4 —The Processor — 41 Clocking Methodology . Combinational logic transforms data during clock cycles . Between clock edges . Input from state elements, output to state element • The state elements, whose outputs change only after the clock edggpe, provide valid inp uts to the combinational logic block. Chapter 4 —The Processor — 42 Clocking Methodology . To ensure that the values written into the state elements on the active clock edge are valid, the clock must have a long enough period so that all the signals in the combinational logic block stabilize, then the clock edge samples those values for storage in the state elements. This constraint sets a lower bound on the length of the clock period, which must be long enough for all state element inputs to be valid. Longest delay determines clock period Chapter 4 —The Processor — 43 It is possible to have a state element that is used as both an input and output to the same combinational logic block Ensure that the clock period is long enough 44 Single Clock Cycle . We studied a simple implementation where a singgyqle clock cycle is required for every instruction. Every instruction begins on one clock edge and completes execution on the next 45 Performance Issues . Longest dldelay determines cllkock period . Critical path: load instruction . Instruction memory register file ALU data memory register file . Not feasible to vary period for different instruct io ns . The clock cycle must be extended to accommodate the longest instruction . Improve performance by pipelining Chapter 4 —The Processor — 46 Conclusion . ISA influences the design of datapath and control for a processor . We studied an implementation bbdased on single cycle 47.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    47 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us