Pipelining and Hazards Instructor: Steven Ho Great Idea #4: Parallelism Software Hardware • Parallel Requests Warehouse Smart Assigned to Computer Scale Phone E.G

Pipelining and Hazards Instructor: Steven Ho Great Idea #4: Parallelism Software Hardware • Parallel Requests Warehouse Smart Assigned to Computer Scale Phone E.G

Pipelining and Hazards Instructor: Steven Ho Great Idea #4: Parallelism Software Hardware • Parallel Requests Warehouse Smart Assigned to computer Scale Phone e.g. search “Garcia” Computer Leverage • Parallel Threads Parallelism & Assigned to core Achieve High e.g. lookup, ads Performance Computer • Parallel Instructions Core … Core > 1 instruction @ one time Memory e.g. 5 pipelined instructions Input/Output • Parallel Data Core > 1 data item @ one time Instruction Unit(s) Functional e.g. add of 4 pairs of words Unit(s) A +B A +B A +B A +B • Hardware descriptions 0 0 1 1 2 2 3 3 All gates functioning in Cache Memory Logic Gates parallel at same time 7/12/2017 CS61C Su18 - Lecture 13 2 Review of Last Lecture • Implementing controller for your datapath – Take decoded signals from instruction and generate control signals • Pipelining improves performance by exploiting Instruction Level Parallelism – 5-stage pipeline for RISC-V: IF, ID, EX, MEM, WB – Executes multiple instructions in parallel – Each instruction has the same latency – What can go wrong??? 7/12/2017 CS61C Su18 - Lecture 13 3 Agenda • RISC-V Pipeline • Hazards – Structural – Data • R-type instructions • Load – Control • Superscalar processors 7/12/2017 CS61C Su18 - Lecture 13 4 Recap: Pipelining with RISC-V tinstruction instruction sequence add t0, t1, t2 or t3, t4, t5 sll t6, t0, t3 tcycle Single Cycle Pipelining Timing tstep = 100 … 200 ps tcycle = 200 ps Register access only 100 ps All cycles same length Instruction time, tinstruction = tcycle = 800 ps 1000 ps Clock rate, fs 1/800 ps = 1.25 GHz 1/200 ps = 5 GHz Relative speed 1 x 4 x 7/11/2018 5 RISC-V Pipeline t = 1000 ps instruction Resource use in a particular time slot add t0, t1, t2 Resource use of or t3, t4, t5 instruction sequence instruction over time slt t6, t0, t3 sw t0, 4(t3) lw t0, 8(t3) tcycle = 200 ps addi t2, t2, 1 7/11/2018 CS61C Su18 - Lecture 13 6 Single-Cycle RISC-V RV32I Datapath pc+4 pc +4 Reg[] alu wb 1 DataD 1 Reg[rs1] ALU 2 alu 0 DMEM inst[11:7] 1 pc AddrD 0 IMEM Reg[rs2] Addr wb pc+4 Branch DataR 0 inst[19:15] AddrA DataA 0 Comp. DataW mem inst[24:20] AddrB DataB 1 inst[31:7] Imm. imm[31:0] Gen PCSel inst[31:0] ImmSel RegWEn BrUnBrEq BrLT BSel ASel ALUSel MemRW WBSel 7/11/2018 7 Pipelining RISC-V RV32I Datapath pc+4 pc +4 Reg[] alu wb 1 DataD 1 Reg[rs1] ALU 2 alu 0 DMEM inst[11:7] 1 pc AddrD 0 IMEM Reg[rs2] Addr wb pc+4 Branch DataR 0 inst[19:15] AddrA DataA 0 Comp. DataW mem inst[24:20] AddrB DataB 1 inst[31:7] Imm. imm[31:0] Gen Instruction Instruction Fetch ALU Execute Memory Access Write Back (F) Decode/Register Read (X) (M) (W) (D) 7/11/2018 8 Pipelined RISC-V RV32I Datapath Recalculate PC+4 in M stage to avoid sending both PC and PC+4 down pipeline pcM pcX pcF pcD +4 +4 Reg[] rs1X DataD 1 ALU aluX DMEM 0 AddrD Addr pc +4 Branch aluM DataR F IMEM DataA AddrA Comp. DataW DataB AddrB inst D rs2 rs2 X imm M Imm. X instX instM instW Must pipeline instruction along with data, so 7/11/2018 control operates correctly in each stage 9 Each stage operates on different instruction lw t0, 8(t3) sw t0, 4(t3) slt t6, t0, t3 or t3, t4, t5 add t0, t1, t2 pcM pcX pcF pcD +4 +4 Reg[] rs1X DataD 1 ALU aluX DMEM 0 AddrD Addr pc +4 Branch aluM DataR F IMEM DataA AddrA Comp. DataW DataB AddrB inst D rs2 rs2 X imm M Imm. X instX instM instW 7/11/2018 Pipeline registers separate stages, hold data for each instruction in flight 10 Pipelined Control • Control signals derived from instruction − As in single-cycle implementation − Information is stored in pipeline registers for use by later stages 7/11/2018 11 Graphical Pipeline Representation • RegFile: right half is read, left half is write Time (clock cycles) I ALU n I$ Reg D$ Reg s Load ALU t I$ Reg D$ Reg Add r ALU I$ Reg D$ Reg Store O ALU r I$ Reg D$ Reg Sub d ALU I$ Reg D$ Reg e Or r 7/12/2017 CS61C Su18 - Lecture 13 12 Question: Which of the following signals (buses or control signals) for RISC-V does NOT need to be passed into the EX pipeline stage for a beq instruction? beq t0 t1 Label (A) BrUn IF ID EX Mem WB (B) MemWr ALU I$ Reg D$ Reg (C) RegWr (D) WBSel 13 Hazards Agenda Ahead! • RISC-V Pipeline • Hazards – Structural – Data • R-type instructions • Load – Control • Superscalar processors 7/12/2017 CS61C Su18 - Lecture 13 14 Pipelining Hazards A hazard is a situation that prevents starting the next instruction in the next clock cycle 1) Structural hazard – A required resource is busy (e.g. needed in multiple stages) 2) Data hazard – Data dependency between instructions – Need to wait for previous instruction to complete its data write 3) Control hazard – Flow of execution depends on previous instruction 7/12/2017 CS61C Su18 - Lecture 13 15 Agenda • RISC-V Pipeline • Hazards – Structural – Data • R-type instructions • Load – Control • Superscalar processors 7/12/2017 CS61C Su18 - Lecture 13 16 Structural Hazard • Problem: Two or more instructions in the pipeline compete for access to a single physical resource • Solution 1: Instructions take it in turns to use resource, some instructions have to stall • Solution 2: Add more hardware to machine • Can always solve a structural hazard by adding more hardware 7/11/2018 CS61C Su18 - Lecture 13 17 1. Structural Hazards • Conflict for use of a resource Time (clock cycles) I ALU n I$ Reg D$ Reg s Load Can we read and t ALU I$ Reg D$ Reg write to registers r Instr 1 simultaneously? ALU I$ Reg D$ Reg O Instr 2 r ALU I$ Reg D$ Reg d Instr 3 e ALU I$ Reg D$ Reg r Instr 4 7/12/2017 CS61C Su18 - Lecture 13 18 Regfile Structural Hazards • Each instruction: − can read up to two operands in decode stage − can write one value in writeback stage • Avoid structural hazard by having separate “ports” − two independent read ports and one independent write port • Three accesses per cycle can happen simultaneously 7/11/2018 CS61C Su18 - Lecture 13 19 Regfile Structural Hazards • Two alternate solutions: 1) Build RegFile with independent read and write ports (what you will do in the project; good for single-stage) 2) Double Pumping: split RegFile access in two! Prepare to write during 1st half, write on falling edge, read during 2nd half of each clock cycle • Will save us a cycle later... • Possible because RegFile access is VERY fast (takes less than half the time of ALU stage) • Conclusion: Read and Write to registers during same clock cycle is okay 7/12/2017 CS61C Su18 - Lecture 13 20 Memory Structural Hazards • Conflict for use of a resource Time (clock cycles) I Trying to read ALU n (and maybe I$ Reg D$ Reg s Load write) same t ALU I$ Reg D$ Reg memory twice r Instr 1 in same clock ALU I$ Reg D$ Reg cycle O Instr 2 r ALU I$ Reg D$ Reg d Instr 3 e ALU I$ Reg D$ Reg r Instr 4 7/12/2017 CS61C Su18 - Lecture 13 21 Instruction and Data Caches Processor Memory (DRAM) Control Instruction Cache Program Datapath PC Bytes Data Registers Cache Arithmetic & Logic Unit Data (ALU) Caches: small and fast “buffer” memories 7/11/2018 CS61C Su18 - Lecture 13 22 Structural Hazards – Summary • Conflict for use of a resource • In RISC-V pipeline with a single memory − Load/store requires data access − Without separate memories, instruction fetch would have to stall for that cycle ▪ All other operations in pipeline would have to wait • Pipelined datapaths require separate instruction/data memories − Or separate instruction/data caches • RISC ISAs (including RISC-V) designed to avoid structural hazards − e.g. at most one memory access/instruction CS61C Su18 - Lecture 13 23 Administrivia • Proj2-2 due 7/13, HW3/4 due 7/16 – 2-2 autograder being run • Guerilla Session Tonight! 4-6pm • HW0-2 grades should now be accurate on glookup • Project 3 released tomorrow night! • Supplementary review sessions starting – First one this Sat. (7/14) 12-2p, Cory 540AB 7/12/2017 CS61C Su18 - Lecture 13 24 Agenda • RISC-V Pipeline • Hazards – Structural – Data • R-type instructions • Load – Control • Superscalar processors 7/12/2017 CS61C Su18 - Lecture 13 25 2. Data Hazards (1/2) • Consider the following sequence of instructions: add t0, t1, t2 sub t4, t0, t3 and t5, t0, t6 or t7, t0, t8 xor t9, t0, t10 Stored Read during WB during ID 7/12/2017 CS61C Su18 - Lecture 13 26 2. Data Hazards (2/2) • Data-flow backward in time are hazards Time (clock cycles) IF ID/RF EX MEM WB I ALU add t0, t1, t2 I$ Reg D$ Reg Hazard if no double pumping n s ALU sub t4, t0, t3 I$ Reg D$ Reg t r ALU I$ D$ Reg and t5, t0, t6 Reg O ALU I$ Reg D$ Reg r or t7, t0, t8 d ALU e I$ Reg D$ Reg xor t9, t0, t10 r 7/12/2017 CS61C Su18 - Lecture 13 27 Solution 1: Stalling • Problem: Instruction depends on result from previous instruction − add t0, t1, t2 sub t4, t0, t3 • Bubble: − effectively NOP: affected pipeline stages do “nothing” Stalls and Performance • Stalls reduce performance − But stalls are required to get correct results • Compiler can arrange code to avoid hazards and stalls − Requires knowledge of the pipeline structure 7/11/2018 29 Data Hazard Solution: Forwarding • Forward result as soon as it is available – OK that it’s not stored in RegFile yet Arithmetic result IF ID/RF EX MEM WB available in EX ALU add t0, t1, t2 I$ Reg D$ Reg ALU sub t4, t0, t3 I$ Reg D$ Reg ALU I$ D$ Reg and t5, t0, t6 Reg ALU I$ Reg D$ Reg or t7, t0, t8 ALU I$ Reg D$ Reg xor t9, t0, t10 Forwarding: grab operand from pipeline stage, 7/12/2017 rather than register file 30 Forwarding (aka Bypassing) • Use result when it is computed − Don’t wait for it to be stored in a register − Requires extra connections in the datapath 7/11/2018 CS61C Su18 - Lecture 13 31 Detect Need for Forwarding (example) Compare destination of older instructions in D X M W pipeline with sources of new instruction in decode stage.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    65 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us