Dynamic Scheduling II

Dynamic Scheduling II

This Unit: Dynamic Scheduling II Application • Previously: dynamic scheduling OS • Insn buffer + scheduling algorithms Compiler Firmware CIS 501 • Scoreboard: no register renaming • Tomasulo: register renaming Introduction to Computer Architecture CPU I/O Memory • Now: add speculation, precise state Digital Circuits • Re-order buffer Unit 9: Dynamic Scheduling II Gates & Transistors • PentiumPro vs. MIPS R10000 • Also: dynamic load scheduling • Out-of-order memory operations CIS 501 (Martin/Roth): Dynamic Scheduling II 1 CIS 501 (Martin/Roth): Dynamic Scheduling II 2 Readings Superscalar + Out-of-Order + Speculation • H+P • Three great tastes that taste great together • None (not happy with explanation of this topic) • CPI ! 1? • Go superscalar • Superscalar increases RAW hazards? • Go out-of-order (OoO) • RAW hazards still a problem? • Build a larger window • Branches a problem for filling large window? • Add control speculation CIS 501 (Martin/Roth): Dynamic Scheduling II 3 CIS 501 (Martin/Roth): Dynamic Scheduling II 4 Speculation and Precise Interrupts Precise State • Why are we discussing these together? • Speculative execution requires • Sequential (vN) semantics for interrupts • (Ability to) abort & restart at every branch • All insns before interrupt should be complete • Abort & restart at every load useful for load speculation (later) • All insns after interrupt should look as if never started (abort) • And for shared memory multiprocessing (much later) • Basically want same thing for mis-predicted branch • Precise synchronous (program-internal) interrupts require • What makes precise interrupts difficult? • Abort & restart at every load, store, ?? • OoO completion ! must undo post-interrupt writebacks • Precise asynchronous (external) interrupts require • Same thing for branches • Abort & restart at every ?? • In-order ! branches complete before younger insns writeback • OoO ! not necessarily • Bite the bullet • Implement abort & restart at every insn • Precise interrupts, mis-speculation recovery: same problem • Called “precise state” • Same problem ! same solution CIS 501 (Martin/Roth): Dynamic Scheduling II 5 CIS 501 (Martin/Roth): Dynamic Scheduling II 6 Precise State Options The Problem with Precise State • Imprecise state: ignore the problem! insn buffer – Makes page faults (any restartable exceptions) difficult – Makes speculative execution almost impossible regfile • IEEE standard strongly suggests precise state I$ D$ • Compromise: Alpha implemented precise state only for integer ops B P D S • Force in-order completion (W): stall pipe if necessary – Slow • Precise state in software: trap to recovery routine • Problem: writeback combines two separate functions – Implementation dependent • Forwards values to younger insns: OK for this to be out-of-order • Trap on every mis-predicted branch (you must be joking) • Write values to registers: would like this to be in-order • Precise state in hardware • Similar problem (decode) for OoO execution: solution? + Everything is better in hardware (except policy) • Split decode (D) ! in-order dispatch (D) + out-of-order issue (S) • Separate using insn buffer: scoreboard or reservation station CIS 501 (Martin/Roth): Dynamic Scheduling II 7 CIS 501 (Martin/Roth): Dynamic Scheduling II 8 Re-Order Buffer (ROB) Complete and Retire Reorder buffer (ROB) Reorder buffer (ROB) regfile regfile I$ I$ D$ D$ B B P W1 W2 P C R • Insn buffer ! re-order buffer (ROB) • Complete (C): second part of decode • Buffers completed results en route to register file • Completed insns write results into ROB • May be combined with RS or separate + Out-of-order: wait doesn’t back-propagate to younger insns • Combined in picture: register-update unit RUU (Sohi’s method) • Retire (R): aka commit, graduate • Separate (more common today): P6-style • ROB writes results to register file • Split writeback (W) into two stages • In order: stall back-propagates to younger insns • Why is there no latch between W1 and W2? CIS 501 (Martin/Roth): Dynamic Scheduling II 9 CIS 501 (Martin/Roth): Dynamic Scheduling II 10 Load/Store Queue (LSQ) ROB + LSQ • ROB makes register writes in-order, but what about stores? ROB regfile • As usual, i.e., to D$ in X stage? I$ • Not even close, imprecise memory worse than imprecise registers B P C R • Load/store queue (LSQ) load data • Completed stores write to LSQ store data addr • When store retires, head of LSQ written to D$ D$ • When loads execute, access LSQ and D$ in parallel load/store LSQ • Forward from LSQ if older store with matching address • More modern design: loads and stores in separate queues • More on this later • Modulo gross simplifications, this picture is almost realistic! CIS 501 (Martin/Roth): Dynamic Scheduling II 11 CIS 501 (Martin/Roth): Dynamic Scheduling II 12 P6 P6 Data Structures • P6: Start with Tomasulo’s algorithm… add ROB • Reservation Stations are same as before • Separate ROB and RS • ROB • head, tail: pointers maintain sequential order • Simple-P6 • R: insn output register, V: insn output value • Our old RS organization: 1 ALU, 1 load, 1 store, 2 3-cycle FP • Tags are different • Tomasulo: RS# ! P6: ROB# • Map Table is different • T+: tag + “ready-in-ROB” bit • T==0 ! Value is ready in regfile • T!=0 ! Value is not ready • T!=0+ ! Value is ready in the ROB CIS 501 (Martin/Roth): Dynamic Scheduling II 13 CIS 501 (Martin/Roth): Dynamic Scheduling II 14 P6 Data Structures P6 Data Structures Regfile ROB Map Table CDB Map Table T+ value R value ht # Insn R V S X C Reg T+ T V Head 1 ldf X(r1),f1 f0 Retire 2 mulf f0,f1,f2 f1 3 stf f2,Z(r1) f2 Tail 4 addi r1,4,r1 r1 Dispatch 5 ldf X(r1),f1 CDB.V op T T1 T2 CDB.T V1 V2 ROB 6 mulf f0,f1,f2 == == == == 7 stf f2,Z(r1) == == Dispatch == == RS Reservation Stations T FU # FU busy op T T1 T2 V1 V2 1 ALU no • Insn fields and status bits 2 LD no • Tags 3 ST no 4 FP1 no • Values 5 FP2 no CIS 501 (Martin/Roth): Dynamic Scheduling II 15 CIS 501 (Martin/Roth): Dynamic Scheduling II 16 P6 Pipeline P6 Pipeline • New pipeline structure: F, D, S, X, C, R • C (complete) • D (dispatch) • Structural hazard (CDB)? wait • Structural hazard (ROB/LSQ/RS) ? Stall • Write value into ROB entry indicated by RS tag • Allocate ROB/LSQ/RS • Mark ROB entry as complete • Set RS tag to ROB# • If not overwritten, mark Map Table entry “ready-in-ROB” bit (+) • Set Map Table entry to ROB# and clear “ready-in-ROB” bit • R (retire) • Read ready registers into RS (from either ROB or Regfile) • Insn at ROB head not complete ? stall • X (execute) • Handle any exceptions • Free RS entry • Write ROB head value to register file • Use to be at W, can be earlier because RS# are not tags • If store, write LSQ head to D$ • Free ROB/LSQ entries CIS 501 (Martin/Roth): Dynamic Scheduling II 17 CIS 501 (Martin/Roth): Dynamic Scheduling II 18 P6 Dispatch (D): Part I P6 Dispatch (D): Part II Regfile Regfile Map Table T+ value R value Map Table T+ value R value Head Head Retire Retire Tail Tail Dispatch Dispatch CDB.V CDB.V op T T1 T2 CDB.T V1 V2 ROB op T T1 T2 CDB.T V1 V2 ROB == == == == == == == == == == == == Dispatch == == Dispatch == == RS RS T FU T FU • RS/ROB full ? stall • Read tags for register inputs from Map Table • Allocate RS/ROB entries, assign ROB# to RS output tag • Tag==0 ! copy value from Regfile (not shown) • Set output register Map Table entry to ROB#, clear “ready-in-ROB” • Tag!=0 ! copy Map Table tag to RS • Tag!=0+ ! copy value from ROB CIS 501 (Martin/Roth): Dynamic Scheduling II 19 CIS 501 (Martin/Roth): Dynamic Scheduling II 20 P6 Complete (C) P6 Retire (R) Regfile Regfile Map Table T+ value R value Map Table T value R value Head Head Retire Retire Tail Tail Dispatch Dispatch CDB.V CDB.V op T T1 T2 CDB.T V1 V2 ROB op T T1 T2 CDB.T V1 V2 ROB == == == == == == == == == == == == Dispatch == == Dispatch == == RS RS T FU T FU • Structural hazard (CDB) ? Stall : broadcast <value,tag> on CDB • ROB head not complete ? stall : free ROB entry • Write result into ROB, if still valid clear MapTable “ready-in-ROB” bit • Write ROB head result to Regfile • Match tags, write CDB.V into RS slots of dependent insns • If still valid, clear Map Table entry CIS 501 (Martin/Roth): Dynamic Scheduling II 21 CIS 501 (Martin/Roth): Dynamic Scheduling II 22 P6: Cycle 1 P6: Cycle 2 ROB Map Table CDB ROB Map Table CDB ht # Insn R V S X C Reg T+ T V ht # Insn R V S X C Reg T+ T V ht 1 ldf X(r1),f1 f1 f0 h 1 ldf X(r1),f1 f1 c2 f0 2 mulf f0,f1,f2 f1 ROB#1 t 2 mulf f0,f1,f2 f2 f1 ROB#1 3 stf f2,Z(r1) f2 3 stf f2,Z(r1) f2 ROB#2 4 addi r1,4,r1 r1 4 addi r1,4,r1 r1 5 ldf X(r1),f1 5 ldf X(r1),f1 6 mulf f0,f1,f2 6 mulf f0,f1,f2 7 stf f2,Z(r1) 7 stf f2,Z(r1) Reservation Stations Reservation Stations # FU busy op T T1 T2 V1 V2 set ROB# tag # FU busy op T T1 T2 V1 V2 set ROB# tag 1 ALU no 1 ALU no 2 LD yes ldf ROB#1 [r1] allocate 2 LD yes ldf ROB#1 [r1] 3 ST no 3 ST no 4 FP1 no 4 FP1 yes mulf ROB#2 ROB#1 [f0] allocate 5 FP2 no 5 FP2 no CIS 501 (Martin/Roth): Dynamic Scheduling II 23 CIS 501 (Martin/Roth): Dynamic Scheduling II 24 P6: Cycle 3 P6: Cycle 4 ROB Map Table CDB ROB Map Table CDB ht # Insn R V S X C Reg T+ T V ht # Insn R V S X C Reg T+ T V h 1 ldf X(r1),f1 f1 c2 c3 f0 h 1 ldf X(r1),f1 f1 [f1] c2 c3 c4 f0 ROB#1 [f1] 2 mulf f0,f1,f2 f2 f1 ROB#1 2 mulf f0,f1,f2 f2 c4 f1 ROB#1+ t 3 stf f2,Z(r1) f2 ROB#2 3 stf f2,Z(r1) f2 ROB#2 4 addi r1,4,r1 r1 t 4 addi r1,4,r1 r1 r1 ROB#4 5 ldf X(r1),f1 5 ldf X(r1),f1 ldf finished 6 mulf f0,f1,f2 6 mulf f0,f1,f2 1.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    22 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us