MIPS Processor Implementation

MIPS Processor Implementation

Design of Pipelined MIPS Processor Sept. 24 & 26, 1997 Topics • Instruction processing • Principles of pipelining • Inserting pipe registers • Data Hazards • Control Hazards • Exceptions MIPS architecture subset R-type instructions (add, sub, and, or, slt): rd <-- rs funct rt 0 rs rt rd shamt funct 31-26 25-21 20-16 15-11 10-6 5-0 RI-type instructions (addiu): rt <-- rs funct I16 0 rs rt I16 31-26 25-21 20-16 15-0 Load: rt <-- Mem[rs + I16] Store: Mem[rs + I16] <-- rt 35 or 43 rs rt I16 31-26 25-21 20-16 15-0 Branch equal: PC <-- (rs == rt) ? PC + 4 + I16 <<2 : PC + 4 4 rs rt I16 31-26 25-21 20-16 15-0 – 2 – CS 740 F’97 Datapath IF ID EX MEM WB instruction instruction decode/ execute/ memory write fetch register fetch address calculation access back MUX Add 4 shift Add left 2 Instr [25-21] Instruction read reg 1 read data 1 read addr memory Instr [20-16] read reg 2 ALU zero read data PC Read address Registers write reg ALU Data memory Instruction read data 2 result [31-0] Instr [15-11] MUX MUX write data write addr MUX write data Instr [15-0] 16 32 \ Sign \ extend ALU Instr [5-0] control – 3 – CS 740 F’97 R-type instructions IF: Instruction fetch • IR <-- IMemory[PC] • PC <-- PC + 4 ID: Instruction decode/register fetch • A <-- Register[IR[25:21]] • B <-- Register[IR[20:16]] Ex: Execute • ALUOutput <-- A op B MEM: Memory • nop WB: Write back • Register[IR[15:11]] <-- ALUOutput – 4 – CS 740 F’97 Load instruction IF: Instruction fetch • IR <-- IMemory[PC] • PC <-- PC + 4 ID: Instruction decode/register fetch • A <-- Register[IR[25:21]] • B <-- Register[IR[20:16]] Ex: Execute • ALUOutput <-- A + SignExtend(IR[15:0]) MEM: Memory • Mem-Data <-- DMemory[ALUOutput] WB: Write back • Register[IR[20:16]] <-- Mem-Data – 5 – CS 740 F’97 Store instruction IF: Instruction fetch • IR <-- IMemory[PC] • PC <-- PC + 4 ID: Instruction decode/register fetch • A <-- Register[IR[25:21]] • B <-- Register[IR[20:16]] Ex: Execute • ALUOutput <-- A + SignExtend(IR[15:0]) MEM: Memory • DMemory[ALUOutput] <-- B WB: Write back • nop – 6 – CS 740 F’97 Branch on equal IF: Instruction fetch • IR <-- IMemory[PC] • PC <-- PC + 4 ID: Instruction decode/register fetch • A <-- Register[IR[25:21]] • B <-- Register[IR[20:16]] Ex: Execute • Target <-- PC + SignExtend(IR[15:0]) << 2 • Z <-- (A - B == 0) MEM: Memory Is this a delayed • If (Z) PC <-- target branch? WB: Write back • nop – 7 – CS 740 F’97 Pipelining Basics Unpipelined 30ns 3ns System R Comb. E Delay = 33ns Logic G Throughput = 30MHz Clock Op1 Op2 Op3 • • • Time • One operation must complete before next can begin • Operations spaced 33ns apart – 8 – CS 740 F’97 3 Stage Pipelining 10ns 3ns 10ns 3ns 10ns 3ns R R R Comb. Comb. Comb. E E E Delay = 39ns Logic Logic Logic G G G Throughput = 77MHz Clock Op1 Op2 • Space operations Op3 13ns apart Op4 • 3 operations occur simultaneously Time • • • – 9 – CS 740 F’97 Limitation: Nonuniform Pipelining 5ns 3ns 15ns 3ns 10ns 3ns R R R Com. Comb. Comb. E E E Delay = 39ns Log. Logic Logic G G G Throughput = 55MHz Clock • Throughput limited by slowest stage • Must attempt to balance stages – 10 – CS 740 F’97 Limitation: Deep Pipelines 5ns 3ns 5ns 3ns 5ns 3ns 5ns 3ns 5ns 3ns 5ns 3ns R R R R R R Com. Com. Com. Com. Com. Com. E E E E E E Log. Log. Log. Log. Log. Log. G G G G G G Clock Delay = 48ns, Throughput = 128MHz • Diminishing returns as add more pipeline stages • Register delays become limiting factor – Increased latency – Small througput gains – 11 – CS 740 F’97 Limitation: Sequential Dependencies R R R Comb. Comb. Comb. E E E Logic Logic Logic G G G Clock Op1 Op2 • Op4 gets result Op3 from Op1 ! Op4 • Pipeline Hazard Time • • • – 12 – CS 740 F’97 Pipelined datapath MUX IF/ID ID/EX EX/MEM MEM/WB Add 4 shift Add left 2 PC Instr [25-21] read reg 1 Instruction read data 1 Instr [20-16] read addr mem read reg 2 ALU Read address Registers zero read data write reg ALU Data mem Instruction result [31-0] read data 2 write data MUX write addr MUX write data Instr [15-0] 16 32 \ Sign \ ext ALU Instr [5-0] ctl Instr [20-16] Instr [15-11] MUX – 13 – CS 740 F’97 Pipeline Structure Branch Flag & Target PC IF/ID ID/EX EX/MEM MEM/WB IF ID EX MEM Instr. Reg. Data Mem. File Mem. Next PC Write Back Reg. & Data Notes • Each stage consists of operate logic connecting pipe registers • WB logic merged into ID • Additional paths required for forwarding – 14 – CS 740 F’97 Pipe Register Next Current State State Operation • Current State stays constant while Next State being updated • Update involves transferring Next State to Current – 15 – CS 740 F’97 Pipeline Stage ID Reg. File Current Next State State Operation • Computes next state based on current – From/to one or more pipe registers • May have embedded memory elements – Low level timing signals control their operation during clock cycle – Writes based on current pipe register state – Reads supply values for Next state – 16 – CS 740 F’97 MIPS Simulator Features Run Controls • Based on MIPS subset Speed Control – Code generated by dis Mode – Hexadecimal instruction code Selection • Executable available Current – HOME/public/sim/solve_tk State Pipe Demo Programs Register • HOME/public/sim/solve_tk/demos Next State Register Values – 17 – CS 740 F’97 Reg-Reg Instructions R-type instructions (add, sub, and, or, slt): rd <-- rs funct rt 0 rs rt rd shamt funct 31-26 25-21 20-16 15-11 10-6 5-0 Add +4 IF/ID ID/EX EX/MEM MEM/WB ALU Op A A rs data data data instr Reg. Instr. rt B B Mem. Wdata dest ALU dest dest Waddr rd PC – 18 – CS 740 F’97 Simulator ALU Example 0x0: 24020003 li r2,3 demo1.O IF 0x4: 24030004 li r3,4 • Fetch instruction 0x8: 00000000 nop ID 0xc: 00000000 nop • Fetch operands 0x10:00432021 addu r4,r2,r3 # --> 7 0x14:00000000 nop EX 0x18:0000000d break 0 • Compute ALU 0x1c:00000000 nop result .set noreorder demo1.s MEM addiu $2, $0, 3 • Nothing addiu $3, $0, 4 WB nop nop • Store result in addu $4, $2, $3 destination reg. nop break 0 .set reorder – 19 – CS 740 F’97 Simulator Store/Load Examples IF demo2.O • Fetch instruction 0x0: 24020003 li r2,3 # Store/Load -->3 ID 0x4: 24030004 li r3,4 # --> 4 0x8: 00000000 nop • Get addr reg 0xc: 00000000 nop • Store: Get data 0x10:ac430005 sw r3,5(r2) # 4 at 8 EX 0x14:00000000 nop • Compute EA 0x18:00000000 nop MEM 0x1c:8c640004 lw r4,4(r3) # --> 4 • Load: Read 0x20:00000000 nop 0x24:0000000d break 0 • Store: Write 0x28:00000000 nop WB 0x2c:00000000 nop • Load: Update reg. – 20 – CS 740 F’97 Simulator Branch Examples demo3.O IF 0x0: 24020003 li r2,3 • Fetch instruction 0x4: 24030004 li r3,4 ID 0x8: 00000000 nop • Fetch operands 0xc: 00000000 nop EX 0x10:10430008 beq r2,r3,0x34 # Don't take 0x14:00000000 nop • Subtract operands 0x18:00000000 nop – test if 0 0x1c:00000000 nop • Compute target 0x20:14430004 bne r2,r3,0x34 # Take MEM 0x24:00000000 nop • Taken: Update PC 0x28:00000000 nop to target 0x2c:00000000 nop WB 0x30:00432021 addu r4,r2,r3 # skip • Nothing 0x34:00632021 addu r4,r3,r3 # Target ... – 21 – CS 740 F’97 Data Hazards in MIPS Pipeline Problem • Registers read in ID, and written in WB • Must resolve conflict between instructions competing for register array – Generally do write back in first half of cycle, read in second • But what about intervening instructions? • E.g., suppose initially $2 is zero: IF ID EX M WB addiu $2, $0, 63 $2 IF ID EX M WB addiu $3, $2, 0 $3 $4 IF ID EX M WB addiu $4, $2, 0 $5 IF ID EX M WB addiu $5, $2, 0 $6 IF ID EX M WB addiu $6, $2, 0 – 22 – CS 740 F’97 Simulator Data Hazard Example Operation • Read in ID demo4.O • Write in WB 0x0: 2402003f li r2,63 • Write-before-read 0x4: 00401821 move r3,r2 # --> 0x3F? register file 0x8: 00402021 move r4,r2 # --> 0x3f? 0xc: 00402821 move r5,r2 # --> 0x3f? 0x10:00403021 move r6,r2 # --> 0x3f? 0x14:00000000 nop 0x18:0000000d break 0 0x1c:00000000 nop – 23 – CS 740 F’97 Handling Hazards by Stalling Idea • Delay instruction until hazard eliminated • Put “bubble” into pipeline Bubble – Dynamically generated NOP Stall Pipe Register Operation • Normally transfer next state to current • “Stall” indicates that current state should not be changed • “Bubble” indicates that current state should be set to 0 – Stage logic designed so that 0 is like Next Current NOP State State – [Other conventions possible] – 24 – CS 740 F’97 Stall Control Stall Control IF ID EX MEM Instr. Reg. Data Mem. File Mem. Stall Logic • Determines which stages to stall or bubble on next update – 25 – CS 740 F’97 Observations on Stalling Good • Relatively simple hardware • Only penalizes performance when hazard exists Bad • As if placed NOP’s in code – Except that does not waste instruction memory Reality • Some problems can only be dealt with by stalling • E.g., instruction cache miss – Stall PC, bubble IF/ID • Data cache miss – Stall PC, IF/ID, ED/EX, EX/MEM, bubble MEM/WB – 26 – CS 740 F’97 Forwarding (Bypassing) Observation • ALU data generated at end of EX – Steps through pipe until WB • ALU data consumed at beginning of EX Idea • Expedite passing of previous instruction result to ALU • By adding extra data pathways and control – 27 – CS 740 F’97 Forwarding for ALU-ALU Hazard Bypass Add +4 IF/ID Control ALU Op ID/EX + EX/MEM $1 MEM/WB A $2 rs 5 instr data data data rt $1 5 12 Instr.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    56 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us