<<

Summary of Multicycle Datapath Differences from single cycle: EEC 170: • All intermediate results -> intermediate registers • Logical RTL -> more complex physical RTL • Possible reuse of hardware (as in book) Goal: Multicycle Control / • Not all instructions take same amount of time Microprogramming •Savings from shorter instructions • Costs: •More hardware (intermediate registers, control) •Imbalance between cycles

© John Owens / UCD 2003 This is only an intermediate representation!

Example Multicycle Datapath Our Control Model

State => set up control logic for Register Transfer State + Inputs => next state

RegDst State + Inputs => next state Equal RegWr MemToReg MemRd MemWr nPC_sel

E ALUctr ALUSrc ExtOp Transfer occurs upon exiting state (same clock edge)

Reg A S Reg. File inputs (conditions)

File Ext ALU IR PC B

Next PC M Next State

Mem State X

Access Logic Register Transfer Control Points Data Mem Fetch Fetch Control State Operand Instruction Result Store Depends on Input

Critical Path? Output Logic

outputs (control points)

Control Spec for multicycle proc Traditional FSM Controller

IR <= MEM[PC] “instruction fetch” next state op cond state control points

A <= R[rs] “decode / operand fetch” B <= R[rt]

LW R-type ORi SW BEQ Truth Table S <= A fun B S <= A or ZX S <= A + SX S <= A + SX PC <= next control points Next(PC,Equal) State 11 Execute Equal 6 M <= MEM[S] MEM[S] <= B PC <= PC + 4 Memory

R[rd] <= S R[rt] <= S R[rt] <= M 4 State PC <= PC + 4 PC <= PC + 4 PC <= PC + 4 op NextState = f(state, Write-back datapath State op, conditions)

1 datapath + state diagram ⇒⇒ control) Mapping Register Transfers to Control Points

IR <= MEM[PC] “instruction fetch” Translate RTs into control points imem_rd, IRen

A <= R[rs] Assign states B <= R[rt] “decode” Aen, Ben, Een

LW Then go build the controller R-type ORi SW BEQ S <= A fun B S <= A or ZX S <= A + SX S <= A + SX PC <= ALUfun, Sen Next(PC,Equal) Execute

M <= MEM[S] MEM[S] <= B PC <= PC + 4 R[rd] <= S

PC <= PC + 4 Memory RegDst, R[rt] <= S R[rt] <= M RegWr, PC <= PC + 4 PCen PC <= PC + 4 Write-back

Assigning States (Mostly) Detailed Control Specs (missing⇒0)

State Op field Eq Next IR PC Ops Exec Mem Write-Back IR <= MEM[PC] “instruction fetch” 0000 en sel A B E Ex Src ALU S R W M R-M Wr Dst 0000 ?????? ? 0001 1 A <= R[rs] “decode” 0001 BEQ x 0011 1 1 1 B <= R[rt] 0001 R-type x 0100 1 1 1 -all same in Moore machine 0001 0001 ORI x 0110 1 1 1 0001 LW x 1000 1 1 1 Setting control signals: not LW R-type ORi SW 0001 SW x 1011 1 1 1 much different than SC! BEQ

S <= A fun B S <= A or ZX S <= A + SX S <= A + SX PC <= Next(PC) BEQ: 0011 xxxxxx 0 0000 1 0 x 0 x 0100 0110 1000 1011 0011 0011 xxxxxx 1 0000 1 1 x 0 x Execute R: 0100 xxxxxx x 0101 0 1 fun 1 0101 xxxxxx x 0000 1 0 0 1 1 M <= MEM[S] MEM[S] <= B ORi: 0110 xxxxxx x 0111 0 0 or 1 PC <= PC + 4 1001 0111 xxxxxx x 0000 1 0 0 1 0 1100 Memory LW: 1000 xxxxxx x 1001 1 0 add 1 R[rd] <= S R[rt] <= S R[rt] <= M 1001 xxxxxx x 1010 1 0 1 PC <= PC + 4 PC <= PC + 4 PC <= PC + 4 1010 xxxxxx x 0000 1 0 1 1 0 0101 0111 1010 SW: 1011 xxxxxx x 1100 1 0 add 1 Write-back 1100 xxxxxx x 0000 1 0 0 1 0

Controller Design Example: Jump- The state diagrams that define the controller for an i i instruction set are highly structured zero0000 inc load

Use this structure to construct a simple i+1 “” Control reduces to programming this very simple device Map ROM None of above: Do nothing op-code • ⇒ microprogramming sequencer datapath control (for wait states) control • Goal: Simpler implementation! microinstruction

zero inc Counter micro-PC load sequencer

2 Using a Jump Counter Our Microsequencer

IR <= MEM[PC] “instruction fetch” 0000 inc

A <= R[rs] “decode” B <= R[rt] 0001 load taken

LW R-type ORi SW BEQ datapath control S <= A fun B S <= A or ZX S <= A + SX S <= A + SX PC <= Next(PC) Z I L 0100 0110 1000 1011 0011 Execute inc inc inc inc Micro-PC Zero zero M <= MEM[S] MEM[S] <= B Increment 1001 PC <= PC + 4 op-code

1100 Memory Load inc Map ROM R[rd] <= S R[rt] <= S R[rt] <= M PC <= PC + 4 PC <= PC + 4 Stores “Load” -> Simpler than next state PC <= PC + 4 zero 0101 0111 1010 zero zero Write-back zero

Microprogram Control Specification Adding the Dispatch ROM

µPC Taken NextIR PC Ops Exec Mem Write-Back Sequencer-based control en sel A B Ex Sr ALU S R W M M-R Wr Dst • Called “microPC” or “µPC” vs. state register 0000 ? inc 1 0001 x load 1 1 Control Value Effect 00 (Zero) Next µaddress = 0 0011 0 zero 1 0 BEQ 0011 0 zero 1 0 0011 1 zero 1 1 01 (Load) Next µaddress = 1 R: 0100 x inc 0 1 fun 1 dispatch ROM 0101 x zero 1 0 0 1 1 10 (Inc) Next µaddress = microPC ORi: 0110 x inc 0 0 or 1 µaddress + 1 0111 x zero 1 0 0 1 0 Mux 1000 x inc 1 0 add 1 LW: 2 1 0 1001 x inc 1 0 1 0 1010 x zero 1 0 1 1 0 ROM: R-type 000000 0100 1010 x zero 1 0 1 1 0 µAddress ROM SW: 1011 x inc 1 0 add 1 BEQ 000100 0011 Select 1100 x zero 1 0 0 1 0 ori 001101 0110 Logic LW 100011 1000 SW 101011 1011 Opcode

Example: Controlling handles non-ideal memory

This controller is powerful! IR <= MEM[PC] “instruction fetch” wait Example: What to do if memory is not ready? ~wait A <= R[rs] “decode / operand fetch” B <= R[rt]

PC taken datapath LW addr control R-type ORi SW BEQ InstMem_rd Instruction PC <= S <= A fun B S <= A or ZX S <= A + SX S <= A + SX Memory IM_wait Z I L Stall = !ZIL Next(PC)

data Execute

Inst. Reg IR_en Micro-PC M <= MEM[S] MEM[S] <= B

op-code ~wait wait Memory ~wait wait Map ROM R[rd] <= S R[rt] <= S R[rt] <= M PC <= PC + 4 PC <= PC + 4 PC <= PC + 4 PC <= PC + 4 Write-back

3 Multicycle Control Summary Announcements What’s the same as SC: • Midterm coming up next Wednesday (11/9) • Setting of control signals, generally • You can pick up your graded hw • HW #4 (chapter 5) is posted on class webpage What’s different from SC: • As usual, due Friday (11/11) at 5pm. • Must handle intermediate registers • Try to finish it before the mid-term. • New concept: states • I will be out of town next Monday (11/7) • Need way to get from one state to another • Christophe will be here to teach the class •Traditional FSM Controller • Mid-term review, problem solving, and questions •Jump table • Additional office hours before the test • • Wednesday (11/9) 11am-1pm

End of Midterm Material Microprogramming

sequencer datapath control For the midterm, you need to know up to this Inputs control slide µ-Code ROM • Know: microinstruction (µ) •Single cycle datapath/control •Multiple cycle datapath micro-PC µ-sequencer: Decode Decode fetch,dispatch, •Multiple cycle control, traditional FSM sequential To DataPath •Multiple cycle control, micro-sequencing Opcode Dispatch – This is actually in the CD (section 5.7 of the book) ROM • Reading: Up to and including Ch. 5 Microprogramming is a fundamental concept • implement an instruction set by building a very simple processor and interpreting the instructions • essential for very complex instructions and when few register transfers are possible • overkill when ISA matches datapath 1-1

Microprogramming “Macroinstruction” Interpretation

Microprogramming is a convenient method for implementing User program plus Data structured control state diagrams: Main ADD Memory Random logic replaced by microPC sequencer and ROM SUB this can change! • AND . • Each line of ROM called a µinstruction: . contains sequencer control + values for control points . one of these is • limited state transitions: (jump table) DATA mapped into one of these branch to zero, next sequential, execution branch to µinstruction address from dispatch ROM unit Horizontal µCode: one control bit in µInstruction for every CPU control AND microsequence control line in datapath (like what we’ve done before) memory e.g., Fetch Instr Vertical µCode: groups of control-lines coded together in Fetch Operand(s) Calculate OR µInstruction (e.g. possible ALU dest) (new!) Save Answer(s) Control design reduces to Microprogramming • Part of the design is to develop a “language” that describes control and is easy for humans to understand

4 Designing a Microinstruction Set Again: Alternative multicycle datapath (book) 1) Start with list of control signals Goal here: Horizontal -> Vertical Microcode 2) Group signals together that make sense (vs. PCWr PCWrCond PCSrc Zero

random): called “fields” IorD MemWr IRWr RegDst RegWr ALUSelA 1 Mux 32 32 3) Place fields in some logical order PC 0 0 32 Mux Zero (e.g., ALU operation & ALU operands first and Reg Instruction Rs 0 Ra 32 Mux RAdr 5 32 ALU Rt Out ALU microinstruction sequencing last) 32 Rb busA A 1 32 Ideal 5 32

e Data Reg Mem Reg File 1 0

Memory Rt Mux 4 0 4) To minimize the width, encode operations that Rw 32 WrAdr 32 B 32 Rd 1 32 Din Dout busW busB 32 will never be used at the same time - vertical 1 2 32 ALU 1 Mux 0 3 5) Create a symbolic legend for the microinstruction << 2 Control format, showing name of field values and how they set the control signals Extend Imm • Use computers to design computers 16 32 ALUOp ExtOp MemtoReg ALUSelB

1&2) Start with list of control signals, grouped into fields Compression of Fields Signal name Effect when deasserted Effect when asserted ALUSelA 1st ALU operand = PC 1st ALU operand = Reg[rs] Only look at writeback control signals: RegWrite None Reg. is written RegDst RegWr MemtoReg Reg. write data input = ALU Reg. write data input = memory • RegDst: Rt or Rd RegDst Reg. dest. no. = rt Reg. dest. no. = rd MemRead None Memory at address is read, • RegWr: Do we write? MDR <= Mem[addr] Ra MemWrite None Memory at address is written • MemToReg: WB from mem or ALU? Rb busA IorD Memory address = PC Memory address = S Reg File e Data Reg Mem 0 So this takes 3 bits! HORIZONTAL UCODE Rt Mux IRWrite None IR <= Memory • Rw PCWrite None PC <= PCSource

Single Bit Control Rd busW busB PCWriteCond None IF ALUzero then PC <= PCSource But what are our possibilities? 1 PCSource PCSource = ALU PCSource = ALUout 1 Mux 0 ExtOp Zero Extended Sign Extended • RegWr = 0, RegDst = MemToReg = X (00) Signal name Value Effect • RegWr = 1, RegDst = Rt, MemToReg = 1 (01) ALUOp 00 ALU adds From ALUOut 01 ALU subtracts • RegWr = 1, RegDst = Rt, MemToReg = 0 (10) 10 ALU does function code 11 ALU does logical OR • RegWr = 1, RegDst = Rd, MemToReg = 0 (11) MemtoReg ALUSelB 00 2nd ALU input = 4 01 2nd ALU input = Reg[rt] • Only 2 bits! VERTICAL UCODE 10 2nd ALU input = extended,shift left 2 11 2nd ALU input = extended Multiple Bit Control Bit Multiple

3&4) Microinstruction Format: unencoded vs. encoded fields 5) Legend of Fields and Symbolic Names

Field Name Width Control Signals Set Field Name Values for Field Function of Field with Specific Value ALU Add ALU adds wide narrow Subt. ALU subtracts Func code ALU does function code ALU Control 4 2 ALUOp Or ALU does logical OR SRC1 2 1 ALUSelA SRC1 PC 1st ALU input = PC rs 1st ALU input = Reg[rs] SRC2 5 3 ALUSelB, ExtOp SRC2 4 2nd ALU input = 4 Extend 2nd ALU input = sign ext. IR[15-0] ALU Destination 3 2 RegWrite, MemtoReg, RegDst Extend0 2nd ALU input = zero ext. IR[15-0] Memory 3 2 MemRead, MemWrite, IorD Extshft 2nd ALU input = sign ex., sl IR[15-0] rt 2nd ALU input = Reg[rt] Memory Register 1 1 IRWrite destination rd ALU Reg[rd] = ALUout rt ALU Reg[rt] = ALUout PCWrite Control 3 2 PCWrite, PCWriteCond, PCSource rt Mem Reg[rt] = Mem Sequencing 3 2 AddrCtl Memory Read PC Read memory using PC Read ALU Read memory using ALUout for addr Total width 24 15 bits Write ALU Write memory using ALUout for addr Memory register IR IR = Mem PC write ALU PC = ALU ALUoutCond IF ALU Zero then PC = ALUout Sequencing Seq Go to sequential µinstruction Fetch Go to the first microinstruction Dispatch Dispatch using ROM.

5 On your own time: what do these fieldnames mean? Horizontal vs. Vertical Microcode

Destination: Horizontal Vertical Code Name RegWrite MemToReg RegDest • Larger size • Smaller size 00 --- 0 X X 01 rd ALU 1 0 1 • More straightforward • Requires recoding 10 rt ALU 1 0 0 • Possibly more general • More restrictive 11 rt MEM 1 1 0

SRC2: Code Name ALUSelB ExtOp 000 --- X X 001 4 00 X 010 rt 01 X 011 ExtShft 10 1 100 Extend 11 1 111 Extend0 11 0

Specific Sequencer (from before) Legacy Software and Microprogramming

Sequencer-based from earlier IBM bet company on 360 Instruction Set Architecture (ISA): • Called “microPC” or “µPC” vs. state register single instruction set for many classes of machines 1 Code Name Effect • (8-bit to 64-bit) 00 fetch Next µaddress = 0 01 dispatch Next µaddress = dispatch ROM Adder microPC Stuart Tucker stuck with job of what to do about software 10 seq Next µaddress = µaddress + 1 compatibility Mux • If microprogramming could easily do same instruction set on many ROM: 2 1 0 ROM: 0 different , then why couldn’t multiple R-type 000000 0100 µAddress ROM microprograms do multiple instruction sets on the same BEQ 000100 0011 Select ? ori 001101 0110 Logic LW 100011 1000 • Coined term “emulation”: instruction set interpreter in microcode SW 101011 1011 Opcode for non-native instruction set • Very successful: in early years of IBM 360 it was hard to know whether old instruction set or new instruction set was more frequently used

Microprogramming in IBM 360 VLSI & Microprogramming

M30 M40 M50 M65 By late seventies • technology assumption about ROM & RAM speed became Datapath width (bits) 8 16 32 64 invalid µinst width (bits) 50 52 85 87 • micromachines became more complicated µcode size (K µinsts) 4 4 2.75 2.75 • to overcome slower ROM, micromachines were pipelined • complex instruction sets led to the need for subroutine and call µstore technology CCROS TCROS BCROS BCROS stacks in ucode µstore cycle (ns) 750 625 500 200 • need for fixing bugs in control programs was in conflict with read-only nature of uROM memory cycle (ns) 1500 2500 2000 750 • VAX instruction set had 400-500 kb of ! • introduction of caches and buffers, especially for Rental fee ($K/month) 4 7 15 35 instructions, made multiple-cycle execution of reg-reg instructions unattractive Only fastest models (75 and 95) were hardwired

6 Modern Usage Microprogramming Pros and Cons Microprogramming is far from extinct Ease of design Played a crucial role in micros of the Eighties Flexibility • Easy to adapt to changes in organization, timing, technology • Motorola 68K series • Can make changes late in design cycle, or even in the field • Intel 386 and 486 Can implement very powerful instructioninstruction sets (just more Microcode is present in most modern CISC micros in control memory) an assisting role (e.g. AMD Athlon, Intel Pentium-4) Generality • Most instructions are executed directly, i.e., with hard-wired • Can implement multiple instruction sets on same machine. control • Can tailor instruction set to application. • Infrequently-used and/or complicated instructions invoke the Compatibility microcode engine • Many organizations, same instruction set Patchable microcode common for post-fabrication Costly to implement bug fixes, e.g. Intel Pentiums load mcode patches bug fixes, e.g. Intel Pentiums load mcode patches Slow at bootup

Thought: Microprogramming one inspiration for RISC Overview of Control

If simple instruction could execute at very high Control may be designed using one of several initial representations. The … choice of sequence control, and how logic is represented, can then be determined independently; the control can then be implemented with If you could even write compilers to produce one of several methods using a structured logic technique. microinstructions… If most programs use simple instructions and addressing Initial Representation Finite State Diagram Microprogram modes… If microcode is kept in RAM instead of ROM so as to fix bugs … Sequencing Control Explicit Next State Microprogram counter If same memory used for control memory could be used Function + Dispatch ROMs instead as for “macroinstructions”… Then why not skip instruction interpretation by a Logic Representation Logic Equations Truth Tables microprogram and simply compile directly into lowest language of machine? (microprogramming is overkill when ISA matches datapath 1-1) Implementation PLA ROM Technique “hardwired control” “microprogrammed control”

Summary (1 of 3) Summary (cont’d) (2 of 3) Disadvantages of the Single Cycle Processor Control is specified by finite state diagram • Long cycle time Specialize state-diagrams easily captured by microsequencer • Cycle time is too long for all instructions except the • simple increment & “branch” fields Load • datapath control fields Multiple Cycle Processor: Control design reduces to Microprogramming • Divide the instructions into smaller steps Control is more complicated with: • complex instruction sets Execute each step (instead of the entire instruction) • • restricted datapaths (see the book) in one cycle Simple Instruction set and powerful datapath ⇒ simple Partition datapath into equal size chunks to control minimize cycle time • could try to reduce hardware (see the book) • ~10 levels of logic between latches • rather go for speed => many instructions at once!

7 Summary (3 of 3) Where to get more information? Microprogramming is a fundamental concept Multiple Cycle Controller: Appendix C of your • implement an instruction set by building a very simple processor and interpreting the instructions text book. • essential for very complex instructions and when few register transfers are possible Microprogramming: Section 5.5 of your text • Control design reduces to Microprogramming book. Design of a Microprogramming language D. Patterson, “Microprogramming”, Scientific • Start with list of control signals American, March 1983. • Group signals together that make sense (vs. random): called “fields” • Place fields in some logical order (e.g., ALU operation & ALU operands first D. Patterson and D. Ditzel, “The Case for the and microinstruction sequencing last) Reduced Instruction Set Computer,” • To minimize the width, encode operations that will never be used at the same time Computer Architecture News 8, 6 (October • Create a symbolic legend for the microinstruction format, showing name of 15, 1980) field values and how they set the control signals

8