<<

ELEC 5200-001/6200-001 and Design Fall 2013 Datapath and Control (Chapter 4)

Vishwani D. Agrawal & Victor P. Nelson Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 1 Von Neumann Kitchen ALU Start

Control My choice Registers PC

Program

Data Input Memory Output

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 2 Where Does It All Begin?

In a register called program (PC). PC contains the memory address of the next instruction to be executed. In the beginning, PC contains the address of the memory location where the program begins.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 3 Where is the Program?

Processor Memory

Program counter (register)

Machine code of program Start address

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 4 How Does It Run? Start PC has memory address where program begins

Fetch instruction word from memory address in PC and increment PC ← PC + 4 to point to next instruction

Decode instruction

Execute instruction

Save result in register or memory

No Yes Program STOP complete?

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 5 Datapath and Control Datapath: Memory, registers, adders, ALU, and communication buses. Each step (fetch, decode, execute, save result) requires communication (data transfer) paths between memory, registers and ALU. Control: Datapath for each step is set up by control signals that set up dataflow directions on communication buses and select ALU and memory functions. Control signals are generated by a consisting of one or more finite- state machines.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 6 Datapath for Instruction Fetch

Add

4

Instruction Instruction word to PC Address Memory control unit and registers

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 7 : A Datapath Component

5 reg 1 32 Read reg 1 data registers 5 reg 2 32 Registers (reg. file) Write 5 register 32 reg 2 data Write data 32

RegWrite from control

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 8 Multi-Operation ALU

Operation select Operation select ALU function from control 000 AND 001 OR 3 010 Add zero 110 Subtract ALU 111 Set on less than result overflow

zero = 1, when all bits of result are 0

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 9 R-Type Instructions Also known as arithmetic-logical instructions add, sub, slt Example: add $t0, $s1, $s2 – Machine instruction word 000000 10001 10010 01000 00000 100000 opcode $s1 $s2 $t0 function – Read two registers – Write one register – Opcode and function code go to control unit that generates RegWrite and ALU operation code.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 10 Datapath for R-Type Instruction 000000 10001 10010 01000 00000 100000 Operation opcode $s1 $s2 $t0 function (add) select from control (add) 5 Read $s1 10001 3 register 32 numbers 5 zero $s2 32 Registers ALU result 10010 (reg. file) 32 Write reg. 5 overflow $t0 number 01000 Write data 32 RegWrite from control activated

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 11 Load and Store Instructions I-type instructions lw $t0, 1200 ($t1) # incr. in bytes 100011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200

sw $t0, 1200 ($t1) # incr. in bytes 101011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 12 Datapath for lw Instruction 100011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 MemWrite Operation select 5 from control Read $t1 01001 (add) register 32 3 Read numbers data 5 zero 32 Registers ALU result (reg. file) Addr. Data overflow memory Write reg. 5 $t0 number 01000 Write Write data data 32 32 RegWrite from control MemRead activated Sign activated extend 16 mem. data to $t0 0000 0100 1011 0000 0000 01001011 Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 13 Datapath for sw Instruction

101011 01001 01000 0000 0100 1011 0000 MemWrite opcode $t1 $t0 1200 activated Operation select 5 Read $t1 from control 01001 register 32 3 (add) Read numbers zero data 5 $t0 32 Registers ALU result 01000 (reg. file) Addr. Data Write reg. 5 overflow memory number 32 Write Write data $t0 data to mem. data 32 32

RegWrite MemRead from control Sign extend 16 0000 0100 1011 0000 0000 01001011 Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 14 Branch Instruction (I-Type)

beq $s1, $s2, 25 # if $s1 = $s2, advance PC through 25 instructions

16-bits 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25

Note: Can branch within ± 215 words from the current instruction address in PC.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 15 Datapath for beq Instruction 16-bits 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25 Operation 5 select Read $s1 10001 from control (subtract) register numbers 32 3 5 $s2 32 Registers To branch

10010 zero (reg. file) control ALU 5 result logic Write reg. 32 number overflow Write data

PC+4 Branch 32 RegWrite From instruction 32 target from control fetch datapath Add 16 Sign Shift 32 extend left 2 32 32 0000 0000 0001 1001 0000 0000

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 16 J-Type Instruction

j 2500 # jump to instruction 2,500 26-bits 000010 0000 0000 0000 0010 0111 0001 00 opcode 2,500

32-bit jump address 0000 0000 0000 0000 0010 0111 0001 0000 bits 28-31 from PC+4

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 17 Datapath for Jump Instruction

Branch Jump Branch 4 addr. 1 32 32 Add 32 mux 0 0 mux PC+4 1 4 28 Shift 32 left 2 26 opcode (bits 26-31) 32 Instruction PC to control Address Memory 32 6 Instruction word to control and

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 registers 18 Jump 0-25 Shift left 2

4 0 mux 1 0 mux

Add

Branch 0 1 mux opcode ALU MemtoReg 26-31 RegDst CONTROL

21-25 zero MemWrite Instr. MemRead PC mem. 16-20 ALU

Reg.File Data 1 mux 0 1 mux

11-15 1 0 mux Combined 0 1 mux mem. Datapaths ALU Sign Shift Cont. 0-15 ext. left 2 0-5

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 19 Control RegDst Jump Branch MemRead Instruction Control MemtoReg Logic bits 26-31 ALUOp opcode MemWrite ALUSrc RegWrite

2 Instruction ALU to ALU bits 0-5 Control funct.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 20 Control Logic: Truth Table

Inputs: instr. opcode bits Outputs: control signals MemtoReg MemRead MemWrite RegWrite ALOOp1 ALUOp2 ALUSrc RegDst Instr Branch Jump type 31 30 29 28 27 26

R 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0

lw 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0

sw 1 0 1 0 1 1 X 0 1 X 0 0 1 0 0 0

beq 0 0 0 1 0 0 X 0 0 X 0 0 0 1 0 1

j 0 0 0 0 1 0 X 1 X X 0 X 0 X X X

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 21 How Long Does It Take? Assume control logic is fast and does not affect the critical timing. Major time delay components are ALU, memory read/write, and register read/write. Arithmetic-type (R-type) Fetch (memory read) 2ns Register read 1ns ALU operation 2ns Register write 1ns Total 6ns

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 22 Time for lw and sw (I-Types) ALU (R-type) 6ns Load word (I-type) – Fetch (memory read) 2ns – Register read 1ns – ALU operation 2ns – Get data (mem. Read) 2ns – Register write 1ns – Total 8ns Store word (no register write) 7ns

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 23 Time for beq (I-Type)

ALU (R-type) 6ns Load word (I-type) 8ns Store word (I-type) 7ns Branch on equal (I-type) – Fetch (memory read) 2ns – Register read 1ns – ALU operation 2ns – Total 5ns

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 24 Time for Jump (J-Type)

ALU (R-type) 6ns Load word (I-type) 8ns Store word (I-type) 7ns Branch on equal (I-type) 5ns Jump (J-type) – Fetch (memory read) 2ns – Total 2ns

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 25 How Fast Can Clock Run? If every instruction is executed in one clock cycle, then: – Clock period must be at least 8ns to perform the longest instruction, i.e., lw. – This is a single cycle machine. – It is slower because many instructions take less than 8ns but are still allowed that much time. Method of speeding up: Use multicycle datapath.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 26 A Single Cycle Example a31 Delay of 1-bit full = 1ns Clock period ≥ 32ns . c32 . 1-b full adder . s31 a2 . a1 1-b full . a0 adder . s2 b31 1-b full s1 . adder s0 . 1-b full . adder b2 b1 Time of adding words ~ 32ns b0 Time of adding bytes ~ 32ns 0

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 27 A Multicycle Implementation

a31 Delay of 1-bit full adder = 1ns . Clock period ≥ 1ns . . Time of adding words ~ 32ns Shift a2 Time of adding bytes ~ 8ns a1 a0 1-b full adder b31 s31 . .

c32 . FF .

. . Shift Shift b2 Initialize s2 b1 to 0 s1 b0 s0

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 28 Jump 0-25 Shift left 2

4 0 mux 1 0 mux

Add

Branch 0 1 mux opcode ALU MemtoReg 26-31 RegDst CONTROL

21-25 zero MemWrite Instr. MemRead PC mem. 16-20 ALU

Reg.File Data 1 mux 0 1 mux

11-15 1 0 mux Single-cycle 0 1 mux mem. Datapaths ALU Sign Shift Cont. 0-15 ext. left 2 0-5

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 29 Multicycle Datapath

PC Addr. AReg. Instr. reg. (IR) Instr.

Data 4 ALU Registerfile Memory

ALUOut Reg. ALUOut BReg. Mem. (MDR) Data Mem.

One-cycle data transfer paths (need registers to hold data)

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 30 Multicycle Datapath Requirements

Only one ALU, since it can be reused. Single memory for instructions and data. Five registers added: – (IR) – Memory data register (MDR) – Three ALU registers, A and B for inputs and ALUOut for output

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 31 Multicycle Datapath PCSource

Shift 26-31 to etc. 0-25 left 2 Control RegWrite PCWrite 21-25 FSM 16-20 28-31

PC 15 AReg. - Addr. 11 Instr. reg. (IR) Instr. IorD ALUSrcA ALUSrcB Data ALU Registerfile Memory

out Reg. ALUOut RegDst BReg. MUX control IRWrite 4

in1 in2 (MDR) Data Mem.

Sign Shift MemRead MemtoReg 0-15 extend left 2 MemWrite 0-5 ALU ALUOp control

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 32 3 to 5 Cycles for an Instruction Step R-type Mem. Ref. Branch type J-type (4 cycles) (4 or 5 cycles) (3 cycles) (3 cycles) Instruction fetch IR ← Memory[PC]; PC ← PC+4 Instr. decode/ A ← Reg(IR[21-25]); B ← Reg(IR[16-20]) Reg. fetch ALUOut ← PC + (sign extend IR[0-15]) << 2 Execution, ALUOut ← ALUOut ← If (A= =B) PC←PC[28- addr. Comp., A op B A+sign extend then 31] branch & jump (IR[0-15]) PC←ALUOut || completion (IR[0-25]<<2) Mem. Access Reg(IR[11- MDR←M[ALUout] or R-type 15]) ← or M[ALUOut]←B completion ALUOut Memory read Reg(IR[16-20]) ← completion MDR

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 33 Cycle 1 of 5: Instruction Fetch (IF) Read instruction into IR, M[PC] → IR Control signals used: IorD = 0 select PC MemRead = 1 read memory IRWrite = 1 write IR Increment PC, PC + 4 → PC Control signals used: ALUSrcA = 0 select PC into ALU ALUSrcB = 01 select constant 4 ALUOp = 00 ALU adds PCSource = 00 select ALU output PCWrite = 1 write PC

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 34 Cycle 2 of 5: Instruction Decode (ID) 31-26 25-21 20-16 15-11 10-6 5-0 R opcode | reg 1 | reg 2 | reg 3 | shamt | fncode

I opcode | reg 1 | reg 2 | word address increment J opcode | word address jump Control unit decodes instruction Datapath prepares for execution R and I types, reg 1→ A reg, reg 2 → B reg No control signals needed Branch type, compute branch address in ALUOut ALUSrcA = 0 select PC into ALU ALUSrcB = 11 Instr. Bits 0-15 shift 2 into ALU ALUOp = 00 ALU adds

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 35 Cycle 3 of 5: Execute (EX)

R type: execute function on reg A and reg B, result in ALUOut Control signals used: ALUSrcA = 1 A reg into ALU ALUsrcB = 00 B reg into ALU ALUOp = 10 instr. Bits 0-5 control ALU I type, lw or sw: compute memory address in ALUOut ← A reg + sign extend IR[0-15] Control signals used: ALUSrcA = 1 A reg into ALU ALUSrcB = 10 Instr. Bits 0-15 into ALU ALUOp = 00 ALU adds

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 36 Cycle 3 of 5: Execute (EX) I type, beq: subtract reg A and reg B, write ALUOut to PC Control signals used: ALUSrcA = 1 A reg into ALU ALUsrcB = 00 B reg into ALU ALUOp = 01 ALU subtracts If zero = 1, PCSource = 01 ALUOut to PC If zero = 1, PCwriteCond =1 write PC Instruction complete, go to IF J type: write jump address to PC ← IR[0-25] shift 2 and four leading bits of PC Control signals used: PCSource = 10 PCWrite = 1 write PC Instruction complete, go to IF

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 37 Cycle 4 of 5: Reg Write/Memory R type, write destination register from ALUOut Control signals used: RegDst = 1 Instr. Bits 11-15 specify reg. MemtoReg = 0 ALUOut into reg. RegWrite = 1 write register Instruction complete, go to IF I type, lw: read M[ALUOut] into MDR Control signals used: IorD = 1 select ALUOut as mem adr. MemRead = 1 read memory to MDR I type, sw: write M[ALUOut] from B reg Control signals used: IorD = 1 select ALUOut as mem adr. MemWrite = 1 write memory Instruction complete, go to IF

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 38 Cycle 5 of 5: Reg Write

I type, lw: write MDR to reg[IR(16-20)] Control signals used: RegDst = 0 instr. Bits 16-20 are write reg MemtoReg = 1 MDR to reg file write input RegWrite = 1 read memory to MDR Instruction complete, go to IF

For an alternative method of designing datapath, see N. Tredennick, Logic Design, the Flowchart Method, Digital Press, 1987.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 39 1-bit Control Signals Signal name Value = 0 Value =1 RegDst Write reg. # = bit 16-20 Write reg. # = bit 11-15 RegWrite No action Write reg. ← Write data ALUSrcA First ALU Operand ← PC First ALU Operand←Reg. A MemRead No action Mem.Data Output←M[Addr.] MemWrite No action M[Addr.]←Mem. Data Input MemtoReg Reg.File Write In←ALUOut Reg.File Write In←MDR IorD Mem. Addr. ← PC Mem. Addr. ← ALUOut IRWrite No action IR ← Mem.Data Output PCWrite No action PC is written PCWriteCond No action PC is written if zero(ALU)=1

zero(ALU) PCWriteCond PCWrite etc. PCWrite

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 40 2-bit Control Signals Signal name Value Action 00 ALU performs add ALUOp 01 ALU performs subtract 10 Funct. field (0-5 bits of IR ) determines ALU operation 00 Second input of ALU ← B reg. ALUSrcB 01 Second input of ALU ← 4 (constant) 10 Second input of ALU ← 0-15 bits of IR sign ext. to 32b 11 Second input of ALU ← 0-15 bits of IR sign ext. and left shift 2 bits 00 ALU output (PC +4) sent to PC PCSource 01 ALUOut (branch target addr.) sent to PC 10 Jump address IR[0-25] shifted left 2 bits, concatenated with PC+4[28-31], sent to PC

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 41 Control: Finite State Machine

Start

Clock State 0 cycle 1 Instruction fetch State 1 Clock cycle 2 Instruction decode and register fetch

FSM-M Clock Memory FSM-R FSM-B FSM-J cycles access R-type Branch Jump 3-5 instr. instr. instr. instr.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 42 State 0: Instruction Fetch (CC1) PCSource=00

Shift

26-31 to 0-25 left 2 etc.=1 Control RegWrite PCWrite 21-25 FSM 16-20 28-31

PC AReg. Addr. Instr. reg. (IR) Instr. IorD=0 ALU ALU ALUSrcA=0 Data ALUSrcB=01 Registerfile Memory

out Reg. ALUOut RegDst BReg. MUX control IRWrite 4 Add =1

in1 in2 (MDR) Data Mem.

MemRead = 1 Sign Shift MemtoReg 0-15 extend left 2 MemWrite ALUOp 0-5 ALU control =00

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 43 State 0 Control FSM Outputs

Start State 1 Instruction decode/ State0 Register fetch/ Instruction Branch addr. fetch MemRead =1 ALUSrcA = 0 IorD = 0 IRWrite = 1 Outputs? ALUSrcB = 01 ALUOp = 00 PCWrite = 1 PCSource = 00

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 44 State 1: Instr. Decode/Reg. Fetch/ Branch Address (CC2) PCSource

Shift 26-31 to etc. 0-25 left 2 Control RegWrite PCWrite 21-25 FSM 16-20 28-31

PC AReg. Addr. Instr. reg. (IR) Instr. IorD ALUSrcA=0 ALU ALU

Data ALUSrcB=11 Registerfile Memory

out Reg. ALUOut RegDst BReg. MUX control IRWrite 4 Add

in1 in2 (MDR) Data Mem.

Sign Shift MemRead MemtoReg 0-15 extend left 2 MemWrite ALUOp 0-5 ALU control = 00

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 45 State 1 Control FSM Outputs Start State 1 Instruction decode (ID) / State0 Register fetch / Instruction Branch addr. fetch MemRead =1 (IF) ALUSrcA = 0 IorD = 0 ALUSrcA = 0 IRWrite = 1 ALUSrcB = 11 ALUSrcB = 01 ALUOp = 00 ALUOp = 00 PCWrite = 1 PCSource = 00

Opcode Opcode = J-type = BEQ

FSM-M FSM-R FSM-B FSM-J

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 46 State 1 (Opcode = lw) → FSM-M (CC3-5)

CC4 PCSource

Shift 26-31 to etc. 0-25 left 2 Control RegWrite=1 PCWrite 21-25 FSM 16-20 28-31

CC3

PC AReg. Addr. Instr. reg. (IR) Instr. IorD=1 ALUSrcA=1 ALU ALU

Data ALUSrcB=10 Registerfile Memory

out Reg. ALUOut

CC5 BReg. MUX control IRWrite 4 Add RegDst=0

in1 in2 (MDR) Data Mem.

Sign Shift MemRead=1 MemtoReg=1 0-15 extend left 2 MemWrite ALUOp 0-5 ALU control = 00

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 47 State 1 (Opcode= sw)→FSM-M (CC3-4)

CC4 PCSource

Shift 26-31 to etc. 0-25 left 2 Control RegWrite PCWrite 21-25 FSM 16-20 28-31

CC3

PC AReg. Addr. Instr. reg. (IR) Instr. IorD=1 ALUSrcA=1 ALU ALU

Data ALUSrcB=10 Registerfile Memory

out Reg. ALUOut RegDst=0 BReg. MUX control IRWrite 4 Add

in1 in2 CC4 (MDR) Data Mem.

Sign Shift MemRead MemtoReg 0-15 extend left 2 MemWrite=1 ALUOp 0-5 ALU control = 00

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 48 FSM-M (Memory Access)

From state 1 Opcode = “lw” or “sw”

Compute mem addrress ALUSrcA =1 Opcode ALUSrcB = 10 Opcode ALUOp = 00 Read = “lw” = “sw” Write Memory data memory

MemRead = 1 IorD = 1 MemWrite = 1 IorD = 1 Write register

RegWrite = 1 To state 0 MemtoReg = 1 (Instr. Fetch) RegDst = 0

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 49 State 1(Opcode=R-type)→FSM-R (CC3-4)

PCSource

Shift 26-31 to etc. 0-25 left 2 Control RegWrite PCWrite 21-25 FSM 16-20 28-31

PC AReg.

Addr. CC3 11-15 Instr. reg. (IR) Instr. IorD ALUSrcA=1 ALU ALU

Data ALUSrcB=00 Registerfile Memory

out Reg. ALUOut RegDst=0 BReg. MUX control IRWrite 4 CC4 “funct. code”

in1 in2 (MDR) Data Mem.

Sign Shift MemRead MemtoReg=0 0-15 extend left 2 MemWrite ALUOp 0-5 ALU control = 10

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 50 FSM-R (R-type Instruction)

From state 1 Opcode = R-type

ALU operation

ALUSrcA =1 ALUSrcB = 00 ALUOp = 10

Write register To state 0 RegWrite = 1 (Instr. Fetch) MemtoReg = 0 RegDst = 1

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 51 State 1 (Opcode = beq ) → FSM-B (CC3) If(zero) PCSource 01

Shift

26-31 to 0-25 left 2 etc.=1 Control 21-25 RegWrite PCWrite FSM 16-20 28-31

PC AReg.

Addr. CC3 11-15 Instr. reg. (IR) Instr. IorD

ALUSrcA=1 ALU ALU

Data ALUSrcB=00 Registerfile Memory zero

out Reg. ALUOut MUX control RegDst BReg. IRWrite 4 subtract

in1 in2 (MDR) Data Mem.

Sign Shift MemRead MemtoReg 0-15 extend left 2 MemWrite ALUOp 0-5 ALU control = 01

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 52 Write PC on “zero”

zero=1

PCWriteCond=1 PCWrite etc.=1 PCWrite

PCWrite = 1, unconditionally write PC Cycle 1, fetch Cycle 3, jump

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 53 FSM-B (Branch)

From state 1 Opcode = “beq”

Write PC on branch condition Branch condition: If A – B=0 ALUSrcA =1 zero = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond=1 PCSource=01

To state 0 (Instr. Fetch)

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 54 State 1 (Opcode = j) → FSM-J (CC3)

CC3 PCSource 10

Shift 26-31 to etc. 0-25 left 2 Control RegWrite PCWrite 21-25 FSM 16-20 28-31

PC

AReg. Addr. 11-15 Instr. reg. (IR) Instr. IorD

ALUSrcA ALU ALU

Data ALUSrcB Registerfile Memory zero

out Reg. ALUOut RegDst BReg. MUX control IRWrite 4

in1 in2 (MDR) Data Mem.

Sign Shift MemRead MemtoReg 0-15 extend left 2 MemWrite ALUOp 0-5 ALU control

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 55 Write PC

zero

PCWriteCond PCWrite etc.=1 PCWrite=1 PCWrite = 1, unconditionally write PC Cycle 1, fetch Cycle 3, jump

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 56 FSM-J (Jump)

From state 1 Opcode = “jump”

Write jump addr. In PC

PCWrite=1 PCSource=10

To state 0 (Instr. Fetch)

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 57 Control FSM Start State 0 1 Instr. Instr. fetch/ decode/reg. adv. PC fetch/branch addr. R J 3 2 B

Read lw Compute ALU Write PC Write memory memory 6 operation on branch jump addr. data addr. condition to PC 8 sw 9 4 5 7 Write Write Write register memory register data

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 58 Control FSM (Controller)

6 inputs (opcode) 16 control Combinational outputs logic

Present Next state state

Reset Clock FF

FF

FF

FF

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 59 Designing the Control FSM

Encode states; need 4 bits for 10 states, e.g., – State 0 is 0000, state 1 is 0001, and so on. Write a truth table for :

Opcode Present state Control signals Next state 000000 0000 0001000110000100 0001 ......

Synthesize a logic circuit from the truth table. Connect four flip- between the next state outputs and present state inputs.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 60 Block Diagram of a Processor MemWrite

ALUOp MemRead Controller ALU (Control FSM) 2-bits control

bits bits bits bits IorD [0,5] zero 6 - 3 - 2 - funct. 2 - ALUOp Opcode RegDst IRWrite PCWrite ALUSrcA RegWrite Overflow ALUSrcB PCSource MemtoReg PCWriteCond Reset

Clock Datapath Mem. Addr. (PC, register file, registers, ALU)

Mem. write data

Mem. data out

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 61 Exceptions or Interrupts Conditions under which the processor may produce incorrect result or may “hang”. – Illegal or undefined opcode. – Arithmetic overflow, divide by zero, etc. – Out of bounds memory address. EPC: 32-bit register holds the affected instruction address. Cause: 32-bit register holds an encoded exception type. For example, – 0 for undefined instruction – 1 for arithmetic overflow

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 62 Implementing Exceptions PCSource 11 8000 0180(hex)

26-31 to

etc.=1 Control PCWrite FSM

EPCWrite=1 PC

EPC Instr. reg. (IR) Instr. ALUSrcA=0 CauseWrite=1 ALU ALUSrcB=01

Overflow to out 0 Control FSM MUX control 4 Cause 1 in1 in2 Subtract 32-bit register

ALU ALUOp control =01

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 63 How Long Does It Take? Again Assume control logic is fast and does not affect the critical timing. Major time components are ALU, memory read/write, and register read/write. Time for hardware operations, suppose Memory read or write 2ns Register read 1ns ALU operation 2ns Register write 1ns

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 64 Single-Cycle Datapath

R-type 6ns Load word (I-type) 8ns Store word (I-type) 7ns Branch on equal (I-type) 5ns Jump (J-type) 2ns Clock cycle time = 8ns Each instruction takes one cycle

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 65 Performance Parameters

Average (CPI) Cycle time (clock period, T) Execution time of a program For single-cycle datapath: CPI = 1 T = hardware time for lw instruction Execution time of a program = T × CPI × (Total instructions executed)

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 66 Multicycle Datapath

Clock cycle time is determined by the longest operation, ALU or memory: Clock cycle time = 2ns Cycles per instruction: lw 5 (10ns) sw 4 (8ns) R-type 4 (8ns) beq 3 (6ns) j 3 (6ns)

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 67 CPI of a Multicycle Computer

∑k (Instructions of type k) × CPIk CPI = ——————————————

∑k (instructions of type k)

where

CPIk = Cycles for instruction of type k

Note: CPI is dependent on the instruction mix of the program being run. Standard benchmark programs are used for specifying the performance of CPUs.

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 68 Example Consider a program containing: loads 25% stores 10% branches 11% jumps 2% Arithmetic 52% CPI = 0.25×5 + 0.10×4 + 0.11×3 + 0.02×3 + 0.52×4 = 4.12 for multicycle datapath CPI = 1.00 for single-cycle datapath

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 69 Multicycle vs. Single-Cycle

Performance ratio = Single cycle time / Multicycle time

(CPI × cycle time) for single-cycle = ——————————————— (CPI × cycle time) for multicycle

1.00 × 8ns = ————— = 0.97 4.12 × 2ns

Single cycle is faster in this case, but is it always so?

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 70 Consider Another Example For this program: loads 5% stores 5% branches 30% jumps 10% Arithmetic 50% CPI = 0.05×5 + 0.05×4 + 0.30×3 + 0.10×3 + 0.50×4 = 3.65 for multicycle datapath CPI = 1.00 for single-cycle datapath

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 71 Multicycle vs. Single-Cycle

Performance ratio = Single cycle time / Multicycle time

(CPI × cycle time) for single-cycle = ——————————————— (CPI × cycle time) for multicycle

1.00 × 8ns = ————— = 1.096 3.65 × 2ns

Multicycle is faster in this case, so the performance ratio depends on the instruction mix. Can we do better?

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 72 Next: Pipelining

Fall 2013, . . . ELEC 5200-001/6200-001 Lecture 5 73