Computer Organization Structure of a Computer z Computer design as an application of digital logic design procedures z Block diagram view address z Computer = processing unit + memory system read/write Memory System central processing data z Processing unit = control + unit (CPU) z Control = finite state machine y Inputs = machine instruction, datapath conditions y Outputs = register transfer control signals, ALU operation control signals codes Control Data Path y Instruction interpretation = instruction fetch, decode, data conditions execute z Datapath = functional units + registers y Functional units = ALU, multipliers, dividers, etc. – instruction fetch and – functional units interpretation FSM y Registers = program , shifters, storage registers and registers

CS 150 - Spring 2001 - Computer Organization - 1 CS 150 - Spring 2001 - Computer Organization - 2

Registers Register Transfer z Selectively loaded – EN or LD input z Point-to-point connection MUX MUX MUX MUX z Output enable – OE input y Dedicated wires y Muxes on inputs of rs rt rd R4 z Multiple registers – group 4 or 8 in parallel each register z Common input from y Load enables LD OE for each register rs rt rd R4 D7 Q7 OE asserted causes FF state to be D6 Q6 connected to output pins; otherwise they y Control signals D5 Q5 are left unconnected (high impedance) MUX D4 Q4 for multiplexer D3 Q3 D2 Q2 LD asserted during a lo-to-hi clock D1 Q1 transition loads new data into FFs z Common with output enables D0 CLK Q0 y Output enables and load enables for each register rs rt rd R4

BUS

CS 150 - Spring 2001 - Computer Organization - 3 CS 150 - Spring 2001 - Computer Organization - 4

Register Files Memories

z Larger Collections of Storage Elements z Collections of registers in one package y Implemented not as FFs but as much more efficient latches y Two-dimensional array of FFs y High-density memories use 1-5 (transitors) per bit y Address used as index to a particular word y Separate read and write addresses so can do both at same time z Static RAM – 1024 words each 4 bits wide y Once written, memory holds forever (not true for denser z 4 by 4 dynamic RAM) y 16 D-FFs y Address lines to select word (10 lines for 1024 words) y Organized as four words of four bits each y Read enable RD RE x Same as output enable WR y Write-enable (load) RB RA x Often called chip select A9 y Read-enable (output enable) A8 WE Q3 x Permits connection of many IO3 A7 IO2 WB Q2 chips into larger array A6 WA Q1 IO1 A5 IO0 Q0 y Write enable (same as load enable) A4 D3 y Bi-directional data lines A3 D2 A2 D1 x output when reading, input when writing A2 D0 A1 A0 CS 150 - Spring 2001 - Computer Organization - 5 CS 150 - Spring 2001 - Computer Organization - 6 Instruction Sequencing Instruction Types z Example – an instruction to add the contents of two z Data Manipulation registers (Rx and Ry) and place result in a third y Add, subtract register (Rz) y Increment, decrement z Step 1: Get the ADD instruction from memory into an y Multiply y Shift, rotate y Immediate operands z Step 2: Decode instruction y Instruction in IR has the code of an ADD instruction z Data Staging y Register indices used to generate output enables for y Load/store data to/from memory registers Rx and Ry y Register-to-register move y Register index used to generate load signal for register Rz z Control z Step 3: execute instruction y Conditional/unconditional branches in program flow y Enable Rx and Ry output and direct to ALU y Subroutine call and return y Setup ALU to perform ADD operation y Direct result to Rz so that it can be loaded into register CS 150 - Spring 2001 - Computer Organization - 7 CS 150 - Spring 2001 - Computer Organization - 8

Elements of the (aka Instruction Unit) Instruction Execution

z Control State Diagram (for each diagram) Reset z Standard FSM Elements y Reset y State register y Fetch instruction Init y Next-state logic Initialize y Decode Machine y Output logic (datapath/control signaling) y Execute y Moore or synchronous Mealy machine to avoid loops unbroken Fetch by FF z Instructions partitioned Instr. into three classes z Plus Additional ”Control" Registers y Branch y Instruction register (IR) Load/ XEQ y Load/store Branch Store Instr. y (PC) Register- y Register-to-register to-Register z Inputs/Outputs Branch Branch z Different sequence Taken Not Taken y Outputs control elements of data path Incr. through diagram for PC y Inputs from data path used to alter flow of program (test if each instruction type zero)

CS 150 - Spring 2001 - Computer Organization - 9 CS 150 - Spring 2001 - Computer Organization - 10

Data Path (Hierarchy) Data Path (ALU) z Arithmetic circuits constructed in hierarchical and z ALU Block Diagram Cin iterative fashion y Input: data and operation to perform y each bit in datapath is y Output: result of operation and status information Ain FA Sum functionally identical Bin y 4-bit, 8-bit, 16-bit, 32-bit Cout AB 16 16 Ain Sum HA Bin Cout HA Operation Cin

16

N SZ

CS 150 - Spring 2001 - Computer Organization - 11 CS 150 - Spring 2001 - Computer Organization - 12 Data Path (ALU + Registers) Data Path (Bit-slice)

z Accumulator y Special register z Bit-slice concept: iterate to build n-bit wide datapaths y One of the inputs to ALU y Output of ALU stored back in accumulator CO ALU CI CO ALU ALU CI z One-address instructions AC AC AC y Operation and address of one operand 16 y Other operand and destination is accumulator register REG AC R0 R0 R0

y AC <– AC op Mem[addr] 16 16 rs rs rs y ”Single address instructions” OP rt rt rt (AC implicit operand) rd rd rd z Multiple registers N 16 from from from y Part of instruction used Z memory memory memory to choose register operands 1 bit wide 2 bits wide

CS 150 - Spring 2001 - Computer Organization - 13 CS 150 - Spring 2001 - Computer Organization - 14

Instruction Path Data Path (Memory Interface)

z Memory z Program Counter y Separate data and instruction memory () y Keeps track of program execution x Two address busses, two data busses y Address of next instruction to read from memory y Single combined memory (Princeton architecture) y May have auto-increment feature or use ALU x Single address bus, single data bus z Instruction Register z Separate memory y Current instruction y ALU output goes to data memory input y Includes ALU operation and address of operand y Register input from data memory output y Data memory address from instruction register y Also holds target of jump instruction y Instruction register from instruction memory output y Immediate operands y Instruction memory address from program counter z Relationship to Data Path z Single memory y PC may be incremented through ALU y Address from PC or IR y Contents of IR may also be required as input to ALU y Memory output to instruction and data registers y Memory input from ALU output CS 150 - Spring 2001 - Computer Organization - 15 CS 150 - Spring 2001 - Computer Organization - 16

Block Diagram of Processor Block Diagram of Processor z Register Transfer View of Princeton Architecture z Register transfer view of Harvard architecture y Which register outputs are connected to which register inputs y Which register outputs are connected to which register inputs y Arrows represent data-flow, other are control signals from y Arrows represent data-flow, other are control signals from control FSM load control FSM load 16 path 16 path y MAR may be a simple multiplexer y Two MARs (PC and IR) rather than separate register REG AC rd wr REG AC rd wr 16 16 store y Two MBRs (REG and IR) 16 16 store y MBR is split in two path data path data Data Memory y Load control for each register Data Memory (REG and IR) OP (16-bit words) OP (16-bit words) addr addr y Load control N 8 N 16 Z Z for each register Control MAR FSM Control 16 FSM 16 IR PC IR PC data 16 16 16 16 Inst Memory (8-bit words) OP OP addr

16 16

CS 150 - Spring 2001 - Computer Organization - 17 CS 150 - Spring 2001 - Computer Organization - 18 A simplified Processor Data-path and Memory Processor Control

z Princeton architecture memory has only 255 words z Synchronous Mealy machine with a display on the last one z Register file z Multiple z Instruction register z PC incremented through ALU z Modeled after MIPS rt000 (used in 61C textbook by Patterson & Hennessy) y Really a 32 bit machine y We’ll do a 16 bit

version CS 150 - Spring 2001 - Computer Organization - 19 CS 150 - Spring 2001 - Computer Organization - 20

Processor Instructions Tracing an Instruction's Execution

z Three principal types (16 bits in each instruction) z Instruction: r3 = r1 + r2 type op rs rt rd funct R 0 rs=r1 rt=r2 rd=r3 funct=0 R(egister) 33334 I(mmediate) 3337 z 1. Instruction fetch J(ump) 313 y Move instruction address from PC to memory address bus z Some of the instructions y Assert memory read add 0 rs rt rd 0 rd = rs + rt y Move data from memory data bus into IR R sub 0 rs rt rd 1 rd=rs-rt y Configure ALU to add 1 to PC and 0 rs rt rd 2 rd = rs & rt or 0 rs rt rd 3 rd=rs|rt y Configure PC to store new value from ALUout slt 0 rs rt rd 4 rd = (rs < rt) lw 1 rs rt offset rt = mem[rs + offset] z 2. Instruction decode I sw 2 rs rt offset mem[rs + offset] = rt y Op-code bits of IR are input to control FSM beq 3 rs rt offset pc = pc + offset, if (rs == rt) addi 4 rs rt offset rt = rs + offset y Rest of IR bits encode the operand addresses (rs and rt) J j 5 target address pc = target address x These go to register file halt 7 - stop execution until reset

CS 150 - Spring 2001 - Computer Organization - 21 CS 150 - Spring 2001 - Computer Organization - 22

Tracing an Instruction's Execution Tracing an Instruction's Execution (cont’d) (cont’d)

z Instruction: r3 = r1 + r2 z Step 1 R 0 rs=r1 rt=r2 rd=r3 funct=0 z 3. Instruction execute y Set up ALU inputs y Configure ALU to perform ADD operation y Configure register file to store ALU result (rd)

CS 150 - Spring 2001 - Computer Organization - 23 CS 150 - Spring 2001 - Computer Organization - 24 Tracing an Instruction's Execution Tracing an Instruction's Execution (cont’d) (cont’d)

z Step 2 z Step 3

CS 150 - Spring 2001 - Computer Organization - 25 to controller CS 150 - Spring 2001 - Computer Organization - 26

Register-Transfer-Level Description Register-Transfer-Level Description (cont’d) z Control z How many states are needed to accomplish these y Transfer data btwn registers by asserting appropriate control signals transfers? z Register transfer notation: work from register to register y Data dependencies (where do values that are needed come from?) y Instruction fetch: mabus PC; – move PC to memory address bus (PCmaEN, ALUmaEN) y Resource conflicts (ALU, busses, etc.) memory read; – assert memory read signal (mr, RegBmdEN) IR memory; – load IR from memory data bus (IRld) z In our case, it takes three cycles op add – send PC into A input, 1 into B input, add y One for each step (srcA, srcB0, scrB1, op) PC ALUout – load result of incrementing in ALU into PC (PCld, PCsel) y All operation within a cycle occur between rising edges of the clock y Instruction decode: z How do we set all of the control signals to be output by the IR to controller values of A and B read from register file (rs, rt) state machine? y Instruction execution: y Depends on the type of machine (Mealy, Moore, synchronous Mealy) op add – send regA into A input, regB into B input, add (srcA, srcB0, scrB1, op) rd ALUout – store result of add into destination register (regWrite, wrDataSel, wrRegSel)

CS 150 - Spring 2001 - Computer Organization - 27 CS 150 - Spring 2001 - Computer Organization - 28

FSM Controller for CPU (skeletal Moore Review of FSM Timing FSM)

z First pass at deriving the state diagram (Moore machine) y These will be further refined into sub-states reset decode execute instruction fetch fetch

step 1 step 2 step 3 IR mem[PC]; Ars rd A + B instruction PC PC + 1; Brt decode

SW J LW ADD instruction to configure the data-path to do this here, execution when do we set the control signals?

CS 150 - Spring 2001 - Computer Organization - 29 CS 150 - Spring 2001 - Computer Organization - 30 FSM Controller for CPU (reset and inst. fetch) FSM Controller for CPU (decode) z Assume Moore machine z Operation Decode State y Outputs associated with states rather than arcs y Next state branch based on operation code in instruction z Reset state and instruction fetch sequence y Read two operands out of register file x What if the instruction doesn’t have two operands? z On reset (go to Fetch state) y Start fetching instructions y PC will set itself to zero reset instruction Decode decode Fetch instruction branch based on value of mabus PC; fetch Inst[15:13] and Inst[3:0] memory read; IR memory data bus; PC PC+1; add

CS 150 - Spring 2001 - Computer Organization - 31 CS 150 - Spring 2001 - Computer Organization - 32

FSM Controller for CPU (Instruction FSM Controller for CPU (Add Execution) Instruction) z For add instruction z Putting it all together y Configure ALU and store result in register and closing the loop y the famous reset rd A + B instruction instruction fetch Fetch fetch y Other instructions may require multiple cycles decode execute cycle Decode instruction instruction decode add execution

instruction add execution

CS 150 - Spring 2001 - Computer Organization - 33 CS 150 - Spring 2001 - Computer Organization - 34

FSM Controller for CPU z Now we need to repeat this for all the instructions of our processor y Fetch and decode states stay the same y Different execution states for each instruction x Some may require multiple states if available register transfer paths require sequencing of steps

CS 150 - Spring 2001 - Computer Organization - 35