The TOY Machine

Lecture 12: TOY Machine Architecture TOY machine. ! 256 16-bit words of memory.

! 16 16-bit registers.

! 1 8-bit program .

8 16 ! 16 instructions types.

pc for branch, addr for loads, result of arithmetic, logic, jump stores or addr for load addr Cond = 0 What we've done. Eval Registers Fetch 0 > 0 ! Written programs for the TOY machine. pc for jal 8 W Data load ! Software implementation of fetch-execute cycle. 2 Memory A Data A IR L – TOY simulator. PC Addr op U B Data d W Addr 5 R Execute A Addr Our goal today. Data s 16 ! + W Data t B Addr Hardware implementation of fetch-execute cycle. 1 W – TOY computer. W 0 addr 8 pc + 1 8 store data

COS126: General Computer Science • http://www.cs.Princeton.EDU/~cos126 2

Designing a Instruction Set Architecture

How to build a ? Instruction set architecture (ISA).

! 16-bit words, 256 words of memory, 16 registers.

! Develop instruction set architecture (ISA). ! Determine set of primitive instructions. – 16-bit words, 16 TOY machine instructions – too narrow ! cumbersome to program – too broad ! cumbersome to build hardware

! Determine major components. ! TOY machine: 16 instructions. – ALU, memory, registers, Instructions Instructions

! Determine requirements. 0: halt 8: load – "flow" of bits 1: add 9: store 2: subtract A: load indirect ! Establish clocking methodology. 3: and B: store indirect – 2-cycle design: fetch, execute 4: xor C: branch zero 5: shift left D: branch positive ! Analyze how to implement each instruction. – determine settings of control signals 6: shift right E: jump register 7: load address F: jump and link

3 4 Designing a Processor

How to build a microprocessor? TOY ALU.

! Big combinational circuit. technical hack ! Develop instruction set architecture (ISA). ! 16-bit .

– 16-bit words, 16 TOY machine instructions ! Add, subtract, and, xor, shift left, shift right, copy input 2.

! Determine major components. – ALU, memory, registers, program counter 16 Input 1 op 2 1 0 ! Determine datapath requirements. +, - 0 0 0 – "flow" of bits ALU 16 & 0 0 1 ^ 0 1 0 ! Establish clocking methodology. <<, >> 0 1 1 16 – 2-cycle design: fetch, execute input 2 1 0 0 Input 2 3 ALU select ! Analyze how to implement each instruction. – determine settings of control signals shift subtract direction

5 6

Arithmetic Logic Unit: Implementation Main Memory

16 TOY main memory: 256 x 16-bit . Input 1 + 000 16 00 10 20 30 40 50 60 70 80 90 A0 B0 C0 D0 E0 F0 Input 2 01 11 21 31 41 51 61 71 81 91 A1 B1 C1 D1 E1 F1 ~ carry in 02 12 22 32 42 52 62 72 82 92 A2 B2 C2 D2 E2 F2 03 13 23 33 43 53 63 73 83 93 A3 B3 C3 D3 E3 F3 04 14 24 34 44 54 64 74 84 94 A4 B4 C4 D4 E4 F4 & 001 05 15 25 35 45 55 65 75 85 95 A5 B5 C5 D5 E5 F5 06 16 26 36 46 56 66 76 86 96 A6 B6 C6 D6 E6 F6 16 16 16 07 17 27 37 47 57 67 77 87 97 A7 B7 C7 D7 E7 F7 MUX 08 18 28 38 48 58 68 78 88 98 A8 B8 C8 D8 E8 F8 Read ^ 010 Write 09 19 29 39 49 59 69 79 89 99 A9 B9 C9 D9 E9 F9 Data Data 0A 1A 2A 3A 4A 5A 6A 7A 8A 9A AA BA CA DA EA FA 0B 1B 2B 3B 4B 5B 6B 7B 8B 9B AB BB CB DB EB FB << 0C 1C 2C 3C 4C 5C 6C 7C 8C 9C AC BC CC DC EC FC 011 >> 0D 1D 2D 3D 4D 5D 6D 7D 8D 9D AD BD CD DD ED FD op 2 1 0 0E 1E 2E 3E 4E 5E 6E 7E 8E 9E AE BE CE DE EE FE +, - 0 0 0 100 0F 1F 2F 3F 4F 5F 6F 7F 8F 9F AF BF CF DF EF FF 3 & 0 0 1 ^ 0 1 0 <<, >> 0 1 1 8 input 2 1 0 0 subtract shift direction ALU control Cl Write Address 7 8 Registers Designing a Processor

TOY registers: fancy 16 x 16-bit register file. How to build a microprocessor?

! Want to be able to read two registers, and write to a third in the

same instructions: R1 " R2 + R3. ! Develop instruction set architecture (ISA).

! 3 address inputs, 1 data input, 2 data outputs. – 16-bit words, 16 TOY machine instructions

! Add decoders and muxes for additional ports.

! Determine major components. – ALU, memory, registers, program counter

16 R0 R8

Write Data R1 R9 ! Determine datapath requirements. 16 4 R2 RA – "flow" of bits A Data Write Address R3 RB 16 R4 RC ! Establish clocking methodology. 4 B Data A Address R5 RD – 2-cycle design: fetch, execute 4 R6 RE

B Address R7 RF ! Analyze how to implement each instruction. – determine settings of control signals

Cl Write 9 10

Datapath and Control Datapath and Control

Datapath. Datapath.

! Layout and interconnection of components. ! Layout and interconnection of components.

! Must accommodate all instruction types. ! Must accommodate all instruction types.

Control. Control.

! Choreographs the "flow" of information on the datapath. ! Choreographs the "flow" of information on the datapath.

! Depending on instruction, different control wires are turned on. ! Depending on instruction, different control wires are turned on.

8 datapath wires 8 datapath wires Result of jump 5E PC Result of jump 5E PC 1 8 1 8 or branch 5E or branch 11 MUX MUX 8 8 Result of adding 1 to 11 Result of adding 1 to 11 0 0 old PC old PC control wire control wire

1 0

11 12 The TOY Datapath The TOY Datapath: Add

8 16 pc for branch, addr for result of arithmetic, logic, jump loads, stores or addr for load addr Cond = 0 Eval Registers 0 > 0 8 pc for jal W Data load 201 ?1?23?4? 2 Memory A Data A Memory IR 20 IR L 20 1 PC Addr op U PC Addr op B Data 2 d W Addr 5 21 21 d R Data R Data 3 s A Addr 1234 s 16 21 4 + W Data t B Addr + W Data t 1 W 1 W 0 W 8 addr

pc + 1 8 store data

Before fetch: After fetch: pc = 20, mem[20] = 1234 pc = 21 IR = 1234: R[2] " R[3] + R[4] 15 16

The TOY Datapath: Add The TOY Datapath: Jump and Link

008C

Cond = 0 Cond = 0 Eval Eval Registers Registers 008C > 0 > 0 W Data W Data 21 1234 20 ???? 2 0028 2 Memory A Data A Memory A Data A IR 008C IR 1 L L PC Addr op 0064 U PC Addr op U 2 B Data B Data d W Addr 5 d W Addr 35 R Data 3 R Data s A Addr s A Addr 4 + W Data t B Addr + W Data t B Addr 1 W 1 W W W

Before execute: After execute: pc = 21 pc = 21 Before fetch: IR = 1234: R[2] " R[3] + R[4] R[2] = 008C pc = 20 R[3] = 0028, R[4] = 0064 mem[20] = FF30 17 18 The TOY Datapath: Jump and Link The TOY Datapath: Jump and Link

Cond = 0 Cond = 0 Eval Eval Registers Registers > 0 > 0 W Data W Data 201 ?F?F?30? 21 FF30 2 2 Memory A Data A Memory A Data A 20 IR IR 20 F L 21 F L PC Addr op U PC Addr op U F B Data F B Data 21 21 d W Addr 35 d W Addr 35 R Data 3 R Data 3 FF30 s A Addr s A Addr 21 0 0 + W Data t B Addr + W Data t B Addr 1 W 1 W W W 30

Before fetch: After fetch: Before execute: pc = 20 pc = 21 pc = 21 mem[20] = FF30 IR = FF30: R[F] " 21; pc " 30 IR = FF30: R[F] " 21; pc " 30 19 20

The TOY Datapath: Jump and Link Do Try This At Home

0030 8 16

pc for branch, addr for result of arithmetic, logic, jump loads, stores or addr for load addr Cond = 0 Cond = 0 Eval Eval Registers Registers > 0 0 > 0 8 21 21 pc for jal W Data W Data 2310 FF30 load 2 2 Memory A Data A Memory A Data A 30 IR IR 21 F L L PC Addr op U PC Addr op U F B Data B Data d W Addr 35 d W Addr 5 R Data 3 R Data s A Addr s A Addr 0 16 + W Data t B Addr + W Data t B Addr 1 W 1 W W W 0 8 addr 30 pc + 1 8 store data

Before execute: After execute: pc = 21 pc = 30 Trace the flow of some other instructions through the datapath picture. IR = FF30: R[F] " 21; pc " 30 R[F] = 21

21 22 Designing a Processor Clocking Methodology

How to build a microprocessor? Two cycle design (fetch and execute).

! Use 1-bit counter to distinguish between 2 cycles.

! Develop instruction set architecture (ISA). ! Use two cycles since fetch and execute phases each access memory – 16-bit words, 16 TOY machine instructions and alter program counter.

! Determine major components. 1-bit – ALU, memory, registers, program counter counter Execute Clock Q Cl Execute ! Determine datapath requirements. Fetch – "flow" of bits Fetch

! Establish clocking methodology. – 2-cycle design: fetch, execute Fetch

! Analyze how to implement each instruction. Execute – determine settings of control signals Clock

25 26

Clocking Methodology Designing a Processor

4 distinguishable events. Registers How to build a microprocessor?

! During fetch phase. W Data W Addr ! At very end of execute phase. ! Develop instruction set architecture (ISA).

! During execute phase. A Addr A Data – 16-bit words, 16 TOY machine instructions B Addr B Data ! At very end of fetch phase. W ! Determine major components. Ex: can only write at very end of execute phase. – ALU, memory, registers, program counter

! R1 " R1 + R1

! execute clock Determine datapath requirements. – "flow" of bits

! Establish clocking methodology. Fetch – 2-cycle design: fetch, execute

Execute ! Analyze how to implement each instruction. – determine settings of control signals Clock

27 28 Control Control

Control: controls components, enables connections. Control: controls components, enables connections.

! Input: opcode, clock, conditional evaluation. (green) ! Input: opcode, clock, conditional evaluation. (green)

! Output: control wires. (orange) ! Output: control wires. (orange)

Cond = 0 Cond = 0 Ev al Ev al Registers Registers > 0 > 0 W Data W Data 2 2 Memory A Data A Memory A Data A IR IR 4 L 4 L Addr op U Addr op U PC B Data PC B Data d W Addr 5 d W Addr 5 R Data R Data A Addr A Addr s s

+ W Data t B Addr + W Data t B Addr 1 W 1 W W W

1-bit 1-bit Opcode counter Execute Control counter Execute Control Fetch Fetch 29 Clock 30

Implementation of Control Implementation of Control: Store s s t t c c l Inputs l Inputs o o opcode b opcode b o o 4-BIT DECODER o 4-BIT DECODER o a a r b r b r r s j n s j n d d u u e e a a s h l c d s h l c d r r o o m m s j s j

h o h o a a n n i i e e a 1 a f f u u i i u i u i n p n p c c i i n n n n p p n n x x f t d f t d m m b b d o d o h h c c d d

d d t t e e

f 0 f s s s s t t + + h p h p

i i r r i i a a c c e e c c r r t z i t z i r r r r z z

h l h l

l i t l i t a a l l o o d d a a u u e e x e g o e x e g o e e e a r t a r t e e l l p p o o a a i 0 i i i a a d d t t h d r v h d r v n e c n e c c c f n f n o r o r c c o c c o r r c c l l e e d d e e e e d h d h d d t t t t t t r k o t t r k o r r t t s g t t s g o o k 1 k

WRITE MEM WRITE MEM

WRITE IR WRITE IR

ALU SELECT 0 ALU SELECT 0

ALU MUX ALU MUX

READ REG A MUX READ REG A MUX

plus a plus a few more few more

31 32 Control: Execute Phase of Store Stand-Alone Registers

IR jump 1 16 PC branch Cond 6 1 8 8 Ev al pc + 1 Registers 0

W Data

Memory A Data A IR execute 4 L Addr op U PC B Data d W Addr R Data fetch clock

s A Addr clock

+ W Data t B Addr Instruction Register 1 W W

jump fetch jump > 0 bpos link reg bzero 1-bit Opcode = 0 counter Execute Control Fetch Program Counter Clock

33 34

Layers of Abstraction Pipelining

Pipelining. Abstraction Built From Examples ! At any instant, processor is either fetching instructions or executing them (and so half of circuitry is idle). Abstract raw materials transistor ! Why not fetch next instruction while current instruction is Connector raw materials wire executing? Clock raw materials crystal oscillator – Analogy: washer / dryer. abstract , Logic Gates AND, OR, NOT connectors Issues. ! Jump and branch instructions change PC. decoder, gates, , – "Prefetch" next instruction. Circuit connectors , ALU ! Fetch and execute cycles may need to access same memory. logic gates, clock, – Solution: use two memory caches. Sequential Circuit flip-flop connector decoder, Result. registers, ALU, Components multiplexer, ! counter, control Better utilization of hardware. adder, flip-flop ! Can double speed of processor. Computer components TOY

35 36 History + Future Goodbye, TOY

8 16 Computer constructed by layering abstractions.

! Better implementation at low levels improves EVERYTHING. pc for branch, addr for result of arithmetic, logic, jump loads, stores or addr for load addr ! Ongoing search for better abstract switch! Cond = 0 Eval Registers 0 > 0 8 pc for jal History. W Data load 2 ! 1820s: mechanical switches (Babbage's difference engine). Memory A Data A IR L ! 1940s: relays, vacuum tubes. Addr op PC B Data U ! 1950s: transistor, core memory. d W Addr 5 R Data ! 1960s: . s A Addr 16 ! 1970s: microprocessor. + W Data t B Addr ! 1980s: VLSI. 1 W W 0 ! 1990s: integrated systems. 8 addr ! 2000s: web computer. pc + 1 8 store data ! Future: DNA, quantum, optical soliton, . . .

37 38