The TOY Machine
Lecture 12: TOY Machine Architecture TOY machine. ! 256 16-bit words of memory.
! 16 16-bit registers.
! 1 8-bit program counter.
8 16 ! 16 instructions types.
pc for branch, addr for loads, result of arithmetic, logic, jump stores or addr for load addr Cond = 0 What we've done. Eval Registers Fetch 0 > 0 ! Written programs for the TOY machine. pc for jal 8 W Data load ! Software implementation of fetch-execute cycle. 2 Memory A Data A IR L – TOY simulator. PC Addr op U B Data d W Addr 5 R Execute A Addr Our goal today. Data s 16 ! + W Data t B Addr Hardware implementation of fetch-execute cycle. 1 W – TOY computer. W 0 addr 8 pc + 1 8 store data
COS126: General Computer Science • http://www.cs.Princeton.EDU/~cos126 2
Designing a Processor Instruction Set Architecture
How to build a microprocessor? Instruction set architecture (ISA).
! 16-bit words, 256 words of memory, 16 registers.
! Develop instruction set architecture (ISA). ! Determine set of primitive instructions. – 16-bit words, 16 TOY machine instructions – too narrow ! cumbersome to program – too broad ! cumbersome to build hardware
! Determine major components. ! TOY machine: 16 instructions. – ALU, memory, registers, program counter Instructions Instructions
! Determine datapath requirements. 0: halt 8: load – "flow" of bits 1: add 9: store 2: subtract A: load indirect ! Establish clocking methodology. 3: and B: store indirect – 2-cycle design: fetch, execute 4: xor C: branch zero 5: shift left D: branch positive ! Analyze how to implement each instruction. – determine settings of control signals 6: shift right E: jump register 7: load address F: jump and link
3 4 Designing a Processor Arithmetic Logic Unit
How to build a microprocessor? TOY ALU.
! Big combinational circuit. technical hack ! Develop instruction set architecture (ISA). ! 16-bit bus.
– 16-bit words, 16 TOY machine instructions ! Add, subtract, and, xor, shift left, shift right, copy input 2.
! Determine major components. – ALU, memory, registers, program counter 16 Input 1 op 2 1 0 ! Determine datapath requirements. +, - 0 0 0 – "flow" of bits ALU 16 & 0 0 1 ^ 0 1 0 ! Establish clocking methodology. <<, >> 0 1 1 16 – 2-cycle design: fetch, execute input 2 1 0 0 Input 2 3 ALU select ! Analyze how to implement each instruction. – determine settings of control signals shift subtract direction
5 6
Arithmetic Logic Unit: Implementation Main Memory
16 TOY main memory: 256 x 16-bit register file. Input 1 + 000 16 00 10 20 30 40 50 60 70 80 90 A0 B0 C0 D0 E0 F0 Input 2 01 11 21 31 41 51 61 71 81 91 A1 B1 C1 D1 E1 F1 ~ carry in 02 12 22 32 42 52 62 72 82 92 A2 B2 C2 D2 E2 F2 03 13 23 33 43 53 63 73 83 93 A3 B3 C3 D3 E3 F3 04 14 24 34 44 54 64 74 84 94 A4 B4 C4 D4 E4 F4 & 001 05 15 25 35 45 55 65 75 85 95 A5 B5 C5 D5 E5 F5 06 16 26 36 46 56 66 76 86 96 A6 B6 C6 D6 E6 F6 16 16 16 07 17 27 37 47 57 67 77 87 97 A7 B7 C7 D7 E7 F7 MUX 08 18 28 38 48 58 68 78 88 98 A8 B8 C8 D8 E8 F8 Read ^ 010 Write 09 19 29 39 49 59 69 79 89 99 A9 B9 C9 D9 E9 F9 Data Data 0A 1A 2A 3A 4A 5A 6A 7A 8A 9A AA BA CA DA EA FA 0B 1B 2B 3B 4B 5B 6B 7B 8B 9B AB BB CB DB EB FB << 0C 1C 2C 3C 4C 5C 6C 7C 8C 9C AC BC CC DC EC FC 011 >> 0D 1D 2D 3D 4D 5D 6D 7D 8D 9D AD BD CD DD ED FD op 2 1 0 0E 1E 2E 3E 4E 5E 6E 7E 8E 9E AE BE CE DE EE FE +, - 0 0 0 100 0F 1F 2F 3F 4F 5F 6F 7F 8F 9F AF BF CF DF EF FF 3 & 0 0 1 ^ 0 1 0 <<, >> 0 1 1 8 input 2 1 0 0 subtract shift direction ALU control Cl Write Address 7 8 Registers Designing a Processor
TOY registers: fancy 16 x 16-bit register file. How to build a microprocessor?
! Want to be able to read two registers, and write to a third in the
same instructions: R1 " R2 + R3. ! Develop instruction set architecture (ISA).
! 3 address inputs, 1 data input, 2 data outputs. – 16-bit words, 16 TOY machine instructions
! Add decoders and muxes for additional ports.
! Determine major components. – ALU, memory, registers, program counter
16 R0 R8
Write Data R1 R9 ! Determine datapath requirements. 16 4 R2 RA – "flow" of bits A Data Write Address R3 RB 16 R4 RC ! Establish clocking methodology. 4 B Data A Address R5 RD – 2-cycle design: fetch, execute 4 R6 RE
B Address R7 RF ! Analyze how to implement each instruction. – determine settings of control signals
Cl Write 9 10
Datapath and Control Datapath and Control
Datapath. Datapath.
! Layout and interconnection of components. ! Layout and interconnection of components.
! Must accommodate all instruction types. ! Must accommodate all instruction types.
Control. Control.
! Choreographs the "flow" of information on the datapath. ! Choreographs the "flow" of information on the datapath.
! Depending on instruction, different control wires are turned on. ! Depending on instruction, different control wires are turned on.
8 datapath wires 8 datapath wires Result of jump 5E PC Result of jump 5E PC 1 8 1 8 or branch 5E or branch 11 MUX MUX 8 8 Result of adding 1 to 11 Result of adding 1 to 11 0 0 old PC old PC control wire control wire
1 0
11 12 The TOY Datapath The TOY Datapath: Add
8 16 pc for branch, addr for result of arithmetic, logic, jump loads, stores or addr for load addr Cond = 0 Eval Registers 0 > 0 8 pc for jal W Data load 201 ?1?23?4? 2 Memory A Data A Memory IR 20 IR L 20 1 PC Addr op U PC Addr op B Data 2 d W Addr 5 21 21 d R Data R Data 3 s A Addr 1234 s 16 21 4 + W Data t B Addr + W Data t 1 W 1 W 0 W 8 addr
pc + 1 8 store data
Before fetch: After fetch: pc = 20, mem[20] = 1234 pc = 21 IR = 1234: R[2] " R[3] + R[4] 15 16
The TOY Datapath: Add The TOY Datapath: Jump and Link
008C
Cond = 0 Cond = 0 Eval Eval Registers Registers 008C > 0 > 0 W Data W Data 21 1234 20 ???? 2 0028 2 Memory A Data A Memory A Data A IR 008C IR 1 L L PC Addr op 0064 U PC Addr op U 2 B Data B Data d W Addr 5 d W Addr 35 R Data 3 R Data s A Addr s A Addr 4 + W Data t B Addr + W Data t B Addr 1 W 1 W W W
Before execute: After execute: pc = 21 pc = 21 Before fetch: IR = 1234: R[2] " R[3] + R[4] R[2] = 008C pc = 20 R[3] = 0028, R[4] = 0064 mem[20] = FF30 17 18 The TOY Datapath: Jump and Link The TOY Datapath: Jump and Link
Cond = 0 Cond = 0 Eval Eval Registers Registers > 0 > 0 W Data W Data 201 ?F?F?30? 21 FF30 2 2 Memory A Data A Memory A Data A 20 IR IR 20 F L 21 F L PC Addr op U PC Addr op U F B Data F B Data 21 21 d W Addr 35 d W Addr 35 R Data 3 R Data 3 FF30 s A Addr s A Addr 21 0 0 + W Data t B Addr + W Data t B Addr 1 W 1 W W W 30
Before fetch: After fetch: Before execute: pc = 20 pc = 21 pc = 21 mem[20] = FF30 IR = FF30: R[F] " 21; pc " 30 IR = FF30: R[F] " 21; pc " 30 19 20
The TOY Datapath: Jump and Link Do Try This At Home
0030 8 16
pc for branch, addr for result of arithmetic, logic, jump loads, stores or addr for load addr Cond = 0 Cond = 0 Eval Eval Registers Registers > 0 0 > 0 8 21 21 pc for jal W Data W Data 2310 FF30 load 2 2 Memory A Data A Memory A Data A 30 IR IR 21 F L L PC Addr op U PC Addr op U F B Data B Data d W Addr 35 d W Addr 5 R Data 3 R Data s A Addr s A Addr 0 16 + W Data t B Addr + W Data t B Addr 1 W 1 W W W 0 8 addr 30 pc + 1 8 store data
Before execute: After execute: pc = 21 pc = 30 Trace the flow of some other instructions through the datapath picture. IR = FF30: R[F] " 21; pc " 30 R[F] = 21
21 22 Designing a Processor Clocking Methodology
How to build a microprocessor? Two cycle design (fetch and execute).
! Use 1-bit counter to distinguish between 2 cycles.
! Develop instruction set architecture (ISA). ! Use two cycles since fetch and execute phases each access memory – 16-bit words, 16 TOY machine instructions and alter program counter.
! Determine major components. 1-bit – ALU, memory, registers, program counter counter Execute Clock Q Cl Execute ! Determine datapath requirements. Fetch – "flow" of bits Fetch
! Establish clocking methodology. – 2-cycle design: fetch, execute Fetch
! Analyze how to implement each instruction. Execute – determine settings of control signals Clock
25 26
Clocking Methodology Designing a Processor
4 distinguishable events. Registers How to build a microprocessor?
! During fetch phase. W Data W Addr ! At very end of execute phase. ! Develop instruction set architecture (ISA).
! During execute phase. A Addr A Data – 16-bit words, 16 TOY machine instructions B Addr B Data ! At very end of fetch phase. W ! Determine major components. Ex: can only write at very end of execute phase. – ALU, memory, registers, program counter
! R1 " R1 + R1
! execute clock Determine datapath requirements. – "flow" of bits
! Establish clocking methodology. Fetch – 2-cycle design: fetch, execute
Execute ! Analyze how to implement each instruction. – determine settings of control signals Clock
27 28 Control Control
Control: controls components, enables connections. Control: controls components, enables connections.
! Input: opcode, clock, conditional evaluation. (green) ! Input: opcode, clock, conditional evaluation. (green)
! Output: control wires. (orange) ! Output: control wires. (orange)
Cond = 0 Cond = 0 Ev al Ev al Registers Registers > 0 > 0 W Data W Data 2 2 Memory A Data A Memory A Data A IR IR 4 L 4 L Addr op U Addr op U PC B Data PC B Data d W Addr 5 d W Addr 5 R Data R Data A Addr A Addr s s
+ W Data t B Addr + W Data t B Addr 1 W 1 W W W
1-bit 1-bit Opcode counter Execute Control counter Execute Control Fetch Fetch 29 Clock 30
Implementation of Control Implementation of Control: Store s s t t c c l Inputs l Inputs o o opcode b opcode b o o 4-BIT DECODER o 4-BIT DECODER o a a r b r b r r s j n s j n d d u u e e a a s h l c d s h l c d r r o o m m s j s j
h o h o a a n n i i e e a 1 a f f u u i i u i u i n p n p c c i i n n n n p p n n x x f t d f t d m m b b d o d o h h c c d d
d d t t e e
f 0 f s s s s t t + + h p h p
i i r r i i a a c c e e c c r r t z i t z i r r r r z z
h l h l
l i t l i t a a l l o o d d a a u u e e x e g o e x e g o e e e a r t a r t e e l l p p o o a a i 0 i i i a a d d t t h d r v h d r v n e c n e c c c f n f n o r o r c c o c c o r r c c l l e e d d e e e e d h d h d d t t t t t t r k o t t r k o r r t t s g t t s g o o k 1 k
WRITE MEM WRITE MEM
WRITE IR WRITE IR
ALU SELECT 0 ALU SELECT 0
ALU MUX ALU MUX
READ REG A MUX READ REG A MUX
plus a plus a few more few more
31 32 Control: Execute Phase of Store Stand-Alone Registers
IR jump 1 16 PC branch Cond 6 1 8 8 Ev al pc + 1 Registers 0
W Data
Memory A Data A IR execute 4 L Addr op U PC B Data d W Addr R Data fetch clock
s A Addr clock
+ W Data t B Addr Instruction Register 1 W W
jump fetch jump > 0 bpos link reg bzero 1-bit Opcode = 0 counter Execute Control Fetch Program Counter Clock
33 34
Layers of Abstraction Pipelining
Pipelining. Abstraction Built From Examples ! At any instant, processor is either fetching instructions or executing them (and so half of circuitry is idle). Abstract Switch raw materials transistor ! Why not fetch next instruction while current instruction is Connector raw materials wire executing? Clock raw materials crystal oscillator – Analogy: washer / dryer. abstract switches, Logic Gates AND, OR, NOT connectors Issues. ! Jump and branch instructions change PC. decoder, Combinational logic gates, multiplexer, – "Prefetch" next instruction. Circuit connectors adder, ALU ! Fetch and execute cycles may need to access same memory. logic gates, clock, – Solution: use two memory caches. Sequential Circuit flip-flop connector decoder, Result. registers, ALU, Components multiplexer, ! counter, control Better utilization of hardware. adder, flip-flop ! Can double speed of processor. Computer components TOY
35 36 History + Future Goodbye, TOY
8 16 Computer constructed by layering abstractions.
! Better implementation at low levels improves EVERYTHING. pc for branch, addr for result of arithmetic, logic, jump loads, stores or addr for load addr ! Ongoing search for better abstract switch! Cond = 0 Eval Registers 0 > 0 8 pc for jal History. W Data load 2 ! 1820s: mechanical switches (Babbage's difference engine). Memory A Data A IR L ! 1940s: relays, vacuum tubes. Addr op PC B Data U ! 1950s: transistor, core memory. d W Addr 5 R Data ! 1960s: integrated circuit. s A Addr 16 ! 1970s: microprocessor. + W Data t B Addr ! 1980s: VLSI. 1 W W 0 ! 1990s: integrated systems. 8 addr ! 2000s: web computer. pc + 1 8 store data ! Future: DNA, quantum, optical soliton, . . .
37 38