<<

CS 240251 SpringFall 2019 2020 FoundationsPrinciples of Programming of Languages Systems λ Ben Wood

A Simple

1. A simple Instruction Set Architecture 2. A simple (implementation): Data Path and Control Logic

https://cs.wellesley.edu/~cs240/s20/ A Simple Processor 1 Program, Application

Programming Language

Compiler/

Software Operating System

Instruction Set Architecture

Microarchitecture

Digital Logic

Devices (transistors, etc.) Hardware

Solid-State Physics A Simple Processor 2 Instruction Set Architecture (HW/SW Interface) processor memory Instructions • Names, Encodings Instruction Encoded • Effects Logic Instructions • Arguments, Results Registers Data Local storage • Names, Size • How many Large storage • Addresses, Locations

Computer

A Simple Processor 3 Computer

Microarchitecture (Implementation of ISA)

Instruction Fetch and Registers ALU Memory Decode

A Simple Processor 4 (HW = Hardware or Hogwarts?) HW ISA An example made-up instruction set architecture Word size = 16 bits • Register size = 16 bits. • ALU computes on 16-bit values. Memory is byte-addressable, accesses full words (byte pairs). 16 registers: R0 - R15 • R0 always holds hardcoded 0 Address Contents • R1 always holds hardcoded 1 0 First instruction, low-order byte • R2 – R15: general purpose 1 First instruction, Instructions are 1 word in size. high-order byte 2 Second instruction, Separate instruction memory. low-order byte Program (PC) register ...... • holds address of next instruction to execute.

A Simple Processor 5 R: M: Data Memory Reg Contents Reg Contents Address Contents R0 0x0000 R8 0x0 – 0x1 R1 0x0001 R9 0x2 – 0x3 R2 R10 0x4 – 0x5 R3 R11 0x6 – 0x7 R4 R12 0x8 – 0x9 R5 R13 0xA – 0xB R6 R14 0xC – 0xD R7 R15 … Program Counter IM: Instruction Memory PC Address Contents 0x0 – 0x1 ß Processor 1. ins IM[PC] 0x2 – 0x3 2. PC ß PC + 2 Loop 3. Do ins 0x4 – 0x5 0x6 – 0x7 Abstract 0x8 – 0x9 HW ISA Machine … A Simple Processor 6 HW ISA Instructions 16-bit Encoding MSB LSB

Assembly Syntax Meaning Rs Rt Rd

ADD Rs, Rt, Rd R[d] ß R[s] + R[t] 0010 s t d

SUB Rs, Rt, Rd R[d] ß R[s] - R[t] 0011 s t d

AND Rs, Rt, Rd R[d] ß R[s] & R[t] 0100 s t d

OR Rs, Rt, Rd R[d] ß R[s] | R[t] 0101 s t d

LW Rt, offset(Rs) R[t] ß M[R[s] + offset] 0000 s t offset

SW Rt, offset(Rs) M[R[s] + offset] ß R[t] 0001 s t offset If R[s] == R[t] then BEQ Rs, Rt, offset 0111 s t offset PC ß PC + offset*2 JMP offset PC ß offset*2 1000 o f f s e t

(R = register file, M = memory) A Simple Processor 7 R: Register File M: Data Memory Reg Contents Reg Contents Address Contents R0 0x0000 R8 0x0 – 0x1 0x46 0x12 R1 0x0001 R9 0x2 – 0x3 0xA9 0x56 R2 R10 0x4 – 0x5 R3 R11 0x6 – 0x7 R4 R12 0x8 – 0x9 R5 R13 0xA – 0xB R6 R14 0xC – 0xD R7 R15 … Program Counter IM: Instruction Memory PC Address Contents 0x0 – 0x1 LW R3, 0(R0) ß Processor 1. ins IM[PC] 0x2 – 0x3 LW R4, 2(R0) 2. PC ß PC + 2 Loop 3. Do ins 0x4 – 0x5 AND R3, R4, R5 0x6 – 0x7 SW R5, 4(R0) Abstract 0x8 – 0x9 HW ISA Machine 0xA – 0xB A Simple Processor 8 R: Register File M: Data Memory Reg Contents Reg Contents Address Contents R0 0x0000 R8 0x0 – 0x1 R1 0x0001 R9 2 0x2 – 0x3 R2 R10 3 0x4 – 0x5 R3 R11 0x6 – 0x7 R4 R12 0x8 – 0x9 R5 R13 0xA – 0xB R6 R14 0xC – 0xD R7 R15 … Program Counter IM: Instruction Memory PC Address Contents 0x0 – 0x1 SUB R8, R8, R8 ß Processor 1. ins IM[PC] 0x2 – 0x3 BEQ R9, R0, 3 2. PC ß PC + 2 Loop 3. Do ins 0x4 – 0x5 ADD R10, R8, R8 0x6 – 0x7 SUB R9, R1, R9 Abstract 0x8 – 0x9 JMP 1 HW ISA Machine 0xA – 0xB HALT A Simple Processor 9 HW arch microarchitecture

4 2 1 3 Instruction Fetch and Registers ALU Memory Decode

One possible hardware implementation of the HW ISA A Simple Processor 11 Processor 1. ins ß IM[PC] 2. PC ß PC + 2 Instruction Fetch Loop 3. Do ins

Fetch instruction from memory.

Add Increment program counter (PC) to point to the next 2 instruction. Instruction Memory Read PC Address Instruction

A Simple Processor 12 Arithmetic Instructions Opcode Rs Rt Rd ADD R3, R6, R8 0010 0011 0110 1000

16-bit Encoding Instruction Meaning Opcode Rs Rt Rd ADD Rs, Rt, Rd R[d] ß R[s] + R[t] 0010 0-15 0-15 0-15 SUB Rs, Rt, Rd R[d] ß R[s] – R[t] 0011 0-15 0-15 0-15 AND Rs, Rt, Rd R[d] ß R[s] & R[t] 0100 0-15 0-15 0-15 OR Rs, Rt, Rd Rd ß R[s] | R[t] 0101 0-15 0-15 0-15 ...

A Simple Processor 13 Instruction Decode, Register Access, ALU

Opcode 4

Reg Write

Write Enable ALU Op 4 Read Addr 1 Rs Read 16 16 4 Data 1 Read Addr 2 overflow Rt Instruction 4 Register File ALU zero Rd Write Addr Read 16 ALU result 16 Write Data Data 2

A Simple Processor 14 Memory Instructions Opcode Rs Rt Rd SW R6, -8(R3) 0001 0011 0110 1000

Instruction Meaning Op Rs Rt Rd LW Rt, offset(Rs) R[t] ß Mem[R[s] + offset] 0000 0-15 0-15 offset SW Rt, offset(Rs) Mem[R[s] + offset] ß R[t] 0001 0-15 0-15 offset ...

A Simple Processor 15 How can we support arithmetic Memory access and memory instructions? What's shared?

4 Control Opcode Unit

Reg Write ALU Op

4 Write Enable Mem Store Read Addr 1 16 Rs Read 16 4 Data 1 Write Enable Read Addr 2 16 Address Rt ALU Inst Register File 32 Data Memory Rt Write Addr Read 16 Data 2 Read 16 Write Write Data Data Data Rd 4 4 Sign 16 (offset) extend A Simple Processor 16 Choose with MUXs

Mem

4 Control Opcode Unit

Reg Write ALU Op

4 Write Enable Mem Store Read Addr 1 16 Rs Read 16 4 Data 1 Write Enable Read Addr 2 16 Address Rt ALU Inst Register File 32 Data Memory 4 Rt 1 0 Write Addr Read 16 Rd Data 2 0 Write Read 16 1 Write Data Data Data Rd 4 Sign 16 (offset) extend 0 1 A Simple Processor 17 Control-flow Instructions

Op Rs Rt Rd BEQ R1, R2, -2 0111 0001 0010 1110

16-bit Encoding Instruction Meaning Op Rs Rt Rd If R[s] == R[t] then BEQ Rs, Rt, offset 0111 0-15 0-15 offset PC ß PC + offset*2 ...

A Simple Processor 18 Compute branch target

Shift left + by 1

4 Control Unit ALU Op + Reg Write overflow 2 Write Enable 4 zero Instruction Read Addr 1 Read 16 Memory 16 4 Data 1 Read Read Addr 2 16 PC Address Register File ALU Inst 1 32 4 0 Write Addr Read 16 Data 2 0 Write Data 1 4 Sign 16 extend A Simple Processor 19 Make branch decision MUX Shift left + by 1 Branch?

Mem

4 Control Unit ALU Op + Reg Write overflow 2 Write Enable 4 zero Mem Store Instruction Read Addr 1 16 Read Write Enable Memory 16 4 Data 1 Read Read Addr 2 16 Address PC Address Register File ALU Inst 1 32 Data Memory 4 0 Write Addr Read 16 Data 2 0 Write Read 16 1 Write Data Data Data 4 Sign 16 extend 0 1 A Simple Processor 20 HW arch: not the only implementation Single-cycle architecture • Simple, "easily" fits on a slide (and in your head). • One instruction takes one clock cycle. • Slowest instruction determines minimum clock cycle. • Inefficient. Could it be better? • Performance, energy, debugging, security, reconfigurability, … • Pipelining • OoO: out-of-order • SIMD: single instruction multiple data • Caching • vs. direct hardware implementation • … enormous, interesting design space of

A Simple Processor 21