How to Speak Computer

High Level Language temp = v[k]; Program v[k] = v[k+1]; v[k+1] = temp; Compiler lw $15, 0($2) Instruction Set Architecture Assembly Language lw $16, 4($2) Program sw $16, 0($2) sw $15, 4($2) Assembler or 1000110001100010000000000000000 Machine Language 1000110011110010000000000000100 “How to talk to computers if Program 1010110011110010000000000000000 you aren’t on Star Trek” 1010110001100010000000000000100 Machine Interpretation

Control Signal Spec ALUOP[0:3] <= InstReg[9:11] & MASK CSE 240A Dean Tullsen CSE 240A Dean Tullsen

Crafting an ISA Where are the instructions?

• Designing an ISA is both an art and a science • • ISA design involves dealing in an extremely rare resource – instruction bits! inst & inst cpu data • Some things we want out of our ISA storage storage – completeness cpu data “stored-program” computer – orthogonality storage – regularity and simplicity – compactness

– ease of programming L1 – ease of implementation inst L2 cpu Mem cache L1 data cache CSE 240A Dean Tullsen CSE 240A Dean Tullsen Key ISA decisions Choice 1: Operand Location

destination operand operation • operations y = x + b • –how many? • Stack – which ones source operands Registers • operands • –how many? • Memory – location –types • We can classify most machines into 4 types: accumulator, – how to specify? how does the computer know what stack, register-memory (most operands can be registers or • instruction format 0001 0100 1101 1111 means? memory), load-store (arithmetic operations must have –size register operands). – how many formats?

CSE 240A Dean Tullsen CSE 240A Dean Tullsen

Choice 1B: How Many Operands? Alternative ISA’s Basic ISA Classes • A = X*Y - B*C

Accumulator: Stack Architecture Accumulator GPR GPR (Load-store) 1 address add A acc ← acc + mem[A]

Stack: 0 address add tos ← tos + next

General Purpose Register: 2 address add A B EA(A) ← EA(A) + EA(B) 3 address add A B C EA(A) ← EA(B) + EA(C) Stack Memory Accumulator Load/Store: A ? 3 address add Ra Rb Rc Ra ← Rb + Rc X 12 load Ra Rb Ra ← mem[Rb] R1 Y 3 B 4 store Ra Rb mem[Rb] ← Ra R2 C 5 R3 temp ? A load/store architecture has instructions that do either ALU CSE 240Aoperations or access memory, but never both. Dean Tullsen CSE 240A Dean Tullsen Choice 2: Addressing Modes Utilization how do we specify the operand we want?

• Register direct R3 R6 = R5 + R3 TeX 1% Memory indirect spice 6% • Immediate (literal) #25 R6 = R5 + 25 gcc 1% • Direct (absolute) M[10000] R6 = M[10000] TeX 0% Scaled spice 16% • Register indirect M[R3] R6 = M[R3] gcc 6%

(a.k.a register deferred) TeX 24% • Memory Indirect M[M[R3] ] Register deferred spice 3% gcc 11% • Displacement M[R3 + 10000] ... TeX 43% • Index M[R3 + R4] Immediate spice 17% 39% • Scaled M[R3 + R4*d + 10000] gcc TeX 32% • Autoincrement M[R3++] Displacement spice 55% • Autodecrement M[R3 - -] gcc 40% 0% 10% 20% 30% 40% 50% 60% Frequency of the addressing mode Conclusion? CSE 240A Dean Tullsen CSE 240A Dean Tullsen

Displacement Size Choice 3: Which Operations?

30% • arithmetic Integer average – add, subtract, multiply, divide 25% • logical Floating-point average 20% Percentage of – and, or, shift left, shift right displacement 15% • data transfer

10% – load word, store word • control flow 5%

0% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Does it make sense to have more complex instructions?

Value -e.g., square root, mult-add, matrix multiply, cross product ... • Conclusions?

CSE 240A Dean Tullsen CSE 240A Dean Tullsen Types of branches (control flow) Conditional branch

• conditional branch beq r1,r2, label • jump jump label • How do you specify the destination (target) of a • procedure call call label branch/jump? • procedure return return • How do we specify the condition of the branch?

CSE 240A Dean Tullsen CSE 240A Dean Tullsen

Branch distance Branch condition

Condition Codes status bits are set as a side-effect of arithmetic instructions or explicitly by compare or test instructions. ex: sub r1, r2, r3 bz label

Condition Register Ex: cmp r1, r2, r3 bgt r1, label

Compare and Branch Ex: bgt r1, r2, label • Conclusions?

CSE 240A Dean Tullsen CSE 240A Dean Tullsen Choice 4: Instruction Format The Customer is Always Right

Fixed (e.g., all RISC processors -- SPARC, MIPS, Alpha) • Compiler is primary customer of ISA opcode addr1 addr2 addr3 • Features the compiler doesn’t use are wasted Variable (VAX, ...) • Register allocation is a huge contributor to performance opcode+ spec1 addr1 spec2 addr2 ... specn addrn • Compiler-writer’s job is made easier when ISA has

Hybrid – regularity – primitives, not solutions – simple trade-offs • Summary -> simplicity over power • Tradeoffs? • Conclusions?

CSE 240A Dean Tullsen CSE 240A Dean Tullsen

Our desired ISA MIPS instruction set architecture

• Registers, Load-store • 32 32-bit general-purpose registers • Addressing modes – R0 always equals zero – immediate (8-16 bits) – 32 or 16 FP registers – displacement (12-16 bits) • 8-, 16-, and 32-bit integers, 32- and 64-bit fp data types – register deferred (register indirect) • immediate and displacement addressing modes • Support a reasonable number of operations – register deferred is a subset of displacement • Don’t use condition codes • 32-bit fixed-length instruction encoding • Fixed instruction encoding/length for performance • regularity (several general-purpose registers)

CSE 240A Dean Tullsen CSE 240A Dean Tullsen MIPS Instruction Format RISC vs CISC

• MIPS is a classic RISC architectures (as are SPARC, Alpha, PowerPC, …) • RISC stands for Reduced Instruction Set Computer. RISC architectures are load-store, few formats, minimal instruction sets. • They were in contrast to the 70s and 80s which proliferated CISC ISAs (VAX, Intel , various IBM), which were characterized by complex and comprehensive instruction sets, and complex instruction decoding. • RISC architectures thrived not because they supported fewer operations, but because they enabled parallelism.

CSE 240A Dean Tullsen CSE 240A Dean Tullsen

MIPS Operations and ISA ISA Key Points

• Read on your own! • Modern ISA’s typically sacrifice power and flexibility for • Get comfortable with MIPS instructions and formats regularity and simplicity; code density for parallelism and throughput. • instruction bits are extremely limited, particularly in a fixed-length instruction format. • Registers are critical to performance – we want lots of them, and few strings attached. • Displacement addressing mode handles the vast majority of memory reference needs.

CSE 240A Dean Tullsen CSE 240A Dean Tullsen