CHAPTER 2: INSTRUCTION SET PRINCIPLES

Prepared by Mdm Rohaya binti Abu Hassan

Chapter 2: Instruction Set Principles

• Instruction Set Architecture • Classification of ISA/Types of machine • Primary advantages and disadvantages of each class of machine • Classification of General Purpose Register Machines • Addressing modes • Aligning Addresses • Interpreting Memory Addresses • Addressing Modes for Desktops and Servers • Addressing Mode Usage (VAX) • DLX Instruction Set

Instruction Sets • What is an instruction set? • Set of all instructions understood by the CPU • Each instruction directly executed in hardware

Instruction Set Architecture (ISA)

• Definition: The instruction set architecture is informally the set of programmer visible registers and address spaces and the set of instructions that can operate on them.

Instruction Set Architecture (ISA)

• Instruction set architecture of a machine fills the semantic gap between the user and the machine. • The ISA specifies the size of main memory, number of registers, and number of bits per instruction. • It also specifies exactly which instructions the machine is capable of performing and how each of the instruction bits is interpreted.

Instruction Set Architecture (ISA)

• It is all of the programmer-visible components and operations of the computer • The ISA provides all the information needed for someone to write a program in machine language • translate from a high-level language to machine language

Instruction Set Architecture (ISA)

• ISA serves as the starting point for the design of a new machine or modification of an existing one.

Instruction Set Architecture (ISA)

software

instruction set

hardware Instruction Set Architecture (ISA)

Classification of ISA • The type of internal storage in the CPU is the most basic differentiation. • The major choices are • a stack the operands are implicitly on top of the stack • an accumulator one operand is implicitly the accumulator • a set of registers all operands are explicit either registers or memory locations

Classification of ISA • Stack Architecture: Operands are implicit. They are on the top of the stack. For example, a binary operation pops the top two elements of the stack, applies the operation and pushes the result back on the stack. Classification of ISA • Accumulator Architecture: One of the operands is implicitly the accumulator. Usually one of the source operand and the destination operand is assumed to be the accumulator. Classification of ISA • General Purpose Register Architecture: Operands are explicit: either memory operands or register operands. Any instruction can access memory. Load/Store Architecture: Only load/store instructions can access memory. All other instructions use registers. Also referred to as register- register architecture. Types of Machines

Code Sequence for C=A+B

Stack Accumulator Register-memory Register-register

Push A Load A Load R1, A Load R1, A Push B Add B Add R1, B Load R2, B Add Store C Store C Add R3, R1, R2 Pop C Store C, R3 Classification of ISA • While most early machines used stack or accumulator-style architectures, all machines designed in the past ten years use a general purpose architecture. • Stack architecture : Early machines • Accumulator architecture : Early machines • General purpose register (GPR) architecture : machines after 1980.

Classification of ISA Reasons for emergence of general-purpose register (GPR) machines17 1. Registers are faster than memory 2. Registers are easily used by a compiler and used more effectively. 1. Example: (A*B)-(C*D)-(E*F) for stack machine? for GPR machine? 3. Registers can be used to hold variables: Reduce memory traffic, improve code density, speed up program.

Classification of General Purpose Register Machines

• There are two major instruction set characteristics that divide GPR architectures.

1. Concerns whether an ALU instruction has two or three operands 2. how many of the operands may be memory addressed in ALU instruction

Types of GPR Machines

Number of Max # of Examples memory operands address allowed

0 3 SPARC, MIPS, PA, PowerPC

1 2 Intel 80X86, Motorola 68000

2 2 VAX

3 3 VAX

Classification of General Purpose Register Machines

1. They concern whether an ALU instruction has two or three operands • Example: • ADD R3, R1, R2 R3 <-R1 + R2 3 operands, R1.R2 and R3 • or • ADD R1, R2 R1 <- R1 + R2

2 operands, R1 and R2 Classification of General Purpose Register Machines 2. how many of the operands may be memory addressed in ALU instruction • Register- Register (Load/Store) • ADD R3, R1, R2 (R3 <- R1 + R2) • Register - Memory • ADD R1, A (R1 <- R1 + A) • Memory - Memory • ADD C, A, B (C <- A + B) Primary advantages and disadvantages of each class of machine

Machine Advantages Disadvantages Type A stack can't be randomly Simple model of expression Stack accessed. It makes it difficult to evaluation. Good code density. generate efficient code. Since accumulator is only Minimizes internal state of Accumulator temporary storage, memory machine. Short instructions traffic is highest. Most general model for code All operands must be named, Register generation leading to longer instructions. Addressing modes • ISA design must define how memory addresses are interpreted and specified in the instructions. • Addressing modes are the ways how architectures specify the address of an object they want to access. • In GPR machines, an addressing mode can specify a constant, a register or a location in memory.

Addressing modes Addressing Modes Example Instruction Meaning When used When a value is in a Register Add R4,R3 R4 <- R4 + R3 register Immediate Add R4, #3 R4 <- R4 + 3 For constants Accessing local Displacement Add R4, 100(R1) R4 <- R4 + M[100+R1] variables Accessing using a Register deffered Add R4,(R1) R4 <- R4 + M[R1] pointer or a computed address Useful in array addressing: Indexed Add R3, (R1 + R2) R3 <- R3 + M[R1+R2] R1 - base of array R2 - index amount Useful in accessing Direct Add R1, (1001) R1 <- R1 + M[1001] static data Addressing modes Addressing Example Meaning When used Modes Instruction If R3 is the address of a Memory deferred Add R1, @(R3) R1 <- R1 + M[M[R3]] pointer p, then mode yields *p Useful for stepping through arrays in a Auto- R1 <- R1 +M[R2] Add R1, (R2)+ loop. increment R2 <- R2 + d R2 - start of array d - size of an element Same as auto increment. Auto- R2 <-R2-d Add R1,-(R2) Both can also be used decrement R1 <- R1 + M[R2] to implement a stack as push and pop Used to index arrays. R1<- Add R1, May be applied to any Scaled R1+M[100+R2+R3*d 100(R2)[R3] base addressing mode ] in some machines. Addressing modes - Notation <- - assignment M - the name for memory: M[R1] refers to contents of memory location whose address is given by the contents of R1 Interpreting Memory Addresses • How is a memory address interpreted? • Byte addressed: Provide access for bytes (8 bits), half words (16 bits), words (32 bits) , and double words (64 bits) • Conventions for ordering the bytes within a word: • Little Endian: put byte whose address xxxx00 at LSB position. • followed by DEC and Intel

Word address Data 0 3 2 1 0 4 7 6 5 4

• Big Endian: Put byte whose address xxxx00 at MSB position. • followed by IBM, Motorola and others Word address Data 0 0 1 2 3 4 4 5 6 7

Example • To store a word in byte-addressable memory (i.e. where each element of memory is one byte), you have to break up the 32 bit quantity into 4 bytes. • Thus, if the word was 0x01ab23cd, it's broken up into 0x01, 0xab, 0x23, 0xcd. Interpreting Memory Addresses •When operating within one machine, the byte order is often unnoticeable - only programs that access the same locations as both words and bytes can notice the difference.

•However, byte order is a problem when exchanging data among machines with different ordering.

Aligning Addresses • In some machines, accesses to objects larger than a byte must be aligned . An access to an object of size s bytes at byte address A is aligned if A mod s = 0.

Aligned at Byte Misaligned at Byte Object Addressed Offset Offset byte 0,1,2,3,4,5,6,7 never halfword 0,2,4,6 1,3,5,7 word 0,4 1,2,3,5,6,7 doubleword 0 1,2,3,4,5,6,7 Aligning Addresses

Quantity Address divisible by (Binary) address ends in

Byte 1 anything

Halfword (16 bits) 2 0

Word (32 bits) 4 00

Doubleword (64 bits) 8 000 Aligning Addresses

Aligning Addresses • Misalignment causes hardware complications, since the memory is typically aligned on a word boundary. • A misaligned memory access will, therefore, take multiple aligned memory references. • Misalignment typically results in an alignment fault that must be handled by the OS

Addressing Mode • How architectures specify the address of an object they will access? • In a GPR, an addressing mode can specify • a constant, • a register, • a location in memory (used to compute effective address). • Immediate or literals are usually considered as memory addressing mode. • Addressing modes that depend on the program counter is called PC- relative addressing. • Addressing modes can significantly reduce instruction counts, but may add to the complexity of building a machine and increase the average CPI. Addressing Modes for Desktops and Servers

Register ADD R4, R3 Immediate ADD R4, #3 Displacement ADD R4, 100(R1) Register Indirect ADD R4, (R1) Indexed ADD R3, (R1+R2) Direct (Absolute) ADD R1, (1001) Memory Indirect ADD R1, @(R3) Autoincrement ADD R1, (R2)+ Autodecrement ADD R1, -(R2) Scaled ADD R1, 100(R2)[R3] Addressing Mode Usage (VAX) Operations in the Instruction Set • Data transfer instructions. • Arithmetic and logic instructions. • Instructions for control flow: conditional and • unconditional branches, jumps, procedure calls • and procedure returns. • System calls. • Floating point instructions. • Decimal instructions. • String instructions. • Graphics instructions DLX • The DLX(pronounced "Deluxe") is a RISC processor architecture designed by John L. Hennessy and David A. Patterson, the principal designers of the MIPS and the Berkeley RISC designs (respectively), the two benchmark examples of RISC design. • The DLX is essentially a cleaned up and simplified MIPS, with a simple 32-bit load/store architecture. Intended primarily for teaching purposes, the DLX design is widely used in university-level courses. DLX Instruction Set • The architecture of DLX was chosen based on observations about most frequently used primitives in programs. DLX provides a good architectural model for study, not only because of the recent popularity of this type of machine, but also because it is easy to understand. • Like most recent load/store machines, DLX emphasizes • A simple load/store instruction set • Design for pipelining efficiency • An easily decoded instruction set • Efficiency as a compiler target

DLX’s Operation 1. Load/Store Any of the GPRs or FPRs may be loaded and stored except that loading R0 has no effect.

2. ALU Operations All ALU instructions are register-register instructions. The operations are : - add, subtract , AND , OR , XOR ,shifts Compare instructions compare two registers (=,!=,<,>,=<,=>). If the condition is true, these instructions place a 1 in the destination register, otherwise they place a 0.

DLX’s Operation • Branches/Jumps All branches are conditional.The branch condition is specified by the instruction, which may test the register source for zero or nonzero. • Floating-Point Operations - add - subtract - multiply - divide

DLX’s Operation • There are four classes of instructions: 1. Load/Store 2. ALU Operations 3. Branches/Jumps 4. Floating-Point Operations

DLX Instruction Set Instruction meaning Move data between registers and memory, or between the integer and FP or Data special register; only memory address mode is 16-bit displacement + contents of a transfers GPR LB, LBU, SB Load byte, load byte unsigned, store byte LH, LHU, SH Load halfword, load halfword unsigned, store halfword

LW, SW Load word, store word (to/from integer registers) LF, LD, SF, Load SP float, load DP float, store SP float, store DP float (SP - single precision, DP - SD double precision) MOVI2S, Move from/to GPR to/from a special register MOVS2I MOVF, Copy one floating-point register or a DP pair to another register or pair MOVD MOVFP2I, Move 32 bits from/to FP tegister to/from integer registers MOVI2FP DLX Instruction Set opcode Instruction meaning

Operations on integer or logical data in GPRs; signed arithmetics Arithmetic / Logical trap on overflow

Add, add immediate (all immediates are 16-bits); signed and ADD, ADDI, ADDU, ADDUI unsigned

SUB, SUBI, SUBU, SUBUI Subtract, subtract immediate; signed and unsigned

Multiply and divide, signed and unsigned; operands must be MULT, MULTU, DIV, DIVU floating-point registers; all operations take and yield 32-bit values

AND, ANDI And, and immediate OR, ORI, XOP, XOPI Or, or immediate, exclusive or, exclusive or immediate

LHI Load high immediate - loads upper half of register with immediate

SLL, SRL, SRA, SLLI, SRLI, Shifts: both immediate(S__I) and variable form(S__); shifts are SRAI shift left logical, right logical, right arithmetic

S__, S__I Set conditional: "__"may be LT, GT, LE, GE, EQ, NE DLX Instruction Set

opcode Instruction meaning

Control Conditional branches and jumps; PC-relative or through register

BEQZ, BNEZ Branch GPR equal/not equal to zero; 16-bit offset from PC Test comparison bit in the FP status register and branch; 16-bit BFPT, BFPF offset from PC J, JR Jumps: 26-bit offset from PC(J) or target in register(JR) Jump and link: save PC+4 to R31, target is PC-relative(JAL) ot a JAL, JALR register(JALR) TRAP Transfer to operating system at a vectored address

RFE Return to user code from an exception; restore user code DLX Instruction Set

Floating point Floating-point operations on DP and SP formats ADDD, ADDF Add DP, SP numbers SUBD, SUBF Subtract DP, SP numbers MULTD, MULTF Multiply DP, SP floating point DIVD, DIVF Divide DP, SP floating point

Convert instructions: CVTx2y converts from type x to CVTF2D, CVTF2I, CVTD2F, type y, where x and y are one of I(Integer), D(Double CVTD2I, CVTI2F, CVTI2D precision), or F(Single precision). Both operands are in the FP registers.

DP and SP compares: "__" may be LT, GT, LE, GE, EQ, __D, __F NE; set comparison bit in FP status register. THANK YOU