Organization EECC 550 • Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer Week 1 Notation (RTN). [Chapters 1, 2]

• Instruction Set Architecture (ISA) Characteristics and Classifications: CISC Vs. RISC. [Chapter 2] Week 2 • MIPS: An Example RISC ISA. Syntax, Instruction Formats, Addressing Modes, Encoding & Examples. [Chapter 2]

• Central Processor Unit (CPU) & Computer System Performance Measures. [Chapter 1] 3rd Edition Ch. 4 Week 3 • CPU Organization: Datapath & Control Unit Design. [Chapter 4] 3rd Edition Ch. 5 Week 4 – MIPS Single Cycle Datapath & Control Unit Design. – MIPS Multicycle Datapath and Finite State Machine Control Unit Design. 3rd Edition Ch. 5 (not in 4th) Week 5 • Microprogrammed Control Unit Design. 3rd Edition Ch. 5 (not in 4th Edition) – Microprogramming Project Week 6 • Midterm Review and Midterm Exam Week 7 • CPU Pipelining. [Chapter 4] 3rd Edition Ch. 6 • The Memory Hierarchy: Cache Design & Performance. [Chapter 5] Week 8 3rd Edition Ch. 7 • The Memory Hierarchy: Main & Virtual Memory. [Chapter 5] Week 9 • Input/Output Organization & System Performance Evaluation. [Chapter 7] 3rd Edition Ch. 8

Week 10 • Computer Arithmetic & ALU Design. [Chapter 3] If time permits. Week 11 • Final Exam. EECC550 - Shaaban #1 Lec # 1 Winter 2011 11-29-2011 Computing System History/Trends + Instruction Set Architecture (ISA) Fundamentals

• Computing Element Choices: – Computing Element Programmability – Spatial vs. Temporal Computing – Main Processor Types/Applications • General Purpose Processor Generations • The Von Neumann Computer Model • CPU Organization (Design) • Recent Trends in Computer Design/performance • Hierarchy of Computer Architecture • Hardware Description: Register Transfer Notation (RTN) • Computer Architecture Vs. Computer Organization • Instruction Set Architecture (ISA): – Definition and purpose – ISA Specification Requirements – Main General Types of Instructions ISA CPU Design – ISA Types and characteristics – Typical ISA Addressing Modes – Instruction Set Encoding – Instruction Set Architecture Tradeoffs – Complex Instruction Set Computer (CISC) – Reduced Instruction Set Computer (RISC) – Evolution of Instruction Set Architectures

Chapters 1, 2 (both editions) EECC550 - Shaaban #2 Lec # 1 Winter 2011 11-29-2011 Computing Element Choices • General Purpose Processors (GPPs): Intended for general purpose computing (desktops, servers, clusters..) • Application-Specific Processors (ASPs): Processors with ISAs and architectural features tailored towards specific application domains – E.g Digital Signal Processors (DSPs), Network Processors (NPs), Media Processors, Graphics Processing Units (GPUs), Vector Processors??? ... • Co-Processors: A hardware (hardwired) implementation of specific algorithms with limited programming interface (augment GPPs or ASPs) • Configurable Hardware: – Field Programmable Gate Arrays (FPGAs) – Configurable array of simple processing elements • Application Specific Integrated Circuits (ASICs): A custom VLSI hardware solution for a specific computational task • The choice of one or more depends on a number of factors including: - Type and complexity of computational algorithm (general purpose vs. Specialized) - Desired level of flexibility/ - Performance requirements programmability - Development cost/time - System cost - Power requirements - Real-time constrains

The main goal of this course is the study of fundamental design EECC550 - Shaaban techniques for General Purpose Processors EECC550 - Shaaban #3 Lec # 1 Winter 2011 11-29-2011 Computing Element Choices The main goal of this course is the study

General Purpose of fundamental design techniques Processors for General Purpose Processors (GPPs): Processor : Programmable computing element that Flexibility runs programs written using a pre-defined set of Application-Specific instructions Processors (ASPs) ISA Requirements → Processor Design

Configurable Hardware

Programmability / Programmability Selection Factors: - Type and complexity of computational algorithms Co-Processors (general purpose vs. Specialized) Application Specific - Desired level of flexibility - Performance Integrated Circuits - Development cost - System cost (ASICs) - Power requirements - Real-time constrains

Specialization , Development cost/time Performance/Chip Area/Watt Performance (Computational Efficiency) EECC550 - Shaaban Software Hardware #4 Lec # 1 Winter 2011 11-29-2011 Computing Element Choices: ComputingComputing ElementElement ProgrammabilityProgrammability (Hardware) (Processor) Software Fixed Function: Programmable:

• Computes one function (e.g. • Computes “any” FP-multiply, divider, DCT) computable function (e.g. • Function defined at Processors) fabrication time • Function defined after • e.g hardware (ASICs) fabrication

Parameterizable Hardware: Performs limited “set” of functions e.g. Co-Processors

Processor = Programmable computing element that runs programs written using pre-defined instructions EECC550 - Shaaban #5 Lec # 1 Winter 2011 11-29-2011 Computing Element Choices: SpatialSpatial vs.vs. TemporalTemporal ComputingComputing Spatial Temporal

(using hardware) (using software/program running on a processor) Defined by fixed functionality and connectivity of hardware elements Time

Hardware Block Diagram Processor Instructions (Program)

Processor = Programmable computing element that runs programs written using a pre-defined set of instructions EECC550 - Shaaban #6 Lec # 1 Winter 2011 11-29-2011 MainMain ProcessorProcessor Types/ApplicationsTypes/Applications The main goal of this course is the study of fundamental design techniques for General Purpose Processors • General Purpose Computing & General Purpose Processors (GPPs) – High performance: In general, faster is always better. – RISC or CISC: P4, IBM Power4, SPARC, PowerPC, MIPS ... 64 bit – Used for general purpose software – End-user programmable – Real-time performance may not be fully predictable (due to dynamic arch. features) – Heavy weight, multi-tasking OS - Windows, UNIX – Normally, low cost and power not a requirement (changing)

– Servers, Workstations, Desktops (PC’s), Notebooks, Clusters … Increasing Cost/Complexity • Embedded Processing: Embedded processors and processor cores – Cost, power code-size and real-time requirements and constraints – Once real-time constraints are met, a faster processor may not be better – e.g: Intel XScale, ARM, 486SX, Hitachi SH7000, NEC V800... 16-32 bit – Often require Digital signal processing (DSP) support or other application-specific support (e.g network, media processing)

– Single or few specialized programs – known at system design time volume Increasing – Not end-user programmable – Real-time performance must be fully predictable (avoid dynamic arch. features) – Lightweight, often realtime OS or no OS – Examples: Cellular phones, consumer electronics .. … • – Extremely code size/cost/power sensitive 8 bit – Single program – Small word size - 8 bit common Processor = Programmable computing element – Usually no OS that runs programs written using pre-defined instructions – Highest volume processors by far – Examples: Control systems, Automobiles, industrial control, thermostats, ... Examples of Application-Specific Processors (ASPs) EECC550 - Shaaban #7 Lec # 1 Winter 2011 11-29-2011 TheThe ProcessorProcessor DesignDesign SpaceSpace

Application specific architectures for performance Embedded Real-time constraints processors GPPs Specialized applications Low power/cost constraints Performance is everything & Software rules Performance The main goal of this course is the Microcontrollers study of fundamental design techniques for General Purpose Processors Cost is everything Chip Area, Power Processor Cost complexity

Processor = Programmable computing element EECC550 - Shaaban that runs programs written using a pre-defined set of instructions #8 Lec # 1 Winter 2011 11-29-2011 General Purpose Processor/Computer System Generations

Classified according to implementation technology: • The First Generation, 1946-59: Vacuum Tubes, Relays, Mercury Delay Lines: – ENIAC (Electronic Numerical Integrator and Computer): First electronic computer, 18000 vacuum tubes, 1500 relays, 5000 additions/sec (1944). – First stored program computer: EDSAC (Electronic Delay Storage Automatic ), 1949. • The Second Generation, 1959-64: Discrete . – e.g. IBM Main frames • The Third Generation, 1964-75: Small and Medium-Scale Integrated (MSI) Circuits. – e.g Main frames (IBM 360) , mini (DEC PDP-8, PDP-11). • The Fourth Generation, 1975-Present: The . VLSI-based Microprocessors (single-chip processor) – First : Intel’s 4-bit 4004 (2300 transistors), 1970. – (PCs), laptops, PDAs, servers, clusters … – Reduced Instruction Set Computer (RISC) 1984

Common factor among all generations: All target the The Von Neumann Computer Model or paradigm EECC550 - Shaaban #9 Lec # 1 Winter 2011 11-29-2011 TheThe VonVon NeumannNeumann ComputerComputer ModelModel • Partitioning of the programmable computing engine into components: – (CPU): Control Unit (instruction decode , sequencing of operations), Datapath (registers, arithmetic and logic unit, connections, buses …). – Memory: Instruction (program) and operand (data) storage. AKA Program Counter – Input/Output (I/O) sub-system: I/O bus, interfaces, devices. (PC) Based Architecture – The stored program concept: Instructions from an instruction set are fetched from a common memory and executed one at a time The Program Counter (PC) points to next instruction to be processed

Control Input Memory - (instructions, data) Datapath registers Output ALU, buses

Computer System CPU I/O Devices

Major CPU Performance Limitation: The Von Neumann computing model implies sequential execution one instruction at a time

Another Performance Limitation: Separation of CPU and memory (The Von Neumann memory bottleneck) EECC550 - Shaaban #10 Lec # 1 Winter 2011 11-29-2011 GenericGeneric CPUCPU MachineMachine InstructionInstruction ProcessingProcessing StepsSteps (Implied by The Von Neumann Computer Model) Instruction Obtain instruction from program storage Fetch (memory) The Program Counter (PC) points to next instruction to be processed Instruction Determine required actions and instruction size Decode

Operand Locate and obtain operand data Fetch

Execute Compute result value or status

Result Deposit results in storage for later use Store

Next Determine successor or next instruction Instruction (i.e Update PC to fetch next instruction to be processed)

Major CPU Performance Limitation: The Von Neumann computing model EECC550 - Shaaban implies sequential execution one instruction at a time #11 Lec # 1 Winter 2011 11-29-2011 HardwareHardware ComponentsComponents ofof ComputerComputer SystemsSystems Five classic components of all computers: 1. Control Unit; 2. Datapath; 3. Memory; 4. Input; 5. Output

Processor I/O Central Processing Unit (CPU)

Computer Keyboard, Mouse, etc. Processor Memory Devices (active) (passive) Control Input Unit (where programs, data I/O Disk Datapath live when Output running) Display, Printer, etc. EECC550 - Shaaban #12 Lec # 1 Winter 2011 11-29-2011 AKA CPU CPUCPU OrganizationOrganization (Design)(Design) Microarchitecture

• Datapath Design: Components & their connections needed by ISA instructions – Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions – (e.g., Registers, ALU, Shifters, Logic Units, ...) Components – Ways in which these components are interconnected (buses connections, multiplexors, etc.). Connections – How information flows between components. Control/sequencing of operations of datapath components • Control Unit Design: to realize ISA instructions – Logic and means by which such information flow is controlled. – Control and coordination of FUs operation to realize the targeted Instruction Set Architecture to be implemented (can either be implemented using a finite state machine or a microprogram). • Hardware description with a suitable language, possibly using Register Transfer Notation (RTN). ISA = Instruction Set Architecture The ISA forms an abstraction layer that sets the requirements for both EECC550 - Shaaban complier and CPU designers #13 Lec # 1 Winter 2011 11-29-2011 Control Branch Unit Control Data cache AA TypicalTypical MicroprocessorMicroprocessor Layout:Layout: TheThe IntelIntel PentiumPentium ClassicClassic 1993 - 1997 Bus Integer data- Floating- 60MHz - 233 MHz Instruction path point cache datapath Datapath

First Level of Memory (Cache) EECC550 - Shaaban #14 Lec # 1 Winter 2011 11-29-2011 Control Unit AA TypicalTypical MicroprocessorMicroprocessor Layout:Layout: TheThe IntelIntel PentiumPentium ClassicClassic 1993 - 1997 60MHz - 233 MHz

Datapath

First Level of Memory (Cache) EECC550 - Shaaban #15 Lec # 1 Winter 2011 11-29-2011 ComputerComputer SystemSystem ComponentsComponents CPU Core 1 GHz - 3.8 GHz Recently 1 or 2 or 4 processor cores per chip 4-way Superscaler All Non-blocking caches RISC or RISC-core (): Deep Instruction Pipelines L1 16-128K 1-2 way set associative (on chip), separate or unified Dynamic scheduling L1 L2 256K- 2M 4-32 way set associative (on chip) unified Multiple FP, integer FUs CPU L3 2-16M 8-32 way set associative (off or on chip) unified Dynamic branch prediction L2 Hardware speculation Examples: Alpha, AMD K7: EV6, 200-400 MHz L3 Intel PII, PIII: GTL+ 133 MHz SDRAM Caches Intel P4 800 MHz PC100/PC133 100-133MHZ Front Side Bus (FSB) 64-128 bits wide Off or On-chip 2-way inteleaved adapters ~ 900 MBYTES/SEC )64bit) Memory I/O Buses Current Standard Controller Example: PCI, 33-66MHz 32-64 bits wide Double Date 133-528 MBYTES/SEC Rate (DDR) SDRAM Memory Bus Controllers NICs PCI-X 133MHz 64 bit PC3200 1024 MBYTES/SEC 200 MHZ DDR 64-128 bits wide Memory 4-way interleaved Disks ~3.2 GBYTES/SEC Displays (one 64bit channel) Networks ~6.4 GBYTES/SEC Keyboards (two 64bit channels)

RAMbus DRAM (RDRAM) I/O Devices: North South 400MHZ DDR 16 bits wide (32 banks) Bridge Bridge I/O Subsystem ~ 1.6 GBYTES/SEC Chipset EECC550 - Shaaban #16 Lec # 1 Winter 2011 11-29-2011 MicroprocessorMicroprocessor PerformancePerformance IncreaseIncrease 19841984--20002000

SPEC CPU2000 Performance

> 100x performance increase in one decade EECC550 - Shaaban #17 Lec # 1 Winter 2011 11-29-2011 MicroprocessorMicroprocessor TransistorTransistor CountCount GrowthGrowth RateRate

Currently ~ 3 Billion

Moore’s Law: 2X transistors/Chip Every 1.5-2 years (circa 1970)

Intel 4004 Still holds today (2300 transistors)

~ 1,300,000x density increase in the last 40 years EECC550 - Shaaban #18 Lec # 1 Winter 2011 11-29-2011 A current Multi-core Microprocessor Example The increase in transistor chip density allows integrating more than one processor core per chip A benefit of Moore’s Law

„AMD Barcelona (Opteron X4) : 4 processor cores on one chip

EECC550 - Shaaban #19 Lec # 1 Winter 2011 11-29-2011 IncreaseIncrease ofof CapacityCapacity ofof VLSIVLSI DynamicDynamic RAMRAM (DRAM)(DRAM) MemoryMemory ChipsChips

1.55X/yr, or doubling every 1.6 years (Also follows Moore’s Law)

~ 17,000x DRAM chip capacity increase in 20 years

EECC550 - Shaaban #20 Lec # 1 Winter 2011 11-29-2011 ComputerComputer TechnologyTechnology Trends:Trends: EvolutionaryEvolutionary butbut RapidRapid ChangeChange • Processor: – 1.5-1.6 performance improvement every year; Over 100X performance in last decade. • Memory: – DRAM capacity: > 2x every 1.5 years; 1000X size in last decade. – Cost per bit: Improves about 25% or more per year. – Only 15-25% performance improvement per year.

• Disk: Performance gap compared – Capacity: > 2X in size every 1.5 years. to CPU performance causes – Cost per bit: Improves about 60% per year. system performance bottlenecks – 200X size in last decade. – Only 10% performance improvement per year, due to mechanical limitations. • Expected State-of-the-art PC Fourth Quarter 2011 : – Processor clock speed: ~ 3500 MegaHertz (3.5 Giga Hertz) With 2-8 processor cores – Memory capacity: ~ 8000 MegaByte (8 Giga ) on a single chip – Disk capacity: ~ 3000 GigaBytes (3 Tera Bytes)

EECC550 - Shaaban #21 Lec # 1 Winter 2011 11-29-2011 AA SimplifiedSimplified ViewView ofof TheThe Software/HardwareSoftware/Hardware HierarchicalHierarchical LayersLayers

ons s icati oftw pl ar Ap e s ems oftw st a y re S

Hardware

EECC550 - Shaaban #22 Lec # 1 Winter 2011 11-29-2011 HierarchyHierarchy ofof ComputerComputer ArchitectureArchitecture High-Level Language Programs

Assembly Language Software Application Programs Operating Machine Language System Program e.g. BIOS (Basic Input/Output System) Compiler Firmware Software/Hardware Instruction Set Architecture Boundary Instr. Set Proc. I/O system (ISA) The ISA forms an abstraction layer Datapath & Control that sets the requirements for both complier and CPU designers Digital Design Microprogram Hardware Circuit Design Layout

Register Transfer Logic Diagrams VLSI placement & routing Notation (RTN)

Circuit Diagrams EECC550 - Shaaban #23 Lec # 1 Winter 2011 11-29-2011 LevelsLevels ofof ProgramProgram RepresentationRepresentation temp = v[k]; High Level Language Program v[k] = v[k+1]; v[k+1] = temp; Compiler lw$15, 0($2) lw$16, 4($2) MIPS Program Assembly sw$16, 0($2) Code Assembler sw$15, 4($2)

0000 1001 1100 0110 1010 1111 0101 1000 Machine Language 1010 1111 0101 1000 0000 1001 1100 0110

Software Program 1100 0110 1010 1111 0101 1000 0000 1001 ISA 0101 1000 0000 1001 1100 0110 1010 1111 Machine Interpretation ISA Requirements → Processor Design

Control Signal ALUOP[0:3] <= InstReg[9:11] & MASK Specification Hardware Register Transfer Notation (RTN) ° Microprogram ° ISA = Instruction Set Architecture. The ISA forms an abstraction layer that sets the EECC550 - Shaaban requirements for both complier and CPU designers EECC550 - Shaaban #24 Lec # 1 Winter 2011 11-29-2011 AA HierarchyHierarchy ofof ComputerComputer DesignDesign Level Name Modules Primitives Descriptive Media

1 Electronics Gates, FF’s Transistors, Resistors, etc. Circuit Diagrams

2 Logic Registers, ALU’s ... Gates, FF’s …. Logic Diagrams

3 Organization Processors, Memories Registers, ALU’s … Register Transfer Notation (RTN) Low Level - Hardware

4 Microprogramming Assembly Language Microinstructions Microprogram Firmware

5 Assembly language OS Routines Assembly language Assembly Language programming Instructions Programs

6 Procedural Applications OS Routines High-level Language Programming Drivers .. High-level Languages Programs

7 Application Systems Procedural Constructs Problem-Oriented Programs High Level - Software EECC550 - Shaaban #25 Lec # 1 Winter 2011 11-29-2011 HardwareHardware DescriptionDescription • Hardware visualization: – Block diagrams (spatial visualization): Two-dimensional representations of functional units and their interconnections. – Timing charts (temporal visualization): Waveforms where events are displayed vs. time. • Register Transfer Notation (RTN): AKA Register Transfer Language (RTL) – A way to describe microoperations capable of being performed by the data flow (data registers, data buses, functional units) at the register transfer level of design (RT). – Also describes conditional information in the system which cause operations to come about. – A “shorthand” notation for microoperations. • Hardware Description Languages: – Examples: VHDL: VHSIC (Very High Speed Integrated Circuits) Hardware Description Language, Verilog. EECC550 - Shaaban #26 Lec # 1 Winter 2011 11-29-2011 RegisterRegister TransferTransfer NotationNotation (RTN)(RTN) • Independent RTN: – No predefined data flow is assumed (i.e No datapath design yet) – Describe actions on registers and memory locations without regard to nonexistence of direct paths or intermediate registers. – Useful to describe functionality of instructions of a given ISA. • Dependent RTN: – When RTN is used after the data flow (datapath design) is assumed to be frozen. – No data transfer can take place over a path that does not exist. – No RTN statement implies a function the data flow hardware is incapable of performing. • The general format of an RTN statement: Conditional information : Action1; Action2; … • The conditional statement is often an AND of literals (status and control signals) in the system (a p-term). The p-term is said to imply the action. • Possible actions include transfer of data to/from registers/memory data shifting, functional unit operations etc. EECC550 - Shaaban #27 Lec # 1 Winter 2011 11-29-2011 RTNRTN StatementStatement ExamplesExamples A ← B or R[A] ← R[B] where R[X] mean the content of register X – A copy of the data in entity B (typically a register) is placed in Register A – If the destination register has fewer bits than the source, the destination accepts only the lowest-order bits. – If the destination has more bits than the source, the value of the source is sign extended to the left.

CTL • T0: A = B – The contents of B are presented to the input of combinational circuit A – This action to the right of “:” takes place when control signal CTL is active and signal T0 is active. EECC550 - Shaaban #28 Lec # 1 Winter 2011 11-29-2011 RTNRTN StatementStatement ExamplesExamples MD ← M[MA] or MD ← Mem[MA] – Means the memory data (MD) register receives the contents of the main memory (M or Mem) as addressed from the Memory Address (MA) register. AC(0), AC(1), AC(2), AC(3) – Register fields are indicated by parenthesis. – The concatenation operation is indicated by a comma. – Bit AC(0) is bit 0 of the AC – The above expression means AC bits 0, 1, 2, 3 – More commonly represented by AC(0-3) E • T3: CLRWRITE – The control signal CLRWRITE is activated when the condition E • T3 is active.

EECC550 - Shaaban #29 Lec # 1 Winter 2011 11-29-2011 ComputerComputer ArchitectureArchitecture Vs.Vs. ComputerComputer OrganizationOrganization • The term Computer architecture is sometimes erroneously restricted to computer instruction set design, with other aspects of computer design called implementation. The ISA forms an abstraction layer that sets the • More accurate definitions: requirements for both complier and CPU designers – Instruction Set Architecture (ISA): The actual programmer- visible instruction set and serves as the boundary or interface between the software and hardware. – Implementation of a machine has two components:

CPU Micro- • Organization: includes the high-level aspects of a computer’s architecture design such as: The memory system, the bus structure, the (CPU design) internal CPU unit which includes implementations of arithmetic, logic, branching, and data transfer operations. • Hardware: Refers to the specifics of the machine such as detailed logic design and packaging technology. Hardware design and implementation • In general, Computer Architecture refers to the above three aspects: 1- Instruction set architecture 2- Organization. 3- Hardware. EECC550 - Shaaban #30 Lec # 1 Winter 2011 11-29-2011 Assembly Programmer Or Compiler InstructionInstruction SetSet ArchitectureArchitecture (ISA)(ISA) “... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation.” The ISA forms an abstraction layer that sets the – Amdahl, Blaaw, and Brooks, 1964. requirements for both complier and CPU designers The instruction set architecture is concerned with: • Organization of programmable storage (memory & registers): Includes the amount of addressable memory and number of available registers. • Data Types & Data Structures: Encodings & representations. • Instruction Set: What operations are specified. • Instruction formats and encoding. • Modes of addressing and accessing data items and instructions • Exceptional conditions. EECC550 - Shaaban #31 Lec # 1 Winter 2011 11-29-2011 EvolutionEvolution ofof InstructionInstruction SetSet ArchitecturesArchitectures Single Accumulator (EDSAC 1949) No ISA Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953)

Separation of Programming Model from Implementation

i.e. CPU High-level Language BasedDesign Concept of an ISA Family (B5000 1963) (IBM 360 1964)

General Purpose Register (GPR) Machines

Load/Store Architecture Complex Instruction Sets (CISC) (Vax, Motorola 68000, Intel x86 1977-80) (CDC 6600, Cray 1 1963-76)

Reduced Instruction Set Computer (RISC)( (MIPS, SPARC, HP-PA, PowerPC, . . . 1984..) EECC550 - Shaaban #32 Lec # 1 Winter 2011 11-29-2011 ComputerComputer InstructionInstruction SetsSets • Regardless of computer type, CPU structure, or hardware organization, every machine instruction must specify the following: – Opcode: Which operation to perform. Example: add, load, and branch. Opcode = Operation Code – Where to find the operand or operands, if any: Operands may be contained in CPU registers, main memory, or I/O ports. Operands location can be explicitly specified in the instruction or implied – Where to put the result, if there is a result: May be explicitly mentioned or implicit in the opcode. Destination – Where to find the next instruction: Without any explicit branches, the instruction to execute is the next instruction in the sequence or a specified address in case of jump or branch instructions. EECC550 - Shaaban #33 Lec # 1 Winter 2011 11-29-2011 InstructionInstruction SetSet ArchitectureArchitecture (ISA)(ISA) Instruction SpecificationSpecification RequirementsRequirements Fetch • Instruction Format or Encoding: Instruction – How is it decoded? Decode • Location of operands and result (addressing modes): Operand – Where other than memory? Fetch – How many explicit operands? – How are memory operands located? Execute – Which can or cannot be in memory? • Data type and Size. Result • Operations Store – What are supported • Successor instruction: Next – Jumps, conditions, branches. Instruction • Fetch-decode-execute is implicit. EECC550 - Shaaban #34 Lec # 1 Winter 2011 11-29-2011 MainMain GeneralGeneral TypesTypes ofof InstructionsInstructions 1. Data Movement Instructions, possible variations: – Memory-to-memory. – Memory-to-CPU register. – CPU-to-memory. – Constant-to-CPU register. – CPU-to-output. – etc. 2. Arithmetic Logic Unit (ALU) Instructions: – Logic instructions – Integer Arithmetic Instructions – Floating Point Arithmetic Instructions 3. Branch (Control) Instructions: – Unconditional jumps. – Conditional branches.

EECC550 - Shaaban #35 Lec # 1 Winter 2011 11-29-2011 ExamplesExamples ofof DataData MovementMovement InstructionsInstructions

Instruction Meaning Machine

MOV A,B Move 16-bit data from memory loc. A to loc. B VAX11

lwz R3,A Move 32-bit data from memory loc. A to register R3 PPC601

li $3,455 Load the 32-bit integer 455 into register $3 MIPS R3000

MOV AX,BX Move 16-bit data from register BX into register AX Intel X86

LEA.L (A0),A2 Load the address pointed to by A0 into A2 MC68000

EECC550 - Shaaban #36 Lec # 1 Winter 2011 11-29-2011 ExamplesExamples ofof ALUALU InstructionsInstructions

Instruction Meaning Machine

MULF A,B,C Multiply the 32-bit floating point values at mem. VAX11 locations A and B, and store result in loc. C nabs r3,r1 Store the negative absolute value of register r1 in r2 PPC601 ori $2,$1,255 Store the logical OR of register $1 with 255 into $2 MIPS R3000

SHL AX,4 Shift the 16-bit value in register AX left by 4 bits Intel X86

ADD.L D0,D1 Add the 32-bit values in registers D0, D1 and store MC68000 the result in register D0

EECC550 - Shaaban #37 Lec # 1 Winter 2011 11-29-2011 ExamplesExamples ofof BranchBranch InstructionsInstructions

Instruction Meaning Machine

BLBS A, Tgt Branch to address Tgt if the least significant bit VAX11 at location A is set. bun r2 Branch to location in r2 if the previous comparison PPC601 signaled that one or more values was not a number.

Beq $2,$1,32 Branch to location PC+4+32 if contents of $1 and $2 MIPS R3000 are equal.

JCXZ Addr Jump to Addr if contents of register CX = 0. Intel X86

BVS next Branch to next if overflow flag in CC is set. MC68000

EECC550 - Shaaban #38 Lec # 1 Winter 2011 11-29-2011 OperationOperation TypesTypes inin TheThe InstructionInstruction SetSet Operator Type Examples

Arithmetic and logical Integer arithmetic and logical operations: add, or

Data transfer 2 Loads-stores (move on machines with memory addressing) 1 Control 3 Branch, jump, procedure call, and return, traps. System Operating system call/return, virtual memory management instructions ... Floating point Floating point operations: add, multiply .... Decimal Decimal add, decimal multiply, decimal to character conversion String String move, string compare, string search Media The same operation performed on multiple data (e.g Intel MMX, SSE) EECC550 - Shaaban #39 Lec # 1 Winter 2011 11-29-2011 InstructionInstruction UsageUsage Example:Example: TopTop 1010 IntelIntel X86X86 InstructionsInstructions Rankinstruction Integer Average Percent total executed 1 load 22% 2 conditional branch 20% 3 compare 16% 4 store 12% 5 add 8% 6 and 6% 7 sub 5% 8 move register-register 4% 9 call 1% 10 return 1% Total 96% Observation: Simple instructions dominate instruction usage frequency.

CISC to RISC observation EECC550 - Shaaban #40 Lec # 1 Winter 2011 11-29-2011 TypesTypes ofof InstructionInstruction SetSet ArchitecturesArchitectures AccordingAccording ToTo OperandOperand MemoryMemory AddressingAddressing FieldsFields Memory-To-Memory Machines: – Operands obtained from memory and results stored back in memory by any instruction that requires operands. – No local CPU registers are used in the CPU datapath. – Include: • The 4 Address Machine. Machine = ISA or CPU targeting a specific ISA type • The 3-address Machine. • The 2-address Machine. The 1-address (Accumulator) Machine: – A single local CPU special-purpose register (accumulator) is used as the source of one operand and as the result destination. The 0-address or Stack Machine: – A push-down stack is used in the CPU. General Purpose Register (GPR) Machines: – The CPU datapath contains several local general-purpose registers which can be used as operand sources and as result destinations. – A large number of possible addressing modes. – Load-Store or Register-To-Register Machines: GPR machines where only data movement instructions (loads, stores) can obtain operands from memory and store results to memory.

CISC to RISC observation (load-store simplifies CPU design) EECC550 - Shaaban #41 Lec # 1 Winter 2011 11-29-2011 TypesTypes ofof InstructionInstruction SetSet ArchitecturesArchitectures MemoryMemory--ToTo--MemoryMemory Machines:Machines: TheThe 44--AddressAddress Machine/ISAMachine/ISA • No program counter (PC) or other CPU registers are used. • Instruction encoding has four address fields to specify: – Location of first operand. - Location of second operand. – Place to store the result. - Location of next instruction.

Memory Instruction: CPU add Res, Op1, Op2, Nexti Op1Addr: Op1 Meaning: Op2Addr: Op2 + Res ← Op1 + Op2

ResAddr: Res or more precise RTN: M[ResAddr] ← M[Op1Addr] + M[Op2Addr] : : Instruction Format (encoding) Bits: 8 24 24 24 24 NextiAddr: Nexti add ResAddr Op1Addr Op2Addr NextiAddr Instruction Opcode Where to find Size: Where to Where to find operands 13 bytes Which put result next instruction operation EECC550 - Shaaban Can address 224 bytes = 16 MBytes #42 Lec # 1 Winter 2011 11-29-2011 TypesTypes ofof InstructionInstruction SetSet ArchitecturesArchitectures MemoryMemory--ToTo--MemoryMemory Machines:Machines: TheThe 33--AddressAddress Machine/ISAMachine/ISA • A program counter (PC) is included within the CPU which points to the next instruction. • No CPU storage (general-purpose registers). Memory CPU Instruction: add Res, Op1, Op2 Op1Addr: Op1 Op2Addr: Op2 Meaning: + Instruction Res ← Op1 + Op2 Size: 10 bytes ResAddr: Res or more precise RTN: : M[ResAddr] ← M[Op1Addr] + M[Op2Addr] : PC ← PC + 10 Increment PC Where to find next instruction Instruction Format (encoding) Bits: 8 24 24 24 NextiAddr: Nexti Program 24 Counter (PC) add ResAddr Op1Addr Op2Addr Opcode Where to Where to find operands Which put result operation Can address 224 bytes = 16 MBytes EECC550 - Shaaban #43 Lec # 1 Winter 2011 11-29-2011 TypesTypes ofof InstructionInstruction SetSet ArchitecturesArchitectures MemoryMemory--ToTo--MemoryMemory Machines:Machines: TheThe 22--AddressAddress Machine/ISAMachine/ISA • The 2-address Machine: Result is stored in the memory address of one of the operands. Instruction: Memory CPU add Op2, Op1 Meaning: Op1Addr: Op1 Op2 ← Op1 + Op2 + or more precise RTN:

Op2Addr: Op2,Res M[Op2Addr] ← M[Op1Addr] + M[Op2Addr] PC ← PC + 7 Increment PC : : Instruction Format (encoding) Where to find next instruction Bits: 8 24 24 add Op2Addr Op1Addr NextiAddr: Nexti Program 24 Counter (PC) Opcode Which Where to find operands operation Instruction Size: Where to 7 bytes put result Can address 224 bytes = 16 MBytes EECC550 - Shaaban #44 Lec # 1 Winter 2011 11-29-2011 TypesTypes ofof InstructionInstruction SetSet ArchitecturesArchitectures TheThe 11--addressaddress (Accumulator)(Accumulator) Machine/ISAMachine/ISA • A single register (accumulator) in the CPU is used as the source of one operand and result destination. Instruction: Memory CPU add Op1 Op1Addr: Op1 Meaning: Acc ← Acc + Op1 + Where to find operand2, and or more precise RTN: where to put result Acc ← Acc + M[Op1Addr] Accumulator : PC ← PC + 4 : Increment PC Where to find Instruction Format (encoding) next instruction Bits: 8 24 NextiAddr: Nexti Program 24 Counter (PC) add Op1Addr Instruction Size: Opcode Where to find 4 bytes Which operand1 operation Can address 224 bytes = 16 MBytes EECC550 - Shaaban #45 Lec # 1 Winter 2011 11-29-2011 TypesTypes ofof InstructionInstruction SetSet ArchitecturesArchitectures TheThe 00--addressaddress (Stack)(Stack) Machine/ISAMachine/ISA

• A push-down stack is used in the CPU.

4 Bytes Instruction Format CPU Memory Bits: 8 24 Stack Instruction: push Op1Addr push push Op1 Opcode Where to find Op1Addr: Op1 pop add Meaning: operand Op2Addr: Op2 TOS Op2, Res TOS ← M[Op1Addr] + Op1 Instruction: 1 Instruction Format ResAddr: SOS Res add Bits: 8 etc. Meaning: add : TOS ← TOS + SOS : Opcode 8 4 Bytes Instruction Format Bits: 8 24 NextiAddr: Nexti Program 24 Counter (PC) Instruction: pop ResAddr pop Res Opcode Memory Meaning: Destination M[ResAddr] ← TOS TOS = Top Entry in Stack SOS = Second Entry in Stack EECC550 - Shaaban Can address 224 bytes = 16 MBytes #46 Lec # 1 Winter 2011 11-29-2011 TypesTypes ofof InstructionInstruction SetSet ArchitecturesArchitectures GeneralGeneral PurposePurpose RegisterRegister (GPR)(GPR) MachinesMachines • CPU contains several general-purpose registers which can be used as operand sources and result destination.

Eight general purpose Registers (GPRs) assumed here: R1-R8 Instruction Format Instruction: Bits: 8 3 24 CPU load R8, Op1 Memory Meaning: load R8 Op1Addr Registers R8 ← M[Op1Addr] Opcode Where to find load PC ← PC + 5 operand1 Op1Addr: Op1 R8 Size = 4.375 bytes rounded up to 5 bytes R7 Instruction: add R6 Instruction Format add R2, R4, R6 + R5 Meaning: Bits: 8 3 3 3 R4 R2 ← R4 + R6 add R2 R4 R6 R3 PC ← PC + 3 Opcode Des Operands : R2 store Size = 2.125 bytes rounded up to 3 bytes : R1

Instruction: Instruction Format NextiAddr: Nexti Program 24 store R2, Op2 Bits: 8 3 24 Counter (PC) Meaning: store R2 ResAddr M[Op2Addr] ← R2 Opcode Destination PC ← PC + 5 Size = 4.375 bytes rounded up to 5 bytes Here add instruction has three register specifier fields While load, store instructions have one register specifier field EECC550 - Shaaban and one memory address specifier field #47 Lec # 1 Winter 2011 11-29-2011 ExpressionExpression EvaluationEvaluation ExampleExample withwith 33--,, 22--,, 11--,, 00--Address,Address, AndAnd GPRGPR MachinesMachines For the expression A = (B + C) * D - E where A-E are in memory

1-Address 0-Address GPR 3-Address 2-Address Accumulator Stack Register-Memory Load-Store add A, B, C load A, B load B push B load R1, B load R1, B mul A, A, D add A, C add C push C add R1, C load R2, C sub A, A, E mul A, D mul D add mul R1, D add R3, R1, R2 sub A, E sub E push D sub R1, E load R1, D store A mul store A, R1 mul R3, R3, R1 push E load R1, E sub sub R3, R3, R1 pop A store A, R3

3 instructions 4 instructions 5 instructions 8 instructions 5 instructions 8 instructions Code size: Code size: Code size: Code size: Code size: Code size: 30 bytes 28 bytes 20 bytes 23 bytes 25 bytes 34 bytes 9 memory 11 memory 5 memory 5 memory 5 memory 5 memory accesses for accesses for accesses for accesses for accesses for accesses for data data data data data data EECC550 - Shaaban #48 Lec # 1 Winter 2011 11-29-2011 InstructionInstruction SetSet ArchitectureArchitecture TradeoffsTradeoffs • 3-address machine: shortest code sequence; a large number of bits per instruction; large number of memory accesses. • 0-address (stack) machine: Longest code sequence; shortest individual instructions; more complex to program. • General purpose register machine (GPR): Machine = CPU or ISA – Addressing modified by specifying among a small set of registers with using a short register address (all new ISAs since 1975). – Advantages of GPR: Why GPR? 1 • Low number of memory accesses. Faster, since register access is currently still much faster than memory access. 2 • Registers are easier for compilers to use. 3 • Shorter, simpler instructions. AKA Register-register GPR • Load-Store Machines: GPR machines where memory addresses are only included in data movement instructions (loads/stores) between memory and registers (all new ISAs designed after 1980).

CISC to RISC observation (load-store simplifies CPU design) EECC550 - Shaaban #49 Lec # 1 Winter 2011 11-29-2011 TypicalTypical GPRGPR ISAISA MemoryMemory AddressingAddressing ModesModes Addressing Sample Mode Instruction Meaning Register Add R4, R3 R4 ← R4 + R3 Immediate Add R4, #3 R4 ← R4 + 3 Displacement Add R4, 10 (R1) R4 ← R4 + Mem[10+ R1] Indirect Add R4, (R1) R4 ← R4 + Mem[R1] Indexed Add R3, (R1 + R2) R3 ← R3 +Mem[R1 + R2] Absolute Add R1, (1001) R1 ← R1 + Mem[1001] Memory indirect Add R1, @ (R3) R1 ← R1 + Mem[Mem[R3]] Autoincrement Add R1, (R2) + R1 ← R1 + Mem[R2] R2 ← R2 + d Autodecrement Add R1, - (R2) R2 ← R2 - d R1 ← R1 + Mem[R2] Scaled Add R1, 100 (R2) [R3] R1 ← R1+ Mem[100+ R2 + R3*d]

CISC to RISC observation (fewer addressing modes simplify CPU design) EECC550 - Shaaban #50 Lec # 1 Winter 2011 11-29-2011 AddressingAddressing ModesModes UsageUsage ExampleExample For 3 programs running on VAX ignoring direct register mode: Displacement 42% avg, 32% to 55% 75% Immediate: 33% avg, 17% to 43% 88%

Register deferred (indirect): 13% avg, 3% to 24%

Scaled: 7% avg, 0% to 16%

Memory indirect: 3% avg, 1% to 6%

Misc: 2% avg, 0% to 3%

75% displacement & immediate 88% displacement, immediate & register indirect. Observation: In addition Register direct, Displacement, Immediate, Register Indirect addressing modes are important. CISC to RISC observation (fewer addressing modes simplify CPU design) EECC550 - Shaaban #51 Lec # 1 Winter 2011 11-29-2011 DisplacementDisplacement AddressAddress SizeSize ExampleExample

Avg. of 5 SPECint92 programs v. avg. 5 SPECfp92 programs

Int. Avg. FP Avg.

30%

20%

10%

0% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Displacement Address Bits Needed For displacement addressing mode 1% of addresses > 16-bits 12 - 16 bits of displacement needed

CISC to RISC observation EECC550 - Shaaban #52 Lec # 1 Winter 2011 11-29-2011 InstructionInstruction SetSet EncodingEncoding Considerations affecting instruction set encoding: – The number of registers and addressing modes supported by ISA. – The impact of of the size of the register and addressing mode fields on the average instruction size and on the average program. – To encode instructions into lengths that will be easy to handle in the implementation. On a minimum to be a multiple of bytes. • Instruction Encoding Classification: 1. Fixed length encoding: Faster and easiest to implement in hardware. e.g. Simplifies design of pipelined CPUs 2. Variable length encoding: Produces smaller instructions. 3. Hybrid encoding. to reduce code size

CISC to RISC observation EECC550 - Shaaban #53 Lec # 1 Winter 2011 11-29-2011 ThreeThree ExamplesExamples ofof InstructionInstruction SetSet EncodingEncoding

Operations & Address Address no of operands Address Address specifier 1 field 1 specifier n field n

Variable Length Encoding: VAX (1-53 bytes)

Operation Address Address Address field 1 field 2 field3

Fixed Length Encoding: MIPS, PowerPC, SPARC (all instructions are 4 bytes each)

Operation Address Address Specifier field

Address Address Operation Address field Specifier 1 Specifier 2

Address Address Operation Specifier Address field 1 field 2

Hybrid Encoding: IBM 360/370, Intel 80x86 EECC550 - Shaaban #54 Lec # 1 Winter 2011 11-29-2011 ISAISA ExamplesExamples Machine Number of General Architecture year Purpose Registers EDSAC 1 accumulator 1949 IBM 701 1 accumulator 1953 CDC 6600 8 load-store 1963 IBM 360 16 register-memory 1964 DEC PDP-8 1 accumulator 1965 DEC PDP-11 8 register-memory 1970 Intel 8008 1 accumulator 1972 Motorola 6800 1 accumulator 1974 DEC VAX 16 register-memory 1977 memory-memory 1 extended accumulator 1978 Motorola 68000 16 register-memory 1980 Intel 80386 8 register-memory 1985 MIPS 32 load-store 1985 HP PA-RISC 32 load-store 1986 SPARC 32 load-store 1987 PowerPC 32 load-store 1992 DEC Alpha 32 load-store 1992 HP/Intel IA-64 128 load-store 2001 AMD64 (EMT64) 16 register-memory 2003 EECC550 - Shaaban #55 Lec # 1 Winter 2011 11-29-2011 ExamplesExamples ofof GPRGPR MachinesMachines (ISAs) For Arithmetic/Logic (ALU) Instructions

Max. number of Max. number memory addresses of operands allowed + destination SPARC, MIPS 03PowerPC, ALPHA

Intel 80386 12 Motorola 68000

2 or 3 2 or 3 VAX

EECC550 - Shaaban #56 Lec # 1 Winter 2011 11-29-2011 ComplexComplex InstructionInstruction SetSet ComputerComputer (CISC)(CISC) ISAs • Emphasizes doing more with each instruction: – Thus fewer instructions per program (more compact code). • Motivated by the high cost of memory and hard disk Why? capacity when original CISC architectures were proposed – When M6800 was introduced: 16K RAM = $500, 40M hard disk = $ 55, 000 – When MC68000 was introduced: 64K RAM = $200, 10M HD = $5,000 Circa 1980 • Original CISC architectures evolved with faster more complex CPU designs but backward instruction set compatibility had to be maintained (e.g X86). • Wide variety of addressing modes: • 14 in MC68000, 25 in MC68020 • A number instruction modes for the location and number of operands: • The VAX has 0- through 3-address instructions.

• Variable-length instruction encoding. To reduce code size EECC550 - Shaaban #57 Lec # 1 Winter 2011 11-29-2011 ExampleExample CISCCISC ISAsISAs MotorolaMotorola 680X0680X0 18 addressing modes: GPR ISA (Register-Memory) • Data register direct. Operand size: • Address register direct. • Immediate. • Range from 1 to 32 bits, 1, 2, 4, 8, • Absolute short. 10, or 16 bytes. • Absolute long. • Address register indirect. • Address register indirect with postincrement. Instruction Encoding: • Address register indirect with predecrement. • Address register indirect with displacement. • Instructions are stored in 16-bit • Address register indirect with index (8-bit). words. • Address register indirect with index (base). • the smallest instruction is 2- bytes • Memory inderect postindexed. (one word). • Memory indirect preindexed. • The longest instruction is 5 words • Program counter indirect with index (8-bit). (10 bytes) in length. • Program counter indirect with index (base). • Program counter indirect with displacement. 2 Bytes 10 Bytes • Program counter memory indirect postindexed. • Program counter memory indirect preindexed. EECC550 - Shaaban #58 Lec # 1 Winter 2011 11-29-2011 ExampleExample CISCCISC ISA:ISA: IntelIntel 80386 X86 or IA-32 GPR ISA (Register-Memory) 12 addressing modes: Operand sizes: • Register. • Can be 8, 16, 32, 48, 64, or 80 bits long. • Immediate. • Also supports string operations. • Direct. • Base. Instruction Encoding: • Base + Displacement. • Index + Displacement. • The smallest instruction is one byte. • Scaled Index + Displacement. • The longest instruction is 12 bytes long. • Based Index. • The first bytes generally contain the opcode, • Based Scaled Index. mode specifiers, and register fields. • Based Index + Displacement. • The remainder bytes are for address • Based Scaled Index + Displacement. displacement and immediate data. • Relative. One Byte 12 Bytes

EECC550 - Shaaban #59 Lec # 1 Winter 2011 11-29-2011 ReducedReduced InstructionInstruction SetSet ComputerComputer (RISC)(RISC) ~1984 ISAs • Focuses on reducing the number and complexity of instructions of the ISA. RISC: Simplify ISA Simplify CPU Design Better CPU Performance – Motivated by simplifying the ISA and its requirements to: RISC Goals • Reduce CPU design complexity • Improve CPU performance. – CPU Performance Goal: Reduced number of cycles needed per instruction. At least one instruction completed per clock cycle. • Simplified addressing modes supported. – Usually limited to immediate, register indirect, register displacement, indexed. • Load-Store GPR: Only load and store instructions access memory. – (Thus more instructions are usually executed than CISC) • Fixed-length instruction encoding. – (Designed with CPU instruction pipelining in mind). • Support of delayed branches. • Examples: MIPS, HP PA-RISC, SPARC, Alpha, POWER, PowerPC. EECC550 - Shaaban #60 Lec # 1 Winter 2011 11-29-2011 ExampleExample RISCRISC ISA:ISA: PowerPCPowerPC Load-Store GPR 8 addressing modes: Operand sizes: • Register direct. • Four operand sizes: 1, 2, 4 or 8 bytes. • Immediate. • Register indirect. • Register indirect with immediate Instruction Encoding: index (loads and stores). • Register indirect with register index • Instruction set has 15 different formats (loads and stores). with many minor variations. • • Absolute (jumps). • All are 32 bits (4 bytes) in length. • Link register indirect (calls). • Count register indirect (branches).

EECC550 - Shaaban #61 Lec # 1 Winter 2011 11-29-2011 ExampleExample RISCRISC ISA:ISA: HPHP PrecisionPrecision ArchitectureArchitecture

HPHP PAPA--RISCRISC Load-Store GPR 7 addressing modes: Operand sizes: • Register • Five operand sizes ranging in powers of • Immediate two from 1 to 16 bytes. • Base with displacement • Base with scaled index and displacement Instruction Encoding: • Predecrement • Instruction set has 12 different formats. • Postincrement • • All are 32 bits (4 bytes) in length. • PC-relative

EECC550 - Shaaban #62 Lec # 1 Winter 2011 11-29-2011 ExampleExample RISCRISC ISA:ISA:

SPARCSPARC Load-Store GPR

5 addressing modes: Operand sizes: • Register indirect with immediate • Four operand sizes: 1, 2, 4 or 8 bytes. displacement. • Register inderect indexed by another register. Instruction Encoding: • Register direct. • Instruction set has 3 basic instruction • Immediate. formats with 3 minor variations. • PC relative. • All are 32 bits (4 bytes) in length.

EECC550 - Shaaban #63 Lec # 1 Winter 2011 11-29-2011 ExampleExample RISCRISC ISA:ISA: DECDEC AlphaAlpha AXPAXP Load-Store GPR 4 addressing modes: Operand sizes: • Register direct. • Four operand sizes: 1, 2, 4 or 8 bytes. • Immediate. • Register indirect with displacement. • PC-relative. Instruction Encoding: • Instruction set has 7 different formats. • • All are 32 bits (4 bytes) in length.

EECC550 - Shaaban #64 Lec # 1 Winter 2011 11-29-2011 RISCRISC ISAISA Example:Example: AKA MIPS-I MIPSMIPS R3000R3000 (32(32--bit)bit) Instruction Categories: 5 Addressing Modes: Registers

• Load/Store. • Register direct (arithmetic). • Computational. • Immedate (arithmetic). R0 - R31 • Jump and Branch. • Base register + immediate offset • Floating Point (loads and stores). (using coprocessor). • PC relative (branches). • Memory Management. • Pseudodirect (jumps) • Special. PC Operand Sizes: HI Load-Store GPR • Memory accesses in any LO multiple between 1 and 4 bytes. Instruction Encoding: 3 Instruction Formats, all 32 bits (4 bytes) wide. R OP rs rt rd sa funct I OP rs rt immediate J OP jump target

MIPS is the target ISA for CPU design in this course EECC550 - Shaaban #65 Lec # 1 Winter 2011 11-29-2011