Instruction Set Architecture a Few Words on the Subject

Instruction Set Architecture A few words on the subject Grigory Rechistov IntelMIPT lab, MDSP project October 2010 1 Question: why caches are not included in ISA? They are not visible to programmer. ISA The part of the computer architecture related to programming. I Instructions I Registers I Addressing of memory I Native data types 2 They are not visible to programmer. ISA The part of the computer architecture related to programming. I Instructions I Registers I Addressing of memory I Native data types Question: why caches are not included in ISA? 2 ISA The part of the computer architecture related to programming. I Instructions I Registers I Addressing of memory I Native data types Question: why caches are not included in ISA? They are not visible to programmer. 2 A single instruction [prexes] <opcode> [operand1], [operand2], [operand3] I Total length of an instruction may be constant or variable depending I Prex changes behaviour of the following operation (e.g. locked, replayed, with dierent length of operands) I An operand denes what data is used and where it comes from (register, memory, constant, etc). I Amount of operands may be even more. I Sometimes an operand can be also placed in opcode value. I Sometimes opcodes can be inside operand values. 3 A single instruction [prexes] <opcode> [operand1], [operand2], [operand3] I Total length of an instruction may be constant or variable depending I Prex changes behaviour of the following operation (e.g. locked, replayed, with dierent length of operands) I An operand denes what data is used and where it comes from (register, memory, constant, etc). I Amount of operands may be even more. I Sometimes an operand can be also placed in opcode value. I Sometimes opcodes can be inside operand values. 3 Operations Common types: 1. Set a register to constant value or value of other register. 2. Loads (memory register ) & stores (register memory) 3. Read and write data from hardware devices (IO) 4. Arithmetic and Logic: 4.1 + =£... 4.2 And, Or, Xor, Not 4.3 Compare two values (result ags register) 5. Control ow 5.1 branch to another location (PC new value) 5.2 conditionally branch (if (condition) then PC new value) 5.3 save current location and jump to new location (Procedure call) 4 Registers I Program counter aka PC the immanent part of von Neumann machines. I Data registers: accumulators, counters. I Address registers. I General purpose registers (GPR): both data and address. I Status register (ags). I Stack pointer. Also can include I Segment registers. I Coprocessor device registers e.g. FPU. I Other types: control, debug, constant, model specic, vector registers, . 5 Amount of registers on misc. architectures Architecture Integer registers Double FP registers x86 8 8 x86-64 16 16 Itanium 128 128 UltraSPARC 32 32 POWER 32 32 Alpha 32 32 65021 3 0 AVR microcontroller 32 0 ARM 16 16 MMIX 256 N/A 1I need your clothes, boots and your motorcycle. Kill all humans! (Except Fry) 6 Memory addressing modes Eective memory address = f (mode; registers; constants) There are a vast amount of modes found in dierent architectures, just to name some common: I absolute I register indirect I PC relative I base register + oset I SIB: addr = base + scale £ index + oset 7 Questions: 1. What is a byte? The minimal addressable amount of data. 2. What is a machine word? The maximum amount of data a CPU can chew at a time. Data types I Integer values, signed and unsigned. I Memory adresses (mind segment:oset scheme in early x86) I Boolean types: ag register, predicate registers (64 in Itanium) I Floating point types: 64(80) bit in x87, 2 ¢ 64 or 4 ¢ 32 bit in SSE I Binary coded decimal (BCD): one byte codes 100 numbers from 0x00 to 0x99. I Tags for memory protection. Elbrus, IBM AS/400. 8 The minimal addressable amount of data. 2. What is a machine word? The maximum amount of data a CPU can chew at a time. Data types I Integer values, signed and unsigned. I Memory adresses (mind segment:oset scheme in early x86) I Boolean types: ag register, predicate registers (64 in Itanium) I Floating point types: 64(80) bit in x87, 2 ¢ 64 or 4 ¢ 32 bit in SSE I Binary coded decimal (BCD): one byte codes 100 numbers from 0x00 to 0x99. I Tags for memory protection. Elbrus, IBM AS/400. Questions: 1. What is a byte? 8 2. What is a machine word? The maximum amount of data a CPU can chew at a time. Data types I Integer values, signed and unsigned. I Memory adresses (mind segment:oset scheme in early x86) I Boolean types: ag register, predicate registers (64 in Itanium) I Floating point types: 64(80) bit in x87, 2 ¢ 64 or 4 ¢ 32 bit in SSE I Binary coded decimal (BCD): one byte codes 100 numbers from 0x00 to 0x99. I Tags for memory protection. Elbrus, IBM AS/400. Questions: 1. What is a byte? The minimal addressable amount of data. 8 The maximum amount of data a CPU can chew at a time. Data types I Integer values, signed and unsigned. I Memory adresses (mind segment:oset scheme in early x86) I Boolean types: ag register, predicate registers (64 in Itanium) I Floating point types: 64(80) bit in x87, 2 ¢ 64 or 4 ¢ 32 bit in SSE I Binary coded decimal (BCD): one byte codes 100 numbers from 0x00 to 0x99. I Tags for memory protection. Elbrus, IBM AS/400. Questions: 1. What is a byte? The minimal addressable amount of data. 2. What is a machine word? 8 Data types I Integer values, signed and unsigned. I Memory adresses (mind segment:oset scheme in early x86) I Boolean types: ag register, predicate registers (64 in Itanium) I Floating point types: 64(80) bit in x87, 2 ¢ 64 or 4 ¢ 32 bit in SSE I Binary coded decimal (BCD): one byte codes 100 numbers from 0x00 to 0x99. I Tags for memory protection. Elbrus, IBM AS/400. Questions: 1. What is a byte? The minimal addressable amount of data. 2. What is a machine word? The maximum amount of data a CPU can chew at a time. 8 Question: what architecture is IA-32? CISC, alas. One more time on RICS and CISC I CISC: more close to high-level languages, compact code, variable instruction length. I RISC: xed length, simplier decoder, easier to pipeline, overall faster. 9 CISC, alas. One more time on RICS and CISC I CISC: more close to high-level languages, compact code, variable instruction length. I RISC: xed length, simplier decoder, easier to pipeline, overall faster. Question: what architecture is IA-32? 9 One more time on RICS and CISC I CISC: more close to high-level languages, compact code, variable instruction length. I RISC: xed length, simplier decoder, easier to pipeline, overall faster. Question: what architecture is IA-32? CISC, alas. 9 Number of operands: 3 More common for CISC: IMUL EAX, DWORD [EBX + ECX*4 + OFFSET], 5 BLENDPD XMM1, XMM2, 255 10 Number of operands: 2 Commonly the rst operand is both the destination and source1, the second is source2: ADD EAX, EBX MOV [EBX], EBP Widely used in ISAs. 11 Number of operands: 1 So called accumulator machines: early computers and small microcontrollers. Most instructions specify a single explicit right operand. Implicit accumulator is used as both destination and left operand. Also, can be used with stack machines for loading/storing values from/to stack and memory: PUSH A POP RAX 12 0 ? WTF? So called stack machines. All arithmetic operations take place using the top one or two positions on the stack. Esoteric stu. Number of operands: Zero 13 So called stack machines. All arithmetic operations take place using the top one or two positions on the stack. Esoteric stu. Number of operands: Zero 0 ? WTF? 13 Number of operands: Zero 0 ? WTF? So called stack machines. All arithmetic operations take place using the top one or two positions on the stack. Esoteric stu. 13 A note on microarchitecture Microarchitecture 6= ISA I IBM TIMI (Technology-Independent Machine Interface): IBM AS/400 (48 bit CISC) POWER (RISC 64 bit) I Transmeta: x86-compatible ISA on VLIW arch. I Think about Intel vs AMD internal CPU designs. 14 Retain backward compatibility! Remember the lessons of Itanium. Instruction set extensions Are made when somebody decides it's better to have some new instructions. 1. 80286 16 bit; 2. 80386 32 bit; 3. coprocessor 8087; 4. MMX, SSEx, AVX extensions for SIMD operations; 5. AMD part: 3DNow! and AMD64 aka EMT-64 aka x86_64. 6. Virtualization support. 15 Instruction set extensions Are made when somebody decides it's better to have some new instructions. 1. 80286 16 bit; 2. 80386 32 bit; 3. coprocessor 8087; 4. MMX, SSEx, AVX extensions for SIMD operations; 5. AMD part: 3DNow! and AMD64 aka EMT-64 aka x86_64. 6. Virtualization support. Retain backward compatibility! Remember the lessons of Itanium. 15 I Alpha I ARM I IA-64 (Itanium) I IA-32, including EMT64 aka AMD64 I Loongson I MIPS I Motorola 68k I PowerPC I SPARC ISAs commonly implemented in software with hardware incarnations I Java virtual machine (ARM Jazelle) I FORTH I MMIX from The Art of Computer Programming Some examples of ISAs 16 I ARM I IA-64 (Itanium) I IA-32, including EMT64 aka AMD64 I Loongson I MIPS I Motorola 68k I PowerPC I SPARC ISAs commonly implemented in software with hardware incarnations I Java virtual machine (ARM Jazelle) I FORTH I MMIX from The Art of Computer Programming Some examples of ISAs I Alpha 16 I IA-64 (Itanium) I IA-32, including EMT64 aka AMD64 I Loongson I MIPS I Motorola 68k I PowerPC I SPARC ISAs commonly implemented in software with hardware incarnations I Java virtual machine (ARM Jazelle) I FORTH I MMIX from The Art of Computer Programming Some examples of ISAs I Alpha I ARM 16 I IA-32, including EMT64 aka AMD64 I Loongson I MIPS

Instruction Set Architecture a Few Words on the Subject

Intel® IA-64 Architecture Software Developer's Manual

Donald Knuth Fletcher Jones Professor of Computer Science, Emeritus Curriculum Vitae Available Online

Binary and Hexadecimal

SIMD Extensions

The Microarchitecture of the Pentium 4 Processor

Tug2007-Slides-2X2.Pdf

Typeset MMIX Programs with TEX Udo Wermuth Abstract a TEX Macro

Itanium-Based Solutions by Hp

Arduino and AVR

Exposing Native Device Apis to Web Apps

Cloud Native Trail Map 2

Chapter 1: Microprocessor Architecture