Intel X86 Architecture Intel Microprocessor History

Intel x86 Architecture Intel microprocessor history Comppgzuter Organization and Assembl yggy Languages Yung-Yu Chuang with slides by Kip Irvine Early Intel microprocessors The IBM-AT • Intel 8080 (1972) • Intel 80286 (1982) – 64K addressable RAM – 16 MB addressa ble RAM –8-bit registers – Protected memory – CP/M operating system – several times faster than 8086 – 5,6,8,10 MHz – introduced IDE bus architecture – 29K transistros – 80287 floating point unit • Intel 8086/8088 (1978) my first computer (1986) – Up to 20MHz – IBM-PC used 8088 – 134K transistors – 1 MB addressable RAM –16-bit registers – 16-bit data bus (8-bit for 8088) – separate floating-point unit (8087) – used in low-cost microcontrollers now 3 4 Intel IA-32 Family Intel P6 Family • Intel386 (1985) • Pentium Pro (1995) – 4 GB addressable RAM – advan ced optimiz ati on techni ques in micr ocode –32-bit registers – More pipeline stages – On-board L2 cache – paging (virtual memory) • Pentium II (1997) – Up to 33MHz – MMX (multimedia) instruction set • Intel486 (1989) – Up to 450MHz – instruction pipelining • Pentium III (1999) – Integrated FPU – SIMD (streaming extensions) instructions (SSE) – 8K cache – Up to 1+GHz •Pentium (()1993) • Pentium 4 (2000) – Superscalar (two parallel pipelines) – NetBurst micro-architecture, tuned for multimedia – 3.8+GHz • Penti um D (2005, Dua l core) 5 6 IA32 Processors • Totally Dominate Computer Market • EliEvolutionary DiDesign – Starting in 1978 with 8086 – Added more features as time goes on IA-32 Architecture – Still support old features, although obsolete • Complex Instruction Set Computer (CISC) –Manyyy different instructions with many different formats • But, only small subset encountered with Linux programs – Hard to match performance of Reduced Instruction Set Computers (RISC) – But, Inte l has done just th!hat! IA-32 architecture Modes of operation • Lots of architecture improvements, pipelining, •Protected mode superscalar, branch prediction, hyperthreading –native mode (Wind ows, Linux ), fu ll fea tures, and multi-core. separate memory • From programmer’s poitint of view, IA-32 has not • Virtual-8086 mode changed substantially except the introduction •hybrid of Protected of a set of hihhigh-performance iittinstructions • each program has its own 8086 computer • Real-address mode – native MS-DOS • System management mode – ppg,yy,gower management, system security, diagnostics 9 10 Addressable memory General-purpose registers •Protected mode 32-bit General-Purpose Registers –4 GB EAX EBP – 32-bit address EBX ESP • Real-address and Virtual-8086 modes ECX ESI –1 MB space EDX EDI – 20-bit address 16-bit Segggment Registers EFLAGS CS ES SS FS EIP DS GS 11 12 Accessing parts of registers Index and base registers • Use 8-bit name, 16-bit name, or 32-bit name • Some registers have only a 16-bit name for • App lies to EAX, EBX, ECX, and EDX their lower half (no 8-bit aliases). The 16-bit 8 8 registers are usually used only in real-address AH AL 8 bits + 8 bits mode. AX 16 bits EAX 32 bits 13 14 Some specialized register uses (1 of 2) Some specialized register uses (2 of 2) • General-Purpose •Segment – EAX – accumula tor (au toma tica lly used by div is ion –CS – code segment and multiplication) –DS – data segment – ECX – loop counter – SS – stack segment –ESP – stack pointer (should never be used for arithmetic or data transfer) – ES, FS, GS - additional segments – ESI, EDI – index registers (used for high-speed • EIP – instruction pointer memory transfer instructions) • EFLAGS – EBP – extddtended frame poitinter (tk)(stack) – status and control flags – each flag is a single binary bit (set or clear) • Some other system registers such as IDTR, GDTR, LDTR etc. 15 16 Status flags Floating-point, MMX, XMM registers • Carry • Eight 80-bit floating-point data ST(0) – unsigned arithmetic out of range registers ST(1) • Overflow – ST(0), ST(1), . , ST(7) ST(2) – signed arithmetic out of range ST(3) – arranged in a stack •Sign ST(4) – used for all floating-point – result is negative ST(5) arithmetic •Zero ST(6) – result is zero •Eight 64-bit MMX registers ST(7) • Auxiliary Carry • Eight 128-bit XMM registers for – carry from bit 3 to bit 4 single-instruction multiple-data (SIMD) operations •Parity – sum of 1 bits is an even number 17 18 Programmer’s model Programmer’s model 19 20 Real-address mode • 1 MB RAM maximum addressable (20-bit address) • Application programs can access any area of memory IA-32 Memory Management • Single tasking • Supported by MS-DOS operating system 22 Segmented memory Calculating linear addresses Segmented memory addressing: absolute (linear) address • Given a segment address, multiply it by 16 (add is a combination of a 16-bit segment value added to a 16- a hexadecimal zero), and add it to the offset bit offset • Example: convert 08F1:0100 to a linear address F0000 E0000 8000:FFFF D0000 Adjusted Segment value: 0 8 F 1 0 C0000 B0000 Add the offset: 0 1 0 0 A0000 one segment 90000 80000 (64K) Linear address: 0 9 0 1 0 70000 60000 8000:0250 50000 • A typical program has three segments: code, 40000 0250 30000 8000:0000 data and stack. Segment registers CS, DS and SS 20000 10000 are used to store them separately. seg ofs 00000 23 24 Example Protected mode (1 of 2) • 4 GB addressable RAM (32-bit address) What linear address corresponds to the segment/offset address 028F:0030? – (00000000 to FFFFFFFFh) • Each program assigned a memory partition whic h is protected from other programs 028F0 + 0030 = 02920 • Designed for multitasking • Supported by Linux & MS-Windows Always use hexadecimal notation for addresses. 25 26 Protected mode (2 of 2) Flat segmentation model • Segment descriptor tables • All segments are mapped to the entire 32-bit physical address space, at least two, one for data and one for • Program structure code – code, data, and stack areas • ggp()lobal descriptor table (GDT) – CS, DS, SS segment descriptors – global descriptor table (GDT) • MASM Programs use the Microsoft flat memory model 27 28 Multi-segment model Translating Addresses • Each program has a local descriptor table (LDT) • The IA-32 processor uses a one- or two-step – holds descriptor for each segment used by the program process to convert a variable' s logical address RAM into a unique memory location. •The first step combines a segment value wihith a variable’s offset to create a linear address. Local Descriptor Table • The second optional step, called page translation, converts a linear address to a 26000 base limit access physical address. 00026000 0010 00008000 000A 00003000 0002 8000 multiplied by 1000h 3000 29 Converting Logical to Linear Address Indexing into a Descriptor Table The segment Logical address Each segment descriptor indexes into the program's local selector points to a Selector Offset descriptor table (LDT). Each table entry is mapped to a segment descriptor, linear address: Descriptor table Linear address space which contains the base address of a (unused) memory segment. Segment Descriptor + Logical addresses The 32-bit offset LlDiLocal Descriptor TblTable DRAM from the logical SS ESP 0018 0000003A address is added to the segment’s base DS offset (index) 18 001A0000 address, generating GDTR/LDTR 0010 000001B6 10 0002A000 Linear address a 32-bit linear 08 0001A000 (contains base address of IP address. descriptor table) 00 00003000 0008 00002CD3 LDTR register Paging (1 of 2) Paging (2 of 2) • Virtual memory uses disk as part of the memory, • OS maintains page directory and page tables thus allowing sum of all programs can be larger • Page translation: CPU converts the linear than physical memory address into a physical address • On ly part of a program must be kep t in • Page fault: occurs when a needed page is not memory, while the remaining parts are kept on in memory, and the CPU interrup ts the dis k. program • The memory used by the program is divided • Virtual memory manager (VMM) – OS utility into small units called pages (4096-byte). that manages the loading and unloading of •As the ppgrogram runs, the processor selectivel y pages unloads inactive pages from memory and loads • OS copies the page into memory, program other ppgages that are immediatel y re quired. resumes execution Page Translation A linear address is Linear Address 10 10 12 divided into a page Directory Table Offset directory field, page Page Frame table field, and page Page Directory frame offset. The Page Table Physical Address CPU uses all three to calculate the Page-Table Entry ppyhysical address. Directory Entry CR3 32.

Intel X86 Architecture Intel Microprocessor History

The Von Neumann Computer Model 5/30/17, 10:03 PM

The Intel Microprocessors: Architecture, Programming and Interfacing Introduction to the Microprocessor and Computer

Lecture Notes

Unit 8 : Microprocessor Architecture

HOW FAST? the Current Intel® Core™ Processor Has 43,000,000% More Transistors Than the 4004 Processor

Trends in Processor Architecture

What Is a Microprocessor?

Chapter 1: Microprocessor Architecture

Theoretical Peak FLOPS Per Instruction Set on Modern Intel Cpus

COSC 6385 Computer Architecture - Multi-Processors (IV) Simultaneous Multi-Threading and Multi-Core Processors Edgar Gabriel Spring 2011

Reverse Engineering X86 Processor Microcode

Assembly Language: IA-X86