CPU CISC Example: Intel

Calcolatori Elettronici e Sistemi Operativi x86: history CPU Year Data Max. Transistors Clock MHz Av. MIPS Level-1 Caches Bus Mem. 8086 1978 16 1MB 29K 5-10 0.8 CPU CISC example: 80286 1982 16 16MB 134K 8-12 2.7 80386 1985 32 4GB 275K 16-33 6 Intel x86 80486 1989 32 4GB 1.2M 25-100 20 8Kb Pentium 1993 64 4GB 3.1M 60-233 100 8K Instr + 8K Data Pentium 1995 64 64GB 5.5M 150-200 440 8K + 8K + Level2 Pro +15.5M Pentium II 1997 64 64GB 7M 266-450 466- 16K+16K + L2 Pentium III 1999 64 64GB 8.2M 500-1000 1000- 16K+16K + L2 Pentium 4 2001 64 64GB 42M 1300-2000 8K + L2 x86: history x86: history Intel introduced microprocessors in 1971 80186 (1982) 4-bit microprocessor 4004 (1971) A faster version of 8086 8-bit microprocessors 16-bit data bus and 20-bit address bus 8008 (1972) 8080 (1974) Improved instruction set 8085 (1975) 80286 (1982) 16-bit processors 8086 introduced in 1978 first x86 CPU 24-bit address bus 20-bit address bus, 16-bit data bus 16 MB address space 8088 (1979) Enhanced with memory protection capabilities a less expensive version of 8086 Uses 8-bit data bus Introduced protected mode Can address up to 4 segments of 64 KB Segmentation in protected mode is different from the real mode Referred to as the real mode Backwards compatible x86: history x86: history 80386 (1985) first 32-bit CPU Pentium (1993) First 32-bit processor Similar to 486 but with 64-bit data bus 32-bit data bus and 32-bit address bus Wider internal datapaths 4 GB address space 128- and 256-bit wide Segmentation can be turned off (flat model) Added second execution pipeline Introduced paging Superscalar performance 80486 (1989) Two instructions/clock Improved version of 386 Doubled on-chip L1 cache Combined coprocessor functions for performing floating-point arithmetic 8 KB data Added parallel execution capability to instruction decode and execution 8 KB instruction units Added branch prediction Achieves scalar execution of 1 instruction/clock Later versions introduced energy savings for laptops x86: history x86: history Pentium Pro (1995) Pentium II (1997) Three-way superscalar Introduced multimedia (MMX) instructions 3 instructions/clock Doubled on-chip L1 cache 36-bit address bus 16 KB data 64 GB address space 16 KB instruction Introduced dynamic execution Introduced comprehensive power management features Out-of-order execution Sleep Speculative execution Deep sleep In addition to the L1 cache In addition to the L1 cache Has 256 KB L2 cache Has 256 KB L2 cache Pentium III, Pentium 4,... Pentium 4F (2005) first x86-64 IA-32: P6 Example Core i7-3970X : Sandy Bridge-E (6 cores) 32 nm (2.27 billion transistors) Caches: 3-ways superscalar, 12-stages pipelined µ-ops cache: 1536 µ-ops per core L1: 32 KB (I$) + 32 KB (D$) per core [8-way – line: 16-B] L2: 256 KB per core [8-way – line: 64-B] branch prediction L3: 15 MB shared [16-way – line: 64-B] 3.5 GHz (memory bus: 800 MHz) - 150 W out-of-order execution Pipeline: 19 stages µ-op hit 5 stages skipped 4 instruction decoders (instruction to µ-ops translators) speculative execution SIMD instructions MMX SSE, SSE2, SSE3, SSE4 mode of operation AES instructions AVX: Advanced Vector Extensions real mode (emulates a 8086) EM64T: Extended Memory 64 technology NX / XD / Execute disable bit HT: Hyper-Threading technology (Hardware multithreading: factor 2) protected mode (32-bit environment) Virtualization support VT-x: Virtualization technology system management mode VT-d: Virtualization for directed I/O TBT: Turbo Boost technology Enhanced SpeedStep technology Operating modes GP registers Real-address mode Register Special use Behaves as an 8086 (with a few extensions) EAX : accumulator for operands and results data EBX : pointer to data Protected mode ECX : counter for string and loop operations Native operating mode EDX : I/O pointer System management mode ESI : pointer to data; source pointer for string operations To handle power management and OEM variants EDI : pointer to data (ES segment); destination ptr for string operations Virtual-8086 mode ESP : stack pointer To emulate an 8086 inside the protected mode EBP : pointer to data on the stack 8086: registers IA-32: registers AH AL AX CS FLAGS R7 EAX AH AL AX CS EFLAGS R7 BH BL BX DS 16-bit Status-flags Register EBX BH BL BX DS 32-bit Status-flags Register CH CL CX ES ECX CH CL CX ES DH DL DX SS IP EDX DH DL DX SS EIP DI Four 16-bit 16-bit Instruction Pointer EDI DI FS 32-bit Instruction Pointer SI Segment Registers ESI SI GS BP EBP BP Six 16-bit SP R0 ESP SP Segment Registers R0 Eight 16-bit Eight 80-bit Eight 32-bit Eight 80-bit GP-Registers FP-Registers GP-Registers FP-Registers CR CR MMX0 XMM0 I-fetch: MEM[CS<<4 + IP] SR MMX1 XMM1 SR MMX2 XMM2 D-fetch: MEM[DS<<4 + address] ( other segment selectors can be forced ) TR TR FP Control Registers MMX3 XMM3 FP Control Registers mov AX, [BX+4] MMX4 XMM4 (16-bit) (16-bit) mov CX, CS:[DX+4] MMX5 XMM5 MMX6 XMM6 stack access: MEM[SS<<4 + SP] IPR 48 bits MMX7 XMM7 POP AX DPR 48 bits PUSH BX Eight 64-bit Eight 128-bit XMM-Registers OPR 8087 MMX-Registers 11 bits Status-flags Register (EFLAGS) Status-flags Register (EFLAGS) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 ID VIP VIF AC VM RF 0 NT IOPL OF DF IF TF SF ZF 0 AF 0 PF 1 CF 0 ID VIP VIF AC VM RF 0 NT IOPL OF DF IF TF SF ZF 0 AF 0 PF 1 CF User flags: System flags: OF: Overflow Flag ID: ID Flag ( if writable, CPUID instruction is supported ) DF: Direction Flag ( set by sw to control string operations: MOVS, CMPS, SCAS, LODS, STOS) VIP: Virtual Interrupt Pending ( to record that a virtual interrupt is pending: only written by sw ) SF: Sign Flag VIF: Virtual Interrupt Flag ( 1: virtual interrupt enabled ) ZF: Zero Flag AC: Alignment Check ( 1: alignment check exceptions enabled ) AF: Auxiliary Carry Flag ( carry generated from bit 3; used for BCD operations) VM: Virtual-8086 Mode ( set to enable virtual-8086 mode ) PF: Parity Flag ( least significant bit of the result ) RF: Resume Flag ( 1: debug exceptions disabled, to allow resuming after a breakpoint ) CF: Carry Flag NT: Nested Task ( 1: a CALL, an interrupt, or an exception caused a task switch ) IOPL: I/O Privilege Level ( max privilege level required for accessing IO address space ) IF: Interrupt Enable Flag ( 1: interrupt enabled ) TF: Trap Flag ( 1: single-step mode for debugging ) IA-32: other registers Memory model Control registers CR0, CR1, CR2, CR3, CR4 Segmented and paged memory CR0 also specifies the if the protected mode is active 3 functioning modes (0-2: privileged modes – 3 user mode) Segment and offset: logical address Memory management registers GDTR, IDTR, LDTR Logical address Linear address for protected mode memory management Segment Descriptor Table Memory type range registers (MTRRs) Linear address Physical address Debug registers Page Table DR0, ..., DR7 Machine specific registers (MSRs) Machine check registers Performance monitoring counters 8086 memory model 8086 memory model 16 bit 16 bit processor seg. register only address translation <<4 0 no protection 20 bit address bus seg base max addressable memory: 1MB 20 bit 16 16 bit data bus offset 2 B = 64 KB 16 bit 8 bit for 8088 16 bit address 16 bit segment register <<4 20-bit base address + Address space 20 20-bit address (physical) 2 -1 IA-32: memory models IA-32: segmented memory model Segmented memory model seg register 32 bit Flat memory model 16 bit seg selector offset 32 bit linear address space GDTR or LDTR linear address uses bits (15:3) Real-address memory model 32 bit access limit for 8086 emulation 13 base address + Linear address segment descriptor Segment Descriptor Table (Global or Local) segment_selector : offset logical address IA-32: segmented memory model IA-32: flat memory model Linear address space Linear address space 0 0 13 13 CS Segment Descriptor Table CS Segment Descriptor Table 16 bit 16 bit access limit access limit base address base address 13 13 DS code segment DS 16 bit access limit 16 bit base address 13 ES IP 13 16 bit 32 bit 13 data segment FS address 16 bit 32 bit GS 13 16 bit 232 -1 232 -1 SS 16 bit IA-32: segment registers IA-32: Segment Descriptors 15 3 2 1 0 63 62 61 60 59 78 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 TI: 0 GDT - 1 LDT D A Index TI RPL BASE 31:24 G / L V LIMIT 19:16 P DPL S TYPE BASE 23:16 RPL: requested privilege level B L Segment selector 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 BASE 15:0 LIMIT 15:0 Segment registers 2-bits: current privilege level (CPL) L: 64-bit code segment AVL: available for system software BASE: base address Segment selector Access information, Limit, Base address CS D/B: code segment: Default operation size (16 (0) or 32 (1) bit) Segment selector Access information, Limit, Base address DS data segment: address size for stack access (16 bit (0) or 32 bit (1 Big)) Segment selector Access information, Limit, Base address ES DPL: descriptor privilege level Segment selector Access information, Limit, Base address FS G: granularity (byte (0) or page (1)) LIMIT: segment size (bytes if G=0, 4KB pages if G=1) Segment selector Access information, Limit, Base address GS P: present Segment selector Access information, Limit, Base address SS S: type (0=system; 1=code or data) Visible portion Hidden portion (shadow

Load more