Intel® Processor Architecture

Total Page:16

File Type:pdf, Size:1020Kb

Intel® Processor Architecture Intel® Processor Architecture January 2013 Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Agenda •Overview Intel® processor architecture •Intel x86 ISA (instruction set architecture) •Micro-architecture of processor core •Uncore structure •Additional processor features – Hyper-threading – Turbo mode •Summary Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 2 Intel® Processor Segments Today Architecture Target ISA Specific Platforms Features Intel® phone, tablet, x86 up to optimized for ATOM™ netbook, low- SSSE-3, 32 low-power, in- Architecture power server and 64 bit order Intel® mainstream x86 up to flexible Core™ notebook, Intel® AVX, feature set Architecture desktop, server 32 and 64bit covering all needs Intel® high end server IA64, x86 by RAS, large Itanium® emulation address space Architecture Intel® MIC accelerator for x86 and +60 cores, Architecture HPC Intel® MIC optimized for Instruction Floating-Point Set performance Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 3 Itanium® 9500 (Poulson) New Itanium Processor Poulson Processor • Compatible with Itanium® 9300 processors (Tukwila) Core Core 32MB Core Shared Core • New micro-architecture with 8 Last Core Level Core Cores Cache Core Core • 54 MB on-die cache Memory Link • Improved RAS and power Controllers Controllers management capabilities • Doubles execution width from 6 to 12 instructions/cycle 4 Intel 4 Full + 2 Half Scalable Width Intel • 32nm process technology Memory QuickPath Interface Interconnect • Launched in November 2012 (SMI) Compatibility provides protection for today’s Itanium® investment Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 5 Intel® XEON™ Phi Former Code Name “Knights Corner” Intel® XEON™ Phi - The first product implementation of the Intel® Many Integrated Core Architecture (Intel® MIC) Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 6 Agenda •Overview Intel® processor architecture •Intel x86 ISA (instruction set architecture) •Micro-architecture of processor core •Uncore structure •Additional processor features – Hyper-threading – Turbo mode •Summary Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 7 X86: From Smartphones to … Motorola RAZR* i • Launched September 2012 • RAZR i is the first smartphone that can achieve speeds of 2.0 GHz Processor: Intel® ATOM™ Z2460 Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 8 X86: … to Supercomputers LRZ SuperMUC System • Installed summer 2012 – Most powerful x86-architecture based computer – #6 on Top500 list – More than 150000 cores • Processor: - Intel® Xeon® E5-2680 (“Sandy Bridge”) - Intel® Xeon® E7-4870 (“Westmere”) Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 9 Intel Tick-Tock Roadmap for Mainstream x86 Architecture since 2006 2nd 3nd ® ™ Generation Generation Intel Core Micro Architecture Intel® Core™ Intel® Core™ MicroArchitecture Codename “Nehalem” Micro Micro Architecture Architecture Merom Penryn Nehalem Westmere Sandy Bridge Ivy Bridge NEW NEW NEW NEW NEW NEW Micro architecture Process Technology Micro architecture Process Technology Micro architecture Process Technology 65nm 45nm 45nm 32nm 32nm 22nm TOCK TICK TOCK TICK TOCK TICK 2006 2007 2008 2009 2011 2012 SSSE-3 SSE4.1 SSE4.2 AES AVX 7 new instructions TICK + TOCK = SHRINK + INNOVATE Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 12 To be continued ... 4nth Generation Intel® Core™ TBD TBD TBD TBD TBD Micro Architecture Haswell Broadwell TBD TBD TBD TBD NEW NEW NEW NEW NEW NEW Micro architecture Process Technology Micro architecture Process Technology Micro architecture Process Technology 22nm 14nm 14nm 10nm 10nm 7nm TICK TOCK TICK TOCK TICK TOCK 2013 >= 2014 ??? ??? ??? ??? AVX-2 TICK + TOCK = SHRINK + INNOVATE Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 13 Registers State for Intel® Pentium® 3 Processor (1998) IA32-INT MMX Technology / SSE Registers Registers IA-FP Registers 80 32 64 128 eax st0 mm0 xmm0 … edi st7 mm7 xmm7 Fourteen 32-bit registers Eight 80/64-bit registers Eight 128-bit registers Scalar data & addresses Hold data only Hold data only: Direct access to regs Direct access to MM0..MM7 4 x single FP numbers No MMX™ Technology / FP 2 x double FP numbers interoperability 128-bit packed integers Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 15 SSE Vector Types 4x single precision Intel® SSE FP 2x double precision FP 16x 8 bit integer 8x 16 bit integer Intel® SSE2 4x 32 bit integer 2x 64 bit integer plain 128 bit Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 16 AVX Vector Types 8x single precision FP Intel® AVX 4x double precision FP 32x 8 bit integer 16x 16 bit integer Intel® AVX2 8x 32 bit integer (Future) 4x 64 bit integer plain 256 bit Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 19 X86 ISA: The Instruction Set • The instruction set for the x86 architecture has been extended numerous times since the set supported by the 8086 processor – See http://en.wikipedia.org/wiki/X86_instruction_listings for an excellent overview • Today, the “base” instructions set (“IA32 ISA”) is the one supported by the first 32bit processor - 80386 • Multiple, “smaller” extensions added then before SSE (1998 / Intel® Pentium® 3) like – MMX( 64 bit SIMD using the x87 FP registers) – Conditional move – Atomic exchange Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 20 New Instructions in Haswell (2013) Group Description Count * SIMD Integer Adding vector integer operations to 256-bit Instructions promoted to 256 bits Gather Load elements from vector of indices 170 / 124 vectorization enabler AVX2 Shuffling / Data Blend, element shift and permute instructions Rearrangement FMA Fused Multiply-Add operation forms ( FMA-3) 96 / 60 Bit Manipulation and Improving performance of bit stream manipulation and 15 / 15 Cryptography decode, large integer arithmetic and hashes TSX=RTM+HLE Transactional Memory 4/4 Others MOVBE: Load and Store of Big Endian forms 2 / 2 INVPCID: Invalidate processor context ID * Total instructions / different mnemonics Software & Services Group, Developer Products Division Copyright © 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 23 HSW Improvements for Threading Sample Code Computing PI by Windows Threads #include <windows.h> void main () #define NUM_THREADS 2 { HANDLE thread_handles[NUM_THREADS]; double pi; int i; CRITICAL_SECTION hUpdateMutex; DWORD threadID; static long num_steps = 100000; int threadArg[NUM_THREADS]; double step; double global_sum = 0.0; for(i=0; i<NUM_THREADS; i++) threadArg[i] = i+1; void Pi (void *arg) { InitializeCriticalSection(&hUpdateMutex); int i, start; double x, sum = 0.0; for (i=0; i<NUM_THREADS; i++){ thread_handles[i] = CreateThread(0, 0, (LPTHREAD_START_ROUTINE) Pi, start = *(int *) arg; &threadArg[i], 0, &threadID); step = 1.0/(double) num_steps; } for (i=start;i<= num_steps; WaitForMultipleObjects(NUM_THREADS, i=i+NUM_THREADS){ thread_handles, TRUE,INFINITE); x = (i-0.5)*step; sum = sum + 4.0/(1.0+x*x); pi = global_sum * step; } EnterCriticalSection(&hUpdateMutex); printf(" pi is %f \n",pi); global_sum += sum; } LeaveCriticalSection(&hUpdateMutex); } Locks can be key bottleneck – even in case there is no conflict Copyright© 2012, Intel Corporation. All rights reserved. Partially Intel Confidential Information. *Other brands and names are the property of their respective owners. Intel® Transactional Synchronization Extensions (Intel® TSX) Intel® TSX = HLE + RTM HLE (Hardware Lock Elision) is a hint inserted in front of a LOCK operation to indicate a region is a candidate for lock elision • XACQUIRE (0xF2) and XRELEASE (0xF3) prefixes • Don’t actually acquire lock, but execute region speculatively • Hardware buffers loads and stores, checkpoints registers • Hardware
Recommended publications
  • AMD Athlon™ Processor X86 Code Optimization Guide
    AMD AthlonTM Processor x86 Code Optimization Guide © 2000 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. No license, whether express, implied, arising by estoppel or otherwise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right. AMD’s products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other applica- tion in which the failure of AMD’s product could create a situation where per- sonal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks AMD, the AMD logo, AMD Athlon, K6, 3DNow!, and combinations thereof, AMD-751, K86, and Super7 are trademarks, and AMD-K6 is a registered trademark of Advanced Micro Devices, Inc. Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation.
    [Show full text]
  • Introduction to the Poulson (Intel 9500 Series) Processor Openvms Advanced Technical Boot Camp 2015 Keith Parris / September 29, 2015
    Introduction to the Poulson (Intel 9500 Series) Processor OpenVMS Advanced Technical Boot Camp 2015 Keith Parris / September 29, 2015 © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Information on Poulson from Intel’s ISSCC Paper © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Poulson information from Intel’s ISSCC Paper http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/itanium-poulson-isscc-paper.pdf 3 © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Poulson information from Intel’s ISSCC Paper http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/itanium-poulson-isscc-paper.pdf 4 © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Poulson information from Intel’s ISSCC Paper http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/itanium-poulson-isscc-paper.pdf 5 © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Poulson information from Intel’s ISSCC Paper • Intel presented a paper on Poulson at the International Solid-State Chips Conference (ISSCC) in July 2011. From this, we learned: • Poulson would be in a 32 nm process (2 process generations ahead from Tukwila, which was at 65 nm, skipping the 45 nm process) • The socket would be compatible with Tukwila • Poulson would have 8 cores, of a brand new core design • The front end (instruction fetch) would be decoupled from the back end (instruction execution) • Poulson could execute and retire as many as 12 instructions per cycle, double Tukwila’s 6 instructions http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/itanium-poulson-isscc-paper.pdf 6 © Copyright 2015 Hewlett-Packard Development Company, L.P.
    [Show full text]
  • Inside Intel® Core™ Microarchitecture Setting New Standards for Energy-Efficient Performance
    White Paper Inside Intel® Core™ Microarchitecture Setting New Standards for Energy-Efficient Performance Ofri Wechsler Intel Fellow, Mobility Group Director, Mobility Microprocessor Architecture Intel Corporation White Paper Inside Intel®Core™ Microarchitecture Introduction Introduction 2 The Intel® Core™ microarchitecture is a new foundation for Intel®Core™ Microarchitecture Design Goals 3 Intel® architecture-based desktop, mobile, and mainstream server multi-core processors. This state-of-the-art multi-core optimized Delivering Energy-Efficient Performance 4 and power-efficient microarchitecture is designed to deliver Intel®Core™ Microarchitecture Innovations 5 increased performance and performance-per-watt—thus increasing Intel® Wide Dynamic Execution 6 overall energy efficiency. This new microarchitecture extends the energy efficient philosophy first delivered in Intel's mobile Intel® Intelligent Power Capability 8 microarchitecture found in the Intel® Pentium® M processor, and Intel® Advanced Smart Cache 8 greatly enhances it with many new and leading edge microar- Intel® Smart Memory Access 9 chitectural innovations as well as existing Intel NetBurst® microarchitecture features. What’s more, it incorporates many Intel® Advanced Digital Media Boost 10 new and significant innovations designed to optimize the Intel®Core™ Microarchitecture and Software 11 power, performance, and scalability of multi-core processors. Summary 12 The Intel Core microarchitecture shows Intel’s continued Learn More 12 innovation by delivering both greater energy efficiency Author Biographies 12 and compute capability required for the new workloads and usage models now making their way across computing. With its higher performance and low power, the new Intel Core microarchitecture will be the basis for many new solutions and form factors. In the home, these include higher performing, ultra-quiet, sleek and low-power computer designs, and new advances in more sophisticated, user-friendly entertainment systems.
    [Show full text]
  • Microcode Revision Guidance August 31, 2019 MCU Recommendations
    microcode revision guidance August 31, 2019 MCU Recommendations Section 1 – Planned microcode updates • Provides details on Intel microcode updates currently planned or available and corresponding to Intel-SA-00233 published June 18, 2019. • Changes from prior revision(s) will be highlighted in yellow. Section 2 – No planned microcode updates • Products for which Intel does not plan to release microcode updates. This includes products previously identified as such. LEGEND: Production Status: • Planned – Intel is planning on releasing a MCU at a future date. • Beta – Intel has released this production signed MCU under NDA for all customers to validate. • Production – Intel has completed all validation and is authorizing customers to use this MCU in a production environment.
    [Show full text]
  • Evolution of Microprocessor Performance
    EvolutionEvolution ofof MicroprocessorMicroprocessor PerformancePerformance So far we examined static & dynamic techniques to improve the performance of single-issue (scalar) pipelined CPU designs including: static & dynamic scheduling, static & dynamic branch predication. Even with these improvements, the restriction of issuing a single instruction per cycle still limits the ideal CPI = 1 Multiple Issue (CPI <1) Multi-cycle Pipelined T = I x CPI x C (single issue) Superscalar/VLIW/SMT Original (2002) Intel Predictions 1 GHz ? 15 GHz to ???? GHz IPC CPI > 10 1.1-10 0.5 - 1.1 .35 - .5 (?) Source: John P. Chen, Intel Labs We next examine the two approaches to achieve a CPI < 1 by issuing multiple instructions per cycle: 4th Edition: Chapter 2.6-2.8 (3rd Edition: Chapter 3.6, 3.7, 4.3 • Superscalar CPUs • Very Long Instruction Word (VLIW) CPUs. Single-issue Processor = Scalar Processor EECC551 - Shaaban Instructions Per Cycle (IPC) = 1/CPI EECC551 - Shaaban #1 lec # 6 Fall 2007 10-2-2007 ParallelismParallelism inin MicroprocessorMicroprocessor VLSIVLSI GenerationsGenerations Bit-level parallelism Instruction-level Thread-level (?) (TLP) 100,000,000 (ILP) Multiple micro-operations Superscalar /VLIW per cycle Simultaneous Single-issue CPI <1 u Multithreading SMT: (multi-cycle non-pipelined) Pipelined e.g. Intel’s Hyper-threading 10,000,000 CPI =1 u uuu u u Chip-Multiprocessors (CMPs) u Not Pipelined R10000 e.g IBM Power 4, 5 CPI >> 1 uuuuuuu u AMD Athlon64 X2 u uuuuu Intel Pentium D u uuuuuuuu u u 1,000,000 u uu uPentium u u uu i80386 u i80286
    [Show full text]
  • Multiprocessing Contents
    Multiprocessing Contents 1 Multiprocessing 1 1.1 Pre-history .............................................. 1 1.2 Key topics ............................................... 1 1.2.1 Processor symmetry ...................................... 1 1.2.2 Instruction and data streams ................................. 1 1.2.3 Processor coupling ...................................... 2 1.2.4 Multiprocessor Communication Architecture ......................... 2 1.3 Flynn’s taxonomy ........................................... 2 1.3.1 SISD multiprocessing ..................................... 2 1.3.2 SIMD multiprocessing .................................... 2 1.3.3 MISD multiprocessing .................................... 3 1.3.4 MIMD multiprocessing .................................... 3 1.4 See also ................................................ 3 1.5 References ............................................... 3 2 Computer multitasking 5 2.1 Multiprogramming .......................................... 5 2.2 Cooperative multitasking ....................................... 6 2.3 Preemptive multitasking ....................................... 6 2.4 Real time ............................................... 7 2.5 Multithreading ............................................ 7 2.6 Memory protection .......................................... 7 2.7 Memory swapping .......................................... 7 2.8 Programming ............................................. 7 2.9 See also ................................................ 8 2.10 References .............................................
    [Show full text]
  • Intel® Core™ Microarchitecture • Wrap Up
    EW N IntelIntel®® CoreCore™™ MicroarchitectureMicroarchitecture MarchMarch 8,8, 20062006 Stephen L. Smith Bob Valentine Vice President Architect Digital Enterprise Group Intel Architecture Group Agenda • Multi-core Update and New Microarchitecture Level Set • New Intel® Core™ Microarchitecture • Wrap Up 2 Intel Multi-core Roadmap – Updates since Fall IDF 3 Ramping Multi-core Everywhere 4 All products and dates are preliminary and subject to change without notice. Refresher: What is Multi-Core? Two or more independent execution cores in the same processor Specific implementations will vary over time - driven by product implementation and manufacturing efficiencies • Best mix of product architecture and volume mfg capabilities – Architecture: Shared Caches vs. Independent Caches – Mfg capabilities: volume packaging technology • Designed to deliver performance, OEM and end user experience Single die (Monolithic) based processor Multi-Chip Processor Example: 90nm Pentium® D Example: Intel Core™ Duo Example: 65nm Pentium D Processor (Smithfield) Processor (Yonah) Processor (Presler) Core0 Core1 Core0 Core1 Core0 Core1 Front Side Bus Front Side Bus Front Side Bus *Not representative of actual die photos or relative size 5 Intel® Core™ Micro-architecture *Not representative of actual die photo or relative size 6 Intel Multi-core Roadmap 7 Intel Multi-core Roadmap 8 Intel® Core™ Microarchitecture Based Platforms Platform 2006 20072007 Caneland Platform (2007) MP Servers Tigerton (QC) (2007) Bensley Platform (Q2’06)/ Glidewell Platform (Q2’06) ) DP Servers/ Woodcrest (Q3’06) DP Workstation Clovertown (QC) (Q1’07) Kaylo Platform (Q3’06)/ Wyloway Platform (Q3 ’06) UP Servers/ Conroe (Q3’06) UP Workstation Kentsfield (QC) (Q1’07) Bridge Creek Platform (Mid’06) Desktop -Home Conroe (Q3’06) Kentsfield (QC) (Q1’07) Desktop -Office Averill Platform (Mid’06) Conroe (Q3’06) Mobile Client Napa Platform (Q1’06) Merom (2H’06) All products and dates are preliminary 9 Note: only Intel® Core™ microarchitecture QC refers to Quad-Core and subject to change without notice.
    [Show full text]
  • Evaluation of the Intel Sandy Bridge-EP Server Processor
    Evaluation of the Intel Sandy Bridge-EP server processor Sverre Jarp, Alfio Lazzaro, Julien Leduc, Andrzej Nowak CERN openlab, March 2012 – version 2.2 Executive Summary In this paper we report on a set of benchmark results recently obtained by CERN openlab when comparing an 8-core “Sandy Bridge-EP” processor with Intel’s previous microarchitecture, the “Westmere-EP”. The Intel marketing names for these processors are “Xeon E5-2600 processor series” and “Xeon 5600 processor series”, respectively. Both processors are produced in a 32nm process, and both platforms are dual-socket servers. Multiple benchmarks were used to get a good understanding of the performance of the new processor. We used both industry-standard benchmarks, such as SPEC2006, and specific High Energy Physics benchmarks, representing both simulation of physics detectors and data analysis of physics events. Before summarizing the results we must stress the fact that benchmarking of modern processors is a very complex affair. One has to control (at least) the following features: processor frequency, overclocking via Turbo mode, the number of physical cores in use, the use of logical cores via Simultaneous Multi-Threading (SMT), the cache sizes available, the memory configuration installed, as well as the power configuration if throughput per watt is to be measured. Software must also be kept under control and we show that a change of operating system or compiler can lead to different results, as well. We have tried to do a good job of comparing like with like. In summary, we obtained a performance improvement of 9 – 20% per core and 46 – 60% improvement across all cores available.
    [Show full text]
  • Nt* and Rtl* INT 2Eh CALL Ntdll!Kifastsystemcall
    ȘFĢ: Fųřțįm’ș Đěřįvǻțįvě Bỳ Jǿșěpħ Ŀǻňđřỳ ǻňđ Ųđį Șħǻmįř Țħě Ŀǻbș țěǻm ǻț ȘěňțįňěŀǾňě řěčěňțŀỳ đįșčǿvěřěđ ǻ șǿpħįșțįčǻțěđ mǻŀẅǻřě čǻmpǻįģň șpěčįfįčǻŀŀỳ țǻřģěțįňģ ǻț ŀěǻșț ǿňě Ěųřǿpěǻň ěňěřģỳ čǿmpǻňỳ. Ųpǿň đįșčǿvěřỳ, țħě țěǻm řěvěřșě ěňģįňěěřěđ țħě čǿđě ǻňđ běŀįěvěș țħǻț bǻșěđ ǿň țħě ňǻțųřě, běħǻvįǿř ǻňđ șǿpħįșțįčǻțįǿň ǿf țħě mǻŀẅǻřě ǻňđ țħě ěxțřěmě měǻșųřěș įț țǻķěș țǿ ěvǻđě đěțěčțįǿň, įț ŀįķěŀỳ pǿįňțș țǿ ǻ ňǻțįǿň-șțǻțě șpǿňșǿřěđ įňįțįǻțįvě, pǿțěňțįǻŀŀỳ ǿřįģįňǻțįňģ įň Ěǻșțěřň Ěųřǿpě. Țħě mǻŀẅǻřě įș mǿșț ŀįķěŀỳ ǻ đřǿppěř țǿǿŀ běįňģ ųșěđ țǿ ģǻįň ǻččěșș țǿ čǻřěfųŀŀỳ țǻřģěțěđ ňěțẅǿřķ ųșěřș, ẅħįčħ įș țħěň ųșěđ ěįțħěř țǿ įňțřǿđųčě țħě pǻỳŀǿǻđ, ẅħįčħ čǿųŀđ ěįțħěř ẅǿřķ țǿ ěxțřǻčț đǻțǻ ǿř įňșěřț țħě mǻŀẅǻřě țǿ pǿțěňțįǻŀŀỳ șħųț đǿẅň ǻň ěňěřģỳ ģřįđ. Țħě ěxpŀǿįț ǻffěčțș ǻŀŀ věřșįǿňș ǿf Mįčřǿșǿfț Ẅįňđǿẅș ǻňđ ħǻș běěň đěvěŀǿpěđ țǿ bỳpǻșș țřǻđįțįǿňǻŀ ǻňțįvįřųș șǿŀųțįǿňș, ňěxț-ģěňěřǻțįǿň fįřěẅǻŀŀș, ǻňđ ěvěň mǿřě řěčěňț ěňđpǿįňț șǿŀųțįǿňș țħǻț ųșě șǻňđbǿxįňģ țěčħňįqųěș țǿ đěțěčț ǻđvǻňčěđ mǻŀẅǻřě. (bįǿměțřįč řěǻđěřș ǻřě ňǿň-řěŀěvǻňț țǿ țħě bỳpǻșș / đěțěčțįǿň țěčħňįqųěș, țħě mǻŀẅǻřě ẅįŀŀ șțǿp ěxěčųțįňģ įf įț đěțěčțș țħě přěșěňčě ǿf șpěčįfįč bįǿměțřįč věňđǿř șǿfțẅǻřě). Ẅě běŀįěvě țħě mǻŀẅǻřě ẅǻș řěŀěǻșěđ įň Mǻỳ ǿf țħįș ỳěǻř ǻňđ įș șțįŀŀ ǻčțįvě. İț ěxħįbįțș țřǻįțș șěěň įň přěvįǿųș ňǻțįǿň-șțǻțě Řǿǿțķįțș, ǻňđ ǻppěǻřș țǿ ħǻvě běěň đěșįģňěđ bỳ mųŀțįpŀě đěvěŀǿpěřș ẅįțħ ħįģħ-ŀěvěŀ șķįŀŀș ǻňđ ǻččěșș țǿ čǿňșįđěřǻbŀě řěșǿųřčěș. Ẅě vǻŀįđǻțěđ țħįș mǻŀẅǻřě čǻmpǻįģň ǻģǻįňșț ȘěňțįňěŀǾňě ǻňđ čǿňfįřměđ țħě șțěpș ǿųțŀįňěđ běŀǿẅ ẅěřě đěțěčțěđ bỳ ǿųř Đỳňǻmįč Běħǻvįǿř Țřǻčķįňģ (ĐBȚ) ěňģįňě. Mǻŀẅǻřě Șỳňǿpșįș Țħįș șǻmpŀě ẅǻș ẅřįțțěň įň ǻ mǻňňěř țǿ ěvǻđě șțǻțįč ǻňđ běħǻvįǿřǻŀ đěțěčțįǿň. Mǻňỳ ǻňțį-șǻňđbǿxįňģ țěčħňįqųěș ǻřě ųțįŀįżěđ.
    [Show full text]
  • The Intel X86 Microarchitectures Map Version 2.0
    The Intel x86 Microarchitectures Map Version 2.0 P6 (1995, 0.50 to 0.35 μm) 8086 (1978, 3 µm) 80386 (1985, 1.5 to 1 µm) P5 (1993, 0.80 to 0.35 μm) NetBurst (2000 , 180 to 130 nm) Skylake (2015, 14 nm) Alternative Names: i686 Series: Alternative Names: iAPX 386, 386, i386 Alternative Names: Pentium, 80586, 586, i586 Alternative Names: Pentium 4, Pentium IV, P4 Alternative Names: SKL (Desktop and Mobile), SKX (Server) Series: Pentium Pro (used in desktops and servers) • 16-bit data bus: 8086 (iAPX Series: Series: Series: Series: • Variant: Klamath (1997, 0.35 μm) 86) • Desktop/Server: i386DX Desktop/Server: P5, P54C • Desktop: Willamette (180 nm) • Desktop: Desktop 6th Generation Core i5 (Skylake-S and Skylake-H) • Alternative Names: Pentium II, PII • 8-bit data bus: 8088 (iAPX • Desktop lower-performance: i386SX Desktop/Server higher-performance: P54CQS, P54CS • Desktop higher-performance: Northwood Pentium 4 (130 nm), Northwood B Pentium 4 HT (130 nm), • Desktop higher-performance: Desktop 6th Generation Core i7 (Skylake-S and Skylake-H), Desktop 7th Generation Core i7 X (Skylake-X), • Series: Klamath (used in desktops) 88) • Mobile: i386SL, 80376, i386EX, Mobile: P54C, P54LM Northwood C Pentium 4 HT (130 nm), Gallatin (Pentium 4 Extreme Edition 130 nm) Desktop 7th Generation Core i9 X (Skylake-X), Desktop 9th Generation Core i7 X (Skylake-X), Desktop 9th Generation Core i9 X (Skylake-X) • Variant: Deschutes (1998, 0.25 to 0.18 μm) i386CXSA, i386SXSA, i386CXSB Compatibility: Pentium OverDrive • Desktop lower-performance: Willamette-128
    [Show full text]
  • M39 Sandy Bridge-PDF
    SANDY BRIDGE SPANS GENERATIONS Intel Focuses on Graphics, Multimedia in New Processor Design By Linley Gwennap {9/27/10-01} ................................................................................................................... Intel’s processor clock has tocked, delivering a next- periods. For notebook computers, these improvements can generation architecture for PCs and servers. At the recent significantly extend battery life by completing tasks more Intel Developer’s Forum (IDF), the company unveiled its quickly and allowing the system to revert to a sleep state. Sandy Bridge processor architecture, the next tock in its tick-tock roadmap. The new CPU is an evolutionary im- Integration Boosts Graphics Performance provement over its predecessor, Nehalem, tweaking the Intel had a false start with integrated graphics: the ill-fated branch predictor, register renaming, and instruction de- Timna project, which was canceled in 2000. More recently, coding. These changes will slightly improve performance Nehalem-class processors known as Arrandale and Clark- on traditional integer applications, but we may be reaching dale “integrated” graphics into the processor, but these the point where the CPU microarchitecture is so efficient, products actually used two chips in one package, as Figure few ways remain to improve performance. 1 shows. By contrast, Sandy Bridge includes the GPU on The big changes in Sandy Bridge target multimedia the processor chip, providing several benefits. The GPU is applications such as 3D graphics, image processing, and now built in the same leading-edge manufacturing process video processing. The chip is Intel’s first to integrate the as the CPU, rather than an older process, as in earlier graphics processing unit (GPU) on the processor itself.
    [Show full text]
  • A 65 Nm 2-Billion Transistor Quad-Core Itanium Processor
    18 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 44, NO. 1, JANUARY 2009 A 65 nm 2-Billion Transistor Quad-Core Itanium Processor Blaine Stackhouse, Sal Bhimji, Chris Bostak, Dave Bradley, Brian Cherkauer, Jayen Desai, Erin Francom, Mike Gowan, Paul Gronowski, Dan Krueger, Charles Morganti, and Steve Troyer Abstract—This paper describes an Itanium processor imple- mented in 65 nm process with 8 layers of Cu interconnect. The 21.5 mm by 32.5 mm die has 2.05B transistors. The processor has four dual-threaded cores, 30 MB of cache, and a system interface that operates at 2.4 GHz at 105 C. High speed serial interconnects allow for peak processor-to-processor bandwidth of 96 GB/s and peak memory bandwidth of 34 GB/s. Index Terms—65-nm process technology, circuit design, clock distribution, computer architecture, microprocessor, on-die cache, voltage domains. I. OVERVIEW Fig. 1. Die photo. HE next generation in the Intel Itanium processor family T code named Tukwila is described. The 21.5 mm by 32.5 mm die contains 2.05 billion transistors, making it the first two billion transistor microprocessor ever reported. Tukwila combines four ported Itanium cores with a new system interface and high speed serial interconnects to deliver greater than 2X performance relative to the Montecito and Montvale family of processors [1], [2]. Tukwila is manufactured in a 65 nm process with 8 layers of copper interconnect as shown in the die photo in Fig. 1. The Tukwila die is enclosed in a 66 mm 66 mm FR4 laminate package with 1248 total landed pins as shown in Fig.
    [Show full text]