Multicore Processors
Total Page:16
File Type:pdf, Size:1020Kb
Multicore processors Dezső, Sima Created by XMLmind XSL-FO Converter. Multicore processors Dezső, Sima Szerzői jog © 2013 Typotex Kiadó Kivonat Az informatika egyik jellemző trendje a processzorok illetve processzor architektúrák nagyiramú fejlődése. 2005-2006-tól kezdődően teljes mértékben tért nyertek a többmagos processzorok, melyek napjaink notebook- jainak, asztali gépeinek, szervereinek jellemző processzor típusává váltak. E processzor kategóriában vezető szerepet játszik az Intel cég mintegy 80%-os világpiaci részesedéssel, míg e terület második helyen álló szereplője az AMD. A fentiekre tekintettel a kidolgozott tananyag az Intel és AMD alaparchitektúráira fókuszál, bemutatva az egymást követő processzorcsaládok utasításkészletének, mikroarchitektúrájának, disszipációkezelési technikáinak illetve rendszerarchitektúrájának a fejlődését, valamint a megjelent processzorok kínálatát. A tananyag alapvető szempontja a többmagos processzorok területén bekövetkezett fejlődés kidomborítása, megismertetése a hallgatókkal. Gacsal József, Intel Hungary, Üzletfejlesztési igazgató Created by XMLmind XSL-FO Converter. Tartalom Part I. Introduction .............................................................................................................................. 1 1. Introduction ........................................................................................................................... 3 1. Foreword ...................................................................................................................... 3 2. The mobile boom and its consequences to computer architectures ............................. 3 3. Consequences of the low power requirement of mobile devices for Intel and AMD .. 5 4. Foreseeable market situation ....................................................................................... 5 5. Intel’s response to the mobile challenge ...................................................................... 5 6. Evolution of Intel’s basic architectures [2] .................................................................. 5 7. AMD’s response to the mobile challenge .................................................................... 6 8. Evolution of AMD’s basic architectures ..................................................................... 6 9. Overview of Intel’s and AMD’s actual processor lines ............................................... 6 10. Scope of these slides .................................................................................................. 7 11. Reasons for this decision .......................................................................................... 7 2. References ............................................................................................................................. 8 Part II. Intel’s Core 2-based processor lines ....................................................................................... 9 1. Introduction ......................................................................................................................... 15 1. The evolution of Intel’s basic microarchitectures ...................................................... 15 2. Intel’s Tick-Tock model ............................................................................................ 15 3. Basic architectures and their related shrinks .............................................................. 16 2. The Core 2 line .................................................................................................................... 19 1. 2.1 Introduction ......................................................................................................... 19 2. 2.2 Major innovations of the Core 2 line ................................................................... 20 2.1. 2.2.1 Wide execution ..................................................................................... 20 2.1.1. 4-wide core ....................................................................................... 20 2.1.2. Enhanced execution resources .......................................................... 22 2.1.3. Performance leadership changes between Intel and AMD ............... 23 2.1.4. Example 1: DP web-server performance comparison (2003) ........... 24 2.1.5. Example 2: Summary assessment of extensive benchmark tests contrasting dual Opterons vs dual Xeons (2003) [7] ..................................................... 24 2.1.6. Example: DP web-server performance comparison (2006) .............. 25 2.2. 2.2.2 Smart L2 cache ..................................................................................... 25 2.2.1. Shared L2 cache ................................................................................ 25 2.2.2. Benefits of shared caches .................................................................. 26 2.2.3. Drawbacks of shared caches ............................................................. 26 2.3. 2.2.3 Smart memory accesses ....................................................................... 26 2.3.1. Hardware prefetchers [9] .................................................................. 26 2.3.2. Intensive use of hardware prefetchers [11] ....................................... 27 2.3.3. Hardware prefetchers within the Core 2 microarchitecture .............. 27 2.4. 2.2.4 Enhanced digital media support ........................................................... 27 2.4.1. Widening the width of FP/SSE Execution units from 64-bit to 128-bit 27 2.4.2. Overview of the x86 ISA extensions in Intel’ processor lines .......... 28 2.4.3. Achieved performance boost in Core 2 for gaming apps .................. 30 2.5. 2.2.5 Intelligent Power management ............................................................. 31 2.5.1. Ultra fine grained power control ....................................................... 31 2.5.2. Platform Thermal Control ................................................................ 31 2.5.3. Possible solution for the Platform Thermal Control Manager [88] .. 32 3. 2.3 Overview of Core 2 based processor lines ........................................................... 32 3. The Penryn line ................................................................................................................... 34 1. 3.1 Introduction ......................................................................................................... 34 1.1. Penryn ........................................................................................................... 34 2. 3.2 Key enhancements of Penryn line ....................................................................... 35 2.1. 3.2.2 More advanced power management ..................................................... 36 2.1.1. Deep Power Down technology (DPD) .............................................. 36 2.1.2. Enhanced Dynamic Acceleration Technology (EDAT) (for mobiles) 38 2.1.3. Overall performance achievements with Penryn (1) ........................ 39 3. 3.3 Overview of Penryn based processor lines .......................................................... 40 iii Created by XMLmind XSL-FO Converter. Multicore processors 4. The Nehalem line ................................................................................................................ 42 1. 4.1 Introduction to the 1. generation Nehalem line (Bloomfield) ............................. 42 1.1. Die shot of the 1. generation Nehalem desktop processor (Bloomfield) [45] 43 2. 4.2 Major innovations of the 1. generation Nehalem line [54] .................................. 44 2.1. 4.2.1 Simultaneous Multithreading (SMT) ................................................... 45 2.1.1. Performance gains of SMT ............................................................... 46 2.2. 4.2.2 New cache architecture ........................................................................ 47 2.2.1. Distinguished features of Nehalem’s cache architecture .................. 47 2.3. 4.2.3 Integrated memory controller ............................................................... 48 2.3.1. Main features .................................................................................... 49 2.3.2. Benefit of integrated memory controllers ......................................... 49 2.3.3. Drawback of integrated memory controllers .................................... 49 2.3.4. Non Uniform Memory Access (NUMA) ......................................... 49 2.3.5. Memory latency comparison: Nehalem vs Penryn ........................... 50 2.4. 4.2.4 QuickPath Interconnect bus (QPI) ...................................................... 51 2.4.1. Signals of the QuickPath Interconnect bus (QPI bus) ...................... 52 2.4.2. QuickPath Interconnect bus (QPI) .................................................... 52 2.4.3. QPI based DP and MP server system architectures ......................... 52 2.4.4. Comparison of the transfer rates of the QPI, FSB and HT buses .... 53 2.4.5. The notion of “Uncore” .................................................................... 53 2.5. 4.2.5 Enhanced power management .............................................................. 54 2.5.1. Nehalem’s Turbo Mode .................................................................... 55 2.5.2. ACPI states [26] ............................................................................... 55 2.6. 4.2.6 New socket ........................................................................................... 58 3. 4.3 Major