Intel's Haswell CPU Microarchitecture
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
AMD Athlon™ Processor X86 Code Optimization Guide
AMD AthlonTM Processor x86 Code Optimization Guide © 2000 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. No license, whether express, implied, arising by estoppel or otherwise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right. AMD’s products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other applica- tion in which the failure of AMD’s product could create a situation where per- sonal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks AMD, the AMD logo, AMD Athlon, K6, 3DNow!, and combinations thereof, AMD-751, K86, and Super7 are trademarks, and AMD-K6 is a registered trademark of Advanced Micro Devices, Inc. Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation. -
Intel® Architecture Instruction Set Extensions and Future Features Programming Reference
Intel® Architecture Instruction Set Extensions and Future Features Programming Reference 319433-037 MAY 2019 Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifica- tions. Current characterized errata are available on request. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Intel does not guarantee the availability of these interfaces in any future product. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1- 800-548-4725, or by visiting http://www.intel.com/design/literature.htm. Intel, the Intel logo, Intel Deep Learning Boost, Intel DL Boost, Intel Atom, Intel Core, Intel SpeedStep, MMX, Pentium, VTune, and Xeon are trademarks of Intel Corporation in the U.S. -
Inside Intel® Core™ Microarchitecture Setting New Standards for Energy-Efficient Performance
White Paper Inside Intel® Core™ Microarchitecture Setting New Standards for Energy-Efficient Performance Ofri Wechsler Intel Fellow, Mobility Group Director, Mobility Microprocessor Architecture Intel Corporation White Paper Inside Intel®Core™ Microarchitecture Introduction Introduction 2 The Intel® Core™ microarchitecture is a new foundation for Intel®Core™ Microarchitecture Design Goals 3 Intel® architecture-based desktop, mobile, and mainstream server multi-core processors. This state-of-the-art multi-core optimized Delivering Energy-Efficient Performance 4 and power-efficient microarchitecture is designed to deliver Intel®Core™ Microarchitecture Innovations 5 increased performance and performance-per-watt—thus increasing Intel® Wide Dynamic Execution 6 overall energy efficiency. This new microarchitecture extends the energy efficient philosophy first delivered in Intel's mobile Intel® Intelligent Power Capability 8 microarchitecture found in the Intel® Pentium® M processor, and Intel® Advanced Smart Cache 8 greatly enhances it with many new and leading edge microar- Intel® Smart Memory Access 9 chitectural innovations as well as existing Intel NetBurst® microarchitecture features. What’s more, it incorporates many Intel® Advanced Digital Media Boost 10 new and significant innovations designed to optimize the Intel®Core™ Microarchitecture and Software 11 power, performance, and scalability of multi-core processors. Summary 12 The Intel Core microarchitecture shows Intel’s continued Learn More 12 innovation by delivering both greater energy efficiency Author Biographies 12 and compute capability required for the new workloads and usage models now making their way across computing. With its higher performance and low power, the new Intel Core microarchitecture will be the basis for many new solutions and form factors. In the home, these include higher performing, ultra-quiet, sleek and low-power computer designs, and new advances in more sophisticated, user-friendly entertainment systems. -
Microcode Revision Guidance August 31, 2019 MCU Recommendations
microcode revision guidance August 31, 2019 MCU Recommendations Section 1 – Planned microcode updates • Provides details on Intel microcode updates currently planned or available and corresponding to Intel-SA-00233 published June 18, 2019. • Changes from prior revision(s) will be highlighted in yellow. Section 2 – No planned microcode updates • Products for which Intel does not plan to release microcode updates. This includes products previously identified as such. LEGEND: Production Status: • Planned – Intel is planning on releasing a MCU at a future date. • Beta – Intel has released this production signed MCU under NDA for all customers to validate. • Production – Intel has completed all validation and is authorizing customers to use this MCU in a production environment. -
Data Sampling on MDS-Resistant 10Th Generation Intel Core (Ice Lake) 1
Data Sampling on MDS-resistant 10th Generation Intel Core (Ice Lake) Daniel Moghimi Worcester Polytechnic Institute, Worcester, MA, USA Abstract the store buffer. MSBDS has been shown in attacks on the kernel space as well as on SGX [1]. In addition to the leakage Microarchitectural Data Sampling (MDS) is a set of hardware component, MSBDS can also be turned around easily to tran- vulnerabilities in Intel CPUs that allows an attacker to leak siently inject values into a victim, an attack technique known bytes of data from memory loads and stores across various as Load Value Injection (LVI) [14]. security boundaries. On affected CPUs, some of these vulner- abilities were patched via microcode updates. Additionally, To mitigate these vulnerabilities, Intel released software Intel announced that the newest microarchitectures, namely workarounds and microcode updates for affected CPUs [4]. Cascade Lake and Ice Lake, were not affected by MDS. While One of the most recent microarchitectures, Ice Lake, is re- Cascade Lake turned out to be vulnerable to the ZombieLoad ported to be unaffected by MDS attacks [6,8]. Intel has also v2 MDS attack (also known as TAA), Ice Lake was not af- explicitly listed Ice Lake processors as not being vulnera- fected by this attack. ble to LVI-SB [14], which exploits MSBDS for Load Value Injection [7]. In this technical report, we show a variant of MSBDS (CVE- 2018-12126), an MDS attack, also known as Fallout, that To better analyze the MDS vulnerabilities, and potentially works on Ice Lake CPUs. This variant was automatically find new variants, Moghimi et al. -
Sgxometer: Open and Modular Benchmarking for Intel SGX
SGXoMeter: Open and Modular Benchmarking for Intel SGX Mohammad Mahhouk Nico Weichbrodt Rüdiger Kapitza TU Braunschweig, Germany TU Braunschweig, Germany TU Braunschweig, Germany ABSTRACT mobile devices like phones and tablets. Also, personal computers, Intel’s Software Guard Extensions (SGX) are currently the most laptops and servers can be secured using AMD Secure Encrypted wide-spread commodity trusted execution environment, which Virtualisation [22, 23], and Intel SGX [24, 27]. provides integrity and confidentiality of sensitive code and data. Intel SGX promises with its isolated memory regions, so called Thereby, it offers protection even against privileged attackers and enclaves, both confidentiality and integrity protection of the sen- various forms of physical attacks. As a technology that only be- sitive data and code running inside them against malicious and came available in late 2015, it has received massive interest and privileged software. It also provides local and remote attestation undergone a rapid evolution. Despite first ad-hoc attempts, there is mechanisms [21] to ensure the authenticity and integrity of the so far no standardised approach to benchmark the SGX hardware, running enclaves, adding protection against forging attempts. Thus, its associated environment, and techniques that were designed to the utilisation of SGX in cloud services can considerably reduce harden SGX-based applications. the customers’ reluctance of using them. Furthermore, Intel has re- In this paper, we present SGXoMeter, an open and modular leased a Software Development Kit (SDK)[18] to ease programming framework designed to benchmark different SGX-aware CPUs, with SGX. It introduces wrappers for low-level instructions and `code revisions, SDK versions and extensions to mitigate side- provides a high-level interface that offers multiple functionalities, channel attacks. -
POWER-AWARE MICROARCHITECTURE: Design and Modeling Challenges for Next-Generation Microprocessors
POWER-AWARE MICROARCHITECTURE: Design and Modeling Challenges for Next-Generation Microprocessors THE ABILITY TO ESTIMATE POWER CONSUMPTION DURING EARLY-STAGE DEFINITION AND TRADE-OFF STUDIES IS A KEY NEW METHODOLOGY ENHANCEMENT. OPPORTUNITIES FOR SAVING POWER CAN BE EXPOSED VIA MICROARCHITECTURE-LEVEL MODELING, PARTICULARLY THROUGH CLOCK- GATING AND DYNAMIC ADAPTATION. Power dissipation limits have Thus far, most of the work done in the area David M. Brooks emerged as a major constraint in the design of high-level power estimation has been focused of microprocessors. At the low end of the per- at the register-transfer-level (RTL) description Pradip Bose formance spectrum, namely in the world of in the processor design flow. Only recently have handheld and portable devices or systems, we seen a surge of interest in estimating power Stanley E. Schuster power has always dominated over perfor- at the microarchitecture definition stage, and mance (execution time) as the primary design specific work on power-efficient microarchi- Hans Jacobson issue. Battery life and system cost constraints tecture design has been reported.2-8 drive the design team to consider power over Here, we describe the approach of using Prabhakar N. Kudva performance in such a scenario. energy-enabled performance simulators in Increasingly, however, power is also a key early design. We examine some of the emerg- Alper Buyuktosunoglu design issue in the workstation and server mar- ing paradigms in processor design and com- kets (see Gowan et al.)1 In this high-end arena ment on their inherent power-performance John-David Wellman the increasing microarchitectural complexities, characteristics. clock frequencies, and die sizes push the chip- Victor Zyuban level—and hence the system-level—power Power-performance efficiency consumption to such levels that traditionally See the “Power-performance fundamentals” Manish Gupta air-cooled multiprocessor server boxes may box. -
A Superscalar Out-Of-Order X86 Soft Processor for FPGA
A Superscalar Out-of-Order x86 Soft Processor for FPGA Henry Wong University of Toronto, Intel [email protected] June 5, 2019 Stanford University EE380 1 Hi! ● CPU architect, Intel Hillsboro ● Ph.D., University of Toronto ● Today: x86 OoO processor for FPGA (Ph.D. work) – Motivation – High-level design and results – Microarchitecture details and some circuits 2 FPGA: Field-Programmable Gate Array ● Is a digital circuit (logic gates and wires) ● Is field-programmable (at power-on, not in the fab) ● Pre-fab everything you’ll ever need – 20x area, 20x delay cost – Circuit building blocks are somewhat bigger than logic gates 6-LUT6-LUT 6-LUT6-LUT 3 6-LUT 6-LUT FPGA: Field-Programmable Gate Array ● Is a digital circuit (logic gates and wires) ● Is field-programmable (at power-on, not in the fab) ● Pre-fab everything you’ll ever need – 20x area, 20x delay cost – Circuit building blocks are somewhat bigger than logic gates 6-LUT 6-LUT 6-LUT 6-LUT 4 6-LUT 6-LUT FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel code and hardware accelerators need effort – Less effort if soft processors got faster 5 FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel code and hardware accelerators need effort – Less effort if soft processors got faster 6 FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel -
8Th Generation Intel® Core™ Processors Product Brief
PRODUCT BRIEF 8TH GENERATION INTEL® CORE™ PROCESSORS Mobile U-Series: Peak Performance on the Go The new 8th generation Intel® Core™ Processor U-series—for sleek notebooks and 2 in 1s elevates your computing experience with an astounding 40 percent leap in productivity performance over 7th Gen PCs,1 brilliant 4K Ultra HD entertainment, and easier, more convenient ways to interact with your PC. With 10 hour battery life2 and robust I/O support, Intel’s first quad-core U-series processors enable portable, powerhouse thin and light PCs, so you can accomplish more on the go. Extraordinary Performance and Responsiveness With Intel’s latest power-efficient microarchitecture, advanced process technology, and silicon optimizations, the 8th generation Intel Core processor (U-series) is Intel’s fastest 15W processor3 with up to 40 percent greater productivity than 7th Gen processors and 2X more productivity vs. comparable 5-year-old processors.4 • Get fast and responsive web browsing with Intel® Speed Shift Technology. • Intel® Turbo Boost Technology 2.0 lets you work more productively by dynamically controlling power and speed—across cores and graphics— boosting performance precisely when it is needed, while saving energy when it counts. • With up to four cores, 8th generation Intel Core Processor U-series with Intel® Hyper-Threading Technology (Intel® HT Technology) supports up to eight threads, making every day content creation a compelling experience on 2 in 1s and ultra-thin clamshells. • For those on the go, PCs enabled with Microsoft Windows* Modern Standby wake instantly at the push of a button, so you don’t have to wait for your system to start up. -
6Th Gen Intel® Core™ Desktop Processor Product Brief
Product Brief 6th Gen Intel® Core™ Processors for Desktops: S-series LOOKING FOR AN AMAZING PROCESSOR for your next desktop PC? Look no further than 6th Gen Intel® Core™ processors. With amazing performance and stunning visuals, PCs powered by 6th Gen Intel Core processors will help take things to the next level and transform how you use a PC. The performance of 6th Gen Intel Core processors enable great user experiences today and in the future, including no passwords and more natural user interfaces. When paired with Intel® RealSense™ technology and Windows* 10, 6th Gen Intel Core processors can help remove the hassle of remembering and typing in passwords. Product Brief 6th Gen Intel® Core™ Processors for Desktops: S-series THAT RESPONSIVE PERFORMANCE EXTRA New architecture and design in 6th Gen Intel Core processors for desktops bring: BURST OF Support for DDR4 RAM memory technology in mainstream platforms, allowing systems to have up to 64GB of memory and higher transfer PERFORMANCE speeds at lower power when compared to DDR3 (DDR4 speed 2133 MT/s at 1.2V vs DDR3 speed 1600 MT/s at 1.5). 6th Gen Intel Core i7 and Core i5 processors come with Intel® Turbo boost 2.0 Technology which gives you that extra burst of performance for those jobs that require a bit more frequency.1 Intel ® Hyper-Threading Technology1 allows each processor core to work on two tasks at the same time, improving multitasking, speeding up the workfl ow, and accomplishing more in less time. With the Intel Core i7 processor you can have up to 8 threads running at the same time. -
Validating the Intel® Pentium® 4 Microprocessor
16.1 9DOLGDWLQJ WKH ,QWHO 3HQWLXP 0LFURSURFHVVRU %RE%HQWOH\ ,QWHO&RUSRUDWLRQ 1(WK$YHQXH +LOOVERUR2UHJRQ86$ EREEHQWOH\#LQWHOFRP ABSTRACT 2. OVERVIEW Developing a new leading edge IA-32 microprocessor is an The Pentium® 4 processor is Intel’s most advanced IA-32 immensely complicated undertaking. In the case of the Pentium® microprocessor, incorporating a host of new microarchitectural 4 processor, the microarchitecture is significantly more complex features including a 400-MHz system bus, hyper-pipelined than any previous IA-32 microprocessor and the implementation technology, advanced dynamic execution, rapid execution engine, borrows almost nothing from any previous implementation. This advanced transfer cache, execution trace cache, and Streaming paper describes how we went about the task of finding bugs in the SIMD (Single Instruction, Multiple Data) Extensions 2 (SSE2). Pentium® 4 processor design prior to initial silicon, and what we For the most part, we applied similar tools and methodologies to found along the way. validating the Pentium® 4 processor that we had used previously General Terms on the Pentium® Pro processor. However, we developed new Management, Verification. methodologies and tools in response to lessons learnt from previous projects, and to address some of the new challenges that the Pentium® 4 processor design presented from a validation 1. INTRODUCTION perspective. In particular, the use of Formal Verification, Cluster Validation case studies are relatively rare in the literature of Test Environments and focused Power Reduction Validation were computer architecture and design ([1] and [2] contain lists of either new or greatly extended from previous projects; each of some recent papers) and case studies of commercial these is discussed in more detail in a section of this paper microprocessors are even rarer. -
The Microarchitecture of the Pentium 4 Processor
The Microarchitecture of the Pentium 4 Processor Glenn Hinton, Desktop Platforms Group, Intel Corp. Dave Sager, Desktop Platforms Group, Intel Corp. Mike Upton, Desktop Platforms Group, Intel Corp. Darrell Boggs, Desktop Platforms Group, Intel Corp. Doug Carmean, Desktop Platforms Group, Intel Corp. Alan Kyker, Desktop Platforms Group, Intel Corp. Patrice Roussel, Desktop Platforms Group, Intel Corp. Index words: Pentium® 4 processor, NetBurst™ microarchitecture, Trace Cache, double-pumped ALU, deep pipelining provides an in-depth examination of the features and ABSTRACT functions of the Intel NetBurst microarchitecture. This paper describes the Intel® NetBurst™ ® The Pentium 4 processor is designed to deliver microarchitecture of Intel’s new flagship Pentium 4 performance across applications where end users can truly processor. This microarchitecture is the basis of a new appreciate and experience its performance. For example, family of processors from Intel starting with the Pentium it allows a much better user experience in areas such as 4 processor. The Pentium 4 processor provides a Internet audio and streaming video, image processing, substantial performance gain for many key application video content creation, speech recognition, 3D areas where the end user can truly appreciate the applications and games, multi-media, and multi-tasking difference. user environments. The Pentium 4 processor enables real- In this paper we describe the main features and functions time MPEG2 video encoding and near real-time MPEG4 of the NetBurst microarchitecture. We present the front- encoding, allowing efficient video editing and video end of the machine, including its new form of instruction conferencing. It delivers world-class performance on 3D cache called the Execution Trace Cache.