Core Specialization for AVX-512 Using Fault-And-Migrate

Total Page:16

File Type:pdf, Size:1020Kb

Core Specialization for AVX-512 Using Fault-And-Migrate Core Specialization for AVX-512 Using Fault-and-Migrate Masterarbeit von cand. inform. Peter Brantsch an der Fakultät für Informatik Erstgutachter: Prof. Dr. Frank Bellosa Zweitgutachter: Prof. Dr. Wolfgang Karl Betreuender Mitarbeiter: Mathias Gottschlag, M.Sc. Bearbeitungszeit: 07. Januar 2019 – 08. Juli 2019 KIT – Die Forschungsuniversität in der Helmholtz-Gemeinscha www.kit.edu Ich versichere wahrheitsgemäß, die Arbeit selbstständig verfasst, alle benutzten Hilfsmittel vollständig und genau angegeben und alles kenntlich gemacht zu haben, was aus Arbeiten anderer unverändert oder mit Abänderungen entnommen wurde sowie die Satzung des KIT zur Sicherung guter wissenschaftlicher Praxis in der jeweils gültigen Fassung beachtet zu haben. Karlsruhe, den 08. Juli 2019 Abstract The Advanced Vector Extensions 512 (AVX-512) are modern Single Instruction Multiple Data (SIMD) extensions to the x86 instruction set using 512-bit wide registers, enabling substantial acceleration of numeric workloads, for example processing eight sets of 64- bit operands in parallel. Because of the high power consumption of the corresponding functional units, a CPU core executing AVX-512 instructions has to temporarily reduce its clock frequency to maintain thermal and electrical limits. This clock frequency reduction can slow down the scalar part of mixed workloads because it persists substantially beyond the last AVX-512 instruction. To mitigate this performance impediment, core specialization can be used, which is the preferred use of certain cores for specific kinds of computation. By running AVX-512 and scalar code on disjoint sets of CPU cores, throttling of cores executing scalar code can be avoided. The Operating Systems Group at the Karlsruhe Institute of Technology has already demonstrated that core specialization can be effectively employed against the aforemen- tioned performance reduction by implementing it in Linux. A new system call is introduced to mark the beginning and end of AVX-512 phases of a task, such that the scheduler can mi- grate it to a specialized core. However, the existing implementation is neither transparent nor automatic, but instead requires the application to be modified. This thesis presents an extension of the existing core specialization implementation, making it transparent and automatic by efficiently virtualizing AVX-512 to intercept the instructions and subsequently trigger migration. Our extension determines the necessary number of AVX-512 cores at runtime based on CPU time consumed. Because there is no trivial way of detecting the end of an AVX-512 phase, we compare different heuristics for re-migration. We evaluate our prototype in a web server scenario with nginx and OpenSSL using ChaCha20-Poly1305 encryption and brotli compression, using AVX-512 to accelerate the combination of cipher and message authentication code. The benchmarks show that the performance degradation caused by AVX-512-induced frequency reductions can be almost completely mitigated, without having to modify the application. v Contents Abstractv Contents1 1 Introduction3 2 Background and Related Work5 2.1 AVX-512 . .5 2.2 Processor Clock Frequency Behavior . .6 2.3 Research in Mixed AVX-512 Workloads . .7 2.4 Core Specialization . .8 2.5 Scheduling for Heterogeneous Systems . .9 2.6 Staged Computation and Cohort Scheduling . 10 2.7 Reconfigurable Systems . 11 3 Analysis 13 3.1 Core Specialization . 13 3.2 Fault and Migrate . 14 4 Design and Implementation 15 4.1 Fault-and-Migrate . 15 4.1.1 Making AVX-512 Trap . 15 4.1.2 Triggering Migration . 17 4.1.3 Handling AVX-512 in The Kernel . 17 4.2 Core Specialization . 17 4.3 Determining the Number of AVX-512 Cores . 18 4.4 Re-Migration Heuristics . 19 4.5 Orthogonality of Approaches . 20 4.6 Debugging and Configuration . 20 5 Evaluation 23 5.1 Setup and Methods . 24 5.1.1 Patching CPU Feature Detection . 25 5.1.2 Repeatability . 25 5.1.3 Performance Counters . 25 5.1.4 The “Perf” Tools . 26 5.2 Mitigation of Frequency Reduction Effects . 26 5.2.1 Re-Migration Heuristics . 27 1 Contents 5.2.2 Selection of AVX-512 Cores . 28 5.2.3 Influence of the α Parameter . 31 5.2.4 Isolation of Throttling . 31 5.3 Theoretical Limit of Speedup . 32 5.4 Remaining Overhead . 34 5.4.1 Trap Handling . 34 5.4.2 Fault-and-Migrate . 34 5.5 Latency . 38 5.6 The Multi-Queue Skiplist Scheduler . 39 5.7 Discussion . 40 6 Conclusion 41 6.1 Future Work . 42 Bibliography 45 Glossary 49 2 1 Introduction Until around 2007, semiconductor scaling worked approximately like Dennard et al. [7] described it in 1974, with each new technology generation bringing smaller, faster tran- sistors at a constant power density. Then, Dennard scaling ended [2], as leakage currents through the ever thinner gate oxide increased, becoming a noticeable part of chip power consumption. The power density could no longer be kept constant and clock frequencies stopped increasing, though transistor miniaturization continued, and the number of tran- sistors per chip increased. To still increase the computing power of chips, even though single-thread performance scaling is hampered by stagnating clock frequencies, contem- porary designs employ parallelism of both code1 and data2. Because of the high power density and parallelism of processors, an increasing share of die area is under-utilized, either because running all of the chip at its maximum clock rate would exceed thermal or electrical limits, or because programs do not efficiently exploit the processor’s parallelism [8]. This share is called dark silicon. A recent example are the Advanced Vector Extensions 512 (AVX-512) [21, 22], because their functional units can raise the power consumption of a CPU core up to the point that it has to reduce its clock frequency, depending on the type3 and rate of Advanced Vec- tor Extensions 2 (AVX 2) and AVX-512 instructions, and the number of similarly loaded cores. However, these SIMD additions to the x86 instruction set operating on 512-bit wide registers can significantly accelerate computation such as cryptography (see the mi- crobenchmark in Section 5.3), offsetting the otherwise detrimental effect of the frequency reduction on performance. If this frequency reduction only affected the SIMD instructions, it would not pose a prob- lem, but before the core can return to higher clock frequencies, a timer of approximately 2ms first has to expire without any events again necessitating frequency changes [18, sec- tion 17.26]. Mixed workloads, in which only small parts of computation are accelerated, can therefore suffer an overall slowdown, because the reduced clock frequency persists beyond the last AVX-512 instruction, affecting the following scalar instructions. This slow- down was demonstrated by Krasnov [28] in a web server using AVX-512-accelerated cryp- tography functions. Chapter 2 will describe the frequency behavior of Intel CPUs in regard to AVX-512 in more detail. To avoid any possible frequency reduction because of wide SIMD instructions, vector- ization could simply be disabled, though only at the cost of all its advantages. No general decision whether to enable or disable vectorization for all mixed applications can be made, though, because the net benefit for even a single application depends on the scenario. 1Multiple Instruction streams, Multiple Data streams (MIMD), i.e. multi-core systems 2SIMD 3floating point or integer, multiplication or simpler operation 3 1 Introduction For the same application, vectorization can be advantageous in some circumstances, and detrimental in others. Instead, a method of maintaining predictable system performance and still reaping the benefits of vectorization is needed. Ideally, to save developers from laboriously instrumenting their applications, and users from unwelcome surprises in the form of library updates bringing performance regressions, such a method would be trans- parent and automatic. Gottschlag and Bellosa [13, 14] have shown that core specialization can successfully be used to mitigate the performance impediments faced by mixed workloads by restricting the execution of AVX-512 code to certain cores, such that only these cores are affected by frequency reductions. This thesis builds upon their previous work, and makes the following contributions: Eicient Virtualization of AVX-512 The CPU is configured to fault on AVX-512 instruc- tions, hence the name “fault-and-migrate”, such that we can intercept these instruc- tions and accordingly migrate the task. This is equivalent to efficient virtualization as per the criteria of Popek and Goldberg [36]. Transparency The existing implementation introduces a new system call to inform the scheduler about the beginning and end of AVX-512 use, requiring modification of the application code. Using the efficient virtualization described above, the scheduler can detect the beginning of AVX-512 use transparently. Automation To require the least amount of tuning by the user and easily adapt to a variety of mixed workloads, the system determines the number of AVX-512 cores automatically. Evaluation using a web server scenario derived from that of Krasnov [28] shows that core specialization using fault-and-migrate can mitigate the performance reduction of mixed workloads caused by AVX-512-induced frequency reductions. Our mechanism for automat- ically determining the number of AVX-512 cores is not yet optimal, tendentially choosing too many cores and causing greatly variable performance. However, using a static set of cores for AVX-512, automatic migration and re-migration are working completely satisfac- torily. The web server scenario, originally suffering from an 11.6% performance reduction with AVX-512, reaches 99% of its non-AVX-512 throughput using our prototype. The remainder of this thesis is structured as follows: Chapter 2 gives detailed back- ground information and an overview of related work, Chapter 3 analyses the problem and prerequisites and discusses possible approaches, followed by Chapter 4, which presents a solution and explains the design and implementation decisions. A comprehensive evalua- tion is performed in Chapter 5.
Recommended publications
  • Intel® Architecture Instruction Set Extensions and Future Features Programming Reference
    Intel® Architecture Instruction Set Extensions and Future Features Programming Reference 319433-037 MAY 2019 Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifica- tions. Current characterized errata are available on request. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Intel does not guarantee the availability of these interfaces in any future product. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1- 800-548-4725, or by visiting http://www.intel.com/design/literature.htm. Intel, the Intel logo, Intel Deep Learning Boost, Intel DL Boost, Intel Atom, Intel Core, Intel SpeedStep, MMX, Pentium, VTune, and Xeon are trademarks of Intel Corporation in the U.S.
    [Show full text]
  • Microcode Revision Guidance August 31, 2019 MCU Recommendations
    microcode revision guidance August 31, 2019 MCU Recommendations Section 1 – Planned microcode updates • Provides details on Intel microcode updates currently planned or available and corresponding to Intel-SA-00233 published June 18, 2019. • Changes from prior revision(s) will be highlighted in yellow. Section 2 – No planned microcode updates • Products for which Intel does not plan to release microcode updates. This includes products previously identified as such. LEGEND: Production Status: • Planned – Intel is planning on releasing a MCU at a future date. • Beta – Intel has released this production signed MCU under NDA for all customers to validate. • Production – Intel has completed all validation and is authorizing customers to use this MCU in a production environment.
    [Show full text]
  • Vxworks Architecture Supplement, 6.2
    VxWorks Architecture Supplement VxWorks® ARCHITECTURE SUPPLEMENT 6.2 Copyright © 2005 Wind River Systems, Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means without the prior written permission of Wind River Systems, Inc. Wind River, the Wind River logo, Tornado, and VxWorks are registered trademarks of Wind River Systems, Inc. Any third-party trademarks referenced are the property of their respective owners. For further information regarding Wind River trademarks, please see: http://www.windriver.com/company/terms/trademark.html This product may include software licensed to Wind River by third parties. Relevant notices (if any) are provided in your product installation at the following location: installDir/product_name/3rd_party_licensor_notice.pdf. Wind River may refer to third-party documentation by listing publications or providing links to third-party Web sites for informational purposes. Wind River accepts no responsibility for the information provided in such third-party documentation. Corporate Headquarters Wind River Systems, Inc. 500 Wind River Way Alameda, CA 94501-1153 U.S.A. toll free (U.S.): (800) 545-WIND telephone: (510) 748-4100 facsimile: (510) 749-2010 For additional contact information, please visit the Wind River URL: http://www.windriver.com For information on how to contact Customer Support, please visit the following URL: http://www.windriver.com/support VxWorks Architecture Supplement, 6.2 11 Oct 05 Part #: DOC-15660-ND-00 Contents 1 Introduction
    [Show full text]
  • CPUID Handling for Guests
    CPUID handling for guests Andrew Cooper Citrix XenServer Friday 26th August 2016 Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 1 / 11 Return information about the processor I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc) Unpriveleged I Useable by userspace I Doesn't trap to supervisor mode The CPUID instruction Introduced in 1993 I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 Unpriveleged I Useable by userspace I Doesn't trap to supervisor mode The CPUID instruction Introduced in 1993 I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx Return information about the processor I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc) Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 The CPUID instruction Introduced in 1993 I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx Return information about the processor I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc) Unpriveleged I Useable by userspace I Doesn't trap to supervisor mode Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 Some kernels binary patch themselves (e.g.
    [Show full text]
  • SIMD Extensions
    SIMD Extensions PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 12 May 2012 17:14:46 UTC Contents Articles SIMD 1 MMX (instruction set) 6 3DNow! 8 Streaming SIMD Extensions 12 SSE2 16 SSE3 18 SSSE3 20 SSE4 22 SSE5 26 Advanced Vector Extensions 28 CVT16 instruction set 31 XOP instruction set 31 References Article Sources and Contributors 33 Image Sources, Licenses and Contributors 34 Article Licenses License 35 SIMD 1 SIMD Single instruction Multiple instruction Single data SISD MISD Multiple data SIMD MIMD Single instruction, multiple data (SIMD), is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously. Thus, such machines exploit data level parallelism. History The first use of SIMD instructions was in vector supercomputers of the early 1970s such as the CDC Star-100 and the Texas Instruments ASC, which could operate on a vector of data with a single instruction. Vector processing was especially popularized by Cray in the 1970s and 1980s. Vector-processing architectures are now considered separate from SIMD machines, based on the fact that vector machines processed the vectors one word at a time through pipelined processors (though still based on a single instruction), whereas modern SIMD machines process all elements of the vector simultaneously.[1] The first era of modern SIMD machines was characterized by massively parallel processing-style supercomputers such as the Thinking Machines CM-1 and CM-2. These machines had many limited-functionality processors that would work in parallel.
    [Show full text]
  • Multiprocessing Contents
    Multiprocessing Contents 1 Multiprocessing 1 1.1 Pre-history .............................................. 1 1.2 Key topics ............................................... 1 1.2.1 Processor symmetry ...................................... 1 1.2.2 Instruction and data streams ................................. 1 1.2.3 Processor coupling ...................................... 2 1.2.4 Multiprocessor Communication Architecture ......................... 2 1.3 Flynn’s taxonomy ........................................... 2 1.3.1 SISD multiprocessing ..................................... 2 1.3.2 SIMD multiprocessing .................................... 2 1.3.3 MISD multiprocessing .................................... 3 1.3.4 MIMD multiprocessing .................................... 3 1.4 See also ................................................ 3 1.5 References ............................................... 3 2 Computer multitasking 5 2.1 Multiprogramming .......................................... 5 2.2 Cooperative multitasking ....................................... 6 2.3 Preemptive multitasking ....................................... 6 2.4 Real time ............................................... 7 2.5 Multithreading ............................................ 7 2.6 Memory protection .......................................... 7 2.7 Memory swapping .......................................... 7 2.8 Programming ............................................. 7 2.9 See also ................................................ 8 2.10 References .............................................
    [Show full text]
  • Intermediate Intel X86: Assembly, Architecture, Applications, and Alliteration
    Intermediate Intel x86: Assembly, Architecture, Applications, and Alliteration Xeno Kovah – 2010 xkovah at gmail All materials are licensed under a Creative Commons “Share Alike” license. • http://creativecommons.org/licenses/by-sa/3.0/ 2 Credits Page • Your name here! Just tell me something I didn’t know which I incorporate into the slides! • Veronica Kovah for reviewing slides and suggesting examples & questions to answer • Murad Kahn for Google NaCl stuff 3 Answer me these questions three • What, is your name? • What, is your quest? • What…is your department? 4 Agenda • Part 1 - Segmentation • Part 2 - Paging • Part 3 - Interrupts • Part 4 – Debugging, I/O, Misc fun on a bun 5 Miss Alaineous • Questions: Ask ‘em if you got ‘em – If you fall behind and get lost and try to tough it out until you understand, it’s more likely that you will stay lost, so ask questions ASAP. • Browsing the web and/or checking email during class is a great way to get lost ;) • 2 hours, 10 min break, 2 hours, 1 hour lunch, 1 hour at a time with 5 minute breaks after lunch • Adjusted depending on whether I’m running fast or slow (or whether people are napping 6 after lunch :P) Class Scope • We’re going to only be talking about 32bit architecture (also called IA-32). No 16bit or 64bit. • Also not really going to deal with floating point assembly or registers since you don’t run into them much unless you’re analyzing math/3d code. • This class focuses on diving deeper into architectural features • Some new instructions will be covered, but mostly in support of understanding the architectural topics, only two are learning new instruction’s sake.
    [Show full text]
  • Intel® Xeon® E-2100 Processor Family Datasheet, Vol. 1
    Intel® Xeon® E-2100 Processor Family Datasheet, Volume 1 of 2 August 2018 Revision 001 Document Number: 338012-001 Legal Lines and Disclaimers Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at Intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. Intel® Turbo Boost Technology requires a PC with a processor with Intel Turbo Boost Technology capability.
    [Show full text]
  • Intel® Software Guard Extensions (Intel® SGX)
    Intel® Software Guard Extensions (Intel® SGX) Developer Guide Intel(R) Software Guard Extensions Developer Guide Legal Information No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied war- ranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel rep- resentative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Intel technologies features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at Intel.com, or from the OEM or retailer. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. Optimization Notice Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations.
    [Show full text]
  • Alphaserver ES40 Service Guide
    AlphaServer ES40 Service Guide Order Number: EK–ES240–SV. A01 This guide is intended for service providers and self- maintenance customers responsible for Compaq AlphaServer ES40 systems. Compaq Computer Corporation First Printing, July 1999 The information in this publication is subject to change without notice. COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED “AS IS” AND COMPAQ COMPUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS, IMPLIED OR STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, GOOD TITLE AND AGAINST INFRINGEMENT. This publication contains information protected by copyright. No part of this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation. © 1999 Digital Equipment Corporation. All rights reserved. Printed in the U.S.A. The software described in this guide is furnished under a license agreement or nondisclosure agreement. The software may be used or copied only in accordance with the terms of the agreement. COMPAQ and the Compaq logo are registered in United States Patent and Trademark Office. Tru64 is a trademark of Compaq Computer Corporation. AlphaServer and OpenVMS are trademarks of Digital Equipment Corporation. Prestoserve is a trademark of Legato Systems, Inc. UNIX is a registered trademark in the U.S. and other countries, licensed exclusively through X/Open Company Ltd. Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation. Other product names mentioned herein may be the trademarks of their respective companies.
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Intel Processor Identification and the CPUID Instruction
    E AP-485 APPLICATION NOTE Intel Processor Identification and the CPUID Instruction April 1998 Order Number : 241618-009 3/23/98 10:27 AM 24161808.DOC INTEL CONFIDENTIAL (until publication date) Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. Intel’s Intel Architecture processors (e.g., Pentium® processor, Pentium processor with MMX™ technology, Pentium Pro processor, Pentium II processor and Celeron™ processor) may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an ordering number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725 or by visiting Intel’s website at http://www.intel.com Copyright © Intel Corporation 1996, 1997.
    [Show full text]