Intel® 64 Architecture Processor Topology Enumeration

Intel® 64 Architecture Processor Topology Enumeration January 2018 Reference Number: 337015-001, Revision: 1.1 Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548- 4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 2018, Intel Corporation. All Rights Reserved. 2 Reference Number: 337015-001, Revision: 1.1 Contents 1 Intel® 64 Architecture Processor Topology Enumeration...........................................5 1.1 Introduction .......................................................................................................5 1.1.1 Glossary .................................................................................................6 1.2 Unique APIC ID in a Multi-Processor System ...........................................................7 1.3 System Topology Enumeration Using CPUID Extended Topology Leaf .........................9 1.4 System Topology Enumeration Using CPUID Leaf 1 and Leaf 4................................. 10 1.4.1 Cache Topology Enumeration ................................................................... 11 1.5 Data Processing of Sub IDs of the Topology .......................................................... 12 1.5.1 Sub ID Extraction Parameters for x2APIC ID .............................................. 12 1.5.2 Considerations for Using Three Level Topology on a System with More Levels. 13 1.5.3 Sub ID Extraction Parameters for Initial APIC ID......................................... 14 1.5.4 Cache ID Extraction Parameters ............................................................... 14 1.6 Analyzing Topology Enumeration Result and Customization ..................................... 15 1.6.1 Dynamic Software Visibility of Topology Enumeration .................................. 16 A Examples................................................................................................................. 17 Figures 1-1 Example of Choosing CPUID Leaf Information for System Topology Enumeration .........8 1-2 Procedures to Extract Sub IDs from the x2APIC ID of Each Logical Processor...............9 1-3 Procedures to Extract Sub IDs from the Initial APIC ID of Each Logical Processor ....... 10 1-4 Procedures to Extract Cache_ID of the Target Cache Level of Each Logical Processor .. 11 1-5 Procedures to Extract Cache_ID of the Target Cache Level of Each Logical Processor .. 16 Reference Number: 337015-001, Revision: 1.1 3 Revision History Document Revision Description Date Number Number 329176-001 1.1 • Updates to original paper January 2018 § 4 Reference Number: 337015-001, Revision: 1.1 Intel® 64 Architecture Processor Topology Enumeration CHAPTER 1 INTEL® 64 ARCHITECTURE PROCESSOR TOPOLOGY ENUMERATION 1.1 INTRODUCTION Processor topology information is important for a number of processor-resource management prac- tices, ranging from task/thread scheduling, licensing policy enforcement, affinity control/migration, etc. Topology information of the cache hierarchy can be important to optimizing software performance. This white paper covers topology enumeration algorithm for single-socket to multiple-socket platforms using Intel 64 and IA-32 processors. The topology enumeration algorithms (both processor and cache) using initial APIC ID has been extended to use x2APIC ID, the latter mechanism is required for future platforms supporting more than 256 logical processors in a coherent domain. Hardware multithreading in microprocessors has proliferated in recent years. The majority of Intel® architecture processors shipping today provide one or more forms of hardware multi-threading support (multicore and/or simultaneous multithreading (SMT), the latter introduced as HyperThreading Tech- nology in 2002). From a processor hardware perspective, the physical package of an Intel 64 processor can support SMT and multi-core. Consequently, a physical processor is effectively a hierarchically ordered collection of logical processors with some forms of shared system resources (for example, memory, bus/system links, caches) From a platform hardware perspective, hardware multithreading that exists in a multi-processor system may consist of two or more physical processors organized in either uniform or non-uniform configuration with respect to the memory subsystem. Application programming using hardware multithreading features must follow the programming models and software constructs provided by the underlying operating system. For example, an OS scheduler generally assigns a software task from a queue using hardware resource at the granularity of a logical processor; an OS may define its own data structure and provide services to applications that allows them to customize the assignment between task and logical processor via an affinity construct for multithreaded applications The OS and the software stack underneath an application (the BIOS, the OS loader) also play significant roles in bringing up the hardware multi-threading features and configuring the software constructs defined by the OS. The CPUID instruction in Intel 64 architecture defines a rich set of information to assist BIOS, OS, and applications to query processor topology that are needed for efficient operation by each respective member of the software stack. Generally, the BIOS needs to gather topology information of a physical processor, determine how many physical processors are present in the system; prepare the necessary software constructs related to system topology, and pass along the system topology information to the next layer of the software stack that takes over control of the system. The OS and the application layers have a wide range of uses for topology information. This document covers several common software usages by OS and applications for using CPUID to analyze processor topology in a single- processor or multi-processor system. The primary software usage of processor topology enumeration deals with querying and identifying the hierarchical relationship of logical processor, processor cores, and physical packages in a single- processor or multi-processor system. We’ll refer to this usage as system topology enumeration. System topology enumeration may be needed by OS or certain applications to implement licensing policy based on physical processors. It is used by OS to implement efficient task-scheduling, minimize thread migration, configure application thread management interfaces, and configure memory allocation services appropriate to the processor/memory topology. Multithreaded applications need system topology information to determine optimal thread binding, manage memory allocation for optimal locality, and improve performance scaling in multi-processor systems. Reference Number: 337015-001, Revision: 1.1 5 Intel® 64 Architecture Processor Topology Enumeration Intel 64 processors fulfill system topology enumeration requirements: 1. Each logical processor in an Intel 64 or IA-32 platform supporting coherent memory is assigned a unique ID (APIC ID) within the coherent domain. A multi-node cluster installation may employ vendor-specific BIOS that preserve the APIC IDs assigned (during processor reset) within each coherent domain, extend with node IDs to form a superset of unique IDs within the clustered system. This document will only cover the CPUID interfaces providing unique IDs within a coherent domain. 2. The values of unique IDs assigned within a coherent Intel 64 or IA-32 platform conform to an algorithm based on bit-field decomposition of the APIC ID into three sub-fields . The three sets of sub-fields correspond to three hierarchical levels defined as “SMT”, “processor core” (or “core”), and “physical package” (or “package”). This allows each hierarchical level to be mapped to a sub- field (a sub ID) within the APIC ID. Conceptually, a topology enumeration algorithm

Intel® 64 Architecture Processor Topology Enumeration

SQL Server 2005 on the IBM Eserver Xseries 460 Enterprise Server

Support for NUMA Hardware in Helenos

A Novel Hard-Soft Processor Affinity Scheduling for Multicore Architecture Using Multiagents

Agent Based Load Balancing Scheme Using Affinity Processor Scheduling for Multicore Architectures

Optimizing Linux* for Dual-Core AMD Opteron™ Processors Optimizing Linux for Dual-Core AMD Opteron Processors

Analysis of Application Sensitivity to System Performance Variability in a Dynamic Task Based Runtime

Introduction to Openmp

White Paper Processor Affinity Multiple CPU Scheduling

Affinity Mask Recommended Value: 0 to Carry out Multitasking, Microsoft Windows 2000 and Windows Server 2003 Sometimes Move Process Threads Among Different Processors

Red Hat Enterprise Linux 7 Performance Tuning Guide

Nighttune User's Guide

Multi-Core Object-Oriented Real-Time Pipeline