Intel Itanium 2 Processor Reference Manual

Intel® Itanium® 2 Processor Reference Manual For Software Development and Optimization May 2004 Order Number: 251110-003 THIS DOCUMENT IS PROVIDED “AS IS” WITH NO WARRANTIES WHATSOEVER, INCLUDING ANY WARRANTY OF MERCHANTABILITY, FITNESS FOR ANY PARTICULAR PURPOSE, OR ANY WARRANTY OTHERWISE ARISING OUT OF ANY PROPOSAL, SPECIFICATION OR SAMPLE. Information in this document is provided in connection with Intel® products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The Pentium, Itanium and IA-32 architecture processors may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800- 548-4725, or by visiting Intel's web site at http://www.intel.com. Intel, Itanium, Pentium, and VTune are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Copyright © 2002-2004, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. 2 Intel® Itanium® 2 Processor Reference Manual For Software Development and Optimization Contents 1 About this Manual.............................................................................................................13 1.1 Overview .............................................................................................................13 1.2 Contents..............................................................................................................14 1.3 Terminology.........................................................................................................14 1.4 Related Documentation.......................................................................................15 2 Itanium® 2 Processor Enhancements ..............................................................................17 2.1 Implemented Instructions ....................................................................................17 2.2 Functional Units and Issue Rules........................................................................17 2.3 Operation Latencies ............................................................................................17 2.4 Data Operations ..................................................................................................18 2.4.1 Data Speculation and the ALAT .............................................................18 2.4.2 Data Alignment.......................................................................................18 2.4.3 Control Speculation ................................................................................20 2.5 Memory Hierarchy...............................................................................................20 2.6 Branch Prediction................................................................................................22 2.7 Instruction Prefetching.........................................................................................23 2.8 IA-32 Execution Layer.........................................................................................23 3 Functional Units and Issue Rules.....................................................................................25 3.1 Execution Model..................................................................................................25 3.2 Number and Types of Functional Units...............................................................25 3.3 Instruction Slot to Functional Unit Mapping.........................................................26 3.3.1 Execution Width .....................................................................................28 3.3.2 Dispersal Rules ......................................................................................29 3.3.3 Split Issue and Bundle Types.................................................................31 4 Latencies and Bypasses ..................................................................................................33 4.1 Control and Data Speculation Penalties..............................................................33 4.2 Branch Related Latencies and Penalties ............................................................33 4.3 Latencies for OS Related Instructions.................................................................34 5 Data Operations ...............................................................................................................37 5.1 Data Speculation and the ALAT..........................................................................37 5.1.1 Allocation/Replacement Policy ...............................................................38 5.1.2 Rules and Special Cases .......................................................................38 5.2 Speculative and Predicated Loads/Stores ..........................................................38 5.3 Floating-Point Loads ...........................................................................................40 5.4 Data Cache Prefetching and Load Hints.............................................................40 5.4.1 lfetch Implementation .............................................................................40 5.4.2 Load Temporal Locality Completers.......................................................41 5.5 Data Alignment....................................................................................................42 5.6 Write Coalescing .................................................................................................43 5.6.1 WC Buffer Eviction Conditions ...............................................................43 5.6.2 WC Buffer Flushing Behavior .................................................................43 5.7 Register Stack Engine.........................................................................................44 5.8 FC Instructions ....................................................................................................44 Intel® Itanium® 2 Processor Reference Manual For Software Development and Optimization 3 6 Memory Subsystem .........................................................................................................45 6.1 Translation Lookaside Buffers.............................................................................46 6.1.1 Instruction TLBs .....................................................................................46 6.1.2 Data TLBs ..............................................................................................46 6.2 Hardware Page Walker .......................................................................................47 6.3 Cache Summary .................................................................................................48 6.4 First-Level Instruction Cache ..............................................................................48 6.5 Instruction Stream Buffer ....................................................................................49 6.6 First-Level Data Cache .......................................................................................49 6.6.1 L1D Loads..............................................................................................50 6.6.2 L1D Stores .............................................................................................50 6.6.3 L1D Load and Store Considerations ......................................................51 6.6.4 L1D Misses ............................................................................................52 6.7 Second-Level Unified Cache...............................................................................53 6.7.1 L1D Requests to L2 ...............................................................................54 6.7.2 L2 OzQ...................................................................................................54 6.7.3 L2 Cancels .............................................................................................56 6.7.4 L2 Recirculate ........................................................................................57 6.7.5 Memory Ordering ...................................................................................58 6.7.6 L2 Instruction Prefetch FIFO ..................................................................58

Intel Itanium 2 Processor Reference Manual

1 Introduction

Caching Basics

On the Hardware Reduction of Z-Datapath of Vectoring CORDIC

Instruction Latencies and Throughput for AMD and Intel X86 Processors

2. Instruction Set Architecture

18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures

Computer Organization and Architecture Designing for Performance Ninth Edition

45-Year CPU Evolution: One Law and Two Equations

Hierarchical Roofline Analysis for Gpus: Accelerating Performance

Introduction to the Intel® Nios® II Soft Processor

Computer Organization & Architecture Eie

Exploiting Branch Target Injection Jann Horn, Google Project Zero