Software Optimization Guide for Amd Family 15H Processors (.Pdf)
Total Page:16
File Type:pdf, Size:1020Kb
Software Optimization Guide for AMD Family 15h Processors Publication No. Revision Date 47414 3.06 January 2012 Advanced Micro Devices © 2012 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. The infor- mation contained herein may be of a preliminary or advance nature and is subject to change without notice. No license, whether express, implied, arising by estoppel or other- wise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right. AMD’s products are not designed, intended, authorized or warranted for use as compo- nents in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD’s product could create a situation where personal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks AMD, the AMD Arrow logo, and combinations thereof, AMD Athlon, AMD Opteron, 3DNow!, AMD Virtualization and AMD-V are trademarks of Advanced Micro Devices, Inc. HyperTransport is a licensed trademark of the HyperTransport Technology Consortium. Linux is a registered trademark of Linus Torvalds. Microsoft and Windows are registered trademarks of Microsoft Corporation. MMX is a trademark of Intel Corporation. PCI-X and PCI Express are registered trademarks of the PCI-Special Interest Group (PCI-SIG). Solaris is a registered trademark of Sun Microsystems, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. 47414 Rev. 3.06 January 2012 Software Optimization Guide for AMD Family 15h Processors Contents Tables . .11 Figures . .13 Revision History. .15 Chapter 1 Introduction . .17 1.1 Intended Audience . .17 1.2 Getting Started . .17 1.3 Using This Guide . .18 1.3.1 Special Information . .19 1.3.2 Numbering Systems . .19 1.3.3 Typographic Notation . .20 1.4 Important New Terms . .20 1.4.1 Multi-Core Processors . .20 1.4.2 Internal Instruction Formats . .20 1.4.3 Types of Instructions . .21 1.5 Key Optimizations . .22 1.5.1 Implementation Guideline . .22 1.6 What’s New on AMD Family 15h Processors . .22 1.6.1 AMD Instruction Set Enhancements . .23 1.6.2 Floating-Point Improvements . .23 1.6.3 Load-Execute Instructions for Unaligned Data . .25 1.6.4 Instruction Fetching Improvements . .25 1.6.5 Instruction Decode and Floating-Point Pipe Improvements . .26 1.6.6 Notable Performance Improvements . .26 1.6.7 AMD Virtualization™ Optimizations . .27 Chapter 2 Microarchitecture of AMD Family 15h Processors . .29 2.1 Key Microarchitecture Features . .30 2.2 Microarchitecture of AMD Family 15h Processors . .30 2.3 Superscalar Processor . .31 Contents 3 Software Optimization Guide for AMD Family 15h Processors 47414 Rev. 3.06 January 2012 2.4 Processor Block Diagram . .31 2.5 AMD Family 15h Processor Cache Operations . .32 2.5.1 L1 Instruction Cache . .33 2.5.2 L1 Data Cache . .33 2.5.3 L2 Cache . .33 2.5.4 L3 Cache . .33 2.6 Branch-Prediction . .34 2.7 Instruction Fetch and Decode . .34 2.8 Integer Execution . .35 2.9 Translation-Lookaside Buffer . .35 2.9.1 L1 Instruction TLB Specifications . .35 2.9.2 L1 Data TLB Specifications . .35 2.9.3 L2 Instruction TLB Specifications . .35 2.9.4 L2 Data TLB Specifications . .36 2.10 Integer Unit . .36 2.10.1 Integer Scheduler . .36 2.10.2 Integer Execution Unit . .36 2.11 Floating-Point Unit . .37 2.12 Load-Store Unit . .38 2.13 Write Combining . .39 2.14 Integrated Memory Controller . .39 2.15 HyperTransport™ Technology Interface . .40 2.15.1 HyperTransport Assist . .41 Chapter 3 C and C++ Source-Level Optimizations . .43 3.1 Declarations of Floating-Point Values . .44 3.2 Using Arrays and Pointers . .45 3.3 Use of Function Prototypes . .47 3.4 Unrolling Small Loops . .47 3.5 Expression Order in Compound Branch Conditions . .48 3.6 Arrange Boolean Operands for Quick Expression Evaluation . .49 4 Contents 47414 Rev. 3.06 January 2012 Software Optimization Guide for AMD Family 15h Processors 3.7 Long Logical Expressions in If Statements . .50 3.8 Pointer Alignment . .51 3.9 Unnecessary Store-to-Load Dependencies . .52 3.10 Matching Store and Load Size . .53 3.11 Use of const Type Qualifier . .56 3.12 Generic Loop Hoisting . .56 3.13 Local Static Functions . .59 3.14 Explicit Parallelism in Code . .59 3.15 Extracting Common Subexpressions . .62 3.16 Sorting and Padding C and C++ Structures . .63 3.17 Replacing Integer Division with Multiplication . .64 3.18 Frequently Dereferenced Pointer Arguments . ..