Software Optimization Guide for the AMD Hammer Processor

Software Optimization Guide for the AMD Hammer Processor

Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors Publication # 25112 Revision: 3.03 Issue Date: September 2003 © 2001 – 2003 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. No license, whether express, implied, arising by estoppel or otherwise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied war- ranty of merchantability, fitness for a particular purpose, or infringement of any intellec- tual property right. AMD’s products are not designed, intended, authorized or warranted for use as compo- nents in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD’s product could create a situation where personal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks AMD, the AMD Arrow logo, AMD Athlon, AMD Opteron, and combinations thereof, 3DNow! and AMD-8151 are trademarks of Advanced Micro Devices, Inc. HyperTransport is a licensed trademark of the HyperTransport Technology Consortium. Microsoft is a registered trademark of Microsoft Corporation. MMX is a trademark of Intel Corporation. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. 25112 Rev. 3.03 September 2003 Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors Contents Revision History . .xv Chapter 1 Introduction . .1 1.1 Intended Audience . .1 1.2 Getting Started Quickly . .1 1.3 Using This Guide . .2 1.4 Important New Terms . .4 1.5 Key Optimizations . .6 Chapter 2 C and C++ Source-Level Optimizations . .7 2.1 Declarations of Floating-Point Values . .9 2.2 Using Arrays and Pointers . .10 2.3 Unrolling Small Loops . .13 2.4 Expression Order in Compound Branch Conditions . .14 2.5 Long Logical Expressions in If Statements . .16 2.6 Arrange Boolean Operands for Quick Expression Evaluation . .17 2.7 Dynamic Memory Allocation Consideration . .19 2.8 Unnecessary Store-to-Load Dependencies . .20 2.9 Matching Store and Load Size . .22 2.10 SWITCH and Noncontiguous Case Expressions . .25 2.11 Arranging Cases by Probability of Occurrence . .28 2.12 Use of Function Prototypes . .29 2.13 Use of const Type Qualifier . .30 2.14 Generic Loop Hoisting . .31 2.15 Local Static Functions . .34 2.16 Explicit Parallelism in Code . .35 2.17 Extracting Common Subexpressions . .37 2.18 Sorting and Padding C and C++ Structures . .39 2.19 Sorting Local Variables . .41 Contents iii Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ 25112 Rev. 3.03 September 2003 Processors 2.20 Replacing Integer Division with Multiplication . .43 2.21 Frequently Dereferenced Pointer Arguments . .44 2.22 Using Signed Integers for 32-Bit Array Indices . .46 2.23 32-Bit Integral Data Types . .47 2.24 Sign of Integer Operands . .48 2.25 Accelerating Floating-Point Division and Square Root . .50 2.26 Fast Floating-Point-to-Integer Conversion . .52 2.27 Speeding Up Branches Based on Comparisons Between Floats . .54 Chapter 3 General 64-Bit Optimizations . .57 3.1 64-Bit Registers and Integer Arithmetic . .58 3.2 64-Bit Arithmetic and Large-Integer Multiplication . .60 3.3 128-Bit Media Instructions and Floating-Point Operations . .65 3.4 32-Bit Legacy GPRs and Small Unsigned Integers . .66 Chapter 4 Instruction-Decoding Optimizations . .69 4.1 DirectPath Instructions . .70 4.2 Load-Execute Instructions . .71 4.2.1 Load-Execute Integer Instructions . .71 4.2.2 Load-Execute Floating-Point Instructions with Floating-Point Operands . .72 4.2.3 Load-Execute Floating-Point Instructions with Integer Operands . .72 4.3 Branch Targets in Program Hot Spots . .74 4.4 32/64-Bit vs. 16-Bit Forms of the LEA Instruction . .75 4.5 Short Instruction Encodings . .76 4.6 Partial-Register Reads and Writes . .77 4.7 Using LEAVE for Function Epilogues . .79 4.8 Alternatives to SHLD Instruction . .81 4.9 8-Bit Sign-Extended Immediate Values . .83 4.10 8-Bit Sign-Extended Displacements . .84 4.11 Code Padding with Operand-Size Override and NOP . .85 Chapter 5 Cache and Memory Optimizations . .87 5.1 Memory-Size Mismatches . .88 iv Contents 25112 Rev. 3.03 September 2003 Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors 5.2 Natural Alignment of Data Objects . .91 5.3 Multiprocessor Considerations . .92 5.4 Store-to-Load Forwarding Restrictions . .93 5.5 Prefetch Instructions . .97 5.6 Write-combining . .105 5.7 L1 Data Cache Bank Conflicts . .106 5.8 Placing Code and Data in the Same 64-Byte Cache Line . .108 5.9 Sorting and Padding C and C++ Structures . .109 5.10 Sorting Local Variables . .111 5.11 Appropriate Memory Copying Routines . .112 5.12 Stack Considerations . .123 5.13 Interleave Loads and Stores . .124 Chapter 6 Branch Optimizations . .125 6.1 Density of Branches . .126 6.2 Two-Byte Near-Return RET Instruction . .128 6.3 Branches That Depend on Random Data . .130 6.4 Pairing CALL and RETURN . .132 6.5 Recursive Functions . .133 6.6 Nonzero Code-Segment Base Values . ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    382 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us