PGI Compilers
Total Page:16
File Type:pdf, Size:1020Kb
USER'S GUIDE FOR X86-64 CPUS Version 2019 TABLE OF CONTENTS Preface............................................................................................................ xii Audience Description......................................................................................... xii Compatibility and Conformance to Standards............................................................xii Organization................................................................................................... xiii Hardware and Software Constraints.......................................................................xiv Conventions.................................................................................................... xiv Terms............................................................................................................ xv Related Publications.........................................................................................xvii Chapter 1. Getting Started.....................................................................................1 1.1. Overview................................................................................................... 1 1.2. Creating an Example..................................................................................... 2 1.3. Invoking the Command-level PGI Compilers......................................................... 2 1.3.1. Command-line Syntax...............................................................................2 1.3.2. Command-line Options............................................................................. 3 1.3.3. Fortran Directives and C/C++ Pragmas.......................................................... 3 1.4. Filename Conventions....................................................................................4 1.4.1. Input Files............................................................................................ 4 1.4.2. Output Files.......................................................................................... 6 1.5. Fortran, C, and C++ Data Types........................................................................7 1.6. Parallel Programming Using the PGI Compilers...................................................... 7 1.6.1. Run SMP Parallel Programs.........................................................................8 1.7. Platform-specific considerations....................................................................... 8 1.7.1. Using the PGI Compilers on Linux................................................................ 9 1.7.2. Using the PGI Compilers on Windows............................................................ 9 1.7.3. PGI on the Windows Desktop.................................................................... 11 1.7.4. Using the PGI Compilers on macOS............................................................. 12 1.8. Site-Specific Customization of the Compilers...................................................... 13 1.8.1. Use siterc Files..................................................................................... 13 1.8.2. Using User rc Files.................................................................................13 1.9. Common Development Tasks.......................................................................... 14 Chapter 2. Use Command-line Options.................................................................... 16 2.1. Command-line Option Overview...................................................................... 16 2.1.1. Command-line Options Syntax................................................................... 16 2.1.2. Command-line Suboptions........................................................................ 17 2.1.3. Command-line Conflicting Options.............................................................. 17 2.2. Help with Command-line Options.................................................................... 17 2.3. Getting Started with Performance................................................................... 18 2.3.1. Using -fast...........................................................................................18 2.3.2. Other Performance-Related Options............................................................ 19 2.4. Targeting Multiple Systems—Using the -tp Option................................................. 20 User's Guide for x86-64 CPUs Version 2019 | ii 2.5. Frequently-used Options............................................................................... 20 Chapter 3. Optimizing and Parallelizing................................................................... 23 3.1. Overview of Optimization..............................................................................24 3.1.1. Local Optimization.................................................................................24 3.1.2. Global Optimization............................................................................... 24 3.1.3. Loop Optimization: Unrolling, Vectorization and Parallelization........................... 24 3.1.4. Interprocedural Analysis (IPA) and Optimization..............................................25 3.1.5. Function Inlining................................................................................... 25 3.1.6. Profile-Feedback Optimization (PFO)........................................................... 25 3.2. Getting Started with Optimization................................................................... 25 3.2.1. -help..................................................................................................27 3.2.2. -Minfo................................................................................................ 27 3.2.3. -Mneginfo............................................................................................ 27 3.2.4. -dryrun............................................................................................... 28 3.2.5. -v......................................................................................................28 3.2.6. PGI Profiler..........................................................................................28 3.3. Common Compiler Feedback Format (CCFF)....................................................... 28 3.4. Local and Global Optimization........................................................................28 3.4.1. -Msafeptr............................................................................................ 29 3.4.2. -O..................................................................................................... 29 3.5. Loop Unrolling using -Munroll......................................................................... 31 3.6. Vectorization using -Mvect.............................................................................32 3.6.1. Vectorization Sub-options.........................................................................33 3.6.2. Vectorization Example Using SIMD Instructions............................................... 35 3.7. Auto-Parallelization using -Mconcur..................................................................37 3.7.1. Auto-Parallelization Sub-options.................................................................37 3.7.2. Loops That Fail to Parallelize................................................................... 39 3.8. Processor-Specific Optimization and the Unified Binary.......................................... 43 3.9. Interprocedural Analysis and Optimization using -Mipa........................................... 43 3.9.1. Building a Program Without IPA – Single Step................................................. 44 3.9.2. Building a Program Without IPA – Several Steps.............................................. 44 3.9.3. Building a Program Without IPA Using Make................................................... 45 3.9.4. Building a Program with IPA......................................................................45 3.9.5. Building a Program with IPA – Single Step..................................................... 46 3.9.6. Building a Program with IPA – Several Steps.................................................. 46 3.9.7. Building a Program with IPA Using Make....................................................... 47 3.9.8. Questions about IPA............................................................................... 47 3.10. Profile-Feedback Optimization using -Mpfi/-Mpfo................................................ 48 3.11. Default Optimization Levels......................................................................... 49 3.12. Local Optimization Using Directives and Pragmas................................................49 3.13. Execution Timing and Instruction Counting........................................................50 3.14. Portability of Multi-Threaded Programs on Linux.................................................51 3.14.1. libnuma.............................................................................................51 User's Guide for x86-64 CPUs Version 2019 | iii Chapter 4. Using Function Inlining..........................................................................52 4.1. Automatic function inlining in C/C++................................................................52 4.2. Invoking Function Inlining..............................................................................53 4.3. Using an Inline Library................................................................................. 54