PGI Compilers

USER'S GUIDE FOR X86-64 CPUS Version 2019 TABLE OF CONTENTS Preface............................................................................................................ xii Audience Description......................................................................................... xii Compatibility and Conformance to Standards............................................................xii Organization................................................................................................... xiii Hardware and Software Constraints.......................................................................xiv Conventions.................................................................................................... xiv Terms............................................................................................................ xv Related Publications.........................................................................................xvii Chapter 1. Getting Started.....................................................................................1 1.1. Overview................................................................................................... 1 1.2. Creating an Example..................................................................................... 2 1.3. Invoking the Command-level PGI Compilers......................................................... 2 1.3.1. Command-line Syntax...............................................................................2 1.3.2. Command-line Options............................................................................. 3 1.3.3. Fortran Directives and C/C++ Pragmas.......................................................... 3 1.4. Filename Conventions....................................................................................4 1.4.1. Input Files............................................................................................ 4 1.4.2. Output Files.......................................................................................... 6 1.5. Fortran, C, and C++ Data Types........................................................................7 1.6. Parallel Programming Using the PGI Compilers...................................................... 7 1.6.1. Run SMP Parallel Programs.........................................................................8 1.7. Platform-specific considerations....................................................................... 8 1.7.1. Using the PGI Compilers on Linux................................................................ 9 1.7.2. Using the PGI Compilers on Windows............................................................ 9 1.7.3. PGI on the Windows Desktop.................................................................... 11 1.7.4. Using the PGI Compilers on macOS............................................................. 12 1.8. Site-Specific Customization of the Compilers...................................................... 13 1.8.1. Use siterc Files..................................................................................... 13 1.8.2. Using User rc Files.................................................................................13 1.9. Common Development Tasks.......................................................................... 14 Chapter 2. Use Command-line Options.................................................................... 16 2.1. Command-line Option Overview...................................................................... 16 2.1.1. Command-line Options Syntax................................................................... 16 2.1.2. Command-line Suboptions........................................................................ 17 2.1.3. Command-line Conflicting Options.............................................................. 17 2.2. Help with Command-line Options.................................................................... 17 2.3. Getting Started with Performance................................................................... 18 2.3.1. Using -fast...........................................................................................18 2.3.2. Other Performance-Related Options............................................................ 19 2.4. Targeting Multiple Systems—Using the -tp Option................................................. 20 User's Guide for x86-64 CPUs Version 2019 | ii 2.5. Frequently-used Options............................................................................... 20 Chapter 3. Optimizing and Parallelizing................................................................... 23 3.1. Overview of Optimization..............................................................................24 3.1.1. Local Optimization.................................................................................24 3.1.2. Global Optimization............................................................................... 24 3.1.3. Loop Optimization: Unrolling, Vectorization and Parallelization........................... 24 3.1.4. Interprocedural Analysis (IPA) and Optimization..............................................25 3.1.5. Function Inlining................................................................................... 25 3.1.6. Profile-Feedback Optimization (PFO)........................................................... 25 3.2. Getting Started with Optimization................................................................... 25 3.2.1. -help..................................................................................................27 3.2.2. -Minfo................................................................................................ 27 3.2.3. -Mneginfo............................................................................................ 27 3.2.4. -dryrun............................................................................................... 28 3.2.5. -v......................................................................................................28 3.2.6. PGI Profiler..........................................................................................28 3.3. Common Compiler Feedback Format (CCFF)....................................................... 28 3.4. Local and Global Optimization........................................................................28 3.4.1. -Msafeptr............................................................................................ 29 3.4.2. -O..................................................................................................... 29 3.5. Loop Unrolling using -Munroll......................................................................... 31 3.6. Vectorization using -Mvect.............................................................................32 3.6.1. Vectorization Sub-options.........................................................................33 3.6.2. Vectorization Example Using SIMD Instructions............................................... 35 3.7. Auto-Parallelization using -Mconcur..................................................................37 3.7.1. Auto-Parallelization Sub-options.................................................................37 3.7.2. Loops That Fail to Parallelize................................................................... 39 3.8. Processor-Specific Optimization and the Unified Binary.......................................... 43 3.9. Interprocedural Analysis and Optimization using -Mipa........................................... 43 3.9.1. Building a Program Without IPA – Single Step................................................. 44 3.9.2. Building a Program Without IPA – Several Steps.............................................. 44 3.9.3. Building a Program Without IPA Using Make................................................... 45 3.9.4. Building a Program with IPA......................................................................45 3.9.5. Building a Program with IPA – Single Step..................................................... 46 3.9.6. Building a Program with IPA – Several Steps.................................................. 46 3.9.7. Building a Program with IPA Using Make....................................................... 47 3.9.8. Questions about IPA............................................................................... 47 3.10. Profile-Feedback Optimization using -Mpfi/-Mpfo................................................ 48 3.11. Default Optimization Levels......................................................................... 49 3.12. Local Optimization Using Directives and Pragmas................................................49 3.13. Execution Timing and Instruction Counting........................................................50 3.14. Portability of Multi-Threaded Programs on Linux.................................................51 3.14.1. libnuma.............................................................................................51 User's Guide for x86-64 CPUs Version 2019 | iii Chapter 4. Using Function Inlining..........................................................................52 4.1. Automatic function inlining in C/C++................................................................52 4.2. Invoking Function Inlining..............................................................................53 4.3. Using an Inline Library................................................................................. 54

PGI Compilers

A Deep Dive Into the Interprocedural Optimization Infrastructure

Handout – Dataflow Optimizations Assignment

Comparative Studies of Programming Languages; Course Lecture Notes

CS 110 Discussion 15 Programming with SIMD Intrinsics

Introduction Inline Expansion

Intel Hardware Intrinsics in .NET Core

Optimizing Subroutines in Assembly Language an Optimization Guide for X86 Platforms

Dataflow Optimizations

Automatic SIMD Vectorization of Fast Fourier Transforms for the Larrabee and AVX Instruction Sets

Eliminating Scope and Selection Restrictions in Compiler Optimizations

Compiler-Based Code-Improvement Techniques

Foundations of Scientific Research