Intel® Oneapi Programming Guide
Total Page:16
File Type:pdf, Size:1020Kb
Intel® oneAPI Programming Guide Intel Corporation www.intel.com Notices and Disclaimers Contents Notices and Disclaimers....................................................................... 5 Chapter 1: Introduction Intel oneAPI Programming Overview ............................................................6 oneAPI Toolkit Distribution..........................................................................7 About This Guide.......................................................................................8 Related Documentation ..............................................................................8 Chapter 2: oneAPI Programming Model Data Parallel C++ (DPC++) ........................................................................9 C/C++ or Fortran with OpenMP* Offload Programming Model ........................ 11 Device Selection...................................................................................... 13 Chapter 3: oneAPI Development Environment Setup Use the setvars Script with Windows* ........................................................ 16 Use a Config file for setvars.bat on Windows ..................................... 18 Automate the setvars.bat Script with Microsoft Visual Studio* ............... 21 Use the setvars Script with Linux* or MacOS* ............................................. 24 Use a Config file for setvars.sh on Linux or macOS ............................. 26 Automate the setvars.sh Script with Eclipse* ...................................... 29 Use Modulefiles with Linux* ...................................................................... 30 Use CMake with oneAPI Applications .......................................................... 34 Chapter 4: Compile and Run oneAPI Programs Single Source Compilation ........................................................................ 35 Invoke the Compiler ................................................................................ 35 Standard Intel oneAPI DPC++/C++ Compiler Options .................................. 35 Example Compilation ............................................................................... 36 Compilation Flow Overview ....................................................................... 38 CPU Flow................................................................................................ 42 Example CPU Commands ................................................................. 42 Optimization Flags for CPU Architectures ............................................ 42 Host and Kernel Interaction on CPU ................................................... 43 Control Binary Execution on Multiple CPU Cores................................... 43 GPU Flow ............................................................................................... 45 Example GPU Commands ................................................................. 45 Offline Compilation for GPU .............................................................. 46 FPGA Flow .............................................................................................. 46 Why is FPGA Compilation Different?................................................... 47 Types of DPC++ FPGA Compilation .................................................... 47 FPGA Compilation Flags ................................................................... 49 Emulate Your Design ....................................................................... 51 Emulator Environment Variables ............................................... 52 Emulate Pipe Depth................................................................. 52 Emulate Applications with a Pipe That Reads or Writes to an I/O Pipe .................................................................................. 53 Compile and Emulate Your Design ............................................. 53 Limitations of the Emulator ...................................................... 54 Discrepancies in Hardware and Emulator Results ......................... 54 2 Contents Emulator Known Issues ........................................................... 56 Evaluate Your Kernel Through Simulation (Beta).................................. 56 Simulation Prerequisites .......................................................... 57 Installing the Questa*-Intel FPGA Edition Software...................... 57 Set Up the Simulation Environment ........................................... 58 Compile a Kernel for Simulation (Beta) ...................................... 58 Simulate Your Kernel (Beta) ..................................................... 59 Troubleshoot Simulator Issues .................................................. 59 Device Selectors for FPGA ................................................................ 60 Fast Recompile for FPGA .................................................................. 60 Generate Multiple FPGA Images (Linux only)....................................... 63 FPGA BSPs and Boards..................................................................... 66 FPGA Board Initialization.......................................................... 67 Targeting Multiple Homogeneous FPGA Devices ................................... 68 Targeting Multiple Platforms.............................................................. 69 FPGA-CPU Interaction ...................................................................... 70 FPGA Performance Optimization ........................................................ 71 Use of Static Library for FPGA ........................................................... 71 Restrictions and Limitations in RTL Support ................................ 75 Use DPC++ Shared Library With Third-Party Applications ..................... 76 FPGA Workflows in IDEs ................................................................... 81 Chapter 5: API-based Programming Intel oneAPI DPC++ Library (oneDPL)........................................................ 82 oneDPL Library Usage ...................................................................... 83 oneDPL Code Sample....................................................................... 83 Intel oneAPI Math Kernel Library (oneMKL) ................................................. 83 oneMKL Usage ................................................................................ 83 oneMKL Code Sample ...................................................................... 84 Intel oneAPI Threading Building Blocks (oneTBB) ......................................... 87 oneTBB Usage ................................................................................ 87 oneTBB Code Sample....................................................................... 88 Intel oneAPI Data Analytics Library (oneDAL) .............................................. 88 oneDAL Usage ................................................................................ 88 oneDAL Code Sample ...................................................................... 88 Intel oneAPI Collective Communications Library (oneCCL)............................. 89 oneCCL Usage ................................................................................ 89 oneCCL Code Sample....................................................................... 89 Intel oneAPI Deep Neural Network Library (oneDNN).................................... 89 oneDNN Usage ............................................................................... 90 oneDNN Code Sample...................................................................... 91 Intel oneAPI Video Processing Library (oneVPL) ........................................... 91 oneVPL Usage................................................................................. 91 oneVPL Code Sample ....................................................................... 92 Other Libraries ........................................................................................ 94 Chapter 6: Software Development Process Migrating Code to DPC++......................................................................... 95 Migrating from C++ to SYCL/DPC++ ................................................. 95 Migrating from CUDA* to DPC++ ...................................................... 95 Migrating from OpenCL Code to DPC++ ............................................. 96 Migrating Between CPU, GPU, and FPGA ............................................. 96 Composability ......................................................................................... 98 C/C++ OpenMP* and DPC++ Composability ....................................... 98 OpenCL™ Code Interoperability ........................................................ 100 3 Debugging the DPC++ and OpenMP* Offload Process................................. 100 oneAPI Debug Tools....................................................................... 101 Trace the Offload Process ............................................................... 107 Debug the Offload Process.............................................................. 108 Optimize Offload Performance......................................................... 116 Performance Tuning Cycle....................................................................... 118 Establish Baseline ......................................................................... 119 Identify Kernels to Offload.............................................................. 119 Offload Kernels ............................................................................