Intel® Advisor User Guide

Intel® Advisor User Guide Intel Corporation Intel® Advisor User Guide Contents Chapter 1: Intel® Advisor User Guide Introduction .............................................................................................7 What's New ......................................................................................8 Design and Optimization Methodology................................................ 13 Tutorials ........................................................................................ 17 Get Help and Support ...................................................................... 18 Install and Launch Intel® Advisor ............................................................... 19 Install Intel® Advisor........................................................................ 19 Set Up Environment Variables ........................................................... 20 Set Up System to Analyze GPU Kernels .............................................. 22 Set Up Environment to Offload DPC++, OpenMP* target, and OpenCL™ Applications to CPU ..................................................................... 24 Launch Intel® Advisor ...................................................................... 25 Get Started with Intel® Advisor GUI ................................................... 28 Set Up Project......................................................................................... 30 Configure Target Application ............................................................. 30 Limit the Number of Threads Used by Parallel Frameworks ........... 31 Choose a Small, Representative Data Set .................................. 31 Build Target Application.................................................................... 32 Create Project................................................................................. 37 Configure Project .................................................................... 38 Configure Binary/Symbol Search Directories ............................... 45 Configure Source Search Directory ............................................ 46 Binary/Symbol Search and Source Search Locations .................... 47 Analyze Vectorization Perspective .............................................................. 49 Run Vectorization and Code Insights Perspective from GUI.................... 50 Vectorization Accuracy Presets.................................................. 52 Customize Vectorization and Code Insights Perspective ................ 53 Run Vectorization and Code Insights Perspective from Command Line .... 58 Vectorization Accuracy Levels in Command Line ......................... 64 Explore Vectorization and Code Insights Results .................................. 65 Vectorization Report Overview .................................................. 65 Examine Not-Vectorized and Under-Vectorized Loops ................... 67 Analyze Loop Call Count .......................................................... 70 Investigate Memory Usage and Traffic........................................ 71 Find Data Dependencies........................................................... 74 Analyze CPU Roofline ............................................................................... 76 Run CPU / Memory Roofline Insights Perspective from GUI.................... 77 CPU Roofline Accuracy Presets .................................................. 79 Customize CPU / Memory Roofline Insights Perspective ................ 79 Run CPU / Memory Roofline Insights Perspective from Command Line .... 84 CPU Roofline Accuracy Levels in Command Line ......................... 90 Explore CPU/Memory Roofline Results ................................................ 91 CPU Roofline Report Overview .................................................. 91 Examine Bottlenecks on CPU Roofline Chart................................ 97 Examine Relationships Between Memory Levels ........................ 100 Compare CPU Roofline Results ................................................ 105 Model Threading Designs........................................................................ 107 2 Contents Run Threading Perspective from GUI................................................ 109 Customize Threading Perspective ............................................ 110 Run Threading Perspective from Command Line ................................ 116 Threading Accuracy Levels in Command Line ........................... 120 Annotate Code for Deeper Analysis .................................................. 121 Annotate Code to Model Parallelism ......................................... 123 Annotations ......................................................................... 133 Annotation Report................................................................. 166 Model Threading Parallelism............................................................ 172 Suitability Report Overview .................................................... 175 Choose Modeling Parameters in the Suitability Report ................ 179 Fix Annotation-related Errors Detected by the Suitability Tool...... 181 Advanced Modeling Options.................................................... 181 Reduce Parallel Overhead, Lock Contention, and Enable Chunking 182 Check for Dependencies Issues ....................................................... 184 Code Locations Pane ............................................................. 184 Filter Pane (Dependencies Report)........................................... 186 Problems and Messages Pane ................................................. 186 Dependencies Source Window ................................................ 187 Add Parallelism to Your Program...................................................... 192 Before You Add Parallelism: Choose a Parallel Framework ........... 192 Add the Parallel Framework to Your Build Environment............... 196 Annotation Report................................................................. 199 Replace Annotations with Intel® oneAPI Threading Building Blocks (oneTBB) Code ....................................................... 200 Replace Annotations with OpenMP* Code ................................. 205 Next Steps for the Parallel Program ......................................... 219 Model Offloading to a GPU ...................................................................... 220 Run Offload Modeling Perspective from GUI ...................................... 222 Offload Modeling Accuracy Presets .......................................... 224 Customize Offload Modeling Perspective................................... 225 Run Offload Modeling Perspective from Command Line ....................... 231 Offload Modeling Accuracy Levels in Command Line .................. 244 Run GPU-to-GPU Performance Modeling from Command Line ...... 247 Explore Offload Modeling Results..................................................... 250 Offload Modeling Report Overview ........................................... 251 Examine Regions Recommended for Offloading ......................... 253 Examine Data Transfers for Modeled Regions ............................ 255 Check for Dependency Issues ................................................. 257 Explore Performance Gain from GPU-to-GPU Modeling................ 258 Investigate Non-Offloaded Code Regions .................................. 260 Advanced Modeling Strategies ........................................................ 268 Manage Invocation Taxes ....................................................... 268 Enforce Offloading for Specific Loops ....................................... 270 Check How Assumed Dependencies Affect Modeling................... 271 Analyze GPU Roofline ............................................................................. 273 Run GPU Roofline Insights Perspective from GUI................................ 275 GPU Roofline Accuracy Presets ............................................... 276 Customize GPU Roofline Insights Perspective ............................ 277 Run GPU Roofline Insights Perspective from Command Line ................ 281 GPU Roofline Accuracy Levels in Command Line ....................... 287 Explore GPU Roofline Results .......................................................... 288 Examine GPU Roofline Summary ............................................. 289 Examine Bottlenecks on GPU Roofline Chart ............................. 290 Examine Kernel Details .......................................................... 296 3 Intel® Advisor User Guide Compare GPU Roofline Results................................................ 299 Design and Analyze Flow Graphs ............................................................. 301 Where to Find the Flow Graph Analyzer ............................................ 301 Launching the Flow Graph Analyzer ................................................. 302 Flow Graph Analyzer GUI Overview.................................................. 303 Menus ................................................................................. 304 Toolbars .............................................................................. 306 Tabs.................................................................................... 307 Main Canvas......................................................................... 310 Charts ................................................................................. 310 Flow Graph Analyzer Workflows......................................................

Intel® Advisor User Guide

Beyond BIOS Developing with the Unified Extensible Firmware Interface

Intel Advisor for Dgpu Intel® Advisor Workflows

Hands-On Intel® Software Development & Oneapi WORKSHOP

Evaluating Techniques for Parallelization Tuning in MPI, Ompss and MPI/Ompss

Introduction to Intel Performance Tools Part

Intel® Software Products Highlights and Best Practices

Intel® Offload Advisor

Intel Edge Computing Portfolio E-Booklet

Intel® Inspector XE 2013

Intel® Inspector for Systems

Intel® Parallel Studio XE 2020 Update 2 Release Notes

Code That Performs