Openmp Application Programming Interface Examples
Total Page:16
File Type:pdf, Size:1020Kb
OpenMP Application Programming Interface Examples Version 5.0.0 – November 2019 Source codes for OpenMP 5.0.0 Examples can be downloaded from github. Copyright c 1997-2019 OpenMP Architecture Review Board. Permission to copy without fee all or part of this material is granted, provided the OpenMP Architecture Review Board copyright notice and the title of this document appear. Notice is given that copying is by permission of OpenMP Architecture Review Board. This page intentionally left blank Contents Foreword vii Introduction1 Examples 2 1 Parallel Execution3 1.1 A Simple Parallel Loop . .5 1.2 The parallel Construct . .6 1.3 teams Construct on Host . .8 1.4 Controlling the Number of Threads on Multiple Nesting Levels . 11 1.5 Interaction Between the num_threads Clause and omp_set_dynamic .... 14 1.6 Fortran Restrictions on the do Construct . 16 1.7 The nowait Clause . 18 1.8 The collapse Clause . 21 1.9 linear Clause in Loop Constructs . 25 1.10 The parallel sections Construct . 27 1.11 The firstprivate Clause and the sections Construct . 28 1.12 The single Construct . 30 1.13 The workshare Construct . 32 1.14 The master Construct . 36 1.15 The loop Construct . 38 1.16 Parallel Random Access Iterator Loop . 39 1.17 The omp_set_dynamic and omp_set_num_threads Routines . 40 1.18 The omp_get_num_threads Routine . 42 i 2 OpenMP Affinity 44 2.1 The proc_bind Clause . 46 2.1.1 Spread Affinity Policy . 46 2.1.2 Close Affinity Policy . 49 2.1.3 Master Affinity Policy . 51 2.2 Task Affinity . 52 2.3 Affinity Display . 53 2.4 Affinity Query Functions . 64 3 Tasking 67 3.1 The task and taskwait Constructs . 68 3.2 Task Priority . 87 3.3 Task Dependences . 89 3.3.1 Flow Dependence . 89 3.3.2 Anti-dependence . 90 3.3.3 Output Dependence . 91 3.3.4 Concurrent Execution with Dependences . 92 3.3.5 Matrix multiplication . 93 3.3.6 taskwait with Dependences . 94 3.3.7 Mutually Exclusive Execution with Dependences . 101 3.3.8 Multidependences Using Iterators . 104 3.4 The taskgroup Construct . 107 3.5 The taskyield Construct . 110 3.6 The taskloop Construct . 112 3.7 The parallel master taskloop Construct . 116 4 Devices 118 4.1 target Construct . 119 4.1.1 target Construct on parallel Construct . 119 4.1.2 target Construct with map Clause . 120 4.1.3 map Clause with to/from map-types . 121 4.1.4 map Clause with Array Sections . 122 4.1.5 target Construct with if Clause . 124 4.1.6 target Reverse Offload . 127 ii OpenMP Examples Version 5.0.0 - November 2019 4.2 Pointer mapping . 130 4.3 Structure mapping . 133 4.4 Array Sections in Device Constructs . 136 4.5 Array Shaping . 140 4.6 declare mapper Construct . 142 4.7 target data Construct . 149 4.7.1 Simple target data Construct . 149 4.7.2 target data Region Enclosing Multiple target Regions . 150 4.7.3 target data Construct with Orphaned Call . 154 4.7.4 target data Construct with if Clause . 158 4.8 target enter data and target exit data Constructs . 162 4.9 target update Construct . 165 4.9.1 Simple target data and target update Constructs . 165 4.9.2 target update Construct with if Clause . 167 4.10 declare target Construct . 169 4.10.1 declare target and end declare target for a Function . 169 4.10.2 declare target Construct for Class Type . 171 4.10.3 declare target and end declare target for Variables . 171 4.10.4 declare target and end declare target with declare simd . 174 4.10.5 declare target Directive with link Clause . 177 4.11 teams Constructs . 180 4.11.1 target and teams Constructs with omp_get_num_teams and omp_get_team_num Routines . 180 4.11.2 target, teams, and distribute Constructs . 182 4.11.3 target teams, and Distribute Parallel Loop Constructs . 183 4.11.4 target teams and Distribute Parallel Loop Constructs with Scheduling Clauses . 185 4.11.5 target teams and distribute simd Constructs . 186 4.11.6 target teams and Distribute Parallel Loop SIMD Constructs . 188 4.12 Asynchronous target Execution and Dependences . 189 4.12.1 Asynchronous target with Tasks . 189 4.12.2 nowait Clause on target Construct . 193 4.12.3 Asynchronous target with nowait and depend Clauses . 195 Contents iii 4.13 Device Routines . 198 4.13.1 omp_is_initial_device Routine . 198 4.13.2 omp_get_num_devices Routine . 200 4.13.3 omp_set_default_device and omp_get_default_device Routines . 201 4.13.4 Target Memory and Device Pointers Routines . 202 5 SIMD 204 5.1 simd and declare simd Constructs . 205 5.2 inbranch and notinbranch Clauses . 212 5.3 Loop-Carried Lexical Forward Dependence . 216 6 Synchronization 219 6.1 The critical Construct . 221 6.2 Worksharing Constructs Inside a critical Construct . 224 6.3 Binding of barrier Regions . 226 6.4 The atomic Construct . 228 6.5 Restrictions on the atomic Construct . 234 6.6 The flush Construct without a List . 237 6.7 Synchronization Based on Acquire/Release Semantics . 240 6.8 The ordered Clause and the ordered Construct . 248 6.9 The depobj Construct . 252 6.10 Doacross Loop Nest . 256 6.11 Lock Routines . 262 6.11.1 The omp_init_lock Routine . 262 6.11.2 The omp_init_lock_with_hint Routine . 263 6.11.3 Ownership of Locks . 264 6.11.4 Simple Lock Routines . 265 6.11.5 Nestable Lock Routines . 268 7 Data Environment 271 7.1 The threadprivate Directive . 273 7.2 The default(none) Clause . 279 7.3 The private Clause . 281 7.4 Fortran Private Loop Iteration Variables . 285 iv OpenMP Examples Version 5.0.0 - November 2019 7.5 Fortran Restrictions on shared and private Clauses with Common Blocks . 287 7.6 Fortran Restrictions on Storage Association with the private Clause . 289 7.7 C/C++ Arrays in a firstprivate Clause . 292 7.8 The lastprivate Clause . 294 7.9 Reduction . 295 7.9.1 The reduction Clause . 295 7.9.2 Task Reduction . 303 7.9.3 Taskloop Reduction . 306 7.9.4 User-Defined Reduction . 313 7.10 The copyin Clause . 324.