OpenMP ¿1?§Ä:
o¬¬ [email protected], [email protected]
¥IÆ)ÔU L§ïĤ ?O¥%
2009 c 12 OpenMP ?§ÔSN
1 ¿1O{0
2 OpenMP {0
3 OpenMP {
4 OpenMP S3Cþ£ICV¤
5 OpenMP -
6 OpenMP êâ¸
7 OpenMP $1¥f§S
8 OpenMP MPI ·Ü¿1?§
9 OpenMP §S3 Unix/Linux e?È$1
10 9Ï^óä
11 éXª o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 2 / 85 ¿1O{0
oæ^¿1Oº G1§SÝJ, ú ±\¯Ý ±\5
o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 3 / 85 ¿1OO©a
SµOpenMP õ?nìÓS ccNUMA(Cache-Coherent Non-Uniform Memory Access)! SMP(Symmetric MultiProcessing) ©ÙªSµMPI z?nìÑkgCS ?nìmÏLD4E&E Cluster!MPP(Massively Parallel Processing)
o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 4 / 85 n«O.
o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 5 / 85 ¿1z©){
?Ö©)µõ?Ö¿u1 õU©)µ©)1O «©)µ©)1êâ
o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 6 / 85 o´ OpenMP
OpenMP ´ÏL3 è¥V\ OpenMP ?È-!N^ OpenMP ¥¼ê5¢y3SXÚþ¿1$1«IO OpenMP JøS¿1§S «{ü(¹^umu¿1 A^.§¦§SQ±$13ªÅ§q±$13? OÅþ§¿äkûУ5Ú 5 m OpenMP ?ÈÀ?ÈìòÑ OpenMP ?È- ?ȤG11§S m OpenMP ?ÈÀòé OpenMP ?È-?1?n?Ȥ OpenMP ¿11§S $1¿1§ê±3§SéÄ|^¸Cþ Ä |± MPI ·Ü?§§3!:SÜ$1 OpenMP ¿1§!:m $1 MPI ¿1 OpenMP |± C/C++ Ú Fortran ó OpenMP §S±$13 Unix!Linux Ú Windodws öXÚþ
o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 7 / 85 OpenMP ëÌúi!ÅÚ< OpenMP Architecture Review Board(ARB) [Ȥ AMD (David Leibs) Cray (James Beyer) Fujitsu (Matthijs van Waveren) HP (Uriel Schafer) IBM (Kelvin Li) Intel (Sanjiv Shah) NEC (Kazuhiro Kusano) The Portland Group, Inc. (Michael Wolfe) SGI (Lori Gilbert) Sun Microsystems (Nawal Copty) Microsoft (-) ARB 9Ϥ ASC/LLNL (Bronis R. de Supinski) cOMPunity (Barbara Chapman) EPCC (Mark Bull) NASA (Henry Jin) RWTH Aachen University (Dieter an Mey) (Õ:µhttp://www.openmp.org o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 8 / 85 OpenMP uÙ{¤
Version 3.0 Complete Specifications - (May, 2008) Version 2.5 - (May 2005, combined C/C++ and Fortran) C/C++ version 2.0 - (March 2002) C/C++ version 2.0 with change bars reflecting changes from 1.0 - (March 2002) Fortran version 2.0 - (November 2000) Fortran version 2.0 with change bars reflecting changes from 1.1 (November 2000) C/C++ version 1.0 - (October 1998) Fortran version 1.1 - (November 1999 - incorporates April 1999 Interpretations and Errata) Fortran version 1.0 - (October 1997)
o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 9 / 85 |± OpenMP ?Èì±9éA?Èëê
úi½| ?Èì éA?Èëê GNU gcc(4.3.2) -fopenmp Intel C/C++/Fortran(10.1) Linux: -openmp, Win: /Qopenmp Portland Group C/C++/Fortran -mp IBM XL C/C++/Fortran -qsmp=omp HP C/C++/Fortran +Oopenmp Microsoft Visual Studio 2008 C++ /openmp Sun Microsystems C/C++/Fortran -xopenmp Absoft Pro FortranMP Fortran -openmp Lahey/Fujitsu Fortran95 C/C++/Fortran –openmp –threadstack –threadheap PathScale C/C++/Fortran -apo -mp
o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 10 / 85 ¥þ:¦G1è
ü¥þ:¦µsum = A · B § int main(argc,argv) int argc; char ∗argv[]; { double sum; double a[256], b[256]; int i,n; n = 256; for (i =0; i o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 11 / 85 ¥þ:¦ MPI ¿1è § #include "mpi.h" int main(argc,argv) int argc; char ∗argv[]; { double sum, sum_local; double a[256], b[256]; int i , n, numprocs, myid, my_first, my_last; n = 256; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); my_first = myid ∗ n/numprocs; my_last = (myid + 1) ∗ n/numprocs; for (i =0; i § int main(argc,argv) int argc; char ∗argv[]; { double sum; double a[256], b[256]; int status; int i ,n=256; for (i =0; i o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 13 / 85 Ä: OpenMP ´e¡8ܵ ?È- ¥¼ê ¸Cþ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 14 / 85 OpenMP ªµC/C++ C/C++ ¥ OpenMP |^ pragma ý?n- OpenMP -§ {Uì±eªµ § #pragma omp directive−name [clause[ [,] clause]. . . ] new−line ¦ ¥ z-± #pragma omp m©§ -Ù{Ü©Uì C/C++ ?È-IO§¿«© # cÚ kx§ kÿ7L^x5© -¥©i #pragma omp ý?n8I3?Èò÷O z OpenMP 1-7LA^u 駿 7 L´(¬ z1Uk OpenMP - ëê Ãc ©§I±E o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 15 / 85 OpenMP ªµFortran Fortran e OpenMP -{Uì±eªµ § sentinel directive −name [clause[[,] clause]. . . ] ¦ ¥ OpenMP -I÷v±e^µ ¤k OpenMP ?È-7L±IÎm© ½ªÚgdª OpenMP I£ÎØÓ OpenMP -Ø«© OpenMP -ØUSi3ëY饧éØUSi3-¥ z1Uk OpenMP - ëê Ãc ©§I±E éu Fortran§8 ØAÏ`²§ÄK±gdªLã o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 16 / 85 OpenMP ªµFortran ½ª ½ª Fortran OpenMP -I£Î !$omp!c$omp ½ ∗$omp§I÷ v±e^µ I£Î7I31m©§¿ üc§mØkÙ§iÎ ½ª Fortran 1°!x!Y1Ú5Ké-1IŠЩ-17Lk½ 0 318§-Y17Lk½ 0 iÎ318 5º±-Ó1§318 ! L²5ºm© XJ-IÎ 1iδiΧ½ö-Y1´ ! §@od1Ñ § !$omp directive −name [clause[[,] clause]. . . ] new−line c$omp directive−name [clause[[,] clause]. . . ] new−line ∗$omp directive−name [clause[[,] clause]. . . ] new−line ¦ ¥ ±en«L«ª d § c23456789 L«Ò !$omp parallel do shared(a,b,c) c$omp parallel do c$omp+shared(a,b ,c) ∗$omp paralleldoshared (a,b,c) ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 17 / 85 OpenMP ªµFortran gdª gdª Fortran OpenMP -I£Î !$omp§I÷v±e^µ I£Î3?Û m©§ciÎ7LUx ØUÙ§iÎ 7LüÕüc§mØUkÙ§iÎ gdª Fortran 1°!x!Y1Ú5Ké-1IŠЩ-17L3I£Î k IY11 §7L &§ÙY1±3I£Î k &§ & c kx 5º±-Ó1§! L²5ºm© XJ-IÎ 1iδiΧ½-Y1=´ !§@od1Ñ ½õ½LÎ^u©C-¥' c§±e/´Àµ end critical!end do!end master!end ordered!end parallel!end sections!end single!end task!end workshare!parallel do!parallel sections!parallel workshare § !$omp directive−name [clause[[,] clause]. . . ] new−line ¦ ¥ § !23456789 L«Ò !$omp parallel do & !$omp shared(a,b,c) !$omp parallel & !$omp&do shared(a,b,c) !$omp paralleldo shared(a,b,c) ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 18 / 85 ^?È éu|±ý?n OpenMP ¢y§÷ _OPENMP ½Â yyyymm ª§Ù¥ yyyy Ú mm ©Od?Èì¦^ OpenMP IO uÙcÚ§Xd÷,d #define Ú #undef (²§@o?È1ò Ã{ý§Ïd¦þØgC½Âd÷" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 19 / 85 ^?ȵFortran ½ª IÎ !$!∗$ ½ c$ §¿I÷v±e^µ IPÎ7Il1m©§¿ ´mvkxN IPÎOü §Ð©17L318k½ ö 0§¿ 311ÊmUx½êi IPÎOü §Y17L318kx½ ö 0 iΧ¿ 311ÊmUx XJþã÷v§@o?Èd1I£ÎòòOü§ÄK òØ?n o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 20 / 85 ^?ȵFortran ½ª § !23456789 L«Ò !$ interval=l∗omp_get_thread_num() / & !$ (omp_get_num_threads() −1) ! e1 !$ duc¡kiÎ ØL« OpenMP ^?ȧ ´5º Do i =1,100, !$ OMP_get_num_threads() ¦ ¥ ±eü«ª dµ § !23456789 L«Ò !$ 10 iam = omp_get_thread_num() + !$ & index #i f d e f _OPENMP 10 iam = omp_get_thread_num() + & index #endif ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 21 / 85 ^?ȵFortran gdª IÎ !$§¿I÷v±e^µ ± !$ ر !$omp m©IP§ c¡vk± Ù§iÎ IPδmvkxN Щ17L3I£Î k IY11 7L &§Y13I£Î k &§& c kx XJþã÷v§@o?Èd1I£ÎòòOü§ÄK òØ?n o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 22 / 85 ^?ȵFortran gdª § !$interval=l∗omp_get_thread_num() / & !$(omp_get_num_threads()−1) !e1 !$ duc¡kiÎ ØL« OpenMP ^?ȧ ´5º Do i=1,100, !$ OMP_get_num_threads() ¦ ¥ ±eü«ª d § c23456789 !$ iam = omp_get_thread_num() + & !$& index #ifdef _OPENMP iam = omp_get_thread_num() + & index #endif ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 23 / 85 S3Cþ£ICV¤ S3Cþ£ICV¤´S^u OpenMP §S1Cþ§ 'X;§ê!§Ò &E K¿1¬S3Cþµ dyn-varµK¿1«´Ä#NÄN§ê"z?Ö ICV B" nest-varµK¿1«´Ä#Ni@¿1"z?Ö ICV B" nthreads-varµK¿1«§ê"z?Ö ICV B" thread-limit-varµ§S#Në§ê"?Ö^ ICV B" max-active-levels-varµi@-¹¿1«ê"?Ö^ ICV B" KÌ«öS3Cþµ run-sched-varµÌ«¢NÝüÑ"z?Ö ICV B" def-sched-varµÌ«%@¢NÝüÑ"?Ö^ ICV B" K§S1S3Cþµ stacksize-varµ OpenMP ¢y)§æÒ£stack¤"?Ö^ ICV B" wait-policy-varµ §Ï"1"?Ö^ ICV B" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 24 / 85 UÚ¼S3Cþ S3Cþ ^ Cþª ¼ Щ dyn-var ü?Ö OMP_DYNAMIC omp_get_dynamic() ëw5) omp_set_dynamic() nest-var ü?Ö OMP_NESTED omp_get_nested() false omp_set_nested() nthreads-var ü?Ö OMP_NUM_THREADS omp_get_max_threads() 1½Â omp_set_num_threads() run-sched-var ü?Ö OMP_SCHEDULE omp_get_schedule() 1½Â omp_set_schedule() def-sched-var Û (none) (none) 1½Â stacksize-var Û OMP_STACKSIZE (none) 1½Â wait-policy-var Û OMP_WAIT_POLICY (none) 1½Â thread-limit-var Û OMP_THREAD_LIMIT omp_get_thread_limit() 1½Â max-active Û OMP_MAX_ACTIVE_LEVELS omp_get_max_active_ ëw5) -levels-var omp_set_max_active_levels() levels() 5)µ X OpenMP 1|±ÄN§§dyn-var Щd1½Â§ÄK false max-active-levels-var Щ´1|±¿1?Oê8§[&EëÃþ" §Sm©$1§CþЩòD§¿¬3 OpenMP (½ API f§S¥1§^r OpenMP ¸Cþ Ö¿DéASÜCþ§ ?Û OpenMP ¸CþUCÑج2KSÜCþ"OpenMP (ëêØ ¬K?ÛSÜCþ" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 25 / 85 ?ÖS3Cþóª ?Ö«äkЩCþ dyn-var!nest-var!nthreads-var Ú run-sched-var gCB"?Ö(½¿1(3?Ö¥ 1§Ù)f?ÖU«ù Cþ"N^ omp_set_num_threads()! omp_set_dynamic()!omp_set_nested() Ú omp_set_schedule() == ¬#Ù'?Ö«S3Cþ" 1½ schedule(runtime) Ì󫧤k|¤½ parallel «?Ö« run-sched-var 7LkÓ§ÄK1ò Ã{ý" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 26 / 85 `k? (ëê API f§S ¸Cþ S3Cþ (none) omp_set_dynamic() OMP_DYNAMIC dyn-var (none) omp_set_nested() OMP_NESTED nest-var num_threads omp_set_num_threads() OMP_NUM_THREADS nthreads-var schedule omp_set_schedule() OMP_SCHEDULE run-sched-var schedule (none) (none) def-sched-var (none) (none) OMP_STACKSIZE stacksize-var (none) (none) OMP_WAIT_POLICY wait-policy-var (none) (none) OMP_THREAD_LIMIT thread-limit-var (none) omp_set_max_active_levels() OMP_MAX_ACTIVE_LEVELS max-active-levels-var >`k?pm> o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 27 / 85 ¿1(µparallel Construct ² «¥èò¿11§¿ ¿1´±i@ C/C++µ § #pragma omp parallel [clause[ [, ]clause] ...] new−line structured−block ¦ ¥ Fortranµ § !$omp parallel [clause [[,] clause ]...] structured−block !$omp end parallel ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 28 / 85 ¿1(ã« § !$omp parallel write(∗,∗) "Hello" !$omp end parallel ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 29 / 85 ¿1i@ã« § !$omp parallel write(∗,∗) "Hello" !$omp parallel write(∗,∗) 'Hi' !$omp end parallel !$omp end parallel ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 30 / 85 ó(µWorksharing Constructs ó(¬31d(§¥©'«§ 1"§¬13z31?Ö¹e۹ܩ«"3z ó(\?¿ØI?1ÓÚ§3ó«(å? Øk nowait ëê§ÄK%@¬?1ÓÚ"XJk nowait§@o1 d?ò¬Ñdó«ÓÚ§@1d(§¬ 1e¡- ØI Ù§?§1d§ØI?1M# £flush¤ö"OpenMP ½Â ±eó(µ Ì(µloop structure ©¬(µsections structure ü1(µsingle structure ó(µworkshare structure µ ó«7L¤k|S§Ñ1½ÑØ1 |Sz§1ó«ÚÓÚ«^S7L o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 31 / 85 Ì(µLoop Constructs Ì(²½õ'ÌSò|§1 C/C++µ § #pragma omp for [clause[[,] clause] ... ] new−line for−loops ¦ ¥ Fortranµ § !$omp do [clause [[,] clause] ... ] do−loops [ !$omp end do [nowait] ] ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 32 / 85 Ì(ã« § !$omp do,clause · · · do i = 1, 1000 . . . enddo !$omp end do ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 33 / 85 ©¬(µsections Construct ØÓ§1ØÓ«Sè C/C++µ § #pragma omp sections [clause[[,] clause] ...] new−line { [#pragma omp section new−line] structured−block [#pragma omp section new−line structured−block ] ... } ¦ ¥ Fortranµ § !$omp sections [clause[[,] clause] ...] [ !$omp section] structured−block [ !$omp section structured−block ] ... !$omp end sections [nowait] ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 34 / 85 ©¬(ã« § !$omp sections !$omp section write(∗,∗) "hello" !$omp section write(∗,∗) "hi" !$omp section write(∗,∗) "bye" !$omp end sections ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 35 / 85 ü1(µsingle structure ²'èUk§£¿Ø7L´Ì§¤1§k §1 C/C++µ § #pragma omp single [clause[[,] clause] ...] new−line structured−block ¦ ¥ Fortranµ § !$omp single [clause [[,] clause] ...] structured−block !$omp end single [end_clause[[,] end_clause] ...] ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 36 / 85 ü1(ã« § !$omp single clause1 clause2 · · · · · · !$omp end single end_clause ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 37 / 85 ó(µworkshare Construct ²'詤ØÓü±øØÓ§1§ØÓüU ,§1g C/C++µØ|±d OpenMP - Fortranµ § !$omp workshare structured−block !$omp end workshare [nowait] ¦ ¥ structured-block I÷v±e^µ ê|D IþD forall é forall ( where é where ( atomic ( critical ( parallel ( critical (¥µ\éÉd§parallel (¥µ\éØÉ d" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 38 / 85 éÜ¿1ó( éÜ¿1(¢Sþ´3¿1(¥i@(¯$ ª§ù -òÓ²ó(äk¿15§ ØIÙ§ é,1²"éÜ¿1(#N ¿1(Úó(Ñ# NA½ëê"X§S¹ké¿1(½ó(äkØÓ1 ëê§@oÙ1òØýÿ"éÜ¿1(¹±e(µ ¿1Ì( ¿1©¬( ¿1ó( o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 39 / 85 ¿1Ì(µParallel Loop Constructs ¿1Ì(²½õ'ÌSò|§¿11 C/C++µ § #pragma omp parallel for [clause[[,] clause] ... ] new−line for−loops ¦ ¥ Fortranµ § !$omp parallel do [clause [[,] clause] ... ] do−loops [ !$omp parallel end do [nowait] ] ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 40 / 85 ¿1©¬(µparallel sections Construct ¿1©¬(²'ØÓ«èòØÓ§¿11 C/C++µ § #pragma omp parallel sections [clause[[,] clause] ...] new−line { [#pragma omp section new−line] structured−block [#pragma omp section new−line structured−block ] ... } ¦ ¥ Fortranµ § !$omp parallel sections [clause[[,] clause] ...] [ !$omp section] structured−block [ !$omp section structured−block ] ... !$omp end parallel sections [nowait] ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 41 / 85 ¿1ó(µparallel workshare Construct ¿1ó(²'詤ØÓü±øØÓ§¿ 11§ØÓüU,§1g C/C++µØ|±d OpenMP - Fortranµ § !$omp parallel workshare structured−block !$omp parallel end workshare [nowait] ¦ ¥ structured-block I÷v±e^µ ê|D IþD forall é forall ( where é where ( atomic ( critical ( parallel ( critical (¥µ\éÉd§parallel (¥µ\éØÉ d" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 42 / 85 ?Ö(µtask Construct ?Ö(½Â²(?Ö C/C++µ § #pragma omp task [clause[[,] clause] ...] new−line structured−block ¦ ¥ Fortranµ § !$omp task [clause [[,] clause] ...] structured−block !$omp end task ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 43 / 85 Ì(ÚÓÚ( Ì(ÚÓÚ(̹±e( Ì(µmaster construct .(µcritical construct ¶æ(µbarrier construct ?Ö (µtaskwait construct f(µatomic construct M#(µflush construct kS(µordered construct o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 44 / 85 Ì(µmaster construct Ì(²(¬I̧£Ì§¤1 C/C++µ § #pragma omp master new−line structured−block ¦ ¥ Fortranµ § !$omp master structured−block !$omp end master ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 45 / 85 Ì(ã« § !$omp master write(∗,∗) "Hello" !$omp end master ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 46 / 85 .(µcritical construct .(²(¬3ÓmUk§1 C/C++µ § #pragma omp critical [(name)] new−line structured−block ¦ ¥ Fortranµ § !$omp critical [(name)] structured−block !$omp end critical [(name)] ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 47 / 85 .(ã« § !$omp critical write_file write(1,∗) data !$omp end critical write_file ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 48 / 85 ¶æ(µbarrier construct ¶æ(²3d?Iwª¶æ§=I|S¤k§Ñ$1d â1 Yè C/C++µ § #pragma omp barrier new−line ¦ ¥ Fortranµ § !$omp barrier ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 49 / 85 ¶æ(ã« o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 50 / 85 ¶æ(~ØÞ~ 5¿µbarrier 7L3¤k§Ó^S¥ ~ØÞ~µ ··· ··· !$omp single ··· !$omp critical ··· !$omp barrier ··· !$omp barrier ··· !$omp end single ··· !$omp end critical ··· ··· ··· !$omp sections ··· !$omp master ··· !$omp section ··· !$omp barrier ··· !$omp barrier ··· !$omp end master ··· !$omp end section ··· !$omp end sections ··· o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 51 / 85 ?Ö (µtaskwait construct ?Ö (²Igc?Öm©ÒI )f?Ö¤ C/C++µ § #pragma omp taskwait newline ¦ ¥ Fortranµ § !$omp taskwait ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 52 / 85 f(µatomic construct f(^u(²;«é,ö?1f#§ ئ٠³õÓÚ?1§§;Óõ§éù Cþ?1# C/C++µ § #pragma omp atomic new−line expression−stmt ¦ ¥ Fortranµ § !$omp atomic statement ¦ ¥ Þ~µ § !$omp do do i = 1, 1000 !$omp atomic + a=a+i enddo !$omp end do ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 53 / 85 M#(µflush construct M#(²é½Cþ1SM#ö±y§dwù Cþ3S¥ C/C++µ § #pragma omp flush [(list)] new−line ¦ ¥ Fortranµ § !$omp flush [( list )] ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 54 / 85 kS(µordered construct kS(²Ì(¥(¬1UìÌS^S§òüSd (¬¥1^S§¿#N(¬ 豿11 C/C++µ § #pragma omp ordered new−line structured−block ¦ ¥ Fortranµ § !$omp ordered structured−block !$omp end ordered ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 55 / 85 kS(ã« § !$omp do ordered do i = 1, 100 block1 !$omp ordered block2 !$omp end ordered block3 enddo !$omp end do ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 56 / 85 (SÚ^Cþêâá5{K (¥Ú^Cþêâá5ýkû½!w«û½½Ûªû ½"µ4(¥ firstprivate!lastprivate ½ reduction ëê( ²Cþ¬d(¥Cþæ^ÛªÚ^ª§¿Å±e5Kµ C/C++µ threadprivate -²Cþ´§hk (¥,S²äkgÄ;±ÏCþ´hk äkæ©;Cþ´ ·ê⤠´ 'é for ̽ parallel for Ì(¥ÌSCþ´hk Øäk´C¤ 3~êCþ´ (¥,S(²·Cþ´ Fortranµ threadprivate -²CþÚ common ¬´§hk 'é do ̽ parallel do Ì(¥ÌSCþ´hk parallel ½ task (¥^SÌÌSCþ3µ4dÌ SÜ(¥´hk Ûª do Ú forall ¢Ú´hk Cray U«Ù'é;êâá5 o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 57 / 85 êâá5{K Ø 3±e/¥§äký½Âêâá5CþNvk3êâ á5ëê¥Ñ§éù ~ ó§3êâá5ëê¥#NÑ ý½ÂCþ§¿ òù Cþýk½Âêâá5" C/C++µ for ̽ parallel for Ì(¥ÌSCþ±3 private ½ lastprivate ëê¥Ñ Fortranµ do ̽ parallel do Ì(¥ÌSCþ±3 private ½ lastprivate ëê¥Ñ parallel ½ task (¥^SÌÌSCþ±3 private! firstprivate!lastprivate!shared!½ reduction ëê¥Ñ§¿ 3 µ4(¥ÉÙ§ b½ê|±3 share ëê¥Ñ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 58 / 85 Ûªêâ{K äkwªû½½Ûªû½êâá5Cþµ äkwªû½êâá5Cþ´3½(¥ Ú^¿ 3d(êâëê¥ÑCþ äkÛªû½êâá5Cþ´3½(¥ Ú^§vkýkû½êâá5§¿ vk3d(êâ á5ëê¥ÑCþ Ûªû½êâá55KXeµ 3 parallel ½ task (¥§XJ3 default ëê§@où Cþ êâá5d default ëêû½ 3 parallel (¥§XJvk default ëê§@où Cþ´ éuØ´ task (5`§XJvk default ëê§@où Cþ òldµ4þe©¥U«êâá5 3 task (¥§XJvk default ëê§@o3¤kµ4(¿ Sܵ4 parallel (¥Cþû½ 3 task (¥§XJvk default ëê êâá5vkþã 5Kû½§@oCþ´ firstprivate o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 59 / 85 « (¥êâá55K C/C++µ 3N^f§S¥(²·Cþ3d«¥´ Øäk´C¤ 3N^f§S¥(²3~êCþ´ Ø´3 threadprivate -¥Ñy§ÄK«¥3N^f§S¥ Ú^©½ö¶imCþ´ äkæ©;Cþ´ Ø´3 threadprivate -¥Ñy§ÄK·ê⤠´ «¥N^f§SÏLÚ^D4ªëêU«'é¢Së êêâêâá5 «¥N^f§S¥Ù§Cþ´hk Fortranµ «¥N^f§S¥(²/CþXk save á5§½êⴠЩz§ 3 threadprivate -¥ÑyCþ´ «¥N^f§S¥Ú^ common ¬½ module ¥Cþ´ §Ø´3 threadprivate -¥Ñy «¥N^f§S¥åëêXÏLÚ^D4§@oòU« 'é¢Sëêêâá5 Cray U«Ù'é;êâá5 «¥N^f§S¥Ûª do Ú forall ¢Ú9(²Ù§/C þ´hk o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 60 / 85 threadprivate - threadprivate -²Cþ´§hk C/C++µ § #pragma omp threadprivate(list) new−line ¦ ¥ Fortranµ § !$omp threadprivate( list ) ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 61 / 85 threadprivate -ã« § real(8), save :: A(100), B integer, save :: C ··· !$omp threadprivate(a, b, c) ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 62 / 85 êâá5ëê defaultµ%@ª C/C++µ § default(shared | none) ¦ ¥ Fortranµ § default(private | firstprivate | shared | none) ¦ ¥ sharedµ§ privateµ§hk firstprivateµ§hk§Ùm©d(l©CþU« lastprivateµ§hk§¿3d«(å#©Cþ reductionµ(²éCþ?15ö C/C++µ § reduction(operator: list ) ¦ ¥ Fortranµ § reduction({operator | intrinsic_procedure_name}:list) o¬¬ ( ¦U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 63 / 85¥ êâEëê copyinµ31 parallel (§ò̧§hkCþE ¤kÙ§§ § copyin( list ) ¦ ¥ copyprivateµJø«Å±òÛª?Öêâ¸,hkC þ2Âd parallel (¥Ù§ê⸠§ copyprivate( list ) ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 64 / 85 §i@ §i@5KXe ó«vi@µ43ó!wª task!critical! ordered ½ master « barrier «vi@µ43ó!wª task!critical! ordered ½ master « master «vki@µ43ó½wª task « ordered «vi@µ43critical ½wª task « ordered «7Li@µ43äk ordered ëêÌ«½¿1 Ì«¥ critical «vi@£µ4½Ù§¤3äkÓ¶i critical «¥§d¿Ø¿©Uk£ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 65 / 85 $1¥½Â C/C++µ 3Þ© omp.h ¥½Â§d©½Â µ ¤k OpenMP f§S. omp_lock_t a. omp_nest_lock_t a. omp_sched_t a. Fortranµ Jø F77 Þ© omp_lib.h Ú F90 module © omp_lib§½  µ ¤k OpenMP f§S integer parameter omp_lock_kind integer parameter omp_nest_lock_kind integer parameter omp_sched_kind integer parameter openmp_version o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 66 / 85 1¸f§S omp_set_num_threadsµ§ê omp_get_num_threadsµ¼§ê omp_get_max_threadsµ¼?§ê omp_get_thread_numµ¼?§Ò omp_get_num_procsµ¼?nìê omp_in_parallelµä´Ä3 parallel « omp_set_dynamicµ´Ä#NÄN§ê omp_get_dynamicµ¼´Ä#NÄN§ê omp_set_nestedµ´Ä#Ni@ omp_get_nestedµ¼´Ä#Ni@ omp_set_scheduleµNÝüÑ omp_get_scheduleµ¼NÝüÑ omp_get_thread_limitµ¼§S#N§ê omp_set_max_active_levelsµ#Ni@-¹ ê omp_get_max_active_levelsµ¼#Ni@-¹ ê omp_get_levelµ¼ci@ parallel « ? omp_get_ancestor_thread_numµ¼I§½öc§Ò omp_get_team_sizeµ¼,i@ ?I½c§#N|§ê omp_get_active_levelµ¼i@¹ parallel «Ò o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 67 / 85 £f§S OpenMP $1¥¹X;^å£f§S§±^uÓÚ "ù £f§Sö@ ^ OpenMP £CþL« OpenMP £"OpenMP £ Uù f§S§Ù{ª OpenMP £ªØ(@" OpenMP £k uninitialized!unlocked ½ locked G"X£3 unlocked G§?ÖT£ locked G"d£?Öò Pkd£§Pkd£?Öd£ unlocked G"?ÖE áu,?Ö£´Ã{y" OpenMP |±ü«a.£µ{ü£Úi@£"i@£Ó ?Ö3E cõg¶{ü£3áu?Ö}Á2gÿò Ø"{ü£Cþ{ü£'é§ UD4{ü£f§S" i@£Cþi@£'駿 UD4i@£f§S" z£f§S£GÚ¤köå3df§S¥£ã"XJù åvk1§ù f§S1´A½" OpenMP £f§S£CþÏL±eª?1µ§o´ÖÚ# £Cþc"£f§SV¹#£flush¤L§¶éù £Cþ ÖÚ#7LUìäkf5£atomic¤#ö5¢y§¿ØI OpenMP ¹w«#-5y£Cþ3ØÓ?Ö¥5" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 68 / 85 {ü£f§S omp_init_lockµÐ©z{ü£ omp_destroy_lockµ¤{ü£ omp_set_lockµ{ü£ omp_unset_lockµE {ü£ omp_test_lockµÿÁ{ü£§XJ3K o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 69 / 85 i@£f§S omp_init_nest_lockµÐ©zi@£ omp_destroy_nest_lockµ¤i@£ omp_set_nest_lockµi@£ omp_unset_nest_lockµE i@£ omp_test_nest_lockµÿÁi@£§XJ3K o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 70 / 85 omp_init_lock Ú omp_init_nest_lock ©OЩz{ü£Úi@£ C/C++µ § void omp_init_lock(omp_lock_t ∗lock); void omp_init_nest_lock(omp_nest_lock_t ∗lock); ¦ ¥ Fortranµ § subroutine omp_init_lock(svar) integer (kind=omp_lock_kind) svar subroutine omp_init_nest_lock(nvar) integer (kind=omp_nest_lock_kind) nvar ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 71 / 85 omp_destroy_lock Ú omp_destroy_nest_lock ©O¤{ü£Úi@£ C/C++µ § void omp_destroy_lock(omp_lock_t ∗lock); void omp_destroy_nest_lock(omp_nest_lock_t ∗lock); ¦ ¥ Fortranµ § subroutine omp_destroy_lock(svar) integer (kind=omp_lock_kind) svar subroutine omp_destroy_nest_lock(nvar) integer (kind=omp_nest_lock_kind) nvar ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 72 / 85 omp_set_lock Ú omp_set_nest_lock ©O{ü£Úi@£ C/C++µ § void omp_set_lock(omp_lock_t ∗lock); void omp_set_nest_lock(omp_nest_lock_t ∗lock); ¦ ¥ Fortranµ § subroutine omp_set_lock(svar) integer (kind=omp_lock_kind) svar subroutine omp_set_nest_lock(nvar) integer (kind=omp_nest_lock_kind) nvar ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 73 / 85 omp_unset_lock Ú omp_unset_nest_lock ©OE {ü£Úi@£ C/C++µ § void omp_unset_lock(omp_lock_t ∗lock); void omp_unset_nest_lock(omp_nest_lock_t ∗lock); ¦ ¥ Fortranµ § subroutine omp_unset_lock(svar) integer (kind=omp_lock_kind) svar subroutine omp_unset_nest_lock(nvar) integer (kind=omp_nest_lock_kind) nvar ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 74 / 85 omp_test_lock Ú omp_test_nest_lock ©OÿÁ{ü£Úi@£§XJ3K C/C++µ § int omp_test_lock(omp_lock_t ∗lock); int omp_test_nest_lock(omp_nest_lock_t ∗lock); ¦ ¥ Fortranµ § logical function omp_test_lock(svar) integer (kind=omp_lock_kind) svar integer function omp_test_nest_lock(nvar) integer (kind=omp_nest_lock_kind) nvar ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 75 / 85 mf§S JøOì omp_get_wtimeµ¼±¦ü cm C/C++µ § double omp_get_wtime(void); ¦ ¥ Fortranµ § double precision function omp_get_wtime() ¦ ¥ omp_get_wtickµ¼Oì°Ý C/C++µ § double omp_get_wtick(void); ¦ ¥ Fortranµ § double precision function omp_get_wtick() ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 76 / 85 mf¼êÞ~ |^m¼ê±O§S¥,ãOs¤m C/C++µ § double start, end; start = omp_get_wtime(); ... work to be timed ... end = omp_get_wtime(); printf ("Work␣took␣%f␣seconds\n", end − start); ¦ ¥ Fortranµ § double precision start, end start = omp_get_wtime() ... work to be timed ... end = omp_get_wtime() print ∗, "Work␣took", end − start, "seconds" ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 77 / 85 ¸Cþ |^¸Cþ± OpenMP §S$1 OMP_SCHEDULEµ run-sched-var S3Cþ±$1 NÝa.Ú¿1¬§ static!dynamic!guided ½ auto OMP_NUM_THREADSµ nthreads-var S3Cþ± §ê OMP_DYNAMICµ dyn-var S3Cþ±´Ä#NÄ N§ OMP_NESTEDµ nest-var S3Cþ±´Ä#Ni@ OMP_STACKSIZEµ stacksize-var S3Cþ±æÒ OMP_WAIT_POLICYµ wait-policy-var S3Cþ± § üÑ OMP_MAX_ACTIVE_LEVELSµ max-active-levels-var S 3Cþ±#Ni@-¹¿1« OMP_THREAD_LIMITµ thread-limit-var S3Cþ± 3 OpenMP §S¥#N§©ê o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 78 / 85 ¸Cþ{ I3§S$1c§§S$1 2òØå^§±eA«~ shell e{µ cshµ setenv OMP_SCHEDULE dynamic sh!bash!kshµ export OMP_SCHEDULE=dynamic DOSµ set OMP_SCHEDULE=dynamic o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 79 / 85 OpenMP MPI ·Ü¿1?§ MPI ¿1?§Sܱ21 OpenMP ¿1§±éÜ?1¿1O §^u!:SÜS§!:m©ÙªSXÚ § program main use omp_lib use mpi implicit none integer, save:: iErr, MyID, NumNodes integer i, j call mpi_init(ierr) call mpi_comm_rank(mpi_comm_world, MyID ,iErr) call mpi_comm_size(mpi_comm_world, NumNodes, iErr) print∗,'MPI:␣My␣ID:␣', MyID, ',␣Number␣of␣Processes:␣', NumNodes !$omp parallel i=omp_get_num_threads() j=omp_get_thread_num() print∗, 'OMP:␣Thread␣Number:', j, ',␣Number␣of␣Threads:␣', i !$omp end parallel call mpi_finalize(ierr ) end ¦ ¥ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 80 / 85 OpenMP §S3 Unix/Linux e?È 3 Unix/Linux ²e~ GCC!Intel!PGI ?Èì?Ȫ Cµ GCCµgcc -o prog-omp -fopenmp prog-omp.c Intelµicc -o prog-omp -openmp prog-omp.c PGIµpgcc -o prog-omp -mp prog-omp.c C++µ GCCµg++ -o prog-omp -fopenmp prog-omp.cpp Intelµicpc -o prog-omp -openmp prog-omp.cpp PGIµpgCC -o prog-omp -mp prog-omp.cpp Fortranµ GCCµgfortran -o prog-omp -fopenmp prog-omp.f90 Intelµifort -o prog-omp -openmp prog-omp.f90 PGIµpgf90 -o prog-omp -mp prog-omp.f90 ?Èì´l,m©|± OpenMP §X GCC l 4.3.2 m© é OpenMP MPI ·Ü?§§S§ò?È·-¤éA MPI ?È·-= o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 81 / 85 Unix/Linux e$1 $1µ csh!tcshµ setenv OMP_NUM_THREADS 8 prog-omp >log & bash!ksh!zshµ export OMP_NUM_THREADS=8 prog-omp >log & é OpenMP MPI ·Ü?§§S1§ OpenMP ¸Cþ ^ MPI §S$1ª$1= éõXÚæ^NÝXÚ+n^r§dIÏLN ÝXÚ5$1§IëwéANÝXÚÃþ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 82 / 85 ¿1Ç 1 S = fpar/P + (1 − fpar) Amdahl ½Æ§Ù¥ P ´¿1Øê§fpar ¿1Ü©G1Ó^ Omz©' b fpar = 80%§@o 16 Ú 32 Ø¿1Ç´õº 16 ص4 32 ص4.4 ¦Uõ/¿1zè§AO´ÑÜ© o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 83 / 85 9Ï^óä §S5U©Ûµ Intel VTuneµ http://software.intel.com/en-us/intel-vtune/ AMD CodeAnalystµ http://developer.amd.com/cpu/CodeAnalyst/ §S è(©Ûµ Understandµ http://www.scitools.com/products/understand/ o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 84 / 85 éXª ¥I¥%µ ̵http://scc.ustc.edu.cn >{µ0551-3602248 &µ[email protected] U¤¥%µ c̵http://124.16.151.186 ò5¶µhttp://scc.qibebt.cas.cn >{µ0532-80662613 &µ[email protected] o¬¬µ ̵http://staff.ustc.edu.cn/~hmli/ >{µ0532-80662613 &µ[email protected][email protected] HÑØÚU?¿" o¬¬ (U¤¥%) OpenMP ¿1?§Ä: 2009 c 12 85 / 85