> OpenMP OFFLOAD PROGRAMMING MODEL in cooperation with FOR NEC SX AURORA TSUBASA MOTIVATION / GOALS OpenMP OFFLOADING LIMITATIONS • Enable OpenMP Offloading for NEC vector engines (VE) • Prototype $ clang -fopenmp -fopenmp-targets=aurora--veort-unknown saxpy.c Host Device • Convenient usage of VEs (transparent and intuitive) • C++ support not implemented yet • Standard compliance • No • Performance portability offloading main(){ void saxpy(){ saxpy(); • Limited macro support int n = 10240; float a = 42.0f; float b = 23.0f; } CONCEPT float *x, *y; x,y // Allocate and initialize x, y CONCLUSION • Integration into the LLVM infrastructure // Run SAXPY • Generic approach (source-to-source) • Source-to-source transformation for target Target Device regions with sotoc #pragma omp target map(to:x[0:n]) map(tofrom:y[0:n]) • Plugin for each target device required #pragma omp parallel for • Prototype for NEC SX Aurora TSUBASA implemented • Usage of Vector Engine Offloading (VEO) for (int i = 0; i < n; ++i){ [1] in libomptarget [2] y[i] = a*x[i] + y[i]; saxpy(); } } CONTACT SOTOC Tim Cramer • Prepares code for cross compilation with IT Center, RWTH Aachen University NEC on VEs [email protected] • Outlining of target regions EXECUTION $ sotoc saxpy.c -- -fopenmp Erich Focht • Resolving all dependencies • Fat binary: Code for VE is integrated into the x86 binary as separated ELF NEC HPC Europe • Extracts knowledge based on the abstract • Aurora plugin handles communication (over VEO) [email protected] syntax tree (AST) generated by the clang compiler (libtooling) • Two different OpenMP runtime (Host: LLVM,Device: NEC) REFERENCES Host Device Target Device [1] https://github.com/SX-Aurora/veoffload void __omp_offloading_28_395672b_saxpy_l8(int *__sotoc_var_n, float * y, Transfer of code [2] Samuel F. Antao, Alexey Bataev, Arpith C. Jacob, Gheorghe- VEO VE ProcHandle Teodor Bercea, Alexandre E. Eichenberger, Georgios Rokos, Matt float *__sotoc_var_a, float * x) { and offloading data Martineau, Tian Jin, Guray Ozen, Zehra Sura, Tong Chen, Hyojin int n = *__sotoc_var_n; Process / libomptarget Aurora Sung, Carlo Bertolli, and Kevin O’Brien. Offloading Support for float a = *__sotoc_var_a; Plugin thread Offloading Information handling OpenMP in Clang and LLVM. In Proceedings of the Third Work- #pragma omp parallel for shop on LLVM Compiler Infrastructure in HPC (LLVM-HPC ‘16). for (int i = 0; i < n; ++i){ LLVM OpenMP RTL NEC OpenMP RTL IEEE Press, Piscataway, NJ, USA, 2016. y[i] = a*x[i] + y[i]; OpenMP OpenMP } *__sotoc_var_n = n; Application Application (offloaded) *__sotoc_var_a = a; }

https://www.jara.org/de/forschung/jara-hpc