
hpxMP, An Implementation of OpenMP Using HPX Phylanx Year 3 Meeting Tianyi Zhang LSU, Baton Rouge, LA, Nov. 7th. 2019 1 Contributors 10/28/2019 Contributors to STEllAR-GROUP/hpxMP · GitHub 60 40 20 0 2014 2015 2016 2017 2018 2019 50 50 2014 2017 2014 2017 50 50 2014 2017 2014 2017 50 50 2014 2017 2014 2017 50 2014 2017 https://github.com/STEllAR-GROUP/hpxMP/graphs/contributors 1/1 LSU, Baton Rouge, LA, Nov. 7th. 2019 2 Outline • An Overview of hpxMP • Performance Comparison • Daxpy Benchmark, Barcelona OpenMP Task Suit • llvm-OpenMP, GOMP, hpxMP • Progress LSU, Baton Rouge, LA, Nov. 7th. 2019 3 Why hpxMP is Needed Phylanx HPX Blaze # pragma omp parallel for hpxMP OpenMP LSU, Baton Rouge, LA, Nov. 7th. 2019 4 Layers of hpxMP and OpenMP implementation User End User and Applicaon Layer OpenMP OpenMP Program Environment Direcves, Library Layer Variable Compiler Funcon Create HPX thread HPX Support for user-level threading System OS/system Support for shared memory and threading Layer Schedule HPX thread Processor1 Processor2 ProcessorN Machine https://www.openmp.org/wp- content/uploads/Intro_To_OpenMP_Mattson.pdf LSU, Baton Rouge, LA, Nov. 7th. 2019 5 Use hpxMP underneath Two ways to build • Compile your program as an OpenMP program clang++/g++ -fopenmp MyCode.cpp –o MyEXE • OMP_NUM_THREADS= N LD_PRELOAD = PATH…/libhpxmp.so ./MyEXE HPX threads OS thread #pragma hpx::start omp parallel #pragma omp task HPX threads LSU, Baton Rouge, LA, Nov. 7th. 2019 6 Examples And Implementation LSU, Baton Rouge, LA, Nov. 7th. 2019 7 Examples #pragma omp task #pragma omp parallel { #pragma omp single { HPX printf("A "); Scheduler #pragma omp task { printf("race "); } #pragma omp task { printf("car "); hpx::applier::register_thread_nullary } #pragma omp taskwait A car race is fun to watch printf("is fun to watch "); OR }\\end omp single }\\end omp parallel A race car is fun to watch LSU, Baton Rouge, LA, Nov. 7th. 2019 8 A Simple OpenMP Program Note: https://www.openmp.org/wp-content/uploads/Intro_To_OpenMP_Mattson.pdf x and sum was initialized before the for loop LSU, Baton Rouge, LA, Nov. 7th. 2019 9 Performance ? hpxMP vs llvm-OpenMP vs GCC OpenMP v0.1.0 Nov.2018 v0.2.0 May 2019 Current Version Nov 2019 LSU, Baton Rouge, LA, Nov. 7th. 2019 10 Performance Comparison Hardware and Software Configuration http://www.hpc.lsu.edu/resources/hpc/system.php?system=QB2 LSU, Baton Rouge, LA, Nov. 7th. 2019 11 Performance Comparison Daxpy Benchmark # pragma omp parallel for vector<float> b = vector<float> a * float c + vector<float> b Vector Size: 103 to 106 Thread Pool Employed in Current Version LSU, Baton Rouge, LA, Nov. 7th. 2019 12 Performance Comparison # pragma omp task # pragma omp taskwait SORT Benchmark OPERATION: SORT ARRAY OBJECT: 107 32-bit numbers CUT OFF: 10 to 107. (Cut off value determines when to perform serial quicksort instead of dividing the array into 4 portions recursively where tasks are created.) LSU, Baton Rouge, LA, Nov. 7th. 2019 13 Implementation Thread and Task Synchronization • The Latch class is a downward counter which can be used to synchronize threads. • The value of the counter is initialized on creation. • Threads may block on the latch until the counter is decremented to zero or simply decrement the counter LSU, Baton Rouge, LA, Nov. 7th. 2019 14 Implementation #pragma omp task LSU, Baton Rouge, LA, Nov. 7th. 2019 15 OpenMP Performance Toolkit (OMPT) OMPT is an application programming interface (API) for first-party performance tools. Making it possible to construct powerful tools that will support OpenMP implementation. LSU, Baton Rouge, LA, Nov. 7th. 2019 16 Pragmas Implemented Compared to v3.0 • #pragma omp parallel • #pragma omp flush • #pragma omp for • #pragma omp ordered • #pragma omp sections • #pragma omp threadprivate • #pragma omp single OpenMP v4.0 • #pragma omp task • #pragma omp master • #pragma omp task depend • #pragma omp critical • #pragma omp barrier OpenMP v5.0 • #pragma omp taskwait • #pragma omp task reduction • #pragma omp atomic LSU, Baton Rouge, LA, Nov. 7th. 2019 17 Runtime Library Implemented Compared to v3.0 • omp_set_num_threads • omp_set_schedule • omp_init_lock • omp_get_num_threads • omp_get_schedule • omp_init_nest_lock • omp_get_max_threads • omp_get_thread_limit • omp_destroy_lock • omp_get_thread_num • omp_set_max_active_levels • omp_destroy_nest_lock • omp_get_num_procs • omp_get_max_active_levels • omp_set_lock • omp_in_parallel • omp_get_level • omp_set_nest_lock • omp_set_dynamic • omp_get_ancestor_thread_num • omp_unset_lock • omp_get_dynamic • omp_get_team_size • omp_unset_nest_lock • omp_set_nested • omp_get_active_level • omp_set_lock • omp_set_nested • omp_test_nest_lock • omp_get_wtime • omp_get_wtick LSU, Baton Rouge, LA, Nov. 7th. 2019 18 Progress.. • Optimize performance by introducing hpx::latch, boost::intrusive_ptr • Automate the collection and visualization of performance data • Automate unit testing with CMake • Enable GCC compiler support • Implement OpenMP Performance Toolkit(OMPT) • Implement most recent OpenMP 5.0 feature • Update Documentation • Discover and Resolve bugs LSU, Baton Rouge, LA, Nov. 7th. 2019 19 Publications and Talks • Zhang, Tianyi, Shahrzad Shirzad, Patrick Diehl, R. Tohid, Weile Wei, and Hartmut Kaiser. "An Introduction to hpxMP: A Modern OpenMP Implementation Leveraging HPX, An Asynchronous Many- Task System." In Proceedings of the International Workshop on OpenCL, p. 13. ACM, 2019. • Oct./2018, Seminar, Introduction to hpxMP, LSU, Baton Rouge, LA. https://www.youtube.com/watch?v=ajDGWPDrcxU&list=PL7vEgTL3FalbVFwzkXLHpBRKlcJNULW1g&index =12 • Feb./2019, Presentation, Scala, New Orleans, LA. • May/2019, Lightning Talk, CppNow19, Aspen, CO. https://www.youtube.com/watch?v=SI0eyXydL3M&t=79s • May/2019, Paper Presentation, IWOCL, Boston, MA. • Sept./2019, Lightning Talk, CppCon19, Aurora, CO. LSU, Baton Rouge, LA, Nov. 7th. 2019 20 Questions? LSU, Baton Rouge, LA, Nov. 7th. 2019 21.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages21 Page
-
File Size-