Intel Threading Building Blocks (TBB)

Intel Threading Building Blocks (TBB)

Intel Threading Building Blocks (TBB) Qingzhao Zhu UTK CS462 project What is TBB? • TBB is a library that supports scalable parallel programming using standard C++ code. ▫ Specify logical parallelism instead of threads ▫ Target threading for robust performance ▫ Emphasize on scalable, data-parallel programming ▫ Shared memory ▫ Portable and open source Overview Intel TBB parallel_for Generic Parallel Algorithms Memory Allocation Concurrent Task Scheduler Containers Synchronization Primitives Flow Graph Code Example Parallel code: #include "tbb/tbb.h" using namespace tbb; Sequential code: class ApplyFoo { float *const my_a; public: void SerialApplyFoo( float a[ ], size_t n) { void operator() ( const blocked_range<size_t>& range ) const { for( size_t i=0; i!=n; ++i) Body object float *a = my_a; Foo(a[i]); for( size_t i = range.begin(); i!= range.end(); ++i ) } Foo(a[i]); } ApplyFoo( float a[] ) : my_a(a) {} }; void ParallelApplyFoo( float a[], size_t n ) { parallel_for ( blocked_range<size_t>(0,n), ApplyFoo(a) ); } Sample node from: https://software.intel.com/en-us/node/506057 Dynamic Task Scheduler – Task Stealing Work Depth First Steal Breadth First (Kim and Voss, 2011) Is TBB right for you? Distributed MPI / OpenSHMEM Memory Parallel Direct Programming PThreads Programming Shared Memory Fortran / C OpenMP Abstraction Bounded Loops OpenMP Balanced tasks C++ Nested and Recursive Parallelism Unbalanced tasks Object-oriented / templated C++ code TBB Concurrent data structure C++ user-defined types TBB vs. OpenMP Performance Matrix Multiplication (8 cores) Fibonacci 47 (cutoff 2048) (Ali et al., 2012) (Tousimojarad and Vanderbauwhede, 2014) Fixed number of loop iterations Nested and Recursive Parallelism TBB Equally distributed tasks OpenMP Unbalanced tasks More information: • Tutorial: ▫ https://www.threadingbuildingblocks.org/docs/help/index.htm#reference/threads.html ▫ https://www.threadingbuildingblocks.org/intel-tbb-tutorial ▫ Reinders, James. Intel threading building blocks: outfitting C++ for multi-core processor parallelism. " O'Reilly Media, Inc.", 2007. • Concise introduction: ▫ Kim, Wooyoung, and Michael Voss. "Multicore desktop programming with intel threading building blocks." IEEE software 28.1 (2011): 23-31. ▫ http://www.cs.cmu.edu/afs/cs/academic/class/15499-s09/www/handouts/TBB-HPCC07.pdf • Performance test: ▫ Ali, Akhtar, Usman Dastgeer, and Christoph Kessler. "OpenCL for programming shared memory multicore CPUs." Proceedings of the 5th Workshop on MULTIPROG. 2012. ▫ Tousimojarad, Ashkan, and Wim Vanderbauwhede. "Comparison of Three Popular Parallel Programming Models on the Intel Xeon Phi." Euro-Par Workshops (2). 2014. • Task Scheduling optimization: ▫ Robison, Arch, Michael Voss, and Alexey Kukanov. "Optimization via reflection on work stealing in TBB." Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on. IEEE, 2008. ▫ Contreras, Gilberto, and Margaret Martonosi. "Characterizing and improving the performance of intel threading building blocks." Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on. IEEE, 2008..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us