Intel® Threading Building Blocks Tutorial

Intel® Threading Building Blocks Tutorial Document Number 319872-009US World Wide Web: http://www.intel.com Intel® Threading Building Blocks Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL(R) PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/#/en_US_010H . Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See http://www.intel.com/products/processor_number for details. BlueMoon, BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Inside, Cilk, Core Inside, E-GOLD, i960, Intel, the Intel logo, Intel AppUp, Intel Atom, Intel Atom Inside, Intel Core, Intel Inside, Intel Insider, the Intel Inside logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel Sponsors of Tomorrow., the Intel Sponsors of Tomorrow. logo, Intel StrataFlash, Intel vPro, Intel XScale, InTru, the InTru logo, the InTru Inside logo, InTru soundmark, Itanium, Itanium Inside, MCS, MMX, Moblin, Pentium, Pentium Inside, Puma, skoool, the skoool logo, SMARTi, Sound Mark, The Creators Project, The Journey Inside, Thunderbolt, Ultrabook, vPro Inside, VTune, Xeon, Xeon Inside, X-GOLD, XMM, X-PMU and XPOSYS are trademarks of Intel Corporation in the U.S. and/or other countries.* Other names and brands may be claimed as the property of others. Copyright (C) 2005 - 2011, Intel Corporation. All rights reserved. ii 319872-009US Introduction Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 Tutorial iii Intel® Threading Building Blocks Revision History Version Version Information Date 1.21 Updated Optimization Notice 2011-Oct-27 1.20 Updated install paths. 2011-Aug-1 1.19 Fix mistake in upgrade_to_writer example. 2011-Mar-3 1.18 Add section about concurrent_vector idiom for waiting on an 2010-Sep-1 element. 1.17 Revise chunking discussion. Revise examples to eliminate parameters 2010-Apr-4 that are superfluous because task::spawn and task::destroy are now static methods. Now have different directory structure. Update pipeline section to use squaring example. Update pipeline example to use strongly-typed parallel_pipeline interface. 1.16 Remove section about lazy copying. 2009-Nov-23 1.15 Remove mention of task depth attribute. Revise chapter on tasks. Step 2009-Aug-28 parameter for parallel_for is now optional. 1.14 Type atomic<T> now allows T to be an enumeration type. Clarify zero- 2009-Jun-25 initialization of atomic<T>. Default partitioner changed from simple_partitioner to auto_partitioner. Instance of task_scheduler_init is optional. Discuss cancellation and exception handling. Describe tbb_hash_compare and tbb_hasher. iv 319872-009US Introduction Contents 11H Introduction .....................................................................................................181H 1.12H Document Structure ...............................................................................182H 1.23H Benefits ................................................................................................183H 24H Package Contents .............................................................................................384H 2.15H Debug Versus Release Libraries ................................................................385H 2.26H Scalable Memory Allocator .......................................................................486H 2.37H Windows* OS ........................................................................................487H 2.3.18H Microsoft Visual Studio* Code Examples .......................................688H 2.3.29H Integration Plug-In for Microsoft Visual Studio* Projects .................689H 2.410H Linux* OS .............................................................................................890H 2.511H Mac OS* X Systems................................................................................991H 2.612H Open Source Version ..............................................................................992H 313H Parallelizing Simple Loops ................................................................................1193H 3.114H Initializing and Terminating the Library....................................................1194H 3.215H parallel_for..........................................................................................1295H 3.2.116H Lambda Expressions ................................................................1396H 3.2.217H Automatic Chunking ................................................................1597H 3.2.318H Controlling Chunking ...............................................................1598H 3.2.419H Bandwidth and Cache Affinity....................................................1899H 3.2.520H Partitioner Summary................................................................20100H 3.321H parallel_reduce ....................................................................................21101H 3.3.122H Advanced Example ..................................................................24102H 3.423H Advanced Topic: Other Kinds of Iteration Spaces ......................................26103H 3.4.124H Code Samples.........................................................................26104H 425H Parallelizing Complex Loops..............................................................................28105H 4.126H Cook Until Done: parallel_do..................................................................28106H 4.1.127H Code Sample ..........................................................................29107H 4.228H Working on the Assembly Line: pipeline...................................................29108H 4.2.129H Using Circular Buffers ..............................................................34109H 4.2.230H Throughput of pipeline .............................................................35110H 4.2.331H Non-Linear Pipelines ................................................................35111H 4.332H Summary of Loops and Pipelines ............................................................36112H 533H Exceptions and Cancellation .............................................................................37113H 5.134H Cancellation Without An Exception ..........................................................38114H 5.235H Cancellation and Nested Parallelism ........................................................39115H 636H Containers .....................................................................................................41116H 6.137H concurrent_hash_map ..........................................................................41117H 6.1.138H More on HashCompare .............................................................43118H 6.239H concurrent_vector ................................................................................45119H 6.2.140H Clearing is Not Concurrency Safe...............................................46120H 6.2.241H Advanced Idiom: Waiting on an Element.....................................46121H 6.342H Concurrent Queue Classes .....................................................................47122H

Intel® Threading Building Blocks Tutorial

Other Apis What’S Wrong with Openmp?

Intel Advisor for Dgpu Intel® Advisor Workflows

Heterogeneous Task Scheduling for Accelerated Openmp

Getting Started with Oneapi

Michael Steyer Technical Consulting Engineer Intel Architecture, Graphics & Software Analysis Tools

Concurrent Cilk: Lazy Promotion from Tasks to Threads in C/C++

Parallel Programming

Intel® Software Products Highlights and Best Practices

Openmp API 5.1 Specification

Openmp Made Easy with INTEL® ADVISOR

Parallelism in Cilk Plus

Introduction to Openmp Paul Edmon ITC Research Computing Associate