Legal Disclaimer
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference www.intel.com/software/products.
Intel, Intel Core and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. Ct: A New Paradigm for Data Parallel Computing *Other names and brands may be claimed as the property of others. Hans–Christian Hoppe Intel Visual Computing Institute, Intel Labs Copyright © 2009. Intel Corporation. using material from http://intel.com/software/products
Anwar Ghuloum, CJ Newburn, Michael McCool and Stefanus Du Toit Performance and Productivity Libraries, Developer Products Division, Software and Services Group Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. *Other brands and names are the property of their respective owners. 2
Contents Challenge: Multiple Parallelism Mechanisms
Ct 101 Today’s parallel platforms have many kinds of – What is Ct, and what value does it provide? parallelism: – Basic language elements and examples • Pipelining • SIMD within a register (SWAR) vectorization Ct Next Steps – Where to go from here • Superscalar instruction issue or VLIW • Overlapping memory access with computation (prefetch) – Towards a parallel virtual machine • Simultaneous multithreading (hyperthreading) on one core •Multiple cores How to learn more • Multiple processors • Asynchronous host and accelerator execution
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 3 *Other brands and names are the property of their respective owners. 4 Automatically Select the Right Mechanisms What is Ct Technology?
Solution: take a single abstract specification of latent parallelism and data • A generalized data parallel programming solution that locality and use automation to transform it into multiple implementations frees application developers from dependencies on that can exploit all these mechanisms. particular hardware architectures. User specifies: • A system that integrates with existing development tools to allow parallel algorithms to be specified at a high level. • A dynamic compiler and runtime that translates high-level Platform implements: specifications of computations into efficient parallel Example implementation uses: •Two cores implementations that can take advantage of both SIMD •Four-way vectorization and thread-level parallelism, as well as accelerators. • Memory latency hiding with streaming • A system that allows an application developer to combine performance, portability, and productivity. Actual distribution of work depends on hardware
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 5 *Other brands and names are the property of their respective owners. 6
What value does it provide? The Ct Runtime C++ APIs Application Other Language Single source calling Ct APIs Single source Productivity •Intel Ct Technology Bindings ..IntegratesIntegrates withwith existingexisting toolstools offers a standards ..Applicable to many problem domains compliant C++ library… ..Safe by default: maintainable …backed by a runtime CPU with CPU Compute Performance Co-processor SSESSE 44 Co-processor ..Efficient and scalable •Runtime generates and LRBniLRBni ..Harnesses both vectors and threads manages threads and ..Eliminates modularity overhead of C++ vector code, via – Machine independent optimization AVXAVX Hybrid Portability ProcessorProcessor ..High-level abstraction – Offload management ..Hardware independent – Machine specific code ..ForwardForward scalingscaling generation and optimizations – Scalable threading runtime (based on TBB!) CPU Accelerator Future
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. 7 8 *Other brands and names are the property of their respective owners. *Other brands and names are the property of their respective owners. 8 Ct Language Introduction What can it be used for?
Bioinformatics Visual computing • Genomics and sequence analysis • Digital content creation (DCC) • Molecular dynamics • Physics engines and advanced rendering Engineering design • Visualization • Finite element and finite difference simulation • Compression/decompression • Monte Carlo simulation Signal and image processing Financial analytics • Computer vision • Option and instrument pricing • Radar and sonar processing • Risk analysis • Microscopy and satellite image processing Oil and gas Science and research • Seismic reconstruction • Machine learning and artificial intelligence • Reservoir simulation • Climate and weather simulation • Planetary exploration and astrophysics Medical imaging Enterprise • Image and volume reconstruction Database search • Analysis and computer aided detection (CAD) • • Business information
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 9 *Other brands and names are the property of their respective owners. 10
Data Spaces Ct Data Objects
The basic type in Ct is the vector, named as Vec C/C++ space Ct space – Vec-s are managed by the Ct runtime – Vec-s are single-assignment vectors – Vec-s are (opaquely) flat, multidimensional, sparse, or nested – Vec values are created & manipulated exclusively through Ct API
copyin Declared Vecs are simply references to immutable values C(++)
Vec
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 11 *Other brands and names are the property of their respective owners. 12 Moving Data In and Out of Ct Ct Operators - Vector Element-wise operators
Bind data with Ct name using Vec constructors Unary Operators Vec
• Pass-by-reference – means both copying in and copying out Ternary Operators A = select( mask, B, C ); • Dynamic Compiler tries to recognize pure copying out A = select( mask, B, 0.f );
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 13 *Other brands and names are the property of their respective owners. 14
Ct Operators - Vector Reduce / Scan Ct Operators - Vector Permutation Operators
Reductions (e.g. aggregation, collective communication) Shift Default Value // Sum all the element of B A = shift(B, 1); A = addReduce(B); A = shiftSticky(B, 1); // Some common cases for BOOLEAN Vec
// calculate the prefix sum of B A = B [ vIndex ]; A = addScan(B); A = scatter( B, vIndex, C );
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 15 *Other brands and names are the property of their respective owners. 16 A Simple Example: Dot Product A More Complex Example: Porting Black-Scholes Black‐Scholes Using C Loops Black‐Scholes Using Ct 1 #include
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 17 *Other brands and names are the property of their respective owners. 18
Functions Remote calls: Invoke Functions from C/C++ Space • A Ct Function is a C++ function that – takes one or more Vec, Elt (a Vec element), or scalars as arguments void BlackScholes(Vec
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 21 *Other brands and names are the property of their respective owners. 22 Ct Dynamic Engine Execution Ct Dynamic Engine Execution
intint ar_a[1024], ar_a[1024], ar_b[1024] ar_b[1024] intint ar_a[1024], ar_a[1024], ar_b[1024] ar_b[1024]
Vec
MemoryMemory Manager Manager a b Ct Dynamic Ct Dynamic Engine Engine
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 23 *Other brands and names are the property of their respective owners. 24
Ct Dynamic Engine Execution Ct Dynamic Engine Execution
int ar_a[1024], ar_b[1024] int ar_a[1024], ar_b[1024] int ar_a[1024], ar_b[1024] int ar_a[1024], ar_b[1024]
Vec
+ voidvoid work( work( Vec
MemoryMemory Manager Manager MemoryMemory Manager Manager a a b b Ct Dynamic Ct Dynamic Engine Engine
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 25 *Other brands and names are the property of their respective owners. 26 Ct Dynamic Engine Execution Ct Dynamic Engine Execution
int ar_a[1024], ar_b[1024] TriggerTrigger JIT JIT int ar_a[1024], ar_b[1024] TriggerTrigger JIT JIT int ar_a[1024], ar_b[1024] IRIR Builder Builder JITJIT int ar_a[1024], ar_b[1024] IRIR Builder Builder JITJIT
Vec
void work( Vec
MemoryMemory Manager Manager MemoryMemory Manager Manager a a b b Ct Dynamic Ct Dynamic Engine Engine
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 27 *Other brands and names are the property of their respective owners. 28
Ct Dynamic Engine Execution Ct Dynamic Engine Execution
int ar_a[1024], ar_b[1024] TriggerTrigger JIT JIT int ar_a[1024], ar_b[1024] TriggerTrigger JIT JIT int ar_a[1024], ar_b[1024] IRIR Builder Builder JITJIT int ar_a[1024], ar_b[1024] IRIR Builder Builder JITJIT
Vec
void work( Vec
MemoryMemory Manager Manager ParallelParallel Runtime Runtime MemoryMemory Manager Manager ParallelParallel Runtime Runtime a a Thread Scheduler b b Thread Scheduler Ct Dynamic Ct Dynamic Engine Engine DataData PartitionPartition
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 29 *Other brands and names are the property of their respective owners. 30 Ct Dynamic Engine Execution Ct Dynamic Engine Execution
int ar_a[1024], ar_b[1024] TriggerTrigger JIT JIT int ar_a[1024], ar_b[1024] int ar_a[1024], ar_b[1024] IRIR Builder Builder JITJIT int ar_a[1024], ar_b[1024]
Vec
MemoryMemory Manager Manager ParallelParallel Runtime Runtime MemoryMemory Manager Manager ParallelParallel Runtime Runtime a Thread Scheduler a Thread Scheduler b Thread Scheduler b Thread Scheduler Ct Dynamic Ct Dynamic Engine DataData Engine PartitionPartition All Intel Platforms
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 31 *Other brands and names are the property of their respective owners. 32
Ct Dynamic Engine Execution Towards a Parallel Virtual Machine
intint ar_a[1024], ar_a[1024], ar_b[1024] ar_b[1024]
Vec
void work( Vec
MemoryMemory Manager Manager ParallelParallel Runtime Runtime a Thread Scheduler b Thread Scheduler Ct Dynamic Engine DataData PartitionPartition All Intel Platforms
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 33 *Other brands and names are the property of their respective owners. 34 Parallel Programming Abstractions Parallel VM Tasks
• Provide function definition, data management and C++ Java .NET Python … execution • Decouple programming languages from Ct TBB Emerging Parallel Languages … concurrency platforms – Allowing new frontends to flourish Industry gap in “parallel VM” • Be well-defined, offer C API and textual representation CUDA OpenCL … – Suitable for wide external adoption
pthreads ConcRT GCD PTX …
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 35 *Other brands and names are the property of their respective owners. 36
Evolve Ct into Data-parallel Virtual Ct In–Depth Information and Product Machine Plans
• Converge Ct and RapidMind APIs into open, standard VM layer •Goals: – New frontends for other languages (e.g. .NET, Python, Java, etc.) – Enable domain specific languages – Leverage data-parallel execution engines from Intel – Provide interface specification for non-Intel implementations – Clearly specify semantics separately from syntax – Binary compatibility and insulation • Not generally aimed at application developers
• Collaboration welcome! – Email to [email protected]!
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 37 *Other brands and names are the property of their respective owners. 38 Ct Going Forward How to Learn More about Ct
• Ct is being turned into an Intel software product • Read the material at http://www.intel.com/go/ct – public beta release planned for Q1/2010 • The product will contain • Browse the Intel Developer Forum website for Ct –Core API presentations – Libraries for Linear Algebra, FFT, Random Number Generation (powered by Intel® Math Kernel Library) – Lots of samples (Medical Imaging, Financial Analytics, • Bug your favorite Intel rep about getting into the Seismic Processing, …) private beta program • Initial release on Windows, followed by Linux – IA-32 and Intel® 64 instruction sets • Sign up for the public beta at – Works with Intel® C/C++ Compiler, Microsoft* Visual http://www.intel.com/go/ct C++*, and GCC* – Works with Intel® VTune™ Analyzer
Software & Services Group, Developer Products Division Software & Services Group, Developer Products Division
Copyright © 2009, Intel Corporation. All rights reserved. Copyright © 2009, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 39 *Other brands and names are the property of their respective owners. 40