Parallel Program Execution and Architecture Models with Dataflow Origin the EARTH Experience

Parallel Program Execution and Architecture Models with Dataflow Origin the EARTH Experience

Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory - CAPSL Parallel Program Execution and Architecture Models with Dataflow Origin The EARTH Experience CPEG 852 - Spring 2014 Advanced Topics in Computing Systems Guang R. Gao ACM Fellow and IEEE Fellow Endowed Distinguished Professor Electrical & Computer Engineering University of Delaware Topic-B-Multithreading [email protected] Outline • Parallel program execution models • An evolution of dataflow architectures: experience with the argument-fetch dataflow model/architectures • Evolution of fine-grain multithreaded program execution models – The EARTH experience. • Memory and synchronization. models • From EARTH to Runnemede – A Journey to extreme-scale Topic-B-Multithreading 2 Outline • Part I: EARTH execution model • Part II: EARTH architecture model and platforms • Part III: EARTH programming models and compilation techniques • The percolation model and its applications • Summary Topic-B-Multithreading 3 Part I EARTH: An Efficient Architecture for Running THreads [PACT95, EURO-PAR95, ICS95, MASCOTS96, ISCA96, PACT96, PPoPP97, PACT97, SPAA97, DIPES98, SPAA98 and many others …) Topic-B-Multithreading 4 The EARTH Program Execution Model • What is a thread? • How the state of a thread is represented? • How a thread is enabled? Topic-B-Multithreading 5 What is a Thread? • A parallel function invocation (threaded function invocation) • A code sequence defined (by a user or a compiler) to be a thread (fiber) • Usually, a body of a threaded function may be partitioned into several threads Topic-B-Multithreading 6 How to Execute Fibonacci Function in Parallel? fib (4) fib (3) + fib (2) fib (2) fib (1) fib(1) fib (0) fib (1) fib (0) Topic-B-Multithreading 7 Parallel Function Invocation fib n-2 fib n-3 SYNC slots local vars fib n-1 fib n-2 caller’s Links between <fp,ip> frames fib n Tree of “Activation Frames” Topic-B-Multithreading 8 An Example int f(int *x, int i, int j) { int a, b, sum, prod, fact; b = x[j]; int r1, r2, r3; sum = a + b; a = x[i]; prod = a * b; fact = 1; fact = fact * a; r1 = g(sum); r2 = g(prod); r3 = g(fact); return(r1 + r2 + r3); } Topic-B-Multithreading 9 The Example is Partitioned into Four Fibers (Threads) Thread0: a = x[i]; fact = 1; Thread1: 1 fact = fact * a; b = x[j]; Thread2: 1 sum = a + b; prod = a * b; r1 = g(sum); r2 = g(prod); r3 = g(fact); Thread3: 3 return (r1 + r2 + r3); Topic-B-Multithreading 10 The State of a Fiber (Thread) • A Fiber shares its “enclosing frame” with other fibers within the same threaded function invocation. • The state of a fiber includes – its instruction pointer – its “temporary register set” • A fiber is “ultra-light weighted”: it does not need dynamic storage (frame) allocation. • Our focus: non-preemptive threads – called fibers Topic-B-Multithreading 11 The “EARTH” Execution Model 2 2 1 2 “signal token” a “thread” actor 1 2 2 4 Topic-B-Multithreading 12 The EARTH Fiber Firing Rule • A Fiber becomes enabled if it has received all input signals; • An enabled fiber may be selected for execution when the required hardware resource has been allocated; • When a fiber finishes its execution, a signal is sent to all destination threads to update the corresponding synchronization slots. Topic-B-Multithreading 13 Thread States Thread created Thread terminated DORMANT Synchronizations received Thread completed ENABLED ACTIVE CPU ready Topic-B-Multithreading 14 The EARTH Model of Computation Fiber within a frame Parallel function invocation Call a procedure SYNC ops Topic-B-Multithreading 15 The EARTH Multithreaded Execution Model Two Level of Fine-Grain Threads: fiber within a frame - threaded procedures Aync. function invocation - fibers A sync operation Invoke a threaded func 2 2 1 2 Signal Token Total # signals Arrived # signals 1 2 2 4 Topic-B-Multithreading 16 EARTH vs. CILK Fiber within a frame Parallel function invocation frames fork a procedure SYNC ops EARTH Model CILK Model Note: EARTH has it origin in static dataflow model Topic-B-Multithreading 17 The “Fiber” Execution Model 0 2 0 2 Signal Token Total # signals Arrived # signals 0 1 0 2 0 4 Topic-B-Multithreading 18 The “Fiber” Execution Model 1 2 0 2 Signal Token Total # signals Arrived # signals 0 1 0 2 0 4 Topic-B-Multithreading 19 The “Fiber” Execution Model 2 2 0 2 Signal Token Total # signals Arrived # signals 0 1 0 2 0 4 Topic-B-Multithreading 20 The “Fiber” Execution Model 2 2 0 2 Signal Token Total # signals Arrived # signals 1 1 0 2 0 4 Topic-B-Multithreading 21 The “Fiber” Execution Model 2 2 0 2 Signal Token Total # signals Arrived # signals 1 1 1 2 0 4 Topic-B-Multithreading 22 The “Fiber” Execution Model 2 2 1 2 Signal Token Total # signals Arrived # signals 1 1 1 2 0 4 Topic-B-Multithreading 23 The “Fiber” Execution Model 2 2 2 2 Signal Token Total # signals Arrived # signals 1 1 1 2 0 4 Topic-B-Multithreading 24 The “Fiber” Execution Model 2 2 2 2 Signal Token Total # signals Arrived # signals 1 1 2 2 0 4 Topic-B-Multithreading 25 The “Fiber” Execution Model 2 2 2 2 Signal Token Total # signals Arrived # signals 1 1 2 2 1 4 Topic-B-Multithreading 26 The “Fiber” Execution Model 2 2 2 2 Signal Token Total # signals Arrived # signals 1 1 2 2 2 4 Topic-B-Multithreading 27 The “Fiber” Execution Model 2 2 2 2 Signal Token Total # signals Arrived # signals 1 1 2 2 3 4 Topic-B-Multithreading 28 The “Fiber” Execution Model 2 2 2 2 Signal Token Total # signals Arrived # signals 1 1 2 2 4 4 Topic-B-Multithreading 29 Part II • The EARTH • Abstract Machine (Architecture) Model • and • EARTH Evaluation Platforms Topic-B-Multithreading 30 Programming Models Users Users Programming Environment Platforms Execution Model API Execution Model Execution Abstract Machine Execution ModelTopic-B- Multithreadingand Abstract Machines 31 The EARTH Abstract Architecture (Model) Local Memory Local Memory EU SU EU SU PE PE NETWORK Topic-B-Multithreading 32 How To Evaluate EARTH Execution and Abstract Machine Model ? Topic-B-Multithreading 33 EARTH Evaluation Platforms EARTH-MANNA EARTH-IBM-SP Implement EARTH on a bare- Plan to implement EARTH on a metal tightly-coupled off-the-shelf Commercial multiprocessor. Parallel Machine (IBM SP2/SP3) EARTH on Clusters EARTH on Beowulf Implement EARTH on a cluster of UltraSPARC SMP workstations connected by fast Ethernet NOTE: Benchmark code are all written with EARTH Threaded-C: The API for EARTH Execution and Abstract Machine Models Topic-B-Multithreading 34 EARTH-MANNA: An Implementation of The EARTH Architecture Model Topic-B-Multithreading 35 Open Issues • Can a multithreaded program execution model support high scalability for large-scale parallel computing while maintaining high processing efficiency? • If so, can this be achieved without exotic hardware support? • Can these open issues be addressed both qualitatively and quantitatively with performance studies of real-life benchmarks (both Class A & B)? Topic-B-Multithreading 36 The EARTH-MANNA Multiprocessor Testbed cluster cluster cluster cluster cluster cluster cluster Crossbar- cluster cluster Hierarchies cluster cluster cluster cluster cluster cluster cluster Cluster Node Node Node Node Node Node Crossbar CP CP i860XP i860XP 8 Network Interface I/O 8 4 32 Mbyte Memory Topic-B-Multithreading 37 Main Features of EARTH Multiprocessor • Fast thread context switching shelf - • Efficient parallel function invocation the - • Good support of fine-grain dynamic load balancing • Efficient support split-phase Using off microprocessors transaction • The concept of fibers and dataflow Topic-B-Multithreading 38 EARTH-C Compiler Environment EARTH-SIMPLE C EARTH-C Split-Phase Analysis Build DDG Program Dependence McCAT Analysis EARTH SIMPLE Compute Remote Level Fiber Partitioning EARTH-C Compiler Fiber generation Merge Statements Threaded-C Fiber Synchronization Threaded-C Compiler Fiber Scheduling Fiber Code Generation (a) EARTH Compilation Environment (b) EARTH-C Compiler Topic-B-Multithreading 39 Performance Study of EARTH • Overview • Performance of basic EARTH primitives (“Stress Test” via “micro-benchmarks”) • Performance of benchmark programs – Speedup – USE value – Latency Tolerance Capacity NOTE: It is important to design your own performance “features” or “parameters” that best distinguishes your models from your counterparts Topic-B-Multithreading 40 EARTH Benchmark Suite (EBS) Benchmark Name Problem Size Problem Domain Characteristics Ray Tracing 512 x 512 Image Processing Class A Wave-2D 150 x 150 Fluid Dynamic Problem Class A Tomcatv 257 Scientific Computation Class A 2D-SLT 80 x 80 Fluid Dynamic Problem Class A Matrix Multiply 480 x 480 Numerical Computation Class A Barnes-Hut 8192 bodies N-body Simulation Class B MP3D 18K paricles Fluid Flow Simulation Class B EM3D 20K nodes Electromagnetic Wav Simulation Class B Sampling Sorting 64K Sorting Problem Class B Gauss Elimination 720 x 720 Numerical Computation Class B Protein Folding 3 x 3 x 3 Cube Chemistry Class B Eigenvalue 2999 Numerical Computation Class B Vertex Enumeration 10 Pivot-Based Searching Class B TSP 10 Graph Searching Class B Paraffins 20 Chemistry Class B N-Queen 12 Graph Searching Class B Power 10000 Power System Optimization Class B Voronoi 64K Graph Paritioning Class B Heuristic-TSP 32K Searching Problem Class B Tree-Add 1M Graph Searching Class B Portable Threaded-C exists Topic-B-Multithreading 41 Main Experimental Results of EARTH-MANNA • Efficient multithreading support is possible with off-the-shelf processor nodes with overhead – context

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    66 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us