Data Management and Control-Flow Aspects of an SIMD/SPMD Parallel Language/Compiler

222 lltt TRANSA('TI0NS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4. NO. 2. FEBRUARY lYY3 Data Management and Control-Flow Aspects of an SIMD/SPMD Parallel Language/Compiler Mark A. Nichols, Member, IEEE, Howard Jay Siegel, Fellow, IEEE, and Henry G. Dietz, Member, IEEE Abstract-Features of an explicitly parallel programming lan- map more closely to different modes of parallelism [ 191, [23]. guage targeted for reconfigurable parallel processing systems, For instance, most parallel systems designed to exploit data where the machine's -1-processing elements (PE's) are capable of parallelism operate solely in the SlMD mode of parallelism. operating in both the SIMD and SPMD modes of parallelism, are described. The SPMD (Single Program-Multiple Data) mode of Because many data-parallel applications require a significant parallelism is a subset of the MIMD mode where all processors ex- number of data-dependent conditionals, SIMD mode is un- ecute the same program. By providing all aspects of the language necessarily restrictive. These types of applications are usually with an SIMD mode version and an SPMD mode version that are better served when using the SPMD mode of parallelism. syntactically and semantically equivalent, the language facilitates Several parallel machines have been built that are capable of experimentation with and exploitation of hybrid SlMDiSPMD machines. Language constructs (and their implementations) for operating in both the SIMD and SPMD modes of parallelism. data management, data-dependent control-flow, and PE-address Both PASM [9], [17], [51], [52] and TRAC [30], [46] are dependent control-flow are presented. These constructs are based hybrid SIMDiMIMD machines, and OPSILA [2], [3],[16] is on experience gained from programming a parallel machine a hybrid SIMDiSPMD machine. prototype, and are being incorporated into a compiler under The research presented here is applicable to SIMD, SPMD, development. Much of the research presented is applicable to general SIMD machines and MIMD machines. and combined SIMDiSPMD operation. Commercial SIMD machines like the Connection Machine [21], [54], DAP [22], Zndex Terms- Compilers, fault tolerance, languages, mixed- mode parallelism, parallel processing, PASM, reconfigurable ma- MasPar [8], and MPP [5], [6] have shown that developing chines, SIMD, SPMD. parallel languages for programming and studying the behavior of such machines is important. While commercial MIMD machines, such as the Butterfly [ 131, iPSC Hypercube [35], and I. INTRODUCTION NCube system [20], are not restricted to SPMD mode, SPMD WO approaches to parallel computation are the SIMD and mode has wide applicability and is simpler to use because only TMIMD modes of parallelism. In the synchronous SIMD one program and its interaction with copies of itself needs to be (Single Instruction-Multiple Data) [ 181 mode of parallelism, debugged (as opposed to having distinct interacting programs the processors are broadcast all of the instructions they are on different processors). For joint SIMD/SPMD machines such to execute and the processors operate in a lock-step fashion. as PASM and OPSILA, because there exist tradeoffs between In the asynchronous MIMD (Multiple Instruction-Multiple the two modes [7], [47], one can take advantage of both modes Data) [IS] mode of parallelism, each processor functions to attempt to improve performance. independently using its own program. In the SPMD (Single In addition to allowing the use of the more suitable mode Program-Multiple Data) [ 141, [IS] mode, all processors when executing a task, being able to utilize two modes of function independently using their own program counters and parallelism within parallel programs provides more flexibility all execute the same program. Because the processors will, in with respect to fault tolerance. Given a task that executes general, be operating independently on different data sets. they more effectively in SIMD mode than in SPMD mode, if a may be taking different control paths through the program. processor becomes faulty, the degraded system may be more On parallel systems capable of operating only in a single effective in SPMD mode rather than in SIMD. Reasons for this mode of parallelism, it is often impossible to fully exploit may include that the needed redistribution of work among the the parallelism inherent in many application programs. This processors will require irregular inter-processor data transfers is because different parts of an application program might and that the addressing requirements of the degraded system may no longer be uniform among the processors. Thus, being Manuscript received Octoher, 15, I9UO: revised Septcinher I. IYYI. This able to execute the same program both in SIMD mode or work was supported by the Oftice of Nakal Research under Grant NO0 Ol4-YO- 5-1483, by the National Science Foundation under Grant CDA-90 15h06. and SPMD mode is needed to facilitate such a fault tolerance hy the Naval Ocean Systems Center under the High Performance ('omputing approach. The language presented here will support this. Block, ONT. A preliminary version of portions ot thi\ material was presented The Explicit Language for Parallelism (ELP)is an explicitly at Frontiers '90: The Third Symposium on the Frontier\ of Ma\sively Parallel Computation. parallel language being designed for programming parallel M. A. Nichols is with NC'R, San Diego. CA 92127-180(1. machines where the user explicitly indicates the parallelism H. J. Siegel and H. G. Dieti are with the Parallel Processing I.ahoratory. to be employed. It will include constructs for both SIMD School of Electrical Engineering, Purdue University. West Lafayette. IN 47907- 1285. and MIMD parallelism. Thus, the full ELP language will be IEEE Log Number 9204618. adaptable to SIMD machines, MIMD machines, and machines NICHOLS c’f a/ SlMDiSPMD PARALLEL 1 ANGUAGE- COMPILER ZZZ capable of operating in both the SIMD and MIMD modes. An DAP Fortran [22], MasPar C and Fortran [ll], Parallel-C ELP application program is able to perform computations that [28], Parallel Pascal [42], extensions to Ada [12], C* [44], use the SlMD and MIMD parallelism modes in an interleaved and DAPL [43], [SI. fashion. Existing parallel languages capable of supporting mixed- The first target architecture for ELP is the PASM (Partition- mode operation include CSL, Hellena, and BLAZE. While able SIMDIMIMD) parallel processing system. ELP is cur- CSL and Hellena are targeted for the TRAC and OPSILA rently being developed on a small-scale 30-processor prototype machines, respectively, BLAZE is untargeted. of PASM (with 16 processors in the computational engine). CSL (Computation Structures Language) [lo] is a PASCAL- PASM is a design for a large-scale reconfigurable system with like explicitly parallel job control language, targeted to TRAC, distributed memory that supports mixed-mode parallelism (i.e., that supports the high-level specification and scheduling of its processing elements (PE’s) can dynamically switch between SIMD tasks (specified via an “EXECUTE” statement) and the SIMD and MIMD modes at instruction level granularity MIMD tasks (specified via a “COBEGIN . COEND” state- with negligible overhead). ELP uses explicit specification of ment). Dynamic switching between SIMD mode and MIMD parallelism so that programmers can conduct experiments to mode is performed at the task level. study the use of SIMD, MIMD, and mixed-mode parallelism. Hellena (31 is an explicitly parallel preprocessed version Thus, the goal of ELP is to facilitate the programming of and of PASCAL, targeted to OPSILA, that supports dynamic experimentation with mixed-mode parallel processing systems mode-switching at the instruction level. SIMD mode is the by supporting explicit control of such systems. default mode of execution, and the encapsulation “spmd . Many parallel algorithm studies have been performed on end-spmd” can be used to temporarily switch to SPMD the PASM prototype. Initially. these studies used assembly mode. language, e.g., [SI, [17]. Currently, as a temporary measure. Unlike the previous two languages, BLAZE (311 (targeted a combination of a C language compiler and AWK scripts for machines capable of SlMD and/or SPMD operation) han- (for pre- and post-processing) is used. Even though this dles parallelism in an implicit fashion. Specifically, BLAZE permits the high-level specification of parallel programs, it is supports language constructs that provide multiple levels of very inefficient and inflexible. These practical experiences in parallelism that permit a compiler to extract the types of par- mixed-mode parallel programming and discussions with mem- allelism appropriate for a given architecture, while discarding bers of the PASM user community provided the motivation for the remainder. the development of the ELP languageicompiler. A mixed-mode language is uriiform with respect to each This paper focuses on aspects of the joint SlMDiSPMD mode it supports if all of the language’s constructs, operations, portion of the ELP languageicompiler. Also, the compi- statements, etc., have interpretations within each of the modes lation/linking/loading process is described. In particular, a with no difference in syntax or semantics [40]. The first variety of inter-related data management, data-dependent parallel language proposed that supported this concept was control-flow, and PE-address dependent control-flow issues XPC. XPC (explicitly Parallel C) [40], [41] is a C-like lan- are examined. Their specification

Data Management and Control-Flow Aspects of an SIMD/SPMD Parallel Language/Compiler

2.5 Classification of Parallel Computers

CS650 Computer Architecture Lecture 10 Introduction to Multiprocessors

What Is SPMD? Messages

Exploiting Automatic Vectorization to Employ SPMD on SIMD Registers

Polynomial-Time Algorithms for Enforcing Sequential Consistency in SPMD Programs with Arrays

Regent: a High-Productivity Programming Language for Implicit Parallelism with Logical Regions

Message Passing Interface Part - I

2.1 Flynn's Taxonomy of Parallel Architectures-Twoslidesxpage

Introduction to Parallel Programming with MPI Lecture #1

Introduction to Matlab Parallel Computing Toolbox

Computer Architecture: SIMD and Gpus (Part III) (And Briefly VLIW, DAE, Systolic Arrays)

Matlab: Parallel Computing Toolbox