Porta-SIMD: an Optimally Portable SIMD Programming Language Duke CS-1990-12 UNC CS TR90-021 May 1990
Total Page:16
File Type:pdf, Size:1020Kb
Porta-SIMD: An Optimally Portable SIMD Programming Language Duke CS-1990-12 UNC CS TR90-021 May 1990 Russ Tuck Duke University Deparment of Computer Science Durham, NC 27706 The University of North Carolina at Chapel Hill Department of Computer Science CB#3175, Sitterson Hall Chapel Hill, NC 27599-3175 Text (without appendix) of a Ph.D. dissertation submitted to Duke University. The research was performed at UNC. @ 1990 Russell R. Tuck, III UNC is an Equal Opportunity/Atlirmative Action Institution. PORTA-SIMD: AN OPTIMALLY PORTABLE SIMD PROGRAMMING LANGUAGE by Russell Raymond Tuck, III Department of Computer Science Duke University Dissertation submitte in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 1990 Copyright © 1990 by Russell Raymond Tuck, III All rights reserved Abstract Existing programming languages contain architectural assumptions which limit their porta bility. I submit optimal portability, a new concept which solves this language design problem. Optimal portability makes it possible to design languages which are portable across vari ous sets of diverse architectures. SIMD (Single-Instruction stream, Multiple-Data stream) computers represent an important and very diverse set of architectures for which to demon strate optimal portability. Porta-SIMD (pronounced "porta.-simm'd") is the first optimally portable language for SIMD computers. It was designed and implemented to demonstrate that optimal portability is a useful and achievable standard for language design. An optimally portable language allows each program to specify the architectural features it requires. The language then enables the compiled program to exploit exactly those fea. tures, and to run on all architectures that provide them. An architecture's features are those it can implement with a constant-bounded number of operations. This definition of optimal portability ensures reasonable execution efficiency, and identifies architectural differences relevant to algorithm selection. An optimally portable language for a set of architectures must accommodate all the features found in the members of that set. There was no suitable taxonomy to identify the features of SIMD architectures. Therefore, the taxonomy created and used in the design of Porta-SIMD is presented. Porta.-SIMD is an optimally portable, full-featured, SIMD language. It provides dynamic allocation of parallel data with dynamically determined sizes. Generic subroutines which operate on any size of data may also be written in Porta-SIMD. Some important commercial SIMD languages do not provide these features. A prototype implementation of Porta-SIMD has been developed as a set of #include files and libraries used with an ordinary C++ compiler. This approach has allowed more rapid prototyping and language experimentation than a custom compiler would have, but modestly constrained the language's syntax. The result is a very portable but only moderately efficient implementation. Porta-SIMD has been implemented for the Connection Machine 2, for Pixel-Planes 4 and 5, and for ordinary sequential machines. Optimal portability is an important new concept for developing portable languages which can handle architectural diversity. Porta.-SIMD demonstrates its usefulness with SIMD computers. ll Abstract Acknowledgements I am most thankful to God, who has made possible what I could n0t have done alone. He enabled me to finish work which looked like it could drag on forever, and to do it in time for an impossible looking graduation deadline. He taught me a lot about faith, trust, prayer, and His peace in the process. He has given me a wonderful wife, Debbi, who loves me and makes life fun. Most of all, God gave His son Jesus to die for my failures and give me new life with meaning and hope. I appreciate Debbi's unending patience, prayers, sacrifice, and encouragement. These have been especially important during my final push to finish, during which she has been busy finishing her own graduate degree. Our family, our church and some special friends have joined in with encouragement and prayers, for which I am very grateful. Dad's quick mid-day phone calls from California several times over the last few weeks were especially helpful. Dr. Frederick P. Brooks, Jr. has been the consummate advisor. He .has provided critically important insight, advice, and perspective. He has regularly and dependably scheduled time for our meetings, despite the many demands on his time, and that has been very important to my steady progress. I am grateful to Dr. Brooks for accepting me as his student and providing me with assured grant support, guidance, and freedom, even when all I knew about my research goals was that I wanted to improve the programming of SIMD computers. I appreciate his deep Christian faith, and have enjoyed participating in Wednesday lunch Bible studies with him. I appreciate my committee for their encouragement and sound advice. Dr. Henry Fuchs has supported my work financially as part of the Pixel-Planes project since I began working on Porta-SIMD, and has been very supportive personally as well. Dr. Merrell Patrick also served on my M.S. committee, and has shown a personal interest in my success that was especially helpful during my transition into doctoral research. Dr. Jothy Rosenberg's ideas, interestj and personality have made him a very valuable source of encouraging advice and positive suggestions. Dr. John Board has provided sound comments and helpful references. I appreciate Dr. Jan F. Prins' service on my committee as an ex officio member, doing significant work without official recognition. He has often been the most accessible member of my committee, and has served as a valuable sounding board for ideas and dilemmas. Greg Turk, a fellow graduate student and member of the Pixel-Planes team, has been valuable as a pioneering Porta-SIMD user and for his willingness to discuss and comment on how Porta-SIMD should work. I appreciate his willingness and that of Tim Cullip for me to use their programs as examples and include them in this dissertation. I want to thank Michael Tiemann for writing G++ (the GNU C++ compiler), for making it freely available, and for providing timely fixes to compiler bugs as they were reported. I appreciate the encouragement and friendship of many members of the Duke and UNC Computer Science Departments, including especially my long-time office mates at Duke, Jack Briner and Mark Jones, and the entire Pixel-Planes team at UNC. I appreciate the time Carlton Brown and Debbi Tuck spent patiently and carefully iii iv Acknowledgements reading drafts of this document to point out and suggest corrections for my writing errors. I appreciate the time Herb Taylor of the David Sarnoff Research Center spent helping me understand details of the Princeton Engine, and also his friendship and encouragement. Several institutions have been important to my research. The Departments of Computer Science at the University of North Carolina at Chapel Hill and Duke University have pro vided critically important computing resources and office space. My research at UNC has been supported by the Pixel-Planes Project, Drs. Henry Fuchs and John Poulton, P.I.s, and its grants: National Science Foundation grant #MIP-8601552, Defense Advanced Research Projects Agency order #6090, Office of Naval Research contract #N0014-86-K-0680; and by the GRIP Project, Frederick Brooks, P.I., under National Institutes of Health grant #RR 02170. Access to a Connection Machine was provided first by the Advanced Computing Research Facility (ACRF) at Argonne National Laboratories, under grants NSF-ASC-8808327 and DOE-W-31-109-ENG-38, and more recently by the Connection Machine Network Server (CMNS) Pilot Facility at Thinking Machines Corporation under DARPA contract DACA76- 88-C-0012. Contents Abstract i Acknowledgements iii 1 Introduction, Thesis, and Overview 1 2 Optimal Portability 5 3 A SIMD Taxonomy for Optimal Portability 11 3.1 Definition of SIMD Architectures .. 11 3.2 Taxonomy of SIMD Architectures . 12 3.2.1 Feature Names . 14 3.2.2 Communication . 14 3.2.2.1 Labeling (number, N) .. 17 3.2.2.2 Communication (C) ... 17 3.2.2.3 Collision Resolution, Write (W) . 19 3.2.2.4 Collision Resolution, Fetch (F) . 20 3.2.2.5 Piped Communication (P) . 20 3.2.2.6 Cut-Through Communication (T) 21 3.2.3 Local Addressing . 22 3.2.3.1 LocaJ Addressing (L) .. 23 3.2.4 Reduce and S~an . 24 3.2.4.1 Reduce (R) ....... 24 3.2.4.2 Scan (S) ..... 25 3.2.5 Parallel I/0 .............................. 25 3.2.5.1 Input (I) .......................... 25 3.2.5.2 Output (0) . ..... 25 3.2.6 PE to Host I/0 . 26 3.2.6.1 Get (G) . ..... 26 3.2.7 Naming a Classification ....................... 26 3.3 The Taxonomy in Use . 26 3.3.1 PxP14 (Pixel-Planes 4) . ..... 29 3.3.2 Oldfield (Oldfield et a!.) ....................... 30 3.3.3 PxP15 (Pixel-Planes 5) . 30 3.3.4 AIS-5000 (Applied Intelligent Systems, Inc.) ............ 30 3.3.5 Centipede . 30 3.3.6 Princeton Engine . .. 31 3.3.7 ASP (Associative String Processor) .......... 31 3.3.8 llliac IV . 31 v vi CONTENTS 3.3.9 SOLOMON (Simultaneous Operation Linked Ordinal MOdular Net- work) . 32 3.3.10 MPP (Massively Paral~el Processor) 32 3.3.11 DAP (Distributed Array Processor) 32 3.3.12 BLITZEN . 32 3.3.13 YUPPIE (Yorktown Ultra Parallel Polymorphic Image Engine) 33 3.3.14 Unger (S. H. Unger) . 33 3.3.15 GAM Pyramid (George Mason University, Adder pyramid, MPP cir- cuits, Pyramidal topology) . 33 3.3.16 BVM (Boolean Vector Machine) . 33 3.3.17 GFll (11-Gflop target performance) . 34 3.3.18 BSP (Burroughs Scientific Processor) 34 3.3.19 MP-1 (MasPar Computer Corp.) .