Communication Architectures for Parallel-Programming Systems

Communication Architectures for Parallel-Programming Systems Raoul A.F. Bhoedjang Advanced School for Computing and Imaging This work was carried out in graduate school ASCI. ASCI dissertation series number 52. Copyright c 2000 Raoul A.F. Bhoedjang. VRIJE UNIVERSITEIT Communication Architectures for Parallel-Programming Systems ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Vrije Universiteit te Amsterdam, op gezag van de rector magnificus prof.dr. T. Sminia, in het openbaar te verdedigen ten overstaan van de promotiecommissie van de faculteit der exacte wetenschappen/ wiskunde en informatica op dinsdag 6 juni 2000 om 15.45 uur in het hoofdgebouw van de universiteit, De Boelelaan 1105 door Raoul Angelo Félicien Bhoedjang geboren te ’s-Gravenhage Promotoren: prof.dr.ir. H.E. Bal prof.dr. A.S. Tanenbaum Preface Personal Acknowledgements First and foremost, I am grateful to my thesis supervisors, Henri Bal and Andy Tanenbaum, for creating an outstanding research environment. I thank them both for their confidence in me and my work. I am very grateful to all members, past and present, of Henri’s Orca team: Saniya Ben Hassen, Rutger Hofman, Ceriel Jacobs, Thilo Kielmann, Koen Langen- doen, Jason Maassen, Rob van Nieuwpoort, Aske Plaat, John Romein, Tim Rühl, Ronald Veldema, Kees Verstoep, and Greg Wilson. Henri does a great job running this team and working with him has been a great pleasure. His ability to get a job done, his punctuality, and his infinite reading capacity are greatly appreciated — by me, but by many others as well. Above all, Henri is a very nice person. Tim Rühl has been a wonderful friend and colleague throughout my years at the VU and I owe him greatly. We have spent countless hours hacking away at various projects, meanwhile discussing research and personal matters. Only a few of our pet projects made it into papers, but the papers that we worked on together are the best that I have (co-)authored. I would like to thank John Romein for his work on Multigame (see Chapter 8), for teaching me about parallel game-tree searching, for his expert advice on computer hardware, for keeping the DAS cluster in good shape, and for generously donating peanuts when I needed them. Koen Langendoen is the living proof of the rather disturbing observation that there is no significant correlation between a programmer’s coding style and the correctness of his code. Koen is a great hacker and taught me a thing or two about persistence. I want to thank him for his work on Orca, Panda, a few other projects, and for his work as a member of my thesis committee. The Orca group is blessed with the support of three highly competent research programmers: Rutger Hofman, Ceriel Jacobs, and Kees Verstoep. All three have contributed directly to the work described in this thesis. I wish to thank Rutger for his work on Panda 4.0, for reviewing papers, for i ii Preface many useful discussions about communication protocols, and for lots of good espresso. I also apologize for all the boring microbenchmarks he performed so willingly. There are computer programmers and there are artists. Ceriel, no doubt, be- longs to the artists. I want to thank him for his fine work on Orca and Panda and for reviewing several of my papers. Kees’s work has been instrumental in completing the comparison of LFC implementations described in Chapter 8. During my typing diet, he performed a tremendous amount of work on these implementations. Not only did he write three of the five LFC implementations described in this thesis, but he also put many hours into measuring and analyzing the performance of various applications. Meanwhile, he debugged Myrinet hardware and kept the DAS cluster alive. Finally, Kees reviewed some of my papers and provided many useful comments on this thesis. It has been a great pleasure to work with him. With a 128-node cluster of computers in your backyard, it is easy to forget about your desktop machine ::: until you cannot reach your home directory. For- tunately, such moments have been scarce. Many thanks to the department’s system administrators for keeping everything up and running. I am especially grateful to Ruud Wiggers, for quota of various kinds and great service. I wish to thank Matty Huntjens for starting my graduate career by recommend- ing me to Henri Bal. I greatly appreciate the way he works with the department’s undergraduate students; the department is lucky to have such people. I ate many dinners at the University’s ’restaurant.’ I will refrain from com- menting on the food, but I did enjoy the company of Rutger Hofman, Philip Hom- burg, Ceriel Jacobs, and others. My thesis committee — prof. dr. Zwaenepoel, prof. dr. Kaashoek, dr. Langendoen, dr. Epema, and dr. van Steen — went through some hard times; my move to Cornell rather reduced my pages-per-day rate. I would like to thank them all for reading my thesis and for their suggestions for improvement. I am particularly grateful to Frans Kaashoek whose comments and suggestions consid- erably improved the quality of this thesis. I would also like to thank Frans for his hospitality during some of my visits to the US. I especially enjoyed my two-month visit to his research group at MIT. On that same occasion, Robbert van Renesse invited me to Cornell, my current employer. He too, has been most hospitable on many occasions, and I would like to thank him for it. The Netherlands Organization for Scientific Research (N.W.O.) supported part of my work financially through Henri Bal’s Pionier grant. Many of the important things I know, my parents taught me. I am sure my father would have loved to attend my thesis defense. My mother has been a great example of strength in difficult times. Preface iii Finally, I wish to thank Daphne for sharing her life with me. Most statements concerning two theses in one household are true. I am glad we survived. Per-Chapter Contributions The following paragraphs summarize the contributions made by others to the papers and software that each chapter of this thesis is based on. I would like to thank all for the time and effort they have invested. Chapter 2’s survey is based on an article co-authored by Tim Rühl and Henri Bal [18]. Tim Rühl and I designed and implemented the LFC communication system described in Chapters 3, 4, and 5. John Romein ported LFC to Linux. LFC is described in two papers co-authored by Tim Rühl and Henri Bal; it is also one of the systems classified in the survey mentioned above. Panda 4.0, described in Chapter 6, was designed by Rutger Hofman, Tim Rühl, and me. Rutger Hofman implemented Panda 4.0 on LFC from scratch, except for the threads package, OpenThreads, which was developed by Koen Langendoen. Earlier versions of Panda are described in two conference publications [19, 124]. The comparison of upcall models is based on an article by Koen Langendoen, me, and Henri Bal [88]. Panda’s dynamic switching between polling and interrupts is described in a conference paper by Koen Langendoen, John Romein, me, and Henri Bal [89]. Chapter 7 discusses the implementation of four parallel-programming systems: Orca, Manta, CRL, and MPICH. Orca and Manta were both developed in the computer systems group at the Vrije Universiteit; CRL and MPICH were developed elsewhere. Koen Langendoen and I implemented the continuation-based version of the Orca runtime system on top of Panda 2.0. The use of continuations in the runtime system is described in a paper co-authored by Koen Langendoen [14]. Ceriel Ja- cobs wrote the Orca compiler and ported the runtime system to Panda 3.0. Rutger Hofman ported the continuation-based runtime system to Panda 4.0. An exten- sive description and performance evaluation of the Orca system on Panda 3.0 are given in a journal paper by Henri Bal, me, Rutger Hofman, Ceriel Jacobs, Koen Langendoen, Tim Rühl, and Frans Kaashoek [9]. Manta was developed by Jason Maassen, Rob van Nieuwpoort, and Ronald Veldema. Their work is described in a conference paper [99]. Rob helped me with the Manta performance measurements presented in Section 7.2. CRL was developed at MIT by Kirk Johnson [73, 74]; I ported CRL to LFC. It could not have been much easier, because CRL is a really nice piece of software. MPICH [61] is being developed jointly by Argonne National Laboratories and iv Preface Mississippi State University. Rutger Hofman ported MPICH to Panda 4.0. The comparison of different LFC implementations described in Chapter 8 was done in cooperation with Kees Verstoep, Tim Rühl, and Rutger Hofman. Tim and I implemented the NrxImc and HrxHmc versions of LFC. Kees implemented IrxImc, IrxHmc, and NrxHmc and performed many of the measurements presented in Chapter 8. Rutger helped out with various Panda-related issues. The parallel applications analyzed in Chapter 8 were developed by many per- sons. The Puzzle application and the parallel-programming system that it runs on, Multigame [122], were written by John Romein. Awari was developed by Victor Allis, Henri Bal, Hans Staalman, and Elwin de Waal. The linear equation solver (LEQ) was developed by Koen Langendoen. The MPI applications were written by Rutger Hofman (QR and ASP) and Thilo Kielmann (SOR). The CRL applications Barnes-Hut (Barnes), Fast Fourier Transform (FFT), and radix sort (radix) are all based on shared-memory programs from the SPLASH-2 suite [153]. Barnes-Hut was adapted for CRL by Kirk Johnson, FFT by Aske Plaat, and radix by me.

Communication Architectures for Parallel-Programming Systems

Validated Products List, 1995 No. 3: Programming Languages, Database

Sun Ultratm 5 Workstation Just the Facts

System Administration

Sun Type 5 Keyboard and Mouse Product Notes 800-6802-12

Sun-4 Handbook - Home Page

High-Performance Workstation at an Entry-Level Price

Device and Network Interfaces

QEMU Version 2.10.50 User Documentation I

Sparcbook F.A.Q

Device and Network Interfaces

Solaris 7 Sun Hardware Platform Guide

Solaris 7 8/99 Sun Hardware Platform Guide