MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY
A.I. Technical Report No. 1445 September, 1993
Robust, High-Speed Network Design for Large-Scale Multiprocessing
Andre´ DeHon [email protected]
Abstract: Large-scale multiprocessing remains an elusive, yet promising paradigm for achieving high-performance computation. As machine size scales upward, there are two important aspects of multiprocessor systems which will generally get worse rather than better: (1) interprocessor communication latency will increase and (2) the probability that some component in the system will fail will increase. Both of these problems can prevent us from realizing the potential benefits of large-scale multiprocessing. In this document we consider the problem of designingnetworks which simultaneously minimize communication latency while maximizing fault tolerance for large-scale multiprocessors. Using a synergy of techniques including connection topologies, routing protocols, signalling techniques, and packaging technologies we assemble integrated, system-level solutions to this network design problem. In particular, we recommend the use of multipath, multistage networks, simple, source-responsible routing protocols, stochastic fault-avoidance, dense three- dimensional packaging, low-voltage, series-terminated transmissionline signalling, and scan based diagnostic and reconfiguration.
Acknowledgements: This report describes research done at the Artificial Intelligence Laboratory of the Mas- sachusettsInstitute of Technology. Support for the Laboratory’s Artificial Intelligence Research is provided in part by the AdvancedResearch Projects Agency under Office of Naval Research contract N00014-91-J-1698. This material is based upon work supported under a National Science Foundation Graduate Fellowship. Any opinions, findings, conclusions or recommendations expressed in this publication are those of the author and do not necessarily reflect the views of the National Science Foundation. Acknowledgments
The ideas presented herein are the collaborative product of the MIT Transit Project under the direction of Dr. Thomas Knight, Jr. The viewpoint is my own and, as such, the description reflects my perspective and biases on the subject matter. Nonetheless, the whole picture presented here was shaped by the effort of the group and would definitely have been less complete without such a collective effort. At the expense of slighting some who’s influences and efforts I may not, yet, fully appreciated, I feel it only appropriate to acknowledge many by name. Knight provided many of the seed ideas which have formed the basis of our works. Preliminary work with the MIT Connection Machine Project and with the Symbolics VLSI group raised many of the issues and ideas which the Transit Project inherited. Collaboration with Tom Leighton, Bruce Maggs, and Charles Leiserson on interconnection topologies has also been instrumental in developing many of the ideas presented here. Knight, of course, provided the project with the oversight and encouragement to make this kind of effort possible. Henry Minsky has been a key player in the Transit effort since its inception. Minsky has helped shape ideas on topology, routing, and packaging. Without his VLSI efforts, RN1 would not have existed and all of us would have suffered greatly on other VLSI projects. Along with Knight, Alex Ishii and Thomas Simon have been invaluable in the effort to understand and develop high-performance signalling strategies. While Simon has a very different viewpoint than myself on most issues, the alternative perspective has generally been healthy and produced greater understanding. The packaging effort would remain arrested in the conceptual stage without the efforts of Fred Drenckhahn. Frederic Chong and Eran Egozy provided the cornerstone for our network organization evaluations and developments. Their efforts took vague notions and turned them into quantifiable entities which allowed us greater insight. Eran Egozy spearheaded the original METRO effort which also included Minsky, Simon, Samuel Peretz, and Matthew Becker. Simon helped shape the MBTA effort. The MBTA effort has also benefitted from contributions by Minsky, Timothy Kutscha, David Warren, and Ian Eslick. Patrick Sobalvarro, Michael Bolotski, Neil Brock, and Saed Younis have also offered notable help during the course of this effort. This work has also been shaped by numerous discussionswith people in “rival” research groups here at MIT. Notably, discussions with Anant Agarwal, John Kubiatowicz, David Chaiken, and Kirk Johnson in the MIT Alewife Project have been quite useful. William Dally, Michael Noakes, Ellen Spertus, Larry Dennison, and Deborah Wallach of the Concurrent VLSI Architecture group have all provided many useful comments and feedback. Noakes and George Andy Boughton have provided much valuable technical support during the life of this project. I am also indebted to William Weihl, Gregory Papadopoulos, and Donald Troxel for their direction. Our productivity during this effort has been enhanced by the availability of software in source form. Thishasallowedustocustomizeexistingtoolstosuit our novel needs and correct deficiencies. Notably, we have benefitted from software from the GNU Project, the Berkeley CAD group, and the Active Messages group at Berkeley. We have also been fortunate that several companies offer generous university programs through which they have made tools available to us. Intel has been most helpful in providing a rich suite of tools for working with the 80960 microprocessor series in source form. Actel provided sufficient discountson theirFPGA toolstomaketheirFPGAs useful forour prototypingefforts. Exemplarhas
i also provided generous discounts on their synthesis tools. Cadence provided us with Verilog-XL for simulation. Logic Modeling Corporation has made their library of models for Verilog available. MetaSoftware provided us with HSpice. Many of our efforts could not have succeeded without the support of these industrial strength tools. This research is supported in part by the Defense Advanced Research Projects Agency under contracts N00014-87-K-0825 and N00014-91-J-1698. This material is based upon work supported under a National Science Foundation Graduate Fellowship. Any opinions, findings, conclusions or recommendations expressed in this publication are those of the author and do not necessarily reflect the views of the National Science Foundation. Fabrication of printed-circuit boards and integrated circuits was provided by the MOSIS Project. We are greatly indebted to the careful handling and attention they have given to our projects. Particularly we are grateful for the help provided by Terry Dosek, Wes Hansford, Sam DelaTorre, and Sam Reynolds.
ii Contents
I Introduction and Background 1
1 Introduction 2