Machine CM-5 Technical Summary

Total Page:16

File Type:pdf, Size:1020Kb

Machine CM-5 Technical Summary (M5 Connection Machine CM-5 Technical Summary Corporation -lrrlrll S llY~r =ir-~Ur - Connection Machine CM-5 Technical Summary November 1993 EMImtm~t NUMBI-RUIS~~Rn; First printing, October 1991 Revised, November 1992 Revised, November 1993 The information in this document is subject to change without notice and should not be construed as a commitment by Thinking Machines Corporati. Thining Machines reserves the right to make changes to any product described herein. Although the informanationin this document has been reviewed and is believed to be reliable, Thinking Machines Corpotioa assumes no liability for emrs in this document. Thinking Machines does not assume any liability arising from the application or use of any information or product described herein Connection Machine is a registered trademark of Thinking Machines Corporation. CM, CM-2,CM-200, CM-5, CM-5 Scale 3, and DataVault are trademarks of Thinking Machines Corporation. CMos; CMAX,and Prism are trademarks of Thinking Machines Corporation. C * is a registered trademark of Thinking Machines Corporation. Paris, *Lisp, and CM Fortran are trademarks of Thinking Machines Corporation. CMMD,CMSSL, and CMX11are trademarks of Thinking Machines Corporation. CMviewis a trademark of Thinking Machines Corporation. FastGraph is a trademark of Thinking Machines Corporation. FastGraphU.S. Patent No. 5,247,694 Scalable Computing (SC) is a trademark of Thinking Machines Corporation. Scalable Disk Array (SDA)is a trademark of Thinking Machines Corporation. Thinking Machines® is a registered trademark of Thinking Machines Corporation. AVSis a trademark of Advanced Visualization Systems, Inc. Ethernet is a trademark of Xerox Corporation. IBM is a trademark of International Business Machines Corporation. Motif is a trademark of The Open Software Foundation, Inc. NFS and UNIX are trademarks of UNIX System Laboratories. SPARCand SPARstation are trademarks of SPARCInternational, Inc. Products bearing a SPARCtrademark are based upon an architecture developed by Sun Microsystems, Inc. SPARCstationis licensed exclusively to Sun Microsystems, Inc. Sun, Sun-4, and Sun Workstation are trademarks of Sun Microsystems, Inc. VMEbus is a trademark oif Motorola Corporation The X Window System is a trademark of the Massachusetts Institute of Technology. Copyright C 1991-1993 by Thinking Machines Corporation. All rights reserved. Thinking Machines Corporation 245 First Street Cambridge, Massachusetts 02142-1264 (617) 234-1000 Contents iAs. _sli~-- Part I Introduction Chapter 1 Supercomputing and Parallelism ......... 3 1.1 Parallelism.......................... .. .. ............... 4 1.2 Parallel Programming ................. ......... ............... 5 1.3 Advantages of a Universal Architecture . .... .... ................ 9 1.4 Looking Ahead ...................... .......................... 11 Chapter 2 The Basic Components of the CM-5 ...... ............... 13 2.1 Processors............................... ................ 13 2.2 Networks................................ ................ 15 2.3 I/O ..................................... ............... 17 2.4 A Universal Architecture ................... ................ 18 Chapter 3 Data Parallel Programming.............. ................ 19 3.1 Data Sets and Distributed Memory .......... ................ 19 3.2 Interconnected Data Structures ............. ................ 21 3.3 Interprocessor Communications ............. ............... 21 3.4 Conditionals............................ ................ 27 3.5 In Summary .............................. 28 3.6 More Information to Come ...................................... 29 Novemer 1993 Coppight @1993 Thining Machines Coporation iii - - Connection Machine CM-5 TechnicalSummary % ~~~~Psss~~~~~%........B ... 81MMI111a - ----- ----------- ---- .------- Part II CM-S Software Chapter 4 Connection Machine Software .................... 33 4.1 Base System Software ........... ... ... .. ................... 33 4.2 Languages and Libraries ......... ......... ......... 33 ...... .. 4.3 CM Software Summarized ....... .................... 35 Chapter 5 The Operating System: CMOST .......... 37 5.1 CMOST and the CM-5 Architecture ......... ............... 38 5.2 CMOST and the Users ..................... ............... 40 5.3 CMOSTand the Administrator.............. .............. 41 5.4 I/O and File Systems ...................... ............... 41 Chapter 6 The Programming Environment:Prism ... .......... 45 6.1 Using Prism ............................. .......... 46 6.2 Analyzing Program Performance ............ .......... .47 6.3 Visualizing Data ............. ......... .................... 48 6.4 Using Prism with CMAX ................... ..................... 49 6.5 On-Line Documentation and Help .......... .................... 50 Chapter 7 The Program Execution Environment . .......53 7.1 Batch Utilities . ...... ........ .......... ........... 54 7.2 Timers............................... .......... ........... 57 7.3 Timesharing........................... ......... ........... 57 7.4 Checkpointing......................... .......... ........... 58 Chapter 8 The CMAX Converter ................. 59 Chapter 9 The CM Fortran Programming Language ...................... 63 9.1 Structuring Parallel Data .... ... ....... .........................63 9.2 Computing in Parallel .......................................... 65 9.3 Communicating in Parallel ................ ....... ...... 66 9.4 EransformingParallel Data ............... ................... 68 November1993 iv Copyght 0 1993ThirlngMachines Corm ion Contents ,8g(m· .................................... Chapter 10 The C* Programming Language.. 71 ........................ 10.1 Structuring Parallel Data ........... .... 71 ........................ 10.2 Computing in Parallel ............. ..... e . ..e 73 . 10.3 Communicating in Parallel ......... 75 ........................ 10.4 ¶IransformingParallel Data ......... 76 ........................ Chapter 11 The *Lisp Programming Language .. 79 11.1 Structuring Parallel Data .............. .............. 80 11.2 Computing in Parallel ................ .............. 82 113 Communicating in Parallel ............ ............. 84 11.4 'Iansforming Parallel Data ............ .............. 85 Chapte r 12 CM Scientific Software Library ..... 87 12.1 Overview ............................ ........ ................. 87 12.2 Multiple Instances .................... ........ ................. 88 12.3 Matrix Operations .................... ........ ................ 92 12.4 Linear Algebraic Equations ............ ........ ................. 94 12.5 Eigensystem Analysis ................ ........ ................. 97 12.6 Fourier ransforms .................... ........ ................ 98 12.7 Ordinary Differential Equations ......... ........ ................. 99 12.8 Optimization ......................... ........ ................. 99 12.9 Random Number Generation ........... ........ ................. 100 L2.10 Statistical Analysis .................... ........ ................. 100 12.11 Communication Functions .............. ........ ................. 101 Chaptelr 13 Data Visualization................... ........... 105 13.1 A Distributed Graphics Strategy ........ ............. ........... 105 13.2 An Integrated Environment ............ .............. ........... 106 13.3 The X11 Protocol ..................... .............. ........... 107 13.4 The CMX11 Library ................... .............. ........... 107 13.5 Visualization Environments - CM/AVS .. ..... 108 Novembr 1993 CopyrWt 1993 Thinkgachins Coporation v Connection Machine CM-5 TechnicalSummary Chapter 14 CMMD..... ., .. ....... ...... ... 111 14.1 Node-Level Programming with CMMD ... ...... ... 112 .... .. 14.2 Programming Models ......................... ... 112 .. ... .. 14.3 Message-Passing Protocols .................... ... 113 14.4 I/O ........................................ ... 119 Part HII CM-5 Architecture Chapter 15 Architecture Overview .......... .............................. 125 15.1 Processors...................... .............................. 125 15.2 Networks and VO ................ .............................. 127 15.3 Further Information .............. 129 .............................. Chapter 16 The User-Level Virtual Machine .............................. 131 16.1 Communications Facilities ......... .............................. 132 16.2 Data Parallel Computations ....... ..............................134 16.3 Low-Level User Programming ..... .............................. 137 Chapter 17 Local Architecture.............. 139 .............................. 17.1 Control Processor Architecture .... .............................. 139 17.2 Processing Node Architecture ...... .............................. 140 17.3 Vector Unit Architecture .......... .............................. 143 Chapter 18 Global Architecture .. 155 .......... 18.1 The Network Interface.. ..... ..... ..... ..... ..... .......... 155 18.2 The Control Network ... ... ... ... .. ... ... ... .. .......... 158 18.3 The Data Network ..... .............................. 162 .......... Chapter 19 System Architectureand Administration ....................... 165 19.1 The System Console .......................................... 165 19.2 Allocation of Resources .................................... 166 19.3 Partitions and Networks ..................................... 167 November1993 vi Copyright0 1993 ThinkingMachines Corporation Contents 11 1, ll ll11 -- - - - . - --- - ----
Recommended publications
  • Appendix E Final-Draft.Fm Page 1 Friday, July 14, 2006 10:23 AM
    Appendix E final-draft.fm Page 1 Friday, July 14, 2006 10:23 AM E.1 Introduction 3 E.2 Interconnecting Two Devices 6 E.3 Connecting More than Two Devices 20 E.4 Network Topology 30 E.5 Network Routing, Arbitration, and Switching 45 E.6 Switch Microarchitecture 56 E.7 Practical Issues for Commercial Interconnection Networks 63 E.8 Examples of Interconnection Networks 70 E.9 Internetworking 80 E.10 Crosscutting Issues for Interconnection Networks 84 E.11 Fallacies and Pitfalls 89 E.12 Concluding Remarks 96 E.13 Historical Perspective and References 98 Exercises 108 Appendix E final-draft.fm Page 2 Friday, July 14, 2006 10:23 AM E Interconnection Networks 1 Revised by Timothy Mark Pinkston, USC, and Jose Duato, UPV and Simula “The Medium is the Message” because it is the medium that shapes and controls the search and form of human associations and actions. Marshall McLuhan Understanding Media (1964) The marvels—of film, radio, and television—are marvels of one-way communication, which is not communication at all. Milton Mayer On the Remote Possibility of Communication (1967) The interconnection network is the heart of parallel architecture. Chuan-Lin and Tse-Yun Feng Interconection Networks for Parallel and Distributed Pro- cessing (1984) Indeed, as system complexity and integration continues to increase, many designers are finding it more efficient to route packets, not wires. Bill Dally Principles and Practices of Interconnection Networks (2004) Appendix E final-draft.fm Page 3 Friday, July 14, 2006 10:23 AM 3 n Appendix E Interconnection Networks E.1 Introduction Previous chapters and appendices cover the components of a single computer but give little consideration to the interconnection of those components and how multiple computer systems are interconnected.
    [Show full text]
  • Simulating Physics with Computers
    International Journal of Theoretical Physics, VoL 21, Nos. 6/7, 1982 Simulating Physics with Computers Richard P. Feynman Department of Physics, California Institute of Technology, Pasadena, California 91107 Received May 7, 1981 1. INTRODUCTION On the program it says this is a keynote speech--and I don't know what a keynote speech is. I do not intend in any way to suggest what should be in this meeting as a keynote of the subjects or anything like that. I have my own things to say and to talk about and there's no implication that anybody needs to talk about the same thing or anything like it. So what I want to talk about is what Mike Dertouzos suggested that nobody would talk about. I want to talk about the problem of simulating physics with computers and I mean that in a specific way which I am going to explain. The reason for doing this is something that I learned about from Ed Fredkin, and my entire interest in the subject has been inspired by him. It has to do with learning something about the possibilities of computers, and also something about possibilities in physics. If we suppose that we know all the physical laws perfectly, of course we don't have to pay any attention to computers. It's interesting anyway to entertain oneself with the idea that we've got something to learn about physical laws; and if I take a relaxed view here (after all I'm here and not at home) I'll admit that we don't understand everything.
    [Show full text]
  • Think in G Machines Corporation Connection
    THINK ING MACHINES CORPORATION CONNECTION MACHINE TECHNICAL SUMMARY The Connection Machine System Connection Machine Model CM-2 Technical Summary ................................................................. Version 6.0 November 1990 Thinking Machines Corporation Cambridge, Massachusetts First printing, November 1990 The information in this document is subject to change without notice and should not be construed as a commitment by Thinking Machines Corporation. Thinking Machines Corporation reserves the right to make changes to any products described herein to improve functioning or design. Although the information in this document has been reviewed and is believed to be reliable, Thinking Machines Corporation does not assume responsibility or liability for any errors that may appear in this document. Thinking Machines Corporation does not assume any liability arising from the application or use of any information or product described herein. Connection Machine® is a registered trademark of Thinking Machines Corporation. CM-1, CM-2, CM-2a, CM, and DataVault are trademarks of Thinking Machines Corporation. C*®is a registered trademark of Thinking Machines Corporation. Paris, *Lisp, and CM Fortran are trademarks of Thinking Machines Corporation. C/Paris, Lisp/Paris, and Fortran/Paris are trademarks of Thinking Machines Corporation. VAX, ULTRIX, and VAXBI are trademarks of Digital Equipment Corporation. Symbolics, Symbolics 3600, and Genera are trademarks of Symbolics, Inc. Sun, Sun-4, SunOS, and Sun Workstation are registered trademarks of Sun Microsystems, Inc. UNIX is a registered trademark of AT&T Bell Laboratories. The X Window System is a trademark of the Massachusetts Institute of Technology. StorageTek is a registered trademark of Storage Technology Corporation. Trinitron is a registered trademark of Sony Corporation.
    [Show full text]
  • Testing the Limits of Tapered Fat Tree Networks
    2019 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Testing the Limits of Tapered Fat Tree Networks Philip Taffet Sanil Rao Edgar A. Leon´ and Ian Karlin Dept. of Computer Science Dept. of Electrical and Computer Engineering Livermore Computing Rice University Carnegie Mellon University Lawrence Livermore National Laboratory Houston, TX Pittsburgh, PA Livermore, CA [email protected] [email protected] fleon,[email protected] Abstract—HPC system procurement with a fixed budget is an was made. This work takes a view back at previous work to optimization problem with many trade-offs. In particular, the evaluate the decision made to use a 2:1 tapered network [1] for choice of an interconnection network for a system is a major Quartz, a system at Lawrence Livermore National Laboratory choice, since communication performance is important to overall application performance and the network makes up a substantial (LLNL). We use three applications from this study, including fraction of a supercomputer’s overall price. It is necessary to the two that were most impacted by network tapering and understand how sensitive representative jobs are to various one of the most latency sensitive applications. Building on aspects of network performance to procure the right network. our work previously presented in poster form at SC17 and Unlike previous studies, which used mostly communication-only SC18 [6], [7], we add controlled congestion to study applica- motifs or simulation, this work employs a real system and measures the performance of representative applications under tion performance under adverse traffic conditions. Our work controlled environments.
    [Show full text]
  • On-Line Reconfigurable Extended Generalized Fat Tree Network-On-Chip for Multiprocessor System-On-Chip Circuits
    Julkaisu 614 Publication 614 Heikki Kariniemi On-Line Reconfigurable Extended Generalized Fat Tree Network-on-Chip for Multiprocessor System-on-Chip Circuits Tampere 2006 Tampereen teknillinen yliopisto. Julkaisu 614 Tampere University of Technology. Publication 614 Heikki Kariniemi On-Line Reconfigurable Extended Generalized Fat Tree Network-on-Chip for Multiprocessor System-on-Chip Circuits Thesis for the degree of Doctor of Technology to be presented with due permission for public examination and criticism in Sähkötalo Building, Auditorium S1, at Tampere University of Technology, on the 27th of September 2006, at 12 noon. Tampereen teknillinen yliopisto - Tampere University of Technology Tampere 2006 ISBN 952-15-1651-8 (printed) ISBN 952-15-1746-8 (PDF) ISSN 1459-2045 Abstract The System-on-Chip (SoC) circuits contain already today a large number of processors and other blocks, which sets several hard requirements to their communication infrastructures implemented with Networks-On-Chip (NOC). The NOCs are a generally accepted concept in the semiconductor industry for solving the problems related with an on-chip communication which include among other things high wire delays and increasing clock skew. Other problems related to on-chip communication are the scalability of the NOC topologies for different systems sizes and for different performance and reliability requirements. This thesis concerns the implementation of communication infrastructures with eXtended Generalized Fat Tree (XGFT) Network-On-Chip (NOC) for Multi-Processor SoCs (MPSoC). Actually this thesis presents two different NOCs, because the XGFT NOC is compared with a two-dimensional (2-D) mesh NOC which was especially developed and designed for this purpose. The 2-D mesh is used as a reference NOC, because it is the most commonly used topology in different NOC implementations presented by today.
    [Show full text]
  • Parallel Architecture Hardware and General Purpose Operating System
    Parallel Architecture Hardware and General Purpose Operating System Co-design Oskar Schirmer G¨ottingen, 2018-07-10 Abstract Because most optimisations to achieve higher computational performance eventually are limited, parallelism that scales is required. Parallelised hard- ware alone is not sufficient, but software that matches the architecture is required to gain best performance. For decades now, hardware design has been guided by the basic design of existing software, to avoid the higher cost to redesign the latter. In doing so, however, quite a variety of supe- rior concepts is excluded a priori. Consequently, co-design of both hardware and software is crucial where highest performance is the goal. For special purpose application, this co-design is common practice. For general purpose application, however, a precondition for usability of a computer system is an arXiv:1807.03546v1 [cs.DC] 10 Jul 2018 operating system which is both comprehensive and dynamic. As no such op- erating system has ever been designed, a sketch for a comprehensive dynamic operating system is presented, based on a straightforward hardware architec- ture to demonstrate how design decisions regarding software and hardware do coexist and harmonise. 1 Contents 1 Origin 4 1.1 Performance............................ 4 1.2 Limits ............................... 5 1.3 TransparentStructuralOptimisation . 8 1.4 VectorProcessing......................... 9 1.5 Asymmetric Multiprocessing . 10 1.6 SymmetricMulticore ....................... 11 1.7 MultinodeComputer ....................... 12 2 Review 14 2.1 SharedMemory.......................... 14 2.2 Cache ............................... 15 2.3 Synchronisation .......................... 15 2.4 Time-Shared Multitasking . 15 2.5 Interrupts ............................. 16 2.6 Exceptions............................. 16 2.7 PrivilegedMode.......................... 17 2.8 PeripheralI/O .........................
    [Show full text]
  • THE RISE and Fall the 01 BRILLIANT START-UP THAT Some Day We Will Build a Think­ I~Z~~~~~ Thinking Ing Machine
    Company Profile THE RISE and Fall THE 01 BRILLIANT START-UP THAT Some day we will build a think­ I~Z~~~~~ Thinking ing machine. It will be a truly NEVER GRASPED intelligent machine. One that can see and hear and speak. A THE BASICS Mach-Ines machine that will be proud of us. by Gary Taubes -From a Thinking Machines brochure seven 'years a~ter. its The truth is very different. This is the simple proeessors, all of them completing In 19 90 founding, Thlllklllg story of how Thinking Machines got the a single instruction at the same time. To Machines was the market leader in paral­ jump on a hot new market-and then get more speed, more processors would lel supercomputers, with sales of about screwed up, big time. be added. Eventually, so the theory went, $65 million. Not only was the company with enough processors (perhaps billions) protitable; it also, in the words of one IBM ntil W. Daniel Hillis came along, and the right software, a massively paral­ computer scientist, had cornered the mar­ Ucomputers more or less had been de­ lel computer might start acting vaguely . ket "on sex appeal in high-performance signed along the lines of ENIAC. Ifl that human. Whether it would take pride in its computing." Several giants in the com­ machine a single processor complete? in­ creators would remain to be seen. puter industry were seeking a merger or a structions one at a time, in sequence. "Se­ Hillis is what good scientists call a very partnership with the company.
    [Show full text]
  • Parallelization of Tree-Recursive Algorithms on a SIMD Machine *
    From: AAAI Technical Report SS-93-04. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved. Parallelization of Tree-Recursive Algorithms on a SIMD Machine * Curt Powley, Richard E. Korf, and Chris Ferguson Computer Science Department University of California, Los Angeles Los Angeles, CA 90024 ABSTRACT The set of tree-recursive algorithms includes constraint satisfaction using backtracking, iterative-deepening The set of tree-recursive algorithms is large, search such as IDA*, depth-first branch-and-bound, including constraint satisfaction using back- two-player game minimax search, and most divide-and- tracking, iterative-deepening search such as conquer algorithms. IDA*, depth-first branch-and-bound, two- player game minimax search, and many 1.2 SIMD Machines divide-and-conquer algorithms. We describe a structured method for implementing such Two examples of single-instruction, multiple-data algorithms on SIMDmachines, and identify (SIMD) machines are the MasPar1 and the Connec- measures for determining if a tree-recursive tion Machine (CM)2. The MasPar is composed of up application is amenable or ill-suited for SIMD to 16,384 (16K) processors. The Connection Machine, parallelization. Using these ideas, we evaluate the computer used in this study, is composed of up to results from four testbeds. 65,536 (64K) one-bit processors. Every processor on a SIMDmachine must execute the same instruction at the same time, or no instruction 1 Introduction at all. This can make programming and verification of straightforward tasks easier, but can considerably 1.1 Tree-RecursiveAlgorithms complicate the programming of complex tasks, such as tree-traversal. Because of this programmingconstraint, A tree-recursive algorithm is one that traverses a tree SIMDmachines have largely been used for "data par- in executing multiple recursive calls.
    [Show full text]
  • Parallel Computing MIMD Machine (I)
    Lecture 2 Parallel Programming Platforms Flynn’s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple Instruction stream stream Instruction Single Multiple SISD SIMD Uniprocessors Processor arrays Pipelined vector processors MISD MIMD Systolic arrays Multiprocessors Multicomputers SISD Machine Example: single CPU computers (serial computer) • Single instruction: Only one instruction stream is acted on by CPU during one clock cycle • Single data: Only one data stream is used as input during one clock cycle • Deterministic execution SIMD Machine (I) • A parallel computer • It typically has a single CPU devoted exclusively to control, a large number of subordinate ALUs, each with its own memory and a high- bandwidth internal network. • Control CPU broadcasts an instruction to all subordinate ALUs, and each of the subordinate ALUs either executes the instruction it is idle. • Example: CM-1, CM-2, IBM9000 SIMD Machine (2) Control CPU ALU 0 ALU 1 ALU p Mem 0 Mem 1 Mem p Interconnection network SIMD Machine (3) From Introduction to Parallel Computing MIMD Machine (I) • Most popular parallel computer architecture • Each processor is a full-fledged CPU with both a control unit and an ALU. Thus each CPU is capable of executing its own program at its own space. • Execution is asynchronous. Processors can also be specifically programmed to synchronize with each other. • Examples: networked parallel computers, symmetric multiprocessor (SMP) computer. MIMD Machine (II) Load A(1) call func Load B(1) X = Y*Z C(1) = A(1)*B(1) Sum = X^2 time Store C(1) call subroutine1(i) … Next instruction Next instruction CPU 0 CPU 1 Further classification according to memory access: • Shared-memory system • Distributed-memory system (Message-passing) Shared-Memory MIMD Machine (I) • Multiple processors can operate independently, but share the same memory resources (a global address space).
    [Show full text]
  • Network Learning on the Connection Machine
    Network Learning on the Connection Machine Guy Blelloch Charles R. Rosenberg M.I.T Artificial Intelligence Laboratory Cognitive Science Laboratory 545 Technology Square Princeton University Cambridge, Massachusetts 02139 Princeton, New Jersey 08542 on the Connection Machine. Currently, the Connection Machine Abstract offers a factor of 500 speedup over a previous implementation on Connectionist networks are powerful techniques, inspired by a VAX780 and a factor of two speed-up over an implementation the parallel architecture of the brain, for discovering intrinsic of a similar network on a Cray-2. Considering that parallel com• structures in data. However, they are not well suited for im• puting is only in it's infancy, we expect the speed-up to be much plementation on serial computers. In this paper, we discuss the greater in the near future. Finally, we present an application in first implementation of a connectionist learning algorithm, error the domain of speech synthesis, called NETtalk [14], that uses our back-propagation, on a fine-grained parallel computer, the Con• implementation of back-propagation on the Connection Machine. nection Machine. As an example of how the system can be used, we present a parallel implementation of NETtalk, a connectionist 2 Error Back-Propagation network that learns the mapping from English text to the pro• Connectionist network models are dynamic systems composed of nunciation of that text. Currently, networks containing up to a large number of simple processing units connected via weighted 16 million links can be simulated on the Connection Machine at links, where the state of any unit in the network depends on the speeds nearly twice that of the Cray-2.
    [Show full text]
  • Download File
    Reconfigurable Optically Interconnected Systems Yiwen Shen Submitted in partial fulfillment ofthe requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2020 © 2020 Yiwen Shen All Rights Reserved ABSTRACT Reconfigurable Optically Interconnected Systems Yiwen Shen With the immense growth of data consumption in today’s data centers and high- performance computing systems driven by the constant influx of new applications, the network infrastructure supporting this demand is under increasing pressure to enable higher bandwidth, latency, and flexibility requirements. Optical interconnects, able to support high bandwidth wavelength division multiplexed signals with extreme energy efficiency, have become the basis for long-haul and metro-scale networks around the world, while photonic components are being rapidly integrated within rack and chip- scale systems. However, optical and photonic interconnects are not a direct replacement for electronic-based components. Rather, the integration of optical interconnects with electronic peripherals allows for unique functionalities that can improve the capacity, compute performance and flexibility of current state-of-the-art computing systems. This requires physical layer methodologies for their integration with electronic components, as well as system level control planes that incorporates the optical layer characteristics. This thesis explores various network architectures and the associated control plane, hard- ware infrastructure, and other supporting software modules needed to integrate silicon photonics and MEMS based optical switching into conventional datacom network systems ranging from intra-data center and high-performance computing systems to the metro- scale layer networks between data centers. In each of these systems, we demonstrate dynamic bandwidth steering and compute resource allocation capabilities to enable sig- nificant performance improvements.
    [Show full text]
  • 69- DR; ARVIND Massachusetts Institute of Technology Among
    -69- DR; ARVIND Massachusetts Institute of Technology Among many machine projects at MIT, including the "Connection machine" Jack Schwartz alluded to, are two major data-flow projects. I am going to review one of them, The Tagged-Token Data-Flow Machine. The goal of my project, simply stated, is to design a general-purpose parallel computer in which all processors will cooperate to solve one problem. Clearly we are interested in big problems, and the question of "What is general purpose computing? II has to be understood in that context. If an application does not have any parallelism, we are no magicians and therefore we can't invent it. However, many applications have plenty of parallelism and one can build a useful machine to E!xploi t parallelism in a large class of applications. Table J[ lists some characteristics of typical applications which have massive amounts of parallelism. Table I. Parallel Applications and their Characteristics. Number Crunching, e.g., -Scientific Computing -High performance -Simple data structure - arrays Symbol Manipulation, e.g., -AI type applications-algebraic simplifier -Complex data structure -Higher order functions -Structure of the program itself is important Concurrent Real time Computing, e.g., -Process control - Missile defense system -Number of asynchronous inputs -Adhoc hardware structures -No' coherent functional view -70- In the area of symbol manipulation also, there are lots of programs with parallelism except that these programs are not as well understood as scientific computing. One reason is that algorithms in AI programs are not so stable. AI Programs tend to be far more complex then scientific programs.
    [Show full text]