Future of Supercomputing Yoshio Oyanagi ∗
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
(52) Cont~Ol Data
C) (52) CONT~OL DATA literature and Distribution Services ~~.) 308 North Dale Street I st. Paul. Minnesota 55103 rJ 1 August 29, 1983 "r--"-....." (I ~ __ ,I Dear Customer: Attached is the third (3) catalog supplement since the 1938 catalog was published . .. .·Af ~ ~>J if-?/t~--62--- G. F. Moore, Manager Literature & Distribution Services ,~-" l)""... ...... I _._---------_._----_._----_._-------- - _......... __ ._.- - LOS CATALOG SUPPLEPtENT -- AUGUST 1988 Pub No. Rev [Page] TITLE' [ extracted from catalog entry] Bind Price + = New Publication r = Revision - = Obsolete r 15190060 [4-07] FULL SCREEN EDITOR (FSEDIT) RM (NOS 1 & 2) .......•...•.•...•••........... 12.00 r 15190118 K [4-07] NETWORK JOB ENTRY FACILITY (NJEF) IH8 (NOS 2) ........................... 5.00 r 15190129 F [4-07] NETWORK JOB ENTRY FACILITY (NJEF) RM (NOS 2) .........•.......•........... + 15190150 C [4-07] NETWORK TRANSFER FACILITY (NTF) USAGE (NOS/VE) .......................... 15.00 r 15190762 [4-07] TIELINE/NP V2 IHB (L642) (NOS 2) ........................................ 12.00 r 20489200 o [4-29] WREN II HALF-HEIGHT 5-114" DISK DRIVE ................................... + 20493400 [4-20] CDCNET DEVICE INTERFACE UNITS ........................................... + 20493600 [4-20] CDCNET ETHERNET EQUIPMENT ............................................... r 20523200 B [4-14] COMPUTER MAINTENANCE SERVICES - DEC ..................................... r 20535300 A [4-29] WREN II 5-1/4" RLL CERTIFIED ............................................ r 20537300 A [4-18] SOFTWARE -
Recent Supercomputing Development in Japan
Supercomputing in Japan Yoshio Oyanagi Dean, Faculty of Information Science Kogakuin University 2006/4/24 1 Generations • Primordial Ages (1970’s) – Cray-1, 75APU, IAP • 1st Generation (1H of 1980’s) – Cyber205, XMP, S810, VP200, SX-2 • 2nd Generation (2H of 1980’s) – YMP, ETA-10, S820, VP2600, SX-3, nCUBE, CM-1 • 3rd Generation (1H of 1990’s) – C90, T3D, Cray-3, S3800, VPP500, SX-4, SP-1/2, CM-5, KSR2 (HPC ventures went out) • 4th Generation (2H of 1990’s) – T90, T3E, SV1, SP-3, Starfire, VPP300/700/5000, SX-5, SR2201/8000, ASCI(Red, Blue) • 5th Generation (1H of 2000’s) – ASCI,TeraGrid,BlueGene/L,X1, Origin,Power4/5, ES, SX- 6/7/8, PP HPC2500, SR11000, …. 2006/4/24 2 Primordial Ages (1970’s) 1974 DAP, BSP and HEP started 1975 ILLIAC IV becomes operational 1976 Cray-1 delivered to LANL 80MHz, 160MF 1976 FPS AP-120B delivered 1977 FACOM230-75 APU 22MF 1978 HITAC M-180 IAP 1978 PAX project started (Hoshino and Kawai) 1979 HEP operational as a single processor 1979 HITAC M-200H IAP 48MF 1982 NEC ACOS-1000 IAP 28MF 1982 HITAC M280H IAP 67MF 2006/4/24 3 Characteristics of Japanese SC’s 1. Manufactured by main-frame vendors with semiconductor facilities (not ventures) 2. Vector processors are attached to mainframes 3. HITAC IAP a) memory-to-memory b) summation, inner product and 1st order recurrence can be vectorized c) vectorization of loops with IF’s (M280) 4. No high performance parallel machines 2006/4/24 4 1st Generation (1H of 1980’s) 1981 FPS-164 (64 bits) 1981 CDC Cyber 205 400MF 1982 Cray XMP-2 Steve Chen 630MF 1982 Cosmic Cube in Caltech, Alliant FX/8 delivered, HEP installed 1983 HITAC S-810/20 630MF 1983 FACOM VP-200 570MF 1983 Encore, Sequent and TMC founded, ETA span off from CDC 2006/4/24 5 1st Generation (1H of 1980’s) (continued) 1984 Multiflow founded 1984 Cray XMP-4 1260MF 1984 PAX-64J completed (Tsukuba) 1985 NEC SX-2 1300MF 1985 FPS-264 1985 Convex C1 1985 Cray-2 1952MF 1985 Intel iPSC/1, T414, NCUBE/1, Stellar, Ardent… 1985 FACOM VP-400 1140MF 1986 CM-1 shipped, FPS T-series (max 1TF!!) 2006/4/24 6 Characteristics of Japanese SC in the 1st G. -
VHSIC and ETA10
VHSIC and The First CMOS and Only Cryogenically Cooled Super Computer David Bondurant Former Honeywell Solid State Electronics Division VHSIC System Applications Manager ETA10 Supercomputer at MET Office - UK’s National Weather Forecasting Service Sources • “Very High Speed Integrated Circuits (VHSIC) Final Program Report 1980-1990, VHSIC Program Office”, Office of the Under Secretary of Defense for Acquisition, Deputy Director, Defense Research and Engineering for Research and Advanced Technology, September 30, 1990 • Carlson, Sullivan, Bach, and Resnick, “The ETA10 Liquid-Nitrogen-Cooled Supercomputer System”, IEEE Transactions on Electron Devices, Vol. 36, No. 8, August 1989. • Cummings and Chase, “High Density Packaging for Supercomputers” • Tony Vacca, “First Hand: The First CMOS And The Only Cryogenically Cooled Supercomputer”, ethw.org • Robert Peglar, “The ETA Era or How to (Mis-)Manage a Company According to Control Data Corp.”, April 17, 1990 Background • I graduated from Missouri S&T in May 1971 • My first job was at Control Data Corporation, the leading Supercomputer Company • I worked in Memory Development • CDC 7600 was in production, 8600 (Cray 1) was in development by Seymour Cray in Chippewa Falls, WI • Star 100 Vector Processor was in Development in Arden Hills • I was assigned to the development of the first DRAM Memory Module for a CDC Supercomputer • I designed the DRAM Module Tester Using ECL Logic Control Data Corporation Arden Hills • First DRAM Module was 4Kx32 using 128 1K DRAM Development Center - June 1971 Chips -
PERSPECTIVE on the NEXT TEN YEARS in PLASMA Physicst
Particle Accelerators, 1986, Vol. 19, pp. 247-255 0031-2460/86/1904-0247/$15.00/0 © 1986 Gordon and Breach, Science Publishers, S.A. Printed in the United States of America PERSPECTIVE ON THE NEXT TEN YEARS IN PLASMA PHYSICSt JOHN KILLEEN National Magnetic Fusion Energy Computer Center, Lawrence Livermore National Laboratory, University of California, Livermore, California 94550 (Received March 7, 1985) A brief survey of developments in the plasma physics of magnetic-fusion research is presented. The major experimental facilities of the next decade are listed. Eight fusion-physics issues are identified which must be addressed as a complete plasma system in order to reach the goal of a fusion reactor. In order to resolve the physics issues the results from these new experimental facilities must be augmented by an extensive program of computer modeling. The use of computer models of a magnetically confined plasma and the implementation of these models on the new supercomputers of the next decade are the main topics of this paper. I. INTRODUCTION During the early 1970s the U.S. magnetic-fusion program supported at least fifteen varieties of experimental concepts. These were rather small experiments as compared to today's large facilities. During the years 1974 to 1980, the program went through a period of dramatic growth, but at the same time evaluations and reviews reduced the number of experimental concepts supported to the following six: tokamak, tandem mirror, reverse field pinch (RFP), stellarator, compact toroids, and Elmo bumpy torus. The most advanced of the above concepts is the tokamak, and all four of the major international groups have commissioned large facilities (Table I) to establish the scientific feasibility of fusion. -
Supercomputers: the Amazing Race Gordon Bell November 2014
Supercomputers: The Amazing Race Gordon Bell November 2014 Technical Report MSR-TR-2015-2 Gordon Bell, Researcher Emeritus Microsoft Research, Microsoft Corporation 555 California, 94104 San Francisco, CA Version 1.0 January 2015 1 Submitted to STARS IEEE Global History Network Supercomputers: The Amazing Race Timeline (The top 20 significant events. Constrained for Draft IEEE STARS Article) 1. 1957 Fortran introduced for scientific and technical computing 2. 1960 Univac LARC, IBM Stretch, and Manchester Atlas finish 1956 race to build largest “conceivable” computers 3. 1964 Beginning of Cray Era with CDC 6600 (.48 MFlops) functional parallel units. “No more small computers” –S R Cray. “First super”-G. A. Michael 4. 1964 IBM System/360 announcement. One architecture for commercial & technical use. 5. 1965 Amdahl’s Law defines the difficulty of increasing parallel processing performance based on the fraction of a program that has to be run sequentially. 6. 1976 Cray 1 Vector Processor (26 MF ) Vector data. Sid Karin: “1st Super was the Cray 1” 7. 1982 Caltech Cosmic Cube (4 node, 64 node in 1983) Cray 1 cost performance x 50. 8. 1983-93 Billion dollar SCI--Strategic Computing Initiative of DARPA IPTO response to Japanese Fifth Gen. 1990 redirected to supercomputing after failure to achieve AI goals 9. 1982 Cray XMP (1 GF) Cray shared memory vector multiprocessor 10. 1984 NSF Establishes Office of Scientific Computing in response to scientists demand and to counteract the use of VAXen as personal supercomputers 11. 1987 nCUBE (1K computers) achieves 400-600 speedup, Sandia winning first Bell Prize, stimulated Gustafson’s Law of Scalable Speed-Up, Amdahl’s Law Corollary 12. -
Some Thoughts for Particle-In-Cell on Leadership Class Computers
Some thoughts for particle-in-cell on leadership class computers W. B. Mori University of California Los Angeles (UCLA) Departments of Physics and Astronomy and of Electrical Engineering Institute of Digital Research and Education [email protected] And much of the material from the Joule Metric study is from: R.A. Fonseca GoLP/IPFN, Instituto Superior Técnico, Lisboa, Portugal W. B. Mori | IPAM, May 14th | PLWS 2012 My experimental colleagues view of the world Life of an experimentalist Life of a simulationist W. B. Mori | IPAM, May 14th | PLWS 2012 But we face a tsunami of advances... ...in theory, in computational methods, in new hardware, and in new algorithms Contrary to the perception of my experimental colleagues, we rarely get to “relax” . This workshop offers us a rare opportunity for us to think in a “quiet room”. W. B. Mori | IPAM, May 14th | PLWS 2012 The evolution of state-of-the-art HPC systems cannot be ignored Top System Performance [MFlop/s] Tianhe-1A IBM RoadRunner 9 Cray XT5-HE 10 IBM Blue Gene/L IBM Blue Gene/L NEC Earth Simulator SGI Project Columbia IBM ASCI White Intel ASCI Red/9152 106 Intel ASCI Red/9632 Intel Paragon XP/S 140 Hitachi/Tsukuba CP-PACS/2048 Fujitsu Numerical Wind Tunnel NEC SX-3/44R Cray-2/8 ETA10-G/8 103 Cray X-MP/4 M-13 Cray-1 CDC Cyber 205 CDC STAR-100 ILLIAC IV CDC 7600 CDC 6600 Jaguar (#2 11/2010) 0 10 IBM 7030 Stretch UNIVAC LARC IBM 7090 Cray XT5 Rpeak 2.33 PFlop/s IBM 709 Rmax 1.76 PFlop/s IBM 704 10-3 1950 1965 1980 1995 2010 Memory also has increased! Year W. -
The Evolution of Software in High Energy Physics
International Conference on Computing in High Energy and Nuclear Physics 2012 (CHEP2012) IOP Publishing Journal of Physics: Conference Series 396 (2012) 052016 doi:10.1088/1742-6596/396/5/052016 The Evolution of Software in High Energy Physics Ren´eBrun, [email protected] CERN, Geneva, Switzerland Abstract. The paper reviews the evolution of the software in High Energy Physics from the time of expensive mainframes to grids and clouds systems using thousands of multi-core processors. It focuses on the key parameters or events that have shaped the current software infrastructure. 1. Introduction A review of the evolution of the software or/and hardware in the past 40 or 50 years has been made several times in the past few years. A very interesting article by D.O.Williams [1] was published in 2005 with a general overview of the software evolution and a detailed description of the hardware and networks areas. A book [2] was recently published with a detailed description of the events, systems and people involved in this evolution. The intention of this paper is to focus on a few elements that have been very important in the development of the general tools and libraries commonly used today in HEP. As we are living in a world with an increasing frequency of changes and new features, we must prepare the ground for massive upgrades of our software systems if we want to make an efficient use of the rapidly coming parallel hardware. The general tendency has been to build a coherent family of systems as illustrated in Figure 1. -
Chapter 5. Driving Forces on Packaging: Physical Interconnects
DRIVING FORCES ON PACKAGING: 5 PHYSICAL INTERCONNECTS The single most important element of the package and interconnect that influences the system clock speed, the performance density, and often the cost of a system, is how close together the devices can be placed. The term that best describes this is packaging efficiency. The three fabri- cation and assembly issues that constrain the packaging efficiency are: ¥ the I/O off the chipÑthe number, the form factor, the pitch ¥ the I/O off the packageÑthe number, the form factor, the pitch ¥ the interconnect substrateÑthe pad pitch, the via pitch, the line pitch, the number of layers Pin Count Requirements and RentÕs Rule As the number of gates on a single die has increased, the number of I/Os required to interface them to the outside world has also increased. In 1960, E.F. Rent of IBM identified a definite empir- ical relationship between the number of gates in a block, Ngates, and the number of I/Os they required, NI/O. This relationship, since coined RentÕs Rule, has been extended and generalized to encompass a variety of chip types and module sizes. Figure 5-1 is a plot of the signal I/Os required for various gate arrays and microprocessors. In general, for every 4 to 10 signal I/Os, one power or one ground is used. As the clock frequency goes up, a higher fraction of power and ground pads are required to keep switching noise at acceptable levels. For clock frequencies over 250MHz, the ratio is closer to 3:1 signal to power/ground pads. -
Needs and Challenges for Modeling FACET-II and Beyond
Needs and challenges for modeling FACET-II and Beyond W.B.Mori University of California Los Angeles (UCLA) Departments of Physics and Astronomy and of Electrical Engineering Institute of Digital Research and Education [email protected] With help from W. An, X.Xu, C. Joshi, R. Fonseca, A. Tableman, W. Lu, M. Hogan, PICKSC, and SLAC. Simulations will be critical for FACET-II and PWFA linear collider research • Need simulation tools that can support the design of experiments at FACET II. • Need simulation tools that can aid in interpreting experiments at FACET II. • Need simulation tools that can simulate new physics concepts, e.g., 3D down ramp injection and matching sections. • Need simulation tools that can simulate physics of a PWFA-LC including the final focus. • Need simulation tools that aid in helping to design a self-consistent set of parameters for a PWFA-LC. Simulations are critical for FACET-II and PWFA linear collider research • Simulations tools need to be continually improved and validated. • Simulation tools need to run on entire ecosystem of resources. • Simulation and analysis tools need to be easy to use. • Relationship between code developers/maintainers and users is critical (best practices are not always easy to document). Local clusters can be very useful: Dawson2 Dawson2 @ UCLA • 96 nodes • Ranked 148 in top 500 • 68 TFlops on Linpack Node configuration • 2× Intel G7 X5650 CPU • 3× NVIDIA M2070 GPU Computing Cores • Each GPU has 448 cores • total GPU cores: 129,024 • total CPU cores:1152 Funded by NSF Existing leadership -
Recent Supercomputing Development in Japan
Development of Supercomputers in Japan --From the Numerical Wind Tunnel to Fugaku-- Yoshio Oyanagi Science Advisor, RIST Kobe Center (高度情報科学技術研究機構) 2019/9/5 KSC 2019 1 How computers started in Japan PREHISTORY 2019/9/5 KSC 2019 2 “Computers” before WWII • Many Powers/Hollerith Punch Card Systems have been introduced to Japan since 1923 • The first one in universities: IBM Punch Card Systems in Kobe Univ. in 1941 • Since no PCS could be imported during WWII, Japan tried to copy the machine. • The next slide shows computer history exhibit of Kobe Univ. RIEB (Res. Inst. for Economics and Business Administration) . 2019/9/5 KSC 2019 3 2019/9/5 KSC 2019 4 Universities in 1950’s and 1960’s • Big computers were too expensive to buy with ordinary budget (we were poor, then!) • Some universities introduced small computers • Some universities developed computers by themselves – TAC 1952-1959 Univ. of Tokyo (Eng.) (with Toshiba) V – (no name) 1953-unfinished Osaka Univ. V – PC-1 1956- 1958 Univ. of Tokyo (Sci.) P – KDC-1 1958-1960 Kyoto Univ. (with Hitachi) T – SENAC-1 1956-1958 Tohoku Univ. (with NEC) P – K-1 1958-1960 Keio Univ. T • Those computers were open to teachers and students in the campus and were heavily used 2019/9/5 KSC 2019 5 PC-1 in Univ. of Tokyo (hand made) 2019/9/5 KSC 2019 6 Authorized Computer Centers • 1963/5/13 Recommendation of Science Council of Japan • 1963/7 Budget Proposal from Univ. of Tokyo • 1964/3 Accepted by government • Finalist computers were: – Hitachi HITAC 5020 (under development) – IBM 7094 II (popular but out-of-date) – CDC 3600 (new in Japan) • 1964/5 Committee selected HITAC 5020 – Rejected American computers 2019/9/5 KSC 2019 7 Computer Center, Univ. -
The Scaled-Sized Model: a Revision of Amdahl's
The Scaled-Sized Model: A Revision of Amdahl’s Law John L. Gustafson Sandia National Laboratories Abstract A popular argument, generally attributed to Amdahl [1], is that vector and parallel architectures should not be carried to extremes because the scalar or serial portion of the code will eventually dominate. Since pipeline stages and extra processors obviously add hardware cost, a corollary to this argument is that the most cost-effective computer is one based on uniprocessor, scalar principles. For architectures that are both parallel and vector, the argument is compounded, making it appear that near-optimal performance on such architectures is a near-impossibility. A new argument is presented that is based on the assumption that program execution time, not problem size, is constant for various amounts of vectorization and parallelism. This has a dramatic effect on Amdahl’s argument, revealing that one can be much more optimistic about achieving high speedups on massively parallel and highly vectorized machines. The revised argument is Figure 1. Fixed-Sized Model (Amdahl’s Argument) supported by recent results of over 1000 times speedup on 1024 processors on several practical scientific applications The figure makes obvious the fact that speedup with these [2]. assumptions can never exceed 1 ⁄ s even if N is infinite. However, the assumption that the problem is fixed in size is 1. Introduction questionable. An alternative is presented in the next section. We begin with a review of Amdahl’s general argument, show a revision that alters the functional form of his equation for 3. Scaled Problem Model speedup, and then discuss consequences for vector, parallel, and When given a more powerful processor, the problem generally vector-parallel architectures. -
The Marketplace of High Performance Computing
The Marketplace of High Performance Computing a1 a;b2 c3 Erich Strohmaier , Jack J. Dongarra , Hans W. Meuer d4 and Horst D. Simon a Computer Science Department, University of Tennessee, Knoxvil le, TN 37996 b Mathematical ScienceSection, Oak Ridge National Lab., Oak Ridge, TN 37831 c Computing Center, University of Mannheim, D-68131 Mannheim, Germany d NERSC, Lawrence Berkeley Laboratory, 50A, Berkeley, CA 94720 Abstract In this pap er we analyze the ma jor trends and changes in the High Performance Computing HPC market place since the b eginning of the journal `Parallel Com- puting'. The initial success of vector computers in the seventies was driven byraw p erformance. The intro duction of this typ e of computer systems started the area of `Sup ercomputing'. In the eighties the availability of standard developmentenviron- ments and of application software packages b ecame more imp ortant. These criteria determined next to p erformance the success of MP vector systems esp ecially at in- dustrial customers. MPPs b ecame successful in the early nineties due to their b etter price/p erformance ratios whichwas enabled by the attack of the `killer-micros'. In the lower and medium market segments the MPPs were replaced by micropro cessor based SMP systems in the middle of the nineties. This success was the basis for the emerging cluster concepts for the very high end systems. In the last few years only the companies whichhaveentered the emerging markets for massive paral- lel database servers and nancial applications attract enough business volume to b e able to supp ort the hardware development for the numerical high end comput- ing market as well.