The TOP500 List and Progress in High- Performance Computing

Total Page:16

File Type:pdf, Size:1020Kb

The TOP500 List and Progress in High- Performance Computing COVER FEATURE GRAND CHALLENGES IN SCIENTIFIC COMPUTING The TOP500 List and Progress in High- Performance Computing Erich Strohmaier, Lawrence Berkeley National Laboratory Hans W. Meuer, University of Mannheim Jack Dongarra, University of Tennessee Horst D. Simon, Lawrence Berkeley National Laboratory For more than two decades, the TOP list has enjoyed incredible success as a metric for supercomputing performance and as a source of data for identifying technological trends. The project’s editors refl ect on its usefulness and limitations for guiding large-scale scientifi c computing into the exascale era. he TOP list (www.top.org) has served TOP500 ORIGINS as the de ning yardstick for supercomput- In the mid-s, coauthor Hans Meuer started a small ing performance since . Published twice a and focused annual conference that has since evolved year, it compiles the world’s largest instal- into the prestigious International Supercomputing Con- Tlations and some of their main characteristics. Systems ference (www.isc-hpc.com). During the conference’s are ranked according to their performance of the Lin- opening session, Meuer presented statistics about the pack benchmark, which solves a dense system of linear numbers, locations, and manufacturers of supercomput- equations. Over time, the data collected for the list has ers worldwide collected from vendors and colleagues in enabled the early identi cation and quanti cation of academia and industry. many important technological and architectural trends Initially, it was obvious that the supercomputer label related to high-performance computing (HPC).− should be reserved for vector processing systems from Here, we brie y describe the project’s origins, the companies such as Cray, CDC, Fujitsu, NEC, and Hitachi principles guiding data collection, and what has made that each claimed to have the fastest system for scienti c the list so successful during the two-decades-long tran- computation by some selective measure. By the end of sition from giga- to tera- to petascale computing. We also the decade, however, the situation became increasingly examine the list’s limitations. The TOP’s simplicity complicated as smaller vector systems became available has invited many criticisms, and we consider several from some of these vendors as well as new competitors complementary or competing projects that have tried (Convex, IBM) and as massively parallel systems with to address these concerns. Finally, we explore several SIMD architectures (Thinking Machines, MasPar) and emerging trends and re ect on the list’s potential useful- MIMD systems based on scalar processors (Intel, nCube, ness for guiding large-scale HPC into the exascale era. and others) entered the market. Simply counting the 42 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/15/$31.00 © 2015 IEEE 100 Pops 33.9 Pops 10 Pops 1 Pops 100 Tops No. 1 system 165 Tops 10 Tops Average 1 Tops 59.7 Gops Performance 100 Gops No. 500 system installation base for systems of such 10 Gops vastly di erent scales did not produce 1 Gops any meaningful data about the mar- 400 Mops 100 Mops ket. New criteria for which systems constituted supercomputers were 1993 1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 2015 Year needed. After two years experimenting FIGURE . Supercomputer performance over time as tracked by the TOP. The red with various metrics and approaches, and orange lines show performance of the fi rst and last systems, and the blue line aver- Meuer and coauthor Erich Strohmaier age performance of all systems. Dashed lines are fi tted exponential growth curves before concluded that the best way to pro- and after for the orange line and before and after for the blue line. vide a consistent, long-term picture of the supercomputer market was to maintain a list of systems up to a pre- but this would lead to inconsistent val- Using a single benchmark that does determined cuto number, ranked ues and make comparisons di cult. To not utilize all the system components according to their actual performance. address this problem, we opted to select necessary for most scienti c applica- On the basis of previous studies they and mandate use of a single benchmark tions or that maps better to particular determined that at least quali ed for all TOP editions. computer architectures could lead to systems could be assembled, and so This benchmark would not repre- misleadingly high performance num- the TOP list was born. sent performance of an actual scienti c bers for some systems, incorrectly indi- application but coarsely embody scien- cating these systems’ suitability for sci- RANKING SUPERCOMPUTER ti c computing’s main architectural enti c computing. To minimize such PERFORMANCE requirements. Because scienti c com- implicit bias, we decided that the bench- The simplest and most universal rank- puting is primarily driven by integrated mark should exercise all major system ing metric for scienti c computing is large-scale calculations, we decided components and be based on a relatively oating-point operations per second to avoid using simplistic benchmarks, simple algorithm that allows optimiza- ( ops). More specialized metrics such such as embarrassingly parallel work- tion for a wide range of architectures. as time to solution or time per iteration loads, that could lead to very high rank- and time per gridpoint can be more ings for systems otherwise unsuitable LINPACK meaningful in particular application for scienti c computing. Instead, we An evaluation of benchmarks suitable domains and allow more detailed sought a benchmark that would show- for supercomputing in the early s comparisons—for example, between case systems’ capabilities without found that Linpack had the most doc- alternative algorithms with di erent being overly harsh or restrictive. Over- umented results by a large margin and complexities—but are harder to de ne all, the collected data should provide thus allowed immediate ranking of properly, more restricted in their use, reasonable upper bounds for actual most of the systems of interest. The and, due to their specialization, not performance while penalizing systems NAS Parallel Benchmarks (NAS PB) applicable to the overall scienti c unable to support a large fraction of sci- were also widely used, as most of them computing market. enti c computing applications. simulated actual application perfor- In addition to limiting performance Obviously no single benchmark mance more closely, but none of them measurement to ops, we decided to use can ever hope to represent or approx- provided enough results to rank more actual measured values to avoid con- imate performance for most scienti c than percent of the systems. taminating collected data with unsub- computing applications, as the space Linpack solves a dense system of stantiated and often outlandish per- of algorithms and implementations is linear equations, which today is some- formance “estimates” for systems that too vast. The purpose of using a single times criticized as an overly simplis- did not reliably function or even exist. benchmark in the TOP was never tic problem. However, the benchmark In principle, measured results from to claim such representativeness but is by no means embarrassingly par- di erent benchmarks or applications to collect reproducible and compara- allel and it worked well with respect could be used to rank di erent systems, ble metrics. to reducing the rankings of loosely NOVEMBER 2015 43 GRAND CHALLENGES IN SCIENTIFIC COMPUTING 10,000 1,000 100 predict. Figure 1 shows performance No. of processor sockets values for the first and last systems as well as average performance of all sys- 10 tems in the TOP500. Until 2008, these curves grew exponentially at a rate of 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 1.91 per year (multiplicative factor). Year Compared to the exponential growth FIGURE 2. Average number of processor sockets for new supercomputers in the TOP500, rate of Moore’s law at 1.59 per year, excluding systems with SIMD processors, vector processors, or accelerators. The exponen- TOP500 system performance had an tial increase in the number of sockets up to 2008 accounts for the higher-than-expected excess exponential growth rate of 1.20 growth rate in supercomputing performance during the same time period. per year. We suspected that this addi- tional growth was driven by an increas- ing number of processor sockets in our coupled architectures, which were of that they do not reduce the number of system sample. (We use the term “pro- limited use to scientific computing in floating-point operations performed. cessor sockets” to clearly differentiate general. The High-Performance Lin- The TOP500 therefore cannot provide processors from processor cores.) pack (HPL) implementation comes any basis for research into algorithmic To better understand this and other with a self-adjustable problem size, improvements over time. Linpack and technological trends contained in the which allows it to be used seamlessly HPL could certainly be used to compare TOP500 data, we obtained a clean and on systems of vastly different sizes as algorithmic improvements, but not in uniform subsample of systems from compared to discrete, fixed sizes for the context of the TOP500 ranking. each edition of the list by extracting the NAS PB. Unlike many other bench- the new systems and those systems marks with variable problem sizes, TOP500 TRENDS that did not use special processors HPL achieves its best performance on Although we started the TOP500 to with vastly different characteristics large-scale problems that use all of a provide statistics about the HPC mar- including SIMD processors, vector system’s available memory and not on ket at specific dates, it became imme- processors, or accelerators (such as small problems that fit into the cache. diately clear that the inherent ability Nvidia GPUs and Intel Phi coproces- This greatly reduces the need for elab- to systematically track the evolution sors).
Recommended publications
  • Materials Modelling and the Challenges of Petascale and Exascale
    Multiscale Materials Modelling on High Performance Computer Architectures Materials modelling and the challenges of petascale and exascale Andrew Emerson Cineca Supercomputing Centre, Bologna, Italy The project MMM@HPC is funded by the 7th Framework Programme of the European Commission within the Research Infrastructures 26/09/2013 with grant agreement number RI-261594. Contents Introduction to HPC HPC and the MMM@HPC project Petascale computing The Road to Exascale Observations 26/09/2013 A. Emerson, International Forum on Multiscale Materials Modelling, Bologna 2013 2 High Performance Computing High Performance Computing (HPC). What is it ? High-performance computing (HPC) is the use of parallel processing for running advanced application programs efficiently, reliably and quickly. The term applies especially to systems that function above a teraflop or 10 12 floating- point operations per second. (http://searchenterpriselinux.techtarget.com/definition/high -performance -computing ) A branch of computer science that concentrates on developing supercomputers and software to run on supercomputers. A main area of this discipline is developing parallel processing algorithms and software: programs that can be divided into little pieces so that each piece can be executed simultaneously by separate processors. (WEBOPEDIA ) 26/09/2013 A. Emerson, International Forum on Multiscale Materials Modelling, Bologna 2013 3 High Performance Computing Advances due to HPC, e.g. Molecular dynamics early 1990s . Lysozyme, 40k atoms 2006. Satellite tobacco mosaic virus (STMV). 1M atoms, 50ns 2008. Ribosome. 3.2M atoms, 230ns. 2011 . Chromatophore, 100M atoms (SC 2011) 26/09/2013 A. Emerson, International Forum on Multiscale Materials Modelling, Bologna 2013 4 High Performance Computing Cray-1 Supercomputer (1976) 80MHz , Vector processor → 250Mflops Cray XMP (1982) 2 CPUs+vectors, 400 MFlops “FERMI”, Bluegene/Q 168,000 cores 2.1 Pflops 26/09/2013 A.
    [Show full text]
  • Recent Developments in Supercomputing
    John von Neumann Institute for Computing Recent Developments in Supercomputing Th. Lippert published in NIC Symposium 2008, G. M¨unster, D. Wolf, M. Kremer (Editors), John von Neumann Institute for Computing, J¨ulich, NIC Series, Vol. 39, ISBN 978-3-9810843-5-1, pp. 1-8, 2008. c 2008 by John von Neumann Institute for Computing Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise requires prior specific permission by the publisher mentioned above. http://www.fz-juelich.de/nic-series/volume39 Recent Developments in Supercomputing Thomas Lippert J¨ulich Supercomputing Centre, Forschungszentrum J¨ulich 52425 J¨ulich, Germany E-mail: [email protected] Status and recent developments in the field of supercomputing on the European and German level as well as at the Forschungszentrum J¨ulich are presented. Encouraged by the ESFRI committee, the European PRACE Initiative is going to create a world-leading European tier-0 supercomputer infrastructure. In Germany, the BMBF formed the Gauss Centre for Supercom- puting, the largest national association for supercomputing in Europe. The Gauss Centre is the German partner in PRACE. With its new Blue Gene/P system, the J¨ulich supercomputing centre has realized its vision of a dual system complex and is heading for Petaflop/s already in 2009. In the framework of the JuRoPA-Project, in cooperation with German and European industrial partners, the JSC will build a next generation general purpose system with very good price-performance ratio and energy efficiency.
    [Show full text]
  • Nascent Exascale Supercomputers Offer Promise, Present Challenges CORE CONCEPTS Adam Mann, Science Writer
    CORE CONCEPTS Nascent exascale supercomputers offer promise, present challenges CORE CONCEPTS Adam Mann, Science Writer Sometime next year, managers at the US Department Laboratory in NM. “We have to change our computing of Energy’s (DOE) Argonne National Laboratory in paradigms, how we write our programs, and how we Lemont, IL, will power up a calculating machine the arrange computation and data management.” size of 10 tennis courts and vault the country into That’s because supercomputers are complex a new age of computing. The $500-million main- beasts, consisting of cabinets containing hundreds of frame, called Aurora, could become the world’sfirst thousands of processors. For these processors to oper- “exascale” supercomputer, running an astounding ate as a single entity, a supercomputer needs to pass 1018, or 1 quintillion, operations per second. data back and forth between its various parts, running Aurora is expected to have more than twice the huge numbers of computations at the same time, all peak performance of the current supercomputer record while minimizing power consumption. Writing pro- holder, a machine named Fugaku at the RIKEN Center grams for such parallel computing is not easy, and the- for Computational Science in Kobe, Japan. Fugaku and orists will need to leverage new tools such as machine its calculation kin serve a vital function in modern learning and artificial intelligence to make scientific scientific advancement, performing simulations crucial breakthroughs. Given these challenges, researchers for discoveries in a wide range of fields. But the transition have been planning for exascale computing for more to exascale will not be easy.
    [Show full text]
  • Arxiv:2109.00082V1 [Cs.DC] 31 Aug 2021 Threshold of Exascale Computing
    Plan-based Job Scheduling for Supercomputers with Shared Burst Buffers Jan Kopanski and Krzysztof Rzadca Institute of Informatics, University of Warsaw Stefana Banacha 2, 02-097 Warsaw, Poland [email protected] [email protected] Preprint of the pa- Abstract. The ever-increasing gap between compute and I/O perfor- per accepted at the mance in HPC platforms, together with the development of novel NVMe 27th International storage devices (NVRAM), led to the emergence of the burst buffer European Conference concept—an intermediate persistent storage layer logically positioned on Parallel and Dis- between random-access main memory and a parallel file system. De- tributed Computing spite the development of real-world architectures as well as research (Euro-Par 2021), Lis- concepts, resource and job management systems, such as Slurm, provide bon, Portugal, 2021, only marginal support for scheduling jobs with burst buffer requirements, DOI: 10.1007/978-3- in particular ignoring burst buffers when backfilling. We investigate the 030-85665-6_8 impact of burst buffer reservations on the overall efficiency of online job scheduling for common algorithms: First-Come-First-Served (FCFS) and Shortest-Job-First (SJF) EASY-backfilling. We evaluate the algorithms in a detailed simulation with I/O side effects. Our results indicate that the lack of burst buffer reservations in backfilling may significantly deteriorate scheduling. We also show that these algorithms can be easily extended to support burst buffers. Finally, we propose a burst-buffer–aware plan-based scheduling algorithm with simulated annealing optimisation, which im- proves the mean waiting time by over 20% and mean bounded slowdown by 27% compared to the burst-buffer–aware SJF-EASY-backfilling.
    [Show full text]
  • TMA4280—Introduction to Supercomputing
    Supercomputing TMA4280—Introduction to Supercomputing NTNU, IMF January 12. 2018 1 Outline Context: Challenges in Computational Science and Engineering Examples: Simulation of turbulent flows and other applications Goal and means: parallel performance improvement Overview of supercomputing systems Conclusion 2 Computational Science and Engineering (CSE) What is the motivation for Supercomputing? Solve complex problems fast and accurately: — efforts in modelling and simulation push sciences and engineering applications forward, — computational requirements drive the development of new hardware and software. 3 Computational Science and Engineering (CSE) Development of computational methods for scientific research and innovation in engineering and technology. Covers the entire spectrum of natural sciences, mathematics, informatics: — Scientific model (Physics, Biology, Medical, . ) — Mathematical model — Numerical model — Implementation — Visualization, Post-processing — Validation ! Feedback: virtuous circle Allows for larger and more realistic problems to be simulated, new theories to be experimented numerically. 4 Outcome in Industrial Applications Figure: 2004: “The Falcon 7X becomes the first aircraft in industry history to be entirely developed in a virtual environment, from design to manufacturing to maintenance.” Dassault Systèmes 5 Evolution of computational power Figure: Moore’s Law: exponential increase of number of transistors per chip, 1-year rate (1965), 2-year rate (1975). WikiMedia, CC-BY-SA-3.0 6 Evolution of computational power
    [Show full text]
  • PETASCALE COMPUTING: Algorithms and Applications Edited by David A
    PETASCALE COMPUTING ALGORITHMS AND APPLICATIONS C9098_FM.indd 1 11/15/07 1:38:55 PM Chapman & Hall/CRC Computational Science Series SERIES EDITOR Horst Simon Associate Laboratory Director, Computing Sciences Lawrence Berkeley National Laboratory Berkeley, California, U.S.A. AIMS AND SCOPE This series aims to capture new developments and applications in the field of computational sci- ence through the publication of a broad range of textbooks, reference works, and handbooks. Books in this series will provide introductory as well as advanced material on mathematical, sta- tistical, and computational methods and techniques, and will present researchers with the latest theories and experimentation. The scope of the series includes, but is not limited to, titles in the areas of scientific computing, parallel and distributed computing, high performance computing, grid computing, cluster computing, heterogeneous computing, quantum computing, and their applications in scientific disciplines such as astrophysics, aeronautics, biology, chemistry, climate modeling, combustion, cosmology, earthquake prediction, imaging, materials, neuroscience, oil exploration, and weather forecasting. PUBLISHED TITLES PETASCALE COMPUTING: Algorithms and Applications Edited by David A. Bader C9098_FM.indd 2 11/15/07 1:38:55 PM PETASCALE COMPUTING ALGORITHMS AND APPLICATIONS EDITED BY DAVID A. BADER Georgia Institute of Technology Atlanta, U.S.A. C9098_FM.indd 3 11/15/07 1:38:56 PM Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2008 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-13: 978-1-58488-909-0 (Hardcover) This book contains information obtained from authentic and highly regarded sources.
    [Show full text]
  • SEVENTH FRAMEWORK PROGRAMME Research Infrastructures PRACE-2IP PRACE Second Implementation Project D5.1 Preliminary Guidance On
    SEVENTH FRAMEWORK PROGRAMME Research Infrastructures INFRA-2011-2.3.5 – Second Implementation Phase of the European High Performance Computing (HPC) service PRACE PRACE-2IP PRACE Second Implementation Project Grant Agreement Number: RI-283493 D5.1 Preliminary Guidance on Procurements and Infrastructure Final Version: 1.0 Author(s): Guillermo Aguirre, BSC Date: 21.02.2013 D5.1 Preliminary Guidance on Procurements and Infrastructure Project and Deliverable Information Sheet PRACE Project Project Ref. №: RI-283493 Project Title: PRACE Second Implementation Project Project Web Site: http://www.prace-project.eu Deliverable ID: < D5.1> Deliverable Nature: Report Deliverable Level: Contractual Date of Delivery: PU 28/02/2013 Actual Date of Delivery: 28/02/2013 EC Project Officer: Leonardo Flores Añover Document Control Sheet Title: Preliminary Guidance on Procurements and Infrastructure Document ID: D5.1 Version: <1.0 > Status: Final Available at: http://www.prace-project.eu Software Tool: Microsoft Word 2007 File(s): D5.1.docx Written by: Guillermo Aguirre, BSC Authorship Contributors: Francois Robin, CEA; Jean-Philippe Nominé, CEA; Ioannis Liabotis, GRNET; Norbert Meyer, PSNC; Radek Januszewski, PSNC; Andreas Johansson, SNIC-LIU; Eric Boyer, CINES; George Karagiannopoulos, GRNET; Marco Sbrighi, CINECA; Vladimir Slavnic, IPB; Gert Svensson, SNIC-KTH Reviewed by: Peter Stefan, NIIF Florian Berberich, PMO & FZJ Approved by: MB/TB Document Status Sheet Version Date Status Comments 0.1 16/01/2013 Draft First outline 0.2 22/01/2013 Draft Added contributions
    [Show full text]
  • Exascale Computing Study: Technology Challenges in Achieving Exascale Systems
    ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems Peter Kogge, Editor & Study Lead Keren Bergman Shekhar Borkar Dan Campbell William Carlson William Dally Monty Denneau Paul Franzon William Harrod Kerry Hill Jon Hiller Sherman Karp Stephen Keckler Dean Klein Robert Lucas Mark Richards Al Scarpelli Steven Scott Allan Snavely Thomas Sterling R. Stanley Williams Katherine Yelick September 28, 2008 This work was sponsored by DARPA IPTO in the ExaScale Computing Study with Dr. William Harrod as Program Manager; AFRL contract number FA8650-07-C-7724. This report is published in the interest of scientific and technical information exchange and its publication does not constitute the Government’s approval or disapproval of its ideas or findings NOTICE Using Government drawings, specifications, or other data included in this document for any purpose other than Government procurement does not in any way obligate the U.S. Government. The fact that the Government formulated or supplied the drawings, specifications, or other data does not license the holder or any other person or corporation; or convey any rights or permission to manufacture, use, or sell any patented invention that may relate to them. APPROVED FOR PUBLIC RELEASE, DISTRIBUTION UNLIMITED. This page intentionally left blank. DISCLAIMER The following disclaimer was signed by all members of the Exascale Study Group (listed below): I agree that the material in this document reects the collective views, ideas, opinions and ¯ndings of the study participants only, and not those of any of the universities, corporations, or other institutions with which they are a±liated. Furthermore, the material in this document does not reect the o±cial views, ideas, opinions and/or ¯ndings of DARPA, the Department of Defense, or of the United States government.
    [Show full text]
  • SC20-Final-Program-V2.Pdf
    Table of Contents ACM Gordon Bell COVID Finalist Keynote ACM Gordon Bell Finalist More Than HPC Plenary ACM Student Research Competition: Panel Graduate Posters Paper ACM Student Research Competition: Research Posters Undergraduate Posters Scientific Visualization Awards Presentation & Data Analytics Showcase Birds of a Feather SCinet Booth Sessions State of the Practice Talk Doctoral Showcase Students@SC Early Career Program Test of Time Exhibitor Forum Tutorial Exhibits Virtual Student Cluster Competition Invited Talk Workshop Job Posting ACM Gordon Bell COVID Finalist (back to top) Thursday, November 19th 10:00 am - 12:00 pm Gordon Bell COVID-19 Prize Finalist Session 1 Session Description: Enabling Rapid COVID-19 Small Molecule Drug Design Through Scalable Deep Learning of Generative Models Sam Ade Jacobs (Lawrence Livermore National Laboratory), Tim Moon (Lawrence Livermore National Laboratory), Kevin McLoughlin (Lawrence Livermore National Laboratory), Derek Jones (Lawrence Livermore National Laboratory), David Hysom (Lawrence Livermore National Laboratory), Dong H. Ahn (Lawrence Livermore National Laboratory), John Gyllenhaal (Lawrence Livermore National Laboratory), Pythagoras Watson (Lawrence Livermore National Laboratory), Felice C. Lightsone (Lawrence Livermore National Laboratory), Jonathan E. Allen (Lawrence Livermore National Laboratory), Ian Karlin (Lawrence Livermore National Laboratory), Brian Van Essen (Lawrence Livermore National Laboratory) We improved the quality and reduced the time to produce machine-learned models for use in small molecule antiviral design. Our globally asynchronous multi-level parallel training approach strong scales to all of Sierra with up to 97.7% efficiency. We trained a novel, character-based Wasserstein autoencoder that produces a higher quality model trained on 1.613 billion compounds in 23 minutes while the previous state-of-the-art takes a day on 1 million compounds.
    [Show full text]
  • OLCF AR 2016-17 FINAL 9-7-17.Pdf
    Oak Ridge Leadership Computing Facility Annual Report 2016–2017 1 Outreach manager – Katie Bethea Writers – Eric Gedenk, Jonathan Hines, Katie Jones, and Rachel Harken Designer – Jason Smith Editor – Wendy Hames Photography – Jason Richards and Carlos Jones Stock images – iStockphoto™ Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory P.O. Box 2008, Oak Ridge, TN 37831-6161 Phone: 865-241-6536 Email: [email protected] Website: https://www.olcf.ornl.gov Facebook: https://www.facebook.com/oakridgeleadershipcomputingfacility Twitter: @OLCFGOV The research detailed in this publication made use of the Oak Ridge Leadership Computing Facility, a US Department of Energy Office of Science User Facility located at DOE’s Oak Ridge National Laboratory. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov. 2 Contents LETTER In a Record 25th Year, We Celebrate the Past and Look to the Future 4 SCIENCE Streamlining Accelerated Computing for Industry 6 A Seismic Mapping Milestone 8 The Shape of Melting in Two Dimensions 10 A Supercomputing First for Predicting Magnetism in Real Nanoparticles 12 Researchers Flip Script for Lithium-Ion Electrolytes to Simulate Better Batteries 14 A Real CAM-Do Attitude 16 FEATURES Big Data Emphasis and New Partnerships Highlight the Path to Summit 18 OLCF Celebrates 25 Years of HPC Leadership 24 PEOPLE & PROGRAMS Groups within the OLCF 28 OLCF User Group and Executive Board 30 INCITE, ALCC, DD 31 SYSTEMS & SUPPORT Resource Overview 32 User Experience 34 Education, Outreach, and Training 35 ‘TitanWeek’ Recognizes Contributions of Nation’s Premier Supercomputer 36 Selected Publications 38 Acronyms 41 3 In a Record 25th Year, We Celebrate the Past and Look to the Future installed at the turn of the new millennium—to the founding of the Center for Computational Sciences at the US Department of Energy’s Oak Ridge National Laboratory.
    [Show full text]
  • At the Frontiers of Extreme Computing
    NOVEMBER 2011 SUPER- COMPUTERS AT THE FRONTIERS OF EXTREME COMPUTING PUBLISHED IN PARTNERSHIP WITH Research and Innovation with HPC Joint SMEs Laboratory HPC At the interface of computer science and mathematics, Inria researchers have spent 40 years establishing the scientific bases of a new field of knowledge: computational science. In inte- raction with other scientific disciplines, computational science offers new concepts, languages, methods and subjects for study that open new perspectives in the understanding of complex phenomena. High Performance Computing is a The work of this laboratory focuses Eventually, in order to boost techno- strategic topic for Inria, about thirty on development of algorithms and logy transfer from public research to Inria research teams are involved. software for computers at the peta- industry, which is part of Inria’s core flop scale and beyond. The laborato- mission, the institute has launched Inria has thus established large ry’s researchers carry out their work an «SME go HPC» Program, together scale strategic partnerships with- as part of the Blue Waters project. with GENCI, OSEO and four French Bull for the design of future HPC industry clusters (Aerospace Valley, architectures and with EDF R&D fo- It is also noteworthy that several Axelera, Minalogic, Systematic). cused on high performance simulation former Inria spin-off companies have The objective of the Program is to for energy applications. developed their business on this mar- bring high level expertise to SMEs wil- ket, such as Kerlabs, Caps Enterprise, ling to move to Simulation and HPC as At the international level, Inria and the Activeon or Sysfera.
    [Show full text]
  • Thor's Hammer/Red Storm
    Bill Camp & Jim Tomkins The Design Specification and Initial Implementation of the Red Storm Architecture --in partnership with Cray, Inc. William J. Camp & James L. Tomkins CCIM, Sandia National Laboratories Albuquerque, NM [email protected] Our rubric Mission critical engineering & science applications Large systems with a few processors per node Message passing paradigm Balanced architecture Use commodity wherever possible Efficient systems software Emphasis on scalability & reliability in all aspects Critical advances in parallel algorithms Vertical integration of technologies Computing domains at Sandia Peak Mid-Range Domain Volume # Procs 1 101 102 103 104 XXX Red Storm Cplant XXX Beowulf X X X Desktop X Red Storm is targeting the highest-end market but has real advantages for the mid-range market (from 1 cabinet on up) Red Storm Architecture True MPP, designed to be a single system Distributed memory MIMD parallel supercomputer Fully connected 3D mesh interconnect. Each compute node processor has a bi-directional connection to the primary communication network 108 compute node cabinets and 10,368 compute node processors (AMD Sledgehammer @ 2.0 GHz) ~10 TB of DDR memory @ 333MHz Red/Black switching: ~1/4, ~1/2, ~1/4 8 Service and I/O cabinets on each end (256 processors for each color)-- may add on-system viz nodes to SIO partition 240 TB of disk storage (120 TB per color) Red Storm Architecture Functional hardware partitioning: service and I/O nodes, compute nodes, and RAS nodes Partitioned Operating System (OS):
    [Show full text]