Leadership Computing at the National Center for Computational Science: Transitioning to Heterogeneous Architectures
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Petaflops for the People
PETAFLOPS SPOTLIGHT: NERSC housands of researchers have used facilities of the Advanced T Scientific Computing Research (ASCR) program and its EXTREME-WEATHER Department of Energy (DOE) computing predecessors over the past four decades. Their studies of hurricanes, earthquakes, NUMBER-CRUNCHING green-energy technologies and many other basic and applied Certain problems lend themselves to solution by science problems have, in turn, benefited millions of people. computers. Take hurricanes, for instance: They’re They owe it mainly to the capacity provided by the National too big, too dangerous and perhaps too expensive Energy Research Scientific Computing Center (NERSC), the Oak to understand fully without a supercomputer. Ridge Leadership Computing Facility (OLCF) and the Argonne Leadership Computing Facility (ALCF). Using decades of global climate data in a grid comprised of 25-kilometer squares, researchers in These ASCR installations have helped train the advanced Berkeley Lab’s Computational Research Division scientific workforce of the future. Postdoctoral scientists, captured the formation of hurricanes and typhoons graduate students and early-career researchers have worked and the extreme waves that they generate. Those there, learning to configure the world’s most sophisticated same models, when run at resolutions of about supercomputers for their own various and wide-ranging projects. 100 kilometers, missed the tropical cyclones and Cutting-edge supercomputing, once the purview of a small resulting waves, up to 30 meters high. group of experts, has trickled down to the benefit of thousands of investigators in the broader scientific community. Their findings, published inGeophysical Research Letters, demonstrated the importance of running Today, NERSC, at Lawrence Berkeley National Laboratory; climate models at higher resolution. -
Workload Management and Application Placement for the Cray Linux Environment™
TMTM Workload Management and Application Placement for the Cray Linux Environment™ S–2496–4101 © 2010–2012 Cray Inc. All Rights Reserved. This document or parts thereof may not be reproduced in any form unless permitted by contract or by written permission of Cray Inc. U.S. GOVERNMENT RESTRICTED RIGHTS NOTICE The Computer Software is delivered as "Commercial Computer Software" as defined in DFARS 48 CFR 252.227-7014. All Computer Software and Computer Software Documentation acquired by or for the U.S. Government is provided with Restricted Rights. Use, duplication or disclosure by the U.S. Government is subject to the restrictions described in FAR 48 CFR 52.227-14 or DFARS 48 CFR 252.227-7014, as applicable. Technical Data acquired by or for the U.S. Government, if any, is provided with Limited Rights. Use, duplication or disclosure by the U.S. Government is subject to the restrictions described in FAR 48 CFR 52.227-14 or DFARS 48 CFR 252.227-7013, as applicable. Cray and Sonexion are federally registered trademarks and Active Manager, Cascade, Cray Apprentice2, Cray Apprentice2 Desktop, Cray C++ Compiling System, Cray CX, Cray CX1, Cray CX1-iWS, Cray CX1-LC, Cray CX1000, Cray CX1000-C, Cray CX1000-G, Cray CX1000-S, Cray CX1000-SC, Cray CX1000-SM, Cray CX1000-HN, Cray Fortran Compiler, Cray Linux Environment, Cray SHMEM, Cray X1, Cray X1E, Cray X2, Cray XD1, Cray XE, Cray XEm, Cray XE5, Cray XE5m, Cray XE6, Cray XE6m, Cray XK6, Cray XK6m, Cray XMT, Cray XR1, Cray XT, Cray XTm, Cray XT3, Cray XT4, Cray XT5, Cray XT5h, Cray XT5m, Cray XT6, Cray XT6m, CrayDoc, CrayPort, CRInform, ECOphlex, LibSci, NodeKARE, RapidArray, The Way to Better Science, Threadstorm, uRiKA, UNICOS/lc, and YarcData are trademarks of Cray Inc. -
Red Storm Infrastructure at Sandia National Laboratories
Red Storm Infrastructure Robert A. Ballance, John P. Noe, Milton J. Clauser, Martha J. Ernest, Barbara J. Jennings, David S. Logsted, David J. Martinez, John H. Naegle, and Leonard Stans Sandia National Laboratories∗ P.O. Box 5800 Albquuerque, NM 87185-0807 May 13, 2005 Abstract Large computing systems, like Sandia National Laboratories’ Red Storm, are not deployed in isolation. The computer, its infrastructure support, and the operations of both have to be designed to support the mission of the or- ganization. This talk and paper will describe the infrastructure and operational requirements of Red Storm along with a discussion of how Sandia is meeting those requirements. It will include discussions of the facilities that surround and support Red Storm: shelter, power, cooling, security, networking, interactions with other Sandia platforms and capabilities, operations support, and user services. 1 Introduction mary components: Stewart Brand, in How Buildings Learn[Bra95], has doc- 1. The physical setting, such as buildings, power, and umented many of the ways in which buildings — often cooling, thought to be static structures – evolve over time to meet the needs of their occupants and the constraints of their 2. The computational setting which includes the re- environments. Machines learn in the same way. This is lated computing systems that support ASC efforts not the learning of “Machine learning” in Artificial Intel- on Red Storm, ligence; it is the very human result of making a machine 3. The networking setting that ties all of these to- smarter as it operates within its distinct environment. gether, and Deploying a large computational resource teaches many lessons, including the need to meet the expecta- 4. -
Merrimac – High-Performance and Highly-Efficient Scientific Computing with Streams
MERRIMAC – HIGH-PERFORMANCE AND HIGHLY-EFFICIENT SCIENTIFIC COMPUTING WITH STREAMS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Mattan Erez May 2007 c Copyright by Mattan Erez 2007 All Rights Reserved ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (William J. Dally) Principal Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Patrick M. Hanrahan) I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Mendel Rosenblum) Approved for the University Committee on Graduate Studies. iii iv Abstract Advances in VLSI technology have made the raw ingredients for computation plentiful. Large numbers of fast functional units and large amounts of memory and bandwidth can be made efficient in terms of chip area, cost, and energy, however, high-performance com- puters realize only a small fraction of VLSI’s potential. This dissertation describes the Merrimac streaming supercomputer architecture and system. Merrimac has an integrated view of the applications, software system, compiler, and architecture. We will show how this approach leads to over an order of magnitude gains in performance per unit cost, unit power, and unit floor-space for scientific applications when compared to common scien- tific computers designed around clusters of commodity general-purpose processors. -
Cray XT and Cray XE Y Y System Overview
Crayyy XT and Cray XE System Overview Customer Documentation and Training Overview Topics • System Overview – Cabinets, Chassis, and Blades – Compute and Service Nodes – Components of a Node Opteron Processor SeaStar ASIC • Portals API Design Gemini ASIC • System Networks • Interconnection Topologies 10/18/2010 Cray Private 2 Cray XT System 10/18/2010 Cray Private 3 System Overview Y Z GigE X 10 GigE GigE SMW Fibre Channels RAID Subsystem Compute node Login node Network node Boot /Syslog/Database nodes 10/18/2010 Cray Private I/O and Metadata nodes 4 Cabinet – The cabinet contains three chassis, a blower for cooling, a power distribution unit (PDU), a control system (CRMS), and the compute and service blades (modules) – All components of the system are air cooled A blower in the bottom of the cabinet cools the blades within the cabinet • Other rack-mounted devices within the cabinet have their own internal fans for cooling – The PDU is located behind the blower in the back of the cabinet 10/18/2010 Cray Private 5 Liquid Cooled Cabinets Heat exchanger Heat exchanger (XT5-HE LC only) (LC cabinets only) 48Vdc flexible Cage 2 buses Cage 2 Cage 1 Cage 1 Cage VRMs Cage 0 Cage 0 backplane assembly Cage ID controller Interconnect 01234567 Heat exchanger network cable Cage inlet (LC cabinets only) connection air temp sensor Airflow Heat exchanger (slot 3 rail) conditioner 48Vdc shelf 3 (XT5-HE LC only) 48Vdc shelf 2 L1 controller 48Vdc shelf 1 Blower speed controller (VFD) Blooewer PDU line filter XDP temperature XDP interface & humidity sensor -
Developing Integrated Data Services for Cray Systems with a Gemini Interconnect
Developing Integrated Data Services for Cray Systems with a Gemini Interconnect Ron A. Oldeld Todd Kordenbrock Jay Lofstead May 1, 2012 Abstract Over the past several years, there has been increasing interest in injecting a layer of compute resources between a high-performance computing application and the end storage devices. For some projects, the objective is to present the parallel le system with a reduced set of clients, making it easier for le-system vendors to support extreme-scale systems. In other cases, the objective is to use these resources as staging areas to aggregate data or cache bursts of I/O operations. Still others use these staging areas for in-situ analysis on data in-transit between the application and the storage system. To simplify our discussion, we adopt the general term Integrated Data Services to represent these use-cases. This paper describes how we provide user-level, integrated data services for Cray systems that use the Gemini Interconnect. In particular, we describe our implementation and performance results on the Cray XE6, Cielo, at Los Alamos National Laboratory. 1 Introduction den on the le system and potentially improve over- all ecency of the application workow. Current In our quest toward exascale systems and applica- work to enable these coupling and workow scenar- tions, one topic that is frequently discussed is the ios are focused on the data issues to resolve resolu- need for more exible execution models. Current tion and mesh mismatches, time scale mismatches, models for capability class high-performance comput- and make data available through data staging tech- ing (HPC) systems are essentially static, requiring niques [20, 34, 21, 11, 2, 40, 10]. -
Jaguar and Kraken -The World's Most Powerful Computer Systems
Jaguar and Kraken -The World's Most Powerful Computer Systems Arthur Bland Cray Users’ Group 2010 Meeting Edinburgh, UK May 25, 2010 Abstract & Outline At the SC'09 conference in November 2009, Jaguar and Kraken, both located at ORNL, were crowned as the world's fastest computers (#1 & #3) by the web site www.Top500.org. In this paper, we will describe the systems, present results from a number of benchmarks and applications, and talk about future computing in the Oak Ridge Leadership Computing Facility. • Cray computer systems at ORNL • System Architecture • Awards and Results • Science Results • Exascale Roadmap 2 CUG2010 – Arthur Bland Jaguar PF: World’s most powerful computer— Designed for science from the ground up Peak performance 2.332 PF System memory 300 TB Disk space 10 PB Disk bandwidth 240+ GB/s Based on the Sandia & Cray Compute Nodes 18,688 designed Red Storm System AMD “Istanbul” Sockets 37,376 Size 4,600 feet2 Cabinets 200 3 CUG2010 – Arthur Bland (8 rows of 25 cabinets) Peak performance 1.03 petaflops Kraken System memory 129 TB World’s most powerful Disk space 3.3 PB academic computer Disk bandwidth 30 GB/s Compute Nodes 8,256 AMD “Istanbul” Sockets 16,512 Size 2,100 feet2 Cabinets 88 (4 rows of 22) 4 CUG2010 – Arthur Bland Climate Modeling Research System Part of a research collaboration in climate science between ORNL and NOAA (National Oceanographic and Atmospheric Administration) • Phased System Delivery • Total System Memory – CMRS.1 (June 2010) 260 TF – 248 TB DDR3-1333 – CMRS.2 (June 2011) 720 TF • File Systems -
The Gemini Network
The Gemini Network Rev 1.1 Cray Inc. © 2010 Cray Inc. All Rights Reserved. Unpublished Proprietary Information. This unpublished work is protected by trade secret, copyright and other laws. Except as permitted by contract or express written permission of Cray Inc., no part of this work or its content may be used, reproduced or disclosed in any form. Technical Data acquired by or for the U.S. Government, if any, is provided with Limited Rights. Use, duplication or disclosure by the U.S. Government is subject to the restrictions described in FAR 48 CFR 52.227-14 or DFARS 48 CFR 252.227-7013, as applicable. Autotasking, Cray, Cray Channels, Cray Y-MP, UNICOS and UNICOS/mk are federally registered trademarks and Active Manager, CCI, CCMT, CF77, CF90, CFT, CFT2, CFT77, ConCurrent Maintenance Tools, COS, Cray Ada, Cray Animation Theater, Cray APP, Cray Apprentice2, Cray C90, Cray C90D, Cray C++ Compiling System, Cray CF90, Cray EL, Cray Fortran Compiler, Cray J90, Cray J90se, Cray J916, Cray J932, Cray MTA, Cray MTA-2, Cray MTX, Cray NQS, Cray Research, Cray SeaStar, Cray SeaStar2, Cray SeaStar2+, Cray SHMEM, Cray S-MP, Cray SSD-T90, Cray SuperCluster, Cray SV1, Cray SV1ex, Cray SX-5, Cray SX-6, Cray T90, Cray T916, Cray T932, Cray T3D, Cray T3D MC, Cray T3D MCA, Cray T3D SC, Cray T3E, Cray Threadstorm, Cray UNICOS, Cray X1, Cray X1E, Cray X2, Cray XD1, Cray X-MP, Cray XMS, Cray XMT, Cray XR1, Cray XT, Cray XT3, Cray XT4, Cray XT5, Cray XT5h, Cray Y-MP EL, Cray-1, Cray-2, Cray-3, CrayDoc, CrayLink, Cray-MP, CrayPacs, CrayPat, CrayPort, Cray/REELlibrarian, CraySoft, CrayTutor, CRInform, CRI/TurboKiva, CSIM, CVT, Delivering the power…, Dgauss, Docview, EMDS, GigaRing, HEXAR, HSX, IOS, ISP/Superlink, LibSci, MPP Apprentice, ND Series Network Disk Array, Network Queuing Environment, Network Queuing Tools, OLNET, RapidArray, RQS, SEGLDR, SMARTE, SSD, SUPERLINK, System Maintenance and Remote Testing Environment, Trusted UNICOS, TurboKiva, UNICOS MAX, UNICOS/lc, and UNICOS/mp are trademarks of Cray Inc. -
A Cray Manufactured XE6 System Called “HERMIT” at HLRS in Use in Autumn 2011
Press release For release 21/04/2011 More capacity to the PRACE Research Infrastructure: a Cray manufactured XE6 system called “HERMIT” at HLRS in use in autumn 2011 PRACE, the Partnership for Advanced Computing in Europe, adds more capacity to the PRACE Research Infrastructure as the Gauss Centre for Supercomputing (GCS) partner High Performance Computing Center of University Stuttgart (HLRS) will deploy the first installation step of a system called Hermit, a 1 Petaflop/s Cray system in the autumn of 2011. This first installation step is to be followed by a 4-5 Petaflop/s second step in 2013. The contract with Cray includes the delivery of a Cray XE6 supercomputer and the future delivery of Cray's next-generation supercomputer code-named "Cascade." The new Cray system at HLRS will serve as a supercomputing resource for researchers, scientists and engineers throughout Europe. HLRS is one of the leading centers in the European PRACE initiative and is currently the only large European high performance computing (HPC) center to work directly with industrial partners in automotive and aerospace engineering. HLRS is also a key partner of the Gauss Centre for Supercomputing (GCS), which is an alliance of the three major supercomputing centers in Germany that collectively provide one of the largest and most powerful supercomputer infrastructures in the world. "Cray is just the right partner as we enter the era of petaflops computing. Together with Cray's outstanding supercomputing technology, our center will be able to carry through the new initiative for engineering and industrial simulation. This is especially important as we work at the forefront of electric mobility and sustainable energy supply", said prof. -
Red Storm IO Performance Analysis James H
1 Red Storm IO Performance Analysis James H. Laros III #1,LeeWard#2, Ruth Klundt ∗3, Sue Kelly #4 James L. Tomkins #5, Brian R. Kellogg #6 #Sandia National Laboratories 1515 Eubank SE Albuquerque NM, 87123-1319 [email protected] [email protected], [email protected] [email protected], [email protected] ∗Hewlett-Packard 3000 Hanover Street Palo Alto, CA 94304-1185 [email protected] Abstract— This paper will summarize an IO1 performance hardware or software, potentially affects how the entire system analysis effort performed on Sandia National Laboratories Red performs. Storm platform. Our goal was to examine the IO system We consider the testing described in this paper to be second performance and identify problems or bottle-necks in any aspect of the IO sub-system. Our process examined the entire IO path phase testing. Initial testing was accomplished to establish from application to disk both in segments and as a whole. Our baselines, develop test harnesses and establish test parameters final analysis was performed at scale employing parallel IO that would provide meaningful information within the lim- access methods typically used in High Performance Computing its imposed such as time constraints. All testing presented applications. in this paper was accomplished during system preventative Index Terms— Red Storm, Lustre, CFS, Data Direct Networks, maintenance periods on the production Red Storm platform, Parallel File-Systems on production file-systems. Testing time on heavily utilized production platforms such as Red Storm is precious. Detailed results of previous, current and future testing is posted on- I. INTRODUCTION line[2]. IO is a critical component of capability computing2.At In section II we will discuss components of the Red Storm Sandia Labs, high value is placed on balanced platforms. -
Pubtex Output 2011.12.12:1229
TM Using Cray Performance Analysis Tools S–2376–53 © 2006–2011 Cray Inc. All Rights Reserved. This document or parts thereof may not be reproduced in any form unless permitted by contract or by written permission of Cray Inc. U.S. GOVERNMENT RESTRICTED RIGHTS NOTICE The Computer Software is delivered as "Commercial Computer Software" as defined in DFARS 48 CFR 252.227-7014. All Computer Software and Computer Software Documentation acquired by or for the U.S. Government is provided with Restricted Rights. Use, duplication or disclosure by the U.S. Government is subject to the restrictions described in FAR 48 CFR 52.227-14 or DFARS 48 CFR 252.227-7014, as applicable. Technical Data acquired by or for the U.S. Government, if any, is provided with Limited Rights. Use, duplication or disclosure by the U.S. Government is subject to the restrictions described in FAR 48 CFR 52.227-14 or DFARS 48 CFR 252.227-7013, as applicable. Cray, LibSci, and PathScale are federally registered trademarks and Active Manager, Cray Apprentice2, Cray Apprentice2 Desktop, Cray C++ Compiling System, Cray CX, Cray CX1, Cray CX1-iWS, Cray CX1-LC, Cray CX1000, Cray CX1000-C, Cray CX1000-G, Cray CX1000-S, Cray CX1000-SC, Cray CX1000-SM, Cray CX1000-HN, Cray Fortran Compiler, Cray Linux Environment, Cray SHMEM, Cray X1, Cray X1E, Cray X2, Cray XD1, Cray XE, Cray XEm, Cray XE5, Cray XE5m, Cray XE6, Cray XE6m, Cray XK6, Cray XMT, Cray XR1, Cray XT, Cray XTm, Cray XT3, Cray XT4, Cray XT5, Cray XT5h, Cray XT5m, Cray XT6, Cray XT6m, CrayDoc, CrayPort, CRInform, ECOphlex, Gemini, Libsci, NodeKARE, RapidArray, SeaStar, SeaStar2, SeaStar2+, Sonexion, The Way to Better Science, Threadstorm, uRiKA, and UNICOS/lc are trademarks of Cray Inc. -
Use Style: Paper Title
Titan: Early experience with the Cray XK6 at Oak Ridge National Laboratory Arthur S. Bland, Jack C. Wells, Otis E. Messer, Oscar R. Hernandez, James H. Rogers National Center for Computational Sciences Oak Ridge National Laboratory Oak Ridge, Tennessee, 37831, USA [blandas, wellsjc, bronson, oscar, jrogers, at ORNL.GOV] Abstract— In 2011, Oak Ridge National Laboratory began an turbines, building a virtual nuclear reactor to increase power upgrade to Jaguar to convert it from a Cray XT5 to a Cray output and longevity of nuclear power plants, modeling the XK6 system named Titan. This is being accomplished in two ITER prototype fusion reactor, predicting impacts from phases. The first phase, completed in early 2012, replaced all of climate change, earthquakes and tsunamis, and improving the XT5 compute blades with XK6 compute blades, and the aerodynamics of vehicles to increase fuel economy. replaced the SeaStar interconnect with Cray’s new Gemini Access to Titan is available through several allocation network. Each compute node is configured with an AMD programs. More information about access is available Opteron™ 6274 16-core processors and 32 gigabytes of DDR3- through the OLCF web site at: 1600 SDRAM. The system aggregate includes 600 terabytes of http://www.olcf.ornl.gov/support/getting-started/. system memory. In addition, the first phase includes 960 The OLCF worked with Cray to design Titan to be an NVIDIA X2090 Tesla processors. In the second phase, ORNL will add NVIDIA’s next generation Tesla processors to exceptionally well-balanced system for modeling and increase the combined system peak performance to over 20 simulation at the highest end of high-performance PFLOPS.