Sequoia and the Petascale Era

Total Page:16

File Type:pdf, Size:1020Kb

Sequoia and the Petascale Era Lawrence Livermore National Laboratory Sequoia and the Petascale Era SCICOMP 15 May 20, 2009 Thomas Spelce Development Environment Group Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA 94551 This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 LLNL-PRES-411030 The Advanced Simulation and Computing (ASC) Program delivers high confidence prediction of weapons behavior Integrated Codes Codes to predict safety and reliability Physics and Verification and Engineering Models Validation Experiments provide Models and NNSA Science Campaigns critical validation data understanding Experiments Legacy UGTs ASC integrates all of the science and engineering that makes stewardship successful Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 2 ASC pursued three classes of systems to cost effectively meet current (and anticipate future) compute requirements Higher performance and . Capability systems ==> the most lower power consumption challenging integrated design calculations Sequoia • More costly but proven Roadrunner • Production workload BlueGene/L Original concept: develop capability . Capacity systems ==> day to day work Purple TLCC (Juno) • Less costly, somewhat less reliable Peloton Performance • Throughput for less demanding Q problems Thunder White MCR Mainframes (RIP) . Advanced Architectures ==> Blue performance, power consumption, etc. Low-cost capacity • Targeted but demanding workload Red • Tomorrow’s mainstream solutions? FY01 FY05 Time The “three curves” (Capability, Capacity and Advanced Architectures) approach has been successful in delivering good cost performance across the spectrum of need… Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 3 Sequoia represents largest increase in computational power ever delivered for NNSA Stockpile Stewardship Sequoia Five Years Planned Lifetime Through CY17 Market Survey 1/06 7/06 12/06 CD0 Approved Write RFP 1/07 7/07 12/07 Vendor Response CD1 Approved Contract Package Selection Sequoia Plan Review 1/08 7/08 12/08 Dawn LA CD2/3 Approved Dawn Early Science Transition to Classified Dawn GA 1/09 7/09 12/09 Dawn Phase 1 Dawn Phase 2 Sequoia Build Decision 1/10 7/10 12/10 Phased System Deliveries Sequoia Parts Commit & Option Sequoia Parts Build Sequoia Demo 1/11 7/11 12/11 Sequoia Early Science Transition to Classified CD4 Approved Sequoia Operational Readiness 1/12 7/12 12/12 Sequoia contract award Dawn system acceptance Sequoia final system acceptance Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 4 “Dawn speeds a man on his journey, and speeds him too in his work” ...Hesiod (~700 B.C.E) Dawn Specifications • IBM BG/P architecture • 36,864 compute nodes (500TF) • 147,456 PPC 450 cores • 4GB memory per node (147.5TB) • 128-to-1 compute to I/O node ratio • 288 10GE links to file system Dawn Installation • Feb 27th - final rack delivery • March 5th - 36 Rack integration complete • March 15-24th – Synthetic WorkLoad start • End of March - Acceptance (planned) ibm.com/systems/deepcomputing/bluegene/ Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 5 DAWN SEQUOIA Initial Delivery System 36 racks 0.5 PF/s 144 TB Rack 1.3 MW 14 TF/s >8 Day MTBF 4 TB 36 KW Chip 850 MHz PPC 450 4 cores/4 threads Node Card 13.6 GF/s Peak 435 GF/s 8 MB EDRAM 128 GB Compute Card 13.6 GF/s 4.0 GB DDR2 13.6 GB/s Memory BW 0.75 GB/s 3D Torus BW Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 6 DAWN Initial Delivery Infrastructure 288 – 10GbE 8 x 4 – LLNL 10GbE 1 – 10GbE HTC 2 – 1GbE 4 x 4 – 144 – 1GbE 1GbE Dawn Core (9 x 4 BG/P Racks) E-net 14 – 1GbE Core 14 – 1GbE 2 – 1GbE 3 – 1GbE Primary Backup 2 – 1GbE LOGIN SERVICE SERVICE SERVICE HMC 2 – FC4 2 – FC4 2 – FC4 2 – FC4 2 – 10GbE 2 – 10GbE 12 – 2 – 10GbE Local Disk 10GbE Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 7 Sequoia Target Architecture and Infrastructure Production Operation FY12-FY17 • 20PF/s, 1.6 PB Memory • 96 racks, 98,304 nodes • 1.6 M cores (1 GB/core) • 50 PB Lustre file system • 6.0 MW power (160 times more efficient than Purple) . Will be used as a 2D ultra-res and 3D high-res Uncertainty Quantification (UQ) engine . Will be used for 3D science capability runs exploring key materials science problems Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 8 High performance material science simulations will contribute directly to ASC programmatic success Six physics/materials science applications targeted for early implementation on Sequoia infrastructure • Qbox – Quantum molecular dynamics for determination of material equation of state • DDCMD – Molecular dynamics for material dynamics • Miranda – 3D Continuum fluid dynamics for interfacial mixing • ALE3D – 3D Continuum mechanics for ignition and detonation propagation of explosives • LAMPPS – Molecular dynamics for shock initiation in high explosives • ParaDiS – Dislocation dynamics for high pressure strength in materials Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 9 Single Sequoia Platform Mandatory Requirement is P ≥ 20 . P is “peak” of the machine measured in petaFLOP/s . Target requirement is P + S ≥ 40 • S is weighted average of five “marquee” benchmark codes • Four code package benchmarks − UMT, IRS, AMG, and SPhot − Program goal is 24x the Purple capability throughput • One “science workload” benchmark from SNL − LAMMPS (molecular dynamics) − Program goal is 20x-50x BGL for science capability PurplePurple -- 100TF/s100TF/s BlueGeneBlueGene /L/L –– 367TF/s367TF/s Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 10 Sequoia Operating System Perspective … Light weight kernel on compute nodes Application . Optimized for scalability and reliability 1-N CN Application NPTL Posix threads . As simple as possible glibc dynamicApplication loading NPTL ApplicationPosix threads glibc dynamic loading . Extremely low OS noise NPTLGLIBC Posix threadsMPI glibc dynamic loading PosixGLIBC threads, OpenMP,MPI SE/TMADI glibc dynamic loading Shared . Direct access to interconnect hardware GLIBCsyscallsMPIFutex ADI RAS SharedMemory GLIBCsyscallsMPIFutexhardwareADI transportRAS . OS features SharedMemory Functionsyscalls ShippedSequoiaFutexhardwareADI CN transport and InterconnectRAS Memory . Linux/Unix syscall compatible w/ I/O syscalls SMP RAS syscallsSequoiahardware CN transport and Interconnect Sequoiahardware CN transport and Interconnect . Support for dynamic lib runtime loading Sequoia CN and Interconnect . Shared memory regions Compute Nodes . Open source Linux/Unix OS on I/O nodes FSD SLURMD Perf tools totalview . Leverage large Linux/Unix base & community Lustre Client NFSv4 Function Shipped . Enhance TCP offload, PCIe, I/O LNet syscalls . Standard File Systems - Lustre, NFSv4, etc. UDP TCP/IP Linux/Unix . Aggregates N CN for I/O & admin Sequoia ION and Interconnect . Open source Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 11 Sequoia Software Stack – Applications Perspective Code Development Tools C/C++/Fortran Compilers, Python User Space Kernel Space Function Shipped APPLICATION syscalls Parallel Math Libs OpenMP, Threads, SE/TM Clib/F03 runtime SOCKETS Lustre Client Optimized Math Libs LNet TCP UDP SLURM/Moab MPI2 Linux/Unix LWK, RAS, Control System ADI IP Code Dev Tools Infrastructure Interconnect Interface External Network Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 12 The tools that users know and love will be available on Sequoia with improvements and additions as needed STAT Existing 107 - MRNet TV memlight LaunchMON New memP 106 - New Lightweight PMPI mpiP Focus Tools O|SS OTF TAU 5 10 - APAI OpenMP SE/TM DPCL TotalView Analyzer Analyzer 104 - MemCheck OpenMP Profiling SE/TM Valgrind Interface Monitor ThreadCheck Operational Scale Operational gprof Stack SE/TM Walker Debugger Dyninst 1 - PAPI Debugging Performance Infrastructure Features Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 13 Application programming requirements and challenges . Availability of 1.6M cores pushes all- MPI codes to extreme concurrency I/OI/O && VisualizationVisualization . Availability of many threads on many SMP cores encourages low-level HybridHybrid parallelism for higher performance Models . Mixed MPI/SMP programming environment and possibility of heterogeneous compute distribution SMPSMP ThreadsThreads brings load imbalance to the fore . I/O and visualization requirements encourage innovative strategies to MPI minimize memory and bandwidth ScalingScaling bottlenecks Lawrence Livermore National Laboratory LLNL-PRES-411030 10th International LCI Conference 14 The RFP asked interested vendors to address a “Unified Nested Node Concurrency” model TM/SE TM/SE OpenMP OpenMP OpenMP OpenMP MPI_INIT Call MPI Call MPI Call MPI MPI Call MPI Call MPI Call MPI Call MPI MPI_FINALIZE Thread0 Thread1 W W Thread2 W W Exit MAIN MAIN Funct1 Funct1 Thread3 W Funct2 1-3 W 1-3 1-3 1-3 1-3 1-3 1-3 1) Pthreads born with MAIN 2) Only Thread0 calls functions to nest parallelism 3) Pthreads based MAIN calls OpenMP based Funct1 4) OpenMP Funct1 calls TM/SE based Funct2 5) Funct2 returns to OpenMP based Funct1 6) Funct1 returns
Recommended publications
  • 2017 HPC Annual Report Team Would Like to Acknowledge the Invaluable Assistance Provided by John Noe
    sandia national laboratories 2017 HIGH PERformance computing The 2017 High Performance Computing Annual Report is dedicated to John Noe and Dino Pavlakos. Building a foundational framework Editor in high performance computing Yasmin Dennig Contributing Writers Megan Davidson Sandia National Laboratories has a long history of significant contributions to the high performance computing Mattie Hensley community and industry. Our innovative computer architectures allowed the United States to become the first to break the teraflop barrier—propelling us to the international spotlight. Our advanced simulation and modeling capabilities have been integral in high consequence US operations such as Operation Burnt Frost. Strong partnerships with industry leaders, such as Cray, Inc. and Goodyear, have enabled them to leverage our high performance computing capabilities to gain a tremendous competitive edge in the marketplace. Contributing Editor Laura Sowko As part of our continuing commitment to provide modern computing infrastructure and systems in support of Sandia’s missions, we made a major investment in expanding Building 725 to serve as the new home of high performance computer (HPC) systems at Sandia. Work is expected to be completed in 2018 and will result in a modern facility of approximately 15,000 square feet of computer center space. The facility will be ready to house the newest National Nuclear Security Administration/Advanced Simulation and Computing (NNSA/ASC) prototype Design platform being acquired by Sandia, with delivery in late 2019 or early 2020. This new system will enable continuing Stacey Long advances by Sandia science and engineering staff in the areas of operating system R&D, operation cost effectiveness (power and innovative cooling technologies), user environment, and application code performance.
    [Show full text]
  • Safety and Security Challenge
    SAFETY AND SECURITY CHALLENGE TOP SUPERCOMPUTERS IN THE WORLD - FEATURING TWO of DOE’S!! Summary: The U.S. Department of Energy (DOE) plays a very special role in In fields where scientists deal with issues from disaster relief to the keeping you safe. DOE has two supercomputers in the top ten supercomputers in electric grid, simulations provide real-time situational awareness to the whole world. Titan is the name of the supercomputer at the Oak Ridge inform decisions. DOE supercomputers have helped the Federal National Laboratory (ORNL) in Oak Ridge, Tennessee. Sequoia is the name of Bureau of Investigation find criminals, and the Department of the supercomputer at Lawrence Livermore National Laboratory (LLNL) in Defense assess terrorist threats. Currently, ORNL is building a Livermore, California. How do supercomputers keep us safe and what makes computing infrastructure to help the Centers for Medicare and them in the Top Ten in the world? Medicaid Services combat fraud. An important focus lab-wide is managing the tsunamis of data generated by supercomputers and facilities like ORNL’s Spallation Neutron Source. In terms of national security, ORNL plays an important role in national and global security due to its expertise in advanced materials, nuclear science, supercomputing and other scientific specialties. Discovery and innovation in these areas are essential for protecting US citizens and advancing national and global security priorities. Titan Supercomputer at Oak Ridge National Laboratory Background: ORNL is using computing to tackle national challenges such as safe nuclear energy systems and running simulations for lower costs for vehicle Lawrence Livermore's Sequoia ranked No.
    [Show full text]
  • The Artisanal Nuke, 2014
    The Artisanal Nuke Mary C. Dixon US Air Force Center for Unconventional Weapons Studies Maxwell Air Force Base, Alabama THE ARTISANAL NUKE by Mary C. Dixon USAF Center for Unconventional Weapons Studies 125 Chennault Circle Maxwell Air Force Base, Alabama 36112-6427 July 2014 Disclaimer The opinions, conclusions, and recommendations expressed or implied in this publication are those of the author and do not necessarily reflect the views of the Air University, Air Force, or Department of Defense. ii Table of Contents Chapter Page Disclaimer ................................................................................................... ii Table of Contents ....................................................................................... iii About the Author ......................................................................................... v Acknowledgements ..................................................................................... vi Abstract ....................................................................................................... ix 1 Introduction .............................................................................................. 1 2 Background ............................................................................................ 19 3 Stockpile Stewardship ........................................................................... 27 4 Opposition & Problems ......................................................................... 71 5 Milestones & Accomplishments ..........................................................
    [Show full text]
  • Technical Issues in Keeping the Nuclear Stockpile Safe, Secure, and Reliable
    Technical Issues in Keeping the Nuclear Stockpile Safe, Secure, and Reliable Marvin L. Adams Texas A&M University Sidney D. Drell Stanford University ABSTRACT The United States has maintained a safe, secure, and reliable nuclear stockpile for 16 years without relying on underground nuclear explosive tests. We argue that a key ingredient of this success so far has been the expertise of the scientists and engineers at the national labs with the responsibility for the nuclear arsenal and infrastructure. Furthermore for the foreseeable future this cadre will remain the critical asset in sustaining the nuclear enterprise on which the nation will rely, independent of the size of the stockpile or the details of its purpose, and to adapt to new requirements generated by new findings and changes in policies. Expert personnel are the foundation of a lasting and responsive deterrent. Thus far, the United States has maintained a safe, secure, and reliable nuclear stockpile by adhering closely to the designs of existing “legacy” weapons. It remains to be determined whether this is the best course to continue, as opposed to introducing new designs (as envisaged in the original RRW program), re-using previously designed and tested components, or maintaining an evolving combination of new, re-use, and legacy designs. We argue the need for a stewardship program that explores a broad spectrum of options, includes strong peer review in its assessment of options, and provides the necessary flexibility to adjust to different force postures depending on evolving strategic developments. U.S. decisions and actions about nuclear weapons can be expected to affect the nuclear policy choices of other nations – non-nuclear as well as nuclear – on whose cooperation we rely in efforts to reduce the global nuclear danger.
    [Show full text]
  • A Comparison of the Current Top Supercomputers
    A Comparison of the Current Top Supercomputers Russell Martin Xavier Gallardo Top500.org • Started in 1993 to maintain statistics on the top 500 performing supercomputers in the world • Started by Hans Meuer, updated twice annually • Uses LINPACK as main benchmark November 2012 List 1. Titan, Cray, USA 2. Sequoia, IBM, USA 3. K Computer, Fujitsu, Japan 4. Mira, IBM, USA 5. JUQueen, IBM, Germany 6. SuperMUC, IBM, Germany Performance Trends http://top500.org/statistics/perfdevel/ Trends • 84.8% use 6+ processor cores • 76% Intel, 12% AMD Opteron, 10.6% IBM Power • Most common Interconnects - Infiniband, Gigabit Ethernet • 251 in USA, 123 in Asia, 105 in Europe • Current #1 Computer • Oak Ridge National Laboratory Oak Ridge, Tennessee • Operational Oct 29, 2012 • Scientific research • AMD Opteron / NVIDIA Tesla • 18688 Nodes, 560640 Cores • Cray Gemini Interconnect • Theoretical 27 PFLOPS Sequoia • Ranked #2 • Located at the Lawrence Livermore National Library in Livermore, CA • Designed for "Uncertainty Quantification Studies" • Fully classified work starting April 17 2013 Sequoia • Cores: 1572864 • Nodes: 98304 • RAM: 1572864 GB • Interconnect: Custom 5D Torus • Max Performance: 16324.8 Tflops • Theoretical Max: 20132.7 Tflops • Power Consumption: 7890 kW Sequoia • IBM BlueGene/Q Architecture o Started in 1999 for protein folding o Custom SoC Layout 16 PPC Cores per Chip, each 1.6Ghz, 0.8V • Built for OpenMP and POSIX programs • Automatic SIMD exploitation Sequoia • #3 on Top 500, #1 June 2011 • RIKEN Advanced Institute for Computational
    [Show full text]
  • The Blue Gene/Q Compute Chip
    The Blue Gene/Q Compute Chip Ruud Haring / IBM BlueGene Team © 2011 IBM Corporation Acknowledgements The IBM Blue Gene/Q development teams are located in – Yorktown Heights NY, – Rochester MN, – Hopewell Jct NY, – Burlington VT, – Austin TX, – Bromont QC, – Toronto ON, – San Jose CA, – Boeblingen (FRG), – Haifa (Israel), –Hursley (UK). Columbia University . University of Edinburgh . The Blue Gene/Q project has been supported and partially funded by Argonne National Laboratory and the Lawrence Livermore National Laboratory on behalf of the United States Department of Energy, under Lawrence Livermore National Laboratory Subcontract No. B554331 2 03 Sep 2012 © 2012 IBM Corporation Blue Gene/Q . Blue Gene/Q is the third generation of IBM’s Blue Gene family of supercomputers – Blue Gene/L was announced in 2004 – Blue Gene/P was announced in 2007 . Blue Gene/Q – was announced in 2011 – is currently the fastest supercomputer in the world • June 2012 Top500: #1,3,7,8, … 15 machines in top100 (+ 101-103) – is currently the most power efficient supercomputer architecture in the world • June 2012 Green500: top 20 places – also shines at data-intensive computing • June 2012 Graph500: top 2 places -- by a 7x margin – is relatively easy to program -- for a massively parallel computer – and its FLOPS are actually usable •this is not a GPGPU design … – incorporates innovations that enhance multi-core / multi-threaded computing • in-memory atomic operations •1st commercial processor to support hardware transactional memory / speculative execution •… – is just a cool chip (and system) design • pushing state-of-the-art ASIC design where it counts • innovative power and cooling 3 03 Sep 2012 © 2012 IBM Corporation Blue Gene system objectives .
    [Show full text]
  • Report Is Available on the UCS Website At
    Making Smart Security Choices The Future of the U.S. Nuclear Weapons Complex Making Smart SecurityChoices The Future of the U.S. Nuclear Weapons Complex Lisbeth Gronlund Eryn MacDonald Stephen Young Philip E. Coyle III Steve Fetter OCTOBER 2013 Revised March 2014 ii UNION OF CONCERNED SCIENTISTS © 2013 Union of Concerned Scientists All rights reserved Lisbeth Gronlund is a senior scientist and co-director of the Union of Concerned Scientists (UCS) Global Security Program. Eryn MacDonald is an analyst in the UCS Global Security Program. Stephen Young is a senior analyst in the UCS Global Security Program. Philip E. Coyle III is a senior science fellow at the Center for Arms Control and Non-Proliferation. Steve Fetter is a professor in the School of Public Policy at the University of Maryland. The Union of Concerned Scientists puts rigorous, independent science to work to solve our planet’s most pressing problems. Joining with citizens across the country, we combine technical analysis and effective advocacy to create innovative, practical solutions for a healthy, safe, and sustainable future. More information about UCS and the Global Security Program is available on the UCS website at www.ucsusa.org/nuclear_weapons_and_global_security. The full text of this report is available on the UCS website at www.ucsusa.org/smartnuclearchoices. DESIGN & PROductiON DG Communications/www.NonprofitDesign.com COVER image Department of Defense/Wikimedia Commons Four B61 nuclear gravity bombs on a bomb cart at Barksdale Air Force Base in Louisiana. Printed on recycled paper. MAKING SMART SECURITY CHOICES iii CONTENTS iv Figures iv Tables v Acknowledgments 1 Executive Summary 4 Chapter 1.
    [Show full text]
  • 2. the IBM Blue Gene/P Supercomputer
    Introduction to HPC Programming 2. The IBM Blue Gene/P Supercomputer Valentin Pavlov <[email protected]> About these lectures • This is the second of series of six introductory lectures discussing the field of High-Performance Computing; • The intended audience of the lectures are high-school students with some programming experience (preferrably using the C programming language) having interests in scientific studies, e.g. physics, chemistry, biology, etc. • This lecture provides an overview of the IBM Blue Gene/P supercomputer’s architecture, along with some practical advices about its usage. What does “super-” mean? • When talking about computers, the prefix “super-” does not have the same meaning as when talking about people (e.g. Superman); • The analogy is closer to that of a supermarket – a market that sells a lot of different articles; • Thus, a supercomputer is not to be tought a priori as a very powerful computer, but simply as a collection of a lot of ordinary computers. What does “super-” mean? • Everyone with a spare several thousand euro can build an in-house supercomputer out of a lot of cheap components (think Raspberry Pi) which would in principle be not much different than high-end supercomputers, only slower. • Most of the information in these lectures is applicable to such ad-hoc supercomputers, clusters, etc. • In this lecture we’ll have a look at the architecture of a real supercomputer, the IBM Blue Gene/P, and also discuss the differences with the new version of this architecture, IBM Blue Gene/Q. IBM Blue Gene/P • IBM Blue Gene/P is a modular hybrid parallel system.
    [Show full text]
  • Annex a – FY 2011 Stockpile Stewardship Plan
    FY 2011 Stockpile Stewardship Plan “So today, I state clearly and with conviction America's commitment to seek the peace and security of a world without nuclear weapons. I'm not naive. This goal will not be reached quickly –- perhaps not in my lifetime. It will take patience and persistence. But now we, too, must ignore the voices who tell us that the world cannot change. …we will reduce the role of nuclear weapons in our national security strategy, and urge others to do the same. Make no mistake: As long as these weapons exist, the United States will maintain a safe, secure and effective arsenal to deter any adversary, and guarantee that defense to our allies…But we will begin the work of reducing our arsenal.” President Barack Obama April 5, 2009 – Prague, Czech Republic Our budget request represents a comprehensive approach to ensuring the nuclear security of our Nation. We must ensure that our strategic posture, our stockpile, and our infrastructure, along with our nonproliferation, arms control, emergency response, counterterrorism, and naval propulsion programs, are melded into one comprehensive, forward-leaning strategy that protects America and its allies. Maintaining our nuclear stockpile forms the core of our work in the NNSA. However, the science, technology, and engineering that encompass that core work must continue to focus on providing a sound foundation for ongoing nonproliferation and other threat reduction programs. Our investment in nuclear security is providing the tools that can tackle a broad array of national challenges – both in the security arena and in other realms. And, if we have the tools, we will need the people to use them effectively.
    [Show full text]
  • Openacc for Gpus, X86, Openpower and Beyond
    April 4-7, 2016 | Silicon Valley Write Once, Parallel Everywhere: OpenACC for GPUs, x86, OpenPower and Beyond Michael Wolfe Performance Portability, 2012 — IPDPS 2012, Shanghai, China http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6267860&tag=1 Performance Portability, 2005 — IPDPS 2005, Denver, Colorado http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1420218&tag=1 Performance Portability, 2001 — HPC Asia 2001,Queensland, Australia http://www.wrf-model.org/wrfadmin/publications.php Performance Portability, 1995 — ORNL/TM-12986, April 1995 http://www.epm.ornl.gov/~worley/papers/tmbenchmark.ps Tianhe-2 Supercomputer NUDT, China Titan Supercomputer Oak Ridge National Laboratory, Tennessee Sequoia Supercomputer Lawrence Livermore National Laboratory, California K Supercomputer RIKEN Advanced Institute Computational Science, Kobe, Japan CORAL Supercomputers US Dept. of Energy, Collaboration Oak Ridge, Argonne, Livermore, 2017-8 Does OpenMP give Performance Portability? “Even with OpenMP accelerator directives ..., two different sources are necessary” “Most people are resigned to having different sources for different platforms” Enabling Application Portability Across HPC Platforms: An Application Perspective Koniges, Mattson, Foertter, He, Gerber, OpenMPCon, 2015, Aachen, Germany http://openmpcon.org/wp-content/uploads/openmpcon2015-tim-mattson-portability.pdf Does OpenMP give Performance Portability? Whining about Performance Portability “But there is a pretty darn good performance portable language. It’s called OpenCL” “Having
    [Show full text]
  • Conceptual and Technical Challenges for High Performance Computing Claude Tadonki
    Conceptual and Technical Challenges for High Performance Computing Claude Tadonki To cite this version: Claude Tadonki. Conceptual and Technical Challenges for High Performance Computing. 2020. hal- 02962355 HAL Id: hal-02962355 https://hal.archives-ouvertes.fr/hal-02962355 Preprint submitted on 9 Oct 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Conceptual and Technical Challenges for High Performance Computing Claude, Tadonki Mines ParisTech – PSL Research University Paris, France Email: [email protected] Abstract High Performance Computing (HPC) aims at providing reasonably fast computing solutions to scientific and real life problems. Many efforts have been made on the way to powerful supercomputers, including generic and customized configurations. The advent of multicore architectures is noticeable in the HPC history, because it has brought the underlying parallel programming concept into common considerations. At a larger scale, there is a keen interest in building or hosting frontline supercomputers; the Top500 ranking is a nice illustration of this (implicit) racing. Supercomputers, as well as ordinary computers, have fallen in price for years while gaining processing power. We clearly see that, what commonly springs up in mind when it comes to HPC is computer capability.
    [Show full text]
  • Blue Gene/Q Resource Management Architecture
    Blue Gene/Q Resource Management Architecture Tom Budnik, Brant Knudson, Mark Megerian, Sam Miller, Mike Mundy, Will Stockdell IBM Systems and Technology Group, Rochester, MN Email: {tbudnik, bknudson, megerian, samjmill, mmundy, stockdel}@us.ibm.com Abstract—As supercomputers scale to a million processor cores and BG/P. The BG/Q hardware design moves the I/O nodes and beyond, the underlying resource management architecture to separate I/O drawers and I/O racks. needs to provide a flexible mechanism to manage the wide Just as the Blue Gene hardware has advanced over multiple variety of workloads executing on the machine. In this paper we describe the novel approach of the Blue Gene/Q (BG/Q) generations, the resource management software on Blue Gene supercomputer in addressing these workload requirements by known as the Control System has also evolved to support the providing resource management services that support both latest supercomputer workload models. the high performance computing (HPC) and high-throughput This position paper describes the BG/Q resource manage- computing (HTC) paradigms. We explore how the resource ment software design at an early stage of development. While management implementations of the prior generation Blue Gene (BG/L and BG/P) systems evolved and led us down the path to the software architecture is in place, the development team is developing services on BG/Q that focus on scalability, flexibility not able to share performance results with the community at and efficiency. Also provided is an overview of the main com- this point in time. ponents comprising the BG/Q resource management architecture and how they interact with one another.
    [Show full text]