SC 14 Terri Quinn, Principal Deputy Department Head, November 17-21, 2014 Integrated Computing and Communications
LLNL-PRES-664252 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC
Science and Technology on a Mission
Strengthen the United States’ security by developing and applying world-class science, technology, and engineering.
Lawrence Livermore National Laboratory LLNL-PRES-xxxxxx2 CORAL is a DOE NNSA & Office of Science project to procure 3 leadership computers for ANL, ORNL, & LLNL with delivery in CY17-18 LLNL’s IBM Blue Gene Systems Modeled on successful LLNL/ANL/IBM Blue Gene partnership (Sequoia/Mira) BG/P BG/Q BG/L Dawn Sequoia NRE contract Long-term contractual partnership ORNL Summit contract (2017 delivery) with 2 vendors LLNL Sierra contract (2017 delivery) 3 platform contracts RFP NRE contract 2 nonrecurring eng. contracts ANL computer contract
CORAL is the next major phase in the U.S. Department of Energy’s Lawrence Livermorescientific National computing Laboratory roadmap and path to exascale computingLLNL-PRES -xxxxxx3 Sierra workloads were derived directly from the needs to fulfill NNSA’s Advanced Simulation and Computing (ASC) mission Sierra will provide computational resources that are essential for nuclear weapon scientists to fulfill the stockpile stewardship mission through simulation in lieu of underground testing. Two broad simulation classes constitute Sierra’s workload
#1 Assess the performance of integrated nuclear weapon systems
#2 Perform weapon’s science and engineering calculations
Lawrence Livermore National Laboratory LLNL-PRES-xxxxxx4 NNSA’s Advanced Simulation and Computing (ASC) Platform Timeline
Cielo (LANL/SNL)
Sequoia (LLNL)
ATS 1 – Trinity (LANL/SNL)
Advanced Advanced Systems Systems (ATS) Technology ATS 2 – Sierra (LLNL) System
Delivery ATS 3 – (LANL/SNL)
Tri-lab Linux Capacity Cluster II (TLCC II)
Dev. & Deploy CTS 1
Use
Commodity Commodity Systems Systems (CTS) Technology Retire CTS 2
‘12 ‘13 ‘14 ‘15 ‘16 ‘17 ‘18 ‘19 ‘20 ‘21 Fiscal Year ASC Platform Strategy includes application code transition for all platforms
Lawrence Livermore National Laboratory LLNL-PRES-xxxxxx5 LLNL selected the most compelling system for NNSA Notional Sierra node . 5x to 7x NNSA app performance improvement over Sequoia Multi-core
. Unmodified codes can run on Power® CPU MEM
. Memory rich nodes; high node memory bandwidth Coherent memory capability . Volta™ GPUs provide substantial performance potential
GGPUPU MEM . Outstanding benchmark analysis by IBM + NVIDIA Component
. Cost competitive; low risk solution; outstanding hardware reliability NRE contract provides significant benefit . Center of Excellence - expert help with porting and optimizing actual applications . Motherboard design and novel cooling concept . GPU reliability; file system performance; open source compiler infrastructure . Advanced system diagnostics and scheduling; advanced networking capabilities
Lawrence Livermore National Laboratory LLNL-PRES-xxxxxx6 Sierra System Compute System Compute Rack 2.1 – 2.7 PB Memory Compute Node Standard 19” 120 -150 PFLOPS POWER® Architecture Processor Warm water cooling 10 MW NVIDIA®Volta™ NVMe-compatible PCIe 800GB SSD > 512 GB DDR4 + HBM Components Coherent Shared Memory IBM POWER • NVLink™
GPFS™ File System 120 PB usable storage NVIDIA Volta Mellanox® Interconnect 1.0 TB/s bandwidth • HBM Dual-rail EDR Infiniband® • NVLink
Lawrence Livermore National Laboratory LLNL-PRES-xxxxxx7