Piz Daint”, One of the Most Powerful Supercomputers

The Swiss High-Performance Computing and Networking Thomas C. Schulthess | T. Schulthess !1 What is considered success in HPC? | T. Schulthess !2 High-Performance Computing Initiative (HPCN) in Switzerland High-risk & high-impact projects (www.hp2c.ch) Application driven co-design Phase III of pre-exascale supercomputing ecosystem Three pronged approach of the HPCN Initiative 2017 1. New, flexible, and efficient building 2. Efficient supercomputers 2016 Monte Rosa 3. Efficient applications Pascal based hybrid Cray XT5 2015 14’762 cores Upgrade to Phase II Cray XE6 K20X based hybrid 2014 Upgrade 47,200 cores Hex-core upgrade Phase I 22’128 cores 2013 Development & Aries network & multi-core procurement of 2012 petaflop/s scale supercomputer(s) 2011 2010 New 2009 building Begin construction complete of new building | T. Schulthess !3 FACT SHEET “Piz Daint”, one of the most powerful supercomputers A hardware upgrade in the final quarter of 2016 Thanks to the new hardware, researchers can run their simula- saw “Piz Daint”, Europe’s most powerful super- tions more realistically and more efficiently. In the future, big computer, more than triple its computing perfor- science experiments such as the Large Hadron Collider at CERN mance. ETH Zurich invested around CHF 40 million will also see their data analysis support provided by “Piz Daint”. in the upgrade, so that simulations, data analysis and visualisation can be performed more effi- ETH Zurich has invested CHF 40 million in the upgrade of “Piz ciently than ever before. Daint” – from a Cray XC30 to a Cray XC40/XC50. The upgrade in- volved replacing two types of compute nodes as well as the in- With a peak performance of seven petaflops, “Piz Daint” has tegration of a novel technology from Cray known as DataWarp. been Europe’s fastest supercomputer since its debut in Novem- DataWarp’s “burst buffer mode” quadruples the effective ber 2013. And it is set to remain number one for now thanks to bandwidth to and from storage devices, markedly accelerating a hardware upgrade in late 2016, which boosted its peak per- data input and output rates and so facilitating the analysis of formance to more than 25 petaflops. This increase in perfor- millions of small, unstructured files. Thus, “Piz Daint” is able to mance is vital for enabling the higher-resolution, compute- and analyse the results of its computations even while they are still data-intensive simulations used in modern materials science, in progress. The revamped “Piz Daint” remains an extremely physics, geophysics, life sciences and climate science. Data energy-efficient and balanced system where simulations and science too, an area where ETH Zurich is establishing strategic data analyses are scalable from a few to thousands of compute research strength, calls for high-power computing facilities. nodes. These fields involve the processing of vast amounts of data. The new system is now well equipped to provide an infrastruc- ture that will accommodate the increasing demands in high performance computing (HPC) up until the end of the decade. “Piz Daint” 2017 fact sheet ~5’000 NVIDIA P100 GPU accelerated nodes Piz~1’400 Daint Dual specifications multi-core socket nodes Model Cray XC40/Cray XC50 Number of Hybrid Compute Nodes 5 320 Number of Multicore Compute Nodes 1 431 Theoretical Peak Floataing-point Performance per Hybrid Node 4.761 Teraflops Intel Xeon E5-2690 v3/Nvidia Tesla P100 Theoretical Peak Floating-point Performance per Multicore Node 1.210 Teraflops Intel Xeon E5-2695 v4 Theoretical Hybrid Peak Performance 25.326 Petaflops Theoretical Muliticore Peak Performance 1.731 Petaflops Hybrid Memory Capacity per Node 64 GB; 16 GB CoWoS HBM2 Multicore Memory Capacity per Node 64 GB, 128 GB Total System Memory 437.9 TB; 83.1 TB System Interconnect Cray Aries routing and communications ASIC, and Dragonfly network topology Sonexion 3000 Storage Capacity 6.2 PB Sonexion 3000 Parallel File System Theoretical Peak Performance 112 GB/s Sonexion 1600 Storage Capacity 2.5 PB Sonexion 1600 Parallel File System Theoretcal Peak Performance 138 GB/s http://www.cscs.ch/publications/fact_sheets/index.html | T. Schulthess !4 Via Trevano 131 T +41 91 610 82 11 © CSCS 2017 6900 Lugano F +41 91 610 82 82 Switzerland www.cscs.ch First production level GPU deployment in 2013 • 2009: Call for application development projects • 2010: Start 11 high-risk, high-impact development projects (incl. COSMO) • Nov. 2011: decision to engage in study with Cray (X86 vs. XeonPhi vs. GPU) Model• GPU Cray areXC40/C NOTray XC50 plan of record for Cray XC systems Number of Hybrid Compute Nodes 5 320 • Jan.Number 2012: of Multicore study Compute of 9 applicationsNodes show GPU are1 431 a viable option Theoretical• GPU bladePeak Floataing-point design begins Performance but per XeonPhi Hybrid Node (KNC) 4.761 has Te rhigheraflops Intel priority Xeon E5-2690 v3/Nvidia Tesla P100 Theoretical Peak Floating-point Performance per Multicore Node 1.210 Teraflops Intel Xeon E5-2695 v4 • Jul.Theoretical 2012: Hybrid difficulties Peak Performance in getting applications to25.326 perform Petaflops on KNC • Sep.-Oct.Theoretical Muliticore 2012: Peakdemonstrate Performance KNC can’t work (deficient1.731 Petaflops memory subsystem) Hybrid Memory Capacity per Node 64 GB; 16 GB CoWoS HBM2 • Nov.Multicore 2012: Memo Crayry Capacity switches per Node priorities, putting GPU64 GB, ahead 128 GB of XeonPhi • Nov.Total System 2012: Memo firstry deployment of Piz Daint (CPU)437.9 TB; 83.1 TB System Interconnect Cray Aries routing and communications ASIC, and Dragonfly • Nov. 2013: full scale out of Pix Daint with Keplernetwork (K20X) topology GPU Sonexion 3000 Storage Capacity 6.2 PB • Feb. 2014 – Apr. 2015: analysis, design, negotiations of upgrade Sonexion 3000 Parallel File System Theoretical Peak Performance 112 GB/s • Nov.Sonexion 2016: 1600 Stoupgraderage Capacity to Pascal GPU 2.5 PB Sonexion 1600 Parallel File System Theoretcal Peak Performance 138 GB/s • Apr. 2017: fully integrated platform for compute (GPU & CPU) and data service http://www.cscs.ch/publications/fact_sheets/index.html | T. Schulthess !5 FACT SHEET “Piz Daint”, one of the most powerful supercomputers A hardware upgrade in the final quarter of 2016 Thanks to the new hardware, researchers can run their simula- saw “Piz Daint”, Europe’s most powerful super- tions more realistically and more efficiently. In the future, big computer, more than triple its computing perfor- science experiments such as the Large Hadron Collider at CERN mance. ETH Zurich invested around CHF 40 million will also see their data analysis support provided by “Piz Daint”. in the upgrade, so that simulations, data analysis and visualisation can be performed more effi- ETH Zurich has invested CHF 40 million in the upgrade of “Piz ciently than ever before. Daint” – from a Cray XC30 to a Cray XC40/XC50. The upgrade in- volved replacing two types of compute nodes as well as the in- With a peak performance of seven petaflops, “Piz Daint” has tegration of a novel technology from Cray known as DataWarp. been Europe’s fastest supercomputer since its debut in Novem- DataWarp’s “burst buffer mode” quadruples the effective ber 2013. And it is set to remain number one for now thanks to bandwidth to and from storage devices, markedly accelerating a hardware upgrade in late 2016, which boosted its peak per- data input and output rates and so facilitating the analysis of formance to more than 25 petaflops. This increase in perfor- millions of small, unstructured files. Thus, “Piz Daint” is able to mance is vital for enabling the higher-resolution, compute- and analyse the results of its computations even while they are still data-intensive simulations used in modern materials science, in progress. The revamped “Piz Daint” remains an extremely physics, geophysics, life sciences and climate science. Data energy-efficient and balanced system where simulations and science too, an area where ETH Zurich is establishing strategic data analyses are scalable from a few to thousands of compute research strength, calls for high-power computing facilities. nodes. These fields involve the processing of vast amounts of data. The new system is now well equipped to provide an infrastruc- ture that will accommodate the increasing demands in high performance computing (HPC) up until the end of the decade. “Piz Daint” 2017 fact sheet Piz Daint specifications Model Cray XC40/Cray XC50 Number of Hybrid Compute Nodes 5 320 Number of Multicore Compute Nodes 1 431 Theoretical Peak Floataing-point Performance per Hybrid Node 4.761 Teraflops Intel Xeon E5-2690 v3/Nvidia Tesla P100 Theoretical Peak Floating-point Performance perInstitutions Multicore Node using 1.210 Piz DaintTeraflops Intel Xeon E5-2695 v4 Theoretical Hybrid Peak Performance 25.326 Petaflops • User Lab (including PRACE Tier-0 allocations) Theoretical Muliticore Peak Performance 1.731 Petaflops Hybrid Memory Capacity per Node• University of Zurich, USI, PSI,64 GB; EMPA 16 GB CoWoS HBM2 Multicore Memory Capacity per Node 64 GB, 128 GB • Total System Memory MaterialsCloud and HBP Collaboratory437.9 TB; 83.1 TB (EPFL) System Interconnect • CHIPP (sine Aug. 2017) Cray Aries routing and communications ASIC, and Dragonfly network topology Sonexion 3000 Storage Capacity• Others, e.g. Swiss Data Science6.2 PB Center Sonexion 3000 Parallel File System(exploratory) Theoretical Peak Performance 112 GB/s Sonexion 1600 Storage Capacity 2.5 PB Sonexion 1600 Parallel File System Theoretcal Peak Performance 138 GB/s http://www.cscs.ch/publications/fact_sheets/index.html | T. Schulthess !6 Via Trevano 131 T +41 91 610 82 11 © CSCS 2017 6900 Lugano F +41 91 610 82 82 Switzerland www.cscs.ch | T. Schulthess !7 Higher resolution is necessary for quantitative agreement wth experiment (18 days for July 9-27, 2006) Altdorf (Reuss valley) Lodrino (Leventina) COSMO-2 COSMO-1 source: Oliver Fuhrer, MeteoSwiss | T.

Piz Daint”, One of the Most Powerful Supercomputers

An Operational Perspective on a Hybrid and Heterogeneous Cray XC50 System

A Scheduling Policy to Improve 10% of Communication Time in Parallel FFT

View Annual Report

This Is Your Presentation Title

TECHNICAL GUIDELINES for APPLICANTS to PRACE 17Th CALL

Cray XC40 Power Monitoring and Control for Knights Landing

Hpc in Europe

A Performance Analysis of the First Generation of HPC-Optimized Arm Processors

Industry Insights | HPC and the Future of Seismic

Cray Xc-50 at Cscs Swiss National Supercomputing Centre

Introduction to Parallel Programming for Multicore/Manycore Clusters

Accelerating Science with the NERSC Burst Buffer Early User Program