Highlights of the 53Rd TOP500 List
Total Page:16
File Type:pdf, Size:1020Kb
ISC 2019, Frankfurt, Highlights of June 17, 2019 the 53rd Erich TOP500 List Strohmaier ISC 2019 TOP500 TOPICS • Petaflops are everywhere! • “New” TOP10 • Dennard scaling and the TOP500 • China: Top consumer and producer ? A closer look • Green500, HPCG • Future of TOP500 Power # Site Manufacturer Computer Country Cores Rmax ST [Pflops] [MW] Oak Ridge 41 LIST: THESummit TOP10 1 IBM IBM Power System, USA 2,414,592 148.6 10.1 National Laboratory P9 22C 3.07GHz, Mellanox EDR, NVIDIA GV100 Sierra Lawrence Livermore 2 IBM IBM Power System, USA 1,572,480 94.6 7.4 National Laboratory P9 22C 3.1GHz, Mellanox EDR, NVIDIA GV100 National Supercomputing Sunway TaihuLight 3 NRCPC China 10,649,600 93.0 15.4 Center in Wuxi NRCPC Sunway SW26010, 260C 1.45GHz Tianhe-2A National University of 4 NUDT ANUDT TH-IVB-FEP, China 4,981,760 61.4 18.5 Defense Technology Xeon 12C 2.2GHz, Matrix-2000 Texas Advanced Computing Frontera 5 Dell USA 448,448 23.5 Center / Univ. of Texas Dell C6420, Xeon Platinum 8280 28C 2.7GHz, Mellanox HDR Piz Daint Swiss National Supercomputing 6 Cray Cray XC50, Switzerland 387,872 21.2 2.38 Centre (CSCS) Xeon E5 12C 2.6GHz, Aries, NVIDIA Tesla P100 Los Alamos NL / Trinity 7 Cray Cray XC40, USA 979,072 20.2 7.58 Sandia NL Intel Xeon Phi 7250 68C 1.4GHz, Aries National Institute of Advanced AI Bridging Cloud Infrastructure (ABCI) 8 Industrial Science and Fujitsu PRIMERGY CX2550 M4, Japan 391,680 19.9 1.65 Technology Xeon Gold 20C 2.4GHz, IB-EDR, NVIDIA V100 SuperMUC-NG 9 Leibniz Rechenzentrum Lenovo ThinkSystem SD530, Germany 305,856 19.5 Xeon Platinum 8174 24C 3.1GHz, Intel Omni-Path Lawrence Livermore Lassen 10 IBM USA 288,288 18.2 National Laboratory IBM Power System, P9 22C 3.1GHz, Mellanox EDR, NVIDIA Tesla V100 100 150 200 250 300 350 50 0 1993 1994 1995 1996 1997 REPLACEMENT RATE 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 96 PERFORMANCE DEVELOPMENT 1,00E+10 1.56 EFlop/s 1 Eflop/s 1,00E+09 149 PFlop/s 1,00E+08100 Pflop/s 10 Pflop/s SUM 1,00E+07 1.01 PFlop/s 1,00E+061 Pflop/s 1,00E+05100 Tflop/s N=1 1,00E+0410 Tflop/s 1,00E+031 Tflop/s 1.17 TFlop/s N=500 1,00E+02100 Gflop/s 59.7 GFlop/s 1,00E+0110 Gflop/s June 2008 1,00E+001 Gflop/s Slowdown of Dennard Scaling 422 MFlop/s 1,00E-01100 Mflop/s 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 Age [Months] 10 15 20 25 0 5 1995 1996 AVERAGE SYSTEM AGE 1998 1999 2001 leads olderto systems systeMs upgrade to incentives Less 2002 2004 2005 2007 2008 2010 2011 2013 2014 7.6 month 2016 2017 2019 RANK AT WHICH HALF OF TOTAL PERFORMANCE IS ACCUMULATED 100 Less freQuent purchases allow to increase systeM sizes and 80 lead to a More top-heavy TOP500 60 40 20 0 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 ANNUAL PERFORMANCE INCREASE OF THE TOP500 Less freQuent purchases on steady 2,6 budget leads to increases in systeM 2,4 sizes, which temporarily keeps TOP500 2,2 overall perforMance increase up. 1000 x 2 TOP500: Averages 11 y 1,8 15 y 1,6 20 y 1,4 Moore’s Law 1,2 1 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 PERFORMANCE DEVELOPMENT 1,00E+11100 Eflop/s June 2013 10 Eflop/s 1.56 EFlop/s 149 PFlop/s 1,00E+091 Eflop/s 100 Pflop/s SUM 10 Pflop/s 1,00E+07 1.02 PFlop/s 1 Pflop/s 100 Tflop/s 1,00E+05 N=1 10 Tflop/s 1,00E+031 Tflop/s 1.17 TFlop/s 100 Gflop/s N=500 59.7 GFlop/s 1,00E+0110 Gflop/s June 2008 1 Gflop/s 1,00E-01100 Mflop/s 422 MFlop/s 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 PROJECTED PERFORMANCE DEVELOPMENT 1,00E+11100 Eflop/s 1,00E+1010 Eflop/s 1,00E+091 Eflop/s 1,00E+08100 Pflop/s 10 Pflop/s 1,00E+07 SUM 1,00E+061 Pflop/s 100 Tflop/s 1,00E+05 N=1 1,00E+0410 Tflop/s 1,00E+031 Tflop/s 1,00E+02100 Gflop/s N=500 1,00E+0110 Gflop/s 1,00E+001 Gflop/s 1,00E-01100 Mflop/s 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 COUNTRIES 500 China 400 Korea, South Italy 300 Canada 200 France United Kingdom 100 Germany 0 Japan United States 1993 1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 2019 COUNTRIES / SYSTEM SHARE Canada; 2% Others; 10% United States Ireland; 3% China Netherlands; 2% Japan Germany; 3% United France United Kingdom; States; 23% 3% United Kingdom Germany France; 4% China; 44% Japan; 6% Netherlands Ireland Canada COUNTRIES / PERFORMANCE SHARE Others; 77; 5% Ireland; United States Italy; 2% 1% China Switzerland; 4% Japan Netherlands; 1% France Germany; 4% United States; 38% United Kingdom United Kingdom; Germany 3% Netherlands France; 4% China; 30% Ireland Japan; 8% Switzerland COUNTRIES (TOP50) / SYSTEM SHARE United States Others; 10% China Italy; 4% Japan France United States; United Kingdom Germany; 8% 38% Germany United Italy Kingdom; 6% Japan; 18% China; 6% Others France; 10% 100 200 300 400 500 0 1993 1995 1997 1999 2001 PRODUCERS 2003 2005 2007 2009 2011 2013 2015 2017 2019 USA Japan Europe China Russia VENDORS / SYSTEM SHARE Lenovo Others; 41; 8% Inspur IBM; 16; 3% Sugon Fujitsu; 14; 3% Lenovo; 175; HPE Dell EMC; 17; 3% 35% Cray Inc. Bull; 21; 4% Bull Dell EMC Cray Inc.; 42; 9% Inspur; 71; 14% Fujitsu HPE; 40; 8% IBM Sugon; 63; 13% # of systems, % of 500 VENDORS / PERFORMANCE SHARE NUDT; 66; 4% others; 90; 6% Lenovo NRCPC; 93; 6% Inspur Lenovo; Sugon 306; 20% HPE Inspur; 87; 6% IBM; 324; Cray Inc. 21% Sugon; 96; 6% Bull Dell EMC HPE; 120; 8% Fujitsu Fujitsu; 71; 4% Cray Inc.; 195; IBM Dell EMC; 58; 4% Bull; 54; 3% 12% Sum of Pflop/s, % of whole list VENDORS (TOP50) / SYSTEM SHARE Cray Inc. HPE Others IBM Bull; 3; 6% ; 8; 16% Cray Inc.; 14; Fujitsu Lenovo; 3; 6% 28% Lenovo Bull Fujitsu; 5; 10% HPE; 10; Others IBM; 7; 20% 14% LENOVO* / SYSTEM SHARE 2018+2019 Others; 14; 9% Australia; 3; 2% China Canada; 4; 2% United States Singapore; 5; 3% Ireland Netherlands United Kingdom; China; 69; 6; 4% 42% United Kingdom Netherlands; 12; Singapore 7% United Canada Ireland; 13; 8% States; 38; Australia 23% Others * Inspur, Sugon, Huawei have not sold outside of China yet. ACCELERATORS 170 Matrix-2000 160 150 PEZY-SC 140 130 Kepler/Phi 120 Xeon Phi Main 110 100 Intel Xeon Phi 90 Clearspeed 80 IBM Cell SysteMs 70 60 50 ATI Radeon 40 Nvidia Volta 30 20 Nvidia Pascal 10 Nvidia Kepler 0 Nvidia Fermi 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 ACCELERATORS (TOP50) / SYSTEM SHARE Nvidia Kepler; 1 Nvidia Pascal; 3 Nvidia Kepler Nvidia Pascal Nvidia Nvidia Volta Volta; 9 Matrix-2000; 1 Matrix-2000 Deep Computing P. None; 23 Deep Computing Xeon Phi P.; 1 Xeon Phi Main Main; 8 ShenWei Power BQC ShenWei; 1 None Power BQC; 3 MOST ENERGY EFFICIENT ARCHITECTURES Rmax/ Computer Power Shoubou system B, ZettaScaler-2.2 Xeon 16C 1.3GHz Infiniband EDR PEZY-SC2 17.6 DGX Saturn V, NVIDIA DGX-1 Volta36 Xeon 20C 2.2GHz Infiniband EDR Tesla V100 *15.1 Summit, IBM Power System Power9 22C 3.07GHz Infiniband EDR Volta GV100 14.7 AI Bridging Cloud Infrastructure (ABCI), Xeon Gold 20C Infiniband EDR Tesla V100 SXM2 *14.4 Fujitsu PRIMERGY, NVIDIA Tesla V100 2.4GHz MareNostrum P9 CTE, IBM Power System Power9 22C 3.1GHz Infiniband EDR Tesla V100 14.1 Tsubame 3.0, SGI ICE XA Xeon 14C 2.4GHz Intel Omni-Path Tesla P100 SXM2 *13.7 PANGEA III, IBM Power System Power9 18C 3.45GHz Infiniband EDR Volta GV100 13.1 Sierra, IBM Power System Power9 22C 3.1GHz Infiniband EDR Volta GV100 12.7 Hygon Dhyana (Epyc ACS (PreE), Sugon TC8600 200Gb 6D Torus Deep Computing 11.4 7501) 32C 2GHz Xeon Gold 6154 18C Tawania 2, QCT QuantaGrid D52G-4U/LC InfiniBand EDR Tesla V100 SXM2 11.3 3GHz * Efficiency based on Power optimized HPL runs of equal size to TOP500 run. [Gflops/Watt] POWER EFFICIENCY 10 TOP10 /W] 8 Gflops 6 TOP50 4 /Power [ /Power 2 TOP500 Linpack 0 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 ENERGY EFFICIENCY 20 ZettaScaler-2.2 18 Max-Efficiency /W] 16 TsubaMe 3.0 14 DGX SaturnV Gflops 12 ZettaScaler-1.6 c 10 TsubaMe KFC 8 AMD FirePro NVIDIA K20x – K80 /Power [ /Power 6 Mic 4 BlueGene/Q Cell 2 Linpack TOP500 Average 0 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 HPCG HPCG/ HPCG/ # T Site Manufacturer Computer Country Rmax ST [Pflop/s] [Pflop/s] Peak HPL 41 LIST: SummitTHE TOP10 1 1 Oak Ridge National Laboratory IBM IBM Power System, USA 2.9258 148.6 1.6% 2.0% P9 22C 3.07 GHz, Volta GV100, EDR Sierra Lawrence Livermore 2 2 IBM IBM Power System, USA 1.7957 94.6 1.5% 1.9% National Laboratory P9 22C 3.1 GHz, Volta GV100, EDR RIKEN Advanced Institute for K Computer 3 18 Fujitsu Japan 0.6027 10.5 5.3% 5.7% Computational Science SPARC64 VIIIfx 2.0GHz, Tofu Interconnect Trinity Los Alamos NL / 4 6 Cray Cray XC40, USA 0.5461 20.2 1.2% 2.7% Sandia NL Intel Xeon Phi 7250 68C 1.4GHz, Aries National Institute of Advanced AI Bridging Cloud Infrastructure (ABCI) 5 7 Industrial Science and Fujitsu PRIMERGY CX2550 M4, Japan 0.5089 19.9 1.6% 2.6% Technology Xeon Gold 20C 2.4GHz, IB-EDR, NVIDIA V100 Piz Daint Swiss National Supercomputing 6 5 Cray Cray XC50, Switzerland 0.4970 21.2 1.8% 2.3% Centre (CSCS) Xeon E5 12C 2.6GHz, Aries, NVIDIA Tesla P100 National Supercomputing Sunway TaihuLight 7 3 NRCPC China 0.4808 93.0 0.4% 0.5% Center in Wuxi NRCPC Sunway SW26010, 260C 1.45GHz Nurion Korea Institute of Science and 8 13 Cray Cray CS500, South Korea 0.3915 13.9 1.5% 2.8% Technology Information Intel Xeons Phi 7250 68C 1.4 GHz, OmniPath JCAHPC Oakforest-PACS 9 14 Fujitsu Japan 0.3855 13.6 1.5% 2.8% Joint Center for Advanced HPC PRIMERGY CX1640 M1, Intel Xeons Phi 7250 68C 1.4 GHz, OmniPath Cori Lawrence Berkeley 10 12 Cray Cray XC40, USA 0.3554 14.0 1.3% 2.5% National Laboratory Intel Xeons Phi 7250 68C 1.4 GHz, Aries ISC 2019 TOP500 HIGHLIGHTS • Petaflops are everywhere! • Slow-down of Dennard’s scaling lead to longer procurement cycles which provided a 5 year delay from slowing down at the very TOP (2008-2013) – Don’t expect this for Moore’s Law! • China: Top consumer and producer overall but not in the TOP50 (yet) • Future increase in architectural diversity will necessitate a flexible approach to benchmarking.