Posted on Authorea 7 Dec 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160738123.33788182/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. n t urn mato P swl ssm te uoentrends. European other some as The well as NVIDIA. HPC were on from systems impact these accelerators current table, with following its coupled the and supercomputing from Intel, put seen China and and be This Japan IBM can US, As from the decade, 2002. chips past list. in US-designed AMD-based the Top-500 For back the on Simulator of Europe. primarily in top Earth also built the the but at US, of dominated the introduction Japan in have until the especially years, spotlight, with the industry—for the jolt in back and a the advance, US industry—including computing the to gave continue the technologies dominated ever. US systems, than The computer diverse and largest interesting world’s more Strassen the partial are with operation-saving of trends factorization turnover the LU to less using conforming Despite without calculations www..org). the equations doing 2 (see but linear precision, benchmark of using lower Linpack pivoting of systems mix High-Performance dense a the or measurement large algorithm a using Both solve is list computers must Top-500 COVID-19. The largest presentationSystems to the SC20. world’s Companies during at related discussed the session tools. Birds-of-a-Feather was uncertainties Top-500 related of as the their of 2020, at of during statistics lot demos systems the top500.org the high-end exploring a of seeing of from procurement and with users their architectures preventing slowed novel year also of virtual, have examples went their unsettling SC20 with and and floors ISC’20 show conferences strange supercomputing a premier both the been has It 2020 7, December 1 Elster . Atos Anne to ARM From Factor: European The ffiito o available not Affiliation uecmue,aentbeecpin.I hsatce ewl ics h itr fARM of history the discuss will we article, this In exceptions. notable are supercomputer, / 1 3 n ryHEsse n the and system Cray/HPE 3 + O ( n 2 obepeiin(4bt otn on numbers. point floating (64-bit) double-precision ) Sunway 1 ytmi hn,a ela h e ARM-based new the as well as China, in system Posted on Authorea 7 Dec 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160738123.33788182/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. Ps(oeie R’ ai rmdmrdobsbn opnns(o oiepoe) nfc,due fact, In as phones). such man- mobile components to (for other ARM components with from baseband designs designs modem/radio (SoC) the or System-on-chip license ARM Mali) own and often ARM’s microcontrollers. their Ethos, Samsung (sometimes and into Neoverse, and GPUs GPUs Cortex-R, them Apple integrate Cortex-M, CPUs, as GPU and Cortex-A, including such Mali ufacture include products, companies their Units) design Mobile behind Processing IP is SecureCore. (Neural of and NPUs range Norway and wide ARM CPUs a as companies, offers smaller known now of now ARM dozens is acquired GPUs that years and several startup CPUs last low-power Norwegian the architecture on processor. a over its focused Falanx, has licenses particularly and as but has devices, such chips, It embedded computer vendors. buying and any other mobile is produce to for it not ecosystem does product that and ARM China. announced and Freescale, designs - Sprint and NVIDIA ARM of AMD, subsidiary 2020, Intel, merger its September Unlike recent regarding the In issues via licensing stock some into T-Mobile 2020. running significant April has also in also ( which Group, completed and ARM Softbank Sprint was company US investing that in and Semiconductors). T-Mobile NXP conglomerate stake spin-off Japanese main Philips the the a of by part has 2016 now in and billion bought 1 was USD Computers ARM for Apple Phillips Computers, by Acorn bought ( later company Advanced company UK as former started Inc.) included that that Apple company venture (now Property) joint (Intellectual a IP Ltd., semiconductor Machines UK-based RISC old to 30-year a systems is ARM Embedded and RISC —from ARM Computer umtJn 2018- June Tianhe- Tianhe- eui 020-1U B leeeQ B oe Q 6 1.6GHz, @ 16C BQC Power IBM BlueGene/Q, IBM US 2012-06-01 Sequoia Sunway Fugaku Taihu- aeDtsn.1WeeAcietr Cores Architecture Where 1 no. Dates Name ia 021-1U ryHECa XK7, Cray Cray/HPE US 2012-11-01 Titan Light SC 2A 1A K- VDAt cur r o S 0Billion 40 USD for Arm Acquire to NVIDIA ue21 - 2011 June ue2020- June ue2013- June 001-1CiaND HMP ne enX60@29 GHz, 2.93 @ X5670 Intel MPP, TH NUDT China 2010-11-01 u 06- 2016 Jun o.2020 Nov. o.2015 Nov. o 2019 Nov o 2017 Nov 2011 Nov pl oJi cr,VS nCi-aigVenture Chip-Making in VLSI Acorn, Join to Apple aa Fujitsu/RIKEN Japan hn ne en HEpes2 arx20 981 4 649 10 NRCPC 2000 260C, Matrix SW26010 Sunway Express-2, MPP, TH Sunway Xeon, Intel China China 024 705 interconn. Tofu with 2.0GHz @ VIIIfx SPARC64 Fujitsu Japan SIMA92Pwr vdaG10GU,Mlao 424 2 Mellanox GPUs, GV100 Nvidia + Power9 AC922 IBM US al :Tp0 o ytm 2010-20 systems 1 No. Top500 1: Table .Gzw ryGmn necn. vdaK0 GPUs K20x Nvidia interconn., Gemini Cray w/ 2.2GHz nnbn D T10,Nii 00GPUs 2050 Nvidia FT-1000, FDR Infiniband 2 R-ae 6F 48C A64FX ARM-based u tmytk iet opeesnei is it since complete to time take may it but n.d.), , /Tf interconnects Tofu w/ utminterconnect Custom M Opteron AMD n LITcnlg aUS (a Technology VLSI and n.d.), , 241C@ 16C 6274 . GHz 2.2 @ 8 368 186 6 640 560 (Nov) (Jun) 630 7 299 7 572 1 848 072 864 492 600 760 Posted on Authorea 7 Dec 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160738123.33788182/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. on fve,i htmdr ne hp aeoe52btAX etrui,weesADcishv 2-3 have operation chips vector AMD a whereas unit, from puter vector AMD, AVX2 and 512-bit Intel one Fugaku have from chips chips Intel x86 modern between units. that differences 128-bit FinFET is biggest view, ( 7nm extensions the of TSMC vector of point 2 a x one is 512 that chip SVE has RIKEN Note A64 also The It by Global developed pins. Fujitsu jointly signal are interconnects. 594 : chips fusion) And Fugaku 48C (torus transistors. A64FX Tofu billion ARM-based 8.786 Fujitsu’s its with uses Fuji, and Mt Fujitsu after and Peak” “High Japanese Named new list. the 2020, Summer In ARM ones, and Intel Fugaku the than efficient more particularly it either. see this not on did focusing they solver been linear consumption, not on have power 1.87 but to as Carlo far Monte have on as They speedup However, alleviate level. 1.6X from this to achieving at Haswell-based seems they far, awareness Intel architecture, so so new interesting the other, the system a the each versus the on requires to with of but feature results next locality, one new core good appear where a had with not Socket-direct, UCX, associated did issues cores to fashion. performance ARM non-uniform OpenMPI many the very cores, a Intel from in unlike ranging behaved that, packages was familiar faced the they challenges of many ported have ( Stanford at 00a ubr5bt nteTp50adGen0 lists. Green500 and Top-500 the on 4 both or 5 2 number unit, at compute 2020 48-core the to Sierra, addition in system). has (operating The chip OS A64 the Its run that cores). units 7,630,848 assist to cores 7,299,072 (from Bermuda- of a administrator group, systems Technology main Marvell the by Aguilar, 2018 Michael in bought being itself conglomerate. registered to processors networking designing diitain agadpormta ok tpooyessesfravne architectures. The advanced the for targeted systems of more prototype 1.529 One benchmark at with a looks 204 benchmark, that number Gradient program Conjugate at Vanguard HP Administration) ranked the system use. on Enterprise)-built its 36 Packard to number (Hewlett made HPE and The was petaFLOPS, list Top-500 SC18. the the at on nounced appear to supercomputer ARM-based first The already Astra are that ARM-based designs ARM future impacts purchase it NVIDIA’s the how If has see the so designs. to of increased CPU interesting very has 75% server-class workloads. have devices be offered AI will these to also by it of influenced has claimed through, power ARM was goes processing 2018, ARM ARM for since tablets, of demand CPUs; as the their such As for devices 2005. demand handheld in and market CPU mobile world’s of proliferation the to Astra Fugaku u a,ulk h etsses( systems next 3 the unlike had, but n.d.), , n Sunway. and ( 2020 November in spot number-1 its maintained Astra ytmhs514AMbsdCvu hneX 8cr rcsos aimIc rwfrom grew Inc. Cavium processors. 28-core ThunderX2 Cavium ARM-based 5,184 has system Astra ytmi lovr nryecet akn ubr1 nteGen0 it unlike list, Green500 the on 10 number ranking efficient, energy very also is system SR:ALreSaeAM4HCDpomn insideHPC - Deployment HPC ARM64 Scale Large A ASTRA: a h rtsse hc htwspr fteU O NA(ainlNcerSecurity Nuclear (National NNSA DOE US the of part was that which system first the was oe saot10tmsfse hnteAMbsdCUcisfudi elphones. in found chips CPU ARM-based the than faster times 100 about is nodes Selene n.d.). , h vdaDXA0 ytma VDAwt 5,2 oe,rne nNov. in ranked cores, 555,520 with NVIDIA at system A100 DGX Nvidia the , Trinity Fugaku S ltomte sda baseline. a as used they platform ASC uecmue tRKNgie h o pto h Top-500 the on spot top the gained RIKEN at Supercomputer umt Sierra, Summit, 3 Astra aa’ uaukespsto sfsetsupercom- fastest as position keeps Fugaku Japan’s aeanc nrdcint h ytmi 2019 in system the to introduction nice a gave , and Sunway Astra edsrbshwthey how describes He n.d.). , ,addoe 0,0 cores 300,000 over added ), tSni ainlLb,an- Labs, National Sandia at Supercomputer Summit, Posted on Authorea 7 Dec 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160738123.33788182/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. h ipc ecmrsue o h o-0 aknsta nyalwfrfl 4btpeiin—eare from precisions—we benchmarks—differing 64-bit HPL-AI full for the allow at only look that and rankings era. ( mixed-precision Top-500 exaFLOPS exascale the or first the the for low- in become used already for likely benchmarks allow will Linpack we and the if schedule, on course, still Of supposedly US. is the Laboratory in National system exascale Ridge Oak at system Issues ( n.d.) uig hscuiga xetd1ya rmr ea fisGUPneVchocis hsmyaandelay again manufac- may This 7nm chips. its Vecchio Ponte with GPU issues its having of is chips, delay it more Apple that or the 1-year 2020 for the expected July demand an in causing the statement thus meet earnings turing, its to Samsung. through struggling use announced is also Intel it again reports to have TSMC may However, Apple Apple. so for chip M1 5nm instance, For nodes. Fugaku vendor-dependent. Supercomputing? 7nm affecting more foundries’ this become with the do is in compatible and to how has be size So but yet to Intel, node” length, assumed has gate technologies: “technology are to to node SMIC nodes refer tied 10nm to 14nm China’s Intel’s used been than Samsung. denomination years (nanometer) less and several nm using The last Company), are processes. Manufacturing 14nm world than Semiconductor the smaller (Taiwan in manufacturers chip TSMC three just TSMC Currently, versus —Intel Race nm The J Top-500. the the to on GPUs Atos ranking A100 - 2020 NVIDIA GPU Nov. new 7 A100 by the number NVIDIA continued adding its with was was to Equipped it work Supercomputer reorganizations, that his First several announced after Launches but Atos it, 2020 1924, where Technologies. France May in Atos in In of Paris died roots to subsidiary Bull Bull their relocated a Rosing have was machines. Bull, Fredrik it IBM as pioneer, punched-card (1931) and Norwegian emerged later build the Bull and Norway, after to both in named started how others fact describes 1918 in is 2008) in Bull (Heide, who machines. presence 1991) punch-card US (Heide, to its back Heide strengthened dating (2018). note, also Syntel has historical and It a and (2014) Europe. Bull, As ITO in (2011), company. Xerox vendor IT IT of HPC Siemens Dutch well-known acquisition including a a the companies, and latter several through companies French the of two (2014), merger/acquisition of the ITO merger Xerox with the grown from since J emerged has Forschungzentrum that It at company number-7 French installed a the chips is Atos EPYC is AMD list the Top-500 on 2020 based Germany. XH200 Nov. Sequena the Bull on Atos entry interesting Another EPYC AMD The and Atos 2019.) Sept. A64fx. in the HPE feature by computer will acquired that Cray Japan formally 80/ outside was Apollo announced (Cray system HPE first . Brook’s the 2020 Stony is in system Nov. featured in announced also Japanese), is A64fx Fujitsu The Aurora n ftetonweacl ytm lne nteU n22.TeAMD-powered The 2021. in US the in planned systems exascale new two the of one n.d.), , M-oee rnirMyBcm is SEacl uecmue u oItls7mProcess 7nm Intel’s to Due Supercomputer Exascale US First Become May Frontier AMD-Powered n M’ eetEY n hpaebt auatrda SC hc lopoue h new the produces also which TSMC, at manufactured both are chip 7nm EPYC recent AMD’s and ( Labs National Argonne at planned system onBy,IE pcrmTc ak ue22:ulPg Reload Page 2020:Full June Talk, Tech Spectrum IEEE Boyd, John Fugaku lorne ubr1o hsbnhaki o.22 ekn t2.0 at peaking 2020 Nov. in benchmark this on 1 number ranked also 4 ne eas7m osdr sn ia foundries rival using considers 7nm, delays Intel hc lal contributed clearly which n.d.), , UESBotrModule Booster JUWELS n.d.). , Ookami ( system ulich ¨ lc FJ in (FZJ) ulich ¨ “of in (“wolf” Frontier Ookami Atos an , , Posted on Authorea 7 Dec 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160738123.33788182/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. for-40-Billion-Creating-World-s-Premier-Computing-Company-for-the-Age-of-AI.html http://www.globenewswire.com/news-release/2020/09/13/2092654/0/en/NVIDIA-to-Acquire-Arm- Arm-for-40-Billion-Creating-World-s-Premier-Computing-Company-for-the-Age-of-AI.html. https://www. https://www.globenewswire.com/news-release/2020/09/13/2092654/0/en/NVIDIA-to-Acquire- latimes.com/archives/la-xpm-1990-11-28-fi-4993-story.html https://www.latimes.com/archives/la-xpm-1990-11-28-fi-4993-story.html. feature. Xilinx department References future 2021. a HPC of in and end discussed computing be the graphics will by Accelerators AI/ML, through emerging targets go other U250, and to in accelerator FPGAs FPGA headquartered expected FPGA-based Alveo offer is to the is deal started card, Xilinx also ( The Although accelerator has computations workloads. for and Inc.). center Asia, cloud Xilinx program data and to and Europe new company hard in AI FPGA presence fairly targeting announced strong the are, (AMD a cards buying companies still has be FPGA it and bought will Jose, have been, San it makers have offer 2020 CPU but hardware largest October in word’s codes, in the functions application of and two user algorithms However, for scientists. encode you speeds let potential that great (FPGAs) Arrays Gate Modulo Field-Programmable HPC? ParaStation ParTec’s for with maturing XH2000 —finally BullSequana FPGA Atos’ 240-petaFLOPS a 1 number be current software. to the than announced faster now be to is projected is and GPUs, Instinct LUMI eacl supercomputers Petascale that roadmap but a systems, with Atos Undertaking) current Joint the HPC including class: (European list, JU Pre-Exascale Top-500 EuroHPC the systems: of new following ten their the top for includes hard the supercomputers pushing in now systems European is some of had have wave Europe next the and JU EuroHPC • • • • • • • • ih dacdCmuigCnr MC) Portugal, (MACC), Centre Computing Advanced Minho Republic Bulgaria Czech Sofiatech, Centre, Supercomputing National IT4Innovations Slovenia IZUM, Luxembourg LuxProvide, oP xc i.TlsPoueet P,adErp’ ffrst oto t P Destiny insideHPC HPC - its Control to Efforts Leonardo Europe’s and 5 EPI, Procurement, Marenostrum Talks Dir. Exec. roHPC LUMI ilb nHECa Xsproptrwt M 4cr PCCU n etr h e AMD new the feature and CPUs EPYC 64-core AMD with supercomputer EX Cray HPE an be will aeotu 5 Marenostrum ( Finland Kajaani, in center CSC’s at Infrastructure) Modern Unified (Large le 20Dt etrAclrtrCard Accelerator Center Data U250 Alveo ( Italy CINECA, at n.d.) , tBreoaSproptn ete Spain Centre, Supercomputing Barcelona at sdsg a o enanucda fti ril’ writing. article’s this of as announced been not has design ’s : IEAPrnr ihAo o 4PLP Load’Supercomputer ‘Leonardo’ 240PFLOPS for Atos with Partners CINECA 5 n.d.). , Fugaku uecmue.And supercomputer. LUMI ( n.d.) , Leonardo n.d.) , Eu- Posted on Authorea 7 Dec 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160738123.33788182/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. https://www.xilinx.com/ products/boards-and-kits/alveo/u250.html https://www.xilinx.com/products/boards-and-kits/alveo/u250.html. https://insidehpc.com/2020/11/cineca-partners-with-atos-for-240pflops- leonardo-supercomputer/ supercomputer/. https://insidehpc.com/2020/11/cineca-partners-with-atos-for-240pflops-leonardo- efforts-to-control-its-hpc-destiny/value https://www.hpcwire.com/2020/11/19/eurohpc-exec-dir-talks-procurement-epi-and-europes- here https://www.lumi-supercomputer.eu/lumi-one-of-the-worlds-mightiest-supercomputers/value https: in-the-world-to-simultaneously-top-all-high-performance-benchmarks //spectrum.ieee.org/tech-talk/computing/hardware/japans-fugaku-supercomputer-is-first- is-first-in-the-world-to-simultaneously-top-all-high-performance-benchmarks. https://phonemantra.com/amd-powered-frontier-may- https://spectrum.ieee.org/tech-talk/computing/hardware/japans-fugaku-supercomputer- become-first-us-exascale-supercomputer-due-to-intels-7nm-process-issues/ due-to-intels-7nm-process-issues/. https://phonemantra.com/amd-powered-frontier-may-become-first-us-exascale-supercomputer- rival-foundries/ https://www.datacenterdynamics.com/en/news/intel-delays-7nm-considers-using- foundries/. https://atos.net/en/2020/press- https://www.datacenterdynamics.com/en/news/intel-delays-7nm-considers-using-rival- with-nvidia-a100-gpu release/general-press-releases_2020_05_14/atos-launches-first-supercomputer-equipped- first-supercomputer-equipped-with-nvidia-a100-gpu. (2008). https://atos.net/en/2020/press-release/general-press-releases_2020_05_14/atos-launches- 1889–1918. diffusion technology Europe information to of dynamics States //doi.org/10.1080/07341510802044686 the United revisiting the offices: from European professional for cards Punched https://www.japantimes.co.jp/news/2020/11/17/national/japan-fugaku-fastest- Knutsen A. K. and Bull R. mahc.1991.10024 F. by Machines Punched-card (1991). of Development 1918-1930. The Production: to Invention From supercomputer/ . https: https://www.japantimes.co.jp/news/2020/11/17/national/japan-fugaku-fastest-supercomputer/ https://www.fujitsu.com/global/ about/innovation/fugaku/ https://www.fujitsu.com/global/about/innovation/fugaku/. //insidehpc.com/2019/02/astra-a-large-scale-arm64-hpc-deployment/ https://insidehpc.com/2019/02/astra-a-large-scale-arm64-hpc-deployment/. EEAnl fteHsoyo Computing of History the of Annals IEEE here 6 itr n Technology and History , 13 3,261–272. (3), https://doi.org/10.1109/ , 24 4,307–320. (4), https: