Preparando o SPRACE para a era do HL-LHC T HIAGO TOMEI, R OGERIO I OPE, J ADIR M ARRA, MARCIO C OSTA

SPRACE High-Luminosity LHC Schedule

2 SPRACE Computing Power Evolution

105

104

103

102

Pledged Computing Power (HS06) 2000 20042005 20092010 20142015 2020 20252024 Year 3 CMS HL-LHC Resource Projections

4 CHF/HS06 Future Projections

Price / performance evolution of installed CPU servers (CERN) v8 Jan 2021

Past evolution

CHF / HS06 Conservative projection 102 Pessimistic projection

2GB ® 3GB/core memory HDD ® SSD 1.33 INTEL-AMD price war, low RAM prices 1.21 1.07 1.08 1.05 1.14 10 1.69 AMD market push 0.77 1.62

Improvement factor/year 1.15 120% RAM price increase

1 1.20

2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 Last 5 year average improvement factor = 1.25 5 Machine Learning in HEP: Jet Generation

Generative Adversarial Networks (GANs) q Generator + discriminator in tandem

Hadronic jets q As calorimeter images § Convolutional NNs § But: sparse and unordered q As point-cloud-style datasets § Graph NNs § But: physics features and variable particle number

Message Passing GAN (MPGAN) q Outperforms point cloud GANs q https://arxiv.org/abs/2106.11535 6 Machine Learning in HEP: Jet Generation

Generative Adversarial Networks (GANs) 0.4 1 q Generator + discriminator in tandem 0.3

Hadronic jets 0.2 q As calorimeter images § Convolutional NNs 0.1 -1 § But: sparse and unordered 10 rel 0 q As point-cloud-style datasets f § Graph NNs -0.1 § But: physics features and variable particle number -0.2 -0.3 10-2 Message Passing GAN (MPGAN) -0.4 q Outperforms point cloud GANs -0.4-0.3-0.2-0.1 0 0.1 0.2 0.3 0.4 h q https://arxiv.org/abs/2106.11535 rel 7 Machine Learning in HEP: Jet Generation

Generative Adversarial Networks (GANs) q Generator + discriminator in tandem

-1 Hadronic jets 10 q As calorimeter images § Convolutional NNs

§ But: sparse and unordered T rel p q As point-cloud-style datasets § Graph NNs § But: physics features and

variable particle number 0.05

0 f 0.05 Message Passing GAN (MPGAN) rel -0.05 0 -0.05 q Outperforms point cloud GANs -0.1 hrel q https://arxiv.org/abs/2106.11535 8 Machine Learning in HEP: Jet Generation

Generative Adversarial Networks (GANs) Generator fn

q Generator + discriminator in tandem Number of particles fe randomly samples from real distribution N = masks mask ∑ … G en erated 1 1 Particle Cloud

1 Final{ Features

Initial Noise 0 Hadronic jets Intermediate{ Features η φ p / T 1 In itial Noise q As calorimeter images … Masks assigned to first N h1 points, sorted in point … FC Layer feature space § Convolutional NNs h2 § But: sparse and unordered Message Passing

q As point-cloud-style datasets fn fe

§ Graph NNs Discriminator Message Passing

Initial{ Features § But: physics features and Initial{ Features R eal Particle R eal or η Cloud φ p T 1 … G en erated

variable particle number … Ave FC Layer Pool G en erated Particle Cloud Message Passing GAN (MPGAN) q Outperforms point cloud GANs q https://arxiv.org/abs/2106.11535 9 Machine Learning in HEP: Tracking

One of the most challenging HL- LHC computing tasks

q Exponential complexity with nhits q Track-ML challenge in 2018

Joining efforts with Exa.TrkX q Geometric Deep Learning § Metric learning § (Attention) Graph NNs q https://arxiv.org/abs/2103.06995 10 Machine Learning in HEP: Tracking

One of the most challenging HL- LHC computing tasks

q Exponential complexity with nhits q Track-ML challenge in 2018

Joining efforts with Exa.TrkX q Geometric Deep Learning § Metric learning § (Attention) Graph NNs q https://arxiv.org/abs/2103.06995 11 Heterogeneous Computing

Prototype: Phase-2 CMS HLT q https://cds.cern.ch/record/2766559 q 37.3 MHS06 farm (PU = 200) q Process events at 750 kHz q High efficiency, low output rate General-purpose GPUs q Cheaper CHF/HS06 q Structure of Arrays q Vectorized loops CMSSW with GPUs q Run-3: pixel tracks, calo local reco q Phase-2: HGCAL reconstruction 12 Heterogeneous Computing

Prototype: Phase-2 CMS HLT q https://cds.cern.ch/record/2766559 q 37.3 MHS06 farm (PU = 200) q Process events at 750 kHz q High efficiency, low output rate General-purpose GPUs q Cheaper CHF/HS06 q Structure of Arrays q Vectorized loops CMSSW with GPUs q Run-3: pixel tracks, calo local reco q Phase-2: HGCAL reconstruction 13 Heterogeneous Computing

Prototype: Phase-2 CMS HLT q https://cds.cern.ch/record/2766559 q 37.3 MHS06 farm (PU = 200) q Process events at 750 kHz q High efficiency, low output rate General-purpose GPUs q Cheaper CHF/HS06 q Structure of Arrays q Vectorized loops CMSSW with GPUs q Run-3: pixel tracks, calo local reco q Phase-2: HGCAL reconstruction 14 Networking – LHCONE

Private network connecting LHCONE services Tier1s and Tier2s: q L3VPN (VRF): routed Virtual q Serving any LHC sites according to their Private Network – operational needs and allowing them to grow q P2P: dedicated, bandwidth guaranteed, point-to-point q Model: sharing the cost and links – in development use of expensive resources q Monitoring: monitoring q A collaborative effort among Research infrastructure – operational & Education Network Providers

15 HEPCAS, LHCONE L3VPN: A global infrastructure for High Energy Physics data analysis (LHC, Belle II, Pierre Auger Observatory, NOvA, XENON, JUNO) Cyfronet IHEP Beijing PIONIER KREONet2 ScotGrid T2 To: ESnet, , CANET CSTNET CMS ATLAS >NORDUnet, RU PSNC To: CERN CNGI-CERNET CANARIE/CANET JANET SouthGrid via MAN LAN Poland SINET Durham via ANA Nikhef AGH Kraków, China to via UK T2 POZMAN China Amsterdam UToronto NetherlLight UVic - - ECDF RAL-PPD NL-T1 U.Warsaw China Next MOXY 400 VRF To: SINET, SimFraU NorthGrid SURFsara ICM-PUB Generation ESnet, Internet2 TRIUMF-T1 Montreal (Montreal) Glasgow T2 Bristol NORDUnet Internet, Netherlands SINET BCNet Lancs KIAE-TRANSIT Internet To: JGN, ESnet, SINET Peers: StarLight London Oxford Nordic Peer: Exchange UTNET Japan TEIN, GÉANT Internet2, RU-VRF, UBC T2 Liv NDGF-T1, NORDUnet (RU-VRF) Tokyo NORDUnet, To RU-VRF, Sussex ESnet StarLight To: GÉANT Helsinki BINP IHEP To GÉANT To >NORDUnet VRF NDGF-T1b To ICEPP Hirosh., Waterloo GÉANT, IC-HEP , Russia Protvino >NORDUnet Man RAL-LCG NetherLight To TEIN U Tokyo Tsukuba, MAN LAN, EEnet NDGF-T1c Brunel Peer: Moscow JINR-T1/T2 TransPAC Tokyo (US NSF) Shef Estonia CANARIE, SarFTI SINET QMUL SINET

to LA Tallinn HEPC Peer: , INR RRC-KI T1/T2 StCANARIEarLight HEPNET-J Tokyo GÉANT, CERN, RU SURFsara , RHUL GÉANT, (Troitsk) ASGC to

GÉANT,CERNET PIONIER, TO: ESnet, via Japan SINET GÉANT+CERN NETHERLIGHT NetherLight ASGC, GridPNPI peer: CERN, GARR KEK T1 to Singapore GÉANT+CERN KREONet2, To ESnet Korea CERNLight

To Amsterdam DFN ANA

ANA -400

SINET to Amsterdam to SINET ANA , then to - VRF via - NORDUnet KREONet2 To ASGC, 400 GÉANT, NEAAR Germany

- NORDUnet Taiwan 400 , , LA Peer: / ESnet, Korea – CANARIE RWTH DE-KIT-T1 TEIN, via URAN SANET KISTI –T1 PNU JGN Global Research Platform Network (GRPnet) CERN JGN TO CSTNET NEAAR/IU/US NSF NSCC PacWave VRF Slovakia CERN - To StarLight, Chicago SingAREN Via TEIN and ANA DESY GSI Ukraine

CNGI-CERNET ASGC, Taiwan to KNU, KCMS FMPh-UNIBA xx ASGC -

via Starlight 400 To RU

Kharkov-KIPT SINET to Tokyo IEPSAS-Kosice

PacWave / CANARIE To SINET, KREONet2 SINET to ESnet, Internet2, ESnet CANARIE via PNWG JGN via Starlight (Seattle) SINET SINET Seattle Peer: ESnet (same VRF as VRF/KIAE Japan KREONet2 ESnet, KREONet2 (Chicago) - To Tokyo To Tokyo To ASGC, KREONet2 ESnet Europe) SURFsara AARNet USA RU CESNET , JGN ,TEIN, HEPPERN, Peer: RU KRLight To Seattle KREONet2, Peer: ASGC, ESnet, ASGC SINET GÉANT and Internet2, CANARIE, to CERN Hong BNL-T1 ANL CANARIE NetherLight GÉANT, Czechia to Daejeon FNAL-T1 KREONet2;

Kong BNL-T1 ------(Amsterdam) - prague_cesnet_lcg2, TransPAC VRF To JANET Europe GÉANT, GARR, CSTNET To GÉANT, Paris NORDUnet NORDUnet to SINET,via ASGC, COMSATS, JGN NCP CANARIE To FZU/praguelcg2 AARNet PNNL, SLAC, ORNL

(London via Orient+) via (London All sites at MANLAN JGN, multipoint VLAN, Peers: ASGC, MANLAN To GÉANT, London TEIN to CERNLight, Hamburg TEIN NORDUnet, NET2 (North SINet ASGC SINET -400 NORDUnet2, Poznan inc Peer: VanderbiltTo WIX, ANSP, and RNPUFL via MIT, GÉANT, RU- , Peer: Internet2 East Tier 2) (Via ANA To: ESnet, London via MOXY) CERN, GÉANT Beijing/ ESnet, GenevaESnet, SOX, -400 VRF/KIAE, SINET, Dublin Amsterdam UCSB, TACC, UNL, Wisc MAN LAN SINET / , AGLT2, KERONet, Amaparo NL-T1, KREONet2; to TIFR To: Internet2 Peer: Caltech, UCSD, ANA Singapore UChi Beijing to GÉANT TEIN North ASGC, , IU, Purdue, () to ASGC ------To Internet2, SINET, Amst Frankfurt TEIN UT Arlington, UIUC, via RNP SURFsara, ESnet, Belnet Prague TransPAC via Orient+ Duke via Internet2 L2 . PSNC, ANA Paris

KREONet2 Internet2, CERN, RU

Hong Kong US/NSF/IU - GÉANT, CSTNET ESnet, NORDUnet, 400

- Vienna VRF

PNWG , Bejing

Hong SINET CANARIE CERN to CSTNet - London ARNES PacificWave Kong To LA ESnet Singapore to PacificWave Belnet ANA-400 Budapest JGN UWisc -400 ANA RENATER Geneva GÉANT, CERNET (distributed exchange) (distributed CAE-1 KREONet ESnet HKIX Madison Brussels NetherLight Peers: ESnet Hong Kong ESnet, Peer: AARNet ESnet To: ESnet, CANARIE, ESnet NORDUnet

TANet2 Singapore AGLT2 Harvard SingAREN Internet2 via Internet2 StarLight, LA, then To ThaiREN GÉANT, NORDUnet, UM To ESnet Lisbon GARR Bucharest Taiwan CERN via Marseille NCU, NTU MIT NET2 / BU VRF

Thailand - , TEIN, PacWave SINET, Tokyo ASGC (North East Tier 2) To

AGLT2 BTAA UNINET, (Sunnyvale) MREN RedCLARA Milan RU (Chicago)

SingAREN OmniPoP ERX-CHULANET MSU RENATER (Chicago) Madrid PSNC

, CERN , TransPAC Taiwan Peers: ESnet, NORDUnet THAISARN, JGN RU UCSD,UCSD To GARR GRNET Via GÉANT- PUBNET-TH, ASGC-T1 To ESnet VRF

– /NSCC UT To ESnet RedIRIS NSTDA-TH , India NKN, ASGC2 Arlington ESnet Internet2 GÉANT, ASGC to RoEduNet CENIC/ ESnet CalREN UNINET-TH, To ASGC via LA TIFR UIC To ESnet Singapore SUT-TH ESnet ANA CANARIE, -400 GARR Amsterdam NORDUnet, UChi ESnet To UNL MyREN UCSB To ESnet SINET CERNLight Malaysia (MWT2) -400 ESnet PERN SINET to Purdue Indiana ANA GÉANT SingAREN WIX RU NORDUnet TOKYO ANA-400 GÉANT Korea GÉANT - Open UCSD VRF Pakistan Singapore ASGC USA (Washington) NCP-LCG2 UIUC IU GÉANT London CERN Peers: Caltech (MWT2) ND PK-CIIT ASGC, SINET, JGN, TEIN, AARNet, AtlanticWave Geneva

ASGC to ASGC to To: Internet2 ESnet GÉANT (distributed GÉANT

IUPUI exchange) SINET, ESnet, CalREN, U Fl CERN-T0 Vanderbilt London ASGC to Tokyo (MWT2) SOX CERN ESnet PacWave Peer: ESnet Geneva Amsterdam CERN-T1 CAE (Los Angeles) (Atlanta) Amsterdam -1: Singapore To ASGC SINET CERN (TEINCC, -London, ESnet SurfNet London via Singapore ASGC Taiwan Taiwan ASGC GÉANT AARnet , Nordunet , SingAREN , SINET, London Peers: SINET, , GÉANT To MANLAN to ) ESnet, Peers: TransPAC, LAN Internet2, NORDUnet , ESnet, U Fl Internet2 GÉANT, via MAN London via PIREN) TransPAC ESnet, Internet2 , GOREX USA AMPath/ ERX-ERNET NKN (U of Guam) -LA, Amsterdam Singapore ASGC to StarLight Peers: UOK, UFL, CANET, AMLight Guam IU, PURDUE, IUPUI, NAP of Americas Belnet India India (Internet2+SingAREN/NSCC) CEA LHC1-RENATER Geneva ESnet, CALTECH, MIT, to ESnet, Internet2, (Miami) and GÉANT London Belgium TIFR VECC AARNet UOC, NORDUNET, To:ESnet, Internet2 Saclay and Paris via ANA-400 France, AS2091 CERN Mumbai Kolkata GÉANT, AARNet, BEgrid-ULB-VUB, LPNHE ARNES Begrid-UCL, IIHE CC-IN2P3-T1 GÉANT RedCLARA, LPC, APC, Slovenia

to AARNet To: ESnet, LAL S. America , et al CPPM, SiGNET xx RoEduNet Internet2 IPHC, IPNHO Australia São Paulo CBPF HEPGrid DNIC CUDI SUBATECH LLR, Romania Panama PTTA (UERJ) CEA- UMel NAP do Brasil RedCLARA VRF UAIC xx, ISS, BELLA: GÉANT, et al, Perth IRFU - NIHAM, UNAM REUNA Peers: ITIM, NIPNEx3 Chile Chile Fortaleza FCCN RNP / REDNESP GÉANT, RU LHCONE Map Ver. 5.4, September 2020 – WEJohnston, ESnet, [email protected] UTFSM/ Brazil Portugal CCTVal GARR International infrastructure by provider/collaboration To: GÉANT GRNET GARR LHCONE VRF domain/aggregator CANARIE NREN/site router at exchange point SAMPA Ioannina SPRACE Italy - A provider network. various NORDUnet (USP) RedIRIS CNAF-T1 Greece Communication links: (UNESP) ANSP Connector network – provides, GÉANT KIAE, Russia e.g., an L2 path between VRFs. 1/10, 20/30/40, and 100Gb/s or N x 100G Spain INFN SINET, Japan KREONet2, Korea Bari, Catania, JGN – Underlined link information ASGC, Taiwan PIC-T1 INFN London Provider network PoP router SingAREN/NSCC BELLA: GÉANT, et al, Pisa, Napoli Frascati, Legnaro, indicates link provider, not use ESnet transatlantic, USA RedCLARA, et al CIEMAT-LCG2, Milano, Roma1, CUDI Not currently connected to UAM-LCG2 Torino NDGF-T1 Dotted outline indicates distributed site JGN, SingAREN (+ TIENCC, SurfNet, NORDUnet, 16 LHCONE GÉANT on the London link) Adapted by R. L. Iope Exchange point Blue dashed outline indicating a WLCG ANA-300/400 - Various links provided by CANARIE, ESnet, (July/2021) RAL federation site not currently on LHCONE GÉANT, Internet2, NORDUnet, SURFnet, SINET,IU/NSF NCC – NAP Network Connection

UNESP Center for Scientific Computing Academic Network of São Paulo - REDNESP (@ Equinix SP4*)

Padtec Padtec Flexponder Flexponder Cisco Catalyst 6506E

10G 10G CH 27 CH 27 10G 10G 10G 10G 40G 40G Brocade MLXe DWDM link 4x 10G (ANSP Router) 40G Megatelecom 40G (2x 100 Gbps)

~ 40 Km Padtec Padtec 4x 10G Huawei CE 8861 Mux / Mux / Huawei CE 6881 Demux Demux 10G 40G 100G CH 31 CH 31 100G

100G Brocade MLXe Padtec Padtec (SOL - SouthernLight) Transponder Transponder 40G SPRACE cluster

Juniper (RNP Router) To AmLight (4x 10G + 2x 100G)

* https://www.equinix.com.br/data-centers/americas-colocation/brazil-colocation/sao-paulo-data-centers

17 Conclusions and Outlook The HL-LHC represents a monumental step in computing q Similar to the Tevatron à LHC jump q Computing power evolution alone will not save us

Two approaches to Research and Development q Machine Learning q Heterogeneous computing

SPRACE will be part of the HL-LHC computing era q Upgrades foreseen for 2021, 2023 – and beyond. q Involvement in R&D activities

18 Computação no SPRACE na era do HL-LHC

Infrastrutura de Computação q Upgrade (Storage + WNs) 2021: (92k + 150k) = 242k USD q Upgrade (Storage + WNs) 2023: (92k + 150k) = 242k USD q Instalação de GPUs na Tier-2: 108k USD q Upgrade para HL-LHC (2026): 242k USD q Upgrade para HL-LHC (2031): 242k USD Azul: previsto no q Infrastrutura do datacenter: 40k USD Projeto Temático P&D para Computação FAPESP 2018/25225-9 q ML TT-V: 2020-2025: 178k BRL q HetComp TT-V: 2020-2025: 178k BRL Vermelho: q ML TT-V: 2026-2027 178k BRL ainda não custeado q HetComp TT-V: 2026-2027 178k BRL Cluster de Desenvolvimento q 2 servidores, 8 GPUs: 59k USD

19