CSCS Site Update HPC Advisory Council Workshop 2018

Colin McMurtrie, Associate Director and Head of HPC Operations.

9th April 2018 The Swiss National Supercomputing Centre Driving innovation in computational research in Switzerland

. Established in 1991 as a unit of ETH Zurich . >90 motivated staff from ~15 nations . Now with a division in Zurich working on Scientific Software and Libraries (aka SSL) . Develops and operates the key supercomputing capabilities required to solve important problems to science and/or society . Leads the national strategy for High-Performance Computing and Networking (HPCN) that was passed by Swiss Parliament in 2009 . Has a dedicated User Laboratory for supercomputing since 2011 . Research infrastructure funded by the ETH Domain on a programmatic basis . ~1000 users, 200 projects

CSCS Site Update - HPCACW 2018 2 Sophist icat ed Infrast ruct ure The basement of t he comput er building houses t he “resource deck” cont aining t he basic infrast ruct ure: 960 batteries for t he emergency power supply as well as t he elect ricit y and water supply systems. Thick cables deliver t he power to t he compu- ter centre at a medium volt age of 16,000 volt s, where braided copper cables, as t hick as an arm, dist ribut e it to t he twelve current ly inst alled t ransformers.

The t ransformers convert t he power to 400 volt s, before it is t aken via power rails to t he middle floor, t he “inst allat ion deck”, and finally from t here to t he supercomput ers. The exist ing po- wer supply allows t he comput er centre to operate comput ers wit h an out put of about 11 megawatt s and t his could even be ext ended t o operate up t o 25 megawatt s. The comput er building (left) and the office block (right) are connected The lake water pipe, measuring 80 cent imetres in diameter, by a bridge and an under ground tunnel. (Picture: CSCS) enters t he building on t he sout h side. Alongside it , a pipe of t he same size leads back to t he lake. Between t he incoming and out going pipes, t here is a sophist icated cooling system in A separat e st orey inst ead of a raised floor operat ion: t he lake water and t he internal cooling water circuit From t he “resources deck”, t he processed power and water are meet in heat exchangers which are as t all as a person. There t he sent to t he “dist ribut ion deck”, t he inst allat ion floor located low temperat ure of t he lake water is t ransferred to t he internal direct ly above. In most convent ional comput er centres, t he cooling circuit . This delivers water at about 8 or at most 9 de- installat ion deck consist s of a raised floor measuring 40 to Pumpgree oncs Ce elsiust o cool twic to et he supercomput ers to cool t hem. By t he t ime 60 cent imetres in height t hrough which kilometres of cable Onct hee the wa wtaerter has has passedsed thr tough hrough this fir ts hist cooling first circoolingcuit, circuit , it is eight are fed. The cabinet s for t he power dist ribut ion unit s (PDU) itdegr has beenee she watarmered up b.y Ho eighwt edegrveree, ts. his The w noawt er16 isto s17t ill°C cold enough to cool are located in t he comput er room and so limit t he opt ions for wta heter isair sen tin thr tough he housingsa further hea to efx changerlower,-densit connected y tcoomput ers and hard installing supercomput ers. a second cooling circuit. This mid-temperature circuit cools the discs. To t his end, it is sent t hrough anot her heat exchanger air in the housings for the comput ers and hard drives of lesser enert hagyt densityis connec, whicht cedan ther toe tfor hee bemedium- cooled witht empewater tharatt ure cooling circuit . In order to avoid t his limit at ion in t he new CSCS building, t he isThis less c old.allo Thisws me oneans tha pumpingt with one pumping •oper opeatFlexible ionration, t ocold supply tw Facilitieso cooling raised Infrastructure floor has been replaced by isa fiv importante-metre high storey since we Flexible Facilities Infrastructurewciratercuit is supplied s t ha t o c twoolo cir secuitsve troal c oolsy stwtems.o types This, cannotof syst tems.oo, saves enerbegy . certainwhich abouthouses t he enfuturet ire technic systemal infrast ruct ure ,requirements also called t he secondary dist ribut ion system. The decision to opt for The cold water pipe is designed to cool of up t his const ruct ion was made on t he basis of experience in t he to 14 megawatts on the first cooling circuit.• The secCSCS’ond circuit Data Centre provides: can cool a further 7 megawatts of comput ers. The more the se- previous comput er centre in Manno where t he raised floor was cond circuit is used, the higher the waste heat absorbed• by thepower/cooling: barely able12 to acMWcommodate t he inst allat ion of new comput ers. water and so the more useful it is to the local industrial works who will be able t o use it. • upgradableOnly a t rapdoor indicates the exis tenctoe of the25 pumping MW station below the surface of Parco Ciani t o visit ors. (Picture: CSCS) Before the lake water returns to the lak•e, it pasFreeses through cooling with water from lake Lugano a stilling basin which can hold 120 cubic metres. The basin collects the water and makes sure that it• flows freelyCurrent down the Power Usage Effectiveness (PUE) = 1.2 return pipe back to the lake at a constant pressure and with no need for further power to be used. On the contrary, the plan is to use the energy generated as it falls to produce electricity. That is why connections for a microturbine have been provided in the pumping station.

So as not to affect the ecological balance of the lake, the water going back into the lake must never exceed 25 degrees Celsius. To ensure that this is always the case, a back-mixing funnel has been fit ted which will add cold water if necessary.

I

The suction strainers for the lake water pipe, just before they were lowered 45 metres into Lake Lugano. (Picture: CSCS)

The water pipe (green) stretches 2.8 km accross the city to connect the lake (right) with the computing centre. On its way it crosses under the Casserate river twice.

CSCS Site Update - HPCACW 2018 3

Via Trevano 131 Tel +41 (0)91 610 82 11 © CSCS 2012 6900 Lugano Fax +41 (0)91 610 82 82 Switzerland www.cscs.ch Flagship “Piz Daint”

. XC40 / Cray XC50 . Operational since April 2013 . Extension + upgrade to hybrid in late 2013 . Upgrade to new GPU in 2016 . Compute nodes . 5‘320 dual-socket nodes with CPU and NVIDIA Tesla P100 GPU . 1‘815 dual-socked nodes with Intel Xeon CPUs . Total system memory 521 TB RAM . Peak Performance . Hybrid partition 25.3 Petaflop/s . Multicore partition 1.7 Petaflop/s . Measured Linpack performance of 19.59 Pflop/s . Most powerful petascale supercomputer in the Top10 of the Green500

CSCS Site Update - HPCACW 2018 4 Evolution of the Flagship System - Piz Daint

. 5,272 hybrid nodes (CrayXC30) . 5320 hybrid nodes (Cray XC50) . Nvidia Tesla K20x . Nvidia Tesla P100 . 16x PCIe 3.0 . 16x PCIe 2.0 . Intel Xeon E5-2690 v3 (HSW) . Intel Xeon E5-2670 (SB) . 16GB HBM2 . 6GB GDDR5 . 64GB DDR4 . 32GB DDR3 . 1815 multi-core nodes (Cray XC40) . No multi-core partition . Dual socket Intel Xeon E5-2695 v4 (BDW) . 64GB and 128GB DDR4 . Cray Aries interconnect . Cray Aries interconnect

. 16x PCIe 3.0 . 16x PCIe 3.0 2013 . Dragonfly topology 2018 . ~36TB/s bisection bandwidth . ~33TB/s bisection bandwidth . Public IP routing to CSCS network . Fully provisioned for 28 cabinets . Sonexion Lustre file system . Cray Sonexion Lustre File System . 9.6PB Snx3000 . 2.7PB Snx1600 . 2.7PB Snx1600 . External GPFS on selected nodes . Slurm WLM . Slurm WLM . Slurm + ALPS . Native Slurm (no ALPS)

CSCS Site Update - HPCACW 2018 5 Piz Daint – A Consolidated HPC Environment

. Pre- & post- processing . Data Mover . Data Warp

. Machine Learning 2017 . Deep Learning . Support for Docker

. Computing

. Visualisation 2013 . Data Analysis

CSCS Site Update - HPCACW 2018 6 Data Centre Ecosystem Internet Access (via SWITCHlan;100 Gbit/s)

CSCS LAN

Consolidated HPC Environment On-site Cloud IaaS Dedicated Customer Systems/Platforms

Data Centre Network (IB, Ethernet)

TSM + Tape Library Site-wide Storage CSCS Site Update - HPCACW 2018 7 Infrastructure-as-a-Service (IaaS)

. “IaaS is a service model that delivers computer infrastructure on an outsourced basis to support enterprise operations. Typically, IaaS provides hardware, storage, servers and data center space or network components; it may also include software”, Source: Technopedia.com

Legend: You = Platform Provider Other = Infrastructure Provider

CSCS Site Update - HPCACW 2018 8 IaaS - Why Use It?

• Enables the hosting of Domain- NGINX Reverse Proxy Server (SSL + Caching) specific portals that are OIDC REST API OIDC Service managed by external entities NGINX OIDC Extension (lua) (Mitreconnect based, Java) • Separation of concerns means we don’t need to get involved with the details of what powers Django REST Mysql DB Django ORM + Business logic the Portal(s) Identity Service • Dynamic Provisioning possible

• But the Web Services Log collection and themselves need to be monitoring services scalable in such an env. • Challenges: RDBMS (Postgresql or • Infrastructure provider has no Mysql) visibility on what is IaaS VM Infrastructure happening within the Example of a Web Service Platform(s) • Arrows denote functional dependency

CSCS Site Update - HPCACW 2018 9 OpenStack IaaS Architecture Summary

CSCS Site Update - HPCACW 2018 10 Production OpenStack Environment - Pollux Director node Controller nodes

New system for generic Cloud Services Compute nodes . 30 new servers . Directors, Controllers, Compute . Dedicated network 40 Gb/s . 2 x 48 port 40 Gbit/s switches integrated into the CSCS network . Storage . ~30 TB usable internal CEPH storage KeyCloak/RHSSO Storage nodes . External Swift-on-GPFS storage . RedHat OpenStack Platform 11 (RHOSP11) . Integrated with CSCS AAI via KeyCloak (RHSSO) SWIFT (IBM . FireWall rules configured for Internet-facing services Spectrum Scale) Now hosting production platforms for third-party projects . Adding more HW . Augmenting the Cloud offerings (work on-going)

SAN Storage

CSCS Site Update - HPCACW 2018 11 Bringing Cloud Technologies closer to Piz Daint Docker Containers and Shifter Production workflows are using Docker Containers with Shifter on Piz Daint:

1. Build and test containers with Docker on your Laptop • Convenience for the user 2. Run securely and with high-performance on Piz Daint using Shifter • Native GPU and MPI performance • Improves parallel file system performance for some applications (e.g. Spark)

Current use cases: • LHConCray • Data Analytics frameworks (e.g. Spark) • HBP Neurorobotics Platform • >5000 container launches per day • Others coming… watch this space

CSCS Site Update - HPCACW 2018 12 Bringing it all together 1. IaaS relies on REST APIs to offer Services to Platforms . We collectively term these services Infrastructure Services

2. OpenStack is one way to provide IaaS and this can be done with satellite clusters

3. For Piz Daint we need other mechanisms to provide the necessary Infrastructure Services APIs . Work underway in this area

4. IaaS opens the door for Interactive Supercomputing . There are known Use Cases coming from various communities . Implies some policy-level changes (e.g. job preemption or node sharing for some queues)

5. This does NOT mean we will do away with our usual operations . The User Lab will remain our main core business . These new services are a way to augment our capability and open doors to new communities

CSCS Site Update - HPCACW 2018 13 Q&A

Contact: [email protected]