Piz Daint and Its Ecosystem Sadaf Alam Chief Technology Officer Swiss National Supercomputing Centre November 16, 2017 CSCS in a Nutshell
Total Page:16
File Type:pdf, Size:1020Kb
Creating Abstractions for Piz Daint and its Ecosystem Sadaf Alam Chief Technology Officer Swiss National Supercomputing Centre November 16, 2017 CSCS in a Nutshell § A unit of the Swiss Federal Institute of Technology in Zurich (ETH Zurich) § founded in 1991 in Manno § relocated to Lugano in 2012 § Develops and promotes technical and scientific services § for the Swiss research community in the field of high-performance computing § Enables world-class scientific research § by pioneering, operating and supporting leading-edge supercomputing technologies § Employing 90+ persons from about 15+ different nations Supercomputing, 2017 2 Piz Daint and the User Lab http://www.cscs.ch/uploads/tx_factsheet/FSPizDaint_2017_EN.pdf http://www.cscs.ch/publications/highlights/ http://www.cscs.ch/uploads/tx_factsheet/AR2016_Online.pdf Model Cray XC40/XC50 Intel® Xeon® E5-2690 v3 @ 2.60GHz (12 XC50 Compute cores, 64GB RAM) and NVIDIA® Tesla® Nodes P100 16GB XC40 Compute Intel® Xeon® E5-2695 v4 @ 2.10GHz (18 Nodes cores, 64/128 GB RAM) Interconnect Aries routing and communications ASIC, Configuration and Dragonfly network topology Scratch ~9 + 2.7 PB capacity Supercomputing, 2017 3 Piz Daint (2013à2016à) § 5,272 hybrid nodes (Cray XC30) § 5,320 hybrid nodes (Cray XC50) § Nvidia Tesla K20x § Nvidia Tesla P100 § Intel Xeon E5-2670 § Intel Xeon E5-2690 v3 § § 6 GB GDDR5 16 GB HBM2 § 64 GB DDR4 § 32 GB DDR3 § 1,431 multi-core nodes (Cray XC40) § No multi-core à § 2 x Intel Xeon E5-2695 v4 § 64 and 128 GB DDR4 2013 2016 § Cray Aries dragonfly interconnect § Cray Aries dragonfly interconnect § ~33 TB/s bisection bandwidth § ~36 TB/s bisection bandwidth § Fully provisioned for 28 cabinets § Sonexion Lustre file system § Sonexion Lustre file system § ~9 PB (Sonexion 3000) & 2.7 PB § 2.7 PB (Sonexion 1600) (Sonexion 1600) § External GPFS on selected nodes Supercomputing, 2017 4 Piz Daint-–More Versatile Than Before Computing Visualization 2013 Data analysis ✙ à Pre-post processing 2016 Data mover Data Warp Machine learning Deep learning Supercomputing, 2017 5 Overview of Hardware Infrastructure SWITCHLAN (100 Gbit Ethernet) CSCS LAN Dedicated platforms Data Center Network (InfiniBand, Ethernet) Supercomputing, 2017 6 LHConCray Collaborative Project § Teams members from CHIPP (Swiss Institute of Particle Physics) and CSCS § To explore LHC workflows efficiency in a shared environment (Piz Daint) § Goals: full transparency to users while sustaining efficiency metrics and supporting monitoring and accounting tools (complete workflow mapping for multiple experiments) § Publications, presentations and community meetings § G. Sciacca, “ATLAS and LHC Computing on Cray”, CHEP, 2016 § L. Benedecis, M. Gila et. al. “Opportunities for container environments on Cray XC30 with GPU devices”. Proceedings of the Cray User Group meeting, 2015 § Status (Jan 2017): https://wiki.chipp.ch/twiki/pub/LCGTier2/MeetingLHConCRAY20170127/20170127.CSCS_CHIPP_ F2F.pdf § Status (Aug 2017): https://wiki.chipp.ch/twiki/pub/LCGTier2/BlogAcceptanceTests2017/LHConCRAY-Run4_CSCS.pdf § Community tools Production since § https://wiki.chipp.ch/twiki/bin/view/LCGTier2/LHConCRAYMonitoring April 2017 § http://ganglia.lcg.cscs.ch/ganglia/sltop_lhconcray.html Supercomputing, 2017 7 LHConCray Project WLCG platform statistics ~170 sites in 40 countries 350,000+ cores 500+ PB 2+ Mio jobs per day 10-100 Gb links http://wlcg.web.cern.ch/tools Data Center Network (InfiniBand, Ethernet) Supercomputing, 2017 8 Status of LHConCray Project § Operational since April 2017 § Monitor status and progress at http://wlcg.web.cern.ch/tools & https://wiki.chipp.ch/twiki/bin/view/LCGTier2/LHConCRAYMonitoring § Statistics § ~20% of total job submission (<0.4% of total compute resources) § Over 90% of docker/shifter image pull requests are for LHC software § Open items (tuning and optimization) § Data corruption patch generated an issue for the swap test case § Continued investigation into DVS and DWS optimization and tuning Supercomputing, 2017 9 Bridging the Gap à Creating New Abstractions § Light-weight operating system (SLES based) § Possible solution: containers or other virtualization interfaces § Diskless compute nodes § Possible solution: exploit burst buffer or tiered storage hierarchies § Computing nodes connectivity (high speed Aries interconnect) § Possible solution: web services access with no address translations overhead Supercomputing, 2017 10 Future Policy and Technical Considerations Convergence of HPC and Data Science Workflows § Resource Management Systems (job schedulers) § Too many jobs (relative to HPC jobs mix) § Fine grain control and interactive access § Resource Specialization (multi-level heterogeneity) § Subset of nodes with special operating conditions e.g. node sharing § Resource Access (authentication, authorization and accounting) § Delegation of access (service and user accounts mappings) § Resource Accessibility and Interoperability (middleware services) § Secure and efficient access through web services § Interoperability with multiple storage targets (POSIX & object) Supercomputing, 2017 11 Thank you for your attention..