2017 ALCF Operational Assessment Report
Total Page:16
File Type:pdf, Size:1020Kb
ANL/ALCF-20/04 2017 Operational Assessment Report ARGONNE LEADERSHIP COMPUTING FACILITY On the cover: This visualization shows a small portion of a cosmology simulation tracing structure formation in the universe. The gold-colored clumps are high-density regions of dark matter that host galaxies in the real universe. This BorgCube simulation was carried out on Argonne National Laboratory’s Theta supercomputer as part of the Theta Early Science Program. Image: Joseph A. Insley, Silvio Rizzi, and the HACC team, Argonne National Laboratory Contents Executive Summary ................................................................................................ ES-1 Section 1. User Support Results .............................................................................. 1-1 ALCF Response ........................................................................................................................... 1-1 Survey Approach ................................................................................................................ 1-2 1.1 User Support Metrics .......................................................................................................... 1-3 1.2 Problem Resolution Metrics ............................................................................................... 1-4 1.3 User Support and Outreach ................................................................................................. 1-5 1.3.1 Tier 1 Support ......................................................................................................... 1-5 1.3.1.1 Phone and E-mail Support ....................................................................... 1-5 1.3.1.2 Continuous Improvement......................................................................... 1-6 Accounts Website Changes ...................................................................... 1-7 Tracking Master User Agreements and User Acknowledgements .......... 1-7 Collaboration to Improve Collection of Project Quarterly and Annual Reports ........................................................................................ 1-7 1.3.2 Application Support ................................................................................................ 1-7 Modifying the LAMMPS code for Aurora .............................................. 1-7 Resolving Memory Limitations for I/O Buffers in c64 Mode ................. 1-8 MPI-Driver Design Manages Calculations for Materials Science ........... 1-8 HOPping from Cray/SGI to IBM Blue Gene/Q with High Efficiency .... 1-8 Optimizing VSVB Kernel, Reducing Time to Solution Submitted for the Association for Computing Machinery (ACM) Gordon Bell Prize ... 1-8 A Director’s Discretionary Team Reaches Its Goal of Obtaining a 2018 INCITE Award................................................................................ 1-9 Reducing Core Hours at a Significant Pace – Accelerated Climate Modeling for Energy ................................................................................ 1-9 Hydrodynamics into HACC to Enable Full Theta Simulation ................ 1-9 1.3.3 Resource Support .................................................................................................. 1-10 1.3.3.1 General Support ..................................................................................... 1-10 Jupyter Notebook ................................................................................... 1-10 Condor.................................................................................................... 1-10 1.3.4 Outreach Efforts .................................................................................................... 1-10 1.3.4.1 General Outreach ................................................................................... 1-10 User Advisory Council .......................................................................... 1-11 Connection to Technology Commercialization and Partnerships Division .................................................................................................. 1-11 Support of GlobalFoundries ALCC Extension ...................................... 1-11 ALCF CY 2017 Operational Assessment Report iii Contents (Cont.) 1.3.4.2 Workshops, Webinars, and Training Programs ..................................... 1-11 Getting Started Videoconferences ......................................................... 1-11 Preparing for KNL — Videos for Users ................................................ 1-11 Preparing for KNL — Developer (Jam) Sessions.................................. 1-12 ALCF Computational Performance Workshop ..................................... 1-12 Theta — Many-Core Tools and Techniques and Developer Sessions .. 1-12 ATPESC 2017 ........................................................................................ 1-12 1.3.4.3 Community Outreach ............................................................................. 1-13 Youth Outreach ...................................................................................... 1-13 Women in STEM ................................................................................... 1-13 Hour of Code.......................................................................................... 1-13 Summer Coding Camp ........................................................................... 1-13 CodeGirls@Argonne Camp ................................................................... 1-13 Visits and Tours .................................................................................................... 1-13 International Delegations ....................................................................... 1-14 School Groups ........................................................................................ 1-14 Industry Groups ..................................................................................... 1-14 Other STEM Activities ......................................................................................... 1-14 1.3.5 Communications ................................................................................................... 1-15 Conclusion ................................................................................................................................. 1-16 Section 2. Business Results .................................................................................... 2-1 ALCF Response ........................................................................................................................... 2-1 ALCF Resources .......................................................................................................................... 2-2 2.1 Resource Availability ......................................................................................................... 2-2 Theta ............................................................................................................................................ 2-3 2.1.1 Scheduled and 2.1.2 Overall Availability ............................................................... 2-3 2.1.3 System Mean Time to Interrupt (MTTI) and 2.1.4 System Mean Time to Failure (MTTF) ....................................................................................................... 2-6 2.2 Resource Utilization ........................................................................................................... 2-6 2.2.1 Total System Utilization ......................................................................................... 2-6 2.3 Capability Utilization .......................................................................................................... 2-8 Mira .......................................................................................................................................... 2-10 2.1.1 Scheduled and 2.1.2 Overall Availability ............................................................. 2-10 2.1.3 System Mean Time to Interrupt (MTTI) and 2.1.4 System Mean Time to Failure (MTTF) ..................................................................................................... 2-12 2.2 Resource Utilization ......................................................................................................... 2-13 2.2.1 Total System Utilization ....................................................................................... 2-13 2.3 Capability Utilization ........................................................................................................ 2-15 Conclusion ................................................................................................................................. 2-18 ALCF CY 2017 Operational Assessment Report iv Contents (Cont.) Section 3. Strategic Engagement and Results ....................................................... 3-1 ALCF Response ........................................................................................................................... 3-1 3.1 Scientific Output ................................................................................................................. 3-1 3.2 Scientific Accomplishments ............................................................................................... 3-2 3.2.1 Molecular Design of Dye-Sensitized Solar Cells with Data Science ..................... 3-2 3.2.2 Multiscale Simulations of Human Pathologies ....................................................... 3-3 3.2.3 Direct Numerical Simulation of Compressible, Turbulent Flow ...........................