ECP Applications Summary

Jan 2017 High Order Multi-physics for Stockpile Stewardship LLNL-ABS-698614

Exascale Challenge Problem Applications & S/W Technologies • Multi-physics simulations of High Energy-Density Physics (HEDP) and focused Technologies Cited experiments driven by high-explosive, magnetic or laser based energy sources • ++, Fortran2003, Lua • Use of high-order numerical methods speculated to better utilize exascale • MPI, RAJA, OpenMP 4.x, CUDA architectural features (higher flop to memory access ratios) • Conduit, LLNL CS toolkit (SiDRe, Quest, CHAI, SPIO, others…) • Support for multiple diverse algorithms for each major physics package (e.g. built-in cross-check or “second vote” capability for validation) • MFEM, , MAGMA (UTK), • Improve end-user productivity for overall concept-to-solution workflow, including • HDF5, SCR, zfp, ADIOS improved setup and meshing, support for UQ ensembles, in-situ vis and post- • UQ-Pipeline, VisIt, P-Mesh processing, and optimized workflow in the LC Advanced Technology Systems (exascale platform) environment • Stockpile stewardship Risks and Challenges Development Plan • Radiation transport on high-order meshes is a research area Y1: ASC L2 language: Demonstrate at least one modular hydrodynamics capability using the CS toolkit. Description: Integrate CS Toolkit in-memory data repository (SiDRe) into the code and • High-order coupling of multi-physics is a research area demonstrate benefits of centralized data mgmt across multiple physics packages by enabling access • Immaturity of vendor-supported programming models and compiler to generalized Toolkit services such as parallel I/O, , runtime interrogation and steering, technology can slow progress and computational geometry capabilities nd th • Arbitrary order selection (e.g. 2 order vs 8 order,…) at run-time Y2: ASC L2 language: Demonstration of coupled multi-physics using the CS toolkit linking capability. can be expensive – important tradeoffs in flexibility for the user vs. Description: Demonstrate multi-physics coupling via mesh linking library that directly interacts with compile-time optimizations and higher performance the mesh-aware data description in SiDRe • Complete separation of core CS components into a reusable toolkit Y3: ASC L2 language: See Appendix of ASC Implementation Plan is a new model for LLNL ASC code development Y4: ASC L1 milestone demonstrating problem of programmatic relevance on Sierra and Trinity

Exascale Computing Project *PI: Rob Rieben (LLNL) 2 LANL ASC Advanced Technology Development and Mitigation: Next-Generation Code project (NGC) Exascale Challenge Problem Applications & S/W Technologies • Multi-physics simulations of systems using advanced material modeling for Applications extreme conditions supporting experimental programs at MaRIE • NGC and ASC IC modernization • Multi-physics simulations of high-energy density physics (HEDP) in support of inertial confinement fusion (ICF) experimental programs at NIF Software Technologies Cited • Routine 3D simulation capabilities to address a variety of new mission spaces of • Legion, MPI interest to the NNSA complex • Kokkos, Thrust, CUDA, OpenMP • Develop abstraction layer (FleCSI) to separate physics method expression from • C++17, LLVM/Clang, Python, Lua underlying data and execution model implementations • , HYPRE • Demonstrate the use of advanced programming systems such as Legion for scalable parallel multi-physics, multi-scale code development • ParaView, VTK-m, HDF5, Portage, FleCSI, Ingen

Risks and Challenges Development Plan • Immaturity of advanced programming systems such as Legion Y1: Release version 1.0 of a production toolkit for multi-physics application development on • Performance impact of FleCSI abstraction layer may be too great in advanced architectures Numerical physics packages that operate atop this foundational toolkit the context of dynamic multi-physics problem will be employed in a low-energy density multi-physics demonstration problem. • Serial nature of existing operator split may impact at Y2: Toolkit release version 2.0. Demonstration of a high-energy density multi-physics problem. exascale and beyond Y3:Toolkit release version 3.0. Workflow integration in preparation for Y4 goal. • Integration of advanced material models in modern unstructured hydrodynamics codes is a research topic Y4: ASC L1 milestone demonstrating problem of programmatic relevance on ASC Advanced • Scalable storage in support of routine 3D simulations of sufficient Technology Systems resolution is unproven

Exascale Computing Project PI: Aimee Hungerford, David Daniel (LANL), LA-UR-16-25966 3 Next Generation Electromagnetics Simulation of Hostile Environment Exascale Challenge Problem Applications & S/W Technologies • Self consistent simulation from a hostile builder device, radiation transport, plasma Applications generation and propagation to NW system circuits, cables and components with • EMPRESS uncertainties • Drekar • Develop coupled Source Region ElectroMagnetic Pulse (SREMP) to System Software Technologies Cited Generated ElectroMagnetic Pulse (SGEMP) simulation. Physical spatial domain on the order of kilometers down to system geometry down to millimeters • C++ • Efficient radiation transport and air chemistry through Direct Simulation Monte Carlo • MPI, Kokkos (OpenMP, Cuda), DARMA (Charm++) (DSMC) in rarified domains and condensed time history in thick regions • DataWarehouse, Qthreads, Node-level resource manager • Hybrid meshing (unstructured/regular mesh) for geometric fidelity near geometry and • Trilinos (Solvers, Tpetra, Sacado, Stokhos, Panzer, Tempus, KokkosKernels) performance in the bulk domain for particle/radiation transport • Percept, , Exodus, CGNS, netcdf, pnetcdf, HDF5 • Single integrated code base with efficient execution on diverse modern hardware • In-situ visualization (VTK-M, Catalyst) Risks and Challenges Development Plan • Embedded uncertainty propagation through stochastic methods Y1: Complete development on fluid representation of plasma models with simple sources such as DSMC research is in its infancy verified – SREMP problem at low altitudes • Scalable solvers for high concurrency not available in solver Y2: Simple radiation transport and PIC coupled to EM/ES fields. tools, and it is not clear such tools exist in the literature Y3: PIC code verified for simple problems - SGEMP problem at high altitude • Particle-fluid exchange of moment densities (mass, momentum and energy) Y4: Initial coupled PIC/Fluid approach for plasma simulation – Kinetic SREMP problem at • Coupling on uncertainties between fluid and particle based codes middle to low altitudes • Load-balancing computing between particle and field codes, AMT technologies

Exascale Computing Project PI: Matt Bettencourt (SNL) 4 Software Technology Requirements For Sandia ECP/ATDM ElectroMagnetic Plasma Application (EMPRESS)

• Programming Models and Runtimes – C++/C++17(1), Python(1), C(1) – MPI(1), OpenMP(1), CUDA(1), Darma(1) – PGAS(3), Kokkos(1) – (1) • Tools Cmake (1), Git (1), GitLab (1), PAPI (2), DDT (3), Vtune (2), Jenkins (1), Ctest (1), CDash (1), LLVM/Clang (2) • Mathematical Libraries, Scientific Libraries, Frameworks Trilinos (1), AgileComponents (1), BLAS/LAPACK(1), CuBLAS(1), Darma(2), Drekar(1), Sacado/Stokhos(1), Dakota (1)

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 5 Software Technology Requirements For Sandia ECP/ATDM ElectroMagnetic Plasma Application (EMPRESS)

• Data Management and Workflows MPI-IO (1), HDF5(1), NetCDF(1), Exodus(1), ADIOS (3), STK(1), CGNS(2), DataWarehouse(1) • Data Analytics and Visualization VisIt (1), Paraview (1), Catalyst (2), VTK (3), FFTW (3)

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 6 Virtual Flight Testing for Hypersonic Re-Entry Vehicles

Exascale Challenge Problem Applications & S/W Technologies • Virtual flight test simulations of re-entry vehicles from bus separation (exo- Applications atmospheric) to target for normal and hostile environments • SPARC (continuum compressible Navier-Stokes, hypersonic gas dynamics) • DSMC-based simulation of exo-atmospheric flight regime with hand-off to continuum • SPARTA (direct simulation Monte-Carlo, rarefied gas dynamics) Navier-Stokes at appropriate altitude • Sierra (Aria – thermal response, Salinas – structural dynamics) Software Technologies Cited • Time-accurate wall-modeled LES of high Reynolds number (100k-10M) hypersonic gas dynamics • C++ • MPI, Kokkos (OpenMP, Cuda), DARMA (Charm++) • Fully-coupled simulation of re-entry vehicle ablator/thermal (shape change, ablation • DataWarehouse, Qthreads, Node-level resource manager products blowing) and structural dynamic (random vibration) response • Trilinos (Belos, MuLue, Tpetra, Sacado, Stokhos, KokkosKernels) • DNS and DSMC enhanced reacting gas models and turbulence models via a-priori • Percept, IOSS, Exodus, CGNS, netcdf, pnetcdf, HDF5 and on-the-fly model parameter calculations • In-situ visualization (VTK-M, Catalyst) • Embedded sensitivity analysis, uncertainty quantification and optimization Risks and Challenges Development Plan • Scalable solvers for hypersonic gas dynamics (multigrid methods FY17: Demonstrate extreme-scale mesh generation and refinement; continue UQ development for hyperbolic problems) efforts; begin research activity on scalable solvers; develop low-dissipation schemes for • Extreme-scale mesh generation and refinement unsteady turbulent gas dynamics; document KNL performance on ATS-1 (Trinity) • Accurate LES models for hypersonic gas dynamics FY18: Focus on DARMA task-parallelism implementation; continue physics model development and implementation for hypersonic turbulent flows • Developing appropriate hypersonic boundary layer/ablator surface interaction models FY19: Focus on multi-physics coupling and simulation development, including workflows; • Effective task-parallelism and load balancing of heterogeneous continue DARMA development activities; document GPU performance on ATS-2 (Sierra) physical model workloads FY20: Full-physics (SPARTA-SPARC coupling, unsteady hypersonic turbulent flows, ablator & structural response coupling) demonstrations with UQ and optimization; document performance

Exascale Computing Project PI: Micah Howard (SNL) 7 Software Technology Requirements For Sandia ECP/ATDM Hypersonic Reentry Application (Sparc)

• Programming Models and Runtimes 1. C++/C++17 (1), Python (2) 2. MPI (1), Kokkos (1) [including OpenMP, CUDA], DARMA (1) [including Charm++, HPX, Legion/Regent] 3. Boost (2) 4. UPC/UPC++ (3), PGAS (3) • Tools 1. CMake (1), Git (1), GitLab (2), Gerrit (2), DDT (1), TotalView (1), Jenkins (1), CDash (1) 2. Vtune (1), TAU (2), OpenSpeedShop (2), PAPI (2) 3. LLVM/Clang (2) • Mathematical Libraries, Scientific Libraries, Frameworks 1. Trilinos (1), AgileComponents (1), BLAS/PBLAS (1), LAPACK/ScaLAPACK (1), Metis/ParMetis (1), SuperLU (1), MueLu (1), Dakota (1) 2. Chombo (3)

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 8 Software Technology Requirements For Sandia ECP/ATDM Hypersonic Reentry Application (Sparc)

• Data Management and Workflows 1. MPI-IO (1), HDF (1), netCDF (1) 2. GridPro (meshing) (1), Pointwise (meshing) (1) 3. Sierra (1), STK (1), DTK (2), CGNS(1), DataWarehouse (1) • Data Analytics and Visualization 1. ParaView (1), EnSight (1) 2. Slycat (2) • System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 9 Enabling GAMESS for Exascale Computing in Chemistry & Materials Heterogeneous Catalysis on Mesoporous Silica Nanoparticles (MSN)

• MSN: highly effective and selective heterogeneous catalysts for a wide variety of important reactions • MSN selectivity is provided by “gatekeeper” groups (red arrows) that allow only desired reactants A to enter the pore, keeping undesirable species B from entering the pore • Presence of solvent adds complexity: Accurate electronic structure calculations are needed to deduce the reaction mechanisms, and to design even more effective catalysts • Narrow pores (3-5 nm) create a diffusion problem that can prevent product molecules from exiting the pore, hence the reaction dynamics must be studied on a sufficiently realistic cross section of the pore • Adequate representation of the MSN pore requires ~10- 100K atoms with a reasonable basis set; reliably modeling an entire system involves >1M basis functions • Understanding the reaction mechanism and dynamics of the system(s) is beyond the scope of current hardware and software – requiring capable exascale

Exascale Computing Project PI: Mark Gordon (Ames) 10 Enabling GAMESS for Exascale Computing in Chemistry & Materials

Exascale Challenge Problem Applications & S/W Technologies • Design new materials for heterogeneous catalysis in ground & excited electronic states Applications • Employ ab initio electronic structure methods to analyze the reaction mechanisms • GAMESS (General Atomic and Molecular Electronic Structure System), and selectivity of mesoporous silica nanoparticles (MSN) with >10K atom QMCPACK, NWChem, PSI4 simulations Software Technologies Cited • Impact development of green energy sources due to MSN catalysts’ ability to • , C++, Python, MPI, OpenMP, OpenACC, CUDA provide specific conversion of cellulosic based chemicals into fuels or other • Swift, DisPy, Luigi, BLAS industrially important products • MacMolPlt • Reduce time & expense of supporting experimental efforts; supports &D in photochemistry, photobiology, ion solvation • Gerrit, Git, Doxygen • Develop strategies to reduce power consumption and common driver for multiple • ASPEN, Oxbow program interoperability Risks and Challenges Development Plan • Meeting power capping targets due to size of the computations • Initiate GAMESS code analysis • GAMESS-QMCPACK interoperability • Initiate GAMESS refactoring by enabling OpenMP for FMO/EFMO/EFP • Optimizing use of on- and off-node hierarchical memory methods • Interfacing QMC kernels to libcchem (C++ library for electronic • Release a new version of GAMESS structure codes) • Release a new version of libcchem with RI MP2 energies and gradients • Hardware architecture uncertainties • Complete and assess an initial threaded GAMESS RI-MP2 energy + • Refactoring of very large and mature code base gradient code, conduct benchmarks • Initiate the development of a GAMESS-QMCPACK interface in collaboration with the QMCPACK group Exascale Computing Project PI: Mark Gordon (Ames) 11 Software Technology Requirements Enabling GAMESS for Exascale Computing in Chemistry & Materials

• Programming Models and Runtimes 1. Fortran, C++/C++17, Python, C, Javascript, MPI, OpenMP, OpenACC, CUDA, OpenCL, GDDI, PARSEC, GDDI, Boost, DASK- Parallel, PYBIND11, OpenMP 4.x 2. TiledArrays, UPC/UPC++, Co-Array FORTRAN, JULIA 3. Argobots, HPX, Kokkos, Raja, Thrust, OpenSHMEM, Sycl, TASCEL • Tools 1. HPCToolkit, PAPI, Oxbow, ASPEN, CMake, git, TAU, GitLab, Docker, Gerrit, GITHUB, TRAVIS CI, HDF5, PSiNSTracer, EventTracer, PEBIL, VecMeter 2. Jira, Cython, PerfPal 3. LLVM/CLANG, ROSE, JIRA, Caliper, Cdash, Flux, Shifter, ESGF, EPAX • Mathematical Libraries, Scientific Libraries, Frameworks 1. BLAS/PBLAS, PETSc, LAPACK/ScaLAPACK, libint, libcchem, MKL, FFTW, MOAB, , SciPy, nose, MAGMA 2. DPLASMA, Sympy 3. Boxlib, HYPRE, Chombo, SAMRAI, Metis/Parmetis, SuperLU, Repast HPC, APOSMM, HPGMG, Dakota

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 12 Software Technology Requirements Enabling GAMESS for Exascale Computing in Chemistry & Materials

• Data Management and Workflows 1. Python-PANDAS, HDF, Swift, DisPy, Luigi 2. BLITZ DB, Drake 3. Airflow • Data Analytics and Visualization 1. MacMolPlt Matplotlib 2. Seaborn, Mayavi, h5py, PyTables, Statsmodels 3. Pygal, Chaco • System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option 3. Might be useful but no concrete plans Exascale Computing Project 13 Software Technology Plans Enabling GAMESS for Exascale Computing in Chemistry & Materials

• Data Management and Workflows – Common Driver for : capable of performing computations using multiple quantum chemistry codes (with QCDB a common shared component), possibly using RabbitMQ as an MPI wrapper • Data Analytics and Visualization – MacMolPlt / Matplotlib: used to view molecular systems (~20K atoms); need to tailor to differentiate those molecular fragments computed with different techniques and to scale up the capability – Quantum Chemistry Database (QCDB): common framework for managing quantum chemical data, e.g., automatic generation of data tables, summaries of error statistics, etc., that are easily transferrable to other quantum chemistry applications – In-situ data analysis (including filtering and reduction) and visualization (to various points in the visualization pipeline) can help reduce data movement and storage. Workflow analysis tools such as Panorama can be used to optimize data movement

Exascale Computing Project 14 NWChemEx: Tackling Chemical, Materials and Biomolecular Challenges in the Exascale Era

• Deliver molecular and materials modeling capabilities for development of new feedstocks biomass on marginal lands, and new catalysts for the conversion of these feedstocks to usable biofuels and other products • NWChemEx will be a technology step change relative to NWChem, a powerful molecular modeling application (PNNL) • Deliver a redesigned NWChem to enhance its scalability, performance, extensibility, and portability and enable NWChemEx to take full advantage of capable exascale • A modular, library-oriented framework with new algorithms to reduce complexity, enhance scalability, and exploit new science approaches to reduce memory requirements and separate software implementation from hardware details • Implement leading-edge developments in , computer science, and applied mathematics

Exascale Computing Project PI: Thom Dunning (PNNL) 15 NWChemEx: Tackling Chemical, Materials and Biomolecular Challenges in the Exascale Era Exascale Challenge Problem Applications & S/W Technologies • Aid and accelerate advanced biofuel development by exploring new feedstocks for Applications efficient production of biomass for fuels and new catalysts for efficient conversion of • NWChemEx (evolved from redesigned NWChem) biomass-derived intermediates into biofuels and bioproducts Software Technologies Cited • Molecular understanding of how proton transfer controls protein-assisted transport of ions across biomass cellular membranes, often seen as a stress responses in • Fortran, C, C++ biomass, would lead to more stress-resistant crops through genetic modifications • Global Arrays, TiledArrays, ParSEC, TASCEL • Molecular-level prediction of the chemical processes driving the specific, selective, • VisIt, Swift low-temperature catalytic conversion (e.g., Zeolites, such as H-ZSM-5) of biomass- derived alcohols into fuels and chemicals in constrained environments • TAO, Libint • Git, svn, JIRA, Travis CI • Co-Design: CODAR, CE-PSI, ExaGraph

Risks and Challenges Development Plan • Unknown performance of parallel tools Y1: Framework with tensor DSL, RTS, , execution state tracking; Operator-level NK- • Insufficient performance, scalability, or capacity of local memory will based CCSD with flexible data distributions and symmetry/sparsity exploitation require algorithmic reformulation Y2: Automated compute of CC energies and 1-/2-body CCSD density matrices; HT and DFT • Unavailable tools for hierarchical memory, I/O, and resource compute of >1K atom systems via multithreading management at exascale Y3: Couple embedding with HF and DFT for multilevel memory hierarchies; QMD using HF • Unknown exascale architectures and DFT for 10K atoms; Scalable R12/F12 for 500 atoms with CCSD energies and gradients • Unknown types of correlation effect for systems with large number using task-based scheduling of electrons Y4: Optimized data distribution and multithreaded implementations for most time-intensive • Framework cannot support effective development routines in HF, DFT, and CC.

Exascale Computing Project PI: Thom Dunning (PNNL) 16 Software Technology Requirements NWChemEx

• Programming Models and Runtimes 1. Fortran, Python, C++, Global Arrays, MPI, OpenMP, CUDA 2. Intel TBB, OpenCL, PaRSEC, TASCEL 3. OpenCR • Tools 1. CHiLL, ADIC/Sacado/OpenAD, PAPI, HPCToolkit 2. LLVM, TAU, Travis, gitHub 3. Your list here • Mathematical Libraries, Scientific Libraries, Frameworks 1. BLAS+(Sca)LAPACK (single threaded, multi-threaded and ), Elemental, FFTW, MADNESS, libint 2. Your list here 3. Your list here

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 17 Software Technology Requirements NWChemEx

• Data Management and Workflows 1. MPI-IO, HDF, Swift 2. ADIOS 3. Your list here • Data Analytics and Visualization 1. VisIt, 2. Your list here 3. Your list here • System Software 1. Standard software development environment, Eclipse 2. Your list here 3. Your list here

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 18 Multiscale Coupled Urban Systems

Vehicle mix, driving habits • Impacts of greenhouse Social-Economic & gases (GHG) on local Transportation Activities Response times climate • Resulting impacts on city function Vehicle Trip plans, Response Building weather Building • Incorporation of emissions, service times mix, demand renewables into city heat demands pricing energy portfolio Wind, pressure, heat, moisture • Resilience of physical infrastructure Urban Atmosphere Buildings • Economic protection, Building missions, heat resilience, and enhancement • … EnvironmentOpen & Sensed Sensitive Real-Time State Population Dynamics InfrastructureData Data Data Municipal Data Sensor Networks Census, Social Sources Sources, Mobility…

CharlieExascale CatlettComputing (ANL), Project Mary Ann Piette (LBNL), Tianzhen Hong (LBNL), Budhendra Bhaduri (ORNL), Thom Dunning (PNNL), John Fey (PNNL), Nancy Carlisle (NREL), Daniel Macumber (NREL), Ralph Muehleisen (ANL)… 19 Multiscale Coupled Urban Systems

Exascale Challenge Problem Applications & S/W Technologies • Urbanization is increasing demand for energy, water, transportation, healthcare, Applications infrastructure, physical & cyber security and resilience, food, education—and • MetroSEED framework to integrate sector-specific models deepening the interdependencies between these systems. New technologies, knowledge, tools needed to retrofit / improve urban districts, with capability to model • chiSIM, RegCM4, WRF, Nek5000, EnergyPlus, CityBES, URBANOpt, and quantify interactions between urban systems TUMS, Polaris, P-MEDM • Integrate modules for urban atmosphere and infrastructure heat exchange and air Software Technologies Cited flow; building energy demand at district or city-scale, generation, and use; urban • Fortran, C, C++, Ruby, Python, JavaScript, C#, R, Swift/T dynamics & activity based decision, behavioral, and socioeconomic models; population mobility and transportation; energy systems; water resources • MPI, OpenMP, OpenACC • Chicago metro area as testbed for coupling agent-based social/economics model • MOAB, PETSc, Boost, CityGML, SIGMA, OpenStudio, LandScan USA, with transportation, regional climate, CFD microclimate, energy of (up to 800K) Repast HPC, OpenStreetMap, TRANSIMS buildings • CESIUM Risks and Challenges Development Plan • HPC resources may not be fully available Y1: Nek5000 verification; EnergyPlus, TUMS, Polaris, chiSIM, P-MEDM on HPC; EnergyPlus • Some city data may not be available 500-bldg model and sim; City sub-region meshes; Model integration and data exchange architecture (CityGML-based) • Synergistic software and co-design projects may not be supported by ECP Y2: Nek5000+EnergyPlus coupling; CityGML+EnergyPlus 10K-bldg model and sim; P-MEDM • Difficulties in coupling urban atmosphere and buildings models at + TUMS + chiSIM coupling; WRF RCP4.5 high-res sims; demo P-MEDM+TUMS coupling the individual building time steps Y3: Integration with Transp / ABM; 50-bldg sim with coupled atmosphere+EnergyPlus model; • Scalability of building energy code (EnergyPlus) CityGML+EnergyPlus 100K-bldg model and sim; Coupling of chiSIM with TUMS+P-MEDM • Scalability of agent-based models Y4: Demo coupled system on all platforms; CityGML+EnergyPlus models for 800K bldgs; fully • Limited stakeholder testing and assessment coupled sims of 500 bldgs on Aurora; 800k bldg with downscale atmospheric input; tune performance of transportation & coupled 800K bldgs on Aurora Exascale Computing Project PI: Charles Catlett (ANL) 20 Software Technology Requirements Multiscale Coupled Urban Systems

• Definitely plan to use (Rank 1) – Programming Models and Runtimes: C++, C, C#, R, JavaScript, Python, Fortran, MPI, OpenMP, OpenACC, OpenStudio, Repast HPC, TUMS, POLARIS, TRANSITS, Nek5000, WRF, EnergyPlus – Tools: CMake, git, TAU, GitLab, Jira, Valgrind, HPCToolkit – Mathematical Libraries, Scientific Libraries, Frameworks: PETSc, MOAB, CouPE, Boost, Lapack, BLAS and other standard libraries for scientific computing – Data Management and Workflows: Swift, Swift/T MPI-IO, HDF5, CityBES, CityGML, SensorML, RabbitMQ, URBANOpt, LandScan, CESIUM – Data Analytics and Visualization: Matplotlib, Graphviz, Paraview, WebGL, VisIT – System Software: Pmem, libnuma, memkind • Will explore as an option (Rank 2) – Programming Models and Runtimes: Globus Online, Scala, CUDA, PyCUDA – Tools: Origami, OpenTuner, Orio, PAPI – Mathematical Libraries, Scientific Libraries, Frameworks: CNTK, Scikit Learn, Pylearn2, Petsc, Blas, CuBlas, CuSparse, H2O, Neon – Data Management and Workflows: Jupyter Requirements Ranking – Data Analytics and Visualization: Cesium, WorldWind 1. Definitely plan to use 2. Will explore as an option 3. Might be useful but no concrete plans Exascale Computing Project 21 Computing the Sky at Extreme Scales Elucidating cosmological structure formation by uncovering how smooth and featureless initial conditions evolve under gravity in an expanding universe to eventually form a complex cosmic web

• Modern cosmological observations have led to a remarkably successful model for the dynamics of the Universe; 3 key ingredients --- dark energy, dark matter, and inflation --- are signposts to further breakthroughs, as all reach beyond the known boundaries of the particle physics Standard Model • A new generation of sky surveys will provide key insights and new measurements such as of neutrino masses • New discoveries - e.g., primordial gravitational waves and modifications of general relativity - are eagerly awaited • Capable exascale simulations of cosmic structure formation are essential to shed light on some of the deepest puzzles in all of physical science with a comprehensive program to develop and apply a new extreme-scale cosmology simulation framework for verification of gravitational evolution, gasdynamics, and subgrid models at very high dynamics

Exascale Computing Project PI: Salman Habib (ANL) 22 Computing the Sky at Extreme Scales

Exascale Challenge Problem Applications & S/W Technologies • To evolve and meld aspects of the capabilities of Lagrangian particle-based Applications techniques (gravity + gas) with Eulerian adaptive mesh resolution (AMR) methods to • HACC, NYX achieve a unified cosmological simulation approach at the exascale Software Technologies Cited • determine the dark energy equation of state • UPC++, C++17 • search for deviations from general relativity • MPI, OpenMP, OpenCL, CUDA • determine the neutrino mass sum (to less than 0.1 eV) from galaxy clustering measurements • BoxLib, High-Performance Geometric Multigrid (HPGMG), FFTW, PDACS • characterize the properties of dark matter • Thrust • testing the theory of inflation

Risks and Challenges Development Plan • Accuracy of subgrid modeling Y1: First major HACC hydro simulation on Theta on full machine;First HACC tests on IBM • Filesystem stability and availability and fast access to storage for Power8/NVIDIA Pascal 36 node early-access system; Release HACC & Nyx post-processing Y2: Access to 25% of Summit system as part of CAAR project with HACC, scaling runs and • Loss of personnel. Team is relatively ; need to avoid single optimization; Nyx scale-up test on Cori/Theta: Clusters of galaxies (deep AMR); Summit points of failure CAAR project simulations: HACC hydrodynamic simulations on full machine; Release HACC • Resilience - machine MTBF & Nyx Y3: HACC and Nyx scaling runs; Scaling of CosmoTools to full scale on Aurora; meeting FOMs; Scaling of CosmoTools to full scale on Summit; Release HACC & Nyx Y4: Final Major HACC and Nyx code releases

Exascale Computing Project PI: Salman Habib (ANL) 23 Software Technology Requirements Cosmology

Definitely plan to use (Rank 1) • Programming Models and Runtimes: C++/C++17, C, Fortran, GASnet, Python, MPI, OpenMP, CUDA, Thrust, UPC++ • Tools: Make, svn, git, GitLab, Valgrind, LLVM/Clang, VTune, SKOPE • Mathematical Libraries, Scientific Libraries, Frameworks: FFTW, MKL, HPGMG, VODE, BoxLib, TensorFlow • Data Management and Workflows: MPI-IO, HDF5, Jupyter, Globus, Smaash, Docker, Shifter, PDACS/Galaxy, pNetCDF • Data Analytics and Visualization: ParaView, VTK-m, vl3, CosmoTools, Gimlet, Reeber Will explore as an option (Rank 2) • Programming Models and Runtimes: OpenACC, OpenCL, Julia, CHAPEL, Kokkos, Raja • Tools: FTI, SKOPE, SCR, Cmake, Vampir • Data Management and Workflows: Swift, Decaf

• Data Analytics and Visualization: BayesDB, yt, R Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 24 Data Analytics and Visualization Plans Computing the Sky at Extreme Scales (cosmology)

• Use of NVRAM will help in implementing in situ analysis • Dynamically partition MPI processes into MPI groups that simultaneously perform different tasks, the analysis groups synchronizing with the main “simulation” group when required • 3 levels of data hierarchy – Level 1: raw/compressed simulation data – Level 2: analyzed/reduced simulation data – Level 3: further reduced to a database or catalog level • Data reduction and in situ analysis acts on level 1/2 data sets to produce level 2/3 data – Level 2 data further analyzed in situ or offline; level 3 is primarily for offline analyses with databases – In future, offline analysis of level 1 data will be severely disfavored, and a new offline/in situ boundary will enter at level 2. • Tools and technologies being used and/or explored – ParaView, VTK-m, vl3, CosmoTools, Gimlet, Reeber, BayesDB, yt, R

Exascale Computing Project 25 Exascale Deep Learning Enabled Precision Medicine for Cancer CANDLE accelerates solutions toward three top cancer challenges

• Focus on building a scalable deep neural network code called the CANcer Distributed Learning Environment (CANDLE) • CANDLE addresses three top challenges of the National Cancer Institute: 1. Understanding the molecular basis of key protein interactions 2. Developing predictive models for drug response, and automating the analysis 3. Extraction of information from millions of cancer patient records to determine optimal cancer treatment strategies

Exascale Computing Project PI: Rick Stevens (ANL) 26 Exascale Deep Learning Enabled Precision Medicine for Cancer

Exascale Challenge Problem Applications & S/W Technologies • Three cancer problem drivers: RAS pathway problem, drug response problem, Applications treatment strategy problem • Deep learning platforms as candidates for foundation for CANDLE: • Focus is on the machine learning aspect of the three problems • Theano, Torch/fbcunn, TensorFlow, DSSTSNE, Neon, CAFFE Poseidon LBANN • building single scalable deep neural network code we call CANDLE: CANcer Distributed CTN) Learning Environment Software Technologies Cited • Common threads • cuDNN, BLAS and DAAL, OpenMP, OpenACC, CUDA, MPI, PGAS • Cancer types at all three scales molecular, cellular and population • significant data management and data analysis problems • need to integrate simulation, data analysis and machine learning

Risks and Challenges Development Plan • CANDLE scalability may require more RAM than available on Y1: Release CANDLE 1.0 scalable to at least 1000 nodes, 10 billion weights, 10 million nodes neurons • Distributed implementation of CANDLE may not achieve high Y2: Release CANDLE 2.0 scalable to at least 5,000 nodes, 30 billion weights, 30 million utilization of the high-performance interconnections between nodes neurons • None of the existing deep neural network code bases will have all Y3: Release CANDLE 3.0 scalable to at least 10,000 nodes, 100 billion weights, 100 million the features required for CANDLE, particular the need to interface to scalable simulations neurons • NCI provided data insufficient for proper training / testing of deep Y4: Release CANDLE 4.0 scalable to at least 50,000 nodes, 300 billion weights, 300 million learning networks neurons

Exascale Computing Project PI: Rick Stevens (ANL) 27 Requirements Ranking 1. Definitely plan to use Software Technology Requirements 2. Will explore as an option Precision Medicine for Oncology 3. Might be useful but no concrete plans

• Definitely plan to use (Rank 1) – Programming Models and Runtimes: C++, C, Python, Scala, Fortran, MPI, OpenMP, SPARK, OpenACC, Cuda, PGAS, Globus Online, Boost, OpenShmem, Lua – Tools: CMake, git, TAU, GitLab, Jira, Valgrind, PAPI, LLVM/Clang, HPCToolkit, Jenkins – Mathematical Libraries, Scientific Libraries, Frameworks: cuDNN, ESSL, MKL, DAAL, Tensorflow, Caffe, Torch, Theano, Mocha, Lapack, BLAS and other standard libraries for scientific computing – Data Management and Workflows: Swift, MPI-IO, HDF5, Jupyter, Digits, DataSpaces, Bellepheron environment analysis for materials (BEAM@ORNL) – Data Analytics and Visualization: Deep visualization toolbox for deep learning, Grafana, Matplotlib, Graphviz, Paraview – System Software: Pmem, libnuma, memkind • Will explore as an option (Rank 2) – Programming Models and Runtimes: Java, Thrust, Minerva, Latte – Tools: Origami, OpenTuner, Orio – Mathematical Libraries, Scientific Libraries, Frameworks: CNTK, Scikit Learn, Pylearn2, Petsc, Blas, CuBlas, CuSparse, H2O, Neon – Data Management and Workflows: Mesos, Heron, Beam, Zeppelin, Dockers – Data Analytics and Visualization: R, EDEN, Origami Graph Visualization on Everest

Exascale Computing Project 28 Software Technology Requirements Precision Medicine for Oncology

• Might be useful but no concrete plans (Rank 3) – Programming Models and Runtimes: Julia, EventWave for event based computing/simulations – Tools: GraphLab, TAO – Mathematical Libraries, Scientific Libraries, Frameworks: GraphLab, ADIC, Keras, TensorFlow Serving – Data Analytics and Visualization: ActiveFlash

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option 3. Might be useful but no concrete plans

Exascale Computing Project 29 Exascale Lattice Gauge Theory Opportunities and Requirements for Nuclear and High Energy Physics Lattice quantum chromodynamics (QCD) calculations are the scientific instrument to connect observed properties of hadrons to the fundamental laws of quarks and gluons and critically important to particle and nuclear physics experiments in the decade ahead

• Lattice QCD has made formidable progress in formulating the properties of hadrons (particles containing quarks), but experimental particle and nuclear physics programs require lattice calculations orders of magnitude more demanding still • Searching for the tiny effects of yet-to-be-discovered physics beyond the standard model, particle physics must have simulations accurate to ~0.10%, an order of magnitude more precise typically realized today • To accurately compute properties and interactions of hadrons and light nuclei, nuclear physics needs lattice calculations on much larger volumes to investigate multi- hadron states in a reliable controlled way • Exascale lattice gauge theory will make breakthrough advances possible in particle and nuclear physics

Exascale Computing Project PI: Paul Mackenzie (FNAL) 30 Exascale Lattice Gauge Theory Opportunities and Requirements for Nuclear and High Energy Physics Exascale Challenge Problem Applications & S/W Technologies • Develop a software infrastructure that exploits recent compiler advances and Applications improved language support to enable the creation of portable, high-performance • MILC, Columbia Physics System, Chroma, QDP++ (all built upon USQCD QCD code with a shorter software tool-chain software infrastructure) • Focus on two nuclear/HEP applications: Software Technologies Cited • Compute from first principles the properties and interactions of nucleons and ight • MPI, OpenMP, CUDA, Kokkos, OpenACC, C++17, Thrust, nuclei with physical quark masses and achieve the multi-physics goal of incorporating both QCD and electromagnetism • SyCL, QUDA, QPhiX, LAPACK, ARPACK • Search for beyond-the-standard-model physics by increasing the precision of calculations of the properties of quark-anti-quark and three-quark states.

Risks and Challenges Development Plan • Critical slowing down in gauge evolution Y1: Develop adaptive multigrid for domain wall and staggered fermions • Correlation functions for large nuclei Y2: Release new versions of old apps augmented with new implementations of algorithms; • Sub-optimal solver performance Initial release of Workflow framework • Multi-level time integration for correlation functions Y3: Release of data parallel API with GPU support; Release benchmark suite for non volatile • Performance in data-parallel GPU offload memory storage; Scalable MG based deflation methods with variance reduction applicable to other HP domains • Extending multigrid solver base Y4: Validated and documented high performance code implementing the 1-2 most successful • Architectural pathfinding (appropriate parallelization approach) algorithms for reducing critical slowing down

Exascale Computing Project PI: Paul Mackenzie (FNAL) 31 at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science (EXAALT) Combining time-acceleration techniques, spatial decomposition strategies, and high accuracy quantum mechanical and empirical potentials

• Tackle materials challenges for energy, especially fission and fusion, by allowing the scientist to target, at the atomistic level, the desired region in accuracy, length, and time space • Shown here is a simulation aimed at understanding tungsten as a fusion first-wall material, where plasma-implanted helium leads to He bubbles that grow and burst at the surface, ultimately leading to surface "fuzz" by a mechanism not yet understood • At slower, more realistic growth rates (100 He/µsec), the bubble shows a different behavior, with less surface damage, than the fast-grown bubble simulated with direct molecular dynamics (MD) • Atomistic simulation allows for complete microscopic understanding of the mechanisms underlying the behavior • At the slower growth rate, crowdion interstitials emitted from the bubble have time to diffuse over the surface of the bubble, so that they are more likely to release from the surface-facing side of the Slowly-growing He bubble in W at bursting bubble, giving surface-directed growth.

Exascale Computing Project PI: Arthur Voter (LANL) 32 Molecular Dynamics at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science

Exascale Challenge Problem Applications & S/W Technologies • Success of MD at risk by inability to reach necessary length & time scales while Applications maintaining accuracy; simple scale-up of current practices only allows larger systems; doesn’t improve current timescales (nsec) & accuracy (empirical potentials) • LAMMPS, LATTE, AMDF • Predictive microstructure evolution requires access to msec timescales and Software Technologies Cited accurately describing complex defects with explicit consideration of the electronic • MPI, OpenMP, CUDA structure, which cannot be done using conventional empirical potentials • Kokkos • Develop novel MD methodologies, driven by two challenges: (1) extending the burnup of nuclear fuel in fission reactors (dynamics of defects & fission gas clusters • VTK, Paraview in UO2) and (2) developing plasma facing components (tungsten first wall) to resist • Legion the harsh conditions of fusion reactors • Bring three state of the art MD codes into a unified tool to leverage exascale platforms across dimensions of accuracy, length, and time Risks and Challenges Development Plan • Lowest levels of DFTB theory might prove insufficient for actinide- Y1: Code integration demonstration on homogeneous nodes (problems 1, 2a); EXAALT bearing materials. Might require higher order DFTB expansion. package release • Performance of spatially parallel replica-based AMD methods Y2: Science-at-scale demonstration, homogeneous nodes (Trinity, Mira) for problems 1 and (SLParRep) might be affected by certain kinds of very low barrier 2a,b) at target simulation rate of 2 µsec/day; EXAALT package release events. Y3: Science at scale demonstration, heterogeneous nodes (e.g., Summit) for problems 1 and • SNAP potential descriptors may not represent the potential energy surface of target multispecies materials with sufficient accuracy. 2a,b at target simulation rate of 10 µsec/day; EXAALT package release Would have to augment the descriptor set. Y4: Science at scale demonstration (e.g., Aurora, Summit). Problems 1 and 2a,b at target simulation rate of 50 µsec/day; Final EXAALT package release

Exascale Computing Project PI: Arthur Voter (LANL) 33 Software Technology Requirements Molecular Dynamics at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science

• Programming Models and Runtimes 1. MPI, Kokkos 2. Legion and various other task-management runtimes, OpenMP, CUDA, various fault-tolerant communication libraries (e.g., 0MQ) 3. HPX, Charm++ • Tools 1. Git/GitLab • Mathematical Libraries, Scientific Libraries, Frameworks 1. Dakota, FFTW

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 34 Software Technology Requirements Molecular Dynamics at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science

• Data Management and Workflows 1. We currently use bdb on node, but we plan to assess various other embedded and distributed databases (such as MDHIM) 3. Swift, DHARMA

• Data Analytics and Visualization 1. VTK, Paraview

• System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 35 Exascale Models of Stellar Explosions: Quintessential Multi-Physics Simulation Creation of the Heavy Elements in the Supernova Explosion of a Massive Star

• Hot material roils around a newly-born neutron star at • White dwarf: small dense star formed when a low-mass star has the center of a core-collapse supernova exhausted its central nuclear fuel and lost its outer layers as a • Shown is the matter entropy from a 3D Chimera planetary nebula; will eventually happen to the Sun simulation of the first half second of the explosion • Stellar collision: coming together of two stars, that merge into one • Driven by intense heating from neutrinos coming from larger unit thru the force of gravity; recently has been observed the cooling neutron star at the center, the shock wave • Series of stellar collisions in a dense cluster over time can lead to will be driven outward, eventually ripping the star an intermediate-mass black hole via "runaway stellar collisions” apart and flinging the elements that make up our • Also a source of characteristic “inspiral” gravitational waves! solar system and ourselves into interstellar space • 3D Castro simulation of merging white dwarfs Exascale Computing Project PI: Daniel Kasen (LBNL) 36 Exascale Models of Stellar Explosions: Quintessential Multi-Physics Simulation Exascale Challenge Problem Applications & S/W Technologies • Stellar explosion simulations to explain the origin of the elements (especially those Applications heavier than iron), a longstanding problem in physics • CLASH framework: components from FLASH, CASTRO, Chimera, Sedona • Define the conditions for astrophysical nucleosynthesis that motivate, guide, and exploit experimental nuclear data (near drip lines) from the Facility for Rare Isotope Software Technologies Cited Beams (FRIB) • C, C++, Fortran, UPC++, Python • Address the physics of matter under extreme conditions, including neutrino and • MPI, OpenMP, GASNet gravitational wave emission and the behavior of dense nuclear matter. • Charm++, Perilla, • Fully self-consistent calculations will model all proposed explosive nucleosynthesis • AMR (BoxLib), MC sites (core-collapse supernovae, neutron star mergers, accreting black holes) and related stellar eruptions: novae, x-ray bursts and thermonuclear supernovae • VisIt, yt, Bellerophon

Risks and Challenges Development Plan • Immaturity of programming models will inhibit progress Y1: Establish / publish API, components, and composability for high level framework design • Unavailability of key personnel Y2: Core-collapse supernova simulation with two-moment transport; CLASH proto-code • Performance predictions for one or more code modules prove too release with API examples ambitious Y3: Release of CLASH1.0; X-ray burst simulation with large network • Modules produced will be incompatible between major code lines Y4: Release of entire CLASH ecosystem; Binary neutron star simulation: neutrino radiation transport calculated using a moment expansion, with a closure relation calibrated by IMC Boltzmann transport; general relativistic hydro with the metric computed assuming conformal flatness; detailed neutrino/matter coupling (non-isoenergetic scattering) and neutrino velocity- dependent and relativistic effects; nuclear reaction networks including of order 1000 isotopes

Exascale Computing Project PI: Daniel Kasen (LBNL) 37 Software Technology Requirements Provided by Exascale Models of Stellar Explosions: Quintessential Multi-Physics Simulation

• Programming Models and Runtimes 1. Fortran, C++, C, MPI, OpenMP, OpenACC, CUDA 2. UPC++, GASNet, Charm++, Co-Array FORTRAN 3. - • Tools 1. git, PAPI 2. - 3. TAU • Mathematical Libraries, Scientific Libraries, Frameworks 1. BoxLib, Hypre, BLAS, LAPACK 2. MAGMA 3. -

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 38 Software Technology Requirements Provided by Exascale Models of Stellar Explosions: Quintessential Multi-Physics Simulation

• Data Management and Workflows 1. HDF, Bellerophon, MPI-IO 2. - 3. ADIOS 4. Data Analytics and Visualization 5. Visit, yt 6. - 7. - • System Software 1. - 2. - 3. -

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 39 High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

• Ability to accurately simulate the complex processes associated with major earthquakes will become a reality with capable exascale • Simulations offer a transformational approach to earthquake hazard and risk assessments • Dramatically increase our understanding of earthquake processes • Provide improved estimates of the ground motions that can be expected in future earthquakes • Time snapshots (map view looking at the surface of the earth) of a simulation of a rupturing earthquake fault and propagation seismic waves

Exascale Computing Project PI: David McCallen (LBNL) 40 High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

Exascale Challenge Problem Applications & S/W Technologies • Build upon and advance simulation and data exploitation capabilities to transform Applications computational earthquake hazard and risk assessments to a frequency range • SW4, ESSI relevant to estimating the risk to key engineered systems (e.g. up to 10Hz). Software Technologies Cited • Require simulations of unprecedented size and fidelity utilizing measured ground • OpenMP, OpenACC, CUDA, MPI motion data to constrain and construct geologic models that can support high frequency simulations. Highly leverages investment of a commercial partner (Pacific • Fortran (inner loop of SW4) Gas and Electric) in obtaining unprecedented, dense ground motion data at regional • HDF5 (coupling SW4 TO ESSI) scale from a SmartMeter system • VISIT for visualization • Provide the first strong coupling and linkage between HPC simulations of earthquake hazards (ground motions) and risk (structural system demands) – a true end-to-end simulation of complex, coupled phenomenon Risks and Challenges Development Plan • Transitioning software to emerging architectures and optimizing Y1: Define core computational kernels of SW4 OpenMP within MPI-partitions for many-core performance for both forward and inverse calculations machines Cuda/OpenACC or OpenMP for GPU/CPU machines; check pointing in SW4 • Utilization of data to refine geologic models at the regional scales – Distributed processing of rupture model. determine just how far data can push simulation realism Y2: Load balance within MPI-partitions Optimize 2D solver at mesh refinement interfaces • Developing an inversion simulation capability to extend the Distribute processing for adjoint calculations. frequency of reliability/realism of ground motion simulations to constrain currently ill-defined geologic structure at fine scale Y3: synchronous and overlapping MPI communication Mesh refinement within curvilinear mesh Multi-scale material model for FWI Construct local model from FWI and 5 Hz synthetic • The realized effectiveness of inversion algorithms at unprecedented data. scale Y4: Combine local and regional SFBA models Construct local model from FWI and 10 Hz synthetic data. Exascale Computing Project PI: David McCallen (LBNL) 41 Software Technology Requirements Provided by High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

• Programming Models and Runtimes 1. Fortran-03, C++, C, MPI 2. OpenMP, CUDA 3. OpenACC • Tools 1. git, cmake, LLVM/Clang, totalview, valgrind 2. ROSE, NVVP (NVIDIA visual profiler) 3. HPCToolkit • Mathematical Libraries, Scientific Libraries, Frameworks 1. Blas/Pblas, Lapack/ScaLAPACK, Proj4 (Cartographic Projections Library) 2. PETSc, BoxLib, Chombo 3. Hypre

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 42 Software Technology Requirements Provided by High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

• Data Management and Workflows 1. HDF5 2. ASDF (Arbitrary Seismic Data Format), python 3. Your list here • Data Analytics and Visualization 1. VisIt, ObsPy (Python framework for processing seismological data) 2. gmt (generic mapping tool) 3. Your list here • System Software 1. Linux/TOSS, Lustre, GPFS 2. Your list here 3. Your list here

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 43 An Exascale Subsurface Simulator of Coupled Flow, Transport, Reactions and Mechanics Exascale Challenge Problem Applications & S/W Technologies

• Safe and efficient use of the subsurface for geologic CO2 sequestration, petroleum Applications extraction, geothermal energy and nuclear waste isolation • Chombo-Crunch, GEOS • Predict reservoir-scale behavior as affected by the long-term integrity of hundreds of Software Technologies Cited thousands deep wells that penetrate the subsurface for resource utilization • C++, Fortran, LLVM/Clang • Resolve pore-scale (0.1-10 µm) physical and geochemical heterogeneities in wellbores and fractures to predict evolution of these features when subjected to • MPI, OpenMP, CUDA geomechanical and geochemical stressors • Raja, CHAI • Integrate multi-scale (µm to km), multi-physics in a reservoir simulator: non- • Chombo AMR, PETSc isothermal multiphase fluid flow and reactive transport, chemical and mechanical effects on formation properties, induced seismicity and reservoir performance • ADIOS, HDF5, Silo, ASCTK • Century-long simulation of a field of wellbores and their interaction in the reservoir • VisIt Risks and Challenges Development Plan • Porting to exascale results in suboptimal usage across platforms Y1: Evolve GEOS and Chombo-Crunch; Coupling framework v1.0; Large scale (100 m) • No file abstraction API that can meet coupling requirements mechanics test (GEOS); Fine scale (1 cm) reactive transport test (Chombo-Crunch) • Batch scripting interface incapable of expressing simulation Y2: GEOS+Chombo-Crunch coupling for single phase; Coupling framework w/ physics; workflow semantics Multiphase flow for Darcy & pore scale; GEOS large strain deformation conveyed to Chombo- • Scalable AMG solver in PETSc Crunch surfaces; Chombo-Crunch precipitation/dissolution conveyed to GEOS surfaces • Physics coupling stability issues Y3: Full demo of fracture asperity evolution-coupled flow, chemistry, and mechanics • Fully overlapping coupling approach results inefficient. Y4: Full demo of km-scale wellbore problem with reactive flow and geomechanical deformation, from pore scale to resolve the geomechanical and geochemical modifications to the thin interface between cement and subsurface materials in the wellbore and to asperities in fractures and fracture networks

Exascale Computing Project PI: Carl Steefel (LBNL) 44 Software Technology Requirements Provided by An Exascale Subsurface Simulator of Coupled Flow, Transport, Reactions and Mechanics

• Programming Models and Runtimes 1. Fortran, C++/C++11, MPI, OpenMP, 2. UPC++, PGAS, TiledArrays 3. OpenShmem • Tools 1. HPCToolkit, PAPI, ROSE, subversion, parallel debugger 2. 3. • Mathematical Libraries, Scientific Libraries, Frameworks 1. Chombo, PETSc, FFTW, BLAS, LAPACK 2. HPGMG, SLEPc 3. Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 45 Software Technology Requirements Provided by An Exascale Subsurface Simulator of Coupled Flow, Transport, Reactions and Mechanics

• Data Management and Workflows 1. HDF5, BurstBuffer, HPSS 2. 3. • Data Analytics and Visualization 1. VisIt, ChomboVis 2. VTK-m 3. • System Software 1. DataWarp 2. 3.

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 46 Exascale Modeling of Advanced Particle Accelerators Toward compact and affordable particle accelerators. A laser beam or a charged particle beam propagating through ionized gas displaces electrons, creates a wakefield that supports electric fields orders of magnitude larger than with usual methods, accelerating a charged particle beam to high energy over a very short distance.

• Particle accelerators: a vital part of DOE infrastructure for discovery science and university- and private-sector applications - broad range of benefits to industry, security, energy, the environment, and medicine • Improved accelerator designs are needed to drive down size and cost; plasma-based particle accelerators stand apart in their potential for these improvements • Translating this promising technology into a mainstream scientific tool depends critically on exascale-class high- fidelity modeling of the complex processes that develop over a wide range of space and time scales • Exascale-enabled acceleration design will realize the goal of compact and affordable high-energy physics colliders, with many spinoff plasma accelerator applications likely

Exascale Computing Project PI: Jean-Luc Vay (LBNL) 47 Exascale Modeling of Advanced Particle Accelerators

Exascale Challenge Problem Applications & S/W Technologies • Design affordable compact 1 TeV electron-positron collider based on plasma Applications acceleration for high-energy physics. • Warp, PICSAR • Develop ultra-compact plasma accelerators with transformative implications in Software Technologies Cited discovery science, medicine, industry and security. • Foftran, C, C++, Python • Enable virtual start-to-end optimization of the design and virtual prototyping of every component before they are built, leading to huge savings in design and construction. • MPI, OpenMP, GASNet, UPC++ • Develop powerful new accelerator modeling tool (WarpX) designed to run efficiently • BoxLib at scale on exascale . • HDF5, VisIT, Paraview, YT

Risks and Challenges Development Plan • Dynamic load balancing Y1: Modeling of single plasma-based accelerator stage with WarpX on single grid; verification • Parallel I/O, data analysis & visualization against previous results. • Scaling electrostatic solver Y2: Modeling of single plasma-based accelerator stage with WarpX with static mesh • Scaling high-order electromagnetic solver refinement. plasma case. • Spurious reflections or charges owing to AMR Y3: Optimized FDTD and spectral PIC on to 5-to-10 millions of cores, near-linear weak scaling with AMR, on a uniform plasma case. • Lost signal with AMR • Numerical Cherenkov instability Y4:Convergence study in 3-D of ten consecutive multi-GeV stages in linear and bubble regime. Release of software to community. • Low temperature plasmas/beams

Exascale Computing Project PI: Jean-Luc Vay (LBNL) 48 Software Technology Requirements Provided by Exascale modeling of advanced particle accelerators

• Programming Models and Runtimes 1. Fortran, C++, Python, C, MPI, OpenMP, 2. UPC/UPC++, OpenACC, CUDA, GASNet 3. HPX, Global Arrays, TiledArrays, Co-Array FORTRAN • Tools 1. LLVM/Clang, CMake, git, 2. TAU, HPCToolkit, • Mathematical Libraries, Scientific Libraries, Frameworks 1. BoxLib, FFTW, NumPy 2. N/A 3. BLAS/PBLAS, LAPACK/ScaLAPACK, P3DFFT

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 49 Software Technology Requirements Provided by Exascale modeling of advanced particle accelerators

• Data Management and Workflows 1. MPI-IO, HDF, h5py 2. ADIOS • Data Analytics and Visualization 1. VisIt, Jupyter notebook 2. N/A 3. VTK, Paraview • System Software 1. N/A 2. N/A 3. N/A

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 50 Exascale Solutions for Microbiome Analysis Microbiomes: integral to the environment, agriculture, health and biomanufacturing. Analyzing the DNA of these microorganism communities is a computationally demanding bioinformatic task, requiring exascale computing and advanced algorithms.

• Microorganisms are central players in climate change, environmental remediation, food production, human health • Occur naturally as “microbiomes” - communities of thousands of microbial species of varying abundance and diversity, each contributing to the function of the whole • <1% of millions of worldwide microbe species have been isolated and cultivated in the lab, and a small fraction have been sequenced • Collections of microbial data are growing exponentially, representing untapped info useful for environmental remediation and the manufacture of novel chemicals and medicines. • “Metagenomics” — the application of high-throughput genome sequencing technologies to DNA extracted from microbiomes — is a powerful method for studying microbiomes • First assembly step has high computational complexity, like putting together thousands of puzzles from a jumble of their pieces • After assembly, further data analysis must find families of genes that work together and to compare across metagenomes • ExaBiome application is developing exascale algorithms and software to address these challenges

Exascale Computing Project PI: Katherine Yelick (LBNL) 51 Exascale Solutions for Microbiome Analysis

Exascale Challenge Problem Applications & S/W Technologies • Enable biological discoveries and bioengineering solutions through high quality Applications assembly and comparative analysis of 1 million metagenomes on an early exascale • HipMer, GOTCCHA, Mash system. Software Technologies Cited • Support biomanufacturing of materials for energy, environment, and health applications (e.g., antibiotics) through identification of genes and gene clusters in • UPC, UPC++, MPI, GASNet microbial communities • Provide scalable tools for three core computational problems in metagenomics: (i) metagenome assembly, (ii) protein clustering and (iii) signature-based approaches to enable scalable and efficient comparative metagenome analysis:

Risks and Challenges Development Plan • Exascale systems are not balanced for this data-intensive Y1: Release HipMer for metagenomes on short read data; Demonstrate HipMer at scale on workload, limiting scalability or requiring new algorithmic Cori, optimized for XeonPhi approaches Y2: Release Mashbased pipeline for whole metagenome classification; Release • No GASNet, UPC or UPC++ implementation exists on the exascale GOTTCHA/Mashbased visualization toolkit systems, requiring a different implementation strategy Y3: Release HipMer for long read metagenomes; Release HipMer, HipMCL, GOTTCHA for • GOTTCHA algorithm cannot accurately distinguish metagenomes, requiring a new analysis approach metagenomes assembly and annotation on long/short reads for APEX/CORAL platforms Y4: Complete assembly and analysis of data in SRA and IMG

Exascale Computing Project PI: Katherine Yelick (LBNL) 52 Software Technology Requirements Provided by Exascale Solutions for Microbiome Analysis

• Programming Models and Runtimes 1. UPC/UPC++, PGAS, GASNetEX 2. Thrust 3. MPI • Tools 1. Git, CMake 2. GDB, Valgrind • Mathematical Libraries, Scientific Libraries, Frameworks 1. Smith-Waterman

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 53 Software Technology Requirements Provided by Exascale Solutions for Microbiome Analysis

• Data Management and Workflows 1. IMG/KBase, SRA, Globus 2. MPI-IO • Data Analytics and Visualization 1. CombBLAS, 2. Elviz, GAGE, MetaQuast • System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 54 Full-loop, Lab-Scale Gasifier Simulation with MFIX-DEM The MFIX-Exa project will track billions of reacting particles in a full-loop reactor, making it feasible to simulate a pilot-scale chemical looping reactor in a time-to-solution small enough to enable simulation-based reactor design and optimization • Simulation contains over 1 million reacting particles coupled to a gas phase through interphase momentum, energy, and mass transfer. • Particles comprised of carbon, volatiles, moisture and ash • Gas comprised of O2, CO, CO2, CH4, H2, H2O, N2 • Animation: (Left) Solids particles colored by temperature. (Right) Volume rendering of CO mass fraction. Grey surfaces indicate regions in the domain where the water-gas shift reaction is strong. • Graphs. (Left) Gas species composition and (Right) temperature at the outlet.

Exascale Computing Project PI: Madhava Syamlal (NETL) 55 Performance Prediction of Multiphase Energy Conversion Devices with Discrete Element, PIC, and Two-Fluid Models (MFIX-Exa)

Exascale Challenge Problem Applications & S/W Technologies

• Curbing man-made CO2 emissions at fossil-fuel power plants relies on carbon- Applications capture and storage (CCS) – understanding how to scale laboratory designs of • MFIX (MFIX-TFM, MFIX-PIC, MFIX-DEM) multiphase reactors to industrial size is required to drive large-scale commercial deployment of CCS; a high-fidelity modeling capability is critically needed because Software Technologies Cited Build and Test is prohibitive both in cost and time, to meet DOE’s CCS goals • C++, FORTRAN, MPI • Deliver computational fluid dynamics-discrete element modeling (CFD-DEM) • BoxLib, AMG solvers (e.g., Hypre, PETSc) capability for lab-scale reactors, demonstrating it by simulating NETL’s 50 kW chemical looping reactor (CLR), including all relevant individual chemical and • HDF5 physical phenomena present in the reactor • MPI, OpenMP, GASNet • Decadal problem is CFD-DEM capability for small pilot-scale (0.5-5 MWe) reactors, • UPC++ demonstrating it by simulating NETL-CLR scaled up to 1MWe with sufficient fidelity and time-to-solution to impact design decisions. Risks and Challenges Development Plan Y1: Restructure DEM hydro models and particle data into BoxLib; replace BiCGStab by BoxLib • Unexpected effects from changes in temporal and spatial resolution within multilevel algorithms with subcycling in time GMG solver; improve single-level alg perf; optimize particle data layout; incorporate HDF5 I/O • Prototype node hardware not available for performance evaluation Y2: Migrate scalar and thermodynamic models into BoxLib; incorporate coarse cut-cell support; of intra-node programming model implement non-subcycling multilevel time-stepping; improve multilevel particle-particle and particle-mesh perf; enable existing analytics for use in-situ • Available external AMG solvers fail to scale • Loss of personnel Y3: Migrate species transport into BoxLib; adapt non-subcycle multilevel alg load-balance to species transport and chemical reactions; port expensive kernels to GPU (OpenMP); extend • Supplied Cartesian cut-cell geometry files from MFIX-GUI fail to analytics to use BoxLib sidecar work correctly with BoxLib integration Y4: Optimize load balance for subcycle multilevel alg with full physics; deliver spectral deferred corrections (SDC); optimize on-node perf; optimize workflow to minimize total runtime

Exascale Computing Project PI: Madhava Syamlal (NETL) 56 Software Technology Requirements Multiphase (NETL)

• Programming Models and Runtimes 1. Fortran, C++/C++17, C, MPI, OpenMP, Python 2. OpenACC, CUDA, OpenCL, UPC/UPC++ • Tools 1. Git, GitLab, TAU, Jenkins (testing), DDT (debugger) 2. HPCToolkit, PAPI • Mathematical Libraries, Scientific Libraries, Frameworks 1. BoxLib, Hypre, PETSc 2. Trilinos

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 57 Software Technology Requirements Multiphase (NETL)

• Data Management and Workflows 1. HDF

• Data Analytics and Visualization 1. VisIt, ParaView • System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 58 Exascale Predictive Wind Plant Flow Physics Modeling Understanding Complex Flow Physics of Whole Wind Plants

• Must advance fundamental understanding of flow physics governing whole wind plant performance: wake formation, complex terrain impacts, turbine-turbine interaction effects • Greater use of U.S. wind resources for electric power generation (~30% of total) will have profound societal and economic impact: strengthening energy security and reducing greenhouse-gas emissions • Wide-scale deployment of wind energy on the grid without subsidies is hampered by significant plant-level energy losses by turbine-turbine interactions in complex terrains • Current methods for modeling wind plant performance are not reliable design tools due to insufficient model fidelity and inadequate treatment of key phenomena • Exascale-enabled predictive simulations of wind plants composed of O(100) multi-MW wind turbines sited within a 10 km x 10 km area with complex terrains will provide validated "ground truth" foundation for new turbine design models, wind plant siting, operational controls and reliably integrating wind energy into the grid

Exascale Computing Project PI: Steve Hammond (NREL) 59 Exascale Predictive Wind Plant Flow Physics Modeling

Exascale Challenge Problem Applications & S/W Technologies • A key challenge to wide-scale deployment of wind energy, without subsidy, in the Applications utility grid is predicting and minimizing plant-level energy losses. Current methods • Nalu, FAST lack model fidelity and inadequately treat key phenomena. Software Technologies Cited • Deliver predictive simulation of a wind plant composed of O(100) multi-MW wind turbines sited within 10km x 10km area, with complex terrain (O(10e11 grid points). • C++, MPI, OpenMP (via Kokkos), CUDA (via Kokkos) • Predictive physics-based high-fidelity models validated with target experiments, • Trilinos (Tpetra), Muelu, Sierra Toolkit (STK), Kokkos provide fundamental understanding of wind plant flow physics, and will drive blade, • Spack, Docker turbine, and wind plant design innovation. • DHARMA (Distributed asynchronous Adaptive Resilient Management of • This work will play a vital role in addressing urgent national need to dramatically Applications) increase the percentage of electricity produced from wind power, without subsidy.

Risks and Challenges Development Plan • Transition to next generation platforms Y1: Baseline run for canonical ABL simulation with MPI; single-blade-resolved sim in non- • Robustness of high-order schemes on turbulence model equations rotating turbulent flow; incorporate Kokkos and demonstrate faster ABL run; demonstrate single-blade-resolved simulation with rotating blades • Sliding-mesh algorithm scalability • Kokkos support Y2: Baseline single-blade-resolved capability (SBR) run; demonstrate mixed-order run with overset or sliding mesh algorithm; demonstrate faster SBR run; demonstrate single-turbine blade-resolved simulation Y3: Demonstrate simulation of several turbines operating in flat terrain Y4: Demonstrate simulation of O(10) turbines operating in complex terrain

Exascale Computing Project PI: Steve Hammond (NREL) 60 Coupled Monte Carlo Neutronics and Fluid Flow Simulation of Small Modular Reactors

• DOE has motivated and supported (just in this decade) the creation and enhancement of a new suite of high resolution physics applications for nuclear reactor analysis • Petascale-mature applications include new Monte Carlo (MC) neutronics and computational fluid dynamics (CFD) capabilities suitable for much-improved analysis of light water reactors (LWR) on the grid today • Petascale has enabled pin-resolved reactor physics solutions for reactor startup conditions • Capable exascale: needed to model operational behavior of LWRs at hot full power with full-core multiphase CFD and fuel depletion (over the complete operational reactor lifetime) • Capable exascale: allow coupling high-fidelity MC neutronics + CFD into an integrated toolkit for also modeling the operational behavior of Small Modular Reactors (<300 MWe) • Penultimate exascale challenge problem for nuclear reactors: generate experimental-quality simulations of steady-state and transient reactor behavior

Exascale Computing Project PI: Thomas Evans (ORNL) 61 Coupled Monte Carlo Neutronics and Fluid Flow Simulation of Small Modular Reactors Exascale Challenge Problem Applications & S/W Technologies • Unprecedentedly detailed simulations of coupled coolant flow and neutron transport Applications in Small Modular Reactor (SMR) cores to streamline design, licensing, and optimal • Nek5000, SHIFT, OpenMC operation. Software Technologies Cited • Coupled Monte Carlo (MC)/CFD simulations of the steady state operation of a 3D full small modular reactor core • MPI, OpenMP, Kokkos, Trilinos, PETSc,, CUDA, OpenACC, DTK • Coupled MC/CFD simulations of the full core in steady-state, low-flow critical heat flux (CHF) conditions • Transient coupled MC neutronics/CFD simulations of the low-flow natural circulation startup of an small modular reactor

Risks and Challenges Development Plan • Intra-node efficiency of MC random walks is insufficient to meet Y1:Three-dimenional full core neutronics and single assembly CFD including challenge problem requirements quantitative profiling and performance analysis to establish a baseline for on- • Transient acceleration techniques like ensemble averaging prove node concurrency impractical for CFD at full scale Y2: Three-dimensional, fully coupled neutronics and CFD including • Advanced Doppler broadening models fail to provide the desired quantitative performance analysis accuracy, memory requirements, or completeness Y3: Improvements in full core neutronics and CFD performance on Phi and • Algebraic multi-grid factorization too expensive for the largest GPU architectures cases under consideration or novel AMG techniques prove ineffective Y4: Demonstration of full-core, fully-coupled neutronics and CFD with quantified machine utilization on Aurora and Summit

Exascale Computing Project PI: Thomas Evans (ORNL) 62 Software Technology Requirements Nuclear Reactors

• Programming Models and Runtimes 1. C++/C++-17, C, Fortran, MPI, OpenMP, Thrust, CUDA, Python 2. Kokkos, OpenACC, NVL-C 3. Raja, Legion/Regent, HPX • Tools 1. LLVM/Clang, PAPI, Cmake, git, CDash, gitlab, Oxbow 2. Docker, Aspen 3. TAU • Mathematical Libraries, Scientific Libraries, Frameworks 1. BLAS/PBLAS, Trilinos, LAPACK 2. Metis/ParMETIS, SuperLU, PETSc 3. Hypre

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 63 Software Technology Requirements Nuclear Reactors

• Data Management and Workflows 1. MPI-IO, HDF, Silo, DTK 2. ADIOS • Data Analytics and Visualization 1. VisIt 2. Paraview • System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 64 QMCPACK: A Framework for Predictive and Systematically Improvable Quantum-Mechanics Based Simulations of Materials

Exascale Challenge Problem Applications & S/W Technologies • Develop a performance portable, exascale version of the open source petascale Applications code QMCPACK that can provide highly accurate calculations of the properties of • QMCPACK. complex materials. Software Technologies Cited • Find, predict and control materials and properties at the quantum level with an unprecedented and systematically improvable accuracy. • MPI, CUDA, OpenMP, OpenACC, RAJA, Kokkos, C++17, runtimes, BLAS, LAPACK, sparse . • The ten-year challenge problem is to simulate transition metal oxide systems of approximately 1000 atoms to 10meV statistical accuracy, such as complex oxide Motifs heterostructures that host novel quantum phases, and using the full concurrency of • Particles, dense and sparse linear algebra, Monte Carlo. exascale systems.

Risks and Challenges Development Plan Y1: Development of miniapps to assess programming models, expose and exploit additional • Portable performance between multiple-architectures not achievable within project time frame. concurrency. Full scale calculations on current generation supercomputers for code development and scaling tests. • 10-year scientific challenge problems are no longer relevant or interesting due to scientific advances. Y2: Testing of programing models, scaling techniques. Focus on on-node concurrency and portable performance on next generation nodes. Selection of preferred programming model. • Significant changes in computational architectures that require new solutions or problem decompositions. Y3: Implementation of newer programming model to expose greater concurrency. Multiple nodes • Application not running at capability scale with good figure of merit. on next generation supercomputers for code development and demo science application. Y4: Full scale calculations and their refinement on next generation supercomputers for code development and demonstration science application.

Exascale Computing Project PI: Paul Kent (ORNL) 65 Software Technology Requirements QMCPACK

• Programming Models and Runtimes 1. C++, MPI, OpenMP (existing), CUDA (existing). 2. Kokkos, RAJA, C++17, OpenMP 4.x+, OpenACC, any sufficiently capable supported runtimes (distributed capability not essential). Intend to identify a preferred solution and promote to #1. 3. DSL (IRP used for associated Quantum Chemistry application), code generators & autotuners. • Tools 1. git (version control and git-based development workflow tools), CMake/CTest/CDash. 2. LLVM, Performance analysis and prediction tools, particularly for memory hierarchy, Static analysis (correctness, metrics/compliance). • Mathematical Libraries, Scientific Libraries, Frameworks 1. BLAS, FFTW, BOOST, MKL (Sparse BLAS, existing implementation), Python, Numpy. 2. Any portable/supported sparse BLAS (including distributed implementations), runtime data compression (e.g. ZFP)

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 66 Software Technology Requirements QMCPACK

• Data Management and Workflows 1. NEXUS (Existing lightweight python workflow tool included with QMCPACK), HDF5, XML. 2. Supported workflow & data management tools with a fair balance of simplicity-complexity / capability / portability. • Data Analytics and Visualization 1. Matplotlib, Visit, VESTA, h5py • System Software 1. Support for fault tolerance/resilience (if provided system software level, and in the remainder of the software stack). Because our application performs Monte Carlo and does limited I/O, the most common faults should be easily handled.

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 67 Transforming Additive Manufacturing through Exascale Simulation (ExaAM)

• The Exascale AM (ExaAM) project is building a new multi-physics modeling and simulation toolkit for Additive Manufacturing (AM) to provide an up-front assessment of the manufacturability and performance of additively manufactured parts • An Integrated Platform for AM Simulation (IPAMS) will be experimentally validated, enabled by in-memory coupling between continuum and mesoscale models to quantify microstructure development and evolution during the AM • Microscopic structure determines local material properties such as residual stress and leads to part distortion and failure • A validated AM simulator enables determination of optimal process parameters for desired material properties, ultimately leading to reduced-order models that can be used for real-time in situ process optimization • Coupled to a modern design optimization tool, IPAMS will enable the routine use of AM to build novel and qualifiable parts

Exascale Computing Project PI: John Turner (ORNL) 68 Transforming Additive Manufacturing through Exascale Simulation (ExaAM) Exascale Challenge Problem Applications & S/W Technologies • Develop, deliver, and deploy the Integrated Platform for Additive Manufacturing Applications Simulation (IPAMS) that tightly couples high-fidelity sub-grid simulations within a • IPAMS: ALE3D, Truchas, Diablo, AMPE, MEUMAPPS, Tusas, continuum process simulation to determine microstructure and properties at each time- Diablo step using local conditions, and hence performance • Dramatically accelerate the widespread adoption of additive manufacturing (AM) by Software Technologies Cited enabling fabrication of qualifiable metal parts with minimal trial-and-error iteration and • C++, Fortran realization of location-specific properties • MPI, OpenMP, OpenACC, CUDA • Large suite of physics models: melt pool-scale process modeling, part-scale process • Kokkos, Raja, Charm++ modeling (microstructure, residual stress & properties), microstructure modeling, • Hypre, Trilinos, P3DFFT, SAMRAI, Sundials, Boost material property modeling, and performance modeling • DTK, netCDF, HDF5, ADIOS, Metis, Silo • Simulate AM of a part where mass reduction results in significant energy savings in the • GitHub, GitLab, CMake, CDash, Jira, Eclipse ICE application and structural loading requires a graded structure

Risks and Challenges Development Plan • New characterization techniques may be required for validation Y1: Simulate microstructure / residual stress in macro-scale solid part against experimental result; • Inability to efficiently spawn subgrid simulations IPAMS v1.0 release; Initial IPAMS code coupling; Unsupported bridge and supported comb-type • Microstructure model results do not match experiment samples (Demo 1) • Linear solver Hypre is unable to efficiently take advantage of hybrid Y2: Initial coupled demo simulation, with file-based communication; In memory IPAMS code architectures (multiple IPAMS components depend on Hypre) coupling; IPAMS v2.0 release; Individual struts within multiple-unit-cell lattice (Demo 2) • Incompatibilities or inefficiencies in integrating components using Y3: Common workflow system for problem setup and analysis; IPAMS v3.0 release; Full scale disparate approaches (e.g. Raja and Kokkos) mechanical test sample (Demo 3) • Uncertainty in input parameters for both micro- and macro-scale Y4: Fully-integrated, in-memory process simulation capability available; IPAMS v4.0 release; Full simulations scale conformal lattice with local microstructure (Demo 4)

Exascale Computing Project PI: John Turner (ORNL), co-PI: Jim Belak (LLNL) 69 Transforming Additive Manufacturing through Exascale Simulation (ExaAM) Application Domain Physical Models and Code(s) • Application Area: Dramatically accelerate the widespread adoption of • Physical Models: fluid flow, heat transfer, phase change additive manufacturing (AM) by enabling fabrication of qualifiable metal (melting/solidification and solid-solid), nucleation, parts with minimal trial-and-error iteration and realization of location- microstructure formation and evolution, residual stress specific properties • Codes: • coupling of high-fidelity sub-grid simulations within a continuum process • Continuum: ALE3D, Diablo, Truchas simulation to determine microstructure and properties at each time-step using • Mesoscale: AMPE, MEUMAPPS, Tusas local conditions • Motifs: Sparse Linear Algebra, Dense Linear Algebra, • Challenge Problem: Simulate AM of a part where mass reduction Spectral Methods, Unstructured Grids, Dynamical results in significant energy savings in the application and structural Programs, Particles loading requires a functionally graded structure

Partnerships First Year Development Plans • Co-Design Centers: CEED (discretization), CoPA (particles), Codar • Demonstrate simulation of unsupported bridge and (data), AMREx (AMR) supported comb-type samples • Software Technology Centers: ALExa (DTK), PEEKS (Trilinos), ATDM • Release initial set of proxy apps (Hypre), Kokkos, Exascale MPI, OMPI-X, ForTrilinos, Sparse Solvers, • Integrated Platform for Additive Manufacturing Simulation xSDK4ECP, SUNDIALS, PETSc/TAO, ADIOS, potentially others (IPAMS) v1.0 release with initial code coupling • Application Projects: ATDM projects (primarily LANL and LLNL), Exascope

Exascale Computing Project PI: John Turner (ORNL), co-PI: Jim Belak (LLNL) 70 Transforming Additive Manufacturing through Exascale Simulation (ExaAM) Critical Needs Currently Outside the Goals and Approach Scope of ExaAM • Application Area: Dramatically accelerate the widespread adoption of additive • modeling of powder properties and spreading manufacturing (AM) by enabling fabrication of qualifiable metal parts with minimal • shape and topology optimization trial-and-error iteration and realization of location-specific properties • post-build processing, e.g. hot isostatic • coupling of high-fidelity sub-grid simulations within a continuum process simulation to determine microstructure and properties at each time-step using local conditions pressing (HIP) • Challenge Problem: Simulate AM of a part where mass reduction results in • data analytics and machine learning of significant energy savings in the application and structural loading requires a process / build data functionally graded structure • reduced-order models

Models and Code(s) Software and Numerical Library Dependencies • Physical Models: fluid flow, heat transfer, phase change • C++, Fortran (melting/solidification and solid-solid), nucleation, microstructure • MPI, OpenMP, OpenACC, CUDA formation and evolution, residual stress • Kokkos, Raja, Charm++ • Codes: • Hypre, Trilinos, P3DFFT, SAMRAI, Sundials, Boost • Continuum: ALE3D, Diablo, Truchas • DTK, netCDF, HDF5, ADIOS, Metis, Silo • Mesoscale: AMPE, MEUMAPPS, Tusas • Motifs: Sparse Linear Algebra, Dense Linear Algebra, Spectral • GitHub, GitLab, CMake, CDash, Jira, Eclipse ICE Methods, Unstructured Grids, Dynamical Programs, Particles

Exascale Computing Project PI: John Turner (ORNL), co-PI: Jim Belak (LLNL) 71 Transforming Additive Manufacturing through Exascale Simulation (ExaAM) Goal and Approach • Accelerate the widespread adoption of additive manufacturing (AM) by enabling fabrication of qualifiable metal parts with minimal trial-and-error iteration and realization of location-specific properties • Coupling of high-fidelity sub-grid simulations within a continuum process simulation to determine microstructure and properties at each time-step using local conditions

Models and Code(s) Software and Numerical Critical Needs Currently Outside the • Physical Models: fluid flow, heat transfer, Library Dependencies Scope of ExaAM phase change (melting/solidification and solid- • C++, Fortran • modeling of powder properties and spreading solid), nucleation, microstructure formation and • MPI, OpenMP, OpenACC, CUDA • shape and topology optimization evolution, residual stress • Kokkos, Raja, Charm++ • post-build processing, e.g. hot isostatic • Codes: • Hypre, Trilinos, P3DFFT, pressing (HIP) SAMRAI, Sundials, Boost • Continuum: ALE3D, Diablo, Truchas • data analytics and machine learning of • DTK, netCDF, HDF5, ADIOS, • Mesoscale: AMPE, MEUMAPPS, Tusas Metis, Silo process / build data • Motifs: Sparse Linear Algebra, Dense Linear • GitHub, GitLab, CMake, CDash, • reduced-order models Algebra, Spectral Methods, Unstructured Jira, Eclipse ICE Grids, Dynamical Programs, Particles Exascale Computing Project PI: John Turner (ORNL), co-PI: Jim Belak (LLNL) 72 Software Technology Requirements Advanced Manufacturing

• Programming Models and Runtimes 1. Fortran, C++/C++17, Python, MPI, OpenMP, OpenACC, CUDA, Kokkos, Raja, Boost 2. Legion/Regent, Charm++ 3. other asynchronous, task-parallel, programming/execution models and runtime systems • Tools 1. git, CMake, CDash, GitLab 2. Docker, Jira, Travis, PAPI, Oxbow • Mathematical Libraries, Scientific Libraries, Frameworks 1. BLAS/PBLAS, Trilios, PETSc, LAPACK/ScaLAPACK, Hypre, DTK, Chaco, ParMetis, WSMP (direct solver from IBM) 2. HPGMG, MeuLu (actually part of Trilinos – replacement for ML), MAGMA, Dakota, SuperLU, AMP

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 73 Software Technology Requirements Advanced Manufacturing

• Data Management and Workflows 1. HDF, netCDF, Exodus 2. ADIOS • Data Analytics and Visualization 1. VisIt, ParaView, VTK • System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 74 Optimizing Stochastic Grid Dynamics at Exascale Intermittent renewable sources, electric vehicles, and smart loads will vastly change the behavior of the electric power grid and impose new stochastics and dynamics that the grid is not designed for nor can easily accommodate

• Optimizing such a stochastic and dynamic grid with sufficient reliability and efficiency is a monumental challenge • Not solving this problem appropriately or accurately could result in either significantly higher energy cost, or decreased reliability inclusive of more blackouts, or both • Power grid data are clearly showing the trend towards dynamics that cannot be ignored and would invalidate the quasi-steady-state assumption used today for both emergency and normal operation • The increased uncertainty and dynamics severely strains the analytical workflow that is currently used to obtain the cheapest energy mix at a given level of reliability • The current practice is to keep the uncertainty, dynamics and optimization analysis separate, and then to make up for the error by allowing for larger operating margins • The cost of these margins is estimated by various sources to be in $5-15B per year for the entire United States • The ECP grid dynamics application can result in the best achievable bounds on these errors and thus resulting in potentially billions of dollars a year in savings.

Exascale Computing Project PI: Zhenyu (Henry) Huang (PNNL) 75 Optimizing Stochastic Grid Dynamics at Exascale

Exascale Challenge Problem Applications & S/W Technologies • Exascale optimization problem in electric system expansion planning with stochastic Applications and transient constraints. Create exascale software tools that achieve optimal • GRID-Pack, PIPS bounds on errors. Potentially saves billions in cost savings annually by producing the best planning decisions and enables large-scale renewable energy (especially wind Software Technologies Cited and solar) without compromising reliability under economical constraints. • MPI, OpenMP, MA57, LAPACK, MAGNA, PETSc, Global Arrays, • Planning encompasses the problem of determining the structure, positioning, and ParMETIS, Chaco, BLAS, GridOPTICS software system (GOSS), Julia, timing of billions in assets for any market operator in such a way that electricity costs Elemental, PARDISO and reliability satisfy given performance requirements. • Co-Design: ExaGraph • Major step forward from today’s practice that only considers steady-state constraints in a deterministic context and inconsistently samples time intervals for reliability metric estimation.

Risks and Challenges Development Plan • Availability of target for software development Y1: PIPS for SCOPF with nonlinear constraints (NL) achieves 10K-way parallelism Transient • Alignment of the planned software attributes with target computers Simulation/PETSc scales to 100-way parallelism. • Availability of large-scale test datasets for the target domain Y2: SCOPF/NL/PIPS Weather Uncertainty (U) Scales to 10K–100K-way parallelism Parallel problem Adjoint Computations of transients scale to 100K-way parallelism. • Achieving target computational performance via integration of Y3:SCOPF/PIPS/NL/U with transient constraints scale to 100K–1M-way parallelism. optimization and dynamics, spanning a time horizon of sub- Y4: 20-year production cost (planning) model with integer variables, uncertainty, and transient seconds to decades. constraints scales to 1M–10M-way parallelism. • Achieving target end-to-end performance considering data ingestion on an exascale platform using a data-driven approach.

Exascale Computing Project PI: Zhenyu (Henry) Huang (PNNL) 76 High-Fidelity Whole Device Modeling of Magnetically Confined Fusion Plasmas

• Progress must be made in simulating the multiple spatiotemporal scale processes in magnetic plasma confinement for fusion scientists to understand physics, predict performance of future fusion reactors such as ITER, and accelerate development of commercial fusion reactors • Plasma confinement in a magnetic fusion reactor is governed through self-organized, multiscale interaction between plasma turbulence and instabilities, evolution of macroscopic quantities, kinetic dynamics of macroscopic plasma particles, atomic physics interaction with neutral particles generated from wall-interaction, plasma heating sources and sinks • Capable exascale is required to perform high-fidelity whole-device simulations of plasma confinement: two advanced, scalable fusion gyrokinetic codes are being coupled self-consistently to build the basis for a whole device model of a tokamak fusion reactor: the grid-based GENE code for the core region and the particle-in-cell XGC1 code for the edge region

Exascale Computing Project PI: Amitava Bhattacharjee (PPPL) 77 High-Fidelity Whole Device Modeling of Magnetically Confined Fusion Plasmas Exascale Challenge Problem Applications & S/W Technologies • Develop high-fidelity Whole Device Model (WDM) of magnetically confined fusion Applications plasmas to understand and predict the performance of ITER and future next-step facilities, validated on present tokamak (and stellarator) experiments • GENE, XGC • Couple existing, well established extreme-scale gyrokinetic codes Software Technologies Cited • GENE continuum code for the core plasma and the • MPI, OpenMP, OpenACC • XGC particle-in-cell (PIC) code for the edge plasma, into which a few other important (scale- CUDA-Fortran separable) physics modules will be integrated at a later time for completion of the whole- device capability • Adios, Python • PETSc, BLAS/LAPACK, FFTW, Turbulence fills the whole SuperLU, SLEPc, Trilinos-Fortran plasma volume, controlling tokamak plasma confinement. • VisIt, PAPI, Tau

Risks and Challenges Development Plan • Efficient scaling of GENE and XGC to exascale Y1: Demonstrate initial implicit coupling capability between core (GENE) and edge (XGC) on • Telescoping to experimental time scales the ITG turbulence physics • Load balancing among different kernels in coupled codes Y2: Demonstrate telescoping of the gyrokinetic turbulent transport using a multiscale time • FLOPS-intensive (communication-avoiding) algorithms integration framework on leadership class computers • Numerical approaches to coupling Y3: Demonstrate and assess the experimental (transport) time scale telescoping of whole- device gyrokinetic physics • Adaptive algorithms Y4: Complete the phase I integration framework and demonstrate the capability of the WDM of multiscale gyrokinetic physics in realistic present-day tokamaks on full-scale SUMMIT, AURORA, and CORI

Exascale Computing Project PI: Amitava Bhattacharjee (PPPL) 78 Software Technology Requirements Provided by Fusion Whole Device Modeling

• Programming Models and Runtimes 1. Fortran, Python, C, MPI, OpenMP, OpenACC, CUDA-Fortran 2. Co-Array FORTRAN, PGAS • Tools 1. Allinea DDT, PAPI, Globus Online, git, TAU 2. 3. GitLab • Mathematical Libraries, Scientific Libraries, Frameworks 1. PETSc, SCOREC, LAPACK/ScaLAPACK, Hypre, IBM ESSL, Intel MKL, CUBLAS, CUFFT, SLEPc, BLAS/PBLAS, FFTW, Dakota, 2. Trilios [with Fortran interface], SuperLU, Sundials 3. DPLASMA, MAGMA, FMM

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 79 Software Technology Requirements Provided by Fusion Whole Device Modeling

• Data Management and Workflows 1. Adios 2. ZFP, SZ 3. SCR • Data Analytics and Visualization 1. VisIt, VTK 2. Paraview • System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 80 Data Analytics at the Exascale for Free Electron Lasers Linear Coherent Light Source (LCLS) is revealing biological structures in unprecedented atomic detail, helping to model proteins that play a key role in many biological functions. The results could help in designing new life-saving drugs.

• Biological function is profoundly influenced by dynamic changes in protein conformations and interactions with molecules – processes that span a broad range of timescales • Biological dynamics are central to enzyme function, cell membrane proteins and the macromolecular machines responsible for transcription, translation and splicing • Modern X-ray crystallography has transformed the field of structural biology by routinely resolving macromolecules at the atomic scale • LCLS has demonstrated the ability to resolve structures of macromolecules previously inaccessible - using the new approaches of serial nanocrystallography and diffract-before- destroy with high-peak-power X-ray pulses • Higher repetition rates of LCLS-II can enable major advances by revealing biological function with its unique capability to follow dynamics of macromolecules and interacting complexes in real time and in native environments • Advanced solution scattering and coherent imaging techniques can characterize sub- nanometer scale conformational dynamics of heterogeneous ensembles of macromolecules – both spontaneous fluctuations of isolated complexes, and conformational changes that may be initiated by the presence of specific molecules, environmental changes, or by other stimuli

Exascale Computing Project PI: Amedeo Perazzo (SLAC) 81 Data Analytics at the Exascale for Free Electron Lasers

Exascale Challenge Problem Applications & S/W Technologies • LCLS detector data rates up 103-fold by 2025; XFEL data analyses times down from weeks to min Applications with real-time interpretation of molecular structure revealed by X-ray diffraction; LCLS-II beam rep rate goes from 120 Hz to 1 MHz by 2020. • Psana Framework, cctbx, lunus, M-TIP, IOTA • LCLS X-ray beam, @ atomic scale wavelengths & 109 brighter than other sources, probes Software Technologies Cited complex, ultra-small structures with ultrafast pulses to freeze atomic motions • Tasking runtime (Legion) • Science drivers to orchestrate compute, network, and storage: Serial Femtosecond Crystallography (SFX) and Single Particle Imaging (SPI) • C++, Python • SFX: study of biological macromolecules (e.g., protein structure / dynamics) and crystalline nano • MPI, OpenMP, CUDA materials; need rapid image analysis feedback on diffraction data to make experimental decisions • FFT, BLAS/LAPACK • SPI: discern 3D molecular structure of individual nano particles & molecules; rapid diffraction pattern tuning of sample concentrations needed for sufficient single particle hit rate, adequate data • HDF5, Shifter, XTC collection

Risks and Challenges Development Plan • Schedulability of NERSC & LCLS resources Y1: Release cctbx, psana: Benchmark exascale aware M-TIP routines; Prototype psana • LCLS-II data rate > ESnet data rate tasking, image kernels ported to C++ parallel STL; Deploy live streaming HDF5 files from FFB to Cori; SFX experiment on Cori PII • HPC execution overhead • Scalability of file format(s) Y2: Release cctbx, psana, M-TIP; port psana- MPI and tasking to Summit, key image kernels ported to image DSL; Decision for psana-MPI vs psana-task; SFX experiment using IOTA on • Keeping up with network infrastructure upgrades Cori PII & SPI experiment on Cori PII • Experiment calendar uncertainty Y3: Release cctbx, psana, M-TIP; Optimized M-TIP under streaming; cctbx & M-TIP • New model for HPC utilization; bursty, short workloads imply lower integrated into psana with 50% scaling Cori PII & Sierra; SFX experiment using ray tracing machine utilization unless resilient jobs can be preempted Y4: End-to-end cctbx on NERSC9 / ESnet6; optimized scheduler for live streaming jobs on • Maturity of tasking runtime/image analysis kernels Cori; SFX & SPI experimental demos for LCLS users visualizing strructures < 10 min Exascale Computing Project PI: Amedeo Perazzo (SLAC) 82 Software Technology Requirements Data Analytics for Free Electron Lasers

• Programming Models and Runtimes 1. Python, C++, Legion, MPI, OpenMP 2. Argobots, Qthreads 3. Halide • Tools 1. CMake, git, Github, Zenhub, LLVM, Travis • Mathematical Libraries, Scientific Libraries, Frameworks 1. scikit-beam, BLAS/LAPACK, FFTW, GSL

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 83 Software Technology Requirements Data Analytics for Free Electron Lasers

• Data Management and Workflows 1. HDF5, psana 2. iRods, GASNet, Mercury, bbcp, globus online, gridftp, xrootd, zettar • Data Analytics and Visualization 1. cctbx, M-TIP • System Software 1. shifter

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 84 Transforming Combustion Science and Technology with Exascale Simulations

• Direct Numerical Simulation (DNS) of a turbulent lifted jet flame stabilized on pre- ignition species reveals how low-temperature reactions help stabilize the flame against the disruptive effects of high velocity turbulence • Understand the role of multi-stage ignition in high pressure diesel jet flames • High-fidelity geometrically faithful simulation of the relevant in-cylinder processes in a low temperature reactivity controlled compression ignition (RCCI) internal combustion engine that is more thermodynamically favorable then existing engines, with potential for groundbreaking efficiencies yet limiting pollutant formation • Develop DNS and hybrid DNS-LES adaptive mesh refinement solvers for transforming combustion science and technology through capable exascale • Prediction of the relevant processes - turbulence, mixing, spray vaporization, ignition and flame propagation, and soot/radiation -- in an RCCI internal combustion engine will become feasible

Lifted jet flame showing the separation of low-temperature reactions (yellow to red) and the high-temperature flame (blue to green)

Exascale Computing Project PI: Jacqueline Chen (SNL) 85 Transforming Combustion Science and Technology with Exascale Simulations Exascale Challenge Problem Applications & S/W Technologies • First-principles (DNS) and near-first principles (DNS/LES hybrids) AMR-based Applications technologies to advance understanding of fundamental turbulence-chemistry • S3D, LMC, Pele, PeleC, PeleLM interactions in device relevant conditions Software Technologies Cited • High-fidelity geometrically faithful simulation of the relevant in-cylinder processes in a low temperature reactivity controlled compression ignition (RCCI) internal • C++, UPC++ combustion engine that is more thermodynamically favorable then existing engines, • MPI, OpenMP, GASNet, OpenShmem with potential for groundbreaking efficiencies yet limiting pollutant formation • Kokkos, Legion, Perilla • Also demonstrate technology with hybrid DNS/LES simulation of a sector from a gas turbine for power generation burning hydrogen enriched natural gas. • BoxLib AMR, Chombo AMR, PETSc, Hypre • High-fidelity models will account for turbulence, mixing, spray vaporization, low- • HDF5 temperature ignition, flame propagation, soot/radiation, non-ideal fluids • IRIS

Risks and Challenges Development Plan • Embedded boundary (EB) multigrid performance Y1: Initial release of Pele, report on baseline performance on KNL performance; Simulation of • Effective AMR for challenge problem demos turbulent premixed flame with non-ideal fluid behavior • Accurate DNS/LES coupling Y2: Verification of Lagrangian spray implementation in compressible solver; Annual end of • Maturity of Eulerian spray models year Pele performance benchmarks • Effective EB load-balance strategies Y3: Benchmarking of linear solvers on Summit; Simulation of turbulent flame with complex geometry; Annual end of year Pele performance benchmarks • Performance portability of chemistry DSL Y4: Low Mach simulation of gas turbine sector with real geometry • Performant task-based particle implementation that is interoperable with BoxLib

Exascale Computing Project PI: Jacqueline Chen (SNL) 86 Software Technology Requirements Combustion (SNL)

• Programming Models and Runtimes Fortran (1) , C++/C++17 (1), MPI (1), OpenMP(1), CUDA (1) , TiledArrays (1), OpenCL(2), Legion / Regent (1) , Gasnet (1), Tida (1), OpenACC (2), PGAS (1), Kokkos (2), UPC/UPC++(1) , Perilla (2) , OpenShmem (2) Boost (3) , Thrust (3) • Tools Cmake (1) , Git (1) , GitLab (1), Papi (2), DDT (1), Vtune (2), Jenkins (1) • Mathematical Libraries, Scientific Libraries, Frameworks BoxLib (1), HPGMG (1), FFTW, Sundials (1), Tida (1), Hypre (2), PetSC (1) BLAS / PBLAS (3)

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 87 Software Technology Requirements Combustion (SNL)

• Data Management and Workflows MPI-IO (1), HDF5(1), ADIOS (3) • Data Analytics and Visualization Visit (1), Paraview (1), VTK (3), FFTW (3), Dharma (3)

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 88 Cloud-Resolving Climate Modeling of the Earth's Water Cycle

• Cloud-resolving Earth system model with throughput necessary for multi-decade, coupled high resolution climate simulations • Target substantial reduction of major systematic errors in precipitation with realistic / explicit convective storm treatment • Improve ability to assess regional impacts of climate change on the water cycle that directly affect multiple sectors of the U.S. and global economies (agriculture & energy production) • Implement advanced algorithms supporting a super- parameterization cloud-resolving model to advance climate simulation and prediction • Design super-parameterization approach to make full use of GPU accelerated systems, using performance performance portable approaches, to ready the model for capable exascale

Hurricane simulated by the ACME model at the high resolution necessary to simulate extreme events such as tropical cyclones

Exascale Computing Project PI: Mark Taylor (SNL) 89 Cloud-Resolving Climate Modeling of Earth’s Water Cycle Summary example of an application project development plan

Exascale challenge problem Applications Software technologies • Earth system model (ESM) with throughput needed for multi-decadal coupled • ACME Earth system model: • Fortran, C++, MPI, OpenMP, high-resolution (~1 km) climate simulations, reducing major systematic errors ACME-Atmosphere OpenACC in precipitation models via explicit treatment of convective storms • MPAS (Model Prediction Across • Kokkos, Legion • Improve regional impact assessments of climate change on water cycle, Scales)-Ocean (ocean) • PIO, Trilinos, e.g., influencing agriculture/energy production • MPAS-Seaice (sea ice); PETSc • Integrate cloud-resolving GPU-enabled convective parameterization into ACME MPAS-Landice (land ice) • ESGF, ESM using Multiscale Modeling Framework (MMF); refactor key ACME model • SAM (System for Globus components for GPU systems Atmospheric Modeling) Online, • ACME ESM goal: Fully weather-resolving atmosphere/cloud-resolving AKUNA superparameterization, eddy-resolving ocean/ice components, throughput framework (5 SYPD) enabling 10–100 member ensembles of 100 year simulations

Risks and challenges Development Plan • Insufficient LCF allocations • Y1: Demonstrate ACME-MMF model for Atmospheric Model Intercomparison • Obtaining necessary GPU throughput on the cloud-resolving model Project configuration; complete 5 year ACME-MMF simulation with active atmosphere and land components at low resolution and ACME atmosphere • Cloud-resolving convective parameterization via multi-scale modeling diagnostics/ metrics framework does not provide expected improvements in water cycle simulation quality • Y2: Demonstrate ACME-MMF model with active atmosphere, land, ocean and ice; complete 40 year simulation with ACME coupled group water cycle • Global atmospheric model cannot obtain necessary throughput diagnostics/metrics • MPAS ocean/ice components not amenable to GPU acceleration • Y3: Document GPU in performance-critical components: Atmosphere, Ocean and Ice; compare SYPD with and without using the GPU PI: Mark • Y4: ACME-MMF configuration integrated ACME model; document highest Taylor (SNL) resolution able to deliver 5 SYPD; complete 3 member ensemble of 40 year simulations with all active components (atmosphere, ocean, land, ice) with ACME coupled group diagnostics/metrics

Exascale Computing Project 90 Software Technology Requirements Climate (ACME)

• Programming Models and Runtimes 1. Fortran, C++/C++17, Python, C, MPI, OpenMP, OpenACC, Globus Online 2. Kokkos, Legion/Regent, 3. Argobots, HPX, PGAS, UPC/UPC++ • Tools 1. LLVM/Clang, JIRA, CMake, git, ESGF 2. TAU, GitLab 3. PAPI, ROSE, HPCToolkit • Mathematical Libraries, Scientific Libraries, Frameworks 1. Metis 2. MOAB, Trillinos, PETSc, 3. Dakota, Sundials, Chaco

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option 3. Might be useful but no concrete plans Exascale Computing Project 91 Software Technology Requirements Climate (ACME)

• Data Management and Workflows 1. MPI-IO, HDF, PIO 2. Akuna 3. ADIOS • Data Analytics and Visualization 1. VTK, Paraview, netCDF

• System Software

Requirements Ranking 1. Definitely plan to use 2. Will explore as an option Exascale Computing Project 3. Might be useful but no concrete plans 92