Frameworks, Algorithms and Scalable Technologies for Mathematics (Fastmath))
Total Page:16
File Type:pdf, Size:1020Kb
SciDAC Institute First Year Progress Report Frameworks, Algorithms and Scalable Technologies for Mathematics (FASTMath)) Principal Investigator: Lori Diachin Lawrence Livermore National Laboratories Livermore, CA 94551 [email protected] Senior Investigators: Mihai Anitescu, Lois McInnes,Todd Munson, Argonne National Laboratory Barry Smith, Tim Tautges Ann Almgren, John Bell, Phil Colella, Sherry Li, Lawrence Berkeley National Laboratory Esmond Ng, Brian Van Straalen, Chao Yang Milo Dorr, Rob Falgout, Jeff Hittinger, Mark Miller, Lawrence Livermore National Laboratory Carol Woodward, Ulrike Yang Mark Shephard, Onkar Sahni, Seegyoung Seol Rensselaer Polytechnic Institute Karen Devine, Vitus Leung, Glen Hansen, Sandia National Laboratories Jonathan Hu, Siva Rajamanickam, Andy Salinger Mark Adams Columbia University Dan Reynolds Southern Methodist University Jim Demmel UC Berkeley Carl Ollivier-Gooch University of British Columbia Contents 1 FASTMath Overview 1 2 Executive Summary of Progress to Date 2 3 FASTMath Technologies: First Year Progress and Plans 4 3.1 Tools for Problem Discretization . 4 3.1.1 Structured grid technologies. 4 3.1.2 Unstructured grid technologies. 6 3.1.3 Particle methods. 10 3.1.4 Time discretization. 10 3.2 Tools for Solution of Algebraic Systems . 11 3.2.1 Iterative solution of linear systems. 11 3.2.2 Direct solution of linear systems. 15 3.2.3 Nonlinear systems. 17 3.2.4 Eigensystems. 17 3.2.5 DVI methods. 19 3.3 High-Level Integrated Technologies . 20 3.3.1 Mesh/solver interactions. 20 3.3.2 Mesh-to-mesh coupling methods. 21 3.3.3 Full analysis codes and UQ processes using unstructured grid technologies. 22 3.3.4 Software Strategies. 23 4 Application Interactions 24 4.1 SciDAC Application Partnerships . 24 4.2 NNSA ParaDiS Dislocation Dynamics . 24 5 Team Interactions and Outreach 26 6 Summary 27 References 27 Appendix A: Institutional Statements of Work 29 Appendix B: Scientific Application Partnerships 41 Appendix C: FASTMath Publications 46 Appendix D: FASTMath Presentations 49 ii 1 FASTMath Overview As the complexity of computer architectures and the range of physical phenomena that can be numerically simulated for important DOE applications continue to grow, application scientists have two fundamental challenges to overcome. First, they must continue to improve the quality of their simulations by increasing accuracy and fidelity of the solution and improving the robustness and reliability of both their software and their algorithms. Second, they must adapt their computations to make effective use of the high-end computing facilities being acquired by DOE over the next five years. This challenge will necessitate million-way parallelism and implementations that are efficient on many-/multi-core nodes. The FASTMath SciDAC Institute is helping DOE application scientists address both of these challenges by focusing on the interactions among mathematical algorithms, software design, and computer architectures. FASTMath work is organized around the following three broad topical area themes: 1. Tools for Problem Discretization Structured mesh capabilities: block structured adaptive mesh refinement, embedded • boundary methods, high-order discretization Unstructured mesh capabilities: complex geometry representations, adaptive mesh re- • finement, dynamic partitioning, mesh quality improvement, high-order discretization Particle techniques: particles with both structured and unstructured meshes • Time discretization methods: implicit/explicit methods, symplectic, multiscale, back- • ward differentiation, generalized linear, differential algebraic equations, error control 2. Tools for Solution of Algebraic Systems Linear solvers: geometric and algebraic multigrid, domain decomposition, Krylov itera- • tive techniques, ILU and LU factorizations Nonlinear solvers: Newton-based with various globalization schemes, fixed-point meth- • ods with accelerations, and nonlinear multilevel methods Eigensolvers: Krylov and non-Krylov subspace methods, optimization-based techniques • Variational inequality solvers: Newton-based active set methods, semi-smooth methods • 3. High-Level Integrated Technologies Mesh-solver interactions: linear and nonlinear solvers with structured and unstructured • meshes Coupling technologies: mesh-to-mesh and particle-to-mesh methods • Unstructured mesh toolchain: inline mesh adaptation in application analysis codes and • UQ processes One of the key challenges facing the scientific computing community is the shift to multi-/many- core nodes and million-way parallelism. Thus a pervasive theme in our work is understanding the most effective ways to implement our algorithms at scale on these architectures, with particular em- phasis on hybrid programming models, architecture-aware partitioning and data layout techniques, and communication reducing algorithms. This report documents the progress of the FASTMath team in each of our core technolgoies areas. In section 2, we provide a high-level executive summary of progress toward our stated 1 goals; a more detailed description is given in section 3. An additional expected activity, that turned out to require considerable effort, was the development of Scientific Application Partnership proposals. The FASTMath team was heavily engaged in this activity throughout the first year and successfully partnered on 15 of the 18 funded activities. Our nonlinear solvers SUNDIALS team was also engaged in the prototype NNSA activity involving ParaDIS. A summary of our planned activities with application partners (including the SAPS, ParaDIS, and related DOE SC apps) is given in section 4. Finally, in section 6, we provide a high level description of the future plans of the FASTMath project, focusing primarily on the next 6 to 12 months of work. 2 Executive Summary of Progress to Date The FASTMath team is on track and making progress with respect to our stated goals. In par- ticular, we have worked extensively on the core technologies of structured and unstructured grid technologies, particle methods, time discretization, linear and nonlinear solvers, eigensolvers, and DVI schemes. We have also focused efforts on the high-level integrated capabilities, particularly in the area of mesh-solver interactions, mesh-to-mesh coupling methods and the integration of the suite of unstructured mesh tools into analysis codes and UQ tool chains. One of the key products of the FASTMath project is the software that encapsulates the algorithms and expertise of the team. We are currently examining our software development and distribution strategy with the aim of improving the quality, ease of adoption, and integration of the many FASTMath software products, with the ultimate goal of maximizing our impact in the application community. As a first step toward better software useability, we have completed an initial assessment of the download/build processes used in the FASTMath software tool suite. Table 1: Key accomplishments - Tools for Problem Discretization (Section 3.1) Tools for Problem Discretization New Math Algorithms Petascale Implementation Software Structured Mesh Capabilities • Initial implementations of • Parallel optimization of EB • Grid generation extension for high-order methods on mapped compressible flow solver scales to mapped multiblock and EB multiblock grids in Chombo 100 K cores meshes in Chombo • Implemented high order time • Visualization techniques for EB integrators to match 4th order at large scale (w/ SDAV) spatial accuracy • New software distribution • New region-based paradigm for systems for Chombo and BoxLib AMR in BoxLib (RAMR) • New Chombo (3/12) and BoxLib (10/11) software releases Unstructured Mesh • New boundary smoothing • Mesquite scales to 125K cores • Initial release of Zoltan2 in Capabilities capabilities in Mesquite with 90% efficiency Trilinos v11. • Parallel adaption of • MPI+threads geometric • Released MOAB4.5 semi-structured boundary layer partitioner in Zoltan2 meshes • Architecture-aware data • Hierarchical partitioning structions in MPI+threads algorithms in Zoltan geometric partitioner in Zoltan2 • Improved ParMA load • MOAB supports GPU mesh balancing algorithms for multiple kernels (collab w/ SDAV) mesh entity types • PHASTA scales to 512K cores • Using ParMA in predictive load balancing Particle Techniques • Efficient space-filling curve • Redesign particle interface in technique for particle sorting to Chombo to improve performance maintain load balance Time Discretization Methods • Working to develop optimized • New SUNDIALS version parameters for new multi-rate released IMEX solver, ARKode • Completed ARKode implementation in SUNDIALS In Tables 1, 2, and 3, we give a high level view of the progress to date in each of the three key FASTMath technology areas. For each of the areas outlined in Section 1, the stated goals of the FASTMath project can be loosely categorized into 3 different bins: 1) new mathematical developments, 2) implementation developments that target near-term petascale architectures, and 3) creating robust, interoperable software packages for application use. Tables 1, 2, and 3 reflect 2 Table 2: Key Accomplishments - Solution of Algebraic Systems (Section 3.2) Tools for Algebraic System Solution New Math Algorithms Petascale Implementation Software Iterative Linear Solvers • New techniques limit stencil • Investigating redundant coarse • Completed Trilinos MueLu growth in Galerkin coarse grid grid solves to decrease open source license correction in AMG communication cost • Working on hypre release 2.9.0b • New structured