Solving Pdes in Space-Time

Solving PDEs in Space-Time: 4D Tree-Based Adaptivity, Mesh-Free and Matrix-Free Approaches Masado Ishii Milinda Fernando Kumar Saurabh University of Utah University of Utah Iowa State University Salt Lake City, Utah Salt Lake City, Utah Ames, Iowa Biswajit Khara Baskar Hari Sundar Iowa State University Ganapathysubramanian University of Utah Ames, Iowa Iowa State University Salt Lake City, Utah Ames, Iowa ABSTRACT KEYWORDS Numerically solving partial diferential equations (PDEs) remains a 4D, space-time adaptive, sedectree, mesh-free, matrix-free, fnite compelling application of supercomputing resources. The next gen- element method, distributed memory parallel eration of computing resources – exhibiting increased parallelism and deep memory hierarchies– provide an opportunity to rethink ACM Reference Format: Masado Ishii, Milinda Fernando, Kumar Saurabh, Biswajit Khara, Baskar how to solve PDEs, especially time dependent PDEs. Here, we con- Ganapathysubramanian, and Hari Sundar. 2019. Solving PDEs in Space- sider time as an additional dimension and simultaneously solve for Time: 4D Tree-Based Adaptivity, Mesh-Free and Matrix-Free Approaches. In the unknown in large blocks of time (i.e. in 4D space-time), instead The International Conference for High Performance Computing, Networking, of the standard approach of sequential time-stepping. We discretize Storage, and Analysis (SC ’19), November 17–22, 2019, Denver, CO, USA. ACM, the 4D space-time domain using a mesh-free kD tree construction New York, NY, USA, 15 pages. https://doi.org/10.1145/3295500.3356198 that enables good parallel performance as well as on-the-fy construction of adaptive 4D meshes. To best use the 4D space-time mesh adaptivity, we invoke concepts from PDE analysis to establish 1 INTRODUCTION rigorous a posteriori error estimates for a general class of PDEs. We We present kD tree-based parallel algorithms for solving a general solve canonical linear as well as non-linear PDEs (heat difusion, class of partial diferential equations (PDEs) using a space-time- advection-difusion, and Allen-Cahn) in space-time, and illustrate adaptive approach. This approach is primarily motivated by the ne- the following advantages: (a) sustained scaling behavior across a cessity of designing computational methodologies that leverage the larger processor count compared to sequential time-stepping ap- availability of very large computing clusters (exascale and beyond). proaches, (b) the ability to capture “localized” behavior in space and For evolution problems, the standard approach of decomposing the time using the adaptive space-time mesh, and (c) removal of any spatial domain is a powerful paradigm of parallelization. However, time-stepping constraints like the Courant-Friedrichs-Lewy (CFL) for a fxed spatial discretization, the efciency of purely spatial condition, as well as the ability to utilize spatially varying time-steps. domain decomposition degrades substantially beyond a threshold We believe that the algorithmic and mathematical developments – usually tens of thousands of processors – which makes this ap- along with efcient deployment on modern architectures shown proach unsuitable on next generation machines. To overcome this in this work constitute an important step towards improving the barrier, a natural approach is to consider the time domain as an scalability of PDE solvers on the next generation of supercomputers. additional dimension and simultaneously solve for blocks of time, instead of the standard approach of sequential time-stepping. This approach of solving for large blocks of space-time is one CCS CONCEPTS of several promising approaches to time-parallel integration ap- • Computing methodologies → Massively parallel algorithms; proaches that have been developed over the past century, but which Distributed algorithms; • Applied computing → Engineering. are gaining increasing attention due to the availability of appro- priate computing resources [17]. Broadly one can consider three types of parallelization approaches to solving a space-time problem. The frst type of methods explicitly parallelizes only over time, and leaves spatial parallelism (and spatial adaptivity) undefned [10, 25]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed These methods may also be considered as shooting methods [17]. for proft or commercial advantage and that copies bear this notice and the full citation The second type of methods explicitly parallelizes over space, and on the frst page. Copyrights for components of this work owned by others than ACM leaves temporal parallelism (and temporal adaptivity) undefned. must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a These methods include wave form relaxation methods that attempt fee. Request permissions from [email protected]. to reconcile the solution at spatial boundaries between space-time SC ’19, November 17–22, 2019, Denver, CO, USA blocks [17]. The third type of methods explicitly targets parallelism © 2019 Association for Computing Machinery. ACM ISBN 978-1-4503-6229-0/19/11...$15.00 (and adaptivity) in space and time. The current work seeks to ad- https://doi.org/10.1145/3295500.3356198 vance type three methods. In the more narrow context of fnite 1 SC ’19, November 17–22, 2019, Denver, CO, USA Ishii and Fernando, et al. leveraging novel architectures that exhibit extreme parallelism and deep memory hierarchies. In addition to this obvious advantage, solving in space-time provides the following advantages: • Allows natural incorporation of a posteriori error estimates for space-time mesh adaptivity. This has several additional tangible benefts in the context of computational overhead. For evolution problems – including wave equations, and problems involving moving interfaces like bubbles and shocks – that exhibit “localized” behavior in space and time, solving in blocks of space- time that are locally refned to match the local behavior provides substantial computational gain [7]. An illustration of this advantage (in 2D space +1D time for illustration purposes) is shown in Fig. 1(B), where the refned mesh is localized along the regions of high gradient of the solution. • Provides easy access to the full time history of the solution which is essential for the solution of design/inverse problems involving adjoints [15, 16] Removes any time-stepping constraints (like the CFL condition) Figure 1: (A) Conventional approaches to computationally solv- • on the numerical approach as the solution across all time-steps ing evolution equations rely on marching sequentially in time. This are simultaneously solved. This concept is illustrated in Fig. 1(B). limits parallel scalability to the spatial domain. In contrast, we The temporal discretization far away from the oscillating source present an approach for simultaneously solving for large blocks is 25 times larger than the smallest temporal discretization. This of space-time. This approach not only exhibits good scalability, but large variation in time step has no impact on numerical stability, also provides improved temporal convergence (proved via es- a priori which is an advantage over conventional sequential time stepping timates) as well as localized refnement in space and time based on a approaches. error estimates. (B) Illustrative results in 2 +1 posteriori D space D time Finally, considering time as an additional dimension allows us to dimensions (i.e. 3D space-time) for a rotating heat source problem. • harness the benefts of using higher order basis functions to get Notice that the mesh is refned in space-time only in regions of inter- high order accurate temporal schemes for no additional imple- est. This not only guarantees accurate solutions in space and time, mentation cost (since these basis functions are already available but is also substantially more efcient than the strategy of taking for the spatial discretization). We derive theoretical esti- very small time steps necessary to guarantee temporal accuracy in a priori mates of convergence with basis order and mesh size, and show the sequential approach. that temporal error scales as a power of the mesh size in space α time ju − ue j2 ≤ Ch , where α = order of basis function + 1. Fig. 2 computationally illustrates this point. We solve a 3D tran- element methods, early work on type three methods was consid- sient problem in space-time on a 4D mesh. Using cubic basis ered by Hughes and coworkers [20, 22], Tezduyar et al. [35], and functions is equivalent to a multi-step fourth order Runge-Kutta Potanza and Reddy [28], while variations on this theme have re- scheme for the sequential time stepping problem. cently been explored by several groups [5, 6, 26, 27, 29, 37]. Additionally, instead of using a traditional mesh-based abstrac- Our preliminary work [9] using the Finite Element Method (FEM) tion for performing fnite element (FEM) computations, we design a indicates that the third approach is particularly promising (see new mesh-free abstraction on adaptive space-time meshes with the Fig. 1) as it provides a natural way to integrate mathematical anal- objective of achieving high performance and scalability on current ysis including construction of fnite element error estimates (both and future architectures with extreme levels of parallelism and

Solving Pdes in Space-Time

A Fast, Verified, Cross-Platform Cryptographic Provider

Extracting and Mapping Industry 4.0 Technologies Using Wikipedia

Best Practices HPC and Introduction to the New System FRAM

Intel Technology Platform for HPC Intel® Xeon® and Intel® Xeon Phi™ Processor Update

Arhitectura Procesoarelor X86

Computer-Aided Cryptography

Evaluation of Programming Models for Manycore And/Or Heterogeneous

Evaluation of Programming Models for Manycore and / Or Heterogeneous Architectures for Monte Carlo Neutron

Sok: Computer-Aided Cryptography

Výpočetní Jednotky Procesorů Poslední Generace a Jejich Využití.: Bakalárska Práca

The Lesser Work Isbn 978-1-68474-073-4

Optimization of Supersingular Isogeny Cryptography for Deeply Embedded Systems Jeffrey Denton Calhoun University of New Mexico - Main Campus