Parareal in Julia with Singularity

Fachhochschule Aachen Campus Jülich Department: Medizintechnik und Technomathematik Degree program: Angewandte Mathematik und Informatik Parareal in Julia with Singularity Masterthesis from Isabel Heisters Jülich, December 2020 Declaration of independence This work is made and written by myself. No other sources and tools have been used than those indicated. Place and date Signature The thesis was supervised by: First examiner: Prof. Dr. Johannes Grotendorst Second examiner: Dr. Robert Speck This thesis was done at : Jülich Supercomputing Centre Forschungszentrum Jülich GmbH Abstract In this thesis a Julia version of the parallel-in-time method Parareal is in- troduced. Parareal decomposes the time as an approach for parallelization. Parareal is an hierarchical, iterative algorithm that uses a coarse, cheap integrator to propagate information quickly forward in time in order to provide initial values for the parallelized original time integration scheme. The problems here used to test the Parareal algorithm are the Lorenz equation, the heat equation and the Allen-Cahn equation. Julia is a programming language that was specifically designed to be used for numerical applications and parallelization. Julia is becoming more popular due to the fact that it is easier to implement than C but has a better runtime than Python. Julia is a new language and not available on most host systems. Singularity is a container solution to create the necessities for scientific application-driven workloads. By using a container the user can configure the environment in which the application can run independently of the host system and its software specifications. This thesis shows how a Singularity container for the Parareal algorithm implemented in Julia can be built. The container is portable to different hosts like HPC and cloud systems without having a runtime overhead in comparison to the runtime without a container. V Zusammenfassung In dieser Arbeit wird der Parareal-Algorithmus als Werkzeug zur Paralleli- sierung vorgestellt. Parareal zerlegt die Zeit als Ansatz zur Parallelisierung. Parareal ist ein hierarchischer, iterativer Algorithmus, der einen günstigen In- tegrator verwendet, um Informationen schnell in der Zeit vorwärts zu senden, um Anfangswerte für das parallelisierte ursprüngliche Zeitintegrationsschema bereitzustellen. Die Probleme, die hier zum Testen des Parareal-Algorithmus verwendet werden, sind die Lorenz-Gleichung, die Wärmeleitungsgleichung und die Allen-Cahn-Gleichung. Julia ist eine Programmiersprache, die speziell für numerische Anwendungen und Parallelisierung entwickelt wurde. Julia wird immer beliebter, da sie einfacher zu implementieren ist als C, aber eine bes- sere Laufzeit als Python hat. Julia ist eine neue Sprache und auf den meisten Host-Systemen nicht verfügbar. Singularity ist eine Containerlösung, mit dem Ziel die Vorraussetzung für wissenschaftliche Projekte zu schaffen. Durch die Verwendung eines Containers kann der Benutzer die Umgebung konfigurieren, in der die Anwendung unabhängig vom Hostsystem und dessen Softwarespe- zifikationen laufen kann. Diese Arbeit zeigt, wie ein Singularity-Container für den in Julia implemen- tierten Parareal-Algorithmus gebaut werden kann. Der Container ist auf ver- schiedene Hosts wie HPC- und Cloud-Systeme portierbar, ohne einen Laufzeit- Overhead im Vergleich zur Laufzeit ohne Container zu haben. VI Contents 1. Motivation1 2. The Parareal algorithm3 2.1. Classification in literature . .3 2.2. Mathematical approach . .4 2.3. Parallelization . .7 3. A testbed 11 3.1. Preliminary remarks . 11 3.2. Lorenz system . 13 3.3. Heat equation in 1D with periodic boundary conditions . 15 3.4. Allen-Cahn Equation . 18 4. Containerization 21 4.1. Introduction to virtualization . 21 4.2. Singularity . 24 4.3. Example: Singularity container for Julia and OpenMPI . 26 5. Testing with Singularity 31 5.1. Singularity on Jureca . 31 5.2. Portability to other hosts . 34 5.3. Singularity with a different MPI version . 36 6. Summary and outlook 39 A. Appendix 41 A.1. Singularity Definition File . 41 Bibliography 48 VII List of Figures 2.1. Visualization of the Parareal algorithm with c=1 . .7 2.2. Visualization of the Parareal algorithm with c=2 . .9 3.1. Solution of the Lorenzsystem over time t = (0, 100) with stepsize ∆t = 0.01 and explicit Euler as solver . 13 3.2. Error and Iteration of the Lorenz system over time solved with Euler as coarse and fine solver with nf = 10 for different nc run on 8 cores . 14 3.3. Runtime of the Lorenz equation with different number of cores and number of time steps nt .................... 14 3.4. Error of the heat equation over nx and IMEX-Euler method as coarse and fine solver for different nc and nf = 10 ........ 16 3.5. Number of iterations of the heat equation with periodic boundry conditions over time and IMEX-Euler method as coarse and fine solver with nf = 10 for different nc and nx = 256 ........ 17 3.6. Runtime of the heat equation for nx = 256 with different number of cores and number of time steps nc and nf = 10 ........ 17 3.7. Calculated and exact Radius of the Allen-Cahn equation with IMEX-Euler and stabilized IMEX-Euler . 19 3.8. Calculations with Parareal of the radius, error of the radius and iteration number against time with stabilized IMEX-Euler for different nc, different nx and nf = 10 ............... 20 3.9. Runtime of the Allen-Cahn equation for nx = 128 with different number of cores and number of time steps nt and . 20 5.1. Run time in seconds of the Lorenz equation run on Jureca with Open MPI comparing a native run and a run with a container . 31 5.2. Run time in seconds of the heat equation run on Jureca with Open MPI comparing a native run and a run with a container . 32 5.3. Run time in seconds of the Allen-Cahn equation run on Jureca with Open MPI comparing a native run and a run with a container 33 5.4. Run time in seconds of the Allen-Cahn equation run on Jureca on multiple nodes with Open MPI comparing a native run and a run with a container . 33 5.5. Run time in seconds of the Allen-Cahn equation run on Jusuf with Open MPI compared to the runtime on Jureca run with a container . 35 VIII 5.6. Run time in seconds of the Allen-Cahn equation run on Jureca with Intel MPI comparing a native run and a run with a container . 37 5.7. Run time in seconds of the Allen-Cahn equation run on Jureca on multiple nodes with Intel MPI comparing a native run and a run with a container . 38 5.8. Run time in seconds of the Allen-Cahn equation run on Jureca on multiple nodes comparing Open MPI and Intel MPI as well a native run and a run with a container . 38 IX List of Tables 5.1. Runtime in seconds of the Allen-Cahn equation with Parareal and IMEX Euler as integrator, nc = 128 and nf = 10 compared for different hosts . 36 X Listings 4.1. Header of the definition file . 27 4.2. Environment of the definition file of the Julia Open MPI container 27 4.3. Installation of Julia into the container . 28 4.4. Installation of Open MPI into the Container . 29 4.5. Binding MPI into Julia . 30 4.6. Building of Julia packages . 30 5.1. Specifications of the container with Intel MPI . 37 A.1. Definition file Open MPI . 41 A.2. Definition file Intel MPI . 44 XI 1. Motivation Classical numerical methods to solve initial value problems run serial in time, which means the time is divided into individual time steps and one step is pro- cessed after the next. To speed up solving the problems mostly parallelization in the spatial domain is used especially since supercomputers are easier to use parallelization as a tool to reduce the run time is very common. The fast development of supercomputers leads to limitations where more computing power cannot save you run time. An idea to solve this problem is a parallelization in the time domain. Most of the time when working on parallelization C or Python is used. C is popular for achieving a high execution speed. In Python on the other hand it is easy to write code. A new language that is trying to combine these two ben- efits is Julia. Julia is a programming language that was specifically designed to be used for numerical applications and high-performance computing. Ju- lia [Bez+14] is a dynamic high-performance programming language. The first stable open-source version was released in February 2012, the development of the language began in 2009. Julia is mainly used in scientific computing. The language Julia is in contrast to other programming languages almost ex- clusively written in Julia itself. Only basic functions, such as integer operations, for loops, recursion or float operations, use C operations or work with C structures. Julia works with a Just-in-Time (JIT) compiler. The code is converted to machine code during compilation. Just-in-time compilation dif- fers from so-called Ahead-of-Time (AOT) compilations in that these programs compile during run time, whereas AOT compilers compile before run time. The parser, which converts the Julia code to machine code, is implemented in Scheme [Dyb09] and is converted to optimized machine code using the LLVM compiler framework [LA04], a virtual machine. For this purpose, the framework uses the compiler backend of Clang for C/C++, which provides excel- lent auto-SIMD vectorization and can lead to high performance. This happens even though Julia is a dynamic language since this usually only occurs in static programming languages. Julia was also developed to make parallel implemen- tation, whether shared memory computing or distributed memory computing, easier.

Parareal in Julia with Singularity

MPI–Openmp Algorithms for the Parallel Space–Time Solution of Time Dependent Pdes

Numerical Investigation of Parallel-In-Time Methods for Dominantly Hyperbolic Equations

A Parareal Algorithm for Optimality Systems Martin Gander, Félix Kwok, Julien Salomon

Asynchronous Iterations of Parareal Algorithm for Option Pricing Models Frédéric Magoulès, Guillaume Gbikpi-Benissan, Qinmeng Zou

Θ-Parareal Schemes

Accelerating Chemical Simulation Through Model Modification

Modified Parareal Method for Solving the Two-Dimensional Nonlinear

A Hybrid Parareal Spectral Deferred Corrections Method Michael L.Minion

Real Time Computing with the Parareal Algorithm

Parareal Neural Networks Emulating a Parallel-In-Time Algorithm∗

A Parareal Algorithm for Optimality Systems Martin Gander, Félix Kwok, Julien Salomon

Shared Memory Pipelined Parareal