Solving Incompressible Navier-Stokes Equations on Heterogeneous Parallel Architectures Yushan Wang

Solving incompressible Navier-Stokes equations on heterogeneous parallel architectures Yushan Wang To cite this version: Yushan Wang. Solving incompressible Navier-Stokes equations on heterogeneous parallel architectures. Distributed, Parallel, and Cluster Computing [cs.DC]. Université Paris Sud - Paris XI, 2015. English. NNT : 2015PA112047. tel-01152623 HAL Id: tel-01152623 https://tel.archives-ouvertes.fr/tel-01152623 Submitted on 18 May 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. UniversitéParis-Sud ECOLE´ DOCTORALE 427 : INFORMATIQUE PARIS SUD Laboratoire de Recherche en Informatique THESE` DE DOCTORAT INFORMATIQUE par Yushan WANG Solving Incompressible Navier-Stokes Equations on Heterogeneous Parallel Architectures Date de soutenance: 09/04/2015 Composition du jury: Directeur de thèse : Marc BABOULIN Professeur (UniversitéParis-Sud, Orsay) Co-directeur de thèse : Olivier LE MAÎTRE Directeur de Recherche (LIMSI/CNRS, Orsay) Rapporteurs : Fabienne JEZ´ EQUEL´ Maˆıtrede Conférences (LIP6, Paris) Masha SOSONKINA Professeur (Old Dominion University, USA) Examinateurs : Abdel LISSER Professeur (UniversitéParis-Sud, Orsay) Michel KERN Chargéde Recherche (INRIA, Le Chesnay) Membre invité: Yann FRAIGNEAU Ingénieur de Recherche (LIMSI/CNRS, Orsay) ii Résumé Dans cette thèse,nous présentons notre travail de recherche dans le domaine du calcul haute performance en mécaniquedes fluides. Avec la demande croissante de simulations àhaute résolution,il est devenu important de développer des solveurs numériques pouvant tirer parti des architectures récentes comprenant des processeurs multi-cœurs et des accélérateurs. Nous nous proposons dans cette thèsede développer un solveur efficace pour la résolutionsur architectures hétérogènesCPU/GPU des équationsde Navier-Stokes (NS) relatives aux écoulements 3D de fluides incompressibles. Tout d'abord nous présentons un aper¸cude la mécaniquedes fluides avec les équationsde NS pour fluides incompressibles et nous présentons les méthodes numériquesexistantes. Nous décrivons ensuite le modèlemathématique,et la méthode numériquechoisie qui repose sur une technique de prédiction-projection incrémentale. Nous obtenons une distribution équilibréede la charge de calcul en utilisant une méthode de décomposition de domaines. Une parallélisationàdeux niveaux combinéeavec de la vectorisation SIMD est utiliséedans notre implémentation pour exploiter au mieux les capacitésdes machines multi-cœurs. Des expérimentations numériquessur différentes architectures parallèlesmontrent que notre solveur NS obtient des performances satis- faisantes et un bon passage àl'échelle. Pour améliorerencore la performance de notre solveur NS, nous intégronsle calcul sur GPU pour accélérerles tâches les plus coûteuses en temps de calcul. Le solveur qui en résultepeut êtreconfiguréet exécutésur diverses architectures hétérogènes en spécifiant le nombre de processus MPI, de threads, et de GPUs. Nous incluons également dans ce manuscrit des résultats de simulations numériquespour des benchmarks con¸cus àpartir de cas tests physiques réels.Les résultatsobtenus par notre solveur sont comparésavec des résultatsde référence. Notre solveur a vocation àêtreintégrédans une future bibliothèquede mécaniquedes fluides pour le calcul sur architectures parallèles CPU/GPU. Mots clés: équations de Navier-Stokes, méthode de prédiction-projection, calcul haute performance, parallélisationmulti-niveaux, calcul sur GPU. iv Abstract In this PhD thesis, we present our research in the domain of high performance software for computational fluid dynamics (CFD). With the increasing demand of high-resolution simulations, there is a need of numerical solvers that can fully take advantage of current manycore accelerated parallel architectures. In this thesis we focus more specifically on developing an efficient parallel solver for 3D incompressible Navier-Stokes (NS) equations on heterogeneous CPU/GPU architectures. We first present an overview of the CFD domain along with the NS equations for incompressible fluid flows and existing numerical methods. We describe the mathematical model and the numerical method that we chose, based on an incremental prediction- projection method. A balanced distribution of the computational workload is obtained by using a domain decomposition method. A two-level parallelization combined with SIMD vectorization is used in our implementation to take advantage of the current distributed multicore machines. Numerical experiments on various parallel architectures show that this solver provides satisfying performance and good scalability. In order to further improve the performance of the NS solver, we integrate GPU computing to accelerate the most time-consuming tasks. The resulting solver can be configured for running on various heterogeneous architectures by specifying explicitly the numbers of MPI processes, threads and GPUs. This thesis manuscript also includes simulation results for two benchmarks designed from real physical cases. The computed solutions are compared with existing reference results. The code developed in this work will be the base for a future CFD library for parallel CPU/GPU computations. Keywords: Navier-Stokes equations, prediction-projection method, Helmholtz solver, Poisson solver, high performance computing, multi-level parallelization, GPU computing. vi Acknowledgements First of all, I would like to thank Marc Baboulin and Olivier Le Ma^ıtre for acting as advisors for my thesis. I appreciate very much their help in this multidisciplinary collaborative research work and their availability. I would also like to thank Yann Fraigneau, with whom I had detailed and fruitful dis- cussions about the code SUNFLUIDH. I also thank the referees of my thesis, Fabienne Jézéqueland Masha Sosonkina, for constructive comments on the manuscript. I express my gratitude to Franck Cappello (Joint-Laboratory on Extreme Scale Com- puting) for funding my visit to Argonne National Laboratory, USA, in August 2013. I want to thank Jack Dongarra and Stanimire Tomov for giving me access to their computing resources at Innovative Computing Laboratory (University of Tennessee, USA). I acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing HPC resources that have contributed to the research results reported within this thesis. Finally, I want to express my gratitude to JoëlFalcou, Karl Rupp and the colleagues at LRI, who enabled me to progress in my thesis. vii Contents Résumé iii Abstract v Acknowledgements vii Contents viii List of Figures xi Introduction 1 1 Navier-Stokes equations 5 1.1 Computational fluid dynamics and Navier-Stokes equations . 5 1.1.1 Fluid mechanics ............................ 6 1.1.2 Equations of fluid dynamics ...................... 8 1.1.3 The dimensionless Navier-Stokes equations . 13 1.2 Numerical methods for the incompressible Navier-Stokes equations . 14 1.3 Incremental prediction-projection method . 15 1.4 Solution of the Helmholtz system ....................... 17 1.5 Solution of the Poisson equation ....................... 19 1.6 Spatial discretization .............................. 21 1.6.1 Staggered mesh ............................. 21 1.6.2 Spatial discretization for the Navier-Stokes equation . 24 1.7 Conclusion of Chapter 1 ............................ 27 2 Parallel algorithms for solving Navier-Stokes equations 29 2.1 Domain decomposition approach ....................... 30 2.2 Multi-level parallelism ............................. 34 2.2.1 Shared memory architecture ...................... 34 2.2.2 Distributed memory architecture ................... 35 2.2.3 Combining shared and distributed memory systems . 35 2.3 General structure of the solver ........................ 38 2.4 Accelerating the solution of the tridiagonal systems . 39 2.5 Performance results for Navier-Stokes computations . 44 2.5.1 Shared memory with pure MPI programming model . 44 ix Contents x 2.5.2 Performance using MPI + OpenMP . 48 2.5.3 Performance comparison with an iterative method . 49 2.6 Conclusion of Chapter 2 ............................ 51 3 Taking advantage of GPU in Navier-Stokes equations 53 3.1 Introduction to GPU computing ....................... 54 3.2 Using GPU for solving Navier-Stokes equations . 57 3.2.1 A GPU Helmholtz-like solver ..................... 57 3.2.2 A GPU Poisson solver ......................... 62 3.2.3 General structure of the GPU solver . 65 3.3 Experimental results .............................. 66 3.3.1 Overview of computational resources . 66 3.3.2 Performance of the Helmholtz solver . 67 3.3.3 Performance of the Poisson solver ................... 68 3.3.4 Performance of the hybrid CPU/GPU Navier-Stokes solver . 69 3.4 Conclusion of Chapter 3 ............................ 71 4 Simulations of Physical Problems 73 4.1 Three dimensional Taylor-Green vortices ................... 74 4.1.1 Benchmark settings .......................... 74 4.1.2 Validation

Solving Incompressible Navier-Stokes Equations on Heterogeneous Parallel Architectures Yushan Wang

Aerodynamics Material - Taylor & Francis

Potential Flow Theory

Chapter 5 Constitutive Modelling

Correct Expression of Material Derivative and Application to the Navier-Stokes Equation —– the Solution Existence Condition of Navier-Stokes Equation

Introduction to Compressible Computational Fluid Dynamics James S

INCOMPRESSIBLE FLOW AERODYNAMICS 3 0 0 3 Course Category: Programme Core A

Continuum Fluid Mechanics

Shallow-Water Equations and Related Topics

Lattice Boltzmann Modeling for Shallow Water Equations Using High

Ch.1. Description of Motion

1 David Apsley 3. APPROXIMATIONS and SIMPLIFIED EQUATIONS

1. Transport and Mixing 1.1 the Material Derivative Let Be The