University of Lille ∼ End-Of-Studies Internship Report LARGE EDDY

University of Lille Faculty of Sciences and Technologies Department of Mathematics ∼ Master’s degree in High Performance Computing, Simulation Academic year : 2019-2020

End-of-studies Internship Report

LARGE EDDY SIMULATIONS OF UNSTEADY FLOWS WITH HIGH ORDER NUMERICAL METHODS

Author : Maxime Jonval Supervisor : Marc Montagnac, PhD

Abstract

The computational ﬂuid dynamics is widely used in aerodynamic and a lot of solvers are historically implemented in Fortran. However, new programming paradigms are interesting in order to build next generation solvers. Onera, the DLR and Airbus proposes a solver, called CODA, written in C++17 which uses these new possibilities. During this internship, a low storage Runge-Kutta method has been implemented in CODA, and a comparison between the CERFACS’ solver JAGUAR and CODA has been done.

R´esum´e

Dans l’industrie, notamment l’aviation, les problèmesde simulation sont résoluesen utilisant la CFD (Computational Fluid Dynamics). Les programmes sont historiquement écriten Fortran mais le développement récent de nouveaux paradigmes de programmation pousse certains industriels à se pencher sur la création de nouveaux programmes exploitant ces nouvelles possibilités.L’Onera, la DLR et Airbus travaillent sur un programme, dénomméCODA, utilisant ces nouveautéset écriten C++17. Pendant ce stage, une méthode de Runge-Kutta peu coûteuseen mémoirea été implémentéedans CODA et des comparaisons de performance avec le programme JAGUAR du CERFACS ont étémenées.

Remerciements

Tout d’abord, merci àAurore Smets ainsi qu’aux deux responsables du master, Nouredine Melab et BenoˆıtMerlet, d’avoir rendu ce stage possible et maintenu durant la période de confinement.

Je remercie Marc Montagnac pour son encadrement, notamment durant la période de confinement. Je remercie également Jean-Fran¸coisBoussuge pour le suivi.

Merci àtoute l’équipe informatique CSG du CERFACS et plus particulièrement àIsabelle d’Ast pour l’aide dans l’installation des différents logiciels.

Je voudrais également remercier tous les stagiaires et doctorants du CERFACS avec qui j’ai pu discuter, que se soit àla pause, au cours d’un repas ou autour d’un verre. Merci àmon collèguede bureau Baptiste Vinot pour les discussions et les mots fléchésaprèsmanger.

Je remercie mes amis de licence, notamment pour les soiréessur Discord pendant le confinement. Un grand merci àmes amis du master CHPS, toutes ces discussions sur Discord et les pauses café ont rendu le confinement plus agréable.

Enﬁn je remercie ma compagne, Claire, pour le soutien qu’elle m’a apport´etout au long de ce stage. Contents

1 Introduction 9 1.1 Presentation of CERFACS ...... 9 1.1.1 CFD Team ...... 9 1.1.2 Computing resources ...... 10 1.2 Context and objective of the internship ...... 10 1.3 High order methods ...... 11 1.4 Presentation of the solvers ...... 11 1.4.1 JAGUAR (proJect of an Aerodynamic solver using General Unstructured grids And high ordeR schemes) ...... 12 1.4.2 CODA (CFD for Onera, Airbus and DLR) ...... 13

2 Governing equations 15 2.1 Navier-Stokes equations ...... 15 2.2 Behavior laws ...... 16 2.2.1 Law for the shear stress tensor ...... 16 2.2.2 Law of the viscosity ...... 16 2.2.3 Law for the heat ﬂux ...... 17 2.3 Closure of the Navier-Stokes equations ...... 17 2.3.1 Perfect gas model ...... 17 2.3.2 The Navier-Stokes system of equations ...... 18

3 Discretization methods 19 3.1 Spectral Diﬀerence method ...... 19 3.1.1 Algorithm description ...... 20 3.1.2 Position of solution points and location of intern ﬂux points ...... 22 3.1.3 Polynomials ...... 23 3.2 Discontinuous Galerkin method ...... 24 3.2.1 Polynomial basis ...... 24 3.2.2 Runge-Kutta DG ...... 25 3.3 Implemented features ...... 26 3.3.1 Time integration method ...... 26 3.3.2 Sponge zone ...... 27 4 Numerical results 28 4.1 Validation of the time integration method ...... 28 4.2 Presentation of the COVO test case ...... 30 4.2.1 Test case in JAGUAR ...... 30 4.2.2 Test case in CODA ...... 31 4.3 Performance comparisons between JAGUAR and CODA ...... 31 4.3.1 Accuracy results ...... 32 4.3.2 Execution time comparison ...... 33 4.3.3 Sponge zone validation ...... 36

5 Conclusion 39 List of Figures

1.1 Diﬀerence between RANS, LES and DNS. Image from [7]...... 12 1.2 FlowSimulator architecture...... 13 1.3 Modular structure of CODA...... 14

3.1 Step 0: Solution points ( ), second-order interpolation polynomial and solution values on the segment...... 20 3.2 Step 1: extrapolation of quantities at flux points ( )...... 21 3.3 Step 2: non linear transformation to build the flux F ( for the current cell and for neighbouring cells)...... 21 3.4 Step 3: computation of a unique flux at segment end points with an approximated Riemann solver ...... 21 3.5 Step 4: building a new polynomial from flux points (at degree p + 1)...... 22 3.6 Step 5: differentiation of the flux polynomial at solution points...... 22 3.7 Example of the distribution of solution ( ) and flux ( ) points in 1D for a third-order accurate polynomial ...... 23

4.1 Presentation of the Onera M6 Wing validation case and the associated mesh. . . . . 29 4.2 Relative convergence of the residuals of second order RK schemes ...... 29 4.3 Relative convergence of the residuals of fourth order RK schemes ...... 30 4.4 X-density component comparison with exact solution after 50 periods. Order 3 methods at the left and Order 4 ones at the right...... 32 4.5 X-density component comparison with exact solution after 50 periods...... 33 4.6 Vortex shift for order 3 methods...... 33 4.7 Ratio TCODA/TJAGUAR with diﬀerent options of compilation for order 3 (left) and 4 (right)...... 34 4.8 Ratio TCODA/TJAGUAR with diﬀerent options of compilation and parallel programming models for order 3 (left) and 4 (right)...... 35 4.9 CODA performances considering various number of iterations (left) and speedup for both CODA and JAGUAR (right)...... 35 4.10 NACA 0012 Mesh...... 36 4.11 Sponge region...... 37 4.12 Pressure variation of NACA 0012 test case in JAGUAR...... 38 University of Lille CERFACS

Chapter 1 Introduction

1.1 Presentation of CERFACS

The CERFACS for ”Centre Européende Recherche et de Formation Avancéeen Calcul Scientifique”, created in 1987, is a basic and applied research center, specialized in modeling and numerical simulation. Through its facilities and expertise in High Performance Computing, CERFACS deals with major scientific and technical research issues of public and industrial interest. CERFACS is involved in major national and international projects and interacts strongly with its seven shareholders : Airbus Group, Cnes, EDF, MétéoFrance, Onera, Safran and Total.

1.1.1 CFD Team

The Computational Fluid Dynamics (CFD) Team is the biggest one at CERFACS, it represents approximately half of the CERFACS, both in human resources and in financial support. The CFD team is organised with three main components : combustion, turbo machinery and applied aerodynamics. The objective of the CFD group at CERFACS is to solve problems involving both CFD and High Performance Computing (HPC). Despite the recent progresses observed in CFD, the solution of many flows of interest is still beyond present capabilities and the challenge of HPC for CFD remains as open and difficult as it ever was. In most CFD issues, brute force approaches still fail and advances in this field rely on defining proper compromises between physics and numerics. This is especially true in the fields of CFD chosen at CERFACS : aerodynamics, turbulence, combustion, unsteady flows, coupled phenomena between fluid mechanics and other mechanisms (fluid structure interaction, optimisation, two-phase flows, radiation, etc). In the last years, the requests of CERFACS partners as well as the general orientation of the CFD community have led the CFD project to a deeper implication in Direct Numerical Simulation tools, especially for reacting flows or flows in complex geometries as well as to the development of new aspects of CFD such as multiphysics or active control. This has been done through an increase of the CFD staff so that the classical expertises of the CFD team (aerodynamics, turbulence modeling, optimisation and parallelization, combustion) has been maintained or even reinforced. An important new field of application for CERFACS is Large Eddy Simulation (LES). The role of the CFD team and of its partners in the development of LES is now significant through multi-

Page 9 of 41 University of Lille CERFACS ple collaborations, contracts and dissemination of information and tools. The LES approach has emerged as a prospective technique for problems associated with time dependent phenomena and coherent eddy structures. This leading edge CFD technology can nowadays be applied to geometries of reasonable complexity (such as a combustion chambers in gas turbines but also in piston engines), which is the result of both constantly increasing computer capacities along with improved underlying numerical methods and grid techniques.

1.1.2 Computing resources

Two supercomputers are available at Cerfacs as internal resources (CERFACS researchers have also access to some external resources) and the one used for the experiments is named Kraken. Its computing power is 498 peak Tﬂops/s.

Compute partition This cluster includes 185 compute nodes, each of them with :

• 2 Intel Skylake (Xeon Gold 6140) processors 18 cores 2.3 Ghz

• 96 GB DDR4 RAM 2677 MHz

• Private L1 caches of 32 KB and L2 caches of 1 MB

• Shared L3 cache by socket of 24 MB and also 2 computing nodes, each of them with two AMD Rome processors (64 cores at 2Ghz) and 256 Go memory. The nodes used during the internship are the ﬁrst ones.

Pre/Post processing partition Kraken includes a visualization support composed of 5 nodes with 288 Go memory with Nvidia Tesla M60 card. NICE EngineFrame environment provides remote display to internal / external users.

1.2 Context and objective of the internship

In 2017 [1], Airbus has signed a partnership with Onera (French Aerospace Research Center) and DLR (German Aerospace Center). The aim is the development of new common Computational Fluid Dynamics capabilities for ﬂow prediction. As a shareholder, Airbus gives an access to this platform to CERFACS. In this context, the goal of this internship was to make some performance comparisons between this platform, called CODA, and the CERFACS’ solver JAGUAR. The studied performances are the accuracy and the execution time. The primary objective was to perform a jet aeroacoustics simulation in both solvers and to compare their respective performances.

Page 10 of 41 University of Lille CERFACS

1.3 High order methods

In the context of High Performance Computing, one needs to limit the stencil size, meaning the number of neighbouring points needed to compute the solution at a given point. A way to do that relies on increasing the number of degrees of freedom inside mesh elements. It is convenient to define high order representations of quantities inside each element following a polynomial approximation. The reconstructed variables at the faces depend on the mesh element only and the two different extrapolations (one at left side and one at right one) may lead to discontinuous flow at mesh faces. There exists a lot of techniques in the literature, the three main methods are :

• The Discontinuous Galerkin (DG) technique : based on the Finite Element framework. The principle is to look for a polynomial representation of the solution that satisﬁes a variational form of the governing system within each element. The DG method was ﬁrst introduced by Reed and Hill in 1973 [22] for analysis of neutron transport problems.

• The Spectral Volume (SV) technique : based on the Finite Volume (FV) method. It follows the pioneering work of Wang [27]. It consists in deﬁning element subdivisions on which a classical FV technique is considered. The mean quantity over each volume is necessary to build the high order representation of data inside the element.

• The Spectral Difference (SD) technique : it follows the Finite Difference (FD) approach. Kor- piva and Kolias published it in 1996 [10] and Liu, Vinokur and Wang published a more general presentation of the technique in 2006 [14]. The idea is to define high order approximation of the quantities from FD inside each mesh cell. In this method, one solves the strong formulation of the equations.

All techniques define high order continuous solution inside each mesh element and since the re- construction leads to two different quantities at mesh interface, a Riemann solver is necessary to compute the flux to exchange between cells. The interest of all methods comes also from the possibility to manage both the space refinement parameter h and the degree of the polynomial p.

1.4 Presentation of the solvers

During the last decades, numerical simulation has become widespread for the design of aircrafts or turbomachines. This has been possible through the design and development of numerical softwares dedicated to Computational Fluid Dynamics (CFD). These softwares can be divided into three main branches, following the kind of modeling they account for, and especially these branches depend on assumptions for turbulence modeling. On one side, industrial CFD code generally solve the mean influence of the turbulence, resolving the so-called Reynolds Averaged Navier Stokes (RANS) equations. The second kind of software is dedicated to Direct Numerical Simulations (DNS). In this case, all turbulence scales are computed. This approach is designed for academic computations on simple configurations. The last technique is called Large Eddy Simulation (LES). As suggested by its name, LES consists in computing all large turbulence scales that can be captured with an appropriate choice of numerical schemes and mesh refinement. Other small scales are modeled with a subgrid scale model. The solvers presented below uses the LES technique.

Page 11 of 41 University of Lille CERFACS

Figure 1.1: Diﬀerence between RANS, LES and DNS. Image from [7].

1.4.1 JAGUAR (proJect of an Aerodynamic solver using General Unstructured grids And high ordeR schemes)

The first solver, JAGUAR [21], is a new CFD code that is currently developed at CERFACS. It is designed for the analysis of high order schemes (convection and diffusion) on high order meshes. JAGUAR is a code based on unstructured grid technique. The method used is the Spectral Differ- ence (SD) technique, it was chosen for several reasons :

• The SD method has been built in order to correct some drawbacks of DG and SV :

– It seems more eﬃcient in terms of CPU usage than DG technique, – SV suﬀers a high sensitivity with respect to element decomposition and this drawback is avoided with SD method.

• It is less mature and the potential of research work is greater.

JAGUAR structure has been optimized in order to get the best computing eﬃciency on both classical and GPGPU architectures. It is written in Fortran 90 and parallel computations can be performed with a MPI implementation and asynchronous communications. For massively parallel computations, an OpenMP paradigm is also implemented in order to limit the number of messages : a hybrid paradigm is considered with MPI communications between nodes and OpenMP paradigm inside each node. The GPGPU implementation is based on CUDA-Fortran compiler.

Page 12 of 41 University of Lille CERFACS

1.4.2 CODA (CFD for Onera, Airbus and DLR)

The second solver, called CODA, is part of a collaboration between Onera, DLR and Airbus. It was originally named Flucs and has been developed in the DLR project Digital-X (04/2012- 06/2016) [11, 13]. The primary objective of Digital-X was the development and deployment of a flexible, parallel software platform for multidisciplinary analysis and optimization of aircrafts and helicopters. For this purpose, Flucs was created with the aim to provide the basis for a consolidated flow solver using modern software techniques. Detailed specifications for Flucs were established based on an extensive survey of current and potential users of flow solvers. The focus was on a common framework for two typical discretizations : second order finite volume and discontinuous Galerkin with variable order.

FlowSimulator software architecture The highest control module of CODA is designed as a compatible Python API for the simulation environment FlowSimulator [17]. The FlowSimulator software is designed to be a platform for multi-disciplinary simulation on massively parallel architectures. The focus has been on coupled simulations of the diﬀerent disciplines of aviation systems. The interface is realized as C++ classes wrapped to Python, providing a control layer for all the operations.

Figure 1.2: FlowSimulator architecture.

The FSDataManager enables to handle the data from diﬀerent tools using the same format since all tools can read and write directly to the FSDM. As a consequence, the data transfers between the tools are independent of the internal data representation used by the individual tools. The main interest of the FlowSimulator is the possibility to integrate softwares into a common environment that has been designed to perform numerical simulations on massively parallel high performance architectures.

Overview of the solver Currently CODA provides a ﬁrst or second order Finite Volume discretization and a Discontinuous Galerkin discretization of order one or higher and an implicit solver which uses Automatic Diﬀer- entiation (AD). There is a large synergy between the two spatial approaches, leading to an abstract design to exploit their similarities. In CODA a high level of abstraction is used to allow a high level of code reuse.

Page 13 of 41 University of Lille CERFACS

Figure 1.3: Modular structure of CODA.

In a HPC point of view, CODA provides a 2-level parallelism : a node-to-node level based on a domain decomposition of the grid and a multi-core level based on a subdomain decomposition of a node-level grid domain. The focus for the node-to-node level is on the GAPSI standard but the MPI one are possible as a fallback option.

Page 14 of 41 University of Lille CERFACS

Chapter 2 Governing equations

This chapter is dedicated to the presentation of the equations involved in CFD, namely the so-called Navier-Stokes system of equations and its closure.

2.1 Navier-Stokes equations

In Computational Fluid Dynamics, one needs to solve the Navier-Stokes equations. These equations are derived (more details in [20]) from some physical assumptions and principles:

• the mass conservation,

• the momentum conservation,

• the energy conservation. #» The variables involved are the ﬂuid density ρ, the velocity u and the total energy E. Using the previous principles, one can write the Navier-Stokes equations in conservative form (without external force): #»  ∂ ρ + ∇ · (ρ u ) = 0,  t #» #» #» ∂ (ρ u ) + ∇ · (ρ u ⊗ u ) = ∇ · σ, (2.1) t #» #»  ∂t(ρE) + ∇ · (ρ u E) = ∇ · ( u σ) − ∇ · q, with σ = −pI + τ where p is the pressure, τ is the shear stress tensor and q is the heat ﬂux.

This system is open with more unknown variables than equations. To close it, additional relations are necessary. The closure is done by providing behavior laws for τ and q and the state law in order #» to link intermediate variables (τ, q, p) with the main variables (ρ, u , E).

Page 15 of 41 University of Lille CERFACS

2.2 Behavior laws

2.2.1 Law for the shear stress tensor

The shear stress tensor τ depends by nature on the ﬂuid viscosity. It represents the rate of change of deformation over time and depends on symmetric combinations of the velocity gradient. One can write it : #» #» #» τ = µ ∇ u + ∇ u T + ξ∇ · u I, (2.2) where µ is the dynamic viscosity and ξ is the second viscosity coeﬃcient.

Eq. 2.2 is known as Newton’s law for the viscosity, it can be written in a diﬀerent form, using spherical and deviator contributions [6] :

#» #» 2 #» #» τ = µ ∇ u + ∇ u T − ∇ · u I + η∇ · u I. (2.3) 3

2 One can deﬁne η = ξ + 3 µ as a volume viscosity.

The Stoke’s hypothesis comes from thermodynamics assumptions at equilibrium, it states that the mechanic pressure is equal to dynamic pressure : #» pm := p + η∇ · u = p ⇔ η = 0 (2.4)

With this hypothesis, Eq. 2.3 becomes :

#» #» 2 #» τ = µ ∇ u + ∇ u T − ∇ · u I , (2.5) 3 known as Newton-Stokes law, it enables to compute the shear stress tensor from a given viscosity law.

2.2.2 Law of the viscosity

In general the viscosity depends on the temperature. For aircrafts or turbomachinery ﬂows, the classical relation to deﬁne µ is the Sutherland’s law :

1.5 T Tref + 110.4 µ(T ) = µref , (2.6) Tref T + 110.4

−5 −1 −1 where Tref = 273.15 K and µref = 1.711 × 10 kg.m .s .

Eq. 2.6 is a good approximation of the viscosity for air in non extreme conditions of temperature (lower than 1500K).

Page 16 of 41 University of Lille CERFACS

2.2.3 Law for the heat ﬂux

The heat flux q is the energy flux density transported by thermal conduction. In the case of a low temperature gradient, q can be expressed as a power of this gradient. One can write the first order term [12], also known as Fourier’s law:

q = −λ∇T, (2.7) with T the temperature and λ the thermal conductivity which is related to µ with : C µ λ = p , (2.8) P r where Cp is the heat capacity at constant pressure and P r is the Prandtl number. P r represents, for a fixed reference length, the ratio of the thermal diffusion time over the dynamic diffusion time.

2.3 Closure of the Navier-Stokes equations

2.3.1 Perfect gas model

The kinetic theory for perfect gas gives the state law for a perfect monatomic gas as

p = nkT, (2.9) where k = 1.38064852 × 10−23 m2.kg.s−2.K−1 is the Boltzmann’s constant and n is the number of molecules per volume unit.

23 −1 Let M be the molar mass and using the Avogadro’s number NA (= 6, 02214076 × 10 mol ), one can write the density as nM ρ = . (2.10) NA

One deﬁne the perfect gas constant as R = kNA and the perfect gas constant for the considered gas as R = R/M (R = 287 for air). Hence a perfect gas is characterized by

p = ρRT. (2.11)

Another important quantities to consider in thermodynamic are the speciﬁc enthalpy p h = e + , (2.12) ρ where e is the internal energy per mass unit, and the heat capacities at constant pressure or constant volume, respectively : ∂h ∂e Cp = and Cv = . (2.13) ∂T p ∂T V One can prove that the enthalpy and the internal energy are functions of the temperature only for a perfect gas, leading to e = CvT and h = CpT. (2.14)

Page 17 of 41 University of Lille CERFACS

For transonic flows around civil aircraft and for turbomachinery, one assumes air to be a perfect gas. This means that air follows Eq. 2.11 and also that it is a perfect polytropic gas. One can define the polytropic coefficient for this gas as : C γ = p . (2.15) Cv Finally, using Eq. 2.14 one can rewrite the total energy per mass unit : #» k u k2 E = C T + , (2.16) v 2 with k · k the euclidean norm.

2.3.2 The Navier-Stokes system of equations

The Navier-Stokes system of equations has been closed, one ﬁnally obtains : #»  ∂ ρ + ∇ · (ρ u ) = 0,  t #» #» #» ∂ (ρ u ) + ∇ · (ρ u ⊗ u ) + ∇p − ∇ · τ = 0, (2.17) t #» #»  ∂t(ρE) + ∇ · ( u (ρE + p)) − ∇ · ( u τ + λ∇T ) = 0,

#» #» 2µ #» Cpµ where τ = µ ∇ u + ∇ u T − I∇ · u and λ = . 3 P r

In order to characterize the ﬂow, there exists two dimensionless numbers : #» k u k • the Mach number M = , with c the speed of sound, c #» ρk u kL • the Reynolds number Re = , where L is a characteristic length. µ The Mach number represents the importance of the ﬂow movement compared to the sound velocity and the Reynolds number measures the importance of the viscosity with respect to the momentum forces.

Page 18 of 41 University of Lille CERFACS

Chapter 3 Discretization methods

This chapter is dedicated to the presentation of numerical methods involved in the solvers. The spectral diﬀerence method is implemented in JAGUAR whereas the discontinuous Galerkin one is used in CODA. A brief presentation of those techniques is given in order to appreciate the overall idea. Afterward, a description of the features implemented in CODA during the internship is given. In each case, references are provided for more details.

3.1 Spectral Diﬀerence method

In order to simplify the description, one considers a 1D implementation of the SD technique. Let the interval [0, 1] be a mesh cell with NSP Degrees of Freedom (DoF). For simplicity, one calls them Solution Points or SP. In the classical FV method NSP = 1 and the flux balance has to be calculated at the border points using the averaged value Q of the conserved variable Q. The SD method proposes to replace this average value by another more accurate expression of Q taking into account the number of degrees of freedom NSP > 1 inside the element. As a consequence, the first question concerns the location of the DoF. The border values Q(0) and Q(1) are obtained using the extrapolation of a p degree polynomial constructed on the NSP solution points. The number p > 0 is called the local method order and it can vary from one cell to another one inside a single mesh. In order to be coherent with the polynomial order, the following equality needs to be satisfied for each cell : NSP = p + 1 . (3.1) At the same time, all solution points must interact inside of the volume [0, 1] as well as with the adjacent volumes [a, 0] and [1, b]. Hence, the variable Q at the solution points must be updated in time using the divergence of the flux density function F represented by a p degree polynomial dF. This polynomial dF is unknown and can be found by derivation of a polynomial of degree p + 1 (corresponding to F) built using NFP Flux Points (NFP > NSP ). These points are usually located inside as well as at the interface of the volume. The flux points situated at the interface of the volume will communicate with other points and thus information propagates in the whole domain. The total number of flux points also depends on the required polynomial order p and in practice, one chooses NFP = p + 2. (3.2)

Page 19 of 41 University of Lille CERFACS

3.1.1 Algorithm description

To explain the principle of the SD method in a 1D conﬁguration, one considers the following 1D unsteady Euler equation : ∂Q ∂F(Q) + = 0, (3.3) ∂t ∂x and a mesh composed of several 1D segments.

As it is suggested by its name, the Spectral Diﬀerence method can be seen as a new formulation that follows some assumptions of the Finite Diﬀerence approach. In particular, there are two key points that are conserved :

1. Equations are solved locally in their local strong conservation form. In particular, there are not any integral transformations as for the Finite Volume or Finite Element techniques.

2. The divergence term is explicitly computed: one builds the F term and then diﬀerentiates it.

The principle of the SD approach is to assume that the vector of unknowns Q varies as a polynomial with a predefined degree inside each segment. In the following and in order to simplify explanations, one assumes that the number of unknowns Q is p + 1, meaning there are p + 1 solution point locations inside the segment. From these p + 1 values, a p degree interpolation polynomial is built to represent quantity variations inside each mesh segment. As for a Finite Difference approach, solution is computed in all i solution point : ∂Q ∂F i + | = 0. (3.4) ∂t ∂x i Following a time marching process, the solution evolution is known once the derivative of the flux density is computed at each solution point.

Figure 3.1: Step 0: Solution points ( ), second-order interpolation polynomial and solution values on the segment.

Solution points, ﬁeld values and interpolation polynomial are shown on Fig. 3.1. One can remark that the SD procedure does not assume that polynomials are continuous on both segment ends, it is represented by the two lines at both sides of the ﬁgure.

Page 20 of 41 University of Lille CERFACS

Figure 3.2: Step 1: extrapolation of quantities at ﬂux points ( ).

Regarding Euler initial equation Eq. 3.3, keeping a p degree polynomial on the quantities Q is only possible if F is represented with a p + 1 degree polynomial : its derivative is then a p degree polynomial. One defines p + 2 flux points in a staggered way : one considers the two end points plus one flux point between two contiguous solution points. One uses the polynomial description of data to extrapolate the data at all flux points (Fig. 3.2).

Figure 3.3: Step 2: non linear transformation to build the ﬂux F ( for the current cell and for neighbouring cells).

The flux F is obtained as a non linear combination of the quantities Q at the flux points (Fig. 3.3). The segment end points are shared by two segments and since quantities are discontinuous at the segment end points, the flux is not continuous.

Figure 3.4: Step 3: computation of a unique ﬂux at segment end points with an approximated Riemann solver

Page 21 of 41 University of Lille CERFACS

Since solutions are discontinuous at end points, the interface flux can easily be computed using an approximated Riemann solver and the flux is then defined with uniqueness. The conservation is guaranteed.

Figure 3.5: Step 4: building a new polynomial from ﬂux points (at degree p + 1).

From ﬂux density, one can build a p + 1 degree interpolation polynomial (Fig. 3.5). The polynomial is globally continuous but only diﬀerentiable inside each segment and not at the boundaries.

Figure 3.6: Step 5: diﬀerentiation of the ﬂux polynomial at solution points.

The last step consists in diﬀerentiating the polynomial to compute the increment at each solution point (Fig. 3.6).

3.1.2 Position of solution points and location of intern ﬂux points

In the mesh, segment sizes can be different and it could be difficult to arrange the solution and flux points on it. An isoparametric transformation is applied to transform the initial segment in the physical space to the segment [0, 1] defined as a computational domain. If all segments are transformed, it is possible to define easily the location of solution and flux point on the isoparametric segment. The solution points are chosen to be the following Gauss points :

1 2s − 1 X = 1 − cos π with s ∈ {1,...,N} . (3.5) s 2 2N

There is not any consensus in the literature regarding the position of the flux points. In JAGUAR, two different kind of points are available and in the following, flux points will be defined with a non

Page 22 of 41 University of Lille CERFACS integer index. A flux point numbered s + 1/2 is located between solution points indices s and s + 1. A first choice follows the location of the Gauss-Lobatto points : 1 h s i Xs+ 1 = 1 − cos π with s ∈ {0, 1,...,N} . (3.6) 2 2 N Another choice follows the Legendre roots. Those flux points are the roots of Legendre polynomials, a serial of polynomial of increasing order defined on [−1; 1] by the following ordinary differential equation : d d (1 − x2) P (x) + n(n + 1)P (x) = 0,P (x) = 1. (3.7) dx dx n n 1 The flux point distribution has an important influence on the stability and accuracy of the SD schemes, and the solution point distribution has very little influence on the properties of the SD schemes. The study in [26] shows that with an order of accuracy higher than two, the Gauss-Lobatto points become unstable whereas the Legendre-Gauss quadrature permits to obtain a stable scheme. Then, the Legendre flux points is used in JAGUAR.

Figure 3.7: Example of the distribution of solution ( ) and ﬂux ( ) points in 1D for a third-order accurate polynomial

In Fig. 3.7 the Legendre ﬂux point locations are described for an order of 3 (i.e. p = 3). Five ﬂux points and four solution points are represented.

3.1.3 Polynomials

To calculate the ﬂux balance between the cells, the solution values on the cell interface are necessary, i.e. in {0} and in {1} for the interval [0, 1]. These values can be extrapolated using the p degree Lagrange polynomial built on the Gauss points and on the Lagrange basis deﬁned by:

p+1 Y X − Xs hi(X) = . (3.8) Xi − Xs s=1,s6=i

The p + 1 degree polynomial (based on the ﬂux points) is also built with a Lagrange basis which is the following p+1 ! X − X 1 Y s+ 2 li(X) = (3.9) X 1 − X 1 s=0,s6=i i+ 2 s+ 2 where X 1 represents a ﬂux point located between the solution points Xi and Xi+1. i+ 2

Page 23 of 41 University of Lille CERFACS

3.2 Discontinuous Galerkin method

The discontinuous Galerkin (DG) method ([30]) is a ﬁnite element method, it works with piece-wise continuous functions. One consider again the 1D unsteady Euler equation

∂Q ∂F(Q) + = 0, (3.10) ∂t ∂x and multiply it by a test function ψk which is polynomial. An integration over the space reference element [0, 1] gives Z 1 Z 1 ∂tQ ψk + ∂xF ψk = 0, (3.11) 0 0 the ﬂux divergence term is then integrated by parts in space, thus yielding

Z 1 Z 1 1 dψk ∂tQ ψk + [F ψk]0 − F = 0. (3.12) 0 0 dx The surface integration involved in the second term of Eq. 3.12 is done through the solution of a Riemann problem.

The solution of this equation writes

p X Qh(x, t) = Qˆl(t)ψl(x), (3.13) l=0 p ˆ where {ψk}k=0 is a basis of polynomial and Qk are the unknowns at integration points.

3.2.1 Polynomial basis

p To build the polynomial basis {ψk}k=0 two choices are possible, namely the modal basis and the nodal one, leading to two diﬀerent schemes. In CODA, the implemented method uses nodal basis.

Modal basis A modal polynomial basis is a set of p + 1 polynomials of degree from 0 to p. For instance, the Legendre polynomials (Eq. 3.7) constitute a modal polynomial basis.

Nodal basis A nodal polynomial basis is a set of p + 1 polynomials of degree p built as in Eq. 3.8.

Page 24 of 41 University of Lille CERFACS

3.2.2 Runge-Kutta DG

Pp ˆ Substituting Qh(x, t) = l=0 Ql(t)ψl(x) into Eq. 3.12 yields the expression :

p Z 1 ˆ Z 1 X dQl 1 dψk ψ ψ dx + F(Q )ψ − F(Q ) = 0, (3.14) l k dt h k 0 h dx l=0 0 0 which represents a system of coupled ordinary diﬀerential equations in time. Since the polynomials ψk are known analytically, one can calculate the ﬁrst integrals of Eq. 3.14 and these values must be calculated only once because they are time independent. Numerical quadrature is used for the second integral.

As an example, one considers the fourth order case, namely for p = 3, with a modal basis using the Legendre polynomials readjusted on the reference element [0, 1]. The four shifted Legendre polynomials are

ψ0(x) = 1, (3.15)

ψ1(x) = 2x − 1, (3.16) 2 ψ2(x) = 6x − 6x + 1, (3.17) 3 2 ψ3(x) = 20x − 30x + 12x − 1. (3.18)

P3 ˆ In this case, one has Qh(x, t) = l=0 Ql(t)ψl(x) and the corresponding system of coupled ordinary diﬀerential equations obtained is

dQˆ 0 + F(Q (1, t)) − F(Q (0, t)) = 0, (3.19) dt h h Z 1 1 dQˆ1 dψ1 + ψ1F(Qh(1, t)) − ψ1F(Qh(0, t)) − F(Qh) = 0, (3.20) 3 dt 0 dx Z 1 1 dQˆ2 dψ2 + ψ2F(Qh(1, t)) − ψ2F(Qh(0, t)) − F(Qh) = 0, (3.21) 5 dt 0 dx Z 1 1 dQˆ3 dψ3 + ψ3F(Qh(1, t)) − ψ3F(Qh(0, t)) − F(Qh) = 0. (3.22) 7 dt 0 dx

This system can be solved with a Runge-Kutta scheme, leading to the Runge-Kutta Discontinuous Galerkin scheme. One can remark that the first order, namely when considering only Eq. 3.19, corresponds to a first order Finite Volume discretization. Another remark : the cell borders values of the fluxes can be obtained by solving a Riemann problem.

Page 25 of 41 University of Lille CERFACS

3.3 Implemented features

This section presents the theory of the implemented features in CODA. First, the low storage Runge-Kutta method which was used in all the simulations. Afterward, the sponge zone principle is explained.

3.3.1 Time integration method

Currently the time integration methods used in JAGUAR are a family of Explicit Runge-Kutta (ERK) schemes including low storage methods. Some implicit methods are also available [16]. These implicit schemes are the DIRK (Diagonally Implicit Runge-Kutta) and SDIRK (Singly-DIRK) methods. However ERK schemes are the most used.

In CODA the preferred choice is the use of implicit methods such as [(E)S]DIRK (Explicit SDIRK) or BDF (Backward diﬀerentiation formula). The implementation of these methods uses a mechanism for nesting time integrations. However, ERK schemes are also available and the low storage method implemented is one of them.

The low storage Runge-Kutta method In the ﬁeld of CFD, one currently uses considerable amount of data and some low storage methods has been developed to deal with it [29]. The method used in JAGUAR is the 2-N storage explicit Runge-Kutta, it uses only 2 times the number of degrees of freedom.

In order to present this method, one considers the general case of a non-autonomous system of ordinary diﬀerential equations dQ = F (t, Q(t)) ; Q(t ) = Q . (3.23) dt 0 0 The general form of a s-stages RK scheme is

s  i−1  n+1 n X n n X Q = Q + ∆t biki; ki = F t + ∆tci,Q + ∆t aijkj , (3.24) i=1 j=1

Pi−1 where ci = j=1 aij.

To obtain low-storage schemes, the idea is to leave useful information on the array instead of starting with an empty array. Using some complex formulas one can rewrites the coeﬃcients of the butcher tableau and the algorithm becomes [4, 25] :

Q0 = Qn, (3.25) l n l−1 Q = Q + αl∆tF Q , l = 1, . . . , s, (3.26) Qn+1 = Qs, (3.27) where al are coeﬃcients speciﬁc to the method.

Page 26 of 41 University of Lille CERFACS

JAGUAR uses a fourth order RK with 4-stages and a low dissipation and low dispersion second order RK with 6-stages described in [4].

3.3.2 Sponge zone

An another feature implemented in both JAGUAR and CODA during the internship is the sponge region. Flow simulations are computed on a bounded domain, hence one needs to treat artificially the external boundaries. The aim of a sponge zone is to allow the flow to leave the computational domain without reflecting any signature inside [15]. A simple approach to treat the external boundaries is to use the sponge layer [8, 3]. This treatment is done by adding a source term in the Navier-Stokes equations, leading to the following systems :

#»  ∂ ρ + ∇ · (ρ u ) = σ(ρ − ρ),  t #» #» #» ref#» #» ∂ (ρ u ) + ∇ · (ρ u ⊗ u ) + ∇p − ∇ · τ = σ((ρ u ) − ρ u ), (3.28) t #» #» ref  ∂t(ρE) + ∇ · ( u (ρE + p)) − ∇ · ( u τ + λ∇T ) = σ(Eref − E).

The new term in the right-hand side is only active near the external boundaries. The σ coefficient is called the damping coefficient, indeed this additional source term cushions the flow variables to a known reference solution. This reference solution can be the initial one. In practice, one must specify the sponge depth or length σdepth and the maximum value σmax taken sufficiently large.

Page 27 of 41 University of Lille CERFACS

Chapter 4 Numerical results

This chapter presents the numerical results obtained during this internship. It contains the validation of implementations done in CODA and comparisons between the two solvers. Each time, a brief presentation of the test case is done.

4.1 Validation of the time integration method

This part aims to validate the implementation of the low storage method in CODA. This feature was implemented in the C++ layer and represents four source files. The first source file implements the low storage class including the function which performs the time iteration updates. A second file is dedicated to the storage of the predefined Butcher tableaux, the third one allows the user to define a Butcher tableau from the python layer. The last one is used to define all the control instructions to choose low storage method from the python layer.

Onera M6 wing validation case In order to check the implementation, one can consider a steady test case and study the convergence of the residuals. The residual is deﬁned as the right-hand-side term is the equations formulation. For instance, in the case of the 1D unsteady Euler equation, one has dQ i = R , (4.1) dt i with Ri the residual.

Since the test case considered here is steady, the time derivative vanishes and one can iterate using a pseudo-time integration process yielding to a residuals convergence toward zero. CODA uses explicit Runge-Kutta methods and a comparison between the residuals convergence of these methods and the low storage one allows to validate the implementation of this new feature. For this purpose the test case used is the so-called Onera M6 wing validation case [19, 24] with the steady Euler Equation.

Page 28 of 41 University of Lille CERFACS

Figure 4.1: Presentation of the Onera M6 Wing validation case and the associated mesh.

The Onera M6 wing was designed by Onera engineers in 1972 in order to provide an experimental support for high Reynolds number ﬂows studies. In this test case, the numerical discretization used is a second order Finite Volume method.

Results The measurements were performed on two low storage schemes :

• the low dissipation and dispersion second order RK with 6 stages,

• a fourth order RK with 4 stages.

Each of them was compared with a classical equivalent in terms of accuracy order. In the fourth order case, the low storage scheme are based on the classical fourth order RK with four stages scheme.

Figure 4.2: Relative convergence of the residuals of second order RK schemes

Page 29 of 41 University of Lille CERFACS

Figure 4.3: Relative convergence of the residuals of fourth order RK schemes

In both cases the convergence of the residuals of low storage methods follows the already implemented schemes, in other word the implementation is correct. One can notice in Fig. 4.3 that the residuals are almost the same because of the same nature of the two schemes. Indeed, these two schemes are based on the same Butcher tableau.

4.2 Presentation of the COVO test case

In order to perform some comparisons between JAGUAR and CODA, one considers the so-called COnvection of a VOrtex test case [5]. The COVO test case is aimed at testing a high-order method’s capability to preserve vorticity in an unsteady inviscid flow. The goal is to transport a vortex in a periodic box. The governing equations are the unsteady 2D Euler equations, with a constant −1 ratio of specific heats of γ = 1, 4 and gas constant Rgas = 287.15 J.kg .K. The domain is first initialized with a uniform flow of pressure P∞, temperature T∞, a Mach number M∞ and a vortical movement. The exact solution of this test case consists in a translation of the initial vortex by a distance equal to the mean-flow velocity times the time interval.

4.2.1 Test case in JAGUAR

In JAGUAR the COVO is implemented using a 2D Lamb-Oseen inviscid vortex generated by the stream function :

2 p 2 2 r1 (x − xc) + (y − yc) ψ1(x, y) = Γexp − , r1 = , (4.2) 2 Rc where

• Γ is the vortex strength,

• Rc is the vortex radius,

Page 30 of 41 University of Lille CERFACS

• (xc, yc) are the coordinates of the vortex center. The resulting initial velocity distribution is:

u0 = U∞ + ∂yψ1 and v0 = −∂xψ1, (4.3) p where U∞ = M∞ γRgasT∞ is the speed of the unperturbed ﬂow. The temperature, density and pressure of the ﬂuid are described by :

2 2 1/(γ−1) Γ exp(−r ) P∞ T0 T0 = T∞ − , ρ0 = and P0 = ρ0RgasT0. (4.4) 2Cp RgasT∞ T∞

4.2.2 Test case in CODA

The implementation of the COVO test case in CODA is quite diﬀerent, it follows the deﬁnition in [9]. In that case one has:

1 − r2 p(x − x )2 + (y − y )2 ψ (x, y) = U exp 2 , r = c c , (4.5) 2 A 2 2 b where

• UA = βU∞ with β a parameter to deﬁne the vortex strength, √ 2 • b = L/ ln 2 with L a representative length scale of the vortex (with exp(−r2) = 1/2 at r = L/b).

The velocity distribution has the same deﬁnition as in Eq. 4.3 (considering ψ2). Since CODA uses nondimensionalization following the DPT scheme [18], one has ρ = T = P = R = 1 (so √ ∞ ∞ ∞ gas U∞ = M∞ γ) and:

2 γ − 1 UA 2 1/(γ−1) T0 = 1 − 2 exp(1 − r2), ρ0 = T0 and P0 = ρ0T0. (4.6) 2 U∞

4.3 Performance comparisons between JAGUAR and CODA

This section is dedicated to the performance comparisons between the solvers. In all this one, DG4 or SD4 means Order 4 Discontinuous Galerkin (used by CODA) and Order 4 Spectral Diﬀerence (used by JAGUAR), where the method order corresponds to the number of solution points per cell which is p + 1 with p the polynomial order.

Page 31 of 41 University of Lille CERFACS

4.3.1 Accuracy results

To measure the accuracy of the solvers, RK scheme with low dissipation and low dispersion can be used on the COVO test case. The experiment consists in performing 50 rotation periods of the vortex and to make a comparison between numerical results and analytical solution. A time step dt = 3.3e−5 was considered and 256107 iterations were performed for both order 3 and 4 methods. A slice following y = 0.05 (middle of the domain) on the x-density component is used for the comparison.

Parameters In order to obtain the same conditions, the parameters used are the following.

For CODA : For JAGUAR :

• (xc, yc) = (0.05, 0.05) • (xc, yc) = (0.05, 0.05)

• L = 0.00416277 (so b = 0.005), • Rc = 0.005,

1/2 • β = 0.8 (so UA = 0.8U∞) • Γ = 0.78031732620442706 (= UAe )

• M∞ = 0.5 • M∞ = 0.5

• T∞ = P∞ = Rgas = 1

• Cv = 2.5

Results

Figure 4.4: X-density component comparison with exact solution after 50 periods. Order 3 methods at the left and Order 4 ones at the right.

Page 32 of 41 University of Lille CERFACS

Figure 4.5: X-density component comparison with exact solution after 50 periods.

As expected, Fig. 4.4 shows that numerical results are not much dissipated or dispersed. Hence the rk6ldld scheme works correctly in CODA. The two zooms on Fig. 4.5 show some diﬀerences between the solvers. The accuracy is pretty good for the order 4 whereas the order 3 suﬀers from a shift of the vortex.

(a) JAGUAR (b) CODA

Figure 4.6: Vortex shift for order 3 methods.

One can see in Fig. 4.6 that the JAGUAR’s vortex is more shifted than the CODA’s one, this explains the observations done in Fig. 4.5. Hence the order 3 DG scheme is more accurate than the order 3 SD scheme.

4.3.2 Execution time comparison

The comparison in terms of execution time was made on the COVO test case on a 3D conﬁguration with 643 cells and periodic boundary conditions in all directions of the domain [0.1]3. The time

Page 33 of 41 University of Lille CERFACS integration scheme used is the low storage RK6 with low dissipation and dispersion. The objective was to compare the solvers in terms of execution time for diﬀerent compilation options. The measurements were performed on the Kraken cluster in one compute node.

Results First, investigation using the -OX flags was performed in order to appreciate the level of code optimization in the sense that a badly optimized code will benefit more than a well optimized one from compiler optimizations. Also, the gain using architecture’s specifications was studied. For this purpose the compilation flag -march=native was used, it allows the compiler to produce specific code for the system’s CPU with all its capacities and features (like AVX instructions for vectorization). The time measurements considered are the total execution time of the program (denoted as Total) and the time iteration process time (denoted as Time loop) after 50 iterations with ∆t = 1e−4.

Figure 4.7: Ratio TCODA/TJAGUAR with different options of compilation for order 3 (left) and 4 (right). The Fig. 4.7 shows the results obtained. One can notice that the optimization using flags are more important for CODA than for JAGUAR. Also passing from order 3 to order 4 is much more expensive for CODA. As suggest by these results, CODA needs more optimizations. By chance, a study has been made at Airbus on the performance issues in the code generated by the C++ compiler [23]. According to this document some additional compilation flags are useful : • -DNDEBUG : disables debug assertions, particularly useful for Eigen. • -DEIGEN DONT VECTORIZE : the SIMD instructions used by Eigen in a loop prevent its vectorization and this flag allows vectorizing loops in CODA. • –param inline-unit-growth=200 : it specifies maximal overall growth of the compilation unit caused by inlining. Parameter value 200 limits unit growth to 3 times the original size. • –param inline-min-speedup=1 : when estimated performance improvement of caller + callee runtime exceeds this threshold (in percent), the function can be inlined. Considering these compilation flags, measurements were performed and compared with the best previous version (-O3 -march=native). Also, the difference between a hybrid job MPI+OpenMP and a job using only MPI was measured.

Page 34 of 41 University of Lille CERFACS

Figure 4.8: Ratio TCODA/TJAGUAR with diﬀerent options of compilation and parallel programming models for order 3 (left) and 4 (right).

The Fig. 4.8 shows a better performance when using the combination of optimized compilation and MPI only. One can remark a gap between the total execution time for a hybrid job MPI+OpenMP and a classical MPI only job at the order 4, indeed in the MPI only case the preproccesing is done faster. In order to obtain a better performance overview of CODA, time measurements were made for diﬀerent number of iterations, this allows to check if the time execution increases linearly with respect to the number of iterations. Also a speedup for 400 iterations and ∆t = 1e−4 was made for Order 3 and 4 with both solvers.

Figure 4.9: CODA performances considering various number of iterations (left) and speedup for both CODA and JAGUAR (right).

CODA does not take much time when increasing the number of iterations, at least for one compute node. The speedup shown in Fig. 4.9 is good, however increasing polynomial order lead to a worst speedup for CODA while an improvement is observed for JAGUAR. These performance results are pretty bad and unexpected, indeed in the best case CODA is 3 times

Page 35 of 41 University of Lille CERFACS slower than JAGUAR for order 3 and 7.4 times slower for order 4 in terms of time loop execution time.

4.3.3 Sponge zone validation

During this internship, the sponge zone was implemented in JAGUAR and CODA. For CODA an external source term class exists in the C++ layer. In order to test it, the implementation of the sponge zone was done in the python layer. Unfortunately, some intern problems of CODA have prevented to obtain the expected results. Hence CODA sponge zone validation is not presented.

The NACA 0012 test case In order to study the effects of the sponge zone, NACA 0012 airfoil test case was used. NACA means National Advisory Committee for Aeronautics [28], it was a US federal agency, in 1958 its activities were transferred to the NASA. They developed equations to generate consistent airfoil shapes. The NACA 0012 is one of them, the four digits are designated as • first : describes the maximum camber as a percentage of the chord length, • second : describes the distance of the maximum camber from the airfoil leading edge in tenths of a chord, • third and fourth : describe the maximum thickness of the airfoil as a percentage of the chord. Airfoils with a series number beginning with 00 are symmetrical and have no camber. The NACA 0012 profile is given by the following equation [2] : rx x x2 x3 x4 y = 5tc 0.2969 − 0.1260 − 0.3516 + 0.2843 − 0.1015 (4.7) t c c c c c where c is the Chord length, x is the position along the chord from 0 to c, t is the maximum thickness as a fraction of the chord and yt is the half thickness at a given value of x.

Figure 4.10: NACA 0012 Mesh. Many NACA airfoils have been physically tested and results are used in evaluation of CFD simulations.

Page 36 of 41 University of Lille CERFACS

Parameters The validation case with a sponge layer in JAGUAR used the following parameters :

Inﬂow data : Sponge layer data :

• Mach = 0.2 • σmax = 100

• angle of attack : 0.0 • σdepth = 10 • angle of sideslip: 7.0 • Reference point : (20, 0, 0)

• p = 101325 Pa

• T = 293.15 K

Figure 4.11: Sponge region.

The Fig. 4.11 shows the region where the sponge zone term is active. The NACA 0012 airfoil is around the origin (0, 0) in the domain, it is too small to be seen in this ﬁgure since the mesh is huge.

Page 37 of 41 University of Lille CERFACS

Results The experiments were performed with a CFL of 0.5 until 1 000 000 iterations with the order 5 SD method. The pressure variation ∆p = p − 101325 Pa, where 101325 Pa represents the standard atmosphere unit, is considered to study the sponge zone eﬀect.

(a) Without sponge layer. (b) With sponge layer.

Figure 4.12: Pressure variation of NACA 0012 test case in JAGUAR.

The Fig. 4.12 shows the obtained results without and with a sponge zone. In the case without sponge zone, a second wave is generated since the turbulence reaches the right boundary, leading to a reﬂection of the perturbation. With the sponge layer, the turbulence propagation is suﬃciently reduced when it enters in the sponge zone. These results validate the implementation in JAGUAR.

Page 38 of 41 University of Lille CERFACS

Chapter 5 Conclusion

The new CFD code CODA implemented in C++17 is a good step toward next generation software. Nevertheless it is not well optimized and need some performance improvements in order to fairly compare its time execution with another solver. As an example of improvement, we saw during this internship that a git branch exists with a great modiﬁcation in the computation of the polynomial bases. In the tested version of CODA the polynomial basis is recomputed at each time step, this is a huge bottleneck and the proposed alternative is to compute each polynomial basis once during the preprocessing phase. Hence the polynomial bases are stored and only memory accesses are necessary at each time step. This lead to better performances and reduce the time execution gap between the order 3 and 4. Unfortunately, some additional plugins are needed to test this git branch and we could not use it.

As CODA is used as a Python plugin, it is generally straightforward to run a desired simulation. However when a deeper usage is needed, the navigation into the C++ sources code is pretty diﬃcult. In the case of the sponge zone, it appears that some necessary features are not anticipated, leading to fallback option. Also, the mesh format widely used in CERFACS, namely the GMSH format, is not well support by CODA’s architecture and we encountered big problems. Another feature of the same kind as the sponge zone was considered but identical issues were encountered.

The initiative of providing a next generation solver is great but it is premature to compare it with optimized Fortran solvers. The same kind of performance comparisons will be relevant when CODA will be well optimized. Also, a better integration in the CODA project team would be crucial to develop new features in accordance with their vision and expectations.

Page 39 of 41 Bibliography

[1] Airbus. Airbus signs strategic partnership with ONERA and DLR, 2017.

[2] Autodesk. Autodesk simulation CFD external Flow Validation : NACA 0012 Airfoil.

[3] Daniel J. Bodony. Analysis of sponge zones for computational ﬂuid mechanics. Journal of Computational Physics, 212(2):681–702, 3 2006.

[4] Christophe Bogey and Christophe Bailly. A family of low dispersive and low dissipative explicit schemes for ﬂow and noise computations. Journal of Computational Physics, 194(1):194–214, 2004.

[5] C1.4. Vortex transport by uniform ﬂow. In 3rd international Workshop on High-Order CFD Methods, 2015.

[6] P. Chassaing. M´ecanique des ﬂuides. Editions Cepadues, 1995.

[7] Ideal Simulation. https://www.idealsimulations.com/resources/turbulence-models-in-cfd/.

[8] Moshe Israeli and Steven A. Orszag. Approximation of radiation boundary conditions. Journal of Computational Physics, 41(1):115–135, 1981.

[9] J. C. Kok. A high-order low-dispersion symmetry-preserving ﬁnite-volume method for compressible ﬂow on curvilinear grids. Technical report, Nationaal Lucht- en Ruimtevaartlabora- torium, 2008.

[10] David A. Kopriva and John H. Kolias. A Conservative Staggered-Grid Chebyshev Multidomain Method for Compressible Flows. Journal of Computational Physics, 125(1):244–261, 1996.

[11] N. Kroll, M. Abu-Zurayk, D. Dimitrov, T. Franz, T. Führer,T. Gerhold, S. Görtz,R. Heinrich, C. Ilic, J. Jepsen, J. Jägersküpper, M. Kruse, A. Krumbein, S. Langer, D. Liu, R. Liepelt, L. Reimer, M. Ritter, A. Schwöppe, J. Scherer, F. Spiering, R. Thormann, V. Togiti, D. Vollmer, and J. H. Wendisch. DLR project Digital-X: towards virtual aircraft design and flight testing based on high-fidelity methods. CEAS Aeronautical Journal, 7(1):3–27, 2016.

[12] L. Landau and E. Lifchitz. Physique th´eorique. Editions Mir, 1989.

[13] T Leicht, D Vollmer, J Jägersküpper, A Schwöppe, R Hartmann, J Fiedler, Linder Höhe,and Linder Höhe.DLR-PROJECT DIGITAL-X NEXT GENERATION CFD SOLVER ‘ FLUCS ’, 2016. [14] Yen Liu, Marcel Vinokur, and Z. J. Wang. Spectral difference method for unstructured grids I: Basic formulation. Journal of Computational Physics, 216(2):780–801, 8 2006.

[15] A Mani. On the reﬂectivity of sponge zones in compressible ﬂow simulations. Technical report, Center for turbulence research, 2010.

[16] T. Marchal. Study and implementation of implicit time integration methods in a high-order CFD code, 2018.

[17] Michael Meinel and G Einarsson. The FlowSimulator framework for massively parallel CFD applications. PARA 2010 conference: state of the art in Scientiﬁc and Parallel Computing, 2010.

[18] Marc Montagnac. Variable Normalization ( nondimensionalization and scaling ) for Navier- Stokes equations : a practical guide. Technical report, CERFACS, 2013.

[19] Nasa. Turbulence Modeling Resource.

[20] G. Puigt and H. Deniau. CFD e-Learning, Mesh and discretization, 2011.

[21] Guillaume Puigt. Jaguar : http://www.cerfacs.fr/˜puigt/jaguar.html, 2017.

[22] W. H. Reed and T. R. Hill. Triangular mesh methods for the neutron transport equation. Proceedings of the Americain Nuclear Society, 1973.

[23] Jerome Robert. Compiler behavior in CODA. Technical report, Airbus, 2020.

[24] Volker Schmitt and Fran¸coisCharpin. Pressure Distributions on the ONERA-M6-Wing at Transonic Mach Numbers. Technical report, AGARD, 1979.

[25] D. Stanescu and W. G. Habashi. 2N-Storage Low Dissipation and Dispersion Runge-Kutta Schemes for Computational Acoustics. Journal of Computational Physics, 143(2):674–681, 1998.

[26] Kris Van Den Abeele, Chris Lacor, and Z. J. Wang. On the stability and accuracy of the spectral diﬀerence method. Journal of Scientiﬁc Computing, 37(2):162–188, 11 2008.

[27] Z. J. Wang. Spectral (ﬁnite) volume method for conservation laws on unstructured grids. Basic Formulation. Journal of Computational Physics, 178(1):210–251, 5 2002.

[28] Wikipedia. National Advisory Committee for Aeronautics.

[29] J. H. Williamson. Low-storage Runge-Kutta schemes. Journal of Computational Physics, 35(1):48–56, 1980.

[30] Olindo Zanotti. Discontinuous Galerkin method for hyperbolic PDEs, 2016.