An Overview of the Multi-Purpose elsA Flow Solver. L. Cambier, M. Gazaix, S. Heib, S. Plot, M. Poinot, J.P. Veuillot, J.F. Boussuge, M. Montagnac

To cite this version:

L. Cambier, M. Gazaix, S. Heib, S. Plot, M. Poinot, et al.. An Overview of the Multi-Purpose elsA Flow Solver.. Aerospace Lab, Alain Appriou, 2011, p. 1-15. ￿hal-01182452￿

HAL Id: hal-01182452 https://hal.archives-ouvertes.fr/hal-01182452 Submitted on 31 Jul 2015

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. CFD Platforms and Coupling

L. Cambier, M. Gazaix, S. Heib, S. Plot, M. Poinot, J.-P. Veuillot An Overview of the Multi-Purpose (Onera) J.-F. Boussuge, M. Montagnac (Cerfacs) elsA Flow Solver

E-mail: [email protected]

evelopment of the elsA software for complex external and internal flow Daerodynamics and multidisciplinary applications started in 1997 at Onera. Due to the multi-purpose nature of elsA, many common basic CFD features can be shared by a wide range of aerospace applications: aircraft, helicopters, turbomachinery, missiles, launchers… The elsA software is based on an Object- Oriented design method and on an Object-Oriented implementation based on three programming languages: ++, and Python. The elsA strategy for interoperability is based on a component approach that relies on standard interfaces for the CFD simulation components. This paper presents an overview of the capabilities of the elsA software in terms of modeling, mesh topology, numerics and boundary conditions, whereas a more detailed description of these capabilities is given in companion papers of this issue of the electronic journal. The importance of High Performance Computing activities is outlined in the present paper.

Introduction large variety of the advanced aerodynamic applications handled by elsA is presented in [25]. Note that it is quite uncommon for a Unlike other industries, such as the automobile industry, the simulation CFD tool to deal both with external flows around airframes and with software used for aerodynamic analysis and design in the aeronautic internal flows in turbomachinery; since it allows common basic CFD industry is not usually provided by commercial software vendors, but is features to be shared, we clearly consider that it is an advantage. The generally developed either by Research Establishments or sometimes research, development and validation activities are carried out using a by the aeronautic industry itself. The overview of software tools used project approach in cooperation with the aircraft industry and external for flow simulation in the European aeronautic industry, presented in laboratories or universities (see Box 1). This project approach has [26], shows that in Europe they are mostly managed by Research been the result, at Onera as elsewhere, of the change in CFD software Establishments. One major reason for this is that the high levels of development from a one-code, one-developer paradigm at the accuracy and reliability required today for improving aeronautic design beginning of the eighties to a team-based approach necessary to cope are obtained through long term expertise and innovative research in with the complexity of today’s CFD. The effort carried out in France various areas: physical modeling, numerical methods, software and coordinated by Onera with elsA was also initiated approximately efficiency on rapidly evolving hardware and validation by comparison at the same time as projects in other countries, such as the WIND- with detailed experimental data. US flow solver of the NPARC Alliance [3] or the FAAST program at NASA Langley RC [19] in the United States or the TAU flow solver in The elsA software (http://elsa.onera.fr) for complex external and Germany [14]. internal flow aerodynamics and multidisciplinary applications has been developed at Onera since 1997 [4], [6]. The main objective is This paper gives a general presentation of the elsA solver, and to offer the French and European aerospace community a tool that mainly focuses on software topics such as Object-Oriented design capitalizes on the innovative results of Computational Fluid Dynamics and implementation, software interoperability or High Performance (CFD) research over time and is able to deal with miscellaneous Computing. Only a general overview of the capabilities of the elsA industrial applications. The range of aerospace applications dealt software in terms of modeling, mesh topology, numerics, aeroelasticity with using elsA (aircraft, helicopters, tilt-rotors, turbomachinery, and optimum design is presented here, whereas a more detailed counter-rotating open rotors, missiles, launchers, etc.) is very wide, description of these capabilities and results, showing evidence of as shown in figure 1 presenting a few examples of elsA results. A their functionality and correctness, is given in companion papers of

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 1 this issue of the electronic journal ([1] for transition and turbulence [22] for optimum design). Also, the reader should refer to another modeling, [8] for space discretization methods on various mesh companion paper [25] for a detailed presentation of the application topologies, [21] for time integration methods, [10] for aeroelasticity, results of elsA, which only appear in this paper as illustrations.

p perturbation 4E-05 2E-05 0 -2E-05 -4E-05

P/(γPint)

0.68 0.69 0.7 0.71 0.72 a) Transport aircraft configuration: simulation of the flow around an aircraft with c) CROR configuration: simulation of the flow around a CROR for an aero- deployed spoilers acoustics study

b) Turbomachinery configuration: simulation of the rotating stall in an axial d) Helicopter configuration: simulation of the interaction between the main rotor compressor stage and the fuselage

Figure 1 – Examples of applications carried out with elsA

Box 1 - Internal and external developers and users

Many partners, not only inside Onera but also in external laboratories and universities and in the aerospace industry, contribute to the development of new capabilities and to the validation of elsA. Inside Onera, the CFD and Aeroacoustics Department of Onera coordinates the elsA software project and contributes to the development in terms of software architecture, numerical methods or CPU efficiency. Several other Departments take part in development and validation activities related to elsA software: namely the Applied Aerodynamics Department for thorough validation and some specific applied aerodynamics developments, the Aerodynamics and Energetics Modeling Department for transition and turbulence modeling and fundamental validation, and the Aeroelasticity and Structural Dynamics Department for fluid/structure capability development and validation.

Since 2001, there has been a partnership with the Cerfacs research organization for elsA development, and Cerfacs has taken part in many developments, dealing, in particular, with mesh strategies, numerical methods and CPU efficiency, since that time. Other labs also take part in the development and validation of elsA, such as the Fluid Mechanics and Acoustics Lab of “École Centrale de Lyon” for development of complex turbomachinery boundary conditions, the applied research center Cenaero (Belgium) for turbomachinery flow simulation and the DynFluid lab of “Arts et Métiers ParisTech” for high accuracy numerical schemes. The use of elsA in French engineering schools or universities is also developing for academic teaching purposes.

elsA is today intensively used as a reliable design tool in the French and European aeronautic industry. In turbomachinery industry, elsA is used in the design teams of Safran group (Snecma and Turbomeca in France, Techspace Aero in Belgium). For transport aircraft configurations, elsA is one of the two CFD programs used at Airbus for performance prediction and for design (the other one is the TAU software from DLR, see [5]). Among other industry partners, we should mention Eurocopter for helicopter applications and MBDA for missile configurations.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 2 Why a multi-purpose tool for solving the Navier-Stokes larger and more complex simulations, and it has become increasingly equations? necessary to involve and combine several models and/or several meshing strategies in the same flow simulation.

CFD methods and software have improved tremendously over the So, CFD software designers are faced with the challenge of meeting a last forty years. Whereas in the 1970s, CFD software for design very wide range of requirements, while keeping software complexity was mostly based on formulations assuming the flow to be inviscid, and development cost under control. Thus, a very broad range of today CFD codes solving the Navier-Stokes equations (see Box 2) CFD capabilities has to be grouped together in an interoperable and have become standard tools in the aeronautic industry [27]. The evolving software package. To cope with these broad requirements, improvements concern various areas: mesh topology capabilities, the designers of the elsA software chose to rely on an Object-Oriented physical modeling, numerical algorithms. However, in each of these design method as will be described below and elsA was one of the first areas, there is no universal method answering all of the problems. Object-Oriented major scientific packages written in C++ [13]. The best choice of methods depends on the type of application and on the levels of accuracy, robustness and efficiency that are required. This choice was quite successful since there has been an intensive In the case of mesh topology capabilities, the relative advantages and development of elsA throughout the years. Today, elsA is being disadvantages of Cartesian structured grids, curvilinear structured developed towards a component architecture (see the section dealing grids and unstructured grids are well-known today and in recent years with interoperability) to cope with ever increasing requirements: smart it has become clearer that an association of various types of grids in a integration in the simulation environments of the aeronautic industry, single simulation is very powerful. runtime control of the simulation, coupling with external software for multi-disciplinary applications, etc. Coupling independent components It is also well-known that transition and turbulence modeling which is through a common high-level infrastructure provides a natural way to required for the simulation of turbulent flows has to be adapted to the reduce the complexity. type of application and even to local flow phenomena. For example, a classical eddy viscosity model should give good results at low cost for the flow simulation around an airfoil at low angle of attack, Object-Oriented design and implementation whereas Large Eddy Simulation would be required for the simulation of the flow for near stall conditions. The selection of the best numerical The elsA software is based on an Object-Oriented (OO) design method. algorithm strongly depends on the flow regime (subsonic, transonic or The central concept of OO design is the class: a class encapsulates supersonic flow) and on the compromise which is required between data and methods. The class interface is a way of communicating with accuracy, efficiency and robustness. class users. The most important difference between procedural and OO programming is the switch from function to class as the fundamental One solution could be to build dedicated software focusing on a narrow abstraction. OO programming is interesting because it makes it easier application domain. But this solution leads to a proliferation of specific to think about programs as collections of abstractions and to hide the software tools, which is very difficult to maintain, document, optimize details of how these abstractions work to users who do not care about and port to different computers. In fact, real-world applications today these details. OO programming can be used to partition problems require a large range of capabilities. And if the best choice of methods into well-separated parts, none of which needs to know more about depends on the type of application, it is also true that one specific the others than absolutely necessary. This ability to break a large, method may be useful for various types of applications (see below developing program down into parts that can be pursued independently, in the section “Mesh topology capabilities” one example of this with thus enhancing collaboration among multiple contributors, explains why the Chimera method). Also, the scientific community is looking for integration of new developments is easier in an OO software.

Box 2 - Mathematical model of elsA: the Navier-Stokes equations

Navier-Stokes equations are the main mathematical model of aerodynamics in the continuous regime, for which the characteristic length scales are large compared with the mean free path of the molecules. These equations result from the application of the principles of mechanics and thermodynamics, and, in integral form, express an equilibrium between a volume term expressing the time variation of mass, momentum (mass times velocity) and total energy (sum of internal energy and kinetic energy) contained in a volume Ω, and a surface flux term corresponding to the exchanges between the fluid inside Ω and the fluid outside Ω.

The Navier-Stokes equations are completed, on the one hand, by behavior laws representing the irreversibility effects associated with viscosity and thermal conductivity, and, on the other hand, by state laws describing the thermodynamic properties of the fluid.

As a result of the low viscosity of air, flows in aeronautics applications are turbulent. Turbulence is included in the Navier-Stokes equations which represent all the turbulence scales. However, Direct Numerical Simulation which relies on solving the instantaneous Navier-Stokes equations is still very far from being applicable to real world applications. Therefore, a statistical approach has to be added to the Navier-Stokes equation, which leads to Reynolds Averaged Navier-Stokes or to Large Eddy Simulation (see [1] for further details).

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 3 One important objective of elsA is to reap the benefits of the OO elsA design also follows the objective of organizing the modules into approach without impairing numerical efficiency. Since nearly all layers in such a way that each layer should mainly affect the layers above CPU time of CFD calculations is spent in computing loops acting over (see Figure 3). That means that classes in a layer are only allowed to use quantities such as cells, nodes or interfaces, the OO design of elsA services of classes of lower layers (or of same layer); the goal of this does not deal with objects such as individual cells or fluxes, since organization is to achieve mono-directional relationships. The advantage loops operating on them would suffer a high penalty. Conversely, is then that maintenance becomes easier since one layer’s interface only outside of loops, it has been shown with elsA design that OO concepts affects the next layers. We will now present the main modules of elsA. may be introduced without any significant CPU penalty. The lowest layer contains all of the low level modules, such as the Fld The last version of elsA delivered includes about 600 classes grouped module (corresponding to the data storage classes which encapsulate in 26 modules specialized for a given CFD task. The OO model based the dynamic memory needed to store any computational data), the on CFD experience has been defined using the classical UML (“Unified Pcm module (dealing with the implementation on parallel computer Modeling Language”) method. Ideally, developers should be able to architectures, it encapsulates the message passing interface) or the work inside a module, without having to know the implementation details Sio module dealing with IO (Input/Output). of the other modules. Achieving a good breakdown is very important for achieving ease co-operative of development and maintenance. A Factory UML modeling tool was used to define the initial design of elsA, but without relying on the automatic code generation capability. The use of such a tool appeared to be too difficult for managing the cooperative Descp Fact Obf development of elsA and was given up.

Each module in elsA is identified by a key of 3 to 5 letters. Inside each Solver module, each class name is then prefixed by the key of the module to which it belongs. As an example, the TurKL class associated with Tmo Lhs Rhs the (k, l) turbulence model belongs to the Tur module which deals with turbulence modeling and transition prediction.

Some of turbulence models implemented in elsA are built on the Space Discretization Boussinesq hypothesis. Their common feature is the use of the eddy viscosity which can be calculated either by an algebraic turbulence model, a subgrid-scale model or using transport equations. Each Oper Fxc Fxd Sou Bnd turbulence model is implemented through a specific class, which is derived from the abstract base class TurBase. All the classes deriving from the abstract class share its interface which declares TurBase Physical model the method compMut(). When manipulated in terms of the interface defined by TurBase, the concrete classes do not have to be known by the client classes. Client classes are only aware of the abstract class. In Eos Tur elsA, the client classes of the Tur class hierarchy are the diffusive fluxes (see module Fxd below) that manipulate a pointer to an instance of a class derived from TurBase by means of: Geometry tur -> compMut()

TurBase Geo Blk Dtw Mask Chim Join Glob compMut()

TurTransp TurLes TurAlg Base

TurSA TurKO TurKEps TurKL TurWale TurFSF TurSmago TurBlx Def Agt Fld Tbx Pcm Sio TurKOMenter TurKEpsV2F

Figure 2 – UML model of the Tur module Figure 3 – elsA design organization The computation of the eddy viscosity depends on each particular Then, the geometry layer contains all of the modules which describe turbulence model and cannot be performed in the TurBase abstract geometrical and topological elements: class. Polymorphism allows the correct version of compMut() to be • Blk: defines the “block” notion. A block corresponds to a called dynamically, without any explicit coding by the programmer. As a region of the discretized physical space defined by a mesh. Blocks consequence, adding a new turbulence model will not modify the code are specialized to take into account grid motion, Arbitrary Lagrangian- of the client class. Figure 2 presents a simplified and reduced view of the Eulerian (ALE) technique and Hierarchical Mesh Refinement (HMR) UML class diagram of Tur. features;

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 4 Box 3 - Joint development project

The elsA project is a joint development project including organizations and individuals located at several geographic places. This cooperative development involving developers at different sites is greatly facilitated by the use of a simple and robust version control system: the Concurrent Version System (CVS), and soon the Subversion system (SVN). The version control system maintains a central “repository” which stores files and gathers them into coherent and named sets of revisions. The developers work in private workspaces and use the repository as a common basis for source exchange. Only one person, called the “integrator”, is allowed to “commit” the set of changes done by a developer back into the repository. Each new production of the software approximately includes 5 to 10 developments.

A delivered release of the software is a particular production, corresponding to a higher level in documentation and validation quality than a current production. Average time between two successive delivered releases of elsA is 12 to 18 months, and generally about 5 intermediate productions are made.

Communication is obviously one of the key elements in a joint development project. A Web site (http://elsa.onera.fr) facilitates information transmission. This site, including restricted areas for external users and developers, gives, in particular, access to the documentation (User Reference Manual, User Starting Guide, Developer’s Guide, Validation report…), the validation scripts (around 170 test cases at the present time in the validation database) and the problem tracking database. A specific email address is available for software support.

• Geo: defines the abstraction of the computational grid and provides In elsA, the choice of programming languages was done in order with all geometrical ingredients used by the finite volume formulation to allow both respect of the OO patterns of the design and of CPU (metrics: volume of cells, surface of cell interfaces; topological relations constraints. The first objective led to the mandatory choice of a true between geometrical entities: cells, interfaces, nodes); OO programming language, and the C++ language was chosen, • Dtw: contains all of the distance and boundary-layer integral because it is the most widely used OO language, available on all usual thickness computations; platforms. Besides C++ as main language for implementing the OO • Mask: defines concepts, such as masks for blanking, defined in design, it was decided to use Fortran for the two following reasons. the Chimera technique; First, the CPU tests we carried out at the beginning of the elsA project • Chim: contains the grid assembly used within the Chimera showed better CPU performance of Fortran in comparison with C. technique; Second, some Fortran lines of legacy code were re-implemented in • Join: deals with multi-block computations with various types the elsA loops. However, due to a completely new design, it was in of multi-block interface connectivity: 1-to-1 abutting, 1-to-n abutting or general not possible to keep the complete subroutines of the legacy mismatched abutting. code. These Fortran routines are very similar to private class methods: since they do not appear in the public class interface, they do not affect The physical model layer includes the Eos module which computes the OO design. A third programming language is used in elsA: the quantities such as pressure, temperature (according to the equation of Python language which is a freely available interpreted OO language state of the gas) or laminar viscosity coefficient, and the Tur module and is used for programming the elsA interface. Today, elsA includes already quoted above. more than one million lines (600,000 in C++ , 420,000 in Fortran and 55,000 in Python). The space discretization layer gathers several important modules for the computation of the terms of the equations to solve and of the boundary conditions: elsA interoperability • Oper: defines the operator notion. Each operator class is responsible for the computation of a single term in CFD equations: The interoperability of a program or of a piece of software is its convective flux (Fxc), diffusive flux (Fxd), source term (Sou); ability to interact with another program or piece of software. The elsA • Bnd: contains the numerous classes devoted to the large variety software has evolved towards a software suite containing the elsA of boundary conditions. kernel (mainly the CFD solver) and some additional tools (dealing in particular with pre-processing and post-processing). All the elements The solver layer contains: of the suite should be seen as boxes in a larger simulation process. • Rhs: builds the (explicit) right hand side of the equation system; Our target is the integration of these boxes into all the platforms of our • Lhs: deals with all the implicit methods; customers. Each aerospace industrial company already has a large • Tmo: manages the main iterative time loop. set of programs and most of the time they have in-house software systems to manage them in their own process. Then our goal is Finally, the top layer is the factory layer which is responsible of the rather to be able to be integrated than to integrate. The strategy for dynamic creation of all objects (in OO methods, “objects” mean the interoperability and the so-called “component approach” software “instances of the class”) and in particular includes the Fact module architecture we have designed for it are tightly bound to this goal of which implements several object “factories” to build objects from user all-platforms integration. input data coming from the interface.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 5 The component approach Whereas in this first example we use files, the next example shows how a full memory transfer of CGNS/Python trees can be achieved: The elsA component approach is based on the interface description. We use standard interfaces for the CFD simulation components. The import elsAxdt CGNS standard (see box 4) is preferred for the data model and the from CGNS.MAP import * Python programming language (see the box describing this language (input_tree,links)=load("001Disk.cgns") in [10]) defines the protocol, in other words the order you should parser=elsAxdt.XdtPython(input_tree) respect to use the component functions. Each box in the simulation output_tree=parser.compute() workflow, including the elsA kernel, has to provide a Python interface save("Result.cgns",output_tree,links) supporting the CGNS data model: the CGNS/Python interface. When you integrate such a CGNS/Python component into a platform, the The code lines are almost the same, but the architecture is quite interface you use is not proprietary; it is based on Open System different, because the elsA interface only uses the CGNS/Python tree. standards. A proprietary interface is a set of services which have a The actual load and save on the disk is not performed by elsA, it is single implementation - the customer has no choice about the service performed by an Open Source module (CGNS.MAP) handling the provider - whilst an Open System interface should have more than CGNS/Python tree and HDF5 files. one implementation. You can write Python scripts and just pass the CGNS data trees from one software tool to another in the memory of We dissociate the elsA computation from the means used for the the process, or by means of a network if you have to, as long as they actual data exchange. You can use the CGNS/Python tree in memory use the same public interface, no matter which implementation they for exchanges with a network layer or a dedicated proprietary use. This is the component interoperability which is the basis of Onera database. For example, if you run a large parallel computation with component approach. local generation of grids, the elsA suite includes Python modules (Post, Converter, Generator) for this grid generation as CGNS/Python CGNS/Python and elsAxdt trees in memory. The solver runs on these grids, and produces one result per process. The merging of these trees is again performed in The CGNS/Python mapping of the CGNS/SIDS is used by an increasing memory and the result is sent by means of a network to a remote post- number of Onera software components. Each tool has its own interface processing workstation. Such a workflow limits the disk accesses on to CGNS/Python: the interface of the elsA kernel is elsAxdt. It parses the high-performance computer and reduces the exchanges on the CGNS/Python data trees and reads/writes elsA data in the tree for the network to the data required for this particular application. This can be next step of the workflow. We call this tree a “shuttle tree”, just like a extended to the code coupling workflows [10]. shuttle bus going from code to code and picking up or dropping data. During this process, we avoid the use of files, because a file access Concluding remarks on interoperability often leads to problems on large computation clusters, and because most intermediate trees are sometimes not archived and trashed as The elsAxdt interface is now used by some of our largest customers. soon as they are used. These transient trees can be created, used and These users are developing their own “integrated environment” with killed within the memory. dedicated methods and tools. The common exchange backbone is based on CGNS. They can use the standard for file archiving as well A CGNS/Python computation with elsA uses quite simple commands, as for component interoperability. The grid is generated as a CGNS tree the read of the CGNS tree and the write of the resulting CGNS/Python by a commercial tool, then the proprietary process takes the tree and tree: enriches it with elsA specific parameters, the tree is submitted to the solver and the result is directly passed to a commercial visualization tool. import elsAxdt parser=elsAxdt.XdtCGNS("001Disk.cgns") The next step is now to use the CGNS/Python interface in the solver parser.compute() itself. Such a re-design is not necessary or useful for every part of the parser.save("Result.cgns") solver. We have selected a limited set of interfaces where we can explode the solver into separate and re-usable components. Each component In this simple example, all the required data is obviously embedded would provide the CGNS/Python interface and thus would increase the into the 001Disk.cgns file: CGNS has been designed to be able interoperability with a finer granularity. The next solver generation would to handle all the CFD data and even the specific solver parameters. extend its flexibility up to its inner software components and thus would Once the computation is performed, the output is obtained in be a future platform for better research and better integration into the Result.cgns, another CGNS tree. customers' proprietary platforms.

Box 4 - CGNS

The CGNS (CFD General Notation System) [24] provides a data model and a portable and extensible standard for exchanging and archiving of CFD analysis data. The main target is data associated with computed solutions of Navier-Stokes equations and their derivatives, but this can be applied to the field of computational physics in general.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 6 CGNS started in the 1990s as a joint NASA, Boeing and McDonnell Douglas project. They developed the so-called SIDS (Standard Interface Data Structure) document that specifies the data model as a reference document. The first implementation was performed by ANSYS/ICEM teams on the top of the ADF (Advanced Data Format) proprietary low level storage system.

Figure B4-01 – Graphical view of a part of a CGNS tree using the Open Source CGNS.NAV tool

In 1999, the CGNS Steering Committee (CGNS/SC) was formed with academic organizations, aerospace industrial companies and software editors. Onera joined the CGNS/SC in 2001 and is an active member.

The standard is made up of a data model specification (CGNS/SIDS) which gives the name to the standard itself: Notation System. The main goal of the CGNS standard is to specify CFD data. The data can be used for exchange in a CFD workflow or as a storage conceptual model. Four CGNS implementations are available, the ADF, the HDF5 (Hierarchical Data File), the XML (eXtensible Markup Language) and finally the Python mapping. Now the main implementations are CGNS/HDF5 and CGNS/Python, the first is archiving oriented while the second is more workflow oriented. CGNS/Python is the basis of the interoperability system developed at Onera around elsA.

The standard is not used for internal data representation but only for the public view of the data, in a workflow exchange, for example from the CAD to the mesh tool, or from the solver to the visualizer. More complex workflows can be defined, with unsteady computation involving both CFD and CSM or optimization algorithms.

The data model has a tree structure, starting from the root node, the base, up to the smallest nodes that can be modeled, such as a single real value or a boundary condition application range (see figure B4-01).

The standard is extensible and the CGNS/SC has a process to add new structures when some user needs are raised and agreed by members. We often define new data structures in the framework of our projects. CGNS is the good candidate for data specification when several partners have to exchange data during run-time or to exchange computing files. CGNS can hold a complete simulation context and this has an important effect: when the user defines his data model, he has to find all the required and exact data needed for the simulation, this avoids hidden behavior of codes or unwanted side-effects.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 7 Modeling capabilities Mesh topology capabilities

The elsA multi-application CFD simulation platform deals with CFD solvers may rely on several meshing paradigms such as structured internal and external aerodynamics from the low subsonic to the high body-fitted grids, unstructured grids or structured Cartesian grids. supersonic flow regime and relies on the solving of the compressible Even when considering the same configuration, there is no universally 3-D Navier-Stokes equations (see Box 2). The thermodynamic accepted definitive choice, as can be seen by looking at the contributions properties of the fluid may correspond either to the perfect gas to the Drag Prediction Workshop (http://aaac.larc.nasa.gov/tsab/cfdlarc/ assumption or to the equilibrium real gas assumption described by a aiaa-dpw/Workshop4/presentations/DPW4_Presentations.htm). In that Mollier diagram. elsA allows the simulation of the flow around moving workshop, the contributions were roughly evenly divided, half of them bodies in several formulations according to the use of absolute or on structured body-fitted grids and the other half on unstructured grids. relative velocities and the definition of the projection frame. This is Each of the mesh types has inherent advantages and disadvantages useful when dealing with applications to turbomachinery flows, where which may depend on the type of configuration and even more on the use of relative velocities is advantageous, and applications to flows the flow region of a given configuration. For example, Cartesian grids around propellers, where the use of absolute velocities is necessary, are easy to generate, to adapt, and to extend to higher-order spatial with the same software. The bodies may be deformable, as well as accuracy, but they are not suitable for resolving boundary layers the associated meshes. The gravity source term is optionally taken around complex geometries. Body-fitted structured grids work well for into account. resolving boundary layers, but the grid generation process for complex geometries remains tedious and requires considerable user expertise. A large variety of turbulence models from eddy viscosity to full General unstructured grids are well-suited to complex geometries Differential Reynolds Stress models are implemented in elsA for the and are relatively easy to generate, but their spatial accuracy is often Reynolds averaged Navier-Stokes (RANS) equations (see [1] for limited to second order, and the associated data structures tend to be further details). The range of turbulence models includes classical less computationally efficient than their structured-grid counterparts. one-transport and two-transport equation models, more advanced Thus, today, the tendency is to couple meshing paradigms in the same two-equation models, multi-scale four-equation models, one-layer or software [29] or in the same coupling infrastructure [28], and this is two-layer Algebraic Reynolds Stress models. When thermal effects are also the way chosen for elsA. important, specific versions of models are available. Low Reynolds versions of the models are mostly used for an accurate description of In elsA, the main focus has been put firstly on structured body-fitted the boundary layer profiles, in association with the use of a very fine grids which allow for the use of very efficient numerical algorithms mesh near the wall. However, wall laws approximating the behavior of due to the natural (I, J, K) ordering of the hexahedral cells. Since it the boundary layer near the wall are also available, which allow for the is generally impossible to define a unique structured body-fitted grid use of coarser mesh near the walls and may be used to reduce the around complex geometries, the computational domain is divided in cost of the calculations, in particular for unsteady calculations. several adjacent or overlapping domains or blocks, in which simpler component grids can be generated more readily. Communication Special attention has been paid to laminar-turbulent transition modeling between component grids is achieved either by direct transfer or by [1], which may be a key point for obtaining accurate flow predictions. interpolation across interfacing boundaries (patched grids), or by A reliable design of wings or turbines often requires an accurate interpolation within overlapping grid regions (overset grids). In order modeling of the transition process. Transition prediction capability in to cope with more and more geometrically complex configurations, high the elsA RANS solver is based on application of criteria that either were flexibility advanced techniques of multi-block structured meshes are previously developed at Onera for use in boundary layer codes, or available in elsA, in addition to matching techniques for 1-to-1 abutting result from classical criteria from literature. These criteria are used to or 1-to-n abutting patched grids. These advanced matching techniques describe Tollmen-Schlichting instabilities (including laminar separation include quasi-conservative mismatched abutting patched grids (also bubble predictions), cross-flow instabilities, bypass for high external called totally non-coincident matchings) and Chimera technique for turbulence rate, attachment line contamination, wall roughness. This overlapping meshes [8]. transition capability is available for complex geometry configurations. Nevertheless, to improve the applicability for very complex geometries, Mismatched abutting patched grids are intensively used in industry a transition model based on transport equations is now also available. computations with elsA. The reason is that they simplify mesh generation for complex configurations, and reduce global number of In order to deal with flows exhibiting strong unsteadiness and large mesh points for a given configuration, by preventing the propagation separated regions, and/or to provide input data for aeroacoustic of mesh refinements throughout the computational domain. They are simulations, the user can perform Detached Eddy Simulations (DES) also well adapted to deal with sliding meshes. and Large Eddy Simulations (LES) [1]. The variants of DES methods available in elsA (basic DES, Zonal-DES, Delayed DES) are associated The Chimera technique which enables a discretization of the flow with the Spalart-Allmaras model or with the Menter (k, ω) model. equations in meshes composed of overset grids, may be applied to The LES approach can be used to compute the larger structures of a wide range of configurations. The two main application domains the turbulent flows while the smaller structures are dissipated by the of this method were originally dealing with the treatment of separate numerical model, either by the MILES approach, which relies on the bodies (such as different positions of a missile below a wing) or properties of advanced upwind schemes and dissipates the unresolved with configurations including bodies in relative motion (for instance, turbulent structures, or by subgrid models such as the Smagorinsky, helicopter rotor and fuselage or booster separation). During recent Wale and filtered structure function models available in elsA. years, the Chimera method has also been increasingly used to simplify

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 8 and improve the meshing of complex configurations composed of the basis for the particular problem at hand, in order to cope with the meshing of a basic geometry and of additional meshes adapted to joint increasing complexity of CFD applications (see [8] for further details bodies (for instance, a spoiler associated with a wing) or to geometrical on structured/unstructured and Cartesian/ curvilinear block matchings). details (for instance, technological effects in a turbomachinery row such as clearances, leakage slots, grooves, etc.). Finally, Chimera may also be used to achieve mesh adaptation for the simulation of phenomena Numerics and boundary conditions capabilities involving a large range of length scales. The flow equations are solved by a cell centered finite-volume method The development in elsA of hybrid multi-block capabilities allowing [8]. Space discretization schemes include a range of second order for the use of unstructured meshes in some blocks of a multi-block centered or upwind schemes. Centered schemes are stabilized by configuration [8] presently relies on a strong cooperative effort between scalar or matrix artificial dissipation, including damping capabilities Onera and Cerfacs. The objective is to benefit from the high flexibility of inside viscous layers, in order to preserve accuracy. Upwind schemes unstructured meshes in the flow regions where it becomes too difficult are based on numerical fluxes such as van Leer, Roe, Coquel-Liou to build a structured grid. Tetrahedral, hexahedral, prismatic cells are HUS, or AUSM fluxes and are associated with classical slope limiters. considered in this elsA development, in order to be able to consider Second and third order Residual Based Compact schemes are also hexahedral cells near the walls, which is very important in aerodynamics available in elsA. The semi-discrete equations are integrated, either for an accurate description of the boundary layers. A first set of hybrid by multistage Runge-Kutta schemes with implicit residual smoothing, capabilities will be available very soon in the main version of the elsA or by backward Euler integration with implicit schemes solved by software. The mismatched abutting technique is extended in elsA to robust LU relaxation methods [21], which in general leads to a higher patch structured and unstructured grids. In the future, the plan is to efficiency with elsA. An efficient multigrid technique can be selected also apply the Chimera technique to the matching between structured in order to accelerate convergence. For time accurate computations, and unstructured grids. the implicit dual time stepping method or the Gear integration scheme are employed. Preconditioning is used for low speed flow simulations. Lastly, structured Cartesian grid capabilities are also currently being developed so that they can be part of the elsA suite in the future. The An extensive range of boundary conditions is available in elsA, from Cartesian formulation enables high order spatial discretization and mesh standard inlet, outlet or wall conditions, to more specific conditions adaptation, which in turn allows for better capturing of off-body flow for helicopter configurations (such as the so-called “Froude” far field phenomena such as shear layers and wakes [8]. Since some specific boundary conditions in hover) or turbomachinery configurations (such high order schemes have been developed in this Cartesian solver, the as the radial equilibrium condition, or – see Box 5 – the Reduced Blade way forward is to couple this solver with the block-structured solver of Count method and the phase-lagged technique for the simulation elsA. Matchings between blocks are again done using Chimera method. of rotor/stator interactions). Actuator-disc models are available to So, elsA will soon offer a quite complete multiple-gridding paradigm economically model the effects of helicopter rotors or propellers, providing the potential for optimizing the gridding strategy on a local when complete detailed calculations are not worthwhile.

Box 5 - Reduced Blade Count method and Phase-Lagged technique, two different approaches for simulating the time-periodic flow in a turbomachinery stage configuration

To improve turbomachinery performances, 3D Navier-Stokes flow computations in blade rows are commonly used for turbine and compressor design. Approximate steady flow calculations through multi-stage machines have become usual in design process for many years. In elsA, they are performed using a specific steady condition, the mixing plane condition, to connect two consecutive rows. This condition is based on azimuthal averages which are computed at the interfaces and transferred from a row to the consecutive one. It gives a quite good prediction of the overall efficiency of a machine but of course does not give any information on the unsteady flow fluctuations. Unsteady computations are increasingly used for industrial purposes: in a complete staged machine, they are necessary when performing unsteady non periodic phenomena. They are still very expensive, as discussed in the High Performance Computing section of this paper. But under some conditions, techniques for reducing the computational domain can be used. They can be applied for time-periodic flows (in the frame of reference of each row), that is for flows where the unsteadiness is only due to the relative motion of the rows. The first technique, called “Reduced Blade Count” method [11], was introduced in elsA software [23]. The computation is performed on the actual geometry with reduced blade counts. The interface between the blade rows accounts for the non equal pitches on each side of the interface by an appropriate scaling. The second technique, known as “Phase-Lagged” technique [9], or “Chorochronic” method was implemented in 2003 in elsA software [2]. A single blade passage is computed for each row. The flow solution is stored on the interface boundaries and on the azimuthal periodic boundaries to deal with the phase lag which exists between rows and adjacent blade passages in a row.

Let us note N1 and N2 the actual blade numbers of two consecutive rows, ω1 and ω2 their rotation speeds. The flow periods are equal to

T1=Trot / N2 and T2=Trot / N1, respectively in the rotating frame of the first row and in the rotating frame of the second one, Trot=2π / |ω2-ω1| being the time for a blade passage to make a whole revolution.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 9 The “Reduced Blade Count” technique consists in reducing the computational domain to K1 and K2 blade passages without changing the

geometry. K1 and K2 are chosen such that K1 / N1 and K2 / N2 are about the same. We assume that the flow is identical on two consecutive

Ki blade passages at each instant, which is exact when the ratios Ki / Ni are equal. Between the lower and the upper boundaries, the flow continuity is enforced, which is an approximation with respect to the actual flow, since the real phase lag between these boundaries is cancelled.

We can define DM as a mean value of the Ki / Ni ratios. This quantity represents the geometric reduction which is applied to each row to obtain the computational domain. On the interface, the two blade groups are linked by an instantaneous continuity condition through

a common azimuthal extension em = 2π / DM with a scaling given by λi = (Ni /Ki ) (1/DM ) for each row. The “Reduced Blade Count”

technique induces a slight approximation since the computational periods are T’1 = Trot / (K2DM) and T’2 = Trot / (K1DM), that is T’i=λiTi. This technique can be applied to several consecutive rows if the actual blade numbers are cooperative enough to apply a geometric reduction.

In the “Phase-Lagged” approach, the computational domain is limited to a single blade passage for each row. As the flow is time-periodic in each blade row, a phase lag exists between two adjacent blade passages. This phase lag is the time taken by a blade of the next row to cover the pitch of the row, modulo the time period of the row. The “Phase-Lagged” technique consists of storing the flow values on the azimuthal periodic boundaries and on the interface in order to use them later to build the flow.

Let us consider two space periodic points A and B of the upper and lower boundaries of the first row. B is ahead of A and what happens at

time t in B will happen in A at t + T2 and more generally at t + T2 + nT1 or has happened at t + T2 - mT1 (m is an integer such that T2 - mT1

is negative). The flow condition stored in A at time t + T2 - mT1 can be used for the boundary treatment of B at time t.

The treatment of the interface relies on the same principle. At each time step, the flow continuity is enforced at the interface between one cell facet of the first row and the suitable storage of the second row, taking into account for the relative position of the rows and using the necessary spatial interpolation on the stored data.

The direct storage of flow solution may lead to very large requirements in terms of memory. The data storage is lowered to an acceptable amount by Fourier analysis.

In the framework of the TATEF2 European project [7], unsteady flow simulations of the stator-rotor interaction in a transonic turbine stage have been performed using the “Phase-lagged” approach. Figure B5-01 shows clearly shock structures obtained for high pressure ratio.

t/T=0.5 t/Tr=0.0 r

t/T=0.25 r t/Tr=0.75

Figure B5-01 – Shock progression through the turbomachinery stage: schematic (left) and density gradient (right). y/H ~ 25%.

The “Phase-Lagged” approach does not make any approximation on the number of actual blades and it accounts for the real time period in each reference frame. So it is more accurate than the “Reduced Blade Count” technique. Moreover, as the computational domain is limited to a single blade passage for each row, it is less expensive in terms of CPU time and computer memory. But this approach can only be applied to a single stage, whereas the “Reduced Blade Count” technique can be applied to multi-stage configurations, even when several rotors have different rotational speeds.

A generalization of the “Phase-Lagged” approach, called “multiple frequency Phase-Lagged method”, is under development in elsA [18]. This method allows for unsteady computations through several rows, whilst still limiting the computational domain to one single blade-to-blade passage in each row.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 10 Multidisciplinary and optimization capabilities the code's performance on state-of-the-art vector and x86-64 based computing platforms. In the CFD context, vector machines are now elsA also includes the Ael module offering a general framework for being rapidly supplanted by clusters of x86-64 based nodes due to aeroelastic applications [10]. This module provides for the following their high operating cost and relatively poor energy efficiency (see simulations: in Figure 5 comparison between many-core and vector platforms on • harmonic forced motion simulations for a given structural mode; the simulation presented in Figure 4). elsA performance improvement • linearized Euler or Navier-Stokes simulations; efforts are now entirely dedicated to massively parallel computers. • static and dynamic fluid/structure coupling simulations in time domain with different levels of structural modeling (“reduced flexibility matrix” approach for static coupling, modal approach, full finite Number of computing cores (NEC) element structural model). 8 16 24 32 1.0 Scalar (IBM Blue Gene/P) The Opt module dealing with calculation of sensitivities by linearized Vector (NEC SX8++) equation or by adjoint solver techniques is useful for optimization and 0.8 control [22]. The calculation of sensitivities consists in calculating the derivatives with respect to control parameters of objective functions such as drag or lift. 0.6

High Performance Computing: towards massively parallel 0.4 computations Normalized computational time As said above, typical aeronautic configurations nowadays take into 0.2 account fine geometrical details, resulting in huge problem sizes, as the example of the simulation of the complete multistage compressor in 0.0 Figure 4 (a simulation which does not use the approximate techniques 1024 2048 3072 4096 described in Box 5 for reducing the computational domain). Moreover, Number of computing cores (IBM) although the type of physical modeling still remains mostly RANS, the a) CPU time (from [17]) major trend is to move towards URANS and even DES or LES in order to get more accurate results, leading to an estimated additional CPU cost of about two orders of magnitude. Number of computing cores (NEC) 8 16 24 32 12.0 Scalar (IBM Blue Gene/P) Vector (NEC SX8++) 10.0

8.0

6.0 Consumed energy (MJ)

4.0

2.0

1024 2048 3072 4096 Number of computing cores (IBM) b) Consumed energy (from [17])

Figure 5 – Comparison between many-core and vector platforms for a single Figure 4 – Unsteady flow simulation of a multistage compressor (134 million cells) unsteady physical iteration on the multistage compressor configuration using 512 to 4096 computing cores (SNECMA configuration) [17] Parallel strategy for structured multi-block calculations with elsA To handle such demands, High Performance Computing is now unavoidable in order first to tackle simulations with very large number The Message Passing Interface (MPI) standard library is used of points (and thus requiring a huge amount of memory), and second to implement communications between processors. elsA uses a to reduce the CPU wall clock time as much as possible. Onera, standard coarse-grained SPMD approach: each block is allocated to Cerfacs and CS have made very large efforts [15, 16] to improve a processor. Several blocks can be allocated to the same processor.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 11 As said previously, meshing a configuration with structured grids algorithm loops over all of the blocks searching for the largest one in often leads to a complex multi-block topology, thus requiring specific terms of cells and allocates it to the computing core with the fewest treatments to exchange data between adjacent blocks. In the elsA cells until all of the blocks are allocated. Note that the number of ghost solver, each block is surrounded by two layers of ghost cells storing cells increases with the number of blocks split, leading to an increasing values coming from the adjacent block. If the two blocks are allocated problem size. Topology modification also implies carefully handling to two different computing cores, then point-to-point message passing block-based implicit algorithms since convergence may rapidly be communication occurs. Otherwise, ghost cells are directly filled by a degraded. Therefore, communications occur at each relaxation step memory-to-memory copy. inside the implicit LU stage.

Point-to-point communications are implemented either with blocking elsA has been ported to most high-performance computing platforms, (MPI_Sendrecv_replace) or non-blocking (MPI_Irecv/ achieving good CPU efficiency on both scalar multi-core computers MPI_send) point-to-point messages. Only blocking point-to-point and vector computers. As an example, Figure 7a shows typical communications require the scheduling of messages as shown by Fig. speedup results on a civil aircraft configuration including 27,8 106 6. The scheduling of communications comes from a heuristic coloring mesh points and 1037 blocks. The numerical options include multigrid algorithm adapted from graph theory. algorithm (3 levels) and the Spalart-Allmaras one-equation turbulence model. For large number of processors, the configuration has been Scheduler split, ending up with 1774 blocks. The computer is the Cerfacs' No scheduler BlueGene/L computer. Another concrete example (Figure 7b) is the 1 High Lift Prediction Workshop configuration, where a good scalability is obtained up to 256 computing cores, on a SGI cluster built with Intel Nehalem processors (Onera's Stelvio computer), on a grid of 160 106 1: blocking mesh points, in 1235 blocks. 0.8 2: non-blocking 16 Ideal Speedup Measured Speedup Normalized efficiency 0.6 12

0.4 8 Speedup

0.2 4

1 2 MPI Implementation 0 0 128 256 384 512 640 768 896 1024 Figure 6 – Effect of the MPI blocking and non-blocking communications on Number of processors elsA efficiency with and without message scheduling a) Generic civil aircraft configuration The non-coincident inter-block connectivity is implemented with 6 blocking collective communications (MPI_Allgatherv). If only one computing core handles the whole inter-block connection, data 5.5 Ideal Speedup Measured Speedup exchanges come down to memory-to-memory copies without MPI 5 messages. 4.5 Performance discussion 4 3.5

The well known Amdahl's law states that load balancing is crucial to Speedup 3 obtain a good efficiency on parallel computers when the number of processors increases. Due to topological constraints, the number of 2.5 cells in blocks can be very different, ranging from 103 to 1003. Most of 2 the time, some blocks must be split to achieve a good load balancing. 1.5 It is not clear if an optimal response can be obtained in a reasonable 1 amount of time. Therefore, elsA integrates a heuristic block splitting 0 32 64 96 128 160 192 224 256 algorithm in the load balancing process. Based on the relative error Number of processors between the number of cells allocated to the computing core and the ideal number of cells, it checks if the largest block to be allocated b) Configuration of the High Lift Prediction Workshop 2010 needs to be split. The partitioning algorithm handles many constraints such as the multigrid constraint. The so-called “greedy” load balancing Figure 7 – Normalized speedup obtained with elsA on two configurations

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 12 Future challenges in HPC the fine-grain parallelism (data-parallelism) and the optimized use of cache memory. This will be mandatory as the trend in increasing the Since the performance improvement through increases in clock number of cores implies that many cores will compete for hardware frequency has reached a limit, mainly due to the power dissipation resources. problem, new parallel paradigms have emerged, namely the increase of the number of cores on a chip and the use of specialized accelerators To summarize on parallel performance, elsA is a portable code (FPGA, GPU, Cell processor). To handle such paradigms, we have reasonably well adapted to current generations of HPC platforms gained experience in two approaches: OpenMP thread programming and and continuous work is undertaken to strike a balance between good GPU using CUDA. In our previous studies on thread parallelism [20], efficiency and maintainability. elsA is routinely used on hundreds of OpenMP appeared to be less efficient compared with MPI on nodes processors in industry; the biggest computation was run with 8192 with small numbers of cores (<16). We plan to update these studies in cores on a 1.7 109 point grid [12]. the context of upcoming many-core chips. In addition, we are currently building a prototype version dedicated to GPU; here the main challenge Note also that not only the solver, but also the pre- and post-processing is the limited data transfer bandwidth between CPU host and GPU. steps should be fully and efficiently addressed for massively parallel configurations. Moreover, the efficiency of the whole simulation in As memory bandwidth will probably remain the limiting factor on the framework of a multiple-gridding paradigm will be a big challenge performance in the near future, major efforts are planned for both for the next few years n

Acknowledgements The authors would like to thank all colleagues of several Onera Departments (CFD and Aeroacoustics, Applied Aerodynamics, Aerodynamics and Energetics Modeling, Aeroelasticity and Structural Dynamics), of Cerfacs and of development partners for their sustained effort to make the elsA software a powerful and reliable tool for research and for industry.

We would like in particular to acknowledge contributions of the following former and present members of the elsA software team in the Onera CFD and Aeroacoustics Department: A. Gazaix-Jollès, M. Lazareff, J. Mayeur, B. Michel, P. Raud, A.-M. Vuillot.

We also thank our colleagues who have provided us with the elsA results presented in the paper: F. Blanc (Cerfacs/Airbus) for the transport aircraft configuration, N. Gourdain (Onera Applied Aerodynamics Department, then Cerfacs) for the rotating stall turbomachinery simulation, then for the HPC turbomachinery calculation, G. Delattre (Onera Applied Aerodynamics Department) for the CROR configuration, C. Benoit, G. Jeanfaivre and X. Juvigny (Onera CFD and Aeroacoustics Department) for the helicopter configuration.

References [1] B. AUPOIX, D. ARNAL, H. BÉZARD, B. CHAOUAT, F. CHEDEVERGNE, S. DECK, V. GLEIZE, P. GRENARD and E. LAROCHE - Transition and Turbulence Modeling. Aerospace Lab, Issue 2, 2011. [2] G. BILLONNET, S. PLOT and G. LEROY - Implementation of the elsA Software for the Industrial Needs of Flow Computations in Centrifugal Compressors. 41e colloque d’Aérodynamique Appliquée, Lyon, 2006. [3] R.H. BUSH, G.D. POWER and C.E. TOWNE - WIND: The Production Flow Solver of the NPARC Alliance. AIAA Paper 98-0935, 36th AIAA Aerospace Science Meeting and Exhibit, Reno, 1998. [4] L. CAMBIER, M. GAZAIX - elsA: an Efficient Object-Oriented Solution to CFD Complexity. AIAA Paper 2002-0108, 40th AIAA Aerospace Science Meeting and Exhibit, Reno, 2002. [5] L. CAMBIER and N. KROLL - MIRACLE - A joint DLR/Onera Effort on Harmonization and Development of Industrial and Research Aerodynamic Computational Environment. Aerospace Science and Technology, vol.12, 2008. [6] L. CAMBIER and J.-P. VEUILLOT - Status of the elsA CFD Software for Flow Simulation and Multidisciplinary Applications. AIAA Paper 2008-664, 46th AIAA Aerospace Science Meeting, Reno, 2008. [7] L. CASTILLON, G. PANIAGUA, T. YASA, A. DE LA LOMA and T. COTON - Unsteady Strong Shock Interactions in a Transonic Turbine: Experimental and Numerical Analysis. ISABE Paper 2007-1218, Beijing, China, 2007. [8] B. COURBET, C. BENOIT, V. COUAILLIER, F. HAIDER, M.-C. LE PAPE and S. PÉRON - Space Discretization Methods. Aerospace Lab, Issue 2, 2011. [9] J.I. ERDOS and E. ALZNER - Computation of Unsteady Transonic Flows Through Rotating and Stationary Cascades. NASA CR-2900, 1977. [10] M. ERRERA, A. DUGEAI, P. GIRODROUX-LAVIGNE, J.-D. GARAUD, M. POINOT, S. CERQUEIRA and G. CHAINERAY - Multi-Physics Coupling Approaches for Aerospace Numerical Simulations. Aerospace Lab, Issue 2, 2011. [11] A. FOURMAUX - Assessment of a Low Storage Technique for Multi-Stage Turbomachinery Navier-Stokes Computations. ASME Winter Annual Meeting, Chicago, 1994. [12] M. GAZAIX and S. CHAMPAGNEUX - Recent Results with elsA on Multi-Cores. Symposium on CFD on Future Architectures, DLR Braunschweig, 2009. [13] M. GAZAIX, A. JOLLÈS and M. LAZAREFF - The elsA Object-Oriented Tool for Industrial Applications. 23rd ICAS meeting, Toronto, 2002. [14] T. GERHOLD, V. HANNEMANN and D. SCHWAMBORN - On the Validation of the DLR-TAU Code. in W. NITSCHE, H.J. HEINEMANN and R. HILBIG (Eds), New Results in Numerical and Experimental Fluid Mechanics, NNFM 72, Vieweg, 1999. [15] N. GOURDAIN, L.Y.M. GICQUEL, M. MONTAGNAC, O. VERMOREL, M. GAZAIX, G. STAFFELBACH, M. GARCIA, J.-F. BOUSSUGE and T. POINSOT - High Performance Parallel Computing of Flows in Complex Geometries - Part 1: Methods. Computational Science and Discovery, vol. 2, 2009.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 13 [16] N. GOURDAIN, L.Y.M. GICQUEL, G. STAFFELBACH, O. VERMOREL, F. DUCHAINE, J.-F. BOUSSUGE and T. POINSOT - High Performance Parallel Computing of Flows in Complex Geometries - Part 2: Applications. Computational Science and Discovery, vol. 2, 2009. [17] N. GOURDAIN, M. MONTAGNAC, F. WLASSOW and M. GAZAIX - High-performance Computing to Simulate Large-scale Industrial Flows in Multistage Compressors. International Journal of High Performance Computing Applications, vol. 24, 2010. [18] L. HE and H.D. LI - Single-Passage Analysis of Unsteady Flows Around Vibrating Blades of a Transonic Fan Under Inlet Distortion. Journal of Turbomachinery, 2002. [19] W.L. KLEB, E.J. NIELSEN, P.A. GNOFFO, M.A. PARK and W.A. WOOD - Collaborative Software Development in Support of Fast Adaptive AeroSpace Tools (FAAST). AIAA Paper 2003-3978, 33rd AIAA Fluid Dynamics Conference, Orlando, 2003. [20] F. LOERCHER - Optimisation d'un code numérique pour des machines scalaires et parallélisation avec OpenMP et MPI. Technical report TR/CFD/03/104, CERFACS, 2003. [21] C. MARMIGNON, V. COUAILLIER and B. COURBET - Solution Strategies for Integration of Semi-Discretized Flow Equations in elsA and CEDRE. Aerospace Lab, Issue 2, 2011. [22] J. PETER, G. CARRIER, D. BAILLY, P. KLOTZ, M. MARCELET and F. RENAC - Local and Global Search Methods for Design in Aeronautics. Aerospace Lab, Issue 2, 2011. [23] S. PLOT, G. BILLONNET and L. CASTILLON - Turbomachinery Flow Simulations Using elsA Software: Steady Validations and First Abilities for Unsteady Computations. 38th AIAA/ASME/SAE/ASEE Joint Propulsion Conference & Exhibit, Indianapolis, 2002. [24] D. POIRIER, S.R. ALLMARAS, D.R. McCARTHY, M.F. SMITH and F.Y. ENOMOTO - The CGNS system. AIAA Paper 1998-3007, 29th AIAA Fluid Dynamics Conference, Albuquerque, 1998. [25] J. RENEAUX, P. BEAUMIER and P. GIRODROUX-LAVIGNE - Advanced Aerodynamic Applications with the elsA Software. Aerospace Lab, Issue 2, 2011. [26] C.C. ROSSOW and L. CAMBIER - European Numerical Aerodynamics Simulation Systems. in E.H. HIRSCHEL and E. KRAUSE (Eds.), 100 Volumes of ‘Notes on Numerical Fluid Mechanics’, NNFM 100, Springer-Verlag, 2009. [27] J.B. VOS, A. RIZZI, D. DARRACQ and E.H. HIRSCHEL - Navier-Stokes Solvers in European Aircraft Design. Progress in Aerospace Sciences 38, 2002. [28] A.M. WISSINK, J. SITARAMAN, V. SANKARAN, D.J. MAVRIPLIS and T.H. PULLIAM - A Multi-code Python-Based Infrastructure for Overset CFD with Adaptive Cartesian Grids. AIAA Paper 2008-927, 46th AIAA Aerospace Science Meeting, Reno, 2008. [29] H. YANG, D. NUERNBERGER and H.-P. KERSKEN - Towards Excellence in Turbomachinery CFD: a Hybrid Structured-Unstructured RANS Solver. Journal of Turbomachinery, vol. 128, 2006.

Acronyms HPC (High Performance Computing) ADF (Advanced Data Format) HUS (Hybrid Upwind Splitting) AUSM (Advection Upstream Splitting Method) LES (Large Eddy Simulation) CAD (Computer-Aided Design) LU (Lower Upper) CFD (Computational Fluid Dynamics) MPI (Message Passing Interface) CGNS (CFD General Notation System) OO (Object-Oriented) CROR (Counter Rotating Open Rotor) RANS (Reynolds Averaged Navier-Stokes) CSM (Computational Structural Mechanics) SIDS (Standard Interface Data Structure) CUDA (Compute Unified Device Architecture) SPMD (Single Process, Multiple Data) DES (Detached Eddy Simulation) TATEF2 (Turbine Aero-Thermal External Flows 2) elsA (ensemble logiciel pour la simulation en Aérodynamique) UML (Unified Modeling Language) FPGA (Field-Programmable Gate Array) URANS (Unsteady Reynolds Averaged Navier-Stokes) GPU (Graphics Processing Unit) XML (eXtensible Markup Language) HDF (Hierarchical Data File)

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 14 AUTHORS

Laurent Cambier graduated in 1980 from “École Centrale de Marc Poinot who received an MSc in Computer Science from Paris”, is presently Assistant Director of the CFD and Paris XI Orsay University, is a software engineer in the CFD Aeroacoustics department of Onera and Head of the elsA and Aeroacoustics department of Onera. He is in charge of program coordinating research, software and validation interoperability topics for code coupling and numerical activities. simulation platforms. He is member of the CGNS steering committee and he initiated and actively participated to the Michel Gazaix graduated in 1979 from “École Normale migration of CGNS to HDF5. Supérieure de Saint-Cloud” and has been a Research Scientist at Onera since 1986. In the CFD and Aeroacoustics Jean-Pierre Veuillot graduated in 1969 from “École department, his research topics include: High Performance Supérieure de l’Aéronautique”, received a PhD from Paris 6 Computing, Real Gas modeling in compressible flows, and University in 1975. He was Assistant Director of the CFD and software engineering applied to large scientific codes. Aeroacoustics department when he retired from Onera in 2010 after a whole career at Onera devoted to numerical Sébastien Heib PhD in Numerical Analysis from Paris 6 methods and software development in CFD. University, joined Onera in 2001. He is presently the head of the elsA software project in the CFD and Aeroacoustics department. Jean-François Boussuge graduated from the Von Karman Institute, was working as numerical simulation engineer in the automotive field until 2001 before becoming research engineer at CERFACS where he was in charge of code Sylvie Plot graduated from “École Nationale Supérieure de development on various CFD methods and algorithms until l’Aéronautique et de l’Espace” in 1990. Since that time, she 2005. Today, he is the leader of the external aerodynamics has been working at Onera, mainly on the development of group at CERFACS. helicopter and turbomachinery CFD capabilities. She is currently head of the “Software and Advanced Simulations Marc Montagnac is a research engineer in external for Aerodynamics” research unit of the CFD and Aeroacoustics aerodynamics group at CERFACS. He graduated from the Department. “Institut National des Sciences Appliquées” with an engineering diploma in 1994 and received a PhD in Computer Science from Paris 6 University in 1999. His primary research and professional interests are in the areas of software engineering, High Performance Computing and numerical methods.

Issue 2 - March 2011 - An Overview of the Multi-Purpose elsA Flow Solver 15