Heat Transfer Prediction for Methane in Regenerative Cooling Channels with Neural Networks

G. Waxenegger-Wilﬁng ∗, K. Dresia †, J.C. Deeken ‡ and M. Oschwald § DLR Institute of Space Propulsion, Hardthausen, Germany

Methane is considered being a good choice as a propellant for future reusable launch

systems. However, the heat transfer prediction for supercritical methane ﬂowing in cooling

channels of a regeneratively cooled combustion chamber is challenging. Because accurate

heat transfer predictions are essential to design reliable and eﬃcient cooling systems, heat

transfer modeling is a fundamental issue to address. Advanced computational ﬂuid dynamics

(CFD) calculations achieve suﬃcient accuracy, but the associated computational cost prevents

an eﬃcient integration in optimization loops. Surrogate models based on artiﬁcial neural

networks (ANNs) oﬀer a great speed advantage. It is shown that an ANN, trained on data

extracted from samples of CFD simulations, is able to predict the maximum wall temperature

along straight rocket engine cooling channels using methane with convincing precision. The

combination of the ANN model with simple relations for pressure drop and enthalpy rise results

in a complete reduced order model, which can be used for numerically eﬃcient design space

exploration and optimization.

Nomenclature

A = channel area [mm2] arXiv:1907.11281v1 [cs.LG] 24 Jul 2019 b = channel width [mm]

d = wall thickness [mm]

Dh = hydraulic diameter [mm] f = friction factor [−]

G = mass ﬂow density [kg s−1 m−2]

h = channel height [mm]

∗Research scientist, rocket engine department, [email protected]. †PhD student, rocket engine department, [email protected]. ‡Group leader, rocket engine department, [email protected]. §Department head, rocket engine department, [email protected]. h = speciﬁc enthalpy [kJ kg−1]

l = channel length [mm]

mÛ = mass ﬂow rate [kg s−1]

p = pressure [Pa]

ρ = density [kg m−3]

QÛ = heat ﬂow rate [W]

qÛ = heat ﬂux [W m−2] r = wall roughness [µm]

Re = Reynolds number [−]

T = temperature [K]

v = ﬂow velocity [m s−1]

y+ = dimensionless wall distance [−]

z = ﬂow length [mm]

Subscripts

b = bulk in = inlet

out = outlet

stat = static tot = total

w = wall

Acronyms

ANN = Artiﬁcial Neural Network

AR = Aspect Ratio

CFD = Computational Fluid Dynamics

CH4 = Methane MAE = Mean Absolute Error

ML = Machine Learning

ReLU = Rectiﬁed Linear Unit

RMS = Root Mean Square

2 I. Introduction

Although most liquid rocket engines that have ﬂown until now used oxygen/hydrogen, oxygen/kerosene or a

hypergolic propellant combination like nitrogen tetroxide/monomethylhydrazine [1, 2], several countries started to

develop engines that use methane as fuel and oxygen as oxydizer in the recent years. Oxygen/hydrogen oﬀers the highest

speciﬁc impulse, but the low density of hydrogen leads to large rocket stages. In addition, the low boiling temperature

of hydrogen at 20 K makes the handling very diﬃcult and increases operating costs. Kerosene is much denser than

hydrogen and easier to handle. Disadvantages are a lower speciﬁc impulse and that kerosene may coke and form

deposits, which is problematic in terms of engine reuse. The main drawback of nitrogen tetroxide/monomethylhydrazine

is its extreme toxicity. The propellant combination oxygen/methane has many favorable characteristics, e.g. methane

is six times as dense as hydrogen, is easier to handle, has preferable coking temperature limits [3] and low toxicity.

Furthermore, oxygen/methane oﬀers a slightly higher speciﬁc impulse than oxygen/kerosene [4].

Despite the mentioned advantages, the prediction of heat transfer for methane ﬂowing in the cooling channels of a

regeneratively cooled combustion chamber has proven challenging [5], but is needed for an eﬃcient cooling system

design. Regenerative cooling performance is especially important for engines, which are reusable or use an expander

(bleed) power cycle, where the energy absorbed in the cooling channels drives the turbopumps [6, 7]. The main

diﬃculties for the heat transfer predictiont of methane is that it usually enters the cooling channels at supercritical

pressure but subcritical temperature. It is then heated up in the cooling channels and most times crosses the Widom-line

[8] close to the critical point. Strong changes in ﬂuid properties at the Widom-line introduce various physical phenomena,

e.g. heat transfer deterioration [9, 10], which inﬂuence the heat transfer. In contrast to that, hydrogen usually enters the

cooling system already in a gas-like state with pressures and temperatures far above the critical values [11].

Several methods exist to study the regenerative cooling of liquid rocket engines. A simple approach is to use

semi-empirical one-dimensional correlations to estimate the local heat transfer coeﬃcient [12, 13]. By using an energy

balance for each combustion chamber wall section, the wall temperatures can be estimated. Especially the maximum wall temperature, which occurs at the hot gas side, is a critical parameter, because it determines the fatigue life of the

chamber [14]. The advantage of such simple relations is the negligible computation time. However, one-dimensional

relations are not able to capture all relevant eﬀects that occur in asymmetrically heated channels like thermal stratiﬁcation

[15] or the inﬂuence of turbulence and wall roughness. Correction factors and quasi-two-dimensional models have been

developed [16], but only full three-dimensional CFD calculations achieve convincing accuracy. Many papers have been

published on CFD simulations for supercritical methane ﬂowing in rocket engine cooling channels [17–19] and CFD

results were compared with experimental data [20, 21]. The main disadvantage of CFD simulations is that they are not

suitable for design optimization, design space exploration, and sensitivity analysis due to their large calculation eﬀort.

3 By constructing surrogate models using samples of the computationally expensive calculation, one can alleviate this burden. However, it is crucial that the surrogate model mimics the behavior of the simulation model as closely as possible and generalizes well to unsampled locations, while being computationally cheap to evaluate. ANNs are known to be universal function approximators [22] and have been successfully applied as surrogate models in a number of domains [23, 24]. Theses models have been applied to heat transfer prediction of supercritical fluids too [25–27]. The possibility to use ANNs with multiple hidden layers allows to generate surrogate models even for high-dimensional problems given a suitable number of samples. In this paper, for the first time, an ANN is trained with data extracted from samples of CFD simulations for heat transfer prediction of supercritical methane. The rest of the paper is organized as follows: Section II describes the basics of machine learning (ML) and the theory of ANNs. A procedure to generate suitable training data by CFD calculations is presented in section III. Section IV discusses the proposed ANN and reports the results. Section V shows how the ANN can be used as a building block of a complete reduced order model for cooling channel flows and sectionVI provides concluding remarks. A good deal of the material presented in the paper can also be found in the master thesis of one co-author supervised by the author [28].

II. Artiﬁcial Neural Networks

ANNs are models that belong to the ﬁeld of ML. To understand ANNs well, a basic understanding of the principles of ML is needed. The following section brieﬂy elaborates on the basics theory. A comprehensive presentation can be found in the book of Goodfellow et al. [29].

A. Machine Learning Basics

The field of ML studies algorithms that use datasets to change parts of a mathematical model in order to solve a certain task, instead of using fixed pre-defined rules. The mathematical model is often a function, which maps input data to output data, and the algorithm has to learn the adjustable parameters of this function in such a way that the mapping has the desired properties. In other words, ML is primarily concerned with the problem of finding and adjusting functions that usually have a large number of parameters. ML algorithms can be divided into supervised and unsupervised. In supervised learning, the training dataset contains both the inputs and the desired outputs, and the mathematical model can amongst other things be used for classification or regression. In a classification task, the model is asked to identify to which set of categories k a specific input belongs. Assuming that each example of the input data is represented as a feature vector xì ∈ Rn, the learning algorithm is asked to produce a suitable function f : Rn → {1,..., k} with a discrete target output. A well-known example of a classification task is object recognition in images. In a regression task, e.g. with a single explanatory variable, the goal is to predict a numerical value given

4 some input. To solve this task, the learning algorithm is asked to output a function f : Rn → R. Unsupervised learning

algorithms receive datasets without target outputs and learn useful properties of the structure of these datasets.

The central challenge in ML is that the model must perform well on new, previously unseen input data. The

capability to perform well on those inputs is called generalization. Generalization is also central to understand the

relationship between mathematical optimization and ML. While optimization algorithms can be used to minimize some

error measure on the training set, ML tries to reduce the generalization error, also called the test error. During training,

one must prevent two central issues. Underﬁtting occurs when the model is not able to obtain a suﬃciently low error on

the training data. Overﬁtting occurs when the gap between the training error and test error is too large; thus, the model

is not able to generalize. The ability of a model to fit a wide variety of functions is called the model’s capacity. Models with low capacity may have problems to fit the training data. Models with high capacity can solve complex tasks, but when their capacity is higher than needed, they may overfit by memorizing properties of the training data that do not work well on the previously unseen test data. ML achieves good results when the capacity of the model is appropriate

for the true complexity of the relevant task and the amount of training data. However, for practical applications, it

is nearly impossible to guess the model with an appropriate capacity. Furthermore, models with higher capacity in

combination with proper methods to prevent overﬁtting often work better than less complex models. Modiﬁcations of a

learning algorithm that are intended to reduce its generalization error, possibly by an increase in train error, are known

as regularization. Instead of reducing the capacity of the model, one can, for example, change the learning algorithm to

express the preference of one function over another.

Most ML models and algorithms have hyperparameters that are not adapted by the learning algorithm, but can be

used to control the outcome, for example, by changing the capacity of a model. Optimal values of hyperparameters and

estimates for the generalization error are found by splitting the available data into three disjoint subsets. The training set

is used to adapt the trainable parameters by the learning algorithm. The second dataset, the validation set, exists to

estimate the generalization error during or after training, allowing for hyperparameter tuning with the goal to ﬁnd a

good balance between performance and avoidance of overﬁtting. However, the estimate of the generalization error of

the ﬁnal model will be biased because the validation data was used to select the model. Thus, a third dataset, the test set,

is used to estimate the real generalization error.

B. Theory of Artiﬁcial Neural Networks

One successful family of models used for ML is that of ANNs. ANNs are inspired by the functionality of biological

brains, which are made of a huge number of biological neurons that work together to control the behavior of animals

and humans. A collection of connected units, called artiﬁcial neurons, form the basis of an ANN. Furthermore, artiﬁcial

5 neurons loosely model biological neurons and are usually represented by nonlinear functions acting on the weighted ì sum of its input signals. Let in = (in1, in2,..., inn) denote an input vector, wì = (w1, w2,..., wn) a weight vector, where n is the input dimension, b a bias term and φ an activation function, then the output out of a single artiﬁcial neuron can be written as

n ©Õ ª out = φ wjinj + b® . (1) j=1 « ¬

The bias term b can be used to shift the activation function φ. A rectiﬁed linear unit (ReLU), where φ(x) = max{0, x}, is the most common activation function in modern ANNs. Trainable parameters are usually the weights and biases of the neurons. Mostly, the connectivity architecture of such ANNs is layered with an input layer, multiple hidden layers and an output layer. ANNs are called feedforward networks when no feedback connections are present. One can prove that a feedforward network with a single hidden layer can approximate any reasonable function if the hidden layer has enough neurons. Nevertheless, using multiple hidden layers adds exponentially more expressive power. Amongst other things, each layer can be used to extract increasingly abstract features and hence more suitable representations of the input data. An ANN with more than one hidden layer is called a deep ANN. Such deep ANN can discover a suitable hierarchy of representations during training and as a result learn and also generalize better. During training or learning, the algorithm requires a measure for the quality of its prediction to adjust the parameters of the model. In regression problems, a typical choice for the cost function is the mean squared error between predicted values and ground truth:

1 Õ 2 J(θì) = y(xì) − f (xì, θì) , (2) 2m xì

where xì and y(xì) are the input vector and the ground truth respectively of a training data point, m is the total number of training data points and f (xì, θì) is the predicted output of the model according to its parameters θì. For ANNs the model parameters θì are given by all weights and biases associated to the neurons. Training correponds to ﬁnding optimal parameters θì such that J(θì) is minimal. Often one adds an extra term for regularization:

1 1 J˜(θì) = J(θì) + αΩ(θì) with Ω(θì) = ||wì||2 = (w2 + w2 + ··· + w2), (3) 2 2 2 1 2 i

where wì denote the weights of the ANN, i is the total number of trainable weights and α is an additional

6 hyperparameter, which controls the amount of regularization. The extra term penalizes larger network weights. The

procedure is known as weight decay or L2 regularization. Because of the nonlinearity of ANNs, J(θì) (or J˜(θì)) is

a non-convex function. One can still use gradient-based optimizers, but there is no global convergence guarantee.

Nevertheless, training algorithms of ANNs are mostly based on using the gradient to descend the cost function to lower values. After initializing all trainable parameters - for example by small random numbers - the gradient of the cost

function is used to update the parameters by

θì0 = θì − ∇ì J, (4)

where is a small parameter called the learning rate that ensures that the change in θì is small. The gradient ∇ì J of

the cost function with respect to θì can eﬃenctly be computed with the backpropagation algorithm. For large training

datasets, gradient computation can still be very time consuming. It turns out that the eﬃciency can be improved by

calculating the gradient on small randomized subsets of the training set called minibatches and applying updates to the

parameters more often. This procedure is called stochastic gradient descent. Finally, one pass of the full training set is

called an epoch.

The use of ANNs for surrogate modelling has advantages and disadvantages. A big advantage is that ANNs can

capture the behavior of complicated functions because they can scale to large datasets and also generalize non-locally

[29]. Especially, if a deep network can extract the underlying factors, ANNs are well suited even for high-dimensional

problems. The biggest disadvantage is that ANNs mostly act as black boxes. The ﬁeld of explainable artiﬁcial

intelligence studies methods to make models like ANNs more explainable and interpretable, but is still in its infancy.

III. CFD based Data Generation

For the generation of training and test datasets, CFD calculations of supercritical methane ﬂowing inside of straight

cooling channel segments are performed. As mentioned in the introduction, many studies have been performed to derive

suitable CFD setups, which can reproduce all essential eﬀects inﬂuencing the heat transfer. The focus of this paper

is to show the feasibility of ANNs to tackle the challenge of numerically eﬃcient heat transfer predictions under the

assumption that precise CFD solvers are available for the corresponding problem setting.

A. CFD Models

The CFD models are generated with standard ANSYS CFX 18.0. The channel ﬂow is modeled as compressible and

stationary, while buoyancy and gravitational forces are neglected. As turbulence model the two-equation shear stress

7 Fig. 1 Geometry and boundary conditions of the cooling channel (not to scale)

transport (SST) model is used which combines the k − ω turbulence model for the inner region of the boundary layer with the k − turbulence model for the free shear ﬂow. The geometry and boundary conditions of the cooling channel

model are shown in Fig. 1 and Fig. 2. Because of symmetry reasons it is suﬃcient to model one half of the channel. h

and b denote channel height and width, while d is used for the chamber wall thickness in the ﬁgure. To restrict the

independent variables a ﬁn thickness of 1 mm is assumed for all simulations. In stream-wise direction, no heat ﬂux

(qÛ = 0 W m−2) is applied for the ﬁrst 80 mm of the channel to obtain a fully developed ﬂow and velocity boundary layer.

l denotes the channel length and is set to 250 mm for a cross section smaller or equal to 5 mm2, while it is increased

for channels with a larger cross section to allow the thermal boundary layer to grow further. The channel surface is

modeled as a rough wall with diﬀerent values for the surface roughness and a no-slip condition. A mass ﬂow boundary

condition and the coolant total temperature are imposed at the ﬂuid inlet. Furthermore, the static pressure is ﬁxed at the

domain outlet and a symmetric ﬂow boundary condition assures no mass or energy ﬂuxes across the symmetry plane.

For the solid domain, all faces, except the hot gas wall, are modeled as adiabatic walls. Thermodynamic properties

of supercritical methane are evaluated with data from the well known NIST database [30], which provides data up to

625 K. For higher temperatures, an ideal gas behavior is assumed. The solid domain uses two diﬀerent material models.

Combustion chamber and solid ﬁns consist of a CuCrZr-alloy, which in this case is 99.25 % copper, 0.62 % chrome and

0.1 % zirconium. For the material properties of the alloy the reader is referred to Oschwald et al. [31]. The galvanic

layer is assumed to be made of copper. To reduce the inﬂuence of axial heat transfer, the thermal conductivity in the

8 Fig. 2 Computational domain with boundary conditions (not to scale) – 1walls denoted with "a" are adiabatic

stream-wise direction is set to zero for both materials.

The following parameters are varied for data generation: mass ﬂow density G, heat ﬂux qÛ, outlet pressure pstat,out,

inlet temperature Tstat,in, surface roughness r, channel area A, aspect ratio AR and inner wall thickness d. Their upper and lower bounds are chosen so that the data cover the geometrical dimensions and operation conditions of both upper

stage and ﬁrst stage liquid rocket engines with moderate chamber pressure. The outlet pressure ranges between 50 bar

and 150 bar, which means that the ﬂuid pressure is always above the critical pressure of methane, and that consequently

no boiling or phase change does occur. The ﬂuid inlet temperature varies from 120 K to 400 K. Hence, there are

simulations where the coolant temperature crosses the Widom-line and a transition from a liquid-like to a gas-like state

takes place. Furthermore, both outlet pressures and inlet temperatures are clustered more narrowly around the critical

point to ensure that these critical cases are well represented. To model both smooth and rougher walls, sand-grain

roughnesses between 0.2 µm and 15 µm are considered. The channel area varies from 1 mm2 to 10 mm2 and also

diﬀerent channel aspect ratios (1.0 to 9.2) are simulated, because of their impact on heat transfer and maximum wall

temperature. For the channel with a cross section of 1 mm2 only an aspect ratio of 1.0 is used to take manufacturing

restriction into account. The inner chamber wall thickness varies from 0.8 mm to 1.2 mm, which signiﬁcantly inﬂuences

the hot gas wall temperature. Generally, higher mass ﬂow densities are considered for high heat ﬂuxes and smoother walls because they result in reasonable wall temperatures and pressure losses.

For both solid and ﬂuid domains, hexahedral mesh elements are generated with ANSYS ICEM. The ﬁrst element in

the boundary layer has a thickness of 0.1 µm to satisfy a value of y+ < 1. The grid resuolution is 100 µm in stream-wise

direction and 35 µm perpendicular to it for the ﬂuid domain. For grid independence, a ﬁner mesh with twice as

many elements was analyzed for certain test cases. Because the resulting wall temperatures only change by 2 % the

9 1200 30 MW/m2 260 1000 50 MW/m2 80 MW/m2 800 240 [K] 600 [K] 220 b w T T

400 200 30 MW/m2 2 200 50 MW/m 180 80 MW/m2 0 0.0 0.1 0.2 0.0 0.1 0.2 z [m] z [m]

(a) Maximum wall temperature Tw (b) Bulk temperature Tb

Fig. 3 Wall temperature and bulk temperature for diﬀerent heat ﬂuxes

coarser mesh shows suﬃcient precision and is therefore used. A converged solution has to fulﬁll three criteria: All

RMS-residuals must be below 1 × 10−5, the conservation equations are well satisﬁed (solution imbalances below 1 %)

and quantities of interest, such as pressure drop or maximum wall temperature, do not change signiﬁcantly between two

iterations. In total, approximatly 20 000 CFD simulations of straight cooling channel segments are performed.

B. CFD Results

ML techniques can only cover eﬀects if they are already present in the training data. Important phenomena, which

affect flows in asymmetrically heated cooling channels and the associated heat transfer, are thermal stratification and

heat transfer deterioration. Both eﬀects can be observed in the CFD results; e.g. Fig. 3 shows the maximum wall

temperature and mean bulk temperature along the axial direction of a simulated straight cooling channel for diﬀerent

constant wall heat ﬂuxes. The wall temperature distribution exhibits a peak for higher heat ﬂuxes as a consequence of

heat transfer deterioration, while the bulk temperature increases nearly linearly. The inﬂuence of the surface roughness

is also modelled correctly. Higher roughness levels enlarge the production of turbulence in the boundary layer. Thus, wall temperatures are decreased, but the pressure loss is increased. These implications coincide with the CFD results.

Overall, it can be concluded that the relevant consequences of thermal stratiﬁcation, heat transfer deterioration, and

surface roughness are represented in the generated data.

10 1.00 Mass Flow Density G 1.00

Heat Flux q˙ 0.68 1.00 0.75

Enthalpy hb,stat -0.15 -0.02 1.00

Pressure pb,stat 0.40 0.36 0.31 1.00 0.50

Roughness r -0.08 -0.04 -0.03 0.03 1.00

Aspect Ratio AR 0.07 0.05 -0.12 0.03 -0.00 1.00 0.25

Cross Section A 0.25 0.18 -0.10 -0.04 0.01 0.26 1.00 Correlation Coeﬃcient

0.00 Flow Length z 0.16 0.11 0.13 -0.21 0.01 0.16 0.27 1.00

Wall Temperature Tw 0.44 0.88 0.21 0.24 -0.16 -0.07 0.18 0.18 1.00 0.25 − G q˙ hb,stat pb,stat r AR A z Tw

Fig. 4 Correlation matrix

C. Data Reduction

Only a reduced amount of the CFD results is used for training the ANN. First, only the values of bulk properties are

utilized for the ﬂuid description. Bulk properties are calculated as mass-ﬂow averaged quantities across the channel

cross section. Although most information contained in the two-dimensional distribution of ﬂuid quantities is lost, it

is hoped that the impact will be reﬂected in the correlations of the bulk variables. Second, at each cross section the

temperature distribution of the solid part is reduced to the values of the mean wall temperatures and the maximum

temperature wall temperature at the hot gas side. Nevertheless, it would be interesting to check how far the accuracy of

a data-driven model can be increased by using a more complete description of the ﬂuid and solid states. Third, these variables are only evaluated every 2 mm in stream-wise direction and saved in a table-like ﬁle structure. Each data point

is extended by the associated geometric information, such as cross section area, aspect ratio, ﬂow length as well as

boundary conditions like heat flux and surface roughness. The flow length is used to include boundary layer effects on

the heat transfer.

After data generation, it is always recommended to study the content and distribution of the data. First, a correlation

matrix can be used to visualize the correlations between multiple variables. Figure 4 shows the correlation matrix

for certain variables, where each entry visualizes the value of the corresponding Pearson correlation coeﬃcient. It

11 1200 2500

1000 2000 kg] / 800 [kJ 1500 stat ,

b 600 h 1000 400 Number of Data Points Enthalpy 500 200

0 0 50 75 100 125 150 175 200 225

Pressure pb,stat [bar]

Fig. 5 Data distribution

is important to note that Pearson correlation only describes the strength of linear relationships and does not imply

causation. For example, the correlation coeﬃcient between wall temperature and heat ﬂux is 0.88 and indicates a

strong positive relationship. Whereas the correlation between wall temperature and surface roughness is negative as

expected. One can see that physically reasonable relations are still represented in the reduced data. Second, the data is

not uniformly distributed and there are regions where the data is very sparse or where no data points are present at all.

Figure 5 exemplary shows the distribution with respect to enthalpy and pressure. As a result of the data generation

process with its manually chosen boundary conditions, there are regions with higher and lower data density. A so-called

covariate shift refers to a situation where the distribution of input variables is diﬀerent in the data available for training

and the data one expects to use as input in the future [32]. This needs to be taken into account because the ANN should

also produce good predictions there.

IV. Artiﬁcial Neural Network for Wall Temperature Prediction

An important problem is the prediction of the maximum temperature for each section of the combustion chamber wall given a certain cooling channel design and suitable boundary conditions. The maximum temperature is a critical

parameter, because it directly determines the fatigue life of the chamber and is therefore a crucial constraint for design

considerations [14]. The main driver for the temperature is the heat transfer from the cooling channel to the coolant.

Hence, the prediction can only be successful if the implicit heat transfer modelling takes the underlying mechanisms

correctly into account. Put diﬀerently, this means that an accurate wall temperature prediction implies a proper reduced

order modelling of the relevant heat transfer.

12 Fig. 6 Exemplary network architecture with 2 fully connected hidden layers

A. Network Architecture and Hyperparameter Optimization

A fully connected, feedforward network is proposed for the wall temperature prediction. The term fully-connected

means that every neuron of one layer is linked with all neurons of the next layer. Figure 6 shows an exemplary model with two hidden layers, four neurons per hidden layer and all input parameters. The optimal number of hidden layers

and neurons depends on the speciﬁc problem and data respectively. To ﬁnd the best network architecture and training

parameters, one needs to split the available data into suitable training and validation sets. Therefore, 90 % of the

data points is randomly selected for training and the rest is held back for validation. However, under a covariate shift,

data points should be weighted according to their so-called importance, which can be calculated by kernel density

estimations, when calculating error measures for training and validation. For further details the reader is referred to

Sugiyma et al. [32].

Given training and validation data, classical grid search and random search can be used to determine the optimal

parameters. Bergstra and Bengio [33] showed that a random search algorithm performs as well as grid search but with

less computational cost. The proposed ANN uses ReLUs for the activation functions of the hidden neurons and a linear

unit is employed for the continuous output. During training, the weight and bias update is calculated with the ADAM

optimizer, which is an extension of the classic stochastic gradient descent algorithm [34]. For faster and more robust

learning, all inputs are automatically scaled and standardized with the StandardScaler from Scikit-learn [35]. The cost

function is given by a mean squared error term plus an extra term for L2 regularization, as in Eq. II.B. The model is

13 generated and trained with KERAS, which is an open-source ANN library written in Python [36]. Random search of

500 diﬀerent hyperparameter combinations and network architectures leads to the following optimal model:

• 4 hidden layers

• 408 neurons per hidden layer

• L2 regularization with α = 0.1

• minibatch size of 4096

• 150 epochs

The training takes about 15 minutes on a Nvidia Quadro P4000 GPU.

B. Results and Visualization

Figure 7 compares predicted and target values for the wall temperature. One can see that the proposed network

achieves a convincing precision. The mean absolute error (MAE) of the wall temperature prediction is 8.38 K on the

training set and 8.40 K on the validation set with a standard deviation of 17.7 K and 18.5 K, respectively. The reason

for the smaller error on the training data is the fact that the training data was directly used to optimize the model’s weights, but the performance on the validation data is still impressive. One can conclude that a suitable selection of

input variables are chosen to predict the maximum wall temperature with high precision. Furthermore, the amount of

data samples is suﬃcient to train the network. Hence, one would conclude that the network has generalized well and

does not overﬁt.

Training: MAE = 8.38 [K] Validation: MAE = 8.40 [K] 1250 104 103

1000 103 102

750 102

101 500 101 Number of Data Points 250 100 100 Predicted Wall Temperature [K] 250 500 750 1000 1250 250 500 750 1000 1250 Target Wall Temperature [K] Target Wall Temperature [K]

Fig. 7 Training and validation results for the proposed model

Nevertheless, there is still the risk of overﬁtting, especially because of the empty regions in the input space, which

are also present in the validation set. To evaluate the quality of the ANN model, it is necessary to study the performance

14 Test Case Tin pout qÛ G A AR d r [–] [K][bar][ MW ][ kg ][mm2] [–] [mm][µm] m2 m2 s 1 140 80 49 11 700 1.9 2.0 0.83 2.1 2 131 217 81 11 700 4.1 4.1 0.90 3.0 3 173 129 57 23 900 7.4 3.7 1.14 3.0 4 127 57 55 26 000 6.0 7.5 0.96 14.2 5 290 51 14 10 100 7.4 3.7 1.14 1.7 6 148 174 37 13 200 3.2 2.3 1.07 6.4 Table 1 Exemplary boundary conditions for the test dataset

on an independent test set with yet unseen data. For this purpose, 25 further CFD calculations for 5 diﬀerent channel

geometries are made and the resulting maximum wall temperatures are compared with the predictions of the ANN. The

input parameters of 6 exemplary simulations are presented in Table 1. To include various engine sizes and operation

conditions, the boundary conditions are varied in a wide range leading to lower but also higher wall temperatures. As

both channel geometries and operational conditions diﬀer from those of the training and validation data, the test set is

an unbiased and independent performance measure for the ANN. The reader is referred to the appendix for a detailed

overview of the train and test data distributions.

Figure 8 shows the maximum wall temperature as a function of the axial length for both the CFD simulation and the

ANN. The MAE is 16.0 K with a standard deviation of 12.0 K. Overall, the ANN shows convincing performance for all wall temperature regimes. Wall temperatures up to 1000 K and as low as 500 K are predicted with minimal error. By

using the ﬂow length, the trained network is also able to reproduce the non-linear evolution in stream-wise direction.

Finally, the eﬀect of heat transfer deterioration is learned as test case 1 shows. It can be concluded that the ANN has

captured the essential underlying factors. It successfully predicts the maximum wall temperatures, even for regions in

the input space where no training or validation data points are present.

Finally, it is often useful to visualize the prediction of the model for a certain range of input values. Directly

observing the output helps to decide whether a model has learned the fundamental underlying factors of the given task

or if it merely memorizes the training data. Additionally, it is important to visualize how the model performs in between

of the given input data. One can employ so-called heat maps, where two inputs are parametrically changed while all

other parameters are kept the same. The output is then plotted in a two-dimensional scatter plot. Heat maps can be used

to identify possible problems in terms of overﬁtting. For example, further investigations would be necessary if there

are regions with strong unexpected discontinuities. Figure 9 illustrates the eﬀect of varying coolant temperature and

pressure as well as the inﬂuence of channel rougness and ﬂow length. In terms of physical interpretation, the response

seems reasonable. In general, a lower coolant bulk enthalpy or a higher bulk pressure lead to lower wall temperatures

15 CFD ANN Test Case: 1 Test Case: 2 1100 1100

1000 1000

900 900

800 800 [K] w

T 700 700

600 600

500 500

400 400

Test Case: 3 Test Case: 4 1100 1100

1000 1000

900 900

800 800 [K] w

T 700 700

600 600

500 500

400 400

1000 1000

900 900

800 800 [K] w

T 700 700

600 600

500 500

400 400 0 50 100 150 200 250 0 50 100 150 200 z [mm] z [mm]

Fig. 8 Wall temperature prediction for various diﬀerent design points, which are representative for diﬀerent operation conditions and cooling channel geometrie (see table 1)

16 for a given heat ﬂux because of changes in the transport properties of the coolant. Furthermore, the wall temperature

builds up excessively close to the critical point. Higher roughness levels enlarge the production of turbulence in the

boundary layer, thus decrease the wall temperature. The flow lenghts reflects the influence of boundary layer growth.

Finally, the wall temperature prediction changes smoothly without unphysical discontinuities.

1000 1000 300 [K] w

kg] 800 900 / [mm]

[kJ 200 600 z stat ,

b 150 800 h 400 100

200 Flow Length 700 Enthalpy 50

0 0 Hot-Gas Wall Temperature 600 50 100 150 200 1 5 10 15

Pressure pb,stat [bar] Roughness r [µm]

Fig. 9 Exemplary heat maps for the trained network

V. Reduced Order Model for Cooling Channel Flow

In addition to forecasting maximum wall temperatures, the prediction of critical variables such as pressure loss

and heating of the coolant is essential for regenerative cooling design. If CFD calculations are not suitable due to

their high computational cost, further reduced order models are required to calculate the stream-wise development of

thermodynamic properties like pressure and enthalpy, while the ANN is used to predict the wall temperature.

A. Pressure Drop Model

The Darcy-Weisbach equation can be used to to estimate the pressure loss along the cooling channels. The pressure

loss in a channel segment of length ∆z is given by

1 2 ∆z ∆p = f ρbvb , (5) 2 Dh

where f is the so called friction factor, ρb the bulk density of the coolant, vb the bulk ﬂow velocity and Dh the hydraulic

17 diameter of the channel. The friction factor f can be calculated by means of a simple empirical correlation valid for

laminar, turbulent and transient ﬂow [37]:

" #1/12 8 12 1 f = 8 + (6) Re (A + B)1.5

16    1  37530 16 A = 2.457 ln © ª and B = , (7)  0.9 ®  7 + 0.27 r ® Re  Re Dh   « ¬ where Re denotes the local Reynolds number and r the surface roughness.

B. Enthalpy Increase Model

Conservation of energy calculates the change of the speciﬁc total enthalpy of the ﬂuid over a channel section of

length ∆z:

QÛ(z, ∆z) 1 h (z + ∆z) = h (z) + with h (z) = h (z) + v (z)2, (8) b,tot b,tot mÛ b,tot b,stat 2 b

where z is the stream-wise coordinate, hb,stat the specific bulk enthalpy of the fluid, vb the bulk flow velocity, mÛ the mass flow rate and QÛ the overall heat flow rate in the channel segment.

C. Comparison with CFD

If one adds a mass continuity equation and a suitable equation of state (or uses the NIST database), one obtains a

complete reduced order model for supercritical methane ﬂowing in a rocket engine cooling channel. The predictions of

the reduced order model can be compared with the results of a full CFD calculation. First, Fig. 10 shows the evolution

of the bulk pressure and bulk enthalpy for an exemplary test case. Although error propagation increases the error in

stream-wise direction, the simple models produce results with suﬃcient accuracy. The mean absolute percentage error

for enthalpy and pressure on the entire test data set is 4.3 % and 4.2 %, respectively. Thus, these models can be used to

18 170 500 CFD CFD Simple model 450 Simple model 160

400 150 [bar] [bar] 350 stat stat , , b b 140 p h 300

130 250

120 200 0 50 100 150 200 250 0 50 100 150 200 250 z [mm] z [mm]

Fig. 10 Comparison between simple one-dimensional models and CFD data for bulk enthalpy and bulk pressure for an exemplary test case

calculate pressure and enthalpy along a channel, which, in turn, the ANN can use as input for the wall temperature

prediction. Figure 11 shows the wall temperature prediction for the proposed network using the reduced order models

for input generation. The error only marginally increases from 16.0 K to 19.6 K when using the reduced order model for

pressure and enthalpy calculation. In summary, the proposed reduced order model is able to predict the evolution of

the bulk pressure, the bulk enthalpy, and the resulting maximum wall temperature for low heat ﬂuxes in the range of

10 MW m−2 (test case 5), medium heat ﬂuxes (test case 1 and 6), as well as very high heat ﬂuxes up to 80 MW m−2 (test

case 2, 3 and 4), which can occur in the nozzle throat of a liquid rocket engine.

D. Performance Assessment

Although ANNs require a time-intensive training phase, the predictive speed is very high, because the network is

just a composite function that multiplies matrices and vectors together. Additionally, the numerical eﬀort does not

depended on the actual value of the inputs (e.g. channel area), whereas CFD simulations need increasingly more time with larger model sizes and thus higher number of mesh elements. On a computer with an Intel Xeon Gold 6140

CPU the CFD calculation of one straight channel segment takes up to 1 hour depending on the channel cross section, while the reduced order model delivers the result after 0.6 s. This comparison shows the great potential of data-driven

surrogate models for design space explorations and optimization loops.

19 CFD ANN ANN + Simple Model for Pressure and Enthalpy Test Case: 1 Test Case: 2 1100 1100

1000 1000

900 900

800 800 [K] w

T 700 700

600 600

500 500

400 400

1000 1000

900 900

800 800 [K] w

T 700 700

600 600

500 500

400 400

1000 1000

900 900

800 800 [K] w

T 700 700

600 600

500 500

400 400 0 50 100 150 200 250 0 50 100 150 200 z [mm] z [mm]

Fig. 11 Comparison of wall temperatures for CFD, the reduced order model and a hybrid model that uses pressure and enthalpy from CFD and the ANN for wall temperature prediction

20 VI. Conclusion and Outlook

In this paper, an ANN was successfully trained to predict the maximum wall temperature for each cylindrical section of a rocket combustion chamber wall given a regenerative cooling design using supercritical methane and suitable boundary conditions. The network was trained on data generated by CFD simulations of straight cooling channel segments. The ANN predicts the wall temperature for previously unseen test cases, including diﬀerent channel geometries and operation conditions, with an MAE of 16.0 K. Furthermore, the prediction of an entire channel segment takes only 0.6 s, which is at least 103 times faster than comparable three-dimensional CFD simulations. Thus, this numerically eﬃcient method constitutes a convincing building block of a reduced order model for supercritical methane

ﬂowing in rocket engine cooling channels. It is also shown which further reduced order models can be added to obtain a suitable description for cooling channel design considerations. The presented methodology can be used to generate predictions with a precision similar to full CFD calculations and after training the answer only takes a fraction of the computation time of a comparable CFD simulation. Therefore, it is well suited for optimization loops and as a component of system analysis tools.

However, ANNs have disadvantages too. On the one hand, there are disadvantages that all data-driven surrogate models share. The data sample selection determines the reachable accuracy. First, if the underlying data is wrong, the resulting model will be wrong as well. For the described methodology this means that it only works if there is a

CFD code available which can model all relevant effects, e.g. heat transfer deterioration or the correct influence of different surface roughness levels. Second, depending on the complexity of the problem, the construction of a precise approximation model can require a huge number of data samples. This data generation can get computationally very expensive. One challenge of surrogate modeling is the generation of a model that is as accurate as needed, using as few simulation evaluations as possible. It would be interesting to study how the additional use of experimental data could improve the situation. On the other hand, there are disadvantages that are typical for ANNs. ANNs are not able to extrapolate, but only provide reliable predictions within the region of the input space that is populated with training points. It is important to take this into account when using ANN based models for design space exploration or optimization. Furthermore, due to the high number of parameters, these algorithms often lack a deeper understanding of the fundamental physics. Thus, domain knowledge and the understanding of physical processes is still crucially important to evaluate and justify the prediction of data-driven algorithms.

The present work can be improved in many directions. Clearly, the data generation process is not optimal. The density of the data points is too far from being uniform in the input space of interest. In future research, an optimization of the data generation should be studied. Building on this, the question should be examined how much data is needed to reach a certain accuracy. A diﬀerent choice of input parameters may increase the precision. Parameters like the boundary

21 layer thickness were not explicitly used in the current model. A further extension should study the consideration of curvature effects. It is well known that centrifugal forces induce recirculation phenomena in the flow which influence the heat transfer and should not be neglected. Eventually, the performance of ANNs should be compared with other types of surrogate models for the task of wall temperature prediction and heat transfer modelling respectively. Overall, it is hoped that the current work will serve as a basis for future studies regarding the application of ANNs in the field of rocket engine design.

Appendix

Table 2 and Table 3 give an overview of mean value, standard deviation and diﬀerent percentiles of most relevant thermodynamic properties of the coolant, channel geometries and the resulting wall temperature at the hot-gas wall for the training and test datasets.

Tb hb pb vb G qÛ r A AR d Tw [K][ kJ ][bar][ m ][ kg ][ MW ][µm][mm2] [–] [mm][K] kg s s m2 m2 Mean 251 566 125 126 18 483 36 6.9 6.7 4.4 1.0 669 Std 84 317 42 78 8078 24 6.1 3.2 3.1 0.1 302 1 % 123 56 53 18 3027 9 0.2 1.0 1.0 0.8 230 25 % 183 279 90 64 12 500 10 1.0 5.0 1.7 1.0 426 50 % 240 572 119 109 17 500 30 5.0 5.0 3.5 1.0 620 75 % 302 790 158 174 25 000 50 15.0 10.0 9.2 1.0 854 99 % 433 1175 215 357 35 000 80 20.0 10.0 9.2 1.2 1482 Table 2 Mean value, standard deviation and percentiles of the training data

Tb hb pb vb G qÛ r A AR d Tw [K][ kJ ][bar][ m ][ kg ][ MW ][µm][mm2] [–] [mm][K] kg s s m2 m2 Mean 267 620 128 119 16 402 42 5.4 4.5 3.8 1.0 741 Std 93 343 44 57 7070 21 3.9 2.1 1.9 0.1 177 Table 3 Mean value, standard deviation and percentiles of the test data

References

[1] Sutton, G. P., History of Liquid Propellant Engines, AIAA, 2005.

[2] Caisso, P., Souchier, A., Rothmund, C., Alliot, P., Bonhomme, C., Zinner, W., Parsley, R., Neill, T., Forde, S., Starke, R., Wang,

W., Takahashi, M., Atsumi, M., and Valentian, D., “A liquid propulsion panorama,” Acta Astronautica, Vol. 65, 2009, pp.

1723–1737. doi:10.1016/j.actaastro.2009.04.020.

22 [3] Liang, K., Yang, B., and Zhang, Z., “Investigation of Heat Transfer and Coking Characteristics of Hydrocarbon Fuels,” Journal

of Propulsion and Power, Vol. 14, No. 5, 1998, pp. 789–796. doi:10.2514/2.5342.

[4] Burkhardt, H., Sippel, M., Herbertz, A., and Klevanski, J., “Kerosene vs. Methane: A Propellant Tradeoﬀ for Reusable Liquid

Booster Stages,” Journal of Spacecraft and Rockets, Vol. 41, No. 5, 2004, pp. 762–769. doi:10.2514/1.2672.

[5] Pizzarelli, M., Nasuti, F., Onofri, M., Roncioni, P., Votta, R., and Battista, F., “Heat transfer modeling for supercritical

methane ﬂowing in rocket engine cooling channels,” Applied Thermal Engineering, Vol. 75, 2015, pp. 600–607. doi:

10.1016/j.applthermaleng.2014.10.008.

[6] Hahn, R., Waxenegger, G., Deeken, J., Oschwald, M., and Schlechtriem, S., “Utilization of LOx/LCH4 for Expander-Bleed

Cycle at Upper Stage Engine Application,” EUCASS 2017, 2017. doi:10.13009/EUCASS2017-370.

[7] Leonardi, M., Nasuti, F., and Onofri, M., “Basic Analysis of a LOX/Methane Expander Bleed Engine,” EUCASS 2017, 2017.

doi:10.13009/EUCASS2017-332.

[8] Banuti, D., “Crossing the Widom-line – Supercritical pseudo-boiling,” The Journal of Supercritical Fluids, Vol. 98, 2015, pp.

12–16. doi:10.1016/j.supﬂu.2014.12.019.

[9] Urbano, A., and Nasuti, F., “Parametric Analysis of Heat Transfer to Supercritical-Pressure Methane,” Journal of Thermophysics

and Heat Transfer, Vol. 26, No. 3, 2012, pp. 450–463. doi:10.2514/1.t3840.

[10] Urbano, A., and Nasuti, F., “Onset of Heat Transfer Deterioration in Supercritical Methane Flow Channels,” Journal of

Thermophysics and Heat Transfer, Vol. 27, No. 2, 2013, pp. 298–308. doi:10.2514/1.t4001.

[11] Locke, J., and Landrum, D., “, Study of Heat Transfer Correlations for Supercritical Hydrogen in Regenerative Cooling

Channels,” Journal of Propulsion and Power, Vol. 24, No. 1, 2008, pp. 94–103. doi:10.2514/1.22496.

[12] Dittus, F., and Boelter, L., “Heat transfer in automobile radiators of the tubular type,” International Communications in Heat

and Mass Transfer, Vol. 12, No. 1, 1985, pp. 3–22. doi:10.1016/0735-1933(85)90003-x.

[13] Huzel, D. K., and Huang, D. H., Modern Engineering for Design of Liquid-Propellant Rocket Engines, AIAA, 1992.

[14] Waxenegger, G., Riccius, J., Zametaev, E., Deeken, J., and Sand, J., “Implications of Cycle Variants, Propellant Combinations and

Operating Regimes on Fatigue Life Expectancies of Liquid Rocket Engines,” EUCASS 2017, 2017. doi:10.13009/EUCASS2017-

[15] Cook, R., “Methane Heat Transfer Investigation,” NASA CR-171199, 1984.

[16] Pizzarelli, M., Carapellese, S., and Nasuti, F., “A Quasi-2-D Model for the Prediction of the Wall Temperature of Rocket Engine

Cooling Channels,” Numerical Heat Transfer, Part A: Applications, Vol. 60, 2011, pp. 1–24. doi:10.1080/10407782.2011.578011.

[17] Ruan, B., and Meng, H., “Supercritical Heat Transfer of Cryogenic-Propellant Methane in Rectangular Engine Cooling

Channels,” Journal of Thermophysics and Heat Transfer, Vol. 26, No. 2, 2012, pp. 313–321. doi:10.2514/1.t3670.

23 [18] Pizzarelli, M., Nasuti, F., and Onofri, M., “CFD analysis of transcritical methane in rocket engine cooling channels,” The

Journal of Supercritical Fluids, Vol. 62, 2012, pp. 79–87. doi:10.1016/j.supﬂu.2011.10.014.

[19] Wang, L., Chen, Z., and Meng, H., “Numerical study of conjugate heat transfer of cryogenic methane in rectangular

engine cooling channels at supercritical pressures,” Applied Thermal Engineering, Vol. 54, No. 1, 2013, pp. 237–246.

doi:10.1016/j.applthermaleng.2013.02.007.

[20] Haemisch, J., Suslov, D., and Oschwald, M., “Experimental Analysis of Heat Transfer Processs in Cooling Channels of

a Subscale Combustion Chamber at Real Thermal Conditions for Cryogenic Hydrogen and Methane,” Space Propulson

Conference 2018, Seville, 2018.

[21] Haemisch, J., Suslov, D., and Oschwald, M., “Experimental Study of Methane Heat Transfer Deterioration in a Subscale

Combustion Chamber,” Journal of Propulsion and Power, Vol. 35, No. 4, 2019, pp. 819–826. doi:10.2514/1.B37394.

[22] Hornik, K., Stinchcombe, M., and White, H., “Multilayer feedforward networks are universal approximators,” Neural Networks,

Vol. 5, 1989, pp. 359–366. doi:10.1016/0893-6080(89)90020-8.

[23] Sudakov, O., Koroteev, D., Belozerov, B., and Burnaev, E., “Artiﬁcial Neural Network Surrogate Modeling of Oil Reservoir: A

Case Study,” , 2019. doi:10.1007/978-3-030-22808-8_24.

[24] Dresia, K., Waxenegger-Wilﬁng, G., Riccius, J., and Oschwald, M., “Numerically Eﬃcient Fatigue Life Prediction of Rocket

Combustion Chambers using Artiﬁcial Neural Networks,” EUCASS 2019, 2019. doi:10.13009/EUCASS2019-264.

[25] Scalabrin, G., and Piazza, L., “Analysis of forced convection heat transfer to supercritical carbon dioxide inside tubes

using neural networks,” International Journal of Heat and Mass Transfer, Vol. 46, No. 7, 2003, pp. 1139–1154. doi:

10.1016/s0017-9310(02)00382-4.

[26] Scalabrin, G., Piazza, L., and Condosta, M., “Convective cooling of supercritical carbon dioxide inside tubes: heat transfer

analysis through neural networks,” International Journal of Heat and Mass Transfer, Vol. 46, No. 23, 2003, pp. 4413–4425.

doi:10.1016/s0017-9310(03)00256-4.

[27] Chang, W., Chu, X., Fareed, A. F. B. S., Pandey, S., Luo, J., Weigand, B., and Laurien, E., “Heat transfer prediction

of supercritical water with artiﬁcial neural networks,” Applied Thermal Engineering, Vol. 131, 2018, pp. 815–824. doi:

10.1016/j.applthermaleng.2017.12.063.

[28] Dresia, K., “Prediction of Heat Transfer in Methane for Liquid Rocket Engines Using Artiﬁcial Neural Networks,” Master’s

thesis, RWTH Aachen, 2018.

[29] Goodfellow, I., Bengio, Y., and Courville, A., Deep Learning, The MIT Press, 2017.

[30] Linstrom, P., “NIST Chemistry WebBook, NIST Standard Reference Database 69,” , 1997. doi:10.18434/t4d303.

[31] Oschwald, M., Suslov, D., and Woschnak, A., “Einﬂuss der Temperaturabhängigkeit der Materialeigenschaften auf den

Wärmehaushalt in regenerativ gekühlten Brennkammern,” DGLR Jahrbuch 2004, 2004.

24 [32] Sugiyama, M., Krauledat, M., and Müeller, K.-R., “Covariate Shift Adaptation by Importance Weighted Cross Validation,”

Journal of Machine Learning Research, Vol. 8, 2007, pp. 985–1005.

[33] Bergstra, J., and Bengio, Y., “Random Search for Hyper-Parameter Optimization,” Journal of Machine Learning Research,

Vol. 13, 2012, pp. 281–305.

[34] Kingma, D. P., and Ba, J., “Adam: A Method for Stochastic Optimization,” International Conference on Learning Representations,

Banﬀ, 2014.

[35] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R.,

Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E., “Scikit-learn: Machine

Learning in Python,” Journal of Machine Learning Research, Vol. 12, 2011, pp. 2825–2830.

[36] Chollet, F., et al., “Keras,” https://keras.io, 2015.

[37] Churchill, S., “Friction factor equation spans all ﬂuid-ﬂow regimes,” Chemical Engineering Journal, Vol. 84, 1977, pp. 91–92.