applied sciences

Article Deformation of Air Bubbles Near a Plunging Jet Using a Machine Learning Approach

Fabio Di Nunno 1,2 , Francisco Alves Pereira 1 , Giovanni de Marinis 2 , Fabio Di Felice 1 , Rudy Gargano 2, Massimo Miozzi 1 and Francesco Granata 2,*

1 National Research Council, Institute of Marine Engineering, Via di Vallerano, 139-00128 Roma, Italy; [email protected] (F.D.N.); [email protected] (F.A.P.); [email protected] (F.D.F.); [email protected] (M.M.) 2 Department of Civil and Mechanical Engineering, University of Cassino and Southern Lazio, Via G. Di Biasio, 43-03043 Cassino (FR), Italy; [email protected] (G.d.M.); [email protected] (R.G.) * Correspondence: [email protected]

 Received: 9 May 2020; Accepted: 1 June 2020; Published: 3 June 2020 

Abstract: The deformation of air bubbles in a liquid flow field is of relevant interest in phenomena such as , air entrainment, and foaming. In complex situations, this problem cannot be addressed theoretically, while the accuracy of an approach based on Computational (CFD) is often unsatisfactory. In this study, a novel approach to the problem is proposed, based on the combined use of a shadowgraph technique, to obtain experimental data, and some machine learning algorithms to build prediction models. Three models were developed to predict the equivalent diameter and aspect ratio of air bubbles moving near a plunging jet. The models were different in terms of their input variables. Five variants of each model were built, changing the implemented machine learning algorithm: Additive Regression of Decision Stump, Bagging, K-Star, Random Forest and Support Vector Regression. In relation to the prediction of the equivalent diameter, two models provided satisfactory predictions, assessed on the basis of four different evaluation metrics. The third model was slightly less accurate in all its variants. Regarding the forecast of the bubble’s aspect ratio, the difference in the input variables of the prediction models shows a greater influence on the accuracy of the results. However, the proposed approach proves to be promising to address complex problems in the study of multi-phase flows.

Keywords: air bubbles; plunging jet; shadowgraph technique; machine learning

1. Introduction The study of the dynamics of air bubbles in a liquid flow field is of practical interest in many areas of environmental, chemical, naval and ocean engineering. The interest in phenomena such as cavitation, air entrainment and foaming is particularly relevant. The theoretical characterization of motion and deformation of isolated bubbles is effective only in simple special cases. The most basic theoretical model of the dynamics of a single bubble is represented by the Rayleigh–Plesset equation [1,2], which allows for a description of the variation in the bubble radius as a function of four forcers: the external pressure, the internal pressure of the bubble, the of the liquid and the viscosity of the liquid. It is valid in the case of a perfectly spherical bubble in an infinite liquid domain that is at rest far from the bubble. Moreover, temperature gradients are not considered, while the pressure is a known input governing the bubble deformation. In most situations of practical interest, these assumptions cannot be justified. Later, various authors proposed other theoretical models under different hypotheses [3–6].

Appl. Sci. 2020, 10, 3879; doi:10.3390/app10113879 www.mdpi.com/journal/applsci Appl. Sci. 2020, 10, 3879 2 of 22

More complex situations are generally modelled by experimental [7–10] or CFD-based approaches [11–14]. However, the dynamics of individual bubbles in a water volume are governed by the physical characteristics of water and air, which can be expressed in terms of water density ρw, bubble density ρb, water dynamic viscosity µ and water surface tension σ. The rising of an air bubble depends also on the buoyancy, which is a function of the pressure gradient ∂P/∂z, where z indicates the water depth, and that of the gravitational acceleration g. The buoyancy also depends on the air bubble volume, related to the bubble equivalent diameter (Deq) (i.e., the diameter of a sphere of volume equal to the bubble volume). Furthermore, the relative motion between the air bubble and the surrounding water, which moves with velocity Vw, involves a resistance related to the bubble velocity Vb [15]. In general, the evolution of a single bubble that moves in the water can be described according to:

∂P f (ρw, ρ , µ, σ, Deq, g, , Vw, V ) = 0 (1) b ∂z b Based on the well-known dimensional analysis [16–18], four essential dimensionless numbers affecting the evolution of individual bubbles can be obtained:

Vb Deq Re = · (2) b υ

Vb Frb = p (3) g Deq · 2 ρw V Deq We = · b · (4) b σ 2 (ρw ρb) g Deq Eo = − · · (5) σ where Reb is the bubble , υ is the water cinematic viscosity, Frb is the bubble , Web is the bubble Weber number and Eo is the Eötvös number. A somewhat complex phenomenon that induces the movement of bubbles in a water volume is observed in the presence of a plunging jet. When a water jet plunges into a water pool, air bubbles are generally entrained and carried away in the water volume (Figure1) if the jet impact velocity exceeds a critical value [19]. The impact of the jet induces a fluctuating pressure field that is very difficult to characterize [20]. Moreover, in the turbulent flow induced by the plunging jet, the evolution of the bubbles is also governed by other mechanisms, e.g., turbulence fluctuation, drag, turbulent shear, etc. [21]. The knowledge of the pressure and shear stress is essential to accurately predict the evolution of the size and shape of the bubbles near the plunging jet, since the shape of a single bubble can fluctuate in response to oscillations in the pressure and shear stress in the liquid surrounding the bubble [22,23]. However, even if the pressure field is not known, the availability of a good amount of data on the evolution of the size and shape of a suitable number of bubbles, as well as of data on some characteristics of the flow field, leads us to consider an interesting alternative approach for predicting the deformation of individual bubbles: machine learning algorithms. These procedures, particularly suited to dealing with nonlinear regression problems dependent on several variables, have been widely used in recent years in solving a variety of water engineering problems [24–34]. Appl. Sci. 2020, 10, 3879 3 of 22 Appl. Sci. 2020, 9, x FOR PEER REVIEW 3 of 22

Figure 1.1. AirAir bubblesbubbles entrained entrained by by the th verticale vertical plunging plunging jet (ajet); individual(a); individual air bubble air bubble path, risingpath, insiderising insidethe lateral the recirculationlateral recirculation zone (only zone the (only highlighted the highlighted bubble is bubble at the same is at timethe same as the time recorded as the frame) recorded (b). frame) (b). The complexity of the observed phenomena makes two-phase air–water flows a field of investigationThe complexity for which of machinethe observed learning phenomena algorithms makes can representtwo-phase a veryair–wa usefulter flows tool. a However, field of investigationso far in the literature, for which there machine are few learning studies algorithms relating to can the applicationrepresent a ofvery machine useful learning tool. However, algorithms so farto two-phase in the literature, air–water there flow are issues. few studies Shaban relating and Tavoularis to the application [35] developed of machine a novel learning method algorithms to evaluate togas two-phase and liquid air–water flow rates flow in vertical issues. upward Shaban gas–liquidand Tavoularis pipe flows. [35] developed This technique a novel consisted method of anto evaluateapplication gas of and multi-layer liquid flow backpropagation rates in vertical neuralupward networks gas–liquid on thepipe probability flows. This density technique function consisted and ofthe an power application spectral of density multi-layer of the backpropagation normalized output neural of a networks differential on pressure the probability transducer density connected function to andtwo axiallythe power separated spectral wall density pressure of taps the in normalized the pipe. Granata output and of dea differential Marinis [36 ]pressure used the transducer Regression Treeconnected M5P model, to two Bagging axially separated algorithm wall and Randompressure Foresttaps in algorithm the pipe. toGranata address and some de complexMarinis [36] problems used theof wastewater Regression hydraulics, Tree M5P model, including Bagging the air algorithm entrainment and in Random a circular Forest drop manholealgorithm under to address supercritical some complexflow. Mosavi problems et al. [37of] wastewater used an adaptive hydraulics, network-based including fuzzy the inferenceair entrainment system in (ANFIS) a circular combined drop withmanhole CFD under data tosupercritical predict the flow. macroscopic Mosavi et parameters al. [37] used such an asadaptive gas velocity network-based in the multiphase fuzzy inference reactor. Thesystem mentioned (ANFIS) studies combined already with show CFD the data remarkable to predict potential the macroscopic of approaches parameters based onsuch machine as gas learningvelocity inalgorithms the multiphase in addressing reactor. issues The related mentioned to multi-phase studies flows.already show the remarkable potential of approachesHowever, based to the on bestmachine of the learning authors’ algorithms knowledge, in thereaddressing are no issues previous relate studiesd to multi-phase based on machine flows. learningHowever, algorithms to the that best deal of the with authors’ the evolution knowledge, of air there bubbles are no in waterprevious volume. studies based on machine learningIn this algorithms research, that three deal di ffwitherent the data-driven evolution of models air bubbles have beenin water developed volume. for the prediction of bubbleIn equivalentthis research, diameter three different (Deq) and data-driven aspect ratio models (AR), wherehave been AR isdeveloped equal to the for ratio the prediction of the minor of axis and major axis of the air bubble fitting ellipsoid [38]. bubble equivalent diameter (Deq) and aspect ratio (AR), where AR is equal to the ratio of the minor axis and major axis of the air bubble fitting ellipsoid [38]. The models differ in terms of their input variables. Five variants of each model have been built, varying the used machine learning algorithms, which are Additive Regression of Decision Stumps Appl. Sci.Appl. 2020 Sci., 20209, x ,FOR10, 3879 PEER REVIEW 4 of 224 of 22

(ARDS), Bootstrap Aggregating (Bagging), K-Star, Random Forest (RF) and Support Vector The models differ in terms of their input variables. Five variants of each model have been Regression (SVR). In general terms, the forecasting capability of different machine learning built, varying the used machine learning algorithms, which are Additive Regression of Decision algorithmsStumps (ARDS),is strongly Bootstrap dependent Aggregating on (Bagging),the size K-Star,and variety Random Forestof the (RF) available and Support dataset. Vector The abovementionedRegression (SVR). algorithms In general have terms, been the forecastingselected be capabilitycause they of di usuallyfferent machine ensure learninghigh performance algorithms in modellingis strongly complex dependent and onhighly the size non-linear and variety relationships. of the available The dataset. input The parameters abovementioned of the algorithms models have beenhave evaluated been selected through because experimental they usually measuremen ensure hights performance carried out in using modelling a shadowgraph complex and technique, highly whichnon-linear will be described relationships. in Thedetail input in parametersthe following of the section. models have been evaluated through experimental measurements carried out using a shadowgraph technique, which will be described in detail in the 2. Materialfollowing and section. Methods

2. Material and Methods 2.1. Experimental Setup 2.1.The Experimentalexperimental Setup facility (Figure 2) consists of a vertical steel pipe with nozzle diameter D = 21 mm, fromThe which experimental a vertical facility plunging (Figure 2jet) consists falls in of a a Plexiglas vertical steel prismatic pipe with tank nozzle with diameter a sideD equal= 21 mm to ,14D and afrom height which equal a vertical to 24D plunging. The water jet fallsflow in is a recirculated Plexiglas prismatic in a closed tank withcircuit, a side supplie equald toby 14 aD centrifugaland a pump.height The equaltest water to 24D flow. The rate water was flow Q is= 0.5 recirculated L/s, measured in a closed by means circuit, of supplied an electromagnetic by a centrifugal flow pump. meter. The testplunging water flow jet axis rate wasis inQ the= 0.5 centerline L/s, measured of the by tank means with of anthe electromagnetic jet nozzle located flow meter.at 5D from the D free surfaceThe and plunging 20D from jet axis the is inbottom the centerline of the tank. of the tank with the jet nozzle located at 5 from the free surface and 20D from the bottom of the tank. The shadowgraph system consists of four Falcon® 1.4MP fast cameras (Teledyne DALSA, The shadowgraph system consists of four Falcon® 1.4MP fast cameras (Teledyne DALSA, Waterloo, Waterloo,ON, Canada), ON, Canada), with 35-mm with 35-mm focal length focal lenses length and lens a resolutiones and a resolution of 1400 1024 of 1400 pixels × 1024 at 100 pixels frames at 100 frames per second. Cameras are arranged in couples and located in front× of two different tank sides per second. Cameras are arranged in couples and located in front of two different tank sides at a 90◦ at a 90°angle. angle. On theOn opposite the opposite tank sides, tank asides, couple a ofcouple LED panels, of LED with panels, a size with of 297 a size210 of mm 297 and × 210 a power mm ofand a × power30 W,of were30 W, placed were for theplaced backlight for illuminationthe backligh oft theillumination measurement of volume. the measurement The investigated volume. volume The investigatedextends fromvolume thefree extends surface from to 6.5 theD streamwise free surface (vertical to 6.5 x-direction),D streamwise from (vertical2.5D to x-direction), 2.5D spanwise from − −2.5D(y-direction) to 2.5D spanwise and from (y-direction)1D to 1D depthwise and from (z-direction). −1D to 1D depthwise (z-direction). −

FigureFigure 2. 2. ExperimentalExperimental setup.setup.

An experimentalAn experimental test test consisting consisting of of 2000 2000 consecut consecutiveive imagesimages has has been been carried carried out. out. In orderIn order to to evaluate the shape evolution of some air bubbles near the vertical plunging jet, the image sequences evaluate the shape evolution of some air bubbles near the vertical plunging jet, the image sequences have been recorded with a frame rate equal to 100 Hz with an image size of 1400 751 pixels and a have been recorded with a frame rate equal to 100 Hz with an image size of 1400× × 751 pixels and a high spatial resolution, close to 0.1 mm/pixel. It should be noted that an increase in the camera’s frame high ratespatial corresponds resolution, to aclose decrease to 0.1 in mm/pixel. the image It size, should leading be noted to a reduction that an increase in the investigated in the camera’s volume. frame rate corresponds to a decrease in the image size, leading to a reduction in the investigated volume. Therefore, the frame rate has been optimized to ensure an investigated volume wide enough to allow for an analysis of the rising bubbles inside the lateral recirculation zone.

Appl. Sci. 2020, 10, 3879 5 of 22

Appl. Therefore,Sci. 2020, 9, x the FOR frame PEER rate REVIEW has been optimized to ensure an investigated volume wide enough to allow5 of 22 for an analysis of the rising bubbles inside the lateral recirculation zone. 2.2. Volumetric Shadowgraph Technique 2.2. Volumetric Shadowgraph Technique The volumetric shadowgraph technique used in this study allows us to describe the three- The volumetric shadowgraph technique used in this study allows us to describe the dimensional evolution of air bubble shape and position, based on the observation of the bubble three-dimensional evolution of air bubble shape and position, based on the observation of the boundariesbubble boundariesfrom different from points different of points view of[39,40]. view [ 39The,40 ].boundary The boundary projection projection in the in theEuclidean Euclidean space definesspace a cone defines whose a cone axis whose joins axis the joins center the center of the of ca themera camera lens lens to tothe the bubble bubble centroid. centroid. ThisThis principle principle is illustratedis illustrated in Figure in Figure 3, considering3, considering the the projection projection of of an an object object onon two two di differentfferent planes, planes, Oxy Oxy and and O’x’y’, O’x’y’, withwith the local the local z-axis z-axis indicating indicating the the camera’s camera’s optical optical axis.axis.

FigureFigure 3. 3. MeasurementMeasurement principle.principle.

If theIf extrinsic the extrinsic and and intrinsic intrinsic parameters parameters of of the the came camerara systemsystem are are known, known, it isit possibleis possible to project, to project, in the three-dimensional space, the n cones, with n equal to the number of cameras. The extrinsic in the three-dimensional space, the n cones, with n equal to the number of cameras. The extrinsic parameters define the location and orientation of the camera reference frame with respect to a known parameters define the location and orientation of the camera reference frame with respect to a known world reference frame, while intrinsic parameters define the optical, geometric and digital characteristics worldof thereference camera, regardlessframe, while of the intrinsic world around paramete it [41,42rs]. define the optical, geometric and digital characteristicsThe estimation of the camera, of the regardless camera parameters of the world takes around place through it [41,42]. a well-established calibration procedureThe estimation [43], which of the involves camera the useparameters of a specific take target,s place consisting through of a planar a well-established chessboard 90 50calibration mm × procedurewith squares [43], which of 5 mm involves0.02 mmthe use in size, of a stuck specific on a target, planar consisting steel plate. of The a calibrationplanar chessboard requires the90 × 50 ± mm acquisitionwith squares of a of sequence 5 mm ± of 0.02 40 images mm in containing size, stuck the on entire a planar translated steel and plate. rotated The target. calibration The corner requires the acquisitionidentification of in a each sequence target image of 40 allowsimages us contai to evaluatening thethe extrinsic entire translated parameters ofand each rotated camera. target. Then, The cornerthe identification evaluation of thein relativeeach target position image and orientationallows us ofto Cam1evaluate to Cam0 the extrinsic and of Cam3 parameters to Cam2 andof each camera.the subsequentThen, the evaluation estimation ofof thethe positionrelative andpositi orientationon and orientation of the two cameraof Cam1 couples, to Cam0 both and relative of Cam3 and absolute, with respect to the world coordinate system, provide the mapping of the target and, to Cam2 and the subsequent estimation of the position and orientation of the two camera couples, consequently, of the investigated volume. both relative and absolute, with respect to the world coordinate system, provide the mapping of the The intersection of the n cones defines the carved hull of the same [44]. Therefore, using sufficiently targetfast and, cameras, consequently, this technique of the allows investigated for the detection volume. and description of the spatial and temporal evolution ofThe individual intersection air bubbles of the within n cones a measurement defines the volume. carved hull of the same [44]. Therefore, using sufficientlyA specific fast cameras, image processing this technique algorithm allows (Figure for4) hasthe beendetection used to and detect description the air bubble of the boundaries spatial and temporalof each evolution recorded of data individual set. The procedureair bubbles entails within two a preliminarymeasurement steps: volume. the first is the background removalA specific (Figure image4b), obtainedprocessing by subtractingalgorithm from (Figure each image4) has the been median used image to calculateddetect the over air thebubble boundariesentire data of set,each the recorded second step data is theset. image The binarizationprocedure entails (Figure 4twoc) by preliminary means of the IsoDatasteps: the threshold first is the backgroundalgorithm removal [45]. The (Figure latter allows 4b), forobtained the detection by subtracting of the bubbles from characterized each image by the a clearly median visible image calculatedboundary, over removing the entire the data out-of-focus set, the second bubbles. step In the is the presence image of binarization relevant void (Figure fraction, 4c) air by bubbles means of the IsoData threshold algorithm [45]. The latter allows for the detection of the bubbles characterized by a clearly visible boundary, removing the out-of-focus bubbles. In the presence of relevant void fraction, air bubbles tend to aggregate, forming a bubble cluster. The direct application of the shadowgraph technique on images containing bubble clusters involves the erroneous detection of a single bubble for the entire cluster. In order to avoid this inconvenience, the watershed technique has been used (Figure 4d). It considers a grayscale digital image as a relief map, with the gray levels of pixels indicating their elevation in the relief [46]. Considering a bubble cluster as a series of watersheds adjacent to each other, the watershed lines allow for their division and, consequently, Appl. Sci. 2020, 10, 3879 6 of 22 tend to aggregate, forming a bubble cluster. The direct application of the shadowgraph technique on images containing bubble clusters involves the erroneous detection of a single bubble for the entire cluster. In order to avoid this inconvenience, the watershed technique has been used (Figure4d). It considers a grayscale digital image as a relief map, with the gray levels of pixels indicating their elevation in the relief [46]. Considering a bubble cluster as a series of watersheds adjacent to each other,Appl. Sci. the 2020 watershed, 9, x FOR PEER lines REVIEW allow for their division and, consequently, their detection [47,48]. The6 of last22 step consists of bubble boundary detection, obtained by considering the pixel outline of the bubbles their detection [47,48]. The last step consists of bubble boundary detection, obtained by considering (Figure4e). the pixel outline of the bubbles (Figure 4e). The air bubble tracking has been carried out by means of the Lucas–Kanade optical flow The air bubble tracking has been carried out by means of the Lucas–Kanade optical flow algorithmalgorithm [[49],49], aa correlation-basedcorrelation-based method that defines defines its best best measurement measurement as as the the minimum minimum of of the the SumSum ofof SquaredSquared DiDifferencesfferences (SSD)(SSD) ofof pixelpixel intensityintensity values between interrogation interrogation windows windows in in two two consecutiveconsecutive framesframes [[50].50]. The opticaloptical flow flow has been been applied applied on on the the de detectedtected bubble bubble centroid centroid in in the the 2D 2D framesframes recorded recorded from from eacheach camera.camera. By knowing the cameras’ cameras’ positions positions with with respect respect to to the the air air bubbles bubbles andand the the positionposition ofof thethe latterlatter in the Euclidean sp spaceace through through the the volumetric volumetric shadowgraph shadowgraph technique, technique, itit is is possible possible toto obtainobtain aa three-dimensionalthree-dimensional tracking of of the the air air bubbles. bubbles. It It should should be be noted noted that, that, in in a a shadowgraphshadowgraphimage image ofof aa bubblybubbly flow,flow, regardlessregardless of the bubble size, size, the the backlight backlight from from the the LED LED panel panel meetsmeets air–water air–water interfaces interfaces from from one one or moreor more bubbles bubbl andes scattersand scatters both by both total by reflection total reflection and refraction and followedrefraction by followed internal by reflections internal reflections and refractions, and refr suchactions, that the such light that intensity the light collected intensity on collected the camera on sensorthe camera varies sensor accordingly. varies accordingly. This made it This possible made to it detect possible and to follow detect the and trajectories follow the of trajectories air bubbles of with air dibubblesfferent with sizes, different starting sizes, from starti a fewng millimeters. from a few Themillimeters. same would The sa notme bewould possible not be in possible the presence in the of solidpresence particles of solid and /orparticles water dropsand/or where water the drops introduction where the of aintroduction depth-of-field of (DOF)a depth-of-field criterion could(DOF) be necessarycriterion could [51]. be necessary [51].

FigureFigure 4.4. StepsSteps forfor thethe boundaryboundary detection: original original image image ( (aa);); image image after after background background removal removal (b ();b ); binarybinary image image (c );(c image); image after after watershed watershed operation operation (d); final(d); final image image containing containing the outline the outline of the detected of the airdetected bubbles air (e bubbles). (e).

FigureFigure5 5shows shows the the reconstructed reconstructed sequencesequence ofof positionspositions takentaken by selected air air bubbles, bubbles, composed composed ofof 2626 consecutive frames. frames. Some Some air air bubbles bubbles have have been been observed observed inside inside the jet thezone jet and zone show and a vertical show a verticaldownwards downwards path, some path, someother otherbubbles bubbles move move in th ine thelateral lateral recirculation recirculation zones zones with with an an upwards upwards spiralingspiraling path.path. Appl. Sci. 2020, 10, 3879 7 of 22

Figure 5. Sequence of air bubbles entrained by the vertical plunging jet and reconstructed by means of the volumetric shadowgraph technique. The plunging jet is downward from the top, with X/D = 0 indicating the water surface.

2.3. Algorithms

2.3.1. Additive Regression of Decision Stumps An additive regression model [52] aims to obtain forecasts by summing up the outcomes from other models: the weak learners. The procedure begins with the development of a standard regression model (e.g., a regression tree model). The errors in the training data, i.e., the differences between the algorithm predictions in the training values and the actual values are called residuals. In order to correct these errors, another prediction model is built, generally of the same type as the former one, which tries to forecast the abovementioned residuals. For this purpose, the initial dataset is replaced by the attained residuals before training the second model. Adding the predictions of the second model to the forecasts of the first one allows us to reduce the amount of residuals. However, they are, again, different from zero because the second model is still not accurate enough. Thus, a third model is developed to predict the residuals of the residuals and the procedure is iterated, until a stopping rule based on cross-validation is met. In particular, a cross-validation is executed at each iteration up to a user-defined maximum number. The result that minimizes the cross-validation estimation of the squared error is selected. In this work, the algorithm performed 100 iterations for each of the considered models. The Decision Stump algorithm has been chosen as weak learner. This algorithm is based on a one-level decision tree: one root node is directly linked to the terminal leaf nodes. A decision stump allows us to get a prediction on the base of the value of a single input feature: a threshold value is selected, and the stump is characterized by two leaves, respectively, for values above and below the

1

Appl. Sci. 2020, 10, 3879 8 of 22 threshold. There are also cases where multiple thresholds should be selected and the stump includes multiple leaves.

2.3.2. K-Star The K-Star procedure [53] is an instance-based algorithm very similar to the k-Nearest Neighbor regression algorithm. The latter procedure provides a prediction by evaluating a weighted average of the k-nearest neighbors’ values, weighted by the inverse of their distance, and the Euclidean metric is typically used as distance metric. The innovative aspect of the K-Star algorithm is provided by the use of an entropy metric instead of Euclidean distance. The complexity of transforming one instance into another is chosen as the distance between the different instances. Complexity is evaluated by introducing a finite set of transformations among the instances and by defining the K* distance:

K (b a) = log Pr (b a) (6) ∗ − 2 ∗ in which Pr* is the probability of all paths between instances a and b. If the instances are real numbers, it can be demonstrated that Pr*(b|a) is only dependent on the absolute value of the difference between a and b: 1 h i K (b a) = K (i) = log (2s s2) log (s) + i log (1 s) log (1 √2s s2) (7) ∗ | ∗ 2 2 − − 2 2 − − 2 − − In which i = |a b| and s is a parameter whose value is between zero and one. Consequently, − the distance is a function of the absolute value of the difference between two instances. In addition, it can be assumed by the hypothesis that the real space is underlain by a discrete space, with the distance between the discrete instances being very small. In Equation (7), as s approaches zero, this leads to a probability density function, where the probability to obtain an integer between i and i + ∆i is:

p i √2s Pr (i) = s/2 e− ∆i (8) ∗ · · which needs to be rescaled in terms of a real value x, where x/xo = i √2s, in order to get the PDF of the real numbers: 1 x Pr (x) = e −xo dx (9) ∗ 2xo

In practical uses, a realistic value of xo must be chosen, which is the mean expected value for x over the probability distribution. In the K-Star procedure, xo is chosen by selecting a number between no and N, where N is the whole number of training elements, while no is the number of the training elements at the smallest distance from a. The choice of xo is generally made by introducing the “blending parameter” b, which takes the value b = 0%, for no and b = 100%, for N, while intermediate values are linearly interpolated. For a more detailed description of the algorithm, see Cleary and Trigg [53]. A value of b = 40% has proven to be optimal for the problems addressed in this study.

2.3.3. Bagging and Random Forest A regression tree essentially consists of a decision tree used as a predictive model [54]. Each internal node represents an input variable, while leaves correspond to specified real values of the target variables. The development of a regression tree consists of a recursive procedure in which the input data domain is divided into subdomains, while a multivariable linear regression model is employed to obtain predictions in each subdomain. The recursive tree growth process is carried out by splitting each subdivision into smaller branches, considering all the possible splits in every field and detecting, at each step, the subdivision in two separate subsets minimizing the least squared deviation, defined as: Appl. Sci. 2020, 10, 3879 9 of 22

1 X 2 R(t) = (y ym(t)) (10) N(t) i − i t ∈ where N(t) is the number of sample elements in the t node, yi is the value of the target variable in the i-th element and ym is the average value of the target variable in the t node. This sum estimates the “impurity” at each node. The algorithm stops when the lowest impurity level is reached or if a different stopping rule is encountered. The grouping of multiple learning models leads to an ensemble method, which may increase the forecast accuracy. The outcomes of different regression trees can be combined to get a single numeric result, for example, through a weighted average. Bootstrap aggregating, also known as Bagging, a ML algorithm introduced by Breiman, is based on such an approach: numerous training datasets of the same size are randomly picked, without replacement, from the original dataset in order to build a regression tree for each dataset. Different predictions will be provided by different regression trees, since small variations in the training dataset may lead to significant changes in ramifications and in outcomes for test instances. Bagging attempts to counteract the instability of the regression tree development process by varying the original training dataset instead of sampling a new independent training set each time: some instances are replaced by others. Finally, the individual predictions of the different regression trees are averaged. The Random Forests algorithm is similar to the Bagging algorithm, but the procedure by which regression trees are built is different. Each node is allocated without considering the best subdivision among all the input variables, but by randomly choosing only a part of the variables to split on. The number of these variables does not change during the training process. Each tree is developed as far as possible, without pruning, until the minimum number of instances in a leaf node is reached. In this study, the minimum number of leaf elements has been chosen to be equal to five, while each forest consists of 1000 trees.

2.3.4. Support Vector Regression Starting from a training dataset {(x , y ), (x , y ), ... ,(x , y )} X R, where X is the space of the 1 1 2 2 l l ⊂ × input arrays (e.g., X Rn). Support Vector Regression (SVR) is a supervised learning algorithm [55] ∈ whose purpose is to find a function f (x) as flat as possible and with a maximum ε deviation from the observed target values yi. Therefore, given a linear function in the form:

f (x) = w, x + b (11) h i in which w X and b R, the Euclidean norm ||w||2 must be minimized, and this leads to a constrained ∈ ∈ convex optimization problem. In most cases, a significant error must be tolerated, so it is necessary to introduce slack variables ξι, ξι∗ in the constraints. Consequently, the optimization problem can be formulated as follows: Xl 1 2   minimize w + C ξ + ξ∗ (12) 2k k i i i=1 y w, x b ε + ξ subject to i − h ii − ≤ i (13) w, x + b y ε + ξ h ii − i ≤ i∗ where the function flatness and the tolerated deviations are dependent on the constant C > 0. The SVR algorithm becomes nonlinear by pre-processing the training instances xi by means of a function Φ: X F, where F is some feature space. Since SVR only depends on the dot products between → the different instances, a kernel k(x , x ) = Φ(x ), Φ(x ) can be used rather than explicitly using the i j h i j i function Φ( ). · In this research, the Pearson VII universal function kernel (PUFK) has been selected: Appl. Sci. 2020, 10, 3879 10 of 22

1 k(x , x ) = (14) i j  ! !2ω q  2 √ (1/ω)  1 + 2 xi xj 2 1 /σ   − −  in which the parameters σ and ω control the half-width and the tailing factor of the peak. In this study, the best results have been obtained for σ = 0.4 and ω = 0.4.

2.4. Evaluation Metrics and Cross-Validation Different evaluation metrics have been used to assess the effectiveness of the predicting models: the coefficient of determination R2, the Mean Absolute Error (MAE), the Root Mean Squared Error (RMSE) and the Relative Absolute Error (RAE). These well-known metrics are defined below:   Pm ( )2 2  i=1 fi yi  R = 1 −  (15)  Pm 2  − (ya y ) i=1 − i where m is the total number of experimental data, fi is the predicted value for data point i, and yi is the experimental value for data point i: Pm i= fi yi MAE = 1 − (16) m s Pm 2 = ( fi yi) RMSE = i 1 − (17) m Pm i= fi yi RAE = 1 − (18) Pm ya yi i=1 − in which ya is the averaged value of the observed data. Each model has been built by a k-fold cross-validation process, using a set of about 300 vectors. The k-fold cross-validation involves the random subdivision of the original dataset into k subsets. Subsequently, k 1 subsets are used to train the model, while the k-th single subset is reserved for − testing and validating the model. The recurrent cross-validation procedure is carried out k times: each of the k subsets is used once as the testing dataset. Finally, the k results are averaged to obtain a single prediction. In this study, k = 10 has been assumed.

3. Results and Discussion

Different models were built to predict the equivalent diameter Deq after one time step, Deq+1, after two time steps, Deq+2 and, after three time steps, Deq+3. In the same way, different models were developed to forecast the aspect ratio AR after one time step, AR+1, after two time steps, AR+2 and, after three time steps, AR+3. Based on the input variables, three different models were built for the prediction of Deq and AR. First, a model taking into account a greater number of variables that can affect the investigated phenomena was developed, then two simpler models were built, based on some relevant parameters for the evolution of the bubbles. Therefore, Model 1 is characterized by the following quantities as input variables: bubble velocity Vb, bubble depth below the free surface hb, the bubble Weber number Web, the bubble Froude number Frb, the bubble Reynolds number Reb, the Eötvös number Eo, the initial equivalent diameter Deq0. Model 2 admits as input variables Web, Reb and Eo. Model 3 has the following input variables: Web, Reb and Frb. In addition, five variants of each model were developed, changing the implemented machine learning algorithm. Table1 shows a comparison of the results provided by the di fferent models. With regard to Deq+1, all models showed good predictive capabilities. Model 1 showed the best forecast performance, in all its variants. The Bagging algorithm proved to be the most accurate one in this case (R2 = 0.9749, Appl. Sci. 2020, 10, 3879 11 of 22

MAE = 0.0974 mm, RMSE = 0.1434 mm, RAE = 13.42%), while the SVR-based variant led to less accurate predictions (R2 = 0.9472, MAE = 0.1418 mm, RMSE = 0.2020 mm, RAE = 20.18%). RF and K-Star showed a comparable performance, outperforming ARDS.

Table 1. Synthesis of the results of equivalent diameter (Deq) prediction models.

Model MAE RMSE Input Variables Algorithm R2 RAE Number [mm] [mm] ARDS 0.9671 0.1132 0.1686 15.39% Bagging 0.9749 0.0974 0.1434 13.42% hb, Vb, Web, Frb, 1 K-Star 0.9692 0.1111 0.1582 15.27% Reb, Eo, Deq0 RF 0.9734 0.1047 0.1489 14.33% SVR 0.9472 0.1418 0.2020 20.18% ARDS 0.9640 0.1209 0.1748 16.31% Bagging 0.9742 0.0973 0.1448 13.61% Deq+1 2 Web, Reb, Eo K-Star 0.9629 0.1152 0.1717 16.29% RF 0.9676 0.1138 0.1645 15.42% SVR 0.9495 0.1466 0.2001 20.36% ARDS 0.8425 0.2710 0.3552 36.80% Bagging 0.9269 0.1706 0.2461 23.38% 3 Web, Frb, Reb K-Star 0.9214 0.1749 0.2488 24.68% RF 0.9512 0.1354 0.2032 18.52% SVR 0.9546 0.1388 0.1931 19.10% ARDS 0.9536 0.1495 0.1936 20.04% Bagging 0.9650 0.1284 0.1675 17.35% hb, Vb, Web, Frb, 1 K-Star 0.9651 0.1278 0.1687 17.38% Reb, Eo, Deq0 RF 0.9637 0.1346 0.1715 18.15% SVR 0.9413 0.1563 0.2123 22.12% ARDS 0.9499 0.1555 0.2009 20.77% Bagging 0.9590 0.1403 0.1820 18.88% Deq+2 2 Web, Reb, Eo K-Star 0.9481 0.1495 0.2003 20.86% RF 0.9553 0.1507 0.1898 20.32% SVR 0.9434 0.1655 0.2144 22.39% ARDS 0.9261 0.1899 0.2470 25.75% Bagging 0.9168 0.1894 0.2610 26.04% 3 Web, Frb, Reb K-Star 0.9170 0.1843 0.2580 25.77% RF 0.9386 0.1682 0.2254 22.85% SVR 0.9410 0.1619 0.2147 22.58% ARDS 0.9526 0.1437 0.1980 19.29% Bagging 0.9695 0.1181 0.1589 15.84% hb, Vb, Web, Frb, 1 K-Star 0.9639 0.1272 0.1705 17.28% Reb, Eo, Deq0 RF 0.9713 0.1137 0.1541 15.38% SVR 0.9399 0.1555 0.2153 21.92% ARDS 0.9451 0.1570 0.2107 21.10% Bagging 0.9670 0.1204 0.1670 16.17% Deq+3 2 Web, Reb, Eo K-Star 0.9422 0.1471 0.2094 20.86% RF 0.9632 0.1285 0.1734 17.39% SVR 0.9425 0.1645 0.2133 22.58% ARDS 0.8932 0.2189 0.2950 29.47% Bagging 0.8845 0.2228 0.3036 30.51% 3 Web, Frb, Reb K-Star 0.9031 0.1971 0.2772 27.79% RF 0.9170 0.1974 0.2623 26.45% SVR 0.9416 0.1590 0.2162 21.74%

Model 2 exhibited forecasting capabilities very similar to those of Model 1. Furthermore, in this case, the Bagging algorithm (R2 = 0.9742, MAE = 0.0973 mm, RMSE = 0.1448 mm, RAE = 13.61%) led to Appl. Sci. 2020, 10, 3879 12 of 22 the best predictions, while SVR (R2 = 0.9495, MAE = 0.1466 mm, RMSE = 0.2001 mm, RAE = 20.36%) was outperformed by all the other considered ones. RF showed an accuracy very close to that of Bagging, while the ARDS-based variant of Model 2 slightly outperformed the homologous variant of Model 1. Model 3 was generally characterized by a significant decline in forecasting capabilities, however overall performance was still good. Unlike previous models, the SVR-based variant (R2 = 0.9546, MAE = 0.1388 mm, RMSE = 0.1931 mm, RAE = 19.10%) outperformed all the others. ARDS led to significantly less accurate predictions (R2 = 0.8425, MAE = 0.2710 mm, RMSE = 0.3552 mm, RAE = 36.80%), while RF ensured predictions comparable to those of SVR, outperforming both Bagging and K-Star. As for Deq+2, Model 1 led to slightly less accurate results than those relating to the prediction of Deq+1. Again, the Bagging-based variant of Model 1 (R2 = 0.9650, MAE = 0.1284 mm, RMSE = 0.1675 mm, RAE = 17.35%) led to the most accurate predictions. On the other hand, SVR was, again, the worst performing algorithm (R2 = 0.9413, MAE = 0.1563 mm, RMSE = 0.2123 mm, RAE = 22.12%). RF and K-Star showed a comparable performance and both slightly outperformed ARDS. Again, Model 2 predictive capabilities were just below those of Model 1 and Bagging algorithm led to the most accurate forecasts (R2 = 0.9590, MAE = 0.1403 mm, RMSE = 0.1820 mm, RAE = 18.88%), whereas SVR (R2 = 0.9434, MAE = 0.1655 mm, RMSE = 0.2144 mm, RAE = 22.39%) provided the least accurate outcomes. RF, ARDS and K-Star were characterized by their comparable forecasting capabilities in this case. Model 3 was again characterized by a remarkable decrease in accuracy of forecasts, compared to Model 1 and Model 2, with a manifest increase in errors. As in the case of Deq+1, the SVR-based variant (R2 = 0.9410, MAE = 0.1619 mm, RMSE = 0.2147 mm, RAE = 22.58%) led to the best results, while the Bagging based variant (R2 = 0.9168, MAE = 0.1894 mm, RMSE = 0.2610 mm, RAE = 26.04%) provided the worst performance. RF outperformed both ARDS and Bagging. Moreover, the ARDS-based variant outperformed the homologous variant developed to forecast Deq+1. Regarding Deq+3, the forecasting capabilities of Model 1 were in line with those of the homologous 2 forecasting model of Deq+2. In this case RF led to the best forecasts (R = 0.9713, MAE = 0.1137 mm, RMSE = 0.1541 mm, RAE = 15.38%), while ARDS and SVR provided the least accurate results. Once again, the predictions of Model 2 were slightly less accurate than those of Model 1. The variant based on the Bagging algorithm (R2 = 0.670, MAE = 0.1204 mm, RMSE = 0.1670 mm, RAE = 16.17%) proved to be the best performing, while those based on ARDS and SVR also in this case led to the least accurate results. Model 3 confirmed itself as the worst performing model. Among the various variants of Model 3, the one based on SVR (R2 = 0.9416, MAE = 0.1590 mm, RMSE = 0.2162 mm, RAE = 21.74%) confirmed to be the most accurate, significantly outperforming all the others. A comparative analysis of the three considered models, in all variants, showed that they do not undergo a significant reduction in forecasting capability of the equivalent diameter of the bubble, gradually passing from Deq+1 to Deq+2 and Deq+3. Therefore, with a good experimental dataset available, an approach based on machine learning may allow us to predict, with good accuracy, the variation in the bubble volume over time. Model 1 generally leads to the best predictions, but it requires a greater number of input variables and, in particular, it needs to know the initial depth of the bubble. Therefore, for practical purposes it may be better to use Model 2, whose high forecasting capabilities demonstrate that, to predict the volumetric deformation of the bubble with satisfactory accuracy, it is enough to know the initial values of the Weber, Reynolds and Eötvös numbers. While it was established that, within Model 1 and Model 2, all the considered algorithms are able to provide good predictions, the variants based on the Bagging algorithm led to the best results, while those based on SVR provided the least accurate results. The latter algorithm, on the other hand, is the one that provided the best results within Model 3. However, Model 3 proved to be less accurate than the other two models: the absence of the number of Eötvös among the input variables significantly reduced its effectiveness. Moreover, it should be noted that the SVR algorithm has consistently shown Appl. Sci. 2020, 10, 3879 13 of 22 Appl. Sci. 2020, 9, x FOR PEER REVIEW 13 of 22 theconsistently same level ofshown accuracy the same across level all modelsof accuracy and inputacross variables. all models This and result input highlights variables. its This robustness, result highlights its robustness, as there is an indication that the algorithm is significantly sensitive to input as there is an indication that the algorithm is significantly sensitive to input parameters. parameters. Figure6 shows a comparison between the predicted and measured values of Deq+1, relative to Figure 6 shows a comparison between the predicted and measured values of Deq+1, relative to Model 1. In addition, it shows the relative error as a function of the measured Deq+1. Overestimation Model 1. In addition, it shows the relative error as a function of the measured Deq+1. Overestimation errors were observed more frequently than underestimation errors in all Model 1 variants. Furthermore, errors were observed more frequently than underestimation errors in all Model 1 variants. theFurthermore, maximum relative the maximum errors relative were observed errors were for obse bubblesrved withfor bubbles equivalent with equivalent diameters diameters less than 2less mm. Thesethan errors 2 mm. showed These aerrors tendency showed to decrease a tendency as theto equivalentdecrease as diameterthe equivalent grew anddiameter generally grew did and not exceed 25%. The exception was represented by the SVR algorithm. If the bubble diameter was generally± did not exceed ±25%. The exception was represented by the SVR algorithm. If the bubble lessdiameter than 2 mm,was less SVR than generally 2 mm, SVR tended generally to overestimate tended to overestimate the diameter the itself diameter and theitself error and the could error also exceedcould 30%, also while,exceed if 30%, the diameterwhile, if the was diameter greater thanwas greater 4 mm, SVRthan tended4 mm, SVR to underestimate tended to underestimate the diameter, withthe an diameter, error which with didan error not exceedwhich did 15%. not If exceed the equivalent 15%. If the bubble equivalent diameter bubble was diameter between was 2 between and 4 mm, the2 SVR-basedand 4 mm, the model SVR-based variant model provided variant results provided comparable results comparable to those of theto those other of variants. the other variants.

6 0.5

0.25 4

0 ) predicted [mm] predicted ) Relative Errors Relative

eq+1 2

(D -0.25

ARDS ARDS 0 -0.5 02460246 (Deq+1) measured [mm] (Deq+1) measured [mm] 6 0.5

0.25 4

0 ) predicted [mm] predicted ) Relative Errors Relative

eq+1 2

(D -0.25

Bagging Bagging 0 -0.5 02460246 (Deq+1) measured [mm] (Deq+1) measured [mm] 6 0.5

0.25 4

0 ) predicted [mm] predicted ) Relative Errors Relative

eq+1 2

(D -0.25

K-Star K-Star 0 -0.5 02460246 (Deq+1) measured [mm] (Deq+1) measured [mm] 6 0.5

0.25 4

0 ) predicted ) eq+1 Relative Errors Relative

(D 2 -0.25

RF RF 0 -0.5 02460246 (Deq+1) measured (Deq+1) measured [mm] Figure 6. Cont. Appl. Sci. 2020, 9, x FOR PEER REVIEW 14 of 22 Appl. Sci. 2020, 10, 3879 14 of 22 Appl. Sci. 2020, 9, x FOR PEER REVIEW 14 of 22 6 0.5

6 0.5

0.25 4 0.25

4 0 ) predicted ) eq+1

Relative Errors Relative 0

(D 2 ) predicted )

eq+1 -0.25 Relative Errors Relative

(D 2 -0.25 SVR SVR 0 -0.5 02460246 SVR SVR 0 (Deq+1) measured -0.5 (Deq+1) measured [mm] 0246 0246 (Deq+1) measured (Deq+1) measured [mm] FigureFigure 6.6. Equivalent diameter prediction “Model“Model 1”:1”: in the left column, predicted versus experimental values, in the right column, residuals versus experimental values. Figurevalues, 6. in Equivalent the right column, diameter residuals prediction versus “Model experimental 1”: in the left values. column, predicted versus experimental values, in the right column, residuals versus experimental values. The comparison between the predicted and measured values of D shown in Figure7 relates to The comparison between the predicted and measured values ofeq D+1eq+1 shown in Figure 7 relates Model 2. As in the previous case, the same figure also shows the relative error as a function of the to ModelThe comparison 2. As in the betweenprevious thecase, predicted the same and figure measured also shows values the of relative Deq+1 shown error asin aFigure function 7 relates of the experimental Deq . The error distributions were quite similar to those of Model 1, while the maximum toexperimental Model 2. As D ineq++1 1the. The previous error distributions case, the same were figure quite also simila showsr to those the relative of Model error 1, while as a function the maximum of the relative errors were observed when Deq < 2.5 mm. In this case, both K-Star and SVR led to relative experimentalrelative errors D wereeq+1. The observed error distributions when Deq < were 2.5 mm. quite In simila this case,r to those both of K-Star Model and 1, while SVR theled maximumto relative errors, in some cases greater than 30% in the forecasts for smaller bubbles. relativeerrors, in errors some were cases observed greater than when 30% D eqin < the 2.5 forecasts mm. In thisfor smallercase, both bubbles. K-Star and SVR led to relative errors, in some cases greater than 30% in the forecasts for smaller bubbles. 6 0.5

6 0.5

0.25 4 0.25

4 0 ) predicted [mm] predicted )

2 Errors Relative 0 eq+1 (D

) predicted [mm] predicted ) -0.25

2 Errors Relative eq+1

(D -0.25 ARDS ARDS 0 -0.5 02460246 ARDS ARDS 0 (Deq+1) measured [mm] -0.5 (Deq+1) measured [mm] 0246 0246 6 0.5 (Deq+1) measured [mm] (Deq+1) measured [mm] 6 0.5

0.25 4 0.25

4 0 ) predicted [mm] predicted )

Relative Errors Relative 0

eq+1 2 (D

) predicted [mm] predicted ) -0.25 Relative Errors Relative

eq+1 2

(D -0.25 Bagging Bagging 0 -0.5 02460246 Bagging Bagging 0 (Deq+1) measured [mm] -0.5 (Deq+1) measured [mm] 0246 0246 6 0.5 (Deq+1) measured [mm] (Deq+1) measured [mm] 6 0.5

0.25 4 0.25

4 0 ) predicted [mm] predicted )

Relative Errors Relative 0

eq+1 2 (D

) predicted [mm] predicted ) -0.25 Relative Errors Relative

eq+1 2

(D -0.25 K-Star K-Star 0 -0.5 02460246 K-Star K-Star 0 (Deq+1) measured [mm] -0.5 (Deq+1) measured [mm] 0246 0246

(Deq+1) measured [mm] (Deq+1) measured [mm] Figure 7. Cont. Appl. Sci. 2020, 10, 3879 15 of 22 Appl. Sci. 2020, 9, x FOR PEER REVIEW 15 of 22

6 0.5 Appl. Sci. 2020, 9, x FOR PEER REVIEW 15 of 22

6 0.50.25 4

0.25 0 4 ) predicted [mm] predicted ) Relative Errors Relative

eq+1 2 0

(D -0.25

) predicted predicted ) [mm] Relative Errors Relative

eq+1 2

(D -0.25 RF RF 0 -0.5 02460246 RF RF 0 (Deq+1) measured [mm] -0.5 (Deq+1) measured [mm] 0 2 4 6 0 2 4 6 6 (Deq+1) measured [mm] 0.5 (Deq+1) measured [mm] 6 0.5

0.25 4 0.25 4

0 0

) predicted [mm] predicted ) ) predicted predicted ) [mm] Relative Errors Relative

eq+1 2 Relative Errors Relative

eq+1 2 (D -0.25 (D -0.25

SVRSVR SVR SVR 0 0 -0.5-0.5 02460 2 4 6 0 02462 4 6 (D ) measured [mm] (D ) measured [mm] (Deq+1)eq+1 measured [mm] eq+1(Deq+1) measured [mm]

Figure 7. Equivalent diameter prediction “Model 2”: in the left column, predicted versus experimental Figure 7. Equivalent diameter prediction “Model 2”: in the left column, predicted versus experimental values,Figure 7. in Equivalent the right column, diameter residuals prediction versus “Model experimental 2”: in the values.left column, predicted versus experimental values, in the right column, residuals versus experimental values. values, in the right column, residuals versus experimental values. An alternative effective representation of the models’ errors in predicting Deq+1, in terms of An alternative effective representation of the models’ errors in predicting Deq+1, in terms of residuals (i.e., absolute errors), is provided by the notched box plots in Figure8. Theseeq+1 plots show residualsAn alternative (i.e., ab soluteeffective errors), representation is provided by of the the notched models’ box plotserrors in Figurein predicting 8. These plotsD show, in terms the of theresiduals rangerange (i.e., ofof errorserrors absolute with with errors),respect respect tois to theprovided the experimental experimental by the values. notched values. The box lower The plots lower end in of Figure endeach of box 8. each These plot box denotes plots plot show the denotes the therange lowerlower of errors quartile quartile with Q1 Q1 respect(25th (25th percentile), percentile), to the experimental the the upper upper end values. enddenotes denotes Thethe upper lower the upper quartile end of quartile Q3 each (75th box Q3 percentile), (75thplot denotes percentile), the the thelower bandband quartile inside inside Q1 thethe (25th boxbox representspercentile), the the the median median upper of ofend the the dedata. data.notes The The the whiskers whiskersupper extend quartile extend from Q3 from the (75th bottom the percentile), bottom of the of the boxband tobox inside the to smallestthe the smallest box non-outlier represents non-outlier andthe and median from from thethe of toptop the of data. the the box boxThe to towhiskers the the highest highest extend non non-outlier-outlier from inthe the inbottom dataset. the dataset. of the Thebox boxtoThe the plotsbox smallest plots further furthe non-outlier confirmr confirm what and what from was was previously thepreviously top of theargued argued box in to in terms the terms highest of ofthe the accuracy non-outlier accuracy of the of in different the the di dataset.fferent models.The boxmodels. Itplots can It canfurther also also be be notedconfirm noted that thatwhat the the was variants variants previously ofof the models models argued based basedin terms on on the theof Bagging the Bagging accuracy and andRF algorithmsof RF the algorithms different have their residual distributions characterized by greater symmetry than the others. havemodels. their It residualcan also distributionsbe noted that characterizedthe variants of by the greater models symmetry based on than the Bagging the others. and RF algorithms have their residual distributions characterized by greater symmetry than the others.

Figure 8. Equivalent diameter prediction: box plots of the residuals. Figure 8. Equivalent diameter prediction: box plots of the residuals. Figure9 shows the Deq time series of two of the analyzed bubbles. Again, it can be observed that all models’Figure variants 9 shows provided the Deq predictionstime series ofin two good of the agreement analyzed bubbles. with the Again, experimental it can be observed data. In addition,that it is interestingall models’ tovariants noteFigure thatprovided 8. higher Equivalent predictions errors diameter were in good observed prediction: agreement in box correspondence with plots the of expe the rimentalresiduals. with higher data. In changes addition, in the experimentalit is interesting values to betweennote that twohigher consecutive errors were time observed steps. in Finally,correspondence it may bewith worth higher highlighting changes in that the experimental values between two consecutive time steps. Finally, it may be worth highlighting the algorithmFigure 9 shows that provided the Deq thetime most series accurate of two predictionsof the analyzed for a bu singlebbles. bubble Again, may it can be di beff erentobserved from that the that the algorithm that provided the most accurate predictions for a single bubble may be different algorithmall models’ that variants was the provided most accurate predictions in the in entiregood agreement dataset. with the experimental data. In addition, it is interestingfrom the algorithm to note that that was higher the most errors accurate were in observed the entire in dataset. correspondence with higher changes in the experimental values between two consecutive time steps. Finally, it may be worth highlighting that the algorithm that provided the most accurate predictions for a single bubble may be different from the algorithm that was the most accurate in the entire dataset. Appl. Sci. 2020, 10, 3879 16 of 22 Appl. Sci. 2020, 9, x FOR PEER REVIEW 16 of 22

4.4 3.6 Experimental data ARDS Bagging 3.4 4.2 K-Star RF SVR 3.2 4 3 Deq [mm] Deq [mm] Experimental data ARDS 3.8 Bagging 2.8 K-Star RF SVR 3.6 2.6 0 5 10 15 20 25 0 5 10 15 20 25 Time step (Frame number) Time step (Frame number)

(a) (b)

FigureFigure 9.9. ComparisonComparison among equivalent diameters by by means means of of time time series;(a) series; (a Model) Model 1—bubble 1—bubble 1;1; (b(b)) Model Model 2—bubble 2—bubble 12. 12.

TheThe aspectaspect ratioratio (AR)(AR) resultsresults areare summarizedsummarized inin TableTable2 .2. The The accuracy accuracy of of the the prediction prediction models models waswas significantlysignificantly lower lower than than that that of of the the analogous analogous forecast forecast models models of of equivalent equivalent diameter. diameter. With regard to AR+1, Model 1 showed again the best prediction capability and the ARDS Table 2. Synthesis of the results of aspect ratio (AR) prediction models. algorithm was the most accurate one (R2 = 0.5376, MAE = 0.0244, RMSE = 0.0317, RAE = 67.89%), while all the other consideredModel algorithms showed comparable performances. Model 2 showed an Input Variables Algorithm R2 MAE RMSE RAE appreciable deteriorationNumber in forecasting capabilities compared to Model 1, except for the SVR-based variant. The results of Model 3 were even less accuARDSrate, especially0.5376 those 0.0244 of the 0.0317ARDS-based 67.89% variant. As for AR+2, Model 1 outcomes were comparableBagging to th0.4651e results 0.0254 of the homologous 0.0342 69.85% predictive hb, Vb, Web, Frb, +1 1 K-Star 0.3505 0.0277 0.0372 76.64% model of AR , in some cases evenReb, Eo slightly, Deq0 better. Model 2 predictions also had an accuracy similar to that of Model 1. In particular, from the comparisonRF with0.4002 Model 1, 0.0276 by analyzing 0.0384 the 72.71%results of the SVR 0.2976 0.0272 0.0373 76.96% different Model 2 variants, it could be observed that ARDS was characterized by the deterioration of the predictive capabilities, while K-Star showedARDS an improvement.0.4795 0.0283For the 0.0372other algorithms, 72.86% the Bagging 0.3884 0.0286 0.0376 77.28% different metrics led to conflictingWe , Reresults., Eo Model 3 proved to be the less accurate once again, except AR+1 2 b b K-Star 0.3573 0.0277 0.0369 77.11% for the SVR-based variant, which outperformedRF the analogous0.3302 variant 0.0303 of Model 0.0407 2. As for 79.89% AR+3, Model 2 outperformed Model 1 in all variants, while ModelSVR 3 led to0.3637 the least 0.0279accurate 0.0374predictions, 76.38% except for the SVR-based variant, the results of which areARDS comparable 0.1463to the analogous 0.0400 variant 0.0540 of 107.30%Model 2. Bagging 0.1957 0.0321 0.0429 85.76% Table3 2. SynthesisWeb, Fr ofb ,theRe bresultsK-Star of aspect ratio0.2257 (AR) prediction 0.0311 models. 0.0417 83.70% RF 0.2038 0.0303 0.0406 84.40% Model Number Input Variables SVRAlgorithm0.3023 R2 0.0287MAE0.0387 RMSE 79.55% RAE ARDSARDS 0.46280.5376 0.0284 0.0244 0.0385 0.0317 72.66% 67.89% BaggingBagging 0.34010.4651 0.0294 0.0254 0.0386 0.0342 80.03% 69.85% hb, Vb, Web, Frb, 1 hb, Vb, Web, Frb, RebK-Star, 0.3783 0.0308 0.0420 77.99% 1 Reb, Eo, Deq0 K-Star 0.3505 0.0277 0.0372 76.64% Eo, Deq0 RF 0.3772 0.0300 0.0409 78.33% RF 0.4002 0.0276 0.0384 72.71% SVR 0.4035 0.0280 0.0371 75.81% SVR 0.2976 0.0272 0.0373 76.96% ARDS 0.3354 0.0332 0.0439 83.01% BaggingARDS 0.36590.4795 0.0312 0.0283 0.0404 0.0372 80.95% 72.86% We , Re , Eo AR+2 2 b b K-StarBagging 0.44590.3884 0.0315 0.0286 0.0411 0.0376 77.78% 77.28% 0.4416 0.0311 0.0417 76.21% AR+1 2 Web, Reb, Eo RF K-Star 0.3573 0.0277 0.0369 77.11% SVR 0.4242 0.0332 0.0432 80.22% RF 0.3302 0.0303 0.0407 79.89% ARDSSVR 0.05570.3637 0.0434 0.0279 0.0547 0.0374105.91% 76.38% Bagging 0.3282 0.0349 0.0434 86.55% 3 Web, Frb, Reb K-StarARDS 0.42260.1463 0.0328 0.0400 0.0411 0.0540 80.80% 107.30% RF Bagging 0.34660.1957 0.0344 0.0321 0.0448 0.0429 84.39% 85.76% SVR 0.4538 0.0334 0.0422 80.08% 3 Web, Frb, Reb K-Star 0.2257 0.0311 0.0417 83.70% RF 0.2038 0.0303 0.0406 84.40% SVR 0.3023 0.0287 0.0387 79.55%

hb, Vb, Web, Frb, Reb, ARDS 0.4628 0.0284 0.0385 72.66% AR+2 1 Eo, Deq0 Bagging 0.3401 0.0294 0.0386 80.03% Appl. Sci. 2020, 10, 3879 17 of 22

Appl. Sci. 2020, 9, x FOR PEER REVIEW 17 of 22 Table 2. Cont.

Model K-Star 0.3783 0.0308 0.0420 77.99% Input Variables Algorithm R2 MAE RMSE RAE Number RF 0.3772 0.0300 0.0409 78.33% ARDSSVR 0.13640.4035 0.0358 0.0280 0.0468 0.0371 93.19% 75.81% BaggingARDS 0.33940.3354 0.0301 0.0332 0.0393 0.0439 80.06% 83.01% hb, Vb, Web, Frb, 1 K-Star 0.1662 0.0345 0.0457 89.14% Reb, Eo, Deq0 Bagging 0.3659 0.0312 0.0404 80.95% RF 0.3065 0.0315 0.0413 81.96% 2 Web, Reb, Eo SVRK-Star 0.22930.4459 0.0301 0.0315 0.0403 0.0411 83.25% 77.78% ARDSRF 0.43270.4416 0.0295 0.0311 0.0388 0.0417 75.35% 76.21% BaggingSVR 0.38780.4242 0.0302 0.0332 0.0389 0.0432 78.77% 80.22% We , Re , Eo AR+3 2 b b K-StarARDS 0.32580.0557 0.0314 0.0434 0.0399 0.0547 84.21% 105.91% RF 0.3190 0.0315 0.0411 82.30% Bagging 0.3282 0.0349 0.0434 86.55% SVR 0.3232 0.0294 0.0396 79.09% 3 Web, Frb, Reb K-Star 0.4226 0.0328 0.0411 80.80% ARDS 0.0371 0.0406 0.0511 104.77% BaggingRF 0.16540.3466 0.0355 0.0344 0.0449 0.0448 92.77% 84.39% 3 Web, Frb, Reb K-StarSVR 0.23690.4538 0.0337 0.0334 0.0430 0.0422 88.32% 80.08% RF ARDS 0.23540.1364 0.0336 0.0358 0.0430 0.0468 88.87% 93.19% SVR 0.3501 0.0322 0.0411 83.04% Bagging 0.3394 0.0301 0.0393 80.06% hb, Vb, Web, Frb, Reb, 1 K-Star 0.1662 0.0345 0.0457 89.14% Eo, Deq0 With regard to AR+1, Model 1 showed again theRF best prediction0.3065 capability 0.0315 and the0.0413 ARDS algorithm81.96% 2 was the most accurate one (R = 0.5376, MAE = 0.0244,SVR RMSE 0.2293= 0.0317, 0.0301 RAE = 0.040367.89%), 83.25% while all the other considered algorithms showed comparable performances. Model 2 showed an appreciable ARDS 0.4327 0.0295 0.0388 75.35% deterioration in forecasting capabilities compared to Model 1, except for the SVR-based variant. Bagging 0.3878 0.0302 0.0389 78.77% The results of Model 3 were even less accurate, especially those of the ARDS-based variant. AR+3 2 Web, Reb, Eo K-Star 0.3258 0.0314 0.0399 84.21% As for AR+2, Model 1 outcomes were comparable to the results of the homologous predictive RF 0.3190 0.0315 0.0411 82.30% model of AR+1, in some cases even slightly better. Model 2 predictions also had an accuracy similar to that of Model 1. In particular, from the comparisonSVR with Model0.3232 1, by analyzing0.0294 0.0396 the results 79.09% of the different Model 2 variants, it could be observed that ARDS was characterized0.0371 0.0406 by the deterioration0.0511 104.77% of the predictive capabilities, while K-Star showed an improvement.Bagging For0.1654 the other 0.0355 algorithms, 0.0449 the92.77% different metrics led to conflicting3 results.We Modelb, Frb, 3Re provedb K to-Star bethe less0.2369 accurate 0.0337 once again,0.0430 except 88.32% for the SVR-based variant, which outperformed the analogousRF variant0.2354 of Model 0.0336 2. As for0.0430 AR +3,88.87% Model 2 outperformed Model 1 in all variants, while ModelSVR 3 led to the least0.3501 accurate 0.0322 predictions, 0.0411 except83.04% for the SVR-based variant, the results of which are comparable to the analogous variant of Model 2. FigureFigure 1010 shows a a comparison comparison between between the the predicted predicted and and experimental experimental values values of AR of+1 AR, with+1, withreference reference to Model to Model 1. In 1.addition In addition,, it shows it shows the relative the relative error error versus versus the measured the measured AR+1 AR. +1.

0.6 2

1.5

1 0.4 0.5

0

) ) predicted +1

-0.5 Relative Errors Relative

(AR 0.2 -1

-1.5 ARDS ARDS 0 -2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 (AR+1) measured (AR+1) measured Figure 10. Cont. Appl. Sci. 2020, 10, 3879 18 of 22 Appl. Sci. 2020, 9, x FOR PEER REVIEW 18 of 22

0.6 2

1.5

1 0.4 0.5

0 ) ) predicted +1 -0.5 Relative Errors Relative

(AR 0.2 -1

-1.5 Bagging Bagging 0 -2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 (AR+1) measured (AR+1) measured 0.6 2.5

2

1.5

1 0.4 0.5

0 ) ) predicted

+1 -0.5 Relative Errors Relative

(AR 0.2 -1

-1.5

-2 K-Star K-Star 0 -2.5 0 0.2 0.4 0.6 0 0.2 0.4 0.6 (AR+1) measured (AR+1) measured 0.6 2.5

2

1.5

1 0.4 0.5

0 ) ) predicted

+1 -0.5 Relative Errors Relative

(AR 0.2 -1

-1.5

-2 RF RF 0 -2.5 0 0.2 0.4 0.6 0 0.2 0.4 0.6 (AR+1) measured (AR+1) measured 0.6 2.5

2

1.5

1 0.4 0.5

0 ) ) predicted

+1 -0.5 Relative Errors Relative

(AR 0.2 -1

-1.5

-2 SVR SVR 0 -2.5 0 0.2 0.4 0.6 0 0.2 0.4 0.6 (AR+1) measured (AR+1) measured

Figure 10. Aspect ratio prediction “Model 1”: in the left column, predicted versus experimental values, inFigure the right 10. column,Aspect ratio residuals prediction versus “Model experimental 1”: in values. the left column, predicted versus experimental values, in the right column, residuals versus experimental values. It is evident that the results which most negatively affect the metrics refer to AR < 0.3. In this range,It theis evident model outcomesthat the results were particularly which most poor. negatively This occurrence affect the wasmetrics mainly refer due to toAR the < 0.3. scarcity In this of trainingrange, the data model falling outcomes within this were range. particularly However, poor. it should This occurrence be noted that was the mainly models’ due predictions to the scarcity cannot of betraining considered data falling satisfactory within in this the range. entire investigatedHowever, it experimentalshould be noted range. that Therethe models’ was an predictions important limitationcannot be in considered the model, whichsatisfactory did not in include, the entire among investigated the input variables, experimental a parameter range. thatThere adequately was an takesimportant into accountlimitation the in stress the model, state on which the surface did not of include, the bubbles among (i.e., the pressure input andvariables, shear stressa parameter in the liquidthat adequately surrounding takes the into bubble). account Unfortunately, the stress state measurements on the surface carried of outthe viabubb shadowgraphles (i.e., pressure technique and areshear not stress able toin overcome the liquid this surrounding limitation. the bubble). Unfortunately, measurements carried out via shadowgraph technique are not able to overcome this limitation. For the sake of brevity, the diagrams relating to the similar Model 2 results and the poorer Model 3 results are not shown. Appl. Sci. 2020, 10, 3879 19 of 22

Appl. Sci. 2020, 9, x FOR PEER REVIEW 19 of 22 For the sake of brevity, the diagrams relating to the similar Model 2 results and the poorer Model 3 Figure 11 shows the notched box plots of the residuals in AR+1 predictions. The box plots also resultsshowAppl. are quiteSci. not 2020 shown.clearly, 9, x FOR thatPEER REVModelIEW 1 outperforms Model 2, which, in turn, outperforms Model19 of 22 3. Furthermore,Figure 11 shows the number the notched of outlie box plotsrs is ofsignificant. the residuals Finally, in AR the+1 predictions.distributions The of boxthe plotsresiduals also showare Figure 11 shows the notched box plots of the residuals in AR+1 predictions. The box plots also quiteslightly clearly asymmetric, that Model with 1 outperforms a prevalence Model of positive 2, which, values. in turn, outperforms Model 3. Furthermore, the numbershow quite of outliers clearly is that significant. Model 1 Finally, outperforms the distributions Model 2, which of the, in residuals turn, outperforms are slightly Model asymmetric, 3. with aFurthermore, prevalence ofthe positive number values. of outliers is significant. Finally, the distributions of the residuals are slightly asymmetric, with a prevalence of positive values. Residuals

Figure 11. Aspect ratio prediction: box plots of the residuals. Figure 11. Aspect ratio prediction: box plots of the residuals. Figure 12 shows theFigure AR time11. Aspect series ratio of twoprediction: of the box analyzed plots of thebubbles. residuals The. results are better than whatFigure was 12 expected shows the on AR the time basis series of the of twoanalysis of the carried analyzed out bubbles. above. They The results confirm are that better the than model what wasevaluation expectedFigure metrics on 12 the shows basishave the ofbeen AR the significantlytime analysis series carried of worsened two of out the above. by analyzed the results They bubbles. confirm relating Thethat toresults bubbles the are model characterizedbetter evaluation than metricsbywhat very have waslow been expectedaspect significantly ratios. on the Therefore, basis worsened of theit byis analysis reasonable the results carried to relating outbelieve above. to that bubbles They the forecast confirm characterized thatmodels the by based model very on low evaluation metrics have been significantly worsened by the results relating to bubbles characterized aspectmachine ratios. learning Therefore, algorithms it is reasonable are able toto believeprovide that accurate the forecast predictions models of basedthe aspect on machine ratio of learningsingle by very low aspect ratios. Therefore, it is reasonable to believe that the forecast models based on algorithmsbubbles, arewhen able used to provide for bubbles accurate whose predictions characteristics of the aspectare comparable ratio of single to those bubbles, used when for usedmodel for machine learning algorithms are able to provide accurate predictions of the aspect ratio of single bubblestraining. whose characteristics are comparable to those used for model training. bubbles, when used for bubbles whose characteristics are comparable to those used for model training. 0.5 0.5

0.5 0.5 0.45 0.45

0.45 0.45 0.4 0.4 AR AR

0.4 Experimental data 0.4 Experimental data AR AR ARDS ARDS Bagging 0.35 Experimental data 0.35 Bagging K-Star Experimental data ARDS K-Star RF ARDS 0.35 Bagging 0.35 RF SVR Bagging K-Star K-StarSVR 0.3 RF 0.3 RF 0 5 10 15 20SVR 25 0SVR 5 10 15 20 25 0.3 0.3 0 5 Time step10 (Frame number)15 20 25 0 5 Time10 step (Frame15 number) 20 25 Time step (Frame number) Time step (Frame number) (a) (b) (a) (b) FigureFigure 12. 12.Comparison Comparison among among aspect aspect ratios ratios by by means means of timeof time series; series (a );(a) Model Model 1—bubble 1—bubble 1; (b )1; Model (b) Figure 12. Comparison among aspect ratios by means of time series ;(a) Model 1—bubble 1; (b) 2—bubbleModel 2—bubble 12. 12. Model 2—bubble 12. ModelModel 2, based2, based on on the the numbers numbers of of Weber, Weber, ReynoldsReynolds and Eötvös only, only, has has proven proven to to be be able able to to Model 2, based on the numbers of Weber, Reynolds and Eötvös only, has proven to be able to predict,predict, with with good good approximation, approximation, the the evolution evolution of singleof single bubbles, bubbles, in termsin terms of equivalentof equivalent diameter diameter and predict, with good approximation, the evolution of single bubbles, in terms of equivalent diameter and aspect ratio, in a field of uncertain pressures originating from a plunging jet. The combination of aspectand ratio, aspect in aratio, field in of a uncertainfield of uncertain pressures pressures originating originating from from a plunging a plunging jet. jet. The The combination combination of Web, Web, Frb and Reb leads to an additional dimensionless parameter, indicated as Mo: Frb andWeReb, Frb bleads and Re tob leads an additional to an additional dimensionless dimensionless parameter, parameter indicated, indicated as as Morton Morton number numberMo Mo:: 3 3 Web Mo = Web3 Mo  We24 (19) Fr24⋅Reb (19) Mo = Fr bbRe (19) 2bb4 Frb Reb However,However, it itis is not not useful useful to to introduce introduce a newnew· parameterparameter in intoto prediction prediction models models that that is ais a combinationHowever,combination it of is of notalready already useful considered considered to introduce parameters, parameters, a new parameter nornor isis it intoit possiblepossible prediction to to define define models a a simple thatsimple is functionala combinationfunctional of alreadyrelationshiprelationship considered between between parameters, the the deformation deformation nor is itof of possible a single to bubblebubble define andand a simple the the Morton Morton functional number. number. relationship Therefore, Therefore, between this this theparameter deformationparameter has has ofnot not a been single been considered considered bubble and inin thisthis the study. Morton number. Therefore, this parameter has not been consideredOverall,Overall, in this preliminary preliminary study. elaborations elaborations have shown that,that, with with regard regard to to the the equivalent equivalent diameter, diameter, the the numbernumber of oftracked tracked bubbles bubbles is is sufficient sufficient for the convergenceconvergence ofof thethe model. model. If If a singlea single bubble bubble is is Appl. Sci. 2020, 10, 3879 20 of 22

Overall, preliminary elaborations have shown that, with regard to the equivalent diameter, the number of tracked bubbles is sufficient for the convergence of the model. If a single bubble is excluded from the training dataset, the developed models still lead to results quite close to those obtained from the entire dataset. Instead, regarding the aspect ratio, a greater number of bubbles in the AR < 0.3 range could improve the accuracy of the models.

4. Conclusions A truthful prediction of the deformation of air bubbles in a water volume in the presence of uncertain pressure field is an important aspect of the study of phenomena such as cavitation, air entrainment and foaming. If suitable experimental data are available, machine learning algorithms can provide powerful tools to obtain accurate predictions. In this study, three different models were built to predict the equivalent diameter and aspect ratio of air bubbles moving near a plunging jet. Five variants of each model were developed, varying the implemented machine learning algorithm: Additive Regression of Decision Stump, Bagging, K-Star, Random Forest and Support Vector Regression. The experimental data were obtained from measurements carried out using a shadowgraph technique. As for the prediction of the equivalent diameter, Model 1 provided the best predictions in most cases, but it needs a higher number of input variables, including the initial depth of the bubble. Model 2 provided results comparable to those of Model 1, but requires only the Reynolds, Weber and Eötvös numbers as inputs. The lower accuracy of Model 3 can be attributed to the absence of the number of Eötvös among the input variables. Within Model 1 and Model 2, the variant based on the Bagging algorithm led to the best results in most cases, while the variant based on SVR led to the least accurate results. On the other hand, the SVR-based variant provided the best results under Model 3. Regarding the forecast of the aspect ratio, the results of the three models cannot be considered fully satisfactory in general, even if, for some bubbles, the models have provided very good predictions. The unsatisfactory results can be attributed to the training dataset, which does not include an adequate number of bubbles with low aspect ratio values, and probably to the limits of the models, whose input variables do not adequately take into account the stresses acting on the surface of the bubbles. From what is shown in this study, it is therefore clear that a combined approach based on a shadowgraph technique and machine learning algorithms can certainly be considered to address complex problems in the study of multi-phase flows.

Author Contributions: Conceptualization, F.G.; methodology, F.G. and F.D.N.; software, F.G.; validation, F.D.N, F.A.P, F.D.F, G.d.M., R.G. and M.M.; formal analysis: F.G. and F.D.N.; investigation: F.G., F.D.N., F.A.P., F.D.F., G.d.M., R.G. and M.M.; data curation: F.D.N.; writing—original draft preparation, F.G. and F.D.N.; writing—review and editing: F.A.P., F.D.F., G.d.M., M.M. and R.G.; supervision: F.G. All authors have read and agreed to the published version of the manuscript. Funding: The work described here was conducted as part of the Italian Ministry of Education, University and Research 2017 PRIN Project ACE, grant 2017RSH3JY. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Rayleigh, L. On the pressure developed in a liquid during the collapse of a spherical cavity. Philos. Mag. 1919, 34, 94–98. [CrossRef] 2. Plesset, M.S. The dynamics of cavitation bubbles. J. Appl. Mech. 1949, 16, 228–231. [CrossRef] 3. Gilmore, F.R. The Growth or Collapse of a Spherical Bubble in a Viscous Compressible Liquid; California Institute of Technology: Pasadena, CA, USA, 1952; Unpublished. 4. Fujikawa, S.; Akamatsu, T. Effects of the non-equilibrium condensation of vapour on the pressure wave produced by the collapse of a bubble in a liquid. J. Fluid Mech. 1980, 97, 3–512. [CrossRef] 5. Keller, J.B.; Miksis, M. Bubble oscillations of large amplitude. J. Acoust. Soc. Am. 1980, 68, 2–633. [CrossRef] Appl. Sci. 2020, 10, 3879 21 of 22

6. Prosperetti, A.; Crum, L.A.; Commander, K.W. Nonlinear bubble dynamics. J. Acoust. Soc. Am. 1988, 83, 2–514. [CrossRef] 7. Fujiwara, A.; Danmoto, Y.; Hishida, K.; Maeda, M. Bubble deformation and flow structure measured by double shadow images and PIV/LIF. Exp. Fluids 2004, 36, 1–165. [CrossRef] 8. Cao, Y.; Kawara, Z.; Yokomine, T.; Kunugi, T. Experimental and numerical study on nucleate bubble deformation in subcooled flow boiling. Int. J. Multiph. Flow 2016, 82, 93–105. [CrossRef] 9. Feng, J.; Cao, J.; Jiang, J.; Sun, B.; Cai, Z.; Gao, Z. Single bubble breakup in the flow field induced by a horizontal jet—The experimental research. Asia Pac. J. Chem. Eng. 2019, 14, 1. [CrossRef] 10. Korobeynikov, S.M.; Ridel, A.V.; Medvedev, D.A. Deformation of bubbles in transformer oil at the action of alternating electric field. Eur. J. Mech. B Fluids 2019, 75, 105–109. [CrossRef] 11. Ekambara, K.; Dhotre, M.T.; Joshi, J.B. CFD simulations of bubble column reactors: 1D, 2D and 3D approach. Chem. Eng. Sci. 2005, 60, 23–6746. [CrossRef] 12. Kerdouss, F.; Bannari, A.; Proulx, P. CFD modeling of gas dispersion and bubble size in a double turbine stirred tank. Chem. Eng. Sci. 2006, 61, 10–3322. [CrossRef] 13. Rzehak, R.; Krepper, E. CFD modeling of bubble-induced turbulence. Int. J. Multiph. Flow 2013, 55, 138–155. [CrossRef] 14. Fletcher, D.F.; McClure, D.D.; Kavanagh, J.M.; Barton, G.W. CFD simulation of industrial bubble columns: Numerical challenges and model validation successes. Appl. Math. Model. 2017, 44, 25–42. [CrossRef] 15. Pfister, M.; Chanson, H. Two-phase air-water flows: Scale effects in physical modeling. J. Hydrodyn. Ser. B 2014, 26, 2–298. [CrossRef] 16. Schmidt, E. Ähnlichkeitstheorie der Bewegung von Flüssigkeitsgasgemsichen (Similarity Theory of Motion in Fluid-Gas Mixtures). Forschungsheft 1934, 365, 1–3. (In German) 17. Clift, R.; Grace, J.; Weber, M. Bubbles, Drops, and Particles; Academic Press: New York, NY, USA, 1978; p. 380. 18. Tripathi, M.; Sahu, K.; Govindarajan, R. Dynamics of an initially spherical bubble rising in quiescent liquid. Nat. Commun. 2015, 6, 6268. [CrossRef] 19. Cummings, P.D.; Chanson, H. Air Entrainment in the Developing Flow Region of Plunging Jets—Part 1: Theoretical Development. J. Fluids Eng. Trans. ASME 1997, 119, 3–602. [CrossRef] 20. Ervine, D.A.; Falvey, H.T.; Withers, W.A. Pressure fluctuations on plunge pool floors. J. Hydraul. Res. 1997, 35, 257–279. [CrossRef] 21. Liao, Y.; Rzehak, R.; Lucas, D.; Krepper, E. Baseline closure model for dispersed bubbly flow: Bubble coalescence and breakup. Chem. Eng. Sci. 2015, 122, 336–349. [CrossRef] 22. Lunde, K.; Perkins, R.J. Shape Oscillations of Rising Bubbles. Flow Turbul. Combust. 1997, 58, 387–408. 23. Avdeev, A.A. Bubble Systems; Springer International Publishing: Berlin/Heidelberg, Germany, 2016. 24. Solomatine, D.P.; Xue, Y. M5 Model Trees and Neural Networks: Application to Flood Forecasting in the Upper Reach of the Huai River in China. J. Hydrol. Eng. 2004, 9, 6–501. [CrossRef] 25. Ahmad, S.; Kalra, A.; Stephen, H. Estimating soil moisture using remote sensing data: A machine learning approach. Adv. Water Resour. 2010, 33, 1–80. [CrossRef] 26. Nourani, V.; Kisi, Ö.; Komasi, M. Two hybrid artificial intelligence approaches for modeling rainfall–runoff process. J. Hydrol. 2011, 402, 41–59. [CrossRef] 27. Kisi, O. Modeling discharge-suspended sediment relationship using least square support vector machine. J. Hydrol. 2012, 456, 110–120. [CrossRef] 28. Najafzadeh, M.; Etemad-Shahidi, A.; Lim, S.Y. Scour prediction in long contractions using ANFIS and SVM. Ocean Eng. 2016, 111, 128–135. [CrossRef] 29. Granata, F.; Papirio, S.; Esposito, G.; Gargano, R.; De Marinis, G. Machine learning algorithms for the forecasting of wastewater quality indicators. Water 2017, 9, 2. [CrossRef] 30. Naghibi, S.A.; Ahmadi, K.; Daneshi, A. Application of support vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resour. Manag. 2017, 31, 9–2775. [CrossRef] 31. Granata, F. Evapotranspiration evaluation models based on machine learning algorithms—A comparative study. Agric. Water Manag. 2019, 217, 303–315. [CrossRef] 32. Arnon, T.A.; Ezra, S.; Fishbain, B. Water characterization and early contamination detection in highly varying stochastic background water, based on Machine Learning methodology for processing real- time UV-Spectrophotometry. Water Res. 2019, 155, 333–342. [CrossRef] Appl. Sci. 2020, 10, 3879 22 of 22

33. Granata, F.; Di Nunno, F.; Gargano, R.; de Marinis, G. Equivalent Discharge Coefficient of Side Weirs in Circular Channel—A Lazy Machine Learning Approach. Water 2019, 11, 11. [CrossRef] 34. Granata, F.; Gargano, R.; de Marinis, G. Artificial intelligence based approaches to evaluate actual evapotranspiration in wetlands. Sci. Total Environ. 2020, 703, 135653. [CrossRef][PubMed] 35. Shaban, H.; Tavoularis, S. Measurement of gas and liquid flow rates in two-phase pipe flows by the application of machine learning techniques to differential pressure signals. Int. J. Multiph. Flow 2014, 67, 106–117. [CrossRef] 36. Granata, F.; de Marinis, G. Machine learning methods for wastewater hydraulics. Flow Meas. Instrum. 2017, 57, 1–9. [CrossRef] 37. Mosavi, A.; Shamshirband, S.; Salwana, E.; Chau, K.W.; Tah, J.H. Prediction of multi-inputs bubble column reactor using a novel hybrid model of computational fluid dynamics and machine learning. Eng. Appl. Comput. Fluid Mech. 2019, 13, 1–492. [CrossRef] 38. Li, Q.; Griffiths, J.G. Least squares ellipsoid specific fitting. In Proceedings of the Geometric Modeling and Processing, Beijing, China, 13–15 April 2004; pp. 335–340. 39. Pereira, F.; Avellan, F.; Dupont, P. Prediction of cavitation erosion: An energy approach. J. Fluids Eng. 1998, 120, 4–727. [CrossRef] 40. Di Nunno, F.; Alves Pereira, F.; Granata, F.; de Marinis, G.; Di Felice, F.; Gargano, R.; Miozzi, M. A shadowgraphy approach for the 3D Lagrangian description of bubbly flows. Meas. Sci. Technol. 2020, in press. [CrossRef] 41. Tsai, R.Y. An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision. In Proceedings of the Computer Vision and Pattern Recognition, Miami Beach, FL, USA, 22–26 June 1986; pp. 364–374. 42. Trucco, E.; Verri, A. Introductory Techniques for 3-D Computer Vision; Prentice Hall: Englewood Cliffs, NJ, USA, 1998; p. 343. 43. Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [CrossRef] 44. Fu, Y.; Liu, Y. 3D bubble reconstruction using multiple cameras and space carving method. Meas. Sci. Technol. 2018, 29, 7. [CrossRef] 45. Ridler, T.; Calvard, S. Picture thresholding using an iterative selection method. IEEE Trans. Syst. Man Cybern. SMC-8 1978, 8, 630–632. 46. Vincent, L.; Soille, P. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 1991, 13, 6–598. [CrossRef] 47. Di Nunno, F.; Alves Pereira, F.; de Marinis, G.; Di Felice, F.; Gargano, R.; Granata, F.; Miozzi, M. Experimental study of air-water two-phase jet: Bubble size distribution and velocity measurements. J. Phys. Conf. Ser. 2018, 1110, 012011. 48. Di Nunno, F.; Alves Pereira, F.; de Marinis, G.; Di Felice, F.; Gargano, R.; Granata, F.; Miozzi, M. Two-phase PIV-LIF measurements in a submerged bubbly water jet. J. Hydraul. Eng. 2019, 145, 9. [CrossRef] 49. Lucas, B.; Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence, Vancouver, BC, Canada, 24–28 August 1981; pp. 121–130. 50. Miozzi, M.; Jacob, B.; Olivieri, A. Performances of feature tracking in turbulent boundary layer investigation. Exp. Fluids 2008, 45, 4–780. [CrossRef] 51. Fdida, N.; Blaisot, J.B. Drop size distribution measured by imaging: Determination of the measurement volume by the calibration of the point spread function. Meas. Sci. Technol. 2009, 21, 2. [CrossRef] 52. Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed.; Morgan Kaufmann: Burlington, VT, USA, 2016; p. 560. 53. Cleary, J.G.; Trigg, L.E. K*: An instance based learner using an entropic distance measure. In Proceedings of the 12th International Conference on Machine Learning, Tahoe City, CA, USA, 9–12 July 1995; pp. 108–114. 54. Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees; Taylor & Francis: Abingdon, UK, 1984; p. 368. 55. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [CrossRef]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).