Multistep Flood Inundation Forecasts with Resilient Backpropagation Neural Networks: Kulmbach Case Study
water
Article Multistep Flood Inundation Forecasts with Resilient Backpropagation Neural Networks: Kulmbach Case Study
Qing Lin *, Jorge Leandro , Stefan Gerber and Markus Disse Chair of Hydrology and River Basin Management, Department of Civil, Geo and Environmental Engineering, Technical University of Munich, Arcisstrasse 21, 80333 Munich, Germany; [email protected] (J.L.); [email protected] (S.G.); [email protected] (M.D.) * Correspondence: [email protected]; Tel.: +49-89-289-23228
Received: 21 October 2020; Accepted: 10 December 2020; Published: 19 December 2020
Abstract: Flooding, a significant natural disaster, attracts worldwide attention because of its high impact on communities and individuals and increasing trend due to climate change. A flood forecast system can minimize the impacts by predicting the flood hazard before it occurs. Artificial neural networks (ANN) could efficiently process large amounts of data and find relations that enable faster flood predictions. The aim of this study is to perform multistep forecasts for 1–5 h after the flooding event has been triggered by a forecast threshold value. In this work, an ANN developed for the real-time forecast of flood inundation with a high spatial resolution (4 m 4 m) is extended to × allow for multiple forecasts. After trained with 120 synthetic flood events, the ANN was first tested with 60 synthetic events for verifying the forecast performance for 3 h, 6 h, 9 h and 12 h lead time. The model produces good results, as shown by more than 81% of all grids having an RMSE below 0.3 m. The ANN is then applied to the three historical flood events to test the multistep inundation forecast. For the historical flood events, the results show that the ANN outputs have a good forecast accuracy of the water depths for (at least) the 3 h forecast with over 70% accuracy (RMSE within 0.3 m), and a moderate accuracy for the subsequent forecasts with (at least) 60% accuracy.
Keywords: hazard; artificial neural network; resilient backpropagation; multistep urban flood forecast
1. Introduction Floods are one of the most threatening hazards to civilian safety and infrastructures, causing damages and losses over the world [1]. Especially in densely populated urban areas, urbanization, aging of drainage systems and climate change contribute to growing flood risk in many countries. Ione important mitigation measure is the prediction of future flood occurrences. Real-time flood forecast with sufficient lead time can boost the use of preventive measures for flood mitigation. Such measures can minimize the threats to communities and individuals at risk of flooding [2]. Various types of hydrology and hydraulic models are available for flood forecasting [3]. From the different modeling types, these can be classified into rainfall-runoff models, one-dimensional (1D) model, two-dimensional (2D) model, and coupled 1D–1D model and 1D–2D model. From the forecast application perspective, 2D and 1D–2D models are able to provide directly a spatial surface flood representation, which is essential for flood damage estimation. 1D–1D models, on the other hand, relies heavily on GIS pre- and post-treatments. Most 2D hydrodynamic models are computationally expensive [4]. Even with the help of up-to-date computational techniques for 2D simulations, the computational capability is still inadequate for a real-time forecast [5]. Data-driven is a fast-growing alternative to hydrodynamic models due to the development of computing technologies in recent years. Data-driven models ignore the physical background of a
Water 2020, 12, 3568; doi:10.3390/w12123568 www.mdpi.com/journal/water Water 2020, 12, 3568 2 of 20 problem and rather explore the relation between the input and output data [6]. For short or long-term flood forecasts, different data-driven models have been implemented, such as neuro-fuzzy [7], support vector machine [8], support vector regression [9,10], Bayesian linear regression methods [11] and artificial neural network (ANN) [12]. Among them, artificial neural networks (ANN) can be an effective tool for flood modeling, if it is properly applied, overcoming pitfalls as over-fitting/under-fitting with sufficient and representative data for model training [13]. Dawson and Wilby applied ANN to conventional hydrological models in flood-inclined catchments in the UK in 1998 [14]. After that, numerous examinations about flood forecasts in catchment scales emerged [2], [15]. Sit and Demir [16] integrated the river network spatial information to improve the forecast accuracy in Iowa. Bustami et al. applied the backpropagation ANN model for forecasting water levels at gauging stations [17]. ANN showed its great potential on the short-time forecast of extreme water levels with a comparable accuracy to the physical model with a far less computational cost [18]. Simon Berkhahn et al. applied an ANN for two-dimensional (2D) urban pluvial inundation extent forecast [19]. Lin et al. applied backpropagation networks for maximum flood inundation extent prediction and achieved a comparable accuracy to the hydraulic model [20]. The objective of this work is to develop a multistep flood forecast method for urban areas at a fine spatial resolution of 4 m by 4 m. Unlike Lin et al. [20], in this study, an ANN-based framework is proposed for performing multistep forecasts for 1–5 h. To the author’s knowledge, only the works of Chang et al. [2] and Shen and Chang [21] were able to produce multistep flood forecasting maps. The novelty herein is the forecast at a finer resolution of 4 m 4 m, suitable for urban flood forecast. × Our hypothesis is that the ANN forecast model can provide comparably accurate flood extents as hydraulic models, but with a much shorter time of several seconds. As the hydrodynamics-based model runs usually takes several hours, the reduction in forecast time of ANN enables more flood mitigation measures to become effective. In Section2, we introduce the resilient backpropagation artificial neural network structure and the validation of the model. Section3 describes the study area and the data preparation of the event database. Section4 shows the results of flood forecasts of the first intervals for synthetic and historical flood events and the real-time forecast for historical flood events. Sections5 and6 are the discussion and conclusion of this work.
2. Methods
2.1. Data and Structure of Artificial Neural Networks Artificial neural networks are algorithms applied to map features into a series of outputs. Through a structure of the input, output and intermediate hidden layers, artificial neural networks can learn data relationships between input and output data [22]. A feedforward neural network is applied in this work for modeling the study area, proceeding and transmitting data in a network structure [23]. One of the most widely used ANN is the multilayer perceptron (MLP) [24]. The MLP consists of highly interconnected neurons organized in layers to process information. The neurons in one layer are fully connected to each neuron in the next layer. Each connection is then assigned a weight. Each neuron collects values from the previous layer by summing up the results from the previous neuron values multiplying the weight on each input arcs and storing the results on itself. An activation function is used to transfer the results from the hidden layers to the output layer, and a loss function is applied to measure the fit of the neural network to a set of input–output data pair. In this case study, the input layer collects the seven inflows to the urban area of Kulmbach, given the hourly discharge intensity. The output layer is fed with the hourly raster inundation map with a resolution of 4 m 4 m from the event database. Between the input layer and the output layer, × the ANN has 2 hidden layers with 10 nodes per layer. One hundred twenty synthetic events from the event database are used for network training (see Section 3.2). Afterward, the other 60 events in the event database are used for model validation. This represents 2/3 of the date for training and 1/3 for testing, which is often found in the literature [25]. Finally, the model is applied to forecast Water 2020, 12, 3568 3 of 20 Water 2020, 12, x FOR PEER REVIEW 3 of 20 threehistorical historical events. events. The widely The widely applied applied sigmoid sigmoid function function is chosen is chosenas the activation as the activation function function for the forneural the neuralnetwork network [26]. Due [26 to]. the Due high-resolution to the high-resolution of the map of the (4 mapm by(4 4 mm), by weights 4 m), weights between between the last thehidden last layer hidden and layer the output and the layer output would layer have would been have 1 GB been RAM 1 with GB RAM a dimension with a dimensionof the problem of the of problemmore than of 30 more million. than 30The million. optimization The optimization of these we ofights these is weightsvery time-consuming, is very time-consuming, even with even the latest with theoptimization latest optimization techniques techniques [27]. Hence, [27 a ].“divide Hence, and a “divideconquer” and strategy conquer” is used strategy to enable is used calculation to enable in calculationa single PC. in In a addition, single PC. it should In addition, be noted it should that, in be principle, noted that, the inresults principle, of the thetwo results strategies of the should two strategiesbe the same. should Some be alternative the same. Someartificial alternative network artificial structures, network such structures,as a convolutional such as aneural convolutional network neural(CNN), network could not (CNN), be applied could in not this be study. applied The in network this study. size The would network require size a wouldvery large require number a very of largehypermeters, number ofwhich hypermeters, was beyond which the was memory beyond capa thecity memory of a capacitypersonal of computer a personal for computer forecasting for forecastingpurposes [28]. purposes Furthermore, [28]. Furthermore, the time thefor time trai forning training can be can reduced be reduced with with our our strategy strategy due due toto parallelization.parallelization. TheThe estimated estimated time time for for training training all networksall networks in parallel in parallel with with four coresfour iscores 6 h. is To 6 reduce h. To thereduce training the timetraining and time save memoryand save requirements, memory requir theements, study area the isstudy divided area into is 50divided50 squared into 50 grids× 50 × (seesquared Figure grids1). A(see similar Figure idea 1). ofA splittingsimilar idea has of been splitting applied has to been a former applied study to a [ 19former]. Each study grid [19]. had Each four independentgrid had four ANNs independent for intervals ANNs (3 h,for 6 h,intervals 9 h and (3 12 h, h). 6 Inh, total,9 h and 10,000 12 h). ANNs In total, are trained 10,000 toANNs produce are multisteptrained to forecasts.produce multistep forecasts.
FigureFigure 1.1. The forward-feed neural network setup in thethe forecastforecast study.study. The input layerlayer isis fedfed withwith dischargedischarge inflowsinflows ofof certaincertain timetime intervalinterval windows.windows. The output layerlayer generatesgenerates thethe floodflood inundationinundation forfor thatthat interval.interval. ResilientResilient backpropagationbackpropagation isis appliedapplied forfor trainingtraining thisthis network.network.
2.2. Hyperparameter Tuning in ANN 2.2. Hyperparameter Tuning in ANN To optimize weights in ANN, resilient backpropagation is a widely applied effective algorithm [29]. To optimize weights in ANN, resilient backpropagation is a widely applied effective algorithm According to Shamim et al. and Panda et al. [23,30], backpropagation neural networks outperform other [29]. According to Shamim et al. and Panda et al. [23,30], backpropagation neural networks methods in flood forecasting studies for their more efficiency and higher robustness. Berkhahn et al. [19] outperform other methods in flood forecasting studies for their more efficiency and higher compared the training algorithms for hyperparameter tuning. The authors showed that resilient robustness. Berkhahn et al. [19] compared the training algorithms for hyperparameter tuning. The backpropagation is more efficient than both backpropagation and Levenberg–Marquardt for maximum authors showed that resilient backpropagation is more efficient than both backpropagation and flood inundation prediction. The process has two stages: the training stage gathers information from Levenberg–Marquardt for maximum flood inundation prediction. The process has two stages: the the flood event database, changing the weights between layers to minimize the error on the output training stage gathers information from the flood event database, changing the weights between layer; the recalling stage generates the forecast for the rest of the events in the database for testing layers to minimize the error on the output layer; the recalling stage generates the forecast for the rest the model. of the events in the database for testing the model. Formula 1 and Formula 2 show the scheme of a resilient backpropagation. To calculate the update Formula 1 and Formula 2 show the scheme of a resilient backpropagation. To calculate the of the network weights wij from ith neuron to jth neuron, the gradient descent algorithm is applied. update of the network weights from ith neuron to jth neuron, the gradient descent algorithm is It distinguishes the update of weights upon the derivative of the loss function L of the model. The loss applied. It distinguishes the update of weights upon the derivative of the loss function of the function L takes the mean square error (MSE). The iteration stops once the loss function reaches its model. The loss function L takes the mean square error (MSE). The iteration stops once the loss minimum (chosen 10 6 in this case). function reaches its minimum− (chosen 10–6 in this case).
Water 2020, 12, x FOR PEER REVIEW 4 of 20
Water 2020, 12, 3568 4 of 20 ∙∆ −1 , ∙ −1 >0 Δ = , ∙∆ −1 , ∙ −1 <0 (1) + ∂L ∂L η ∆ij(t 1), (t) (t 1) > 0 · − ∂wij · ∂wij − ∆ ∂ −1L ,else∂L ∆ij(t) = η− ∆ij(t 1), (t) (t 1) < 0 , (1) · − ∂wij · ∂wij − ∆ij(t 1), else −1 +∆− , <0 ∂L = wij(t 1) + ∆ij(t), (t) < 0 , (2) −1 −∆ ∂, wij >0 − ∂L wij(t) = wij(t 1) ∆ij(t), (t) > 0 , (2) 0, else∂wij − − 0, else The learning rate is to scale the speed in each weight updating iteration. The larger alternative learningThe rate learning is rate chosen is to when scale the speederror gradient in each weightin the same updating signal iteration. in neighboring The larger iterations alternative and + lowerlearning alternative rate η islearning chosen rate when the when error the gradient loss function in the is same close signal to zero, in neighboringfulfilling 0<