IOP Conference Series: Earth and Environmental Science

PAPER • OPEN ACCESS Neural Network Approaches to Modeling of Natural. Emergencies. Prediction of Lena River Spring High Waters

To cite this article: G Struchkova et al 2021 IOP Conf. Ser.: Earth Environ. Sci. 666 032084

View the article online for updates and enhancements.

This content was downloaded from IP address 170.106.202.126 on 25/09/2021 at 11:20

International science and technology conference "Earth science" IOP Publishing

IOP Conf. Series: Earth and Environmental Science 666 (2021) 032084 doi:10.1088/1755-1315/666/3/032084

Neural Network Approaches to Modeling of Natural. Emergencies. Prediction of Lena River Spring High Waters

G Struchkova1, M Lebedev1, V Timofeeva1, T Kapitonova1, A Gavrilieva1 1Larionov Institute of Physical-Technical Problems of the North Siberian Branch of the Russian Academy of Science, 677980 Oktyabrskaya St. 1,

E-mail: [email protected]

Abstract. Floods play a significant role in terms of damage and safety during construction and operation of crucial objects such as a bridge over the Lena river and underwater crossings of trunk pipelines in the North and the Arctic. A rapid rise of spring high water on the Lena river is due to accelerated melting of snow in a basin and a meridional flow direction of the river. If flood control measures are not taken, then severe economic and social consequences are inevitable, especially in places with complex infrastructure. As, for example, heavily populated cities, the strategically important objects, the underwater crossings of the trunk pipelines, bridges and power lines. This paper presents results of a study of a possibility of use of neural network algorithms to predict danger of the flood from the spring high waters on a section of the Lena river based on statistical archival data obtained over 70 years and an assessment of effectiveness of the neural network approach. The artificial neural networks have proven their effectiveness in solving various prediction problems, especially when using the statistical data. The use of the neural network approach based on the prediction of a time series from previous values gives the good results. Modeling was carried out using methods of a multilayer perceptron (MLP) and radial basis network (RBF). Both selected methods showed sufficient adequacy of selected statistical models.

1. Introduction The most common natural emergencies (E) in Yakutia are the spring high waters, the severe economic and social consequences of which give every reason to consider them especially catastrophic. For 13 years, from 2001 to 2013, the spring flood of the rivers in the Republic caused the damage totaling about 16 billion rubles [1]. The Lena river plays the extremely important role in social and national economic life of the Republic. By nature of the flow of the Lena river three large sections are usually distinguished: the upper one is from sources to a mouth of the Vitim river (1690 km), the middle one is from the Vitim mouth to a confluence of the Aldan river (1400 km) and the lower one is from the mouth of the Aldan river to the Stolb island in the Lena delta (1310 km) [2]. Average duration of the spring high water is 75 days. The spring high water on the Lena river ends on the average in the second decade of June. In the sections of the Lena river, having a large number of islands, the wave of the high water breaks an ice cover sometimes simultaneously in several anabranches. The river flow dispersing along the individual anabranches weakens, which in turn contributes to formation of the jam of increased power, blocking all or almost the entire channel at a

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Published under licence by IOP Publishing Ltd 1

International science and technology conference "Earth science" IOP Publishing

IOP Conf. Series: Earth and Environmental Science 666 (2021) 032084 doi:10.1088/1755-1315/666/3/032084

junction. The jams of this origin often cause the catastrophic floods. Amounts of the damage from the spring high water depend on the effectiveness of the flood control measures taken with account of preliminary prognoses of the jams and the ice jam water levels. The prediction of development of the spring high water is the complicated problem that requires taking into account a complex of various factors. Some papers [3,4,5,6,7] consider the use of the neural network approach in the prediction of the flood during the high waters and freshets. There is a choice of the sections for a solution of the prediction problem. The jam water levels are formed under the influence of the factors common to the whole Lena: 1) a growth rate and height of the spring high water; 2) ice and meteorological conditions during the formation and destruction of the ice cover. In this paper, using the hydrological data for a period from 1970 to 2012, the retro- predictive models of the maximum water levels of the Lena river are constructed for the section near the Tabaga in the neighborhood of Yakutsk. The choice of the section of the study is due to the fact that near the Tabaginsky Cape close to an underwater crossing of a gas pipeline in the vicinity of the village Khatassy, the construction of a bridge over the Lena river has been planned. Besides, in this section, a frequency of the jams and floods for the considered period of time exceeds 0.8. The study of this section will allow solving the problem of ensuring safety of the construction of the bridge, as well as the underwater crossing of the gas pipeline, which has been operating since 2004. During this time, 3 accidents occurred at the underwater crossing of the gas pipeline. By the nature of the flow and the hydrological regime, the part near the Tabaga village corresponds to a type of the sections of the middle Lena [2].

2. Materials and research methods The many factors influence the water level in the river during the spring high water, in addition, a degree of the influence of the individual factors on each section of the river can vary. Consideration of all the factors is not possible due to excessive complexity and laboriousness. On the basis of analysis of the data, it is established that the following significant natural factors influence the possibility of the development of the extreme spring high waters in this section of the Lena river: discharge of water at a breakup date; the maximum ice thickness; the water level at the beginning of freeze-up; water reserves in the snow; an average air temperature during spring months. To improve quality of the prediction, the large number of the models with different predictors were analyzed. Besides, standard procedures were performed in all variants to eliminate multicollinearity: a correlation matrix was analyzed, namely, pair correlation coefficients between the predictors. The values of the pair correlation coefficients are not significant, they do not exceed 0.6 in modulus, which indicates a weak relationship between the factors. The more detailed analysis was also performed using the determination coefficients of each argument for all the others. The assessment of the values of the determination coefficients showed that they do not exceed 0.4, which also confirmed absence of the significant relationships between the predictors. As the result of the analysis, it was determined that the following predictors are the most significant ones for the given section: the water level at the beginning of the freeze-up; the discharge of water at the breakup date; the maximum ice thickness; the average water reserve in the snow for April, the average air temperature in April and the first ten days of May that allow making the prediction with the lead time from 3 to 8 days. The data from a Yakutsk weather station and the Tabaga gauging station (a distance of about 20 km) were used. The breakup of the Lena river near the village Tabaga begins on May 10-28, and an onset of the maximum water level is on May 20-24. In this case, the multilayer perceptron (MLP) and the radial basis network (RBF) were considered. To solve the majority of the practical problems with the use of the MLP, only one intermediate layer is enough, two layers are used in the special cases, and the networks with three layers are very rarely used in practice. The network of the radial basis function (RBF) type has the intermediate layer of the radial elements, each of which reproduces a Gaussian response surface. Since these functions are nonlinear, no more than one intermediate layer is also sufficient for the modeling of the arbitrary function. To model any function, it is only necessary to take the sufficient number of the radial

2

International science and technology conference "Earth science" IOP Publishing

IOP Conf. Series: Earth and Environmental Science 666 (2021) 032084 doi:10.1088/1755-1315/666/3/032084

elements. To determine the optimal structure of the neural network, there were studied the networks with the following parameters: the type is the MLP; the three-layer structure is: 5 input neurons, 1 output neuron, 3, 4, 7, 8 and 10 elements of the hidden layer; the RBF type is with 5 input neurons, 1 output neuron, 3, 5, 7, 10 and 20 neurons of the hidden layer. When using the neural network to solve the prediction problem, the time series is divided into three sets: the training, testing and control samples, which are then fed to the input of the network. The identical, logistic and hyperbolic functions were chosen as the activation function of the hidden neurons, the identical function was selected as the activation function of the output neurons. A size of a subsample is the training sample 70%, the testing one 15% and the control one 15%.

3. Results and discussion The data for the period from 1970 to 2008 near the village Tabaga in the neighborhood of Yakutsk were used as the input data for the neural networks, the retro-prediction was made for four years (from 2009 to 2012). We will analyze the data obtained to select the best model. We will compare each model in terms of the parameters: performance and error. The networks under consideration with one intermediate layer were trained using the backpropagation algorithm. Predictive capabilities of the neural networks largely depend on how well the input data are prepared. To work with the neural network, stages of preliminary and final data processing are required. The input data are scaled so that they lie in a range from zero to one. One of the most common forms of neural network input preprocessing is normalization. Necessity of the normalization of the data samples is due to the very nature of the used variables of the neural network models. Being different in a physical meaning, they can often vary greatly in the absolute values. The normalization of the data allows bringing all the used numerical values of the variables to the same range of their change, which makes it possible to bring them together in one neural network model. To perform the normalization of the data, it is needed to know exactly limits of the change of the values of the corresponding variables - the minimum and maximum values (the boundaries of a normalization interval). 1. We will consider the neural networks of direct transmission of a signal of the multilayer perceptron type. We will analyze the obtained data to select the best model. Table 1 presents the results of a comparison of the models in terms of the parameters: the performance and error.

Table 1. Comparison of models of different architectures in terms of accuracy of training and prediction on control sample.

Architecture Accuracy of prediction Accuracy of prediction on training sample on control sample

1 MLP 5-4-1 0.867 0.855 2 MLP 5-8-1 0.847 0.897 3 MLP 5-7-1 0.944 0.964 4 MLP 5-10-1 0.934 0.873 5 MLP 5-3-1 0.983 0.980 5* MLP 5-3-1 0.929 0.831

The accuracy value is a ratio of standard deviations of the predicted and observed values (the series) on the training and control sample. The model with the MLP architecture with 5 input neurons, 3 hidden layer neurons and 1 output neuron has the higher performance of 0.983. Moreover, if the input data are not normalized, the performance of this network structure will be the best, but slightly lower than 0.929 (line 5*, Table 1), the water level prediction also contains the larger relative error (line 5*, Table 2). Table 2 presents the

3

International science and technology conference "Earth science" IOP Publishing

IOP Conf. Series: Earth and Environmental Science 666 (2021) 032084 doi:10.1088/1755-1315/666/3/032084

results of the retro-prediction of the maximum water level of the Lena river obtained by the neural networks with the MLP structure near the Tabaga village.

Table 2. Comparison of results of retro-prediction of maximum water levels during high water of Lena river near Tabaga village, cm.

2010 2011 2012

error error

error

Relative Relative Relative

Relative Relative

Networks % % %

Prediction Prediction Prediction

Observations Observations Observations 1 MLP 5-7-1 956 928 3 935 958 2 943 961 2 2 MLP 5-4-1 956 937 2 935 953 2 943 962 2 3 MLP 5-10-1 956 925 3 935 903 3 943 953 1 4 MLP 5-8-1 956 938 2 935 967 3 943 972 3 5 MLP 5-3-1 956 945 1 935 930 0.5 943 937 1 5* MLP 5-3-1 956 940 2 935 981 5 943 998 6

The MLP architecture model with 5 input neurons, 3 hidden layer neurons and 1 output neuron also shows the better prediction accuracy. 2. We will consider the networks based on the radial basis functions RBF. We will carry out the analysis of the obtained data to select the best model. Table 3 presents the results of the comparison of the models in terms of the parameters: the performance and the error.

Table 3. Comparisons of models of different architectures in terms of accuracy of training and prediction on control sample. Architecture Accuracy of prediction on Accuracy of prediction on training sample control sample 1 RBF 5-10-1 0.865 0.855 2 RBF 5-7-1 0.819 0.773 3 RBF 5-3-1 0.787 0.776 4 RBF 5-5-1 0.888 0.837 5 RBF 5-20-1 0.942 0.967 5* RBF 5-20-1 0.911 0.871

4

International science and technology conference "Earth science" IOP Publishing

IOP Conf. Series: Earth and Environmental Science 666 (2021) 032084 doi:10.1088/1755-1315/666/3/032084

Table 4. Comparison of results of retro-prediction of maximum water levels during high water of Lena river near Tabaga village, cm.

2010 2011 2012

error error

error

Relative Relative Relative

Relative Relative

Networks % % %

Prediction Prediction Prediction

Observations Observations Observations 1 RBF 5-3-1 956 974 2 935 967 3 943 987 4 2 RBF 5-10-1 956 945 1 935 963 3 943 991 5 3 RBF 5-5-1 956 932 2 935 923 1 943 957 1 4 RBF 5-7-1 956 911 5 935 903 3 943 995 6 5 RBF 5-20-1 956 958 0.4 935 901 3 943 952 1 5* RBF 5-20-1 956 939 1.7 935 903 3.4 943 942 0.1

Table 5. Comparison of results of retro-prediction of maximum water levels during high water of Lena river near Tabaga village using neural networks, ARIMA and regression model, cm.

2010 2011 2012

error error

error

Relative Relative Relative

Relative Relative

Networks % % %

Prediction

Prediction Prediction

Observations Observations

Observations MLP 5-3-1 956 945 1 935 930 0,5 943 937 1 RBF 5-20-1 956 958 0.4 935 901 3 943 952 1 Traditional models ARIMA 956 928 4 935 906 3 943 896 4 Regression 956 977 2 935 979 5 943 993 5 model

The best model is one with the RBF architecture with 5 input neurons, 20 hidden layer neurons and 1 output neuron. The performance is 0.91, i.e. this model falls into a good category. Moreover, if the normalization of the input data is not carried out, the performance of the same network structure will be the best, but slightly lower than 0.911 (line 5*, Table 3), the water level prediction also contains the large relative error (line 5*, Table 4). The results of the prediction with the use of the MLP and RBF neural networks are compared with the results of the traditional models: the ARIMA autoregression model and the regression one in Table 5.

5

International science and technology conference "Earth science" IOP Publishing

IOP Conf. Series: Earth and Environmental Science 666 (2021) 032084 doi:10.1088/1755-1315/666/3/032084

4. Сonclusion The RBF networks have several advantages over the MLP ones and learn by an order of magnitude faster. But the RBF model must have more elements, so it runs slower and requires more memory than the MLP one. In this paper, an attempt is made to show that the performance of the prediction of the maximum water level of the river section during the spring high water improves, when the data preprocessing methods are used. The ANN methods have an ability to take into account hidden periodicities [6] and to build the information processing algorithms that have the capability to learn and identify patterns in the flow of the fuzzy conflicting information and become exceptionally important in the prediction of the characteristics of extreme hydrological situations. According to Tables 2 and 4 (the ANN), it can be seen that the constructed model satisfactorily reflects the real water level, the prediction errors of the MLP (0.5-1)% and the RBF (0.4-3)% show the sufficient adequacy of the selected statistical models. Thus, on the basis of the statistical data and regression modeling, there is presented the use of the neural networks for the development of the model that allows predicting flood risks from the spring high waters. The choice of the prediction section was determined based on locations of the potentially dangerous objects, the flood of which can cause the significant material damage to economics.

5. References [1] Burtseva E I Parfenova O N 2015 Economic damage caused by of the rivers in Republic Russia Problems of the modern economy 1(53) pp 256-259 [2] Nogovitsyn D D 2007 On the issue of forecasting congestion on the Lena river Science and technology in Yakutia 1 pp 19-24 [3] Grebnev Ya V Yarovoy A V 2018 Flood monitoring and forecasting in the Krasnoyarsk Territory using neural network algorithms Siberian fire and rescue bulletin 3(10) pp 16-36 [4] Krasnogorskaya N N Ferapontov Yu I Nafikova E V 2013 Development of methods for forecasting hydrological processes for water management tasks Scientific notes of the Russian State Hydrometeorological University Russia 28 pp 43-50 [5] Fazel S, Mirfenderesk H, Tomlinson R, Blumenstein M 2015 Towards robust flood forecasts using neural networks Neural Networks (IJCNN) Int Joint Conf on (IEEE) pp 1-6 [6] H Li, Y Lai, L Wang, X Yang, N Jiang, L Li, C Wang, B Yang 2019 Review of the State of the Art: Interactions between a Buried Pipeline and Frozen Soil Cold Regions Science and Technology 157 pp 171 – 186 [7] Honey Badrzade, Ranjan Sarukkalige, Jayawardena A W 2013 Impact of multi-resolution analysis of artificial intelligence models inputs on multi-step ahead river flow forecasting Journal of Hydrology 507(12) pp 75-85

6