Original Research Article

Flood frequency analysis using participatory GIS approach and discharge data for two stations in Burhi Gandak River

Abstract:

Aims: Find the best distribution to match the steam flow and calculation of magnitude and frequency of flow.

Study Design: In the current study we have used several statistcal distributions to find the best fit distribution for flow and used flood frequency analysis technique to find the magnitude and frequency of flow and non-exceedance probability of peak discharge.

Place and Duration of Study: Study has been performed at Sikandarpur and Rosera gauging sites of Burhi Gandak River. Historical (50 years) maximum annual peak discharge data of each station are used for statistical analysis for estimating maximum peak discharge in 5, 10, 25, 50, 100 year return period.

Methodology: In this study, Lognormal distribution, Galton distribution, Gamma distribution, log-Pearson type 3, Gumbell distribution, Generalised extreme values distribution have been considered to describe the annual maximum steam flow. Flood frequency analysis methods used for estimating the magnitude of the extreme event and non- exceedance probability of peak discharge to pass over any river channel.

Results: For both Sikandarpur and Rosera stations Log Pearson type 3 distributions showed lowest value K–S and Chi-square. The annual probable peak discharge for 5, 10, 25, 50, and 100 years return period is calculated for each distribution.

Conclusion: The most suitable distribution for both the stations is found to be the log- Pearson type 3 distribution.

Keyword: frequency analysis, flood, return period, distribution, Gumbell and Log Pearson type 3

Introduction

Flood frequency analysis is generally used for water resource management in the possibility of extreme events occurs in flood-prone areas. The application of the frequency analysis methods has been widely recognized by numerous researchers in various fields. In the planning and design of water resources projects, engineers and planners are often interested in estimating flood peak quantities for a set of non-exceedance probabilities that will come to pass from the project areas. Besides the unit hydrograph method, rational method, and rainfall-runoff models method, frequency analysis is one of the main techniques used to define the relationship between the frequency of flood and magnitude of an event with which that event is exceeded (Bhagat, 2017). Different frequency distribution techniques have been developed for the determination of hydraulic frequency analysis. However, none of the distribution technique is accepted as a universal distribution to describe the flood frequency analysis for any gauging site. The selection of a suitable distribution technique usually depends on the characteristics of available data at a particular site (Hassan et al., 2019)

Floods are natural hazards and cause extreme damages throughout the world. The main reasons for floods are extreme precipitation, ice and snow melting, dam failure, and the lack of capacity of the river channel to pass the excess water. Floods are natural phenomena that cause disasters like the destruction of infrastructure, damages in environmental and agricultural lands, mortality, and economic losses (Hu et al., 2018). The main objective of the study is to carry out the flood frequency analysis for the Burhi Gandak river using the maximum annual peak discharge data measured at Sikandarpur and Rosera station. To describe the flood frequency in the study area, the choice of an appropriate probability distribution and parameter estimation methods are of immense importance. The probability distributions used in this study include the log-normal distribution, Galton distribution, Gamma distribution, log-Pearson type 3, Gumbel distribution, Generalised extreme values distribution. These distributions are recommended for at-sight flood frequency analysis in various countries (Sisay,2015).

Materials and Methods

Description of Study Area

The Burhi Gandak river originates in the upper plains of the from the spring of Someshwar hills at an elevation of 300 meters. It drains into River Ganga about 7 km east from district of . The total length of the river is 320 km,

which covers the catchment area to 12180 km2. The basin area lies between valley point at 25°26′02″N, 86°36′25″E and ridge point at 27°27′21″N, 84°03′48″E. The Burhi Gandak is a minor plains-fed river occupying the position of a former, much more significant, mountain- fed river. The Burhi Gandak catchment is bounded on the north by the Himalayas, on the south by the River Ganga, on the east by the Kosi Basin and the west by the Gandak Basin. There may be some overbank connection with the main Gandak channel at times of high flood, but it appears to receive most of its water from minor tributaries and floodplain sheet flow in the plains. The main tributaries of Burhi Gandak River are Masan, Harbora, Tilawe, Siriswa Koria, Pasaha, Tiar Hahwa, with their catchment in Someshwar hills. The Burhi Gandak River flows in, West Champaran, , Samstipur, and Khagaria districts of Bihar state.

Image 1. Location Map

Flood Frequency Analysis

To describe the flood frequency analysis at a specific area, the selection of an appropriate probability distribution is always essential. We have considered a log-normal

distribution, Galton distribution, Gamma distribution, log-Pearson type 3, Gumbel distribution, Generalised extreme values distribution for the analysis of flood frequency at two gauging sites of the Burhi Gandak River. These distributions are standard in the literature and are recommended distributions for flood frequency analysis in many countries. Generally, the steps followed in flood frequency analysis are as follows:

Step 1: Selection of the data

Here, annual maximum daily peak discharge data from the year 1956 to 2005 (50 years) are selected for the flood frequency analysis. This research focused on annual maximum peak flood discharge records for man accurate representation of the magnitude of the flood flows.

Step 2: Fitting the probability distribution

The development of software for statistical extreme values analysis has been increasing day by day for more accurate measurement. Nowadays, different program package and pre-defined excel sheets are used to perform flood frequency analysis. Among the most used ones include: PeakFQ software performs statistical flood-frequency analyses of annual- maximum peak flows (Flynn et al., 2006), CumFreq (initially developed for the analysis of hydrological measurements of variable magnitudes in space and time), RAINBOW developed by the Institute for Land and Water Management of the Katholieke Universiteit Leuven (Raes et al., 1970),; Hydrognomon (Kozanis et al., 2010) developed by the ITIA (The name “Itia" is not an acronym, it is the Greek name of the willow tree) research team of the National Technical University of Athens in 1997; and Hyfran (developed in Canada by the team of Bernard Bobée, chairman of statistical hydrology from 1992 to 2004). The Hyfran software can be used for any dataset of extreme values, provided that observations are independent and identically distributed (Adlouni & Bobée, 2015). Using frequently used probability functions such as normal, log-normal, Gumbel, gamma, Weibull, exponential, or Pareto distribution, all these software can perform statistical analysis. This study used Hydrognomon software for flood frequency analysis. Hydrognomon is a software tool for the processing of hydrometeorological data (Kozanis et al., 2010). Although Hydrognomon is not commonly used in reviewed literature for flood frequency analysis, it was, nonetheless, used successfully by researchers to simulate the hydrology of Kaduna River in Niger and for modelling future climatic variation (Garba et al., 2013). Few researchers used Hydrognomon for time-series data analysis. However, the software is freely available and can perform

frequency analysis, among many other hydrologic tasks. One of the advantages of this software is that it supports several time steps, from the most exceptional minute scales up to decades and filling of missing values. The software can also perform over thirteen statistical distributions and statistical tests.

Step 3: Goodness of fit test to identify the best fitting distribution

Goodness-of-fit test statistics are used for checking the validity of a specified or assumed probability distribution model (Alam et al., 2018). A goodness-of-fit test, in general, refers to measuring how well the observed data correspond to the fitted (assumed) model. The commonly used goodness-of-fit tests are Kolmogorov–Smirnov (K-S), root means square error (RMSE) test, Chi-square test, and Anderson–Darling (A-D). The K–S test is an exact test where the distribution of the K–S test statistic itself does not depend on the underlying cumulative distribution function being tested. Besides, the use of Chi-square helps to understand the results and, thus, to derive more detailed information from the statistic test than from many others (Wuensch, 2011). It also has the advantage in that it can be applied to any univariate distribution. For those reasons, the K–S test and Chi-square test have been selected for testing the best-fitted distribution. The fact that the Chi-square test requires a significant sample size is not a problem in the current study since we have a large sample size.

Results and Discussion

This study tested different frequency distribution functions in order to compute the return period of significant flood events in Burhi Gandak River. There are two hydrological station measuresthe maximum daily discharge data from year 1956 to 2005. Notably, the maximum yearly discharge varies among the stations (Figure 1). The values recorded for the amount of discharge in the two stations are various times fallow similar trend, where the recorded maximum discharge is 4861cumec in 1975 at Sikandarpur and 3486 cumec in 1987 at Rosera station. The Sikandarpur station recorded the highest amount of discharge in many years because it is located downstream of the Burhi Gandak river.

Selection of the Best-Fitted Distribution

The 50-year maximum discharge data of each are used for statistical analysis. Table 1. summarized the goodness-of-fit test results for each station. The value of Kolmogorov– Smirnov (K–S) test and chi-square test for log-normal distribution, Galton distribution,

Gamma distribution, log-Pearson type 3, Gumbel distribution, Generalised extreme values distributions given in table 1.

Figure 1. Maximum annual peak discharge for Sikandarpur station (in red) and Rosera station (in blue)

Weibull LogNormal Galton Gamma LogPearsonIII Weibull LogNormal Galton Gamma LogPearsonIII Gumbel Max GEV Max Gumbel Max GEV Max

Return period (T) f or Maximum v alues in y ears - scale: Normal distribution Return period (T) f or Maximum v alues in y ears - scale: Normal distribution

1 1 1.01 1.01 1.02 1.05 1.11 1.25 1.43 1.67 2 2.5 3.33 5 10 20 50 100 200 500 2000

1 1 1.01 1.01 1.02 1.05 1.11 1.25 1.43 1.67 2 2.5 3.33 5 10 20 50 100 200 500 2000

4,500 6,000

5,500 4,000

5,000 3,500 4,500 3,000 4,000

3,500 2,500 m³/s 3,000 m³/s 2,000 2,500

2,000 1,500

1,500 1,000 1,000 500 500

0 0 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 a) b)

Probability Density Functions (PDF) - Histogram Probability Density Functions (PDF) - Histogram LogNormal Galton Gamma LogPearsonIII Gumbel Max GEV Max LogNormal Galton Gamma LogPearsonIII Gumbel Max GEV Max

0 1,000 2,000 3,000 4,000 5,000 6,000 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500

c) d)

Figure 2.a) Distribution plot for Sikandarpur station b) Distribution plot for Rosera station c) Density function for sikandarpur station d)Density function for Rosera station

Table 1. Summary of best-fitted distribution for Sikandarpur and Rosera station

Sikandarpur station Rosera station Distribution K-S test Chi square test K-S test Chi square test

Log Normal 0.10647 8.52 0.07112 4.32 Galton 0.12733 16.64 0.07914 9.92 Gamma 0.12411 16.64 0.09055 11.04 Log Pearson type 3 0.10579 8.52 0.069 4.88 Gumbel Max. 0.12501 16.64 0.07874 9.92 GEV Max. 0.12757 16.64 0.07757 9.92

In the above tables 1, the selected distributions, which correspond to the one with the lowest value of K–S and Chi-square, are highlighted. Goodness-of-fit tests, as well as the skewness, standard deviation, and the mean of the statistical parameters, are calculated. Both Sikandarpur and Rosera stations are highly skewed, posting a skewness of 1.07 and 1.14, respectively, which are greater than 1. The statistical comparison by K–S and Chi-square test for goodness-of-fit showed that the best-fitted distribution for the Sikandarpur station is Log- Normal distribution (0.1064, 26.44), Galton (0.1273, 16.64), Gamma distribution (0.1241, 9.68) log-Pearson type 3 (0.1057, 8.52), Gumbel Max. (0.1250, 16.64) and GEV Max. (0.1275, 16.64) respectively for 50-year discharge data. For Rosera, the best-fitted distributions are Log-Normal distribution (0.0711, 4.32), Galton (0.0791, 9.92), Gamma distribution (0.0905, 11.04), log-Pearson type 3 (0.069, 4.88), Gumbel Max. (0.0787, 9.92) and GEV Max. (0.0775, 9.92) respectively. However, the selected distribution did not necessarily exclude the use of other methods. Other distribution also shows good test results that are close to the selected distribution values. In fact, for both stations, the log-Normal distribution could be used for the extreme value of annual maximum peak discharge.

Return Period Calculation For the best-fitted distribution, log-Pearson type 3 is selected for the return period calculation. Due to the small size of the data, 50 and 100 years of return period discharge is computed using a 95% confidence interval. Table 2 presents a summary of the 5, 10, 25, 50, and 100 years return period of annual peak discharge amount for each station, depending on 50 years of annual maximum peak discharge data.

Table 2. a, b, c. Summary table of return period and corresponding peak discharge Log Pearson Type 3 Gumbel Max.

Sikandarpur Rosera Sikandarpur Rosera Station Station Station Station

Return period Max. Max. Return period Max. Max. (year) Discharge Discharge (year) Discharge Discharge (cumec) (cumec) (cumec) (cumec)

5 5 10 10 25 25 50 50 100 100

Galton Extreme Value Generalized Extreme Value

Sikandarpur Rosera Sikandarpur Rosera Station Station Station Station

Return period Max. Max. Return period Max. Max. (year) Discharge Discharge (year) Discharge Discharge (cumec) (cumec) (cumec) (cumec)

5 5 10 10 25 25 50 50 100 100

Log Normal distribution Gamma distribution

Sikandarpur Rosera Sikandarpur Rosera Station Station Station Station

Return period Max. Max. Return period Max. Max. (year) Discharge Discharge (year) Discharge Discharge (cumec) (cumec) (cumec) (cumec)

5 5 2684.77 10 10 25 25 50 4598.03 50 100 100

Conclusion

River flow can depend on the geographical location, and rainfall variation over the years has emerged as a critical aspect effect on flood hazard. However, an essential consideration in flood frequency analysis is estimating the best-fitted distribution to compute the appropriate return period. In this study, different probability distribution function and probability density functions were applied to the time series data of two stations in Burhi Gandak catchment. The K–S test and Chi-square test were applied to each distribution, and the values of each parameter helped to determine the best-fitted distribution in each case. This was followed by the computation of the 5, 10, 25, 50, and 100 years return periods maximum peak discharge. The results of the analysis generated from the study give detailed information of likely flow discharge to be expected in the river at the various return periods based on the observed historical data. This information will be beneficial for engineering purposes, such as when designing structures in or near the river that may be affected by the flood as well as in designing the flood structure to protect against the expected events. This may include the design of dam, bridges, and flood control structures, which will reduce flood disasters in the study or assist considerably in stormwater management in the flood-prone region. Findings from the study show that frequency analysis in flood management could help gather information on past flood events. This opens up the potential for extending the application of GIS into the analysis of flood extent and flood depth mapping. Going forward, the methods used in this study could be applied in developing flood hazard maps for each maximum peak discharge using GIS software, while the results of the study could be used by the government in flood management study in . More significantly, the method and results could be used by development planners at the county and national levels in identifying flood-prone areas in Burhi Gandak catchment and in developing mitigation strategies.

References

Alam, M. A., Emura, K., Farnham, C., & Yuan, J. (2018). Best-fit probability distributions and return periods for maximum monthly rainfall in Bangladesh. Climate, 6(1), 9.

Bhagat, N. (2017). Flood Frequency Analysis Using Gumbel's Distribution Method: A Case Study of Lower Mahi Basin, India. Journal of Water Resources and Ocean Science, 6(4), 51- 54.

Adlouni, S., & Bobée, B. (2015). Hydrological frequency analysis using HYFRAN-PLUS software. Chaire Indusrielle Hydro-Québec/CRSNG en Hydrologie Statistique/Institut

National de la Recherche Scientifique (INRS)/Centre Eau, Terre et Environnement: Montréal, QC, Canada, 71.

Flynn, K. M., Kirby, W. H., & Hummel, P. R. (2006). User's manual for program PeakFQ, annual flood-frequency analysis using Bulletin 17B guidelines (No. 4-B4).

Garba, H., Ismail, A., & Oriola, F. O. P. (2013). Calibration of hydrognomon model for simulating the hydrology of urban catchment. Open Journal of Modern Hydrology, 3(2), 75- 78.

Hassan, M. U., Hayat, O., & Noreen, Z. (2019). Selecting the best probability distribution for at-site flood frequency analysis; a study of Torne River. SN Applied Sciences, 1(12), 1629.

Hu, Q., Tang, Z., Zhang, L., Xu, Y., Wu, X., & Zhang, L. (2018). Evaluating climate change adaptation efforts on the US 50 states’ hazard mitigation plans. Natural hazards, 92(2), 783- 804.

Kozanis, S., Christofides, A., Mamassis, N., Efstratiadis, A., & Koutsoyiannis, D. (2010). Hydrognomon–open source software for the analysis of hydrological data. European Geophysical Union General Assembly 2010.

Raes, D., Mallants, D., & Song, Z. (1970). RAINBOW: a software package for analysing hydrologic data. WIT Transactions on Ecology and the Environment, 18.

Sisay, D. (2015). Flood Risk Analysis in Illu Floodplain, Upper Awash River Basin, Ethiopia (Doctoral dissertation, Addis Ababa University).

Wuensch, K. L. (2011). Chi-Square Tests.