International Journal of Mechanical and Production Engineering Research and Development (IJMPERD) ISSN (P): 2249–6890; ISSN (E): 2249–8001 Vol. 10, Issue 3, Jun 2020, 7281-7290 © TJPR Pvt. Ltd.

CONVENTIONAL MODELING FOR FORECASTING ANNUAL RAINFALL IN CITY

M. NIRMALA Department of Mathematics, Sathyabama Institute of Science and Technology, Tamilnadu, India ABSTRACT

This research work deals with the forecasting of annual rainfall of Chennai city, India using conventional and computational models. Chennai is one of the most populous metropolitan city in India the knowledge of extreme precipitation in advance helps the disaster management for the control of floods and droughts. For this purpose, the annual rainfall of Chennai is modeled by filtering the white noise and then the smoothened data series were fed into the conventional Auto Regressive Integrated Moving Average (ARIMA) model for forecasting. The experimental result shows that the conventional ARIMA model gives better accuracy. The error measure indicates the proposed model’s performance accuracy.

KEYWORDS: Rainfall, Forecasting, ARIMA & Error measure

Original Article Original

Received: May 25, 2020; Accepted: Jun 27, 2020; Published: Aug 07, 2020; Paper Id.: IJMPERDJUN2020688

1. INTRODUCTİON

Rainfall prediction is one of the most important and challenging task carried over by meteorologists all over the world. Rainfall prediction is also useful for disaster management in tackling the extreme precipitations. There are many forecasters doing lot of research to obtain more accuracy in the model they construct. There are many factors which affects rainfall prediction such as temperature, humidity, wind speed, pressure, dew point etc. and hence prediction becomes more complicated for them since the real life data are nonlinear in nature. Nowadays Artificial Neural Network algorithms are applied to time series analysis.

2. RELATED WORK

Artificial Neural Network plays a major role in learning and capturing the characteristics of any nonlinear natural phenomena. MoniraSumi et al [9] made an attempt for deriving forecasting models for average daily and monthly rainfall of the Fukuoka city in Japan using the artificial neural network, multivariate adaptive regression splines, the k-nearest neighbor and radial basis support vector regression together with preprocessing techniques like moving average method and principal component analysis. The K-nearest neighbor, artificial neural network, and extreme learning machine were applied for the seasonal forecasting of summer (June-September) and post- monsoon (October-December) rainfall for the Kerala state of India by Yejnaseni Dash et al and proved that extreme machine learning performed better when compared to the other two models [12]. FhiraNhita et al constructed a forecasting system with evolving neural network algorithm to forecast the monthly rainfall of Bandung Regency in Indonesia [4] and were successful by arriving a forecast accuracy of 70%. Duong Tran Anh et al [3] combined two pre processing methods seasonal decomposition and Discrete Wavelet Transform to decompose the monthly rainfall series in Vietnam and constructed the prediction models based on Artificial Neural Network and Seasonal Artificial

www.tjprc.org SCOPUS Indexed Journal [email protected] 7282 M. Nirmala

Neural Network. Kavitha Rani et al developed Artificial Neural Network based rainfall prediction model with teaching learning optimization algorithm to improve the accuracy of the model [7]. A feed forward neural network with back propagation and Levenberg-Marquardt algorithms based prediction models were constructed by Neelam Mishra et al for forecasting one-month and two-month ahead rainfall of Northern India [10]. In this study, a forecasting model for annual rainfall in Chennai is constructed using the conventional ARIMA model.

3. STUDY AREA AND MATERİALS

Chennai is the capital city of state in India. Its geographic location is 13°04'N latitude and 80°17'E longitude [5]. According to the 2011 Indian census, it is the sixth-most populous city and fourth-most populous urban agglomeration in India.Chennai has a tropical wet and dry climate.. The hottest part of the year is late May to early June with maximum temperatures around 35 °C – 40 °C (95–104 °F). The coolest part of the year is January, with minimum temperatures around 19 °C –25 °C (66–77 °F). The lowest recorded temperature was 13.9 °C (57.0 °F) on 11th December 1895 and 29th January 1905. The highest recorded temperature was 45 °C (113 °F) on 31 May 2003. The average annual rainfall is about 140 cm (55 in). The city gets most of its seasonal rainfall from the north–east monsoon winds, from mid–October to mid–December. in the Bay of Bengal sometimes hit the city. The highest annual rainfall recorded is 257 cm (101 in) in 2005. Prevailing winds in Chennai are usually southwesterly between April and October and north-easterly during the rest of the year.

Figure 1: Geographical Location, Boundaries and Water bodies of Chennai

Chennai has relied on the annual to replenish water reservoirs, as no major rivers flow through the area. Chennai has a water table at 2 metres for 60 percent of the year. Two major rivers flow through Chennai, the CooumRiver through thecentre and the to the south. A third river, the Kortalaiyar, travels through the northern fringes of the city before draining into the Bay of Bengal, at . All the three rivers are heavily polluted by the industrial wastages and hence the water is not suitable for the use of public.

As Chennai is one among the metropolitan city in India, it depends entirely on rainfall to fulfill the daily needs of water for its people. Due to extreme precipitation during December 2015, the people in Chennai suffered for many days without food and . Many lost their lives, their belongings during the flood. Hence the knowledge of flood or drought is necessary for disaster management to take necessary precautionary actions to manage both.

This research article develops a prediction model based on ARIMA for predicting the annual rainfall in Chennai. For this purpose, a dataset containing the monthly rainfall of chennai, from 1901 to 2017 were obtained from India Meteorological Department, Pune, India and India Water Portal.

Impact Factor (JCC): 8.8746 SCOPUS Indexed Journal NAAS Rating: 3.11 Conventional Modeling for Forecasting Annual Rainfall in Chennai City 7283

4. METHODOLOGY 4.1 AutoCorrelation and Partial AutoCorrelation Functions

Let z1, z2, z3,…,zn denote observations made at equidistant time intervals. A stochastic process is a statistical phenomenon that evolves in time according to probabilistic laws. The process is stationary if the properties of the time series are unaffected by a change of time origin. The covariance between zt and its value zt+k separated by k intervals of time which under the stationarity assumption must be the same for all t is called the autocovariance at lag k and is defined as

 k  covzt , ztk   Ezt  ztk   (1)

The autocorrelation function (ACF) and the partial autocorrelation function (PACF) explain the internal structure of the analysed series. The ACF, k at lag k of the time series zt is the linear correlation coefficient between zt and zt+k is

E[(z  )(z  )] t tk (2) k  2 2 E[(zt  ) (ztk  ) ]

The PACF is the linear correlation between zt and zt+k by controlling the possible effects of linear relationships among values at intermediate lags.

4.2 Auto Regressive Integrated Moving Average (ARIMA) Model

The ARIMA models are a combination of autoregressive models and moving average models. The autoregressive models

AR(p) base their predictions of the values of a variable xt, on a number p of past values of the same variable number of autoregressive delays xt−1,xt−2,. . . ,xt−p. They include a random disturbance et. The moving average models MA(q) generate predictions of a variable xt based on a number q of past disturbances of the same variable prediction errors of past values et−1,et−2,...,et−q. The combination of the auto regressive and moving average models AR(p) and MA(q) generates more flexible models called ARMA(p,q) models. A simple regression model can be represented as

Yt = b0 + b1 Yt –1 + b2 Yt –2 + …+ bpYt –p + et (3)

where Yt is the forecast variable, Yt –1,…, Yt –p are explanatory variables and et is the error term. The name autoregression is used to denote the above equation due to time-lagged values of the explanatory variable. The moving average model uses the past errors as explanatory variables. A simple moving average model is represented as follows:

Yt = b0 + b1 et –1 + b2 et –2 + …+ bq et –q + et (6)

The autoregressive model and the moving average model are efficiently coupled to form a general and useful class of time series model called ARMA. This class of models can be extended as ARIMA model for non-stationary time series by differencing the data series. The real time data are non stationary data which are unpredictable and cannot be modeled or forecasted. The non stationary data needs to be transformed into stationary data to obtain consistent and reliable results. The Box-Jenkins methodology requires the time series to be analyzed is stationary.

The ARIMA model of order (p,d,q) or an ARIMA process is defined as

d (B) zt  (B)et (4)

where Zt denoted the observed value at time t, B represents the backward shift operator the autoregressive

www.tjprc.org SCOPUS Indexed Journal [email protected] 7284 M. Nirmala operator (B) is of order p, d is the order of difference taken and the moving average operator  (B) is of order q.

(B)and (B) have the following representations:

2 p (B) 11B 2 B ...p B (5)

2 q (6) (B) 11B 2B ... qB

For the stationary process, the ARIMA model is

(B)zt (B)et (7)

The various steps involved in building an ARIMA model are iterative identification, estimation, diagnosis and forecasting. Identification of a model may be accomplished on the basis of the data pattern, time series plot and using their autocorrelation function and partial autocorrelation function. The first and the foremost important step in any time series analysis are to plot the observations against time. The time series plot displays the trend, seasonality, outliers and discontinuities of the series. The plot is vital both in describing the data and also for formulating a sensible model. The autocorrelation coefficients measure the correlation between observations at different times which plays a role in a time series. By arranging a set of autocorrelation coefficients, as a function of separation in time, the sample autocorrelation function, or the ACF is obtained. An analogy can be drawn between the autocorrelation coefficient and the product- moment correlation coefficient by this function. The partial autocorrelation function (PACF) is an estimate of the additional correlation between the data value at time t and (t-k) after adjusting for the correlation of the values between time t and (t-k). The general characteristics of ACF and PACF are tabulated in Table 1.

Table 1: General Characteristics of various models Model ACF PACF AR Spikes decay towards zero Spikes at lags 1 to p, then zero MA Spikes at lags 1 to q, then zero Spikes decay to zero ARMA Spikes decay to zero Spikes decay to zero

where the spike represents the line at various lags in the ACF and PACF plots with length equal to magnitude of autocorrelations. If the series is not stationary then differencing can be done to make the data stationary. A difference of order one means that each value of the series is subtracted from the next neighbouring value.

wt = zt - zt-1 (8)

Now in the new time series wt, the number of observations is one less than the original series. A difference of order two means that the order one differenced series is differenced again

yt = wt - wt-1 = zt - 2zt-1 + zt-2 (9)

The series yt is having two observations less when compared to the original series. Similarly this process can be continued until a stationary series is obtained.

Some general rules for identifying the order of the ARIMA model are,

 If the ACF function dies out rapidly, it indicates that the series is stationary. Hence there is no need of

Impact Factor (JCC): 8.8746 SCOPUS Indexed Journal NAAS Rating: 3.11 Conventional Modeling for Forecasting Annual Rainfall in Chennai City 7285

differencing (i.e. d = 0).

 If the ACF function doesn't die out rapidly, it indicates that the series is nonstationary and in need of differencing. i.e. d > 0. Hence the series is differenced till rapidly dying ACF is obtained. In this case, d is equal to the number of differences taken.

 If ACF decays exponentially and kk for k = 0, 1,…, k are non-zero in PACF, then it suggests p = k and q = 0.

 If PACF decays exponentially and k for k = 0, 1,…, k are non-zero in ACF, then it suggests p = 0 and q = k.

 If ACF and PACF both are mixtures of exponentials and damped sine waves, and the exponential decay starts th th after k1 and k2 lag for ACF and PACF respectively, it suggests p = k1 and q = k 2.

The next step involves the estimation of parameters of the tentative model. The parameters are estimated and tested for statistical significance after identifying the tentative model using least squares method. If the parameter estimates do not meet the stationarity condition then a new model should be identified and its parameters are estimated and tested. After finding the correct model it should be diagnosed. When a model is fitted to a time series, it should be checked whether the model provides an adequate description of the data. In the diagnosis process, the autocorrelation of the residuals from the estimated model should be sufficiently small and should resemble white noise. Residuals are generally defined as observed value minus fitted value. Hence the residuals are plotted as a time series plot and the correlograms are calculated. If the residuals remain significantly correlated among them, a new model should be identified, estimated and diagnosed. Once the model is selected it is used to forecast.

4.3 Performance Evaluation Criteria

The development, examination, and recommendation of the methods that may be used to determine and compare the accuracy and precision of models are of primary interest. After fitting a number of different statistical forecasting models to a time series, performance evaluation criteria is used for determining model accuracy and the extent to which the model's behaviour is consistent with data. The error measure used in this research work is the Mean Absolute Percentage Error (MAPE) measure.

MAPE is commonly used in quantitative forecasting methods as it produces a measure of relative overall fit. The absolute values of all the percentage errors are summed up and the average is computed.

N Et  Y MAPE  100 t1 t (10) N

where Yt and Et represent desired inputs and corresponding errors at t = 1,2,...,N respectively.

4. RESULTS AND DİSCUSSİONS

A time series plot is used to display the time variation against the annual rainfall series.

www.tjprc.org SCOPUS Indexed Journal [email protected] 7286 M. Nirmala

Figure 2: Time Series Plot of Annual Rainfall in Chennai

To check the normality of the rainfall series, Anderson – Darling test is made. Anderson-Darling test compares the empirical cumulative distribution function of the data with the distribution expected if the data is normal.

Figure 3: Anderson-Darling Test for Normality

If this observed difference is sufficiently large, the test will reject the null hypothesis of population normality. The test statistic value is 0.618 and p > 0.001. This shows that the series is normal. The visual plot of the time series plot (figure 2) is often enough to convince a statistician that the series is stationary or non – stationary. The visual inspection of the sample autocorrelation function (figure 4) shows that the rainfall series is stationary. Also the autocorrelation function tends to zero within one or two lags. If the Autocorrelation function dies out rapidly, that is, it reaches zero within one or two lag periods then it indicates that the time series is stationary.

Impact Factor (JCC): 8.8746 SCOPUS Indexed Journal NAAS Rating: 3.11 Conventional Modeling for Forecasting Annual Rainfall in Chennai City 7287

Figure 4: ACF and PACF plots for Annual Rainfall series

Seasonality is defined as a pattern that repeats itself over fixed intervals of times. For a stationary data, seasonality can be found by identifying those autocorrelation coefficients of more than two or three time lags that are significantly different from zero. Since the time series considered here is the annual rainfall of Tamilnadu, the time series is non-seasonal. Since the autocorrelation function dies out rapidly, that is, it reaches zero within one or two lag periods, the time series is stationary. Hence there is no need of differencing. Therefore d = 0.

Figure 5: Residual plots for Annual Rainfall series

The partial autocorrelation function decays exponentially and the corresponding autocorrelation function coefficients are non-zero for first 4 lags, then it suggests p = 3 and q = 3. For the partial autocorrelation function plot and autocorrelation function refer figure 4. Hence the suitable model is ARIMA (3,0,3).

www.tjprc.org SCOPUS Indexed Journal [email protected] 7288 M. Nirmala

Table 2: Final Estimates of Parameters Type Coef SE Coef T-Value P-Value AR 1 -0.111 0.331 -0.33 0.738 AR 2 -0.494 0.233 -2.12 0.036 AR 3 0.633 0.315 2.01 0.047 MA 1 -0.145 0.357 -0.41 0.685 MA 2 -0.625 0.212 -2.94 0.004 MA 3 0.499 0.366 1.36 0.175 Constant 6.8625 0.0280 245.03 0.000 Mean 7.0585 0.0288

Table 3: Modified Box-Pierce (Ljung-Box) Chi-Square Statistic Lag 12 24 36 48 Chi-Square 5.45 8.40 13.39 21.82 DF 5 17 29 41 P-Value 0.363 0.957 0.994 0.994

6. CONCLUSİONS

Traditional statistical time series forecasting methods, including moving average, exponential smoothing, and Auto- Regressive Moving Average, all assume stationarity and linearity of the time series. Artificial Neural Networks does not require any stationarity constraint on the time series to be learned and predicted. The highly flexible nonlinear regressive structure of Artificial Neural Network fit the target pattern space. In this research article, a prediction model based on rainfall parameters using ARIMA have been constructed and the error measure shows the performance of the model. Based on the results, it can be concluded that conventional models can still play a major role in forecasting real world problems.

REFERENCES

1. DariushRahimi, Sadat Hasheminasab.: Analysis water Quality by Artificial Neural Network in Bazoft River (Iran). J Chem. Pharma. Res. 9(3), 115-121 (2017)

2. Dawson, C. W., Wilby, R. L.: Hydrological Modeling using Artificial Neural Networks. Prog. Phy. Geography. 25(1), 80-108 (2001)

3. Duong Tran Anh, Thanh Duc Dang, Song Pham Van.: Improved Rainfall Prediction using Combined Pre-Processing mehtods and Feed-Forward Neural Networks. 2, 65 – 83 (2019)

4. Fhira Nhita, Adiwijaya, Wisesty, U. N. Izzatul Ummah.: Planting Calendar Forecasting System using Evolving Neural Network. Far East J. Elect Commu., 14(2), 81-92 (2015)

5. Geography of Chennai,

https://en.wikipedia.org/wiki/Geography_of_Chennai

6. Hung, N.Q, Babel, M. S, Weesakul, S, Tripathi, N. K.: An Artificial Neural Network model for rainfall forecasting in Bangkok, Thailand. Hydrol.Earth Syst.Sci. 13, 1413-1425 (2009)

7. Kavitha Rani, B, Srinivas, K, Govardhan, A.: Rainfall Prediction with TLBO Optimized ANN. 73, 643-647 (2014)

8. Mehnaza Akhter.: Application of ANN for the Hydrological Modeling. I.J. Research Appl Sci. Engg. Tech., 5(7), 203-213 (2017)

Impact Factor (JCC): 8.8746 SCOPUS Indexed Journal NAAS Rating: 3.11 Conventional Modeling for Forecasting Annual Rainfall in Chennai City 7289

9. MoniraSumi, S , Faisal Zaman, M, Hideo Hirose.: A Rainfall Forecasting Method using Machine Learning Models and its Application to the Fukuoka City case. Int. J. Appl. Math. Comput. Sci. 22(4), 841-854 (2012)

10. Neelam Mishra, Hemant Kumar Soni, Sanjiv Sharma, Upadhyay, A.K.: Development and Analysis of Artificial neural Network Models for Rainfall Prediction by using Time-Series Data. I.J. Intelligent Systems and Applications, 1, 16-23 (2018)

11. Nirmala, M.: Computational Models for Forecasting Annual Rainfall in Tamilnadu. Appl. Math. Sci. 9(13), 617-621 (2015).

12. Yajnaseni Dash, Saroj K. Mishra, Bijaya K., Panigrahi.: Rainfall Prediction for the Keral state of India using Artificial Intelligence approaches. Cop and Elec Engg. 70, 66 -73 (2018)

www.tjprc.org SCOPUS Indexed Journal [email protected]