Artificial Neural Network Application in Water Resource Management and Flood Warning: Case Study North West Malaysia

Hassanuddin Mohamed Noor

This thesis is submitted in partial fulfilment of the requirements for the reward of the degree of Doctor of Philosophy University of Portsmouth

Abstract

Disasters caused by floods are a major cause of losses of properties and lives. The unpredictability in weather conditions due to changing weather patterns do not only lead to flooding but also contribute to water resource management problems. Rapid development in many tropical countries, like in Malaysia, has resulted in the loss of natural floodplains leading to an increase in flooding and water shortage. Sufficient advanced flood warning system that can save lives and properties can be developed using accurate river model. The work reported in this thesis has made significant contributions in the prediction of river flow rate based on rainfall rate in the catchment area using Artificial Neural Network (ANN). The proposed approach models the non-linear process of the rainfall-runoff in a wide variety of catchment area conditions. This study demonstrates significant improvement in the accuracy and reliability of water resource management by using ANN modelling to predict river flow rate. It also shows ANN as a fast and adaptable approach that is suitable for river flow rate modelling that does not need detailed geographical information of the catchment area. Its attractiveness is in its ability to adapt to changing conditions and therefore does not become outdated like conventional hydrology models. The research shows that river flow rate is a better parameter to be used for an early flood warning system as it is more sensitive to rainfall rate compared to the river level which is used in conventional flood warning systems. The study has also shown that

ANN with a feed-backward network with one hidden layer provides the best results and it is able to produce river flow rate prediction up to 132 hours with root mean square error of . This is a significant contribution as the flood warning system currently0 .use02 d /in Malaysia can only predict flooding within 8 to 24 hours. The work in this thesis can assist the authorities to manage water from dams thereby effectively managing floods and ensuring sufficient water for domestic and agricultural use. The findings of this research has already been presented to the

Malaysian government agency responsible for managing waterways and dams.

i Acknowledgement

This dissertation would have never been completed without the guidance of my advisors and committee members, help from friends, and support from my family, wife and Majlis Amanah Rakyat (MARA) whose sponsoring my study.

I would like to express my deepest gratitude to my study director, Dr David L Ndzi, for his excellent guidance, encouragement, and providing me with an excellent atmosphere for doing research. I would like to thank Dr David Sanders and Dr

Shikun Zhou for their services as my annual reviewer panels and helping me to develop the scope of my research. Also fully thanks to Department of Irrigation and

Drainage (DID) Malaysia for providing the valuable data to be used in this studies.

I am indebted to my friends and the members of the School of Engineering, especially from Room A2.09 and to Dr Guangguang Yang because this research would not have been possible without their help. I would like to thank my parents, children, and friends for their prayers and best wishes. This dissertation is made possible with the undivided support and encouragement from my wife, Noor Azma. She is always there and stood by me through the good and bad times. Special thanks also go to my previous employer, Kolej Kemahiran Tinggi MARA Kemaman (KKTM), Bahagian

Kemahiran dan Teknikal MARA (BKT) and Bahagian Sumber Manusia MARA (BSM) for supporting my ambition to complete the doctoral degree in University of

Portsmouth.

ii Declaration: Whilst registered as a candidate for the above Degree, I have not been registered for any other research award. The results and conclusions embodied in this thesis are the work of the named candidate and have not been submitted for any other academic award.

iii Table of Contents Page

Abstract …………………………………………………………...... i

Acknowledgements …………………………………………………………...... iii

Table of Contents…………………………………………………………...... iv

List of Figures …………………………………………………………...... viii

List of Tables …………………………………………………………...... xi

List of Acronyms…………………………………………………………...... xiii

Publications…………………………………………………………...... xv

CHAPTER 1. INTRODUCTION 1

1.1 Research Background …………………………………………………………... 1

1.2 Introduction of Weather in Malaysia …….……………………………...... 2

1.3 Statement of Aims………………………………………………………………. 3

1.4 Objective ………………………………………………………………………. 4

1.5 Justification………………………………………...…………………………….. 5

1.6 Background and Research Context………………………………………….. 5

1.7 Flood Warning System in Malaysia…………………………………………… 6

1.7.1 Telemetry System…………………………………………………………...... 7

1.7.2 Manual Water River Level Monitoring………………………………………. 7

1.7.3 Flood Warning Board…………………………………………………………… 7

1.7.4 Mass media………………………………………………………………………. 7

1.8 Structure of the Thesis……………………………………………………… 8

1.9. Major Outcome………………………………………………………………….. 9

CHAPTER 2. LITERATURE REVIEW 10

2.1 Existing Flood Forecasting Techniques…………………………………… 10

2.1.1 River and Rain-Gauge Networks………………………………………………. 12

2.1.2 Radar and Geographic Information Systems………………………………… 13

iv 2.1.3 Linear Statistical Models……………………………………………………….. 13

2.1.4 Nonlinear Time Series Analysis and Prediction…………………………... 13

2.1.5 Artificial Neural Networks (ANN)………………………………………….. 14

2.2 Weather radar…………………………………………………………………… 14

2.3 Satellite-Based Weather System ……………………………………………… 14

2.4 Introduction to Basic Hydrology Modelling………………………………… 15

2.4.1 Rational method……………………………………………….……………… 17

2.4.2 Time and area diagram (TAD) ……………………………….……………… 18

2.4.3 Unit Hydrograph……………………………….……………………………… 18

2.5 Approaches to Hydrological Modelling………………………………………. 19

2.5.1 Deterministic Models…………………………………………………… 20

2.5.2 Conceptual Models……………………………………………….……………. 20

2.5.3 Empirical Models………………………………………………………………. 21

2.6 Summary…………………………………..……………………………………. 21

2.7 Major Outcome …………………………………..…………………………… 22

CHAPTER 3. ARTIFICIAL NEURAL NETWORKS (ANN) 23

3.1 Introduction.………………………………………………………………… 23

3.2 Nonlinear AutoRegressive Neural Network with eXogenous Input….. 24

3.3 Types of ANN…………………………………………………….………… 24

3.4 Development of an ANN …………………………………………………… 25

3.4.1 ANN Architecture…………………………………………………….………… 26

3.4.2 Training an ANN model……………………………………………………. 28

3.4.2.1 Levenberg-Marquardt (LM) Algorithm……………………………………… 29

3.4.2.2 Bayesian Regularization (BR) Algorithm…………………………………… 30

3.4.3 Network Design Configuration…………………………………………… 30

3.4.4 Variable Selection…………………………………………………………….. 30

3.4.5 Data Pre-Processing. …………………………………………………………… 31

3.4.6 Training, Testing, and Validation. …………………………………..…….. 32

v 3.5 Neural network paradigms. ………………………………………………. 32

3.6 Evaluation criteria. …………………………………………………………….. 33

3.7 Neural network training. ………………………………………………………. 33

3.8 ANN application in prediction model. ……………………………………… 34

3.9 Implementation………………………………………………………………… 35

3.10 Model Evaluation……………………………………………………………… 37

3.11 Correlation Coefficient……………………………………………………….. 38

3.12 Advantages and Disadvantages of ANN for Hydrological Modelling….. 38

3.13 Summary………………………………………………………………………… 39

3.14 Major Outcome……………………………………………………………… 40

CHAPTER 4 METHODOLOGY

4.1 Study area ……………………………………………………………………… 42

4.2 Mann-Kendall (MK) trend test……………………………………………… 46

4.3 Rain Dynamics…………………………………………………………………… 47

4.4 Catchment Area Characteristic………………………………………………… 51

4.5 Surface-runoff relationship………………………………………………. 52

4.6 ANN-based river flow rate model ……………………………………………. 55

4.7 Comparison between river level and river flow rate measurement ……….. 56

4.8 Summary………………………………………………………………………… 56

4.9 Major Outcome …………………………………………………………………. 57

CHAPTER 5. RESULTS

5.1 Rainfall statistics. …………………………………………………………… 58

5.2 Comparison between level monitoring and flow monitoring for flood

observation……………………………………………………………………… 63

5.2.1 Case 1: River Pelarit Flood event on 30th October to 1st November 2010. 64

5.2.2 Case 2: River Jarum Flood event on 30th October to 1st November 2010…. 66

5.2.3 Case 3: River Pelarit, Flood event on 1 April 2011…………………………. 67

5.2.4 Case 4: River Jarum, Flood event on 1 April 2011……………………………. 67

vi 5.2.5 Summary for comparison between level monitoring and flow 68

measurement

5.3 Rainfall in the catchment area and river flow rate ………..…………………. 69

5.3.1 Dry period before rain against time delay to change flow rate…………… 69

5.3.2 Rain-No change condition……………………………………………………… 70

5.4 ANN-based River Flow Model………………………………………………… 71

5.4.1 Model I: Flow Rate Model for River Pelarit using LM Training Algorithm. 73

5.4.2 Model II River Flow Rate Model for Pelarit using BR Algorithm …………. 76

5.4.3 Model III River Flow Rate for River Pelarit using LM Training Algorithm. 78

5.4.4 Model IV River Flow for River Pelarit using BR Training Algorithm…….. 79

5.4.5 Summary Experiment for River Flow Model River Pelarit………………… 81

5.4.6 Model V River Flow Model for River Jarum -LM Training Algorithm (Oct 82

2013)

5.4.7 Model VI River Flow Model for River Jarum - BR Training Algorithm 83

(Oct 2013)

5.4.8 Model VIII River Flow for River Jarum -LM Training Algorithm (Oct 88

2010)

5.4.9 Summary Experiment for River Flow Model for River Pelarit……………. 89

5.5 Summary …………………………………….…………………………… 89

5.6 Major Outcome…………………………………………………………… 91

Chapter 6: CONCLUSION AND FUTURE WORK 92

6.1 Summary of the Study………………………………………………………… 92

Contributions ……………………………………………………………………. 95

6.3 Recommendations for Future Works in ANN Rainfall-Runoff Modelling... 95

REFERENCES

APPENDICES

Form UPR16- Research Ethics Review Checklist

Certificate of Ethics Review

vii List of Figures

Page

CHAPTER 1 INTRODUCTION

Fig. 1.1 Monsoon seasons in Peninsular Malaysia. 3

Fig. 1.2 Annual rain distribution for Peninsular Malaysia 5

Fig. 1.3 Flood-prone area in Malaysia 6

Fig. 1.4 Weather warning issued by MMD. 8

Fig. 1.5 Thesis structure 8

CHAPTER 2 LITERATURE REVIEW

Fig. 2.1 The phase of flood event for water reservoir or lake. 10

Fig. 2.2 Peninsular Malaysia weather radar network 15

Fig. 2.3 The time area diagram obtained by splitting the basin into n areas 18

according to flow time to the outlet.

Fig. 2.3 Snyder's synthetic unit hydrograph 19

CHAPTER 3 ARTIFICIAL NEURAL NETWORKS

Fig. 3.1 Layout of feed-forward neural network 23

Fig. 3.2 Two-layer network with abbreviated notation 27

Fig. 3.3 Local and global minima of errors 28

Fig. 3.4 Eight steps in designing neural networks forecasting model 30

Fig. 3.5 Training stopping due to convergence 34

Fig. 3.6 A schematic of the ANN that been applied in the model 36

CHAPTER 4 METHODOLOGY

Fig. 4.1 Development of River Flow Modelling for Timah Tasoh Basin 41

Fig. 4.2 Annual rain distribution for Peninsular Malaysia 42

Fig. 4.3 Location of the hydrological station in Perlis, Malaysia. 43

Fig. 4.4 Land usage in Timah Tasoh Reservoir catchment area 44

Fig. 4.5 The distance of River Pelarit from the gauging station in Kg Batu 45

viii Limabelas.

Fig. 4.6 River Jarum- gauging station at Kg Masjid to Timah Tasoh Lake 45

Fig. 4.7 Monthly distribution in 2013 for rain stations in Perlis. 48

Fig. 4.8 Location of four rain stations in Timah Tasoh catchment. 48

Fig. 4.9 Different scenarios of rain gauge grid for rain cell detection 50

Fig. 4.10 The direction of rain cell movement within the catchment area 50

Fig. 4.11 Padang Besar daily rainfall in the year of 2010. 52

Fig. 4.12 Relation of rainfall and river flow in the basin of Timah Tasoh 54

Fig. 4.13 Water Level Classification at Flood Warning Centre 56

CHAPTER 5 RESULTS

Fig. 5.1 Monthly average for the accumulated rainfall for 14 meteorology 58

stations in Perlis from 2010 to 2013

Fig. 5.1 Comparison of Rainfall Rate from 2010 to 2013 in Perlis 61

Fig. 5.3 Rain intensity in 2010 to 2013 for 14 weather stations in Perlis 61

Fig. 5.4 Rainfall for Kaki Bukit, level and flow rate of River Pelarit during the 64

2010 flooding event

Fig. 5.5 The plot of Jarum River Level, Flow rate and rainfall at Padang Besar 66

Weather station.

Fig. 5.6 Plot of River Pelarit Level, Flow rate and rainfall at Kaki Bukit weather 67

station

Fig. 5.7 Rainfall during the 2011 flooding event for Padang Besar and River 68

Jarum at a recording interval of 5 minutes

Fig. 5.8 Relation of rainfall and flow in the basin of Timah Tasoh. 69

Fig. 5.9 Simplified diagram of Timah Tasoh Catchment Area 72

Fig. 5.10 Prediction from Model 1: River Pelarit. Deterministic forecast: (a) 75

observed and predicted time series; (b) residual plot; (c) scatter plot.

Fig. 5.11 Prediction from Model 2: River Pelarit for Oct. 2013 dataset. 77

Deterministic forecast: (a) observed and predicted time series; (b)

ix residual plot; (c) scatter plot.

Fig. 5.12 Prediction from Model 3: River Pelarit for Nov 2010 dataset. 79

Deterministic forecast: (a) observed and predicted time series; (b)

residual plot; (c) scatter plot.

Fig. 5.13 Prediction from Model 2: River Pelarit for Oct. 2013 dataset 81

Fig. 5.14 Prediction from Model V: River Jarum for Oct. 2013 dataset 83

Fig. 5.15 Prediction from Model VI: River Jarum for Oct. 2013 dataset 85

Fig. 5.16 Prediction from Model VI: River Jarum for Oct. 2010 dataset 87

Fig. 5.17 Prediction from Model VI: River Jarum for Oct. 2010 dataset 89

CHAPTER 6 CONCLUSIONS AND FUTURE WORK

Fig. 6.1 Simplified analysis for model flood prediction in Timah Tasoh 92

x List of Tables Page

Table 3.1 Summary of the settings to be used in this experiment and their 36

values.

Table 4.1 Detail of land use in Timah Tasoh catchment area. 44

Table 4.2 Four significance level and the required number of data n. 47

Table 4.3 Rain gauge stations in Timah Tasoh catchment 48

Table 4.4 Rain records for 4 stations in Timah Tasoh Catchment. 51

Table 4.5 Total of data for each station 55

Table 4.6 Related stations for Pelarit and River Jarum. 55

Table 4.7 Stages level of rivers (meter) in Perlis, Malaysia 56

Table 5.1 The total and duration of rainfall from 2010 to 2013 for 14 stations in 59

Perlis

Table 5.2 Frequency of rainfall rates for 10 to 150 mm/h in Perlis 59

Table 5.3 Descriptive statistic rainfall of 5 minutes for 14 stations in Perlis for 60

2010 to 2013

Table 5.4 The descriptive statistic for daily rainfall of 14 stations in Perlis for 62

2010 to 2013.

Table 5.5 Maximum rainfall rate (mm/h) for 2010 to 2013 for 14 stations in 62

Perlis

Table 5.6 Numbers rain of occasions in 2013 in 14 rain stations in Perlis 62

Table 5.7 Percentage of rain event in 4 different sessions 63

Table 5.8 Stages level of River Pelarit (meter) 64

Table 5.9 River level and flow rate changes during 2010 flood event 65

Table 5.10 Summary of parameters changes during 2010 flood event 66

Table 5.11 Water level as classified by DID for River Jarum 66

Table 5.12 Summary of parameters changes during 2011 flood event 67

xi Table 5.13 Summary of parameters changes during 2011 flood event 68

Table 5.14 Dry time and Flow delay for River Pelarit and River Jarum. 70

Table 5.15 Rain with no changes in Pelarit and River Jarum 71

Table 5.16 Details of Input and the Output for Pelarit and River Jarum Flow 73

Prediction Model.

Table 5.17 River Pelarit (Training: train LM, Target: Oct 2013). Grey shading 74

denotes the best performing model.

Table 5.18 River Pelarit (Training: train BR, Target: Oct 2013). Grey shading 76

denotes the best performing model.

Table 5.19 Pelarit Prediction Model. (Training: train BR, Target: Nov 2010). 78

Grey shading denotes the best performing model.

Table 5.20 Pelarit (Training: train LM, Target: Oct 2010). Grey shading denotes 79

the best performing model

Table 5.21 Overall of the best prediction model performances based on 81

different training algorithms for River Pelarit.

Table 5.22 Jarum Prediction (Training: train LM, Target: Oct 2010) 82

Table 5.23 Jarum (Training: train BR, Target: Oct 2010) 84

0 Jarum (Training: train BR, Target: Oct 2010) 87

Table 5.25 Model of Prediction for River Jarum (train BR, Target: Oct 2010). 88

Table 5.26 Best River Flow Model performance based on different training 89

algorithms for River Jarum.

xii List of Acronyms

AMSR-E Advanced Microwave Scanning Radiometer for the Earth Observing System

ARMA Auto-Regressive Moving Average

ANN Artificial Neural Network

ARIMA Autoregressive Integrated Moving Average

BR Bayesian Regularization

R2 Coefficient of determination or Regression

CCD Cold Cloud Durations

DID Drainage and Irrigation Department

GIS Geographic Information System

GMS Geosynchronous Meteorological Satellite

GFS Global Forecast System

GSMaP Global Satellite Mapping of Precipitation

HMM Hidden Markov Models

LM Levenberg-Marquardt

MMD Malaysian Meteorological Department

MYR Malaysian Ringgit

MK Mann-Kendall

MAPE Mean absolute percentage error

NN Neural Network

NNARX Neural Network Autoregressive with Exogenous Input

NLP Nonlinear Prediction

QPE Quantitative Precipitation Estimations

RBFN Radial Basis Function Networks

RNN Recurrent Neural Network

RSME Root Mean Square Error

SOM Self-Organizing Map

SSMI Special Sensor Microwave Imager

SEVIRI Spinning Enhanced Visible and Infrared Imager

TBP-NN Temporal Backpropagation Network

TSDM Time Series Data Mining Methodology

TMI Tropical Measuring Mission Microwave Imager

xiii Publications

Book Chapter Mohd-Safar, N. Z., Ndzi, D.L., Sanders, D.A., Noor, H. M. & Kamarudin, L. M., ‘Integration of Fuzzy C-Means and Artificial Neural Network for Short-Term Localized Rainfall Forecast in Tropical Climate’, in SAI Intelligent Systems Conference (IntelliSys), London, 2016, pp. 930–938. Conference Papers

Author

Hassanuddin Mohamed Noor, David Ndzi, Guangguang Yang, Noor Zuraidin Mohd Safar

‘River Flow Rate Prediction Using NARX for Advanced Flood Warning in Northern Malaysia.’ 13th IEEE Colloquium on Signal Processing and its Applications (CSPA 2017), Penang, Malaysia, 10th to 12th March 2017. Author

Mohd-Safar, Noor Zuraidin; Ndzi, David Lorater; Sanders, David Adrian; Noor, Hassanuddin Mohamed; Kamarudin, Latifah Munirah / Integration of fuzzy c- means and artificial neural network for short-term localized rainfall forecast in tropical climate. Proceedings of 2016 SAI Intelligent Systems Conference (IntelliSys). IEEE, 2016. Author H. Md. Noor, D. L. Ndzi, D. Sanders, N. Z. Mohd-Safar, S. Zarina Ibrahim, L. M. Kamarudin, A. Y. M. Shakaff. ‘Study of Rain Characteristics for Early Flood Warning System using Rain Gauge Network’. 4th International Malaysia-Ireland Joint Symposium on Engineering, Science and Business (IMiEJS), Penang, Malaysia, 25-26 June 2014.

xiv Chapter 1 Introduction

This chapter provides a prologue to the research work presented in this thesis. It describes the research background and the objectives. It also outlines the structure of the thesis. Research Background

Flood prediction and water resource management are major issues, especially in highly populated areas in tropical countries. Unlike in temperate climate regions, such as in Europe and North America, high resolution spatial and temporal weather data is not available in most tropical countries. Rainfall in the tropics are predominantly convection rain that is characterised by small rain cell size, high intensity and short durations that easily result in flash floods. Flooding in populated areas has become a major problem in many developing countries as developments in flood plains have increased. Accurate and fast adaptive flood prediction systems are have therefore become indispensable, especially as hydrology models become outdated as land use changes.

An accurate prediction of river flow rate and level are some ways of reducing the risk of flooding. This can ensure the effective management of flood barriers, activation of temporary flood defences and early warning to residents. In principle, the longer the lead time of advanced forecast, the longer the time for precautions to be taken.

One of the biggest problems in designing a flood forecasting system is that models that take into account ground cover are easily outdated as a result of development in the floodplains. Watersheds and floodplains are continuously changing as development take place.

To predict a river’s characteristics, a detailed understanding of its catchment area is required. This involves a comprehensive modelling that takes into account external elements like rainfall, heat transfer and radiation combined with internal factors such as the state of vegetation, soil moisture, and groundwater levels. The needs to forecast floods and droughts and manage water resources would therefore, require regular

1 modifications to a hydrological modelling system.

An example of a catchment model is the Lumped Based Basin Model [1], this is a conventional model based on the assumption that some hydrologic parameters can represent each sub-basin within a watershed. The parameters are a weighted average representation of the entire sub-basin. The main hydrologic inputs of this model are precipitation, depth, and temporal distribution. Various geometric parameters such as length of the stream, land slope, catchment area, centroid location, soil types, land use, and absorbency are also required for the development of a traditional lumped based model. However, these parameters are not always available and do change which makes the conventional lumped based model inaccurate. Forecasted river flow rate can be produced with the help of an accurate river model and flooding forecasted based on the prevailing conditions. Introduction to weather in Malaysia

Malaysia can generally be divided into two parts, West or Peninsular Malaysia and

East Malaysia (Borneo). Malaysia has a tropical climate with two major monsoon seasons, North East (NE) monsoon (from November to March) and South West (SW) monsoons (between May and September). Fig. 1.1 illustrates monsoon seasons rain direction in Peninsular Malaysia. Both monsoons bring heavy rainfall, and as a result,

Malaysia receives between 2000 to 4000 mm of precipitation annually with 150 mm to 200 mm of rain per day in some cases [2]. The transition between the NE and SW monsoons (and vice versa) occurs in April and October [3]. Continuous rainfall for a number of days is common in West Malaysia.

2 NE Monsoon (Nov – Mar)

Inter Monsoon (April and Oct)

SW Monsoon (May – Sep)

Statement of Aims

This research includes the study of rain dynamics across the catchment area and the effect on river flow characteristic. Thus, broadly speaking, this research can be divided into the following:

 Study of the relation between rainfall rate and runoff in the study area.

 Design and development of a computer-based model to predict river flow rate

based on rainfall in the catchment area.

This study focuses mainly on the application of time series analysis and Artificial

Neural Network (ANN) to build a model to forecast the river flow rate.

Understanding some of the rain characteristics is key to the development of an efficient early flood prediction water resource management system [4].

3 Objectives

The objectives of this research can be outlined as follow:

 To study rain characteristics in a selected area in Malaysia using available rain

gauges data.

 To study the relationship between rainfall rate, river level and river flow rate

to identify which river characteristics (level and flow rate) has the strongest

relationship with rainfall;

 To identify a possible technique(s) that can be used to predict the identified

river characteristics.

 To develop a computer-based model, to predict the river characteristics, test

and analyse its performance using measured data.

The above objectives have been achieved as follows steps:

 Perlis, a state in Northwest Malaysia was selected an area of study. River Jarum

and River Pelarit were selected.

 It has been shown that the river flow rate has the most significant correlation

with rainfall compared to river level. Therefore, river flow rate has been

selected as the parameter to predict.

 Nonlinear Autoregressive Exogenous (NARX) technique, one of the Artificial

Neural Network (ANN) has been selected and used to model the river flow rate

based on rainfall in the catchment.

 The implemented method can predict river flow rate based on rainfall rate up

to 132 hours with a correlation coefficient of more than 0.9.

4 Justification

Malaysia has many rivers and receives a high amount of rainfall annually. Fig. 1.2 shows annual rain distribution in the Peninsular Malaysia [5]. However, Malaysia faces serious water management challenges [6]. Rapid urbanisation has accelerated changes in catchment hydrology and geomorphology. Development within river watersheds has resulted in higher runoff and decreasing river capacity that has led to an increase in flood frequency. Furthermore, modelling rainfall-runoff relationship is inadequately available in Malaysia. However, there are facilities available to provide real-time measurement of rainfall and rivers. They are available from the Department of Irrigation and Drainage (DID). Background and Research Context

Severe floods occurred in Malaysia in 2006, 2007, 2010, 2011, 2013, 2014 and 2015. In

December 2013, most parts of low-lying areas in states of Johor, Pahang, and

Terengganu were flooded and more than 30,000 people were affected [7]. In

December 2015, the worst flooding for 50 years affected four states (Kelantan,

Terengganu, Pahang and Perak) [8] which led to the evacuation of more than 200,000

5 people and losses estimated at over RM 2.9 Billion Malaysian Ringgit (MYR) or £530 million [9].

Legend

Flood Prone Area

Fig. 1.3 shows that there are more than ten areas or river basins in Malaysia that are prone to flooding. Most of these areas are densely populated, and any flood events will have a high impact on the community and its economic activities. Therefore, the success of this study could make a very positive contribution to flood prediction in

Perlis in particular and Malaysia in general. Flood warning system in Malaysia

Malaysia does not have a coordinated system of advanced flood prediction. It has an evolved national and regional disaster response system that needs better information regarding flood prediction models [10]. Flood forecasting and warning services are concentrated in river basins that are highly populated. There are several types of flood warning and monitoring systems in Malaysia.

6 1.7.1. Telemetry System

The telemetry system allows measured data, mainly river levels, to be transferred wirelessly to an online system managed and controlled by the Department of

Irrigation and Drainage (DID). DID is responsible for collecting and analysing rainfall and monitoring the river levels. 1.7.2. Manual Water River Level Monitoring

A total of 137 river level real-time monitoring stations have been set up in Malaysia.

When water level exceeds the predetermined critical level, a local observer will transmit the real-time water level information to the DID state office for further actions. 1.7.3. Flood Warning Board

Flood Warning Boards have been installed in flood-prone areas in the major river basins. Warning notices on these boards are correlated to the water level at observation points upstream. The residents are informed of the impending flood situation in their area. 1.7.4. Mass media

Fig. 1.4 illustrates the example of warning been published by Malaysia Meteorology

Department (MMD) due to heavy rain at 5 a.m. 2nd December 2016. The blue zone is where thunderstorms are forecasted, and the red zone is where torrential rains are forecasted that will last for up to three days. These warnings are published on social media, mass media and other forms of the electronic broadcasting system to warn the public.

7 THUNDERSTORM WARNING HEAVY RAIN WARNING

Legend Heavy rain for another 1 to 3 days Medium rain, sometimes heavy rain will be occurred exceeding 1 day Affected Area Heavy rain currently occurred for more than 1 day

Structure of the thesis

Chapter 1 has introduced the research area. This is followed by Chapter 2 for literature review discussion. Fig. 1.5 shows the overview of the structure of the thesis.

Prologue of thesis Chapter 1: Introduction

Chapter 2: Literature Review Research literature review and theoretical background Chapter 3: ANN

Research design and Chapter 4: Methodology contributions

Chapter 5: Result Case study, research results, analysis and conclusion Chapter 6: Conclusion

8 Major Outcome

This chapter, has discussed the importance of river modelling due to the risk of flooding especially in highly populated area in Malaysia. Present flood early warning system in Malaysia has proven to be inadequate to give early warning to the public.

As a result, loss of lives and damages to infrastructure occurs due to series of severe floods. This shows the importance of developing accurate flood prediction and flood early warning system for Malaysia to minimize these losses.

9 Chapter 2 Literature Review

This chapter discusses the concept and theories that are related to the objectives of this study. A review of published studies on rain, hydrological modelling, flood prediction and flood management systems are presented. Flooding can be defined as

“a general and temporary condition of partial or complete inundation of normally dry land areas from overflow of inland or tidal waters from the unusual and rapid accumulation or runoff of surface waters from any source” [11]. Also as “the covering of normally dry land by water that has escaped or been released from the normal confines of any lake, or any river, creek or another natural watercourse, whether or not altered or modified; or any reservoir, canal, or dam” [12].

Rain cell bringing precipitation

Water level in the ground increased and become saturated

Groundwater flooding occurred when water level in ground rise above land surface

Additional rain fall on saturated soil become surface run off

Water from surface run off flow to river

Excess water flow into the river and increases flow rate to downstream river

Level of water in dam increased and the dam had to relieve excess water and becoming river flood

Fig. 2.1 The phases of rain caused flood event for water reservoir or lake

Fig. 2.1 shows the natural phases that lead to a rain caused flood event. Firstly, precipitation in the catchment area leads to the water level in the ground increasing

10 until the ground become saturated (tipping point). Groundwater flooding occurs when water levels in the ground rise above the land surface. Normally groundwater levels rise and fall according to the annual cycle but periods of prolonged rainfall may cause water levels to rise above the land surface. Additional rainfall on saturated soil become surface runoff. Excess water flows into rivers and increases their flow rates and levels downstream. For a river that discharges into a reservoir, dam or lake, it changes not just the amount of water in the reservoir or dam, but the rate at which they are filled. Depending on the topology, the high level of water in the reservoir may result in the flooding upstreaming or in the surrounding low-lying areas.

Flooding predominantly occurs downstream of managed dams where excess water is not released gradually, especially when inflow into the dam is high and has not been predicted. This is a widespread problem in South East Asia, in general, and in

Malaysia, in particular, where the dams are used for water supply to homes and are a key part of the irrigation infrastructure. 2.1 Existing Flood Forecasting Techniques

Flood prediction is a complex process because of the numerous factors that affect river flow rates and levels such as the topology, rainfall amount and duration, soil types and size of catchments. The relationship between these factors often needs to be fully understood. Classical linear Gaussian time series (deterministic) models are inadequate for the analysis and prediction of complex geophysical phenomena [13].

Linear methods such as Auto-Regressive Moving Average (ARIMA) approach are unable to identify complex characteristics due to the need to characterise all-time series observations, the necessity of time series stationarity and, the requirement of normality and independence of residuals [14]. Nonlinear time series approaches such as Hidden Markov Models (HMM) [15], Artificial Neural Networks (ANN) [16-17] and Nonlinear Prediction (NLP) [18] techniques applied to discharge forecasting only produce accurate predictions for short prediction periods due to the dynamic nature of rainfall and its complexity.

11 The time dependency of the modelling required has resulted in many studies based on nonlinear modelling with time delay parameters. These time dependency techniques also have been applied to varieties of physical applications such as in the domains of physiology [19][20], financial and economics [21], geophysics [22] and engineering applications [23]. The Time Series Data Mining Methodology (TSDM) is based on a variant of time-delayed embedding called the reconstruction of phase space [30- 31]. TSDM framework combines the methods of phase space reconstruction and data mining to reveal hidden patterns predictive of future events in nonlinear, nonstationary time series events. This could be applied to river flow rate or river level prediction.

In recent years, numerous studies from hydrodynamics, civil engineering, statistics and data mining have contributed to the area of flood prediction. Some of the current measurement and techniques used in flood prediction can be categorized as follow [26]:

 River and Rain-Gauge Networks.

 Radar and Information Systems.

 Linear Statistical Models.

 Nonlinear Time Series Analysis and Prediction.

The first two techniques use direct measurements and knowledge of past events to predict flooding. The linear statistical models use the weighted flood discharge, precipitation intensity, elevation, stream length, and main channel slope for flood prediction [27]. The ANN modelling falls in the fourth categories which are Nonlinear

Time Series Analysis and Prediction, which is been selected for this studies. River and Rain-Gauge Networks

Presently, river flow rate, river levels and precipitation measurement are monitored by the Department of Irrigation and Drainage (DID) in Malaysia. DID has installed and operated 525 telemetric stations in 38 river basins. Additionally, 670 manual river gauges, 1013 stick gauges and 182 flood warning boards have been set up in flood-

12 prone areas to provide crucial information during the flood season. 395 automatic flood warning sirens are an integral part of the local flood warning system is used

[28]. Data from the monitoring stations are used to predict possible flood events. This approach often does not provide sufficient warning time as rainfall is intense and localised which may affect areas downstream very far from it rained. Radar and Geographic Information Systems

Presently, with the development of the radar system and the Geographic Information

System (GIS) is being used along with the traditional river models for improved flood forecasting [29]. This complex system consists of simulation programs will use multiple data source from various sources such as telemetry, gauges to calculate runoff, infiltration and precipitation volumes using land use and elevation information. However, the major disadvantages of this system are the complexity is high, the longer time taken to produce prediction result and the high cost of implementation and maintenance. Linear Statistical Models

Linear Statistical Models such as autocorrelation functions, spectral analysis, and analysis of cross- correlations, linear regression and Autoregressive Integrated

Moving Average (ARIMA) have been studied to access their application in flood forecasting. However, several studies like Solomatine et al. founds that the use of

ARMA unable provides accurate predictions for long-term prediction [30][31]. Nonlinear Time Series Analysis and Prediction

Nonlinear Prediction (NLP) method [32] has been used in flood prediction by researchers like Porporato et al. [33]. They have experimented with flood prediction based on single parameter (river flow) time series [33][34] and also multiple variable time series as well [35]. Some of the nonlinear time series approaches such as Hidden

Markov Models, and Artificial Neural Networks (ANN) are based on multiple time series. Laio et al. [36] have performed a comparison of ANN and NLP approaches in daily discharge forecasting NLP method was found to provide accurate forecasts

13 over a shorter prediction period (1-6 hours), but for the prediction periods exceeding

24 hours, the ANN approach is more accurate [26]. Artificial Neural Networks (ANN)

ANN is widely accepted as a potentially useful way of modelling complex nonlinear and dynamic systems. ANN i s useful in situations where the underlying input parameter relationships are not entirely understood or where the nature of the process being modelled is complex. Although ANN does not remove the need for knowledge or prior information about the systems of interest, it reduces the model's reliance on this prior information. This removes the need for an exact specification of the precise functional form of the relationship that the model seeks to represent.

ANN represent the input-output relation where different parameters such as rainfall, flow rate, temperature, humidity, wind direction, etc. can be used as input to the model. Numbers of researchers have used this technique in a variety of geophysical studies [29]. There is, however, no set algorithm that can be applied to ensure that the network will always produce an optimal solution and the nonlinear nature of the

ANN model often results in multiple predicted values. Details of ANN will be discussed in Chapter 3. 2.2 Weather radar

In Malaysia, weather radar or rain radars are used to measure the rainfall[4][37][38][39]. Twelve Doppler weather radars operate in the location shown in Fig. 2.1 and northern Borneo (Sabah and Sarawak). These consist of S-band (8 sites) and C-band radars (4 sites). Radar is used to estimate the distributions of precipitation amounts over a large area. In addition to the large area, radar coverage allows the detection of rain events which might not be detected by rain gauges [39]. However, the application of radar-based rain measurement has limitations. Their ranges for acceptable accuracy are limited and is reduced as the topology becomes more complex [40]. The localised nature of rainfall rate in Malaysia means that they are not a reliable means of measuring rainfall rate due to their limit spatial and temporal

14 resolutions.

Fig. 2.2 Peninsular Malaysia weather radar network [41].

2.3 Satellite-Based Weather System

The weather satellite is a type of satellite that is primarily used to monitor the weather and climate of the Earth. Satellites can be polar orbiting, covering the entire Earth asynchronously, or geostationary, hovering over the same spot on the equator.

With advances in satellite technology, some of the present rain forecasting and flood prediction systems have been designed to incorporate geographical information system (GIS) and ground sensors such as rain station. Examples include the Japanese

Geosynchronous Meteorological Satellite (GMS) with a spatial resolution of up to 10 km and the United States’ Global Forecast System (GFS) with a spatial resolution of

35 km. These render satellite-based systems inadequate for accurate prediction of localised floods, especially in the tropical areas. Some of these satellite systems have a temporal resolution of 6 hours. Studies have shown that the average size of rain cell in Malaysia is between 1.2 to 1.5 km [42]. Rain cells in the tropical areas also have high perception intensity which can be up to 120 mm per hour. It is typically localised rather than widespread due to its convective characteristic and has a lifespan of less

15 than an hour [38][43]. Due to this limitation, flood prediction based on satellite systems inherently produces poor prediction results [44][45].

Presently, the system with the highest resolution for rain detection system is the

Global Satellite Mapping of Precipitation (GSMaP) by the Japan Aerospace

Exploration Agency (JAXA) [46]. GSMaP has been designed to achieve a temporal resolution of 1 hour, spatial resolution with grid latitude-longitude of 0.1 degrees, and makes the data available approximately 4-hour after it is captured (quasi-real- time basis). This system has been developed to assist developing countries with flood prediction. However, studies in Chao Phraya River basin in Thailand has concluded that the satellite-based system is not adequate for near-real-time rainfall monitoring applications due to the dynamic nature of tropical rain cell which is small and localized compared to the rain cell size in moderate climate [47]. Overall, the GSMaP system has been found to be suitable for climate change studies.

Several studies have been conducted to improve the accuracies of satellite rain prediction systems. One of the systems uses geostationary satellites with thermal sensors to measure the daily minimum cloud top temperatures or Cold Cloud

Duration (CCD) to rain rates. The limitation of this method is that it assumes that rain rates are related to cloud duration. This is an assumption that fails in case of high rain rates that occur over a short period. Another method of calculating the rain rates is the rate retrieval methods [48]. This uses satellite microwave radiometer observations based on the fundamental principles of radiative transfer. However, this is an approximation based on the properties of clouds and is susceptible to inaccuracies in localised weather patterns and windy conditions [49]. Another approach used for rain detection is based on soil moisture [50]. It detects the surface soil moisture using low frequencies of microwave imagery systems such as Tropical

Measuring Mission Microwave Imager (TMI), Advanced Microwave Scanning

Radiometer for the Earth Observing System (AMSR-E), and Special Sensor

Microwave Imager (SSMI). The purpose is to improve the accuracy of GSMaP by introducing the new method to complement the high-resolution global precipitation

16 prediction.

However, the drawback of most air-based rain detection system such as radar and satellite is their poor spatial and temporal resolutions [51][52]. Some flood warning systems prediction is very complex and mainly designed for large area prediction which makes their application in tropical areas unreliable due to localised flooding and short duration runoff or flash floods [53]. The nature of convection rain in tropical areas increases the errors and uncertainties in flood prediction and storm modelling systems [54][55]. These limitations have been proven in the application of the topography based hydrological or known as TOPMODEL prediction system for river level forecasting in Nigeria [53]. 2.4 Introduction to Basic Hydrology Modelling

There are several basic hydrology models used to estimate discharge of water base on simple calculations. Below are the some of the basic hydrology models. 2.4.1. Rational method

The Irish engineer Thomas James Mulvaney (1822-1892) published a simple equation to determine the maximum discharge for a quantity of rainfall in a catchment. The equation (2.1) published by Mulvaney in 1851 is the following:

Q = C A R (2.1)

A- Catchment Area. R- Maximum catchment average rainfall intensity. C- An empirical coefficient which is the proportion of rainfall that contributes to runoff.

The equation only predicts the hydrograph peak, not the whole hydrograph. The maximum discharge, Q is predicted by the equation for a rainfall with the duration is at least equal to the concentration-time of the catchment.

The method shows the way maximum discharge is presumed to increase the catchment area and rainfall intensity. The calculation of this maximum discharge is made in a rational way, hence the name of the method. The rational method is a popular and easy to use technique for estimating peak flow in any small drainage

17 basin having mixed land use and should not be used in basins larger than 1 square mile. Although this method does not offer the possibility of obtaining an entire hydrograph but just its peak (maximum discharge) it is enough for hydrological engineers to design bridges or dams that can sustain the maximum discharge. 2.4.2. Time and area diagram (TAD)

Using this concept, hydrologist obtains a time-area diagram (TAD) which the area from which water flows in each time step as shown in Fig. 2.3. The time-area diagram represents the lag in the time needed for water from each area of the catchment to reach the outlet.

The time-area concept was used in the USA and in England. The idea is the base of some newer distributed hydrologic models. Kull and Feldman adapted Clark's method for use with radar data obtained from the NEXRAD system.

Fig. 2.3 The time area diagram obtained by splitting the basin into n areas according to flow time to the outlet.

2.4.3. Unit Hydrograph

The time-area concept is a method that has been used in hydrologic models. It has a limitation because the accurate determination of the contributing areas for each time step is difficult because the overland and groundwater flow speed is not easy to be determined. Sherman tried to avoid this problem using the lag time needed for water to reach the catchment outlet as a time distribution without a direct link to catchment areas known as Unit Hydrograph [56].

18 Snyder's synthetic unit hydrograph is based on the creation of the hydrograph from characteristics of the rainfall and the catchment. Snyder five characteristics of the

Unit Hydrograph can be obtained: the peak discharge per unit of watershed area, QpR, the basin lag, TlR, the base time, Tb, and the widths, W (in time units) of the unit hydrograph at 50 and 75 percent of the peak discharge as Fig. 2.4.

Fig. 2.4 Snyder's synthetic unit hydrograph

Another method to determine a hydrograph is the method proposed by SCS. This method is used to create a dimensional hydrograph with ordinate values expressed in a ratio Q/Qp (flow/peak flow) and containing the values of the ratio t/tp (time/time to peak) on the abscissa. The dimensionless unit hydrograph can be used later to determine a watershed-specific unit hydrograph knowing some characteristics of the watershed. The data needed to apply the method are the area of the catchment A, the time of concentration Tc and the duration of the unit excess rainfall D. 2.5 Approaches to Hydrological Modelling

Hydrological models are used to improve the understanding of the hydrological processes as well as to make forecasts of the future hydrological events, e.g. flood events, river levels, etc. taking into account rainfall, snowmelt and evaporation. All models are simplifications of reality, and there are many different ways to represent it. Some authors have categorized hydrological modelling approaches into three main types: process-based; conceptual, and empirical or data-driven (which includes statistical) [57].

19 This decomposes the physical characteristics of the catchment area into different constituents and models them [57]. An example of this model is the European

Hydrological System [58]. It has been applied to a range of areas for flood forecasting and used to examine the effects of land use change and groundwater modelling in Europe.

Physically-based models represent a catchment as three-dimensional grid-like global circulation models of the climate using the fundamental laws of conservation of energy and mass to model water movements on the surface through the unsaturated and saturated zones to the river. A flood hydrograph is then dynamically built from the model runoff [59]. These models incorporate as full an understanding of the catchment processes as possible and when the conditions change, the models can be used to evaluate the impact of runoff or other catchment properties [60]. Although these models are the most complex and accurate of the three model types, they required a significant amount of data which is not always available for all catchments and long processing time is needed. These model, therefore, have more value for planning than for use in real-time forecasting.

The conceptual model is also known as the lumped conceptual or geomorphology- based model [61]. These are viewed as the most successful models for rainfall-runoff simulation and flood modelling [62]. These models still have a physical basis but have been structured in such a way as to represent a stream as a network in the surrounding catchment. These models attempt to represent the main dynamics in the catchment but require calibration of several parameters [63]. Conceptual models are therefore less demanding compared to physically-based models but need more information than empirical data-driven models. Topography-based hydrological

(TOPMODEL) that were used in Nigeria is an example of a simple yet powerful conceptual model that has been used in numerous applications covering a range of

20 catchment sizes and geographical areas [53].

These models do not use catchment characteristics or other physical parameters [64].

One major advantage of this approach is that these models are simple to use compared to deterministic or conceptual models. However, the disadvantage is that these model are did not accept any changes such as changes in land use into account compared to the deterministic model.

ANN is classified as an empirical model where the data is fed into the model, a relationship between the inputs and the outputs is then used to produce a forecast.

Therefore, the ANN-based model does not require any understanding of the physical processes. However, some hydrological knowledge is a prerequisite as it guides the selection of inputs (i.e. rainfall, previous flows, water levels, and so forth) and outputs to produce a forecast (e.g. runoff, stream flows, hydraulic conductivities, etc.). This type of modelling does not always appeal to hydrologists who prefer the core of the model to be a dynamic, physically-based and a representation of the processes involved. Research reported has used sensitivity analysis to determine the most significant inputs to be used in the network [34]. This can be applied to gain a better understanding of what is happening in the model.

Unit hydrographs are a classic empirical model which capture the relationship between rainfall and catchment response. One example is the time series models such as ARMA models, which are frequently used as a type of empirical model to capture the rainfall-runoff relationship [65]. 2.6 Summary

A description of the study area has been given, including the different method of rain measurement systems and existing flood prediction available in Malaysia. An introduction to hydrological modelling approaches has been given. Several basic hydrology models for the estimates discharge of water base also been discussed where most of the model needs to have some hydrological parameters is a

21 prerequisite as it guides the selection of inputs. The ANN approaches been selected because it does not require any understanding of the physical processes where data is fed into the model, a relationship between the inputs (rainfall, previous condition of dry time and flow rate) and the outputs is then used to produce a forecast river flow rate. 2.7 Major Outcomes

There are several methods of rain measurement systems and flood prediction based on the ground and radar-based rain measurement. However, because of the complexity of measuring rainfall rate in tropical areas due to limited spatial and temporal resolutions, it will leads to unreliable prediction. Hydrology models estimate discharge of water base on the calculation but physical data is need before the model can be used. Therefore, ANN approach been selected because it does not require any understanding of these physical processes. ANN models are simpler to use compared to deterministic or conceptual models.

22 Chapter 3 Artificial Neural Networks (ANN)

This chapter describes the application of ANN in hydrology for river flow modelling as a basis to develop a river flow prediction system. An overview of ANN especially its structure, model development, and advantages and disadvantages are given. This chapter concludes with a review of the major themes that have emerged from the

ANN river flow forecasting literature and how the research in this thesis fits within this context. Introduction ANN describe a system modelled on a human brain with neurons and interconnections that enable input to be mapped to the output. The development of the back-propagation algorithm was an important advancement in neural networks.

Neural Network Autoregressive with Exogenous Input (NNARX) is a version of

ANN that is more capable of for nonlinear systems [66]. Its generalisation capability, effective learning rate and faster convergence make it preferable compared to other neural networks.

Input Layer Hidden Layer Output Layer i j k W11

I1 Neuron W11 W12 W12 O1 W13 Neuron

W21 W21 W22 I2 Neuron W23 W22 O2 Neuron W31 W31 W32 Neuron I3 W32 W33

Fig. 3.1 Layout of feed-forward neural network

There is a need to extract knowledge from data and to understand the connections in the ANN network architecture. There is also a need to reduce complexity to ensure faster convergence [61][67]. Fig. 3.1 shows an example of an ANN network architecture. The network can have many hidden layers, hence become more complex.

23 Nonlinear AutoRegressive Neural Network with eXogenous Input (NARX) NARX is a nonlinear model derived from Autoregressive with Exogenous Input

(ARX) model [68][69][70]. It is a recurrent dynamic neural network with feedback connection enclosing several layers of a neural network. NARX is well suited to modelling nonlinear systems, especially in time series analysis. One major application of NARX dynamic neural networks is in control systems [71]. Some important qualities about NARX networks with gradient-descending learning gradient algorithm have been reported to be much faster and generalise better than other networks [70]. The dynamic behaviour of the NARX model is described by Equation

(3.1) where F is a nonlinear function of its arguments such as a polynomial. F can be a neural network, a wavelet network, a sigmoid network, etc.

(3.1)

Here, is the= variabl( , e of, the flow, … … rate, , of the, river, and, … . .is) the+ Є externally determined variable in this study which is rainfall in the related station. In this scheme, information about will help to predict y, as do previous values of itself. is the error term and the current value of model output is denoted by y (t). Є Types of ANN A multilayer perceptron (MLP) network is the most commonly used type of ANN.

Other types of ANN include Radial Basis Function Networks (RBFN) [72], Recurrent

Neural Networks (RNN) [73] and Self-Organizing Map (SOM) [74]. RBFN have architectures that are similar to MLPs. The RBFN differs in the form of the activation function in the hidden layer nodes, which are of a radial basis or Gaussian form. The parameters of the radial basis functions are usually determined first followed by the weights during the training process. RBNF requires a large training dataset to achieve high accuracy.

RNN uses a modification of feedforward network to allow for feedbacks between layers. Different learning algorithms have been developed for this type of network.

It has been used successfully in handwriting recognition applications but less in the

24 hydrological modelling.

Self- Organizing Map (SOM) is a slightly different type of ANN [75]. SOM uses an unsupervised classification method to cluster data into similar types. Therefore, only an input data vector is required. It is used for data compression and visualisation of relationships in the data [76]. In its simplest form, SOM is composed of a single two-dimensional layer of input neurons of vector w. The training algorithm randomly selects an input vector and finds the best match based on the

Euclidean distance between the two vectors. This iterates until a stopping condition is satisfied, e.g. a minimum error is reached. The disadvantage of SOM is that there is no clear methodology for the selection of the number of nodes (or clusters) or the values of the learning parameters, e.g. the size of the neighbourhood. Development of an ANN There are not many guidelines in the literature on how to develop an optimal ANN for a given application. Some authors have attempted to provide some guidelines based on empirical results model development [77]. Decisions to make when implementing include selecting the network architecture, the number of hidden layers, the number of hidden nodes, input variables, activation functions, training algorithm, and which performance measures to use. These decisions are described in section 3.8

The architecture of an ANN is often selected after many trials as there is no established methodology to determine the configuration [78][79]. Once the type of network is chosen, the number of hidden layers and the number of nodes in each hidden layer must be determined. However, there are no fixed rules, so it is often by trial and error [80][81]. There are, however, some heuristics in the literature. For example, Hecht-Nielsen [82] suggested the following upper limit for the number of hidden layers ensure that the network can approximate any continuous function:

(3.2) where is t h e number of hidden layers≤ 2and+ 1 is the number of inputs. Rogers and Dowla [83] suggested a second relationship to avoid overfitting as follows:

25 (3.3) where is the number of observations= in≤the training dataset.

Depending on which training algorithm is chosen, the number of parameters needs to be specified, e.g. the learning rate and momentum.

Another important decision to make is the choice of input variables. One study used river levels or flows and rainfall [84]. Total rainfall is normally used but effective rainfall has been used [85]. Effective rainfall is total precipitation minus losses, which produces the runoff. However, the problem with effective rainfall is that it is hard to estimate because it depends on the antecedent moisture conditions of the basin, which changes over time.

Other variables that have been used include temperature, evapotranspiration, moving average antecedent precipitation, and soil moisture [86]. If too many inputs that are not independent are used, it increases the size of the network and the training. There is also a greater likelihood of overfitting the training data because of the ratio of connection weights to training data increases. Although many different methods exist, trial and error and correlation analysis are used most frequently to select inputs [67][87][88].

ANN Architecture ANN has two structure; feed-forward network and a feedback network. The feed- forward neural network has no provision for output to be used as an input to a processing unit in the same layer. A feedback network allows outputs to be used as an input in the same or preceding hidden layer. An ANN system learns by changing the weighting of inputs. During training, the network sees the real results and compares them to the outputs. If the difference is significant, the network then uses the feedback to adjust the weights of the inputs.

An ANN is comprised of nodes or neurons. Information is passed between nodes through connections. Weights are then associated with each connection, which represents the magnitude or strength of that connection. Within the node is a

26 nonlinear transformation function, an activation function, which is applied to the input signals of the node to produce an output signal. The simplest function that is usually used is the sigmoid function. The nodes or neurons are arranged into a series of layers: an input layer; one or more hidden layers; and one output layer as in Fig. 3.2. A weight matrix W, a bias vector b, and an activation or transfer function f are associated with each hidden layer.

2 2,1 1 2 a1 = f1 (IW1,1∙X + b1) a = f2 (IW ∙a +b )

a1 IW1,1 LW2,1

X + + a2 f1 Y f2

b1 b2

Input layer Hidden layer 1 Hidden layer 2 Output layer

Fig. 3.2 Two-layer network with abbreviated notation.

The input layer is where external information is received and provided to the network (e.g. antecedent rainfall or runoff) whilst the output layer produces the forecasted parameter (e.g. the river flow rate). The representation of nodes in each layer and the interconnections are more clearly shown in Fig. 3.2. This is an example of a feedforward network.

The output of each node is obtained by computing the value of the activation function on the product of the input vector and the weight vector, minus the value of the bias associated with that node. It is possible to express the forward processing through the network as a single equation. A network with one hidden layer and

K outputs would have a functional form as in Equation (3.4).

(3.4)

( , ) = + ∑ + ∑

27 Where is the number of inputs, is the number of nodes in the hidden layer, g is the activation function of the hidden layer nodes, and and are the weights. The indices and correspond to the output node and hidden layer nodes, respectively.

However, ANN are rarely expressed in this manner as the equation is not interpretable.

Training an ANN model After the network structure been configured, training is used to find the values of the weights that minimise the error between the inputs and the outputs in the training dataset:

(3.5) where is the sum of the errors=squared∑ ( between− ) the targets, t, and the ANN estimate, a, for i observations in the input-output dataset.

Fig. 3.3 Local and global minima of errors.

The training iteratively adjusts the weights of each node until a stopping condition is reached. The initial weights are first randomly selected. In a single training run, the algorithm may fall into a local minimum on the error surface as illustrated in Fig.

3.3. The most common objective function is to minimise the sum of the errors squared.

This assumes that the model errors are distributed around zero mean with unknown variance.

28 To achieve an acceptable level of generalisation, the dataset is usually divided into

three subsets which consist of training, validation and test datasets. The test dataset

is reserved and used in the testing phase to measure the performance of the network.

This is not used during the training process.

There are two types of training algorithms, supervised and unsupervised. For a

supervised training algorithm, the output is available, and the network will find the

best set of weights by minimising the error between the output and the target. For the

unsupervised network, only the input data set will be given to the network, which

will then try to find clusters of similar inputs without any previous knowledge.

Although many training algorithms have been investigated in this study, two

algorithms are selected and described. These are the Levenberg-Marquardt (LM) and

Bayesian Regularization (BR) algorithms.

3.4.2.1 Levenberg-Marquardt (LM) Algorithm

The Levenberg-Marquardt (LM) algorithm improves the training speed while

avoiding the Hessian matrix [89]. If the objective function of the ANN is the sum of

squares, then the Hessian matrix can be written as equation (3.6)

(3.6)

= + (3.7)

The gradient can be determined as=:

(3.8)

where is the Jacobian matrix that= contains the first derivatives of the errors of the

ANN with respect to the weights and biases of the network, while e is a error vector.

The LM algorithm approximates the Hessian matrix as follows:

(3.9)

When > 0, equation ( 3.9) becomes= − [a gradient+ ] descent with a decreased step

size. Thus, is decreased after each iteration.

29 3.4.2.2 Bayesian Regularization (BR) Algorithm

Bayesian Regularization (BR) is an ANN algorithm that updates the weights and

bias values using the LM algorithm as a basis. It minimises the squared errors and

weights and then determines a suitable combination of the two to produce a network

with the greater capability of generalisation. [90].

Network design configuration The process of designing the ANN Network is to select the configuration. This

includes the number of layers, nodes, etc. Kaastra and Boyd [21] describe a method

for designing a back-propagation ANN as in Fig. 3.4. All of these steps will be

discussed in the next section.

Fig. 3.4 Eight steps in designing neural networks forecasting model [21]

Variable selection In the case of river flow prediction, defining input data is critical since it affects the

generalisation ability of the ANN. For flow rate prediction, rain data from weather

stations used as input data. Previous rainfall and river flow rates are sometimes used,

as inputs and as targets. Moving average and exponential moving average are used

30 to enhance the performance of ANN. A simple moving average (SMA) is expressed as in equation (3.12)

(3.12) ⋯ ( ) where is parameter such as the= flow rate, is the duration or data resolution, and

is the total time period. The exponential moving average is expressed as in equation

(3.13)

( ) (3.13) ( ) ( ) ⋯ ( ) ( ) ( ) Correlation analysis= [67] provides( ) ( a) way⋯ ( of)reducing the number of inputs, hence a number of trials by eliminating those input variables that are independent or unrelated to the output [91] [92]. The problems with this method are that: (i) it assumes a linear relationship between the variables, whereas most hydrological problems are non-linear; (ii) it is unclear which correlation value threshold to use and

(iii) it does not take variable independence into account.

Data pre-processing Since neural networks pattern matches, the representation of the data is critical in designing a successful network. Raw inputs and outputs are rarely fed to the ANN directly [93][94]. The data was pre-processed to reduce noise, detect patterns, and normalise data distribution to facilitate ANN network training convergence and generalisation.

The normalised data is mapped into a predefined range [0, 1]. The min-max method normalises the values of the attributes X of a data set according to its minimum and maximum values. Equation (3.14) is used.

(3.14)

= This also matches the range of data with the range of transfer functions [93][94]. The input and output variables are standardised before the network is trained. Unlike statistical models, it does not need to be normally distributed.

31 Training, testing, and validation The training data set is the largest dataset compared to testing and validation dataset.

The testing set is used to test the generalisation ability of the trained network and the validation set is used to check the performance of the network. Kaastra and Boyd recommended a 10% to 30% of the data as testing dataset [21], and Zhuang et al. [14] recommended 90% to 70% of the data to be used as training dataset. Therefore, in this study, 70-15-15 (training-testing-validation) configuration is considered as an acceptable division and has been used in this research [95][96]. Neural network paradigms The number of hidden layers and neurons in the hidden layers plays a vital role in the performance of Back Propagation Neural Network especially in cases where a problem related to an arbitrary decision boundary with rational activation functions.

Also, multiple hidden layers can approximate any smooth mapping to any accuracy.

However, the process of deciding the number of hidden layers and number of neurons in each layer is still not defined [97]. Furthermore, many previous studies showed that it is better to increase the number of hidden neurons rather than to increase the number of layers of the network [97][98]. In this studies, following configuration for the experiment has been used:

 The number of input neurons is the same as the number of inputs. Therefore,

one input neuron is assigned to the previous flow rate and another is assigned

for the predict the future flow rate [97] [98].

 The number of hidden layers depends on a number of inputs and the nature of

data. It is also acceptable that one layer with a sufficient number of neurons is

enough for good approximations because more layers increase the possibilities

of over-fitting[99] [17].

 There is only one output neuron, which is the forecasted river flow rate.

 The most commonly used transfer functions are linear and sigmoid functions

32 because of their continuity feature. The hyperbolic tangent sigmoid (Tan-sig)

and pure linear functions are both used in this research. The Tan-sig function

accepts inputs in any range and non-linearly transforms them into a value

between -1 and 1. The function can be expressed as:

(3.15)

( ) = ℎ( , ) = − 1 where u is the input range and α is a variable that controls the shape of the function.

The larger the value of α, the closer the function gets to the step function; and the smaller the value, the closer it gets to a purely linear function. The variable α is varied in the experiment to test the effect of different shapes of the Tan-sig function on ANN performance. Evaluation criteria Neural network performance is evaluated using specified performance indicators.

For error measurement, the mean square error (MSE) function was used during training. Regression technique was also used [100][101] to measure the neural network performance and prediction performance. Neural network training Training of the ANN was iterative. The main objective of training is to reach the global minimum of the error function and generalised the model. In this study, backpropagation algorithm was used. It was developed to improve training for different types of problems with the aim to reduce the training time and achieved fast converging.

The main goal of backpropagation is to optimise the weights which enable the neural network to learn how to map arbitrary inputs to the outputs correctly. In this studies, the Bayesian regularisation backpropagation and Levenberg-Marquardt backpropagation have been selected from a wide range of available methods, after evaluation [68][69][102][103]. This two algorithms (Bayesian regularisation and

Levenberg-Marquardt) were selected after a study comparing different algorithms showed that the Levenberg-Marquardt algorithm (trainlm in Matlab) is suitable for

33 small to medium-sized networks [104][105][106].

Another aspect of training is the criteria to decide when to stop training process. Since the network is trained in iterations, also called epochs, the number of iterations must be specified to guide the training process. The number of iterations can be determined before training starts, but this method has two disadvantages. If the network converges very quickly, overtraining will occur, but if the opposite happens then, training will stop before the network able to converge. On the other hand, the convergence method, which was used in this research, solves this problem by stopping when the performance function stops improving, i.e. the network converges. Fig. 3.5 illustrates an example of error performance and the criteria to stop the training process. During training, the maximum epoch is fixed to a high value that will not likely be reached (e.g. 1000) to allow sufficient iteration to achieve convergence but at the same time to ensure that the training does not carry on forever.

Since the LM algorithm usually has a faster convergence, training time is typically shorter compared to Bayesian Regularization. However, recurrent NARX networks with Bayesian Regularisation tend to produce better results compared to the LM algorithm.

Fig. 3.5 Training stopping due to convergence.

ANN application in prediction models In the early stage, Smith and Eli [107] used a ba s ic three- layer ANN (Input, hidden, output) to predict runoff based on simulated rainfall patterns in a synthetic

34 catchment. Their results were not very accurate in predicting the river flow rate nor the time because they could not successfully model the physical characteristic of their catchment and rainfall because the ANN does not model the individual physical processes.

Later, Dawson and Wilby [77] also used ANN to forecast flow rates of the Amber and Mole Rivers in the UK for lead time up to 6 hours. This was found to be comparable in performance to an existing flood forecasting system in operation.

Different networks have also developed, e.g. a Radial Basis Function Networks

(RBFN) by Mason et al. [108], using a temporal backpropagation network (TBP-NN) by Sajikumar and Thandaveswara [109]. The issue of lead times was also considered, e.g. Campolo et al. [110] examined the effect of ANN predictions as lead time is increased from 1 hour to 5 hours. They noted the decrease in accuracy with increasing forecasting time.

In 2000, reviewed by the American Society of Civil Engineers Task Force, Maier and

Dandy [67] have targeted the broader area of hydrology. They highlighted the main challenges in ANN model development such as selection of input variables, an optimal division of the input data, data pre-processing methods, etc. The study showed that the application of ANN in hydrology, including rainfall-runoff modelling, concluded that ANN could perform as well as existing hydrological models. However, there was a lack of an established methodology for model design and implementation. Implementation Table 3.1 gives a summary of network configuration used in this research. These network configurations were selected based on detailed research and study of the problem to be solved, including analysis of the available data.

35 Input variables Rainfall, Flow Rate, Dry time Output variable River Flow Rate Training, testing, validation 70-15-15 ANN type RNN (NARX) Number of input neurons 3 Number of hidden layers 1 Number of neurons in hidden layer 1 Number of output neurons 1 Hidden layer transfers function Hyperbolic tangent sigmoid shape

Output layer transfers function Hyperbolic tangent sigmoid and Pure linear Training algorithm Bayesian Regularization and Levenberg-Marquardt backpropagation Training algorithm learning rate (Mu) 0.001 to 1 Training algorithm learning rate updates (Mu step 10 and 0.1 up and Mu step down) Error (performance) function MSE, RMSE Prediction (performance) Predicted- Actual Flow Regression Max epochs 1000 Max training time Undetermined Table 3.1 Summary of the settings to be used in this experiment and their values.

Model Input Input Layer Hidden Layer Output Layer i j k

W11

Rainfall Neuron W11 W12 W12 W13 Neuron

W21 W21 River flow W22 Neuron Predicted of n-time Flow rate W23 W22

W31 W31

W32 Dry Period Neuron W32 W33

Fig. 3.6 A schematic of the ANN that been applied in the prediction model.

In this research, an ANN model of the River Pelarit and Jarum were developed to predict the river flow rate using recordings of 5 minutes interval river flow rate from related stream gauging stations and rainfall of relevant station. The network

36 consisted of 3 lagged inputs (Rainfall, River flow and Dry Period) from each station, one hidden layer, i.e. the output of predicted flow rate as in Fig. 3.6. Model Evaluation The performance of hydrological models is commonly assessed by measures of performance or goodness-of-fit statistics. This has also been applied to ANN rainfall- runoff models. It is suggested that the adopted criteria should not be redundant and should be sensitive to different types of errors (e.g. errors in peak prediction, errors in the timing of the hydrograph prediction, etc.) [77][111][112]. However, often only indices, which are linked to each other such as the correlation root mean squared error are applied.

Evaluation of prediction performance can be assessed by computing an error measure, E over a time series:

(3.16) E the error between the estimated= ∑ or( predicted̈( ) − ( element)) an actual sequence element

[70].

After the testing phase is completed, the results are saved in the workspace, and a graph is plotted between the actual flow rate and the predicted flow rate over a given time period. The graph is an effective visual way of comparing the measured and predicted the result. Performances measurement such as Root Mean Square Error

(RSME), Mean Absolute Percentage Error (MAPE) and Regression (R2) have been used. RMSE is been shown in Equation (3.17) [113].

∑ ( ) Where is the observed valueRSME for= the observation(3. and17) is the predicted value. MAE gives the same weight to all errors,ℎ while the RMSE penalises variance as it gives errors with larger absolute values more weight than errors with smaller absolute values [114]. When both metrics are calculated, the RMSE is by deﬁnition never lower than the MAE.

37 Correlation Coefficient The correlation coefficient is one of the ways to determine the degree to which two variables' movements are associated as shown in equation (3.18). The range of values for the correlation coefficient must be from -1.0 to 1.0. A correlation of -1.0 indicates a perfect negative correlation, while a correlation of 1.0 indicates a perfect positive correlation [115].

(∑ . ) (∑ ).(∑ )

= ∑ (∑ ) . ∑ (∑ ) (3.18)

Where is observed data; is predicted data; and is the number of data. Advantages and Disadvantages of ANN for Hydrological Modelling One of the key advantages of using ANN in hydrology modelling is that when developed, the model can be running in near-real-time and continuously use available data to adapt and improve its prediction. This is very important in an area where land use changes at a faster pace than normal. Continues and adaptive techniques such as ANN become more important as they are able to evolve their prediction with changing conditions. Other advantages include:

 Ability to adapt to non-linear problems such as hydrological problems. The

interconnection between the neurons in an ANN generates non-linear data

processing structures that are distributed across the network. This feature

allows intrinsically non-linear processes to be modelled, such as the

transformation of rainfall into runoff.

 Ability to model input /output relationships; ANN does not need an explicit

mathematical equation to specify the connection between the inputs and

the outputs.

 Adaptability; ANN is adaptable to changes as new data become available. This

feature makes it particularly useful in the treatment of non-stationary

38 processes, where learning strategies can be designed in real-time so that the

model learns continuously [116]. This is important as rainfall patterns changes

due to long-term weather change and land use.

 Less sensitive to noise in the data; ANN proven has a greater ability to handle

noise in the input data [26] especially when the amount of data available is

large.

 Modularity: ANN can be integrated easily into modular architectures to

solve specific subtasks of the overall problem efficiently [117]. This would

allow the method to be adapted to different sections of the river course.

However, the main disadvantage of ANN is its black box approach to the process, which makes it less preferable to physically-based and conceptual models, especially for real-time applications. T rying to open up the black box and find physical meaning in ANN is an area of research that needs more attention. Some preliminary studies have begun to address this issue, but it continues to be an area where further research is needed [57][118]. This particular aspect is its attraction for catchment modelling as the complexities of the physical process and their changes do not to be explicitly understood. Summary This chapter has provided an overview of the design of an ANN with particular emphasis on rainfall-runoff modelling. The advantages of ANN in hydrological modelling, as well as the disadvantages, have been discussed. The following chapter will present the methodology that been applied in this research.

39 Major Outcome The advantages of using an ANN-based prediction method include the following:

 Prior knowledge of the underlying process is not required;

 Existing complex relationships among the various aspects of the process under

investigation need not be recognised;

 Solution conditions, such as those required by standard optimisation or

statistical models are not pre-set;

 Constraints and prior solution structures are neither assumed nor enforced

[119].

40 Chapter 4 Methodology

This chapter describes a detailed study of the catchment area and how rainfall affects the river characteristics. Statistical analysis of rain in the study area is also presented to gain a detailed understanding of the local rain dynamics.

Fig. 4.1 shows the development of river flow model for Timah Tasoh which consists of several stages. Firstly, the rainfall characteristic will be studied using rainfall data from 14 weather stations in Perlis. Secondly, the relationship between rainfall and river characteristic (flow rate and level) for River Jarum and River Pelarit, based on the studies of data from 3 rainfall stations and two river flow stations, will be determined. Finally, to perform flow modelling, ANN modelling will be introduced and described how it could be used to predict the river important characteristics.

Fig. 4.1 Development of River Flow Modeling for Timah Tasoh Basin.

41 Study area

Malaysia has two predominant seasons; dry and rainy seasons. The north-west region is the driest part of the Peninsula during the northeast monsoon period where Perlis is located [120]. The northeast monsoon has fewer influences in the north-western region because of the Titiwangsa mountain range, which shelters the region [2]. On the other hand, the northeast monsoon strongly affects the eastern part of the

Peninsula [120][121]. Fig. 4.1 shows that the annual rains in Peninsular Malaysia range from 1500 mm to above 4000 mm per year [5].

Fig. 4.2 Annual rain distribution for Peninsular Malaysia [5].

Perlis is the smallest state in the northern part of Peninsular Malaysia which has been chosen as the case study area for this research. Perlis is located at 6°30′ N 100°15′ E and has an area of 821 km2 (317 miles2). In 2005, 2010 and 2011, a series of floods occurred in Perlis resulting in significant damages to properties and some loss of lives at one time [122]. In 2010 alone, 13,711 people were evacuated from the flooded areas while the floods claimed four lives [122]. According to the Ministry of Agriculture and Agro-based Industry of Malaysia, losses to farmers were estimated to be up to

RM 50 million (£10 million) [123] in 2010 alone.

42 1. PADANG BESAR

2. LUBOK SIREH 3. KAKI BUKIT

4. JARUM RIVER

5. PELARIT RIVER

Fig. 4.3 Location of the hydrological station in Perlis, Malaysia.

Fig. 4.3 shows the locations of the gauging stations from which data has been collected and used in this study. They consist two river gauging stations and three of weather stations.

The main water reservoir in Perlis is known as Timah Tasoh which receives water supply from two main rivers, the River Tasoh and River Pelarit with a combined basin area of 191 square km. Because of its small capacity, the water level in the reservoir often rises quickly after heavy rains. Excess water from the reservoir, when released, causes flooding downstream of the dam to the main populated area, including the state capital, Kangar. Any delay in releasing the water also causes a flood to the surrounding area at the upstream of Timah Tasoh dam [124]. Therefore, monitoring and predicting the amount of run-off and the water level in the lake or dam are crucial for the authority to effectively manage the water in the dam. This dam also has dual functions; to ensure that sufficient amount of water is retained for domestic and irrigation use whilst making sure that the water level does not rise above a critical level that will cause floods. During heavy rains or monsoon season, excess water in

43 the dam has to be released to ensure the safety of the dam and to reduce the risk of flooding upstream. However, during the dry season, the water level in the dam falls rapidly due to its small capacity. This can lead to water rationing for domestic water supply and irrigation.

Fig. 4.4 Land usage in Timah Tasoh Reservoir catchment area.

The land use in the Timah Tasoh catchment area is shown in Fig. 4.4 and Table 4.1 provides details of land usage in the basin area. Most of the land is covered by vegetation, forest and farmland.

Land use River Jarum River Pelarit Hulu River Tasoh River Pelarit Hilir

Sugar 11.58 - 11.58 -

Settlement 0.74 0.35 2.68 0.35

Mixed crop 2.22 0.21 3.37 0.76

Scrubland 2.79 0.33 3.2 0.38

Rubber 12.94 2.09 25.87 5.13

Rice Field 5.23 0.4 7.62 0.59

Forest 28.9 38.72 46.85 52.11

Grass - 0.13 0.08 0.28

Quarry - 0.5 - 0.5

TOTAL 64.4 42.72 101.25 60.1 Detail of land use ((km2) in Timah Tasoh catchment area [125].

44 The distance between the River Pelarit monitoring station at Batu Limabelas and the

Timah Tasoh reservoir is about 9.87 km as shown in 0. The related rain stations for

River Pelarit are Kaki Bukit and Lubuk Sireh stations.

Fig. 4.5 The distance of River Pelarit from the gauging station in Kg Batu Limabelas.

Fig. 4.6 River Jarum- gauging station at Kg Masjid to Timah Tasoh Lake.

45 The distance of the River Jarum monitoring station to the Timah Tasoh dam is about

7.40 km. The aerial view is shown in Fig. 4.6. The related rain gauge station for River

Jarum is Padang Besar gauging station, located in the northern part of Perlis.

Mann-Kendall (MK) trend test

Mann-Kendall (MK) test is a non-parametric way to detect a trend in a series of values to study the trend of rainfall in Perlis. This test is suitable for cases where the trend may be assumed to be monotonic and thus no seasonal or other cycle is present in the data. The MK test is applicable in cases when the data values xi of a time series can be assumed to obey the model as equation (4.1) where f(t) is a continuous monotonic increasing or decreasing function of time and the residuals i can be assumed to be from the same distribution with zero mean. It is therefore assumed that the variance of the distribution is constant in time. To test the null hypothesis of no trend, Ho, i.e. the observations xi are randomly ordered in time, against the alternative hypothesis, H1, where there is an increasing or decreasing monotonic trend.  

i i i MK test statistic is deﬁnedx asf (Tt =) i 0 0 if x =0 −1 if x < 0. Under the null hypothesis that there is no trend, T is distributed as a normal random variable with mean zero and variance assuming no ties between (x1, x2,.,xn).

The MK test statistic S is calculated using the formula (4.2) and (4.3):

Where xj and xk are the rainfall= values is in specific( duration− ) j (4.2and )k, j > k respectively and

j k) = + j Sgn(x - x 1 , if x - xk > 0 (4.3) j 0, if x - xk = 0 If n is 9 or less, the absolutej value of S is compared directly to the theoretical -1, if x - xk < 0 distribution of S derived by MK. In this case, the two-tailed test is used for four different significance levels : 0.1, 0.05, 0.01 and 0.001. At certain probability level

46 H0 is rejected in favour of H1 if the absolute value of S equals or exceeds a specified value S/2, where S/2 is the smallest S which has the probability less than /2 to appear in case of no trend. If the value of S is positive indicates an upward trend and if the value of S is negative shows a downward trend. The minimum values of n with which these four significance levels can be reached are derived from the probability table for S as follows in Table 4.2. Significance level,  required n 0.1 4 0.05 5 0.01 6 0.001 7 Four significance level and the required number of data n.

The significance level 0.001 means that there is a 0.1% probability that the values xi are from a random distribution and with that probability, a mistake been made when rejecting H0 of no trend. Thus the significance level 0.001 means that the existence of a monotonic trend is very probable. Respectively the significance level

0.1 means that there is a 10% probability that a mistake been made when rejecting

H0. Total of data, n is at least 10 the normal approximation test is used. However, for the case where a number of data values 10 or more or there are several tied values

(i.e. equal values) in the time series, it may reduce the validity of the normal approximation when the number of data values is close to 10.

Rain Dynamics

Annual rain data from 4 weather stations located in the catchment area has been analysed to study the rain characteristic. The data was measured at every 5 minutes intervals. The list of the stations and the distances between the stations are shown in

Table 4.3 which shows the distance (in meters) between the stations. The closest two rain gauge stations are Lubuk Sireh and Kaki Bukit which are 1416 meter apart. For the purpose of this study, these two stations have been used to determine the rain cell sizes and dynamics. The selection of these stations is also important because Lubuk

47 Sireh and Kaki Bukit are located in the middle of the Timah Tasoh catchment area and Tasoh is located at the edge of the dam as shown in Fig. 4.8 . Rainfall from Wang

Kelian does not flow into the catchment but its position is important for studying the dynamics and direction of rain cells within the area. KELIAN LUBOK WANG BUKIT SIREH KAKI

WANG KELIAN - 4958 LUBOK SIREH 4749 - KAKI BUKIT 4094 1416 - TASOH 7811 3643 3724

Rain gauge stations in Timah Tasoh catchment.

600

500

400 WANG KELIAN

300 LUBOK SIREH KAKI BUKIT 200 TASOH

100

0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Fig. 4.7 Monthly distribution in 2013 for rain stations in Perlis.

Fig. 4.1 compares the monthly rainfall amounts measured by the 4 stations in 2013.

The results show that the wettest month was October with up to 480 mm of rain in

Wang Kelian station. Amongst the 4 stations, Wang Kelian experiences the most amount of rain followed by Kaki Bukit, Lubok Sireh and then Tasoh. These show that topology significantly influences the amount of rain that falls in this region. Rain from the sea is blocked by the hill that separates Wang Kelian from the rest of the rain gauge stations.

48 Wang Kelian

Rain Lubuk Sireh guage

Kaki Bukit River Tasoh Level sensor

Fig. 4.8 Location of four rain stations in Timah Tasoh catchment.

The study of rain dynamics is achieved using the rain gauge grid as described in Fig.

4.9. The box represents the coverage area which is divided into four smaller sections and each rain gauge is considered to be located in the centre of each segment. Based on the rain cell size, there are three possible scenarios: (a) the rain cell size is smaller than or equal to 1.5 km (distance between the 2 closest rain gauge stations) and is moving across the area; (b) the rain cell is larger than 1.5 km and is moving across the area and; (c) the rain cell is small and localised. In the first scenario (Fig. 4.9 a), initially at a time a small sized rain cell travels from outside into the catchment area at time

. In each case, only one rain gauge station is experiencing rain. When the rain cell enters the catchment area, rain gauge, RG1 will record the rainfall. The second rain gauge RG2 will record rainfall at time t2. Finally, at time t3, none of the rain gauges in the catchment area will record the rainfall from the rain cell.

In the second scenario, Fig. 4.9 b), the rain cell size is large covering more than one weather station at the same time. This is widespread rain which is a characteristic of rain in temperate climates and is easier to detect and predict. In the third scenario,

Fig. 4.9 c), the rain cell is small and static, where it only rains and stop in the same location. Fig. 4.9 a) and Fig. 4.9 b) illustrates cases where the rain dynamics is likely

49 being influenced by the wind. Fig. 4.10 show the direction of rain cell movement in the catchment area based on the study of rainfall data.

Time Rain RG1RG2RG3RG4 Time Rain GaugeRG1RG2RG3RG4 Time Rain GaugeRG1RG2RG3RG4 Gauge Initial, t0 0 0 0 0 Initial, t0 0 0 0 0 Initial, t0 0 0 0 0 t1 X 0 X 0 t1 X 0 X 0 t1 X 0 0 0 t2 0 X 0 X t2 X 0 0 0 t2 0 X 0 0 t3 0 0 0 0 t3 0 0 0 0 t3 0 0 0 0 1. t0 1. t0 1. t0 1 2 1 2 1 2

3 4 3 4 3 4 2. t1 2. t1 2. t1 1 2 1 2 1 2

3 4 3 4 3 4 3. t2 3. t2 3. t2 1 2 1 2 1 2

3 4 3 4 3 4 4. t3 4. t3 4. t3 1 2 1 2 1 2

3 4 3 4 3 4

a) b) c) Fig. 4.9 Different scenarios of rain gauge grid for rain cell detection: (a) small sized rain cell < 1.5 km, (b) large rain cell > 1.5 km, (c) localized cell < 1.5 km.

Wang Kelian Rain station Lubuk Sireh River Level station Kaki Bukit Tasoh

Wind Direction

Fig. 4.10 The direction of rain cell movement within the catchment area.

Fig. 4.10 shows an example the dynamics of the rain cell in the Timah Tasoh catchment area. It is based on Table 4.4 which indicates that the movement of the rain

50 cell in the North West direction deduced from the analyses of the sequence of the rain gauges measurements. For this case study, initially, on 30 January 2013 at 11:15 am, the rain cell travelled from outside the catchment area and at time 11:25 am both Kaki

Bukit and Lubuk Sireh rain gauge stations were experiencing rain. This suggests that the rain cell that entered the catchment area had a size that was larger than 1.5 km in diameter based on the distance between the two rain gauge stations. However, the amount of rain that was recorded is not same. This rain cell has been recorded for 15 minutes at the Lubok Sireh station before it disappeared at 11:35 a.m. Meanwhile, this rain cell moved in the direction of Wang Kelian station where at 11:35 a.m., the station started recording the rain. Due to the movement of the rain cell, at time 12:05 p.m. the rain cell had passed.

SITE WANG KELIAN KAKI BUKIT LUBOK SIREH TASOH Date Time Amount of rain [mm] 20130103 11:15 0 0Rain (mm) 0 0 20130103 11:20 0 0.1Rain (mm) 0.5 0 20130103 11:25 0 0.5Rain mm 1.5 0 20130103 11:30 0 0.7 1.6 0 20130103 11:35 0.6 1.1 0 0 20130103 11:40 2.6 0.8 0 0 20130103 11:45 0.9 0.2 0 0 20130103 11:50 0.2 0.2 0 0 20130103 11:55 0.2 0 0 0 20130103 12:00 0.1 0 0 0 20130103 12:05 0 0 0 0 Rain records for 4 stations in Timah Tasoh Catchment.

Catchment Area Characteristic

In the conventional hydrological method, hydrological processes in the floodplain are used to estimate the river characteristics e.g. flow rate and level [126][127]. The catchment's physical characteristics (soils, geology, and the presence different of types of vegetation) must be known and some assumptions about their spatial homogeneity made. This is not always possible and therefore an alternative approach

51 has to be sought.

In this study, the alternative method chosen utilises rainfall and river data. The data covers a 4-year period from 1 Jan 2010 to 31 Dec 2013. The aim of the analyses is:

 To study the relationship between rainfall and river characteristics such as

river flow rate.

 To identify the dependency on the flow rate of the river and rainfall in terms

of rainfall duration and intensity.

 To analyse how the river flow responds after rainfall in the catchment area

after a dry period and the effects of percolation into the soil.

Surface-runoff relationship

Rain-runoff is a process that occurs in both space and time domains. Spatial domain is defined by catchment boundaries, rainfall distribution, topography, soil layers, river network, and all that have any tangible effect on the resulting estimation. The time domain is defined by the variation of the estimate for each discrete rainfall point a long time.

Fig. 4.11 Padang Besar daily rainfall in the year of 2010.

For small catchment areas likes Timah Tasoh, lumped sum model approach is the most suitable model to be used [128]. In this method, the complexity of parameters

52 such as topology, type of soil and type of vegetation in the catchment area can be avoided [129][130].

The water level in the Timah Tasoh reservoir is monitored and controlled manually.

Therefore, the level would change artificially when the water is released irrespective of rain condition. If too much water is released, it could result in water shortage if inflow from the two rivers is not sufficient. A prediction of the river flow rate would enable a more efficient water release management to be put in place, to ensure the safety the dam and balance the mitigation flooding against retaining sufficient water for domestic and agricultural use. The flow rates of the rivers are affected, naturally, by rainfall in the catchment area. To understand the catchment area, further analysis to find the relationship between rainfall and water discharge into the dam were carried out using the available data in the catchment area which are:

 Padang Besar rainfall station  Lubok Sireh rainfall station  Kaki Bukit rainfall station  River Pelarit flow rate  River Jarum flow rate

53 River Flow rate

Time

Tdry Tdelay TF Change

T rain

T Dry Period-Duration before rain dry, T Flow Delay-Time between the stat of rain and a change in the river flow rate. delay T Rain Period -Duration of rain. rain T F Change Time duration for the flow rate to return to base flow rate. Fig. 4.12 Relation of rainfall and river flow in the basin of Timah Tasoh.

Fig. 4.12, illustrates the relations of rainfall and river flow. The rainfall affects the river flow rate at a time duration that depends on the topology, distance of weather station from the river, hydrology of the catchment, etc. As the rain starts to get heavy, the soil becoming saturated. Thus there is an increase in surface runoff and water starts flowing into the river channel, causing the rate of discharge to increased. A very steep rise would indicate a fast increasing discharge rate meaning that water is flowing into the river channel at a rapid rate, and thus imply heavy rainfall in the river catchment.

When the river flow reaches the maximum level (peak discharge), and the rain stops the river discharge starts to decrease, the curve will start to slope gradually (gradient is usually less steep compared to the lag time). However, the flow rate can peak when it still raining, in which case the river flow rate starts to increase.

This analysis is necessary to determine the saturation time for the basin. This could also describe the soil moisture variation based rainfall cumulated over a fixed period

54 and the time evolution of these variables. In other words, this analysis is to determine:

 Duration of time for when flow changes based on the amount of rainfall in the

basin after a dry period.

 Duration of time that the flow rate affected by rain.

 The minimum time to predict the river flow after the rain occurred.

Table 4.5 shows the number of available data from stations in Timah Tasoh basin which consist of rainfall and river flow rate data from 2010 to 2013 with 5 minutes intervals.

Station Measurement parameter Amount of data 1 Padang Besar Rainfall (mm/h) 420,768 2 Lubok Sireh Rainfall (mm/h) 420,768 3 Kaki Bukit Rainfall (mm/h) 420,768 4 River Pelarit Flow rate (m3 / sec) 420,768 5 River Jarum flow rate (m3 / sec) 420,768 Total of data for each station.

The next step is to group the related weather and river flow stations according to the river catchment as shown in Table 4.1.

Station River catchment 1 Padang Besar, River Jarum Jarum 2 Lubok Sireh, Kaki Bukit, River. Pelarit Pelarit Related stations for Pelarit and River Jarum.

ANN-based river flow rate model

Satisfactory flood prediction requires a model that represent the parameters of the river. As discussed in Chapter 3, ANN falls under data-driven hydrological models.

ANN is sensitive to the data that is used and there are several conditions that must be fulfilled to ensure that the process to produce river flow rate model can be generalised. This includes selecting appropriate parameters as input based on their relevance and to reduce the possibility of overfitting during the training phase.

55 Comparison between river level and river flow rate measurement.

Current practice in flood monitoring, river level will be constantly monitored as well as rainfall rate and duration in order to warn the public the possibility of flooding. In

Malaysia, reliance on water level monitoring often results in very short lead time warning. The water level markers are used as shown in Fig. 4.13. This study aims to detect events that lead to an increase in river or dam water to improve flood forecast.

Table 4.7 provides the details of the current practices. It should be noted that these level are located specifically.

Fig. 4.13 Water Level Classification at Flood Warning Centre [131].

Station Name River Normal Alert Warning Danger (Cross- District Basin level Level Level Level section) (Trend) Kaki River Pelarit River Perlis 35.60 38.60 38.72 39.00 Bukit River Jarum Beseri River Perlis 30.00 33.30 33.39 33.60 Stages level of rivers (meter) in Perlis, Malaysia from DID website for flood monitoring system [132].

Summary

This chapter has broadly discussed the methodology. It has discussed the rain dynamics within the catchment area. The discussion of rainfall in the catchment area has broadly reviews regarding the how to determine has been provided. Most important, ANN, which has been selected for use in this research has been described.

A brief introduction to the current flood event monitoring used in Malaysia has been

56 provided. The following chapter describes in detail the distribution of rain in the studied area, the advantages of river flow monitoring compared to river level monitoring and the implementation of ANN modelling to predict river flow rate.

Major Outcome

To develop the river model in this studies, several stages have been involved. Firstly, the land coverage had to be studied. In this case, it is found that most of the studied area consists of farmland and forest. The next phase, the rainfall characteristic have studied to determine the rain characteristic in the tropical area. Next, the relationship between rainfall and river characteristic (flow rate and level) for River Jarum and

River Pelarit, based on the studies of data from 3 rainfall stations and two river flow stations, will be determined. Finally, to perform flow modelling, ANN modelling will be introduced and described how it could be used to predict the river flow rate.

57 Chapter 5 Result

The results of the study are presented in this chapter. The aim of this study was to use rainfall data in the catchment area to predict river flow rate which can be used to estimate the amount of water as well as predict and prevent flooding. To achieve this rainfall in the catchment area has been studied. The initial studies were to gain an understanding of the rain pattern in the catchment area. The existing method of measuring the water level in the rivers to predict flooding is considered inadequate as it does not provide sufficient advanced warning.

To analysis the trends of precipitation based on 5 minutes data for all the stations,

Mann-Kendall (MK) test been used [133][134][135]. The result shows that there is no fixed seasonal pattern of rainfall in Perlis based on data from 2010 to 2013. This is evident in Fig. 5.1. The North East Monsoon (NEM) period should be the wettest months in Perlis and the driest during the South West Monsoon (SWM) period.

However, Fig. 5.1 shows that the rainy season in 2010 shifted several months earlier and in 2011, the double-peaked rainy season occurred in March and September which is abnormal compared to other years within the four years.

Average Monthly Rainfall for 2010 to 2013

Fig. 5.1 Average Monthly rainfall for 14 weather stations in Perlis from 2010 to 2013.

Rain analysis has been carried out on the 4-year data obtained from Perlis. The results, based on measurement in the catchment area using data from 4 stations are

58 summarised in Table 5.1 and Table 5.2.

Year Number of data Number of data sample with Average rain period (hours) samples (5 min) rainfall (14 stations) 2010 1,471,680 52079 744 2011 1,471,680 39049 558 2012 1,475,712 40274 575 2013 1,471,680 36559 522 Table 5.1 The total and duration of rainfall from 2010 to 2013 for 14 stations in Perlis.

From Table 5.1, in 2010 the highest average duration of rain for each station with 744

hours and 2013 had the lowest. An assumption is made in this analysis that when rain

is recorded, it will last for the duration of 5 minutes (quantisation). KAKIBUKIT KG. BEHOR KG. ULUPAUH NGOLANG Rain Rate Rain (mm LDG. PER. LDG. TEMIANG NANGKA KATONG PADANG PADANG KELIAN ABI KG. ABI LUBOK BAHRU TASOH LATEH BESAR WANG SIREH GUAR ARAU BKT. SEL /h

) Frequency

10 1192 9194 9056 1088 8875 1562 8718 1320 8591 9537 8987 8903 8780 9149 20 10422 820 900 9423 887 6755 808 7606 803 939 837 811 815 860 30 508 383 445 458 444 330 383 364 420 430 424 413 396 434 40 245 215 210 193 215 156 204 165 178 220 213 184 196 216 50 153 143 117 152 145 107 128 129 106 145 116 136 131 143 60 102 91 90 84 84 65 96 73 91 98 93 92 85 105 70 50 57 46 40 54 49 55 51 44 51 62 73 68 76 80 39 47 38 29 35 27 35 31 45 40 30 32 27 51 90 42 35 12 18 23 30 25 30 21 25 29 32 28 42 100 14 13 18 17 14 8 18 6 16 18 19 12 14 24 110 11 11 10 4 10 12 13 7 10 9 7 13 12 16 120 5 18 6 6 11 3 8 8 10 6 11 7 5 8 130 3 5 1 7 7 4 3 3 2 5 3 5 5 4 140 5 1 2 3 2 1 2 3 3 0 4 3 2 3 150 8 7 1 5 4 6 3 4 2 0 0 3 6 4 TOTAL 1414 1104 1095 1284 1081 1709 1049 1484 1034 1152 1083 1071 1057 1113 L 9 Table0 5.2 2 The3 frequency0 8of rainfall9 rates0 for2 10 to3 150 5mm/h9 in Perlis0 . 5

59 PADANG KATONG LATEH KG. BEHOR PADANG BESAR GUAR NANGKA ABI KG. BAHRU BKT. TEMIANGBKT. WANG KELIAN LDG. PER. SEL PER. LDG. LUBOK SIREH LUBOK KAKI BUKIT ULU PAUH ULU NGOLANG TASOH ARAU

Number of rain 14149 11040 10952 12843 10810 17098 10499 14840 10342 11523 10835 10719 10570 11135 occurrences of 5 min interval

Mean (mm/h) 6.7 7.4 7.0 6.4 7.4 4.4 7.2 5.2 7.2 7.1 7.2 7.3 7.2 7.8

Median 2.4 2.4 2.4 2.4 2.4 1.2 2.4 1.2 2.4 2.4 2.4 2.4 2.4 2.4 (mm/h)

Standard dev 14.3 14.3 12.6 12.3 13.6 10.2 13.6 11.1 13.3 12.8 13.3 13.8 13.6 14.6 (mm/h)

Min (mm/h) 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2

Max (mm/h) 189.6 216.0 148.8 166.8 230.4 192.0 177.6 170.4 164.4 129.6 136.8 246.0 165.6 166.8 Table 5.3 Descriptive statistic rainfall of 5 minutes for 14 stations in Perlis for 2010 to 2013.

Descriptive statistics of rainfall rate for 14 stations have been summarised in Table

5.3. The highest rain duration per year is measured at Bukit Temiang station with

rain duration of 356 hours per year. However, the maximum rainfall rate was

recorded at Padang Katong station in 2013 with rain rate of more than 150mm/hour.

One of the most important indications for potential flood event is high daily rainfall.

It has been found that, during the 2010 flood event, the daily rainfall rate was 246 mm

within 24 hours [136]. Fig 5.2 compares the distribution of rain rate from 2010 to 2013.

60 Fig. 5.2 Comparison of Rainfall Rate from 2010 to 2013 in Perlis

Rain intensity has been categorised into four groups [136] [137][138] as follows:

 light precipitation; R ≤ 10mm/h  moderate: 10 < R ≤ 25 mm/h  heavy: 25 < R ≤ 50 mm/ h  extreme: R > 50 mm/ h.

Fig. 5.3 shows the number of rain occurrences for each category. 85% of the rainfall is categorised as light, 9% as moderate, 4% as heavy and 2% for extreme rainfall. Heavy and extremely heavy rainfall account for 6% which at these intensities, will severely affect services like a high frequency of communication (> 10 G Hz) and also lead to flash floods.

Rain rate in 4 years in Perlis 160000 85% 140000 120000 100000 80000 60000 Frequency 40000 20000 0 10 mm 25 mm 50 mm Exceed 50 mm

Fig. 5.3 Rain intensity in 2010 to 2013 at 14 weather stations in Perlis.

61 Rainfall amount varies from 1 mm per day to 225 mm per day as shown in Table 5.4.

The highest daily rainfall of 228.9 mm was recorded in Padang Besar (which is in the

catchment area) rain station on 01/11/2010. On 02/11/2010, the large flood event

occurred downstream of the catchment area indicating that the daily cumulative

rainfall can be used to predict potential flood [124]. KAKI BUKIT KAKI KG. ULU PAUH ULU NGOLANG TEMIANG PER. LDG. NANGKA KATONG PADANG PADANG KELIAN ABI KG. ABI BAHRU LUBOK TASOH LATEH BESAR WANG SIREH GUAR ARAU BKT. SEL BEHOR

Mean (mm/day) 14.0 13.1 11.7 12.4 12.4 12.7 13.1 12.6 11.9 12.2 11.7 12.4 11.8 13.6

Median (mm/day) 7.0 5.9 6.3 6.6 6.4 7.4 7.0 7.4 6.1 6.5 6.4 6.0 6.5 7.0

Standard dev (mm/day) 19.5 20.1 17.0 18.4 18.4 17.5 17.6 16.5 16.4 16.7 15.5 17.6 16.1 18.2

Min (mm/day) 0.1 0.3 0.4 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.2 0.3 0.1

Max (mm/day) 162.7 228.9 168.4 174.1 198.7 130.3 191.5 137.7 155.4 180.4 149.1 138.3 152.4 139.5

Table 5.4 The descriptive statistic for daily rainfall of 14 stations in Perlis for 2010 to 2013. KAKI BUKIT KG. BEHOR ULU PAUH ULU NGOLANG LDG. PER. LDG. TEMIANG NANGKA PADANG PADANG KATONG ABI KG. BAHRU KELIAN TASOH LUBOK WANG BESAR LATEH GUAR SIREH ARAU BKT. YEAR SEL

2010 160 216 120 160 146 112 118 125 132 130 137 149 166 126

2011 139 128 149 148 143 109 126 122 134 119 136 128 156 136

2012 158 158 133 167 143 192 157 170 119 128 120 134 127 167

2013 162 246 125 162 164 121 145 138 178 190 144 230 109 145 Table 5.5 Maximum rainfall rate (mm/h) for 2010 to 2013 for 14 stations in Perlis.

Time Frequency Percentage (%) Session

Early Morning (EM) 1 a.m. to 6 a.m. 943 16.24 Morning (M) 7 a.m. to 12 p.m. 1173 20.20 Afternoon (A) 1 p.m. to 6 p.m. 2168 37.33 Night (N) 7 p.m. to 12 a.m. 1524 26.24 Table 5.6 Numbers rain of occasions in 2013 in 14 rain stations in Perlis

62 KG. BEHOR LATEH KG. BEHOR PADANG KATONG PADANG BESAR GUAR NANGKA ABI KG. BAHRU BKT. TEMIANGBKT. WANG KELIAN LDG. PER. SEL. PER. LDG. LUBOK SIREH LUBOK Average (%) KAKI BUKIT ULU PAUH ULU NGOLANG TASOH ARAU

EM 16 18 16 16 15 17 15 16 16 21 16 14 15 18 16

M 18 22 21 19 24 19 22 18 23 24 20 21 20 23 20

A 39 36 38 39 37 39 33 35 31 30 32 38 38 30 37

N 27 25 24 26 25 26 30 30 31 26 31 27 28 29 26

Table 5.7 Percentage of a rain event in 4 different sessions: Early Morning (EM), Morning (M); Afternoon (A), Night (N) for 14 rain stations in Perlis.

To analyse rain occurrences on a daily cycle, the day was divided as shown Table 5.7.

This shows that most rainfall occur are in the afternoon with 37% of the rain events

dominated by convective rain [139]. Understanding these statistics or daily variation

is important as it shows that localised flash flood is most likely in the late afternoon

and early evening, whilst widespread flooding is most likely going to start at night

due to the time it takes for runoff to result in widespread flood.

Four cases of flood events are studied to establish the effectiveness of standard river

level based flood monitoring approach compared with the proposed river flow rate

monitoring approach for River Pelarit and River Jarum. These four flood events are

used to justify the use of flow rate as a river parameter to be predicted as an indicator

for a potential flood.

63 Case 1: River Pelarit: 30th October to 1st November 2010

47,23 m3/sec

0.3 m3/sec

Fig. 5.4 Rainfall for Kaki Bukit, level and flow rate of River Pelarit during the 2010 flooding event.

Fig. 5.4 shows the observed discharge hydrographs between 31 October and 04

November 2010. The river flow rate, the river level and the Kaki Bukit rain gauge station reading are shown. The flood started on 2nd November downstream of Timah

Tasoh reservoir. There was a period of persistent rainfall in the Timah Tasoh catchment area between 31 October and 02 November. It is found that the cumulative rainfall per day was the highest in 2010 before the flood event started. It could be concluded that the main cause of the flooding was from diffuse precipitation covering large spatial and temporal scales within the catchment. The amount of rain meant the

Timah Tasoh reservoir exceeded its capacity within a short space of time leading to flooding both upstream and downstream.

Station Name Alert Level Warning Level Danger Level

River Pelarit at Kaki Bukit 38.60 38.72 39.00 Table 5.8 Stages level of River Pelarit (meter)

Table 5.8 shows the water level and the warning level of River Pelarit that are assigned by the authorities (DID). Table 5.9 gives a summary of changes in the level and the flow rate of the River Pelarit during the period shown in Fig. 5.9 . From the

64 Fig. 5.9 illustrates the rising limb of Pelarit river flow rate which started 19:00 on 31

Oct. 2010. The flow rate changed from 0.30 cubic m3/sec and rose to 47.23 cubic m3/sec which is more than 156 times the normal flow rate. By comparison, the river level only increased from 35 m to 39.90 m which is a 14 % change as given in Table 5.9.

Note that the river level is referenced to the mean height above sea level rather .than the depth of the river. It should also be noted that river level is a function of the terrain and change in height alone is a poor indicator Normal Peak Percentage changes River Level 35 m 39.90 m 14 Flow rate 0.30 m3/s 47.23 m3/s 15643.33 Table 5.9 River level and flow rate changes during 2010 flood event.

The result shows that the river flow rate is more sensitive to rainfall rate than river level. The river level also depends on the location of a river course.

In this case, the level of River Pelarit only reaches and exceeds danger level (shown in Table 5.8) for 12 hours on 1st November. From Fig. 5.3, it can be seen that the conventional method of monitoring flood by measuring the river level is not sensitive to changes in the river leading to short warning lead-time.

65 Case 2: River Jarum, Flood event on 30th October to 1st November 2010

Fig. 5.5 illustrates the observed discharge hydrographs, (the river flow rate and the water level) of the River Jarum, and the rain gauge reading at Padang Besar station between 31 Oct and 04 November 2010. Table 5.10 gives a summary of the changes in the level and the flow rate of the river during the period shown in Fig. 5.5.

Fig. 5.5 Jarum River Level, Flow rate and rainfall at Padang Besar weather station.

Normal Peak % changes River Level 30 m 34 m 13 Flow rate 0.30 m3/sec 59.90 m3/sec 198666 Table 5.10 Summary of parameters changes during 2010 flood event

Station Name Alert Level Warning Level Danger Level

River Jarum 33.30 33.39 33.60 Table 5.11 Water level as classified by DID for River Jarum.

From Fig. 5.5, the river flow rate started to rise at around 19:00 on 31st October. The total measured rainfall was 401.4 mm from 31st October to 02 November. The flow rate changed from 0.29 m3/s and peaked at 59.90 m3/s meanwhile the river level increased from 30 to 34 meter as in Table 5.10.

66 Case 3: River Pelarit, Flood event on 1 April 2011

Fig. 5.6 The plot of River Pelarit Level, Flow rate and rainfall at Kaki Bukit weather station.

Fig. 5.6 illustrates the observed discharge hydrographs (flow rate and the water level) for River Pelarit together with Kaki Bukit rain gauge station reading between 27

March and 02 April 2011. Normal Peak % changes River Level 35 m 38.76 m 10.74 Flow rate 0.30 m3/sec 47.01 m3/sec 15570 Table 5.12 Summary of parameters changes during 2011 flood event

Table 5.12 shows that the river flow rate rose from 28 March 2011. The flow rate changed from 0.30 m3/s and peaked at 47.01 m3/s. The river level increased from 35 m to 38.76 m.

Case 4: River Jarum, Flood event on 1 April 2011

Fig. 5.7 illustrates the measured river level and flow rate, and rainfall at the Padang

Besar station from 27 March to 02 April 2011. The flow rate changes from (0.30 m3/s to the first peak of 48.35 m3/s) and the river level increased from 30 to 33.61.

67 48.35 m3/s

0.3 m3/s

Fig. 5.7 Rainfall during the 2011 flooding event for Padang Besar and River Jarum at a recording interval of 5 minutes.

Normal Peak % changes River Level 30 m 33.61 m 12.03 Flow rate 0.30 m3/sec 48.35 m3/sec 16016 Table 5.13 Summary of parameters changes during 2011 flood event

Summary for comparison between level monitoring and flow rate measurement.

Flood forecasters rely heavily on real-time data of rainfall and river water levels as well as rainfall forecasts. They use hydrology models to estimate runoff in the catchment area and hence potential for flooding.

The case studies of the two major flood events (October 2010 and April 2011), for both

River Pelarit and River Jarum, show that changes to the flow rates were more pronounced compared to river levels. Flow rate can also be used to calculate the volume of water which is important for water resource management. Therefore, this study focuses on predicting river flow rate to address both flooding as well as water resource management.

68 Time

Tdry Tdelay TF Change

T rain Tdry, Dry Period Duration without raining before the Flow goes higher than the average. Tdelay Flow Delay Duration between the Rain start and the change in the Flow rate. T rain Rain Period Duration of rain start and end.

TF Change Duration when the flow rate increase higher than the base flow to flow decrease to lower than base flow

Fig. 5.8 Relationship of rainfall and river flow rate.

The purpose of this analysis is to determine the duration of effectiveness for river flow rate prediction. Fig. 5.8 illustrates the rainfall and river flow rate relationship.

The objective is to ascertain the length of time it take to change the river flow rate after rainfall has been recorded at specific rain gauge stations. This also depends on the soil moisture or dry period before the rain.

Dry (no rain) period and time delay for flow rate to change

Table 5.14 shows the dry or no rain period and the duration of time delay between the start of rain and the start of a change in river flow rate. The maximum dry period was in 2011 when there was no rain for 912 hours or 38 days. The average time delay, between start of rain and the change in the river flow rate is 18.64 hours for River

Jarum and 32.56 for River Pelarit.

69 Dry period (Hours) Flow delay (Hours) Year max average max average 2010 356.27 30.62 118.84 18.52 2011 912.42 30.80 108.45 17.69 2012 201.04 29.91 130.52 19.34 2013 427.14 28.22 125.76 19.04 i. River Jarum and Padang Besar Rain Station.

Dry period (Hours) Flow delay (Hours) Year max average max average 2010 327.79 33.18 118.09 12.01 2011 263.83 25.82 114.84 13.04 2012 501.85 34.08 118.42 15.31 2013 599.12 37.18 115.84 14.23 ii. River Pelarit and Kaki Bukit Rain Station Table 5.14 Dry period and river flow change delay for Pelarit and River Jarum.

The time delay depends on the location of the rain gauge stations. It is worth noting that whilst the longest dry period for Padang Besar rain gauge station was in 2011

(912 hours), for the Kaki Bukit rain gauge station it was in 2013 (599 hours). These differences illustrate the small rain cell size in the area. From the table, there is no strong statistical relationship between the length of the dry period before it rains and the time delay before the river flow rate is affected by runoff. However, it is thought to be because of the soil moisture, hence runoff cannot be successfully predicted based on an annual average. Consideration of cumulative rainfall could be better.

This has not been explored any further in this thesis.

Rain-No change condition

Dry time (Hours) Flow delay (Hours) Year min max Average min max Average 2010 0.75 522.92 67.69 0.50 37.20 4.67 2011 0.92 362.10 54.48 0.50 37.20 4.67 2012 0.08 783.91 56.28 0.50 37.20 4.67 2013 0.42 355.69 52.42 0.50 37.20 4.67 a) River Jarum (Padang Besar Rain Station) Period without rain and flow delay.

70 Dry time (Hours) Flow delay (Hours) Year min max Average min max Average 2010 0.83 592.37 84.89 0.40 24.90 4.79 2011 1.00 653.17 71.68 0.50 20.30 3.44 2012 0.75 372.51 55.87 0.50 41.30 6.31 2013 1.00 260.92 68.36 0.40 75.50 7.68 b) River Pelarit (Kaki Bukit Rain Station) Period without rain and flow delay

Table 5.15 Rain with no changes in Pelarit and River Jarum

Table 5.15 shows the dry period before rain and the time between the start of the rain event and when changes in the flow rates for Pelarit and River Jarum occurred. For

River Pelarit, the maximum dry period was 783.91 hours or 32 days which occurred in 2012. For River Jarum, the maximum dry period was 653.17 hours or 27 days which occurred in 2011. For the flow delay, the maximum time delay was 37.20 hours, from when the rain started for the River Pelarit flow rate to start to increase. Normally, this duration occurs during the dry season where the ground will absorb the rainwater until it is saturated and the underground water starts to flow into the river. However, during the wet season when the soil cannot absorb rainfall, the duration will be reduced and it takes only 0.5 hours for Jarum and 0.45 hour for River Pelarit flow to be affected when the rainfall is recorded in the catchment area.

ANN has been used to study the degree of significance of the chosen factors on the river flow rate prediction.

71 RAINFALL RAINFALL KAKI BUKIT PADANG BESAR RAIN STATION RAIN STATION

RIVER RIVER PELARIT JARUM 9.87 km 7.40 km TIMAH TASOH RESERVOIR

TIMAH TASOH DAM

Fig. 5.9 Simplified diagram of Timah Tasoh Catchment Area.

In this study, eight river flow rate models have been developed. The first four models,

(Model I, Model II, Model III and Model IV) have been designed for River Pelarit and the rest (Model V, Model VI, Model VII and Model VIII) have for River Jarum. Two different datasets have been chosen, (Nov 2010 and Oct 2013) which are periods prior to the flood events. Table 5.16 shows the details of the stations where data were used as input and output for each river that in this study.

Station Description

PB (input 1) Padang Besar Rainfall

SJ (input 2) Previous Flow rate of River Jarum of t

SJ (target) Predicted Flow rate of River Jarum

a) Model for River Pelarit Prediction-(Model I to IV)

72 Station Description LS (input 1) Lubok Sireh Rainfall KB (input 2) Kaki Bukit Rainfall SP (input 3) Previous Flow rate of River Pelarit of t SP (output) Predicted Flow rate of River Pelarit

b) Model for River Jarum Prediction. (Model V to VIII)

Table 5.16 Details of Input and the Output for Pelarit and River Jarum Flow Prediction Model.

For October 2010 prediction, 87552 data samples have been used and for November

2010 prediction, 88552 data samples have been used. Thus, the 70-30 (training and test datasets) configuration has been adopted. Regarding data training, the algorithm randomly selects 70% samples from the whole dataset for each attempt. The value of the root means square error (RMSE) depends on the selected data. 30 % of the remaining data is used for validation and testing.

In this study, two training algorithms were used. Levenberg-Marquardt (LM) algorithm for the models I, III, V and VII, Bayesian Regularisation (BR) backpropagation algorithms in models II, IV, VI and VIII to determine the prediction period. The selection of two algorithms has been discussed in Section 3.7. The prediction periods were increased in step of 12 hours until the model was not able to produce the predicted flow rate using number of performance indicators (MAPE,

RMSE and Regression). In this study, the best prediction was selected to be the one with the lowest value of MAPE, the lowest value of RSME and the correlation coefficient or that has a regression close to value 1.

73 In this configuration, the LM training algorithm was used for eleven different prediction periods as outlined in Table 5.17. Table 5.17 shows that the best result for the prediction model is only up to 12 hours. When the prediction period is increased, this model fails to produce good prediction results. Prediction Period Performance Measurement Trails Hours MAPE RMSE Regression 1 12 0.001 0.006 0.995 2 24 163.509 328.210 -0.003 3 36 130.865 306.647 0 4 48 165.430 57.6355 -0.309 5 60 40.173 125.363 0.060 6 72 57.529 184.802 -0.036 7 84 65.531 177.280 0.002 8 96 83.783 195.204 -0.040 9 108 92.585 195.423 -0.034 10 120 108.840 205.217 0 11 132 105.033 189.562 -0.031 Table 5.17 River Pelarit (Training: train LM, Target: Oct 2013). Grey shading denotes the best performing model. /s) 3 (m Flow rate Flow

a) Plot of the Predicted and measured Flow rate of River Pelarit for 12 Hours lead time using LM training algorithm, (Oct 2013)

74 /s) 3 (m

b) Error Plot of the predicted and actual flow rate from NARX model /s) 3 m ( w o l F d e t c i d e r P

c) Regression plot of actual and forecasted flow of River Pelarit.

Fig. 5.10 Prediction using Model 1: River Pelarit. Deterministic forecast: (a) observed and predicted time series; (b) residual plot; (c) scatter plot.

Fig. 5.10 a) shows a comparison of 12 hours prediction results and measured flow rate. The graph shows that the predicted and measured result are very similar, illustrating that within that period, the model is accurate. Fig. 5.10 b) shows the error between predicted river flow rate and the actual flow rate. The values of root mean square error (RMSE), MAPE and correlation coefficient or regression is 0.006, 0. 002 and 0. 99, respectively.

75 In this prediction model, Bayesian Regularisation (BR) training function was used.

Eleven different prediction period is outlined in Table 5.18 were tested. Prediction Period Performance Measurement Experiment Hours MAPE RMSE Regression 1 12 0.003 0.009 0.990 2 24 0.004 0.012 0.999 3 36 0.028 0.005 0.999 4 48 0.011 0.252 0.999 5 60 0.016 0.497 0.999 6 72 0.014 0.401 0.999 7 84 0.449 0.019 0.999 8 96 0. 018 0.437 0.999 9 108 0.016 0.379 0.999 10 120 99.697 191.484 -0.033 11 132 106.740 192.409 -0.027 Table 5.18 River Pelarit (Training: train BR, Target: Oct 2013). Grey shading denotes the best performing model. /s) 3 (m Flow rate Flow

a) The plot of the Predicted and Measured Flow rate of River Pelarit for 108 Hours Lead Time using BR training algorithm. (Oct 2013)

76 /s) 3 (m Flow rate Flow

b) Error Plot of the predicted and actual flow rate from ANN model w o l F d e t c i d e r P

c) Regression plot of actual and forecasted flow of River Pelarit.

Fig. 5.11 Simulation for Model 2: River Pelarit for Oct. 2013 dataset. Deterministic forecast: (a) observed and predicted time series; (b) residual plot; (c) scatter plot.

Fig. 5.11 0 shows the result of 108 hours prediction for River Pelarit and the actual

flow rate in October 2013. Comparing with the results in Fig 5.11 (River Pelarit with

LM Training algorithm), the BR algorithm is able to increase the prediction period up

to 108 hours or 4.5-day period. By comparing both predicted and the actual flow rate,

the error is in the acceptable range with the maximum value of 1.2 cubic m3/s at the

highest peak of flow rate (39 m3/s). The results also demonstrate the excellent

performance of the proposed algorithm with the values of RMSE, MAPE and

regression of 0.378, 0. 017 and 0. 99, respectively.

77 Prediction Period Performance Measurement Experiment Hours MAPE RMSE Regression 1 12 0.018 0.0567 0.999 2 24 54.589 202.198 -0.487 3 36 0.0915 0.0163 0.999 4 48 30.941 185.314 -0.174 5 60 21.072 146.514 -0.264 6 72 12.832 111.467 0.006 7 84 8.635 98.479 0.013 8 96 14.630 62.436 0.009 9 108 57.026 95.349 -0.001 10 120 84.479 95.201 0.001 11 132 0.744 22.117 0.892 Table 5.19 Pelarit Prediction Model. (Training: train BR, Target: Nov 2010). Grey shading denotes the best performing model.

a) Plot of Predicted and Measured Flow rate of River Pelarit for 12 Hours lead time using LM training Algorithm, (Nov 2010)

b) Error Plot of the predicted and actual flow rate from NARX model

78 d e t s a c e r o F

c) Regression of actual and forecasted flow of River Pelarit.

Fig. 5.12 Prediction of Model 3: River Pelarit for Nov 2010 dataset. Deterministic forecast: (a) observed and predicted time series; (b) residual plot; (c) scatter plot.

Fig. 5.12 0 shows the result of 12 hours prediction period using the November 2010 dataset. The values of RMS error, MAPE and correlation coefficient are 0.06, 0.027 and

0. 999, respectively.

Prediction Period Performance Measurement Experiment Hours MAPE RMSE Regression 1 12 0.0191 0.0598 0.99976 2 24 0.0162 0.0606 0.99991 3 36 0.0133 0.0741 0.99995 4 48 0.0089 0.1087 0.99998 5 60 0.0095 0.2037 0.99999 6 72 0.0075 0.2219 0.99991 7 84 0.0279 0.7508 0.99973 8 96 0.0465 2.1526 0.99853 9 108 0.0397 1.9065 0.99884 10 120 0.0581 3.2330 0.99694 11 132 0.7440 22.1169 0.89261 Table 5.20 Pelarit (Training: train LM, Target: Oct 2010). Grey shading denotes the best performing model.

79 a) Plot of the Predicted and Measured Flow rate for River Pelarit 108 Hours Prediction using BR training algorithm, Prediction: Oct 2010

b) Error Plot of the predicted and actual flow rate from the output of NN model.

80 w o l F d e t s a c e r o F

c) Regression of actual and simulated flow of River Pelarit.

Fig. 5.13 Prediction from Model 2: River Pelarit for Oct. 2013 dataset

Fig. 5.13 a) shows the result of 108 hours prediction using the October 2010 dataset.

The values of RMSE, MAPE and regression, are 0. 0397, 1.065 and 0.9988, respectively.

Experiment Prediction Period (October 2013) Performance Measurement Model algorithm Hours Days MAPE RMSE Regression I LM 12 0.5 0.001 0.006 0.995 II BR 108 4.5 0.016 0.378 0.999 Prediction Period (November 2010) Performance Measurement algorithm Hours Days MAPE RMSE Regression III LM 12 0.5 0.0181 0.056 0.999 IV BR 108 4.5 0.040 1.906 0.998 Table 5.21 Overall of the best prediction model performances based on different training algorithms for River Pelarit.

The results of river flow rate prediction for River Pelarit are summarised in Table

5.21. When LM training algorithm is used acceptable prediction of the flow rate up to

12 hours is achieved. Using the BR algorithm, acceptable results up to 132 hours (4 ½ days) can be obtained. Based on the processing speed, on average LM is able to produce quicker results which are up to 3 to 4 times faster compared to the model that is using LM training function. This is because the LM algorithm has a faster convergence and the training time is typically shorter compared to Bayesian

Regularization algorithm. However, taking into account, the run-time of the system,

81 the LM training algorithm based model, on average, is 2 to 4 times faster than BR training algorithm based model. Given that the runtime is in minutes for higher accuracy, the model using BR training algorithm is considered to be superior.

Prediction Period Performance Measurement Experiment Hours MAPE RMSE Regression 1 12 0.026 0.080 0.997 2 24 0.250 0.997 0 3 36 111.248 200.715 -0.908 4 48 362.444 148.123 0 5 60 133.354 368.767 -0.002 6 72 116.792 378.476 0 7 84 0.009 0.102 0.999 8 96 0.617 10.945 0 9 108 173.526 379.840 -0.061 10 120 232.229 399.939 0.047 11 132 239.562 369.233 0 Table 5.22 Jarum Prediction (Training: train LM, Target: Oct 2010) c e s r e p r e t e m c i b u c n i e t a r w o l F

a) Plotting of the Predicted and Measured Flow rate of River Jarum with 12 Hours Ahead using LM training algorithm (Oct 2013)

82 ) c e s r e p r e t e m c i b u c ( e t a r w o l F f o r o r r E

b) Error Plot of the predicted and actual flow rate from NARX model

Regression Sungai Jarum-Actual and Forecasting: R=0.99769 3.5 Data Fit

3 w o l 2.5 F d e t c i d e r 2 P

1.5

1.5 2 2.5 3 3.5 Actual Flow

c) Regression of actual and forecasted flow of River Pelarit.

Fig. 5.14 Prediction from Model V: River Jarum for Oct. 2013 dataset

Fig. 5.14 a) shows the result of 12 hours river flow rate prediction for River Pelarit.

The values of RMSE, MAPE and correlation coefficient, are 0.080, 0.026 and 0.997, respectively. The 12 hours prediction is the best result obtained from the models that use the Levenberg-Marquardt (LM) training algorithm.

83 Prediction Period Performance Measurement Experiment Hours MAPE RMSE Regression 1 12 236.818 394.962 -0.195 2 24 0.013 0.045 0.999 3 36 0.019 0.132 0.999 4 48 0.018 0.213 0.999 5 60 0.021 0.271 0.999 6 72 0.021 0.277 0.999 7 84 0.023 0.255 0.999 8 96 0.025 0.267 0.999 9 108 180.513 395.459 -0.064 10 120 0.267 0.025 0.999 11 132 0.207 0.020 0.999 Table 5.23 Jarum (Training: train BR, Target: Oct 2010) c e s r e p r e t e m c i b u c n i e t a r w o l F

a) Plot of Predicted and Measured Flow rate of River Jarum with 132 Hours prediction using BR training algorithm (Oct 2013) ) c e s r e p r e t e m c i b u c ( e t a r w o l F f o r o r r E

b) Error Plot of the predicted and measured flow rate

84 Regression Sungai Jarum-Actual and Forecasting: R=0.99966 30 Data Fit

20 w o l F d e t 15 c i d e r P 10

5 10 15 20 25 30 Actual Flow

c) Regression of actual and forecasted flow of River Pelarit.

Fig. 5.15 Prediction from Model VI: River Jarum for Oct. 2013 dataset.

Fig. 5.15 a) shows a result of 132 hours prediction and measured flow rate using the

October 2010 data set. The values of RMSE, MAPE and correlation coefficient are

0.019, 0.206 and 0.999, respectively. When the prediction period increases in 12 hours step, the model is able to produce good prediction results up to 132 hours where the

RMSE, MAPE and regression coefficient are in acceptable range.

85 Prediction Period Performance Measurement Experiment Hours MAPE RMSE Regression 1 12 0.00471 0.0084 0.999 2 24 222.405 403.116 -0.368 3 36 166.214 405.339 -0.305 4 48 58.1244 185.44 0.025 5 60 41.8196 149.812 -0.174 6 72 26.7131 124.528 0.045 7 84 61.0448 331.454 0.000 8 96 32.2485 185.795 -0.071 9 108 104.695 159.367 0.000 10 120 146.216 132.434 0.00085692 11 132 217.637 164.111 -0.071199 Table 5.24 Jarum (Training: train BR, Target: Oct 2010)

Sungai Jarum-Comparing of Actual and Forecasting Nov 2010 1.75

1.7 Predictions Actual 1.65

1.6

1.55

1.5

1.45

1.4

1.35

1.3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Hour a) The plot of the Predicted and Measured Flow Rate of River Jarum with 12

Hours Ahead using LM training algorithm (Oct 2010).

86 ) c e s r e p r e t e m c i b u c ( e t a r w o l F f o r o r r E

b) Error Plot of the predicted and actual flow rate. w o l f d e t c i d e r P

c) Regression of actual and forecasted flow of River Jarum. Fig. 5.16 Prediction from Model VI: River Jarum for Oct. 2010 dataset

Fig. 5.16 0 shows the result of 12 hours prediction using the October 2010 dataset.

Using this configuration, the best prediction is only up to 12 hours with RMSE, MAPE and regression value of 0. 0083, 0. 0047 and 0.999 respectively. When the prediction period is increased beyond 12 hours, the model fails to produce good prediction results.

87 Prediction Period Performance Measurement Experiment Hours MAPE RMSE Regression 1 12 0.004 0.007 0.999 2 24 0.006 0.021 0.999 3 36 0.016 0.235 0.99993 4 48 0.017 0.331 0.99996 5 60 0.018 0.613 0.9999 6 72 0.015 0.565 0.9998 7 84 0.032 1.859 0.99994 8 96 0.052 2.459 0.99886 9 108 0.026 1.359 0.99972 10 120 0.027 1.567 0.99964 11 132 0.023 1.283 0.99974 Table 5.25 Model of Prediction for River Jarum (train BR, Target: Oct 2010).

Sungai Jarum-Comparing of Actual and Forecasting Nov 2010 70 Network Predictions Expected Outputs 60

Step ahead= 30 1584 Number of neuron= 1 Delay= 1 RMSE= 20 1.28356 MAPE= 0.0228199

0 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 Days a) Mapping of the Predicted and Actual Flow rate River Jarum 132 Hours Prediction using BR train, Prediction: Oct 2010. ) c e s r e p r e t e m c i b u c ( e t a r w o l F f o r o r r E

b) Error Plot of the predicted and actual flow rate from NARX model.

88 Regression Sungai Jarum-Actual and Forecasting: R=0.99974

Data 55 Fit 50

40 w o

l 35 F d

e 30 t c i d

r 25 e r

P 20

10 20 30 40 50 Actual Flow

c) Regression of actual and forecasted flow of River Jarum. Fig. 5.17 Prediction from Model VI: River Jarum for Oct. 2010 dataset

Fig. 5.17 a) shows the result of 132 hours prediction using the October 2010 dataset.

This is the highest predicting period that can be achieved for this model. The values of RMSE, MAPE and regression are 1.283, 0.023 and 0. 999, respectively.

Prediction Period (October 2013) Performance Measurement Training Hours MAPE RMSE Regression algorithm I LM 12 0.026 0.08 0.99 II BR 132 0.20 0.02 0.99 Prediction Period (November 2010) Performance Measurement algorithm Hours MAPE RMSE Regression III LM 12 0.01 0.01 0.99 IV BR 132 0.02 1.28 0.99 Table 5.26 Best River Flow Rate Model performance based on different training algorithms for River Jarum.

Table 5.26 shows the best prediction for River Jarum based on various training algorithms. Using LM training algorithm, the best result is only up to 12-hour prediction. However, using BR algorithm, the flow rate can be predicted up to 132 hours or 5.5 days.

Studies of rainfall rate in the tropical area have shown that rainfall intensity can reach and exceed 150 mm/h. Rainfall in Perlis is characterised by highly diverse low and extremely high rainfall rate where the daily accumulated rainfall can reach 200 mm.

The studies also reveal that there was no consistency in annual rainfall pattern in the

89 four years from 2010 to 2013. Therefore, predicting flood events based on previous flood event will not produce reliable results.

Analyses of rain occurrences on a daily cycle found that the highest rainfall occurs in the afternoon (12 pm to 6 pm) with 37% of the total yearly rain. This is due to the fact that rainfall type is dominated by convective rain in tropical areas and the risk of flood occurrence is higher in the evening and night.

Considering the current flood detection system which uses river level, it has been shown that the proposed method of using river flow rate is better as an early warning system because it is more sensitive to rainfall compared to river level. This is a major finding of this study.

The study of the catchment area characteristics using the rainfall and river flow data, has shown that the time delay between the start of the rain and when the flow rate starts to changes in the dry season (or long no rain period) is 18.64 hours for River

Jarum and 32.56 hours for River Pelarit on average. Therefore, the prediction time is in addition to the delay time, for the locations of the rain gauge stations used.

However, during the wet or rainy season when the soil is saturated, the time delay is shorter and is down to 0.5 hours for Jarum and 0.45 hour for River Pelarit from the start of rainfall. It should, however, be noted that this duration can be modified or will change proportionally with rain intensity and duration.

One of the most significant contributions of this thesis is the development of the river flow prediction model. Using the ANN-based techniques, the model can produce short-term (<12 hrs.), medium-term (< 3 days), long-term (4-10 days) flow rate forecasting of the rivers [140][141]. In general, for very fast approximation, the

Levenberg-Marquardt (LM) training algorithm provides the fastest convergence.

However, as the number of weights in the network increases, the advantage of this algorithm decreases. Bayesian Regularization (BR) training algorithm although slower, provides more accurate results, both for short and long-term.

Finally, in this study, the need to know the surface topology has been eliminated as it is taken into account by the implemented ANN model.

90 This study have revealed that there was no consistency in annual rainfall pattern in the period from 2010 to 2013. Therefore, predicting flood events based on previous rain events will not produce reliable results. This study has shown that river flow rate prediction provides faster response time to rainfall compared to river level.

91 Chapter 6 Discussion and Future Work

This chapter summarises the contributions of this study and makes recommendations for future works. Summary of the Study One of the biggest problems in designing a flood and water runoff forecasting system is that models that take into account ground cover easily become outdated as a result of development in the flood plains. In hydrology modelling, a detailed understanding of the catchment area is required. A complex system that models all the processes which consider all external elements like rainfall combined with internal factors such as the state of vegetation, soil moisture, and groundwater levels is often used. The parameters are difficult and time-consuming to obtain but easily become outdated in areas where development is taking place.

Input (Rainfall in “Black Box Concept” Output (River catchment area) Timah Tasoh Basin Flow rate)

Simplified analysis of catchment model in Timah Tasoh Catchment Area

In this research, the black box approach has been used as shown in Fig. 6.1. The main hydrologic parameters of this model are precipitation, river flow rate and temporal rainfall distribution. Catchment area modelling based on rainfall-runoff data does not consider the physiographical factors or the catchment area characteristics explicitly and therefore does not require them to be measured. Because of the black box features, this system can be operated in a near real-time prediction system because it can

“update” itself based on the real-time input to produce real-time forecasting.

These concepts are suitable for small and medium-sized catchment areas but can be extended to include larger catchment areas with the use of high spatial resolution rain gauge network. This avoids the complexities of knowing the values of parameters such as topology, type of soil and type of vegetation in the catchment area and their

92 variations. The time-serial correlation between rainfall and the river flow rate has been extensively used in this study.

River flow rate predictions can be made with the help of an accurate river model. The relationship between river flow rate and the volume of water, therefore, can be used to forecast flooding by taking into account the prevailing conditions. From this study, it has been found that the main influence of rainfall in the River Pelarit and River

Jarum catchment areas is on their flow rates and not on the river levels as used in a conventional method to predict flooding. The study has shown that the use of river flow rate will provide a more advanced warning in term of lead time for the application of flood control and to manage water resources compared to the use of river level as an indicator.

This research’s aim was to develop an artificial intelligence based computer model that can be used to improve flood prediction and water resource management in NW

Malaysia. By considering the application of artificial neural networks, a nonlinear autoregressive machine learning technique was adopted. The developed model has been trained using preceding conditions such as rainfall values, dry period and previous flow rate to predict the future river flow rates of two rivers that discharge into the reservoir. The chosen methodology was to evaluate the performance of various architectures with a view to determining the most appropriate predictive model. Study of the results of two rivers prediction models shows that the recurrent neural network which uses the Bayesian Regularization training algorithm is the most efficient neural network architecture of the two for river flow rate modelling.

The best prediction period for both River Pelarit and River Jarum obtained is 108 hours and 132 hours, respectably. The network was configured to have one hidden layer.

Overall, the results show that rainfall data can be used to predict impending floods or water shortage by using river flow rate as the parameter of measurement. Whilst the accuracy of the results reported in this thesis is high for up to four day forecast, it starts to decrease for longer term forecasts. However, the model, if implemented, can

93 regularly update the forecast as new data comes in and adapt to changes in the catchment area. This type of real-time or near real-time model that uses new data to adapt its prediction is envisaged to be more appropriate in the phase of climate change or changing weather patterns.

Overall, the objectives of this research were outlined as follows:

 To study rain characteristics in a selected area in Malaysia using available rain

gauges data.

 To study the relationship between rainfall rate, river level and river flow rate

to identify which river characteristics (level and flow rate) has the strongest

relationship with rainfall;

 To identify a possible technique(s) that can be used to predict the identified

river characteristics.

 To develop a computer-based model, to predict the river characteristics, test

and analyse its performance using measured data.

The objectives have been achieved by:

 Analysis the rainfall characteristic and pattern in two rivers’ (River Jarum and

River Pelarit) catchment areas in Perlis, Northwest Malaysia.

 Establish the relationship between rainfall and river flow rate with the river

flow rate being the best parameter that can be used as an indicator for advanced

flood prediction and water resource availability management.

 Identified Artificial Neural Network (ANN) technique, developed and

implementing a model that has been used to predict river flow rate based on

rainfall. This is the first river model that been developed and implemented

based on ANN technique in this catchment area.

 Built a model that can predict river flow rates based on rainfall. The result

shows that it can predict river flow rates up to 132 hours in advance with

94 correlation coefficients of more than 0.9. Currently, the is no river model that

has been develop in for the studied area. The dam is manually controlled based

on the experience of the operator. Furthermore, several studies on river

modelling from different areas in Malaysia shows that the best-predicted

model is from 8 to 24 hours in advance [142][143][144].

Contributions Although several studies have been carried out in Malaysia on flood prediction, the systems that have been put in place, have proven to be ineffective in providing sufficient lead time during floods. The actions that are often taken in the management of water reservoirs or dams are also uncoordinated and decisions are made based on experience rather than system based. This thesis has made significant and novel contributions in two major areas; in the river modelling based on ANN technique that can be used to predict floods and also in the prediction of water inflow into a reservoir that can be used to manage water resources.

The developed ANN-based model, although other authors have used them in related areas, is considered to be one of the first to be applied in this context. The long prediction time and its high accuracy mean that it can be used to manage water dams, safeguard water supply and to control flooding both upstream and downstream.

Some of the results of this study have already been presented to some members of the Malaysian Department of Irrigation and Drainage (DID) and aspects of the study e.g. the use of river flow rate rather than river level as an indicator for potential floods is being considered for adoption. The current practice of measuring the river level has does not allow sufficient lead time to issue warning and for evacuation. Recommendations for Future Works in River Modelling Although the model developed and used produces good results, the author feels that several areas of the research carried out can be further strengthened. These include:

 The development of a model that relates the position of the rain gauges in the

catchment area to the time delay of the impact of the rainfall on the river flow

95 rate;

 The development of a model that will take readings from multiple rain gauges,

providing better spatial resolution, for river flow rate prediction.

 The development and deployment of a self-adapting real-time version of the

model and integration with any flooding or water resource management

system.

96 References

[1] D. Ruelland, V. Larrat, and V. Guinot, “A comparison of two conceptual models for the simulation of hydro-climatic variability over 50 years in a large Sudano- Sahelian catchment,” Glob. Chang. facing risks Threat. to water Resour., no. October, pp. 25–29, 2010.

[2] W. Z. W. Z. & A. A. J. Jamaludin Suhaila, Sayang Mohd Deni, “Trends in Peninsular Malaysia Rainfall Data During the Southwest Monsoon and Northeast Monsoon Seasons : 1975 – 2004,” vol. 39, no. 4, pp. 533–542, 2010.

[3] N. S. Muhammad, “Probability Structure and Return Period Calculations for Multi-Day Monsoon Rainfall Events at Subang, Malaysia,” Colorado State University, 2013.

[4] K. Sene, Flash Floods: Forecasting and Warning, XIII. Dordrecht: Springer Netherlands, 2013.

[5] Malaysian Meteorological Department, “Urban Stormwater Management Manual,” Percetakan Nasional Malaysia Berhad, 2013. [Online]. Available: http://www.water.gov.my/urban-stormwater-mainmenu-564/188.

[6] Urban Stormwater Management Manual for Malaysia, 2012th ed. Department of Irrigation and Drainage, 2012.

[7] T.N.Alagesh, H. Mohd, and H. Reduan, “Flood: Worsening situation in Pahang, Terengganu,” News Straits Times, 2013. [Online]. Available: http://www.nst.com.my/latest/font-color-red-flood-font-worsening-situation- in-pahang-terengganu-1.424948.

[8] Z. A. Akasah and S. V Doraisamy, “2014 Malaysia flood : impacts & factors contributing towards the restoration of damages,” J. Sci. Res. Dev. 2, vol. 2, no. 14, pp. 53–59, 2015.

[9] “Floodlist.” [Online]. Available: http://floodlist.com/asia/malaysia-floods- kelantan-worst-recorded-costs.

[10] M. S. Bin Khalid and S. B. Shafiai, “Flood Disaster Management in Malaysia: An Evaluation of the Effectiveness Flood Delivery System,” Int. J. Soc. Sci. Humanit., vol. 5, no. 4, pp. 398–402, 2015.

[11] “United Nations Platform for Space-based Information for Disaster Management and Emergency Response - UN-SPIDER.” [Online]. Available: http://www.un-spider.org/about/what-is-un-spider. [Accessed: 31-Jan-2017].

[12] G. Australia, “What is a Flood?,” 2011. [Online]. Available: http://www.ga.gov.au/scientific-topics/hazards/flood/basics/what.

97 [13] M. Dong, “A tutorial on nonlinear time-series data mining in engineering asset health and reliability prediction: Concepts, models, and algorithms,” Math. Probl. Eng., vol. 2010, 2010.

[14] R. J. Povinelli, “Time Series Data Mining: Identifying Temporal Patterns for Characterization and Prediction of Time Series Events,” p. 193, 1999.

[15] B. R. Kumar, “Forecasting and control of future values on Spatial-Temporal prediction,” pp. 31–36, 2001.

[16] S. H. Elsafi, “Artificial Neural Networks (ANNs) for flood forecasting at Dongola Station in the River Nile, Sudan,” Alexandria Eng. J., vol. 53, no. 3, pp. 655–662, 2014.

[17] R. Chuentawat, S. Bunrit, C. Ruangudomsakul, and N. Kerdprasop, “Artificial Neural Networks and Time Series Models for Electrical Load Analysis,” vol. I, 2016.

[18] D. Labat, R. Ababou, and A. Mangin, “Linear and nonlinear input/output models for karstic springflow and flood prediction at different time scales,” Stoch. Environ. Res. Risk Assess., vol. 13, no. 5, pp. 337–364, 1999.

[19] C. M. Zealand, D. H. Burn, and S. P. Simonovic, “Short term streamflow forecasting using artificial neural networks,” J. Hydrol., vol. 214, no. 1, pp. 32– 48, 1999.

[20] F. Guenther, “Neural networks: Biological models and applications.” pp. 1–7, 2003.

[21] I. Kaastra and M. Boyd, “Designing a neural network for forecasting financial and economic time series.pdf.” 1995.

[22] H. Tabari, P. Hosseinzadeh Talaee, and P. Willems, “Short-term forecasting of soil temperature using artificial neural network,” Meteorol. Appl., vol. 22, no. 3, pp. 576–585, 2015.

[23] O. Awodele and O. Jegede, “Neural Networks and its Application in Engineering,” Proc. Informing Sci. IT Educ. Conf. 2009, 2009.

[24] T. Htet, H. San, and M. M. Khin, “River Flood Prediction using Time Series Model,” vol. 1, no. 2, pp. 265–269, 2015.

[25] P. R.J., “Identifying Temporal Patterns for Characterization and Prediction of Time Series Events,” Marquette University, Milwaukee, 1999.

[26] C. Damle, “Flood forecasting using time series data mining,” 2005.

[27] P. Supriya, M. Krishnaveni, and M. Subbulakshmi, “Regression Analysis of Annual Maximum Daily Rainfall and Stream Flow for Flood Forecasting in Vellar River Basin,” Aquat. Procedia, vol. 4, pp. 957–963, 2015.

98 [28] ESCAP / WMO Typhoon Committee ( MALAYSIA ), “Members Report,” 2011.

[29] E. Baltas, “Advances in Geosciences The combined use of weather radar and geographic information system techniques for flood forecasting,” Inf. Syst., pp. 117–123, 2007.

[30] D. Solomatine, C. Rojas, S. Velickov, and H. Wust, “Chaos theory in predicting surge water levels in the North Sea,” Proc. 4th Int. Conf. …, no. July, pp. 1–8, 2000.

[31] M. Kim and J. Kim, “Predicting IGS RTS Corrections Using ARMA Neural Networks,” vol. 2015, 2015.

[32] J. D. Farmer and J. J. Sidorowich, “Predicting chaotic time series,” Phys. Rev. Lett., vol. 59, no. 8, pp. 845–848, Aug. 1987.

[33] a. Porporato and L. Ridolfi, “Nonlinear analysis of river flow time sequences,” Water Resour. Res., vol. 33, no. 6, pp. 1353–1367, 1997.

[34] B. Sivakumar, “Chaos theory in hydrology: important issues and interpretations,” J. Hydrol., vol. 227, no. 1, pp. 1–20, 2000.

[35] L. Cao, A. I. Mees, and K. Judd, “Dynamics from multivariate time series,” Phys. D Nonlinear Phenom., vol. 121, pp. 75–88, 1998.

[36] F. Laio, a. Porporato, R. Revelli, and L. Ridolfi, “A comparison of nonlinear flood forecasting methods,” Water Resour. Res., vol. 39, no. 5, p. n/a-n/a, 2003.

[37] R. F. Adler and A. J. Negri, “A satellite infrared technique to estimate tropical convective and stratiform rainfall,” J. Appl. Meteorol., vol. 27, no. 1, pp. 30–51, 1988.

[38] K. S. Takahashi, Tsutomu, “Tropical Rain Characteristics and Microphysics in a Three-Dimensional Cloud Model,” Am. Metrol. Soc., vol. Volume 61, no. 1984, pp. 2817–2845, 2004.

[39] W. Arm, A. Msm, D. Ara, A. Aw, I. Ar, and H. Saliza, “Improved Watershed Runoff Estimation Using Radar-Derived Rainfall in Peninsular Malaysia,” Int. Conf. Recent Emerg. Technol., 2009.

[40] S. Burcea, S. Cheval, A. Dumitrescu, B. Antonescu, A. Bell, and T. Breza, “Comparison Between Radar Estimated And Rain Gauge Measured Precipitation In The Moldavian Plateau,” Environ. Eng. Manag. J., vol. 11, no. 4, pp. 723–731, 2012.

[41] R. Adriyanto, “CURRENT STATUS OF WEATHER RADAR DATA EXCHANGE,” World Meteorol. Organ., no. Workshop On Radar Data Exchange, 2013.

[42] J. D. A. R. Nor Hisham Haji Khamis, “Determination of Rain Cell Size

99 Distribution for Microwave Link Design in Malaysia,” RF Microw. Conf., pp. 38–40, 2004.

[43] Shuyi S. Chen and R. A. H. Jr, “Diurnal variation and life-cycle of deep convective systems over the tropical Pacific warm pool,” Q. J. R. Meteorol. Soc., vol. 123, no. 538, pp. 357–388, 1997.

[44] S. Ramli and W. Tahir, “Radar Hydrology: New Z/R Relationships for Quantitative Precipitation Estimation in Klang River Basin, Malaysia,” Int. J. Environ. Sci. Dev., vol. 2, no. 3, pp. 223–227, 2011.

[45] A. M. Jafri, Z. Hashim, M.L.Kavvas, Z.Q.Chen, and N.Ohara, “Development of Atmospheric Based Flood Forecasting and Warning System for Selected River Basins in Malaysia.,” J. Hydrol., vol. 356, no. 3–4, pp. 1–18, 2010.

[46] T. Ushio et al., “A Kalman Filter Approach to the Global Satellite Mapping of Precipitation (GSMaP) from Combined Passive Microwave and Infrared Radiometric Data,” J. Meteorol. Soc. Japan, vol. 87A, no. November 2008, pp. 137–151, 2009.

[47] W. Veerakachen, M. Raksapatcharawong, and S. Seto, “Performance evaluation of Global Satellite Mapping of Precipitation (GSMaP) products over the Chaophraya River basin, Thailand,” Hydrol. Res. Lett., vol. 8, no. 1, pp. 39– 44, 2014.

[48] R. a. Roebeling and I. Holleman, “SEVIRI rainfall retrieval and validation using weather radar observations,” J. Geophys. Res., vol. 114, no. D21, p. D21202, Nov. 2009.

[49] R. A. Roebeling and I. Holleman, “Validation Of Rain Rate Retrievals From SEVIRI Using Weather Radar Observations,” EUMETSAT Meteorol. Satell. Conf., 2008.

[50] S. Seto, T. Tsunekawa, and T. Oki, “A new rain detection method to complement high-resolution global precipitation products,” Japan Soc. Hydrol. Water Resour., vol. 86, pp. 82–86, 2012.

[51] W. K. Al-Assadi, S. Gandla, S. Sedigh, and I. P. Dugganapally, “Design of a flood prediction system,” 2009 12th Int. IEEE Conf. Intell. Transp. Syst., pp. 1–6, Oct. 2009.

[52] J. N. Ghazali and A. Kamsin, “A Real Time Simulation and Modeling of Flood Hazard,” 12th WSEAS Int. Conf. Syst. Heraklion, Greece, pp. 438–443, 2008.

[53] E. Coppola, B. Tomassetti, M. Verdecchia, F. S. Marzano, and G. Visconti, “Small-catchment flood forecasting and drainage network extraction using computational intelligence,” 2006 IEEE Int. Jt. Conf. Neural Netw. Proc., pp. 851– 858, 2006.

100 [54] C. R. Jackson and J. R. Apel, Synthetic Aperture Radar Marine User’s Manual, Sep 2004. National Oceanic and Atmospheric Administration (NOAA), 2004.

[55] P. Campling, A. Gobin, K. Beven, and J. Feyen, “Rainfall-runoff modelling of a humid tropical catchment: the TOPMODEL approach,” Hydrol. Process., vol. 16, no. 2, pp. 231–253, Feb. 2002.

[56] Z. Yang, “Derivation of unit hydrograph using a transfer function approach,” vol. 42, pp. 1–9, 2006.

[57] C. W. Dawson and R. L. Wilby, “Hydrological modelling using artificial neural networks,” Prog. Phys. Geogr., vol. 25, no. 1, pp. 80–108, 2001.

[58] M. B. Abbott, J. C. Bathurst, J. A. Cunge, P. E. O’Connell, and J. Rasmussen, “An introduction to the European Hydrological System — Systeme Hydrologique Europeen, ‘SHE’, 2: Structure of a physically-based, distributed modelling system,” J. Hydrol., vol. 87, no. 1–2, pp. 61–77, Oct. 1986.

[59] F. Fenicia, D. Kavetski, and H. H. G. Savenije, “Elements of a flexible approach for conceptual hydrological modeling: 1. Motivation and theoretical development,” Water Resour. Res., vol. 47, no. 11, p. n/a-n/a, Nov. 2011.

[60] R. Allen Freeze, “Modelling changes in forest,” Hydrol. Forecast., vol. 12, no. 2, pp. 214–215, 1985.

[61] P. Taylor, C. W. Dawson, and R. Wilby, “An artificial neural network approach to rainfall-runoff modelling An artificial neural network approach to rainfall- runoff modelling,” Hydrol. Sci. J., vol. 43, no. August 2012, pp. 37–41, 1998.

[62] V. Koren, S. Reed, and M. Smith, “Combining physically-based and conceptual approaches in the development and parameterization of a distributed system,” no. 282, pp. 101–108, 2003.

[63] T. Wagener, M. J. Lees, and H. S. Wheater, “A toolkit for the development and application of parsimonious hydrological models,” Math. Model. small watershed Hydrol., vol. 2, 2001.

[64] I. G. Pechlivanidis, B. M. Jackson, N. R. Mcintyre, and H. S. Wheater, “Catchment Scale Hydrological Modelling: A Review Of Model Types, Calibration Approaches And Uncertainty Analysis Methods In The Context Of Recent Developments In Technology And Applications,” Glob. NEST J., vol. 13, no. 3, pp. 193–214, 2011.

[65] C. Garc, “Time Series Analysis : Autoregressive , MA and ARMA processes,” 2012.

[66] C. L. COCIANU and H. GRIGORYAN, “An Artificial Neural Network for Data Forecasting Purposes,” Inform. Econ., vol. 20, no. 2/2015, pp. 34–45, 2015.

101 [67] R. May, G. Dandy, and H. Maier, “Review of Input Variable Selection Methods for Artificial Neural Networks,” Artif. Neural Networks - Methodol. Adv. Biomed. Appl., no. August 2016, p. 362, 2011.

[68] F. Ahmad Ruslan, A. M. Samad, Z. M. Md Zain, and R. Adnan, “Flood water level modeling and prediction using NARX neural network: Case study at Kelang river,” Signal Process. its Appl. (CSPA), 2014 IEEE 10th Int. Colloq., no. 1, pp. 204–207, 2014.

[69] Haviluddin and R. Alfred, “Performance of Modeling Time Series Using Nonlinear Autoregressive with eXogenous input ( NARX ) in the Network Traffic Forecasting,” vol. 2013, no. June 2013, pp. 164–168, 2015.

[70] E. Diaconescu, “The use of NARX neural networks to predict chaotic time series,” WSEAS Trans. Comput. Res., vol. 3, no. 3, pp. 182–191, 2008.

[71] Y. Liu, G. Ma, and X. Jiang, “A design method for adaptive inverse control using NARX neural networks,” Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788), vol. 1. p. 459–463 Vol.1, 2004.

[72] K. C. Gupta, “Neural Network Structures,” Neural Networks RF Microw. Des., pp. 61–103, 2000.

[73] C. Bergmeir and J. M. Benitez, “Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS,” J. Stat. Softw., vol. 46, no. 7, pp. 1–26, 2012.

[74] T. Attia, “A Comparative Study of Three Different Topologies of Neural Network – based Multiuser Detectors of WCDMA signals,” no. 6, pp. 61–70, 2006.

[75] S. M. Guthikonda, “Kohonen Self-Organizing Maps,” no. December, 2005.

[76] a. K. Mitra, S. Nath, and a. K. Sharma, “Fog forecasting using rule-based fuzzy inference system,” J. Indian Soc. Remote Sens., vol. 36, no. 3, pp. 243–253, 2008.

[77] C. W. Dawson and R. Wilby, “An artificial neural network approach to rainfall-runoff modelling,” Hydrol. Sci. J., vol. 43, no. 1, pp. 47–66, Feb. 1998.

[78] J. M. Ortiz-rodríguez, J. M. Ortiz-rodríguez, M. R. Martínez-blanco, and M. Manuel, “Robust Design of Artificial Neural Networks Robust Design of Artificial Neural Networks Methodology in Neutron Spectrometry Methodology in Neutron Spectrometry.”

[79] P. Hosseinzadeh Talaee, “Multilayer perceptron with different training algorithms for streamflow forecasting,” Neural Comput. Appl., vol. 24, no. 3–4, pp. 695–703, 2014.

102 [80] P. P. Balestrassi, E. Popova, a. P. Paiva, and J. W. Marangon Lima, “Design of experiments on neural network’s training for nonlinear time series forecasting,” Neurocomputing, vol. 72, no. 4–6, pp. 1160–1178, 2009.

[81] B. Makki and M. N. Hosseini, “Some refinements of the standard autoassociative neural network,” Neural Comput. Appl., vol. 22, no. 7–8, pp. 1461–1475, 2013.

[82] N. Sajikumar and B. S. Thandaveswara, “A non-linear rainfall–runoff model using an artificial neural network,” J. Hydrol., vol. 216, no. 1–2, pp. 32–55, Mar. 1999.

[83] C. Engineering, “Artificial intelligence techniques in flood forecasting m ~ BRISTOL,” no. January, 2005.

[84] C. M. Zealand, D. H. Burn, and S. P. Simonovic, “Short term streamflow forecasting using artificial neural networks,” vol. 214, no. May 1998, pp. 32–48, 1999.

[85] A. Jain and S. Srinivasulu, “Integrated approach to model decomposed flow hydrograph using artificial neural network and conceptual techniques,” J. Hydrol., vol. 317, no. 3–4, pp. 291–306, 2006.

[86] I. Durre, J. M. Wallace, and D. P. Lettenmaier, “Dependence of extreme daily maximum temperatures on antecedent soil moisture in the contiguous United States during summer,” J. Clim., vol. 13, no. 14, pp. 2641–2651, 2000.

[87] T. Chaipimonplin and T. Vangpaisal, “Comparison of the Efficiency of Input Determination Techniques with LM and BR Algorithms in ANN for Flood Forecasting, Mun Basin, Thailand,” Int. J. Comput. Electr. Eng., vol. 6, no. 2, pp. 90–94, 2014.

[88] T. Chaipimonplin, L. See, and P. Kneale, “Improving neural network for flood forecasting using radar data on the Upper Ping River,” no. December, pp. 12– 16, 2011.

[89] S. Roweis, “Levenberg-Marquardt Optimization,” Notes, Univ. Toronto, 1996.

[90] F. C. Kruse, Predictions Nonlinearities and Portfolio. BoD – Books on Demand, 2012, 2012.

[91] W. Wang, P. H. a. J. M. Van Gelder, J. K. Vrijling, and J. Ma, “Forecasting daily streamflow using hybrid ANN models,” J. Hydrol., vol. 324, no. 1–4, pp. 383– 399, Jun. 2006.

[92] J. Tao and A. P. Barros, “Prospects for flash flood forecasting in mountainous regions – An investigation of Tropical Storm Fay in the Southern Appalachians,” J. Hydrol., Mar. 2013.

103 [93] L. S. Maciel and R. Ballini, “Design a Neural Network for Time Series Financial Forecasting : Accuracy and Robustness Analisys,” Inst. Enonomia. Univ. Estadual Campinas, no. 1986, 2008.

[94] S. C. Nayak, B. B. Misra, and H. S. Behera, “Impact of Data Normalization on Stock Index Forecasting,” Int. J. Comput. Inf. Syst. Ind. Manag. Appl., vol. 6, no. 2014, pp. 257–269, 2014.

[95] A. I. Erdi TOSUN, Kadir AYDIN, Simona Silvia MEROLA, “Estimation Of Operational Parameters For A Direct Injection Turbocharged Spark Ignition Engine By Using Regression Analysis And Artificial Neural Network,” p. 3.

[96] U. Edward J. Brouch, Captain, “Artificial Neural Network Prediction Of Chemical-Disease Relationships Using Readily Available Chemical Properties,” Air Force Institute Of Technology, 2014.

[97] S. Karsoliya, “Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture,” Int. J. Eng. Trends Technol., vol. 3, no. 6, pp. 714–717, 2012.

[98] F.-J. Chang, Y.-M. Chiang, and L.-C. Chang, “Multi-step-ahead neural networks for flood forecasting,” Hydrol. Sci. J., vol. 52, no. 1, pp. 114–130, 2007.

[99] G. Panchal, A. Ganatra, Y. Kosta, and D. Panchal, “Behaviour analysis of multilayer perceptrons with multiple hidden neurons and hidden layers,” Int. J. Comput. Theory Eng., vol. 3, no. 2, pp. 332–337, 2011.

[100] D. L. Bailey and D. Thompson, “Developing Neural-network Applications,” AI Expert, vol. 5, no. 9, pp. 34–41, Jun. 1990.

[101] K. G. Sheela and S. N. Deepa, “Review on methods to fix number of hidden neurons in neural networks,” Math. Probl. Eng., vol. 2013, no. Article ID 425740, 2013.

[102] M. A. Mohammed, Lubna. B. Hamdan and E. A. Abdelhafez, “Hourly Solar Radiation Prediction Based on Nonlinear,” Jordan J. Mech. Ind. Eng., vol. 7, no. 1, pp. 11–18, 2013.

[103] T. Chaipimonplin, “Investigation internal parameters of neural network model for Flood Forecasting at Upper river Ping, Chiang Mai,” KSCE J. Civ. Eng., vol. 20, pp. 478–484, 2015.

[104] H. Demuth, “Neural Network Toolbox,” Networks, vol. 24, no. 1, pp. 1–8, 2002.

[105] Ö. Kisi and E. Uncuoglu, “Comparison of three back-propagation training algorithm for two case study,” Indian J. Eng. Mater. Sci., vol. 12, no. October, pp. 434–442, 2005.

[106] J. Eriksson, Optimization and Regularization of Nonlinear Least Squares Problems.

104 1996.

[107] J. Smith and R. N. Eli, “Neural-Network Models of Rainfall-Runoff Process,” J. Water Resour. Plan. Manag., vol. 121, no. 6, pp. 499–508, 1995.

[108] C. W. Dawson and R. L. Wilby, “A comparison of artifical neural networks used for river flow forecasting,” Hydrology and Earth System Sciences, vol. 3, no. 4. pp. 529–540, 1999.

[109] N. Sajikumar and B. S. Thandaveswara, “A non-linear rainfall–runoff model using an artificial neural network,” J. Hydrol., vol. 216, no. 1–2, pp. 32–55, Mar. 1999.

[110] L. Xiong, K. M. O’Connor, and S. Guo, “Comparison of three updating schemes using artificial neural network in flow forecasting,” Hydrol. Earth Syst. Sci., vol. 8, no. 2, pp. 247–255, 2004.

[111] R. J. DAWSON, “PERFORMANCE-BASED MANAGEMENT OF FLOOD DEFENCE SYSTEMS,” no. May, 2003.

[112] Y. A. Pachepsky, G. Martinez, F. Pan, T. Wagener, and T. Nicholson, “Evaluating Hydrological Model Performance using Information Theory- based Metrics,” Hydrol. Earth Syst. Sci. Discuss., no. February, pp. 1–24, 2016.

[113] T. Chai and R. R. Draxler, “Root mean square error (RMSE) or mean absolute error (MAE)? -Arguments against avoiding RMSE in the literature,” Geosci. Model Dev., vol. 7, no. 3, pp. 1247–1250, 2014.

[114] C. Res, C. J. Willmott, and K. Matsuura, “Advantages of the mean absolute error ( MAE ) over the root mean square error ( RMSE ) in assessing average model performance,” vol. 30, pp. 79–82, 2005.

[115] N. Fenton and M. Neil, “Correlation coefficient and p-values: what they are and why you need to be very wary of them,” Risk Assess. Decis. Anal. with Bayesian Networks, 2012.

[116] P. Coulibaly and C. K. Baldwin, “Dynamic neural networks for nonstationary hydrological time series modeling,” in Practical Hydroinformatics, Springer Berlin Heidelberg, 2009, pp. 71–85.

[117] A. B. Bhattacharya, D. K. Tripathi, T. Das, and A. Nag, “Statistical Characteristics Of Tropical Rain Rate And Rain Intensity From Radar And Rain Gauge Measurements,” vol. 4, no. 1, pp. 53–64, 2011.

[118] A. Tayebiyan, T. A. Mohammad, and A. H. Ghazali, “SCIENCE & TECHNOLOGY Artificial Neural Network for Modelling Rainfall-Runoff,” vol. 24, no. 2, pp. 319–330, 2016.

[119] S. French, “Cynefin, statistics and decision analysis,” J. Oper. Res. Soc., vol. 64,

105 no. 4, pp. 547–561, 2013.

[120] A. R. and K. A. R. Negin Vaghefi, Mad Nasir Shamsudin, “Impact of Climate Change on Rice Yield in the Main Rice Growing Areas of Peninsular Malaysia,” Res. J. Environ. Sci., vol. 7, no. 2, pp. 59–67, 2013.

[121] Z. A. Mohtar, A. S. Yahaya, F. Ahmad, S. Suri, and M. H. Halim, “Trends for Daily Rainfall in Northern and Southern Region of Peninsular Malaysia,” J. Civ. Eng. Res., vol. 4, no. 3A, pp. 222–227, 2014.

[122] D. U. Lawal, A. Matori, A. M. Hashim, K. W. Yusof, and A. Chandio, “Natural Flood Influencing Factors : A Case Study of Perlis, Malaysia,” Int. Conf. Civil, Offshore Environ. Eng. (ICCOEE 2012), pp. 1–6, 2012.

[123] M. I. Phang Kun Liong, “A Case Study On The Sever Rain Occured In Northwestern Peninsular Malaysia From 31 October Till 2 November 2010,” JMM Res. Publ., vol. No. 11/201, 2012.

[124] C. P. Diman and W. Tahir, “Dam Flooding Caused A Prolonged Flooding,” Int. J. Civ. Environ. Eng. IJCEE-IJENS Vol12 No06, no. 6, 2012.

[125] W. R. Ismail, “Timah Tasoh Reservoir, Perlis, Malaysia.”

[126] D. for E. F. & R. Affairs, How to model and map catchment processes when flood risk management planning. Enviroment Agency, 2016.

[127] J. M. Wright, “Chapter 4: Flood Risk Assessment,” Floodplain Manag. Princ. Curr. Pract., pp. 1–25, 2008.

[128] K. J. Beven and M. J. Kirkby, “A physically based, variable contributing area model of basin hydrology,” Hydrol. Sci. Bull., vol. 24, no. 1, pp. 43–69, 1979.

[129] C. Reszler, G. Bloschl, and J. Komma, “Identifying runoff routing parameters for operational flood forecasting in small to medium sized catchments,” Hydrol. Sci. J., vol. 53, no. 1, pp. 112–129, 2008.

[130] C. Zhang, Y. Peng, J. Chu, C. A. Shoemaker, and A. Zhang, “Integrated hydrological modelling of small-and medium-sized water storages with application to the upper fengman reservoir basin of China,” Hydrol. Earth Syst. Sci., vol. 16, no. 11, pp. 4033–4047, 2012.

[131] Paridah Anun bt Tahir, “Malaysia Water Resorce Management (MyWRMS) Forum 2014,” 2014.

[132] Department of Irrigation and Drainage (JPS), “Infobanjir.” [Online]. Available: http://infobanjir.water.gov.my/waterlevel_page.cfm?state=PLS.

[133] H. B. Mann, “Nonparametric Tests Against Trend,” Econometrica, vol. 13, no. 3, pp. 245–259, 1945.

106 [134] M. G. Kendall, Rank Correlation Methods, 4th Editio. 1975.

[135] R. M. Hirsch, J. R. Slack, and R. A. Smith, “Techniques of trend analysis for monthly water quality data,” Water Resour. Res., vol. 18, no. 1, pp. 107–121, Feb. 1982.

[136] A. S. D. and G. T. V. K. Andy Y. Kwarteng, “Analysis of a 27-year rainfall data (1977–2003) in the Sultanate of Oman,” Int. J. Climatol., vol. 4, no. December 2007, p. 1549–1555., 2008.

[137] I. Conference, E. Science, E. Ipcbee, and I. Press, “Radar Hydrology : New Z / R Relationships for Klang River Basin , Malaysia,” vol. 8, no. 3, pp. 248–251, 2011.

[138] Y. C. Gao and M. F. Liu, “Evaluation of high-resolution satellite precipitation products using rain gauge observations over the Tibetan Plateau,” Hydrol. Earth Syst. Sci., vol. 17, no. 2, pp. 837–849, 2013.

[139] C. M. Taylor, R. A. M. de Jeu, F. Guichard, P. P. Harris, and W. A. Dorigo, “Afternoon rain more likely over drier soils,” Nature, vol. 489, no. 7416, pp. 423–426, Sep. 2012.

[140] C. E. M. Tucci and W. Collischonn, “Flood forecasting,” WMO Bull., vol. 55, no. July, pp. 179–184, 2006.

[141] Z. Liu, “Flood Forecasting and Warning in China.”

[142] R. Ruslan, Fazlina Ahmat; Samad, Abd Manan; Zain, Zainazlan Md; Adnan, “Flood water level modeling and prediction using NARX neural network : Case study at Kelang river.,” Proc. - 2014 IEEE 10th Int. Colloq. Signal Process. Its Appl. CSPA 2014. IEEE Comput. Soc. 2014., 2014.

[143] Y. Abou Rjeily, O. Abbas, M. Sadek, I. Shahrour, and F. Hage Chehade, “Flood forecasting within urban drainage systems using NARX neural network,” Water Sci. Technol., Jul. 2017.

[144] K. C. Keong, M. Mustafa, A. J. Mohammad, M. H. Sulaiman, and N. R. Hasma, “Artificial Neural Network Flood Prediction for Sungai Isap Residence,” p. 11.

107

Certificate of Ethics Review Project Title: Disaster Prediction, Management and Analysis System User ID: 673723 Name: Hassanuddin Bin Mohamed Noor Application Date: 27/01/2015 09:40:01 You must download your referral certificate, print a copy and keep it as a record of this review.

The FEC representative for the School of Engineering is Giles Tewkesbury

It is your responsibility to follow the University Code of Practice on Ethical Standards and any Department/School or professional guidelines in the conduct of your study including relevant guidelines regarding health and safety of researchers including the following:

• University Policy • Safety on Geological Fieldwork

It is also your responsibility to follow University guidance on Data Protection Policy:

• General guidance for all data protection issues • University Data Protection Policy

SchoolOrDepartment: ENG Primary Role: PostgraduateStudent SupervisorName: Dr David L Ndzi HumanParticipants: No PhysicalEcologicalDamage: No HistoricalOrCulturalDamage: No HarmToAnimal: No HarmfulToThirdParties: No OutputsPotentiallyAdaptedAndMisused: No Confirmation-ConsideredDataUse: Confirmed Confirmation-ConsideredImpactAndMitigationOfPontentialMisuse: Confirmed Confirmation-ActingEthicallyAndHonestly: Confirmed

Certificate Code: DE21-1DCB-316A-0A2A-5BFA-18FE-9E7A-F600 Page 1 Supervisor Review As supervisor, I will ensure that this work will be conducted in an ethical manner in line with the University Ethics Policy.

Supervisor signature: Date:

Certificate Code: DE21-1DCB-316A-0A2A-5BFA-18FE-9E7A-F600 Page 2