Machine Learning in Water Systems

Machine Learning in Water Systems Dragan Savic´ (editor) AISB Convention 2013 • University of Exeter • 3rd–5th April, 2013 2 Foreword from the Convention Chairs This volume forms the proceedings of one of eight co-located symposia held at the AISB Convention 2013 that took place 3rd-5th April 2013 at the University of Exeter, UK. The convention consisted of these symposia together in four parallel tracks with five plenary talks; all papers other than the plenaries were given as talks within the symposia. This symposium-based format, which has been the standard for AISB conventions for many years, encourages collabora- tion and discussion among a wide variety of disciplines. Although each symposium is self contained, the convention as a whole represents a diverse array of topics from philosophy, psychology, computer science and cognitive science under the common umbrella of artificial intelligence and the simulation of behaviour. We would like to thank the symposium organisers and their programme committees for their hard work in publicising their symposium, attracting and reviewing submissions and compiling this volume. Without these interesting, high quality symposia the convention would not be possible. Dr Ed Keedwell & Prof. Richard Everson AISB 2013 Convention Chairs Published by The Society for the Study of Artificial Intelligence and the Simulation of Behaviour http://www.aisb.org.uk ISBN: 978-1-908187-33-8 3 CONTENTS Stephen Mounce, A comparative study of artificial neural network architectures for time series prediction of water distribution system flow data Renji Remesan, Jimson Mathew and Ian Holman, Rare events and extreme flood predictions: An Application of Monte Carlo based Statistical Blockade Roger Swan, John Bridgeman and Mark Sterling, Optimisation of Water Treatment Works using Static and Dynamic Models with an NSGAII Genetic Algorithm Andrew Duncan, Albert Chen, Edward Keedwell, Slobodan Djordjević, Dragan Savić, RAPIDS: Early Warning System for Urban Flooding and Water Quality Hazards Alireza Pakgohar, David Zhang and Sarah Ward, Dynamically Integrated Project Portfolio Planning to Accommodate Asset Management Plan Cycles in the UK Water Industry Mateja Skerjanec, Darko Cerepnalkoski, Saso Dzeroski, Boris Kompare and Natasa Atanasova, Modelling dynamic systems using a hybrid approach 4 PREFACE The emergence of water, alongside energy and food, as one of the three major, interlinked, global environmental security issues provides abundant challenges and opportunities for the application of Machine Learning to such problems as optimisation of water distribution and drainage networks’ design and operation, modelling and prediction of fluvial, pluvial, urban and coastal flooding, sediment transport and water quality issues. Advances in GIS, remote sensing and weather forecasting techniques mean that environmental data is becoming increasingly abundant at the same time as demands for solutions and tools to work on these problems become more urgent. Numerical models have been applied widely to improve the understanding and operational management of natural and manmade water systems. Traditionally, so-called “physically- based” models have been applied for such purposes. However, such models are often computationally demanding, and frequently require significant data to constrain model structures and parameters. Data-Driven Models (DDMs) based on Machine Learning techniques - which seek to provide a mapping between the inputs and outputs of a given system, with little prior process knowledge – have emerged as an attractive option for prediction and classification in water systems. The principal benefit of such DDMs is their fast execution time, which allows many more model evaluations for a fixed computational budget. Such models have been applied widely to address a variety of problems within water systems modelling, including: system simulation (e.g. rainfall-runoff modelling/rating curve prediction) when trained on measured data, and also when employed as metamodels and trained to emulate models with a stronger physical (or process) basis; to improve the speed of the optimisation procedure by acting as a surrogate model to the full fitness evaluation; to correct systematic errors in physically-based models during real-time forecasting; to provide uncertainty bound predictions during model forecasting when trained on uncertainty bounds derived from offline calibration; and in classification (for example of predicted severity of a hazard or exceedances of regulatory limits). Despite their potential benefit, successful application of machine learning techniques is not straightforward. A variety of machine learning techniques, optimisation methods and evaluation procedures have been applied in the research literature. It is not always clear which methods will perform best in different settings, and how choices made will influence performance. As an example, different machine learning techniques might perform best depending on how their performance is evaluated within a given operational setting. Furthermore, although such methods are technically “black-box” models, system understanding may be required to choose the best input variables, and tailor the approach to the operational setting in question. With a view towards sharing the interdisciplinary knowledge required to make appropriate methodological decisions, papers are invited that explore issues of model design and application, and in particular, papers that compare different approaches for machine learning application. 5 A comparative study of artificial neural network architectures for time series prediction of water distribution system flow data S. R. Mounce1 Abstract. Many water utility companies are beginning to amass Continuous on-line monitors and sensors are increasingly large volumes of data by means of remote sensing of flow, being used to measure a wide range of potable water hydraulic pressure and other variables. For district meter area monitoring and quality variables within WDS [1]. The proliferation and there has been increasing interest in using this sensor data for diminishing costs of automated data transfer, such as by SMS abnormality detection, such as the real-time detection of bursts. and GPRS systems, is allowing all types of recorded data to be Research pilots have explored systems for generating ‘smart transferred from many disparate points on the networks. Water alarms’ and a key requirement is usually a prediction of future utilities are struggling to archive or to transform the data time series values. Artificial neural networks have been effectively into knowledge with which to enable operational employed in this capacity, however built in temporal memory in control. Data-driven modelling provides a mapping between the the network architecture (tap delays, feedback etc.) has not been inputs and outputs of a given system, with the advantage of not widely explored. In this comparative study, a number of requiring a detailed understanding of the physical, chemical artificial neural network architectures are evaluated for water and/or biological processes that affect a system – and it is distribution flow time series prediction, in particular by emerging as an attractive option for prediction and classification exploring using temporal memory. These models included multi- in water systems. Data-driven models can complement and layer perceptron, mixture density network, time delay network sometimes replace deterministic models [2] and Artificial Neural and recurrent network. In addition the mean diurnal cycle Networks (ANNs) are one such model. ANNs have been (calculated from the data set) was utilised as a baseline successfully applied to a range of water modelling problems and prediction. Genetic algorithm optimisation was utilised in some have displayed particular promise for forecasting applications. cases to optimise the number of hidden processing elements and They can be evaluated based on physical model outcomes and the learning rates parameters for the neural network. Two experimental/field data can be further integrated in order to reference data sets are used as a case study originating from enhance their performance. Maier and Dandy [3] provide a typical real world distribution systems and the performance comprehensive review of 43 papers dealing with the use of assessed by means of mean absolute error. The results of the ANNs for the prediction and forecasting of water resources study show that of the static networks, the mixture density variables, as well as a useful protocol for developing such network is superior for repeatability and insensitivity to models. Their study found that in all but two papers reviewed, parameter settings. Similarly, the recurrent network is generally feedforward networks were used, and that most used the superior to the time delay network in this capacity. However, the backpropagation training algorithm. use of either time delays or feedback results in approximately Sensor data (such as flow) obtained from a WDS is in the 50% less error than a static network for the best performing form of a time series—that is, a data stream consisting of one or networks. 1 more variables whose value is a function of time. Standard industry practice in the UK is to sample at a regular time interval KEYWORDS: Time series prediction, water distribution of 15 minutes to produce a discrete series. Due to opportunities systems, DMA flow, ANN, MLP, MDN, TDNN, Recurrent afforded in recent years near real time

Machine Learning in Water Systems

Neural Networks

Artificial Neural Networks

The Nnlib2 Library and Nnlib2rcpp R Package for Implementing Neural Networks

Neural Network FAQ, Part 1 of 7

Forecasting with Artificial Neural Networks

Uhm Ms 3812 R.Pdf

Neural Networks in Mobile Robot Motion, Pp

Comparative Economic Forecasting with Neural Networks: Forecasting Aggregate Business Sales from S&P 500 and Interest Rates

Artificial Neural Networks Modelling for Monitoring and Performance Analysis of a Heat and Power Plant

Available Toolboxes

Application of Neural Networks to Data Mining

High-Performance Commercial Data Mining: a Multistrategy Machine Learning Application