A Non-Parametric Bayesian Network Hydrologic Model a Case Study of a Lowland Catchment

A Non-Parametric Bayesian Network Hydrologic Model A Case Study of a Lowland Catchment Sjoerd Gnodde ANON-PARAMETRIC BAYESIAN NETWORK HYDROLOGIC MODEL ACASE STUDY OF A LOWLAND CATCHMENT ANON-PARAMETRIC BAYESIAN NETWORK HYDROLOGIC MODEL ACASE STUDY OF A LOWLAND CATCHMENT MSc Thesis to obtain the degree of Master of Science at Delft University of Technology, to be defended online on June 29, 2020 at 14:001. by Sjoerd GNODDE 1Please contact the author if you want to attend the defence. Thesis committee: dr.ir. O. Morales-Nápoles, TU Delft (Chairman) dr. M. Hrachowitz, TU Delft dr. E. Ragno, TU Delft ir. B. Dekens, Witteveen+Bos dr.ir. J. Hoch, Universiteit Utrecht Keywords: Lowland catchments, Hydrologic modeling, Non-parametric Bayesian Net- works, Data-driven models, Model optimization Cover design Anniek Keijer & Sjoerd Gnodde Photo back cover The river Vledder Aa during the Water-Op-Maat project middenloop Vled- der Aa. Photo by Corné Joziasse. An electronic version of this thesis is available at http://repository.tudelft.nl/. I believe that we do not know anything for certain, but everything probably. Christiaan Huygens CONTENTS Summary xi Preface xv 1 Introduction 1 1.1 Problem statement and objective . 1 1.2 Previous research into Bayesian networks and similar methods in hydrology . 3 1.3 Overview research goals and design. 4 2 Copulas and non-parametric Bayesian networks 7 2.1 Bayesian network . 7 2.2 Sklar’s theorem . 8 2.3 Types of copula . 8 2.3.1 Gaussian . 9 2.3.2 Archimedean copulas . 10 2.4 Fit a copula to data . 13 2.4.1 Maximum Likelihood Estimation . 13 2.4.2 Directly from correlation. 13 2.5 Application of copulas in non-parametric Bayesian networks. 14 2.6 Multivariate normal method . 15 2.6.1 Process. 15 2.6.2 Correlation matrix . 16 2.6.3 Conditional MVN parameters . 16 2.7 Random sampling . 18 2.8 Calculating conditional distributions . 18 2.8.1 By sampling regularly . 19 2.8.2 By sampling randomly . 19 3 Case Study 21 3.1 Selecting case study . 21 3.2 Vledder, Wapserveense and Steenwijker Aa . 21 3.2.1 History . 22 3.2.2 Recent works in the catchment . 22 3.2.3 Catchment delineation . 23 3.2.4 Waterways . 24 3.2.5 Geology . 25 3.2.6 Land use . 25 3.2.7 Climate . 26 3.2.8 Artificial structures and water fluxes . 26 vii viii CONTENTS 3.3 Data. 27 3.3.1 Overview. 27 3.3.2 Discharge . 27 3.3.3 KNMI data . 29 3.3.4 Precipitation . 30 3.3.5 Groundwater level . 32 3.3.6 Surface water level . 33 3.3.7 Soil moisture. 33 3.3.8 NDVI. 34 3.3.9 Processing and filtering . 35 3.4 Water balance. 35 3.4.1 Monthly scale . 36 3.4.2 Complete timeframe - Budyko framework . 37 4 Initial model and performance 39 4.1 Initial model . 39 4.2 Determining performance . 40 4.2.1 NSE . 40 4.2.2 Kling-Gupta efficiency . 41 4.3 k-fold cross-validation . 41 4.4 Performance per observation . 42 5 Testing the copula assumption 43 5.1 Introduction . 43 5.2 Autocorrelation test . 44 5.2.1 Unconditional autocorrelation. 44 5.2.2 Partial Autocorrelation . 44 5.3 Multidimensional Cramér-von Mises test . 45 5.3.1 Theory . 45 5.3.2 Results . 47 5.4 Absolute differences . 47 5.5 Quadrant Pearson correlation. 48 5.5.1 Tail dependence in data . 48 5.5.2 Tail dependence for copula types . 49 6 Selection and implementation of the marginal distribution 51 6.1 Empirical cumulative density function . 52 6.2 Altered logistic function. 52 6.2.1 Initial values for fitting the CDF . 53 6.2.2 Optimal number of fit parameters . 53 6.3 Gaussian mixture model . 53 6.3.1 Initial values for fitting the CDF . 55 6.3.2 Optimal number of normal distributions . 55 6.4 Inverse cumulative function . 56 6.4.1 Fitting the CDFs to the data . 56 CONTENTS ix 6.5 Preferred CDF. 56 6.5.1 Optimal fit . 56 6.5.2 Extrapolation . 57 6.5.3 Computational duration . 57 6.5.4 Conclusion. 57 6.6 Shifting the CDFs to increase extrapolation . 57 6.6.1 Shift samples outwards . 58 6.6.2 Limit shifted range . 58 6.6.3 Differentiate in extrapolation sides . 59 6.6.4 Implementation . 59 7 Model parameters and error sensitivity 63 7.1 Most likely value from sampled values . 63 7.2 Number of samples to use . 64 7.2.1 Good fit of the target variable . 64 7.2.2 Confidence intervals . 65 7.2.3 Computational time . 65 7.3 Subset period data . 66 7.4 Sensitivity input error . 66 7.4.1 Systematic errors or bias . 67 7.4.2 Random errors . 67 7.4.3 Systematic errors in the discharge measurements . 70 8 Bayesian network layout 73 8.1 Criteria . 73 8.2 Strategy . 74 8.3 Connections in the model based on influences on the variables . 74 8.4 Selected implementations . 77 8.5 Final model . 78 9 Results Bayesian network and benchmark models 81 9.1 General results Bayesian network . 81 9.1.1 Complete k-fold test . 81 9.1.2 Factors KGE . 81 9.1.3 Error per observed values . 83 9.1.4 Not fixing all variables . 83 9.1.5 Predicting all variables . 84 9.1.6 Predicting all days . 85 9.2 Benchmark models . 86 9.2.1 Saturated Bayesian network . ..

Load more