7. Concepts in Probability, Statistics and Stochastic Modelling
Total Page:16
File Type:pdf, Size:1020Kb
wrm_ch07.qxd 8/31/2005 12:18 PM Page 168 7. Concepts in Probability, Statistics and Stochastic Modelling 1. Introduction 169 2. Probability Concepts and Methods 170 2.1. Random Variables and Distributions 170 2.2. Expectation 173 2.3. Quantiles, Moments and Their Estimators 173 2.4. L-Moments and Their Estimators 176 3. Distributions of Random Events 179 3.1. Parameter Estimation 179 3.2. Model Adequacy 182 3.3. Normal and Lognormal Distributions 186 3.4. Gamma Distributions 187 3.5. Log-Pearson Type 3 Distribution 189 3.6. Gumbel and GEV Distributions 190 3.7. L-Moment Diagrams 192 4. Analysis of Censored Data 193 5. Regionalization and Index-Flood Method 195 6. Partial Duration Series 196 7. Stochastic Processes and Time Series 197 7.1. Describing Stochastic Processes 198 7.2. Markov Processes and Markov Chains 198 7.3. Properties of Time-Series Statistics 201 8. Synthetic Streamflow Generation 203 8.1. Introduction 203 8.2. Streamflow Generation Models 205 8.3. A Simple Autoregressive Model 206 8.4. Reproducing the Marginal Distribution 208 8.5. Multivariate Models 209 8.6. Multi-Season, Multi-Site Models 211 8.6.1. Disaggregation Models 211 8.6.2. Aggregation Models 213 9. Stochastic Simulation 214 9.1. Generating Random Variables 214 9.2. River Basin Simulation 215 9.3. The Simulation Model 216 9.4. Simulation of the Basin 216 9.5. Interpreting Simulation Output 217 10. Conclusions 223 11. References 223 wrm_ch07.qxd 8/31/2005 12:18 PM Page 169 169 7 Concepts in Probability, Statistics and Stochastic Modelling Events that cannot be predicted precisely are often called random. Many if not most of the inputs to, and processes that occur in, water resources systems are to some extent random. Hence, so too are the outputs or predicted impacts, and even people’s reactions to those outputs or impacts. To ignore this randomness or uncertainty is to ignore reality. This chapter introduces some of the commonly used tools for dealing with uncertainty in water resources planning and management. Subsequent chapters illustrate how these tools are used in various types of optimization, simulation and statistical models for impact prediction and evaluation. 1. Introduction of the system. If expected or median values of uncertain parameters or variables are used in a deterministic model, Uncertainty is always present when planning, developing, the planner can then assess the importance of uncertainty managing and operating water resources systems. It arises by means of sensitivity analysis, as is discussed later in this because many factors that affect the performance of water and the two subsequent chapters. resources systems are not and cannot be known with Replacement of uncertain quantities by either expected, certainty when a system is planned, designed, built, median or worst-case values can grossly affect the evalua- managed and operated. The success and performance of tion of project performance when important parameters each component of a system often depends on future are highly variable. To illustrate these issues, consider meteorological, demographic, economic, social, technical, the evaluation of the recreation potential of a reservoir. and political conditions, all of which may influence Table 7.1 shows that the elevation of the water surface future benefits, costs, environmental impacts, and social varies over time depending on the inflow and demand for acceptability. Uncertainty also arises due to the stochastic water. The table indicates the pool levels and their associ- nature of meteorological processes such as evaporation, ated probabilities as well as the expected use of the recre- ation facility with different pool levels. rainfall and temperature. Similarly, future populations of – towns and cities, per capita water-usage rates, irrigation The average pool level L is simply the sum of each patterns and priorities for water uses, all of which affect possible pool level times its probability, or – water demand, are never known with certainty. L ϭ 10(0.10) ϩ 20(0.25) ϩ 30(0.30) There are many ways to deal with uncertainty. One, and ϩ 40(0.25) 50(0.10) ϭ 30 (7.1) perhaps the simplest, approach is to replace each uncertain This pool level corresponds to 100 visitor-days per day: quantity either by its average (i.e., its mean or expected – VD(L) ϭ 100 visitor-days per day (7.2) value), its median, or by some critical (e.g., ‘worst-case’) value, and then proceed with a deterministic approach. Use A worst-case analysis might select a pool level of ten as a of expected or median values of uncertain quantities may be critical value, yielding an estimate of system performance adequate if the uncertainty or variation in a quantity is rea- equal to 100 visitor-days per day: ϭ ϭ sonably small and does not critically affect the performance VD(Llow) VD(10) 25 visitor-days per day (7.3) wrm_ch07.qxd 8/31/2005 12:18 PM Page 170 170 Water Resources Systems Planning and Management adequate representations of the data. Sections 4, 5 and 6 expand upon the use of these mathematical models, and possible probability recreation potential E021101a pool levels of each level in visitor-days per day discuss alternative parameter estimation methods. for reservoir with Section 7 presents the basic ideas and concepts of different pool levels the stochastic processes or time series. These are used to model streamflows, rainfall, temperature or other 10 0.10 25 20 0.25 75 phenomena whose values change with time. The section 30 0.30 100 contains a description of Markov chains, a special type of 40 0.25 80 stochastic process used in many stochastic optimization 50 0.10 70 and simulation models. Section 8 illustrates how synthetic flows and other time-series inputs can be generated for Table 7.1. Data for determining reservoir recreation potential. stochastic simulations. Stochastic simulation is introduced with an example in Section 9. Many topics receive only brief treatment in this intro- ductory chapter. Additional information can be found in Neither of these values is a good approximation of the applied statistical texts or book chapters such as Benjamin average visitation rate, that is and Cornell (1970), Haan (1977), Kite (1988), Stedinger –– VD ϭ 0.10 VD(10) ϩ 0.25 VD(20) ϩ 0.30 VD(30) et al. (1993), Kottegoda and Rosso (1997), and Ayyub ϩ 0.25 VD(40) ϩ 0.10 VD(50) and McCuen (2002). ϭ 0.10(25) ϩ 0.25(75) ϩ 0.30(100) ϩ 0.25(80) ϩ 0.10(70) (7.4) 2. Probability Concepts and ϭ 78.25 visitor-days per day Methods –– Clearly, the average visitation rate, VD ϭ 78.25, the visitation rate corresponding to the average pool level This section introduces the basic concepts and definitions – ϭ ϭ VD(L) 100, and the worst-case assessment VD(Llow) used in analyses involving probability and statistics. These 25, are very different. concepts are used throughout this chapter and later Using only average values in a complex model can chapters in the book. produce a poor representation of both the average performance and the possible performance range. When 2.1. Random Variables and Distributions important quantities are uncertain, a comprehensive analysis requires an evaluation of both the expected The basic concept in probability theory is that of the performance of a project and the risk and possible random variable. By definition, the value of a random magnitude of project failures in a physical, economic, variable cannot be predicted with certainty. It depends, ecological and/or social sense. at least in part, on the outcome of a chance event. This chapter reviews many of the methods of proba- Examples are: (1) the number of years until the flood bility and statistics that are useful in water resources stage of a river washes away a small bridge; (2) the num- planning and management. Section 2 is a condensed ber of times during a reservoir’s life that the level of the summary of the important concepts and methods of pool will drop below a specified level; (3) the rainfall probability and statistics. These concepts are applied in depth next month; and (4) next year’s maximum flow at this and subsequent chapters of this book. Section 3 a gauge site on an unregulated stream. The values of all presents several probability distributions that are often of these random events or variables are not knowable used to model or describe the distribution of uncertain before the event has occurred. Probability can be used quantities. The section also discusses methods for fitting to describe the likelihood that these random variables these distributions using historical information, and will equal specific values or be within a given range of methods of assessing whether the distributions are specific values. wrm_ch07.qxd 8/31/2005 12:18 PM Page 171 Concepts in Probability, Statistics and Stochastic Modelling 171 The first two examples illustrate discrete random The value of the cumulative distribution function FX(x) variables, random variables that take on values that are for a discrete random variable is the sum of the probabil- discrete (such as positive integers). The second two exam- ities of all xi that are less than or equal to x. ples illustrate continuous random variables. Continuous ϭ FxXXi() ∑ p () x (7.10) random variables take on any values within a specified Յ xxi range of values. A property of all continuous random variables is that the probability that the value of any of The right half of Figure 7.1 illustrates the cumulative those random variables will equal some specific number – distribution function (upper) and the probability mass any specific number – is always zero.