<<

Prof. (Dr.) Rajib Kumar Bhattacharjya Indian Institute of Technology Guwahati Guwahati, Assam Email: [email protected] Web: www.iitg.ernet.in/rkbc Visiting Faculty NIT Meghalaya Probabilistic Approach: A X is a variable described by a probability distribution.  The distribution specifies the chance that an observation x of the variable will fall in a specific of X.

 A set of observations n1, n2, n3,…………., nn of the random variable is called a .  It is assumed that samples are drawn from a hypothetical infinite population possessing constant statistical properties.  Properties of sample may vary from one sample to others.

The probability P (A) of an event is the chance that it will occur when an observation of the random variable is made.

푛 푃(퐴) = lim 퐴 1. Total Probability 푛→∞ 푛 푃 퐴 + 푃 퐴 + 푃 퐴 +. . … … … … … + 푃(퐴 ) = 푃(Ω) = 1 n --- number in range of event A. 1 2 3 푛 A 1. Complementarity n----- Total observations 푃 퐴 = 1 − 푃(퐴) 1. Conditional Probability

퐵 푃(퐴∩퐵) 푃 = 퐴 푃(퐴) B will occur provided A has already occurred. Joint Probability 푃 퐴 ∩ 퐵 = 푃(퐴)푃(퐵)

Example: The probability that annual precipitation will be less than 120 mm is 0.333. What is the probability that there will be two successive year of precipitation less than 120 mm.

P(R<35) = 0.333 P(C) = 0.3332 =0.111 15

10 9 Precipi- tation 5 5 2

Year

Relative Frequency: 푛 푓 푥 = 푖 푠 푖 푛

Which is equivalent to 푃 푥푖 − ∆푥 ≤ 푋 ≤ 푥푖 , the probability that the random variable X will lie in the interval [푥푖 − ∆푥,푥푖]

Cumulative Frequency Function: 푖

퐹푠 푥푖 = 푓푠 푥푗 푗=1 This is estimated as 푃 푋 ≤ 푥푖 , the cumulative probability of xi. This is estimated for sample , corresponding function for population will be Probability density Function:

푓푠(푥) 푓 푥 = 푛lim→∞ ∆푥→0 ∆푥

Probability Distribution Function:

퐹 푥 = 푛lim→∞ 퐹푠 푥 ∆푥→0

Whose derivative is the probability density function

푑퐹 훾 푓 훾 = 푑푥 The cumulative probability as 푃 푋 ≤ 푥 , can be expressed as

푥 푃 푋 ≤ 푥 = 퐹 푥 = 푓 푢 푑푢 −∞

푃 푥푖 = 푃 푥푖 − ∆푥 ≤ 푋 ≤ 푥푖

푥푖 = 푓 푥 푑푥

푥푖−∆푥

푥푖 푥푖−∆푥 = 푓 푥 푑푥 − 푓 푥 푑푥 −∞ −∞

=퐹 푥푖 − 퐹 푥푖 − ∆푥

=퐹 푥푖 − 퐹 푥푖−1 Sample Population Pxi fsxi

∆x x xi

Fsxi Fsxi

dF(x) dx

x Probability density function

푓(푥)

1 푥 − 휇 2 푓 푥 = 푒푥푝 − 2휋 2휎2

2 퐹(푥) 1 푧 푥 − 휇 푓 푧 = 푒푥푝 − 푧 = −∞ ≤ 푧 ≤ ∞ 2휋 2 휎

푧 1 2 퐹 푧 = 푒−푢 /2푑푢 2휋 푥 −∞ Cumulative probability of standard

푧 1 2 퐹 푧 = 푒−푢 /2푑푢 2휋 −∞

퐹 푧 = 퐵 for 푧 < 0

퐹 푧 = 1 − 퐵 for 푧 ≥ 0

1 퐵 = 1 + 0.196854 푧 + 0.115194 푧 2 + 0.000344 푧 3 + 0.019527 푧 4 −4 2 Ex. What is the probability that the standard normal random variable 푧 will be less than -2? Less than 1? What is 푃 −2 < 푍 < 1 ?

Solution

푃 푍 < −2 = 퐹 −2 = 1 − 퐹 2 = 1 − 0.9772 = 0.228

푃 푍 < 1 = 퐹 1 = 0.8413

푃 −2 < 푍 < 1 = 퐹 1 − 퐹 2 = 0.841 − 0.023 = 0.818 Example 2: The annual runoff of a stream is modeled by a normal distribution with and of 5000 and 1000 ha-m respectively. i. Find the probability that the annual runoff in any year is more than 6500 ha-m. ii. Find the probability that it would be between 3800 and 5800 ha-m. Solution: Let X is the random variable denoting the annual runoff. Then z is given by

(푋−5000) 푧 = N (0, 1) 1000

(6500−5000) (i) P (X≥6500) = P (z ≥ ) 1000 = P (z≥1.5) = 1-P (z≤1.5) = 1- F (1.5) F (1.5) = 0.9332 P (X≥6500) =1-0.9332 = 0.0688 P (X≥6500) =0.0688 (ii) P (3800≤X≤5800)

(3800−5000) (5800−5000) = P [ ≤ 푧 ≤ P ] 1000 1000 = P [-1.2≤z≤0.8] = F (0.8)-F (-1.2) = F (0.8)-[1-F (1.2)] =0.7881-(1-0.8849) =0.673 Statistical 퐸 푥 = 휇 = 푥푓 푥 푑푥 −∞ It is the first about the origin of the random variable, a measure of the midpoint or of the distribution 푛 1 푥 = 푥 The sample estimate of the mean is the average 푛 푖 푖=1

The variability of data is measured by the 휎2

∞ 퐸 푥 − 휇 2 = 휎2 = 푥 − 휇 2푓 푥 푑푥 −∞ 푛 1 The sample estimate of the variance is given by 푠2 = 푥 − 푥 2 푛 − 1 푖 푖=1 A symmetry of distribution about the mean is a measured by the . This is the third moment about the mean ∞ 퐸 푥 − 휇 3 = 푥 − 휇 3푓 푥 푑푥 −∞

The coefficient of skewness 훾 is defined as

1 훾 = 퐸 푥 − 휇 3 휎3

A sample estimate for 훾 is given by

푛 푛 푥 − 푥 3 퐶 = 푖=1 푖 푠 (푛 − 1)(푛 − 2)푠3 Fitting a probability distribution

A probability distribution is a function representing the probability of occurrence of a random variable.

By fitting a distribution function, we can extract the probabilistic information of the random variable

Fitting distribution can be achieved by the method of moments and the method of maximum likelihood

Method of moments

Developed by Karl Pearson in 1902 Karl Pearson (27 March He considered that good estimate of the of a 1857 – 27 April 1936) was probability distribution are those for which moments of an English mathematician the probability density function about the origin are equal to the corresponding moments of the sample data The moments are

∞ 푛 1 퐸 푥 = 휇 = 푥푓 푥 푑푥 = 푥 = 푥 푛 푖 −∞ 푖=1

∞ 푛 1 퐸 푥 − 휇 2 = 휎2 = 푥 − 휇 2푓 푥 푑푥 = 푠2 = 푥 − 푥 2 푛 − 1 푖 −∞ 푖=1

∞ 푛 푛 푥 − 푥 3 퐸 푥 − 휇 3 = 푥 − 휇 3푓 푥 푑푥 = 푖=1 푖 (푛 − 1)(푛 − 2) −∞ Method of moments

Developed by R A Fisher in 1922 Sir Ronald Aylmer Fisher (17 February 1890 – He reasoned that the best value of a parameter of a 29 July 1962), known as probability distribution should be that value which R.A. Fisher, was an English , evolutionary maximizes the likelihood of joint probability of biologist, mathematician, occurrence of the observed sample geneticist, and eugenicist. Suppose that the sample space is divided into intervals of length 푑푥 and that a sample of independent and identically distributed observations 푥1, 푥2, 푥3, … , 푥푛 is taken

The value of the probability density for 푋 = 푥푖 is 푓(푥푖) The probability that the random variable will occur in that interval including 푥푖 is 푓 푥푖 푑푥 Since the observation is independent, their joint probability of occurrence is

푛 푛 푓 푥1 푑푥푓 푥2 푑푥푓 푥3 푑푥 … 푓 푥푛 푑푥 = 푓(푥푖)푑푥 푖=1 Since 푑푥 is fixed, we can maximize

푀푎푥 퐿 = 푓(푥푖) 푖=1 푛

푀푎푥 푙푛퐿 = 푙푛푓(푥푖) 푖=1 Example: The can be used to describe various kinds of hydrological data, such as inter arrival times of rainfall events. Its probability density function is 푓 푥 = 휆푒−휆푥 for 푥 > 0. Determine the relationship between the parameter 휆 and the first moment about the origin 휇.

Solution: Method of maximum likelihood

Method of moments 푛 푛 휕푙푛퐿 푛 푙푛퐿 = 푙푛푓(푥 ) = − 푥 = 0 ∞ 푖 휕휆 휆 푖 휇 = 퐸 푥 = 푥푓 푥 푑푥 푖=1 푖=1 −∞ 푛 푙푛퐿 = 푙푛 휆푒−휆푥푖 푛 ∞ 1 1 푖=1 = 푥푖 −휆푥 휆 푛 = 푥휆푒 푑푥 푖=1 −∞ 푛 푙푛퐿 = 푙푛휆 − 휆푥 1 푖 휆 = 푖=1 푥 1 휇 = 푛 휆 푙푛퐿 = 푛푙푛휆 − 휆 푥푖 푖=1

Normal family Normal, lognormal, lognormal-III

Generalized extreme value family EV1 (Gumbel), GEV, and EVIII (Weibull)

Exponential/Pearson type family Exponential, Pearson type III, Log-Pearson type III Normal Distribution

1 1 푥−휇 2 푓 푥 = 푒−2 휎 −∞ ≤ 푥 ≤ +∞ 휎 2휋

0.8 0.08 =0.5 =-20 0.7 =1.5 0.07 =-10 =2.5 =0 0.6 =3.5 0.06 =10 =4.5 =20

0.5 0.05

0.4 0.04

f(x)

f(x)

0.3 0.03

0.2 0.02

0.1 0.01

0 0 -10 -8 -6 -4 -2 0 2 4 6 8 10 -50 -40 -30 -20 -10 0 10 20 30 40 50 x x Log-Normal Distribution

2 1 푦−휇푦 1 − 푓 푦 = 푒 2 휎푦 푥 > 0 and 푦 = ln(푥) 휎푦 2휋 Extreme value (EV) distributions

• Extreme values – maximum or minimum values of sets of data • Annual maximum discharge, annual minimum discharge • When the number of selected extreme values is large, the distribution converges to one of the three forms of EV distributions called Type I, II and III

25 0.2 =0.5 0.18 =1 EV type I distribution =1.5 0.16 =2.0 0.14 =2.5 If M1, M2…, Mn be a set of daily rainfall or streamflow, and let X = max(Mi) be the maximum for the year. If Mi are independent and 0.12 identically distributed, then for large n, X has an extreme value type I 0.1 or Gumbel distribution. f(x) 0.08

0.06

0.04

0.02

0 -5 0 5 10 15 20 1 −푦 푓 푥 = 푒− 푦+푒 x 훽 0.2 =1 푥 − 휇 0.18 =2 =3 푦 = 0.16 훽 =4 0.14 6푆푥 휇 = 푥 − 0.5772훽 훽 = 0.12 휋 0.1

f(x)

0.08 Distribution of annual maximum stream flow follows an EV1 distribution 0.06

0.04

0.02

0 -5 0 5 10 15 20 x EV type III distribution 1.8 If Wi are the minimum stream flows in different days of the year, =1 k=0.5 let X = min(Wi) be the smallest. X can be described by the EV type 1.6 =1 k=1.0 III or . =1 k=1.5 1.4 =1 k=2.0

1.2

1

f(x) 0.8 푘−1 푥 푘 푘 푥 − 푓 푥 = 푒 휆 푥 > 0, 휆, 푘 > 0 휆 휆 0.6

0.4

0.2

0 0 0.5 1 1.5 2 2.5 x

Distribution of low flows (eg. 7-day min flow) follows EV3 distribution. 4.5 Exponential distribution =0.5 4 =1.5 =2.5 3.5 =3.5 Poisson process – a stochastic process in =4.5 which the number of events occurring in two 3 disjoint subintervals are independent 2.5 random variables.

f(x) In hydrology, the interarrival time (time 2 between stochastic hydrologic events) is 1.5 described by exponential distribution 1

0.5

0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 1 x f (x)  ex x  0;   x

Interarrival times of polluted runoffs, rainfall intensities, etc are described by exponential distribution. • The time taken for a number of events (b) in a Poisson process is described by the gamma distribution • Gamma distribution – a distribution of sum of b independent and identical exponentially distributed random variables.

 x 1ex f (x)  x  0;   gamma function ( )

Skewed distributions (eg. hydraulic conductivity) can be represented using gamma without log transformation. Pearson Type III

Named after the statistician Pearson, it is also called three-parameter gamma distribution. A lower bound is introduced through the third parameter (e)

 (x e ) 1e(xe ) f (x)  x  e;   gamma function ( )

It is also a skewed distribution first applied in hydrology for describing the pdf of annual maximum flows. Log-Pearson Type III

If log X follows a Person Type III distribution, then X is said to have a log- Pearson Type III distribution

 (y e) 1e( ye ) f (x)  y  log x  e () Frequency analysis for extreme events

Q. Find a flow (or any other event) that has a return period of T years

1  x u  x u    x u  f (x)  exp  exp  EV1 pdf and cdf      F(x)  exp exp        6s   x u  x  0.5772  x u Define a reduced variable y y   F(x)  exp exp(y)

y  ln lnF(x) ln ln(1 p) where p  P(x  xT )   1  yT  ln ln1    T 

If you know T, you can find 푦푇, and once 푦푇 is know, 푥푇 can be computed by

xT  u yT 32 Example 01

Given annual maxima for 10-minute storms. The mean and standard deviation are 0.649 in and 0.177 in. Find 5 & 50 year return period 10-minute storms x  0.649in s  0.177in 6s 6 *0.177     0.138 u  x  0.5772  0.649  0.5772 *0.138  0.569  

  T    5  y5  ln ln    ln ln   1.5  T 1   5 1

x5  u y5  0.569  0.138 *1.5  0.78in

x50 1.11in

33 Normal Distribution 2 1 x     xT x 1 2  K   z Normal distribution f (x)  e   T s T X  2

So the frequency factor for the Normal Distribution is the xT  x  KT s  x  zT s standard normal variate

Example: 50 year return period

1 T  50; p   0.02; K  z  2.054 50 50 50

34 ∞ 1/2 1 1 푤 = 푙푛 0 < 푝 ≤ 0.5 푃 푋 > 푥푇 = = 푓 푥 푑푥 푝2 푇 푥푇

퐾푇휎

휇 푥푇

2.515517 + 0.802853푤 + 0.010328푤2 푧 = 푤 − 1 + 1.432788푤 + 0.189269푤2 + 0.001308푤3 EV-I (Gumbel) Distribution

  x  u  6s   T  F(x)  exp exp    u  x  0.5772 yT  ln ln         T 1 xT  u yT 6 6    T   x  0.5772 s  s ln ln        T 1 6    T   x  0.5772  ln ln  s     T 1

6    T  xT  x  KT s KT   0.5772  ln ln      T 1

36 Example 02

Given annual maximum rainfall, calculate 5-yr storm using frequency factor

6    T  KT   0.5772  lnln      T 1

6    5  KT   0.5772  lnln   0.719     5 1

xT  x  KT s  0.649  0.719  0.177  0.78 in

37 Distribution PDF CDF Range Mean 1 푥−휇 2 푥 Normal 1 − 1 푥−휇 2 −∞ ≤ 푥 ≤ +∞ 휇 and 휎 2 휎 1 − 푒 푒 2 휎 푑푥 휎 2휋 휎 2휋 Or −∞ 푧2 1 − 푥−휇 푒 2 , 푧 = 2휋 휎 2 푦 2 1 푥−휇푦 1 푥−휇푦 −∞ ≤ 푦 ≤ +∞ 휇 and 휎 Lognormal 1 − 1 − 푦 푦 푦 = ln(푥) 푒 2 휎푦 푒 2 휎푦 푑푦 0 ≤ 푥 ≤ +∞ 휎푦 2휋 휎푦 2휋 −∞ −푒−푦 Extreme value 1 −푦 푒 −∞ ≤ 푥 ≤ +∞ 훽 = 푥 − 0.5772훼 푒− 푦+푒 Type I 훼 6푆 훼 = 푥 푦 = (푥 − 훽)/훼 휋 Extreme value 훼푥훼−1훽−훼푒− 푥/훽 훼 1 − 푒− 푥/훽 훼 푥 ≥ 0 훽Γ 1 + 1/훼 Type III 훽 Γ 1 + 2/훼 Ex. Annual maximum values of 10 min duration rainfall at some place from 1913 to 1947 are presented in Table below. Develop a model for storm rainfall frequency analysis using Extreme Value Type I distribution and calculate the 5, 10, and 50 year return period maximum values of 10 min rainfall of the area.

Year 1910 1920 1930 1940 Solution 0 0.53 0.33 0.34 6푠 6 × 0.177 1 0.76 0.96 0.70 훼 = = = 0.138 휋 휋 2 0.57 0.94 0.57 3 0.49 0.80 0.80 0.92 F푢 (=x)푥 −exp0.5772훼 exp(=0.y649) − 0.5772 × 0.138 = 0.569

4 0.66 0.66 0.62 0.66 y  ln lnF(x) ln ln(1 p) where p  P(x  xT ) 5 0.36 0.68 0.71 0.65   1  6 0.58 0.68 1.11 0.63 yT  ln ln1    T  7 0.41 0.61 0.64 0.60 8 0.47 0.88 0.52 1 9 0.74 0.49 0.64 푦 = −푙푛 푙푛 1 − = 1.5 푇 5 Mean 0.649 SD 0.177

푥푇 = 푢 + 훼푦푇 = 0.569 + 0.138 × 1.5 = 0.78 6    T  KT   0.5772  ln ln      T 1

6 5 퐾 = − 0.5772 + 푙푛 푙푛 = 0.719 푇 휋 5 − 1

푥푇 = 푥 + 퐾푇푠 = 0.649 + 0.719 × 0.177 = 0.78 LOG – PEARSON TYPE III DISTRIBUTION

푦 = log(푥)

Calculate mean 푦 , standard deviation 푠푦 and the coefficient of skewness 퐶푠

The value of z can be calculated using the procedure described earlier.

Ex. Calculate the 5 and 50 year return period maximum discharge of the Guadalupe River, Texas using lognormal and Log-Pearson Type III distribution. SOLUTION

Year 1930 1940 1950 1960 1970 Lognormal Distribution 0 55900 13300 23700 9190 1 58000 12300 55800 9740 푦 = 4.2743 푠푦 = 0.4027 퐶푠 = −0.0696 2 56000 28400 10800 58500

3 7710 11600 4100 33100 퐾50 for 퐶푠 = 0 is 2.054 4 12300 8560 5720 25200 푦 = 푦 + 퐾 푠 5 38500 22000 4950 15000 30200 50 50 푦 = 4.2743 + 2.054 × 0.4027 = 5.101 6 179000 17900 1730 9790 14100 5.101 7 17200 46000 25300 70000 54500 푥50 = 10 = 126300 푐푓푠 8 25400 6970 58300 44300 12700 9 4940 20600 10100 15200 푥5 = 41060 푐푓푠 Log-Pearson Type III Distribution

2.0 − 2.054 퐾 = 2.054 + −0.0696 − 0 = 2.016 50 −0.1 − 0

푦50 = 푦 + 퐾50푠푦 = 4.2743 + 2.016 × 0.4027 = 5.0863

5.0863 푥50 = 10 = 121990 푐푓푠

푥5 = 41170 푐푓푠 PLOTING POSITIONS

Plotting position refers to the probability value assigned to each piece of data to be plotted

푚 푚 − 푏 푃 푋 ≥ 푥푚 = California’s formula 푃 푋 ≥ 푥 = 푛 푚 푛 + 1 − 2푏

푚 − 1 푃 푋 ≥ 푥푚 = Modified formula 푛 푏 = 0.5 For Hazen (1930) formula

푚 − 0.5 푏 = 0.3 For Chegodayev’s formula 푃 푋 ≥ 푥푚 = Hazen (1930) formula 푛 푏 = 0 For Weibull formula

푚 − 0.3 푃 푋 ≥ 푥 = Chegodayev’s formula 푚 푛 + 0.4 푏 = 3/8 For Blom’s formula

푚 푏 = 1/3 For Tukey’s formula 푃 푋 ≥ 푥푚 = Weibull formula 푛 + 1 푏 = 0.44 For Gringorten’s formula 1960 1974 1957 1938 1952 1975 1973 1935 1968 1947 1977 1961 1940 1942 1941 1958 1972 1967 1936

Year Maimum 1090 1254 1303 1543 1580 1583 1586 1642 1651 1657 1982 5069

671 714 716 719 804 855 937 Discharge (m3/s) 19 18 17 16 15 14 13 12 11 10

9 8 7 6 5 4 3 2 1 Rank 0.422 0.400 0.378 0.356 0.333 0.311 0.289 0.267 0.244 0.222 0.200 0.178 0.156 0.133 0.111 0.089 0.067 0.044 0.022 Exceedence probability 1.31 1.35 1.40 1.44 1.48 1.53 1.58 1.63 1.68 1.73 1.79 1.86 1.93 2.01 2.10 2.20 2.33 2.50 2.76 w 0.20 0.25 0.31 0.37 0.43 0.49 0.56 0.62 0.69 0.76 0.84 0.92 1.01 1.11 1.22 1.35 1.50 1.70 2.01 z 2.83 2.85 2.86 2.86 2.91 2.93 2.97 3.04 3.10 3.11 3.19 3.20 3.20 3.20 3.22 3.22 3.22 3.30 3.70 log(Q) Log Q from

2.81 2.83 2.85 2.88 2.90 2.92 2.95 2.98 3.00 3.03 3.07 3.10 3.13 3.17 3.22 3.27 3.33 3.41 3.54 Lognormal distribution 1956 1963 1939 1955 1964 1948 1943 1954 1970 1971 1966 1959 1962 1953 1951 1944 1978 1950 1976 1965 1969 1937 1946 1949 1945 116 140 140 162 197 218 242 260 276 277 286 306 328 348 348 360 377 399 425 430 487 507 583 623 49 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 0.978 0.956 0.933 0.911 0.889 0.867 0.844 0.822 0.800 0.778 0.756 0.733 0.711 0.689 0.667 0.644 0.622 0.600 0.578 0.556 0.533 0.511 0.489 0.467 0.444 0.21 0.30 0.37 0.43 0.49 0.53 0.58 0.63 0.67 0.71 0.75 0.79 0.83 0.86 0.90 0.94 0.97 1.01 1.05 1.08 1.12 1.16 1.20 1.23 1.27 ------1.83 1.60 1.43 1.30 1.19 1.08 0.99 0.91 0.83 0.75 0.68 0.62 0.55 0.49 0.43 0.37 0.31 0.25 0.20 0.14 0.08 0.03 0.03 0.08 0.14 1.69 2.06 2.15 2.15 2.21 2.30 2.34 2.38 2.42 2.44 2.44 2.46 2.49 2.52 2.54 2.54 2.56 2.58 2.60 2.63 2.63 2.69 2.70 2.77 2.79 1.99 2.08 2.15 2.20 2.25 2.29 2.33 2.36 2.39 2.42 2.45 2.48 2.50 2.53 2.55 2.58 2.60 2.62 2.65 2.67 2.69 2.72 2.74 2.76 2.78 4.00

3.50

3.00

2.50 Log(Q)

2.00

1.50 0.802 0.602 0.402 0.202 0.002 Exceedence probability RELIABILITY OF ANALYSIS

Confidence Limits: Statistical estimate often presented with a range, or within which the true value can be expected to lie.

Confidence Level 훽

1 − 훽 Significant Level 훼 훼 = i.e. is 훽 = 90%, then 훼 = 0.05 2

For a return period 푇, the upper limit 푈푇,훼 and lower limit 퐿푇,훼 may be specified as

푈 푈푇,훼 = 푦 + 푠푦퐾푇,훼

퐿 푈푇,훼 = 푦 + 푠푦퐾푇,훼

퐾 + 퐾2 − 푎푏 퐾 − 퐾2 − 푎푏 퐾푈 = 푇 푇 퐾퐿 = 푇 푇 푇,훼 푎 푇,훼 푎 2 2 푧훼 푧 푎 = 1 − 푏 = 퐾2 − 훼 2 푛 − 1 푇 푛

푧훼 is the standard normal variable with exceedence probability 훼 Problem: Determine 90% confidence limits for the 100 year discharge using the following data Logarithmic mean =3.639, standard deviation = 0.4439, coefficient of skewedness = -0.64 for 16 years of data Solution:

훽 = 0.9 훼 = 0.05

푧훼has the exceedence probability of 0.05. This cumulative probability is 0.95.

푧훼 = 1.645 2 2 퐾푇 + 퐾푇 − 푎푏 1.843 + 1.843 − 푎푏 퐾 = 1.843 퐾푈 = = = 2.7714 Considering log-normal distribution, the 100 푇,훼 푎 푎 푧2 1.6452 푎 = 1 − 훼 = 1 − = 0.909799 2 푛 − 1 2 16 − 1 2 2 2 2 퐿 퐾푇 − 퐾푇 − 푎푏 1.843 − 1.843 − 푎푏 푧훼 1.645 퐾 = = = 1.2804 푏 = 퐾2 − = 1.8432 − = 3.227522 푇,훼 푎 푎 푇 푛 16

푈 푈푇,훼 = 푦 + 푠푦퐾푇,훼 = 3.639 + 0.4439 × 2.7714 = 4.869225

퐿 퐿푇,훼 = 푦 + 푠푦퐾푇,훼 = 3.639 + 0.4439 × 1.2804 = 4.207211 Testing the

휒2 test is used to test the goodness of fitting

푚 푛 푓 푥 − 푝 푥 2 휒2 = 푠 푖 푖 Where 푚 is the number of interval 푝 푥푖 푖=1

푛푓푠 푥푖 is the observed number of occurrence and 푛푝 푥푖 is the corresponding expected number of occurrence in the interval 푖. For describing 휒2 test, 휒2 probability distribution must be defined

푘 1 −1 푓 푥 = 푥2 푒−푥/2 2푘/2Γ 푘/2

1 F 푥 = 훾 푘/2, 푥/2 Γ 푘/2 Degree of freedom 푣 = 푚 − 푝 − 1 Check the goodness of fit on Normal Distribution Interval Range 푛푖 1 20 1 2 25 2 퐹 푧 = 퐵 for 푧 < 0 3 30 6 4 35 14 퐹 푧 = 1 − 퐵 for 푧 ≥ 0 5 40 11 6 45 16 1 퐵 = 1 + 0.196854 푧 + 0.115194 푧 2 + 0.000344 푧 3 + 0.019527 푧 4 −4 7 50 10 2 8 55 5 9 60 3 10 65 1 69 Mean 39.77 SD 9.17 2 Interval Range 푛푖 푓푠(푥푖) 퐹푠(푥푖) 푧푖 B 퐹(푥푖) 푝(푥푖) 휒

1 20 1 0.0145 0.0145 -2.1559 0.0154 0.0154 0.0154 0.004084 푥푖 − 휋 푧푖 = 2 25 2 0.0290 0.0435 -1.6107 0.0535 0.0535 0.0380 0.147864 휎 3 30 6 0.0870 0.1304 -1.0654 0.1436 0.1436 0.0901 0.007632 4 35 14 0.2029 0.3333 -0.5202 0.3012 0.3012 0.1577 0.895245 5 40 11 0.1594 0.4928 0.0251 0.4901 0.5099 0.2087 0.801553 6 45 16 0.2319 0.7246 0.5703 0.2840 0.7160 0.2061 0.22287 7 50 10 0.1449 0.8696 1.1156 0.1325 0.8675 0.1515 0.01965 8 55 5 0.0725 0.9420 1.6609 0.0482 0.9518 0.0843 0.115501 9 60 3 0.0435 0.9855 2.2061 0.0136 0.9864 0.0346 0.159163 10 65 1 0.0145 1.0000 2.7514 0.0032 0.9968 0.0104 0.108361 69 1 2.481923 Mean 39.77 SD 9.17

2 Degree of freedom 푣 = 푚 − 푝 − 1 = 10 − 2 − 1 = 7 휒7,1−0.95 = 14.067

2 2 Since 휒7,1−0.95 is greater than 휒푐 , the null hypothesis can not be rejected