<<

Bernoulli Distribution

Example: Toss of coin Dene X = 1 if head comes up and

X = 0 if tail comes up.

1 Both realizations are equally likely: (X = 1) = (X = 0) = 2 Examples: Often: Two outcomes which are not equally likely: Success of medical treatment Interviewed person is female Student passes exam Transmittance of a disease Bernoulli distribution (with parameter )

X takes two values, 0 and 1, with p and 1 p Frequency function of X x 1 x (1 ) for x 0, 1 p(x) = ∈ { } ½ 0 otherwise Often: 1 if event A has occured X = ½ 0 otherwise Example: A = blood pressure above 140/90 mm HG.

Distributions, Jan 30, 2003 - 1 - Bernoulli Distribution

Let X1, . . . , Xn be independent Bernoulli random variables with same parameter .

Frequency function of X1, . . . , Xn

x1+...+xn n x1 ... xn p(x , . . . , xn) = p(x ) p(xn) = (1 ) 1 1

for xi 0, 1 and i = 1, . . . , n ∈ { } Example: Paired-Sample Sign Test

Study success of new elaborate safety program Record average weekly losses in hours of labor due to accidents before and after installation of the program in 10 industrial plants

Plant 1 2 3 4 5 6 7 8 9 10 Before 45 73 46 124 33 57 83 34 26 17 After 36 60 44 119 35 51 77 29 24 11

Dene for the ith plant 1 if rst value is greater than the second Xi = ½ 0 otherwise

Result: 1 1 1 1 0 1 1 1 1 1

The Xi’s are independently Bernoulli distributed with unknown parameter .

Distributions, Jan 30, 2003 - 2 -

Let X1, . . . , Xn be independent Bernoulli random variables Often only interested in number of successes

Y = X1 + . . . + Xn

Example: Paired Sample Sign Test (contd) Dene for the ith plant 1 if rst value is greater than the second Xi = ½ 0 otherwise n Y = Xi iP=1 Y is the number of plants for which the number of lost hours has decreased after the installation of the safety program

We know:

Xi is Bernoulli distributed with parameter Xi’s are independent What is the distribution of Y ?

Probability of realization x , . . . , xn with y successes: 1 y n y p(x , . . . , xn) = (1 ) 1 Number of dierent realizations with y successes: n y ¡ ¢

Distributions, Jan 30, 2003 - 3 - Binomial Distribution

Binomial distribution (with parameters n and )

Let X1, . . . , Xn be independent and Bernoulli distributed with pa- rameter and n Y = Xi. iP=1 Y has frequency function

n y n y p(y) = (1 ) for y 0, . . . , n µy¶ ∈ { } Y is binomially distributed with parameters n and . We write

Y Bin(n, ).

Note that the number of trials is xed, the of success is the same for each trial, and the trials are independent.

Example: Paired Sample Sign Test (contd) Let Y be the number of plants for which the number of lost hours has decreased after the installation of the safety program. Then

Y Bin(n, )

Distributions, Jan 30, 2003 - 4 - Binomial Distribution

Binomial distribution for n = 10

0.4 0.4 θ = 0.1 θ = 0.3

0.3 0.3

0.2 0.2 p(x) p(x)

0.1 0.1

0.0 0.0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 x x

0.4 0.4 θ = 0.5 θ = 0.8

0.3 0.3

0.2 0.2 p(x) p(x)

0.1 0.1

0.0 0.0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 x x

Distributions, Jan 30, 2003 - 5 -

Consider a sequence of independent Bernoulli trials. On each trial, a success occurs with probability . Let X be the number of trials up to the rst success. What is the distribution of X? x 1 Probability of no success in x 1 trials: (1 ) Probability of one success in the xth trial: The frequency function of X is

x 1 p(x) = (1 ) , x = 1, 2, 3, . . . X is geometrically distributed with parameter .

Example: 1 Suppose a batter has probability 3 to hit the ball. What is the chance that he misses the ball less than 3 times?

The number X of balls up to the rst success is geometrically distributed 1 with parameter 3. Thus 2 1 1 2 1 2 (X 3) = + + = 0.7037. 3 3 3 3³3´

Distributions, Jan 30, 2003 - 6 - Hypergemetric Distribution

Example: Quality Control Quality control - sample and examine fraction of produced units N produced units M defective units n sampled units What is the probability that the sample contains x defective units?

The frequency function of X is M N M x nx p(x) = , x = 0, 1, . . . , n. ¡ ¢¡N ¢ n X is a hypergeometric¡ ¢ with parameters N, M, and n.

Example: Suppose that of 100 applicants for a job 50 were women and 50 were men, all equally qualied. If we select 10 applicants at random what is the probability that x of them are female? The number of chosen female applicants is hypergeometrically distributed with parameters 100, 50, and 10. The frequency function is

50 50 x 10 x p(x) = for x 0, . . . , n ¡ ¢100¡ ¢ 10 ∈ { } ¡ ¢ for x = 0, 1, . . . , 10.

Distributions, Jan 30, 2003 - 7 -

Often we are interested in the number of events which occur in a specic period of time or in a specic area of volume: Number of alpha particles emitted from a radioactive source during a given period of time Number of telephone calls coming into an exchange during one unit of time Number of diseased trees per acre of a certain woodland Number of death claims received per day by an insurance company Characteristics Let X be the number of times a certain event occurs during a given unit of time (or in a given area, etc). The probability that the event occurs in a given unit of time is the same for all the units. The number of events that occur in one unit of time is inde- pendent of the number of events in other units. The mean (or expected) rate is . Then X is a Poisson random variable with parameter and frequency function x p(x) = e , x = 0, 1, 2, . . . x!

Distributions, Jan 30, 2003 - 8 - Poisson Approximation

The Poisson distribution is often used as an approximation for binomial probabilities when n is large and is small: x n x n x p(x) = (1 ) e µx¶ x! with = n .

Example: Fatalities in Prussian cavalry Classical example from von Bortkiewicz (1898). Number of fatalities resulting from being kicked by a horse 200 observations (10 corps over a period of 20 years) Statistical model: Each soldier is kicked to death by a horse with probability . Let Y be the number of such fatalities in one corps. Then Y Bin(n, ) where n is the number of soldiers in one corps.

Observation: The data are well approximated by a Poisson distribution with = 0.61

Deaths per Year Observed Rel. Frequency Poisson Prob. 0 109 0.545 0.543 1 65 0.325 0.331 2 22 0.110 0.101 3 3 0.015 0.021 4 1 0.005 0.003

Distributions, Jan 30, 2003 - 9 - Poisson Approximation

Poisson approximation of Bin(40, )

1.0 1 1.0 1 = 400 = 10 0.8 0.8

0.6 0.6 p(x) p(x) 0.4 0.4

0.2 0.2

0.0 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x x

0.5 1 0.5 = 40 = 1 0.4 0.4

0.3 0.3 p(x) p(x) 0.2 0.2

0.1 0.1

0.0 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x x

0.2 1 0.2 = 8 = 5

0.1 0.1 p(x) p(x)

0.0 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x x

0.2 1 0.2 = 4 = 10

0.1 0.1 p(x) p(x)

0.0 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x x

Distributions, Jan 30, 2003 - 10 - Continuous Distributions

Uniform distribution U(0, ) 40 U(0, θ)

Range (0, 1) 30 1 f(x) = 1(0,)(x) 20

Frequency 10 (X) =

2 0 2 var(X) = −2 −1 0 1 2 3 4 12 X 40 Exp() Exp(λ) 30 Range [0, )

∞ 20 f(x) = exp( x)1[0, )(x)

∞ Frequency

1 10 (X) = 1 0 var(X) = −2 −1 0 1 2 3 4 2 X

2 40 (, ) 2 N N(µ, σ ) 30

Range ¡ 1 1 f x x 2 20 ( ) = exp 2 ( ) √ 2 2 Frequency 2 ³ ´ (X) = 10

var(X) = 2 0 −2 −1 0 1 2 3 4 X 6 4 2 0 −2 U(0, θ) Exp(λ) N(µ, σ2)

Distributions, Jan 30, 2003 - 11 -