Chapter 4 Commonly Used Distributions Bernoulli Trial

Chapter 4 Commonly Used Distributions Bernoulli Trial

Chapter 4 Commonly Used Distributions 4.1 Binomial distribution x3.4 Bernoulli trial: the simplest kind of random trial (like coin tossing) ♠ Each trial has two outcomes: S or F. ♠ For each trial, P (S) = p . ♠ The trials are independent. 76 ORF 245: Common Distributions { J.Fan 77 th Associated r.v.: let Yi be the outcome of i trial. Its takes only two values, f0; 1g, with pmf P (Yi = 1) = p and P (Yi = 0) = 1 − p, called Bernoulli dist. Then, E(Yi) = p and var(Yi) = pq. Generic name: |Quality control: Defective/Good; ♠Clinical trial: Survival/death; disease/non- disease; cure/not cure; effective/ineffective; |sampling survey: male/female; old/young; Coke/Pepsi; ♠online ads: click/not click .... Example 4.1 Which of following are Bernoulli trials? ♠ Roll a die 100 times and observe whether getting even or odd spots. ♠ Observe daily temperature and see whether it is above or below ORF 245: Common Distributions { J.Fan 78 freezing temperature. ♠ Sample 1000 people and observe whether they belong to upper, middle or low SES class. Definition: Let X be the number of successes in a Bernoulli process with n trials and probability of success p. Then, the dist of X called a Binomial distribution, denoted by X ∼ Bin(n; p). Fact: Let X ∼ Bin(n; p), then n x n−x • (pmf) b(x; n; p) = P (X = x) = x p q . • (summary) EX = np and var(X) = npq. • (sum of indep. r.v.) X = Y1 + ··· + Yn, FYi = a Bernoulli r.v. ORF 245: Common Distributions { J.Fan 79 Binomial with n= 8 p= 0.5 Binomial with n= 25 p= 0.5 Binomial with n= 50 p= 0.5 0.15 0.25 0.20 0.08 0.10 0.15 y y y 0.10 0.04 0.05 0.05 0.00 0.00 0.00 0 2 4 6 8 0 5 10 15 20 25 10 15 20 25 30 35 40 x x x Binomial with n= 8 p= 0.3 Binomial with n= 25 p= 0.3 Binomial with n= 60 p= 0.3 0.30 0.15 0.08 0.20 0.10 y y y 0.04 0.10 0.05 0.00 0.00 0.00 0 2 4 6 8 0 5 10 15 20 25 5 10 15 20 25 30 x x x Binomial with n= 8 p= 0.1 Binomial with n= 25 p= 0.1 Binomial with n= 100 p= 0.1 0.4 0.25 0.12 0.20 0.3 0.08 0.15 y y y 0.2 0.10 0.04 0.1 0.05 0.0 0.00 0.00 0 2 4 6 8 0 5 10 15 20 25 0 5 10 15 20 x x x Figure 4.1: Bionomial distributions with different parameters. ORF 245: Common Distributions { J.Fan 80 Example 4.2 Genetics According to Mendelian's theory, a cross fertilization of related species of red and white flowered plants produces offsprings of which 25% are red flowered plants. Crossing 5 pairs, what is the chance of | no red flowered plant? Let X = # of red flowered plants. Then, X ∼ Bin(5; 0:25). P (X = 0) = q5 = 0:755 = 0:237: | no more than 3 red flowered plants? P (X ≤ 3) = 1 − P (X ≥ 4) = = 0:984: ORF 245: Common Distributions { J.Fan 81 Use dbinom, pbinom, qbinom, rbinom for density (pmf), probability (cdf), quantile, and random numbers. > dbinom(0,5,0.25) #pmf P(X = 0) or density for discrete r.v. [1] 0.2373047 > pbinom(3,5,0.25) #cdf P(X <= 3) [1] 0.984375 > dbinom(3,5,0.25) #pmf P(X = 3) [1] 0.08789063 4.2 Hypergeometric and Negative Binomial Distributions x3.5 In a sampling survey, one typically draws w/o replacement. n draws w/o replacement Hypergeometric dist: M defect N-M Good X defect n-X Good (Use hyper in R: e.g. Total N Total n Figure 4.2: Illustration of sampling without replacement. 1 ORF 245: Common Distributions { J.Fan 82 dhyper, phyper, qhyper, rhyper) MN−M P (X = x) = x n−x ; N n for maxf0; n − N + Mg ≤ x ≤ min(n; M). Remarks ♠ When sampling with replacement, the distribution is binomial. ♠ When sampling a small fraction from a large population, like a poll, the trials are nearly Bernoulli. E.g. N = 40000 and n = 1000, # of families with income ≥$50K is 40%. 15999 15001 P (S jS ) = ≈ P (S ):P (S jS ··· S ) = ≈ P (S ): 2 1 39999 2 1000 1 999 39001 1000 N − n Fact: EX = np; var(X) = npq, where p = M . N − 1 N |corr{z factor} ORF 245: Common Distributions { J.Fan 83 Example 4.3 Capture-recapture to estimate population size Suppose that there are N animals in a region. Capture 5 of them, tag them and then release them back. Now among newly 10 captured animals, X of them are tagged. Probabilistic question: If N = 25, how likely is it that X = 2? The distribution of X is \hyper" with N = 25, M = 5, n = 10. 520 P (X = 2) = 2 8 = :385: 25 10 > dhyper(2,5, 20, 10) #pmf P(X = 2), N-M = 20 [1] 0.3853755 > phyper(2,5, 20, 10) #cdf P(X <= 2) [1] 0.6988142 Statistical question: What is the population size if we observe 3 ORF 245: Common Distributions { J.Fan 84 tagged animal? Thus, the realization of X, denoted by x, is 3. Since 5 EX = nN and an observed x is a reasonable estimate of EX, then 5 5n x = n or Nb = = 16:667 or 17: N x Definition: In a Bernoulli trial w/ P (S) = p, let Y be the number of draws required to get r successes and X = Y − r (so that it begins from 0). The distribution of X is called a negative Binomial (Use nbinom in R, namely dnbinom, pnbinom, qnbinom, rnbinom): x + r − 1 P (X = x) = prqx; x = 0; 1; 2; ··· r − 1 Y z }| { Y1 Y2 Y3 Yr FFSz }| { FFFFFFFSz }| { FFFSz }| { ··· FFFFFSz }| { When r = 1, Y follows a geometric distribution: P (Y = y) = pqy−1; y = 1; 2; ··· : 2 Recall that EYi = 1=p; var(Yi) = q=p . Then, F Y = Y1 + ··· + Yr and X = Y − r. ORF 245: Common Distributions { J.Fan 85 2 F EY = r=p, var(Y ) = rq=p . 2 F EX = EY − r = rq=p, var(X) = rq=p . 4.3 Poisson Distribution x3.6 Useful for modeling number of rare events (# of traffic accidents, # of houses damaged by fire; # of customer arrivals to a service center). When is the Poisson distribution (Use pois in R) reasonable? • independence: # of events occurring in any time interval is independent of # of events in any other non-overlaping interval; • rare: almost impossible for two or more events simultaneously; • constant rate: The average # of occurrences per time unit is ORF 245: Common Distributions { J.Fan 86 constant, denoted by λ. Poission dist with lambda = 0.5 Poission dist with lambda = 1 0.6 0.3 0.4 0.2 dpois(x, 1) 0.2 dpois(x, 0.5) 0.1 0.0 0.0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 (a) (b) Poission dist with lambda = 7 Poission dist with lambda = 15 0.15 0.08 0.10 0.04 dpois(x, 7) 0.05 dpois(x, 15) 0.00 0.00 0 5 10 15 20 25 30 0 5 10 15 20 25 30 (c) (d) Figure 4.3: The Poisson Distribution (line diagram) for different values of λ Let X = # of occurrences in a unit time interval. It can be shown that the distribution Poisson(λ) is given by: e−λλx P (X = x) = ; x = 0; 1; 2; ··· : x! ORF 245: Common Distributions { J.Fan 87 x= 0:30 plot(x, dpois(x,0.5), xlab="(a)", type="n") #lambda = 0.5 for(i in 0:30) lines(c(i,i), c(0,dpois(i, 0.5)),lwd=1.5, col="blue") plot(x, dpois(x,15), xlab="(d)", type="h", col="blue", lwd=1.5) title("Poission dist with lambda = 0.5") Poisson approximation to the Binomial: When n is large and p is small, and n and 1=p are of the same order of magnitude (e.g., n = 100; p = 0:01; n = 1000; p = 0:003), then: n e−λλx pxqn−x ≈ ; with λ = np: x x! Example 4.4 Suppose that a manufacturing process produces 1% of defective products. Com- pute P( # of defectives ≥ 2 in 200 products). Note that Bin(200, 0.01) ≈ Poisson(2). Thus, P (X ≥ 2) = 1 − P (X = 0) − P (X = 1) e−220 e−22 = 1 − − = 0:594 0! 1! Direct calculation shows that the probability is 0.595. ORF 245: Common Distributions { J.Fan 88 Expected value and variance: EX = λ and var(X) = λ. (Think of the Poisson approximation to the Binomial). 4.4 The normal distribution x4.3 Definition: A cont rv X has a normal dist ("dnorm, pnorm, qnorm, rnorm " for density, probability (cdf), quantile, and random number generator) if its pdf is given by ! 1 (x − µ)2 f(x; µ, σ) = p exp − ; −∞ < x < 1: 2πσ 2σ2 denoted by X ∼ N(µ, σ2). Facts: It can been shown that EX = µ and var(X) = σ2. Standard normal distribution refers to the specific case in ORF 245: Common Distributions { J.Fan 89 Figure 4.4: Normal - bell-shaped - distributions with different means (center) and variances (spreadness).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    29 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us