Lecture Notes 5)
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 3 Continuous Random Variables 3.1 Introduction Rather than summing probabilities related to discrete random variables, here for continuous random variables, the density curve is integrated to determine probability. Exercise 3.1 (Introduction) Patient's number of visits, X, and duration of visit, Y . density, pmf f(x) density, pdf f(y) = y/6, 2 < y <_ 4 probability less than 1.5 = 1 sum of probability 1 at specific values probability less than 3 0.75 P(X < 1.5) = P(X = 0) + P(X = 1) 2/3 = 0.25 + 0.50 = 0.75 = area under curve, 0.50 P(Y < 3) = 5/12 1/2 probability at 3, P(X = 2) = 0.25 P(Y = 3) = 0 0.25 1/3 0 1 2 0 1 2 3 4 x probability (distribution): cdf F(x) 1 1 probability = 0.75 0.75 value of function, F(3) = P(Y < 3) = 5/12 0.50 probability less than 1.5 = value of function 0.25 F(1.5 ) = P(X < 1.5) = 0.75 0.25 0 1 2 0 1 2 3 4 x Figure 3.1: Comparing discrete and continuous distributions 73 74 Chapter 3. Continuous Random Variables (LECTURE NOTES 5) 1. Number of visits, X is a (i) discrete (ii) continuous random variable, and duration of visit, Y is a (i) discrete (ii) continuous random variable. 2. Discrete (a) P (X = 2) = (i) 0 (ii) 0:25 (iii) 0:50 (iv) 0:75 (b) P (X ≤ 1:5) = P (X ≤ 1) = F (1) = 0:25 + 0:50 = 0:75 requires (i) summation (ii) integration and is a value of a (i) probability mass function (ii) cumulative distribution function which is a (i) stepwise (ii) smooth increasing function (c) E(X) = (i) P xf(x) (ii) R xf(x) dx (d) V ar(X) = (i) E(X2) − µ2 (ii) E(Y 2) − µ2 (e) M(t) = (i) E etX (ii) E etY (f) Examples of discrete densities (distributions) include (choose one or more) (i) uniform (ii) geometric (iii) hypergeometric (iv) binomial (Bernoulli) (v) Poisson 3. Continuous (a) P (Y = 3) = (i) 0 (ii) 0:25 (iii) 0:50 (iv) 0:75 x=3 R 3 x x2 i 32 22 5 (b) P (Y ≤ 3) = F (3) = 2 6 dx = 12 = 12 − 12 = 12 x=2 requires (i) summation (ii) integration and is a value of a (i) probability density function (ii) cumulative distribution func- tion which is a (i) stepwise (ii) smooth increasing function (c) E(Y ) = (i) P yf(y) (ii) R yf(y) dy (d) V ar(Y ) = (i) E(X2) − µ2 (ii) E(Y 2) − µ2 (e) M(t) = (i) E etX (ii) E etY (f) Examples of continuous densities (distributions) include (choose one or more) (i) uniform (ii) exponential (iii) normal (Gaussian) (iv) Gamma (v) chi-square (vi) student-t (vii) F Section 2. Definitions (LECTURE NOTES 5) 75 3.2 Definitions Random variable X is continuous if probability density function (pdf) f is continuous at all but a finite number of points and possesses the following properties: • f(x) ≥ 0, for all x, R 1 • −∞ f(x) dx = 1, R b • P (a < X ≤ b) = a f(x) dx The (cumulative) distribution function (cdf) for random variable X is Z x F (x) = P (X ≤ x) = f(t) dt; −∞ and has properties • limx→−∞ F (x) = 0, • limx!1 F (x) = 1, • if x1 < x2, then F (x1) ≤ F (x2); that is, F is nondecreasing, R b • P (a ≤ X ≤ b) = P (X ≤ b) − P (X ≤ a) = F (b) − F (a) = a f(x) dx, 0 d R x • F (x) = dx −∞ f(t) dt = f(x). f(x) is positive density, f(x) probability probability cdf F(a) = P(X <_ a) total area = 1 P(a < X <_ b) = F(b) - F(a) a x a b Figure 3.2: Continuous distribution The expected value or mean of random variable X is given by Z 1 µ = E(X) = xf(x) dx; −∞ the variance is σ2 = V ar(X) = E[(X − µ)2] = E(X2) − [E(X)]2 = E(X2) − µ2 76 Chapter 3. Continuous Random Variables (LECTURE NOTES 5) p with associated standard deviation, σ = σ2. The moment-generating function is Z 1 M(t) = E etX = etX f(x) dx −∞ for values of t for which this integral exists. Expected value, assuming it exists, of a function u of X is Z 1 E[u(X)] = u(x)f(x) dx −∞ The (100p)th percentile is a value of X denoted πp where Z πp p = f(x) dx = F (πp) −∞ and where πp is also called the quantile of order p. The 25th, 50th, 75th percentiles are also called first, second, third quartiles, denoted p1 = π0:25, p2 = π0:50, p3 = π0:75 where also 50th percentile is called the median and denoted m = p2. The mode is the value x where f is maximum. Exercise 4.2 (Definitions) 1. Waiting time. Let the time waiting in line, in minutes, be described by the random variable X which has the following pdf, 1 x; 2 < x ≤ 4; f(x) = 6 0; otherwise: density, pdf f(x) probability, cdf F(x) 1 1 probability = probability less than 3 2/3 value of function, 0.75 = area under curve, F(3) = P(X < 3) = 5/12 P(X < 3) = 5/12 1/2 probability at 3, P(X = 3) = 0 1/3 0.25 0 1 2 3 4 x 0 1 2 3 4 x Figure 3.3: f(x) and F(x) Section 2. Definitions (LECTURE NOTES 5) 77 (a) Verify function f(x) satisfies the second property of pdfs, Z 1 Z 4 1 x2 x=4 42 22 12 f(x) dx = x dx = = − = = −∞ 2 6 12 x=2 12 12 12 (i) 0 (ii) 0:15 (iii) 0:5 (iv) 1 (b) P (2 < X ≤ 3) = Z 3 1 x2 x=3 32 22 x dx = = − = 2 6 12 x=2 12 12 5 9 (i) 0 (ii) 12 (iii) 12 (iv) 1 − 32 32 1 (c) P (X = 3) = P (3 < X ≤ 3) = 12 − 12 = 0 6= f(3) = 6 · 3 = 0:5 (i) True (ii) False 1 So the pdf f(x) = 6 x determined at some value of x does not determine probability. (d) P (2 < X ≤ 3) = P (2 < X < 3) = P (2 ≤ X ≤ 3) = P (2 ≤ X < 3) (i) True (ii) False because P (X = 3) = 0 and P (X = 2) = 0 (e) P (0 < X ≤ 3) = Z 3 1 x2 x=3 32 22 x dx = = − = 2 6 12 x=2 12 12 5 9 (i) 0 (ii) 12 (iii) 12 (iv) 1 Why integrate from 2 to 3 and not 0 to 3? (f) P (X ≤ 3) = Z 3 1 x2 x=3 32 22 x dx = = − = 2 6 12 x=2 12 12 5 9 (i) 0 (ii) 12 (iii) 12 (iv) 1 (g) Determine cdf (not pdf) F (3). Z 3 1 x2 x=3 32 22 F (3) = P (X ≤ 3) = x dx = = − = 2 6 12 x=2 12 12 5 9 (i) 0 (ii) 12 (iii) 12 (iv) 1 (h) Determine F (3) − F (2). Z 3 1 x2 x=3 32 22 F (3)−F (2) = P (X ≤ 3)−P (X ≤ 2) = P (2 ≤ X ≤ 3) = x dx = = − = 2 6 12 x=2 12 12 5 9 (i) 0 (ii) 12 (iii) 12 (iv) 1 because everything left of (below) 3 subtract everything left of 2 equals what is between 2 and 3 78 Chapter 3. Continuous Random Variables (LECTURE NOTES 5) (i) The general distribution function (cdf) is 8 R x 0 dt = 0; x ≤ 2; > −∞ < 2 it=x 2 F (x) = R x t t x 4 2 6 dt = 12 = 12 − 12 ; 2 < x ≤ 4; > t=2 : 1; x > 4: x2 4 x2−4 In other words, F (x) = 12 − 12 = 12 on (2; 4]. Both pdf density and cdf distribution are given in the figure above. (i) True (ii) False 22 4 5 9 (j) F (2) = 12 − 12 = (i) 0 (ii) 12 (iii) 12 (iv) 1. 32 4 5 9 (k) F (3) = 12 − 12 = (i) 0 (ii) 12 (iii) 12 (iv) 1. 3:52 4 2:52 4 (l) P (2:5 < X < 3:5) = F (3:5) − F (2:5) = 12 − 12 − 12 − 12 = (i) 0 (ii) 0:25 (iii) 0:50 (iv) 1. 3:52 4 (m) P (X > 3:5) = 1 − P (X ≤ 3:5) = 1 − F (3:5) = 1 − 12 − 12 = (i) 0:3125 (ii) 0:4425 (iii) 0:7650 (iv) 1. (n) Expected value. The average wait time is Z 1 Z 4 1 Z 4 x2 x3 4 43 23 µ = E(X) = xf(x) dx = x x dx = dx = = − = −∞ 2 6 2 6 18 2 18 18 23 28 31 35 (i) 9 (ii) 9 (iii) 9 (iv) 9 . library(MASS) # INSTALL once, RUN once (per session) library MASS mean <- 4^3/18 - 2^3/18; mean; fractions(mean) # turn decimal to fraction [1] 3.111111 [1] 28/9 (o) Expected value of function u = x2. Z 1 Z 4 1 Z 4 x3 x4 4 44 24 E X2 = x2f(x) dx = x2 x dx = dx = = − = −∞ 2 6 2 6 24 2 24 24 (i) 9 (ii) 10 (iii) 11 (iv) 12. (p) Variance, method 1. Variance in wait time is 282 σ2 = V ar(X) = E X2 − µ2 = 10 − = 9 23 26 31 35 (i) 81 (ii) 81 (iii) 81 (iv) 81 .