Chapter 5 Joint Distribution and Random Samples

Chapter 5 Joint Distribution and Random Samples 5.1 Discrete joint distributions x5.1 |predict or control one random variable from another. (e.g. cases of flu last week and this week; income and education). Joint outcome: The joint event f(X; Y ) = (x; y)g = f! : X(!) = x; Y (!) = yg = fX = xg\fY = yg: 105 ORF 245: Joint Distributions and Random Samples { J.Fan 106 Range: If X has possible values x1; ··· ; xk and Y has possible values y1; ··· ; yl, then the range of (X; Y ) is all ordered pairs f(xi; yj): i = 1 ··· k; j = 1 ··· lg: Joint dist: Describes how prob is distributed over the ordered pairs: X X p(xi; yj) = P (X = xi;Y = yj); p(xi; yj) = 1; i j called the joint probability mass function. Example 5.1 Discrete joint distribution Let X = deductible amount, in $, on the auto policy and Y = deductible amount on homeowner's policy. Assume that their distribution over a population is ORF 245: Joint Distributions and Random Samples { J.Fan 107 Table 5.1: Joint distribution of deductible amounts y p(x; y) 0 100 200 100 .20 .10 .20 x 200 .05 .15 .30 (a) (Marginal) What is the (marginal) distribution of X? X p(X = 100) = p(100; y) = :2 + :1 + :2 = :5: y X p(X = 200) = p(200; y) = :05 + :15 + :3 = :5: y (b) (Marginal)1 What is the marginal distribution of Y ? X X p(Y = y) = P (X = x; Y = y) = p(x; y) x x p(Y = 0) = :25;P (Y = 100) = :25;P (Y = 200) = :5 1small latter indicates that the materials are not as important. They are required and assigned as reading materials. ORF 245: Joint Distributions and Random Samples { J.Fan 108 (c) (Functions of two rv's) Let Z = g(X; Y ). Then, X P (Z = z) = p(x; y): (x;y):g(x;y)=z e.g. Z = X + Y . The range is f100; 200; 300; 400g. P (Z = 100) = p(X = 100;Y = 0) = :20 P (Z = 200) = P (X = 100;Y = 100) + P (X = 200;Y = 0) = :10 + :05 = :15: P (Z = 300) = = :20 + :15 = :35; P (Z = 400) = = :30 P P (d) (Expected value): Eg(X; Y ) = x y g(x; y)p(x; y). ORF 245: Joint Distributions and Random Samples { J.Fan 109 (e) (Additivity): E(g1(X) + g2(Y )) = Eg1(X) + Eg2(Y ). e.g. E(X + Y ) = :20 × 100 + :15 × 200 + :35 × 300 + :30 × 400 = $275; while EX = $150 and EY = $125. p1 p2 pr n draws Example 5.2 Multinomial 1 2 ... r distribution & machine learning 1 Figure 5.1: Classification and Multinomial distribution. Make n draws from a population with r categories. Let Xi be the number of \successes" in ith category. Then, its pmf is given by n n − x1 n − x1 − · · · xr−1 x1 xr p(x1; ··· ; xr) = ··· p1 ··· pr x1 x2 xr n! x1 xr = p1 ··· pr ; where x1 + ··· + xr = n. x1! ··· xr! ORF 245: Joint Distributions and Random Samples { J.Fan 110 Q1: Are Xi and Xi+1 independent? Q2: What is the distribution if 100 peo- ple is sampled Ex 5.1 and classified ac- cording to the insurance policies. Figure 5.2: What object in the image? Example 5.3 Trinomial distribution and population genetics. In genetics, when sampling from an equilibrium population with respect to a gene with two alle- les A and a, three genotypes can Figure 5.3: Illustration of Hardy-Weinberg formula. be observed with proportion given below: ORF 245: Joint Distributions and Random Samples { J.Fan 111 Table 5.2: Hardy-Weinberg formula aa aA AA 2 2 p1 = θ p2 = 2θ(1 − θ) p3 = (1 − θ) θ = 0:5 0.25 0.5 0.25 Probability question: If θ = 0:5, n = 10, what is the probability that X1 = 2;X2 = 5 and X3 = 3? 10! probability = (:25)2(:5)5(:25)3 = 0:0769: 2!5!3! > X=rmultinom(1000,1,prob=c(0.25,0.5,0.25)) #sample 1000 from trinomial > X[,1:10] #first 10 realizations [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 0 0 1 0 0 0 0 0 1 [2,] 0 1 0 0 0 1 1 1 0 0 [3,] 0 0 1 0 1 0 0 0 1 0 > apply(X,1,mean) #calculate relative frequencies [1] 0.265 0.492 0.243 Statistical question: If we observe x1 = 2, x2 = 5 and x3 = 3, ORF 245: Joint Distributions and Random Samples { J.Fan 112 are the data consistent with the H-W formula? If so, what is θ? Addition rule for expectation: (without any condition) E(X1 + ··· + Xn) = E(X1) + ··· + E(Xn): Independence: X and Y are independent if P (X = x; Y = y) = P (X = x)P (Y = y), or p(x; y) = pX(x)pY (y) for all x and y. Expectation of product: If X and Y are independent, then E(XY ) = E(X)E(Y ), since X X X X E(XY ) = xypX(x)pY (y) = xpX(x) ypY (y) = EX · EY: x y x y 5.2 Joint densities x5.1 ORF 245: Joint Distributions and Random Samples { J.Fan 113 f (,xy ) Let X and Y be continuous random variables. Then the joint density A = shaded rectangle function f(x; y) is the one such that Figure 5.4: The joint density function of 2 rv's is such that ZZ probability equals the volume under its surface. P ((X; Y ) 2 A) = f(x; y)dxdy: A It indicates the likelihood of getting (x; y) per unit area near (x; y): P (X 2 x ± ∆x;Y 2 y ± ∆y) f(x; y) ≈ : 4∆x∆y Marginal density functions can be obtained by Z +1 Z +1 fX(x) = f(x; y)dy; fY (y) = f(x; y)dx: −∞ −∞ Example 5.4 Uniform distribution on a set A ORF 245: Joint Distributions and Random Samples { J.Fan 114 has a joint density f(x; y) = IA(x; y)=Area(A); where IA(x; y) is the indicator function of the set A: 8 < 1; if (x; y) 2 A IA(x; y) = : : 0; otherwise If A = f(x; y): x2 + y2 ≤ 1g is a unit disk, then p Z 1−x2 1 2 p 2 fX(x) = p dy = 1 − x ; for − 1 ≤ x ≤ 1: − 1−x2 π π Similarly, fY (y) = fX(y). Thus, X and Y have identical dist (but P (X = Y ) = 0 in this case). R +1 R +1 Expectation: Eg(X; Y ) = −∞ −∞ g(x; y)f(x; y)dxdy. Independence: X and Y are independent if f(x; y) = fX(x)fY (y); for all x and y: ORF 245: Joint Distributions and Random Samples { J.Fan 115 Rule of product: If X and Y are independent, then E(XY) = E(X)E(Y) R +1 R +1 R +1 R +1 since −∞ −∞ xyfX(x)fY (y)dxdy = −∞ xfX(x)dx −∞ yfY (y)dy. Example 5.5 Independence and uncorrelated If (X; Y ) are uniformly distributed on the unit disk, then X and Y are not independent. But, the rule of product holds EXY = 0 = E(X)E(Y ): |termed uncorrelated in the next section. |uncorrelated but not independent (independent=) uncorrelated) ORF 245: Joint Distributions and Random Samples { J.Fan 116 Now if (X; Y ) is uniformly distributed on the unit square. Then, Z 1 fX(x) = dy = 1; for 0 ≤ x ≤ 1: 0 Similarly, fY (y) = I[0;1](y). It is now easy to see that f(x; y) = I[0;1]×[0;1](x; y) = fX(x)fY (y), i.e. X and Y are indep. Hence, E(XY ) = E(X)E(Y ) = 1=4: 5.3 Covariance and correlation x5.2 Purpose: FCorrelation measures the strength of linear associ- ORF 245: Joint Distributions and Random Samples { J.Fan 117 ation between two variables. Fcovariance helps compute var(X+Y ): 2 var(X + Y ) = E(X + Y − µX − µY ) = = var(X) + var(Y ) + 2 E(X − µX)(Y − µY ) : | covariance{z } Covariance: cov(X; Y ) = E(X − µX)(Y − µY ). It is equal to cov(X; Y ) = E(XY ) − (EX)(EY ): If X and Y are independent, then cov(X; Y ) = 0. Correlation: is the covariance when standardized: X − µ Y − µ cov(X; Y ) ρ = E X × Y = : σX σY σXσY ORF 245: Joint Distributions and Random Samples { J.Fan 118 When ρ = 0, X and Y are uncorrelated. Properties: ρ measures the strength of linear association. ♠− 1 ≤ ρ ≤ 1; ♠ The larger the jρj, the stronger the linear association. ♠ ρ > 0 means positive association: large x tends to associate with large y and small x tends to associate with small y. ρ < 0 means negative association. ♠ ρ is independent of units: corr(1:8X + 32; 1:5Y ) = corr(X; Y ). Example 5.6 Bivariate normal distribution has density ORF 245: Joint Distributions and Random Samples { J.Fan 119 Figure 5.5: Simulated data with sample correlations 0.026, -0.595, -0.990, 0.241, 0.970, 0.853. ORF 245: Joint Distributions and Random Samples { J.Fan 120 " #! 1 1 x − µ 2 x − µ y − µ y − µ 2 f(x; y) = exp − 1 − 2ρ 1 2 + 2 : p 2 2 2σ1σ2π 1 − ρ 2(1 − ρ ) σ1 σ1 σ2 σ2 If X and Y are standardized, then 1 1 f(x; y) = exp − x2 − 2ρxy + y2 : 2πp1 − ρ2 2(1 − ρ2) It has the following properties: 2 2 F (Marginal dist) X ∼ N(µ1; σ1) and Y ∼ N(µ2; σ2).

Load more