Lecture 22: Bivariate Normal Distribution Distribution
Total Page:16
File Type:pdf, Size:1020Kb
6.5 Conditional Distributions General Bivariate Normal Let Z1; Z2 ∼ N (0; 1), which we will use to build a general bivariate normal Lecture 22: Bivariate Normal Distribution distribution. 1 1 2 2 f (z1; z2) = exp − (z1 + z2 ) Statistics 104 2π 2 We want to transform these unit normal distributions to have the follow Colin Rundel arbitrary parameters: µX ; µY ; σX ; σY ; ρ April 11, 2012 X = σX Z1 + µX p 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 1 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Marginals General Bivariate Normal - Cov/Corr First, lets examine the marginal distributions of X and Y , Second, we can find Cov(X ; Y ) and ρ(X ; Y ) Cov(X ; Y ) = E [(X − E(X ))(Y − E(Y ))] X = σX Z1 + µX h p i = E (σ Z + µ − µ )(σ [ρZ + 1 − ρ2Z ] + µ − µ ) = σX N (0; 1) + µX X 1 X X Y 1 2 Y Y 2 h p 2 i = N (µX ; σX ) = E (σX Z1)(σY [ρZ1 + 1 − ρ Z2]) h 2 p 2 i = σX σY E ρZ1 + 1 − ρ Z1Z2 p 2 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY = σX σY ρE[Z1 ] p 2 = σX σY ρ = σY [ρN (0; 1) + 1 − ρ N (0; 1)] + µY = σ [N (0; ρ2) + N (0; 1 − ρ2)] + µ Y Y Cov(X ; Y ) ρ(X ; Y ) = = ρ = σY N (0; 1) + µY σX σY 2 = N (µY ; σY ) Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 2 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 3 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - RNG Multivariate Change of Variables Consequently, if we want to generate a Bivariate Normal random variable Let X1;:::; Xn have a continuous joint distribution with pdf f defined of S. We can define n new random variables Y ;:::; Y as follows: 2 2 1 n with X ∼ N (µX ; σX ) and Y ∼ N (µY ; σY ) where the correlation of X and Y = r (X ;:::; X ) ··· Y = r (X ;:::; X ) Y is ρ we can generate two independent unit normals Z1 and Z2 and use 1 1 1 n n n 1 n the transformation: If we assume that the n functions r1;:::; rn define a one-to-one differentiable transformation from S to T then let the inverse of this transformation be x1 = s1(y1;:::; yn) ··· xn = sn(y1;:::; yn) X = σX Z1 + µX Then the joint pdf g of Y1;:::; Yn is p 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY ( f (s1;:::; sn)jJj for (y1;:::; yn) 2 T g(y1;:::; yn) = 0 otherwise We can also use this result to find the joint density of the Bivariate Normal using a 2d change of variables. Where 2 @s1 ··· @s1 3 @y1 @yn 6 . 7 J = det 6 . 7 4 . 5 @sn ··· @sn @y1 @yn Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 4 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 5 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Density General Bivariate Normal - Density The first thing we need to find are the inverses of the transformation. If Next we calculate the Jacobian, x = r (z ; z ) and y = r (z ; z ) we need to find functions h and h such 1 1 1 2 2 1 2 1 2 " @s1 @s1 # " 0 # @x @y σX 1 J = det = det −ρ 1 = that Z1 = s1(X ; Y ) and Z2 = s2(X ; Y ). @s2 @s2 p p p 2 2 2 σ σ 1 − ρ @x @y σX 1−ρ σY 1−ρ X Y X = σ Z + µ X 1 X The joint density of X and Y is then given by X − µX Z1 = σX f (x; y) = f (z1; z2)jJj 1 1 2 2 1 1 2 2 p 2 = exp − (z + z ) jJj = exp − (z + z ) Y = σ [ρZ1 + 1 − ρ Z2] + µ 1 2 p 2 1 2 Y Y 2π 2 2πσX σY 1 − ρ 2 Y − µY X − µX p 2 = ρ + 1 − ρ Z2 " " !2 !2## σY σX 1 1 x − µX 1 y − µY x − µX = exp − + − ρ p 2 2 1 Y − µY X − µX 2πσX σY 1 − ρ 2 σX 1 − ρ σY σX Z2 = p − ρ 1 − ρ2 σY σX " 2 2 !# 1 −1 (x − µX ) (y − µY ) (x − µX ) (y − µY ) Therefore, = exp + − 2ρ 2 1=2 2 2 2 2πσX σY (1 − ρ ) 2(1 − ρ ) σX σY σX σY x − µX 1 y − µY x − µX s1(x; y) = s2(x; y) = p − ρ σX 1 − ρ2 σY σX Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 6 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 7 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Density (Matrix Notation) General Bivariate Normal - Density (Matrix Notation) Obviously, the density for the Bivariate Normal is ugly, and it only gets Recall for a 2 × 2 matrix, worse when we consider higher dimensional joint densities of normals. We a b 1 d −b 1 d −b A = A−1 = = can write the density in a more compact form using matrix notation, c d det A −c a ad − bc −c a 2 Then, x µX σX ρσX σY x = µ = Σ = 2 (x − µ)T Σ−1(x − µ) y µY ρσX σY σY 1 T 2 x − µx σY −ρσX σY x − µx = 2 σ2 σ2 (1 − ρ2) y − µy −ρσX σY σ y − µy X Y X 1 −1=2 1 T −1 1 2 T f (x) = (det Σ) exp − (x − µ) Σ (x − µ) σY (x − µX ) − ρσX σY (y − µY ) x − µx = 2 2π 2 2 2 2 −ρσ σ (x − µ ) + σ (y − µ ) y − µy σX σY (1 − ρ ) X Y X X Y 1 −1=2 = σ2 (x − µ )2 − 2ρσ σ (x − µ )(y − µ ) + σ2 (y − µ )2 We can confirm our results by checking the value of (det Σ) and 2 2 2 Y X X Y X Y X Y σX σY (1 − ρ ) T −1 ! (x − µ) Σ (x − µ) for the bivariate case. 1 (x − µ )2 (x − µ )(y − µ ) (y − µ )2 = X − 2ρ X Y + Y 2 2 2 1 − ρ σ σX σY σ −1=2 1 X Y (det Σ)−1=2 = σ2 σ2 − ρ2σ2 σ2 = X Y X Y 2 1=2 σX σY (1 − ρ ) Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 8 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 9 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Examples General Bivariate Normal - Examples X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 2); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 2) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) ρ = 0 ρ = 0 ρ = 0 ρ = 0:25 ρ = 0:5 ρ = 0:75 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 10 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 11 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Examples General Bivariate Normal - Examples X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 2); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 2) ρ = −0:25 ρ = −0:5 ρ = −0:75 ρ = −0:75 ρ = −0:75 ρ = −0:75 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 12 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 13 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions Multivariate Normal Distribution Multivariate Normal Distribution - Cholesky Matrix notation allows us to easily express the density of the multivariate In the bivariate case, we had a nice transformation such that we could normal distribution for an arbitrary number of dimensions. We express the generate two independent unit normal values and transform them into a k-dimensional multivariate normal distribution as follows, sample from an arbitrary bivariate normal distribution. X ∼ Nk (µ; Σ) There is a similar method for the multivariate normal distribution that takes advantage of the Cholesky decomposition of the covariance matrix. where µ is the k × 1 column vector of means and Σ is the k × k covariance matrix where fΣgi;j = Cov(Xi ; Xj ). The Cholesky decomposition is defined for a symmetric, positive definite matrix X as The density of the distribution is L = Chol(X) where L is a lower triangular matrix such that LLT = X. 1 1 f (x) = (det Σ)−1=2 exp − (x − µ)T Σ−1(x − µ) (2π)k=2 2 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 14 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 15 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions Multivariate Normal Distribution - RNG Cholesky and the Bivariate Transformation Let Z ;:::; Z ∼ N (0; 1) and Z = (Z ;:::; Z )T then We need to find the Cholesky decomposition of Σ for the general bivariate 1 k 1 k case where σ2 ρσ σ Σ = X X Y ρσ σ σ2 µ + Chol(Σ)Z ∼ Nk (µ; Σ) X Y Y this is offered without proof in the general k-dimensional case but we can We need to solve the following for a; b; c check that this results in the same transformation we started with in the 2 2 a 0 a b a ab σX ρσX σY bivariate case and should justify how we knew to use that particular = 2 2 = 2 b c 0 c ab b + c ρσX σY σY transformation.