Lecture 22: Bivariate Normal Distribution Distribution

6.5 Conditional Distributions General Bivariate Normal Let Z1; Z2 ∼ N (0; 1), which we will use to build a general bivariate normal Lecture 22: Bivariate Normal Distribution distribution. 1 1 2 2 f (z1; z2) = exp − (z1 + z2 ) Statistics 104 2π 2 We want to transform these unit normal distributions to have the follow Colin Rundel arbitrary parameters: µX ; µY ; σX ; σY ; ρ April 11, 2012 X = σX Z1 + µX p 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 1 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Marginals General Bivariate Normal - Cov/Corr First, lets examine the marginal distributions of X and Y , Second, we can find Cov(X ; Y ) and ρ(X ; Y ) Cov(X ; Y ) = E [(X − E(X ))(Y − E(Y ))] X = σX Z1 + µX h p i = E (σ Z + µ − µ )(σ [ρZ + 1 − ρ2Z ] + µ − µ ) = σX N (0; 1) + µX X 1 X X Y 1 2 Y Y 2 h p 2 i = N (µX ; σX ) = E (σX Z1)(σY [ρZ1 + 1 − ρ Z2]) h 2 p 2 i = σX σY E ρZ1 + 1 − ρ Z1Z2 p 2 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY = σX σY ρE[Z1 ] p 2 = σX σY ρ = σY [ρN (0; 1) + 1 − ρ N (0; 1)] + µY = σ [N (0; ρ2) + N (0; 1 − ρ2)] + µ Y Y Cov(X ; Y ) ρ(X ; Y ) = = ρ = σY N (0; 1) + µY σX σY 2 = N (µY ; σY ) Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 2 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 3 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - RNG Multivariate Change of Variables Consequently, if we want to generate a Bivariate Normal random variable Let X1;:::; Xn have a continuous joint distribution with pdf f defined of S. We can define n new random variables Y ;:::; Y as follows: 2 2 1 n with X ∼ N (µX ; σX ) and Y ∼ N (µY ; σY ) where the correlation of X and Y = r (X ;:::; X ) ··· Y = r (X ;:::; X ) Y is ρ we can generate two independent unit normals Z1 and Z2 and use 1 1 1 n n n 1 n the transformation: If we assume that the n functions r1;:::; rn define a one-to-one differentiable transformation from S to T then let the inverse of this transformation be x1 = s1(y1;:::; yn) ··· xn = sn(y1;:::; yn) X = σX Z1 + µX Then the joint pdf g of Y1;:::; Yn is p 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY ( f (s1;:::; sn)jJj for (y1;:::; yn) 2 T g(y1;:::; yn) = 0 otherwise We can also use this result to find the joint density of the Bivariate Normal using a 2d change of variables. Where 2 @s1 ··· @s1 3 @y1 @yn 6 . 7 J = det 6 . 7 4 . 5 @sn ··· @sn @y1 @yn Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 4 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 5 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Density General Bivariate Normal - Density The first thing we need to find are the inverses of the transformation. If Next we calculate the Jacobian, x = r (z ; z ) and y = r (z ; z ) we need to find functions h and h such 1 1 1 2 2 1 2 1 2 " @s1 @s1 # " 0 # @x @y σX 1 J = det = det −ρ 1 = that Z1 = s1(X ; Y ) and Z2 = s2(X ; Y ). @s2 @s2 p p p 2 2 2 σ σ 1 − ρ @x @y σX 1−ρ σY 1−ρ X Y X = σ Z + µ X 1 X The joint density of X and Y is then given by X − µX Z1 = σX f (x; y) = f (z1; z2)jJj 1 1 2 2 1 1 2 2 p 2 = exp − (z + z ) jJj = exp − (z + z ) Y = σ [ρZ1 + 1 − ρ Z2] + µ 1 2 p 2 1 2 Y Y 2π 2 2πσX σY 1 − ρ 2 Y − µY X − µX p 2 = ρ + 1 − ρ Z2 " " !2 !2## σY σX 1 1 x − µX 1 y − µY x − µX = exp − + − ρ p 2 2 1 Y − µY X − µX 2πσX σY 1 − ρ 2 σX 1 − ρ σY σX Z2 = p − ρ 1 − ρ2 σY σX " 2 2 !# 1 −1 (x − µX ) (y − µY ) (x − µX ) (y − µY ) Therefore, = exp + − 2ρ 2 1=2 2 2 2 2πσX σY (1 − ρ ) 2(1 − ρ ) σX σY σX σY x − µX 1 y − µY x − µX s1(x; y) = s2(x; y) = p − ρ σX 1 − ρ2 σY σX Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 6 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 7 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Density (Matrix Notation) General Bivariate Normal - Density (Matrix Notation) Obviously, the density for the Bivariate Normal is ugly, and it only gets Recall for a 2 × 2 matrix, worse when we consider higher dimensional joint densities of normals. We a b 1 d −b 1 d −b A = A−1 = = can write the density in a more compact form using matrix notation, c d det A −c a ad − bc −c a 2 Then, x µX σX ρσX σY x = µ = Σ = 2 (x − µ)T Σ−1(x − µ) y µY ρσX σY σY 1 T 2 x − µx σY −ρσX σY x − µx = 2 σ2 σ2 (1 − ρ2) y − µy −ρσX σY σ y − µy X Y X 1 −1=2 1 T −1 1 2 T f (x) = (det Σ) exp − (x − µ) Σ (x − µ) σY (x − µX ) − ρσX σY (y − µY ) x − µx = 2 2π 2 2 2 2 −ρσ σ (x − µ ) + σ (y − µ ) y − µy σX σY (1 − ρ ) X Y X X Y 1 −1=2 = σ2 (x − µ )2 − 2ρσ σ (x − µ )(y − µ ) + σ2 (y − µ )2 We can confirm our results by checking the value of (det Σ) and 2 2 2 Y X X Y X Y X Y σX σY (1 − ρ ) T −1 ! (x − µ) Σ (x − µ) for the bivariate case. 1 (x − µ )2 (x − µ )(y − µ ) (y − µ )2 = X − 2ρ X Y + Y 2 2 2 1 − ρ σ σX σY σ −1=2 1 X Y (det Σ)−1=2 = σ2 σ2 − ρ2σ2 σ2 = X Y X Y 2 1=2 σX σY (1 − ρ ) Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 8 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 9 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Examples General Bivariate Normal - Examples X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 2); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 2) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) ρ = 0 ρ = 0 ρ = 0 ρ = 0:25 ρ = 0:5 ρ = 0:75 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 10 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 11 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Examples General Bivariate Normal - Examples X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 1) X ∼ N (0; 2); Y ∼ N (0; 1) X ∼ N (0; 1); Y ∼ N (0; 2) ρ = −0:25 ρ = −0:5 ρ = −0:75 ρ = −0:75 ρ = −0:75 ρ = −0:75 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 12 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 13 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions Multivariate Normal Distribution Multivariate Normal Distribution - Cholesky Matrix notation allows us to easily express the density of the multivariate In the bivariate case, we had a nice transformation such that we could normal distribution for an arbitrary number of dimensions. We express the generate two independent unit normal values and transform them into a k-dimensional multivariate normal distribution as follows, sample from an arbitrary bivariate normal distribution. X ∼ Nk (µ; Σ) There is a similar method for the multivariate normal distribution that takes advantage of the Cholesky decomposition of the covariance matrix. where µ is the k × 1 column vector of means and Σ is the k × k covariance matrix where fΣgi;j = Cov(Xi ; Xj ). The Cholesky decomposition is defined for a symmetric, positive definite matrix X as The density of the distribution is L = Chol(X) where L is a lower triangular matrix such that LLT = X. 1 1 f (x) = (det Σ)−1=2 exp − (x − µ)T Σ−1(x − µ) (2π)k=2 2 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 14 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 15 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions Multivariate Normal Distribution - RNG Cholesky and the Bivariate Transformation Let Z ;:::; Z ∼ N (0; 1) and Z = (Z ;:::; Z )T then We need to find the Cholesky decomposition of Σ for the general bivariate 1 k 1 k case where σ2 ρσ σ Σ = X X Y ρσ σ σ2 µ + Chol(Σ)Z ∼ Nk (µ; Σ) X Y Y this is offered without proof in the general k-dimensional case but we can We need to solve the following for a; b; c check that this results in the same transformation we started with in the 2 2 a 0 a b a ab σX ρσX σY bivariate case and should justify how we knew to use that particular = 2 2 = 2 b c 0 c ab b + c ρσX σY σY transformation.

Lecture 22: Bivariate Normal Distribution Distribution

Applied Biostatistics Mean and Standard Deviation the Mean the Median Is Not the Only Measure of Central Value for a Distribution

1. How Different Is the T Distribution from the Normal?

Laws of Total Expectation and Total Variance

Multicollinearity

Characteristics and Statistics of Digital Remote Sensing Imagery (1)

Linear Regression

CONDITIONAL EXPECTATION Definition 1. Let (Ω,F,P)

Conditional Expectation and Prediction Conditional Frequency Functions and Pdfs Have Properties of Ordinary Frequency and Density Functions

Negative Binomial Regression

Section 1: Regression Review

Assumptions of the Simple Classical Linear Regression Model (CLRM)

(Introduction to Probability at an Advanced Level) - All Lecture Notes