The Bivariate and Multivariate Normal Distribution

The Bivariate and Multivariate Normal Distribution

Probability 2 - Notes 11 The bivariate and multivariate normal distribution. 2 2 Definition. Two r.v.’s (X;Y) have a bivariate normal distribution N(µ1;µ2;s1;s2;r) if their joint p.d.f. is · ¸ ³ ´2 ³ ´³ ´ ³ ´2 ¡1 x¡µ1 ¡2r x¡µ1 y¡µ2 + y¡µ2 1 2(1¡r2) s1 s1 s2 s2 fX;Y (x;y) = p e (1) 2 2ps1s2 (1 ¡ r ) for all x;y. The parameters µ1;µ2 may be any real numbers, s1 > 0; s2 > 0, and ¡1 · r · 1. It is convenient to rewrite (1) in the form ¡ 1 Q(x;y) 1 fX;Y (x;y) = ce 2 ; where c = p and 2 2ps1s2 (1 ¡ r ) "µ ¶ µ ¶µ ¶ µ ¶ # x ¡ µ 2 x ¡ µ y ¡ µ y ¡ µ 2 Q(x;y) = (1 ¡ r2)¡1 1 ¡ 2r 1 2 + 2 (2) s1 s1 s2 s2 2 2 Statement. The marginal distributions of N(µ1;µ2;s1;s2;r) are normal with r.v.’s X and Y having density functions 2 2 ¡ (x¡µ1) ¡ (y¡µ2) 1 2s2 1 2s2 fX (x) = p e 1 ; fY (y) = p e 2 : 2ps1 2ps2 Proof. The expression (2) for Q(x;y) can be rearranged as follows: "µ ¶ µ ¶ # 1 x ¡ µ y ¡ µ 2 y ¡ µ 2 (x ¡ a)2 (y ¡ µ )2 Q(x;y) = 1 ¡ r 2 + (1 ¡ r2) 2 = + 2 ; 2 2 2 2 1 ¡ r s1 s2 s2 (1 ¡ r )s1 s2 (3) where a = a(y) = µ + r s1 (y ¡ µ ). Hence 1 s2 2 Z (y¡µ )2 Z (x¡a)2 (y¡µ )2 ¥ ¡ 2 ¥ ¡ ¡ 2 2s2 2(1¡r2)s2 1 2s2 fY (y) = fX;Y (x;y)dx = ce 2 £ e 1 dx = p e 2 ; ¡¥ ¡¥ 2ps2 R (x¡a)2 p p ¥ ¡ 2 2 where the last step makes use of the formula ¡¥ e 2s dx = 2ps with s = s1 1 ¡ r . 2 Exercise. Derive the formula for fX (x). Corollaries. 2 2 1. Since X » N(µ1;s1), Y » N(µ2;s2), we know the meaning of four parameters involved into the definition of the normal distribution, namely 2 2 E(X) = µ1, Var(X) = s1, E(Y) = µ2, Var(X) = s2. 2. Xj(Y = y) is a normal r.v. To verify this statement we substitute the necessary ingredients into the formula defining the relevant conditional density: (x¡a(y))2 fX;Y (x;y) 1 ¡ 2 2 f (xjy) = = p e 2s1(1¡r ) : XjY 2 fY (y) 2p(1 ¡ r )s1 1 2 2 In other words, Xj(Y = y) » N(a(y);(1 ¡ r )s1). Hence: 3. E(XjY = y) = a(y) or, equivalently, E(XjY) = µ + r s1 (Y ¡ µ ). In particular, we see that 1 s2 2 E(XjY) is a linear function of Y. 4. E(XY) = s1s2r + µ1µ2. Proof. E(XY) = E[E(XYjY)] = E[YE(XjY)] = E[Y(µ + r s1 (Y ¡ µ )] = µ E(Y)+ 1 s2 2 1 r s1 [E(Y 2) ¡ µ E(Y)] = µ µ + r s1 [E(Y 2) ¡ µ2] = µ µ + r s1 Var(Y) = s s r + µ µ . 2 s2 2 1 2 s2 2 1 2 s2 1 2 1 2 5. Cov(X;Y) = s1s2r. This follows from Corollary 4 and the formula Cov(X;Y) = E(XY) ¡ E(X)E(X). 6. r(X;Y) = r. In words: r is the correlation coefficient of X; Y. This is now obvious from the definition r(X;Y) = p Cov(X;Y) . Var(X)var(Y) Exercise. Show that X and Y are independent iff r = 0. (We proved this in the lecture; it is easily seen from either the joint p.d.f.) Remark. It is possible to show that the m.g.f. of X;Y is 1 2 2 2 2 (µ1t1+µ2t2)+ (s t +2rs1s2t1t2+s t ) MX;Y (t1;t2) = e 2 1 1 2 2 Many of the above statements follow from it. (To actually do this is a very useful exercise.) The Multivariate Normal Distribution. Using vector and matrix notation. To study the joint normal distributions of more than two r.v.’s, it is convenient to use vectors and matrices. But let us first introduce these notations for the case of two normal r.v.’s X1;X2. We set µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ 2 X1 x1 t1 µ1 s1 rs1s2 X = ; x = ; t = ; m = ; V = 2 X2 x2 t2 µ2 rs1s2 s2 Then m is the vector of means and V is the variance-covariance matrix. Note that jVj = 2 2 2 s1s2(1 ¡ r ) and à 1 ¡r ! 1 2 s s ¡1 s1 1 2 V = 2 ¡r 1 (1 ¡ r ) s s 2 1 2 s2 1 ¡ 1 (x¡m)T V¡1(x¡m) tT m+ 1 tT Vt Hence f (x) = e 2 for all x. Also M (t) = e 2 . X (2p)2=2jVj1=2 X We again use matrix and vector notation, but now there are n random variables so that X, x, t th th and m are now n-vectors with i entries Xi, xi, ti and µi and V is the n £ n matrix with ii entry 2 th T si and i j entry (for i 6= j) si j. Note that V is symmetric so that V = V. 2 1 ¡ 1 (x¡m)T V¡1(x¡m) The joint p.d.f. is f (x) = e 2 for all x. We say that X » N(m;V). X (2p)n=2jVj1=2 We can find the joint m.g.f. quite easily. h i Z Z n T ¥ ¥ 1 T ¡1 T ∑ t jXj t X 1 ¡ ((x¡m) V (x¡m)¡2t x) MX(t) = E e j=1 = E[e ] = ::: e 2 dx1:::dxn ¡¥ ¡¥ (2p)n=2jVj1=2 We do the equivalent of completing the square, i.e. we write (x ¡ m)T V¡1(x ¡ m) ¡ 2tT x = (x ¡ m ¡ a)T V¡1(x ¡ m ¡ a) + b for a suitable choice of the n-vector a of constants and a constant b. Then Z ¥ Z ¥ ¡b=2 1 ¡ 1 (x¡m¡a)T V¡1(x¡m¡a) ¡b=2 MX(t) = e ::: e 2 dx1:::dxn = e : ¡¥ ¡¥ (2p)n=2jVj1=2 We just need to find a and b. Expanding we have ((x ¡ m) ¡ a)T V¡1((x ¡ m) ¡ a) + b = (x ¡ m)T V¡1(x ¡ m) ¡ 2aT V¡1(x ¡ m) + aT V¡1a + b £ ¤ = (x ¡ m)T V¡1(x ¡ m) ¡ 2aT V¡1x + 2aT V¡1m + aT V¡1a + b T ¡1 T T ¡1 T This has£ to equal (x ¡ m) V¤ (x ¡ m) ¡ 2t x for all £x. Hence we¤ need a V = t and b = ¡ 2aT V¡1m + aT V¡1a . Hence a = Vt and b = ¡ 2tT m + tT Vt . Therefore ¡b=2 tT m+ 1 tT Vt MX(t) = e = e 2 Results obtained using the m.g.f. 1. Any (non-empty) subset of multivariate normals is multivariate normal. Simply put t j = 0 for 2 2 t1µ1+t1 s1=2 all j for which Xj is not in the subset. For example MX1 (t1) = MX1;:::;Xn (t1;0;:::;0) = e . 2 2 Hence X1 » N(µ1;s1). A similar result holds for Xi. This identifies the parameters µi and si as the mean and variance of Xi. Also 1 2 2 2 2 t1µ1+t2µ2+ 2 (t1 s1+2s12t1t2+s2t2 ) MX1;X2 (t1;t2) = MX1;:::;Xn (t1;t2;0;:::;0) = e Hence X1 and X2 have bivariate normal distribution with s12 = Cov(X1;X2). A similar result holds for the joint distribution of Xi and Xj for i 6= j. This identifies V as the variance-covariance matrix for X1;:::;Xn. 3 2. X is a vector of independent random variables iff V is diagonal (i.e. all off-diagonal entries are zero so that si j = 0 for i 6= j). 0 Proof. From (1), if the X s are independent then si j = Cov(Xi;Xj) = 0 for all i 6= j, so that V is diagonal. T n 2 2 If V is diagonal then t Vt = ∑ j=1 s jt j and hence n ³ ´ n T 1 T 1 2 2 t m+ 2 t Vt µ jt j+ 2 s jt j =2 MX(t) = e = ∏ e = ∏ MXj (t j) j=1 j=1 By the uniqueness of the joint m.g.f., X1;:::;Xn are independent. 3. Linearly independent linear functions of multivariate normal random variables are multivari- ate normal random variables. If Y = AX + b, where A is an n £ n non-singular matrix and b is a (column) n-vector of constants, then Y » N(Am + b;AVAT ). Proof. Use the joint m.g.f. tT Y tT AX+b tT b (AT t)T X tT b T MY(t) = E[e ] = E[e ] = e E[e ] = e MX(A t) tT b (AT t)T m+ 1 (AT t)T V(AT t) tT (Am+b)+ 1 tT (AVAT )t = e e 2 = e 2 This is just the m.g.f. for the multivariate normal distribution with vector of means Am + b and variance-covariance matrix AVAT . Hence, from the uniqueness of the joint m.g.f, Y » N(Am + b;AVAT ). Note that from (2) a subset of the Y 0s is multivariate normal. NOTE. The results concerning the vector of means and variance-covariance matrix for linear functions of random variables hold regardless of the joint distribution of X1;:::;Xn.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us