The Multivariate Normal Distribution=1See Last Slide For
Total Page:16
File Type:pdf, Size:1020Kb
Moment-generating Functions Definition Properties χ2 and t distributions The Multivariate Normal Distribution1 STA 302 Fall 2017 1See last slide for copyright information. 1 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ2 and t distributions 2 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Joint moment-generating function Of a p-dimensional random vector x t0x Mx(t) = E e x t +x t +x t For example, M (t1; t2; t3) = E e 1 1 2 2 3 3 (x1;x2;x3) Just write M(t) if there is no ambiguity. Section 4.3 of Linear models in statistics has some material on moment-generating functions (optional). 3 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Uniqueness Proof omitted Joint moment-generating functions correspond uniquely to joint probability distributions. M(t) is a function of F (x). Step One: f(x) = @ ··· @ F (x). @x1 @xp @ @ R x2 R x1 For example, f(y1; y2) dy1dy2 @x1 @x2 −∞ −∞ 0 Step Two: M(t) = R ··· R et xf(x) dx Could write M(t) = g (F (x)). Uniqueness says the function g is one-to-one, so that F (x) = g−1 (M(t)). 4 / 40 Moment-generating Functions Definition Properties χ2 and t distributions g−1 (M(t)) = F (x) A two-variable example g−1 (M(t)) = F (x) −1 R 1 R 1 x1t1+x2t2 R x2 R x1 g −∞ −∞ e f(x1; x2) dx1dx2 = −∞ −∞ f(y1; y2) dy1dy2 5 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Theorem Two random vectors x1 and x2 are independent if and only if the moment-generating function of their joint distribution is the product of their moment-generating functions. 6 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Proof Two random vectors are independent if and only if the moment-generating function of their joint distribution is the product of their moment-generating functions. Independence therefore the MGFs factor is an exercise. Mx1;x2 (t1; t2) = Mx1 (t1)Mx2 (t2) Z 1 Z 1 x1t1 x2t2 = e fx1 (x1) dx1 e fx2 (x2) dx2 −∞ −∞ Z 1 Z 1 x1t1 x2t2 = e e fx1 (x1)fx2 (x2) dx1dx2 −∞ −∞ Z 1 Z 1 x1t1+x2t2 = e fx1 (x1)fx2 (x2) dx1dx2 −∞ −∞ 7 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Proof continued R 1 R 1 x1t1+x2t2 Have Mx1;x2 (t1; t2) = −∞ −∞ e fx1 (x1)fx2 (x2) dx1dx2. Using F (x) = g−1 (M(t)), Z 1 Z 1 −1 x1t1+x2t2 F (x1; x2) = g e fx1 (x1)fx2 (x2) dx1dx2 −∞ −∞ Z x2 Z x1 = fx1 (y1)fx2 (y2) dy1dy2 −∞ −∞ Z x2 Z x1 = fx2 (y2) fx1 (y1) dy1 dy2 −∞ −∞ Z x2 = fx2 (y2)Fx1 (x1) dy2 −∞ Z x2 = Fx1 (x1) fx2 (y2) dy2 −∞ = Fx1 (x1) Fx2 (x2) So that x1 and x2 are independent. 8 / 40 Moment-generating Functions Definition Properties χ2 and t distributions A helpful distinction If x1 and x2 are independent, M (t) = M (t)M (t) x1+x2 x1 x2 x1 and x2 are independent if and only if M (t ; t ) = M (t )M (t ) x1;x2 1 2 x1 1 x2 2 9 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Theorem: Functions of independent random vectors are independent Show x1 and x2 independent implies that y1 = g1(x1) and y2 = g2(x2) are independent. y g (x ) t Let y = 1 = 1 1 and t = 1 : Then y2 g2(x2) t2 t0y My(t) = E e 0 0 0 0 = E et1y1+t2y2 = E et1y1 et2y2 0 0 = E et1g1(x1)et2g2(x2) ZZ 0 0 t1g1(x1) t2g2(x2) = e e fx1 (x1)fx2 (x2) dx1dx2 Z 0 Z 0 t2g2(x2) t1g1(x1) = e fx2 (x2) e fx1 (x1) dx1 dx2 Z 0 t2g2(x2) = e fx2 (x2)Mg1(x1)(t1)dx2 = Mg1(x1)(t1)Mg2(x2)(t2) = My1 (t1)My2 (t2) So y1 and y2 are independent. 10 / 40 Moment-generating Functions Definition Properties χ2 and t distributions 0 MAx(t) = Mx(A t) Analogue of Max(t) = Mx(at) t0Ax MAx(t) = E e 0 0 = E e(A t) x 0 = Mx(A t) Note that t is the same length as y = Ax: The number of rows in A. 11 / 40 Moment-generating Functions Definition Properties χ2 and t distributions t0c Mx+c(t) = e Mx(t) ct Analogue of Mx+c(t) = e Mx(t) t0(x+c) Mx+c(t) = E e 0 0 = E et x+t c 0 0 = et c E et x t0c = e Mx(t) 12 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Distributions may be defined in terms of moment-generating functions Build up the multivariate normal from univariate normals. 1 2 2 2 µt+ 2 σ t If y ∼ N(µ, σ ), then My (t) = e Moment-generating functions correspond uniquely to probability distributions. So define a normal random variable with expected value µ and variance σ2 as a random variable with µt+ 1 σ2t2 moment-generating function e 2 . This has one surprising consequence . 13 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Degenerate random variables A degenerate random variable has all the probability concentrated at a single value, say P rfy = y0g = 1. Then yt My (t) = E(e ) X = eytp(y) y y0t = e · p(y0) = ey0t · 1 = ey0t 14 / 40 Moment-generating Functions Definition Properties χ2 and t distributions y0t If P rfy = y0g = 1, then My (t) = e µt+ 1 σ2t2 2 This is of the form e 2 with µ = y0 and σ = 0. So y ∼ N(y0; 0). That is, degenerate random variables are \normal" with variance zero. Call them singular normals. This will be surprisingly handy later. 15 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Independent standard normals i:i:d: Let z1; : : : ; zp ∼ N(0; 1). 0 1 z1 . z = @ . A zp E(z) = 0 cov(z) = Ip 16 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Moment-generating function of z µt+ 1 σ2t2 Using e 2 p Y Mz (t) = Mzj (tj) j=1 p Y 1 t2 = e 2 j j=1 1 Pp t2 = e 2 j=1 j 1 t0t = e 2 17 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Transform z to get a general multivariate normal Remember: A non-negative definite means v0Av ≥ 0 Let Σ be a p × p symmetric non-negative definite matrix and p 1=2 µ 2 R . Let y = Σ z + µ. The elements of y are linear combinations of independent standard normals. Linear combinations of normals should be normal. y has a multivariate distribution. We'd like to call y a multivariate normal. 18 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Moment-generating function of y = Σ1=2z + µ 0 1 0 0 t c 2 t t Remember: MAx(t) = Mx(A t) and Mx+c(t) = e Mx(t) and Mz (t) = e M (t) = M (t) y Σ1=2z+µ 0 = et µ M (t) Σ1=2z t0µ 1=2 0 = e Mz (Σ t) t0µ 1=2 = e Mz (Σ t) t0µ 1 (Σ1=2t)0(Σ1=2t) = e e 2 t0µ 1 t0Σ1=2Σ1=2t = e e 2 t0µ 1 t0Σt = e e 2 t0µ+ 1 t0Σt = e 2 So define a multivariate normal random variable y as one with 0 1 0 t µ+ 2 t Σt moment-generating function My (t) = e . 19 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Compare univariate and multivariate normal moment-generating functions 1 2 2 µt+ 2 σ t Univariate My (t) = e 0 1 0 t µ+ 2 t Σt Multivariate My (t) = e So the univariate normal is a special case of the multivariate normal with p = 1. 20 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Mean and covariance matrix For a univariate normal, E(y) = µ and V ar(y) = σ2 Recall y = Σ1=2z + µ. E(y) = µ cov(y) = Σ1=2cov(z)Σ1=20 = Σ1=2 I Σ1=2 = Σ We will say y is multivariate normal with expected value µ and variance-covariance matrix Σ, and write y ∼ Np(µ; Σ). 0 1 0 t µ+ 2 t Σt Note that because My (t) = e , µ and Σ completely determine the distribution. 21 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Probability density function of y ∼ Np(µ; Σ) Remember, Σ is only positive semi-definite. It is easy to write down the density of z ∼ Np(0;I) as a product of standard normals. If Σ is strictly positive definite (and not otherwise), the density of y = Σ1=2z + µ can be obtained using the Jacobian Theorem as 1 1 0 −1 f(y) = 1 p exp − (y − µ) Σ (y − µ) jΣj 2 (2π) 2 2 This is usually how the multivariate normal is defined. 22 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Σ positive definite? Positive definite means that for any non-zero p × 1 vector a, we have a0Σa > 0. Pp Since the one-dimensional random variable w = i=1 aiyi may be written as w = a0y and V ar(w) = cov(a0y) = a0Σa, it is natural to require that Σ be positive definite. All it means is that every non-zero linear combination of y values has a positive variance.