4.3 Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation

1

4.3 Sampling From A Multivariate Normal Distribution And Maximum Likelihood Estimation (a) The Multivariate Normal Likelihood

Let X1, X 2 ,, X n ~ N, be a random sample from a multivariate normal population. Then, the joint density function of

X1, X 2 ,, X n

n   t 1   1  x j    x j    f x ,, x  exp  1 n   p 1   2 2 j1 2  2    n  t 1  .  x j    x j   1  exp j1  np n 2  2  2  2     

The following results will be used to obtain the maximum likelihood estimate of  and  .

Result: Let A be a p  p symmetric matrix and x be a p 1 vector. Then,

t t t (a) x Ax  trx Ax  trAxx , where A  aij  and

p trA  aii . i1

p  (b) trA  i , where i are the eigenvalues of A. i1

Based on the result, we have 2

n n t 1 t 1 x j    x j   trx j    x j   j1 j1 n 1 t  tr x j  x j    j1   n  2A.12, p. 98 tr 1 x   x   t    j  j     j1  Then, n n t t t x j  x j    x j  x  x   x j  x  x   j1 j1 n n t t  x j  xx j  x  x  x   j1 j1 n t t  x j  xx j  x  nx  x   j1

n where  xi . Further, x  i1 n

  n  tr 1 x  x x  x t  n x   x   t     j  j        j1    n   tr 1 x  x x  x t   ntr 1 x   x   t   j  j          j1    n   tr 1 x  x x  x t   n x   t 1 x     j  j         j1 

Therefore, the likelihood function of X1, X 2 ,, X n can be simplified to 3

n  t 1   x j    x j   1 j 1 L ,  f x ,, x  exp      1 n  np n 2  2  2  2      .    n   tr  1  x  x x  x t   n x   t  1 x      j  j       1    j1    exp    np n   2  2  2  2      (b) Maximum Likelihood Estimation of  and  To obtain the maximum likelihood estimate, the following result will be used. Result: Given a p  p symmetric positive definite matrix B and a scalar b  0 , it follows that 1   tr 1B  1 exp    2b pb exp  bp b   b       2  B

  1 B for all positive definite  , with equality holding only for  2b .

Import Result (MLE of  and  )

Let X1, X 2 ,, X n ~ N, be a random sample from a multivariate normal population. Then, n t X j  X X j  X  ˆ X and j1 n 1S   ˆ   n n are the maximum likelihood estimators of  and  , respectively, where n t X j  X X j  X  S  j1 n 1 4

is a unbiased estimate of  . Their observed values, x and

n 1 x  x x  x t  n j  j  , are called the maximum likelihood j1 estimates of  and  . [proof:] ˆ maximizing the function

   n   tr  1  x  x x  x t   n x   t  1 x      j  j       1    j1   exp    np n   2  2  2  2      also minimizes the function

n  1  t  t 1 tr   x  x x  x   n x    x   .    j  j         j1 

Since  1 is positive definite, so that nx  t  1 x    0 . However, as

  x , the function

  n  tr 1 x  x x  x t   n x   t 1 x      j  j         j1    n   tr 1 x  x x  x t     j  j     j1  achieves its minimum. It remains to find ˆ maximizing 5

   n   tr  1  x  x x  x t     j  j    1    j1   L ˆ,  exp      np n   2  2  2  2     

n t b  n By the previous result with 2 and B   x j  xx j  x , the j1

n x  xx  xt maximum occurs at  j j . ˆ  j1 n

Note: Maximum likelihood estimators possess an invariance property. Let ˆ be the maximum likelihood estimator of  , and consider estimating the parameter h , which is a function of  , the maximum likelihood estimate of h  is given by hˆ. For examples,

1. The maximum likelihood estimator of  t  1 is ˆ t ˆ 1ˆ .

2. The maximum likelihood estimator of  ii is ˆ ii , where

n 2 X  X   ij i . ˆ  j1 ii n Note:

Let X1, X 2 ,, X n ~ N, be a random sample from a multivariate normal population. Then, n t X j  X X j  X  ˆ  X and j1 ˆ  n are sufficient statistics. 6

4.4 The Sampling Distribution of X and S Definition of the Wishart Distribution:

Let Z1,Z2 ,,Zn ~ N p 0, be independently distributed. Then,

n t the random matrix M   Z j Z j is distributed as a Wishart j1 distribution with n d.f., Wn, . The density of a Wishart distribution with n d.f. is (n p1)  trm 1  m 2 exp  2   . f m |   p pn p( p1) n 2 4 2 1  2    n 1 i i1 2 

Properties of the Wishart Distribution: 1. If M ~ W and M ~ W , then 1 n1 , 2 n2 ,

M  M ~ W . 1 2 n1 n2 ,

t 2. If M ~ W , then CM1C ~ W t . 1 n1 , n1 ,CC

Import Result:    1. X ~ N p  ,  .  n  n t 2. n 1S  X j  X X j  X  ~ Wn1, . j 1

3. X and n 1S are independent.

4.5 Large-Sample Behavior of X and S Import Result: 7

Let X1, X 2 ,, X n be a random sample from a population with mean  and finite (nonsingular) covariance  . Then,

nX    N p 0, and

t 1 2 nX   S X    p for n  p large.