556: MATHEMATICAL STATISTICS I 1 the Multinomial Distribution the Multinomial Distribution Is a Multivariate Generalization of T

556: MATHEMATICAL STATISTICS I MULTIVARIATE PROBABILITY DISTRIBUTIONS 1 The Multinomial Distribution The multinomial distribution is a multivariate generalization of the binomial distribution. The binomial distribution can be derived from an “infinite urn” model with two types of objects being sampled without replacement. Suppose that the proportion of “Type 1” objects in the urn is θ (so 0 · θ · 1) and hence the proportion of “Type 2” objects in the urn is 1 ¡ θ. Suppose that n objects are sampled, and X is the random variable corresponding to the number of “Type 1” objects in the sample. Then X » Bin(n; θ). Now consider a generalization; suppose that the urn contains k + 1 types of objects (k = 1; 2; :::), with θi being the proportion of Type i objects, for i = 1; :::; k + 1. Let Xi be the random variable corresponding to the number of type i objects in a sample of size n, for i = 1; :::; k. Then the joint T pmf of vector X = (X1; :::; Xk) is given by e n! n! kY+1 f (x ; :::; x ) = θx1 ....θxk θxk+1 = θxi X1;:::;Xk 1 k x !:::x !x ! 1 k k+1 x !:::x !x ! i 1 k k+1 1 k k+1 i=1 where 0 · θi · 1 for all i, and θ1 + ::: + θk + θk+1 = 1, and where xk+1 is defined by xk+1 = n ¡ (x1 + ::: + xk): This is the mass function for the multinomial distribution which reduces to the binomial if k = 1. 2 The Dirichlet Distribution The Dirichlet distribution is a multivariate generalization of the Beta distribution. The joint pdf T of vector X = (X1; :::; Xk) is given by e ¡(®) ®1¡1 ®k¡1 ®k+1¡1 fX1;:::;Xk (x1; :::; xk) = x1 :::xk xk+1 ¡(®1):::¡(®k)¡(®k+1) for 0 · xi · 1 for all i such that x1 + ::: + xk + xk+1 = 1, where ® = ®1 + ::: + ®k+1 and where xk+1 is defined by xk+1 = 1 ¡ (x1 + ::: + xk): This is the density function which reduces to the Beta distribution if k = 1. It can also be shown that the marginal distribution of Xi is Beta(®i; ®). 3 The Multivariate Normal Distribution The multivariate normal distribution is a multivariate generalization of the normal distribution. T The joint pdf of X = (X1; :::; Xk) takes the form e µ ¶k=2 ½ ¾ 1 1 1 T ¡1 fX1;:::;Xk (x1; :::; xk) = 1 exp ¡ (x ¡ ¹) § (x ¡ ¹) 2¼ j§j 2 2 e e e e T where x = (x1; :::; xk) , ¹ is a k £ 1 vector, and § is a symmetric, positive-definite k £ k matrix. e e It can be shown that all marginal and all conditional distributions derived from the multivariate normal are also multivariate normal, and that any linear combination Y = AX e e for matrix A also has a multivariate normal distribution. 1 THE MULTIVARIATE NORMAL DISTRIBUTION MARGINAL AND CONDITIONALS DISTRIBUTIONS T Suppose that vector random variable X = (X1;X2;:::;Xk) has a multivariate normal distribution with pdf given by e µ ¶k=2 ½ ¾ 1 1 1 T ¡1 fX (x) = exp ¡ x § x (1) e e 2¼ j§j1=2 2e e where § is the k £ k variance-covariance matrix (we can consider here the case where the expected value ¹ is the k £ 1 zero vector; results for the general case are easily available by transformation). e Consider partitioning X into two components X and X of dimensions k1 and k2 = k¡k1 respectively, e e 1 e 2 that is, · ¸ X X = 1 : Xe e e 2 We attempt to deduce (a) the marginal distribution of X1, and (b) the conditional distribution ofe X given that X = x . e 2 e 1 e1 First, write · ¸ § § § = 11 12 §21 §22 T where §11 is k1 £ k1, §22 is k2 £ k2, §21 = §12, and · ¸ V V §¡1 = V = 11 12 V21 V22 so that §V = Ik (Ir is the r £ r identity matrix) gives · ¸ · ¸ · ¸ § § V V I 0 11 12 11 12 = k1 §21 §22 V21 V22 0 Ik2 where 0 represents the zero matrix of appropriate dimension. More specifically, §11V11 + §12V21 = Ik1 (2) §11V12 + §12V22 = 0 (3) §21V11 + §22V21 = 0 (4) §21V12 + §22V22 = Ik2 : (5) From the multivariate normal pdf in equation (1), we can re-express the term in the exponent as T ¡1 T T T T x § x = x V11x + x V12x + x V21x + x V22x : (6) e e e1 e1 e1 e2 e2 e1 e2 e2 In order to compute the marginal and conditional distributions, we must complete the square in x2 in this expression. We can write e xT§¡1x = (x ¡ m)TM(x ¡ m) + c (7) e e e2 e e2 e e and by comparing with equation (6) we can deduce that, for quadratic terms in x , e2 T T x V22x = x Mx ) M = V22 (8) e2 e2 e2 e2 2 for linear terms T T ¡1 x V21x = ¡x Mm ) m = ¡V V21x (9) e2 e1 e2 e e 22 e1 and for constant terms T T T T ¡1 x V11x = c + m Mm ) c = x (V11 ¡ V V V21)x (10) e1 e1 e e e e e1 21 22 e1 thus yielding all the terms required for equation (7), that is T ¡1 ¡1 T ¡1 T T ¡1 x § x = (x + V V21x ) V22(x + V V21x ) + x (V11 ¡ V V V21)x ; (11) e e e2 22 e1 e2 22 e1 e1 21 22 e1 which, crucially, is a sum of two terms, where the first can be interpreted as a function of x2, given x1, and the second is a function of x only. e e e1 Hence we have an immediate factorization of the full joint pdf using the chain rule for random vari- ables; fX (x) = fX jX (x2jx1)fX (x1) (12) e e e2 e1 e e e1 e where ½ ¾ 1 ¡1 T ¡1 fX jX (x2jx1) / exp ¡ (x2 + V22 V21x1) V22(x2 + V22 V21x1) (13) e2 e1 e e 2 e e e e giving that ¡ ¡1 ¡1¢ X jX = x »Nk ¡V V21x ;V (14) e 2 e 1 e1 2 22 e1 22 and ½ ¾ 1 T T ¡1 fX (x ) / exp ¡ x (V11 ¡ V21V22 V21)x (15) e1 e1 2e1 e1 giving that ³ ´ T ¡1 ¡1 X »Nk 0; (V11 ¡ V V V21) : (16) e 1 1 21 22 ¡1 But, from equation (3), §12 = ¡§11V12V22 , and then from equation (2), substituting in §12, ¡1 ¡1 ¡1 T ¡1 ¡1 §11V11 ¡ §11V12V22 V21 = Id ) §11 = (V11 ¡ V12V22 V21) = (V11 ¡ V21V22 V21) : Hence, by inspection of equation (16), we conclude that X »Nk (0; §11) ; (17) e 1 1 that is, we can extract the §11 block of § to define the marginal sigma matrix of X . e 1 Using similar arguments, we can define the conditional distribution from equation (14) more precisely. ¡1 First, from equation (3), V12 = ¡§11 §12V22, and then from equation (5), substituting in V12 ¡1 ¡1 ¡1 T ¡1 ¡§21§11 §12V22 + §22V22 = Ik¡d ) V22 = §22 ¡ §21§11 §12 = §22 ¡ §12§11 §12: Finally, from equation (3), taking transposes on both sides, we have that V21§11 + V22§21 = 0. Then ¡1 ¡1 pre-multiplying by V22 , and post-multiplying by §11 , we have ¡1 ¡1 ¡1 ¡1 V22 V21 + §21§11 = 0 ) V22 V21 = ¡§21§11 ; so we have, substituting into equation (14), that ¡ ¡1 ¡1 ¢ X jX = x »Nk §21§ x ; §22 ¡ §21§ §12 : (18) e 2 e 1 e1 2 11 e1 11 Thus any marginal, and any conditional distribution of a multivariate normal joint distribution is also multivariate normal, as the choices of X and X are arbitrary. e 1 e 2 3.

556: MATHEMATICAL STATISTICS I 1 the Multinomial Distribution the Multinomial Distribution Is a Multivariate Generalization of T

On Two-Echelon Inventory Systems with Poisson Demand and Lost Sales

Distance Between Multinomial and Multivariate Normal Models

The Exciting Guide to Probability Distributions – Part 2

Probability Distributions and Related Mathematical Constructs

Binomial and Multinomial Distributions

Discrete Probability Distributions

Statistical Distributions

5. the Multinomial Distribution

Chapter 5: Multivariate Distributions

A Multivariate Beta-Binomial Model Which Allows to Conduct Bayesian Inference Regarding Θ Or Transformations Thereof

Review of Probability Distributions for Modeling Count Data

An Alternative Approach of Binomial and Multinomial Distributions