Multivariate Distributions
Total Page:16
File Type:pdf, Size:1020Kb
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing1 in particular on multivariate normal, normal-mixture, spherical and elliptical distributions. In addition to studying their properties, we will also discuss techniques for simulating and, very briefly, estimating these distributions. Familiarity with these important classes of multivariate distributions is important for many aspects of risk management. We will defer the study of copulas until later in the course. 1 Preliminary Definitions Let X = (X1;:::Xn) be an n-dimensional vector of random variables. We have the following definitions and statements. > n Definition 1 (Joint CDF) For all x = (x1; : : : ; xn) 2 R , the joint cumulative distribution function (CDF) of X satisfies FX(x) = FX(x1; : : : ; xn) = P (X1 ≤ x1;:::;Xn ≤ xn): Definition 2 (Marginal CDF) For a fixed i, the marginal CDF of Xi satisfies FXi (xi) = FX(1;:::; 1; xi; 1;::: 1): It is straightforward to generalize the previous definition to joint marginal distributions. For example, the joint marginal distribution of Xi and Xj satisfies Fij(xi; xj) = FX(1;:::; 1; xi; 1;:::; 1; xj; 1;::: 1). If the joint CDF is absolutely continuous, then it has an associated probability density function (PDF) so that Z x1 Z xn FX(x1; : : : ; xn) = ··· f(u1; : : : ; un) du1 : : : dun: −∞ −∞ Similar statements also apply to the marginal CDF's. A collection of random variables is independent if the joint CDF (or PDF if it exists) can be factored into the product of the marginal CDFs (or PDFs). If > > X1 = (X1;:::;Xk) and X2 = (Xk+1;:::;Xn) is a partition of X then the conditional CDF satisfies FX2jX1 (x2jx1) = P (X2 ≤ x2jX1 = x1): If X has a PDF, f(·), then it satisfies Z xk+1 Z xn f(x1; : : : ; xk; uk+1; : : : ; un) FX2jX1 (x2jx1) = ··· duk+1 : : : dun −∞ −∞ fX1 (x1) where fX1 (·) is the joint marginal PDF of X1. Assuming it exists, the mean vector of X is given by > E[X] := (E[X1];:::; E[Xn]) whereas, again assuming it exists, the covariance matrix of X satisfies Cov(X) := Σ := E (X − E[X]) (X − E[X])> 1We will not study copulas in these notes as well defer this topic until later in the course. Multivariate Distributions 2 th so that the (i; j) element of Σ is simply the covariance of Xi and Xj. Note that the covariance matrix is > symmetric so that Σ = Σ, its diagonal elements satisfy Σi;i ≥ 0, and it is positive semi-definite so that > n th x Σx ≥ 0 for all x 2 R . The correlation matrix, ρ(X) has as its (i; j) element ρij := Corr(Xi;Xj). It is also symmetric, positive semi-definite and has 1's along the diagonal. For any matrix A 2 Rk×n and vector a 2 Rk we have E[AX + a] = AE[X] + a (1) Cov(AX + a) = A Cov(X) A>: (2) Finally, the characteristic function of X is given by h is>Xi n φX(s) := E e for s 2 R (3) and, if it exists, the moment-generating function (MGF) is given by (3) with s replaced by −i s. 2 The Multivariate Normal Distribution If the n-dimensional vector X is multivariate normal with mean vector µ and covariance matrix Σ then we write X ∼ MNn(µ; Σ): The standard multivariate normal has µ = 0 and Σ = In, the n × n identity matrix. The PDF of X is given by 1 − 1 (x−µ)> Σ−1(x−µ) f(x) = e 2 (4) (2π)n=2jΣj1=2 where j · j denotes the determinant, and its characteristic function satisfies h is>Xi is>µ− 1 s>Σs φX(s) = E e = e 2 : (5) > > Recall again our partition of X into X1 = (X1;:::;Xk) and X2 = (Xk+1;:::;Xn) . If we extend this notation naturally so that µ Σ Σ µ = 1 and Σ = 11 12 : µ2 Σ21 Σ22 then we obtain the following results regarding the marginal and conditional distributions of X. Marginal Distribution The marginal distribution of a multivariate normal random vector is itself multivariate normal. In particular, Xi ∼ MN(µi; Σii), for i = 1; 2. Conditional Distribution Assuming Σ is positive definite, the conditional distribution of a multivariate normal distribution is also a multivariate normal distribution. In particular, X2 j X1 = x1 ∼ MN(µ2:1; Σ2:1) Multivariate Distributions 3 −1 −1 where µ2:1 = µ2 + Σ21 Σ11 (x1 − µ1) and Σ2:1 = Σ22 − Σ21Σ11 Σ12. Linear Combinations Linear combinations of multivariate normal random vectors remain normally distributed with mean vector and covariance matrix given by (1) and (2), respectively. Estimation of Multivariate Normal Distributions The simplest and most common method of estimating a multivariate normal distribution is to take the sample mean vector and sample covariance matrix as our estimators of µ and Σ, respectively. It is easy to justify this choice since they are the maximum likelihood estimators. It is also common to take n=(n − 1) times the sample covariance matrix as an estimator of Σ as this estimator is known to be unbiased. Testing Normality and Multivariate Normality There are many tests that can be employed for testing normality of random variables and vectors. These include standard univariate tests and tests based on QQplots, as well omnibus moment tests based on whether the skewness and kurtosis of the data are consistent with a multivariate normal distribution. Section 3:1:4 of MFE should be consulted for details on these tests. 2.1 Generating Multivariate Normally Distributed Random Vectors Suppose we wish to generate X = (X1;:::;Xn) where X ∼ MNn(0; Σ). Note that it is then easy to handle the > case where E[X] 6= 0. Let Z = (Z1;:::;Zn) where the Zi's are IID N(0; 1) for i = 1; : : : ; n. If C is an (n × m) matrix then it follows that C>Z ∼ MN(0; C>C): Our problem therefore reduces to finding C such that C>C = Σ. We can use the Cholesky decomposition of Σ to find such a matrix, C. The Cholesky Decomposition of a Symmetric Positive-Definite Matrix A well known fact from linear algebra is that any symmetric positive-definite matrix, M, may be written as M = U>DU where U is an upper triangular matrix and D is a diagonal matrix with positive diagonal elements. Since Σ is symmetric positive-definite, we can therefore write p p p p Σ = U>DU = (U> D)( DU) = ( DU)>( DU): p The matrix C = DU therefore satisfies C>C = Σ. It is called the Cholesky Decomposition of Σ. The Cholesky Decomposition in Matlab and R It is easy to compute the Cholesky decomposition of a symmetric positive-definite matrix in Matlab and R using the chol command and so it is also easy to simulate multivariate normal random vectors. As before, let Σ be an (n × n) variance-covariance matrix and let C be its Cholesky decomposition. If X ∼ MN(0; Σ) then we can generate random samples of X in Matlab as follows: Sample Matlab Code Multivariate Distributions 4 >> Sigma = [1.0 0.5 0.5; 0.5 2.0 0.3; 0.5 0.3 1.5]; >> C = chol(Sigma); >> Z = randn(3,1000000); >> X = C'*Z; >> cov(X') ans = 0.9972 0.4969 0.4988 0.4969 1.9999 0.2998 0.4988 0.2998 1.4971 We must be very careful2 in Matlab3 and R to pre-multiply Z by C> and not C. We have the following algorithm for generating multivariate random vectors, X. Generating Correlated Normal Random Variables generate Z ∼ MN(0; I) =∗ Now compute the Cholesky Decomposition ∗= compute C such that C>C = Σ set X = C>Z 3 Normal-Mixture Models Normal-mixture models are a class of models generated by introducing randomness into the covariance matrix and / or the mean vector. Following the development of MFE we have the following definition of a normal variance mixture: Definition 3 The random vector X has a normal variance mixture if p X ∼ µ + W AZ where (i) Z ∼ MNk(0; Ik) (ii) W ≥ 0 is a scalar random variable independent of Z and (iii) A 2 Rn×k and µ 2 Rn are a matrix and vector of constants, respectively. Note that if we condition on W , then X is multivariate normally distributed. This observation also leads to an obvious simulation algorithm for generating samples of X: first simulate a value of W and then simulate X conditional on the generated value of W . We are typically interested in the case where rank(A) = n ≤ k and Σ = AA> is a full-rank positive definite matrix. In this case we obtain a non-singular normal variance mixture. Assuming W is integrable4, we immediately see that E[X] = µ and Cov(X) = E[W ] Σ: 2We must also be careful that Σ is indeed a genuine variance-covariance matrix. 3Unfortunately, some languages take C> to be the Cholesky Decomposition rather C. You must therefore always be aware of exactly what convention your programming language / package is using. 4That is, W is integrable if E[W ] < 1. Multivariate Distributions 5 We refer to µ and Σ as the location vector and dispersion matrix of the distribution. It is also clear that the correlation matrices of X and AZ are the same as long as W is integrable. This means that if A = Ik then the components of X are uncorrelated, though they are not in general independent. The following result5 emphasizes this point. Lemma 1 Let X = (X1;X2) have a normal mixture distribution with A = I2, µ = 0 and E[W ] < 1 so that Cov(X1;X2) = 0.