<<

Open Journal of , 2016, 6, 443-450 Published Online June 2016 in SciRes. http://www.scirp.org/journal/ojs http://dx.doi.org/10.4236/ojs.2016.63040

A Multivariate Student’s t-Distribution

Daniel T. Cassidy Department of Engineering Physics, McMaster University, Hamilton, ON, Canada

Received 29 March 2016; accepted 14 June 2016; published 17 June 2016

Copyright © 2016 by author and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/

Abstract A multivariate Student’s t-distribution is derived by analogy to the derivation of a multivariate normal (Gaussian) probability density function. This multivariate Student’s t-distribution can

have different shape parameters νi for the marginal probability density functions of the multi- variate distribution. Expressions for the probability density function, for the , and for the of the multivariate t-distribution with arbitrary shape parameters for the marginals are given.

Keywords Multivariate Student’s t, , , Arbitrary Shape Parameters

1. Introduction An expression for a multivariate Student’s t-distribution is presented. This expression, which is different in form than the form that is commonly used, allows the shape parameter ν for each marginal probability density function (pdf) of the multivariate pdf to be different. The form that is typically used is [1]

−+ν Γ+((ν n) 2) T ( n) 2 +Σ−1 n 2 (1.[xx] [ ]) (1) ΓΣ(νν2)(π ) This “typical” form attempts to generalize the univariate Student’s t-distribution and is valid when the n marginal distributions have the same shape parameter ν . The shape of this multivariate t-distribution arises from the observation that the pdf for [ xy] = [ ] σ is given by Equation (1) when [ y] is distributed as a multivariate with [Σ] and σ 2 is distributed as chi-squared. The multivariate Student’s t-distribution put forth here is derived from a Cholesky decomposition of the scale matrix by analogy to the multivariate normal (Gaussian) pdf. The derivation of the multivariate normal pdf is

How to cite this paper: Cassidy, D.T. (2016) A Multivariate Student’s t-Distribution. Open Journal of Statistics, 6, 443-450. http://dx.doi.org/10.4236/ojs.2016.63040 D. T. Cassidy

given in Section 2 to provide background. The multivariate Student’s t-distribution and the variances and covariances for the multivariate t-distribution are given in Section 3. Section 4 is a conclusion.

2. Background Information 2.1. Cholesky Decomposition

A method to produce a multivariate pdf with known scale matrix [Σs ] is presented in this section. For nor-

mally distributed variables, the covariance matrix [Σ=Σ] [ s ] since the scale factor for a normal distribution is the of the distribution. An example with n = 4 is used to provide concrete examples. Consider the transformation [ yx] = [M ][ ] where [ y] and [ x] are 41× column matrices, [M ] is 44× square matrix, and the elements of [ x] are independent random variables. The off-diagonal elements of [M ] introduce correlations between the elements of [ y] .

yx11m1,1 000  yxmm 00 22= 2,1 2,2  (2) yxmmm3,1 3,2 3,3 0  33 yx44mmmm4,1 4,2 4,3 4,4  T The scale matrix [Σ=s ] [MM][ ] . The covariance matrix [Σ] has elements Σ=ij, E{yyij} where E{yyij} is the expectation of yyij and Σ=Σi,, j ji. If the [ x] are normally distributed, then T [Σ=Σ] [ s ] =[MM][ ] , where the superscript T indicates a transpose of the matrix. If [Σs ] is known, then

[M ] is the Cholesky decomposition of the matrix [Σs ] [2]. For the 44× example of Equation (2), m2 mm mm mm 1,1 1,1 2,1 1,1 3,1 1,1 4,1 mm m22++ m mm mm mm + mm Σ=1,1 2,1 2,1 2,2 2,1 3,1 2,2 3,2 2,1 4,1 2,2 4,2 . (3) [ s ] 22 2 mmmmmm1,1 3,1 2,1 3,1+ 2,2 3,2 mmm3,1++ 3,2 3,3 mmmmmm3,1 4,1+ 3,2 4,2 + 3,3 4,3 22 22 mmmmmmmmmmm1,1 4,1 2,1 4,1+ 2,2 4,2 3,1 4,1 ++ 3,2 4,2 3,3m4,3 mmmm 4,1+++ 4,2 4,3 4,4

T T 2 From linear algebra, det ([MM]) = det[ M] det[ M] = ( det [ M]) . For [M ] as defined in Equation (2), 2 det M= mm mm and det Σ=mm22 mm 2 2=det M whereas Σ=E yy =σ 2 is the va- [ ] 1,1 2,2 3,3 4,4 [ s ] 1,1 2,2 3,3 4,4 ( [ ]) ii, { ii} yi Σ=E yy =σ 2 riance of the zero- yi and ij, { i j} yyij is the covariance of the zero-mean

random variables yi and y j .

2.2. Multivariate Normal Probability Density Function

To create a multivariate normal pdf, start with the joint pdf fxN ([ ]) for n unit normal, zero mean, independent random variables [ x]: n 11 1  1T  f([ x]) =exp −=x2 exp −[xx] [ ] (4) N ∏ i n   i=1 2π 22(2π)  

T where [x] is an n-row column matrix: [x] = [ xx12,,, xn ] . fN ([ x])dd xx12 d xn gives the probability that the random variables [ x] lie in the interval [x] <[ x] ≤+[ xx] [d ].

The requirement for zero mean random variables is not a restriction. If E{xi } = µi , then xxii′ = − µi is a

zero mean random variable with the same shape and scale parameters as xi . Use Equation (2) to transform the variables. The Jacobian determinant of the transformation relates the products of the infinitesimals of integration such that

∂ ( xx12,,, xn ) ddxxin2 d x= ddyy12 d. yn (5) ∂ ( yy12,,, yn )

444 D. T. Cassidy

The magnitude of the Jacobian determinant of the transformation [ y] = [ Mx][ ] is (Appendix)

∂ ( xx,,, x) 11 12 n = = (6) ∂ yy, , , y det M ( 12 n ) [ ] det[Σs ]

2 where the equality det[Σ=s ] ( det [M ]) has been used.

−1 − Σ= T Σ=−1 T1− = 1 Since [ s ] [MM][ ] , [ s ] ([MM] ) [ ] , and since [xMy] [ ] [ ] , the multivariate “z-score”

−1 − T TT− 1= T Σ −1 T1Σ Σ=Σ [xx] [ ] becomes [ yM] ([ ] ) [ My] [ ] [ y] [ s ] [ y] , which equals [ yy] [ ] [ ] since [ s ] [ ] for normally distributed variables. The result is that the unit normal, independent, multivariate pdf, Equation (4), becomes under the trans- formation Equation (2)

11T −1 fyN ([ ]) = exp−Σ[ y] [ s ] [ y] (7) n 2 (2π) det [Σs ]

T where [ y] is a n-row column matrix: [ y] = [ yy12,,, yn ] and [Σ=Σs ] [ ] . For the 44× example, 1 0 00 m 1,1 m 1 − 2,1 00 −1 mm1,1 2,2 m2,2 [M ] = , (8) mm− mm m 1 3,2 2,1 3,1 2,2 − 3,2 0 mmm1,1 2,2 3,3 mm2,2 3,3 m3,3  − mm− mm m 1 1 4,3 3,2 4,2 3,3 − 4,3 m1,4 mmm2,2 3,3 4,4 mm3,3 4,4 m 4,4

−1 Σ=−1 T1− from which [ s ] ([MM] ) [ ] can be calculated. In Equation (8),

−1 mmm2,1 3,2 4,3−−+ mmm 4,2 2,1 3,3 mmm 2,2 3,1 4,3 mmm 4,1 2,2 3,3 m1,4 = − . (9) mm1,1 2,2 mm 3,3 4,4

−1 The denominator in the expression for m1,4 is det [M ] . 3. Multivariate Student’s t Probability Density Function A similar approach can be used to create a multivariate Student’s t pdf. Assume truncated or effectively truncated t-distributions, so that moments exist [3] [4]. For simplicity, assume that support is [µ−+bb βµ, β] where b is a positive, large number, β is the scale factor for the distribution, and µ is the for the distribution. If b is a large number, then a significant portion of the tails of the distribution are included. If b = ∞ then all of the tails are included. Start with the joint pdf for n independent, zero-mean (location parameters [µ] = 0 ) Student’s t pdfs with shape parameters [ν ] , and scale parameters [β ] = 1 : −+(ν 12) nn2 i Γ+((ν i 12) ) x f([ x];[νν]) = 1; +=i gx( ) (10) t ∏∏ν i ii ii=11Γ (ννii2) π i = with −∞ ≤xi ≤ +∞ . ft ([ x];[ν ]) dd xx12 d xn gives the probability that a random draw of the column matrix

[ x] from the joint Student’s t-distribution lies in the interval [x] <[ x] ≤+[ xx] [d ]. The pdf gxi( ii;ν ) is a function of only xi and the shape parameter ν i , and thus is independent of any other gxj( jj;ν ) , ji≠ .

445 D. T. Cassidy

Use the transformation of Equation (2) to create a multivariate pdf −+ν 2 ( i 12) i − my1 n ∑ ij, j 1 Γ+((ν i 12) ) j=1 fyt ([ ];[ν ]) = ∏  1.+ (11) = Γ ν det [M ] i 1 (ννii2) π i   = i −1 The solution xi ∑ j=1 myij, j of the transformation Equation (2) was used. The elements of the inverse −1 −1 matrix [M ] , mij, , are given in terms of the mij, by Equation (8) for the n = 4 example. Note that the shape parameters ν i of the constituent distributions need not be the same in the multivariate t-distribution given by fyt ([ ];[ν ]) . ft ([ y];[ν ]) dd yy12 d yn gives the probability that a random draw of the column matrix [ y] from the multivariate Student’s t-distribution with shape parameters [ν ] lies in the interval [ y] <≤+[ y] [ yy] [d ] . x n From the definition of the exponential function e= limn→∞ ( 1 + xn) where e= 2.718281828 is Euler’s number, then −+(ν 12) t 2 lim 1+=−exp( t 2 2) (12) ν →∞ ν and

2 n i 11−1 limft ([ y] ;[ν ]) = ∏ exp− 0.5∑myij, j ν →∞  det [M ] i=1 2π j=1 2 ni 1 −1 = exp− 0.5∑∑myij, j n  det[M ] ( 2π) ij=11 = (13)

11T −1 = exp−Σ[ yy] [ s ] [ ] n 2 (2π) det [Σs ]

= fyN ([ ]).

In the limit as [ν →∞] , the multivariate Student’s t-distribution fyt ([ ];[ν ]) , Equation (11), becomes a multivariate normal distribution.

3.1. Some Σij, for the n = 4 Example In this subsection some examples for the variances and covariances of a multivariate Student’s t-distribution using the n = 4 example of Equation (2) are given.

The variance of the random variable y3 is 22= −1 ν E{y3} ∫∫∫∫ yfM 3t ([ ] [ y];[ ]) dddd yyyy4 321 (14)

with the limits of the integrations equal to µβii− b and µβii+ b , i = 1,2,3,4. Perform the integrations as listed. The integral over dy4 is unity since only x4 depends on y4 (c.f. Equation (2)) and fxt ([ ];[ν ]) factors into a product gxgxgxgx1122( ) ( ) 33( ) 44( ) —see Equation (10). Write −1 −−− 111 xmymymymy4= 4,4 4 +++ 4,3 3 4,2 2 4,1 1 (15)

−1 =my4,4 4 − µ 4 (16) −1 −1 where the mij, are the elements of the inverse of matrix [M ] and are as given by [M ] , Equation (8), and µ4 is a constant as far as the integral over y4 is concerned. Repeat the procedure for the integrals for dy3 , dy2 , and dy1 . These integrals are not equal to unity owing to 2 the presence of the y3 term.

446 D. T. Cassidy

The variance of the random variable yi for the multivariate Student's t-distribution with support [−∞, ∞] and with ννi = for all i is given by i ν Emy22=σ = 2 (17) { iy} i ∑ jj, j=1 ν − 2 The expression for σ 2 is valid only for ν > 2 . The expression would be valid for ν ≥ 1 if the region of yi support was [µi−+bb βµ ii, β i] rather than [−∞, ∞] where βi is a scale factor and b <∞ [3]-[5]. Note that the scale factors for the multivariate t-distribution are βi= m ii, . Truncation or effective truncation of the pdf keeps the moments finite [3]-[5]. For example, the second central for a ν = 1 Student’s t-distribution with scale factor β and support [µ−+bb βµ, β] is (bb− arctan ( )) µν( =1, β ,b) = β2 × , (18) 2 arctan (b) which is finite provided that b <∞. In the interest of brevity, only variances and covariances that were calculated for support of [−∞, ∞] will be discussed. The requirement that ν i > 2 will be understood to be waived if the pdf is truncated or effectively truncated. It is also to be understood that the variances and covariances as calculated for support of [−∞, ∞] provide upper limits for variances and covariances calculated for truncation or effective truncation of the pdf.

If the ν i are not equal, then for the n = 4 example of Equation (2)

2νν12 222ν 3 E{y3 } = ×+ mmm3,1 ×+3,2 ×3,3. (19) νν12−−−222 ν 3

The covariance E{yy23} for the ννi = for all i is given by ν E.{yy2 3} =(mm 2,1 3,1 + mm 2,2 3,2 ) (20) ν − 2

If the ν i are not equal, then the covariance E{yy23}

νν12 E.{yy2 3} =×mm2,1 3,1 +× mm2,2 3,2 (21) νν12−−22

The expression for E{yy13} , which is valid for the ν i not equal, is

ν1 E.{yy1 3 } = × mm1,1 3,1 (22) ν1 − 2

The expressions for E{yy33} , E{yy23} , and E{yy13} show a simple pattern for the relationship between the covariance matrix Σ , the scale matrix [Σs ] Equation (3), and the matrix [M ] Equation (2).

3.2. General Expressions for Σij,

Given a matrix [M ] that is an nn× square matrix with elements mij, , an expression for the variance (assuming support [−∞, ∞] , ν i > 2 for all i, and in≤ ) for the multivariate Student’s t-distribution fyt ([ ];[ν ]) is i 22ν j E.{yi } =∑ × m jj, (23) j=1 ν j − 2

A general expression for the covariance (assuming support [−∞, ∞] , ν i > 2 for all i, and i≤≤ nj, n) for the multivariate Student’s t-distribution fyt ([ ];[ν ]) is min(ij , ) ν k E.{yyij} = ∑ × mmik,, jk (24) k=1 ν k − 2 If support is [µ−+bb βµ, β] , then the general expressions need to be multiplied by functions that depend on b and ν . Truncation or effective truncation keeps the moments finite and defined for all ν ≥ 1 [3]-[5]. The general expressions for the covariance, Equation (24), yields, when ij= , the general expression for the 2 variance, Equation (23). The general expression for the variance, Equation (23), is given to emphasize the m jj,

447 D. T. Cassidy

nature of the variance. Unlike normally distributed random variables, the correlation matrix [Σ] for random variables that are T distributed as Student’s t is not equal to [MM][ ] . For normally distributed variables, the β equals the standard deviation σ . For Student’s t distributed variables, the standard deviation σ does not equal the scale parameter β . For a Student’s t distribution with shape parameter ν , scale parameter β , and support [−∞, ∞] , σ22=×− β νν( 2) . If the region of support for the Student’s t distribution is truncated to [µ−+bb βµ, β] then the variance σ22<× β νν( −2) for all ν ≥ 2 and is finite for all ν ≥ 1 [3]-[5]. Given a matrix of the variances and the covariances, [Σ] , and a column matrix of the shape parameters [ν ] T associated with each variable, the scale matrix [Σ=s ] [MM][ ] would in principle be determined sequentially,

starting with m1,1 and m1,2 . The shape parameters [ν ] would be obtained from the marginal distributions or from other knowledge.

4. Conclusion A multivariate Student’s t-distribution is derived by analogy to the derivation for a multivariate normal (or Gaussian) pdf. The variances and covariances for the multivariate t-distribution are given. It is noteworthy that the shape parameters [ν ] of the constituent Student’s t-distributions of the multivariate t-distribution, Equation (11), need not be the same.

Acknowledgements This work was funded by the Natural Science and Engineering Research Council (NSERC) Canada.

References [1] Kotz, S. and Nadarajah, S. (2004) Multivariate t-Distributions and Their Applications. Cambridge University Press, Cambridge. [2] Press, W.H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B.P. (1992) Numerical Recipes in FORTRAN: The Art of Scientific Computing. 2nd Edition, Cambridge University Press, Cambridge, 89. [3] Cassidy, D.T. (2011) Describing n-Day Returns with Student’s t-Distributions. Physica A, 390, 2794-2802. http://dx.doi.org/10.1016/j.physa.2011.03.019 [4] Cassidy, D.T. (2012) Effective Truncation of a Student’s t-Distribution by Truncation of the Chi Distribution in a Mixing Integral. Open Journal of Statistics, 2, 519-525. http://dx.doi.org/10.4236/ojs.2012.25067 [5] Cassidy, D.T. (2016) Student’s t Increments. Open Journal of Statistics, 6, 156-171. http://dx.doi.org/10.4236/ojs.2016.61014

448 D. T. Cassidy

Appendix: The Jacobian The Jacobian determinant is used in physics, mathematics, and statistics. Many of these uses can be traced to the Jacobian determinate as a measure of the volume of an infinitesimially small, n-dimensional parallelepiped.

1. Volume of a Parallelepiped The volume of an n-dimensional parallelepiped is given by the absolute value of the determinant of the com- ponents of the edge vectors that form the parallelepiped. The area of a parallelogram with edge vectors a and b is ab× .

The volume of a parallelepiped with edge vectors a = (aaa123,,) , b = (bbb123,,) , and c = (ccc123,,) is given by the determinant

ccc123

cab⋅×=( ) aaa123. (25)

bbb123

2. Inversion Exists

Assume that there are n functions xii= f( qq12,,, qn) . The necessary and sufficient condition that the func- −1 tions can be inverted to find qii= f( xx12,,, xn) is that the Jacobian determinant is nonzero, i.e., ∂ ( xx,,, x) 12 n ≠ 0 (26) ∂ (qq12,,, qn ) where ∂∂xx ∂ x 11... 1 ∂∂qq12 ∂ qn ∂ ( xx,,, x) ∂∂xx ∂ x 12 n ≡ 22...2 . (27) ∂(qq12,,, qnn) ∂∂ q1 q 2 ∂ q ......

To simplify the notation, assume that n = 3 so that xii= f( qqq123,,) , i = 1, , 3 . The total differential is

∂∂ffii ∂ f i dxqqqi =++ d123 d d. (28) ∂∂qq12 ∂ q 3 These equations can be put in matrix form ∂∂∂fff 111 ∂∂∂ qqq123 ddxq11 ∂∂∂fff 222 dxq22= d. (29) ∂∂∂ qqq123 ddxq33 ∂∂∂fff333  ∂∂∂qqq123

These three equations can be solved for the dqi if the determinant of the 33× matrix is non-zero. This is a standard result from linear algebra. The determinant of the 33× matrix is called the Jacobian determinant of the transformation.

3. Change of Variables The Jacobian determinant of the transformation is used in change of variables in integration:

449 D. T. Cassidy

∂ ( xxx,,) = = 123 ∫∫∫dV ∫∫∫ ddd xxx123 ∫∫∫ ddqqq12 d. 3 (30) ∂ (qqq123,,) The absolute value sign is required since the determinant could be negative (i.e., the volume could decrease). The Jacobian determinant for the inverse transformation (to obtain [x] as functions of [ y] ) given by Eq- uation (8) is

∂x 1 1 000 0 00 m1,1 ∂y1 ∂∂xx m2,1 1 2200 − 00 ∂∂yy mm1,1 2,2 m2,2 12 = , (31) ∂∂∂xxx mm− mm m 1 3330 3,2 2,1 3,1 2,2 − 3,2 0 ∂∂∂yyy123 mmm1,1 2,2 3,3 mm2,2 3,3 m3,3 ∂∂∂∂ xxxx4444 − mm− mm m 1 m 1 4,3 3,2 4,2 3,3 − 4,3 ∂∂∂∂yyyy 1,4 1234 mmm2,2 3,3 4,4 mm3,3 4,4 m 4,4 which equals 1111 1 ××× = . (32) mm1,1 2,2 mm 3,3 4,4 det [ M]

450