Wishart Distribution Wishart Distribution

I Consider as usual a random sample x1, x2,..., xN from Np(µ, Σ).

I Recall: the sample covariance S satisfies 1 S = A, N − 1 where N X 0 A = (xα − x¯)(xα − x¯) α=1

Statistics 784 NC STATE UNIVERSITY 1 / 13 Multivariate Analysis Wishart Distribution

I Write  0  x1 0  x2  X =    .   .  0 xN Then I  0  (x1 − x¯) 0  (x2 − x¯)   1  X =   = I − 1 10 X dev  .  N N N  .  N 0 (xN − x¯) and  1  A = X0 X = X0 I − 1 10 X dev dev N N N N

Statistics 784 NC STATE UNIVERSITY 2 / 13 Multivariate Analysis Wishart Distribution

I The Helmert matrix is the orthogonal matrix

 √1 √−1 0 ... 0  2 2  √1 √1 √−2 ... 0   6 6 6   . . . . .   ......  HN =  . . . .   1 1 1 −(N−1)   √ √ √ ... √   N(N−1) N(N−1) N(N−1) N(N−1)  √1 √1 √1 ... √1 N N N N

Statistics 784 NC STATE UNIVERSITY 3 / 13 Multivariate Analysis Wishart Distribution

I If we partition HN as ! H(1) H = N N √1 √1 √1 ... √1 N N N N then 0 1 I = H0 H = H(1) H(1) + 1 10 N N N N N N N N

I So 1 0 I − 1 10 = H(1) H(1) N N N N N N and

 1  0  0   A = X0 I − 1 10 X = X0H(1) H(1)X = H(1)X H(1)X N N N N N N N N

Statistics 784 NC STATE UNIVERSITY 4 / 13 Multivariate Analysis Wishart Distribution

I If Z = HN X then the rows of Z are uncorrelated, and hence independent, with the first n = N − 1 distributed as Np(0, Σ), and the last distributed as √  Np Nµ, Σ So we can write I n X 0 A = zαzα α=1

where z1, z2,..., zn is a random sample from Np(0, Σ).

I A is said to have the Wishart distribution Wp(Σ, n).

Statistics 784 NC STATE UNIVERSITY 5 / 13 Multivariate Analysis Wishart Distribution Distribution of A

I First assume that Σ = Ip. Z(1) (1)X I Then the elements of = HN are iid N(0, 1). I We can use Gram-Schmidt orthogonalization to write

Z(1) = WT0

where:

I T is (p × p) lower-triangular; I the columns of W are mutually orthogonal and normalized. 0 and A = Z(1) Z(1) = TT0.

Statistics 784 NC STATE UNIVERSITY 6 / 13 Multivariate Analysis Wishart Distribution

I Anderson shows that, when n ≥ p:

I The non-zero elements of T are independent; I The sub-diagonal elements are all distributed as N(0, 1); 2 2 I The diagonal element ti,i satisfies ti,i ∼ χn−i+1. 1 I That is, the joint density of these 2 n(n + 1) variables is

p !  p i  Y 1 X X tn−i exp − t2 i,i  2 i,j  i=1 i=1 j=1 p   1 p(n−2) 1 p(p−1) Y 1 2 2 π 4 Γ (n + 1 − i) 2 i=1

Statistics 784 NC STATE UNIVERSITY 7 / 13 Multivariate Analysis Wishart Distribution

I The result for general Σ is that the joint density is

p ! Y  1  tn−i exp − Σ−1TT0 i,i 2 i=1 p   1 p(n−2) 1 p(p−1) 1 n Y 1 2 2 π 4 det(Σ) 2 Γ (n + 1 − i) 2 i=1

I A is symmetric, and hence varies in a space of the same 1 dimensionality, 2 p(p + 1), as T. I The density of A is found by transformation:   1 (n−p−1) 1 −1  det(A) 2 exp − trace Σ A 2 p   1 pn 1 p(p−1) 1 n Y 1 2 2 π 4 det(Σ) 2 Γ (n + 1 − i) 2 i=1

Statistics 784 NC STATE UNIVERSITY 8 / 13 Multivariate Analysis Wishart Distribution Characteristic Function

I Let Θ be a real, symmetric (p × p) matrix. I Then p p−1 p X X X trace(AΘ) = Ai,i θi,i + 2 Ai,j θi,j i=1 i=1 j=i+1

I So the joint characteristic function of A1,1, A2,2,..., Ap,p, 2A1,2, 2A1,3,..., 2Ap−1,p is E{exp[itrace(AΘ)]}

I Since A can be written as n X 0 A = zαzα, α=1 n ! n X 0 X 0 trace(AΘ) = trace zαzαΘ = zαΘzα α=1 α=1

Statistics 784 NC STATE UNIVERSITY 9 / 13 Multivariate Analysis Wishart Distribution

I It follows that 1 E{exp[itrace(AΘ)]} = 1 n det(Ip − 2iΘΣ) 2

I Note that if matrices Ai are independent, with Ai ∼ Wp(Σ, ni ), then ! X X Ai ∼ Wp Σ, ni i i

I This follows either from the representation of Ai or from multiplying the individual characteristic functions.

Statistics 784 NC STATE UNIVERSITY 10 / 13 Multivariate Analysis Wishart Distribution Linear Transformation

0 I If A ∼ Wp(Σ, n) and B = CAC for some nonsingular C, then

0 B ∼ Wp(CΣC , n)

I This also follows either from the representation of A or from transforming the characteristic function.

Statistics 784 NC STATE UNIVERSITY 11 / 13 Multivariate Analysis Wishart Distribution Marginal Distribution

I If A ∼ Wp(Σ, n) and A and Σ are partitioned as

 A A   Σ Σ  A = 1,1 1,2 , Σ = 1,1 1,2 , A2,1 A2,2 Σ2,1 Σ2,2

then marginally, Ai,i ∼ Wpi (Σi,i , n), i = 1, 2. I If Σ1,2 = Σ2,1 = 0, then furthermore A1,1 and A2,2 are independent.

I This also follows either from the representation of A or from the structure of the characteristic function.

Statistics 784 NC STATE UNIVERSITY 12 / 13 Multivariate Analysis Wishart Distribution Conditional Distribution

I If A ∼ Wp(Σ, n) and A and Σ are partitioned as above, then

−1 A1,1·2 = A1,1 − A1,2A2,2A2,1 ∼ Wp1 (Σ1,1·2, n − p2)

−1 and is independent of A2,2 and A1,2A2,2. I This is proved using the representation of A using other arguments presented earlier.

Statistics 784 NC STATE UNIVERSITY 13 / 13 Multivariate Analysis