Wishart Distribution Wishart Distribution

Wishart Distribution Wishart Distribution I Consider as usual a random sample x1, x2,..., xN from Np(µ, Σ). I Recall: the sample covariance matrix S satisfies 1 S = A, N − 1 where N X 0 A = (xα − x¯)(xα − x¯) α=1 Statistics 784 NC STATE UNIVERSITY 1 / 13 Multivariate Analysis Wishart Distribution I Write 0 x1 0 x2 X = . . 0 xN Then I 0 (x1 − x¯) 0 (x2 − x¯) 1 X = = I − 1 10 X dev . N N N . N 0 (xN − x¯) and 1 A = X0 X = X0 I − 1 10 X dev dev N N N N Statistics 784 NC STATE UNIVERSITY 2 / 13 Multivariate Analysis Wishart Distribution I The Helmert matrix is the orthogonal matrix √1 √−1 0 ... 0 2 2 √1 √1 √−2 ... 0 6 6 6 . . .. HN = . 1 1 1 −(N−1) √ √ √ ... √ N(N−1) N(N−1) N(N−1) N(N−1) √1 √1 √1 ... √1 N N N N Statistics 784 NC STATE UNIVERSITY 3 / 13 Multivariate Analysis Wishart Distribution I If we partition HN as ! H(1) H = N N √1 √1 √1 ... √1 N N N N then 0 1 I = H0 H = H(1) H(1) + 1 10 N N N N N N N N I So 1 0 I − 1 10 = H(1) H(1) N N N N N N and 1 0 0 A = X0 I − 1 10 X = X0H(1) H(1)X = H(1)X H(1)X N N N N N N N N Statistics 784 NC STATE UNIVERSITY 4 / 13 Multivariate Analysis Wishart Distribution I If Z = HN X then the rows of Z are uncorrelated, and hence independent, with the first n = N − 1 distributed as Np(0, Σ), and the last distributed as √ Np Nµ, Σ So we can write I n X 0 A = zαzα α=1 where z1, z2,..., zn is a random sample from Np(0, Σ). I A is said to have the Wishart distribution Wp(Σ, n). Statistics 784 NC STATE UNIVERSITY 5 / 13 Multivariate Analysis Wishart Distribution Distribution of A I First assume that Σ = Ip. Z(1) (1)X I Then the elements of = HN are iid N(0, 1). I We can use Gram-Schmidt orthogonalization to write Z(1) = WT0 where: I T is (p × p) lower-triangular; I the columns of W are mutually orthogonal and normalized. 0 and A = Z(1) Z(1) = TT0. Statistics 784 NC STATE UNIVERSITY 6 / 13 Multivariate Analysis Wishart Distribution I Anderson shows that, when n ≥ p: I The non-zero elements of T are independent; I The sub-diagonal elements are all distributed as N(0, 1); 2 2 I The diagonal element ti,i satisfies ti,i ∼ χn−i+1. 1 I That is, the joint density of these 2 n(n + 1) variables is p ! p i Y 1 X X tn−i exp − t2 i,i 2 i,j i=1 i=1 j=1 p 1 p(n−2) 1 p(p−1) Y 1 2 2 π 4 Γ (n + 1 − i) 2 i=1 Statistics 784 NC STATE UNIVERSITY 7 / 13 Multivariate Analysis Wishart Distribution I The result for general Σ is that the joint density is p ! Y 1 tn−i exp − traceΣ−1TT0 i,i 2 i=1 p 1 p(n−2) 1 p(p−1) 1 n Y 1 2 2 π 4 det(Σ) 2 Γ (n + 1 − i) 2 i=1 I A is symmetric, and hence varies in a space of the same 1 dimensionality, 2 p(p + 1), as T. I The density of A is found by transformation: 1 (n−p−1) 1 −1 det(A) 2 exp − trace Σ A 2 p 1 pn 1 p(p−1) 1 n Y 1 2 2 π 4 det(Σ) 2 Γ (n + 1 − i) 2 i=1 Statistics 784 NC STATE UNIVERSITY 8 / 13 Multivariate Analysis Wishart Distribution Characteristic Function I Let Θ be a real, symmetric (p × p) matrix. I Then p p−1 p X X X trace(AΘ) = Ai,i θi,i + 2 Ai,j θi,j i=1 i=1 j=i+1 I So the joint characteristic function of A1,1, A2,2,..., Ap,p, 2A1,2, 2A1,3,..., 2Ap−1,p is E{exp[itrace(AΘ)]} I Since A can be written as n X 0 A = zαzα, α=1 n ! n X 0 X 0 trace(AΘ) = trace zαzαΘ = zαΘzα α=1 α=1 Statistics 784 NC STATE UNIVERSITY 9 / 13 Multivariate Analysis Wishart Distribution I It follows that 1 E{exp[itrace(AΘ)]} = 1 n det(Ip − 2iΘΣ) 2 I Note that if matrices Ai are independent, with Ai ∼ Wp(Σ, ni ), then ! X X Ai ∼ Wp Σ, ni i i I This follows either from the representation of Ai or from multiplying the individual characteristic functions. Statistics 784 NC STATE UNIVERSITY 10 / 13 Multivariate Analysis Wishart Distribution Linear Transformation 0 I If A ∼ Wp(Σ, n) and B = CAC for some nonsingular C, then 0 B ∼ Wp(CΣC , n) I This also follows either from the representation of A or from transforming the characteristic function. Statistics 784 NC STATE UNIVERSITY 11 / 13 Multivariate Analysis Wishart Distribution Marginal Distribution I If A ∼ Wp(Σ, n) and A and Σ are partitioned as A A Σ Σ A = 1,1 1,2 , Σ = 1,1 1,2 , A2,1 A2,2 Σ2,1 Σ2,2 then marginally, Ai,i ∼ Wpi (Σi,i , n), i = 1, 2. I If Σ1,2 = Σ2,1 = 0, then furthermore A1,1 and A2,2 are independent. I This also follows either from the representation of A or from the structure of the characteristic function. Statistics 784 NC STATE UNIVERSITY 12 / 13 Multivariate Analysis Wishart Distribution Conditional Distribution I If A ∼ Wp(Σ, n) and A and Σ are partitioned as above, then −1 A1,1·2 = A1,1 − A1,2A2,2A2,1 ∼ Wp1 (Σ1,1·2, n − p2) −1 and is independent of A2,2 and A1,2A2,2. I This is proved using the representation of A using other arguments presented earlier. Statistics 784 NC STATE UNIVERSITY 13 / 13 Multivariate Analysis.

Wishart Distribution Wishart Distribution

STAT 802: Multivariate Analysis Course Outline

Mx (T) = 2$/2) T'w# (&&J Terp

Bayesian Inference Via Approximation of Log-Likelihood for Priors in Exponential Family

Interpolation of the Wishart and Non Central Wishart Distributions

Singular Inverse Wishart Distribution with Application to Portfolio Theory

An Introduction to Wishart Matrix Moments Adrian N

Package 'Matrixsampling'

Arxiv:1402.4306V2 [Stat.ML] 19 Feb 2014 Advantages Come at No Additional Computa- Archambeau and Bach, 2010]

Wishart Distributions for Covariance Graph Models

2. Wishart Distributions and Inverse-Wishart Sampling

Bayesian Inference for the Multivariate Normal

Exact Largest Eigenvalue Distribution for Doubly Singular Beta Ensemble