<<

Covariance Matrices

ACM 118 10/29/09 Definitions

„ Let Y = ()YY12,,... Yn be a random vector

„ μ = (μ12,...,μμn ) „ Then E(Y)= μ „ We define the by:

cov(YEYEYYEY )=−⎡⎤ [ ] − [ ] T ⎣⎦()()

Covariance of Y with itself sometimes referred to as a - Definitions Cont.

„ Alternatively,

Σ=i, j cov(YYi ,j )

⎛⎞ΣΣ11K 1n ⎜⎟ Σ=⎜⎟MOM ⎜⎟ ⎝⎠ΣΣmmn1 L

⎡ T ⎤ cov(YYij , ) =− E() Y i EY() i( Y j − EY() j) ⎣⎢ ⎦⎥ Properties

„ Let X=AY (A is a non-). Then: EX[]== E[AA Y] EY[] „ Proof:

⎛⎞xx11KK 1nn⎛⎞EE[ x 11] [ x 1 ] ⎜⎟⎜⎟ EX[]== E⎜⎟MOM⎜⎟ M O M ⎜⎟[]⎜⎟ [] ⎝⎠xxmmnm11LL⎝⎠EE x x mn

⎛⎞⎛aa11K 1nnn axaxax 11 11+++ 12 12... 1 1 ⎞ ⎜⎟⎜ ⎟ AA==⎜⎟⎜MOM, X M ⎟ ⎜⎟⎜ ⎟ ⎝⎠⎝aammnmmmmmnmn11122L axaxax+++... ⎠ Proof Cont.

⎛⎞ax11EE[] 11+++ ax 12 []... 12 ax 1nn E [] 1 ⎜⎟ EX[]A = ⎜⎟M ⎜⎟ ⎝⎠axm11EE[] m+++ ax m 2 []... m 2 ax mn E [] mn [] [] ⇒=EXAA EX Identities

„ For cov(X) – the covariance matrix of X with itself, the following are true: cov(X) is a symmetric nxn matrix with the

variance of Xi on the diagonal T cov. ()AXX= AA cov( ) Proof

„ First: Trivial „ Second:

T cov(XEXEXXEX ) =−⎡⎤()[]() −[] ⎣⎦ =−EXX⎡⎤T XEXTT −+ EXX EXEX ⎣⎦[] [] [][] T []TT [] [][] =−EXX⎡⎤ EXEX⎡⎤ − EEXX⎡⎤ + EEXEX ⎡ ⎤ ⎣⎦⎣⎦⎣⎦ ⎣ ⎦ [] T cov(YXEXEXXEX )==− cov(AAAAA ) ⎡⎤()() −[] ⎣⎦ =−E⎡⎤AAA XXTT XEXTT AA T −+ EXX AA T EXEX A T ⎣⎦[] [] [][]

TT[]TT T [] T T [][] =−AAAAAAAEXX⎡⎤ EXEX⎡⎤ − EEXX⎡⎤ + EEXEX ⎡ ⎤ A ⎣⎦ ⎣⎦⎣⎦ ⎣ ⎦ = AAcov()X T Estimation:

„ Let N be the number of observations for the ith . Then

X = (XX1,,K n ) 1 N pxxxxij=−−∑() ik i() jk j N −1 k =1 Example

„ Stocks 1 Year log-daily price ratio of Microsoft, Google, and Yahoo

> price<-read.csv("1yrlogprice.csv",header=T) > price[1,] MSFT APPL GOOG YHOO 1 0.001884144 -0.01971508 0.01284900 0.007799784 > cov(price) MSFT APPL GOOG YHOO MSFT 1.315902e-04 0.0001022785 0.0001040492 5.402314e-05 APPL 1.022785e-04 0.0002327443 0.0001420152 9.584520e-05 GOOG 1.040492e-04 0.0001420152 0.0001880265 5.842570e-05 YHOO 5.402314e-05 0.0000958452 0.0000584257 3.278960e-04 Data Explained

„ 4 stocks => matrix is 4x4 Symmetric „ cov(APPL,MSFT)=cov(MSFT,APPL) „ Largest covariance is between Google and Apple Multivariate Normal Distribution

„ X is an n dimensional vector „ X is said to have a multivariate normal distribution (with μ and covariance Σ) if every linear combination of its components are normally distributed. X ~,N ()μ Σ ()TTT XX~,NaNaaaμμΣ⇔ ~() , Σ Multivariate Normal Cont.

„ μ is a n x 1 vector, E[x]=μ „ Σ is a n x n matrix, Σ=cov(X) π „ If Σ is non-singular, the density is given by:

11⎛⎞T fx()=−−Σ−n/2 1/2 exp⎜⎟()() xμ xμ 2()Σ ⎝⎠2 If Ais non− random ANAAAX ~,()μ Σ T Linear Combination MVN:

„ Consider Z=X+Y, X and Y ~ bivariate normal „ The density is given by a convolution:

∞∞ 22 22 1 −+1/2 xy −−+−1/2(()()xu yv ) f zeedudv= () Z () 2 ∫∫ 2()π −∞ −∞

22 1 −+1/2()xy = eπ 2 Examples

„ Bivariate Normal Distribution

⎛⎞10 Σ=⎜⎟ ⎝⎠01 Examples Cont.

„ Bivariate Normal (w/ non zero covariance)

⎛⎞1.5 Σ=⎜⎟ ⎝⎠.5 1 Marginal Distributions

⎛⎞()1 X 12() () XX==⎜⎟(),,,,,,()X XXX X =() ⎜⎟2 11KKp pn+ ⎝⎠X

⎛⎞()1 μμμ ⎛⎞ΣΣ11 12 11() () =Σ=⇒⎜⎟(),~,⎜⎟X N () Σ11 ⎜⎟2 ΣΣ ⎝⎠μ ⎝⎠21 22

„ The of any subset of coordinates is multivariate normal Marginal Distributions Cont.

(2) „ The conditional distribution of X given X ()1 is normal ƒ X()12 and X (are ) independent iff they are uncorrelated, i.e. Σ12 = 0 References:

„ Emmanuel Candes http://www.acm.caltech.edu/~emmanuel/stat/ Handouts/covariance.pdf