Sample Mean Vector and Sample Covariance Matrix

STA135 Lecture 2: Sample Mean Vector and Sample Covariance Matrix Xiaodong Li UC Davis 1 Sample mean and sample covariance Recall that in 1-dimensional case, in a sample x1; : : : ; xn, we can define n 1 X x¯ = x n i i=1 as the (unbiased) sample mean n 1 X s2 := (x − x¯)2 n − 1 i i=1 p-dimensional case: Suppose we have p variates X1;:::;Xp. For the vector of variates 2 3 X1 ~ 6 . 7 X = 4 . 5 ; Xp we have a p-variate sample with size n: p ~x1; : : : ; ~xn 2 R : This sample of n observations give the following data matrix: 2 3 2 >3 x11 x12 : : : x1p ~x1 > 6x21 x22 : : : x2p7 6~x2 7 X = 6 7 = 6 7 : (1.1) 6 . .. 7 6 . 7 4 . 5 4 . 5 > xn1 xn2 : : : xnp ~xn Notice that here each column in the data matrix corresponds to a particular variate Xj. Sample mean: For each variate Xj, define the sample mean: n 1 X x¯ = x ; j = 1; : : : ; p: j n ij i=1 Then the sample mean vector 2 n 3 1 P 2 3 n xi1 2 3 x¯1 6 i=1 7 n xi1 n 6 7 1 X 1 X ~x := 6 . 7 = 6 . 7 = 6 . 7 = ~x : 4 . 5 6 . 7 n 4 . 5 n i 6 n 7 i=1 i=1 x¯p 4 1 P 5 xip n xip i=1 1 Sample covariance matrix: For each variate Xj, j = 1; : : : ; p, define its sample variance as n 1 X s = s2 := (x − x¯ )2; j = 1; : : : ; p jj j n − 1 ij j i=1 and sample covariance between Xj and Xk n 1 X s = s := (x − x¯ )(x − x¯ ); 1 ≤ k; j ≤ p; j 6= k: jk kj n − 1 ij j ik k i=1 The sample covariance matrix is defined as 2 3 s11 s12 : : : s1p 6s21 s22 : : : s2p7 S = 6 7 ; 6 . .. 7 4 . 5 sp1 sp2 : : : spp Then 2 1 Pn 2 1 Pn 3 n−1 i=1(xi1 − x¯1) ::: n−1 i=1(xi1 − x¯1)(xip − x¯p) 6 . .. 7 S = 4 . 5 1 Pn 1 Pn 2 n−1 i=1(xip − x¯p)(xi1 − x¯1) ::: n−1 i=1(xip − x¯p) 2 2 3 n (xi1 − x¯1) ::: (xi1 − x¯1)(xip − x¯p) 1 X . = 6 . .. 7 n − 1 4 . 5 i=1 2 (xip − x¯p)(xi1 − x¯1) ::: (xip − x¯p) 2 3 n xi1 − x¯1 1 X = 6 . 7 x − x¯ : : : x − x¯ n − 1 4 . 5 i1 1 ip p i=1 xip − x¯p n 1 X > = ~x − ~x ~x − ~x : n − 1 i i i=1 2 Linear transformation of observations 2 3 X1 ~ 6 . 7 Consider a sample of X = 4 . 5 with size n: Xp ~x1; : : : ; ~xn: The corresponding data matrix is represented as 2 3 2 >3 x11 x12 : : : x1p ~x1 > 6x21 x22 : : : x2p7 6~x2 7 X = 6 7 = 6 7 : 6 . .. 7 6 . 7 4 . 5 4 . 5 > xn1 xn2 : : : xnp ~xn q×p q For some C 2 R and d~ 2 R , consider the linear transformation 2 3 Y1 ~ 6 . 7 ~ ~ Y = 4 . 5 = CX + d: Yq 2 Then we get a q-variate sample: ~yi = C~xi + d;~ i = 1; : : : ; n; The sample mean of ~y1; : : : ; ~yn is n n n 1 X 1 X 1 X ~y = ~y = (C~x + d~) = C( ~x ) + d~ = C~x + d:~ n i n i n i i=1 i=1 i=1 And the sample covariance is n 1 X > S = ~y − ~y ~y − ~y y n − 1 i i i=1 n 1 X > = C~x − C~x C~x − C~x n − 1 i i i=1 n 1 X > = C ~x − ~x ~x − ~x C> n − 1 i i i=1 n ! 1 X > = C ~x − ~x ~x − ~x C> n − 1 i i i=1 > = CSxC : 3 Block structure of the sample covariance 2 3 2 3 2 3 X1 X1 Xq+1 6X27 6X27 6Xq+27 For the vector of variables X~ = 6 7, we can divide it into two parts: X~ (1) = 6 7 and X~ (2) = 6 7. 6 . 7 6 . 7 6 . 7 4 . 5 4 . 5 4 . 5 Xp Xq Xp In other words, 2 3 2 3 X1 xi1 6 X2 7 6 xi2 7 6 7 6 7 6 . 7 6 . 7 6 . 7 6 . 7 6 7 " ~ (1) # 6 7 " (1) # ~ 6 Xq 7 X 6 xiq 7 ~xi X = 6 7 = (2) ; in correspondence ~xi = 6 7 = (2) 6 Xq+1 7 X~ 6 xi(q+1) 7 ~x 6 7 6 7 i 6 X 7 6 x 7 6 q+2 7 6 i(q+2) 7 6 . 7 6 . 7 4 . 5 4 . 5 Xp xip where 2 3 2 3 xi1 xi(q+1) 6xi27 6xi(q+2)7 ~x(1) = 6 7 ; ~x(2) = 6 7 i 6 . 7 i 6 . 7 4 . 5 4 . 5 xiq xip 3 We have the partition of the sample mean and the sample covariance matrix as follows: 2 3 x¯1 2 3 6 x¯2 7 s11 : : : s1q s1;q+1 : : : s1;p 6 7 6 . 7 6 . .. .. 7 6 . 7 6 . 7 6 7 ¯(1) 6 7 ¯ 6 x¯q 7 ~x 6 sq1 : : : sqq sq;q+1 : : : sq;p 7 S11 S12 ~x = 6 7 = (2) ; S = 6 7 = : 6 x¯q+1 7 ~x¯ 6 sq+1;1 : : : sq+1;q sq+1;q+1 : : : sq+1;p 7 S21 S22 6 7 6 7 6 x¯q+2 7 6 . .. .. 7 6 7 4 . 5 6 . 7 4 . 5 sp1 : : : spq sp;q+1 : : : sp;p x¯p (1) (2) By definition, S11 is the sample covariance of X~ and S22 is the sample covariance of X~ . Here S12 is referred to as the sample cross covariance matrix between X~ (1) and X~ (2). In fact, we can derive the following formula: n 1 X (2) (1) > S = S> = ~x − ~x¯(2) ~x − ~x¯(1) 21 12 n − 1 i i i=1 4 Standardization and Sample Correlation Matrix For the data matrix (1.1). The sample mean vector is denoted as ~x and the sample covariance is denoted p as S. In particular, for j = 1; : : : ; p, letx ¯j be the sample mean of the j-th variable and sjj be the sample standard deviation. For any entry xij for i = 1; : : : ; n and j = 1; : : : ; p, we get the standardized entry xij − x¯j zij = p : sjj Then the data matrix X is standardized to 2 3 2 >3 z11 z12 : : : z1p ~z1 > 6z21 z22 : : : z2p7 6~z2 7 Z = 6 7 = 6 7 : 6 . .. 7 6 . 7 4 . 5 4 . 5 > zn1 zn2 : : : znp ~zn Denote by R the sample covariance for the sample z1;:::; zn. What is the connection between R and S? The i-th row of Z can be written as 2 3 2 p 3 2 p1 3 2 3 zi1 (xi1 − x¯1)= s11 s11 xi1 − x¯1 p p1 6zi27 6(xi2 − x¯2)= s227 6 s 7 6xi2 − x¯27 6 7 = 6 7 = 6 22 7 6 7 6 . 7 6 . 7 6 .. 7 6 . 7 4 . 5 4 . 5 6 . 7 4 . 5 p 4 1 5 zip (xip − x¯p)= spp p xip − x¯p spp Let 2 p1 3 s11 6 p1 7 − 1 6 s22 7 V 2 = 6 . 7 : 6 .. 7 4 5 p1 spp This transformation can be represented as − 1 − 1 − 1 ~zi = V 2 (~xi − ~x) = V 2 ~xi − V 2 ~x; i = 1; : : : ; n: 4 This implies that the sample mean for the new data matrix is − 1 ~z¯ = V 2 (~x¯ − ~x¯) = ~0; By the formula for the sample covariance of linear combinations of variates, the sample covariance matrix for the new data matrix Z is > − 1 − 1 R = V 2 S V 2 2 p1 3 2 3 2 p1 3 s11 s11 s12 : : : s1p s11 p1 p1 6 s 7 6s21 s22 : : : s2p7 6 s 7 = 6 22 7 6 7 6 22 7 6 .. 7 6 . .. 7 6 .. 7 6 . 7 4 . 5 6 . 7 4 1 5 4 1 5 p sp1 sp2 : : : spp p spp spp s 2 1 p s12 ::: p 1p 3 2 3 s11s22 s11spp r11 r12 : : : r1p s p s21 2p 6 1 ::: p 7 6r21 r22 : : : r2p7 6 s22s11 s22spp 7 6 7 = 6 . 7 := . 6 . 7 6 . .. 7 4 . 5 4 5 sp1 sp2 p p ::: 1 rp1 rp2 : : : rpp spps11 spps22 The matrix R is called the sample correlation matrix for the original data matrix X. 5 Mahalanobis distance and mean-centered ellipse Sample covariance is p.s.d. Recall that the sample covariance is n 1 X S = (~x − ~x¯)(~x − ~x¯)>: n − 1 i i i=1 Is S always positive semidefinite? Consider the spectral decomposition p X > S = λj~uj~uj : j=1 Then S~uj = λj~uj, which implies that > > > ~uj S~uj = ~uj (λj~uj) = λj~uj ~uj = λj: On the other hand n ! 1 X ~u>S~u = ~u> (~x − ~x¯)(~x − ~x¯)> ~u j j n − 1 j i i j i=1 n 1 X = ~u>(~x − ~x¯)(~x − ~x¯)>~u n − 1 j i i j i=1 n 1 X = j~u>(~x − ~x¯)j2 ≥ 0: n − 1 j i i=1 This implies that all eigenvalues of S are nonnegative, so S is positive semidefinite.

Sample Mean Vector and Sample Covariance Matrix

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support