Covariance and Correlation Math 217 Probability and Statistics

The converse, however, is not always true. Cov(X; Y ) can be 0 for variables that are not independent. For an example where the covariance is 0 but Covariance and Correlation X and Y aren't independent, let there be three Math 217 Probability and Statistics outcomes, (−1; 1), (0; −2), and (1; 1), all with the 1 Prof. D. Joyce, Fall 2014 same probability 3 . They're clearly not independent since the value of X determines the value of Y . Note that µ = 0 and µ = 0, so Covariance. Let X and Y be joint random vari- X Y ables. Their covariance Cov(X; Y ) is defined by Cov(X; Y ) = E((X − µX )(Y − µY )) = E(XY ) Cov(X; Y ) = E((X − µX )(Y − µY )): 1 1 1 = 3 (−1) + 3 0 + 3 1 = 0 Notice that the variance of X is just the covariance We've already seen that when X and Y are in- of X with itself dependent, the variance of their sum is the sum of their variances. There's a general formula to deal Var(X) = E((X − µ )2) = Cov(X; X) X with their sum when they aren't independent. A covariance term appears in that formula. Analogous to the identity for variance Var(X + Y ) = Var(X) + Var(Y ) + 2 Cov(X; Y ) 2 2 Var(X) = E(X ) − µX Here's the proof there is an identity for covariance Var(X + Y ) 2 2 Cov(X) = E(XY ) − µX µY = E((X + Y ) ) − E(X + Y ) 2 2 2 = E(X + 2XY + Y ) − (µX + µY ) Here's the proof: = E(X2) + 2E(XY ) + E(Y 2) 2 2 Cov(X; Y ) − µX − 2µX µY − µY 2 2 = E((X − µX )(Y − µY )) = E(X ) − µX + 2(E(XY ) − µX µY ) 2 2 = E(XY − µX Y − XµY + µX µY ) + E(Y ) − µY = E(XY ) − µX E(Y ) − E(X)µY + µX µY = Var(X) + 2 Cov(X; Y ) + Var(Y ) = E(XY ) − µX µY Bilinearity of covariance. Covariance is linear Covariance can be positive, zero, or negative. in each coordinate. That means two things. First, Positive indicates that there's an overall tendency you can pass constants through either coordinate: that when one variable increases, so doe the other, Cov(aX; Y ) = a Cov(X; Y ) = Cov(X; aY ): while negative indicates an overall tendency that when one increases the other decreases. Second, it preserves sums in each coordinate: If X and Y are independent variables, then their covariance is 0: Cov(X1 + X2;Y ) = Cov(X1;Y ) + Cov(X2;Y ) and Cov(X; Y ) = E(XY ) − µX µY = E(X)E(Y ) − µX µY = 0 Cov(X; Y1 + Y2) = Cov(X; Y1) + Cov(X; Y2): 1 2 2 Here's a proof of the first equation in the first But Var(X) = σX and Var(−Y ) = Var(Y ) = σY , condition: so that equals Cov(aX; Y ) = E((aX − E(aX))(Y − E(Y ))) X ±Y 2 + 2 Cov ; = E(a(X − E(X))(Y − E(Y ))) σX σY = aE((X − E(X))(Y − E(Y ))) By the bilinearity of covariance, that equals = a Cov(X; Y ) 2 The proof of the second condition is also straight- 2 ± Cov(X; Y ) = 2 ± 2ρXY ) σxσY forward. and we've shown that Correlation. The correlation ρ of two joint XY 0 ≤ 2(1 ± ρ : variables X and Y is a normalized version of their XY covariance. It's defined by the equation Next, divide by 2 move one term to the other side Cov(X; Y ) of the inequality to get ρXY = : σX σY ∓ρXY ≤ 1; Note that independent variables have 0 correlation as well as 0 covariance. so By dividing by the product σX σY of the stan- −1 ≤ ρXY ≤ 1: dard deviations, the correlation becomes bounded between plus and minus 1. This exercise should remind you of the same kind of thing that goes on in linear algebra. In −1 ≤ ρXY ≤ 1: fact, it is the same thing exactly. Take a set of real-valued random variables, not necessarily inde- There are various ways you can prove that in- pendent. Their linear combinations form a vector equality. Here's one. We'll start by proving space. Their covariance is the inner product (also X Y called the dot product or scalar product) of two 0 ≤ Var ± = 2(1 ± ρXY ): vectors in that space. σX σY There are actually two equations there, and we can X · Y = Cov(X; Y ) prove them at the same time. 2 First note the \0 ≤" parts follow from the fact The norm kXk of X is the square root of kXk variance is nonnegative. Next use the property defined by proved above about the variance of a sum. 2 2 kXk = X · X = Cov(X; X) = V (X) = σX X Y Var ± and, so, the angle θ between X and Y is defined by σX σY X ±Y X ±Y = Var + Var + 2 Cov ; X · Y Cov(X; Y ) σ σ σ σ cos θ = = = ρXY X Y X Y kXk kY k σX σY Now use the fact that Var(cX) = c2 Var(X) to that is, θ is the arccosine of the correlation ρXY . rewrite that as 1 1 X Y Math 217 Home Page at Var(X) + Var(±Y ) + 2 Cov ; ± 2 2 http://math.clarku.edu/~djoyce/ma217/ σX σY σX σY 2.

Load more