<<

The converse, however, is not always true. Cov(X,Y ) can be 0 for variables that are not inde- pendent. For an example where the covariance is 0 but Covariance and Correlation X and Y aren’t independent, let there be three Math 217 Probability and outcomes, (−1, 1), (0, −2), and (1, 1), all with the 1 Prof. D. Joyce, Fall 2014 same probability 3 . They’re clearly not indepen- dent since the value of X determines the value of Y . Note that µ = 0 and µ = 0, so Covariance. Let X and Y be joint random vari- X Y ables. Their covariance Cov(X,Y ) is defined by Cov(X,Y ) = E((X − µX )(Y − µY )) = E(XY ) Cov(X,Y ) = E((X − µX )(Y − µY )). 1 1 1 = 3 (−1) + 3 0 + 3 1 = 0 Notice that the of X is just the covariance We’ve already seen that when X and Y are in- of X with itself dependent, the variance of their sum is the sum of their . There’s a general formula to deal Var(X) = E((X − µ )2) = Cov(X,X) X with their sum when they aren’t independent. A covariance term appears in that formula. Analogous to the identity for variance Var(X + Y ) = Var(X) + Var(Y ) + 2 Cov(X,Y ) 2 2 Var(X) = E(X ) − µX Here’s the proof there is an identity for covariance Var(X + Y ) 2 2 Cov(X) = E(XY ) − µX µY = E((X + Y ) ) − E(X + Y ) 2 2 2 = E(X + 2XY + Y ) − (µX + µY ) Here’s the proof: = E(X2) + 2E(XY ) + E(Y 2) 2 2 Cov(X,Y ) − µX − 2µX µY − µY 2 2 = E((X − µX )(Y − µY )) = E(X ) − µX + 2(E(XY ) − µX µY ) 2 2 = E(XY − µX Y − XµY + µX µY ) + E(Y ) − µY

= E(XY ) − µX E(Y ) − E(X)µY + µX µY = Var(X) + 2 Cov(X,Y ) + Var(Y )

= E(XY ) − µX µY Bilinearity of covariance. Covariance is linear Covariance can be positive, zero, or negative. in each coordinate. That two things. First, Positive indicates that there’s an overall tendency you can pass constants through either coordinate: that when one variable increases, so doe the other, Cov(aX, Y ) = a Cov(X,Y ) = Cov(X, aY ). while negative indicates an overall tendency that when one increases the other decreases. Second, it preserves sums in each coordinate: If X and Y are independent variables, then their covariance is 0: Cov(X1 + X2,Y ) = Cov(X1,Y ) + Cov(X2,Y ) and Cov(X,Y ) = E(XY ) − µX µY

= E(X)E(Y ) − µX µY = 0 Cov(X,Y1 + Y2) = Cov(X,Y1) + Cov(X,Y2).

1 2 2 Here’s a proof of the first equation in the first But Var(X) = σX and Var(−Y ) = Var(Y ) = σY , condition: so that equals

Cov(aX, Y ) = E((aX − E(aX))(Y − E(Y )))  X ±Y  2 + 2 Cov , = E(a(X − E(X))(Y − E(Y ))) σX σY = aE((X − E(X))(Y − E(Y ))) By the bilinearity of covariance, that equals = a Cov(X,Y ) 2 The proof of the second condition is also straight- 2 ± Cov(X,Y ) = 2 ± 2ρXY ) σxσY forward. and we’ve shown that Correlation. The correlation ρ of two joint XY 0 ≤ 2(1 ± ρ . variables X and Y is a normalized version of their XY covariance. It’s defined by the equation Next, divide by 2 move one term to the other side Cov(X,Y ) of the inequality to get ρXY = . σX σY ∓ρXY ≤ 1, Note that independent variables have 0 correla- tion as well as 0 covariance. so By dividing by the product σX σY of the stan- −1 ≤ ρXY ≤ 1. dard deviations, the correlation becomes bounded between plus and minus 1. This exercise should remind you of the same kind of thing that goes on in . In −1 ≤ ρXY ≤ 1. fact, it is the same thing exactly. Take a set of real-valued random variables, not necessarily inde- There are various ways you can prove that in- pendent. Their linear combinations form a vector equality. Here’s one. We’ll start by proving space. Their covariance is the inner product (also  X Y  called the dot product or scalar product) of two 0 ≤ Var ± = 2(1 ± ρXY ). vectors in that space. σX σY There are actually two equations there, and we can X · Y = Cov(X,Y ) prove them at the same time. 2 First note the “0 ≤” parts follow from the fact The norm kXk of X is the square root of kXk variance is nonnegative. Next use the property defined by proved above about the variance of a sum. 2 2 kXk = X · X = Cov(X,X) = V (X) = σX  X Y  Var ± and, so, the angle θ between X and Y is defined by σX σY  X  ±Y   X ±Y  = Var + Var + 2 Cov , X · Y Cov(X,Y ) σ σ σ σ cos θ = = = ρXY X Y X Y kXk kY k σX σY Now use the fact that Var(cX) = c2 Var(X) to that is, θ is the arccosine of the correlation ρXY . rewrite that as 1 1  X Y  Math 217 Home Page at Var(X) + Var(±Y ) + 2 Cov , ± 2 2 http://math.clarku.edu/~djoyce/ma217/ σX σY σX σY

2