4.2 Variance and Covariance
Total Page:16
File Type:pdf, Size:1020Kb
4.2 Variance and Covariance The most important measure of variability of a random variable X is obtained by letting g(X) = (X − µ)2 then E[g(X)] gives a measure of the variability of the distribution of X. 1 Definition 4.3 Let X be a random variable with probability distribution f(x) and mean µ. The variance of X is σ 2 = E[(X − µ)2 ] = ∑(x − µ)2 f (x) x if X is discrete, and ∞ σ 2 = E[(X − µ)2 ] = ∫ (x − µ)2 f (x)dx if X is continuous. −∞ ¾ The positive square root of the variance, σ, is called the standard deviation of X. ¾ The quantity x - µ is called the deviation of an observation, x, from its mean. 2 Note σ2≥0 When the standard deviation of a random variable is small, we expect most of the values of X to be grouped around mean. We often use standard deviations to compare two or more distributions that have the same unit measurements. 3 Example 4.8 Page97 Let the random variable X represent the number of automobiles that are used for official business purposes on any given workday. The probability distributions of X for two companies are given below: x 12 3 Company A: f(x) 0.3 0.4 0.3 x 01234 Company B: f(x) 0.2 0.1 0.3 0.3 0.1 Find the variances of X for the two companies. Solution: µ = 2 µ A = (1)(0.3) + (2)(0.4) + (3)(0.3) = 2 B 2 2 2 2 σ A = (1− 2) (0.3) + (2 − 2) (0.4) + (3− 2) (0.3) = 0.6 2 2 4 σ B > σ A 2 2 4 σ B = ∑(x − 2) f (x) =1.6 x=0 Theorem 4.2 The variance of a random variable X is σ 2 = E(X 2 ) − µ 2. Proof: For the discrete case we can write 2 σ 2 = ∑(x − µ) f (x) = ∑(x2 − 2µx + µ 2 ) f (x) x x = ∑ x2 f (x) − 2µ∑ x f (x) + µ 2 ∑ f (x). x x x Since µ = ∑ xf ( x ) by definition, and ∑ f ( x ) = 1 for any x x discrete probability distribution, it follows that σ 2 = ∑ x2 f (x) − µ 2 = E(X 2 ) − µ 2. x For the continuous case the proof is step by step the same, with summations replaced by integrations. 5 Example 4.10 The weekly demand for Pepsi, in thousands of liters, from a local chain of efficiency stores, is a continuous random variable X having the probability density ⎧2(x-1), 1 < x < 2 f (x) = ⎨ ⎩0, elsewhere Find the mean and variance of X. Solution: 2 µ = E(X ) = 2 x(x −1)dx = 5 3. ∫1 and 2 E(X 2 ) = 2 x2 (x −1)dx =17 6. ∫1 Therefore, 2 2 σ =17 6 − (5 3) =1 18. 6 Theorem 4.3 Let X be a random variable with probability distribution f(x). The variance of the random variable g(X) is 2 2 2 σ g ( X ) = E[(g(X ) − µg ( X ) ) ] = ∑(g(x) − µg ( X ) ) f (x) x if X is discrete, and ∞ σ 2 = E[(g(X ) − µ )2 ] = (g(x) − µ )2 f (x)dx g ( X ) g ( X ) ∫ g ( X ) −∞ if X is continuous. 7 Definition 4.4 Covariance of two random variables Let X and Y be random variables with joint probability distribution f(x, y). The covariance of X and Y is σ XY = E[(X − µ X )(Y − µY )] = ∑∑(x − µ X )(y − µY ) f (x, y) xy if X and Y are discrete, and ∞ ∞ σ = E[(X − µ )(Y − µ )] = (x − µ )(y − µ ) f (x, y)dxdy XY X Y ∫∫ X Y −∞−∞ if X and Y are continuous. 8 Remark If large values of X often result in large values of Y or small values of X result in small values of Y, then positive X– µ X will often result in positive Y – µY .Thus the product (X –)µ X (Y – µY) will tend to be positive (positive association between X and Y). Similarly, if small values of X often result in large values of Y or large values of X result in small values of Y, then positive X – µ X will often result in negative Y – µY. Thus the product (X –)µ X (Y – µY ) will tend to be negative (negative association between X and Y). The sign of the covariance indicates whether the relationship between two variables is positive or negative. It can be shown that when two variables are independent, then 9 the covariance of the two variables is zero. Theorem 4.4 The covariance of two random variables X and Y with means µ X and µ Y , respectively, is given by σ XY = E(XY ) − µ X µY . Proof: For the discrete case we can write σ XY = ∑ ∑ (x − µ X )( y − µY ) f (x, y) x y = ∑ ∑ (xy − µ X y − µY x + µ X µY ) f (x, y) x y µY = ∑ ∑ xyf (x, y) − µ X ∑ ∑ yf (x, y) x y x y − µY ∑ ∑ xf (x, y) + µ X µY ∑ ∑ f (x, y) x y x y 10 µ X 1 Example 4.14, page 101 The fraction X of male runners and the fraction Y of female runners who compete in marathon races is described by the joint density function: ⎧8xy, 0 ≤ x ≤1, 0 ≤ y ≤ x f (x, y) = ⎨ ⎩0, elsewhere Find the covariance of X and Y. Solution: We first compute the marginal density function. They are ⎧4x3 , 0 ≤ x ≤1 ⎧4y(1− y 2 ), 0 ≤ y ≤1 g(x) = ⎨ h(y) = ⎨ ⎩0 elsewhere ⎩0 elsewhere 1 1 4 4 2 2 8 µ X = E(X ) = 4x dx = µY = E(Y ) = 4y (1− y )dy = ∫0 5 ∫0 15 1 1 E(XY) = 8x2 y 2dxdy = 4 ∫0 ∫y 9 4 4 8 4 11 σ XY = E(XY) − µ X µY = 9 − ( 5 )(15 ) = 225 Correlation Coefficient The measure of the strength of the relationship Definition 4.5 Let X and Y be random variables with covariance σ XY and standard deviation σ X and σ Y , respectively. The correlation coefficient of X and Y is σ XY ρ XY = σ Xσ Y 12 Remark ¾ ρ XY is free of units. ¾Satisfies the inequality −1≤ ρ XY ≤1 ρ = 0 ¾ XY if σ XY = 0 (X and Y are independent) ¾ ρ XY = ±1 if Y = a + bX 13.