STAT 3202: Practice 01 Autumn 2018, OSU

Home , Mean squared error

Exercise 1

Consider independent random variables X1, X2, and X3 with

• E[X1] = 1, Var[X1] = 4 • E[X2] = 2, SD[X1] = 3 • E[X3] = 3, SD[X1] = 5

(a) Calculate E[5X1 + 2]. Solution:

E[5X1 + 2] = 5 · E[X1] + 2 = 7

(b) Calculate E[4X1 + 2X2 − 6X3]. Solution:

E[4X1 + 2X2 − 6X3] = 4 · E[X1] + 2 · E[X2] − 6 · E[X3] = 4 · (1) + 2 · (2) − 6 · (3) = −10

2 Var[5X1 + 2] = 5 · Var[X1] = 25 · (4) = 100

(d) Calculate Var[4X1 + 2X2 − 6X3]. Solution:

2 2 2 Var[4X1 + 2X2 − 6X3] = 4 · Var[X1] + 2 · Var[X2] + (−6) · Var[X3] = 42 · (4) + 22 · (32) + (−6)2 · (52) = 1000

Exercise 2

Consider random variables H and Q with • E[H] = 3, Var[H] = 16 h Q2 i • SD[Q] = 4,E 5 = 3.2

1 (a) Calculate E[5H2 − 10]. Solution:

Var[H] = E H2 − (E[H])2 16 = E H2 − 32E H2 = 25

E 5H2 − 10 = 5 · E H2 − 10 = 5 · 25 − 10 = 115

(b) Calculate E[Q]. Solution:

E Q2 = 16

Var[Q] = (Var[Q])2 = 16

Var[Q] = E Q2 − (E[Q])2 16 = 16 − E[Q]2

E[Q] = 0

Exercise 3

Consider a random variable S with probability density function

1 f(s) = (2s + 10), 40 ≤ s ≤ 100. 9000 (a) Calculate E[S]. Solution:

Z 100 2s + 10 E[S] = s · ds = 74 40 9000

(b) Calculate SD[S]. Solution:

Z 100 2s + 10 E[S2] = s2 · ds = 5760 40 9000

Var[S] = E[S2] − (E[S])2 = 5760 − 742 = 284

√ SD[S] = pVar[S] = 284 ≈ 16.8523

2 Exercise 4

Consider independent random variables X and Y with 2 • X ∼ N(µX = 2, σX = 9) 2 • Y ∼ N(µY = 5, σY = 4) (a) Calculate P [X > 5]. Solution:

X − µ 5 − 2 P [X > 5] = P X > = P [Z > 1] = 1 − P [Z < 1] = 1 − 0.8413 = 0.1587 σX 3 pnorm(q =5, mean =2, sd = sqrt(9), lower.tail = FALSE)

## [1] 0.1586553 1 - pnorm(1)

## [1] 0.1586553 (b) Calculate P [X + 2Y > 5]. Solution:

E[X + 2Y ] = 1 · E[X] + 2 · E[Y ] = 12

Var[X + 2Y ] = 12 · Var[X] + 22 · Var[Y ] = 25

2 X + 2Y ∼ N(µX+2Y = 12, σX+2Y = 25)

X + 2Y − µ 5 − 12 P [X + 2Y ] = P X+2Y > = P [Z > −1.4] = 1 − P [Z < −1.4] = 1 − 0.0808 = 0.9192 σX+2Y 5

pnorm(q =5, mean = 12, sd = sqrt(25), lower.tail = FALSE)

## [1] 0.9192433 1 - pnorm(-1.4)

## [1] 0.9192433

Exercise 5

Consider random variables Y1, Y2, and Y3 with

• E[Y1] = 1,E[Y2] = −2,E[Y3] = 3 • Var[Y1] = 4, Var[Y2] = 6, Var[Y3] = 8 • Cov[Y1,Y2] = 1, Cov[Y1,Y3] = −1, Cov[Y2,Y3] = 0

3 (a) Calculate Var[3Y1 − 2Y2]. Solution:

2 2 Var[3Y1 − 2Y2] = 3 · Var[Y1] + (−2) · Var[Y2] + 2(3)(−2) · Cov[Y1,Y2] = 48

(b) Calculate Var[3Y1 − 4Y2 + 2Y3]. Solution:

2 2 2 Var[3Y1 − 4Y2 + 2Y3] = (3) · Var[Y1] + (−4) · Var[Y2] + (2) · Var[Y3]

+ (2)(3)(−4) · Cov[Y1,Y2] + (2)(3)(2) · Cov[Y1,Y3] + (2)(−4)(2) · Cov[Y2,Y3] = 128

Exercise 6

Consider using ξˆ to estimate ξ. (a) If Bias[ξˆ] = 5 and Var[ξˆ] = 4, calculate MSE[ξˆ] Solution:

2 MSE[ξˆ] = bias[ξˆ] + var[ξˆ] = 52 + 4 = 29

h i (b) If ξˆ is unbiased, ξ = 6, and MSE[ξˆ] = 30, calculate E ξˆ2

Solution: First we use the deﬁnition of MSE to recover the variance of ξˆ.

2 MSE[ξˆ] = bias[ξˆ] + var[ξˆ]

30 = 0 + var[ξˆ] var[ξˆ] = 30

Then using the fact that the estimator is unbiased, we have

h i h i h i2 E ξˆ2 = var ξˆ + E ξˆ = 30 + 62 = 66

Exercise 7

Using the identity

(θˆ − θ) = θˆ − E[θˆ] + E[θˆ] − θ = θˆ − E[θˆ] + Bias[θˆ]

show that

4 h i 2 MSE[θˆ] = E (θˆ − θ)2 = Var[θˆ] + Bias[θˆ]

Solution: We compute:

h i MSE[θˆ] = E (θˆ − θ)2 2 = E θˆ − E[θˆ] + E[θˆ] − θ ( using the given identity)

2 2 = E θˆ − E[θˆ] + 2 · θˆ − E[θˆ] E[θˆ] − θ + E[θˆ] − θ (algebra)

2 2 = E θˆ − E[θˆ] + 2 · E[θˆ] − θ E[θˆ − E[θˆ] + E[θˆ] − θ (linearity of expectation)

where E[θˆ] − θ in the last line is a constant.

Note that: 2 • E θˆ − E[θˆ] = Var(θˆ), by the deﬁnition of variance

h i • E θˆ − E[θˆ] = E[θˆ] − E[θˆ] = 0

2 • E[θˆ] − θ = Bias[θˆ]2, by deﬁnition of bias

Substituting these into the last expression above, we have

2 MSE[θˆ] = Var[θˆ] + Bias[θˆ]

Exercise 8

2 Let X1,X2,...,Xn denote a random sample from a population with mean µ and variance σ . Consider three estimators of µ:

X + X + X X X + ··· + X X µˆ = 1 2 3 , µˆ = 1 + 2 n−1 + n , µˆ = X,¯ 1 3 2 4 2(n − 2) 4 3

Calculate the mean squared error for each estimator. (It will be useful to ﬁrst calculate their bias and variances.) Solution: We ﬁrst calculate the expected value of each estimator.

X + X + X 1 E [ˆµ ] = E 1 2 3 = (µ + µ + µ) = µ 1 3 3 X X + ··· + X X µ (n − 2)µ µ µ µ µ E [ˆµ ] = E 1 + 2 n−1 + n = + + = + + = µ 2 4 2(n − 2) 4 4 2(n − 2) 4 4 2 4 E [ˆµ3] = E X¯ = µ

5 Thus we see that each estimator is unbiased. Next we calculate the variance of each estimator.

X + X + X 1 σ2 Var [ˆµ ] = Var 1 2 3 = σ2 + σ2 + σ2 = 1 3 9 3 X X + ··· + X X σ2 (n − 2)σ2 σ2 nσ2 Var [ˆµ ] = Var 1 + 2 n−1 + n = + + = 2 4 2(n − 2) 4 16 4(n − 2)2 16 8(n − 2) σ2 Var [ˆµ ] = Var X¯ = 3 n Since each estimator is unbiased, and for any estimator

h i h i h i2 h i MSE θˆ = E (θˆ − θ)2 = bias θˆ + var θˆ we see that the resulting mean squared errors are simply the variances.

σ2 MSE [ˆµ ] = 1 3 nσ2 MSE [ˆµ ] = 2 8(n − 2)

σ2 MSE [ˆµ ] = 3 n

Exercise 9

Let X1,X2,...,Xn denote a random sample from a distribution with density

1 f(x) = e−x/θ, x > 0, θ ≥ 0 θ Consider ﬁve estimators of θ:

X + X X + 2X θˆ = X , θˆ = 1 2 , θˆ = 1 2 , θˆ = X,¯ θˆ = 5 1 1 2 2 3 3 4 5

Calculate the mean squared error for each estimator. (It will be useful to ﬁrst calculate their bias and variances.) Solution: First, notice that this is an exponential distribution, thus we have

2 E[Xi] = θ, Var[Xi] = θ

Following the work from the previous exercise, we will ﬁnd

6 hˆ i 2 MSE θ1 = θ

h i θ2 MSE θˆ = 2 2 h i 5θ2 MSE θˆ = 3 9 h i θ2 MSE θˆ = 4 n hˆ i 2 MSE θ5 = (5 − θ)

Note that the ﬁrst four estimators are unbiased, and thus their mean squared error is simply their variance. The ﬁfth estimator has no variance, and thus its mean squared error is its bias squared.

Exercise 10

h i h i h i h i h i ˆ ˆ ˆ 2 ˆ 2 ˆ ˆ Suppose that E θ1 = E θ2 = θ, Var θ1 = σ1, Var θ2 = σ2, and Cov θ1, θ2 = σ12. Consider the unbiased estimator

ˆ ˆ ˆ θ3 = aθ1 + (1 − a)θ2.

ˆ ˆ If θ1 and θ2 are independent, what value should be chosen for the constant a in order to minimize the variance ˆ and thus mean squared error of θ3 as an estimator of θ? Solution: ˆ We ﬁrst verify that θ3 is unbiased.

hˆ i h ˆ ˆ i hˆ i hˆ i E θ3 = E aθ1 + (1 − a)θ2 = aE θ1 + (1 − a)E θ2 = aθ + (1 − a)θ = θ

ˆ ˆ If θ1 and θ2 are independent, then

hˆ i h ˆ ˆ i 2 hˆ i 2 hˆ i 2 2 2 2 Var θ3 = Var aθ1 + (1 − a)θ2 = a Var θ1 + (1 − a) Var θ2 = a σ1 + (1 − a) σ2

Thus we want to ﬁnd a to minimize the function

2 2 2 2 f(a) = a σ1 + (1 − a) σ2

Taking the derivative, we have

0 2 2 f (a) = 2aσ1 − 2(1 − a)σ2

Setting this equal to 0 and solving for a gives

2 σ2 a = 2 2 σ1 + σ2

7 Finally, we check that this gives a minimum by evaluating the second derivative.

00 2 2 f (a) = 2σ1 + 2σ2

Since since this is in fact always positive

2 σ2 a = 2 2 σ1 + σ2

gives a minimum.

Exercise 11

Let Y have a binomial distribution with parameters n and p. Consider two estimators for p:

Y pˆ = 1 n and

Y + 1 pˆ = 2 n + 2

For what values of p does pˆ2 achieve a lower mean square error than pˆ1? Solution: Recall that for a binomial random variable we have

E [Y ] = np Var [Y ] = np(1 − p)

We ﬁrst calculate the bias, variance, and mean squared error of pˆ1.

Bias [ˆp1] = E [ˆp1] − p = p − p = 0 1 np(1 − p) p(1 − p) Var [ˆp ] = Var [Y ] = = 1 n2 n2 n p(1 − p) MSE [ˆp ] = 1 n

Then we ﬁrst calculate the bias, variance, and mean squared error of pˆ2.

Y + 1 np + 1 E [ˆp ] = E = 2 n + 2 n + 2 np + 1 Bias [ˆp ] = E [ˆp ] − p = − p 2 2 n + 2 Y + 1 1 np(1 − p) Var [ˆp ] = Var = Var [Y ] = 2 n + 2 (n + 2)2 (n + 2)2 np + 1 2 np(1 − p) MSE [ˆp ] = (Bias [ˆp ])2 + Var [ˆp ] = − p + 2 2 2 n + 2 (n + 2)2

8 Finally, we solve for values of µ where MSE [ˆp2] < MSE [ˆp1] . We have,

MSE [ˆp2] < MSE [ˆp1]

np + 1 2 np(1 − p) p(1 − p) − p + < n + 2 (n + 2)2 n

Now we decide to be lazy and use WolframAlpha.

Thus, MSE [ˆp2] < MSE [ˆp1] when

! ! 1 r n + 1 1 r n + 1 1 − < p < 1 + 2 2n + 1 2 2n + 1

Note that this interval is symmetric about 0.5, which makes sense because pˆ2 is biased towards 0.5

Exercise 12

2 Let X1,X2,...,Xn denote a random sample from a population with mean µ and variance σ . Create an unbiased estimator for µ2. Hint: Start with X¯ 2. Solution: First note that from previous results we know that

σ2 E X¯ = µ, Var X¯ = n Also, recall that re-arranging the variance formula based on the ﬁrst and second moments, for any random variable we have

E Y 2 = Var[Y ] + (E[Y ])2

Using these, we ﬁnd that

2 2 σ E X¯ 2 = Var X¯ + E X¯ = + µ2 n

This expression has the µ2 that we’re after, but also has an extra term that we would like to disappear. Recall that

E S2 = σ2

Let’s now consider the estimator

1 ξˆ = X¯ 2 − S2 n It’s expected value is

9 h i 1 1 σ2 1 E ξˆ = E X¯ 2 − S2 = E X¯ 2 − E S2 = + µ2 − σ2 = µ2 n n n n

1 Thus ξˆ = X¯ 2 − S2 is an unbiased estimator for µ2. n

Exercise 13

Let X1,X2,X3,...,Xn be iid random variables form U(θ, θ + 2). (That is, a uniform distribution with a minimum of θ and a maximum of θ + 2.) Consider the estimator

n 1 X θˆ = X = X¯ n i i=1

(a) Calculate the bias of θˆ when estimating θ. Solution:

h i θ + (θ + 2) E θˆ = E X¯ = E[X] = = θ + 1 2

h i h i bias θˆ = E θˆ − θ = (θ + 1) − θ = 1

(b) Calculate the variance of θˆ. Solution:

h i Var[X] [(θ + 2) − θ]2 1 Var θˆ = Var X¯ = = = n 12n 3n

h i h i2 h i 1 3n + 1 MSE θˆ = bias θˆ + var θˆ = 12 + = 3n 3n