Review of Mathematical Statistics Chapter 10
Total Page:16
File Type:pdf, Size:1020Kb
Review of Mathematical Statistics Chapter 10 Definition. If fX1 ; ::: ; Xng is a set of random variables on a sample space, then any function p f(X1 ; ::: ; Xn) is called a statistic. For example, sin(X1 + X2) is a statistic. If a statistics is used to approximate an unknown quantity, then it is called an estimator. For example, we X1+X2+X3 X1+X2+X3 may use 3 to estimate the mean of a population, so 3 is an estimator. Definition. By a random sample of size n we mean a collection fX1 ;X2 ; ::: ; Xng of random variables that are independent and identically distributed. To refer to a random sample we use the abbreviation i.i.d. (referring to: independent and identically distributed). Example (exercise 10.6 of the textbook) ∗. You are given two independent estimators of ^ ^ an unknown quantity θ. For estimator A, we have E(θA) = 1000 and Var(θA) = 160; 000, ^ ^ while for estimator B, we have E(θB) = 1; 200 and Var(θB) = 40; 000. Estimator C is a ^ ^ ^ ^ weighted average θC = w θA + (1 − w)θB. Determine the value of w that minimizes Var(θC ). Solution. Independence implies that: ^ 2 ^ 2 ^ 2 2 Var(θC ) = w Var(θA) + (1 − w) Var(θB) = 160; 000 w + 40; 000(1 − w) Differentiate this with respect to w and set the derivative equal to zero: 2(160; 000)w − 2(40; 000)(1 − w) = 0 ) 400; 000 w − 80; 000 = 0 ) w = 0:2 Notation: Consider four values x1 = 12 ; x2 = 5 ; x3 = 7 ; x4 = 3 Let's arrange them in increasing order: 3 ; 5 ; 7 ; 12 1 We denote these \ordered" values by these notations: x(1) = 3 ; x(2) = 5 ; x(3) = 7 ; x(4) = 12 5+7 x(2)+x(3) The median of these four values is 2 , so it is 2 . As another example if we have observed the values f1 2 1 7 6g, then x(1) = 1 ; x(2) = 1 ; x(3) = 2 ; x(4) = 6 ; x(5) = 7 and the median is x(3) = 2. Definition: If fX1 ;X2 ; ::: ; Xng is a collection of random variables, then the k-th order statistics is X(k). The median can be written in terms of order statistics: 8 > > X n+1 if n is odd < ( 2 ) median = > > X n +X n : ( 2 ) ( 2 +1) 2 if n is even Note that X(1) = min(X1 ; ::: ; Xn) X(n) = max(X1 ; ::: ; Xn) Theorem. Let fX1 ; ::: ; Xng be an i.i.d. of continuous random variables with common density f(x) and common distribution function F (x). Let Y1 < Y2 < ··· < Yn be the their order statistics (why can we ignore the equality between them). Then the density of the k-th order statistics Yk is n! g (y) = [F (y)]k−1 [1 − F (y)]n−k f(y) k (k − 1)!(n − k)! Example. Consider a population with density function 1 f(x) = 2x 0 < x < 2 Then Z x 1 F (x) = f(t)dt = x2 0 < x < 0 2 2 Then the density of the random variable min(X1; ; ::: ; X6) is: h i h i 6! 1−1 6−1 g (x) = y2 1 − y2 (2y) = 12y(1 − y2)5 1 0!(6 − 1)! and , the density of the random variable max(X1; ; ::: ; X6) is: h i h i 6! 6−1 6−6 g (x) = y2 1 − y2 (2y) = 3y11 6 5!(6 − 6)! Definition. An estimator θ^ for a parameter θ is said to be unbiased if for all possible values of θ we have E(θ^j θ) = θ. Note. What is good about about an unbiased estimator? Answer: If θ^ is unbiased for estimating θ, then by observing a large number of instances, fθ^1 ; θ^2 ; ::: ; θ^100g of θ^, say, and θ^1+···+θ^100 then by taking the simple average 100 we will have a good approximation for θ. This is due to the Law of Large Numbers. 2 Example. Suppose fX1 ;X2g are two random variables from a population N(ν ; σ = 1). 2 Find the parameter for which X1 + 3X2 is an unbiased estimator. Solution. 2 2 2 ) 2 2 2 E(X1 ) = Var(X1) + E(X1) = 1 + µ E(X1 + 3X2) = E(X1 ) + 3E(X2) = 1 + µ + 3µ, regardless of the true value of µ. 2 2 So, X1 + 3X2 is an unbiased estimator for the parameter µ + 3µ + 1. Definition. Consider a class of unbiased estimators for a particular parameter. Then from this class an estimator is called Uniformly Minimum Variance Unbiased Estimator (UMVUE) if it has the minimum variance among all elements of that class of unbiased estimators. Example (exercise 10.8 of the textbook) ∗. Two instruments are available for measuring a particular nonzero distance. The random variable X represents the measurement with the 3 first instrument and the random variable Y with the second instrument. Assume X and Y are independent, and that E[X] = 0:8 m E[Y ] = m Var(X) = m2 Var(Y ) = 1:5 m2 where m is the true distance. Consider the estimators of m that are of the form Z = αX + βY . Determine the values of α and β that make Z an UMVUE within the class of estimators of this kind. Solution. being unbiased ) m = E(αX + βY ) = α E(X) + β E(Y ) = (0:8α + β)m ) 0:8 α + β = 1 ) β = 1 − 0:8 α Var(αX + βY ) = α2Var(X) + β2Var(Y ) = α2m2 + (1 − 0:8α)2(1:5 m2) = n o n o α2 + (1 − 0:8α)2(1:5) m2 = 1:96 α2 − 2:4 α + 1:5 m2 Since the coefficient of α2 is positive (1.96), the quadratic form is minimized at − −2:4 − α = 2(1:96) = 0:6122. Then also β = 1 0:8(0:6122) = 0:5102 Theorem. Let fY1 ; ::: ; Yng be random variables from the same population with mean µ and Pn 2 ¯ 1 variance σ . Then the sample mean Y = n Yj is an unbiased estimator for the j=1 population mean µ. The proof does not need any assumption of independence. Proof. 1 Xn 1 Xn E(Y¯ ) = E(Y ) = µ = µ n j n j=1 j=1 Theorem. Let fY1 ; ::: ; Yng be mutually independent random variables from the same population with mean µ and variance σ2. Then ¯ σ2 (a) Var(Y ) = n . 4 Pn 1 − ¯ 2 (b) The sample variance n−1 (Yj Y ) is an unbiased estimator for the population j=1 variance σ2 Proof. ( ) n o Y + ··· + Y 1 1 Var(Y¯ ) = Var 1 n = Var(Y + ··· + Y ) = Var(Y ) + ··· + Var(Y ) n n2 1 n n2 1 n n o 1 1 σ2 = σ2 + ··· + σ2 = (nσ2) = n2 n2 n This proves part (a). To prove part (b) we first note that h i Pn Pn 2 2 (Yj − Y¯ ) = (Yj − µ) + (µ − Y¯ ) j=1 j=1 Pn h i 2 2 = (Yj − µ) + 2(Yj − µ)(µ − Y¯ ) + (µ − Y¯ ) j=1 Pn Pn Pn 2 2 = (Yj − µ) + 2(µ − Y¯ ) (Yj − µ) + (µ − Y¯ ) j=1 j=1 j=1 Pn 2 2 = (Yj − µ) + 2(µ − Y¯ )(nY¯ − nµ) + n(µ − Y¯ ) j=1 Pn 2 2 2 = (Yj − µ) − 2n(µ − Y¯ ) + n(µ − Y¯ ) j=1 Pn 2 2 = (Yj − µ) − n(µ − Y¯ ) j=1 So Pn Pn 2 2 2 (Yj − Y¯ ) = (Yj − µ) − n(µ − Y¯ ) j=1 j=1 5 Applying the expectation E on both sides , we will have: " # Pn Pn 2 2 2 E (Yj − Y¯ ) = E(Yj − µ) − nE(µ − Y¯ ) j=1 j=1 Pn = Var(Yj) − nVar(Y¯ ) j=1 ( ) 2 − σ2 − 2 = nσ n n = (n 1)σ So 2 3 1 Xn E 4 (Y − Y¯ )25 = σ2 n − 1 j j=1 This proves part (b). Note. Repeating the argument we used for the identity Xn Xn 2 2 2 (Yj − Y¯ ) = (Yj − µ) − n(µ − Y¯ ) j=1 j=1 shows that for any constant a we have Xn Xn 2 2 2 (Yj − Y¯ ) = (Yj − a) − n(a − Y¯ ) j=1 j=1 and specially for a = 0 we have : Xn Xn − ¯ 2 2 − ¯ 2 (Yj Y ) = Yj n(Y ) j=1 j=1 Dividing by n gives: 1 Xn 1 Xn (Y − Y¯ )2 = Y 2 − (Y¯ )2 n j n j j=1 j=1 Equivalently: 6 ( ) Pn Pn P 2 1 − ¯ 2 1 2 − 1 n n (Yj Y ) = n Yj n j=1 Yj j=1 j=1 Note that the left-hand side is the empirical variance. Example ∗. Mrs. Actuarial Gardener has used a global positioning system to lay out a perfect 20-meter by 20-meter gardening plot in her back yard. Her husband, Mr. Actuarial Gardener, decides to estimate the area of the plot. He paces off a single side of the plot and records his estimate of the length. He repeats this experiment an additional 4 times along the same side. Each trial is independent and follows a Normal distribution with a mean of 20 and a standard deviation of 2 meters. He then averages his results and squares that number to estimate the total area of the plot. Which of the following is a true statement regarding Mr. Gardener's method of estimating the area? A) On average, it will underestimate the true area by at least 1 square meter. B) On average, it will underestimate the true area by less than 1 square meter. C) On average, it is an unbiased method. D) On average, it will overestimate the true area by less than 1 square meter.