Stats 135: Efficiency and Sufficiency
Total Page:16
File Type:pdf, Size:1020Kb
Stats 135: Efficiency and Sufficiency Joan Bruna Department of Statistics UC, Berkeley February 8, 2015 Joan Bruna Stats 135: Efficiency and Sufficiency Cramer-Rao lower bound This result basically states that the price to pay for having an unbiased estimator is a certain amount of variance. Theorem Suppose Xi are i.i.d with probability distribution Pθ. Under smoothness condition on fθ (the density or frequency function), we have: If θb is an unbiased error for θ, then 1 var θb ≥ : nI (θ) Note: the right hand side is the asymptotic variance of the MLE. Proof idea: take Z to be the partial derivative (wrt to θ) of the log-likelihood. Show that cov (Z; T ) = 1 and var (Z) = nI (θ). Use Cauchy-Schwarz to conclude. Joan Bruna Stats 135: Efficiency and Sufficiency Efficiency Question: how to compare two estimators? Check out their variance Definition (Efficiency) Given two estimators θb and θe of a parameter θ, the efficiency of θb relative to θ is e var θe eff(θ;b θe) = : var θb Note: if eff(θ;b θe) ≤ 1, var θb ≥ var θe , θb less efficient than θe Efficiency measures what \fraction of samples needed for the two estimators to have same variance". Warning: It only makes sense when θb and θe have the same bias (ideally 0). Joan Bruna Stats 135: Efficiency and Sufficiency Efficient estimators Definition (Efficient estimator) An unbiased estimator that achieves the Cramer-Rao lower bound is called efficient. Note: Unbiased estimators cannot do better in terms of variance than Cramer-Rao This is true asymptotically for the MLE. So MLE is asymptotically efficient. However, MLE is not necessarily efficient in finite sample. No uniqueness of asymptotically efficient estimators Joan Bruna Stats 135: Efficiency and Sufficiency Sufficiency (X1;:::; Xn): n-dimensional; might be complicated/expensive to store Question: is there function of data that contains all the information there is in the sample about parameter θ? If so T (X1;:::; Xn) contains all relevant info. So it's the only thing we need to keep track of. Definition (Sufficiency) A statistics T is said to be sufficient for θ if X1;:::; XnjT (X1;:::; Xn) = t does not depend on θ for any t. T is called a sufficient statistic for θ. Joan Bruna Stats 135: Efficiency and Sufficiency Sufficiency Example n Consider fXi gi=1 i.i.d Bernoulli(p) (1 wp p, 0 wp 1 − p). P n t n−t Let T = Xi . T is binomial (n; p): P(T = t) = t p (1 − p) P(X = x ;:::; X = x ; T = t) P(X = x ;:::; X = x jT = t) = 1 1 n n 1 1 n n P(T = t) px1 (1 − p)1−x1 ::: pxn (1 − p)1−xn = n t n−t t p (1 − p) pt (1 − p)n−t 1 = = n t n−t n t p (1 − p) t So T is sufficient for p Joan Bruna Stats 135: Efficiency and Sufficiency Sufficiency: N&S condition Theorem A necessary and sufficient condition for T to be sufficient for θ is that fθ(x1;:::; xn) = gθ(T )h(x1;:::; xn) Previous example: P P Xi n− Xi Pp(X1 = x1;:::; Xn = xn) = p (1 − p) P = (p=(1 − p)) Xi (1 − p)n n T So h = 1, and gp(T ) = (1 − p) (p=(1 − p)) . Proof: see class notes Joan Bruna Stats 135: Efficiency and Sufficiency Sufficiency and MLE Corollary If T is sufficient for θ, θbML is a function of T . Why? If fθ(x1;:::; xn) = gθ(T )h(x1;:::; xn), log(lik(θ)) = log(gθ(T )) + log(h(x1;:::; xn)) log(h(x1;:::; xn)) does not involve θ, so plays no role in the maximization. Joan Bruna Stats 135: Efficiency and Sufficiency Rao-Blackwell theorem Preparations Various ways of measuring quality of estimators. A possible one: Mean-squared error (MSE): 2 2 MSE(θb) = Eθ (θb− θ) = bias + var θb Nice property of MSE: captures bias-variance tradeoff. Joan Bruna Stats 135: Efficiency and Sufficiency Rao-Blackwell theorem Question: given an estimator θb, can I find a better one if I know a sufficient statistic T ? Better here is in the sense of MSE. Theorem (Rao-Blackwell) Let θb be an estimator for θ. Assume E θb2 < 1. Suppose T is sufficient for θ. Let θe = E θbjT Then MSE(θe) ≤ MSE(θb) : Question: why do we need T to be sufficient? Can we improve the MLE in this fashion? Proof: see class notes; basically properties of conditional expectation. Joan Bruna Stats 135: Efficiency and Sufficiency.