Biostatistics
Total Page:16
File Type:pdf, Size:1020Kb
Last Lecture . Biostatistics 602 - Statistical Inference Lecture 15 • Can Cramer-Rao bound be used to find the best unbiased estimator for any distribution? If not, in which cases? Bayes Estimator . • When Cramer-Rao bound is attainable, can Cramer-Rao bound be used for find best unbiased estimator for any τ(θ)? If not, what is the Hyun Min Kang restriction on τ(θ)? • What is another way to find the best unbiased estimator? • Describe two strategies to obtain the best unbiased estimators for March 12th, 2013 τ(θ), using complete sufficient statistics. Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 1 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 2 / 1 Recap - The power of complete sufficient statistics Finding UMUVE - Method 1 . .Use Cramer-Rao bound to find the best unbiased estimator for τ(θ). 1. If ”regularity conditions” are satisfied, then we have a Cramer-Rao . bound for unbiased estimators of τ(θ). Theorem 7.3.23 . • It helps to confirm an estimator is the best unbiased estimator of τ(θ) Let T be a complete sufficient statistic for parameter θ. Let ϕ(T) be any if it happens to attain the CR-bound. estimator based on T. Then ϕ(T) is the unique best unbiased estimator of • If an unbiased estimator of τ(θ) has variance greater than the .its expected value. CR-bound, it does NOT mean that it is not the best unbiased estimator. [τ (θ)]2 2. When ”regularity conditions” are not satisfied, ′ is no longer a In(θ) valid lower bound. • There may be unbiased estimators of τ(θ) that have variance smaller 2 than [τ ′(θ)] . In(θ) Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 3 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 4 / 1 Finding UMVUE - Method 2 Frequentists vs. Bayesians A biased view in favor of Bayesians at http://xkcd.com/1132/ . Use complete sufficient statistic to find the best unbiased estimator for .τ(θ). 1. Find complete sufficient statistic T for θ. 2. Obtain ϕ(T), an unbiased estimator of τ(θ) using either of the following two ways • Guess a function ϕ(T) such that E[ϕ(T)] = τ(θ). • Guess an unbiased estimator h(X) of τ(θ). Construct ϕ(T) = E[h(X) T], then E[ϕ(T)] = E[h(X)] = τ(θ). | Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 5 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 6 / 1 Bayesian Statistic Bayesian Framework • Prior distribution of θ : θ π(θ). ∼ Frequentist’s Framework • Sample distribution of X given θ. X θ f(x θ) | ∼ | = X f (x θ), θ Ω . P { ∼ X | ∈ } . • Joint distribution X and θ Bayesian Statistic . f(x, θ) = π(θ)f(x θ) • Parameter θ is considered as a random quantity | • Distribution of θ can be described by probability distribution, referred • Marginal distribution of X. to as prior distribution m(x) = f(x, θ)dθ = f(x θ)π(θ)dθ • A sample is taken from a population indexed by θ, and the prior | ∫θ Ω ∫θ Ω distribution is updated using information from the sample to get ∈ ∈ posterior distribution of θ given the sample. • Posterior distribution of θ (conditional distribution of θ given X) . f(x, θ) f(x θ)π(θ) π(θ x) = = | (Bayes’ Rule) | m(x) m(x) Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 7 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 8 / 1 Example Inference Under Bayesian’s Framework . Leveraging Prior Information . Burglary (θ) Pr(Alarm|Burglary) = Pr(X = 1 θ) Suppose that we know that the chance of Burglary per household per | 7 True (θ = 1) 0.95 night is 10− . False 0.01 Pr(θ = 1) (θ = 0) Pr(θ = 1 X = 1) = Pr(X = 1 θ = 1) (Bayes’ rule) | | Pr(X = 1) Suppose that Burglary is an unobserved parameter (θ 0, 1 ), and Alarm Pr(θ = 1) ∈ { } is an observed outcome (X = 0, 1 ). = Pr(X = 1 θ = 1) { } | Pr(θ = 1, X = 1) + Pr(θ = 0, X = 1) • Under Frequentist’s Framework, Pr(X = 1 θ = 1) Pr(θ = 1) = | • If there was no burglary, there is 1% of chance of alarm ringing. Pr(X = 1 θ = 1) Pr(θ = 1) + Pr(X = 1 θ = 0) Pr(θ = 0) • If there was a burglary, there is % of chance of alarm ringing. 95 | 7 | • One can come up with an estimator on θ, such as MLE 0.95 10− 6 = 7 × 7 9.5 10− • However, given that alarm already rang, one cannot calculate the . 0.95 10− + 0.01 (1 10− ) ≈ × probability of burglary. × × − So, even if alarm rang, one can conclude that the burglary is unlikely to happen. Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 9 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 10 / 1 What if the prior information is misleading? Advantages and Drawbacks of Bayesian Inference . Over-fitting to Prior Information Advantages over Frequentist’s Framework . Suppose that, in fact, a thief found a security breach in my place and • Allows making inference on the distribution of θ given data. planning to break-in either tonight or tomorrow night for sure (with the • Available information about θ can be utilized. same probability). Then the correct prior Pr(θ = 1) = 0.5. • Uncertainty and information can be quantified probabilistically. Pr(θ = 1 X = 1) . | Pr(X = 1 θ = 1) Pr(θ = 1) = | . Pr(X = 1 θ = 1) Pr(θ = 1) + Pr(X = 1 θ = 0) Pr(θ = 0) Drawbacks of Bayesian Inference | | . 0.95 0.5 • Misleading prior can result in misleading inference. = × 0.99 0.95 0.5 + 0.01 (1 0.5) ≈ • Bayesian inference is often (but not always) prone to be ”subjective” × × − • See : Larry Wasserman ”Frequentist Bayes is Objective” (2006) However, if we relied on the inference based on the incorrect prior, we may Bayesian Analysis 3:451-456. end up concluding that there are > 99.9% chance that this is a false alarm, and ignore it, resulting an exchange of one night of good sleep with • Bayesian inference could be sometimes unnecessarily complicated to interpret, compared to Frequentist’s inference. .quite a bit of fortune. Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 11 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 12 / 1 Bayes Estimator Solution (1/4) Prior distribution of p is . Γ(α + β) α 1 β 1 Definition π(p) = p − (1 p) − . Γ(α)Γ(β) − Bayes Estimator of θ is defined as the posterior mean of θ. Sampling distribution of X given p is E(θ x) = θπ(θ x)dθ . | θ Ω | ∫ ∈ n xi 1 xi f (x p) = p (1 p) − . X | − Example Problem i=1 . ∏ { } i.i.d. X , , Xn Bernoulli(p) where 0 p 1. Assume that the prior Joint distribution of X and p is 1 ··· ∼ ≤ ≤ distribution of p is Beta(α, β). Find the posterior distribution of p and the .Bayes estimator of p, assuming α and β are known. fX(x, p) = fX(x p)π(p) n | xi 1 xi Γ(α + β) α 1 β 1 = p (1 p) − p − (1 p) − − Γ(α)Γ(β) − i=1 ∏ { } Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 13 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 14 / 1 Solution (2/4) Solution (3/4) The marginal distribution of X is 1 n n Γ(α + β) xi+α 1 n xi+β 1 The posterior distribution of θ x : m(x) = f(x, p)dp = p i=1 − (1 p) − i=1 − dp | 0 Γ(α)Γ(β) ∑ − ∑ ∫ 1 ∫ f(x, p) Γ(α + β) Γ( xi + α)Γ(n xi + β) π(θ x) = = − | m(x) 0 Γ(α)Γ(β) Γ(α + β + n) ∫ ∑ ∑ Γ(α + β) x n x xi+α 1 n xi+β 1 Γ( i + α + i + β) xi+α 1 n xi+β 1 p − (1 p) − − − p − (1 p) − − dp Γ(α)Γ(β) ∑ − ∑ ×Γ( xi + α)Γ(n xi + β) ∑ − ∑ = [ ] ∑ n − ∑ n Γ(α + β) Γ( xi + α)Γ(n xi + β) Γ(α + β) Γ( xi + α)Γ(n xi + β) − = ∑ i=1 ∑ − i=1 Γ(α)Γ(β) Γ(α + β + n) Γ(α)Γ(β) Γ(α + β + n) [ ∑ ∑ ] ∑ ∑ Γ(α + β + n) 1 = p xi+α 1(1 p)n xi+β 1 x n x − − − fBeta( xi+α,n xi+β)(p)dp Γ( i + α)Γ( i + β) ∑ − ∑ × 0 − − ∫ ∑ n ∑ n Γ(α + β) Γ( xi + α)Γ(n xi + β) ∑ ∑ = i=1 − i=1 Γ(α)Γ(β) Γ(α + β + n) ∑ ∑ Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 15 / 1 Hyun Min Kang Biostatistics 602 - Lecture 15 March 12th, 2013 16 / 1 Solution (4/4) Is the Bayes estimator unbiased? The Bayes estimator of p is n n i=1 xi + α i=1 xi + α pˆ = n n = n x + α + n x + β α + β + n i=1 +α np + α i=1 i ∑ i=1 i ∑ E = = p n − α + β + n α + β + n ̸ xi n α α + β = ∑i=1 +∑ [ ∑ ] n α + β + n α + β α + β + n α ∑ Unless α+β = p. = [Guess about p from data] weight · 1 + [Guess about p from prior] weight np + α α (α + β)p · 2 Bias = p = − α + β + n − α + β + n As n increase, weight = n = 1 becomes bigger and bigger and 1 α+β+n α+β +1 n As n increases, the bias approaches to zero.