Confidence & Credible Interval Estimates

Confidence & Credible Interval Estimates Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Introduction Point estimates of unknown parameters θ Θ governing the distribution of an observed quantity ∈ X X are unsatisfying if they come with no measure of accuracy or precision. One approach ∈ to giving such a measure is to offer set-valued or, for one-dimensional Θ R, interval-valued ⊂ estimates for θ, rather than point estimates. Upon observing X = x, we construct an interval I(x) which is very likely to contain θ, and which is very short. The approach to exactly how these are constructed and interpreted is different for inference in the Sampling Theory tradition, and in the Bayesian tradition. In these notes I’ll present both approaches to estimating the means of the Normal and Exponential distributions, using “pivotal quantities,” and of integer-valued random variables with monotone CDFs. 1.1 Confidence Intervals and Credible Intervals A γ-Confidence Interval is a random interval I(X) Θ with the property that P [θ I(X)] γ, ⊂ θ ∈ ≥ i.e., that it will contain θ with probability at least γ no matter what θ might be. Such an interval may be specified by giving the end-points, a pair of functions A : X R and B : X R with the → → property that ( θ Θ) P [A(X) <θ<B(X)] γ. (1) ∀ ∈ θ ≥ Notice that this probability is for each fixed θ; it is the endpoints of the interval I(X) = (A, B) that are random in this calculation, not θ, in the sampling-based Frequentist paradigm. A γ-Credible Interval is an interval I Θ with the property that P[θ I X] γ, i.e., that ⊂ ∈ | ≥ the posterior probability that it contains θ is at least γ. A family of intervals I(x) : x X for { ∈ } all possible outcomes x may be specified by giving the end-points as functions A : X R and → B : X R with the property that → P[A(x) <θ<B(x) X = x] γ. (2) | ≥ This expression is a conditional or posterior probability that the random variable θ will lie in the interval [A, B], given the observed value x of the random variable X. It is the parameter θ that is random in the Bayesian paradigm, not the endpoints of the interval I(x) = (A, B). 1 1.2 Pivotal Quantities Confidence intervals for many parametric distributions can be found using “pivotal quantities”. A pivotal quantity is a function of the data and the parameters (so it’s not a statistic) whose probability distribution does not depend on any uncertain parameter values. Some examples: Ex(λ): If X Ex(λ) then λX Ex(1) is pivotal and, for samples of size n, λX¯ Ga(n,n) • ∼ ∼ n ∼ and 2nλX¯ χ2 are pivotal. n ∼ 2n Ga(α, λ): If X Ga(α, λ) with α known then λX Ga(α, 1) is pivotal and, for samples of • ∼ ∼ size n, λX¯ Ga(αn,n) and 2nλX¯ χ2 are pivotal. n ∼ n ∼ 2αn No(µ,σ2): If µ is unknown but σ2 is known, then (X µ) No(0,σ2) is pivotal and, for • − ∼ samples of size n, √n(X¯ µ) No(0,σ2) is pivotal. n − ∼ No(µ,σ2): If µ is known but σ2 is unknown, then (X µ)/σ No(0, 1) is pivotal and, for • − ∼ samples of size n, (X µ)2/σ2 χ2 is pivotal. i − ∼ n No(µ,σ2): If µ andPσ2 are both unknown then for samples of size n, √n(X¯ µ)/σ No(0, 1) • n − ∼ and (X X¯ )2/σ2 χ2 are both pivotal. This is the key example below. i − n ∼ n−1 Un(0P,θ): If X Un(0,θ) with θ unknown then (X/θ) Un(0, 1) is pivotal and, for samples • ∼ ∼ of size n, max(Xi)/θ Be(n, 1) is pivotal. Find a sufficient pair of pivotal quantities for iid ∼ X Un(α, β). { i} ∼ We(α, β): If X We(α, β) has a Weibull distribution then βXα Ex(1) is pivotal. • ∼ ∼ 1.3 Confidence Intervals from Pivotal Quantities Pivotal quantities allow us to construct sampling-theory “confidence intervals” for uncertain parameters. For the simplest example, let’s consider the problem of finding a γ = 90% Confidence Interval for the unknown rate parameter λ from a single observation X Ex(λ) from the exponen- ∼ tial distribution. Since λX Ex(1) is pivotal, we have ∼ P a λX b = e−a e−b = 0.95 0.05 = 0.90 λ ≤ ≤ − − if a = log(0.95) = 0.513 and b = log(0 .05) = 2.996, so for every fixed λ> 0 − − 0.513 2.996 0.90 = P λ (3a) λ X ≤ ≤ X h i will be a symmetric Confidence Interval for λ for a single observation X. Note it is the endpoints of the interval that are random (through their dependence on X) in this sampling-distribution based approach to inference, while the parameter λ is fixed. That’s why the word “confidence” is used for these intervals, and not “probability.” In Section (4) when we consider Bayesian interval estimates this will switch— the observed (and so fixed) values of X will be used, while the parameter will be regarded as a random variable. 2 With a larger sample-size something similar can be done. For example, since 2nλX¯ χ2 is n ∼ 2n pivotal for iid X Ex(λ), and since the 5% and 95% quantiles of the χ2 distribution are 3.940 { j}∼ 10 and 18.307, a sample of size n = 5 from the Ex(λ) distribution satisfies 0.3940 1.8307 0.90 = P [3.940 2nλX¯ 18.307] = P λ , (3b) λ ≤ n ≤ λ X¯ ≤ ≤ X¯ 5 5 so [0.39/X¯5, 1.83/X¯5] is a 90% confidence interval for λ based on a sample of size n = 5. Intervals (3a) and (3b) are both of the same form, straddling the MLE λˆ = 1/X¯n, but (3b) is narrower because its sample-size is larger. This will be a recurring theme— larger samples will lead more precise estimates (i.e., narrower intervals) at the cost of more sampling. Most discrete distributions don’t have (exact) pivotal quantities, but the central limit theorem usually leads to approximate confidence intervals for most distributions for large samples. Exact intervals are available for many distributions, with a little more work, even for small samples; see Section (5.3) for a construction of exact confidence and credible intervals for the Poisson distribution. Ask me if you’re interested in more details. 2 Confidence Intervals for a Normal Mean First let’s verify the claim above that, for a sample of size n from the No(µ,σ2) distribution, the pivotal quantities (X X¯ )2/σ2 χ2 and √n(X¯ µ)/σ No(0, 1) have the distributions i − n ∼ n−1 n − ∼ I claimed for them— and, moreover, that they are independent. P iid Let x = X ,...,X No(µ,σ2) be a simple random sample from a normal distribution mean µ { 1 n} ∼ and variance σ2. The log likelihood function for θ = (µ,σ2) is 2 2 log f(x θ) = log (2πσ2)−n/2 e− P(xi−µ) /2σ | nn 1 o n(¯x µ)2 = log(2πσ2) (x x¯ )2 n − − 2 − 2σ2 i − n − 2σ2 X so the MLEs are 1 1 µˆ =x ¯ = x andσ ˆ2 = S where S := (x x¯ )2. (4) n n i n i − n X X First we turn to discovering the probability distributions of these estimators, so we can make sampling-based interval estimates for µ and σ2. Since the X are independent, their sum X has a No(nµ, nσ2) distribution and { i} i µˆ =x ¯P No(µ,σ2/n). n ∼ Since the covariance betweenx ¯ and each component of (x x¯ ) is zero, and since they’re all jointly n − n Gaussian,x ¯ must be independent of (x x¯ ) and hence of any function of (x x¯ ), including n − n − n S = (x x¯ )2. Now we can use moment generating functions to discover the distribution of S. i − n Since Z2 χ2 = Ga(1/2, 1/2) for a standard normal Z No(0, 1), we have P ∼ 1 ∼ n(¯x µ)2/σ2 Ga(1/2, 1/2) and (x µ)2/σ2 Ga(n/2, 1/2) n − ∼ i − ∼ X 3 so n(¯x µ)2 Ga(1/2, 1/2σ2 ) and (x µ)2 Ga(n/2, 1/2σ2). n − ∼ i − ∼ By completing the square we have X (x µ)2 = (x x¯ )2 + n(¯x µ)2 i − i − n n − X X as the sum of two independent terms. Recall (or compute) that the Gamma Ga(α, λ) MGF is (1 t/λ)−α for t < λ, and that the MGF for the sum of independent random variables is the − product of the individual MGFs, so E exp t (x µ)2 = (1 2σ2t)−n/2 i − − n o X = E exp t (x x¯ )2 E exp tn(¯x µ)2 i − n n − n o = E exp t X(x x¯ )2 (1 2σ2t)−1/2, so i − n − n o E exp t (x x¯ )2 = (1 2σ2tX)−(n−1)/2 by dividing. Thus i − n − n X o n 1 1 S := (x x¯ )2 Ga − , i − n ∼ 2 2σ2 X and so S/σ2 χ2 is independent of √n(X¯ µ)/σ No(0, 1), as claimed. ∼ n−1 n − ∼ 2 2.1 Confidence Intervals for Mean µ when Variance σ0 is Known The pivotal quantity x¯ µ Z = n − σ2/n No 2 2 has a standard (0, 1) normal distribution, withp CDF Φ(z).

Confidence & Credible Interval Estimates

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support