12 Interval Estimation

12 Interval estimation 12.1 Introduction In Chapter 11, we looked into point estimation in the sense of giving single values or points as estimates for well-defined parameters in a pre-selected population density/probability function. If p is the probability that someone contesting an election will win and if we give an estimate as p = 0.7, then we are saying that there is exactly 70% chance of winning. From a layman’s point of view, such an exact number may not be that reasonable. If we say that the chance is between 60 and 75%, it may be more acceptable to a layman. If the waiting time in a queue at a check-out counter in a grocery store is exponentially distributed with expected waiting time θ minutes, time being measured in minutes, and if we give an estimate of θ as between 5 and 10 minutes it may be more reasonable than giving a single number such as the expected waiting time is exactly 6 minutes. If we give an estimate of the expected life-time of individuals in a certain community of people as between 80 and 90 years, it may be more acceptable rather than saying that the expected life time exactly 83 years. Thus, when the unknown parameter θ has a continuous parameter space Ω it may be more reasonable to come up with an interval so that we can say that the unknown parameter θ is somewhere on this interval. We will examine such interval estimation problems here. 12.2 Interval estimation problems In order to explain the various technical terms in this area, it is better to examine a simple problem and then define various terms appearing there, in the light of the il- lustrations. Example 12.1. Let x1,…,xn be iid variables from an exponential population with density 1 f (x, θ)= e−x/θ, x ≥ 0, θ > 0 θ = +⋯+ = u and zero elsewhere. Compute the densities of (1) u x1 xn; (2) v θ and then evaluate a and b such that Pr{a ≤ v ≤ b}=0.95. Solution 12.1. The moment generating function (mgf) of x is known and it is Mx(t)= −1 −n (1−θt) ,1−θt > 0. Since x1,…,xn are iid, the mgf of u = x1 +⋯+xn is Mu(t)=(1−θt) , 1 − θt > 0 or u has a gamma distribution with parameters (α = n, β = θ). The mgf of v is −n available from Mu(t) as Mv(t)=(1−t) ,1−t > 0. In other words, v has a gamma density with the parameters (α = n, β = 1) or it is free of all parameters since n is known. Let the density of v be denoted by g(v). Then all sorts of probability statements can be made Open Access. © 2018 ArakM. Mathai, Hans J. Haubold, published by De Gruyter. This workis licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. https://doi.org/10.1515/9783110562545-012 356 | 12 Interval estimation on the variable v. Suppose that we wish to find an a such that Pr{v ≤ a} = 0.025 then we have a vn−1 ∫ e−vdv = 0.025. 0 Γ(n) We can either integrate by parts or use incomplete gamma function tables to obtain the exact value of a since n is known. Similarly, we can find a b such that ∞ vn−1 Pr{x ≥ b} = 0.025 ⇒ ∫ e−vdv = 0.025. b Γ(n) This b is also available either integrating by parts or from the incomplete gamma function tables. Then the probability coverage over the interval [a, b] is 0.95 or Pr{a ≤ v ≤ b} = 0.95. We are successful in finding a and b because the distribution of v is free of all parameters. If the density of v contained some parameters, then we could not have found a and b because those points would have been functions of the parameters involved. Hence the success of our procedure depends upon finding a quantity such as v here, which is a function of the sample values x1,…, xn and the parameter (or parameters) under consideration, but whose distribution is free of all parameters. Such quantities are called pivotal quantities. Definition 12.1 (Pivotal quantities). A function of the sample values x1,…, xn and the parameters under consideration but whose distribution is free of all parameters is called a pivotal quantity. Let us examine Example 12.1 once again. We have a probability statement Pr{a ≤ v ≤ b} = 0.95. Let us examine the mathematical inequalities here. (x + ⋯ + x ) a ≤ v ≤ b ⇒ a ≤ 1 n ≤ b θ 1 θ 1 ⇒ ≤ ≤ b (x1 + ⋯ + xn) a (x + ⋯ + x ) (x + ⋯ + x ) ⇒ 1 n ≤ θ ≤ 1 n . b a Since these inequalities are mathematically identical, we must have the probability statements over these intervals identical. That is, (x + ⋯ + x ) (x + ⋯ + x ) (x + ⋯ + x ) Pr{a ≤ 1 n ≤ b} = Pr{ 1 n ≤ θ ≤ 1 n }. (12.1) θ b a Thus, we have converted a probability statement over v into a probability statement over θ. What is the difference between these two probability statements? The first one 12.2 Interval estimation problems | 357 says that the probability that the random variable falls on the fixed interval [a, b] is 0.95. In the second statement, θ is not a random variable but a fixed but unknown parameter and the random variables are at the end points of the interval or here the interval is random, not θ. Hence the probability statement over θ is to be interpreted [ u , u ] . as the probability for the random interval b a covers the unknown θ is 0 95. In this example, we have cut off 0.025 area at the right tail and 0.025 area at the left tail so that the total area cut off is 0.025 + 0.025 = 0.05. If we had cut off an area α 2 each at both the tails then the total area cut off is α and the area in the middle if 1 − α. In our Example 12.1, α = 0.05 and 1 − α = 0.95. We will introduce some standard notations which will come in handy later on. Notation 12.1. Let y be a random variable whose density f (y) is free of all parameters. Then we can compute a point b such that from that point onward to the right the area cut off is a specified number, say α. Then this b is usually denoted as yα or the value of y from there onward to the right the area under the density curve or probability function is α or Pr{y ≥ yα} = α. (12.2) Then from Notation 12.1 if a is a point below which of the left tail area is α then the point a should be denoted as y1−α or the point from where onward to the right the area under the curve is 1 − α or the left tail area is α. In Example 12.1 if we wanted to α compute a and b so that equal areas 2 is cut off at the right and left tails, then the first part of equation (12.1) could have been written as Pr{v − α ≤ v ≤ v α } = 1 − α. 1 2 2 Definition 12.2 (Confidence intervals). Let x1,…, xn be a sample from the population f (x|θ) where θ is the parameter. Suppose that it is possible to construct two functions of the sample values ϕ1(x1,…, xn) and ϕ2(x1,…, xn) so that the probability for the random interval [ϕ1, ϕ2] covers the unknown parameter θ is 1 − α for a given α. That is, Pr{ϕ1(x1,…, xn) ≤ θ ≤ ϕ2(x1,…, xn)} = 1 − α for all θ in the parameter space Ω. Then 1 − α is called the confidence coefficient, the interval [ϕ1, ϕ2] is called a 100( 1 − α)% confidence interval for θ, ϕ1 is called the lower confidence limit, ϕ2 is called the upper confidence limit and ϕ2 − ϕ1 the length of the confidence interval. When a random interval [ϕ1, ϕ2] is given we are placing 100(1−α)% confidence on our interval saying that this interval will cover the true parameter value θ with probability 1 − α. The meaning is that if we construct the same interval by using samples of 358 | 12 Interval estimation the same size n then in the long run 100(1 − α)% of the intervals will contain the true parameter θ. If one interval is constructed, then that interval need not contain the true parameter θ, the chance that this interval contains the true parameter θ is 1 − α. In our ( +⋯+ ) ( +⋯+ ) Example 12.1, we were placing 95% confidence in the interval [ x1 xn , x1 xn ] to v0.025 v0.975 contain the unknown parameter θ. From Example 12.1 and the discussions above, it is clear that we will be successful in coming up with a 100(1 − α)% confidence interval for a given parameter θ if we have the following: (i) A pivotal quantity Q, that is, a quantity containing the sample values and the parameter θ but whose distribution is free of all parameters. [Note that there may be many pivotal quantities in a given situation.] (ii) Q enables us to convert a probability statement on Q into a mathematically equiv- alent statement on θ. How many such 100(1 − α)% confidence intervals can be constructed for a given θ, if one such interval can be constructed? The answer is: infinitely many.

12 Interval Estimation

Estimating Confidence Regions of Common Measures of (Baseline, Treatment Effect) On

Theory Pest.Pdf

Random Vectors

Ordinary Least Squares 1 Ordinary Least Squares

Quantum State Estimation with Nuisance Parameters 2

Statistical Data Analysis Stat 4: Confidence Intervals, Limits, Discovery

Confidence Intervals for Functions of Variance Components Kok-Leong Chiang Iowa State University

Monte Carlo Methods for Confidence Bands in Nonlinear Regression Shantonu Mazumdar University of North Florida

Nearly Optimal Tests When a Nuisance Parameter Is Present

Statistical Inference Using Maximum Likelihood Estimation and the Generalized Likelihood Ratio • When the True Parameter Is on the Boundary of the Parameter Space

Lecture 3. Inference About Multivariate Normal Distribution

Composite Hypothesis, Nuisance Parameters’