Chapter 17 Inference for One Numerical Population
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 17 Inference for One Numerical Population In Chapter 10 you learned about finite populations. You learned about smart and dumb random samples from a finite population. You learned that i.i.d. trials can be viewed as the outcomes of a dumb random sample from a finite population. Chapter 11 developed these ideas in the special case of a dichotomous response. This was a very fruitful development, leading to all the results not named Poisson in Chapters 12–16. And, of course, our results for the Poisson are related to our results for the binomial. In Chapters 17–20 we mimic the work of Chapters 11–16, but for a numerical response rather than a dichotomy. First, you will see the familiar distinction between a finite population and a mathematical model for the process that generates the outcomes of trials. Second, you will see that responses that are counts must be studied differently than responses that are measurements. We begin by studying responses that are obtained by counting. Before we get to count responses, let me lay out some notation for this chapter. Recall that either dumb random sampling from a finite population or the assumption that trials are i.i.d., result in our observing n i.i.d. random variables: X1,X2,X3,...,Xn. The probability/sampling distribution for each of these random variables is determined by the population. Recall that for a dichotomous response the population is quite simple; it is determined by the single number p. For a numerical response, as you will soon see, the population is more complex—it is a picture, not a single number. Finally, when I want to talk about a generic random variable—one observation of a trial or one population member selected at random—I will use the symbol X, without a subscript. You may need—on occasion—to refer back to the preceding paragraph as you work through this chapter. 17.1 Responses Obtained by Counting I will begin with finite populations. 425 Table 17.1: The population distribution for the cat population. x 0 1 2 3 Total P (X = x) 0.10 0.50 0.30 0.10 1.00 Figure 17.1: The probability histogram for the cat population. 0.50 0.30 0.10 0.10 0 1 2 3 17.1.1 Finite Populations for Counts Please remember that the two examples in this subsection are both hypothetical. In particular, I claim no knowledge of cat ownership or household size in our society. Example 17.1 (The cat population.) A city consists of exactly 100,000 households. Nature knows that 10,000 of these households have no cats; 50,000 of these households have exactly one cat; 30,000 of these households have exactly two cats; and the remaining 10,000 households have exactly three cats. We can visualize the cat population as a population box that contains 100,000 cards, one for each household. On a household’s card is its number of cats: 0, 1, 2 or 3. Consider the chance mechanism of selecting one card at random from the population box. (Equivalently, selecting one household at random from the city.) Let X be the number on the card that will be selected. It is easy to determine the sampling distribution of X and it is given in Table 17.1. For example, 50,000 of the 100,000 households have exactly one cat; thus P (X =1)= 50,000/100,000 =0.50. It will be useful to draw the probability histogram of the random variable X; it is presented in Figure 17.1. To this end, note that consecutive possible values of X differ by 1; thus, δ = 1 and the height of each rectangle in Figure 17.1 equals the probability of its center value. For example, the rectangle centered at 1 has a height of 0.50 because P (X =1)=0.50. Either the distributionin Table 17.1 or its probability histogram in Figure 17.1 can play the role of the population. In the next section, we will see that for a measurement response the population is a picture, called the probability density function. (Indeed, the population must be a picture for mathematical reasons—trust me on this.) 426 Because we have no choice with a measurement—the population is a picture—for consistency, I will refer to the probability histogram of a count response as the population. Except when I don’t; occasionally, it will be convenient for me to view the probability distribution—such as the one in Table 17.1—as being the population. As Oscar Wilde reportedly said, Consistency is the last refuge of the unimaginative. It can be shown that the mean, µ, of the cat population equals 1.40 cats per household and its standard deviation, σ, equals 0.80 cats per household. I suggest you trust me on the accuracy of these values. Certainly, if one imagines a fulcrum placed at 1.40 in Figure 17.1, it appears that the picture will balance. If you really enjoy hand computations, you can use Equations 7.1 and 7.3 on pages 147 and 148 to obtain µ = 1.40 and σ2 = 0.64. Finally, if you refer to my original description of the cat population in Example 17.1, you can easily verify that the median of the 100,000 population values is 1. (In the sorted list, positions 10,001 through 60,000 are all home to the response value 1. Thus, the two center positions, 50,000 and 50,001 both house 1’s; hence, the median is 1.) For future use it is convenient to have a Greek letter to represent the median of a population; we will use ν, pronounced as new. You have now seen the veracity of my comment in the first paragraph of this chapter; the population for a count response—a probability histogram—is much more complicated than the population for a dichotomy—the number p. Thus far with the cat population, I have focused exclusively on Nature’s perspective. We now turn to the view of a researcher. Imagine that you are a researcher who is interested in the cat population. All you would know is that the response is a count; thus, the population is a probability histogram. But which probability histogram? It is natural to begin with the idea of using data to estimate the population’s probability histogram. How should you do that? Mathematically, the answer is simple: Select a random sample from the population of 100,000 households. Provided that the sample size, n, is 5% or fewer of the population size, N = 100,000, whether the sample is smart or dumb matters little and can be ignored. For the cat population, this means a sample of 5,000 or fewer households. (It is beyond my imagination—see Wilde quote above—that a cat population researcher would have the energy and resources to sample more than 5,000 households!) In practice, a researcher would attempt to obtain a sample for which the WTP assumption (Definition 10.3 on page 240) is reasonable. Because the cat population is hypothetical, I cannot show you the results of a real survey. Instead, I put the cat population in my computer and instructed my favorite statistical software package, Minitab, to select a random sample of n = 100 households. (I chose a dumb sample because it is easier to program.) The data I obtained are summarized in Table 17.2. In this table, of course, the researcher would not know the numbers in the Unknown Probability column; but Nature would. Nature can see that the researcher’s relative frequencies are somewhat close to the unknown probabilities. Given that the population is a picture, it seems reasonable to draw a picture of the data. Which picture? The idea is that I want to allow Nature to compare the picture of the data to the picture of the population. Thus, the dot plot is not a good idea—it’s like comparing apples to music. A histogram 427 Table 17.2: Data from a simulated random sample of size n = 100 from the cat population in Table 17.1. Relative Unknown Value Frequency Frequency Probability 0 9 0.09 0.10 1 44 0.44 0.50 2 35 0.35 0.30 3 12 0.12 0.10 Total 100 1.00 1.00 Figure 17.2: The density histogram for the data in Table 17.2. This is our picture-estimate of the population in Figure 17.1. 0.44 0.35 0.09 0.12 0 1 2 3 seems reasonable, but which one? The natural choice is the density histogram because, like the probability histogram, its total area is one. For a count response—but not a measurement response—we need one modification of the den- sity histogram before we use it. Remembering back to Chapter 2, one of the reasons for drawing a histogram of data is to group values into class intervals—sacrificing precision in the response values for a picture that is more useful. In Chapter 7, we never grouped values for the probabil- ity histogram. Thus, when we use a density histogram to estimate the probability histogram of a population, we do not group values together. Without grouping, there is no need for an endpoint convention; thus, we modify slightly our method for drawing a density histogram. The modifica- tion is presented in Figure 17.2, our density histogram of the data in Table 17.2.