Chapter 5, Section 4
Often people have a notion of the precision they want in a survey or a study…they’ll want a 95% confidence interval or they’ll want the mean within 5 dollars or within 3%… then you have to tailor the experiment to the demands – or show that what they want is not doable.
In this section we’ll look at determining the sample size…just how many need to be in a sample to get what you want.
We calculate the confidence interval using a sample statistic “plus or minus a determined amount”
If there’s a predetermined boundary, then the only thing that can be changed in the amount is n, the size of the sample.
For example:
If we want a 95% confidence interval and we’re up for a large sample, then our z score is fixed: 1.96…we’ll have a working estimator for the sample standard deviation, which we’ll use and we’ll set the plus or minus part of the equation equal to the bound.
Let’s suppose we did a random sample of some experiment units (n = 64) and found a sample mean of 5 and a sample standard deviation of 4.8. Using a confidence interval of 85%, then we’ll say that the sample mean is
4.8 5  1.44 = 5  .864 8 and someone says – that’s too much leeway…how about 85% confidence and  .6? so you say – sure but we’ll have to change the size of our sample
4.8 1.44  .6 n
Solving for n we find that we need n = 133 plus just a few more because we fudged and used sample standard deviation. Always round upward for your n’s. So you take the plus or minus part of your confidence interval calculation and set it equal to the bound you want and back solve for n.
With sample proportions – unless you have some estimates on p, the sample proportion, use a p of .5. If you have some information, you may use the proportion from your information. For example in the text, there’s a sample proportion of .63, and the author chose to underestimate and use .6. If you round a value of p, round toward .5…3.8 would go to .4 and .63 goes to .6. In any case, I’ll give you p’s to use in any test questions on this so you won’t have to guess in a test situation.
So, if a value of p is not obvious, use p = .5
Note, too, that if sigma or a sample mean is not available, you may use Range/4 as a substitute for s. example
How many children must be sampled to estimate the mean age of learning to walk within 1 month of the true mean with 99% confidence. The researcher knows that learning to walk starts as early a 8 months and as late as 26 months.
example
Suppose you are a retailer and you want to estimate the number of shoplifters operating in your store. Your security chief thinks that 5% of the customers shoplift. You decided to select a random number of shoppers for special surveillance to check out his feeling. How many customers should you include in the survey to estimate the proportion of shoplifters correct to within .02 with 95% confidence?
pq recall  n
