M 140 Test 2 A Name______

SHOW YOUR WORK FOR FULL CREDIT!

Problem Max. Points Your Points 1-10 10 11 5 12 4 13 12 14 6 15 17 16 6 17 Extra Credit 4 Total 60

1

Multiple Choice Questions (1 point each) 1. A new headache remedy was given to a group of 25 subjects who had headaches. Four hours after taking the new remedy, 20 of the subjects reported that their headaches had disappeared. From this information you conclude: a. that the remedy is effective for the treatment of headaches. b. nothing, because the size is too small. c. nothing, because there is no control group for comparison. d. that the new treatment is better than aspirin. e. that the remedy is not effective for the treatment of headaches.

2. In the Faculty of Science of a university there are ten departments. Two faculty members are selected at random from each department. The sample of twenty faculty members selected in this way is called a: a. Simple random sample. b. Systematic sample. c. Stratified random sample. d. Voluntary response sample. e. Multistage sample.

3. A simple random sample of 1200 adult Americans is selected and each person is asked the following question.

"In light of the huge national deficit, should the government at this time spend additional money to establish a national system of health insurance?"

Only 39% of those responding answered yes. This a. is reasonably accurate since it used a large, simple random sample. b. probably overstates the percentage of people that favor a system of national health insurance. c. probably understates the percentage of people that favor a system of national health insurance. d. is very inaccurate, but neither understates nor overstates the percentage of people that favor a system of national health insurance. Since simple random was used, it is unbiased.

Use the following to answer questions 4–5: A marketing research firm wishes to determine if the adult men in Laramie, Wyoming, would be interested in a new upscale men’s clothing store. From a list of all residential addresses in Laramie, the firm selects a simple random sample of 100 and mails a brief to each.

4. What is the population of interest? a. All adult men in Laramie, Wyoming. b. All residential addresses in Laramie, Wyoming. c. The members of the marketing firm that actually conducted the survey. d. The 100 addresses to which the survey was mailed.

5. One particular neighborhood in Laramie happens to have exactly 100 residential addresses. What do we know about the chance that all 100 homes in that neighborhood end up being the sample that is selected? a. It is the same as for any other set of 100 residential addresses. b. It is exactly 0. Simple random samples will spread out the addresses selected. c. It is reasonably large due to the “cluster” effect. d. It is 100 divided by the size of the population of Laramie. e. It is much less likely than most sets of 100 residential addresses from Laramie are. 2

6. There are two statistics classes. The first has 350 students and the second has 250 students. In the first class the students are instructed to each toss a coin 20 times and record the value of p$ , the proportion of heads. The instructor them makes a histogram of the 350 values of p$ obtained. The second class did the same, except that each student tossed a coin 40 times. The histogram of p$ values for the first class should be a) more biased since it is based on a smaller number of tosses. b) more variable since it is based on a smaller number of tosses. c) less variable since it is based on a larger number of students.

7. A large university wishes to determine the percentage of its students that have committed some form of academic dishonesty, such as cheating on an examination or plagiarism on assignments during their academic career. To estimate this percentage, a random sample of their current students is selected. Each selected student is then interviewed by a staff member and asked if they had cheated. The results of this survey likely will be unreliable because a. some students likely will refuse to answer the question. b. those students who answer the question may not do so honestly. c. the interviewer being a staff member may be intimidating and hence there may be response bias. d. All of the above are reasons for concern.

8. A telemarketing firm in Los Angeles uses a device that dials residential telephone numbers in the city at random. Of the first 100 numbers dialed, 48% are unlisted. This is not surprising because 52% of all Los Angeles residential phones are unlisted. a. the 52% is a parameter, and the 48% is a statistic b. the 52% is a statistic, and the 52% is a parameter c. both the 52% and the 48% are parameters d. both the 52% and the 48% are statistics

9. A polling agency took a random sample of 1000 likely voters in Florida (population: about 18 million), and another random sample of 1000 likely voters in New Mexico (population: about 2 million). a. the sampling variability of the Florida poll will be greater than that of the New Mexico poll b. the sampling variability of the Florida poll will be less than that of the New Mexico poll c. the sampling variability of the Florida poll will be about the same as for the New Mexico poll

10. Which one of the following statements is false? a. An appropriate statistical notation for the fraction of all American adults who received at least one speeding ticket last year is p. b. We cannot predict the likely accuracy of an estimate obtained from a sample if the sample is not taken randomly. c. Usually a parameter value will fall within the interval specified by the point estimate plus and minus its margin of error, but this is not guaranteed to happen. d. When the z-score of a data value is 1.8, that means that the data value could be 1.8 standard deviations either above or below the mean.

3

11. (5 points) Fill in the blanks. To make it easier for you, here’s the list of possible terms to use: Sampling variability Sampling design Sampling distribution

A list of potential individuals or objects to be sampled is called ___ sampling frame ______.

Simple random sampling, stratified random sampling, and multistage random sampling are the most common types of ____ sampling design _____ .

The fact that different random samples from a population will give somewhat different results is called _____ sampling variability ______.

The distribution of the sample statistic for all the possible SRSs of the same size from the same population is called the ____ sampling distribution ______.

The fact that even well constructed samples will give results that are somewhat different from the population value simply because the entire population is not sampled is called ___ sampling error ___ .

12. The drawings represent various sampling methods. Match the pictures with the sampling methods.

A. Stratified Random Sampling

B. Systematic Sampling

C. Simple Random Sampling

D.

13. An Australian study included 588 men and women who already had some pre-cancerous skin lesions. Half got a skin cream containing a sunscreen with a sun protection factor of 17; half got an inactive cream. After 7 months, those using the sunscreen with the sun protection had fewer new pre-cancerous skin lesions (New England Journal of Medicine) .

a. What are the explanatory and response variables?

Explanatory variable: the two types of skin creams

Response variable: the number of pre-cancerous skin lesions

4 b. Explain what the placebo was in this study, and why they included it.

The placebo is the inactive skin cream. The control group received the placebo. This provides the basis for comparison. Without a placebo group to compare against, it is not possible to know whether the treatment itself had any effect.

c. This experiment was also a double-blind study. Explain what that means and why they used that.

Double-blind experiment means that neither the subjects nor the experimenter knows which group gets the treatment (the skin cream with the sunscreen) and the placebo (the inactive cream). This eliminates observer or experimenter bias, as experimenters may unwittingly bias participants because they expect them to respond in a particular way.

d. Draw the outline of the study. Make sure you indicate the response variable, and group sizes.

Treatment group: 294 subjects 588 randomly Skin cream with selected subjects sunscreen Compare number of pre- with some pre- Random cancerous skin lesions after cancerous skin assignment 7 months lesions Control group: 294 subjects Placebo

e. The experimenters suspected that gender could be one of the lurking variables. Explain briefly what they could do to control for this variable.

They could do a block design separating the subjects by gender, and doing the experiment with both genders.

f. Explain where at what point in the study the experimenters should use random assignment :

When they separate the subjects into the two groups. They should randomly decide which subject goes to which group.

5

14. The effect of alcohol consumption on the body appears to be much greater at high altitudes than at sea level. To test this theory, a scientist randomly selected 40 subjects and used a design in which each person acted as his or her own control. For the first treatment, the subject was transported to an altitude of 12,000 feet, where he or she ingested a drink containing 100 cc of alcohol. The second treatment consisted of receiving the same drink at sea level. After two hours, the amount of alcohol in the blood (grams per 100 cc) at each altitude was measured. a. Explain why this is an experiment and not an observational study.

It is an experiment because the researchers imposed a treatment. They didn’t just observe the subjects.

b. Specifically , what type of design does this experiment use?

Matched pair design

c. was used in this experiment. Explain where.

When they decided for each subject at which altitude will she/he drink first.

15. (10 points) A survey of homeowners in California found that 24% feel their home is too small for their families. You randomly select 500 homeowners in California, and you find that 144 of them feel their home is too small for their family.

a. Clearly identify in words the population, the sample, the parameter of interest, and the statistic in this situation.

Population: ALL homeowners in California

Sample: the 500 randomly selected homeowners in California

Parameter: The proportion of ALL homeowners in California who feel their home is too small for their families

Statistic: The proportion of the 500 homeowners who feel their home is too small for their families

b. Fill in the blanks: p$ = __ 144/500 =0.288 __ p = __ 24% = 0.24 ____

6 c. If we could create the sampling distribution the sample proportions for all possible samples of size 500 for, what would be the mean and standard deviation of this sampling distribution?

Mean: p = 0.24

p().(.)1− p 0 24 1− 0 24 Standard deviation: = = 0. 019 n 500

d. Are the conditions satisfied to assume that the shape of the sampling distribution is approximately normal? Check the conditions.

np = 500(0.24) = 120 >10 n(1-p) = 500 (1 – 0.24) = 380 >10

Since both are greater than 10, we can safely assume that the sampling distribution is approximately normal. e. Draw the sampling distribution of the sample proportions. Mark the mean, and three standard deviations below and above the mean, and label them with values.

0.183 0.202 0.221 0.24 0.259 0.278 0.297 d. About how many standard deviations is the sample proportion in your sample away from p?

0.288 is more than 2 standard deviations above the mean. More precisely, it’s 2.53 standard 0.. 288− 0 24 deviations above the mean: z = = 2. 53 0.019 e. Based on this, do you have any doubts about the survey result? Explain briefly.

I do have some doubts since the sample proportion (0.288) is too far from the mean. The probability that it could happen just by chance is very low. f. True or False?

T F If we would do the same survey with 500 randomly selected homeowners in a smaller state, the accuracy of the estimate of the proportion of all homeowners who feel their home is too small for their family would be better than from the larger state.

7

T F If you could ask a random sample of 1000 homeowners instead of 500, then the margin of error of our estimate of the proportion of all homeowners who feel their home is too small for their family would be smaller.

16. (6 points) The weights of the contents of a cereal box are normally distributed with a mean weight of 20 ounces and a standard deviation of 0.37 ounce.

a. What percent of cereal boxes weigh more than 21 ounces?

21− 20 z = = 2. 70 From the table, or using the calculator: 0.0034 0.37

About 0.34% of cereal boxes weigh more than 21 ounces.

b. Boxes in the lower 5% do not meet the minimum weight requirements and must be repackaged. What is the minimum weight requirement for a cereal box?

Lower 5% = 0.05  z-score: -1.64

x = z(s.d.) + mean = -1.64(0.37) + 20 = 19.39

The minimum weight requirement for a cereal box is 19.39 ounces.

8

17. EXTRA CREDIT: The histograms below show four sampling distributions of statistics intended to estimate the same parameter. Label each distribution relative to the others as having large or small bias and as having large or small variability. (Circle the correct choices.)

Bias: large small Bias: large small Variability: large small Variability: large small

Bias: large small Bias: large small Variability: large small Variability: large small

9