I. Survey Sampling Goals and Basic Terminology

Total Page:16

File Type:pdf, Size:1020Kb

I. Survey Sampling Goals and Basic Terminology

Stat 475/920 Notes 1

Reading: Lohr, Chapter 1.1-1.4, 1.6

I. Survey Sampling Goals and Basic Terminology

Sample surveys are used to obtain information about a population by examining only a small fraction of the population.

Examples of fields in which survey samples are used: . Governments survey human populations; for example the U.S. government conducts health surveys and demographic surveys. For example, the unemployment rate is estimated by sampling. . Public opinion polling. . Musical composers’ royalties from their music being played on the radio are determined by sampling radio station broadcasts to estimate the frequency with which a musical piece is played. . During audits of the financial records of large companies, sampling techniques may be used when examination of the entire set of records is impractical.

1 Basic Terminology

Observation unit: An object on which a measurement is taken.

Target population: The complete collection of observations we want to study.

Sample: A subset of the population.

Sampled population: The collection of all possible observation units that might have been chosen in a sample; the population from which the sample was taken.

Sampling unit: The unit we actually sample. We may want to study individuals but not have a list of all individuals in the target population. Instead, households serve as the sampling units, and the observation units are individuals in the household.

Sampling frame: The list of sampling units.

II. Judgment samples vs. probability samples

A good sample will reproduce the characteristics of interest in the population as closely as possible 2 Two approaches to sampling:

1. Judgment sample: The investigator uses his or her judgment and knowledge of the population to select the units to be included in the population.

2. Probability sample: Each unit in the population has a known probability of being selected in the sample and a chance method such as using numbers from a random number table is used to choose the specific units to be included in the sample.

Comparisons of judgment samples vs. probability samples.

A. Random Rectangles. Handout.

3 B. 1948 U.S. Presidential Election Polls

4 vs.

Harry Truman Thomas Dewey

During and before 1948, political pollsters favored a type of judgment sampling called quota sampling over probability sampling. In quota sampling, each interviewer is asked to find people with certain characteristics but is otherwise free to choose the people. For example, one interviewer in St. Louis was asked to interview 13 people of whom  exactly 6 were to live in the suburbs and 7 in the central city  exactly 7 were to be men, and 6 men Of the 7 men (and there were similar quotas for the women):  exactly 3 were to be under forty years old, and 4 over forty  exactly 1 was to be black, and 6 white. The monthly rentals to be paid by the 6 white men were specified also:  1 was to pay $44.01 or more

5  3 were to pay $18.01 to $44.00  2 were to pay $18.00 or less The instructions are set up so that the overall sample matches the population on chosen characteristics. Population Gallup Oct 14 Roper Oct 25 estimate sample sample Sex Male 49.1 50.5 50.1 Female 50.9 49.5 49.9 Age 21-34 34.3 29.6 36.3 35-49 30.9 34.4 30.8 50 and over 34.8 35.9 32.9 Education (last school attended) Grade school or 43.5 35.3 27.5 less High school 43.4 46.8 48.8 College 13.0 17.9 23.7 Race White 96.1 95.0 97.4 Black 3.9 5.0 2.6 Veteran status Male veteran 14.2 13.3 18.3 All others 85.8 86.7 81.6

6 Number of 2972 3501 respondents

All three of the major polls predicted that Dewey would win decisively.

Dewey Truman Thurmond Wallace Crossley 49.9 44.8 1.6 3.3 Gallup 49.5 44.5 2.0 4.0 Roper 52.2 37.1 5.2 4.3

The Chicago Tribune was so sure of Dewey’s victory it printed “Dewey Defeats Truman” as its headline for the next day.

7 But Truman won decisively: Dewey Truman Thurmond Wallace 45.1 49.5 2.4 2.4

What went wrong with quota sampling?

The final selection of respondents is left up to the subjective judgment of the interviewer rather than being determined objectively. The problem is unintentional bias on the part of the interviewers. Republicans tended to be more likely to own telephones, have permanent addresses and live on nicer blocks.

8 Within each broad demographic group, the Republicans are marginally easier to interview. In fact, in the years 1936 (when the Gallup poll started) through 1944, there was a bias in the poll toward the Republican candidate but not enough to change the predicted winner.

Year Gallup final Election result Error in terms of survey Republican vote 1936 55.7% 62.5% +6.8% Roosevelt (D) Roosevelt (D) 1940 52.0% 55.0% +3.0% Roosevelt (D) Roosevelt (D) 1944 51.5% 53.8% +2.3% Roosevelt (D) Roosevelt (D) 1948 49.5% Dewey 49.5% Truman +4.4% (R) (D)

Probability sampling has been used since 1952 by the major polling organizations.

Year Gallup final survey Election result Error in terms of Republican vote 1952 51.0% Eisenhower 55.4% -4.4% (R) Eisenhower 1956 59.5% Eisenhower 57.8% +1.7% 9 (R) Eisenhower 1960 51.0% Kennedy 50.1% Kennedy -0.9% (D) 1964 64.0% Johnson (D) 61.3% Johnson -2.7% 1968 43.0% Nixon (R) 43.5% Nixon -0.5% 1972 62.0% Nixon (R) 61.8% Nixon +0.2% 1976 48.0% Carter (D) 50.0% Carter +0.9% 1980 47.0% Reagan (R) 50.8% Reagan -3.8% 1984 59.0% Reagan (R) 59.2% Reagan -0.2% 1988 56.0% Bush (R) 53.0% Bush (R) +3.0% 1992 49.0% Clinton (D) 43.3% Clinton -0.71% (D) 1996 52.0% Clinton (D) 50.1% Clinton -0.4% (D) 2000 48.0% Bush (R) 47.9% Bush (R) +0.1% 2004 49.0% Bush (R) 51.0% Bush (R) -2.0%

When probability sampling is used, no clear pattern emerges between Democrats and Republican vote percentages in the sample compared to the actual election. The errors involved are generally smaller under probability sampling than quota sampling. The Gallup poll reports that their average error in 24 national election polls prior to 1950 when they used quota sampling was 2.3% whereas their average error in 17 national

1 Most of the error in giving too high a vote share to Clinton was in giving too small a vote share to Perot rather than in giving too small a vote share to Bush. 10 election polls after 1950 when they have used probability sampling has been only 1.5%. This improved accuracy is being accomplished with smaller sample sizes. The 1948 Gallup poll had a sample size of 3250, whereas most modern polls aim for 1500 or fewer respondents.

Quota sampling makes the sample representative of the population on the characteristics that are controlled but does not ensure representativeness on the uncontrolled characteristics. In probability sampling, the randomization and the law of large numbers helps, for a large enough sample, to make the sample representative of the population on all factors even ones that would be difficult to control (e.g., niceness of house). In 1948, in Washington state, the Washington State Public Opinion Laboratory conducted two polls prior to the election, one using quota and one using probability sampling.

Actual Probability Quota sample Washington sample state vote Dewey 42.7 46.0 52.0 Truman 52.6 50.5 45.3 Wallace 3.5 2.9 2.5 The probability sample produced much more accurate results.

11 Key point: Probability sampling is generally a much more reliable sampling method than judgment sampling. The randomness in probability sampling tends to produce a sample that is representative of the population.

In addition, as we will study, our uncertainty about our estimates from a probability sample can be accurately described and the sample size can be chosen to make this uncertainty as small as desired.

III. Sources of Errors in Surveys

The goal of a sample survey is to obtain as accurate as possible an estimate of a quantity about the population from the sample.

The source of errors in estimates from a survey can be divided into two major groups: 1. Errors of nonobservation : Arise because the sampled elements make up only part of the target population. 2. Errors of observation: arise when recorded data deviates from the truth.

Errors of nonobservation:

12 1. Sampling error: Error that arises because we have taken one sample rather than looked at the whole population. This is error that arises from the sample to sample variability of our estimates. A key point is that in probability sampling, the average magnitude of this error can be measured and the sampling error can be reduced by good survey designs and appropriate choice of sample size.

2. Coverage error: Error that arises when the sampling frame does not match up perfectly with the target population. For example, the sample frame of all listed landline residential telephone numbers does not cover all households because some households have unlisted numbers and some households only have a cell phone.

3. Nonresponse: Nonresponse arises in one of three ways:

(i) The inability to contact the sampled unit (person or household, for example). An interviewer must not substitute a next-door neighbor, who just happens to be home at 3 p.m., for the person actually selected for the sample. This type of substitution might lead to a survey that is biased because too many families with children or too many retired people or too many people who work nights are being interviewed.

13 (ii) The inability of the person responding to come up with the answer to the question. A survey on the economic impact of businesses on a community would be seriously biased if a few of the larger businesses do not know how much they spend on transportation.

(iii) Refusal to answer. People who are more interested or have stronger feelings about an issue are more likely to answer, often biasing the results.

We will study how to try to reduce nonresponse and how to deal with remaining nonresponse in the analysis in Chapter 8.

Example of a survey with undercoverage and nonresponse: The Literary Digest poll.

Literary Digest was a popular magazine in the early 20th century that was well known for its poll that had correctly predicted the presidential election winner since its inception in 1916 to 1932. In 1936, the poll consisted of sending postcard questionnaires to 10 million people selected from subscribers to the Digest, telephone directories and automobile owners. The 2,376,523 returned post cards showed the Republican Alf Landon a winner over the Democrat Franklin Delano Roosevelt by 57% to 43%. Landon, 1,293,669; Roosevelt, 972,897

14 Final Returns in the Digest’s Poll of Ten Million Voters But, Roosevelt won in a landslide, 63% to 37%.

The Literary Digest poll suffered from problems of coverage error and nonresponse: (i) Coverage error. The sampling frame overrepresented higher income people who were more likely to be Republican. The readers of the Literary Digest tended to have above average incomes. Furthermore, in the 1930s, owning a car or telephone was a luxury (about 40% of households had residential phones).

(ii) Nonresponse. The anti-Roosevelt voters felt more strongly than the pro-Roosevelt voters and were more likely to respond. Surveys that use voluntary responses to mailed questionnaires are often severely biased.

George Gallup made his fame by correctly predicting that Roosevelt would win based on a poll of 5000 people and also correctly predicting the results of the Literary Digest poll by choosing a random sample of a population chosen to match that of the Literary Digest poll.

Errors of observation

15 Once a person (or object) is in place and ready to be “measured,” there are still more errors that can creep into the survey.

Interviewers have a direct and dramatic effect on the way a person responds to a question.  Reading a question with inappropriate emphasis or intonation can force a response in one direction or another. Most people who agree to an interview do not want to appear disagreeable and will tend to side with the view apparently favored by the interviewer, especially on questions for which the respondent does not have a strong opinion.  The race or gender of the interviewer may affect whether the respondent answers a question truthfully. For example, Schuman and Converse (1971) employed both white and African American interviewers to interview African American residents of Detroit. To the question, “Do you personally feel that you can trust most white people, some white people, or none at all?” the response of 35% of those interviewed by a white person was that they could trust most white people. The percentage was 7% for those interviewed by an African American person.

Respondents differ greatly in motivation to answer correctly and ability to do so.

16  Each respondent must understand the entire question and be clear about the options for the answer. This means questions must be clearly phrased and the questionnaire should not be too long.

“Twenty two percent of Americans Say Holocaust May Not Have Occurred” Akron Beach Journal, 1993 based on Associated Press story.

The headline was based on the following Roper poll question: “Does it seem possible or does it seem impossible to you that the Nazi extermination of the Jews never happened?”

22 percent said it “it seemed possible.” Another 12% said that they did not know. Only 65% said it was “impossible it never happened.” But the double negative in the quetiosn may have confused some respondents.

The Gallup organization conducted a follow-up poll. “The term Holocaust usually refers to the killing of Jews in Nazi death camps during World War II. In your opinion, did the Holocaust: definitely happen, probably happen, probably not happen, or definitely did not happen.”

17 In this poll, 83% said the Holocaust definitely happened and another 13% said it probably happened. Only 1% said it definitely did not happen.

We will study more about questionnaire design later in the course.

 Errors due to the respondent can be broadly classified into four categories: 1) Recall bias (respondent simply does not remember correctly) 2) Prestige bias (respondent exaggerates a little on income or athletic success) 3) Intentional deception (the respondent will not admit breaking a law) 4) Incorrect measurement (the respondent did not understand the units and reported feet instead of inches or did not understand the definition of children and reported grandchildren as well).

 We will study questionnaire design later in the course and also discuss more about how to try to minimize the various errors on surveys.

18 Another broad classification of errors on surveys besides errors of nonobservation vs. errors of observation is sampling error vs. nonsampling error (all errors other than sampling error).

We will focus on methods to control sampling error for most of the course and return to nonsampling error later.

Next topic: Simple random sampling (Chapter 2 of Lohr).

19

Recommended publications