
Appendix III Limitations of the Data Introduction.—The data presented in this unit selected can be assigned a number, Statistical Abstract came from many greater than zero and less than or equal to sources. The sources include not only Fed- one, representing its likelihood or probability eral statistical bureaus and other organiza- of selection. tions that collect and issue statistics as their principal activity, but also governmental ad- For large-scale sample surveys, the prob- ministrative and regulatory agencies, private ability sample of units is often selected as a research bodies, trade associations, insur- multistage sample. The first stage of a mul- ance companies, health associations, and tistage sample is the selection of a probabil- private organizations such as the National ity sample of large groups of population Education Association and philanthropic members, referred to as primary sampling foundations. Consequently, the data vary units (PSU’s). For example, in a national multistage household sample, PSU’s are considerably as to reference periods, defini- often counties or groups of counties. The tions of terms and, for ongoing series, the second stage of a multistage sample is the number and frequency of time periods for selection, within each PSU selected at the which data are available. first stage, of smaller groups of population The statistics presented were obtained and units, referred to as secondary sampling units. In subsequent stages of selection, tabulated by various means. Some statistics smaller and smaller nested groups are cho- are based on complete enumerations or sen until the ultimate sample of population censuses while others are based on units is obtained. To qualify a multistage samples. Some information is extracted sample as a probability sample, all stages from records kept for administrative or regu- of sampling must be carried out using prob- latory purposes (school enrollment, hospital ability sampling methods. records, securities registration, financial ac- counts, social security records, income tax Prior to selection at each stage of a multi- returns, etc.), while other information is ob- stage (or a single stage) sample, a list of tained explicitly for statistical purposes the sampling units or sampling frame for through interviews or by mail. The estima- that stage must be obtained. For example, tion procedures used vary from highly so- for the first stage of selection of a national household sample, a list of the counties and phisticated scientific techniques, to crude county groups that form the PSU’s must be ‘‘informed guesses.’’ obtained. For the final stage of selection, Each set of data relates to a group of indi- lists of households, and sometimes persons viduals or units of interest referred to as the within the households, have to be compiled in the field. For surveys of economic entities target universe or target population, or sim- and for the economic census, the Census ply as the universe or population. Prior to Bureau generally uses a frame constructed data collection the target universe should from the Bureau’s Standard Statistical Es- be clearly defined. For example, if data are tablishment List (SSEL). The SSEL contains to be collected for the universe of house- all establishments with payroll in the United holds in the United States, it is necessary to States including small single establishment define a ‘‘household.’’ The target universe firms as well as large multi-establishment may not be completely tractable. Cost and firms. other considerations may restrict data col- lection to a survey universe based on some Wherever the quantities in a table refer to available list, such list may be inaccurate an entire universe, but are constructed from and out of date. This list is called a survey data collected in a sample survey, the table frame or sampling frame. quantities are referred to as sample esti- mates. In constructing a sample estimate, The data in many tables are based on data an attempt is made to come as close as is obtained for all population units, a census, feasible to the corresponding universe or on data obtained for only a portion, or quantity that would be obtained from a com- sample, of the population units. When the plete census of the universe. Estimates data presented are based on a sample, the based on a sample will, however, generally sample is usually a scientifically selected differ from the hypothetical census figures. probability sample. This is a sample se- Two classifications of errors are associated lected from a list or sampling frame in such with estimates based on sample surveys: a way that every possible sample has a (1) sampling error—the error arising from known chance of selection and usually each the use of a sample, rather than a census, Appendix III 971 to estimate population quantities and (2) a standard error of 10 units. An approxi- nonsampling error—those errors arising mately 90 percent confidence interval (plus from nonsampling sources. As discussed or minus 1.6 standard errors) is from 184 below, the magnitude of the sampling error to 216. for an estimate can usually be estimated from the sample data. However, the magni- All surveys and censuses are subject to nonsampling errors. Nonsampling errors are tude of the nonsampling error for an esti- of two kinds—random and nonrandom. mate can rarely be estimated. Conse- Random nonsampling errors arise because quently, actual error in an estimate exceeds of the varying interpretation of questions (by the error that can be estimated. respondents or interviewers) and varying The particular sample used in a survey is actions of coders, keyers, and other proces- sors. Some randomness is also introduced only one of a large number of possible when respondents must estimate values. samples of the same size which could have These same errors usually have a nonran- been selected using the same sampling dom component. Nonrandom nonsampling procedure. Estimates derived from the dif- errors result from total nonresponse (no us- ferent samples would, in general, differ from able data obtained for a sampled unit), par- each other. The standard error (SE) is a tial or item nonresponse (only a portion of a measure of the variation among the esti- response may be usable), inability or unwill- mates derived from all possible samples. ingness on the part of respondents to pro- The standard error is the most commonly vide correct information, difficulty interpret- used measure of the sampling error of an ing questions, mistakes in recording or estimate. Valid estimates of the standard keying data, errors of collection or process- errors of survey estimates can usually be ing, and coverage problems (overcoverage calculated from the data collected in a prob- and undercoverage of the target universe). ability sample. For convenience, the stand- Random nonresponse errors usually, but ard error is sometimes expressed as a per- not always, result in an understatement of cent of the estimate and is called the sampling errors and thus an overstatement of the precision of survey estimates. Esti- relative standard error or coefficient of mating the magnitude of nonsampling errors variation (CV). For example, an estimate of would require special experiments or ac- 200 units with an estimated standard error cess to independent data and, conse- of 10 units has an estimated CV of 5 per- quently, the magnitudes are seldom avail- cent. able. A sample estimate and an estimate of its Nearly all types of nonsampling errors that standard error or CV can be used to con- affect surveys also occur in complete cen- struct interval estimates that have a pre- suses. Since surveys can be conducted on scribed confidence that the interval includes a smaller scale than censuses, nonsampling the average of the estimates derived from errors can presumably be controlled more all possible samples with a known probabil- tightly. Relatively more funds and effort can ity. To illustrate, if all possible samples were perhaps be expended toward eliciting re- selected under essentially the same general sponses, detecting and correcting response conditions, and using the same sample de- error, and reducing processing errors. As a sign, and if an estimate and its estimated result, survey results can sometimes be standard error were calculated from each more accurate than census results. sample, then: 1) Approximately 68 percent of the intervals from one standard error be- To compensate for suspected nonrandom low the estimate to one standard error errors, adjustments of the sample estimates above the estimate would include the aver- are often made. For example, adjustments age estimate derived from all possible are frequently made for nonresponse, both samples; 2) approximately 90 percent of the total and partial. Adjustments made for ei- intervals from 1.6 standard errors below the ther type of nonresponse are often referred estimate to 1.6 standard errors above the to as imputations. Imputation for total non- estimate would include the average esti- response is usually made by substituting for mate derived from all possible samples; and the questionnaire responses of the nonre- 3) approximately 95 percent of the intervals spondents the ‘‘average’’ questionnaire re- from two standard errors below the estimate sponses of the respondents. These imputa- to two standard errors above the estimate tions usually are made separately within would include the average estimate derived various groups of sample members, formed from all possible samples. by attempting to place respondents and nonrespondents together that have ‘‘similar’’ Thus, for a particular sample, one can say design or ancillary characteristics. Imputa- with the appropriate level of confidence tion for item nonresponse is usually made (e.g., 90 percent or 95 percent) that the av- by substituting for a missing item the re- erage of all possible samples is included in sponse to that item of a respondent having the constructed interval.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-