Survey Data Collection Using Complex Automated Questionnaires
Total Page:16
File Type:pdf, Size:1020Kb
From: AAAI Technical Report SS-03-04. Compilation copyright © 2003, AAAI (www.aaai.org). All rights reserved. Survey Data Collection Using Complex Automated Questionnaires William P. Mockovak, Ph.D. Bureau of Labor Statistics Office of Survey Methods Research 2 Massachusetts Ave. N.E. Washington, DC 20212 [email protected] Abstract Since the early 1980s, researchers in the federal statistical com- The CPS is conducted by a workforce of about 1,500 inter- munity have been moving complex surveys from paper question- viewers scattered across the U.S. who conduct interviews naires to computer-assisted interviewing (CAI). The data col- in respondents’ homes and, occasionally, on doorsteps, lected through such surveys cover a variety of needs, from critical porches, lawns, etc., or over the telephone from either an information on the health of the economy to social issues as interviewer’s home or from a centralized telephone inter- measured by statistics on health, crime, expenditures, and educa- viewing facility. Although the composition of the inter- tion. This paper covers some of the key issues involved in devel- viewer workforce has undergone some changes in recent oping applications used primarily by a middle-age, part-time years with the increased hiring of retirees, historically, the workforce, which uses the software in a variety of situations, majority of interviewers have been women between the while interacting with an occasionally uncooperative public. ages of 45 and 55, who work part-time, and who lack many of the computer skills taken for granted in younger popula- Introduction tions. Many interviewers tend to work on a single, primary survey, but are available to work on others (some continu- Government surveys provide critical information on a vari- ous, some one-time). Depending on the survey, annual ety of topics of importance to the well-being of the U.S., rates of attrition or interviewer turnover can range between including data about consumer expenditures, health, crime, 18 and 31 percent (Mockovak, 1981), so recruitment and education, and employment – just to name a few. training are ongoing activities. Compared to other disciplines, the history of survey re- To collect survey data, interviewers use a computer- search is relatively new, with most major advances occur- assisted interviewing (CAI) program, called an instrument, ring since the 1930s (Brick, 2002). Although desired in- which is loaded on a laptop computer (PCs are used in formation is sometimes available from existing administra- centralized phone centers). This application displays the tive records or through simple mail surveys, more detailed question that the interviewer should ask, the valid answers information can only be obtained through direct interviews (or answer types) for the question, special instructions to with representative samples of Americans. For example, the interviewer, and any associated help information. In one of the better known, largest, and longest continuing addition, case management software exists to help the in- surveys in the United States is the Current Population Sur- terviewer manage her work assignment, receive and trans- vey or CPS. The CPS is a monthly sample survey of about mit cases to headquarters, provide security, and other use- 60,000 households that has been conducted since 1940, and ful functions (e.g., data backup). Although a keyboard is which provides information on labor force activity. This most frequently used, some applications also allow an in- survey is the source of the monthly national unemployment terviewer to enter survey responses using a mouse or rate, a key economic indicator. In addition, the CPS pro- pointing device, or some combination of the preceding. In vides rich demographic detail that can be used to analyze past, as well as in many existing instruments, navigation is the U.S. labor force, including age, sex, race, Hispanic ori- accomplished most often using function keys (F keys) to gin, educational attainment, marital status and family at- allow specific actions such as backing up or jumping to tributes, foreign-born status, veteran status, and other another section or to obtain help. However, Windows- demographic characteristics. based applications are allowing increased use of the mouse and tab and arrow keys, along with the use of graphical Copyright © 2002, American Association for Artificial Intelligence features to accomplish the same goals. (www.aaai.org). All rights reserved. The opinions expressed in this paper are solely those of the author and do not reflect official positions of the Bureau of Labor Statistics. Computer Assisted Telephone Interviewing, or CATI, was first conducted by Chilton Research Services in 1971 (Couper and Nicholls, 1998). This technology involves calling respondents from a centralized telephone inter- viewing facility. The purported benefits of CATI com- ning around, when the respondent is doing other tasks, pared to paper included the management of very large tele- etc.), follow proper interviewing procedures, and take as phone samples, randomization of question and response little time as possible, since many surveys are longitudinal category order, on-line and automatic arithmetic, access to and require multiple visits. on-line databases, increased timeliness of data, and reduced costs. Although respondents have used computers to complete survey questionnaires without the assistance of an inter- Computer Assisted Personal Interviewing, or CAPI, was viewer, for example, self-administered questionnaires us- first introduced in government surveys in the early 1980s ing touchtone phones or voice recognition (Clayton and to replace paper questionnaires used in face-to-face inter- Harrell, 1989; Appel and Cole, 1994), these types of appli- views, where interviewers traveled to a respondent’s home. cations tend to be short and simple with very minimal Its initial objectives were to reduce post-interview data branching. Similarly, although a great deal of excitement processing, to use data from previous interviews to speed and experimentation exists about using the Internet for sur- up and improve the quality of current interviews, to allow vey data collection, for the most part, these applications more complex questionnaire routing and tailoring of ques- have also focused on simple, self-completed question- tions (for example, the use of branching logic to ask tai- naires, and obtaining a representative, scientific sample lored questions of respondents with selected characteris- using this mode remains a major concern. In addition, tics), to allow “last minute” changes to questionnaires, and problems with respondent cooperation that are so common to produce statistics more quickly. Except for the ability to in interviewer-mediated data collection are even more pro- incorporate last-minute changes, these objectives have nounced in surveys done over the web. been widely achieved. It was also hoped that lower costs and higher quality data would result, but these objectives Accordingly, the focus of this paper is on the use of com- have been more elusive. plex computer-assisted interview instruments for inter- viewer-mediated data collection. These interviews can be In addition to CATI and CAPI, the acronym CASIC (Com- conducted either face-to-face or over the telephone. The puter Assisted Survey Information Collection) is also used key challenge facing the survey methodologist is designing frequently in survey research to describe any effort to use an automated interviewing tool that does not make the in- computers for survey data collection, data capture, data terviewer’s already very challenging job even more diffi- preparation, and any activities that support these tasks cult. As mentioned previously, an interviewer must inter- (Couper and Nicholls, 1998). Recent CASIC efforts have act with the computer-assisted interviewing program, while focused on use of the Internet for survey data collection. interacting with the respondent at the same time. As one might expect, not all respondents readily agree to an inter- From 1988 to 1994, the use of computer-assisted inter- view, and when they do agree to cooperate, many fre- viewing grew rapidly, to the point where now most face-to- quently want to give information in a different order or face government surveys use it to collect data. Early CAPI form from that required by the automated questionnaire. applications often placed tremendous demands on inter- These constraints pose unusual demands on the interviewer viewers. For example, the first CAPI survey required in- and the software. The interviewer must negotiate with oc- terviewers to carry 25 pound computer terminals into re- casionally hostile respondents and apply sound interview- spondents’ homes, connect them to a telephone line, and ing techniques, while interacting with the software used to dial in to a connection on a mainframe (Rothschild and collect the survey data. Wilson, 1988). Modern-day applications use light-weight laptop computers, pad computers, or PDAs, although Instrument and Interface hardware issues like screen size, screen readability, and Design Challenges battery life continue to pose important operational prob- lems, especially for PDAs. The challenges facing CAI instrument designers are the same facing any large-scale software development effort. The size and complexity of CAI instruments can vary Even if a paper questionnaire already exists, data require- widely. For some surveys, the CAI program can be quite