<<

67th Annual Conference

Evaluating New Frontiers in Public Opinion and Social Research

Conference Program May 17–20, 2012 JW Marriott Orlando Grande Lakes • Orlando,

www.aapor.org

AAPOR 12 FP.indd 1 5/2/12 11:24 AM

Thursday, May 17, 2012 1:30 p.m. - 3:00 p.m. Concurrent Sesson A

New Frontiers: Interactive and Gaming Techniques to Improve Surveys

Interactive and Gaming Techniques to Improve Surveys Elizabeth Dean, RTI International ([email protected]); Adam Sage, RTI International ([email protected]); Jennie W. Lai, Nielsen ([email protected]); Michael Link, Nielsen ([email protected]); Ashley Richards, RTI International ([email protected]); Lorelle Vanno, Nielsen ([email protected]); Jeffrey Henning, Affinova, Inc. ([email protected])

The mobile, digitally networked era has expanded communication tools and styles. Technological augmentation of social networks enables screening calls, blocking contacts, and avoiding face to face interaction. These technologies also facilitate active and passive sharing of massive amounts of personal information while building virtual communities to meet real social needs. Online communication styles signify the arrival of a model for surveying very different from the two-way “conversation with a purpose.” One norm of online communication is interactivity. Facebook’s model of status updates, user comments, “likes,” and live feeds of social conversation can now be considered a norm for online communication. A second norm of online communication is the game. People compete in online games to earn rewards, status, and achievement but also to express themselves, build community, and experience altruism. “Gamification” makes routine experiences more engaging by providing points, badges and status for behaviors like checking in to venues (Foursquare) and buying coffee (My Starbucks Rewards). Social games like Farmville enable users to obtain these rewards and connect with others by linking to Facebook and playing within existing social networks. These new styles of communication necessitate an examination of survey research norms. Market researchers have adopted interactive techniques, but survey methodologists have yet to systematically assess them. These papers evaluate survey techniques harnessing interactivity and game elements to improve surveys. Research in this session includes: an iPhone survey that tests gamification and social sharing features against a standard mobile survey; applications of game design elements like role-playing and arbitrary rules to questionnaire design; Facebook Applications for recruiting survey respondents, panel maintenance, and engaging study participants; and a test of the randomized response technique in a controlled virtual world environment. As a whole, this session will assess the impact of gamification and social interactivity on respondent engagement and survey data

Abstract #1: The growing usage of smartphone applications (or “apps”), particularly among young adults, has opened a new frontier for data collection. This emerging method of Computer-Assisted Self-Interviewing (CASI) offers new techniques to engage respondents on the mobile platform in response to the persistent challenge of respondent cooperation. The use of game mechanics has been integrated with smartphone apps in the recent years to draw on the intrinsic motivation for users to engage in a task. The tools for game mechanics such as points, badges, levels, challenges and leaderboards are used to motivate desired behaviors (i.e., “gamifying” the process but not necessarily turning the task completely into a “game”). Moreover, “social sharing” on networks such as Facebook is a defining attribute for today’s youth and a critical feature of some of the most successful apps. The mechanics of “social sharing” such as comments, posting updates or “liking” the status of others are engaging features to connect the users within the app community and social networks such as Facebook. Leveraging both game and social mechanics for mobile app research can maximize respondent engagement for longitudinal data collection. To measure these emerging techniques for engagement, Nielsen will conduct a split sample experiment contrasting two versions of the iPhone app to collect media usage information. One version of the app will be fully integrated with game and social mechanics while the other version will be initiated without these features, then add the game and social mechanics in phases. This experiment is expected to gather learning on the effectiveness of these emerging techniques for respondent engagement and demonstrate whether data collection via smartphone app is a viable method for repeated measures of the hard-to-reach younger cohorts.

Abstract #2: This research explores the utility of Facebook applications as survey and passive data collection platforms. Facebook applications are interactive, user-facing tools that allow users to interact and enhance the user experience through social games, quizzes, and other interactive and social tools. Facebook’s Graph Application Programming Interface offers researchers the tools necessary to develop applications that can allow access to both public and private data (provided access is permitted), which in turn offer data collection capabilities survey researchers are now only beginning to understand. Traditional survey questionnaires constrain not only the type of data that can be collected, but also the volume, accuracy, and timeliness of the data and data collection process. Facebook’s Graph API is revolutionizing the way we conceptualize the word “data,” which has major implications across a variety of dimensions directly related to the field of survey research. For instance, Facebook offers the capability to stream data in real-time, drawing from a user-base of over 800 million, and in forms that are both new (e.g. location check-in, social networks, and status updates) and old (e.g. demographic data). Facebook applications offer researchers an opportunity to develop unique approaches to address research questions. These applications also provide a platform for questionnaire administration, data creation, and passive data collection in real-time. This paper explores the applicability of Facebook applications, such as social gaming and Facebook user-experience-enhancing applications, to survey research. Results from a pilot study involving the use of a Facebook application to engage the social networks of military personnel to build registries will be used to explore the potential uses of applications. Specifically, this research aims to provide a better understanding of the new types of data being developed, Facebook applications as a mode of questionnaire administration, participant recruitment, implications to sample development, and limitations to be addressed going forward.

Abstract #3: This research explores the utility of Facebook applications as survey and passive data collection platforms. Facebook applications are interactive, user-facing tools that allow users to interact and enhance the user experience through social games, quizzes, and other interactive and social tools. Facebook’s Graph Application Programming Interface offers researchers the tools necessary to develop applications that can allow access to both public and private data (provided access is permitted), which in turn offer data collection capabilities survey researchers are now only beginning to understand. Traditional survey questionnaires constrain not only the type of data that can be collected, but also the volume, accuracy, and timeliness of the data and data collection process. Facebook’s Graph API is revolutionizing the way we conceptualize the word “data,” which has major implications across a variety of dimensions directly related to the field of survey research. For instance, Facebook offers the capability to stream data in real-time, drawing from a user-base of over 800 million, and in forms that are both new (e.g. location check-in, social networks, and status updates) and old (e.g. demographic data). Facebook applications offer researchers an opportunity to develop unique approaches to address research questions. These applications also provide a platform for questionnaire administration, data creation, and passive data collection in real-time. This paper explores the applicability of Facebook applications, such as social gaming and Facebook user-experience-enhancing applications, to survey research. Results from a pilot study involving the use of a Facebook application to engage the social networks of military personnel to build registries will be used to explore the potential uses of applications. Specifically, this research aims to provide a better understanding of the new types of data being developed, Facebook applications as a mode of questionnaire administration, participant recruitment, implications to sample development, and limitations to be addressed going forward.

Abstract #4: The Randomized Response Technique (RRT) is used to encourage accurate responding to sensitive survey questions. When using the RRT, respondents are given two questions (one sensitive and the other nonsensitive with a known response distribution) and are instructed to answer one of them. The question to be answered is determined by the outcome of a random act with a known probability (e.g. a coin toss), that only the respondent sees. Researchers do not know which question each respondent answered, but are able to calculate proportions for each response to the sensitive question. Though it is designed to reduce error, the RRT may actually increase measurement error if respondents implement it incorrectly. Evaluating the RRT is challenging because the outcome of its driving feature, the randomizer, is concealed from researchers. As a result, prior research has typically assumed that higher reporting of undesirable responses signals the RRT’s success. Eight RRT items were evaluated in a non-probability survey of 75 participants of the online virtual world, Second Life (SL). Participants were randomly assigned to one of three modes: face-to-face interview in SL, voice chat interview in SL, or web. The randomizer across all modes was an interactive, 3-dimensional virtual coin toss that was discreetly manipulated by the researchers in order to determine with near certainty whether participants followed the procedure. Only 67% of participants followed the procedure for every RRT item. The greatest rate of procedural noncompliance on an item was 13%. In a true application of the RRT, such noncompliance would result in greatly inflated estimates. There were no significant differences in RRT compliance by demographic characteristics or survey mode. Most participants indicated in debriefing questions that they enjoyed this method of answering questions, but their noncompliance is cause for additional skepticism about using the RRT.

Advances in Survey Sampling and Weighting

Probability-Based Sampling Using Split-Frames with Listed Households Mary E. Losch, University Center Social & Behavioral Research & Dept. of Psychology ([email protected]); Mansour Fahimi, Marketing Systems Group ([email protected])

Random Digit Dialing (RDD) sampling methodology for targeting rare subgroups requires extensive field resources for screening purposes. On the other hand, sole use of targeted lists can seriously undermine the integrity of the employed sample as meaningful selection probabilities cannot be calculated from incomplete frames. However, augmenting a listed frame with the remaining households in the geography of interest can create a complete frame from which probability-based samples could be selected. As such, by relying on a disproportionate stratified sampling approach, efficient samples can be produced that will significantly reduce screening costs. The authors will discuss results from one such survey conducted to study the reproductive health of women 18 to 30 years of age in a Midwestern state. In addition to presenting sampling design details, respondent characteristics, key outcome measures, and various yield rates will be compared for different sub-frames.

Surveying Katrina Survivors – Challenges and Solutions Karol Krotki, RTI International ([email protected]); Darryl Creel, RTI International ([email protected]); Joseph McMichael, RTI International ([email protected]); Marjorie Hinsdale-Shouse, RTI International ([email protected])

For the Children’s Health after the Storms (CHATS) Feasibility Study, which is an environmental and epidemiologic study of Gulf Coast children, RTI International designed and implemented a sample survey of the temporarily displaced population after Hurricanes Katrina and Rita in 2005. A dual sample design was needed to meet the primary research objective, which was to determine if there is an association between prior occupancy in temporary housing units and adverse health effects among children who lived in the storm-affected areas. An exposed sample and an unexposed sample were developed separately using unique sampling strategies. This paper documents the challenges encountered in developing a robust sample design that facilitates the linking of these two disparate sample sources within a highly mobile population. The initial sample, which was generated from FEMA-supplied lists of temporary housing units, was the exposed cohort. Maps were used to understand the geographical distribution of the population as well as its mobility patterns. Maps were also used to identify the urban/rural nature of the population and to inform the sample design. Extensive tracing was needed to locate the individuals and the sample design had to react to the results of the tracing operation as we identified new post-hurricane move patterns from 2008 to 2011. For the control (unexposed) sample, a dynamic approach to sample selection was developed to match the treatment (exposed) sample of children with other children in close geographic proximity who never lived in the temporary housing units after the hurricanes. This paper will also document the extent to which field operations were successful in locating the exposed sample and the steps that had to be taken to match the unexposed sample as the exposed sample was located.

To Weight, or Not to Weight, That is the Question: Survey Weights and Multivariate Analysis Rebekah Young, The State University ([email protected]); David R. Johnson, The Pennsylvania State University ([email protected])

There is widespread consensus that using survey weights is necessary for descriptive inference (i.e., percentages, means) if the findings are to be generalized to the population from which the sample was drawn. There is less agreement on when and whether weights should be used with multivariate methods, such as linear or logistic regression analysis (Winship & Radbill, 1994; Gelman, 2007). If the sample selection and nonresponse are nonignorable, it is necessary to incorporate survey design features into the estimation of linear regression coefficients (Kott, 2007). Weighted regression, however, can produce inefficient standard errors of the estimates. Rather than use weighted regression, one alternative is to use a model-based approach that includes variables in a regression model that reflect sample design features (Gelman, 2007). In this paper, we use a series of simulation models based on the 2005 Current Population Survey to explore how sensitive a model-based strategy is to misspecification. A practical concern of a model-based strategy is that it requires an analyst to replicate many of the tasks of a survey statistician; when done incorrectly or with incomplete information, a model-based strategy could be risky. Our results show that, under some circumstances, the smaller standard errors from a model-based strategy may come at the cost of biased b-coefficients, so an unweighted approach should be used cautiously. References: Gelman, Andrew. 2007. “Struggles with Survey Weights and Regression Modeling.” Statistical Science 22:153-164. Kott, Phillip S. 2007. “Clarifying some Issues in the Regression Analysis of Survey Data.” Survey Research Methods 1:11-18. Winship, Christopher and Larry Radbill. 1994. “Sampling Weights and Regression Analysis.” Sociological Methods & Research 23:230-257.

Item-Specific Weights: A Proposal Hee-Choon Shin, NORC at the University of Chicago ([email protected]); Jibum Kim, NORC at the University of Chicago ([email protected]); Fang Wang, NORC at the University of Chicago ([email protected])

Most surveys are designed to be multipurpose, collecting numerous survey variables, sometimes in the hundreds. In the presence of nonresponse, the common approach to improve the estimates is to identify a key subset of items and impute values where missing from survey respondents, and to adjust the survey weights for respondents’ responses to compensate for survey non-respondents. In practice, the final weights in major surveys reflect adjustment only for unit nonresponse and selected item nonresponse, but do not account for all item nonresponse. However, our ultimate research interests are in the estimation of specific items or combination of items for which missing data imputation is not carried out. Therefore, the final weights should reflect the effects of item nonresponses, in which information on particular items are missing, in addition to the effect of unit nonresponse. Current practice to handle item nonresponse is either to impute the missing values or ignore the missing values. Contrary to the current practice, we will propose a simple and novel approach which allows item-specific weights for each item. We will focus on nonresponse adjustment for each survey item. As a byproduct, our proposal will enhance the quality of outcome rates for surveys (AAPOR, Standard Definitions). A demonstration with simulated data will be presented after specifying our approach.

Optimal Sample Allocation-A Portable Tool for Estimating Design Effect Mansour Fahimi, Marketing Systems Group ([email protected])

Oftentimes, survey researchers rely on disproportionate stratified sampling schemes for allocation of the total sample to various strata. While such departures from proportional allocation or EPSEM designs can accommodate analytical objectives and/or budgetary constraints, the imposed unequal selection probabilities have to be accounted for by application of survey weights. Since weighting increases variance of survey estimates, any disproportional sample allocation plan has to be considered in light of its associated loss of precision due to unequal weighting effect. Although final weights cannot be available prior to the competition of a survey, there are simple approximations one can use to estimate the resulting effect during the design phase. The author will present a portable tool for this purpose using basic spreadsheet techniques. In particular, the Solver optimization model of Excel will be used to identify sample allocation plans that can accommodate the various design and cost constraints while producing the smallest design effect.

Cell Phones and Non-Sampling Error

Nonsampling Error Attributable to Sampled Cell Phone Numbers in the American Time Use Survey Brian Meekins, Bureau of Labor Statistics ([email protected]); Stephanie Denton, Bureau of Labor Statistics ([email protected])

Recent research on the impact of cell phones has largely focused on coverage and nonresponse error with few exceptions (Kennedy et al 2009, Brick et al 2011). In this work the authors focus on nonsampling error in the American Time Use Survey (ATUS). This nationally representative survey is conducted by the U.S. Census Bureau on behalf of the Bureau of Labor Statistics. The sample for the ATUS is derived from households that have completed Wave 8 of the Current Population Survey. Households that volunteer a phone number for that survey are then called for the ATUS using that phone number (those who do not volunteer a phone number are mailed an invitation to participate and an incentive). The vast majority of CPS respondents provide Census with a phone number. The ATUS further selects a sample member from within the household to answer relatively detailed questions including a 24 hour time use diary. In this work we examine the impact of calling cell phone numbers on nonresponse and measurement error in the ATUS. Because the sample is derived from CPS completed interviews, we are able to model nonresponse using CPS data. Almost 40% of ATUS telephone sample volunteered their cell phone number for contact in the CPS. Those who volunteer their cell phone number for survey contact in the CPS are just as likely to say that a phone interview is acceptable. Cell phone volunteers are less likely to be complete ATUS interviews due to noncontact while their refusal rate is similar to those volunteering a landline number. Differences in measurement error appear to be negligible. There are some differences in the estimates of time use, but these are largely due to demographic differences.

Exploring Direct Calibration of NIS Weights using Cell Telephone Status from the NHIS Meena Khare, NCHS/CDC ([email protected]); Nadarajasundaram Ganesh, NORC at the University of Chicago ([email protected]); Kennon R. Copeland, NORC at the University of Chicago ([email protected]); Abera Wouhib, NCHS/CDC ([email protected])

The National Immunization Survey (NIS), a large list-assisted RDD landline telephone (LT) survey, is currently augmented with a cell phone sample. NIS estimates were generally based on data obtained from LT household interviews and provider reports completed for a sample of preschool children in the United States. The increased number of inaccessible households in recent years due to non-landline telephone status widens the gap between the target and the sampled populations, and raises a major concern regarding the representativeness of the estimates from the LT sample alone. The prevalence of cell-only households has recently escalated approximately to 30% and nontelephone household remained constant about 2-3%; an additional 14% are cell-mainly and do not answer a landline telephone. Although recent NIS research has shown minimal noncoverage bias (<1%) with the increasing prevalence of cell-only and cell-mainly households, a national cell phone sample was added to the NIS starting in Q4/2010 to reduce potential bias due to noncoverage of non-landline households and to measure the direct impact on NIS estimates.

In this paper, we evaluate the impact of different weighting methods on the potential bias in estimated vaccination coverage rates due to the nonrepresentativeness of the NIS LT sample. The distribution of telephone status, household characteristics, and selected estimated vaccination coverage rates from the 2010 NIS-child, and NHIS-Provider Record Check study (NHIS-PRC) are compared. Distributions from the NHIS-PRC are utilized to directly calibrate NIS sample weights by landline and cell telephone status. We compare weighted vaccination coverage rates with and without using direct calibration of the survey weights. For comparison of methods, we also compare estimates using a propensity-based weighting method to adjust NIS weights for noncoverage.

Assessment of Bias in the National Immunization Survey-Teen: Benchmarking to the National Health Interview Survey, 2009-2010 Christina Dorell, Centers for Disease Control and Prevention ([email protected]); Kennon R. Copeland, NORC at the University of Chicago ([email protected]); Reiping Huang, NORC at the University of Chicago ([email protected]); Benjamin Duffey, NORC at the University of Chicago ([email protected])

The proportion of children living in cell-phone-only households was 31.8% in 2010; these children are missing from sampling frames of landline random-digit-dialing (RDD) surveys and may result in biased estimates. Address-based sample surveys like the National Health Interview Survey (NHIS) have lower opportunity for non-coverage bias. The National Immunization Survey-Teen (NIS-Teen) reported vaccination estimates for adolescents through 2010 using a landline RDD sample. In 2010 a national cell phone sample was added and in 2011 dual frame (landline and cell) estimates will be reported. While inclusion of a cell phone sample in the NIS-Teen addresses non-coverage, an assessment of non- response bias of dual-frame estimates is needed. To assess the impact of non-coverage and non-response bias on NIS-Teen vaccination estimates from 2009 and 2010, we compared adolescent (ages 13-17 years) vaccination estimates, from the NIS-Teen and NHIS. Bivariate analyses were used to describe NIS-Teen landline and NHIS distributions for telephone status and sociodemographic characteristics. Provider-verified vaccination estimates were reported from the 2009 and 2010 landline NIS-Teen, 2010 dual-frame NIS-Teen, and 2009 and 2010 NHIS samples. Net biases in vaccination estimates from the surveys are reported. Approximately 20% of the NHIS sample lived in cell-phone-only or phoneless households. Relative to the NHIS sample, the NIS-Teen landline sample under-represents adolescents who are non-Hispanic black, have less-educated mothers, live in rented households, and live in households in central cities, characteristics common to cell-phone-only households. NIS-Teen landline estimates were three to nine percentage points greater than the NHIS for nine of thirteen vaccines. NIS-Teen dual-frame estimates were three to six percentage points greater than the NHIS for seven of thirteen vaccines. We will present findings from our assessment of differences in NIS-Teen landline and dual-frame and NHIS vaccination estimates that may be related to differential coverage, non-response, and provider non- consent in the two surveys.

The Impact of Cell Phones in Longitudinal Studies Daniel Loew, Abt SRBI ([email protected]); Mark Morgan, Abt SRBI ([email protected])

Much attention has been given to the inclusion of cellular phones in random population studies, but they also have important ramifications when conducting longitudinal studies of listed populations. A critical element of longitudinal research projects is minimizing attrition from wave to wave. In telephone surveys a primary reason for attrition is that participants relocate and often change their phone number. The growing use of personal email and cell phones with number portability can potentially make it easier to maintain contact with people who move. In this methodological brief we test the hypothesis that cellular phones are more beneficial to minimizing attrition in longitudinal research than landline phones.

This paper will examine the differences in attrition and ease of contact between landline and cellular telephone numbers in a 3 year cohort study among military personnel. The analysis includes the impact of supplemental contact modes such as email, postal mail, new address searches and contact referrals. The results will help guide best practices for utilizing technology to help maximize participation in longitudinal studies.

Alternative Interviewing Approaches

Cumulative Effects of Dependent Interviewing on Measurement Error: Results from a Four-wave Validation Study Johannes Eggs, Institute for Employment Research ([email protected]); Annette Jaeckle, Institute for Social and Economic Research ([email protected]); Mark Trappmann, Institute for Employment Research ([email protected])

With Proactive Dependent Interviewing (PDI), respondents are reminded of the answer to a survey question they gave in a previous interview. The previous information is used to verify whether the respondent’s status has changed, or as a starting point for asking about events since the previous interview. In either case, concern is frequently voiced that measurement error from the first wave will be carried forward into future waves of the survey. We exploit a rare opportunity for validating several waves of a panel survey, to examine how the extent of cross-sectional and longitudinal aspects of measurement error evolves, when PDI is used over several waves: We use four waves of the panel study “Labour Market and Social Security” (PASS), linked to individual administrative records, to examine whether PDI has any cumulative effects on measurement error across waves, that go beyond the effects in a single wave reported in previous validation studies. We address the following questions: 1. When PDI is used in successive waves to measure benefit receipt, does the extent of under-reporting and over-reporting change across waves? 2. How does the resulting bias in prevalence rates change across waves? 3. How do the changes in over- and under-reporting affect transition rates onto and off benefit receipt between waves? Do errors in transition rates change across waves? 4. Do errors in reported months of receipt change across waves?

Exploring Conversational Interviewing in the American Time Use Survey Jennifer Edgar, Bureau of Labor Statistics ([email protected]); Stephanie Denton, Bureau of Labor Statistics ([email protected]); Scott Fricker, Bureau of Labor Statistics ([email protected]); Polly Phipps, Bureau of Labor Statistics ([email protected])

In the American Time Use Survey (ATUS), interviewers use a set of scripted open ended questions to walk respondent chronologically through the prior 24 hour day, collecting his or her activities and details about each activity reported. Interviewers are trained to engage the respondent in a conversational and flexible way to gather necessary information. These conversational interviewing techniques, such as using a variety of strategies to assist with recall, anchoring questions, and using active listening are thought to put the respondent at ease and provide interviewers the freedom to collect data in the best possible way.

While the 24 hour recall diary is a standard way to collect time use data, recalling activities on a previous day is a challenging task. Conversational interviewing is hypothesized to improve respondent understanding of question and concepts as interviewers and respondent collaborate and converse on meaning. In the ATUS, conversational interviewing is also thought to improve recall by allowing interviewers to ask open ended questions to assist respondents in reconstructing their day in an order and way that is meaningful to them rather than following a set script and sequence.

In this paper, we use 100 behavior coded transcripts of ATUS interviews to address four research questions about the use of conversational interviewing in the ATUS recall diary. First, are interviewers using conversational interviewing, scripted interviewing, or some hybrid combination? Second, what conversational interviewing techniques are being used? Third, is there variation in the use of conversational interviewing by interviewer experience? Finally, is there evidence that conversational interviewing is associated with the quality and quantity of the activities reported by the respondent?

Conversational Interviewing and the Comprehension of Opinion Questions Frost A Hubbard, Survey Research Center, University of ([email protected]); Chris Antoun, Survey Research Center, University of Michigan ([email protected]); Frederick Conrad, Survey Research Center, University of Michigan ([email protected])

Conversational interviewing, which allows interviewers to say what is required so that respondents understand survey questions, can improve comprehension of questions and accuracy of answers (e.g. Conrad & Schober, 2000). These benefits have been demonstrated for questions about facts and behaviors, but not opinions. We report a study that explored the effectiveness of conversational interviewing for opinion questions and examined differences in interviewers’ facility with the technique because of variation in their interpersonal sensitivity, i.e., ability to perceive the internal states (including comprehension) of other people based largely on their non-verbal behavior. We embedded our experiment in the last section of the June 2011 version of the Survey of Consumers after all questions had been administered with standardized interviewing. The interviewer then re-asked ten questions using standardized or conversational interviewing; the particular technique was randomly determined prior to data collection. Our main measure was response change between initial and subsequent administration of the questions, the logic being that if respondents misunderstood a question in the initial standardized presentation, they would be more likely to respond differently in a conversational than standardized re- interview because conversational interviewing can correct the misconception potentially leading to a different answer. The ten questions in the experiment concerned a variety of topics, including opinions about the economy and respondent demographics. The definitions concerned words in the question stem and response categories. Across all ten questions, overall, the conversational technique produced significantly more response change than the standardized technique. This was largely due to one opinion and one behavioral question. Conversational interviewing required about one more minute than standardized administration of the ten questions. Interviewers who scored higher on a test of interpersonal sensitivity produced significantly more response change than those who scored lower. We are currently analyzing the interviewer-respondent interaction to further explain the quantitative effects.

Language Barriers to Conversational Interviewing: Results from the 2010 & 2011 SIPP-EHC Tests Rachael Walsh, U.S. Census Bureau ([email protected])

The collection of data used to describe the current financial situation of families in the United States is usually done through surveys conducted in English. As the population of non-English speaking respondents continues to grow, the language barrier—that is the misinterpretation of survey questions resulting from translational and interpretational inaccuracy—poses a threat to data quality and increases respondent burden. The purpose of this research is to assess the impact of language differences between interviewers and survey respondents in a new, more conversational style of interviewing. The Survey of Income and Program Participation-Event History Calendar (SIPP-EHC) utilizes a conversational style of interviewing in a calendar format which is new to the Census Bureau. The deviation from the scripted interviewing style enables the testing of the hypothesized idea that interviewing in the respondent’s first language minimizes the burden of increased interview length resulting from the language barrier.

A quasi-experimental design of propensity score matching was applied to data from the 2010 and 2011 SIPP-EHC field tests, concluding that interviews in ESL households are significantly longer than interviews is English-only households of a demographically similar background. The propensity model compares households of the most similar demographic characteristics to ensure the difference is the result of the language barrier rather than confounding covariates. Interviews conducted in Spanish are statistically significantly longer than all other interviews conducted in English. However, when Spanish- speaking households are interviewed in Spanish and compared to English-only speaking households interviewed in English, a statistically significant difference in interview length is not seen. This research provides statistical support for the tested hypothesis that interviewing in the respondent’s first language can minimize the burden of interview length. Rather than translating surveys into multiple languages, data collection agencies that can hire and retain bilingual interviewers may compensate for this effect.

Conducting Surveys with Proxies: Evaluating a Standardized Measure to Determine Need Kirsten A Barrett, Mathematica Policy Research, Inc. ([email protected]); Debra Wright, Mathematica Policy Research ([email protected]); Jennifer Gardner, Mathematica Policy Research ([email protected])

The use of proxy respondents for sample members who are unable to respond is common in research on persons with chronic health conditions. Research comparing the quality of self and proxy reports, however, is mixed. While proxies can accurately report factual information (Wright et al, 2009), there is greater variability on subjective items. For example, Andresen, et al.(2000) examined disability questions on the 2000 Census and found that proxies reported that the person with a disability had greater levels of impairment than what was self-reported by the person with the disability. Historically, the decision to use a proxy has been left to interviewer judgment or gatekeeper reports. This can lead to variability in how cognitive ability is assessed and the quality of the data collected. Currently available standard assessments are not designed to assess the ability to complete a telephone interview. There is a need for a standardized tool to objectively and reliably assess cognitive ability to participate in an interview while maximizing self-response.

For the National Beneficiary Survey (NBS), Mathematica Policy Research administered an innovative cognitive screener to determine if sample members had the capacity to complete the survey or if a proxy was needed. The purpose of the screener was to standardize the assessment of proxy need. The NBS is a CATI/CAPI survey conducted for the Social Security Administration. NBS data are collected from a nationally representative sample of disability beneficiaries. This paper will: 1) describe the screening process; 2) discuss inter-rater reliability; and 3) present information on the number of cases identified as needing a proxy, reasons a proxy was needed, the outcome of these cases, and when the proxy determinations were made (e.g. as a result of the screener or at another point). We will also suggest methodological improvements for proxy determination.

Exploring the Immigration Issue

Immigration Policy Attitudes at the Local Level: What’s Threat Got to Do With It? Maria Krysan, University of Illinois at Chicago ([email protected])

Immigration is changing the face of America in the 21st century. Public opinion on the issue of immigration has been tracked by Gallup and others for decades; and it has generally relied on nationally representative samples who are asked fairly general questions about whether immigration to the U.S. should be “increased, decreased, or kept about the same.” The portrait of immigration policy attitudes provided by these data suffers from this focus because immigration policy, increasingly, is being played out at local and state levels—in part because of inaction at the federal level. So not only are contested immigration policy issues more varied than is suggested by the singular focus on the appropriate rate of immigration to the US, but the issues are playing themselves out at a smaller level of geography than . In this paper, we explore models predicting immigration policy attitudes that are innovative in at least three ways. Our dataset, the 2010 Chicago Area Study (n=1068) has: (1) ten different immigration policy questions, ranging from local ordinances to the preferred pace of immigration to the US; (2) a very rich battery of measures of one of the core predictors of immigration policy attitudes—threat—including perceptions of educational, cultural, economic, political and personal threat; and (3) representative samples of residents in five Chicago suburbs that have experienced different levels of immigration from Mexico. Our preliminary results highlight the complexity of immigration policy attitudes, revealing very different responses depending on the specific policy issue. In addition, the important dimensions of threat that predict these attitudes differ depending on specific policy content, though the importance of prejudice, economic threat and concerns about Spanish language appear particularly important.

Applying the “Contact Hypothesis” to White Anglos’ Views About Latinos and Immigration: Evidence from Five Chicago-Area Communities Marylee C. Taylor, Pennsylvania State University ([email protected])

Allport’s (1954) “contact hypothesis” specifies that contact between groups can generate positive inter- group attitudes when four conditions hold, among them equal status among interactants. Pettigrew (1986) added “friendship potential” to Allport’s list. Although the contact hypothesis has been tested in many contexts, most U.S. studies focus on the black/white divide. Research assessing environmental influences has found white residents’ racial attitudes to be more negative when the local black population share is large, contrary to simplified contact notions. However, some recent work indicates that the direction of the relationship between minority numbers and dominant group attitudes may reverse when small geographical units are the focus. In short, research evidence related to the contact hypothesis is far from complete.

As part of the 2010 Chicago Area Study, rich data were collected from a random sample of 628 white Anglos in five communities, allowing a close look at important understudied questions related to inter- group contact. Do white Anglos who report contact with Latinos hold more favorable attitudes toward Latinos, immigrants, and immigration? What opinions does this group hold about policies aimed at undocumented immigrants? Do status inequalities undercut any tendency for inter-group contact to generate positive attitudes? And how does Latino population share in the proximal geographical area – the residential block – relate to white Anglos’ attitudes and opinions about Latinos and policy related to immigrants and immigration?

Results obtained to date underscore the necessity of measuring alternate forms of contact and varied attitudes and opinions. For example, self-reported frequency of conversations with Latino neighbors, like interaction with Latino co-workers, has little relationship to white Anglos’ attitudes and policy opinions. However, white Anglos who include a Latino among the five people with whom they discuss important matters are more liberal on most immigration policy questions and many other attitudes about immigrants and Latinos.

Immigration Policy, A Non-Border State, and the Nation--A View from the Mid-West Amy Sue Goodin, University of Oklahoma Public Opinion Learning Laboratory ([email protected]); Natalie Jackson, Duke University ([email protected])

Long before the controversial Arizona immigration law SB 1070 was passed in the spring of 2010, Oklahoma was passing strict anti-illegal immigration legislation. In 2007, the Oklahoma legislature overwhelmingly passed HB 1804—the harshest anti-immigration bill in the nation at the time. Since then, the state has continued on this path: in the spring of 2011, the Oklahoma legislature passed another bill in the style of Arizona SB 1070. The combination of laws places Oklahoma high among the states with tough anti-immigration laws in place, making it clear that the legislature is among the most conservative anti-immigrant legislatures in the nation. Since the legislators are elected by the people and there was widespread support for these bills, it is assumed that the anti-immigration sentiment exists among the general public of Oklahoma as well. Why are Oklahomans, who do not live in a border state and have relatively lower illegal immigrant populations than many other states, so concerned with immigration? In this paper, we examine Oklahoman’s opinions of immigration policy and consider how attitudes differ between the U.S. population and the Oklahoma population using survey data collected from samples of each population. Further, we attempt to explain any observable differences in views on immigration policy through a variety of lenses, including cultural worldviews, political views, and efficacy. These findings will help explain how public opinion on immigration issues can vary by states, as well as helping to explain why even non-border states are passing such strict anti-immigration legislation.

Integration of Migrants Neli Esipova, Gallup ([email protected]); Anita Pugliese, Gallup ([email protected])

Gallup examined migrants’ lives and those of native-born residents in 15 European Union countries. Gallup interviewed more than 25,000 adults in 2009/2010 about the physical, financial, career, community, and social aspects of their wellbeing as well as their beliefs about religion and national institutions. This model provides a comprehensive view of migrant experiences in these 15 European countries, and contrasts the experiences of migrants who have lived in their current countries for at least five years (long-timers), migrants who have lived in their current countries for five years or less (newcomers), with native-born residents.

Gallup’s data reveal that migrants are not as well off as the native born residents. Migrants exhibit lower evaluative and experiential wellbeing ratings, fewer social connections and lower attachment to their community as compared to native born residents. Financial wellbeing and job status improves with length of stay in the country; however, migrants still do not reach the level of the native born after five years in their new country. In other areas, newcomers are significantly more positive than long timers, suggesting higher optimism and expectations that could decline over time. Because Gallup ask the same questions and uses the same methodology in more than 150 countries and areas, we are able to create a model to estimate what migrants lives hypothetically would have been like if they had stayed home – and therefore, what they have gained or lost due to their move to their adopted country. Gallup found that the gains and losses that migrant long-timers experienced largely depend on the level of human development in their home countries. The bigger the development gap between their home countries and the highly developed European Union countries, the bigger the gains and losses these migrants are likely to experience

Leaving Home: Examining the Influence of Social Ties on Latin American Immigration Ana Lucia Cordova Cazar, Gallup Research Center, University of Nebraska-Lincoln ([email protected]); Matt Hastings, Gallup Research Center, University of Nebraska- Lincoln ([email protected]); Allan L. McCutcheon, Gallup Research Center, University of Nebraska-Lincoln ([email protected])

Hispanics have increasingly become an essential part of America's social fabric. Indeed, between 2000 and 2010, Hispanic growth alone accounted for more than half of the increase in the total US population (US Census Bureau, 2010). In fact, during this ten-year period, the Hispanic population grew from 35.3 million to 50.5 million, and now represents 16 percent of the total US population. Such demographic changes have important implications for society, including, for instance, reductions in social solidarity and social capital (Putnam, 2007). As such, understanding the motivations behind immigration from Latin America represents a critical issue.

Using data from the Gallup World Poll, a probability-based multinational survey conducted by the Gallup Organization, preliminary analyses indicate that Latin Americans continue to desire to immigrate to the United States. One in three Latin Americans intends to permanently move to another country, with a plurality (47 percent) choosing the United States as their destination of interest. Of this group, 10 percent indicate that they plan to move within the next twelve months.

Utilizing the social capital theory framework (see e.g. Putnam, 2000) and a set of relevant questions from the Gallup dataset, this study will examine whether social ties with family and friends – both at home and in the United States – exert an influence on Latin American migration patterns. Further, this work will distinguish between those Latin Americans currently planning to move to the United States and those who ideally would move given the opportunity. Importantly, we will also consider the motivations behind Latin Americans desire to emigrate, including academic and employment objectives.

Improving Questionnaire Design

The Effects of Question Design Features on the Cognitive Processing of Survey Questions Across Cultural Groups Timothy Patrick Johnson, Survey Research Laboratory ([email protected]); Allyson Holbrook, University of Illinois at Chicago Survey Research Laboratory ([email protected]); Young Ik Cho, University of Wisconsin-Milwaukee ([email protected]); Sharon Shavitt, University of Illinois at Urbana-Champaign ([email protected]); Noel Chavez, University of Illinois at Chicago ([email protected]); Saul Weiner, University of Illinois at Chicago ([email protected])

Survey questions vary in terms of how easily respondents are able to comprehend their meaning and formulate answers. Growing evidence points to cultural variability in the information processing of survey questions. In this paper, we investigate the degree to which question design variables moderate the relationship between race/ethnicity and the cognitive processing of survey questions. Our analyses address the effects of respondent culture, as assessed by race/ethnic status, and survey question characteristics, and their interactions. Analysis is based on a sample of 105,000 individual answers to 200+ questions asked of 603 respondents. A novel feature of our analysis is that behavior coding is used to capture survey question processing, including comprehension, memory, mapping and social desirability concerns. Among the question characteristics examined are question response format (e.g., yes/no vs. numeric vs. verbal labels), question topic (physical vs. mental health), type of judgment (time- qualified vs. not), type of report (self vs. proxy), question length, and language complexity. We employ hierarchical regression models to evaluate the independent effects of these question and respondent characteristics on composite indicators of respondent comprehension, memory, mapping and social desirability. Models control for other respondent sociodemographic characteristics and for behavior code assessments of the adequacy with which interviewers read each survey question to respondents. Recommendations regarding optimal design of survey questions in surveys of racially and ethnically diverse populations will be discussed.

Exploring the Associations of Question, Respondent, and Interviewer Characteristics with Survey Data Quality Aaron Maitland, Westat ([email protected]); Heather Ridolfo, National Center for Health Statistics ([email protected]); James Dahlhamer, National Center for Health Statistics ([email protected])

A variety of factors and design characteristics have been posited to influence the quality of survey responses. At the item level, question type, format, and length, have been linked to measurement error, as has the mode or method of data collection. Respondent characteristics such as age, race/ethnicity, gender, and education, along with topical knowledge, interest and motivation have also been found to influence the quality of survey responses. Finally, interviewer demographics, skills, and expectations can impact the quality of elicited responses. While the survey methods field is replete with research on measurement error and data quality, the bulk of this work tends to focus on just one of the aforementioned sources of error (question, mode, respondent, interviewer).

In this paper, we examine the associations of question, respondent, and interviewer characteristics with survey data quality using data from the 2010 National Health Interview Survey. We start by developing a hierarchical data set with questions nested within respondents nested within interviewers. We then use multilevel modeling to estimate the contribution of each level to variability in item nonresponse and response times. At the question level, characteristics such as question complexity, topic and sensitivity are coded and used as predictors of data quality. Respondent-level characteristics such as race/ethnicity, gender and age are also included in the model. Finally, we test for the existence of an interviewer effect, and explore the relationship between interviewer performance measures and data quality.

Mechanisms of Misreporting to Filter Questions Frauke Kreuter, Joint Program in Survey Methodology ([email protected]); Stephanie Eckman, Institute for Employment Research ([email protected]); Annette Jaeckle, ISER ([email protected]); Antje Kirchner, Institutue for Employment Research ([email protected]); Stanley Presser, JPSM ([email protected]); Roger Tourangeau, JPSM ([email protected])

To avoid asking respondents questions that do not apply to them, many surveys use filter questions to determine routing into follow up items. Filter questions can be asked in an interleafed format, in which follow up questions are asked immediately after each relevant filter, or a grouped format, in which follow- up questions are asked only after multiple filters have been administered. Most previous studies of the phenomenon have found that the grouped format collects more affirmative answers to the filter questions than the interleafed format. The interpretation generally given is that respondents in the interleafed format learn that they can shorten the questionnaire by answering negatively. However, such “motivated underreporting” is only one of several potential mechanisms that could produce the observed differences in responses. Acquiescence could also explain the response patterns found in previous studies.

In the fall of 2011, we carried out a telephone survey (n=2400) specifically designed to test the mechanisms of differential reporting to the two filter question formats. Our experiments extended previous work by experimentally contrasting three different filter formats: 1) asking all filters before asking follow-up questions, 2) grouping filters by topic and asking follow-up questions after each group of filters, and 3) interleafed format. We also ask filters for which the follow-up questions are triggered by ‘no’ responses, in addition to the filters triggered by ‘yes’ responses previously studied. Our study also included a link to administrative data, and thus we can report with greater certainty than previous studies which filter format collects the highest quality data to the filter questions. The experimental design allows us to distinguish between the motivated underreporting and acquiescence phenomena. We find support for both explanations, but on balance show that the motivated underreporting explanation fits the observed patterns better.

Turn that Frown Upside-Down: The Effects of Smiley Faces as Symbolic Language in Self-administered Surveys Amanda Libman, University of Nebraska - Lincoln ([email protected]); Jolene D Smyth, University of Nebraska - Lincoln ([email protected])

Question wording has long been known to influence responses. More recently, research has shown that the visual design of a questionnaire also influences responses (Smith 1995; Jenkins & Dillman 1997; Christian & Dillman 2004; Couper et al. 2004). In Smith’s original study, he found evidence that responses to a scalar question were influenced by the Dutch ladder visual aid provided with the question. Another somewhat common practice is to provide smiling and frowning faces in satisfaction scales. The faces are thought to act as a substantive symbolic language, helping respondents understand the meaning of scale points and negotiate through the scale, but we know very little about how respondents use the visual aids or what effect they have on responses. In this paper we use data from an eye tracking web experiment (n=62) conducted in Spring 2011 to examine the differences in respondents’ processing of and responses to satisfaction questions with and without smiley face visual aids. In particular, we look at total time to complete the questions, what parts of the questions respondents looked at, how long they looked at each part, and the order in which they looked at each part to try to better understand how visual aids such as smiley faces affect the processing of survey questions. Preliminary analyses indicate that those who received the smiley faces completed the question faster than those who did not receive the visual aid. Findings will help inform questionnaire design and contribute to the growing literature on how visual elements affect survey processing and responses.

Predicting and Adjusting for Non-Response Bias

Anticipatory Survey Design: Reduction of Nonresponse Bias through Bias Prediction Models Andy Peytchev, RTI International ([email protected]); Sarah Riley, University of at Chapel Hill ([email protected]); Jeff Rosen, RTI International ([email protected]); Joe Murphy, RTI International ([email protected]); Mark Lindblad, University of North Carolina at Chapel Hill ([email protected])

A common strategy in survey administration is to strive for higher response rates, implicitly or explicitly, to reduce the potential for nonresponse bias. Methods blindly aimed at maximizing response rates without consideration of survey estimates, however, may fail to reduce nonresponse bias. Conversely, targeted strategies can be considered to increase participation in a manner that reduces nonresponse bias. Targeting sample members who are least likely to participate rather than most likely may seem inefficient in reducing nonresponse rates, yet rational if the goal is nonresponse bias reduction, and even variance reduction. Auxiliary information, such as data from the sampling frame, and demographic and survey data from prior survey administrations can be highly predictive of both survey participation and the likely contribution to nonresponse bias and variance. In our prior study, we demonstrated that reducing the variation in the predicted likelihood of participation through such an approach can reduce nonresponse bias. In a second experiment, we specified models predicting key survey estimates and prioritized sample cases based not only on response propensity, but also on the predicted likelihood of bias reduction in the survey estimate. In this presentation, we will describe several alternative approaches and the one that we ultimately selected for data collection in the current wave of the Community Advantage Panel Survey, a longitudinal study on financial assets and liabilities. The current wave of this survey began in June 2011 and is scheduled to end in December 2011. Preliminary results from the experiment reveal new insight about the ability to target and reduce nonresponse bias. Results can help inform improved survey designs that reduce nonresponse bias, particularly in surveys with ample auxiliary information, such as from rich sampling frames or prior waves of data collection.

Accounting for Nonresponse Bias in the Nebraska Behavioral Health Consumer Survey Brian M Wells, University of Nebraska-Lincoln ([email protected])

The Nebraska Behavioral Health Consumer Survey collects opinions from persons who use behavioral health services in Nebraska (generally called consumers) on aspects of the services they receive, such as accessibility, quality, and participation. The results of the survey are used to determine statewide policy and access statewide performance at the federal level. But like most modern surveys (de Leeuw & de Heer, 2002), the Consumer Survey suffers from low response rates, increasing the risk of nonresponse bias of the survey estimates. Identifying predictors of both response and survey outcomes are critical to making proper postsurvey adjustments to correct for nonresponse bias (Groves, 2006). This analysis investigates gender, race, age, admission date, region of service, and service type as potential predictors to account for nonresponse bias in the Consumer Survey for seven survey outcomes. Two response propensity models were tested. Admission date is a strong predictor of nonresponse with those admitted into a service after 2009 having seven times the odds of responding (OR = 6.94, p < 0.0001). Importantly, admission date (before 2009 and during 2009) is also a predictor of positive opinions about behavioral health services, with large differences of between 9 and 23 percent between the two periods. Service type, gender, race, age, and region are also significant predictors in both models (p < 0.05). Some potential theories are discussed in addition to proposed recommendations to avoid this major source of nonresponse bias. This study is especially relevant as changes in mental health and substance abuse services are mandated by the (i.e., health care reform), and multiple states and providers are examining how they currently administer and regulate these behavioral health services.

From Analysis to Action: Use of Paradata in a CAPI Environment Barbara C. O'Hare, U.S. Census Bureau ([email protected])

The growing complexity of lifestyles and modes of communication among the general public are challenging survey practitioners to further tailor their methods to specific segments of the survey population to obtain high response rates and demographic balance. The systematic statistical analysis of survey paradata during data collection provides a means to improve data collection efficiency while maintaining quality (Groves and Heeringa, 2006; Laflamme, 2008; Kreuter, Couper, Lyberg, 2010). Real- time analysis of and decisions based on survey paradata can help address the challenges of delivering a survey to meet expectations in the context of an increasingly diverse survey population. This paper discusses work at the U.S. Census Bureau to implement the use of daily paradata analysis in a CAPI field environment. Our presentation will summarize these key steps in implementing a production application of paradata analysis during data collection: • Considerations in the integration of survey operations data from disparate systems into analytic data files for ongoing paradata analysis. • The process to develop key indicators of survey progress, cost, and quality and their presentation in a format meaningful to field operations staff. • An example of a key indicator – a response propensity for each case. A response propensity model, estimating likelihood of an interview on the next contact attempt, is used to “flag” cases in the operational case management system. • Steps taken to move from paradata analysis to implementation in a CAPI field environment. The study is based on an analytic database combining operational data from our case management system, the payroll system of interviewer daily hours and miles, and data from the Census Bureau’s Contact History instrument (CHI). Increasing the reliance on statistical analyses of survey outcomes provides a quick means to identify trends in data collection that jeopardize meeting high response rates and demographic balance.

The External Survey Environment: Measuring and Monitoring the Public Rupa Datta, NORC at University of Chicago ([email protected]); Nancy Bates, U.S. Census Bureau ([email protected]); Monica Wroblewski, U.S. Census Bureau ([email protected]); Jennifer Hunter Childs, U.S. Census Bureau ([email protected]); Morgan Earp, U. S. Bureau of Labor Statistics ([email protected])

Panel Description: According to the Groves and Couper (1998) conceptual model of survey cooperation, the social environment plays a critical role in the outcome of a survey request. Cooperation with surveys is subject to societal change, for example, shifts in the demographic composition of a population or public opinion among the members of a society. Levels of trust in government, political alienation, and privacy and confidentiality concerns are all part of the social environment for surveys. And although the social environment is considered to be a fairly fixed attribute, Groves and Couper warn that it should not be ignored. This is because it influences decision making, its importance changes over time, and it exhibits variation among subgroups of the population. This panel will report on the survey environment from several perspectives using different data sources. The first paper reports findings from the 2010 Census Integrated Communication Program Evaluation survey. The authors discuss the impact that an intervention introduced into the survey environment (the 2010 Decennial Census social marketing campaign) had on Census awareness, attitudes and knowledge. Using a panel component of the same survey, the second paper will report on five different census “mindsets” and how they shifted as a result of the campaign. The third paper will report on a more recent survey to determine if mindsets towards the census have changed since the decennial census environment and in light of the current economic, political, and social climate. The final two papers describe a new federal interagency initiative designed to continuously monitor the survey environment by way of a daily tracking survey. The first reports on the development of new questions designed to measure public trust in official statistics. The second provides results from a field test of these questions and describes the plan for ongoing data collection.

First Abstract: The 2010 Census Integrated Communications Program: A Comprehensive Effort to Alter the External Survey Environment A. Rupa Datta and Ting Yan, NORC As part of the 2010 Decennial Census, the U.S. Census Bureau waged the 2010 Census Integrated Communications Program, a multi-faceted effort to improve public awareness, attitudes and knowledge about the Census in order to increase Census participation. This type of communications program is an extreme case of how an external survey environment can be altered, and the potential of that altered environment to affect survey participation. The 2010 Census Integrated Communications Program Evaluation (CICPE) was conducted by NORC at the University of Chicago to evaluate the effectiveness of the ICP communications campaign. This paper will present data on the extent to which the 2010 ICP was able to affect knowledge and attitudes about the Census in the months leading up to Census Day. Constructs include knowledge of the Census and positive and negative attitudes about the Census, as well as some previously documented correlates of survey participation, such as civic participation, awareness of current events, and voting behavior. Some analyses will also describe the relationship between different levels of attitudes and knowledge and subsequent Census participation.

Second Abstract: Did the 2010 Census Social Marketing Campaign Shift Public Mindsets? Nancy Bates and Mary Mulry, U.S. Census Bureau In the research leading up to the 2010 U.S. Census, the team developing the paid advertising campaign used results from a 2008 pre-Census survey to construct different “mindsets”. Items use to construct the mindsets included knowledge of Census data uses, data privacy and confidentiality concerns, concern of data misuse, and self-reported intent to participate in the Census. Using discriminant analysis, five different mindsets were fashioned. These were labeled: The Insulated, the Unacquainted, the Head-Nodders, the Leading Edge, and the Cynical Fifth. These mindsets were subsequently used to develop advertising messages aimed to encourage Census participation. As part of the evaluation of the 2010 Census communications campaign, the Census Bureau sponsored a 3-wave survey conducted before, during, and immediately after the Census. Many of the same items asked in the 2008 survey were included in the 2010 survey. After the Census, addresses of households participating in the multi-wave survey were matched to Census records. This allowed researchers to add an important variable to the data file -- whether the address had participated in the Census by mail back or personal visit. In this paper, we recreate the 5-category mindset variable using the abbreviated set of items from the evaluation survey data. Using the panel component of the survey, we examine how membership in the mindsets shifted over the course of the communications campaign. We also examine how the mindsets correlated with actual mail back behavior, message receptivity to the paid advertising, and the geographic audience segmentation for the campaign. We conclude by tying our results back to two new research efforts – a survey that measured the post-decennial Census mindsets and a new daily tracking survey designed to measure public trust in official statistics.

Third Abstract: Mindsets Revisited: Results of the Second Iteration of the Census Barriers, Attitudes, and Motivators Survey Monica J. Wroblewski The U.S. Census Bureau implemented a multi-million dollar communications campaign for the 2010 Census designed to increase awareness and participation. In part, the design of the paid advertising component relied upon results from the Census Barriers, Attitudes, and Motivators Survey (CBAMS) fielded in summer 2008. This multi-mode survey with oversampling in hard-to-count populations resulted in five distinct attitudinal segments or mindsets, each having their own unique set of characteristics that were used to help target census messaging. After the 2010 Census, the Census Bureau commissioned the second iteration of CBAMS to determine the degree to which these census mindsets have changed since the decennial census and in light of the current economic, political, and social climate. Our research found evidence of the original mindsets shifting in favor of census, and the results also suggest moving forward with a new segmentation structure composed of seven mindsets. In this paper, we will describe the modified CBAMS II design and analysis, and we will profile each mindset in detail. These attitudinal segments will help to determine how often and what type of market research is conducted over the next decade to support communications for the 2020 Census and will ultimately be used to shape census messaging.

Fourth Abstract: Development of the Federal Statistical System Public Opinion Survey Jennifer Hunter Childs, U.S. Census Bureau (presenting author) Stephanie Willson, National Center for Health Statistics Shelly Wilkie Martinez, Office of Management and Budget Laura Rasmussen, Monica Wroblewski, U.S. Census Bureau The U.S. Census Bureau is partnering with other federal statistical agencies to understand public trust in official statistics in the United States. This interagency group is commissioning a public opinion survey of attitudes toward statistics produced by the federal government over the next two years. The study looks at trust in the federal statistical system, the credibility of federal statistics, and attitudes toward and knowledge of the statistical uses of administrative records. The research follows similar efforts in various European countries that have traced public views of official statistics there. Once the study has been fielded, we will be able to make comparisons between attitudes observed in the US and those measured in Europe. The longitudinal design of the study will also allow us to observe how current events impact public perception towards the federal statistical system. This paper describes the development of the questionnaire employed to measure trust in official statistics in the United States. The interagency group authoring this questionnaire conducted cognitive testing involving different versions of questions to measure trust, awareness of federal statistics and attitudes toward administrative records use.. Results provide preliminary findings about the structure of public opinion towards these topics in the U.S. and informed the selection of questions for a field pretest.

Fifth Abstract: Factors of Trust in Statistics that Influence Public Perceptions of Use of Administrative Records Morgan Earp, U.S. Bureau of Labor Statistics (presenting author) Melissa Mitchell, U.S. National Agricultural Statistics Service Jenny Hunter Childs, U.S. Census Bureau Peter , U.S. Census Bureau Shelly Wilkie Martinez, U.S. Office Management and Budget In an effort to explore the public’s trust of official statistics in the United States and attitudes towards the use of administrative records, the Census Bureau is collaborating with several agencies to develop a measure of trust in statistical products, trust in statistical agencies, and attitudes towards use of administrative records. We will use this measure to monitor the public’s trust level and how this impacts their attitudes towards use of administrative records. During the construct and item development phase, we consulted similar models (Fellegi et al. & UK Office for National Statistics), conducted expert reviews, and completed cognitive interviews. The constructs we sought to measure were trust in statistical products (accuracy, credibility, objectivity and relevance), trust in statistical agencies (confidentiality protected, integrity, openness/transparency and impartiality), and attitudes towards use of administrative records. This paper focuses on the construct and item evaluation phase as well as early trend results. We will present the theoretical model we developed, the methods used to identify and validate the structure of this model and some early findings based on the initial month of data collection.

Thursday 3:00 p.m. - 4:00 p.m.

Demonstration Session #1

Completing Complex Intercept Surveys on Cell-enabled iPads James J. Dayton, ICF ([email protected]); Heather Driscoll, ICF ([email protected]); Robert S. Pels, ICF ([email protected])

Intercept field data collectors working in outdoor environments using electronic devices face a number of challenges traditional paper and pencil data collectors do not. For example, these electronic devices must be easy to use in difficult environments; their programmatic solutions must efficiently mimic the flexible, dynamic functionality that is baked into paper-based data collection tools; and interviewers must juggle interviewing several respondents with the electronic data collection device participating in the same activity simultaneously instead of filling out multiple paper questionnaires concurrently. ICF’s solution? The “AppPI" (App for Personal Interviewing)—a data collection application designed for cell-enabled tablets.

During this presentation, ICF will demonstrate our second generation cell phone-enabled iPad, created to interview anglers working from docks and beaches and on boats. The demonstration will review how: visual aids assist anglers and interviewers to identify various fish species; the camera feature captured a visual record of species for later identification by ICF fisheries biologists if needed; and the GPS function can be used to assure interview quality and track members of a large field force. We will also demonstrate how AppPI workforce management applications can be used to direct interviewers to specific locations at specific times, as well as track progress toward a variety of specific interview quotas. Finally, we will demonstrate how the AppPI can be configured to mimic ease of use and other functional advantages of the current clipboard paper and pencil system through the use of flexible comment fields and the ability to toggle across multiple ongoing surveys.

A Demonstration of a Multi-Platform Mobile Survey Application: SurveyPulseTM, by RTI International David James Roe, RTI International ([email protected]); Yuying Zhang, RTI International ([email protected]); Michael Keating, RTI International ([email protected])

The landscape of survey research is changing drastically as a result of advances in mobile technologies and increased accessibility. As Smartphone coverage in the U.S. nears 35%, and associated technologies become increasingly powerful, some predict Smartphones will be able to perform many of the functions of current desktop computers and laptops, leading to more people accessing the internet on mobile devices than on desktop PCs1. As such, it is only reasonable to expect that capturing data via mobile surveys will be an important task the survey research industry must face in the near future.

Smartphone survey applications have the potential to offer a robust set of features to researchers: instant location data, multimedia access including video and the use of a still camera, and better respondent communication tools such as push notifications, email and SMS (text). Currently there are many paid survey that collect information via SMS and mobile web browsers. However, application based offerings allowing users to enter and submit responses to questions in a customized survey application are less prevalent.

This demonstration introduces SurveyPulseTM, by RTI International. SurveyPulseTM enables delivery of surveys to users of multiple mobile devices, including tablets that utilize various platforms and operating systems (Android, iOS, OSX, Windows, Mac, RIM), allowing data to be collected in real time. The app can be used to collect data from a panel of users where surveys are pushed to users’ devices based on certain selection criteria, or distributed to respondents on a study by study, client by client basis. User engagement can be maintained through the sharing of real-time results and aggregate, top-level statistics. In addition, users can be notified about new and still to be completed surveys automatically, contributing to more representative data, higher response rates and lower operational costs.

Sociometric Badges: Using Wearable Sensors to Measure Behavior Ben Waber, Harvard Business School ([email protected])

I will discuss how a wearable sensing platform, the Sociometic Badge, allows us to measure and analyze human behavior in real-world settings and in very fine detail. The Sociometric Badge is capable of automatically measuring the amount of face-to-face interaction, conversational time, physical proximity to other people, and physical activity levels using social signals derived from vocal features, body motion, and relative location to capture individual and collective patterns of behavior. Through a series of studies, we show how we can use the badges to measure persuasiveness, interest levels, and social support. Finally, we detail how we have used the badges in real companies to transform organizational design and deepen our scientific understanding of management.

Poster Session 1

Web Survey with ABS Sample-A Viable Alternative to RDD? Jun Suzuki, Research Into Action, Inc. ([email protected])

This Methodological Brief describes our 2011 experience administering a sequential mixed-mode design with Address-Based Sample (ABS) as part of a fourth consecutive annual study of Oregon households’ energy use behaviors and attitudes. For the three previous studies, we completed all interviews by phone using Random Digit Dialing (RDD) of landline and cell phone numbers. Faced with the challenges of reaching Oregon’s growing number of cell-phone-only households (31%, 24% nationwide) and rising phone data collection costs, the research team sought an alternative data collection method.

In the first phase of our research, we mailed initial and reminder postcards to 4,000 randomly selected ABS households, inviting them to participate in a web survey. We offered a cash lottery incentive to entice participation. In the second phase, we attempted to reach the ABS web survey non-respondents by phone. Although we received phone numbers for half of the ABS, this was not enough to fill the quotas. For this reason, we conducted additional RDD calls to both landline and cell phone numbers proportionate to the rate of Oregon’s cell-phone-only households.

Interestingly, those responding to the web survey from the ABS most closely matched the known demographic characteristics, including phone status (cell-phone-only, landline-only, and cell-and- landline), of our target population. Given Oregon’s high rate of households with internet access (81%, 74% nationwide) and the representativeness of the ABS web survey respondents, this ABS web method could provide a cost-effective alternative to conventional RDD approaches for general household surveys while ensuring coverage of cell-phone-only households. To address non-coverage of households without internet access, a phone survey option can still be included. A key challenge was a low response rate, which can be remedied by increasing the initial mailing and encouraging participation by mailed letter with an enclosed incentive, rather than by postcard.

Reuniting with Retirees: Determining the Effectiveness of Locating Older Adults Through Milestone Reunions Sabine K. Horner, American Institutes for Research ([email protected])

Project Talent is a large-scale nationally representative longitudinal study that first surveyed the aptitudes, abilities, interests, and aspirations of 400,000 American high school students in 1960. Plans to follow up with this cohort, now in their late sixties, depend on the ability to locate an analytically large and representative sample of the original participants. This paper determines whether the tracking and outreach activities undertaken by the Project Talent team at the American Institutes for Research (AIR) are yielding a sample that is representative of the original Project Talent cohort. In 2011, AIR leveraged the opportunity presented by the 50th reunions taking place for the classes of 1961 as a way of reaching a significant amount of original study participants. Representatives from Project Talent attended reunions to reconnect with sample members or asked reunion organizers to share information with their classmates on behalf of the project. This paper examines whether the schools reached through the 2011 outreach activities are representative of the entire sample of 957 high schools, based on the indicators of school size (indicative of the size of the community in which that school was situated), geographic region, per pupil spending, school type (private, public, parochial), and college admission rates. Secondly, the paper examines the students from the class of 1961 who registered themselves with Project Talent as a result of the AIR’s reunion outreach efforts and, using the measures of gender, geographic region, socio-economic status, and several cognitive and personality indicators from the original 1960 survey, evaluates whether these students are representative of the entire cohort of participating high school juniors. This analysis will show the effectiveness of this strategy for locating and engaging study participants and determine whether some older populations can be reached more effectively through milestone reunions than others.

Changing Survey Modes: Does it Matter How You Get There? Felicia LeClere, NORC at the University of Chicago ([email protected]); Jennifer Vanicek, NORC at the University of Chicago ([email protected]); Kanru Xia, NORC at the University of Chicago ([email protected]); Amaya Ashley, NORC at the University of Chicago ([email protected]); Whitney Murphy, NORC at the University of Chicago (murphy- [email protected]); Kari Carris, NORC at the University of Chicago ([email protected])

The administration of multiple modes of data collection can occur at the initial design in order to offer more simultaneous response options. It can also occur dynamically in response to incomplete sampling frame information or as a follow up to those persons who have not responded or refused to respond to the initial mode of interview. In this research, we use questionnaire and paradata from the Racial and Ethnic Approaches to Community Health across the U.S. (REACH U.S.), which is an ABS multi-mode study in 28 communities, to determine whether the impact of mixed questionnaire modes on survey performance and measurement error differ by the operational reason for switching modes. REACH takes five routes to contact and interview respondents including those that begin with in-person, mail, and telephone. We will compare two of those paths: (1) those households whose addresses are obtained from the USPS Delivery Sequence File (DSF) but cannot be matched to a phone number and, thus, must be mailed a questionnaire and (2) those households despite having a telephone number available “cannot be reached by phone” because they never pick up the telephone or refuse through multiple attempts. Estimating mode effects for survey statistics is complicated by the admixture of the selection of persons into mode, which in this case does not occur through an initial selection by the respondent but for other reasons, and by the consequence of delivering a questionnaire in a different mode. We will use propensity matching models to disentangle these sources of error on questions from REACH U.S. that may be particularly vulnerable to both selection and measurement effects. Past research suggests that self-reported height and weight, physical activity, and consumption of fruits and vegetables are vulnerable to misreporting due to social desirability. We will focus our analyses on these key variables.

Multiple Email Reminders and Response Rate for an Internet Based Survey Robert Brackbill, New York City Department of Health and Mental Hygiene ([email protected]); Shengchao Yu, New York City Department of Health and Mental Hygiene ([email protected]); Deborah Walker, Department of Health and Mental Hygiene ([email protected]); Lennon Turner, Department of Health and Mental Hygiene ([email protected]); Sara Miller, Department of Health and Mental Hygiene ([email protected]); Mark Farfel, Department of Health and Mental Hygiene ([email protected]); Steven Stellman, Department of Health and Mental Hygiene ([email protected])

The World Trade Center Health Registry (WTCHR) is currently conducting its third wave (W3) of data collection. Wave 1 (W1), conducted between September, 2003 and November, 2004, yielded a 71,437 person cohort for understanding health effects from the September 11, 2001 terrorist attack on the World Trade Center. Wave 2 (W2) completed between November, 2006 and December, 2007 had 46,322 adult participants. W3, like W2, used mixed mode: internet, paper, and phone. A greater rate of completion for web surveys results in cost savings for paper and phone modes. This study assessed the impact of multiple email reminders on internet based Wave 3 survey response.

The literature on email reminder effect on internet based survey is generally limited to only a few reminders within a set period of time (e.g. Cook, Heath & Thompson, 2000; Archer, 2007).

39,386 WTCHR enrollees who had verified email addresses were invited in batches to complete the W3 internet based questionnaire. Several weeks after the last batch of email invitations, the first reminder email was sent to non-responders to the invitation (n=33,456). Up to 12 reminders will be sent at 7-10 day intervals, on different days of the week, and with different subject lines.

Daily response rates before and after email reminders were compared by using a difference ratio. The first email reminder had the greatest impact with a difference ratio of 21. After nine reminders the difference ratio was 4 indicating that even after multiple instances of contact, reminders continued to have a substantial impact on response rate. Further analysis will address other questions such as: a) What is the relative lag of impact on base response rate following each reminder; and b) How does email reminder impact vary by mode of participation in prior survey wave.

Trends in Residential Energy-Use Attitudes and Behaviors, 2002-2010: Did the Great Recession Have an Impact? Carla Jackson, Abt SRBI, Inc. ([email protected])

Since 2002, Abt SRBI has conducted a biennial telephone survey about energy-related attitudes and behaviors with a randomly-selected, nationwide sample of 800 consumers. Topics addressed in a comprehensive 18-minute interview include: attitudes and behaviors related to energy conservation and efficiency; ENERGY STAR® awareness; residence and demographic characteristics; and related topics.

Our hypothesis for the 2010 survey was that the Great Recession created unprecedented pressures on households to reduce their energy use and to save money on their energy bills and that these challenges would lead to more positive attitudes about saving energy and the implementation of energy-saving measures. We also expected to see less concern about the environment as economic issues were of primary importance to most consumers.

Major findings from the survey supported our hypotheses:

• More respondents reported saving money as their primary motivation for saving energy over protecting the environment. • There was an increase in the percentage of consumers who reported doing more to save energy in their homes. • Purchases of many types of appliances and electronic equipment decreased from 2008 to 2010. The cost of equipment was a barrier for many consumers. • ENERGY STAR awareness increased significantly from 68 percent in 2008 to 79 percent in 2010. • Consumers’ concerns about the environment significantly decreased between 2008 and 2010, including a decrease in the percentage convinced that is occurring.

These are among the attitudinal and behavioral changes related to saving energy which will be presented in the poster, which will also describe differences among consumer segments with respect to major survey indicators.

Targeting Precise Geographic or Localized Areas Using an Address Based Sample Frame Anna Fleeman, Abt SRBI Inc. ([email protected]); Tiffany Henderson, Abt SRBI Inc. ([email protected]); John Boyle, Abt SRBI Inc. ([email protected]); Kenneth J. Ruggiero, Medical University of South Carolina ([email protected])

Abt SRBI, in partnership with the Medical University of South Carolina, conducted a study of adolescents living close to the touchdown points of severe tornadoes (F3 or greater). For sampling and weighting purposes, standard RDD and address based sampling (ABS) frames are based on county; however, for very precise geographic or localized areas, a county-based sample is not efficient given the lengthy screening process and low incidence. To reduce the screening and maximize the incidence of adolescents most affected by the tornadoes, we used a highly targeted ABS frame based on latitude/longitude coordinates. First, the paths of the selected tornadoes were plotted and the coordinates identified. Then, Census Block IDs were assigned by plurality to increasing radii around each coordinate (e.g., 0.5 mile, 2 miles). These radii served as the strata from which the nearly 50,000 addresses were randomly selected. Third, phone numbers were appended if a match could be made. Addresses unable to be matched to a phone number were sent a letter and a screening questionnaire that identified households with children, determined phone status, and requested contact information. Addresses for which phone numbers could be matched were called and screened for presence of children. Baseline phone interviews were then conducted with eligible households from both samples. Presented findings will include response and incidence rates by strata and sample type as well as household displacement and mapping. Further, the feasibility, efficiency, and potential error related to this type of ABS will be discussed. Results will provide insight with regard to the methods of effectively sampling populations in very precise geographic areas using ABS.

Effective Recruitment and Coaching Method for Long Term Panels: Using Testimonial Videos to Gain Cooperation and Improve Task Compliance Ana P. Petras, Nielsen ([email protected]); Anh Thu Burks, Nielsen ([email protected]); Rosemary Holden, Nielsen ([email protected]); Michael W. Link, Nielsen ([email protected])

Recruiting participants with little or no knowledge of the organization conducting the research can be challenging primarily due to credibility barriers and limited understanding of the benefits of participation. Gaining and sustaining task compliance after participants have agreed to participate can also be challenging primarily due to lack of motivation to complete the task or lack of knowledge on how to accurately complete the task. In 2011, Nielsen deployed two videos – a recruitment video and a coaching video – in its television ratings panel to help address these challenges.

Both videos utilize testimonials from former (and diverse) Nielsen homes that have participated in the panel, as well as testimonials from users of the data (i.e. media executives). The videos also utilize targeted motivational concepts for low cooperating demographic groups. The videos were produced in English, Spanish and subtitled for Chinese, Korean and Vietnamese languages. Here, we present the use of testimonial videos with targeted motivational content as effective tools to gain and maintain cooperation in long term panels. Results from a Field staff survey on video usage and other metrics to measure their effectiveness will be discussed.

Response Effects of Symbolic Images in Satisfaction Scales Ziv Tepman, Google ([email protected]); Vani Henderson, Google ([email protected])

Satisfaction scales that employ symbolic images instead of text labels allow for greater comparability of responses across languages, eliminate survey translation burden, and avoid the ambiguity of purely numeric scales. However, little research has been done comparing "image-labeled" satisfaction scales with "text-labeled" scales. In this experiment, English-speaking respondents to a web-based customer satisfaction survey were randomly assigned to answer a series of three 7-point satisfaction questions in one of three ways: facial expression scales, text-labeled scales with labels at every point (running from "extremely dissatisfied" to "extremely satisfied"), or fully-labeled text scales with the addition of facial expressions at the end points only. This last condition was included to examine whether respondent transposition errors or primacy effects can be mitigated with a combined text/image scale. Key outcome measures were: the distribution of responses to the satisfaction questions, the correlation between a satisfaction question and questions measuring closely related constructs, and drop-off rates. For two of the three satisfaction questions, the facial expression scale produced average responses that were significantly lower (p<0.01) than either of the text-labeled scale conditions. Fewer respondents marked the top two categories of the facial expression scale than the scales with text-labeled points, producing a more dispersed and symmetrical distribution of responses. The facial expression scale also showed greater concurrent validity with questions measuring similar constructs than the two verbally-labeled scales. The two versions of the text-labeled scale produced nearly identical distributions of responses, and no significant differences in drop-off were found across the three conditions. Though results may vary depending on the specific verbal labels or images used, these findings suggest that a facial expression satisfaction scale can be equally or more valid than a text-labeled one while avoiding the burden of translation and the ambiguity of numeric scales.

The Relation Between Visual Imagery and Attitudes About Social Issues and Types of People John D. Edwards, Loyola University Chicago ([email protected]); Patrick R. Harrison, Loyola University Chicago ([email protected])

The ubiquity of new mass communication and social media has produced, among other results, an unprecedented availability of visual images of people and events. What is the relationship between the properties of these images and attitudes toward the entities that are seen? To address this question a sample of 112 university students described their visual images of 12 objects and their evaluations of those images, and recorded their attitudes toward each object using Semantic Differential scales of their beliefs, feelings, and action tendencies about each object. The objects consisted of three representatives each of four categories: US Presidents (e.g., Clinton), types of stigmatized people (e.g., illegal immigrants), social institutions (e.g., religion), and social issues (e.g., same-sex marriage). Overall, the relation between visual image favorability and total attitude scores combining belief, feeling, and action components was moderately high (r = .62), but this varied across categories: Presidents (r=.69), Issues (r=.64), Institutions (r=.56), Types of people (.29). Within categories, correlations were virtually the same for all 3 Presidents (range .74 to.76), but varied for Issues (range .55 to .69), Institutions (range .48 to .69), and especially Types of people (range .22 to .48). In general, the favorability of visual images correlated more with the cognitive (r=.66) and affective (r=.62) than with the behavioral component (r=.41), although this pattern varied somewhat among and within the object categories. Current work in our research program is exploring potential reasons for these variations in terms of visual image properties such as vividness, accessibility, and stability. We expect that a more comprehensive "picture" of attitudes on social issues, political figures, and other topics can be obtained by employing measures not only of the traditional cognitive, affective, and behavioral components but also of the visual images that appear in the "mind's eye."

Do Respondents' Self Reported Behavior Differ Over Time? Marla D. Cralley, Arbitron Inc. ([email protected])

Often self-report survey instruments such as time-use or diet diaries are used to measure specific respondent behaviors. Concerns arise around the accuracy of the reports.

Arbitron PPM, a system to passively collect Radio and Television media use over time among an on- going panel of respondents, has been implemented in 48 top U.S. metros, replacing the traditional paper self-report radio diaries previously used in these markets. During July and December of 2008, Arbitron asked a test sample of the same PPM panelists to keep one-week radio diaries while continuing to wear their meters. Results of the first study suggest that respondents actually embellish their reports in an attempt to satisfy the researcher. Panelists also alter their actual behavior during the survey period, resulting in higher listening levels when the diary keeping task drew attention to radio.

This paper is a follow-up to the initial analysis and will include the results of the initial July 2008 study compared to findings from the December 2008 study conducted with the same PPM panelists. We will report on how the level of satisficing found in the first study results changed when panelists were asked to repeat the diary keeping task. Passively collected listening of the same panelists will be trended and compared to decipher whether the amount of satisficing decreases when respondents are asked to keep subsequent diaries. Patterns of differences will be compared and contrasted by demographic characteristics.

Also, the proportion of sample actually contributing to overall listening pattern changes and the level of differences at a panelist level will determine whether only a few respondents actually change their behavior a lot or whether a lot of panelists increase their listening and reporting only a little. These findings will help us as researchers to understand what happens when we ask respondents to self-report behaviors.

The Social Economic Determinants of Suicide Rates of the Elderly in Taiwan’s Aging Society Wen-jen Hsieh, National Cheng Kung University ([email protected])

According to the World Health Organization (WHO), suicide has been considered as one of major causes of death globally. As compared to the data from the WHO, Taiwan’s suicide rates have been higher than the worldwide average. Furthermore, the suicide rates of elderly aged 65 and above have been the highest among different age groups in Taiwan. Hence, this paper investigates the social economic determinants of suicide rates of the elderly aged 65 and above in Taiwan’s 23 cities and counties. The explanatory variables are inclusive of the ratio of population completed 12 years of education to aged 15 and above, crude divorce rate, elderly dependency ratio, female labor participation rate, the share of low- income population and household disposable income. Using 1998-2009 statistical data from the Department of Health and Department of Household Registration of Taiwan’s Government, respectively, for suicide rates and marital status in all counties and cities to compile a set of panel data, and then the data are applied to a fixed-effects model for regression analysis. The estimated results indicate that the elderly suicide rate is significantly positively correlated with the crude divorce rate; while negatively correlated to household disposable income. Therefore, higher divorce rates will result in broken families and lower disposable income may directly or indirectly affect the suicide rate of Taiwan’s elderly. The lack of care and financial supports from the family of the elderly also highlight the needed actions from the central and local governments in this aging society.

Dual Frame Sample and Mixed Mode Survey Strategy for Improving Coverage Error John Tarnai, Social & Economic Sciences Research Center ([email protected]); Lori Pfingst, Washington State Budget & Policy Center ([email protected]); David Solet, Public Health-Seattle & King County ([email protected])

How to develop appropriate and affordable methods for selecting representative samples is an increasingly difficult problem. Response rates and contact rates have been falling and it is becoming increasingly difficult to reach people through traditional single mode survey methods. In this study we examine the effectiveness of using both an ABS and an RDD frame for representing a major metropolitan area, and for allowing respondents to complete the survey by telephone, mail, or Internet. The study was conducted with an address based sample of 6,400 residents of a major metropolitan area, and an RDD sample of 9,944 telephone numbers. The questionnaire was designed as a 12-page booklet consisting of 40 questions about community health. Response rates for both the mail portion of the sample and the telephone RDD sample were comparable at 28% to 29%. A total of 1,176 respondents completed the survey by telephone; 1,728 completed it by mail, and 205 completed it on the Internet. Significant differences were found for many questions, by survey mode and also by sample frame. The paper focuses on assessing the adequacy of each sample frame as well as the combined responses in representing the households in the study area, in comparison to census demographics. The paper compares the demographics and other findings by survey mode and by sample frame and discusses the implications of these results for designing surveys of households.

Electronic Democracy for Whom? Understanding Demand of Brazil's Chamber of Deputies Website Max Stabile, University of Brasilia ([email protected]); Carlos Batista, University of Brasilia ([email protected]); Deborah Cancella Pinheiro Celentano, University of Brasilia ([email protected])

This article proposes to understand how citizens have interacted with the political system through the new possibilities offered by new information and communication technologies (ICTs). Who are the users of these new tools? How do they evaluate these new channels? What is their opinion on the new possibilities of electronic participation? The main theoretical aspect addressed on the article is the debate of whether the internet replicates traditional forms of participation, or if it really is capable of engaging more citizens, including those extraneous or disinterested. These questions were directed to the Brazilian Chamber of Deputies website, which, over the last few years, has learned to adapt and promote ample access to legislative information, offering interactive resources to contact the Brazilian parliamentarians and that is considered the best legislative web portal in South America.

This paper used two distinct methodologies: i) analyzing website access data and statistics, thus identifying the patterns of access, reference websites and search words used to reach the website and ii) conducting a WebSurvey to collect users opinions. The questionnaire was used to collect assessments from the users on the tools available on the website, to identify the profile of these users and understand their political behavior in the offline world. The survey was sent to over two hundred thousand emails registered on different services of the website and was also available to any user who accessed the website.

Effects of Progress Indicators on Short Questionnaires Aaron Sedley, Google ([email protected]); Mario Callegaro, Google ([email protected])

Perhaps more than paper or telephone modes, online surveys face significant data quality concerns due to breakoffs. Online surveys that are time-consuming or cognitively demanding can be closed with a simple mouse click. And more broadly, the context of Internet usage may involve more distraction and competition for respondents’ time and attention to complete surveys, compared to phone and paper surveys.

There have been dozens of studies on the impact of progress indicators on breakoff rates, with a meta- analysis showing inconsistent effects across comparable studies for constant rate progress bar indicators (Callegaro, Yang & Villar, AAPOR 2011) . However, most progress bar experimentation has been conducted on long questionnaires (>10 minutes median completion time). Drop-off on shorter online surveys is still a concern, and the impact of progress bars on these surveys is largely unknown. Furthermore, the horizontal width of a progress bar could impact its effect, which has yet to be examined.

We use a split-ballot experiment on a short online survey (median response time <3 minutes) to evaluate the impact of using progress bars of two widths (wide and narrow) compared to a control group without a progress bar. We compare breakoff rates, item & unit nonresponse, and response time to provide initial experimental evidence to understand the effects of progress indicators on short questionnaires and a potential relationship with the width of the progress bar. Our hypothesis is that, keeping everything constant, the wider progress bar should produce lower break offs than the narrow one because respondents can more easily see the movement of the bar. We also hypothesize that for short surveys the progress bar will reduce break offs in comparison to not showing the bar.

Data are currently being collected for this experiment, in a tracking survey of online product users’ attitudes.

An Examination of the 2010 Census Be Counted Program and Its Effects on Census Coverage and Duplication Geoffrey I. Jackson, Census Bureau ([email protected]); Keith M. Wechter, Census Bureau ([email protected])

During the 2010 Census, the Census Bureau provided Be Counted questionnaires that were made available at various public sites nationwide, which allowed people to self respond if they: a) did not receive a 2010 Census questionnaire, b) believed they were not included in their household’s original mail back questionnaire, or c) had no usual residence (including those experiencing homelessness). The Be Counted questionnaires collected the same demographic information that was present on the 2010 Census mail back questionnaires: relationship, age, sex, Hispanic origin, and race. Be Counted questionnaires were made available throughout the country in Questionnaire Assistance Centers and Be Counted sites. Questionnaire Assistance Centers provided the extra benefit of having a Census representative on site to assist respondents in completing the questionnaire or answering general questions about the 2010 Census.

This paper will provide information on how often and for what reasons the Questionnaire Assistance Centers were visited. We will also report on how many Be Counted forms were completed and how many people from Be Counted forms were counted in the final 2010 Census counts. We will examine types of living quarters that were occupied by people that completed a Be Counted form and if Be Counted forms were more likely to be completed in urban, rural, or suburban regions of the country. Finally, we will look at issues with the questionnaire design and potential duplication caused by the processing of people with no usual residence.

Breaking Ground: Using Qualitative Data Analysis for Survey Development of an Under-researched Topic Clarissa R. Steele, Value-Added Research Center, University of Wisconsin-Madison ([email protected])

When researchers develop a survey, they can often use previous surveys, tested scales, and theory as the basis for their instrument. However, when little background information and theory is available about a topic, researchers must rely on other means to develop their instrument. In this study, a survey about how students are assigned to teachers in classrooms was developed using previous qualitative data analyses from three Midwestern urban school districts. The qualitative data collection included semi-structured interviews and focus groups with principals, former principals, and district staff as well as an observation of a school-level assignment meeting. From this analysis, two surveys, one for principals and one for teachers/staff, were created and then pre-tested in a fourth Midwestern school district. Cognitive interviews with principals and teachers revealed complexity in three key aspects of the survey: questions about data use in decision making and timing, idiosyncratic definitions of different types of classroom designs, and the grade-level precision of the assignment process. A concurrent qualitative data collection that included interviews, focus groups, and assignment meeting observations with teachers and principals in the same district revealed grade-specific adaptations of district artifacts and school-level data as well as the prevalence of particular characteristics and assignment practices across schools. The next phase of survey development includes a pilot study with the updated instruments planned for spring 2012.

Nonresponse in a Census of Chicago Public Schools Students: Relative Impacts of Schools, Principals, and Students Rachel Levenstein, Chicago Consortium on School Research, University of Chicago ([email protected]); Marisa de la Torre, Chicago Consortium on School Research, University of Chicago ([email protected]); Susan Sporte, Chicago Consortium on School Research, University of Chicago ([email protected])

Household survey response rates have certainly been declining in recent years (e.g., Groves 2008), and some research indicates that some establishment surveys are also dealing with lower response rates (DesRoches, 2008). But little research has focused on school survey nonresponse. Like other establishment surveys, school survey nonresponse is often impacted by the presence of a gatekeeper. For student surveys administered in schools, a common gatekeeper is the school’s principal, who not only permits data collection, but also encourages and facilitates response. In addition, nonresponse at the student level is a function of contact and cooperation, not unlike household surveys (Groves & Couper, 1998). Survey nonresponse in in-school student surveys may therefore be characterized by a hierarchical nonresponse process, whereby student participation is conditional on principal cooperation. This paper will propose a model of contact and cooperation that operates in multiple levels within a school district. Using data from a census of Chicago Public Schools students, teachers, and principals as well as auxiliary record data, the impact of the school, principal, and students on survey nonresponse will be estimated. Hierarchical linear models will be used to estimate the contribution of these factors to response propensity.

Home or Work or Both? Assessing the Role of Duplication of Website Visitations Using an Online Metered Panel Cristina Ion, Nielsen ([email protected]); Kumar Rao, Nielsen ([email protected]); Seema Varma, Nielsen ([email protected]); PengFei Yi, Nielsen ([email protected])

The Internet is all pervasive now. From its humble beginnings in the military five decades ago, the Internet has evolved into a dynamic medium that allow people to interact, collaborate and share information. In the U.S., more and more people are using the Internet than ever before. According to a 2009 PEW study (Rainie 2010), 74% of adults (ages 18 and older) use the Internet at home or some other location, compared to 48% in the year 2000. From this it follows that page views, which is the measure of website activity and are the number of distinct web pages served to a web user (Bhat, Bevans et al. 2002), are not indicative of the number of unique online users accessing the website because users can access the website from multiple locations. Therefore, to estimate reach, the proportion of online users who visited the website, we have to account for the duplication of website visitations from multiple locations. This estimation task is at the heart of this study and analysis.

In this study, we estimate duplication of website visitations to popular websites in the U.S. from a RDD- recruited online panel in which consented panel members install software (a.k.a. meter) on their home and work computers as part of their membership in the panel. The software passively tracks their online and click stream behaviors at both home and work locations. We use multiple modeling approaches to study duplication of website visitations and along the way observe patterns in duplications across various demographic cohorts. In addition, we juxtapose these behavioral-based estimates against survey data, one that is obtained from an online opt-in panel. We discuss the findings from the study and conclude with recommendations for future research.

Results from the National Survey of Fishing, Hunting, and Wildlife-Associated Recreation (FHWAR) Cell Phone and Debit Card Test: Response Rates Matthew Herbstritt, US Census Bureau ([email protected])

The Fishing, Hunting, and Wildlife-Associated Recreation Survey (FHWAR) team feels it is necessary to research alternative methods to improve contacts and response rates for the FHWAR survey. Recently, budget restrictions have significantly reduced, and will continue to reduce, the ability to conduct computer-assisted personal interviews (CAPI). The 2011 FHWAR survey drastically changed compared to years past, with mostly computer-assisted telephone interviews (CATI) being used to contact sample cases instead of CAPI. CAPI interviews usually result in the highest response rate -this was the case for 2011 FHWAR Wave 1 interviewing. CATI interviewing cost less and can be a good alternative to CAPI, but with its advantages comes major drawbacks. Matching phones numbers to addresses from the Master Address File (MAF) is not an exact science, plus hanging up a phone is easier than denying an interviewer in person. For these reasons, among others, alternative designs need to be perused to keep up with the persistent difficulties of obtaining survey responses. FHWAR has split the research population into three panels: advanced letter and cell phone, advanced letter and cash incentives, and advanced letters only. Response rates will be tested for each individual panel against CATI and CAPI interviews. These panels will also be tested against one another using multiple means testing.

Is Past, the Future? Resampling Past Respondents to Improve Current Sample Quality Lawnzetta Tracie Yancey, The Nielsen Company ([email protected]); Lukasz Chmura, The Nielsen Company ([email protected]); Kumar Rao, The Nielsen Company ([email protected]); David Kurzynski, The Nielsen Company ([email protected]); Scott Bell, The Nielsen Company ([email protected]); Tim Dolson, The Nielsen Company ([email protected])

The Nielsen TV Ratings Diary service involves the use of a one-week TV diary survey for measuring TV ratings. While the service has been around for a while, it recently received a sampling makeover to address the diminishing coverage associated with landline random-digit dialing (RDD) surveys. Address- based sampling (ABS) replaced RDD as the sampling methodology for the diary service. While internal studies have found that the use of ABS helped improve the coverage of the diary service, concerns about low response rates to diaries continue to linger, especially among certain demographic cohorts (such as minorities and young adults). Therefore, in an effort to increase cooperation rates, improve demographic representation of address-based samples, and to reduce costs, the Nielsen Company in 2011 investigated re-contacting previous diary respondents among four key demographic cohorts: Age of Head of Household (AOH) less than 35, AOH between the ages of 35 and 54, Blacks, and Hispanics. This article describes this effort. Using an experimental design, newly sampled and re-contacted respondents in the four cohorts were provided with multiple response inducements (such as unconditional monetary incentives and reminder post card) at various stages of the diary recruitment and response process. Our preliminary investigation has revealed some interesting findings. First, the role played by monetary incentives is in line with what we know about them in surveys and observational studies - incentivizing respondents is helpful in increasing cooperation rates. This applies even to hard-to-reach demographic cohorts. Second, households that participated in a previous diary study are likely to participate again in the future. We will present quantitative evidence of the effect of response inducements on cooperation and television viewing activity (a.k.a. tuning) among new and re-contact households across demographic cohorts.

The Opinion Dynamics Surrounding Nuclear Energy in the U.S.: Exploring the Interplay of Risk Perceptions and Values on Public Support for Nuclear Energy Sara Yeo, University of Wisconsin-Madison ([email protected]); Kristin Runge, University of Wisconsin-Madison ([email protected]); Nan Li, University of Wisconsin-Madison ([email protected]); Dominique Brossard, University of Wisconsin-Madison ([email protected]); Dietram A. Scheufele, University of Wisconsin-Madison ([email protected]); Michael Xenos, University of Wisconsin-Madison ([email protected])

After an extended period of relative calm with respect to plant accidents, the use of nuclear energy as an alternative to conventional greenhouse-gas producing technologies has been regaining public attention in the United States, yet few studies empirically examine the complex interplay of predispositional, cognitive, and attitudinal influences shaping public attitudes toward this issue. Using data from a 2010 Knowledge Network survey (N=1153) completed in the months prior to the Fukushima Daiichi disaster, this study examines the dynamics linking value predispositions, deference to science, and risk/benefit perception as predictors of support for nuclear energy, and serves as an important benchmark for pre-Fukushima public sentiment in the U.S. Our results show that support for nuclear energy was chiefly predicted by perceived risks and benefits associated with nuclear power. Ideological beliefs and confidence in safety and regulatory systems were also statistically significant predictors of support for nuclear energy, with religiosity influencing support as a mediator of benefit perception. Given the recent disaster in Japan, and the increase in applications for new nuclear units submitted to the U.S. Nuclear Regulatory Commission between 2007 and 2011, this analysis provides useful insights for policymakers, public opinion researchers and media effects scholars in understanding the dynamics underlying public support for nuclear energy use.

Gender Pre-Specified Sampling for Cost Control Kien Le, Social and Economic Survey Research Institute, Qatar University ([email protected]); Abdoulaye Diop, Social and Economic Survey Research Institute, Qatar University ([email protected]); Darwish Alemadi, Social and Economic Survey Research Institute, Qatar University ([email protected]); Jill Wittrok, University of Michigan ([email protected])

Household surveys administered in the Middle East and North Africa countries are typically conducted using survey practices developed from the West and hence outside the region. In some instances, these techniques are not sensitive to the cultural and religious characteristics of these countries. In this paper, we look at one gender barrier: that is, male interviewers are not allowed to interview female respondents in these countries. To overcome this barrier, we propose a sampling method based on the gender matching of interviewers and respondents. The main benefit of the method is to reduce the field cost and to allow for a simple and fast selection of respondents within households. The method is applied to a national survey in Qatar, a country in the Middle East. We achieve significant reduction in the field cost (by 27 percent) as fewer interviewers are needed during the field work. In the United States and Europe household surveys, this sampling method cannot be used to reduce the field cost, but it can be used to match interviewers’ and respondents’ gender to avoid possible interviewer effects, especially in surveys with gender sensitive questions.

Drop-Off Point for Undergraduate Students on a Web-based Alcohol and Tobacco Use Questionnaire Ananda Mitra, Department of Communication ([email protected])

This paper is based on two random sample trials with nearly 3,000 students in each trial (2009 and 2010 fall) which demonstrates that there are predictable drop-off points for questionnaires delivered over the Internet to a random sample of students. The questionnaire was designed to elicit alcohol, tobacco and drug-related behavior and attitudes as well as demographic information. The data suggests that in both the years the drop off rate reached 11% at the same point in the questionnaire. This point in the questionnaire was the 13th screen in a 29 screen questionnaire. After that, the drop off rate would plateau at 15% in both the years. The data also suggests that among those who dropped off, nearly 50% dropped off within the first five minutes of opening the questionnaire and the drop off would plateau about 30 minutes into the questionnaire by which time nearly 91% of the total number of drop offs had left the questionnaire. The same trend was observed in both years of the study. These findings suggest that it is useful to ensure that respondents can be offered some incentive to stay with the questionnaire when they are about half way through a long questionnaire. The fact that drop offs plateau near the half-way point is important to consider to reduce the overall drop off amount. It is also important to ensure that the respondent be offered some incentive to stay with a questionnaire within the first five minutes of opening the questionnaire. This could reduce the drop off rate significantly. Further analysis of the nature of questions and formatting of questions merit additional examination to locate other predictors of drop off.

Cost Implications of New Address Listing Technology: Implications for Efficiency and Data Quality Katie Dekker, NORC at the University of Chicago ([email protected]); Ned English, NORC at the University of Chicago ([email protected])

For many years, address frames have been compiled by field staff using paper and pencil methods in a process known as “traditional listing”. The recent trend in survey research has been to replace in-person listing in favor of using the United States Postal Service Delivery Sequence File (CDSF or DSF) as an address frame. However, current research shows that the CDSF is not adequate everywhere; in particular the CDSF suffers from undercoverage in rural areas. Consequently, best practice has become to list in areas associated with undercoverage of the CDSF. In recent years, NORC has been involved in research on traditional listing vs. enhanced (or dependent) DSF listing, where the lister verifies and edits the CDSF list that has been geocoded to a selected segment. While listing has been conducted using paper and pencil, NORC introduced use of a handheld device to accompany a national listing effort in 2011. The device-based listing process carries advantages in that it eliminates the need for data entry, allows field capture of the geographic coordinates of each housing unit, and simplifies some lister tasks. This paper will describes a preliminary exploration of the cost implications of implementing a handheld listing device as opposed to paper and pencil listings, not only in terms of lister time and travel, but also in terms of hardware, training, and other costs associated with the device. In addition to costs associated with the national frame listing relative to that from previous listing projects, we examine device-related error that may contribute to future costs. Understanding these errors will be beneficial when considering implementing newer technologies in the listing process.

Assessing Quality of Care Through Medical Record Reviews in Mesoamerica Gulnoza Usmanova , Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Catherine M. Wetmore, Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Ali Mokdad , Institute of Health Metrics and Evaluation, University of Washington ([email protected]); K. Ellicott Colson , Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Emily Carnahan , Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Dharani Ranganathan, Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Emma Margarita Iriarte, Inter-American Development Bank ([email protected]); Paola Zúñiga Brenes , Inter-American Development Bank ([email protected]); Sebastian Martinez, Inter-American Development Bank ([email protected]); Jennifer Nelson, Inter-American Development Bank ([email protected]); Brent Anderson, Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Tasha B. Murphy, Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Bernardo Hernández Prado, Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Rafael Lozano, Institute of Health Metrics and Evaluation, University of Washington ([email protected]); Ali H. Mokdad, Institute of Health Metrics and Evaluation, University of Washington ([email protected])

Medical records provide a unique snapshot of service provision and could be used to assess quality of health care services. However, the use of medical record reviews to assess quality of medical care in developing countries, especially in remote areas, has not been previously done. In this paper, we discuss advantages, disadvantages and challenges we encountered in implementing a medical record review component within the Salud Mesoamerica 2015 initiative. This initiative was established by the Inter-American Development Bank with funding from the Bill & Melinda Gates Foundation, the Carlos Slim Foundation, and the Government of Spain. Briefly, a medical record review survey was planned for 8 Mesoamerican countries (Belize, Costa Rica, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, and Panama). Examples of medical records forms were obtained from each country during site visits and were used as a basis for developing computerized data collection tools. Stratified random samples of health facilities providing any reproductive health and/or pediatric care services were selected, representing the range of facility types in each country. At each site, 100 patient charts were drawn at random. A team of data collectors with medical background extracted key indicators of quality from medical records using netbooks. Data collected during the record review allowed us to estimate the coverage of best practices, which could not have been achieved through household surveys alone. However, the quality of record keeping at the facility level was a major determinant of the availability of data. Moreover, a clear limitation of this methodology is reliability of the data, as high-quality record-keeping may not guarantee high-quality care, and vice versa. Recruitment of interviewers with basic medical knowledge and those who are familiar with local practices facilitated extraction of the data.

The Latino God Gap and Partisanship 1990-2008 Juhem Navarro-Rivera, University of Connecticut ([email protected]); Ariela Keysar, Trinity College ([email protected]); Barry A. Kosmin, Trinity College ([email protected])

As the fastest-growing ethnic group in the United States, Latinos are changing the way the United States looks and are expected to change the American political landscape as many young Latinos join the ranks of voters. Latinos have traditionally been Catholic and Democratic-leaning voters. The partisan balance between Democrats and Republicans among Latinos has remained constant in two decades. But real changes are occurring politically among Latinos that are rooted in the religious changes the community is currently undergoing. Latinos are becoming more diverse religiously as well with an increasing religious polarization towards evangelical Christianity and secularity.

Given the increasing religious gap among white Americans and their partisan preferences (the more religious are more likely to identify as Republicans) we ask if the same dynamics are occurring among Latinos. We do not expect many changes in partisanship among Catholic Latinos, who are shrinking proportionally as they are disproportionately affected by immigration politics and language barriers. We hypothesize that the new and younger evangelical Latinos are more likely to identify as Republicans while the secular Latinos are more likely to identify as Democrats.

We look at three points in time: 1990, 2001, and 2008. Using pooled data from the American Religious Identification Survey (2001 & 2008) and the National Survey of Religious Identification (1990) we will test if changes in religious affiliation among Latinos correspond to changes in partisan preference. These three surveys include over 10,000 Latino respondents over almost 20 years, including a 1000-subject subsample conducted in English and Spanish in 2008, which provides us with a clear picture of the changes we are addressing here.

Exploring the Gender Gap and the Impact of Stressed Environment Residence on Environmental Risk Tolerance Using Observational and Experimental Data Marc D. Weiner, Bloustein Center for Survey Research, Rutgers University ([email protected]); Timothy D. MacKinnon, Bloustein Center for Survey Research, Rutgers University ([email protected])

We present the findings of a split-sample question order experiment embedded within a nationwide RDD environmental risk perception and tolerance survey; that 2010 survey also collected an oversample of households within 50-mile-radii of six nuclear power production and waste management facilities. We analyze original observational and experimental data and find strong support for an environmental risk tolerance gender gap in the general population, which narrows in environmentally-stressed communities.

From the risk analysis literature, we identify and address three unsettled empirical questions concerning 1) the gender gap in risk tolerance; 2) differences in the gender gap predicated on residential location vis- à-vis proximity to environmentally-stressed locations, and 3) the effects of risk-priming on both the gender gap and stress-location effects.

By testing these hypotheses using six hypothetical local environmental risks and a 25-year local environment outlook probe as dependent variables, we find strong evidence of a gender gap in environmental risk tolerance in the general population, whereby men exhibit greater environmental risk tolerance; we further note that this general population gender gap diminishes in environmentally-stressed communities. Furthermore, we find evidence that respondents in stressed communities generally perceive environmental risks less seriously than those in non-stressed communities. The primes allow us to manipulate an enhanced level of risk, which ultimately allows us to generate even more stringent tests of the gender and stressed-location effects.

These findings help explore how question order and related experimental variation can shape theory on the gender gap; we suggest this evidence dovetails with recent international research advancing a structural, social inequality gender gap explanation. From a policy perspective, these findings argue strongly in favor of the importance of diversity in the gendered, and by implication racial, composition of our policy-making bodies, particularly in the areas of the environmental policy and risk assessment.

Use of Ignored Data in Existing Datasets to Evaluate and Enhance the Representativeness of Survey Responses David Fan, University of ([email protected])

Although ideal for representativeness, true random samples of large human populations are difficult to obtain. Instead, typical sample selection processes require at least one sampling property belonging to population members. Then, surveys measure a set of output properties. For telephone surveys, a sampling property would be phone usage and an output property could be gender. As early as 1924, it was noted that a representative response only needs to be independent of the all sampling properties. Thus a sample can provide representative results for some output responses but not others.

Furthermore, a sample user might be able to tolerate small deviations from absolute independence. Such a user would find it useful for a sampling method to have an associated tolerance map using the same format as a target at a shooting range. Shooting targets have a bullseye at the center with concentric rings of decreasing accuracy. For a tolerance map, a survey sample could have output properties lying in different tolerance zones. The bullseye would correspond to absolutely independence of the sampling and output properties. The tolerance zone of one percent bias would include all output properties with this bias or less.

This paper proposes a method for using existing survey datasets to exclude output properties from tolerance zones. The method was applied to Harris and Pew datasets. Sampling properties included cell phone usage, Internet usage and conservative ideology. As an example, a tolerance map of a Pew dataset excluded 15% of the responses from the 2% tolerance zone for a sample design using the Internet mode and recruiting respondents exclusively from a politically conservative website.

The independence condition indicates that incentives can be problematic if they decrease orthogonality between incentives and responses. The approaches in this paper could be useful for obtaining representative responses from social media.

Measuring the Number of Government Contractors on the Annual Survey of Public Employment and Payroll L. Morrison, U.S. Census Bureau ([email protected]); Jennifer Beck, U.S. Census Bureau ([email protected]); Kenneth L. Long, U.S. Census Bureau ([email protected]); Lisa Miller, U.S. Census Bureau ([email protected]); Regina Padgett, U.S. Census Bureau ([email protected])

The U.S. Census Bureau has been producing comprehensive uniformly classified statistics on the economic activity of state and local governments through the Census of Governments and its related annual and quarterly programs since 1957. One of the data series from the Census of Governments programs has been statistics on state and local government employment and payroll. These data provide information on the functional activities of state and local employment as well as information on full and part-time employment and payroll.

In order to ensure continued relevancy of these data, given recent economic changes and the resulting fiscal pressures on state and local governments, the Census Bureau has been examining how best to collect information on the number of contractors working on behalf of state and local governments. Indeed, a recent National Academies of Science report entitled "State and Local Government Statistics at a Crossroads" (2007) noted that “employment data would be more relevant if there were information on the privatization of government services formerly performed by government workers, a need that is understandable in light of the perceived trend toward outsourcing” (page 64). Recently, the Census Bureau explored the possibility of collecting information on state and local government contractors and contracted services. After providing some background on the Annual Survey of Public Employment and Payroll, this paper will describe the proposed questions that were cognitively tested with government agency respondents during the winter of 2011. This paper will also address the difficulties associated with collecting this type of information on this particular survey, and the outcome of the testing. Finally, the authors propose avenues for future research.

Complementing Survey Data with Observational Methods: The Role of Video Coding Cleo Jacobs Johnson, Mathematica Policy Research ([email protected])

The Administration for Children and Families (ACF) initiated the Building Strong Families (BSF) program to help interested and romantically involved unwed parents strengthen their relationships. BSF offered these couples relationship skills education delivered in group sessions, as well as other support services. ACF hired Mathematica Policy Research to conduct a rigorous random assignment evaluation of the program. Although the BSF programs were directed towards parents, it was hypothesized that BSF would indirectly affect children via the skills parents learn. That is, skills couples learn to strengthen their own relationship may also contribute to more supportive interactions with their children. Thus, an important element of the evaluation was the collection of data on the quality of the parent-child interaction. The proposed session focuses on our experience with video coding parent-child observations as a measure of relationship quality to complement the BSF evaluation survey data. In this presentation, we are focusing on the home-based assessment which includes the Two Bags task, a semi-structured play-based activity for mother-child and father-child dyads. The purpose of the Two Bags task is to assess parent and child behaviors as the pair interact. The video recorded Two Bags sessions were coded by a team of specially trained video coders using the Parent-Child Interaction (PCI) Rating Scales for the Two Bags Assessment. In the session, we will discuss the process of building an in-house video coding group with special emphasis on hiring, training and certifying coders along with our effort to maintain inter-rater reliability for the duration of the coding period. Over 3,000 videos of parent-child dyads were coded by a team of 17 video coders. Across all of the coders, inter-rater reliability was consistently over 85 percent. The session will draw lessons from both the videotape collection and video coding methodology to inform future research efforts.

A Typology and Review of Web Evaluation Strategies Bryan Wiggins, Fors Marsh Group ([email protected]); Jennifer Romano Bergstrom, Fors Marsh Group ([email protected]); Scott Turner, Fors Marsh Group ([email protected])

Web evaluation has traditionally focused on usability testing, analytics, and site intercept (pop-up) surveys. While each of these tools is useful, each also has its drawbacks. Usability testing is often used to assess functionality, navigation, and overall site experience with a new website but provides little quantitative output. Analytics provide data on hit rates and time spent on the site but little actionable information. Site intercept surveys provide information from actual users but suffer from potential non- response bias.

We believe experimental research must be combined with traditional web evaluation techniques. Proper evaluation methods differ depending on the research goal: testing navigation and site experience, evaluating the effectiveness of a new site, or measuring consumer satisfaction.

In addition to usability testing, experimental methods, such as benchmarking a new website against existing sites, are important means of quantitatively measuring the effectiveness of a new site in meeting its objectives. The authors will discuss a benchmarking study to evaluate a new career exploration website in disseminating career, college, and military information. Between-group analyses compared the amount of information learned from the new website versus existing career exploration sites.

Another experimental study compared the effectiveness of a site redesign in achieving its goal of utilizing multimedia to deliver military lifestyle information. Participants’ knowledge, attitudes, and images of military lifestyle were compared after viewing the redesigned or original website.

Web evaluation should not be confined to new or redesigned websites; to create an impactful site it is important to continuously evaluate and update. The authors will discuss the role of usability testing, analytics, and site intercept surveys in combination with various experimental design options to provide actionable results for all stages of web development. A typology of experimental strategies matching web evaluation needs with different stages of web development will be presented.

Can We Interview Your Teenager? Parent Permission Scripts and Teen Participation David Grant, UCLA Center for Health Policy Research ([email protected]); Royce Park, UCLA Center for Health Policy Research ([email protected]); May Aydin, UCLA Center for Health Policy Research ([email protected]); Yu-Chiech (Jay) Lin, Institute for Social Research, University Of Michigan ([email protected])

The California Health Interview Survey (CHIS) is one of the few population-based surveys which samples and directly interviews adolescents (aged 12 to 17 years old). Because the interview is conducted with a minor and contains potentially sensitive topics, permission must be obtained from the adolescent’s parent or legal guardian to conduct the adolescent interview. Over the past decade, parental permission has declined from 75.9% in 2001 to 58.5% in 2009.When permission is obtained, teen completion rates have also declined, from 83.2% to 74.6% over the same time period. Of these two factors, parental permission rates declined at almost twice the rate (-17.4%) of teen completion (-8.6%).

To enhance teen participation in CHIS, we examine several strategies to increase parental permission. Multivariate models were applied to predict the likelihood of parents granting permission based on parent and adolescent demographic and household characteristics. Preliminary analysis suggests that factors associated with parents who do not grant permission include male, non-African American race/ethnicity, Latino with small household size, households with younger adolescents, households with older male adolescents, or living in an urban area. These finding are used to target demographic characteristics of focus group participants. Focus groups with parents of teens will be conducted to better understand barriers to permission and to review and modify the permission scripts in an attempt to improve the permission rate. In the final stage, an experiment will be conducted in CHIS comparing the original and revised permission scripts.

NOTE: Focus groups will be complete and findings presented at AAPOR if accepted; results from the final stage will not be available for presentation.

The Effects of Vignette Placement on Survey Estimates: A Split Ballot Experiment Charles Q. Strohm, RTI International ([email protected]); Judith A. Seltzer, UCLA ([email protected]); Suzanne M. Bianchi, UCLA ([email protected])

Vignettes are useful for measuring norms and beliefs, but little research investigates how to embed vignettes in the larger context of a survey. The placement of a vignette might affect responses to survey questions unrelated to the vignette because vignettes provide more information and context about the survey’s topic compared to standard attitudinal survey questions.

We conducted a split ballot experiment in a survey of the Knowledge Networks online panel (n = 3,132). The survey used two methods to measure beliefs about whether parents and adult children should live together. First, the survey included a vignette about a hypothetical family that has lost its home and needs a place to live. Respondents were asked for their opinion about whether the family members in the vignette (“John,” “Mary,” their child, and an older mother) should move in with other family members. Second, the survey included general attitudinal questions, unrelated to the vignette, about the desirability of family members living together. One of these questions was from the General Social Survey (GSS): “As you know, many older people share a home with their grown children. Do you think this is generally a good idea or bad idea?” We randomly assigned the order of the vignette (and the question about the vignette) and the GSS question to respondents.

Preliminary analysis shows that vignette placement affected responses to the GSS question. When the vignette followed the GSS question, only 17% said co-residence was a “good idea” in response to the GSS question. When the vignette preceded the GSS question, 30% said co-residence was a “good idea.” Placing the vignette first may promote empathy for people experiencing economic hardship, encouraging more positive responses to the GSS question. At AAPOR, we will also discuss how the effect of vignette placement depends on respondent and vignette characteristics.

Behavioral Comparison for Originally Designated Versus Replacement Sample Ryan McKinney, Arbitron ([email protected]); Kelly Dixon, Arbitron ([email protected])

To recruit and maintain nationwide radio panels of approximately 70,000 persons, Arbitron employs a mixed mode recruitment approach from an address based sampling frame. All addresses are randomly selected and then segmented into primary and replacement sample types. Those addresses designated as primary sample types are all attempted for panel recruitment while the replacement sample types are recruited as needed to maintain demographic proportionality to the sampled market’s universe. To facilitate the maintenance of proportionality, Arbitron pools the eligible replacement sample and stratifies on household level characteristics.

Arbitron has conducted statistical comparisons of the radio listening levels of primary and replacement households in the PPM panel, while controlling for demographics, to determine if there is a meaningful difference between the two groups. The demographic composition of the primary sample is also examined to understand which groups are less likely to respond. In addition, the recruitment success metrics based on aggregate household level demographics are analyzed to understand the response rate impact of the required enumeration stage for the replacement sample and how that may impact listening differences.

Using the iPad2 as a Prize-based Incentive to Boost Response Rates Richard McClendon, Brigham Young University ([email protected]); Eric Jenson, BYU ([email protected]); Danny Olsen, BYU ([email protected])

With the rising popularity and fervor over new technological devices like iPads and smartphones, researchers may ask whether the use of these devices as promised incentives rather than prepaid or promised cash is more effective in boosting the response rates of web-based surveys. In 2009, Dillman, Smyth, and Christian downplayed the use of prize drawing incentives for web-based surveys and instead conclude that, like mail and telephone surveys, the most effective way to increase response rates in web-based surveys is to use postal mail to deliver an invitation and prepaid cash incentive (pgs. 274-275). However, for many public, marketing, and social researchers, the feasibility of this approach is not only cost-prohibitive but naturally goes against the initial purposes of using the internet in the first place—the reduction in time and ease of use. Further, when it comes to the advancement and public use of technological, data from 2009 already feels like it’s a century behind. Thus, the purpose of this paper is to revisit the question of lottery- or prize-based drawings, particularly in light of using new technological devices as incentives; in our case—the iPad2. In spring of 2011, the Office of Assessment and Analysis at Brigham Young University sent out two web- based surveys to gather data on academic and career advisement. Each survey was sent to a random sample of over 7000 students. The initial invitation had no incentive promised. Eleven days later, with only around an 8% response rate, a reminder was sent to the non-respondents including an invitation for them to enter a drawing for an iPad 2 at the end of completing the survey. Within 48 hours the response rate for the academic and career advisement surveys shot up to 35% and 30% respectively. We will present further details surrounding this significant shift.

Technologies Used to Interview Youth Who are Deaf or Have Hearing Impairments: Results from the National Longitudinal Transition Study 2012 Holly H. Matulewicz, Mathematica Policy Research ([email protected]); Daniel J. Friend, Mathematica Policy Research ([email protected]); Anne B. Ciemnecki, Mathematica Policy Research ([email protected])

Facilitating self-report is a cornerstone of high quality survey research, especially for surveys involving persons with disabilities. Surveys designed exclusively for telephone administration do not need to exclude persons who are deaf or have hearing impairments, as technology exists to bridge communications with these populations. Furthermore, technologies used to facilitate such communications have evolved over time. In the past, Text Telephone (TTY) was the norm. While it provided a way for telephone interviewing to occur, it was cumbersome to implement, as it placed substantial burden on respondents and interviewers alike. Questions and response categories were typed into a keyboard, transmitted over a modem, and displayed for respondents in a string of text across a screen. Responses were returned in the same way. Such a format is not conducive to the use probes or clarification of responses. These challenges contributed to significantly increased length of interview administration. As technology has evolved, other means of communication have come into use, including: instant messenger (used on computers or handheld devices), video relays (involving lip reading or use of sign language), and texting on cellular and other hand held devices. Despite these advances, there is limited documentation in the literature about the prevalence with which each new technology is employed and how survey researchers can maintain reliability and collect high quality data across each mode. This paper presents preliminary findings from the baseline survey of the National Longitudinal Transition Study 2012 (NLTS 2012), sponsored by the U.S. Department of Education. This study includes a telephone survey with a nationally representative sample of 15,000 youth (ages 13-21) and parents, including persons with hearing and speech impairments. Findings include a description of the prevalence rates of use for each mode of communication and operational issues encountered during administration.

Understanding How Technology Changes Have Influenced How Students Interact With Surveys in a University Environment Steve Wygant, Brigham Young University ([email protected]); Richard McClendon, Brigham Young University ([email protected]); Eric Jenson, Brigham Young University ([email protected])

Recent technology shifts have changed the way participants interact and respond to surveys. In the past, participants were limited to responding via paper/pencil or vocal response formats. Recent changes now allow survey participants to respond with computers, cell phones, and tablet computers.

The addition of these new survey interaction methods has increased the need for survey researchers to know and understand their respondent populations. Survey researchers must be aware of how their respondents interact with the survey and optimize the survey experience to best fit the method respondents are using to interact.

The proposed presentation will look at the methods of survey interaction used by students at a large US based university. The proposal will identify the student profiles most commonly associated with the various methods of interaction with surveys. The presentation will also document which questions are most challenging for each interaction type.

Data for this presentation was collected from multiple large surveys delivered at a private US based university.

New Frontiers in Political Advertising Research: The Interaction among Candidate Position, Electoral System, and the Effects of Negative Political Advertisements Bin Xing, Kent State University ([email protected])

Literature Review Scholars have been investigating the effects of negative political advertisements for decades. However, one essential question remains unanswered: which type of political candidate is most likely to be affected by negative ads in different electoral systems? The main purpose of this study is to explore the relationships among candidates’ competitive position, electoral system, and the effects of negative ads.

Current literature showed that negative ads could have both positive and negative effects in political campaigns. Moreover, study results indicated that a candidate’s competitive position (front runner/high recognition candidate vs. challenger/low recognition candidate) played an important role in deciding who should employ negative political ads and what the corresponding effects were. The previous findings in political communication and marketing research seemed to suggest that challengers were more suitable for running negative ads.

To further explore the influences of negative political ads, this study compares the effects across different electoral systems (e.g., “First-past-the-post” and “Instant-runoff voting”). Through comparing negative political ads’ effects among various electoral systems, scholars may be able to find more variables and further distinguish the most influencing factors in negative ads’ effects.

RQ1: Which type of target is most likely to be affected by negative ads and in what ways? RQ2: How do message types moderate negative ads’ influence on the way people think about the candidates? RQ3: How do types of electoral system moderate negative ads’ influence on voters’ choices?

Methodology A 2 (ads sender: front runner vs. challenger) * 2 (message type: negative issue ad vs. negative image ad) * 2 (ads targets: front runner vs. challenger) * 2 (electoral system: FPTP vs. IRV) factorial study was developed. The dependent variables were evaluation of candidates and final choice of candidate. The stimulus materials were chosen from a political ads archive. The ad tones were pre-determined through a content analysis.

Viability of Using Facebook to Increase Response Rates in an ABS Survey Paul Ruggiere, University of North Survey Research Center ([email protected]); Ashton Sams, University of North Texas Survey Research Center ([email protected]); Ashley Niermann, University of North Texas Survey Research Center ([email protected]); Enrique Romero, University of North Texas Survey Research Center ([email protected])

Ever since the Internet offered a new method for collecting data, researchers have sought contact lists comparable in quality to physical address directories and listed telephone number directories. With over 150 million users in the U.S., Facebook has the potential to offer a somewhat comprehensive source for Internet contact information. The purpose of this study was to determine whether or not Facebook could be utilized to identify people who were part of an Address-Based Sampling (ABS) survey and to encourage their participation in the survey. A random sample of 5,000 ABS records was drawn for this experiment from a larger ABS sample used for a study of health in Texas households. Names and cities contained on the ABS address records were matched to names and cities of 536 people with accounts on Facebook believed to be the intended person. A Facebook account and page for the Texas health study was set up to educate potential respondents about the study’s purpose and also as a page from which messages and friend requests could be sent. Messages asked the potential respondents to watch their mail for an invitation to participate in the survey and encouraged them to respond to the invitation or to participate whenever an interviewer called. Attempts to contact the potential respondents using Facebook met many obstacles. Several strategies were used to get past filters and system spam guards. Ultimately after 151 of the targeted account holders were contacted, Facebook labeled the health study account as spam and the contact phase of the experiment ended prematurely. Results of this small sample indicated that this approach did not improve response rates. The obstacles encountered, strategies used, and viability of using Facebook as a method of active contact for ABS samples will be examined and discussed in this paper.

The DRC Model for Hot Comment Processing. Valerie Waller, Data Recognition Corporation (DRC) ([email protected]); Paula Eckel, Data Recognition Corporation (DRC) ([email protected]); Ann Davies, Data Recognition Corporation (DRC) ([email protected]); Anna Chandonnet, Data Recognition Corporation (DRC) ([email protected])

Virtually any survey can uncover indications of potential threats or risks, either to the respondent or others. This is particularly true of surveys on sensitive topics, such as suicidal ideation, sexual preference, or risk behaviors, on which responses to open ended comments may reveal—however unintentionally—a danger of immediate harm to the respondent. Risks can surface even on surveys that do not ask sensitive questions, however. For example, commercial organizations can find early warning signs of consumer concerns or product problems among respondents’ comments. Data Recognition Corporation (DRC) has developed a process for immediately screening all comments to identify potential threats and to alert the survey sponsors so that interventions can be made quickly, when appropriate. This presentation will focus on the process DRC uses, including technical considerations in hot comment processing for web and paper surveys, timing issues, development of key words lists, criteria for raising an alert, protection of personal identifying information and confidentiality issues.

Breaking Down the Tailored Design Method Leslyn M. Hall, Redstone Research, LLC ([email protected]); Randy ZuWallack, ICF ([email protected]); Fred J. Eggers, Econometrica ([email protected])

While the efficacy and success of Dillman’s Tailored Design Method is widely recognized, it is not known the cost and benefits of each successive stage. Using a randomized experimental design for each stage of a mail survey, we examine the associated costs and boosts to response and cooperation rates using a general population mail survey of rental housing costs for sending pre-survey notification letters, survey packets, reminder postcards, second survey packets, and final telephone follow-up.

Multi-Mode or Multi-Choice Leslyn M. Hall, Redstone Research, LLC ([email protected]); Randy ZuWallack, ICF ([email protected]); Fred J. Eggers, Econometrica ([email protected])

Response rates in mixed-mode studies are lower when mail and Internet modes of survey response are offered at the same time (Griffin, Fischer, & Morgan 2001; Israel, 2009; Messer & Dillman, 2010). There is also evidence that the final sample in mixed-mode studies is more representative than the final sample in mail-only studies (Messer & Dillman, 2010). Yet, there is some evidence that offering a Web response option in mixed-mode studies can lead to a lower overall response rate (Griffen, et al., 2001); furthermore, survey costs increase with offering more than one mode of response. Using an experimental design for both a telephone survey and a mail survey, both with prenotification letters where half of the sample is offered a chance to complete on the web and half are not, we evaluate the costs and benefits of the following on final response and cooperation rates: (1) sending a prenotification letter with the opportunity to complete the survey via the Web; (2) sending a prenotification letter informing people that they will be contacted in the near future; and (3) not notifying people prior to contacting them to participate in a survey.

Age Influences on Attitudes towards Information Privacy and Consent to Record Linkage Kristen L. Cibelli, Program in Survey Methodology, University of Michigan ([email protected]); Jenna Fulton, Joint Program in Survey Methodology, University of ([email protected])

The pace of technological change has accelerated rapidly in the last decade, encompassing the growing prevalence of the Internet, automation of government and business records, and the popularity of social networking websites. In this environment, privacy issues are increasingly salient to the public. Evidence suggests that privacy concerns can affect whether or not respondents participate in surveys, consent to requests for linkage of survey responses with administrative records, and provide personally-identifying information often required to facilitate this linkage. While respondents vary in their level of privacy concern, it is unclear what characteristics contribute to this variation. We hypothesize that respondent age will influence privacy concerns due to varying degrees of habit with disclosure. Because younger adults have come of age during an era of technological advances in which disclosing personal information on social media websites is the norm, they might exhibit fewer privacy concerns regarding survey and administrative record sharing than older adults. More often than not, older adults are less exposed to the practice of sharing personal information and subsequently may be more hesitant regarding matters of privacy and data sharing. Drawing on data from a recent nationally representative telephone survey, this paper presents research that examines the effect of respondent age on attitudes toward privacy and data- sharing practices. We first examine indicators of privacy concern by respondent age. Further, we assess respondents’ willingness to permit data sharing among federal agencies and the likelihood that respondents’ will grant access to various types of administrative records in a survey context, by age. Preliminary results suggest that older respondents have more privacy concerns, are less comfortable with data sharing, and may be less willing to consent to administrative record linkage.

A Revised Framework for Survey Participation: An NSI Perspective Boris Lorenc, Statistics Sweden ([email protected])

Groves and Couper (1993, 1998) provide a framework for approaching survey participation in household surveys. By assuming the perspective of an academic researcher conducting a survey, they categorize correlates of participation into two sets, one with levels considered to be out of the researcher’s control, and another with levels considered to be under the researcher’s control.

The study presented here considers a revision of this approach from the perspective of a national statistical institute (NSI). It brings novely in two ways. Firstly, a redefinition of what constitutes aspects being within (versus outside) the control of the statistical organization, and secondly a discussion of consequences of the redefinition for dedicating data collection efforts.

The framework - best presented in a figure - embeds both the surveying organization and the householder within their common societal context: external survey environment. Householder's embedding is mediated by her/his family/household and other direct societal settings (informal and formal groups they partake in); the surveying organization is the context in which its field organization exists, and thus is a mediator for the individual interviewers. Also the survey sponsor is embedded in the same broader societal context. To all these three main components, more detailed and measurable properties are attached.

This approach is making it obvious that an NSI reaches (or fails to do so) a householder through either - in the context of a specific survey operation - efforts of its field organization or - within efforts to positively influence external survey environment - through the broader societal context. Some new aspects related to the latter are mentioned, and given examples of.

Evaluating New Incentives: The Efficacy of Grand Prize Sweepstakes and Participant Compliance Ekua Kendall, Arbitron Inc ([email protected]); Arianne Buckley, Arbitron Inc. ([email protected])

Arbitron developed an electronic meter that automatically detects audio exposure to encoded radio and TV signals. Panelists are asked to wear their meter everyday from the time they wake up to the time they go to sleep in order to measure their full media exposure. Arbitron also conducts ongoing research to improve the sample performance of our panels.

Incentives have been an ongoing challenge for the social research community. Since Arbitron recruits large populations from all across the United States, we are very familiar with these questions: What incentives work best? Is there an alternative to costly individualized cash incentives? Using a live split sample, we tested the efficacy of providing a grand prize sweepstakes during holiday periods with historically decreased compliance rates versus providing an individualized incentive for each participating member in our research panel. Beyond the aggregate we also sought to find out if there was any subset of demographic groups that performed better or worse as a result of this new incentive.

The survey research community has a minimal, but growing knowledge base of the efficacy of grand prize sweepstakes as an incentive for participant compliance. The implications of studies like this are important when considering the high cost of individualized cash programs, variations and management for large research panels. We hypothesized that offering an additional grand prize sweepstakes would be low risk and not likely harm response or compliance rates. In this vain, the study’s methodology offers great test replication possibilities.

During this presentation we will reveal the results and the implications of the findings. Compliance rates by geographical locations will be reported along with key results that vary by age, race and ethnicity. Our ongoing study in this area and recommendations for replication will prove helpful and interesting to the social research field at large.

Evaluating the Validity of Age-Targeted List in an ABS Mix-Mode Survey Ting Yan, NORC at University of Chicago ([email protected]); Rupa Datta, NORC at the University of Chicago ([email protected]); Joshua Borton, NORC at the University of Chicago (borton- [email protected])

The household survey component of the National Survey of Early Care and Education (NSECE) employs an address-based sampling design and a mix-mode data collection protocol. Addresses are sampled from the US Postal Service (USPS) Delivery Sequence File (DSF) and are sent to be matched with a telephone number. Those successfully matched with a telephone number are also matched with an age- targeted flag indicating that the address has a child under the age of 13. We will evaluate the quality of this age-targeted flag by comparing it against self-reports of respondents. Since the household survey is a national representative sample of U.S. households, we will describe characteristics at various geographic levels that correlate with the quality of this age-targeted flag and demonstrate the variations in the quality of this flag by different geographic levels.

Thursday, May 17, 2012 4:00 p.m. - 5:30 p.m. Concurrent Session B

Assessing the Impact of Non-Response on Survey Estimates

Effects of Nonresponse on Survey Estimates of Political Participation Richard Ohrvall, Statistics Sweden ([email protected]); Mikaela Jarnbert, Statistics Sweden ([email protected])

Over recent decades, the nonresponse rate in general surveys has increased rapidly in Sweden as in many other countries. Although higher nonresponse rate does not necessarily induce nonresponse bias, this development has led to an increased interest and concern about the effects on the survey quality. Using unique datasets, from the Swedish Party Preferences Survey conducted by Statistics Sweden combined with information about individual electoral participation gathered from registers, we examine the changes in response rates and the effect this has had on crucial survey estimates. The study has two main purposes. First, we describe the change in the response rate for the Party Preferences Survey from 1985 until 2010. Since the sample is drawn from a national population register, we have fairly extensive information about both the respondents and the nonrespondents, and are thereby able to describe them in terms of socioeconomics variables and other aspects. A key interest here is if the nonrespondents are becoming more different from the respondents over time. Second, we try to estimate if the higher nonresponse rates have lead to nonresponse bias. This is done by using official registers over electoral participation to get information on whether the individuals in the sample voted in the election that followed immediately after the survey was carried out. Since voting is a form of political participation, i.e. an activity that is closely related to what political opinion polls aim to estimate, we consider this to be a more relevant variable to estimate the nonresponse bias than other variables that more often are available for this kind of analysis, e.g. sex, age and geographical region.

Nonresponse and the Validity of Estimates from National Telephone Surveys Leah Melani Christian, Pew Research Center for the People & the Press ([email protected]); Scott Keeter, Pew Research Center ([email protected]); Michael Dimock, Pew Research Center ([email protected])

Nonresponse in social and political surveys continues to grow, raising concerns about the validity of the estimates produced from these surveys. Response rates for many of the most visible and respected polling organizations are declining and now approach and occasionally reach single digits. Yet survey estimates continue to conform reasonably well to national benchmarks, and political polls continue to accurately predict voter preferences in recent elections.

Previous research indicated that nonresponse bias in political polls was relatively small (Keeter et al, 1997, 2006) and seemed unrelated to response rates. But response rates have declined even further since then. In addition, the proportion of adults reachable only on a cell phone has grown substantially over the past 5 years (Blumberg and Luke, 2010), and including cell phones has become the standard practice for quality telephone surveys. Most research indicates that response rates on cell phones are even lower than those obtained on landlines (AAPOR Cell Phone Task Force 20120), increasing the potential for nonresponse bias.

We will discuss a new study that replicates two studies conducted in 1997 and 2003, which compared results from a survey conducted using the Pew Research Center’s standard practices with one conducted using a more rigorous design that obtained a higher response rate. The current study will utilize a similar approach to those studies but will employ a dual-frame design to further understand the potential for nonresponse bias in cell phone surveys. To validate the survey estimates, the survey will include a wide range of benchmarks from high response-rate government surveys and from administrative data. It will also use household information from commercial databases to compare respondents and nonrespondents whose telephone numbers can be matched to addresses.

Examining the Impact of Nonresponse on Estimates from the 2006-2010 Continuous NSFG James Wagner, University of Michigan/Survey Research Center ([email protected]); James M. Lepkowski, University of Michigan/Survey Research Center ([email protected]); Brady T. West, University of Michigan/Survey Research Center ([email protected]); Mick P. Couper, University of Michigan/Survey Research Center ([email protected]); Nicole Kirgis, University of Michigan/Survey Research Center ([email protected]); William Axinn, University of Michigan/Survey Research Center ([email protected]); William Mosher, National Center for Health Statistics ([email protected])

Estimates from survey data may be affected by nonresponse to the survey. The impact of this nonresponse is likely to vary by statistic. Therefore, in order to evaluate possible nonresponse biases, a multi-faceted approach is needed. The US Office of Management and Budget (OMB) now requires nonresponse bias analyses of this sort for federal surveys that achieve less than an 80% response rate.

This paper presents such an analysis for the 2006-2010 National Survey of Family Growth (NSFG Cycle 7). The NSFG is an interesting case study because the NSFG was managed with the goal of minimizing the risk of nonresponse bias. This was accomplished through a number of measures, including balancing response across demographic subgroups as well as characteristics observed by interviewers that are correlated with survey estimates. In this paper, we explore predictors of nonresponse in search of mechanisms (e.g. contactibility) that may be related to the key statistics produced by the NSFG. We also implement several indicators of the risk of nonresponse bias in order to evaluate their utility. Finally, we model the potential differential impacts of nonresponse on these key statistics. This constellation of evidence aims to provide researchers with a clearer understanding of the potential impact of nonresponse as well as remedies for this nonresponse.

Investigating Nonresponse Bias in a Nonresponse Bias Study Paul J. Lavrakas, Independent Consultant ([email protected]); J. Michael Dennis, Knowledge Networks ([email protected]); Jordon Peugh, Knowledge Networks ([email protected]); Jeffrey Shand-Lubbers, Knowledge Networks ([email protected]); Elissa Lee, Google, Inc.; Owen Charlebois, Google, Inc.

Going back to discussions at the 2002 International Nonresponse Workshop (organized by Bob Groves and Lars Lyberg), it has become a Best Practice to incorporate nonresponse bias studies into survey designs; e.g., OMB has issued a directive that any federal survey that expects less than an 80% response rate should contain a study of nonresponse bias. Thus nonresponse bias studies have started to be reported, including those that attempt to gather new data from the original study’s responders/nonresponders. However, few of these types of nonresponse bias studies achieve a near perfect response rate.

We will report findings about nonresponse and nonresponse bias in a nonresponse bias study. These data come from a 2011 survey conducted with existing panel members which followed up responders and nonresponders to a prior 2011 study in which they were invited to join a new measurement panel. In the original study, a random national sample of 400 households was drawn from KnowledgePanel®, a probability-based online panel. One-third of that initially designated sample completed the original questionnaire to determine their eligibility and consent to participate in the measurement panel. This yielded an AAPOR RR1 of 33.5%. After the recruitment period for the original study ended, all 400 household were invited to complete a nonresponse bias follow-up questionnaire. This follow-up survey achieved a 75.5% AAPOR RR1. A host of demographic and psychographic variables on all 400 households was known from other questionnaires these households previously have completed. We will report findings on the nature of the nonresponse in the original study (2/3 of which were nonrespondents) and the nature of the nonresponse to the follow-up nonresponse study (1/4 of which were nonrespondents). To our knowledge, this will be the first time findings will have been reported on the nature of nonresponse bias in a nonresponse bias study.

Non-Response in Recontact Surveys of Hard to Reach Populations Gregory A. Smith, Pew Research Center ([email protected]); Leah Melani Christian, Pew Research Center ([email protected])

One common way to efficiently recruit participants for surveys of hard to reach populations is to recontact respondents whose answers to previous studies indicate they are qualified members of the hard to reach population being surveyed. For example, recontacting respondents who had identified themselves as Muslims in surveys conducted between 2000 and 2006 was one key component of the sampling plan for the 2007 Pew Research Center survey of Muslim Americans. A similar approach was employed to sample Jews, Mormons, atheists and agnostics as part of Pew’s 2010 Religious Knowledge Survey, and we continued to employ this approach as part of a new 2011 survey of Muslim Americans.

But to what extent is non-response bias a problem in recontacted samples of hard to reach populations? That is, to what extent do respondents who are successfully recontacted and agreeable to participating in a second survey similar to or different than respondents who cannot successfully be recontacted or recruited for a second survey? This paper seeks to answer this question. Since the surveys of hard to reach populations we have conducted and the earlier surveys from which recontact samples are drawn include a number of identical questions, we are able to assess the degree to which the demographic characteristics and a small number of attitudinal indicators (such as party identification and political ideology) of the successfully recontacted respondents resemble the broader population from which the recontact sample is drawn. We pay special attention to whether the amount and nature of non-response bias observed varies depending on the hard to reach population in question. And we consider the implications of our findings for researchers planning to utilize recontacted samples in efforts to survey hard to reach populations.

Conference Chair's Potpourri: Interesting Papers that Didn't Quite Fit Elsewhere in the Conference Program

Emotional Risks of Survey Research: Experimental Evidence Susan Labott, university of illinois at Chicago ([email protected]); Timothy Johnson, University of Illinois at Chicago ([email protected])

Risks and harms associated with survey research have rarely been studied, although a few studies have documented emotional distress in subjects with emotional vulnerabilities. The purpose of this study was to induce negative moods in subjects, to evaluate the intensity and duration of distress, and then to test novel approaches to alleviate negative moods. To ensure subject safety, individuals with pre-existing vulnerabilities that could place them at risk for harm (due to PTSD, depression, suicidality, recent psychiatric admission, trauma, or loss) were excluded. Individuals (N= 395) participated in a telephone interview about an event that was personally distressing; mood, stress, and emotional reactions were assessed after the interview. In a second interview 24-48 hours later, respondents were randomly assigned to one of three novel interventions designed to either enhance or alleviate the emotional effects of the initial interview. Controlling for respondent characteristics, results indicated that interviews on distressing topics do make the moods of participants significantly more negative, although there were no adverse events that occurred. From the end of the initial interview to the beginning of the 2nd interview, moods naturally returned to baseline levels. During the 2nd interview, participants who again discussed the distressing event reported moods significantly more negative than those who discussed either a neutral or a happy event. This study provides strong evidence that, in emotionally healthy subjects, interviews on distressing topics create negative moods, but do not place the individuals at risk for harms due to the research. It also demonstrates that additional interventions may be useful to alleviate any negative moods that are induced in survey research. Future work on samples more at-risk will be needed to fully evaluate the utility of these methods for the protection of human subjects.

Emotion, Neuroscience, and Responses to Survey Questions George Franklin Bishop, University of Cincinnati ([email protected]); Stephen T. Mockabee, University of Cincinnati ([email protected])

What George Lakoff calls the Enlightenment theory of mind has dominated much of modern public opinion research. In asking survey questions many investigators assume that respondents know what they think and that the public is essentially rational, at least in the aggregate. Think here of Page and Shapiro’s "The Rational Public" and Frank Newport’s "Polling Matters". Pollsters typically presume, as well, that respondents can consciously report on the “reasons” why they have the opinions they do—why, for example, they approve or disapprove of the way is handling his job as president. But they can’t. As neuroscientist Michael Gazzaniga reminds us: “Ninety-eight percent of what the brain does is outside conscious awareness.” Our left-brain narrating mechanism “the interpreter” continually constructs plausible, after-the-fact explanations of the information it’s exposed to. Our “reasoning” is emotionally driven and biased by our partisan brains. We “reason with our gut,” as Drew Westen puts it, and we justify our pre-existing emotional preferences with plausible, after-the-fact justifications and rationalizations. But contemporary theories of the survey response, modeled on the metaphor of the mind as a computer, are virtually devoid of any mention of feelings and emotions in the question-and-answer process. The stages of question comprehension, retrieval of information from memory, judgment or estimation, and response-category mapping all look like purely cognitive, higher –order cortical functions with emotions regarded as cold cognitive processes at best. None of this resembles the emotional brain illuminated by Damasio, LeDoux, et al. Our paper will: (1) review research in contemporary neuroscience, including functional magnetic resonance imaging (fMRI) studies of social and political beliefs and attitudes, and (2) present an alternative theoretical model of responses to survey questions that gives a central place to the primacy of emotion and affect—one of many neuro-frontiers in public opinion and social research.

Investigating Automated Coding of Open-Ended Survey Questions Rebecca J. Weiss, Stanford University ([email protected]); Matt Berent ([email protected]); Jon A. Krosnick ([email protected]); Arthur Lupia ([email protected])

Open-ended questions are a commonly used method to gain insight into the opinions and experiences of survey respondents. However, coding answers to open-ended questions can be a costly endeavor; human coders require training and supervision to ensure accuracy and consistency, and may require a considerable amount of time to perform the coding reliably. Unlike humans, machine learning coding methods may be less expensive and faster, and their results may be more accurate. In this paper, we explore the application of supervised machine learning methods to answers to open-ended survey questions from the American National Election Studies 2008 Time Series Survey. Respondents were asked to identify the nation's most important problems, the things they liked and disliked about the major party candidates running for President, occupation, and quiz questions testing their knowledge about politics.

We evaluate the benefits and shortcomings of different supervised learning methods in terms of coding accuracy and consistency. Additionally, we investigate whether features such as response length influence the quality of automated coding. The application of machine learning and data mining techniques for public opinion research exists on the cutting-edge of the field. But it is important to recognize why and when these methods should and should not be applied. In this paper, we will not only evaluate the promise of these methods for automated coding, but also their limitations. These lessons will have application for other text-mining approaches in public opinion research.

Guidelines for Matching Ethnographers to Targeted Race/Ethnic Sites in Cross- Cultural Survey Evaluations: How Well Did They Work in a 2010 Census Evaluation? Laurie Schwede, U.S. Census Bureau ([email protected]); Rodney Terry, U.S.Census Bureau ([email protected])

The Census Bureau is conducting the 2010 Census evaluation, “Comparative Ethnographic Studies of Enumeration Methods and Coverage in Race/Ethnic Groups.” This evaluation aims to identify reasons why miscounts of some minority subpopulations persist across decennial censuses and to suggest improvements. Seventeen ethnographers were selected to conduct a coordinated set of small-scale systematic Census field observation/debriefing studies of live 2010 Census interviews. They were to address the same research questions with the same methods at the same time in nine U.S. race/ethnic sites in each of two independent data collections at different times in the same sites. In most sites, the design called for researchers to try to observe most interviews in households of a particular race/ethnic group: Alaska Native; American Indian; Asian; Black; Native Hawaiian; non-Hispanic white; Middle Eastern or Hispanic; or in a general quasi-control site.

The aims of this paper are to: 1) describe the methods and guidelines we developed for selecting and matching ethnographers to the sites; 2) assess how well these guidelines worked in the selected field sites; 3) identify factors affecting the ethnographers’ ability to complete most of their 35 interviews with their target group; and 4) revise our guidelines. Data come from 17 ethnographers and their observed interviews. We look at different combinations of interviewer/respondent/ethnographer characteristics. In this evaluation where ethnographers were to observe unobtrusively and minimize their effects on the standardized interview, ethnographers’ fluency in the respondents’ foreign language was beneficial when the interviewers were also bilingual but problematic when the interviewers did not speak that language. Target race/ethnic group size, geographical aggregation/dispersion, and cultural barriers were also important factors, as were ethnographer and interviewer characteristics, field staff cooperation, sample size and site features. We offer revised guidelines for selecting ethnographers for observation-based survey evaluations in cross-cultural research and discuss wider survey methodology implications.

A Case Study of Developing Translation Standards for Consumer Research in Emerging Markets Jennie W. Lai, Nielsen ([email protected]); Mandy Sha, RTI International ([email protected]); Teresa Jin, Nielsen ([email protected])

The inherent challenge of managing the translation of multilingual materials for global research is developing a systematic translation process given the varying translation expertise and resources available locally. For Nielsen’s consumer research, the survey materials are typically first designed in English (the source language) and then translated into the target language(s) with culture-specific tailoring based on the research needs of the local market. In an effort to standardize the translation method globally, three experimental approaches were developed to assess the quality of translation tailoring to varying resources of expertise and tools available in the local markets. These three approaches involved different translator qualifications: a centralized professional translation service within Nielsen, a local professional translation service, and a Nielsen project member who is a native speaker and familiar with the consumer panel study in China. Additionally, a toolkit of translation reference materials (including translation validation form, translation input document, vocabulary bank and/or step- by-step translation procedures) was developed for each translation approach tailoring to the qualification of the translators. The experiment followed the standardized approach of Translation, Review, Adjudication, Pretesting and Documentation (TRAPD), and the translation output is evaluated based on the toolkit provided in addition to expertise and resources available for translation and review locally. The face-to-face recruitment materials of a large consumer panel study in China were used for evaluation of the proposed translation methods from English to Chinese. Both qualitative evaluation through a small number of one-on-one interviews in China and quantitative evaluation of scoring the translation by assigning problem codes developed by Pan and Fond (2010) were used for each approach. This paper will discuss one of the three translation approaches (using a centralized professional translation service); point to the lessons learned and share recommendations for adapting and implementing translation standards as well as areas for future research.

Election and Opinion Polling Methods Probabilistic Turnout Reporting for Upcoming Elections: An Analysis Catherine Wilson, American National Election Studies ([email protected])

This research analyzes an effort by the American National Election Studies to improve its measure of how likely a respondent is to vote in an upcoming election. Traditionally the ANES has asked whether or not the respondent expects to vote with a categorical or yes/no format question. In 2010 the ANES piloted a question asking respondents to report the percent chance that they would vote in the next election.

We look at two interesting aspects of the new question. Firstly, how do respondents map between their response to a categorical frequency question and their response to a question that elicits numerical percent chance? We find that the percent chance distribution skews upward across the categorical choice options. Secondly, we evaluate the accuracy of the new question in predicting likely voters by incorporating it into the estimation of a traditional voter turnout model. The results of this analysis suggest ways that we may improve measures of expected voting in future surveys.

Identifying Likely Voters in Pre-election Polls: Comparing Methods to Find the Best One David Vannette, Stanford University ([email protected])

Many pre-election polls are conducted in order to identify the candidate preferences of people labeled "likely voters". A commonly cited challenge in pre-election polling is the fact that a large proportion of respondents say they will vote in the upcoming election - often a much greater proportion than the proportion of the population who actually end up voting. Therefore, researchers wish to identify the subset of respondents who are truly likely voters, and different organizations use different approaches. Our study evaluates the effectiveness of a variety of different methods for identifying likely voters. We use data from the American National Election Studies (ANES) 2008 Time Series Study, which administered a wide array of measures that could be used to identify likely voters during pre-election interviews. Then, these same respondents were interviewed post-election and were asked whether they in fact voted. We attempt to identify the optimal combination of pre-election reports with which to effectively separate people who did vote from those who did not. We use the Gallup 7-item battery as a starting point in this work and supplement it with a variety of non-demographic and demographic measures that are known to predict voter turnout. We explore two different approaches: (1) dividing respondents into two categories: voters and non-voters via a discriminant function approach, and (2) assigning a probability of turnout to each respondent and then comparing the effectiveness of various probabilistic cut points for separating voters from non-voters. The result is evidence pointing to the methods that can be most effective in identifying which survey respondents will vote on Election Day. This evidence will be of value to all researchers who conduct pre-election polls or who interpret such data.

Reducing Overreporting Voter Turnout in Seven European Countries–Results from a Survey Experiment Steve Schwarzer, TNS opinion ([email protected]); Sylvia Kritzinger, University of Vienna, Department of Methods in the Social Sciences ([email protected]); Eva Zeglovits, University of Vienna, Department of Methods in the Social Sciences ([email protected])

Information on voter turnout is crucial when studying electoral behavior in liberal democracies. Most of these analyses are based on turnout questions stemming from survey research. The problem of using survey data is that they might not reflect the actual behavior of the respondents: respondents overreport on turnout, meaning that they report turnout but did actually not vote. Taken this further, analyses based on this data might produce biased results and conclusions.

There have been several attempts to reduce overreporting by introducing new ways of question wordings and a more extended list of answering options. These forms have been tested successfully mostly in the US, but also in the Austrian context recently. However, social desirability bias and memory errors – two possible reasons for vote overreporting – are known to be sensitive to the survey mode and the time that has passed since the last election. But also national context variables such as the general level of turnout in the last election are of relevance.

Thus, in this paper we study how the different ways of question wordings work in different contextual settings and which impact this has on comparative survey research. To do so, we conduct a survey experiment (web survey) including the usual turnout question and two new question forms in 7 European countries in November 2011. Our analyses (1) allow us to define, if and how far former results on reducing overreporting in the voter turnout question can be generalized to different settings, and (2) draw our attention, if and how models explaining reported turnout might be biased, if overreporting is not reduced.

Breaking Bad? Method and Meaning of the "Breaking News" Question in Exit Polling Jennifer Agiesta, The ([email protected]); Patrick Moynihan, Harvard University ([email protected]); Lillian Nottingham, Harvard University ([email protected])

To measure the impact of an immediate, current event on voter behavior and attitudes, Election Day exit polls sponsored by the TV news networks and the Associated Press have incorporated the "breaking news" item on their self-administered questionnaires for the last several election cycles. Since exit polls use self-administered questionnaires, this closed item is printed in advance with a blank space provided where exit poll interviewers will fill in the specific news event using a very brief description decided on short notice by consortium members. Using representative data of voters from the National Election Pool since 2004, this paper will explore the performance of this item as it’s varied by topic and salience (e.g., “the recent terrorism attempt” to “campaign ads”), question wording and response options (e.g., “…how would you rate the importance of _? Very important/Somewhat important/Not too important/Not at all important” vs. “…how important was _? The single most important factor/One of several important factors/A minor factor/Not a factor at all”), and use in a state or national, Democratic or Republican contest. Other factors to be analyzed include: how the breaking news item is presented to respondents (e.g., handwritten or pre-printed sticker), and the extent to which breaking news is associated with vote preference, time of voting decision (e.g., “within the last few days” or “within the past month”), ideology, education and other background variables. The paper will conclude with recommendations on future use of the breaking news item in exit polls.

Data Quality from Low Cost Data Collection Methodologies Michael W. Traugott, University of Michigan ([email protected])

There has been a rapid rise in the number of firms that offer low cost data collection on the web or by interactive voice recognition (IVR) techniques that employ computers to phone numbers and interview people with touch tone devices through the use of a computerized voice. Some firms even offer “do it yourself” polls where any questionnaire can be fielded for less than $1,000. This paper reports on a study where multiple organizations fielded the same questionnaire employing such techniques for data collection, and the results are compared to those from a major national study that fielded the same questions. I compare the surveys in terms of sample characteristics, the use of weights, statistical properties of the main variables of interest, and the nature of the relationships between these variables. The general topic of the survey is public opinion about President Barack Obama’s citizenship and the relationship of politically relevant variables to those attitudes. The baseline for comparison is the 2010 American National Election Study.

Expanding the Frontiers of Survey Research through the Collection of Biological Data

Expanding the Frontiers of Survey Research through the Collection of Biological Data Angela Jaszczak, NORC at the University of Chicago ([email protected]); Samantha Clemens, National Centre for Social Research ([email protected]); Leslie Erickson, RTI International ([email protected]); Nickie Rose, Ipsos MORI ([email protected]); Heidi Guyer, Survey Research Center, University of Michigan ([email protected]); Katie Lundeen, NORC at the University of Chicago ([email protected])

New Frontiers: Challenges in Using Twitter to Measure Public Opinion

Methodological Considerations in Analyzing Twitter Data Annice Kim, RTI International ([email protected]); Heather Hansen, RTI International ([email protected]); Joe Murphy, RTI International ([email protected])

Twitter is a free online microblogging tool where users can send and read short messages (“tweets”) of up to 140 characters. With over 300 million registered users and more than 140 million tweets per day, Twitter allows for mass-scale sharing of opinions by users worldwide. As a result, researchers are increasingly interested in mining Twitter data to gain insights into public opinion. However, there are challenging methodological issues to consider in extracting and analyzing Twitter data.

In this session, we use examples from an ongoing study of twitter data on a wide range of health topics including influenza, healthcare reform, and substance abuse, to illustrate methodological issues in analyzing Twitter data. We will discuss insights on: 1) sampling: fire-hose vs. API access; 2) identifying relevant tweets: high level of noise in keyword-based queries and the process of cleaning data; 3) metrics: twitter activity as total counts vs. as a proportion of all tweet volume; and 4) time frame of analysis: variation in data when examined over hours and days vs. weeks and months, and how trends over time can be dramatically influenced by key events. We will discuss areas for future research including the need for standards in sampling and metrics, and insights into general patterns of twitter use and demographics of users.

The Challenges in Analyzing Twitter Data for Public Opinion Researchers Masahiko Aida, Greenberg Quinlan Rosner ([email protected])

The twitter, or so called micro-blogging service gained fair amount of popularity recently and its tweet volume are increasing in exponential speed.

It is natural for survey researchers to wonder, if ideas and opinions expressed on twitters are correlated with public opinions that we have been measuring in traditional methods (such as telephone, web survey or by in-person interviews). If so social media data can be another method of data collection that survey researchers can access with much less time and cost. In the paper, author will identify four areas of challenges in obtaining and analyzing twitter data and then compare twitter data with measurements from series of nationally representative telephone surveys.

The first challenge is a data collection; the volume of twitters is already large and is increasing in exponential speed. Author will discuss if census is possible and if not, what kind of sampling scheme would be appropriate.

Secondly, as all the data in the twitter are text data, researchers need transform text data into quantitative metrics. As they are free text, there are large varieties of spelling, typological errors, abbreviations (such as hashtags) and links. Author will discuss data cleaning strategy utilizing freely available natural language processing programs.

Third, the optimal functional form of correlation of public opinion and twitter is unknown and can be subject specific. It may be a simple linear relationship, or lagged relationship or it may be non-linear. Author will present a systematic method of estimating an optimal functional relationship between public opinion data and twitter data.

Lastly, as the proportion of users who actively use twitter is still very small among general public, author will compare correlations among different subgroups to see if degree of twitter use mediates the size of correlation.

Do Social Media Affect Public Discourses? A Sentiment Analysis of Political Tweets during the French Presidential Election Campaign Steve Schwarzer, Marie Curie Initial Training Network ELECDEM (steve.schwarzer@tns- opinion.com); Leendert de Voogd, TNS (Political & Social) ([email protected]); Pascal Chelala, TNS opinion ([email protected])

Since the election of Barak Obama in 2008, social media have become an essential part of political campaign instruments. The literature provides us with a general understanding of the growing success of social networks. It looks like a new form of Agora is gaining grounds all over the World. On the other hand, empirical research on the role of micro-blogs in shaping and/or predicting public opinion is lagging behind. Our proposed paper focuses on the question of how micro-blogs inform us about public opinion and the political landscape in the off-line world: Do social media in general and Twitter in particular work as a functional substitute to analyze public opinion and political debates? To answer this, we analyze the sentiment of political tweets posted about the French presidential election campaign. Our analytical instruments use tailor-made text-mining and machine-learning algorithms build on a vast body of pre-coded political messages. Comparing the outcome of our sentiment analysis with data from our own traditional public opinion research, the paper will show a predictive power of tweets on public opinion. This analysis will allow us, to (1) define, how the sentiment analysis of political tweets can be used to predict public opinion moves and (2), if one could measure the potential impact of Twitter by taking into account the relative influence of the authors of these tweets. The paper will conclude by answering the question, whether micro-blogging can be considered as new form of participative democracy and more importantly, drive political discussions and political discourse.

Can Automated Sentiment Analysis of Twitter Data Replace Human Coding? Annice Kim, RTI International ([email protected]); Ashley Richards, RTI International ([email protected]); Joe Murphy, RTI International ([email protected]); Adam Sage, RTI International ([email protected]); Heather Hansen, RTI International ([email protected])

In recent years, researchers have begun analyzing massive volumes of social media data including content from the popular microblog Twitter as a new method to measure public opinions and behaviors. A growing number of social media analytics tools offer researchers the ability to quickly analyze the sentiment of text data. Sentiment analysis is the automated computational coding of text to determine if expressed opinions are positive, neutral, or negative. Although automated sentiment analyses are being increasingly used to analyze Twitter data (tweets), the validity of these tools is largely unknown. In this study, we compared results from automated sentiment analysis to manual coding of tweets. Random samples of 500 tweets were selected for a wide range of health topics including influenza, healthcare reform, and substance use (cocaine and salvia) for a total of 2000 tweets. Automated sentiment analysis was conducted using IBM SPSS Text Analytics for Surveys (STAS) software and radian6, a leading provider of social media monitoring and analytics. For manual coding, a codebook was developed with definitions for positive, negative, and neutral sentiment, and all tweets were coded by two independent coders, with any discrepancies in coding resolved by an adjudicator. For nearly half of the tweets, the automated analysis produced results that disagreed with the manual coding. Rates of agreement between automated and manual coding were slightly higher for radian6 than STAS. These results suggest that current automated methods do not replicate the gold standard of manual coding. While some vendors provide the ability to customize their sentiment tools, this is not the default. Our results suggest that manual coding is a more valid method for analyzing sentiment of text data than out-of-the-box automated tools and that researcher should use caution when considering the use of automated sentiment analysis tools.

Operational Issues in Cell Phone Surveys

Why We No Longer Need Cell Phone Incentives Thomas M. Guterbock, Center for Survey Research, University of Virginia ([email protected]); John Lee P. Holmes, Center for Survey Research, University of Virginia ([email protected]); Robin A. Bebel, Center for Survey Research, University of Virginia ([email protected]); Peter A. Furia, Center for Survey Research, University of Virginia ([email protected])

Two recent cell phone survey incentive experiments lead us to conclude that the increasing prevalence of “unlimited” cell plans has undermined the case for incentives.

Early studies of cell phone incentives, ours included, concluded that offering incentives to cell phone survey participants would increase response rates (Brick et al. 2007, Diop et al. 2007). Cash also ought to reduce non-response bias if marginal and low-income potential respondents find a cash incentive highly salient, as predicted by leverage-salience theory (Groves 2000). But more recent experiments (Oldendick and Lambries 2010) have found no effect on production rates.

We conducted two experiments (Washington Metro and Danville, VA) with random assignment of cell phone users to $10 and no-incentive groups and asked respondents if they expected “to actually pay a higher bill this month as a result of doing this survey” on their cell phone. We compared production rates and representation. We expected to find more people who needed to pay for their calls among the incentive recipients. We also expected that poor and marginal groups would be more likely to pay for their minutes. Even if lower income respondents perceived that their minutes were paid for, we expected them to be better represented in the incentive treatment.

These hypotheses were not supported. In Danville the incentive caused only a small difference in production rates whereas calling in the Washington Metro area showed no difference. In general, very few respondents expected that our call would cost them money. Marginal groups were represented nearly equally under both treatments. We will detail these findings along with assessing any significant substantive differences in responses between the two groups. The good news is that researchers can lower costs by not offering reimbursement to cell phone respondents, and need not be concerned that they will thereby diminish sample representativeness.

Geographical Accuracy of Cell Phone Samples and the Effect on Telephone Survey Bias, Variance, and Cost Benjamin Skalland, NORC at the University of Chicago ([email protected]); Meena Khare, National Center for Health Statistics ([email protected]); Carolyn Furlow, National Center for Immunization and Repiratory Diseases ([email protected])

Prior to sampling, geographic information can be derived from landline telephone numbers with great accuracy, allowing for state-specific landline surveys and effective geographic stratification for national surveys producing state-level estimates. However, the assignment of geographical information to cell phone numbers is more problematic because the cell phone exchange is associated with the place the service for that cell phone number was originally acquired, which is not necessarily the place where the person currently resides: a person could have acquired the service in a different state than the state of residence or could have moved to a different state since activation. Christian et al. (2009) estimate that less than 3 percent of landline households reside in a state that differs from the state associated with the landline telephone number, but about 12 percent of cell-phone-only households reside in a state that differs from the state associated with the cell phone number. In this paper, we present state-level estimates of the geographic accuracy of cell phone samples for cell-phone-only households from the 2009 National H1N1 Flu Survey, a dual-frame RDD survey conducted by NORC on behalf of the Centers for Disease Control and Prevention. We then discuss the implications of cell sample geographic inaccuracy on the bias and variance of dual-frame estimates, as well as on the cost of dual-frame surveys.

Distractions, Privacy, Costs: What are Cell Phone Respondents Concerned About? Gene M. Lutz, Center for Social & Behavioral Research, University of Northern Iowa ([email protected]); Mary E. Losch, Center for Social & Behavioral Research, University of Northern Iowa ([email protected])

Issues of coverage and response rates have made dual frame landline and cell sampling strategies common practice for population-based surveys using a telephone data collection mode. Beginning with early studies incorporating cell phone users, concerns have been raised about respondent reactions to unsolicited calls to cell phones, perceptions of privacy, social acceptability, costs to respondents, and distraction/safety, among others. While researchers have decided to use cell phone frames, few studies have provided findings related to cell respondents’ perceptions of their inclusion. Since 2008, cell-only respondents participating in the Iowa BRFSS have been asked about their location and competing behaviors during the interview along with their opinions about privacy, as well as their overall assessment of the cell-respondent experience. Results show that many concerns were less evident than anticipated and not widely shared by cell respondents. However, some subgroups of respondents were more concerned than others and implications of some of these concerns are important to consider regardless of respondent views. Four-year trends will be presented along with a general profile and implications for telephone data collection with cell phone users.

The Impact of Telephone Number Churn on Dual-Frame (Landline/Cell) RDD Response Rates Heather M. Morrison, NORC at the University of Chicago ([email protected]); Kathleen Santos, NORC at the University of Chicago ([email protected]); Robert H. Montgomery, NORC at the University of Chicago ([email protected])

Among the many factors that have been speculated to contribute to declining landline RDD response rates is telephone number turnover, or “churn.” Number portability, cell-phone-only households, 1,000 block number-pooling and even simple household relocation all contribute to an increasingly unstable landscape for landline telephone surveys. Analyses conducted in 2008 on data from a suite of large CATI survey projects with a protracted data collection period examined this churn by analyzing the rates at which landline telephone numbers initially categorized as known households lose that status, and the converse. Looking at data from the State and Local Area Integrated Telephone Survey (SLAITS) conducted by NORC under the direction of the National Center for Health Statistics we found that after a seven to eight week lag in contact approximately six percent of known households became non- households and an additional 20 percent resulted in noncontact, with both rates worsening slowly over further time.

However, as telephone surveys increasingly adopt dual-frame sampling approaches that include both landline and cell telephone numbers it may be that the impact of this churn is minimized by the inclusion of the cell lines. Once again drawing on data from SLAITS we examine the rate at which cell telephone numbers initially categorized as known households lose that status. We seek to better quantify cell phone number status change and using this analysis, provide guidance to studies that may use recontact designs or encounter other long delays between dials. We will also consider factors that influence number turnover with an eye to better understanding their impact on future surveys.

Public Opinion and Survey Research in Afghanistan

Prospects for Taliban Reconciliation in Afghanistan Matthew Warshaw, D3 Systems, Inc. ([email protected]); John Richardson, D3 Systems, Inc. ([email protected])

As Operation Enduring Freedom begins its 11th year, a political settlement with the Taliban is increasingly being viewed by US officials as a key component to ending the war in Afghanistan. However, debate continues as to what such a settlement will look like and its likelihood of success. Although reconciliation is popular with the US military and diplomats, what do Afghans think about it?

This paper explores Afghan’s opinions on Taliban reconciliation, exploring multiple options that such a settlement may take. Perceptions of the general population are compared with those of women and ethnic minorities who are considered to have the most to lose if reconciliation efforts move forward. Regional differences are also explored to compare opinions in the southern and eastern regions that have seen the most violence to those of the country as a whole.

Conclusions in this paper are supported by recent polling across Afghanistan by ACSOR Surveys on behalf of D3 Systems, Inc. This study examines a recent representative survey of Afghanistan in which Afghans conducted face to face interviews with other Afghans following nationwide sampling in proportion to the population.

Effects of the 2009 Afghani Presidential Election on Public Opinion Jill M. Heese, University of Nebraska-Lincoln ([email protected]); Ann M. Arthur, University of Nebraska-Lincoln ([email protected])

Free and clear elections, without fear of violence or fraud, are the building blocks of democracy. Plagued with scandal, the 2009 presidential election in Afghanistan provides a unique opportunity to examine how public opinion is affected when these building blocks are absent.

In an effort to gain support of conservative Shiites, President Hamid Karzai passed the Shiite Personal Status Law in March of 2009. This law dramatically decreased women’s rights, requiring, among other things, permission from their husbands before voting or even leaving the home.

Security was a major concern, with the number of insurgent attacks ranging between 32 and 48 per day in the two weeks prior to the election (CBC, 2009). The ousted Taliban vowed to cut off the inked fingers of voters in an effort to keep people from the polls, and when elections were held on August 20, they kept their promise.

Fraud was rampant with over 1.3 million votes cast ultimately deemed fraudulent (Independent Election Commission, 2009). When a run-off election was planned for November between Karzai and his top contender, Abdullah Abdullah, the latter conceded before the run-off was held, citing the corruption.

Using data from the Gallup World Poll, a probability-based multi-national survey, this research examines changes in Afghanis’ reported confidence in government and satisfaction with personal freedoms following the 2009 election. Preliminary analysis indicates that confidence in the national government fell from 63.9% pre-election to 29.9% post-election. Interestingly, while approval of the country’s leadership fell substantially (51.8%, 32.8%), approval of city leadership remained steady (59.8%, 57.7%). Additionally, satisfaction with freedom remained relatively stable for males (72.0%, 70.5%) while declining heavily for females (64.5%, 47.7%). This research underscores the importance of free and clear elections in obtaining the confidence of the Afghani people in a “post-Taliban” national government.

Pashtun Women in and Afghanistan Anne Dalal Pessala, D3 Systems, Inc. ([email protected])

Afghan and Pakistani Pashtuns have a strong sense of shared identity and hold many common beliefs, including the circumscribed role of women in public and private life. These stringent norms distinguish them not only within the region and Muslim world, but also from other ethnic groups within their respective nations. This paper, using data from the 2011 Women in Muslim Countries survey conducted by D3 Systems, Inc., compares the attitudes and behaviors of Pashtun women in Afghanistan and Pakistan with each other and with other ethnic groups within each country. The Women in Muslim Countries survey is a multi-wave multi-country study conducted via face-to-face and CATI interviews. Interviews in Afghanistan and Pakistan were conducted face to face. Areas of comparison for this presentation include: • Voting behavior • Control over household finances • Education • Attitudes toward Islamic law and human rights • Media use • Access to healthcare and employment.

The Unique Challenges of Polling in a War Zone Pamela Hunter, Glevum Associates ([email protected])

Polling in a war zone poses unique challenges. This paper will summarize the primary challenges of conducting public opinion research in Afghanistan and how they have been addressed. It will be based on the experiences of Glevum Associates, a research firm for which I work as a senior researcher, which has been conducting qualitative and quantitative research as a Department of Defense subcontractor In Afghanistan for more than three years. Researchers in Afghanistan must learn to be flexible and adapt to conditions to provide the best information possible.

Topics will include the following challenges: • Random selection. Landlines are not available and so Afghans who have phones use cell phones. New methods, such as utilizing social networking web sites, are not an option in Afghanistan. Therefore, researchers must use random selection methods that were developed prior to widespread telephone coverage in the US. • Safety issues. Western researchers must rely entirely on Afghans who can go into the countryside without drawing attention. However, even Afghan interviewers face unsafe areas that are completely inaccessible. Some survey question simply cannot be asked directly to protect interviewers. • Cultural issues. A deep cultural gap exists between Western nations and Afghanistan. Thus assumptions about the populace cannot be made as they can in other countries. For example, interviewers must be of the same gender as the interviewees, women may be inaccessible, and men may have more than one wife. • Atmospheric data collection. This method is designed to understand the “buzz” among Afghans. Afghans have limited access to media and are largely an oral society. Literacy rates are low; estimates indicate that only 20 to 40% of the population is literate. Thus news is primarily conveyed through word of mouth.

Web Survey Questionnaire Design

Using Adaptive Questionnaire Design in Open-ended Questions: A Field- experimental Study on the Size of Answer Boxes in Web Surveys Marek Fuchs, Darmstadt University of Technology ([email protected]); Matthias Emde, Darmstadt University of Technology ([email protected])

Previous research on open-ended questions revealed several factors influencing the answers provided by respondents in web surveys. Among others, motivating instructions have proven to elicit long and rich responses. Furthermore the visual design of the answer box size influences the answers provided by respondents. Small boxes seem to pose a lower response burden and therefore reduce item- nonresponse. Larger answer boxes increase burden but at the same time motivate respondents to provide more extended responses. Based on this evidence finding the ideal box size seems to require a trade-off balancing item-nonresponse and the length of the answers. Web surveys offer opportunities to deal with this dilemma and optimize answers to open-ended questions. In a survey among the university freshman students we conducted experiments using auto-adjusting answer boxes. Moreover, we used the amount of information respondents typed into the first open-ended question to assign them later in the survey a custom-size answer box. Results suggest that the web survey specific enhancement in the visual design of text boxes influences data quality. Especially the adaptive assignment of answer boxes reduce item-nonresponse for those respondents who did not answer in the first place while prolific writers typed more detailed answers into the second box. Results are discussed in light of the general question whether questionnaire design in general and visual design in particular should be used to instruct respondents on how to answer an question or whether an adaptive design should be used that adjusts elements of the questionnaire to the behavior of respondents.

Yes-No versus Checkboxes Response Options in Web Surveys: What Form is Closer to Benchmarks? Mike Murakami, Google ([email protected]); Mario Callegaro, Google ([email protected]); Vani Henderson, Google ([email protected]); Ziv Tepman, Google ([email protected]); Qi Dong, Google ([email protected])

When writing survey questions, researchers who want to maximize response rates, reliability, and validity can choose among a variety of response formats. Among these formats are checkboxes, where a respondent can check off one or more answers from a provided list, and yes-no radio buttons, in which each item must individually be clicked as a positive or negative response.

Previous research shows consistently that asking questions in a yes-no format yields higher endorsements than in a checkbox format. Despite these consistent results, however, there is not enough evidence to demonstrate which of the two forms is closer to the “truth.”

In this paper, we examine the results of two web survey experiments conducted on Google advertisers in which the respondents were randomly assigned to view either a checkbox list or an analogous series of questions with yes-no answer response options.

The results from the first survey (conducted earlier this year) confirm the previous findings that yes/no questions yield greater endorsement rates in comparison to checkboxes. In the second, forthcoming survey (to be completed in the first quarter of 2012), we will be able to connect each response to internal information indicating the behavior of that specific respondent/advertisers. Using these objective measures of behavior, we will validate the responses of each experimental condition to determine whether checklists underestimate endorsements, or yes-no radio buttons overestimate them (or both).

Optimal Response Formats for Online Surveys: Branch, Grid, or List? Matthew DeBell, Stanford University ([email protected]); Catherine Wilson, Stanford University ([email protected]); Simon Jackman, Stanford University ([email protected]); Lucila Figueroa, Stanford University ([email protected]); Kyle Dropp, Stanford University ([email protected])

This paper reports the results of randomized experiments comparing branch, grid, and list question formats for many different questions in an Internet survey with a nationally representative probability sample: the American National Election Studies' Evaluations of Government and Society Study. We compare the three formats in terms of administration time, item nonresponse, survey breakoff rates, indications of satisficing, and criterion validity. Results also show there are tradeoffs between data quality and quantity. We meta-analyze the results to quantify these tradeoffs and we conclude with recommendations for optimizing these tradeoffs in various research settings. For many purposes, grids are an appropriate format choice.

Investigating the Impact of the Number of Grid Items on Web Survey Responses Fan Guo, Survey Research Center, University of Michigan ([email protected]); Elizabeth Nunge, Google Inc. ([email protected])

Grids are commonly used, particularly in web surveys, when multiple items share the same response options. Previous research has found that displaying multiple items in a single grid yields higher correlated responses and less item differentiation compared to using either fewer grid items per screen or having one single item per screen (Tourangeau et al, 2004, Yan, 2005, Toepel et al, 2009). Also, responses in a grid appear to be less consistent when the item definitions are reversed, although the difference is not significant (Callegaro et al, 2009). The aim of our study is to investigate how the number of grid items and screens affect response behavior and data quality given a large number of overall items (12). Users who opt into a survey linked from the Google Maps interface will be shown one of five grid format conditions from a split ballot experiment. Across all groups, users will be presented with only one grid per screen, but the number of screens and grid-items will vary; in the first group, users will be presented with one twelve-item grid, in the second group, two-six item grids, in the third group, three four-item grids, and so forth. Additionally, all twelve grid-items will be randomized across screens and all grids will be presented with the same fully labeled 5- point response scale. We hypothesize that the distribution of answers will differ across experimental conditions and drop-offs and item non-response will increase with larger grids. Additionally, we hypothesize that grids with more items will lead to more measurement error as indicated by a higher rate of inconsistency, especially when the meaning of some grid-items is reversed (i.e confusing vs. simple).

Positioning of Clarification Features in Web Surveys: Evidence from Eye Tracking Data Tanja Kunz, Darmstadt University of Technology ([email protected]); Marek Fuchs, Darmstadt University of Technology ([email protected])

Enhancing response quality is a major concern in survey research. Especially in self-administered questionnaires, clarification features such as definitions, examples, or instructions are often applied to prevent misinterpretation of questions, and problems with retrieving relevant information or with formatting the answer. However, previous research demonstrated the inefficiency of such information because it was likely to be ignored during the question-answer process. In the present study, eye tracking data are applied to gain in-depth understanding of an optimal positioning of clarification features. A comprehensive understanding of questions and answers requires careful reading. The recording of eye movements allows detecting which position of definitions and instructions increases the probability that respondents actually take notice of this information when reading a survey question. Furthermore, by examining the respondents’ scanpath while reading the various components of a question we are able to determine whether attentive reading of clarification features is a sufficient condition for improving response quality irrespective of the sequence in which clarification features and other components of a question are read. In a between-subjects lab experiment, respondents (N=100) are randomly assigned to either the control condition where no definition, example, or instruction is provided or to one of three experimental conditions where clarification features are provided at different positions (after the question text, before the question text, after the response options). Results are discussed in light of the general question whether differences in the responses provided can be explained by differences in eye movements and by differential scanpaths when processing the question.

Friday, May 18, 2012 8:00 a.m. - 9:30 a.m. Concurrent Session C

Address-Based Sampling: Issues and Challenges

Effectiveness of Address Based Sampling for Recruiting into a Longitudinal Panel Darby Steiger, The Gallup Organization ([email protected]); Kyley McGeeney, Gallup, Inc ([email protected]); Yongwei Yang, Gallup, Inc. ([email protected])

Numerous studies have shown that address based sampling (ABS) is an excellent alternative to traditional RDD sampling in order to improve coverage on surveys (Dutwin & Link, 2011; English et al, 2010; Fahimi, 2010; Link et al, 2008). Studies of general population and hard-to-reach populations have been designed to achieve comparable, if not higher, response rates than using RDD frames (Boyle et al. 2011; Montaquila et al, 2010). However, few organizations have studied the efficacy of ABS for recruiting into a longitudinal panel (DiSogra & Hendarwan, 2010; Garret & DiSogra, 2010). The presence of an interviewer should produce comparable recruiting success to RDD sample for “matched” cases, but for unmatched cases, is a mail-only strategy powerful enough to persuade households to make a long term commitment to a panel?

In 2011, Gallup conducted a large scale experiment with 100,000 ABS records, randomly assigning households to one of four incentive conditions to join the Gallup Panel, a probability-based, nationally representative, multi-mode panel. All sampled households received an initial invitation mailing and to provide details about the commitment. Matched cases received follow-up phone calls from a Gallup interviewer who encouraged them to enroll in the Panel. Unmatched cases received a one-page enrollment form and several follow-up mailings to encourage enrollment. Results show that matched records resulted in more than twice the rate of enrollment in the panel than unmatched records, and that a double incentive treatment combining both prepaid and promised incentives boosted enrollment rates by as much as 9 percentage points. This paper will review the results and cost-effectiveness of the recruitment experiment vis-à-vis traditional RDD recruiting. It will also examine the long term efficacy of using matched and unmatched address based sampling (and varying incentive levels) on panel retention and attrition.

Sampling From the Abyss? Exploring Biases Inherent in Address-Based Sampling with Marketing Data S. Mo Jang, University of Michigan ([email protected]); Josh Pasek, University of Michigan ([email protected]); Curtiss Cobb, Knowledge Networks ([email protected]); Charles DiSogra, Knowledge Networks ([email protected]); J. Michael Dennis, Knowledge Networks ([email protected])

What happens when survey firms develop address-based samples with the assistance of commercial marketing databases? More and more, ancillary data compiled from marketing sources is being used both as a tool to identify likely respondents for surveys and to sample a set of respondents who look more like the general public. Sampling with this kind of supplementary data can reduce the costs of finding unusual populations and can even aid in building a sample that demographically mirrors the population at large. The data collected by marketing firms is frequently incomplete, however. A large proportion of households lack ancillary data on a variety of variables. If households lacking ancillary data on any particular variable differ systematically from households where that data is present, such a sampling strategy could lead to biased outcomes.

To date, little analysis has assessed the extent to which missing ancillary data might alter the composition of a sample. The current study addresses this possibility. Using a unique dataset compiled by Knowledge Networks, we compare the demographic composition of recruited households with missing ancillary data with the self-reported data from members of those same households. We find that households missing ancillary data are systematically biased with regard to a number of demographic factors. We discuss the implications of these findings for samples based on ancillary data sources and possible methods that could be used to address the challenges posed.

Two Years of Seasonal Yield Variation and Response Patterns in Address-based Mail Samples Charles A. DiSogra, Knowledge Networks, Inc. ([email protected]); Erlina Hendarwan, Knowledge Networks, Inc. ([email protected])

The probability-based on-line KnowledgePanel® run by Knowledge Networks maintains size and representative diversity through ongoing mail recruitment. At the 2011 AAPOR meeting we presented data from 2010, where eight national address-based samples (n=22,500), drawn from the USPS Delivery Sequence File, were fielded from February through October. Those data suggested the highest yield in July and lowest in May. Now having a second year of data, eight mailings (avg. size 26,250), January through October, this observed pattern mostly persists with some new observations. The highest yield is now the 2011 January mailing with July the next highest (similar to 2010), the period May through June a low yield (similar to 2010) and conflicting year-to-year results for April. All yields range from 6.2% to 7.4% and do not include additional telephone follow-up with non-responders that raised yields into the 8.5-9.8% range.

All mailings are identical in strategy and timing. Each has an initial packet sent first class to Current Resident with a $2 incentive, reminder postcard mailed one week later, and a reminder letter mailed to non-responders in week four. All materials are English-Spanish. The sample design differed from 2010 to 2011. The 2010 sample had two strata, one targeting Hispanic census blocks and the other stratum having all remaining blocks. The 2011 sample had four strata constructed from ancillary information, they were: Hispanic ages 18-24; All else 18-24; Hispanic ages 25+/unknown; and, All else 25+/unknown. Although the 2011 stratification lowered overall yields by 1%, it is presumed not to affect any seasonal variation.

The 2011 mailings approximated the mailing dates for 2010 with some differences; 2011 had a January, March and June mailing where 2010 did not. Emerging seasonal patterns will be presented both overall and for different response modes (return mail, online, telephone call-in).

Addressed-Based Sampling – A Better Sample? Exploring the Benefits of Using Addressed-Based Sampling in a State-Wide Targeted Sub-Population James M. Ellis, University of Virginia Center for Survey Research ([email protected]); Deborah L. Rexrode, University of Virginia Center for Survey Research ([email protected])

As society changes, survey research changes. The declining number of landline-only households in the U.S. has complicated and raised the cost of conducting surveys by telephone. Gone is the near-universal access to residences and greatly reduced is the geographic precision that high-coverage telephone surveys once offered. However, advances in computing power and address-based data files have created an alternative method of sampling known as address-based sampling (ABS).

ABS uses Delivery Sequence File (DSF) data from the U.S. Postal Service’s records of every address in the country that can receive mail. The quality of this national list has improved with the push for more automation and efficiency in mail delivery, the growth in geographic information systems applications, and local initiatives to enhance emergency services by assigning standard street addresses even in more rural areas. ABS boasts near-universal coverage and precise geographical targeting.

Because ABS is relatively new and often used in mixed-mode surveys, describing its use is valuable. This paper describes a mixed-mode ABS experiment within a statewide survey about recreational activities and demand. The original sample contained 13,880 records and yielded 3,149 responses from 17 geographic strata.

Data were collected by web and mail. Addresses were randomly assigned to one of four treatments: mail only, mail with web later, web with mail later and the choice of options offered up front.

The response rate was much lower for the web-with-mail-later group. Nearly half of the responses in this group resulted from a single paper mailing. However, one-quarter of respondents from the two treatment groups offering the web mode early on used the web. Some substantive and demographic differences by treatment were observed, but were not problematic. A hybrid protocol – a web-first invitation with multiple postal contacts to nonresponders – might have reduced costs up to 15% compared to a postal-only protocol.

Redesigning Fair Market Rent Surveys Randal ZuWallack, ICF ([email protected]); Leslyn Hall, Redstone Research ([email protected]); Doray Sitko, Econometrica ([email protected]); Charles Hanson, Econometrica ([email protected]); Fred Eggers, Jr., Econometrica ([email protected])

The U.S. Department of Housing and Urban Development (HUD) annually estimates Fair Market Rents for metropolitan areas and non-metropolitan counties. HUD uses census data and area-specific surveys to determine FMRs. HUD has been conducting FMR surveys via RDD samples for more than 15 years. In recent years, the utility of conventional landline RDD studies has decreased and the impact on FMR surveys is exasperated since the target population is renters. According to Blumberg and Luke (2011), 50 percent of adults who rent are cell-only and thus can’t be reached through a landline survey.

HUD is redesigning the FMR surveys and is currently evaluating a landline and cell phone dual frame telephone sample versus an address based sample with a mail survey. Adding a cell-phone sample would be least disruptive to the current methodology and is inevitable with 50 percent of the rental population living as cell-only. But, the limited ability to geographically target cell phone samples may be an inefficient way to reach households in relatively small geographic areas for FMR surveys. An ABS approach is a drastic change for FMR surveys, but is distinctly advantaged in targeting small areas. However, FMRs have traditionally been completed in a short time period--a distinct advantage of telephone surveys.

ABS or Dual-frame? To answer this question, we conducted an experiment in four metropolitan areas: Peoria, IL (RDD); Fort Wayne, IN (ABS); Columbia, SC (RDD); and Charleston, SC (ABS). We present the result of the experiment with a focus on cost, timing and quality.

Issues in Survey Non-Response

Who Doesn’t Respond When a Survey is Voluntary? Deborah Harner Griffin, US Census Bureau ([email protected])

Few surveys enjoy the benefits of being mandatory. Most surveys conducted by the federal government and the private sector cannot rely on a statement such as, “Your Response is Required by Law” to encourage participation. Other incentives or appeals are needed to gain respondent cooperation. While mandatory surveys have been shown to achieve higher survey response rates than voluntary surveys, little research is available to see who is included in a mandatory, but not a voluntary survey. The American Community Survey (ACS) is a large national household survey that collects demographic, social, economic, and housing data throughout the year. Response to the ACS is mandatory. In 2003 the Census Bureau conducted a test to assess the operational, cost, and quality implications of using voluntary, rather than mandatory, methods in the ACS. Data for two complete sample panels were collected using voluntary methods, methods similar to those used in other Census Bureau household surveys. For all other sample panels the data were collected using mandatory methods. A shift to voluntary methods resulted in a drop in the overall survey response rates. Recently the dataset from that test was reanalyzed to profile the characteristics of these nonrespondents. The characteristics of the population that was included in the voluntary implementation were compared with the characteristics of the population included in a mandatory implementation. The results show some unexpected differences. The hardest-to-interview populations were equally likely to be included in voluntary and mandatory implementations. Mandatory methods were more successful in gaining cooperation from the higher educated, more mobile, and higher income populations, suggesting that these individuals may be missing from voluntary surveys unless special procedures are added to encourage their participation.

Actualization of Respondents’ Participation in “Isolated” Conditions Jason Minser, Abt SRBI ([email protected]); Mindy Rhindress, Abt SRBI ([email protected]); Marci Schalk, Abt SRBI ([email protected])

Address-based sampling (ABS) is used increasingly in household surveys. Between 50% and 60% of records are able to be “matched” to known telephone numbers, using multiple sources. “Unmatched” households are notified by an advance mailing inviting them to participate (e.g., web or call-in). Unmatched sample with limited response-rate stimuli represents a major challenge in utilizing ABS designs. This study focuses on an analysis of the “cooperators” in the unmatched sample for the second stage of a two-stage, recruit/follow-up design. Data for this study was generated by the ABS-sampled 2010-2011 Minnesota Household Travel Survey, conducted by Abt SRBI. Approximately 3,200 unmatched households opted into the recruitment task via a web site provided in a postal mailing. As part of this task, unmatched households were asked to provide additional direct contact information (e.g., phone number or email). About half of the unmatched households chose not to provide additional direct contact information. Those who provided contact information were reminded by telephone to complete the follow-up survey. Mail follow-up reminders were not possible for those households who chose not to provide additional contact information. For these households, the researcher must rely mainly on participant self-motivation to complete follow-up tasks. This effectively “isolated” them from further interaction. This paper investigates the effect of “isolated” (no follow-up contact) vs. “direct interaction” conditions on non-response in follow-up tasks. We will examine characteristics of the “isolated” sample who participated in the survey, versus those who received telephone follow-up contact. We will examine the extent to which respondents “isolated” from contact differ significantly on key geographic and demographic characteristics. We will also examine non-response bias from both “isolates” and non- response.

Trends in Mail Survey Response Rates: An Analysis of Monthly Response Rates in a Satisfaction Survey conducted in Oregon since 1994 Virginia M. Lesser, Department of Statistics - Survey Research Center ([email protected]); Daniel Yang, Oregon State University - Survey Research Center ([email protected]); Lydia Newton, Oregon State University - Survey Research Center ([email protected])

Survey response rates have been declining across all modes of data collection. We explore this trend using data from a unique data set documenting changes in monthly response rates to a survey conducted from 1994 to the present. The Oregon State University Survey Research Center manages the Oregon State driver and motor vehicle services customer satisfaction survey for the Oregon Department of Motor Vehicles. This monthly survey began in April 1994 and is continuing on a monthly basis. Records of response rates to this monthly survey have been summarized to show trends over this time period. Relatively few changes in survey operations were made over this time period. For example, either a preletter or an additional mailing to the nonrespondents was used. Since 1994, all selected individuals may have either up to three or four contacts. Since 2001, little change has been made to the delivery methods but there were minor changes to the questionnaire. Also, since 2001, both gender and birth data information were collected for all individuals selected in the sample. This provided the detail needed for closer examination of the nonresponse across gender and age groups. We present graphical summaries illustrating the trends in response rates in various age and gender groups since 2001. In addition, we employ a temporal analysis using time series techniques to estimate the change in response rate over time after adjusting for the changes in the survey operations. Results of these analyses will be discussed.

Exploring the Effects of a Shorter Interview on Data Quality, Nonresponse, and Respondent Burden Scott Fricker, Bureau of Labor Statistics ([email protected])

The Consumer Expenditure Quarterly Interview Survey (CEQ) is an ongoing monthly survey conducted by the U.S. Bureau of Labor Statistics (BLS) that collects expenditure information from American households. Sample households are interviewed five times over the course of thirteen consecutive months; the interviews are long, the questions detailed, and the experience can be perceived as burdensome. This presentation reports the results of a small-scale field test which examined the effects of administering a shorter CEQ instrument on data quality, nonresponse error, and respondent burden. One- third of study participants were interviewed monthly using a 1-month reference period, and one-third received a modified CEQ which replaced half of the detailed expenditure questions with global expenditure items. Data from these two treatment groups were compared against a control group that was administered a quarterly CEQ interview consisting only of detailed expenditure questions. We present results from data quality analyses that examined both direct measures (e.g., number of expenditure reports, expenditure amounts) and indirect measures (e.g., response rates, measures of perceived burden, item nonresponse, etc.), nonresponse bias analyses (e.g., comparisons of response rates, sample composition, and expenditure estimates across treatment conditions), and discuss limitations and implications of the study.

An Analysis of the AAPOR 2011 Membership Survey Nonresponse and Paradata Heather Hammer, Abt SRBI ([email protected]); Joe Murphy, RTI International ([email protected]); Liz Hamel, Kaiser Family Foundation ([email protected]); Chase Harrison, Harvard Business School ([email protected])

Prior to 2011, AAPOR periodically surveyed current and former members and conference attendees to gain insight into member satisfaction and the value of benefits and programs. The previous membership survey, conducted in 2007-2008, was administered sequentially by Web-CASI with e-mail reminders, member reminders posted to AAPORnet, and a hardcopy questionnaire mailed to a random sample of 500 nonrespondents. The final response rates (AAPOR RR2) were 55.4% for the Member Survey and 30.2% for the Former Member Survey. Beginning with 2011, AAPOR resolved to conduct annual surveys with a more rigorous approach. To increase response rates, the Temple University Institute for Survey Research (ISR) mailed hardcopy questionnaires to all nonrespondents, added telephone reminders for everyone who did not complete a Web-CASI or mail survey, and offered a CATI option to all nonrespondents contacted by phone. The AAPOR Membership Committee also posted reminders via e- mail, on AAPORnet, www.aapor.org, on Twitter, and on AAPOR’s Facebook and LinkedIn pages. The strategy resulted in final response rates (AAPOR RR1) of 60.8% for the Member Survey, 30.6% for the Former Member Survey, and 55% for the Non-Member Survey of Conference Attendees. Due to the multimode design, and because overall nonresponse and differential nonresponse between members, former members, and non-member conference attendees were key considerations in the 2011 Membership Survey, ISR developed standardized attempt codes to facilitate analysis of the resulting paradata within and between modes and membership groups. One interesting finding was the effectiveness of the telephone reminders in prompting Web-CASI completes paired with an unexpected lack of interest in completing the survey by phone. This presentation will report the results of the AAPOR Membership Committee’s nonresponse analysis, ISR’s analysis of the unified multimode case history paradata, and a joint analysis integrating the two components.

Methodological Briefs: Issues in Cell Phones and Landline Surveys

Geographic Differences between RDD Cell and Landline Frames and Self-Report Robert Benford, GfK Custom Research North America ([email protected]); Linda Piekarski, Survey Sampling International ([email protected]); John Lien, GfK CRNA ([email protected]); Trevor Tompson, Associated Press ([email protected])

Landline RDD sample geographic assignment has proven to be reasonably accurate based on comparisons to respondent self-reports. What is yet unreported are rates of departure for cell RDD where geographic assignment is based on the billing center from where cell phone service originated. In this paper differences in error rates between frame geography and respondent self-reports are compared for the total U. S. through more granular geographic disaggregation. Further, the paper includes demographic comparisons including the likelihood of differences across population density such as urban, rural, and suburban classifications. Survey data are from several AP-GfK Polls.

Increasing Response Rates in Cell Frames: Results from an Incentive and Voicemail Experiment Kathleen Thiede Call, University of Minnesota, SHADAC ([email protected]); Jessie Kemmick Pintor, University of Minnesota, SHADAC ([email protected]); Stefan Gildemeister, Minnesota Department of Health ([email protected]); David Dutwin, SSRS/Social Science Research Solutions ([email protected]); Robyn Rapoport , SSRS/Social Science Research Solutions ([email protected])

One approach to increasing response rates in telephone surveys is to offer an incentive for participation; however, evidence on the effectiveness of incentives is limited and mixed. Participants in cell phone samples are often remunerated for the cost of the call yet this necessity may be changing with shifts in the pricing structures for cell phones. Another mechanism for increasing response rates is leaving tailored voicemail messages; however, evidence on the impact of voicemails is also mixed and limited. Our objective is to evaluate how cell phone users respond to different incentive and voicemail conditions independently and combined. We compare a standard voicemail message (i.e., provision of information about the study, a call-in number, and mention of the intent to call back another time) and incentive voicemail (i.e., standard voicemail plus mention of the availability of an incentive for eligible respondents).

Data are from the cell phone frame of the 2011 Minnesota Health Access Survey, a dual-frame survey on health insurance coverage and access that will be completed mid-December 2011. Approximately 56,000 cell phone users will be contacted with the goal of obtaining 4,000 cell phone completes. The sample is randomized to six conditions:

1) 27,050 (48%) no incentive, standard voicemail 2) 1,050 (2%) no incentive, no voicemail 3) 7,000 (12.5%) $5 incentive, standard voicemail 4) 7,000 (12.5%) $5 incentive, incentive voicemail 5) 7,000 (12.5%) $10 incentive, standard voicemail 6) 7,000 (12.5%) $10 incentive, incentive voicemail

(Remuneration is provided to any respondent in groups 1 and 2 upon request.) We will examine response rates and the distribution of other dispositions (e.g., no answer, refusal) for each condition. Survey researchers face declining response rates and constrained budgets. Our findings will address the need for information about the efficacy of various strategies for increasing response rates in cell phone samples.

Cell Phones in Smaller Geographies: Are You Reaching the Right People? Meghann Crawford, Siena College Research Institute ([email protected])

Research continues to show the importance of including cell phone samples in public opinion studies so as to be fully representative of the population you are studying. The latest figures from the NHIS indicate that cell phone only households are approaching 30% nationally. However, this figure varies greatly by state. When surveying smaller geographies by cell phone, such as state or county, there is a larger sampling error due to the mobile nature of a cell phone. Even though an individual purchased their cell phone in one location, it does not necessarily mean they live there. When conducting a larger national survey, this problem does not factor in as much since most individuals will qualify based on geography. Conducting surveys on cell phones for smaller geographies presents its own unique set of challenges. Does the individual live in the geography being surveyed? Is the geography something that is readily known to respondents (county of residence vs. voting district)? Statewide and local surveys are important in understanding voting behavior, health related concerns, local policy issues and the like. If cell phone sample is not included, there is a potential bias introduced. The Siena College Research Institute has conducted several thousand cell phone interviews within the state of New York. This paper presents the findings of how accurate our statewide cell phone samples have been at reaching those living in New York while also building more representative statewide samples. We will also show what regions of New York have the largest mobility within the state. In addition, this paper shows differences by county in cell phone only status. Further, we will provide a profile of whom these New Yorkers are and who you would be missing if you were not including cell phone samples in your research.

Impact of a Reduced Pre-Recruitment Incentive on Non-Response in Cell Phone Surveys Vrinda Nair, Arbitron ([email protected]); Robin Gentry, Arbitron Inc. ([email protected])

In 2007, when Arbitron initially began sampling cell phone only (CPO) households, we sampled those persons who returned the ABS screener questionnaire with a cell phone number. We used higher incentives in the mailings to these CPO households than those used for RDD households to compensate for the minutes used by respondents when we called them for the survey. Since that time, there has been an enormous growth in the cell phone only population and many changes in telephone ownership (e.g. increased use of smart phones) and decreased costs for cell phone use (e.g. unlimited minute plans and rollover minutes). We hypothesized that due the changing telephony environment, respondents might not expect to be compensated for their minutes and that we might be able to reduce the amount of money spent on incentives sent prior to the phone call without a significant reduction in cooperation rates.

In Spring 2011 we tested the standard $5 pre-survey incentive versus a reduced incentive of $2. Our study investigates whether the reduction in survey incentives lead to significant differences in the willingness to be sent a 7-day radio listening diary and the subsequent return of a completed diary for cell phone only respondents. We also researched the demographic profile of the consenters and returners for the incentive groups to understand which groups were possibly affected by the reduction in pre- placement premium.

Are Design Effects Increasing in Telephone Surveys? A Study of Design Effects in the Behavioral Risk Factors Survey Veronica Roth, The Pennsylvania State University ([email protected]); David Johnson, Pennsylvania State University ([email protected])

With greater nonresponse rates and the growth of cell-only households, the variability of the post- stratification survey weights used to bring samples back to representativeness are likely to change. An important implication of using weights in statistical analyses is that greater variability in the sample weight yield larger design effects (Sturgis, 2004). The extent to which disproportionate stratification and clustering are used can also impact the design effects (Davern et al. 2007). Greater variability amongst weights can reduce statistical power and the accuracy of sample estimates, therefore requiring larger samples to obtain similar degrees of accuracy (Battaglia et al.). We examine the trends over a 20-year period of the design effects of several demographic and health-related variables from the Behavioral Risk Factors Survey System (BRFSS) national dataset. We chart changes in the design effects observed for the two types of variables and develop a model to estimate the extent to which these shifts reflect changes in the composition of the national BRFSS dataset, variability of post-stratification weights and the group characteristics that make up these weights, and differing degrees of disproportionate sampling and clustering.

Disproportionate Stratification to increase Incidence of Finding Minorities in RDD Landline and Cell Frames Robert Benford, GfK Custom Research North America ([email protected]); Linda Piekarski, Survey Sampling International ([email protected])

Landline RDD samples provide sufficient information that allows for the stratification of exchange sets such that these strata can then be sorted by the density of minority populations within those strata. Sample is then selected disproportionately across these strata so that an increase in locating the minority group occurs that is greater than the effective rate they are found in the population. Weights are used to correct the approximated selection probabilities for each stratum. In this paper we derive a similar strategy for cell phone RDD samples that uses county and state FIPS to array minority densities and create strata for disproportionate stratification.

Predictors of Survey Length Eran N. Ben-Porath, SSRS/Social Science Research Solutions ([email protected]); Melissa J. Herrmann, SSRS-Social Science Research Solutions ([email protected])

The length of a telephone survey can affect both costs and data quality. For one, longer interviews obviously require more interviewing time to complete. In addition, longer interviews can cause respondent attrition and breakoffs, which, in turn, will require even more interviewing time and negatively impact response rate. On any given survey, there is notable variance in interview length among respondents, even where the questionnaire itself is identical for everyone. Our paper examines the factors impacting the length of interviews, beyond the questionnaire. We look at respondent demographics, design factors (such as the language of the survey or whether the interview is conducted by landline or cell phone), and interviewer characteristics, across a variety of surveys, in order to assess the contribution of these variables to length. Applying multiple regression analyses, our findings indicate that age, language, and telephone-mode are among the key predictors of survey length. Specifically, we find that age is positively and linearly correlated with survey length, Spanish interviews are significantly longer than English ones, and cell phone interviews are longer than landline (even though respondents are younger). We then consider the causes for these differences, and discuss the implications of our findings for the estimation and design stages of survey studies.

Methodological Issues in Mail Surveys Addressing Topic Salience Bias by Questionnaire Design Pat Dean Brick, Westat ([email protected]); J. Michael Brick, Westat ([email protected]); Rob Andrews, NOAA ([email protected]); Nancy Mathiowetz, University of Wisconsin ([email protected]); Lynne Stokes, Southern Methodist University ([email protected])

As RDD surveys encounter more response rate and coverage problems, ABS mail surveys are being used more, and this requires examining a variety of issues for these surveys. One important concern in ABS mail surveys of subgroups is the potential for nonresponse bias due to topic salience, with those interested in the topic responding at a higher rate than those who are not interested. Topic salience is especially problematic in surveys that attempt to estimate totals because the prevalence of participation in the topic may be biased upward. In surveys designed to estimate total fishing effort among the general household population, higher response rates among those who are more likely to fish is sometimes called avidity bias. We examine avidity bias in two surveys using a two-phase mail approach to estimate total fishing effort. To assess avidity bias we matched the sampled addresses to addresses in fishing license registers and we consider the matched addresses as the salient population. We then compare the response rates for the matched and unmatched addresses to gage the magnitude of avidity bias. The first survey was done in 2009-2010 and used a screening questionnaire that solely focused on identifying those who fished in the last year; the survey was found to have substantial avidity bias. The second survey was done in 2010-2011 with a revised screener that was designed to reduce avidity bias by de- emphasizing fishing and including a variety of outdoor recreational activities. The analysis presented examines whether the avidity bias is reduced as a result of using a broader set of activities in the screener questionnaire and whether the effects are consistent across subgroups.

Alternative Questionnaire Effects on Response in Mail Surveys Douglas Williams, Westat ([email protected]); J. Michael Brick, Westat ([email protected]); Jill M. Montaquila, Westat ([email protected]); Daifeng Han, Westat ([email protected])

With the increasing use of address-based sample (ABS) frames, mail data collection methods are experiencing a renaissance. When screening for specific population groups, the two-phase mail approach has been demonstrated to be more effective at identifying population domains of interest than RDD methods considering cost, coverage, and response rates. An important question is how best to motivate response from the target population with this approach? One hypothesis is that shorter surveys will result in higher response rates, but there is little research that examines this. An alternative hypothesis is that a survey that is too short may affect the legitimacy or perceived relevance of the survey and actually depress response rates when compared to a somewhat longer, more relevant survey. Leverage-salience theory posits that these factors, (e.g., length, legitimacy, relevance) all affect a respondent’s decision to participate. In 2009 we conducted a pilot test of the National Household Education Surveys Program (NHES) using a two-phase mail data collection as an alternative to the RDD methodology that had been used previously. The 2009 pilot included different screener versions that manipulated the length and content of the mail screening questionnaire. While the small pilot results were inconclusive there were indications of differential response between the target and non-target population groups by screener version. Based on these findings we conducted a larger field test of NHES in 2011 that included an experiment that not only used two screening surveys of different lengths, but also explored switching the screener questionnaires for nonresponse follow-up. We describe the effects of questionnaire versions and of switching the questionnaires for both the screener and extended or second-phase response.

An Experimental Examination of Four Within-Household Selection Methods in Household Mail Surveys Kristen Olson, University of Nebraska-Lincoln ([email protected]); Jolene Smyth, University of Nebraska-Lincoln ([email protected]); Stacia Jorgensen, University of Nebraska-Lincoln ([email protected])

Household surveys are increasingly moving to self-administered mail modes of data collection. To maintain a probability sample of the population, adults within households must be selected with probability methods. However, very little experimental methodological work has been conducted on within-household selection in mail surveys. In one of the first studies, Battaglia, et al. (2008) compared the next birthday, all adult, and any adult within-household selection methods. Other possible within- household selection methods remain unexamined. We experimentally examine four such methods - the next birthday method, the last birthday method, selection of the youngest adult in the household, and selection of the oldest adult in the household – in a mail survey of Nebraska residents (n= 2,498, AAPOR RR1 36.3%). We also included a household roster in the questionnaire to empirically evaluate (1) whether mail respondents would complete a roster and (2) the accuracy of selection of respondents among all adults in the household. This paper will evaluate field outcomes for each of the experimental groups, correlates of accuracy of within-household selection, and correlates of household roster completion. Initial findings indicate that the youngest adult method had a significantly lower response rate than the other selection methods. Additionally, slightly less than 10 percent of respondents failed to complete any information in the roster at all, although up to 30 percent failed to complete at least one field in the roster for the first person in the household. Furthermore, among those households with sufficient household roster information, between 62 percent and 74 percent of within-household selections were correctly made, varying across the experimental groups. The paper will conclude with implications and suggestions for future research to improve within-household selection for mail surveys.

Evaluating Methods to Select a Respondent for a General Population Mail Survey Wendy Hicks, Westat ([email protected]); David Cantor, Westat ([email protected])

Coverage and nonresponse concerns with RDD surveys have led to an increase in address based sampling (ABS) and mail data collection. In 2008, the Health Information National Trends Survey III (HINTS III), a health communication survey sponsored by the National Cancer Institute, ran parallel RDD and ABS (mail) data collections (Cantor et al, 2009). HINTS IV, with collection in 2011 and 2012, will use a mail survey. One unresolved issue with a general population mail survey such as HINTS is the random selection of a respondent. The purpose of this paper is to describe experiments that compare different methods of respondent selection.

In a 2011 pilot test for HINTS IV, three methods for selecting the household respondent were compared: 1) the All Adults (AA) method, 2) the Next Birthday (NB) method and 3) a modified Hagan-Collier (HC) approach which pre-assigns the household respondent by gender and relative age (e.g., oldest male). The results of the Pilot were ambiguous and differed somewhat from Battaglia (2008). The NB and HC methods attained an equivalent response rate (34%), while the AA method resulted in a significantly lower response rate (20%). The low response rate for the AA ran counter to HINTS III. Respondent age distributions did not meaningfully differ by method in the HINTS Pilot, however the sample sizes were very small.

HINTS IV will continue the experiment using the AA and NB method in the national data collection ending in January 2012. Two-thousand addresses are allocated to the AA method, and 10,000 are allocated to the NB method. The paper presents results from the Pilot and the national survey, focusing on response rates and the representativeness of the respondents compared to national estimates for various demographic characteristics. Additionally, the presentation will compare the two methods on key HINTS estimates relative to historical trends.

Reaching Medical Professionals: A Review of the Methodology for a Mail Survey of Physicians and Residents Kinsey Gimbel, Fors Marsh Group ([email protected]); Fahima Vakalia, Fors Marsh Group ([email protected])

A recent Department of Defense (DoD) effort to survey physicians and medical residents faced a number of challenges in reaching respondents. In addition to the trend of the general population to be less likely to participate in surveys regardless of mode or sponsorship, the specific issues that faced this project included: 1) a frame that contained only office and/or hospital mailing addresses, rather than home mailing or email addresses; 2) a population, especially in the case of medical residents, with extremely limited time to complete a survey; and 3) a subject matter (propensity to join the military) that would be of little interest to many respondents.

This paper presents a summary of the methodology that was used to maximize survey response rates while minimizing the potential for non-response bias. A total sample of 18,000 physicians and 15,000 residents were sent mail surveys containing a token financial incentive of $2. A five-point mail plan, including a “thank you” post card and two follow-up copies of the survey, was used, and communications emphasized the importance of the public service physicians perform in both the civilian and military arenas. Ultimately, responses rates of 26% for physicians and 24% for residents were obtained with minimal profile differences between respondents and non-respondents.

Areas of discussion will include strategies used to reach this population, ramifications of those strategies on the response rate and respondent profile, and possible improvements in future efforts to reach this population.

New Frontiers: Social Media Use, Public Opinion and Behavior

Facebook User Estimates Based on a Large, Representative, Probability Sample Tom Wells, The Nielsen Company ([email protected]); Michael W. Link, The Nielsen Company ([email protected])

Facebook has recently become the most popular social networking site (SNS). The recent increase in Facebook membership has been followed by an increase in Facebook user research. However, there have been two major limitations to this research: small, unrepresentative convenience samples and usage data collected through survey self-reports.

This research introduces several methodological improvements to the current research on Facebook and SNS users. Namely, it is based on a large, nationally representative, probability-based, cross-platform sample with internet and TV usage data collected from meters, not from retrospective self-reports.

Our analysis is based on approximately 20,000 respondents from Nielsen’s TVandPC panel. We utilize this powerful data source to generate more accurate and more detailed estimates of Facebook usage and cross-platform behavior.

First, our data analysis shows that about 50% of the sample are recent Facebook users. However, within this group, we find enormous differences in usage -- heavy users spend 15 more hours per month on the site than light users.

Second, we estimate regression models to predict Facebook user segments. Teens, females, and non- Hispanic whites are more likely to be Facebook users and heavy Facebook users. Household income and homeownership are negatively associated with Facebook usage, which also appear to be a function of age or life stage.

Finally, our analysis shows that Facebook user segments are strong predictors of television viewing patterns. Facebook users are more likely to watch the Fox network and are more likely to watch programs such as Survivor and House and less likely to watch 60 Minutes. In addition, heavy Facebook users are more likely to be heavy television viewers, which is not surprising given a large percentage of web usage at home coincides with TV viewing.

Social Media, News Exposure and Political Expression: Facebook as a Venue for Political Participation Narayanan Iyer, Southern Illinois University - Carbondale ([email protected]); Aaron S. Veenstra, Southern Illinois University - Carbondale ([email protected]); Mohammad Delwar Hossain, Southern Illinois University - Carbondale ([email protected]); Chee Youn Kang, Southern Illinois University - Carbondale ([email protected]); Benjamin Lyons, Southern Illinois University - Carbondale ([email protected]); Changsup Park , Southern Illinois University - Carbondale ([email protected]); Rajvee Subramanian, Southern Illinois University - Carbondale ([email protected]); Yanfang Wu, Southern Illinois University - Carbondale ([email protected])

The U.S presidential election of 2008 has been referred to as the ‘Facebook election’ (Johnson, 2010), in part because of the effective utilization of social media by the Obama campaign. The strategy worked well in mobilizing social media users, predominantly the youth, and getting Obama and the Democratic Party a groundswell of support among young Americans. Social media usage has grown exponentially since 2008 and the current user base for technologies such as Facebook and Twitter transcends beyond youth to encompass all population segments. A social media strategy is now viewed as a critical and crucial component in any and all political campaigns.

One of the primary features of social media is its facilitation of users’ expression within their networks via status updates. These updates are constantly transmitted via the newsfeed and form the core of the content within social network sites. Social media users express themselves on any and all topics and engage with comments from other members within the network. The lack of any barriers to entry and their functional ease of use have made social media technologies invaluable in fostering community participation and discussion by presenting low-efficacy individuals with viable new pathways to political involvement.

This study extends the framework of the ‘communication mediation model’ to investigate relationships between expressive political participation on Facebook and traditional political participation, The study uses general population survey data to model the antecedents of expressive participation. Tentative results of the study indicate that consumption of political news via Facebook, to the exclusion of other media, is strongly linked to expression of political views on Facebook, and that this relationship is especially strong for low-efficacy individuals. Results also indicate that reinforcement of political views via user interactions leads to increased participation through donations, rally attendance, and voting.

Wikipedia and Political Communication: The Role of the Online Encyclopedia in the German 2009 National Election Campaign Thomas Roessing, Institut fuer Publizistik, University of Mainz ([email protected]); Nicole Podschuweit, University of Mainz ([email protected])

This paper uses quantitative data from surveys and content analyses to shed light on the role of the online encyclopedia Wikipedia in the German national election campaign 2009. During the run-up to this election the local political parties adopted several new forms of online communication, such as Twitter and Facebook. Besides these elsewhere widely discussed forms of political online communication, one website based on user generated content comes to mind which keeps many voters up to date with a plethora of information about politics: the free online encyclopedia Wikipedia. The reach of Wikipedia is extremely high: As one of Germany’s most visited websites it reaches more citizens than popular online mass media. In 2009, 65 % of German Internet users accessed Wikipedia at least occasionally. Among young people Wikipedia achieved a range of 94 % (ARD & ZDF Survey on Internet use). However, different articles generate different amounts of traffic. The article about the German Chancellor Angela Merkel was accessed 174,696 times in September 2009, the month of the general election. This paper discusses four aspects of Wikipedia’s role during the 2009 campaign: 1. Political information is processed by Wikipedia’s community on three levels: meta-level (project organization), article-discussion-level (article formation), and article-level (actual content). 2. Due to conflicts within the community, and the use of mechanisms like instrumental editing, Wikipedia’s content is not always neutral and objective. 3. Content analysis reveals that he reach of Wikipedia is boosted by the fact that journalists are prone to using Wikipedia as a source. 4. Since Wikipedia requires its authors to provide references for their contributions, there is potential for a citation cycle between traditional mass media and the online encyclopedia. Further research should include thorough quantitative content analyses of article content as well as surveys among journalists on their use of Wikipedia as a source.

Opening Up Online: Social Networking and Online Survey Response Behaviors Matthew Lackey, Fors Marsh Group ([email protected]); Nicholas Irwin, Fors Marsh Group ([email protected]); Scott Turner, Fors Marsh Group ([email protected])

Social networking websites now dominate internet traffic, particularly among youth, as users are spending more time sharing information, ideas, thoughts, and feelings in a computer-mediated environment. The rapid increase in activity on these sites suggests that people are becoming more comfortable going online and sharing their personal thoughts and feelings to the social networking community. As more survey research is performed electronically due to ease of contact and cost effectiveness, researchers should strive to understand the effect that the social media usage has on online survey responding. This study examines the response behaviors of youth with a profile on a social networking site and those without a profile to determine if social networkers demonstrate better survey taking behaviors and are more willing to share information via online surveys.

Data for this paper comes from a cross-sectional advertising tracking study that examines youth (ages 16- 24) recall of advertising via multiple mediums including social networking sites. Respondents are asked if they have a profile on a social networking site and how often they visit those sites.

Preliminary data analysis demonstrated that social networkers were more likely to provide substantive responses, when given the option, and provided longer open-ended responses than non-social networkers. Results will provide insight into the effects of social networking on survey respondents’ participation, engagement and behavior with online survey instruments. It will also provide guidance on pathways for further research into electronic survey response behaviors.

Public Opinion and Political Behavior

Issue Indifference and Policy Opinion: When Not Caring is Consequential Justine G.M. Ross, University of California, Riverside ([email protected])

A common assumption within the field of public opinion research is if a respondent expresses indifference toward a policy or candidate when surveyed, they not only lack an opinion but, will subsequently not turnout to vote because of their non-preference. We seek to clarify and identify individual issue indifference and the consequences thereof. In contrast to those who are ambivalent, care about a policy outcome, but can be swayed in either direction due to conflicting sub-opinions, we define indifferent voters as those individuals who self-identify a lack of attachment to an opinion. Those who are expressly indifferent do not care about policy outcomes; however, we find they can be prompted to deliver an opinion in a ballot-setting. We propose that measuring indifference is reliant upon an individual’s assessment of the perceived importance of an issue. We examine policy preferences on Proposition 8, California’s same-sex marriage amendment – a salient, timely issue that apart from age cohort lacks a reliable opinion predictor. We show that those self-identified indifferent voters were much more likely to take a pro-gay rights stance whereas those who were ambivalent were equally likely of being for or against the measure. Finally, we extend this analysis to understand the connection between electoral outcomes and representative behavior. Using multinomial MRP, we estimate public opinion by subgroup and analyze congruence. We present a conundrum for minority-group and issue-specific movements: how does one strategically mobilize those who are indifferent? Future efforts to further these types of issues, must consider what other measures or candidates are included on the same ballot.

Generations in American Politics Jocelyn Kiley, Pew Research Center ([email protected]); Michael Dimock, Pew Research Center ([email protected]); Scott Keeter, Pew Research Center ([email protected])

The question of generations in politics has particular resonance in the current political moment. Over the last several years a deep chasm has emerged, sharply distinguishing the political preferences of the youngest generation (the “Millennial” generation) and older generations (the “Silent” generation, Baby Boomers, and Gen Xers). The Millennials have emerged as far more liberal and Democratic-leaning than their elders, and early polling on the 2012 election indicates that the Millennials will again be the Democratic Party’s best age group.

But generalizations about the political views of Gen Xers and Baby Boomers, in particular, are difficult. There is, in fact, substantial diversity within these generations. The formative political experiences of “young” Baby Boomers, who came of age in the late 1970s and early 1980s, differed from those of their “older Boomer” counterparts who came of age in the 1960s and 1970s; and party affiliation and voting histories of these two sub-groups suggest some lasting effects from these early political experiences. Similarly, Generation X counts both those who came of political age in the Reagan era and the Clinton era among its members; the young group more closely matches the Millennial generation politically, while the older one is more conservative and Republican.

Utilizing Pew Research Center surveys taken over the past 25 years as well as data from exit polling and historical public opinion sources, this paper will (1) analyze the partisan, ideological, and attitudinal trajectories of the conventionally-defined political generations (Millennials, Generation Xers, Baby Boomers, Silents, and the “Greatest” generation), but also (2) explore the differences within generations associated with the social and political climates that prevailed at the time of their political socialization. Finally, (3) the paper will address the degree to which distinctiveness of generations tends to erode as cohorts age.

Gaps in Americans’ Political Interest: Following Politics in Surveys from Gallup, Pew, and the ANES Joshua Robison, Northwestern University, Political Science Department ([email protected])

Since 1960, the ANES has measured general political interest by asking respondents how often they follow politics and public affairs. These responses are shown to predict political knowledge and participation. In this paper, I present empirical data demonstrating that the response patterns to this question reported by the ANES differs substantially from those found in surveys conducted by Gallup and Pew. The average number of respondents indicating they follow politics most of the time, the highest response category, on these latter surveys is 47%, while the ANES reports 25% give this response. Beyond demonstrating this remarkable gap, I provide evidence that question context underlies this difference. My evidence suggests that questions about political participation and social groups may depress interest reports. These results were observed even while controlling for the presence of knowledge items, a known source of bias for this question, on these surveys. In addition to analyzing these aggregate survey responses, I will also provide evidence from a survey experiment conducted using Amazon’s Mechanical Turk as a data generation tool. The results reported in this paper are consequential for the measurement of this empirically important variable as well as for the broader study of political participation and public opinion.

POPTOP: How Public Opinion is Related to Public Policy Cliff Zukin, Rutgers University ([email protected])

THE PUBLIC holds a hallowed place in American democracy. Public officials and leaders are supposed to draw on public opinion for guidance in making and carrying out public policy. Most descriptions of that process are based on an explicit or implicit model where opinion is pushed up from the public to policy makers, usually in the form of public opinion surveys.

Drawing on 35 years of public opinion polling and some concepts from the political science literature, I argue that a bottom-up model describes the relationship between public opinion and public policy in but a few, highly visible, instances. In examining the relationship between opinion and policy the first critical distinction to make is between passive and active public opinion. The main way policy is reactive to opinion is through the political culture—the norms and values we share. This acts through a law of anticipated consequences to constrain policy makers to a conscribed set of policy alternatives, or as a boundary.

When public opinion is in the active mode, a top-down process best characterizes the relationship between opinion and policy. Here it is the policy maker/elected official who is the dominant partner. And, rather than having any “true” reading of public opinion, there are a number of sources of potential public opinion that simultaneously co-exist. These include (at least) surveys, media content, election outcomes, interest groups (pluralism) and personal politic cues. Each of these potential worlds of public opinion has its own set of biases, which color the way opinion is expressed/perceived.

This paper describes the workings of the political culture, bottom-up and top-down models in examining the critical relationship between public opinion and public policy.

Questions on Sensitive Topics and Social Desirability Bias

Towards a More Objective Measure of Socially Desirable Reporting in Survey Research Zeina Mneimneh, Institute for Social Research - University of Michigan ([email protected])

Background: Social desirability bias is one of the major sources of response error in survey research, and is a major concern among survey practitioners and methodologists. Theoretically, the survey methodology literature views social desirability as a response bias triggered by the interaction between the respondent, the question, and the interview process. The bulk of the empirical survey method literature however focuses on misreporting as a question-level phenomenon that is affected by design features such as mode of administration, question type and wording, and interview setting, largely overlooking the respondent him/herself.

Objective: The objective of this paper is to bridge the fields of psychometrics and survey methodology to identify socially desirable reporting behavior by simultaneously modeling respondent-level and item-level characteristics.

Method: Using data from a national probability sample of the Lebanese population, Mixed Rasch Model (MRM) was applied to three scales that differ in the desirability or undesirability of their items’ content. The three scales measure different types of temperament: hyperthymic, depressive, and anxious temperament. Using MRM, respondents were grouped into classes that differ in their pattern of responses on each of these scales. The nature of these classes was investigated by testing predictors of class membership using respondent characteristics, design variables, and interview privacy measures.

Results: Two classes of respondents were identified from response patterns on each of the hyperthymic and depressive temperament scales. Class membership was found to be significantly associated with interview mode and interview privacy. Respondent’s characteristics such as the need for social approval, gender, and education were also found to be related to class membership; however different characteristics were associated with different scales.

Conclusion: Using MRM, it is possible to identify socially desirable reporting behavior among survey respondents. Such methods could be implemented during different stages of the survey design and implementation possibly reducing measurement error.

Item Sum: A New Technique for Asking Quantitative Sensitive Questions Antje Kirchner, Institute for Employment Research (IAB) ([email protected]); Mark Trappmann, Institute for Employment Research (IAB) ([email protected]); Ivar Krumpal, Universität Leipzig ([email protected]); Ben Jann, University of Bern ([email protected])

This article joins an ongoing debate about how to measure sensitive topics in population surveys. We propose a novel technique that can be applied to the measurement of quantitative sensitive variables such as frequency of drug use or income from illicit work: the item sum technique (IST). This method is closely related to the item count technique (ICT), which was developed for the measurement of dichotomous sensitive items such as employee theft or unethical behaviors. The IST is an improvement, as it applies to a wider range of variables. First, we describe how the new technique works. Second, we present the results of a 2010 telephone survey on illicit work and moonlighting in Germany (n= 3.211). We compare results from IST to those from direct questioning, showing that IST is a promising data collection technique, however, faces the same challenges as the ICT. Results for earnings from illicit work yield substantially higher estimates of the socially undesirable behavior than direct questioning, whereas results for hours spent working illicitly are less clear. We conclude with a discussion of how IST results can also be used in regressions and other complex analyses, despite the fact that the data contains no direct reports of the sensitive behavior and which design improvements should be considered in future studies.

The Relationship Between the Accuracy of Self-reported Data and the Availability of Respondent Financial Records Emily Geisen, RTI International ([email protected]); Charles Strohm, RTI International ([email protected]); Chris Stringer, US Census Bureau ([email protected]); Brandon Kopp, Bureau of Labor Statistics ([email protected]); Ashley Richards, RTI International ([email protected])

Use of respondent records is a method for supplementing survey data. Records can improve data quality when data collected by self-report is subject to recall error. One limitation of using respondent records is that respondents often cannot provide records for all survey items of interest. If the accuracy of self- reports for items with records is different from the accuracy of self-reports for items without records, the records are subject to nonresponse bias. However, if accuracy of self-reports is not related to availability of records, then even a limited set of records can provide detailed information about the direction and type of measurement error.

We explored the relationship between accuracy of self-reports and record availability using data from the Consumer Expenditure Records Study. The CE Records study is a non-probability feasibility study that examined the accuracy of self-reported data for various types of household expenditures. In the first interview, participants provided self-reports about the cost of household expenditures from the previous three months. In a follow-up interview, participants provided records (e.g., receipts, bank statements, bills) for all expenditures asked about. By comparing self-reports and records, we were able to evaluate the accuracy of self-reports.

Records were available for 36% of the 3,039 items reported. Of the items with records, 53% had incorrect self-reports. On average, items were under- or overestimated by 39%. Several factors such as the date, frequency, and cost of expenditures were associated with the availability of records. Using these factors and respondent demographics, we developed a propensity model to determine the likelihood that a record was available for a given item. We then determined if items with higher propensities for having records were more or less likely to be accurate than those with lower propensities. Finally, we evaluated the implications this has on reducing measurement error.

2011 New York City HIV/Sexual Practices Survey Micheline Blum, Baruch College Survey Research ([email protected]); Douglas Muzzio

This session will report on a survey conducted by Baruch College Survey Research for the New York City Department of Health and Mental Hygiene. The study surveyed 2,473 adult New Yorkers from June 25, 2011 through August 25, 2011 on a variety of HIV-related and sexual behavior questions.

The poster session will highlight the one or more of the principal policy findings and implications of the study. These include: • Sexually active seniors who have been neglected in previous studies and recommendations. • The central importance of doctors’ recommendations for HIV testing. • Sexually active adults that practice “HIV-roulette” by engaging in high-risk behavior including those with multiple partners, men having sex with men, non-monogamous heterosexual marriages and not using condoms. • The knowledge what specific sub-groups know, don’t know or think they know about HIV and its transmission

Using Qualitative Methods to Study Census Coverage Issues

Using Qualitative Methods to Study Census Coverage Issues M. Mandy Sha, RTI International ([email protected]); Emilia Peytcheva, RTI International ([email protected]); Jennifer Hunter Childs, US Census Bureau ([email protected]); Sarah Heimel, US Census Bureau ([email protected]); Ryan King, US Census Bureau ([email protected]); Tiffany R. King, RTI International ([email protected]); Sarah Cook, RTI International ([email protected]); Eleanor Gerber, Research Support Services ([email protected]); Alisú Schoua-Glusberg, Research Support Services ([email protected]); Katherine Kenward, Research Support Services ([email protected]); Julie Feldman, RTI International ([email protected])

The proposed panel includes six papers that investigate issues surrounding census coverage as well as the qualitative methods used to study such issues. This panel is of interest to end users of census data and to methodologists and practitioners who conduct qualitative research. The U.S. Census Bureau recently conducted a large-scale qualitative study to examine why people are counted twice in the census and how to more successfully resolve such duplication, without anyone feeling that their privacy and confidentiality has been violated. Fifty semi-structured qualitative interviews were completed, as well as 226 cognitive interviews testing the Targeted Coverage Follow-up (TCFU) questionnaire across 27 living situations. Respondents were real computer-identified suspected duplicates from the 2010 Census, or were reporting for household members who were suspected duplicates. Findings and recommendations from the study may inform future duplication research and the coverage follow-up operation for the 2020 Census. We propose to start the panel by describing the identified duplicates in the 2010 Census and the coverage follow-up operation that took place. The papers will provide nation-wide information as well as background for the rest of the papers in the panel. Next, we will present two papers examining the efficacy of the cognitive interviewing methodology: applying census residency rules and proxy vs. nonproxy reporting of living situations. The fifth paper looks into using semi-structured interviewing technique to resolve duplication surrounding formal and informal custody situations. Finally, we will conclude the panel by presenting a description of the efforts to manage data quality associated with this large qualitative research study with complex respondent recruitment criteria and an estimate of their impact on data quality.

Abstract #1:

"Characteristics of People Overcounted in the Census" The goal of the decennial census is to count every housing unit and every person in the United States, once and only once. However, sometimes people or housing units get counted more than once. For instance, an apartment might initially be included twice in the census if different unit designations were received from different address sources, but the Census Bureau has successfully implemented procedures that reduce and resolve most housing unit duplication. Person duplication can occur for a variety of reasons; a child with divorced parents might have been counted in the census by each parent while a person in prison would have been counted in the prison but also could have been counted by their family at home. Duplication of persons is more complex and challenging to resolve than housing unit duplication due to the complexity of living situations as well as the privacy and confidentiality concerns that constrain any attempts to contact and follow up with possible duplicates. In order to address the problem of person duplication, it is essential to understand characteristics of the people who we suspect were duplicated and characteristics of the living quarters where they were counted. Information presented will include demographic characteristics of the suspected duplicates (such as age and race) from the 2010 Census, and comparisons of the two questionnaires where the same person was counted (such as the distance between each address, the number of people duplicated between the two addresses, and whether the same telephone number was provided on each return). By understanding the characteristics of duplicated people, we can continue to research how to prevent duplication on initial census enumerations and how to resolve the duplication that persists.

Abstract #2:

After Census 2000, when it became apparent that a rather large number of people were duplicated in the census, the Census Bureau began designing and testing what would become known as the Coverage Followup (CFU) interview. The CFU interview would be the Census Bureau’s method to follow up on Census returns when there is an indication that coverage error may have occurred. Through a series of probes, the interview determines who should ultimately be counted at the housing unit as a resident, and the result of the interview is considered the “gold standard” Census return for that housing unit. A series of tests were conducted by the Census Bureau leading up to the 2010 Census, and each one used the CFU interview as a way to resolve duplication. Each test built on the lessons learned from the previous test, but ultimately the results showed that the CFU was not very successful at resolving person duplication. With a limited budget for follow up and due to the low resolution rate of the unduplication cases in our testing, they were not followed up during the production of the 2010 Census. To find the resolution rate for unduplication cases in the actual Census environment, a sample of cases was selected for completion of a CFU interview as an experiment. This presentation examines the unduplication cases that completed a CFU interview. We examine the ability of the CFU interview to have a respondent mention a complex living situation for a potentially duplicated person, and the interview’s ability to resolve the duplication. We will also present some results on how often respondents provide alternative addresses and how often those addresses match the duplicate address, as well as demographic characteristics of the persons in this sample. This paper demonstrates how the current census procedures could resolve duplication in the census.

Abstract #3:

When conducting survey research about individuals, it is ideal to interview the person in question. However, due to busy schedules, mental and physical abilities, and simple refusals, sometimes a proxy respondent is necessary. Proxy reports have been widely used and are often considered necessary in certain survey research situations, such as: parent proxies for child and adolescents’, people with developmental disabilities or mental illnesses, the elderly and other populations. Research continues to seek a greater understanding of the quality of proxy reporting versus self-reporting responses as well as when proxy responses are significantly different from self-reports. This paper examines the use of proxy reports versus self-report responses in a study conducted by the US Census Bureau. The study was designed to increase understanding of living situations in which individuals are likely to be duplicated (persons listed at more than one residence) in the decennial census. Cognitive interviews were conducted with either duplicated adults or an adult household proxy. The purpose of this research is to examine whether there was a significant difference in quality of living situation information provided by the proxy versus the targeted duplicate respondent. Of the 226 cognitive interviews conducted, 130 respondents were the targeted duplicates and 96 respondents were proxies. An adult household member was allowed to serve as a proxy when the duplicate respondent was unavailable. Our analysis will include findings from RTI Interviewer observations as well as potential data quality indicators including percent providing match to duplicate address, percent providing complete dates and address for moves and other transitions, Targeted Coverage Follow-Up (TCFU) interview item completeness, refusals, mention of privacy concerns, and number of probes needed to elicit address and date information. This research may be of interest to methodologists and studies that use proxy respondents in lieu of targeted individuals.

Abstract #4:

"Duplication in the Decennial Census: Processes Leading to Double Counting Children in Custody Situations" Fifty qualitative interviews using a semi-structured protocol were carried out as part of a qualitative study undertaken to elucidate the processes which lead to duplication in the decennial census (in conjunction with a cognitive study of a census follow-up instrument.) Of these fifty, 17 involved the duplication of children or young adults in two residences. Two kinds of custody situation were encountered: those where formal agreements exist (usually as a result of divorce) and those where the arrangement is informal. Formal agreements are generally between ex-spouses. Both parents regard the child’s stay with them as “custody” and often regard the child as living in two places. Informal arrangements are those in which a child stays with a grandparent, aunts or nonrelatives. These arrangements are more uncertain, flexible, and are associated with difficult family situations. Parents in these situations tend to regard custody as remaining with them, even if the child does not live with them. The use of the term “custody” to elicit these situations can therefore be misleading as it is not necessarily associated with the child’s primary place of residence. Three custody cases also were dependent interviews: that is, the connected household had previously participated in a cognitive interview in another part of the study. In these instances, both households “claimed” the presence of the child nearly full time: both accounts cannot be true. Reasons for this pattern are explored.

Abstract #5

"Managing Data Quality on a Large Qualitative Research Study with Complex Respondent Recruitment Criteria" Cognitive interviewing is a questionnaire pretesting technique that has shown to be successful. Commonly observed limitations about this technique include a small number of respondents, purposively chosen recruitment criteria, and interviewer variability. Managing data quality associated with these operational aspects is particularly important to the success of large cognitive testing studies with complex recruitment criteria. In 2010-2011, Census Bureau conducted a large-scale cognitive testing study of the Targeted Follow-up (TCFU) questionnaire, a detailed instrument designed to resolve instances of people who appear to have been duplicated in the census. This study faced the challenge of completing 226 cognitive interviews over a total of 6 months and two rounds, and by a team of eight experienced interviewers in five states. The majority of the interviews were conducted in the field with actual 2010 US Census participants recruited from 18 household and 9 Group Quarters living situations, without the respondents knowing that they were duplicated. Stringent requirements were put in place to protect respondent confidentiality and privacy mandated by Title 13. This paper presents both a description of the attempted improvements to manage data quality and an estimate of their impact. We measure quality by interviewer productivity and protocol adherence, whether decisions regarding the respondent inclusion criteria led to more “interesting” and valuable data, the success of recruiting and representing a diversity of living situations and duplications in the TCFU, and the quality of the TCFU information collected. Our efforts to manage data quality included improvements to interviewer selection and training, recruitment strategy-setting and monitoring, interviewer-recruiter communication and logistics management.

Friday, May 18, 2012 10:00 a.m. - 11:30 a.m. Concurrent Session D

Assessing the Accuracy of Election Prediction Methods

Maximizing the Accuracy of Final Pre-Election Polls Predicting the Outcomes of Races for Seats in the U.S. Senate and the House of Representatives: A Meta- Analysis Sam Storey, Stanford University ([email protected]); Jon A. Krosnick, Stanford University ([email protected])

Numerous surveys have been conducted in states and congressional districts prior to elections for seats in the U.S. Congress, asking respondents how they would vote if the election were held on the day of the survey interview. Such surveys conducted shortly before election day have often been used to predict election outcomes. Yet no publication to date has reported a comprehensive and systematic assessment of the accuracy of such polls in anticipating election results. In this paper, we report such a meta-analysis of all available state and congressional district polls conducted prior to the 2008 and 2010 elections. Predictors of accuracy investigated include the number of respondents interviewed, the proximity of the data collection to election day, the mode of data collection (human telephone interviewers vs. RDD IVR to landlines only vs. internet), and the sampling procedure used (probability vs. non-probability), and other factors. The findings gauge the overall accuracy of such polls and identify steps that researchers can take to maximize their accuracy.

How Accurate are Robo Polls? And Why? Scott Ferguson Clement, Washington Post ([email protected]); Peyton M. Craighill, ([email protected]); Jon Cohen, The Washington Post ([email protected])

A major component of the proliferation of pre-election polls in recent years is the expanded use of automated random digit dial surveys, also known as IVR or “robo” polls. While some major media organizations have shunned these polls for methodological reasons such as low response rates and lack of cell phone coverage, automated polls have earned a reputation for closely mirroring election outcomes in both high and low turnout races and have the benefit of being much less costly than traditional live interviewer surveys. Indeed, the 2010 election report from the National Council on Public Polls noted that IVR polls conducted within three weeks of the election missed the actual outcome by an average of 2.6 percentage points compared to 2.4 points for live interviewer surveys.

Little research has focused on exactly why automated polls - which involve major methodological shortcuts from dual frame RDD surveys - tend to get election results right. This paper will expand on recent analyses comparing the accuracy of automated landline and live interviewer polls, digging below overall vote estimates to voting preferences among demographic and political groups. In addition, the paper will examine degree to which demographic weighting, likely voter modeling and non-coverage - across states with differing percentages of cell phone-only residents - play a role in accurate pre-election estimates for both automated and live interviewer polls. The paper will assess what is still unknown about automated methodologies as well as the current level of disclosure among major automated pollsters. Lastly, the paper will offer practical recommendations for poll observers on when one should rely on automated and live interviewer polls as accurate gauges of public opinion and when they should viewed with skepticism.

On Line Exit Polls: The Canadian Experience Darrell J. Bricker, Ipsos Public Affairs ([email protected])

Since the 2000 Federal Election in Canada, Ipsos Reid has had noteworthy success in using on-line survey techniques to both predict and explain the results of Canada's national elections. The paper will describe what we've learned over the last decade, will show the accuracy of our methodology for predicting the 2011 Federal Election results, and point the way forward for exit poll practitioners interested in using on-line methodologies in their own jurisdiction.

Fundamental Models for Forecasting Elections David Rothschild, Yahoo! Research. PhD Economist ([email protected])

When American voters go to the polls in even years they potentially vote in several different state and federal elections with a range of state and national implications: presidential, senatorial, and gubernatorial elections. This paper develops models for forecasting all of these types of elections at the state level using fundamental information such as past election results, economic indicators, ideological indicators, and biographical information about the candidates. Freed from the power of polls and prediction markets we are able to analyze which of these different indicators most meaningfully correlate with or predict the election outcomes in the states. For example, here are the three main contributions to the political science and economics literature on politics and elections. Presidential approval is positively correlated with the performance of candidates of the same party in both senatorial and gubernatorial races; presidential coattails exist. Economic variables carry significance as trends, not levels; national and state trends are significant for Electoral College races, but only state for senatorial races and only national for gubernatorial races. Data through the second quarter is almost as predictive as data through the third quarter; for some variables it is more predictive! Our forecasting models also add to the literature on forecasts by illustrating methods that more accurately forecast the results of elections than previous methods.

Cross-National Survey Research - WAPOR Sponsored Session

Cross-National Survey Research. Tom W. Smith, NORC ([email protected]); Mark Tessler, ISR/University of Michigan ([email protected]); Rory Fitzgerald, ESS ([email protected]); David Howell, ISR/University of Michigan ([email protected]); Janet Harkness, University of Nebraska ([email protected])

Frame and Coverage Issues in Address-Based Sampling

Modeling Coverage Error in Address Lists Due to Geocoding Error: The Impact on Survey Operations and Sampling Lee Fiorio, NORC at the University of Chicago ([email protected])

Survey research organizations have been researching using extracts of the United States Postal Service delivery sequence file (DSF) as a replacement for traditional listing. Due to software limitations, individual housing units (HUs) on the DSF are sometimes errantly geocoded which can influence coverage properties of selected segments. NORC undertook a national listing effort in 2011 to augment the DSF in areas known to have limited coverage, such as rural areas and areas with new construction. We used an “enhanced” listing method, where the lister, using a handheld device, verifies and edits the DSF list geocoded to a designated segment. One benefit of enhanced listing with a handheld device is the ability to capture the geographic coordinates of each HU, thus providing data to further explore the nature of DSF coverage. We focus on a selection of rural and urban segments from the national listing effort. For addresses on the DSF but not found by the lister, we model the likelihood of address-level geocoding error using DSF flags and census data from 2010 using logistic regression. We also build a spatial autologistic model (Besag, 1972) to account for spatially dependent data by incorporating spatial autocorrelation. Preliminary findings indicate geocoding error occurrences are spatially dependent and the probability of geocoding error is related to address characteristics such as vacancy and drop delivery and also segment characteristics such as urbanicity and housing unit density. For addresses on the DSF found by the lister, we analyze spatial discrepancies. We find that positional accuracy at the block level is inadequate and the level of accuracy is improved by use of larger geographic scales such as block group or segment. Understanding the correlates of geocoding error in the DSF will increase listing efficiency and frame quality by allowing identification of areas with poorer DSF coverage.

Sub-National Coverage Profile of U.S. Housing Units Using the USPS Computerized Delivery Sequence File Joseph P. McMichael, RTI International ([email protected]); Rachel Harter, RTI International ([email protected]); Bonnie E. Shook-Sa, RTI International ([email protected]); Vincent G. Iannacchione, RTI International ([email protected])

Coverage of the national population of housing units (HUs) using the United States Postal Service (USPS) Computerized Delivery Sequence File (CDS) is highly variable across states, counties, and areas with varying population densities. Depending on the mode of data collection, these variations in coverage can affect the veracity of the CDS for use as an address-based sampling (ABS) frame. The recent availability of data from the 2010 Decennial Census and American Community Survey enable an in-depth look at HU coverage in the United States at the sub-national level.

This paper explores the HU coverage properties of the CDS by state, population density and other demographic and geo-physical characteristics by comparing Census counts of HUs to counts of mailing addresses from a commercially-available version of the CDS. Additionally, we report on the increase in HU coverage provided by the CDS No-Stat file. The CDS No-Stat file is a supplement to the CDS that includes city style addresses for P.O. Box, vacant rural addresses, addresses under construction and certain types of HUs that are unlocatable from their P.O. box or other unlocatable address in the main CDS file.

Predicting When to Adopt Given Frame Construction Methods: Modeling Coverage and Cost Benefits Ned English, NORC ([email protected]); Colm O'Muircheartaigh, NORC ([email protected]); Katie Dekker, NORC at the University of Chicago ([email protected]); Ipek Bilgen, NORC at the University of Chicago ([email protected]); Lee Fiorio, NORC at the University of Chicago ([email protected]); Mark Clausen, NORC at the University of Chicago ([email protected]); Tamara Brooks, NORC at the University of Chicago (brooks- [email protected])

At the present day there are multiple ways to construct sampling frames for in-person or address-based studies in survey research. Depending on the environment and available technology, one could implement traditional listing, enhanced (or “dependent”) listing, or use an extract of the USPS delivery sequence file (“DSF” or “CDSF”) alone. Each method has advantages in terms of coverage properties and cost which vary due to urbanicity, the availability of lists, and other factors. At question is how, exactly, do the coverage and cost properties relate across frame construction methods and environments. We use data from an experiment embedded in the National Children’s Study during 2011 were selected segments were listed by each method and blindly verified in-person for coverage. This experiment was implemented in rural, suburban, and urban areas of varying housing age and socioeconomic environment. The results of our modeling show which frame construction methods carry the greatest coverage advantages in what situation, and the cost-benefit implied by each. Our paper contributes to the literature of predicting when it is most appropriate to adopt certain frame construction methods, as predicted through a priori information.

Assessing Coverage and Accuracy of an Address Based Frame for Subgroups of the Population Kelly Dixon, Arbitron ([email protected]); Dan Estersohn, Arbitron ([email protected]); Al Tupek, Arbitron ([email protected]); Mike Kwanisai, Arbitron ([email protected]); Missy Mosher, Survey Sampling International ([email protected]); Linda Piekarski, Survey Sampling International ([email protected]); Jessica Smith, Survey Sampling International ([email protected])

Arbitron obtains hundreds of thousands of records annually from Survey Sampling International (SSI) for purposes of selecting household samples. The primary information Arbitron gets from SSI is either a phone number or an address (sample point) and some geographic descriptive information such as county or subcounty. SSI has the ability to provide additional information about many of the sample points; including such things as name, age and race/ethnicity of the householder, and household size. Arbitron has traditionally used a proportional sample design to measure radio audiences. Achieving demographic and geographic proportionality is one of the main goals for Arbitron, since radio listening does vary by these characteristics

The goal of this study is to investigate if using additional frame information from SSI can help Arbitron achieve better demographic proportionality, while maintaining good geographical proportionality in its samples. Specifically, we will analyze the usefulness of the SSI auxiliary information to find Hispanic and young households and sample them at a rate that yields a proportional sample. We plan to compare the demographic information from thousands of Arbitron respondents to the SSI frame information. We will report the proportion and types of matches and non-matches by characteristic. We will frame the discussion of the metrics in terms of accuracy and coverage and the importance of each to achieve our survey’s goals.

What's Accuracy Got to Do with It?: Evaluating Tradeoffs between Sample Frames and Geographical Accuracy Ashley Amaya, NORC at the University of Chicago ([email protected]); Christopher Ward, NORC at the University of Chicago ([email protected]); Felicia LeClere, NORC at the University of Chicago ([email protected])

The population coverage of a landline random-digit dial (RDD) sampling frame has been declining for nearly a decade as an increasing proportion of U.S. households rely only on cellular telephones. As population coverage declines within the landline RDD sample frame, the potential for large inferential biases in landline RDD surveys grows. In order to increase population coverage, two new sample frames have been offered as alternatives – address-based frames (ABS) and landline-cell RDD dual frames.

As the targeted geographic area varies (e.g., census tract, county, state, etc.), frame efficiencies change. For example, telephone numbers can be associated with geography but tend to have more ambiguous geographic boundaries. Surveys that use telephone frames cannot efficiently target smaller geographic areas. Cell telephone numbers present the particular challenge of mobility as respondents relocate but retain their telephone numbers.

Address-based frames have other complications. Previous research suggests coverage discrepancies between rural and urban geographies (O’Muircheataigh et al. 2009). Several rural addresses are excluded from standard ABS frames altogether while cities suffer from a large proportion of drop point addresses that are unable to receive mail. Moreover, most studies require addresses be geocoded to determine whether they fall into the target geography. This results in some level of geocoding error which, in turn, reduces sampling efficiencies.

In this paper, we examine the comparative ability of these sampling frames to efficiently cover various types of geographic areas. Precise geographic targeting is often needed to oversample specific areas, to sample respondents with particular characteristics, or to stratify the sample to produce reliable estimates for specific areas. We evaluate each type of frame in terms of coverage (i.e. proportion of the target population included in the frame) and efficiency (i.e. proportion of the frame included in the target population) across several geographies.

Furthering the Debate on Non-Probability Sampling

Comparison of Dual Frame Telephone and Non-probability Online Panels Regarding Accuracy of Political Opinion Polling Clifford Young, Ipsos ([email protected]); Chris Jackson, Ipsos ([email protected]); Jon Krosnick, Stanford University ([email protected])

Abstract: In 2011, IPSOS conducted two experiments in which the same questionnaire was administered simultaneously via RDD telephone calls to a national sample of cell phones and land lines and also via the Internet with a non-probability sample of people who volunteered to complete questionnaires in return for remuneration. The questionnaires included measures of a wide range of political opinions. These two side-by-side data collection efforts allow for exploring two issues: (1) do the two methodologies differ in the degree to which they yield accurate measurements of the American adult population (assessed by comparison with benchmarks of known high accuracy), and (2) do the two methodologies yield data that differ in terms of the distributions of political opinions and the relations between opinions and the relations of opinions with demographics. This paper will present a series of analyses exploring these issues and will also explore whether computing weights affects the conclusions we reach. Specifically, weights will be built for both samples using best practices procedures developed by an advisory committee to the American National Election Studies. The findings of these comparisons will help researchers to assess whether RDD and opt-in internet surveys yield comparable substantive results with regard to political opinions and whether weighting optimally can reduce any differences between modes that might appear.

A Systematic Review of Studies Investigating the Quality of Data Obtained with Online Panels Mario Callegaro, Google UK ([email protected]); Ana Villar, Independent consultant ([email protected]); Jon A. Krosnick, Stanford University ([email protected]); David Yeager, Standord University ([email protected])

A core tenet of survey research is that the inferences one makes about the population can only be as good as the quality of the respondents in the sample. However, with declines in probability sample response rates and increases in non-probability Internet-based research, researchers have found it increasingly difficult to agree on the quality of a survey sample. Contributing to this difficulty is the variety of research studies that have evaluated the quality of survey data derived from probability-based and non-probability-based sources and the effectiveness of statistical methods to reduce error in data from these sources.

Specifically, some research has documented a greater average error among non-probability samples relative to probability samples (Chang & Krosnick, 2009; Yeager et al., in press), while other research has found few or small differences between the two. Other research has pointed to greater variability in results from surveys non-probability samples of Internet volunteers. For instance, Dedeker (2006) conducted the same study twice on the same Internet survey panel and reached two different business conclusions. An additional study found five to ten times greater variability in error among a sample of seven Internet surveys of non-probability samples versus seven probability sample surveys (Yeager et al., in press). Similarly-sized variability was found in the National Dutch Online Panel Comparison Study. Relatedly, statistical methods such as post-stratification survey weighting have inconsistent effects on non-probability sample surveys, and in some cases increase survey error.

It is critically important to synthesize the survey accuracy studies summarized above as well as others. The present study will evaluate the evidence from more than 45 different studies have assessed the accuracy of non-probability sample surveys and the effectiveness of methods to improve their accuracy, with the aim of helping researchers and consumers to have more informed expectations about data quality in their surveys.

Some Thoughts on a Framework for Statistical Inference from Nonprobability Samples Robert Santos, The Urban Institute ([email protected])

The survey industry in the U.S. is facing a significant challenge. High levels of non-response and noncoverage impugn the integrity of statistical inferences from probability samples without heavy reliance on model based adjustments. Advances in technology and internet access have lead to the development of quasi-probability and nonprobability web panels survey products. Smartphone and social media based survey applications are emerging. Sampling statisticians trained in classical finite population sampling theory (CFPS) are often skeptical of the validity of statistical inference from nonprobablity samples. This presentation is strictly a thought piece. It explains how probability samples invoke CFPS theory to generate valid statistical inference. It uses that to motivate and contrast a framework for making inferences with nonprobability samples. An empirical pathway is proposed as well as components and necessary conditions. In a sense, the pathway addresses the question: "How and when might one expect that their nonprobability sample can provide valid inferences?"

In Defense of Probability: Are the Times a-Changing? Gary Langer, Langer Research Associates ([email protected])

At a conference devoted to evaluating new frontiers in public opinion research, it’s worth a pause to reflect on old realities. Calling on the literature from Marcus Tullius Cicero to George Snedecor to Leslie Kish to modern times (Groves, Biemer and Lyberg, others) this presentation offers a review of the precepts of probability sampling and their foundational role in survey research. I evaluate claims made about probability sampling by practitioners of alternative methodologies, assess some of the practices and claims of non-traditional approaches and review recent comparisons of probability and non- probability results. Recalling the great debates of Central City, I discuss the concept of “representativeness” and propose ways of differentiating between the new frontier and the forbidden forest of survey practice.

Methodological Briefs: Methods for Improving Response Rates

Response Rate Effects in an ABS Survey for Stamped vs. Business Reply Return Envelopes, with and without Incentives, and Medium vs. Standard Size Outgoing Envelopes John Tarnai, Social & Economic Sciences Research Center ([email protected]); David Schultz, Social & Economic Sciences Research Center ([email protected]); David Solet, Public Health - Seattle & King County ([email protected]); Lori Pfingst, Washington State Budget & Policy Center ([email protected])

We conducted an experiment to test three procedures that could improve the response rates to a mail survey. The experiment was conducted with an address based sample of 6,400 residents of a major metropolitan area, and tested the effects of (A) two different outgoing envelope sizes (standard 4” by 9.5” vs. a medium size 6.5” by 9.5”); (B) stamped vs. business reply return envelopes; and (C) no incentive vs. a $1 bill incentive. There were 800 sample units in each of the eight treatment groups. The questionnaire was designed as a 12-page booklet consisting of 40 questions about community health. The results indicated significant main effects for each of the treatment groups. The initial sample units were randomly assigned to one of the eight treatment groups; otherwise all groups were treated identically, including follow up reminders that were mailed to all sample units. The best response rate was achieved for the incentive group, then the stamped reply envelope group, and last for the medium size outgoing envelope group. The paper describes the response rate results for each of the eight treatment groups and discusses the implications of these results for designing mail surveys.

Mail Survey as a Non-Response Follow Up? Experience from the 2010 Health and Retirement Study Piotr Dworak, ISR ([email protected]); Heidi Guyer, Institue for Social Research University of Michigan ([email protected]); James Wagner, Institue for Social Research University of Michigan ([email protected])

Mail surveys are typically used first in the sequence of multi-mode research. This allows for reducing more costly phone and face to face interviewing. However, some researchers propose that a mail follow up may be used as an effective tool minimizing non-response after other modes have been engaged.

The recent experience from the 2010 Health and Retirement Study (HRS) screening effort provides some validation for using a short household listing paper survey as a nonresponse follow-up, for households with a long history of refusing other modes or households located in gated/locked communities.

The 2010 Health and Retirement Study (HRS) conducted an extensive screening effort using a 5-10 minute face-to-face, telephone, and paper instrument. In the process approximately 30,000 households were contacted face-to-face. In the end, a short paper screener was sent to household refusing any prior attempts to complete the. A mix of reminder phone calls was placed where telephone numbers were available.

The response rate exceeded expectations given the history of the sample. In other preliminary findings, the amount of face-to-face or telephone contact prior to the mailing did not impact the mailing response rate, however, telephone contact a few days after the mailing increased the chances of responding. Further, the availability of call notes collected by interviewers prior to the mailing allowed to code some returned mail as non-sample. Available household characteristics are analyzed to answer who was more likely to respond to the survey. A cost model is developed to estimate potential savings of implementing similar non-response follow up in the future.

The Effect on Personalized Address Labels on Response Rates and Postal Deliverability Rates Vrinda Nair, Arbitron ([email protected]); Yelena Pens, Arbitron ([email protected])

Traditionally Arbitron has addressed the household by the name associated with the address whenever such a match could be made for a “personalization” approach. The main hypothesis for using the personalization approach is that response rates should be higher because respondents are more likely to open a package addressed to them rather than one addressed to a generic label like “Arbitron Radio & TV Household” or “Current Resident”.

On the other hand, for example, if we address the survey to a specific person and if the dwelling is vacant because the owner moved, it results in mail being returned to us as “Unable to Forward” (because we specifically instruct the USPS to not forward our packages) rather than the correct disposition of “Vacant.” Since an ABS sampling point is an address rather than a person, personalization could negatively impact mail deliverability.

Since this also has a direct effect on response rate calculations, Arbitron conducted two studies to research how response rates and deliverability are affected by the use of generic salutation versus personalized name approach.

First, during summer 2011 Arbitron conducted a field test of a mail recruited one week web-based diary of radio listening. Also, in fall 2011, Arbitron tested the impact of personalization on response rates for an ABS screener questionnaire. For both tests, all mailings were randomized with half of the mailings using the name matched to the address if available, while the other half included a generic “Arbitron Radio Household” greeting.

The presentation will look at the impact of the generic vs. personalized salutations on deliverability, address usability and response rate and will suggest a preferred approach for addressing mailings to an address based sample.

Combining Prepaid and Promised Incentives: Impact of Prepaid Amount in a Mail Survey of Young Adults Luciano Viera, Fors Marsh Group, LLC ([email protected]); Scott Turner, Fors Marsh Group, LLC ([email protected]); Sean Marsh, Fors Marsh Group, LLC ([email protected])

The general population has been increasingly less likely to participate in surveys. As a result, a large body of research has focused on the use of incentives to encourage survey participation, generally showing that cash works best especially when it is prepaid versus promised conditionally upon survey return (Church, 1993).

Last year in Phoenix, the present authors shared the results of an experiment designed to test the impact of offering a promised monetary incentive in addition to an upfront, prepaid incentive on the survey quality of a national mail survey tracking the future career plans of young adults. Results showed when compared to offering just a $2 prepaid incentive, combining this amount with a promised incentive of $5 or $10 increased response rates and yielded substantial cost savings with no deleterious impact on survey point estimates. Theoretically, this finding is consistent with social influence theory suggesting that prepaid incentives may “build trust” such that they might “magnify” the positive effects of promised incentives. However, this begged the question as to whether the prepaid $2 incentive could be reduced without negatively impacting survey quality.

To address this question, this study presents the results of a follow-up incentive experiment conducted in a subsequent survey administration. Specifically, 50,000 sample members were randomly assigned to one of two experimental conditions (25,000 each):

1) $1 Prepaid Incentive, $5 Promised Incentive 2) $2 Prepaid Incentive, $5 Promised Incentive

With the exception of the language describing the incentives and the actual cash incentives varying according to these two experimental conditions, all mailing materials sent to youth were identical. Everything else was kept consistent to minimize confounds. Specific measures of survey quality included survey response rates vs. costs, respondent profiles, and key metrics. Results and implications for existing survey practice along directions for future research will be discussed.

Response Rate and Recall Effects of Using a Tailored Stamp on Advance Materials in a General Population Telephone Study Grant D. Benson, University of Michigan Institute for Social Research ([email protected]); Sunghee Lee, University of Michigan Institute for Social Research ([email protected]); Toby Jayaratne, University of Michigan School of Public Health ([email protected]); Paul Schulz, University of Michigan Institute for Social Research ([email protected]); Alicia Giordimaina, University of Michigan School of Public Health ([email protected])

Advance mailings are used with regularity to increase response rates. With a growing body of research demonstrating the importance of visual design in engaging respondents, reducing non-response, and minimizing error (Dillman et. al. 2010), we sought to assess the impact of tailoring just one aspect of the letter’s visual appeal – the stamp. The increased availability of diverse stamp designs as well as the ease of designing one’s own stamps suggests the value of matching specific stamps with targeted samples in advance mailings. Even a small increase in response rates from tailoring the stamps used could be a useful avenue for further research.

For a national RDD study conducted in September through December, 2011, we randomly assigned numbers that had been matched to Hispanic surnames to receive either a U.S. Flag stamp or a stamp from the Latino Legends series. We asked all respondents, regardless of whether they received a mailing or not, if they recalled receiving the advance letter. While preliminary results did not show a clear relationship between advance letter recall and treatment group, there did appear to be a large and significant relationship between contact rate and stamp treatment group. This paper will discuss cost- benefit trade-offs and implications for contact rates, cooperation rates, and future strategies for tailored advance mailings.

SHOW Me the Money? Effects of Preincentives, Differential Incentives, and Envelope Messaging in an ABS Mail Survey Kristen Cyffka, University of Wisconsin Survey Center ([email protected]); Jennifer Dykema, University of Wisconsin Survey Center ([email protected]); John Stevenson, University of Wisconsin Survey Center ([email protected]); Kelly Elver, University of Wisconsin Survey Center ([email protected]); Karen Jaques, University of Wisconsin Survey Center ([email protected])

Mail surveys that use address-based sampling frames are an increasingly important method for collecting health-related data from random samples of the general population. Identifying methods for ABS mail surveys that yield high response rates and data quality while remaining cost-effective is needed. While systematic reviews indicate incentives are effective in increasing response rates in mail surveys (e.g., Edwards et al. 2002), more research is need to determine what monetary thresholds are most cost- effective and to examine the effects of incentives and other methods on other measures of data quality. This presentation reports on results from an experiment conducted to evaluate the effects of varying amounts of preincentives, differential incentives, and envelope messaging on response rates, costs, and data quality in an address-based mail survey. The experiment was conducted as part of the Survey of the Health of Wisconsin (SHOW). Households (N = 2,608) were randomly divided into experimental groups using a 2 (amount of preincentive) x 2 (differential incentive) x 2 (envelope message) design. The two levels for the amount of the cash preincentive were $5 versus $2. The two levels for the differential incentive were $2 versus none. Households were only eligible for the differential incentive if they failed to respond to the initial mailing. The two levels for the envelope message included a message that read “Thank You! A cash gift is enclosed” versus no message. The final response rate was 67.1% overall. Analyses examine the effects of the treatments on unit and item nonresponse, costs, and nonresponse bias. Results indicate that larger preincentives are associated with significantly higher response rates and lower item-missing data. The use of additional incentives and envelope messaging appears to have no effect on outcomes. Results from this presentation add to research about choosing the most cost- effective incentive combinations in mailed self-administered questionnaires.

Cash Incentives vs. Sweepstakes: What Works Best? Charles Darin Harm, Arbitron, Inc. ([email protected]); Courtney Mooney, Arbitron ([email protected])

Arbitron has developed an electronic meter that can automatically detect audio exposure to encoded radio signals. We ask panelists to wear their meter everyday from rise to retire in order to measure media exposure. Arbitron conducts ongoing research to improve the compliance of our panelists.

While overall panel compliance decreases in the summer months, Black panelists between the ages of 18 and 34 typically have lower compliance rates than other demographic groups. The decrease in compliance rates among this group is particularly low during the weeks around the 4th of July. In July 2010, Arbitron offered Black 18-34 panelists a cash incentive for compliance during the month of July. While the cash incentive was effective at reducing Black 18-34 non-compliance, the cost of offering the incentive to this group was very large.

In the fall and winter of 2010 Arbitron tested the effectiveness of offering a sweepstakes to reduce non- compliance during the Thanksgiving and Christmas holiday periods. The sweepstakes was less expensive to implement than offering cash incentives, and significantly improved panel compliance among all demographic groups.

In July 2011, a split sample test was conducted to determine if offering a cash incentive to Black 18-34 panelists, in addition to offering sweepstakes entries would have a greater impact on compliance than either treatment alone. This follow-up test targeted Black 18-34 panelists, as they are a key demographic group that tend to have low compliance. This presentation will examine the relative effectiveness of using “guaranteed” performance-based cash incentives, “potential” cash incentives (e.g., sweepstakes entries), and a combination of the two incentive strategies to increase participation among hard to reach demographics.

New Frontiers: Advances in Mobile Data Collection - New Methods, New Opportunities, New Challenges

Advances in Mobile Data Collection - New Methods, New Opportunities, New Challenges David James Roe, RTI International ([email protected]); Michael Keating, RTI International ([email protected]); Robert Furberg, RTI International ([email protected]); Trent D. Buskirk, Saint Louis University ([email protected]); Michael Link, The Nielsen Company ([email protected]); Patricia Graham, Knowledge Networks ([email protected])

Panel Description: As the cell phone coverage rate in the U.S. climbs to 85%, and the Smartphone coverage rate approaches 35% nationwide, there is an ever-increasing need for efficient and appropriate mobile survey methods. As an industry, we continue to meet the challenges of declining landline telephone coverage by developing and experimenting with a number of data collection strategies such as dual-frame RDD and cell phone surveys, applications (apps) for Smartphone data collection, Smartphone panels, and real-time data capture and surveys designed for mobile browsers. With advances in mobile technologies and accessibility, mobile data collection offers more than just the advantage of surveying people who are hard to reach. New methods for mobile survey research include combining data capture modes available via Smartphone, such as geolocation, real time status updates and insights, surveys via mobile apps and the use of SMS (text messaging) for data collection and respondent communication. From providing a high level summary of the technologies and tools available via Smartphones, to case studies on different data capture and contact modes, to methodological considerations, the papers in this session present and evaluate a wide range of issues and techniques utilizing the mobile platform for survey research that will have an impact as our industry evolves.

First Abstract: Mobile Technology and Survey Research: Lessons from Early Implementations and the Consumer Marketplace Mobile technology (e.g. smartphones and tablet computers) is increasingly becoming a part of people’s everyday lives. As user adoption increases, mobile technology offers researchers the opportunity to reduce data collection costs, increase the efficiency of tracing operations, exploit additional modes of questionnaire administration, and collect a very rich set of personal data. Due to early implementation, many lessons on the use of mobile technology have already been learned, including application design, types of data that can be collected, and limitations. This paper will summarize a variety of prior studies that have used mobile devices for research and how the technology in smartphones (camera, GPS, accelerometer, etc…) was used to collect new kinds of data. The paper will also examine how best practices in the consumer application marketplace can inform the design of data collection applications using a variety of case studies in the iTunes App Store and the Android Marketplace.

Second Abstract: Online Surveys Aren’t Just for Computers Anymore! Exploring potential mode effects between Smartphone versus Computer-Based Online Surveys. This paper will present results from an experiment conducted to explore mode effects between surveys completed via a smartphone compared to those completed via a desktop or laptop computer. In particular, the Got Healthy Apps study randomized online panel members who reported having an iPhone into either iPhone or Computer completion of a survey relating to the use of health related iPhone apps using stratified randomization based on sex, age-group and education level. A total of 221 iPhone assignees and 209 computer assignees completed the survey. We will present results comparing primacy/recency effects, total survey time, total questions asked, back button usage rates and completion times per section of the survey across the two survey modes. Specific emphasis will be placed on presenting results from an embedded within-person app recognition experiment that sought to explore whether Smartphone users recognize apps by name or by icon or both. We will also provide a brief description of how the iPhone survey was developed and deployed and summarize results from this study in the form of recommendations for future Smartphone survey development.

Third Abstract: SMS-adjunct to Support Data Quality and Compliance in Health Survey Research The decline in health survey response rates over the last decade has been well documented. The increasing use of mobile phones, social networking tools and information sharing systems offer new opportunities for enhancing survey techniques and processes. The pervasiveness, low cost, and convenience of mobile phones make short-message-service (SMS) texting an ideal application for disseminating as well as gathering health information from consumers. SMS intervention data suggest that text messaging systems can effectively improve disease management. Based on this knowledge, a primary research question emerged. If SMS can be used as a means of effecting health behavior change, can it be also be used as an adjunctive technology to maintain a survey sample’s engagement with longitudinal data collection? In adherence with the AAPOR Cell Phone Task Force Best Practice Recommendations, an ongoing pilot seeks to demonstrate the feasibility of using SMS as an adjunctive technology (SMS-a) to support survey data collection conducted primarily via smartphone application. Thought the BreathEasy evaluation, as part of RWJF's Project HealthDesign, approximately thirty patients with diagnoses of moderate to severe asthma were recruited from two Richmond, VA primary care clinics in October 2011 to participate in a six month evaluation of a smartphone application for self-monitoring of individuals with asthma. Patients are using an Android-based application to record their observations of daily living (ODLs), including asthma and mental health symptoms, medication use, symptom triggers, physical activity, and activity limitations. The SMS-a feature has been integrated to provide daily diary reminders, compliance prompts following incomplete or missed entries, and a suite of health promotion messaging content. In addition to serving as a technical proof-of-concept, this paper will describe the effect of SMS-a on participant satisfaction, retention, data quality, and lessons learned for future.

Fourth Abstract: Capturing in-the-moment insights via mobile data collection Scanning various industry conferences and reflecting upon client conversations, it’s clear that researchers are finally thinking about mobile as an important channel for research. This comes as no surprise, as the CDC's National Center for Health Statistics reported in December 2010 that during the period January – June 2011, 26.6% of U.S. households will have only cell phones (no “land lines”). The same study showed a staggering 51.3% when looking at 25 to 29 year-olds. In the wake of this trend, it’s clear that technology is affecting major change on the how the public, interact with media and brands. As these shifts are occurring, so too are their expectations of how they will be able to engage with a brand or each other. What can we learn by researching these interactions while they are occurring? While researchers are comfortable thinking about mobile as a method to round out their sample or address coverage bias by doing the same surveys on a different screen, we have found it less common for researchers to take the next step and actually change the way they are doing their research to meet the expectations of how mobile information requests, survey based or otherwise, can add facts and insights from the person who is a consumer, a shopper, a brand builder, a physician, a patient and a citizen. Such an adaptation and reconstruction of where we obtain research means not just adapting to mobile phones, but targeting them so that not only are consumer expectations met, but so too are brand insight and action objectives. This paper will bring forth best practices, case studies of response quality, and how to apply in-the moment data capture to obtain incremental information from mobile research respondents.

Fifth Abstract Smartphone applications (or “apps”) provide researchers with a range of ready-made tools to collect both customary and new forms of data in a more reliable manner than self-reports, such as location, visual data, barcode scanning, in the moment surveys, and the like. Yet unlike traditional surveys, respondents have greater experience with and expectations of smartphone apps, such as ease of use, speed, and functionality. Researchers need, therefore, to pay more attention to user engagement. To this end, we report on a study in which a smartphone app was developed to capture television viewing behaviors and to serve as a replacement for a current paper-and-pencil (PAPI) diary survey approach. The app captures the critical data elements collected in the PAPI version, but also allows users to express their views on current shows via a rating scale, comments, and “likes.” The app also contains additional features designed to enhance user engagement, including a points & status system and allowing respondents to share their viewing and comments with others using the app or with their Facebook network. We discuss the challenges encountered in developing a smartphone application to replace a long-standing PAPI approach, and provide empirical data tracking data entry, feature use and overall compliance. Additionally, using a split-sample design, with one set of respondents utilizing a “basic” app with no gamification and social sharing features and another set of respondents using a “full feature” app, we assess the impact of these techniques for enhancing user engagement in terms of increased participation and changes in television viewing behaviors (a potentially negative consequence). The findings are of interest not only to those developing smartphone applications or leveraging ready-made app utilities, but more broadly to the survey field in terms of our understanding of how to

Reliability and Validity of Survey Self Reports

Lying Versus Fail-to-match: Self-reported Turnout and Validated Turnout in the 2008-2009 ANES Panel Study Matthew Berent, Stanford University ([email protected]); Jon Krosnick, Stanford University ([email protected]); Arthur Lupia, University of Michigan ([email protected])

More survey respondents report turning out for elections than would be expected based on official turnout statistics. Many scholars have taking this discrepancy as evidence of survey respondents misreporting turnout behavior. One proposed method for collecting more accurate turnout data is to rely on official government turnout records. Previous turnout validation exercises typically lead to turnout rates among survey respondents that are much closer to population turnout statistics than turnout rates based on self- reports. Some have interpreted such results as evidence that validated turnout based on government records generates more accurate turnout data than self-reports.

We use data from the 2008-2009 ANES Panel Study and government turnout records from six states to test the idea that validated turnout data are more accurate than self-reported turnout. We find that the turnout rate among survey respondents was higher than turnout in the population from which respondents were recruited. While a small proportion of respondents seem to misreport turnout, problems with turnout validation methods call into question the idea that validated turnout data are more accurate than self- reports. In particular, turnout validation methods fail to locate government records for some respondents. As a result, some respondents who did turn out are coded as not turning out based on the validated turnout exercise. This fail-to-match problem, in conjunction with higher turnout among survey respondents, serendipitously generates validated turnout rates that are more similar to population turnout rates than rates based on self-reports. While respondent misreporting introduces inaccuracy into self- reported turnout data, the fail-to-match problem introduces inaccuracy into validated turnout data. We conclude that turnout self-reports are more accurate than previously believed, and are not less accurate than validated turnout data.

The Validity of Adolescents' Self-reported Data Jill Walston, American Institutes for Research ([email protected])

The Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K) was sponsored by the National Center for Education Statistics (NCES) within the U.S. Department of Education. The students participating in this large-scale study were followed longitudinally from kindergarten in the fall of 1998 through the spring of 2007 when most were in the eighth grade (13 and 14 years old). Data about the students and about their home and school environments were collected from multiple sources including teachers, school administrators, parents, and the students themselves. In addition, the students participated in cognitive assessments and direct physical measurements. Many of the items on the 2007 student questionnaire (paper-and-pencil) had a comparable measurement from a direct source or from an interview or questionnaire item from another respondent, which allows for a unique opportunity to check the validity of self-reported data collected from this age group (9,300 students completed the questionnaire). A wide range of self-reported items will be examined including 1) their weight status compared with their actual body mass index calculated from direct measurement of their height and weight, 2) their ability in a variety of academic subjects compared with their assessment data in those subjects, 3) enrollment in particular science and mathematics courses compared with teacher report, 3) number of hours they watch TV and do homework compared with parent report, 4) the availability of different types of food for purchase at their school compared with school administrator report. The study will look at the comparability of these data by a variety of demographic characteristics. Some of these comparisons will not examine validity but will investigate how data about the same behavior or characteristic differs depending on the source of the data. Results from this study will be useful for those designing, or interpreting data from, studies with adolescent respondents.

Findings from a Split-Ballot Experiment on a New Approach to Measuring Health Insurance in the Current Population Survey Joanne Pascale, U.S. Census Bureau ([email protected])

This paper presents results from the Survey of Health Insurance and Program Participation (SHIPP), a large-scale split-ballot field experiment carried out in March 2010 that tested questions about health insurance coverage during the past calendar year from the Current Population Survey Annual Social and Economic Supplement (CPS) and an experimental set of health insurance questions (EXP) designed to reduce measurement error in the CPS. The EXP also contained questions on current coverage. Two distinct sample sources were used – Random Digit Dial (RDD) and enrollment files (MCARE). There were no significant differences in estimates of those uninsured throughout the year or estimates of coverage by plan type for the non-elderly RDD sample, but for the MCARE non-elderly sample the uninsured estimate was three percentage points lower in the EXP than in the CPS. Regarding reference period, the rate of uninsured-throughout-the-year was lower than the rate of uninsured at a point in time, by about two-and-a-half percentage points in the RDD sample and three-and-a-half percentage points in the MCARE sample. The EXP also produced data outside the scope of the CPS: the number, duration and start/end months of coverage spells, by plan type, over a 15-17 month period. This enables analysts to examine churning on-and-off coverage within the same plan type as well as transitions from one plan type to another. The EXP battery of health insurance questions took 91 seconds longer to administer than the CPS, and the median difference was 66 seconds.

How Likely?: Comparisons of Behavioral Intention Measurement Validity John Bremer, Toluna USA Inc. ([email protected]); Randall K. Thomas, ICF International ([email protected])

In predicting behavior, behavioral intention is generally the best predictor for behaviors requiring any degree of planning. There are a number of different measures of intention, including measures using percentages to indicate probability, single response measures with response labels to indicate likelihood (e.g. ‘Very likely’), and strength of intention (‘Probably will’). In developing response labels, we also have a choice to develop unipolar and bipolar scales (‘Not at all likely’ to ‘Absolutely Certain I will’ for unipolar; ‘Absolutely certain I will not’ to ‘Absolutely certain I will’ with a neutral point of ‘Uncertain’ for bipolar scales). In a large-scale web-based study with over 12,262 U.S. respondents, we randomly assigned respondents to one of 22 possible likelihood measures to assess likelihood to purchase 8 different consumer products. We compared response differentiation, response extremity, and time of completion. We generally found that unipolar measures of intention had higher correspondence with behavior than bipolar measures, and that measures with response verbal labels had higher levels of validity than did those employing numeric response entry.

Are You Sure You Didn’t See Our Ad? Factors Affecting Recall Inconsistencies in an Advertising Tracking Study Lindsey Brewer, Fors Marsh Group LLC ([email protected]); Ashton Jacobe, Fors Marsh Group LLC ([email protected]); Scott Turner, Fors Marsh Group LLC ([email protected])

Many surveys rely on self-report data to measure behaviors that are otherwise unrecorded or unobservable to researchers. When such self-report data is retrospective, it can increase respondent error by requiring them to recall events that are outside of their immediate awareness. Although it is not possible to evaluate the accuracy of individual responses, one method of assessing self-report reliability is measuring the consistency of responses across questions. This study examines respondent inconsistencies within a survey, across two sets of advertising recall questions (unaided and aided) that overlap in content.

In order to measure advertising awareness, respondents (n=7,175) are asked two sets of recall questions. They are first asked if they have seen or heard any advertisements for the Military or any of its branches in the past few months (unaided recall). Respondents are then presented with images and descriptions of specific advertisements and are asked whether or not they have seen each one in the past few months (aided recall).

Data analysis will focus on those respondents who indicated they had not seen any military advertisements in the unaided recall section, but later responded affirmatively in the aided recall section. The unaided recall questions are presented with growing levels of specificity and prompting; responses will be compared across these questions to examine if the degree of specificity provided is tied to recall of a particular ad. Demographic and behavioral factors related to these response patterns will also be discussed. The conclusion will provide recommendations for designing questions that may reduce recall error in self-report surveys.

Using Incentives to Increase Survey Participation and Decrease Bias

What are the Odds? Lotteries versus Cash Incentives. Response Rates, Cost and Data Quality for a Web Survey of Low-Income Former and Current College Students John Stevenson, University of Wisconsin Survey Center ([email protected]); Jennifer Dykema, University of Wisconsin Survey Center ([email protected]); Lisa Klein, Mathematica Policy Research ([email protected]); Kristen Cyffka, University of Wisconsin Survey Center ([email protected]); Sara Goldrick-Rab, University of Wisconsin-Madison ([email protected])

Collecting high quality data in surveys of students and low-income populations is critical for many studies in social science. Research indicates small, monetary pre-incentives are most effective in increasing response rates; less effective are post-incentives and lotteries, which are offered contingent upon completion of the survey. However, little is known about the effectiveness of large cash post-incentives relative to lotteries for cash or gifts. Moreover, given growing constraints on research funds, research is needed to determine which types of incentives are most cost-effective.

In this study we consider the effectiveness of various types of post-incentives with a low-income population: college students receiving Pell Grants (a federal means tested form of financial aid). In particular, we assess responsiveness to a cash post-incentive relative to lotteries offering monetary or nonmonetary incentives.

A stratified random sample of Pell Grant recipients that were initially enrolled in Wisconsin public higher education in 2008 (N=3,000) and surveyed in their first semester, then followed over time whether they remained enrolled in school or not. All panel members were initially mailed a $5 cash pre-incentive and invitation to complete a web survey in 2011. Nonresponders received up to three email reminders and then sent a mail SAQ. Respondents were randomly assigned to the following post-incentive groups:

• Condition 1: no post-incentive • Condition 2: $10 post-incentive • Condition 3: inclusion in a lottery for $50 (paid out to 25 winners) • Condition 4: inclusion in a lottery for an iPad

The analysis includes: • Effects of the experimental treatments on unit and item nonresponse. • Effects of incentives on nonresponse bias, looking at survey reports, reported civic engagement and linked administrative data • Differences in participation across incentive groups looking at past participation for this study. • Analysis of cost variation among the treatments.

Experimenting with Noncontingent and Contingent Incentives in a Media Measurement Panel Paul J. Lavrakas, Independent Consultant ([email protected]); J. Michael Dennis, Knowledge Networks ([email protected]); Jordon Peugh, Knowledge Networks ([email protected]); Jeffrey Shand-Lubbers, Knowledge Networks ([email protected]); Elissa Lee, Google, Inc.; Owen Charlebois, Google, Inc.

There is a rich history of experimentation to understand the effects of incentives on the decision to cooperate with a survey request. However, few studies have used a full factorial design to simultaneously test the impacts of both the value of a noncontingent incentive and the value of a contingent incentive. Furthermore, there is little reported on the effects of these two forms of incentives when used to build a new measurement panel from an existing panel. Our paper will report the findings of a pilot study conducted by Knowledge Networks, in which national random sample of 400 households was drawn from KnowledgePanel®, Knowledge Networks’ probability- based online panel. These households were invited to participate in a multi-media measurement panel. A noncontingent incentive was given to all sampled households – a $2 bill mailed in a brief letter alerting the households they were chosen for the new study. The following week a formal invitation to join the new study was mailed, in which the households received either a (noncontingent) $5 or $10 cash incentive. The letter also promised the respondents that if they joined the new panel they would receive a (contingent) incentive of either $10 or $25 each month they remained in the panel. We will discuss the main effects and interaction effects on cooperation metrics (i.e., our dependent variables) in the experiment, including how a set of demographic and psychographic variables (i.e., covariates) known about each of the 400 households mediated these effects. We also will report findings from a follow-up survey, conducted after the recruitment period for the new panel ended, in which 76% of the 400 invited households completed a detailed questionnaire about their motives for wanting (or not) to participate in the new panel, including the role the incentives offered played in their final decision.

Incentives Effects on Nonresponse Bias: Can Monetary Incentives Be Used to Decrease Nonresponse Bias in Measuring Wealth-Related Quantities? Barbara Felderer, Institute for Employment Research ([email protected]); Gerrit Müller, Institute for Employment Research (Gerrit.@iab.de); Frauke Kreuter, Institute for Employment Research ([email protected])

Respondents incentives are widely used to increase response rate, but its effect on nonresponse bias is yet unsure. In this study, we examine the effect of cash incentives on nonresponse bias, especially the question whether cash incentives succeed in bringing people with low socio-economic status - who are typically underrepresented - into the sample, and therefore leads to decreasing nonresponse bias in wealth-related quantities.

Social samples usually include too few people with low socio-economic status to draw valid inference on them, but that problem can be overcome by using the data from "Panel labor market and social insurance" (PASS) from the German "Institute of Employment Research" this is specially build to examine benefit recipients and samples them in a high number. An incentive experiment was conducted within the third wave of PASS, consisting of two treatment groups - 10 Euro cash incentive unconditionally and a lottery ticket conditionally -, which allows to examine incentives effects on nonresponse bias.

We find that mean response propensity is higher for the "cash group", whereas standard deviation is lower which indicates decrease in survey representativeness. Indeed, bias is reduced for several wealth- related quantities such as household income, welfare benefit recipiency, and deprivation index. Findings for household income and welfare benefit recipiency are compared to official register data provided by the German "Federal Employment Agency" which are available for both respondents and nonrespondents. Register data are known to be highly reliable for these variables, and bias in these variables can directly be attributed to nonresponse. We conclude that cash incentives indeed can successfully be used to decrease nonresponse bias in wealth-related quantities by bringing people with low socio-economic status into the sample who would otherwise refuse.

Survey Research of Economic Incentives: Do Incentives Given Prior to a Survey Affect Participation in the Survey? Aaron Hill, MDRC ([email protected])

Much work has been done on the role of monetary survey participation incentives and their effect on response bias and total survey error. Existing literature shows benefits of incentives on participation in surveys and their role in achieving higher response rates. However, less is known about the effect incentives have on participation in subsequent survey waves. Recent work suggests that a participation incentive in the first wave of a longitudinal survey affects response behavior in subsequent waves.

This may have implications for survey research on programs that have economic incentives unrelated to survey participation incentives (e.g., as part of a program to improve the well-being of low-income families, a $600 incentive is given to a family when a high school student in the household passes a standardized exam). This paper examines whether an economic incentive rewarded prior to completion of a survey has an effect on participation in the survey. Using survey data from three randomized controlled trials of programs that offer economic incentives in Texas, New York City, and the United Kingdom, this paper explores whether the economic incentives rewarded prior to survey administration affect response behavior. Specifically, it examines whether those who received an incentive are responding at different rates than those who received no incentive, and whether this introduces a bias in response that leads to measurement error (in the sampling frame and the experimental research design).

The analysis includes disposition data from six surveys that were part of three large-scale randomized controlled trials (with one to three survey waves in each trial), MIS data on the occurrence and dollar amount of economic incentives received prior to survey interview, and baseline program data that provides detailed characteristics of the entire sampling frame (including nonrespondents) captured at the start of the study.

Maximizing Survey Participation for Retail Customers by Understanding Survey Mode and Incentive Preferences Joe Cardador, Service Management Group ([email protected])

In this study 1,260 retail customers were asked about their likelihood to participate in a customer experience survey given different incentives and different survey mode options. Participants ranked their preferences for 15 incentives using Maximum Difference Scaling (MaxDiff). The incentives tested included sweepstakes drawings, giftcards and discounts, and “stacked” incentives that combined elements of both. Participants were also asked to rank their preferences for web, phone, or mail surveys under a variety of conditions (e.g., inbound versus outbound). Results confirm that discounts and giftcard incentives generally perform best but the right sweepstakes drawing can generate strong interest among potential survey respondents. As expected, computer users favored web-based surveys over phone and mail surveys. This group also favored inbound automated phone surveys over outbound phone surveys conducted with a live interviewer. Differences by demographic groups are also presented in an effort to identify which incentives and which survey modes may lead to increased non-response for some subgroups of retail consumers.

Friday, May 18, 2012 1:45 p.m. - 3:15 p.m. Concurrent Session E

Consumer Confidence and Economic Issues

Americans' Economic Confidence and Objective Economic Indicators Lydia Saad, Gallup ([email protected]); Christopher Wlezien, Temple University

The prevailing indexes of consumer confidence reported in the U.S. today are based on monthly measures of consumer attitudes about the economic climate. However, Gallup’s Economic Confidence Index -- measured daily as part of Gallup Daily tracking, and reported on the basis of 3-day rolling averages -- documents that consumer attitudes often change sharply within the span of a month in reaction to specific political or economic news. This paper examines the extent to which Americans’ economic confidence is affected by two of the more prominent economic news events Americans are exposed to on a regular basis: daily changes in the Dow Jones’ Industrial Average and the Bureau of Labor Statistics monthly unemployment reports.

For this analysis, Americans’ views of the economy are based on Gallup’s Economic Confidence Index from its inception in January 2008 through January 2012. Gallup’s daily measurement of the Index affords the opportunity to view the short-term impact that changes of various magnitudes in the Dow and unemployment rate have on consumer attitudes about the economy, as well as to look at the duration of any such changes. Because the Gallup Economic Confidence Index has a .82 correlation with the Conference Board’s Consumer Confidence Index and a .95 correlation with the Thomson Reuters/University of Michigan Consumer Sentiment Index, the findings may have important implications for the utility of those monthly measures.

Confidently Partisan: Consumer Views and Political Attitudes in Good Times and Bad Dan Cassino, Fairleigh Dickinson University's PublicMind Poll ([email protected]); Peter J. Woolley, Fairleigh Dickinson University's PublicMind Poll ([email protected]); Krista Jenkins, Fairleigh Dickinson University's PublicMind Poll ([email protected])

Research has shown that, in normal economic times, an individual’s perception of the nation’s economic health is largely a function of his or her partisanship: Individuals who have a favorable view of the party in power tend to perceive the economy as being in better shape than those who do not support the party in power and, likewise, those who dislike the party in power tend to perceive the economy as being in worse shape than do those who like the party in power. However, it is not at all clear that these relationships hold up under prolonged difficult economic conditions. This paper uses our own archive of polls on consumer confidence to identify the predictors of individuals’ views of present and future economic conditions, and how these predictors interact with voters’ political attitudes, to show how perceptions of the political environment color consumer confidence and views of the current economy in good times and bad.

Deliberate Decisions about the 2012 Federal Budget: How the American Public Would Reduce Spending and Increase Taxes to Shrink the Projected Budget Deficit Curtiss Cobb, Knowledge Networks ([email protected]); Norman Nie, Revolution Analytics ([email protected]); Saar Golde, Revolution Analytics ([email protected])

When the nation’s leaders were engaged in a polarizing debate over the appropriate balance between tax increases and spending cuts to address the federal budget deficit, Knowledge Networks captured the public’s opinion on this same topic. In an Internet poll fielded July 28 through August 1, 2011, 1,491 adults were asked how much money they think the federal government should spend and collect in taxes in 2012.

Respondents were presented with an interactive list of 28 federal programs and budget priorities— including various discretionary programs—and were asked to make actual decisions on how much money the federal government should spend on each program. They were also asked to set personal and corporate taxes at marginal rates they thought to be fair. As the respondents made their choices, the survey interactively provided feedback on the budgetary impact of their decisions.

In summary, poll found that the American public united and willing to propose massive spending cuts and only modest tax increases in order to reduce the projected 2012 budget deficit regardless of political party or ideology. Democrats proposed $7 worth of cuts for every $1 in new taxes, while Republicans suggested $33 in cuts for every $1 in new taxes.

However, a deeper analysis of the public’s preference reveals why compromise is difficult. The 28 federal programs and 6 tax brackets reduce to a set of four competing budgetary commitments along which the American public is highly fragmented.

Other key findings to be presented will include: (1) the amount of consensus among the public for spending on individual programs; and (2) a preference to modestly raise taxes on those income brackets above their own.

Interviewer Communication and Survey Participation

An Interactional Model of the Call for Participation in the Survey Interview Nora Cate Schaeffer, University of Wisconsin Survey Center, University of Wisconsin-Madison ([email protected]); Dana Garbarski, Department of Sociology, University of Wisconsin- Madison ([email protected]); Jeremy Freese, Northwestern University ([email protected]); Douglas W. Maynard, Department of Sociology, University of Wisconsin-Madison ([email protected])

Drawing on our analysis of the actions of interviewers and sample members during the call to recruit survey participation, we develop an interactional model of that call. This model brings the actions of interviewers and sample members together and draws attention to the complexities of analyzing the consequences of individual actions. To analyze features of the interactional environment quantitatively, we use audio recordings from the 2004 wave of the Wisconsin Longitudinal Study (WLS) and an innovative design that controls for sample members’ propensity to participate in the survey. The propensity score is based on information collected in prior WLS waves and was used to select pairs of acceptances and declinations based on the propensity to participate. For each call we analyze an extensive set of interviewers’ and sample members’ actions, the characteristics of those actions, and their sequential location in the interaction. Our case-control design allows us to analyze the consequences of actions for the outcome of the call. Previous research has proposed that the actions of sample members may provide encouraging, discouraging, or ambiguous interactional environments for interviewers soliciting participation in surveys. We examine features of actions that may contribute to an encouraging or discouraging environment in the opening moments of the call. We also analyze whether a sample member’s subsequent actions (e.g., a question about the length of the interview versus a statement of lack of interest) constitute an encouraging, discouraging, or ambiguous environment within which the interviewer must produce her next action. This paper is part of a larger project whose aim is to capture and model the structures of action and interaction in survey introductions.

Using Interviewer Observations of Door Step Concerns to Characterize Reluctance of Survey Respondents Shirley Tsai, Bureau of Labor Statistics ([email protected]); Ting Yan, NORC at University of Chicago ([email protected]); Jay Lin, University of California at Los Angeles ([email protected])

As household surveys are experiencing declining response rates in the past few decades (Curtin et al., 2000; Bates et al, 2008), reducing nonresponse and correct for potential nonresponse error have been two major challenges for survey organizations. Paradata – data about the survey data collection process – have been increasingly used in identifying nonrespondents and post-survey nonresponse adjustments in recent years (Kreuter et al., 2010). Interviewer observations of door step concerns – one type of paradata – are shown to reveal the concerns sampled respondents have and also their reasons for refusing the survey request. However, due to the qualitative nature of interviewer observations, they have yet to be widely used in quantitative analysis and modeling. This paper demonstrates two different ways of using interviewer observations of door step concerns to characterize reluctance of survey respondents – principal component analysis (PCA) and latent class analysis (LCA). We will evaluate the effectiveness of both methods in characterizing the level of difficulty of recruiting survey respondents and the nature of their reluctance. We will also provide suggestions on how interviewer observations of door step concerns can be used in other survey settings.

How Telephone Interviewers’ Approaches Impact their Success Jessica Broome, University of Michigan ([email protected])

Telephone survey interviewers vary widely in their success at persuading potential respondents to participate. This persuasive act can be viewed in two stages: first, the initial impression the interviewer makes on potential respondents; and, assuming there is success in this stage, the interviewer’s ability to respond appropriately to concerns expressed by potential respondents.

I report results from a study currently underway that looks at both of these stages in a corpus of audio- recorded telephone survey introductions. Initial impressions of telephone interviewers are assessed by asking web survey respondents to listen to the initial seconds of an interviewer’s recorded introduction (typically “Hello, this is ____ and I’m calling from the University of Michigan about our study on ____.”) and to rate the interviewer on several personal and vocal characteristics, including “professional,” “competent,” “confident,” “enthusiastic,” “friendly,” “pleasant to listen to,” “natural-sounding,” and “scripted.” A study by van der Vaart, Ongena, Hoogendoorn and Dijkstra (2005) indicates that a competent, confident approach may be more effective than a friendly one. At the same time, there is some evidence indicating that interviewers with voices judged as “pleasant” are more successful than those with less pleasant-sounding voices.

Interviewers’ responsiveness to concerns expressed by sample members will be assessed through analysis of the interaction in complete survey introductions (as opposed to the initial seconds). These introductions have been transcribed and codes assigned indicating the presence of a specific concern (for example, “I don't have time.”) and the interviewer’s response to the concern (for example, “We can call you back at a more convenient time.”). The hypothesis is that interviewers who offer relevant responses to concerns expressed by sample members will be more successful.

Preliminary results from this ongoing research will be discussed, as well as directions for future research and practical implications for survey operations.

It's About Time: Examining the Effect of Interviewer-Quoted Survey Completion Time Estimates on Nonresponse Bess Welch, NORC at the University of Chicago ([email protected]); Stacie Greby, National Center for Immunization and Respiratory Diseases ([email protected]); Christopher Ward, NORC at the University of Chicago ([email protected]); Jacqueline George, NORC at the University of Chicago ([email protected]); Kathleen S O'Connor, Centers for Disease Control and Prevention, National Center for Health ([email protected])

Modern survey researchers often trade off the benefits of collecting more data to guide public health decision making with the risk of a longer survey leading to lower response rates. Although significant research exists on the effect of interview length on breakoff, there is very limited research on the effect of interviewer-quoted survey completion time estimates on interview completion. This paper examines the effect of stated time estimates on interview completion rates in multiple CATI surveys: the National Immunization Survey (NIS), the National Survey of Children with Special Health Care Needs (NS- CSHCN), and the National Survey of Children's Health (NSCH). We show that the mention of a shorter survey leads to higher response rates and reduces the risk of early breakoff in the interview. Findings from the NSCH demonstrate the use of different wording to describe the same amount of time (e.g., “about half an hour” vs. “about 30 minutes”) and that effect on response rates. Additional findings from a controlled experiment on the NIS highlight the mention of survey length on response rates. Implications for survey practice and further research are discussed.

Investigating Mode Effects

Mode Effects Measurement and Correction: A Case Study Courtney Kennedy, Allison Ackermann, and Chintan Turakhia from Abt SRBI Michael O. Emerson and Adele James from the Rice University Kinder Institute for Urban Research, Courtney Kennedy at Abt SRBI ([email protected])

Multiple modes of data collection are progressively being used more often to reduce the cost of rigorous samples, maximize response rates/coverage, or to make data collection more cost efficient for longitudinal surveys. Depending on the survey content, such a change has the potential to generate mode effects. While the body of literature is growing, much remains to be learned about the conditions under which researchers should expect to observe a mode effect and what avenues of recourse are advisable when a mode effect is observed. In this presentation we present the results of a randomized mode experiment embedded in the second wave of The Portraits of American Life Study (PALS). The baseline survey was administered in 2006 with computer-assisted personal interviewing (CAPI) and audio computer-assisted self-interviewing (ACASI) for a module of sensitive items. The second wave, conducted in 2011, features a sequential, mixed mode approach beginning with Web and followed-up with CATI. A random subset of the Wave 2 respondents was assigned to a “CATI only” condition with no Web option. We present our methodology for exploring mode differences and correction of mode effects.

Disentangling Mode and Nonresponse Effects in the World Trade Center Health Registry Joe Murphy, RTI International ([email protected]); Robert Brackbill, New York City Department of Health and Mental Hygiene ([email protected]); Shengchao Yu, New York City Department of Health and Mental Hygiene ([email protected]); David Wu, New York City Department of Health and Mental Hygiene ([email protected]); Deborah Walker, New York City Department of Health and Mental Hygiene ([email protected]); Lennon Turner, New York City Department of Health and Mental Hygiene ([email protected]); Steven Stellman, New York City Department of Health and Mental Hygiene ([email protected]); Sara Miller, New York City Department of Health and Mental Hygiene ([email protected]); Erica Saleska, RTI International ([email protected])

The World Trade Center (WTC) Health Registry was established in 2003 for monitoring the health of people directly exposed to the WTC disaster. Wave 1 (W1) baseline data collection achieved a cohort of 71,437 individuals. Wave 2 (W2) was completed between November 2006 and December 2007 for a response rate of 68.6% (46,322 adults). Currently, the New York City Department of Health and Mental Hygiene (DOHMH) is conducting the third wave (W3) of data collection. Both W2 and W3 used a multi mode approach that included web, mail, and Computer Assisted Telephone Interviewing (CATI).

In W2, mail surveys accounted for 46.1% of completed questionnaires, followed by Web surveys (41.8%) and CATI (12.1%). At the time of CATI initiation, the response rate to mail and Web surveys was 59%. The CATI response rate was 21%. Estimates on certain key mental health measures, like probable post- traumatic stress disorder, were significantly lower for CATI respondents compared to all respondents (13.3% and 19.1%, respectively). This was partly explained by differences in respondent characteristics by mode, but even when controlling for these characteristics, mode differences remained. To better isolate the effect of mode, an experiment was incorporated into W3 with a subset of cases to disentangle mode effects from differential nonresponse. The experiment consists of three matched samples of 400 assigned to Web, mail, or CATI. By tracking who responds by these modes and estimates of key mental and physical health measures, we can better estimate the effects of mode independent of nonresponse error.

This presentation will summarize the methods and findings of the experiment and will be useful for others considering the tradeoffs of efficiency and potential mode effects in their data collection designs.

Assessing Measurement Equivalence and Bias of Questions in Mixed-mode Surveys Under Controlled Sample Heterogeneity Thomas Klausch, Utrecht University ([email protected]); Joop Hox, Utrecht Univeristy ([email protected]); Barry Schouten, Statistics Netherlands ([email protected])

Analysts working with data generated by different modes of data collection often want to be sure that their measurements are comparable. If a set of questions is designed to measure the same latent trait, confirmatory factor analysis (CFA) is a useful analytic tool for this purpose. It can be applied to assess, whether properties, such as measurement error, the association between latent traits and questions (measurement invariance) and the means of latent traits, are equivalent across survey modes.

We illustrate an application using empirical data from an experiment based on a national probability sample, in which 4048 respondents were randomly assigned to either face-to-face, telephone, mail or web interviewing. Two related traits were measured with three questions respectively, “moral support of the police” and the “obligation to obey”, which form the basis of our CFA model.

The association between latent traits and questions was invariant across modes. However, measurement errors differed between modes. In particular, the self-administered modes yielded more reliable indicators than the interviewer modes. Moreover, we find systematic bias between modes on the mean of one of the traits. The effect signs suggest that respondents gave socially desirable answers in the interviewer- administered modes.

A particularity of survey modes is that sample compositions are often heterogeneous. If the selective process is correlated with model elements, such as traits, it can bias invariance tests and decrease fit. We illustrate available options to adjust for this problem, for example propensity score methods or covariate adjustment, all based on the use of auxiliary variables, such as socio-demographics.

In conclusion, self-administered modes seem to produce measurements of lower quality than interviewer- administered modes with respect to random error and systematic bias. Modes may thus affect both, the error variance and bias of an estimate. An effect can be suspected particularly between interviewer and non-interviewer modes.

DMixed-mode design and Mode Effect in Surveying Military Veterans Wendy Wang , Pew Research Center ([email protected]); Rich Morin, Pew Research Center ([email protected]); Kim Parker, Pew Research Center ([email protected])

Mixed-mode surveys can be used when survey practitioners face time or budget constraints as well as for other research purposes. Previous research has shown the mode of survey interview can sometimes affect the way in which questions are answered, especially for sensitive questions. Using data from a pair of recent Pew research center surveys of veterans in the U.S., we explore the possible mode effects on the data quality of post- 9/11 veterans samples in this study.

To oversample the post-9/11 veterans in the survey, we used both phone (n=498) and internet interviews (n=214 from the Knowledge Networks panel). We analyzed whether there were differences between the two samples to ensure that the data from the two modes were comparable. The preliminary results suggest that the differences in responses between the two modes were modest in size and most were not statistically significant. Statistically significant differences were found in 19 of the 60 substantive questions in the survey, although the pattern was not consistent.

The sample of post-9/11 veterans who responded by phone and those who responded online were somewhat different on a few characteristics. There were no significant differences by gender, age, race or ethnicity, but more phone than web respondents had never attended college and fewer were currently married. Results from Logistic regression models show that after controlling for all demographic variables , a significant effect by mode of interview was found for 6 of the 19 questions, including whether it is best for the U.S. to be active in world affairs, satisfaction with their personal finances and how good the care for injured veterans is at U.S. military hospitals.

Estimating Mode Effects without Bias: A Randomized Experiment to Compare Mode Differences between Face-to-Face Interviews and Web Surveys Douglas Rivers, Stanford University, Department of Political Science ([email protected]); Lynn Vavreck, UCLA, Department of Political Science ([email protected])

Comparisons of Web, telephone and face-to-face surveys have conflated differences in sample composition (who participates in each mode) with mode effects (how the same person would respond to questions administered using different modes). Using an experimental design that eliminates differences in sample composition, we estimate mode differences between a self-administered Web survey and a face-to-face interview. The design recruited a diverse set of participants at a central location, secured agreement to participation, and only then randomized interview mode for each participant. To improve efficiency, respondents were randomized in blocks, further controlling differences between the two treatment groups. The experiment demonstrates that there are large mode differences between self- administered Web interviews and face-to-face interviews. The face-to-face interview causes respondents to moderate attitudes, acquiesce, or give socially appropriate answers to tough questions. The effects are especially pronounced for respondents with low cognitive skills, suggesting that low observed attitude constraint in this group may be an artifact of survey mode.

Methodological Briefs: Questionnaire Design Issues

The Direction of Rating Scales and Its Influence on Response Behavior in Web Surveys Florian Keusch, WU Vienna University of Economics and Business, Austria ([email protected])

Measurement of attitudes, opinions, and behavior in web surveys often relies on letting respondents indicate their agreement or disagreement with several statements on a continuum of predetermined answer categories on a horizontal rating scale presented in a grid format. When designing rating scales for web surveys, a number of decisions have to be made, among others about the response order of the scale points, i.e. whether the scale runs from positive to negative or from negative to positive. In the literature, there is no conclusive evidence about the influence of the direction of extreme point labeling (e.g., Belson, 1966; Christian et al. 2009; Dickson & Albaum 1975; Friedman et al., 1993; Weng & Cheng, 2000) and various theories have been used to explain differences in answer behavior; satisficing and primacy effects (Krosnick & Alwim 1987), interpretative heuristics (Tourangeau et al. 2004; 2007; Koller/Salzberger 2010), and cultural differences (Ferall-Nunge/Couper 2011; Tellis/Chandrasekaran 2010; Yang et al. 2010). As an increasing number of samples for web surveys is drawn from online panels and some respondents potentially participate in a large number of web surveys nowadays, their past experience with and expectations about the presentation of rating scales has to be considered. The presented study considers this diverse experience among web survey respondents. In three independent experiments with online panel members, professionals, and students the response order of rating scales was experimentally varied. The influence of scale direction was measured on different indicators of data quality (response distribution, answer differentiation, straightlining, response styles). The results indicate that respondents with more web survey experience assume that rating scales usually run from positive-to-negative.

The Accuracy of Retrospective Reports of Residence and Employment Lisa Lee, NORC ([email protected]); Catherine Haggerty, NORC ([email protected]); Nola du Toit, NORC ([email protected])

It is a common survey task to ask respondents to report on events from the past. However, retrospective reports may be subject to error. Respondents may forget that an event happened and, for events that are remembered, they may forget when it happened. The longer ago something happened the more likely respondents will forget to report it (Ebbinghaus ([1894] 1964); Sudman et al., 1996). Errors in event dating occur when respondents remember events as having happened earlier (backward telescoping) or later (forward telescoping) than they did (Neter & Waksberg, 1964; Rubin & Baddeley, 1989; Huttenlocher, Hedges & Bradburn, 1990). Since 2002, NORC has been conducting a longitudinal survey with public housing residents who relocated due to the rebuilding of substandard public housing. In the most recent wave of the survey, respondents were asked to list all the housing units they lived in since they began the relocation process and dates of residence. They also listed any jobs they held since relocation began and dates of employment. We compare the accuracy of these retrospective reports of residence and employment to two “gold standard” sources of data. One source is the data on current residence and employment status that respondents gave during earlier waves of the survey. A second source is administrative data obtained from the public housing authority and from state employment records. We will present findings on forgetting (failing to report residences or jobs), and errors in dating (remembering the residence or job but reporting the dates incorrectly) and the relationship of errors in the retrospective reports to factors such as time since move/employment, complexity of residence/employment history, and demographic factors. This work will enhance understanding of the factors that influence the accuracy of recall and illustrate the importance of longitudinal surveys as they are less subject to recall error.

A Comparison of Extreme Response Styles between Non-Hispanic and Hispanic Populations in United States Jennifer Kelley, University of Michigan ([email protected]); Sunghee Lee, University of Michigan ([email protected])

Extreme response style, where respondents exaggerate the response and choose extreme points on ordinal response scales, is a source of measurement error itself. When this occurs more frequently in one group of respondents than other respondents, it introduces systematically different measurement error to the inference. Among the correlates of extreme reporting, race/ethnicity receives considerable attention in the literature. Hispanics/Latinos and African Americans in the United States are hypothesized to engage in higher levels of extreme response, which the literature has attempted to explain through cultural influences. However, due to the limitations of the current literature, the empirical evidence does not always support this hypothesis. Mainly, the limitations come from the lack of systematic control for potential confounders of the response style. One of the confounders is item-level characteristics, such as question topic and response scale features (e.g., response wording and length). This study attempts to provide a comprehensive comparison of extreme response styles between Hispanic and non-Hispanic populations in United States by controlling for potential confounders. The General Social Survey (GSS) from 2000 to 2010, which includes a wide range of questions with rating scales, will be used for this study. We will first categorize each question by topic (or measurement construct), response scale wording, and response scale length. We will then examine extreme reporting separately by question and response scale characteristics and compare it between Hispanics and non- Hispanics. This will allow us to further examine whether there are certain question characteristics, inducing extreme response style, more than others and whether racial/ethnic differences in extreme reporting interact with question characteristics.

Student Effort on Large-scale, Low-stakes Assessments: Comparing Results from NAEP and PISA Pia Peltola, American Institutes for Research ([email protected]); David Miller, American Institutes for Research ([email protected]); Rhonda Baylor, Optimal ([email protected]); Laura Warren, American Institutes for Research ([email protected])

It has been argued that low-stakes assessments (i.e., those that have no direct consequences for the students, teachers, or schools) may not accurately measure student achievement (Wise and DeMars, 2005). For example, there is concern that student achievement may be underestimated if students know that their performance on an assessment does not affect their grades. Further concern has been raised about international assessments—that possible cross-cultural differences in student motivation may compromise the validity of comparing student performance across countries (Holliday and Holliday, 2003).

Using recent data from the National Assessment of Educational Progress (NAEP) and the Program for International Student Assessment (PISA), this analysis will examine student reports about how much effort they put into taking the assessment, how this relates to their achievement, and the relative usefulness of each and combined motivation measures in predicting achievement.

On NAEP, eighth-graders’ effort was measured by three questions, each with a four-point Likert scale: (1) How hard was this test compared to most other tests you have taken this year in school? (2) How hard did you try on this test compared to how hard you tried on most other tests you have taken this year in school? (3) How important was it to you to do well on this test?

On PISA, 15-year-olds were shown an “effort thermometer” with three 10-point scales. A first scale (with “10” marked) indicated a maximum amount of effort invested in a situation of high personal importance. For the second scale, participants were asked to rate how much effort they put into taking the PISA test, and for the third scale, participants were asked to rate how much effort they would have invested if their performance counted towards their school grade.

Do Longer Questionnaires Yield Lower Response Rates? Stephanie Lloyd, Center for Survey Research, University of Massachusetts Boston ([email protected]); Patricia Gallagher, Center for Survey Research ([email protected]); Carol Cosenza, Center for Survey Research ([email protected])

While it is important for health organizations to assess the care they provide to patients, they often have limited funding to devote to data collection. The Consumer Assessment of Health Care Providers and Systems Clinician and Group Patient Centered Medical Home instrument (CAHPS® C&G PCMH) was developed to measure patients’ experiences getting ambulatory care. Our experiment was designed to examine the effects on data quality of reformatting the CAHPS C&G PCMH questionnaire to require fewer pages, thus minimizing printing and postage costs. The survey of which this study was a part, was funded by the Agency for Healthcare Research and Quality (AHRQ).

A sample of n=2,100 adult patients from a university-based health system was randomized to self administer either a 4-page or a 12-page questionnaire. The 12-page version is formatted following CAHPS guidelines: 8 pages of items and 4 pages for the instructions page and covers. The test questionnaire is a 4-page version of the original, with the introduction and instructions at the top of the first page.

A standard 3-contact mail administration protocol was followed. Response rates, item non response, skip compliance, horizontal and vertical presentation of a 0-10 summary rating item, responses to individual items and CAHPS composite measures were compared across arms. The project yielded a response rate of 44% (AAPOR RR 1).

New Frontiers: Advances in Web Surveys

Information-Communication Technology Support for Online Surveys: A Need for Integration Vasja Vehovar, University of Ljubljana ([email protected]); Ana Slavec, University of Ljubljana ([email protected]); Nejc Berzelak, University of Ljubljana ([email protected] lj.si); Katja Lozar Manfreda, University of Ljubljana ([email protected])

Software tools for developing and implementing online survey questionnaires are of essential importance. Based on the tool used, savings in time can be experienced and complexity and quality of collected data can be increased. However, corresponding issues have received very little attention in literature (Macer 2002, Crawford 2002, 2006, Vehovar et. al. 2005, Berzelak 2006, Kazcmirek 2006, 2008, Zuckerberg 2006). We present trends and key characteristics of software tools for online surveys by analyzing the WebSM database that includes over 400 inserts. In particular, tools are observed as regards support for various research process stages. Our analysis show that while core parts (online questionnaire implementation, distribution and data collection) can be fully realized online using most of the available tools, earlier stages and post–collection stages have not fully shifted online yet. In most software tools, the support for post-collection stages, such as data editing, weighting/imputation, advanced data management and analysis is usually weak or absent. Even less satisfactory is the support for questionnaire drafting and testing as there are usually no online collaboration features. Instead, word processors and exchanging questionnaire versions by e-mail are used. The practice of using software tools only when the questionnaire is finalized is clearly outdated as there is a significant potential for integrating all stages of the survey process. In order to elaborate performance differences between implementing the questionnaire using usual limited tools and collecting survey data in a fully integrated online tool, a pilot online survey platform has been developed. It supports online collaboration, comments management, archiving and versioning. We performed an experiment where same questionnaires were developed in both ways. Results show that even relatively primitive integrated tools provide better user experience and efficiency than the standard questionnaire development procedure.

The Effect of Mode on Participant Responses to Qualitative Research in Virtual Worlds Sarah Dipko, Westat ([email protected]); Catherine Billington, Westat ([email protected]); Pat Dean Brick, Westat ([email protected])

Research associated with virtual worlds continues to grow as virtual worlds increase in popularity. The benefits of conducting research in virtual worlds include the recruitment of large numbers of respondents in a reasonable time and at low cost. Moreover, respondents can participate in research surveys from within the virtual world which increases convenience. However, this also raises new issues such as possible mode effects between real-world vs. virtual-world responses. Survey responses provided in a virtual world (e.g., Cybertown, Second Life, Muse) may differ qualitatively from responses provided in a real world online (web) survey. It is important to determine what sorts of questions are vulnerable to change as well as the extent and direction of that influence. The results will inform our ability to interpret findings of virtual world-based research. In-world study participants will be recruited by a vendor with an established virtual presence. Data will be collected in the virtual world through a self-administered survey (e.g., a survey kiosk). Respondents who provide informed consent will be asked to complete a brief in-world survey of roughly 20 questions for a small virtual incentive (e.g., Linden dollars). The survey will include both demographic/factual questions as well as attitudinal/opinion questions. Respondents who complete the in-world survey will then be asked to complete a web survey in real life for a small real-life incentive. The web survey will be similar to the virtual-world survey. The analysis will focus on mode effects between the two sets of responses.

Designing Interactive Interventions in Web Surveys: Humanness, Social Presence and Data Quality Chan Zhang, Institute for Social Research, University of Michigan ([email protected])

Humanized agents are often implemented in the user interfaces of computers as well as other devices (e.g., IKEA’s animated online assistant ‘Anna’ and the iPhone speech-based personal assistant ‘Siri’). This study explores the design of humanized agents that interact with respondents in a Web survey questionnaire. Specifically, the agent intervenes when respondents speed, i.e., answer too fast to provide high quality data. Although a few previous studies on this type of intervention reported positive impact on response quality, the evidence also suggests that some respondents may lack the motivation to comply with the intervention requests. We speculate that the intervention could be more effective if respondents treat the intervention as if it is from a human agent rather than simply feedback from the computer; the risk is this may introduce social desirability bias. To explore this, we examine four designs of the intervention for the speeding; in all cases respondents will be prompted and asked to re-consider their answers when their answers are faster than the pre-specified threshold. We manipulate the picture and the text to either be more human-like or feel more like a computer program. We also contrast how the intervention is presented – either on the subsequent survey screen or in a pop-up window – to test the hypothesis that the latter might help create the sense of an agent stepping into the data collection process; as a result, respondents might be more likely to respond socially to the intervention when it involves humanized cues. This study compares the four intervention designs on: 1) respondents’ compliance with the intervention, 2) their subsequent response effort in the survey, and 3) the social desirability of their answers to subsequent sensitive questions.

The Persistence of Attentiveness in Web Surveys: A Panel Study Adam Berinsky, MIT ([email protected]); Samantha Luks, YouGov ([email protected]); Doug Rivers, Stanford University and YouGov ([email protected])

Survey researchers have long known that respondents often fail to pay close attention to surveys. This tendency is an especially significant concern for internet-based surveys because there is no interviewer present to monitor the quality of answers to survey questions. Recently, psychologists (Oppenheimer et al., 2009) have proposed using the Instructional Manipulation Check (IMC) as one way to measure attentiveness during surveys.

In our AAPOR presentation last year we compared the IMC with traditional attentiveness measures in their relationship with the strength of observed experimental effects. Several months after the presentation, we re-interviewed respondents from this survey, measured their attentiveness again using both types of measures, and asked them to read a new randomized news story followed by several questions about the story.

In this paper, we use the panel data to examine three questions: (1) How likely were inattentive individuals in the first wave to remain inattentive in the second wave? (2) How well were respondents in the second wave able to remember details from the first survey? (3) How should we treat responses from those who are persistently inattentive versus those who are periodically inattentive?

Perspectives on the 2008 and 2010 General Elections

American Pride and Prejudice: Public Opinion on the Meaning of Obama’s Election as President David C. Wilson, University of Delaware ([email protected])

The reverberations from the 2008 election of Barack Obama as President still hold substantial meaning to members of the public. To some his election signified the possibility of the American dream, but for others his election either made no difference toward or threatened one’s America identity. One argument suggests that lower pride in Obama’s election is based on racial prejudice: those who hold more resentful racial beliefs are less likely to think Obama’s election is meaningful. A similar argument suggests that a racialized Obama is simply less liked than a deracialized Obama. Both arguments are based on sound theory, and testable in public opinion surveys; however, few studies have taken a rigorous approach to testing these hypotheses. Using three survey experimental studies (total N > 3,000)—two from national samples, and one from a statewide sample—I ask whether pride in Obama’s election as President is altered by randomized labels of him as the “first Black/African American/Non-White/Multiracial” President. I also consider the impact of racialized resentment toward African Americans on pride, as well as how resentment might exacerbate or diminish any effects across the racial and non-racial labels.

Race of Interviewer Effects in the 2008 Presidential Election Nuri Kim, Stanford University ([email protected]); Yphtach Lelkes, Stanford University ([email protected]); Jon Krosnick, Stanford University ([email protected])

Since the beginning of survey research wherein interviewers administer questionnaires orally, researchers have been concerned that interviewer behavior and characteristics might be a source of systematic measurement error in assessments. One particular concern has been that the race of the interviewer might influence respondent reports of opinions on matters related to race. This concern seems especially relevant to pre-election polls done prior to the 2008 U.S. Presidential election, because one of the candidates was African-American. This matter received some attention in the press prior to that election, with different leading polling organizations expressing different beliefs about whether African-American interviewers were more likely than White interviewers to elicit pro-Obama reports. In this paper, we analyzed data from hundreds of thousands of interviews conducted in 2008 by three major survey organizations (Gallup, CBS/NYT, ABC/Washington Post) to explore the effect of the race of telephone interviewers on reported voting intentions. We also tested a particular hypothesis about the mechanism that might produce such an effect: respondent decisions about whether to complete an interview or not. We found that Africa-American interviewers were indeed more likely to elicit statements of the intent to vote for Mr. Obama, and we found that this occurred partly because African-American interviewers were more likely than white interviewers to elicit interviews from African-American respondents and/or that White interviewers were more likely than African-American interviewers to elicit interviews from White respondents. Thus, what might seem to be an interviewer race-inspired distortion of respondents' reports of their voting intentions, turns out to be at least partly the result of a process that unfolds much sooner than when the interviewer asks the respondent how he/she would vote and instead happens at the moment the respondent decides whether to complete the interview or to hang up.

The Impact of Climate Change Issue in the 2008 U.S. Presidential Election Bo MacInnis, Stanford University ([email protected]); Jon Krosnick, Stanford University ([email protected])

The long-held theories of voting behavior posit that voters evaluate political candidates on the basis of their positions on issues, and yet have received little empirical confirmation in the general population and limited support among members of the public who attach high personal importance to the issue. National surveys show climate change is a salient issue and large majorities of Americans have acknowledged the existence and human cause of climate change and desired government actions to reduce future global warming, and that climate change issue public is a sizeable group, suggesting that climate change would be an important factor in the 2008 election. However, counterarguments exist: one is that other issues such as economy, and financial and mortgages crises seemed prominently important to electoral, diminishing the electoral importance of climate change, and the other is that both candidates appeared to have expressed a similar position on climate change, leading to the tendency toward a null effect of the issue in voting (Page and Brody, 1972).

First to explore whether climate change affected voting behaviors in the 2008 presidential election, this study employed the well-established methodologies in political science through the measure of issue congruence and yielded two findings. First, greater relative proximity to Mr. Obama on climate change increased the likelihood of voting for him instead of for Mr. McCain, a finding that is consistent with the rational choice theory of voting. The finding appears to be robust in a wide range of analyses, using various different ways to represent issue proximity, using various estimation techniques making various different types of assumptions, and using various issues other than climate change issues as additional controls. Second, issue voting on climate change were stronger among those attach high personal importance to the issue, a finding that is consistent with issue voting theories.

Candidates, Campaigns, and Policy Issues: Original Panel Data from the 2010 Midterms Andrew Therriault, Vanderbilt University, Center for the Study of Democratic Institutions ([email protected])

How do voters evaluate candidates on specific policy issues? To answer, I present original data from an internet panel survey during the 2010 midterm elections. In both waves of a pre- and post-election survey, respondents are asked to rate the positions and qualifications of their Senate and House candidates with regard to three distinct issues. I first demonstrate how learning about candidates over the course of the campaign allows voters to better connect their own policy preferences with the candidates' positions, and explore how voters' evaluations of candidates' issue-handling competence are colored by partisan bias. The second part of the study analyzes how voters combine evaluations of both positions and qualifications with each other and across issues, in order to determine their overall candidate preferences. Finally, I introduce an original dataset which records candidates' emphasis of those same issues in their campaigns, to demonstrate how voters' attitudes toward candidates on issues can be swayed by campaign messages. My findings offer new insight into voters' evaluation process and carry important implications for studies of public opinion, voting, and campaigns.

What Actually Happened in the 2010 Midterm Elections? Scott Ferguson Clement, Washington Post ([email protected]); Peyton M. Craighill, Washington Post ([email protected]); Jon A. Cohen, Washington Post ([email protected])

The 2010 midterm elections marked the third consecutive “wave” election in American politics, resulting in massive Republican gains in the House of Representatives as well as the Senate. But Democrats rode such a wave in midterm elections only four years earlier, followed by a landslide presidential victory (by modern standards) under the banner of president Obama. The trajectory of independent voter is most curious. After voting by a 57 to 39 percent margin for Democratic House candidates in 2006 and 52 to 44 for Obama over John McCain, independents reversed course in 2010, backing Republican House candidates by a 56 to 37 percent margin.

Republican leaders and commentators proclaimed the 2010 election a broad rejection of Obama’s policies and a mandate to reduce the size of government, while Democrats claimed that the public still backed their goals, but languishing enthusiasm kept their supporters from the polls.

How much of the Republican victory was due to differential turnout as opposed to actual change in voter attitudes? This paper aims to offer a clear answer to this question that has received little empirical attention from researchers. We will analyze newly released 2010 turnout data from the Census Bureau’s Current Population Survey as well as Washington Post-ABC News and Pew Research Center polls to provide a comprehensive portrait of non-voters in 2010. We also will analyze newly released media exit poll data to examine how 2010 voters - both overall and among key swing groups - differed from electorates in the 2008 general and 2006 midterm elections. Lastly, we will examine how the electorate is shaping up ahead of the 2012 elections.

Recall and Measurement Error in Surveys

Measurement Errors in Self-Reports of Consumer Expenditures: Are Errors Attributable to Respondents or Expenditure Types? Charles Q. Strohm, RTI International ([email protected]); Emily Geisen, RTI International ([email protected]); Ashley Richards, RTI International ([email protected]); Brandon Kopp, Bureau of Labor Statistics ([email protected])

The literature on measurement error highlights a variety of factors related to the accuracy of retrospective self-reports. These factors include the salience, frequency, and recency of the topic, as well as characteristics of the respondent such as education and gender. However, less is known about the relative importance of these different factors. For example, is recall error attributable largely to characteristics of the topic or the respondent? Does the salience or the recency of the topic explain more of the variance in recall accuracy?

This study uses data from the Consumer Expenditure Records Study (CERS). The CERS was designed to investigate the accuracy of self-reports for consumer expenditures (e.g., appliances, furniture, and mortgages) when compared to receipts and other records. In the first interview, respondents provided self-reports about the cost of expenditures from the previous three months. In a follow-up interview, respondents provided records for some of the reported expenditures. By comparing self-reports and records, we can describe the extent and size of measurement errors for the expenditures reported by the 115 respondents in the study.

Preliminary results show substantial variation in accuracy by expenditure type. For example, self-reports and records matched for approximately three-quarters of rental expenditures, but only for about half of home furnishing and clothing expenditures. In contrast, no demographic characteristics were consistently associated with accuracy. This paper will use multi-level modeling techniques to describe the associations between accuracy of self-reports and the correlates of measurement error. These correlates include expenditure type, cost, recency of purchase, and respondent characteristics such as education and gender. The multi-level model will quantify how much of the variance in accuracy is attributable to the characteristics of the expenditure versus the respondent. Results from this paper will provide practical guidance about methods to reduce measurement error in expenditure reports.

Examination of Recall Error in Reports of H1N1 and Seasonal Flu Vaccination Ipek Bilgen, NORC at the University of Chicago ([email protected]); Kennon R. Copeland, NORC at the University of Chicago ([email protected]); Tammy A. Santibanez, Centers for Disease Control and Prevention ([email protected]); Nicholas Davis, NORC at the University of Chicago ([email protected])

In survey methodology, measurement error due to recall error in respondent self-reports has been studied across many disciplines via different types of questionnaires (Abelson, Loftus, & Greenwald, 1992; Pearson, Ross, & Dawes, 1994). Three prominent hypotheses have emerged to explain recall error that is detrimental to data quality: (1) distinctive events hypothesis; (2) time elapse hypothesis; and (3) response propensity hypothesis. We investigate these hypotheses as they relate to the quality of influenza vaccination reports of occurrence and month in which vaccination was received. We analyze the national influenza weekly vaccination coverage estimates from the 2009-2010 National H1N1 Flu Survey and 2010-2011 NIS National Child Flu Survey conducted weekly by NORC for the Centers for Disease Control and Prevention. A negative binomial regression model will be used to explore differences in reports of influenza vaccinations and month of vaccination among respondents with various time intervals between influenza vaccination and survey interview date. Three types of independent variables will be explored: (a) vaccination distinctiveness (seasonal vs. H1N1); (b) length of elapsed time from vaccination to date of the interview; (c) response propensity measures (e.g., early vs. late responders) and respondent demographic variables (e.g., SES, age, race, and education). Preliminary results suggest a decline in vaccination reports as the time elapse between vaccination and interview dates increase, due to the “forgetting curve”. However, weekly reports from the 2009-2010 waves indicate that this curve is less distinct in H1N1 reports (as the latter is a more distinctive and unique event). Further analyses will explore whether the decay in the vaccination reports due to time elapse is moderated by the distinctiveness of the vaccination event, response propensity, and selected respondent characteristics. Potential approaches for reducing recall error in vaccination reports and adjusting for the decay when deriving vaccination rate estimates will be discussed.

A Pilot Study to Validate Health Measures on the Behavior Risk Factor Surveillance System Andrew Caporaso, Westat ([email protected]); Wendy Hicks, Westat ([email protected]); David Cantor, Westat ([email protected]); Sean Hu, Center for Disease Control and Prevention ([email protected]); Carol Pierannunzi, Center for Disease Control and Prevention ([email protected]); Lina Balluz, Center for Disease Control and Prevention ([email protected])

This presentation summarizes the findings from the Behavior Risk Factor Surveillance System (BRFSS) Health Measures Pilot (HMP). The objectives of the HMP were to (1) assess the feasibility of collecting health measures in respondents’ homes as a follow up to the BRFSS telephone interview, (2) to determine whether physical health measures are useful in assessing measurement error associated with specific BRFSS questions and (3) obtain preliminary estimates of error in the BRFSS self-reports.

The HMP was sponsored by the Centers for Disease Control (CDC) and was conducted in the states of Washington and Florida. BRFSS respondents were visited by health professionals who collected blood samples and measured height, weight and blood pressure. The self-reported health outcomes from the BRFSS were compared to the physical measures. The health outcomes of interest were high cholesterol, hypertension, diabetes and obesity. Measurement error was gauged by assessing false-negative (reporting no disease and measuring as having the disease), false-positive rates (reporting a disease and not measuring as having the disease), and overall agreement as measured by net error and Cohen’s kappa.

Hypertension, diabetes and obesity were found to be underreported in the BRFSS. High cholesterol was overreported. False-negative reporting occurred at a higher rate than false-positives for all four conditions. False-positive reporting was minimized by accounting for medication use. For hypertension, diabetes and obesity the HMP kappa results were comparable to analyses conducted using data from the National Health and Nutrition Examination Survey (NHANES), a national in-person health examination study.

Conclusions from the analysis suggest that collecting physical measures is a useful methodology to assess measurement error in a telephone survey like the BRFSS once factors such as medication use are accounted for. Recommendations are made with respect to modifying question wording as well as the follow-up interviewing procedures.

Inconsistency in Reporting Health Conditions: Is Measurement Error to Blame? Stephen J. Blumberg, National Center for Health Statistics ([email protected]); Rosa M. Avila, National Center for Health Statistics ([email protected])

One problem with follow-up surveys of a pre-screened sample is that the respondents may not answer the screening questions the same way the next time. The Survey of Pathways to Diagnosis and Services was intended to be a survey of parents of children with special health care needs (CSHCN) aged 6 to 17 years who were reported to have been told by a or health care provider that the child had one of three selected development-related health conditions: autism spectrum disorder (ASD), developmental delay, or intellectual disability. These children were identified during the 2009-2010 National Survey of CSHCN (Time 1), and to the extent possible, their parents were recontacted for follow-up interviews (Time 2).

Among parents of children said at Time 1 to have ever been diagnosed with ASD or intellectual disability, nearly one in ten were said at Time 2 to have never been diagnosed with those conditions. Inconsistent responses were also notable for the group of children who had been diagnosed with the condition but, at Time 1, were said to no longer have it. Of these children, one in three ever diagnosed with ASD and three in four ever diagnosed with intellectual disability were said at Time 2 to now have the condition.

We will take a careful look at the extent to which measurement error may have led to these inconsistencies. We will consider indicators of resistance to complete the surveys and other indicators of the likelihood of poor response quality throughout the surveys. We will consider demographic differences that exist between consistent and inconsistent respondents, the relationship of inconsistent responses to elapsed time between surveys, and other possible explanations for why inconsistencies may have occurred.

The Impact of Relationship Quality in Parent-Child Dyads on the Accuracy of Matches in Proxy Reports Mediated By Question Characteristics of Sensitivity, Abstraction, and Mutability Jennifer Benoit-Bryan, University of Illinois, Chicago ([email protected])

The use of proxy reporting, where an alternate reporter with a close relationship to a subject of interest answers survey questions without consulting that individual, can be a useful tool for researchers. Proxy reporting questions are employed when it would be too expensive to survey multiple populations of interest, when a subject is incapable of answering for themselves, or when the subject of interest is not reachable. Accuracy of proxy reports for medical conditions have been studied; however, these tend to be specific to single diseases or conditions and are not generalizable to proxy reporting in general. The importance of the quality of the relationship between the proxy reporter and the subject in relation to the quality of proxy reporting has never been investigated.

This research puts forward and tests a theory of the impact of relationship quality in parent-child dyads on accuracy of matches in proxy reports based on the sensitivity, abstract level, and mutability of the match variables. The analysis was conducted using data from the nationally representative Adolescent Health dataset with over 17,000 matched parent-child dyads. The relationship satisfaction rating of the subject of a sensitive question was found to be significantly correlated with agreement rate of self and proxy responses. When the dimensions of abstraction and mutability are high, relationship satisfaction ratings are statistically significant, when the variable is concrete and unchanging, the relationship satisfaction ratings are not predictive of match accuracy.

Targeting Sub-populations Using Address-Based Sampling

Using Qualitative and Quantitative Testing to Improve Response from Hispanic Households Michelle A Cantave, Arbitron, Inc ([email protected]); Robin Gentry, Arbitron, Inc ([email protected])

Arbitron currently uses a mailed screener survey sent to an Address Based sample (ABS) to recruit the non-landline portion of our sample frame. Traditionally Arbitron has had difficulty getting Hispanics, particularly less acculturated Hispanic households, to return the screener. A variety of data and research suggests that Hispanics are less likely to respond to mail questionnaires than other groups. In December 2010 Arbitron conducted a series of focus groups with Spanish-primary Hispanics to determine their concerns about surveys, in general, and our screener, in particular. From these focus groups, we learned that the main concerns are:

• Trusting that a company is legitimate and not a scam • Feeling that their opinion is important and they make a difference in their community • Knowing that their information will not be shared or used for non-research purposes

In the Summer of 2011, we conducted a split-sample test to assess potential improvements to the ABS screener returns rates by sending a pre-alert mailer to potential Hispanic households. The mailer was designed to inform households that 1) Arbitron is a legitimate and “serious” company, 2) that their opinion is important and their response makes a difference to their community, 3) that their personal information will be kept confidential and 4) that we will be sending them an envelope containing the survey and a small cash gift in a few days.

In order to target the mailer, we identified records in our sample that were located in Census tracts with a Hispanic population density of at least fifty percent. .

We will present the return rate results as well as the analysis of the demographics of those that returned the screener to see if there were any differences in the group that received the pre-alert.

Targeting Minority Populations Using Demographic-Appended Address-Based Sampling Kyley McGeeney, Gallup ([email protected]); Manas Chattopadhyay, Gallup ([email protected]); Jenny Marlar, Gallup ([email protected])

The Computerized Delivery Sequence (CDS) file provides an enticing sample frame for conducting nationally representative studies due to its nearly total coverage of U.S. households. With the of telephone use, address-based sampling (ABS) is a viable alternative for covering the adult U.S. population, but it is not without its challenges. Males, young people, and racial and ethnic minorities appear to have lower response rates in ABS surveys than in surveys with sample that comes from an RDD or dual frame (Boyle et al., 2011). Gallup conducts a monthly study using address-based sampling that is consistent with this finding. The resulting sample typcially underrepresents Hispanics, African Americans, and young adults. Gallup conducted a targeted sampling experiment to evaluate the effectiveness of appending the address-based sample with demographic data. The demographic-appended sample frame allows for identification of high-density Hispanic neighborhoods, high-density African American neighborhoods, Hispanic surname households, and households that are either known or predicted to have young adults. Respondents in the Hispanic treatment groups were randomly assigned to receive either a paper questionnaire in English with an option to complete the questionnaire in English or Spanish via the Web, or both English and Spanish paper questionnaires with an option to complete the questionnaire in English or Spanish via the Web. The design resulted in 18 treatment groups. The study evaluates the effectiveness of targeted sampling and discusses the most effective strategies for improving coverage of underrepresented groups.

Does Ethnically Stratified Address-Based Sample Result in Both Ethnic and Class Diversity? Case Studies in Oregon and Houston Robyn Rapoport, SSRS ([email protected]); Susan Sherr, SSRS ([email protected])

Over the past several years, address-based sampling (ABS) has increased in popularity due to its excellent coverage, utility in targeting small geographic areas, and the ability to reach households without a landline telephone. Although there are clearly a number of benefits to the use of ABS, research has consistently shown that the samples of respondents that emerge from ABS studies are often biased in favor of more highly educated respondents. This occurs because respondents to an ABS study whose address cannot be matched to a listed phone number must take the initiative to contact the survey research organization either by mail, web, or call-in number depending on the study design. Even studies that incorporate incentives for unlisted addresses can see this bias.

In 2011, SSRS conducted two health studies using AB samples, one in Harris County, Texas and one in the state of Oregon. In both studies, sample was stratified to target specific ethnic groups. The samples were stratified by census block group, a more precise way to target specific geographic areas with higher densities of ethnic minorities than by phone exchange, and one which is only possible using AB sample. However, it is unclear whether the unlisted ethnic sample will result in a diverse group of ethnic respondents at varying levels of education or, similar to other AB samples, a disproportionately more educated sample of African-Americans, Hispanics and Asians. This paper will use the two case studies of Harris County and Oregon to demonstrate whether targeted ABS stratification of high density minority areas can provide a diverse mix of ethnic respondents or if other methods need to be employed to reach the less educated members of these groups.

Using Ancillary Information to Stratify and Target Young Adults and Hispanics in National ABS Samples J. Michael Dennis, Knowledge Networks, Inc. ([email protected]); Charles A. DiSogra, Knowledge Networks, Inc. ([email protected]); Erlina Hendarwan, Knowledge Networks, Inc. ([email protected])

Knowledge Networks uses the USPS address frame to design recruitment samples for its probability- based online panel. A 2011 objective was to field a more efficient stratification design to recruit young adults (ages 18-24) and Hispanics. Bilingual invitation packets and $2 cash is mailed to sampled addresses. Follow-up mailings are sent to non-responders. Recruited households without Internet access are provided a computer to enable participation. In 2010, eight national mailings (n=22,500) were fielded targeting census blocks with 45% Hispanic population density in one stratum and all remaining blocks in another stratum. Simultaneously, ancillary data from commercial databases attached to each address were tracked as potential predictor variables for future targeting purposes. Hispanic ethnicity and age were key items. Based on the 2010 findings, the sample stratification for 2011 was designed using only ancillary data to make four strata: Hispanic 18-24, All else 18-24, Hispanic 25+, All else 25+. Mailing materials/incentives were unchanged. Eight surveys of approximately 26,250 addresses each were fielded with this design. The allocation of sample across these strata was conservative for this first effort (design effect = 1.52). Compared to the census block stratification design, the overall yields were 1% lower while the number of young adults and Hispanics was nearly identical. However, the ancillary data stratification design proved more efficient within stratum in the recruitment of young adults and Hispanics. For example, in 2010 the Hispanic stratum was 38% Hispanic while in 2011 the Hispanic strata were about 74% Hispanic. This is also reflected in a modest 1.7% and more substantial 3.2% lower cost per recruited young adult and Hispanic, respectively. Based on the 2011 data, it is now possible to set higher over-sampling recruitment goals and, with greater predictive certainty, allocate more sample to young adult and Hispanic strata to take advantage of learned efficiencies

Friday, May 18, 2012 3:15 p.m. - 4:15 p.m. Demonstration Session #2

Leveraging Social Media Monitoring for Market Research Marie-Eve St-Arnaud, Voxco ([email protected]); Alkis Papadopoullos, Voxco ([email protected])

The ability to effectively track conversations and user postings on the web has to be more than a simple exercise in monitoring how many mentions a product, brand, company are garnering on the web. As a source of what is trending on the web, or as an early warning system for unexpected feedback (positive or negative), social media monitoring can be very effective, but only if there is a means to quantify this qualitative data.

Voxco uses natural language processing, semantic analysis and proprietary machine learning algorithms to reliably identify meaning in online content and more importantly to provide analysis of results in context. This helps customers reliably mine data for actionable information and thus serves a wide range of opinion mining needs. Many opinion mining and social media monitoring solutions do a good job of collecting data, monitoring for occurrences of keywords, but then leave the customer with the task of manually figuring out how analyze the flood of data, and attempt to understand what are the appropriate calls to action. It is important to ensure that users can collect the data, but also to assist them in shaving costly hours off of the more complex data analysis phase.

Furthermore, the ability to show the context of the topics evoked is critical and makes it feasible for customers to draw actionable conclusions. But more importantly, the ability to convert possible web trends into targeted surveys which permit a company to quantify this immense feedback channel along specific market segments or demographic lines is going to be increasingly important. Conversely, the ability to validate relevance of surveys by tracking topics and trends on the web increases the reliability of conclusions drawn from surveys. The interplay between surveys and social media monitoring can thus lead to more effective decisions being made.

Friday, May 18, 2012 3:15 p.m. - 4:15 p.m. Poster Session 2

The Utility of the Integrated Design of the Medical Expenditure Panel Survey to Inform Mortality Related Studies Steven B. Cohen, Agency for Healthcare Research and Quality ([email protected])

The analytic capacity of surveys can be dramatically enhanced through the linkage to existing secondary data sources at higher levels of aggregation as well as through direct matches to additional health and socio-economic measures acquired for the same set of sample units from other sources of survey specific or administrative data. In this paper, the capacity of one specific integrated survey design to enhance longitudinal analyses focused on mortality studies is discussed. Examples are drawn from the Medical Expenditure Panel Survey (MEPS), an ongoing longitudinal panel survey designed to produce estimates of health care utilization, expenditures, sources of payment, and insurance coverage of the U.S. civilian non-institutionalized population. The first set of analyses are conducted to examine the differentials in pre-dispositional factors that distinguish a cohort of decedents from their surviving counterparts. Particular attention is given to the capacity to distinguish the health characteristics and the health care experiences of a cohort of decedents for a time period prior to their deaths. This is followed by a more extensive model-based study to assess the relationship between antecedent health and health care related factors and mortality. The relationship between medical expenditure levels over time and mortality is also examined to illustrate the enhanced set of longitudinal analyses that are possible through this framework. The longitudinal analyses that are highlighted are based on linkages of the MEPS to the National Health Interview Survey and the National Death Index.

Predicting the Success of Brand Launch Using Retail Index Shahzad Muhammad, The Nielsen Company ([email protected]); Ravish Khare, The Nielsen Company ([email protected])

Brand launches and Brand Managers — this relationship evokes mixed feelings. The brand manager is as delighted about the birth of a new baby as he/she is worried about its prospects. Is there a way to ensure success? Can there be some course of action to minimize anxiety? Evidently, such a task calls for the creation of a workable model using cases of brand launch successes.

For a successful model, three factors need to be fulfilled, namely — the model should be able to state factors that could be controlled; by controlling these factors desired measurable results could be achieved. Additionally the model should be capable of predicting the future. Keeping these under consideration, three sets of measures have been developed.

Measures that a marketer can exercise control on — (a) Relation between Sales and Store Count Distribution to control increase in availability and at the same time increasing sales, (b) Relation between Sales per Store & Category Weighted Distribution to control increase in throughput while adding better quality stores, (c) Ratio of Category Weighted Distribution & Category Numeric Distribution to exercise control for better retailer consolidation or have better Distribution Efficiency.

Measures that a marketer can place targets — (a) Market Share, (b) Distribution Width (c) Access of Brand to Total Category Market

To predict the future, we use past year trends — (a) Rate of Market Acquisition, (b) Rate of Store Acquisition, (c) Rate of Market Potential Acquisition.

The model was tested on India data and gave extremely accurate results 75% of cases; and extreme to fairly accurate results 98% of cases.

The successful brands achieve their goal in five-to-six years after launch; the unsuccessful ones start declining in less than four years.

The above model works using single source easily available data, therefore enables marketers into taking quick, informed, and precise decisions.

Uncooperative Respondents in Japan, Korea, and the U.S.: Using the General Social Surveys in Japan, Korea, and the United States Jibum Kim, NORC ([email protected]); Noriko Iwai, Osaka University of Commerce ([email protected]); Tom W. Smith, NORC ([email protected])

Are characteristics of uncooperative respondents the same across countries? Using the 1973-2010 GSS in the U.S., the 2000-2008 JGSS in Japan, and the 2003-2008 KGSS in Korea, we examined the correlates of uncooperative respondents. Interviewers assessed the respondents’ attitudes toward the interview, which ranged from very cooperative to hostile (not cooperative at all in Korea). By collapsing two positive and two negative categories, we created a dummy variable (uncooperative respondents=1). The percentage of uncooperative respondents ranged from 6.3% to 11.7% in Japan, from 2.8% to 6.6% in the U.S., and from 8.3% to 16.2% in Korea. Using the logistic regression predicting uncooperative respondents, all three countries show that male, college graduates, those who do not provide income information, and those who have poor understanding are more likely to be considered uncooperative respondents in three countries. Also, there are some regional differences in each country. For example, in the United States, people living in the Northeast are more likely to be uncooperative respondents compared to those in the West. However, the association between age groups and survey cooperation are inconsistent across countries. The youngest age groups are the least uncooperative respondents in the United States, but it is the oldest age groups that are the least uncooperative respondents in Japan. In Korea, there is no age group difference. Our findings are discussed in terms of increasing usage of paradata.

Comparing Face-to-face and Online Approach: Household Recruitment of Consumer Panel Research in China Teresa (Ye) Jin, Nielsen ([email protected]); Yu-Chieh (Jay) Lin, Institute for Social Research, University of Michigan ([email protected]); Shu Duan, University of Michigan ([email protected]); Jennie Lai, Nielsen ([email protected])

Panel household recruitment can be challenging in the emerging markets where households may have limited understanding on the concept of survey research. With internet penetration rapidly expanding, online recruitment method can be leveraged for particular hard to reach segments in China. In consideration that China is a collection of diverse regions with economic growth, technology development, cultures, dialects, consumer behaviors, and lifestyles, strategies of offline and online recruitment used vary in different tier of cities. Nielsen has deployed a mixed-mode method of recruiting households online as well as face-to-face for a consumer panel across the country to study consumers’ purchasing behavior. This research paper attempts to systematically evaluate and compare cooperation rates of online versus face-to-face mode in different tier of cities in China using household/respondent demographics, purchasing behavior, lifestyle variables, geography, etc. The paradata will be examined to evaluate the efficiency of face-to-face recruitment method as well. These key research findings and recommended best practices will be shared in hopes of shedding light on an effective and efficient method of recruiting panel households in mainland China.

The Challenge of Going National: An Experimental Evaluation of the Effects of Local vs. Distant Survey Sponsorship on General Public Internet and Mixed-Mode Response Rates Michelle L. Edwards, Social & Economic Sciences Research Ctr., Washington State University ([email protected]); Don A. Dillman, Social & Economic Sciences Research Ctr., Washington State University ([email protected])

A goal of many surveyors is to convince the general public to respond over the Internet when contact can be made only by mail, as is the case for address-based sampling. Research has shown that the combined use of token cash incentives with an initial withholding of a mail response alternative can increase Internet response rates significantly. Two recent experiments using this design found that approximately 31 – 41% of all households responded via the web and an additional 14-18% responded to a mail questionnaire offered late in the implementation process (Smyth et al. 2010, Messer and Dillman, 2011). However, these studies were limited to regional and state populations surveyed by a university located in that region or state that was likely to be recognized as a legitimate source of the survey request. More recent research now suggests that when survey designs include geographically distant populations that are less familiar with the survey sponsor, the proportion of respondents willing to respond over the Internet declines. In this paper we report a new experiment fielded in January 2012, in which we compare the effects of survey sponsorship by a local (in-state) university vs. distant (out-of-state) university on Internet response rates. Our overarching purpose is to gain a better understanding of this issue as a potential obstacle to building national study designs for web+mail survey implementation systems using address-based samples of the general public.

Adjusting the Response Bias in RES ACV: An Analysis of Propensity Score Matching for Comparing RES and Panel Data Muhammad Usman Sikander, The Nielsen Company ([email protected]); Muhammad Shahzad, The Nielsen Company ([email protected])

Ratio estimation method makes use of the auxiliary variable for reliable estimation. This auxiliary variable should be highly correlated with the study variable otherwise resulting estimates will be biased, inefficient and inconsistent. At Nielsen, the Retail Measurement Service (RMS) relies on the ratio estimation method. The auxiliary variable known as All Commodity Value (ACV) and study variable known as sales volume are the key inputs for estimation. ACV information is collected during the Retail Establishment Survey (RES) and is subject to many restrictions. The problem is more persistent in developing and under-developed countries where this information cannot be cross-checked for the validity unless the store becomes a panel member and provides access to its detailed retail data. Perceiving the usefulness of the ACV information, it is important to correct it for the possible response bias in reporting ACV by the retailer. This ACV information is corrected using the RMS data but is limited to panel stores only as census stores don’t have the RMS data. Using a propensity score matching approach, a profile-match between the census stores (contacted during the RES) and panel stores (part of RMS), is established. As a result, each census store can be approximated with a panel store and RMS information of each panel store can be used for corresponding profile matched census store. The analysis undertakes two different approaches of linear regression and Principal Component Analysis (PCA) to generate the weights for propensity score calculation and compare/ evaluate the effectiveness of both methods for correcting the response bias in reporting of ACV information using the panel stores’ information.

Assessing Housing Conditions: The Validity of a Mixed Mode Research Design Amy Donley, Institute for Social and Behavioral Sciences ([email protected]); Rachel Morgan, Institute for Social and Behavioral Sciences ([email protected])

This poster presents the effectiveness of using a mixed methodology research design in evaluating the housing conditions in an urban environment. The first stage of the research consisted of a longitudinal telephone survey study with residents. Data were collected in 2005 and again in 2009. In both surveys, a representative sample was acquired and residents were surveyed via telephone on the condition of their homes. Respondents were asked about many things, including the exterior appearance, the condition of the roof, the paint, the windows, and so on.

After the telephone surveys were completed, a windshield survey was undertaken. While not a common research method in the social sciences, windshield surveys have been used in studies that require assessment of neighborhoods or communities. They can be used, for example, as a means of assessing damage after a natural disaster, such as a tornado. They can also be used to determine levels of accessibility to handicapped citizens or pedestrians.

Assessment of one’s own home, however, can be subjective. So to add another layer of validity to the findings, a windshield survey was conducted. Homes that scored high on the “dilapidation scale” from the telephone survey were listed in a sampling frame. A random sample of homes was taken. Then these homes were visually assessed by researchers. The researchers had a set list of characteristics they were to examine and set criteria for rating those characteristics. Comparing the data from the telephone survey to the data from the windshield survey, the researchers found that the results were not significantly different except for comments/responses on the condition of house exteriors: Windshield survey results showed a greater percentage of homes with damaged exteriors than results compiled from the telephone surveys showed.

Language Measurement, Trends, and Media Usage Among Hispanics Marla D. Cralley, Arbitron Inc. ([email protected]); Kate Williams, Arbitron ([email protected])

Because, among Hispanics, language is one of the major drivers of radio listening choices, Arbitron closely monitors language and controls our Hispanic samples on language where universe estimates are available.

Arbitron PPM, a system to passively collect Radio and Television media use among on-going panels of respondents, has been implemented in 48 top U.S. metros, replacing the traditional paper radio self- report diaries previously used in these markets. Estimates for weighting are provided to Arbitron by Nielsen Media Research for eleven of Arbitron’s PPM radio metro samples.

At Arbitron, language is collected first during the initial recruitment call, when a single household member aged 18 or older is asked to report personal characteristics for everyone living in the home. Then, a few days later, during the “Welcome to the Panel” call, we attempt to speak to each person in the household and at that time, collect their self-reported response to the same language question, asking about “language spoken at home.”

This paper will explore the differences in self-reported and proxy-reported language among Arbitron panelists by metro, age/gender groups, and household composition. Particular attention will be paid to those panelists indicating that they “speak both English and Spanish equally” in the home, whether their “language dominance” changes when self-reported, and whether they prefer their meter rechargers to communicate with them in their dominant language. We will also present trends in language dominance among panelists over time based on universe estimates compared to Arbitron installed persons by several demo characteristics, including age and household size.

Lastly, we will show resulting listening estimates to explore the relationship between language dominance (as identified by the language spoken most often at home) and respondents radio listening choices by station format.

Using Longitudinal Multilevel Analysis to Analyze Trends in Surveys Claire Durand, Universite de Montreal ([email protected]); François Yale, ASSSMM ([email protected])

There are topics that have been and are still much surveyed: for example, voting intentions, or support for issues like the death penalty or . However, pollsters often survey the same topics using different question wording or methods of survey administration. Given this variability in methods, how is it possible to analyze trends? The presentation will demonstrate the advantages of using longitudinal multilevel analysis, with time at level two, for this purpose. With this type of analysis, polls are seen as nested within time periods. The evolution of opinions over time and the impact of events on this evolution may then be modeled taking into account differences in survey methodologies. The presentation will use support for Quebec sovereignty to illustrate the method. It uses almost 700 polls conducted between 1976 and 2008. The results reveal the impact of major political events and of question wording – i.e., whether the question asks about general support or voting intention and whether the constitutional option is labeled as independence, separation, sovereignty or sovereignty-association – on the evolution of support for sovereignty. Only multilevel analysis could provide such a refined analysis. The presentation concludes with a discussion of other possible topics that may usefully be explored using this methodology.

Using Dual Sample Surveys to Examine the Relationship between Response Rate and Bias Graham Wright, Cohen Center for Modern Jewish Studies - Brandeis University ([email protected]); Theadora Fisher, Brandeis University ([email protected]); Leonard Saxe, Brandeis University ([email protected])

In 2010 and 2011 the Cohen Center for Modern Jewish Studies at Brandeis University conducted a pair of studies, one of US respondents and a smaller one of Canadian respondents. These studies were constructed with a design that allows for an explicit examination of the relationship between survey response rate and bias. In both cases the population frame was made up of applicants to a program with a registration database that provides both contact information and key demographic variables for all members of the frame. For both studies a small stratified sample (3,000 cases in the US survey, 1,270 in the Canadian) was drawn from the total population and sent an online survey. Members of these small samples were incentivized with a guaranteed $25 Amazon.com gift card for survey completion and were called over the phone and encouraged to complete the survey online. These samples achieved response rates of close to 50%. In addition, for each study a larger group (a stratified sample of 64,400 for the US survey, or the entire remaining 11,000 cases in the Canadian frame) was sent an identical online survey, incentivized only with entry into a lottery, and was not subject to follow-up calls. As expected, the response rates for these groups were significantly lower: around 10%. However the bias between the large and small groups (on survey variables), and between all respondents and the population (on variables present in the registration database that exist for survey nonrespondents) was much smaller than expected given the vast differences in response rates. This paper will use these studies to examine the relationship between response rate and bias, and posit possible explanations for the lower than expected level of bias in surveys with low response rates.

What is Quality of Life and Can Polling Measure It? Thomas Lamatsch, Monmouth University Polling Institute ([email protected]); Patrick Murray, Monmouth University Polling Institute ([email protected]); Tyler Breder, Monmouth University

The Monmouth University Polling Institute spent nearly two years developing a comprehensive survey to measure "Quality of Life." The survey included more than 100 discrete question items and was conducted with a statewide sample of more than 2,800 respondents. The large sample size allowed for testing demographic subgroups that are generally too small with standard survey sample sizes. Multivariate analysis was used to identify a number of sub-concepts within the rubric of Quality of Life. This allowed researchers to group the population into nine distinctive clusters based on Quality of Life perceptions and behaviors; some being held together primarily by demographics, others by attitudes, and some by a mix of both. Additional analysis helped researchers identify a small subset of question items that can be developed into a summary Quality of Life index (similar to popular Consumer Confidence measures). This paper will discuss the process used to develop the survey and some of the more intriguing findings that have implications for future survey research related to quality of life and life satisfaction. One key finding is that some groups with extremely similar demographic profiles can have very different outlooks on quality of life with the variations being far more subtle than typical measures of life satisfaction that focus on the usual suspects, such as income and education.

Evaluating New Technologies for Retention of Rural Youth in Longitudinal Survey Research Eleanor M. Jaffee, The Carsey Institute at the University of ([email protected]); Meghan L. Mills, The Carsey Institute at the University of New Hampshire ([email protected])

The Rural Youth Study is a 10-year longitudinal panel survey on the transition of youth into adulthood in a rural New England County. The study began in 2008 with a representative sample of 7th and 11th graders attending all of the county’s public schools; the response rate was an exceptional 83% (N=657). With the cooperation of district personnel, the first two waves of data collection (in 2008 and 2009) were collected while both cohorts were attending school and therefore easy to reach. The third wave of data collection (2011), however, presented the research team with the challenge of retaining our older cohort participants who had moved on from high school, were geographically dispersed throughout the rural county and beyond, and often had limited internet access. To find and communicate with them we used email, texting, and Facebook, as well as phone and mail. We employed two survey modalities: online “SNAP Surveys” via email and Facebook and paper surveys sent in the mail. We also offered different types of incentives including entry in a raffle to win an iPad or iPod, guaranteed incentives for survey completion, and prepaid incentives. Our final third wave response rate for the older cohort was 59% (N=199).

This paper describes our participation retention activities and reports on our analyses of their comparative effectiveness in reaching participants and targeting non-responders. The emphasis of this paper is on our use of “new” technologies in participant retention and the benefits and limitations of each. We also discuss the intersection of these technologies with conventional wisdom about effective incentivization. There is limited research on the effectiveness of technology for participant retention of rural youth. Our paper addresses this gap in the literature and offers practical insights for retaining socially, geographically, and technologically hard-to-reach populations in survey research.

Public Opinion and Uncertain Science: Exploring the Dynamics Behind Real and Perceived Knowledge Gaps in Nanotechnology Leona Yi-Fan Su, University of Wisconsin - Madison ([email protected]); Dominique Brossard, University of Wisconsin - Madison ([email protected]); Ashley A. Anderson, University of Wisconsin - Madison ([email protected]); Dietram A. Scheufele, University of Wisconsin - Madison ([email protected]); Michael Xenos, University of Wisconsin - Madison ([email protected])

Previous research has explored the role of different media sources in widening and narrowing knowledge gaps about science, or disparities in knowledge among member of the public with high socioeconomic status and low socioeconomic status (SES). Researchers have used two types of empirical approaches— factual knowledge and perceived knowledge—to measure public understanding of science. However, research has yet to examine knowledge gaps in the context of these two types of knowledge about science, real and perceived. In this study, we examine the role of media sources, as well as interpersonal communication, in widening and narrowing the real and perceived knowledge gaps among different SES groups.

This study relies on a national survey collected by Knowledge Networks from an online panel (N = 2,338, response rate of 54.2%) between July, 9th and 23rd 2010. The findings show that (1) the factual knowledge and perceived knowledge of nanotechnology are merely slightly related and affected differently by several predicting variables, with the same influence factors together explaining about 43% of the total variance in perceived knowledge but only 14% in factual knowledge; (2) online media use, especially attention to scientific blogs, widens the factual knowledge gap among different SES groups while news media use reverses the perceived knowledge gap; and (3) frequent interpersonal discussion about scientific issues enlarges only the factual knowledge gap among different SES segments but not the perceived knowledge gap.

E-government 2.0: Overview of Social Media Utilization by South American Federal Executives Leonardo Costa Rodrigues, Universidade de Brasilia ([email protected]); Max Stabile, Universidade de Brasilia ([email protected])

This research intends to measure the amount of information made available by the Federal Executive in South America on social media and, consequently, to verify how the supply of information can be modified by the introduction of new information and communication technologies.

In order to reach the proposed objective, the methodological procedure adopted was the oriented navigation. To assist it, an analysis matrix was created with around 50 items that were employed in the collection and systematization of the variables found available on social medias of each one of the ten countries researched, in a total of over 300 units observed.

It was verified that virtual political interaction exists and that the information supply on public websites and social medias is increasing, an element that indicates citizen participation. However, enhancements still can be done to expand the governmental presence, to diversify the offered tools and, mainly, to ensure better participation strategies.

The Effects of Survey Design Features on Answers to Sensitive Questions Lindsey Witt, Bureau of Social Research - University of Nebraska-Lincoln ([email protected])

The challenges of asking questions with socially desirable answers are well documented (Tourangeau & Yan, 2007). In order to make answering socially desirable questions less threatening, the use of a self- administered mode and softening the question language have been cited as means to reducing the effects of social desirability when asking survey questions (Bradburn et al., 2004; Tourangeau & Smith, 1996). Though the effects of sponsorship have been studied in terms of nonresponse (Dillman et al., 2009), research of its effects on measurement, including the effects on the perceived threat of answering sensitive questions, has been limited.

This study aims to better understand the extent to which the threat of answering sensitive questions needs to be reduced using tested and untested survey design features. Data from the Nebraska Young Adult Alcohol Opinion Survey will be used, which includes questions about perceptions of alcohol use and alcohol behaviors. Using variations of sponsorship, scale ordering, and question wording, respondents will be randomly assigned to one of three groups varying from a low threat level to a high threat, including one group where survey features indicate that the sponsor portrays alcohol use favorably, a more neutral group using some design elements to deter social desirability, and a third group where a respondent could infer negative connotations around alcohol use. The responses will then be compared across the three groups to determine the impact of the varying survey design features on the level of threat perceived when answering sensitive questions. Also, each group’s average response to a question asking if the respondent has ever been convicted of driving under the influence of alcohol will be compared to the true, State value. The results will also be compared across mode due to the mixed-mode nature of the survey utilizing both web and mail modes.

Predictive Validity of Vague Quantifier and Numeric Responses for Frequency Estimation Tarek Baghal, University of Nebraska ([email protected])

Vague quantifier scales provide options that are, as the name suggests, inherently vague. Therefore, there is often a large variation in the numeric meaning assigned to vague quantifiers (e.g. Wallsten et al. 1985). The scales also have relative meaning, such as where on the scale a respondent believes they are in comparison to similar others (Schaeffer 1991). As such, it has been argued against the usage of such scales when it is possible to use numeric response options (especially open ended) instead (Beyth- Marom 1982, Schaeffer 1991, Tourangeau et al. 2000). Still, there is a dearth of research on whether vague quantifiers or numeric responses perform better in regards to validity. This study examines measurement properties of frequency estimation questions through the assessment of predictive validity, which has also been shown to be important in examining measurement properties of competing question formats (Chang and Krosnick 2003). Data from the National Survey of Student Engagement (NSSE) is analyzed, in which two psychometrically tested scales, active and collaborative learning and student- faculty interaction, are measured through both vague quantifier and numeric responses. In particular, the study will examine predictive validity through correlations and regression models relating the scales, both vague and numeric, to theoretically related constructs, grades in school and satisfaction with educational experience. Initial results based on correlations and regression diagnostics suggest that the predictive validity is higher for vague quantifier scales compared to numeric responses. Thus, vague quantifiers may have better measurement properties. In addition, predictive validity is examined across numeracy, measured by SAT Math scores, as numeracy may be an important variable in measurement properties of questions dealing with quantitative information. However, initial results suggest there is little difference across numeracy scores in terms of predictive validity for these scales, with vague quantifier scales still showing higher levels of predictive validity.

The Digital Divide in Rural Louisiana: Broadband Access and Adoption Robert Kirby Goidel, Louisiana State University ([email protected])

The uneven distribution of broadband access and adoption rates across local communities has important implications for economic development, educational outcomes, and democratic governance. Often referred to as the digital divide, uneven distribution of broadband access and adoption is associated with job creation, civic participation, and democratic competence. While much of the research has focused on the impact on racial and ethnic minorities, broadband access and adoption remain important issue in rural communities. In this paper, we present the results from a telephone survey of 1800 landline and 400 cell phone respondents in eighteen of Louisiana's poorest parishes in the northeast, northwest, and delta regions. The results detail issues related to broadband access and obstacles to adoption, and outlines strategies for increasing adoption rates.

Survey Mode Preference among Enrollees in the World Trade Center Health Registry Shengchao Yu, New York City Department of Health and Mental Hygiene ([email protected]); Robert Brackbill, New York City Department of Health and Mental Hygiene ([email protected]); Deborah Walker, New York City Department of Health and Mental Hygiene ([email protected]); Lennon Turner, New York City Department of Health and Mental Hygiene ([email protected]); Mark Farfel, New York City Department of Health and Mental Hygiene ([email protected]); Steven Stellman, New York City Department of Health and Mental Hygiene ([email protected]); Sara Miller, New York City Department of Health and Mental Hygiene ([email protected]); Jiehui Li, New York City Department of Health and Mental Hygiene ([email protected])

The World Trade Center Health Registry (WTCHR), the largest post-disaster exposure registry in the United States, is a longitudinal study that tracks 71,000 enrollees to understand the long term health effects of the September 11 2001 terrorist attack. The baseline interviews were conducted during years 2003 and 2004, and a follow-up survey (Wave 2) was conducted during years 2006 and 2007. The third wave of the survey is currently underway and data collection will end in March 2012.

In Wave 2, enrollees were offered web and mail modes throughout the survey. Phone interviews, due to its high cost, were only given as the last resort in the last 4 months of the 14-month data collection period. Wave 3 survey employs similar survey method to Wave 2.

The objective of this paper is: 1) to understand what factors are related to enrollees’ choice of survey mode, particularly among Wave 2 phone participants; 2) to assess the interaction between survey reminders and mode preference.

Our preliminary data show that, among enrollees who completed Wave 2 surveys by phone, 63% of them chose to complete Wave 3 by phone rather than by web or mail. Logistic regression on mode choice among this group found that: 1) the preference of phone over web is significant among older [Odds Ratio (OR) 0.24, 95% CI 0.07-0.08] and less educated people [OR 0.66, 95% CI 0.47-0.93]; and 2) the preference of phone over mail is significant among male [OR 0.71, 95% CI 0.50-0.95], Hispanic [OR 2.20, 95% CI 1.41-3.52], and low income groups [OR 2.21, 95% CI 1.10-4.51]. When data collection is complete, we will assess the interaction between survey reminders and mode preference.

Results of this paper would help identify strategies to encourage survey participants to respond in less expensive survey modes without sacrificing response rate.

Using State Variation to Assess the Association Between Social Change and Odds of Re-contact in a Longitudinal Study Julia McQuillan, University of Nebraska-Lincoln ([email protected]); Anna Bellatorre, University of Nebraska-Lincoln ([email protected]); Andrew Bedrous, University of Nebraska-Lincoln ([email protected]); Ashley J. Frear-Cooper, University of Nebraska-Lincoln ([email protected])

Retaining panel participants is vital for successful longitudinal research projects. Re-contacting participants years after the initial interview is particularly difficult for random digit dial (RDD) studies that often have minimal contact information. Studies of panel attrition guide our choice of individual level predictors of re-contact. We focus on re-contact because the time between our interviews involved dramatic social changes --an economic recession, enforcement of anti-immigrant laws, and the dramatic increase in cell-phone-only homes – and because refusals were low (n=150 of 2,192 contacted and 3170 in the sample). In prior work using state-level correlations between social change indicators and rate of re-contact, we found that changes in state contexts were associated with rates of re-contact. We now ask: do state-level social change indicators add information to an individual level model predicting re-contact from baseline characteristics? We use the National Survey of Fertility Barriers (NSFB), a two-wave panel study of women aged 25 to 45 and their partners first interviewed between 2004 and 2006 and re-contact attempts between 2007and 2010. As expected several individual characteristics are associated with re- contact (economic hardship, rural residence, relationship, age, race/ethnicity, interview language, household size, parental status, education level. We planned to include indicators of the state-level change in the proportion foreign born, unemployed and cell-phone-only, but change in foreign born and change in the unemployment rate were too highly correlated (r = .63), therefore we excluded foreign born as the least reliable measure. The full Hierarchical Linear Generalized Model indicating that change in the unemployment rate is associated with lower odds of re-contact (OR = .0002, P < .05); cell-phone only is also negative but not significant. This study reinforces the importance of gaining extensive baseline contact information and regular mail contact with panel members between phone interviews to mitigate damage from unpredictable social change.

Where Did We Go Wrong? Using Multiple Regression to Identify Budgeting Errors Julia McQuillan, University of Nebraska-Lincoln, Sociology ([email protected]); Chan Wai Kok, University of Nebraska-Lincoln ([email protected]); Stacia Jorgensen, University of Nebraska-Lincoln ([email protected]); Jacob E. Cheadle, University of Nebraska-Lincoln, Sociology ([email protected]); Amanda Richardson, Bureau of Social Research, University of Nebraska-Lincoln ([email protected]); Nicole R. Bryner, University of Nebraska-Lincoln Bureau of Social Research ([email protected])

Academic Survey Research Organizations (ASROs) need to provide cost estimates to potential clients for project planning and grant applications, but do not always know why budgets do or do not match actual costs. Why do some budgets match costs and others do not? We used historical data from six mail survey projects and compare budgeted and actual costs using mean differences and multiple regression analyses. Two factors known at the beginning of studies – the number of target households and whether or not the project will include incentives – explain most of the variance in costs. Of the many cost elements – management costs, data entry costs, copy costs, mailing costs, number of units requiring a second or third mailing – the difference between the assumed and actual response rate is the major reason for budgeting error.

Statistical Uses of Administrative Records in the 2010 Census of Group Quarters Population Young Chun, U.S. Census Bureau ([email protected]); Andre Williams, U.S. Census Bureau ([email protected]); Diane Barrett, U.S. Census Bureau ([email protected])

Administrative records are used to build and update the sampling frame, assist imputation of both unit and item nonresponse, and assess census coverage. Major advances in the use of administrative records, some long standing, have occurred in economic and business statistics and health research applications. Administrative records have gained renewed attention in population census, arguably for controlling the accelerated data collection cost, reducing respondent burden and sustaining data quality. The purpose of this paper is to present possible statistical uses of administrative records particularly in the U.S. Census of Group Quarters (GQ). GQ Enumeration involves collecting population data from correctional facilities, skilled nursing facilities, college residence halls, and military barracks. They include hard-to-access populations who live or stay in a group setting but are usually unrelated to each other. We demonstrate the utility of administrative records in constructing the GQ frame for coverage improvement. We examine the availability and usage of administrative records in GQ enumeration. We analyze the 2010 Census of GQ to investigate the extent to which administrative records were used by major GQ types for valid enumeration as well as frame construction. We discuss limitations of administrative data pertaining to the GQ frame construction and enumeration. Findings of pros and cons of using administrative records are connected to pragmatic implications for making the 2020 Census of GQ cost- effective and high quality.

Would You Like a Receipt With That? Availability of Respondent Records When Collecting Expenditure Information Amy Hendershott, RTI International ([email protected]); Jennifer Edgar, Bureau of Labor Statistics ([email protected]); Christopher Stringer, US Census Bureau ([email protected]); Emily Geisen, RTI International ([email protected])

Measurement error due to over- or underreporting in surveys designed to capture expenditure information may be attributable to respondent recall difficulty. The use of respondent records may reduce measurement error, but only if such records are available for the items asked about. The US Census Bureau and Bureau of Labor Statistics contracted with RTI International to conduct a feasibility study to investigate the extent to which records were available for consumer purchases, factors affecting record availability, and logistical considerations for obtaining expenditure records.

This study uses data from the Consumer Expenditure Records Study (CERS). The CERS consisted of a non-probability sample of 115 participants. Each participant completed two interviews. In the first interview, participants were asked a subset of the standard Consumer Expenditure Quarterly Interview (CEQ) questions. At the end of the first interview, interviewers asked participants to gather any records (e.g., receipts, bank statements, credit card bills) they could obtain for the expenditures asked about and to bring the records to the second interview, which was scheduled for four to seven days later. In the second interview, interviewers recorded which expenditures the participants had records for and then recorded the pertinent information from the record.

Records were available for only 36% of the expenditures reported in Interview 1. Several factors affected the availability of records, including demographic characteristics, the date or frequency of the purchase, and the amount of the expenditure.

In this paper, we describe the factors affecting the availability of expenditure records. We report on participants’ barriers to providing these records such as discarding receipts after making a purchase, other household members making the purchase, and records only being available online, which participants were unwilling or unable to provide. We conclude with a discussion of the feasibility of using respondent records in studies that collect expenditure information.

Expectation: Intention, Social Network, and Central Signal David Rothschild, Yahoo! Research ([email protected]); Zeljka Buturovic, IBOPE Zogby ([email protected])

People’s expectations reveal information about themselves, their social network and the information they consume. Expectations about political outcomes utilize: the respondent’s intention, an expectation of the intention of his/her social network, and a noisy central signal of the expected outcome. In order to further explore the interaction of these three information sources, we ran a several repetitions of a poll during the 2012 Republican primary to likely Republican voters that included three new questions. First, we had the respondents distribute the probability that they will vote for any of the given candidates. Second, we had the respondents distribute the probability that they believe any given candidate will win. Third, we had the respondents distribute how they believe their social network will vote among the given candidates. We combine these responses with the responses to the standard intention question and the concurrent poll and prediction market data, which serve as a proxy for the central signal of expected outcome. From this data we calibrate both the weight of the different inputs and interaction between them in the formation of individual-level expectations and, consequently, election forecasts. These results enhance three different literatures. First, the findings relate to the forecasting literature by providing more insight into the information possessed by non-expert individuals that they reveal to unique questions addressing both intentions and expectations. We use this information to make more efficient forecasts. Second, these findings add to the literature on social networks and information flow. Third, the designs of our questions address the literature on survey design, demonstrating the enhanced value in certain graphically-based questions.

The Impact of a Reminder Postcard in a Multi-node Survey of Rental Units Stephanie Dion, ICF International ([email protected]); Katelyn Muir, ICF International ([email protected]); Randal ZuWallack, ICF International ([email protected]); Leslyn Hall, Redstone Research, LLC ([email protected])

INTRODUCTION: Postcards are frequently used in mail surveys to increase response rates. Postcards are valuable in that they remind respondents to return their completed questionnaire; notify that a questionnaire had been sent if it was received by a different household member; and can introduce an alternative response option such as a link to a web survey. Reminder postcards are generally followed by a recontact attempt such as a second mailing of the questionnaire packet or a follow-up telephone call. This presentation evaluates the impact of a postcard on a multi-mode survey of rental units in Charleston, SC and Fort Wayne, IN. The survey response options include mail, web, and telephone.

This research study is designed to measure the overall increase in survey response due to the postcard and whether the increase in response varies across recontact mode. The results are based on a two factor experimental design with each address assigned to a postcard condition and a recontact condition. The addresses were equally assigned to one of three postcard conditions: no postcard sent, postcard sent with an option to conduct the survey on the web, postcard sent with no mention of the web survey. Independent of the postcard assignment, addresses were assigned to the recontact condition, which included a second mail packet or telephone follow-up.

Sleep Diary Feasibility and Mode Study: Paper Vs. Electronic Jaki Brown, RTI International ([email protected])

The Healthy Adolescent Development Project (HAD) is a prospective longitudinal study funded by a grant from the National Institute on Drug Abuse (NIDA), which examines neurocognitive susceptibility factors and consequences of marijuana abuse among adolescents in a community with high prevalence rates of early adolescent marijuana use. The 529 children being interviewed in this wave range in ages from 12- 15. Included in their interviews is a sleep habits module, since literature shows a clear connection between unhealthy sleep habits and illicit drug use.

To explore this connection, we developed a sleep diary feasibility and mode study. By implementing this study we hoped first to learn about the feasibility of having our adolescent sample members complete a detailed 7 day sleep diary. Second, we aimed to compare two different modes for collecting sleep habits data. In addition, we wanted to learn the extent to which risk for drug abuse influenced compliance in completing these surveys.

We worked with a sub-sample (n=136) of our study children at various levels of risk of illicit drug use, as determined in Wave 1. Within their risk groups the children were randomly selected to receive either a diary programmed on a Samsung Galaxy Tab or a paper and pencil (PAPI) diary. Both diaries are identical in content. The children were asked to keep the diary of their sleep habits over the course of 7 consecutive days.

The findings we present from the sleep diary study about its feasibility, the mode effects we encountered, data quality and compliance issues have important implications for other researchers hoping to gather more detailed data from their study subjects.

Intensifying the Request: Results from an Experiment on Improving Internet Response Rates for Address-based Samples of the General Public Benjamin L. Messer, Washington State University ([email protected]); Don A. Dillman, Washington State University Social & Economic Sciences Research Center ([email protected])

Recent research has shown that using a web+mail survey design (i.e. offering web first, followed by a mail alternative) with a token cash incentive can obtain responses from about half the sampled respondents, two-thirds of which use the web to respond, in statewide and regional surveys of address- based samples of households (Messer & Dillman 2011; Smyth et. al., 2010). The proposed paper builds on this research in two ways to improve web methods for conducting general public surveys. First, we seek to determine whether the web+mail design is as effective in Alabama and Pennsylvania, two states in the U.S. located far from the survey sponsor state, Washington, and that have different population characteristics and response rates to the American Community Survey. Second, we test a new variation of the web+mail design – 2web+mail – aimed at pushing more respondents to the web. Previous research using web+mail informed respondents in advance that a mail follow-up would be sent at a later date, and also included a token cash incentive with both the initial web and the mail follow-up requests. In the 2web+mail design, we withheld the advanced notice of a mail follow-up and provided the incentives only in the web requests in an effort to convince more respondents to use the web as opposed to waiting for the mail follow-up. The experiment permits response rate and demographic comparisons to be made between the mail-only, web+mail, and 2web+mail designs in the three states. Results indicate that the substantial state disparities exist, particularly in the effectiveness of using the web. In addition, the 2web+mail design may be more effective at increasing the proportion of web respondents compared to web+mail, but also at a cost of lower overall response rates and a less demographically representative sample.

East-West Differences in German Household Telephone Ownership: A replication of 'Phone Home?' Volker Hüfken, Heinrich-Heine-University of Duesseldorf, Institute of Social Sciences ([email protected])

This study examines East-West differences in household telephone ownership. The initial conditions are significantly different in both societies. The focus is specifically on individual and contextual effects. According to the work of T.Smith following factors to predict household telephone ownership were used; socio-economic and demographic characteristics, household composition, cultural background/urbanization, political and group ties, and political attitudes. The analysis based on the cumulative German General Social Survey (GGSS). These consist of independent cross-sectional surveys of the adult household population of Germany. 10 surveys conducted between 1982 and 2006 in West Germany and five between 1991 and 2006 in East Germany. The analysis, indicates that the socio- economic factor especially income has the largest impact. In East Germany, the SES factors are more strongly than in West-Germany. Also more differences in telephone ownership could be observed between marital status, age, and urbanization. Most of the other factors also have significant, independent effects to telephone ownership. More than 20 years after the reunification differences are marginal.

Tracking Residential Mobility at the Household Level Kate Bachtell, NORC at the University of Chicago ([email protected]); Ned English, NORC at the University of Chicago ([email protected]); Catherine C. Haggerty, NORC at the University of Chicago ([email protected])

Considerable literature supports the desirability of studying individuals in the context of their immediate social unit, the household. Yet focused studies of household composition reveal that households in economically disadvantaged populations with low homeownership rates are particularly likely to experience additions, subtractions, and substitutions among members. This paper examines the methodological challenges associated with defining, tracking, and explaining residential mobility at the household level. We describe a new approach that was employed for the Making Connections Survey, a cross-sectional and longitudinal study of ten low-income urban communities. Our method involved linking all individuals within the household at different points in time using a combination of data queries, probabilistic matching software, and human review. The process produced personal identifiers that could be integrated with the household-level data to identify changes beyond numerical shifts in household size. We use the combined data to examine residential mobility across a gradient scale of stability in household composition. Our work advances past studies in three ways. First, we demonstrate a more comprehensive understanding of residential change by mapping various types of change in household composition – gaining, losing, or replacing individuals, or being repopulated entirely with new occupants – in combination with physical relocation during a ten-year period. A series of maps compare the geographic patterns of residential movement among households that experience specific types of changes versus those that remain a stable unit. Second, we use regression modeling to examine the influence of household instability at a finer level on the propensity to move. Finally, we highlight the social and economic characteristics of households that would be cast as stable without the addition of the linked individual identifiers.

Sports Fanship: Assessing Behaviors, Their Meaning and Impact Don Levy, Siena Research Institute ([email protected])

For years Gallup has asked respondents a single self-description question, “In general, would you describe yourself as a sports fan, or not?” Over time, approximately 60 percent of Americans, 70 percent of men and 50 percent of women, consistently describe themselves as sports fans. This research, conducted online and with both landlines and cell phones over a five year period uses the degree to which respondents engage in eight different fan activities to construct the Sports Fanship Index. Rather than be subject to social desirability, we argue that an assessment of triangulated behaviors can provide a more accurate and nuanced understanding of the rate of sport fanship among Americans. We measure the frequency with which respondents watch sports or sports news, listen to sporting events or talk shows on the radio, discuss sports with friends, family or colleagues, read sport magazines or books, use the internet to track sports, participate in sports fantasy leagues and frequent sports bars. Using these responses we compute a raw fanship score for each respondent and submit those scores to cluster analysis. We then code all fans into one of four groups – Avid, Involved, Casual or Non-fan. We find that rather than only 60 percent of Americans being fans, based on their behaviors, closer to 85 percent of Americans walk the walk of sports fanship.

We comment on variation by age and gender as well as describe the differences between the four levels of fanship. Finally, we comment on the implications for our social fabric, citing several examples including attitudes towards current sports topics like Mixed Martial Arts, of the seeming centrality of sport consumption among a large majority of Americans.

Putting the “Social” in Exploring the Social Media Frontier: Collaborating to Investigate Social Media’s Past, Present and Future Jennifer C. Romano Bergstrom, Fors Marsh Group ([email protected]); Caitlin Krulikowski, Fors Marsh Group ([email protected]); Megan Fischer, Fors Marsh Group ([email protected]); Sarah Evans, Fors Marsh Group ([email protected]); Sean Marsh, Fors Marsh Group ([email protected]); Shawn Bergman, Appalachian State University; Fors Marsh Group ([email protected])

Researchers often use a single methodology to gather information and make assumptions. While this routine is largely successful, there may be undiscovered holes in the end product. Here, we used a “collaborative approach” to gather public opinion data regarding social media and acceptable methods for Military outreach through social media.

First, we assembled a diverse group with far-reaching expertise areas: an academic team with psychology backgrounds, a Military team with subject expertise, and a consultant team with survey methodology, communication, user experience and marketing expertise.

Second, we used an interdisciplinary approach to ensure we covered the entire field: the academic team scoured the academic literature for studies pertaining to social media from social science, marketing and advertising perspectives; the Military team examined the Military literature and interviewed personnel for information about Military social media practices, constraints, and best practices they have found through trial and error; the consultant team inspected popular press articles, reports, books, magazines and journals, and they attended conferences to learn about the latest practical social media trends and research.

Third, we collectively aggregated the vast background research, created focus group and one-on-one interview protocols designed to probe qualitatively into the holes identified in the research and created a quantitative survey to extend existing models and further delineate how the Military can best use social media for outreach. The consultants moderated the focus groups and interviews, conducted cognitive pre-testing of survey questions, and conducted the survey; the academics and consultants analyzed the data; the whole team created a social media framework and “best practices” for the Military to apply to its social media use in recruitment efforts.

This paper focuses on the “collaborative approach” used to conduct this research in a short timeframe. We compare this approach to alternative approaches and demonstrate that the “collaborative approach” yields more complete results.

Social Seniors: Determining the Viability of Reaching American Seniors through Social Media Melissa Wentzel, American Institutes for Research ([email protected])

According to the Pew Research Center, 42% of Americans 65 years of age and older are active online (“Who’s online: Demographics of internet users,” May 2011). This paper examines the effectiveness of the internet as a tool for reaching communities of older Americans, aged 65 to 70, and investigates whether some communities are more easily reached online than others.

Project Talent, a large-scale longitudinal study started in 1960, launched an extensive outreach effort to engage participants after up to 50 years of noncontact. The outreach strategy relies heavily on reaching participating classes through their 50th high school reunions in order to reacquaint them with the study and update current contact information. 957 high schools participated in the study in 1960. In 2011 Project Talent had a presence at the 50th reunions of 427 schools.

This paper examines how these reunions are located (through social media sites such as Facebook and Classmates, websites established specifically for a reunion, or by sending press releases to local media outlets) and determines whether some communities (based on the indicators of geographic region, school size, school type - private, parochial, or public - and college admission rates) are more likely to be reached online or through more traditional means.

In addition, this paper examines the characteristics of the participants who registered themselves with Project Talent through the study’s website, www.projecttalent.org. Using the indicators of geographic region, community size, school type, and college admission rates, this paper will highlight any trends among those participants who register online. The findings of this paper will enable researchers to better understand the social media habits of an older demographic and will guide future locating and engagement activities.

Conducting Research on Native American Reservations: Challenges and Solutions from the Field Robynne A. Locke, ICF International ([email protected])

Social science research can play an instrumental role in the creation of effective social policy. However, sensitive populations that could potentially benefit from this research are often the ones who, due to a history of conflict, are often reluctant to participate in research. Native Americans are one example of a population that has been made suspicious of outside research due to historical exploitation and conflict (Marshall and Batten, 2003). The objective of this paper is to address this conflict through the following research questions: (1) What are the areas of conflict and resistance between Native American tribes and outside researchers? (2) How do these affect participation in public opinion surveys and other important research? And (3), how can researchers and tribes work together to overcome these obstacles? Through interviews and monitoring of a survey conducted on five Native American reservations across the U.S., this paper will seek to address the above questions and explore this tension between the researcher and the researched from an anthropological perspective.

An Examination of the Effect of 3rd Person versus 1st Person Item Wording Valerie Waller, Data Recognition Corporation (DRC) ([email protected]); Jack Fentress, Data Recognition Corporation ([email protected]); Colleen Rasinowich, Data Recognition Corporation (DRC) ([email protected])

For evaluating program impact, a pre/post design is often used with impact measured by the absolute changed in scores across relevant metrics. Likert scales using agreement or describe scaling are appropriate tools, but impact measures can be diluted or obscured when there is a high level of agreement at pre administration. This is the issue of “limited upside” from pre to post results. One solution is to reword the item and make the item more demanding in terms of agreement. The insertion of adjectives like “absolutely” or “extremely” is often effective, but this study considered the effect of using the 1st person rather than a more abstract 3rd person phrasing. For example, the item “It is important to set goals for yourself” can be reworded “I need to set goals for myself”. A pre/post survey design was used for program evaluations in six states. Results showed a high level of pre program agreement on the third person worded items, with pre agreement (“strongly agree” or “agree”) ranging from 84% to 95%. These items were reworded in the 1st person and the revised survey instrument is being tested in four additional state programs. Results will be available in early December, 2011. The effect of increased personalization of survey items and item differentiation has significant relevance for evaluative research and particularly pre/post designs.

A Multi-Mode Approach for Assessing Key Health Indicators in Resource-Limited Settings via Household and Health Facility Surveys Catherine M Wetmore, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Emily Carnahan, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); K. Ellicott Colson, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Ali Mokdad, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Gulnoza Usmanova, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Dharani Ranganathan, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Sebastian Martinez, Inter-American Development Bank ([email protected]); Paola Zúñiga Brenes, Inter-American Development Bank ([email protected]); Emma Margarita Iriarte, Inter-American Development Bank ([email protected]); Ana Pérez Expósito, Inter-American Development Bank ([email protected]); Jennifer Nelson, Inter-American Development Bank ([email protected]); Pablo Ibarrarán , Inter- American Development Bank ([email protected]); Brent Anderson, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Tasha B. Murphy, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Bernardo Hernández Prado , Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Rafael Lozano, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Ali H Mokdad, Institute for Health Metrics and Evaluation, University of Washington ([email protected])

Results-based funding strategies for health financing in developing countries make donor support contingent upon demonstrated improvements in key indicators. The Mesoamerican Health Initiative (SM2015), established by the Inter-American Development Bank with funding from the Bill & Melinda Gates Foundation, the Carlos Slim Foundation, and the Government of Spain, aims to reduce inequities in the coverage of basic health services among the poorest populations in Mesoamerica by 2015. To ensure the integrity of this initiative, independent measurement and evaluation of progress will be overseen by the Institute for Health Metrics and Evaluation at the University of Washington.

In 2011, we launched a series of rapid assessment surveys of households and health facilities to measure baseline coverage of key interventions and indicators pertaining to reproductive, maternal, and child health. Simple random samples of households with age-eligible women and children were selected with representative probability from target municipalities to provide estimates among the lowest wealth quintile of the population. Complete household censes were performed to inform selection of households when current census figures were not available. Standardized household surveys were implemented by pen-and-paper when computer-assisted (CAPI) surveys were not possible due to security concerns. All health facilities in the selected segments completed CAPI surveys. Real-time quality checks were performed, and households and facilities were revisited as needed.

Baseline data collection in El Salvador is complete, with surveys in seven other Mesoamerican countries underway. We provide recommendations for efficient sampling and data collection modes in extremely resource-limited settings. We summarize the prevalence of indicators in the key areas of the initiative, highlighting synergies between household and facility data. For example, low rates of condom use (2.4%) among women “in need” of contraception suggest limited demand or deficiencies in distribution, rather than lack of availability, since condoms were observed in nearly all health facilities (90.7%).

Collecting Biomedical Specimens in Health Research Kevin Ulrich, Group Health Research Institute ([email protected])

Survey research organizations are increasingly being tasked with obtaining consent and, in some cases, collecting biological specimens from potential respondents. In order to obtain high response rates in these types of studies, it is crucial that survey methodologists investigate best practices to improve response in the collection of these types of data. Under what circumstances are participants willing to consent to these procedures? What methods can be utilized to best facilitate the collection of these types of data?

This paper reviews outcomes from multiple studies conducted by the Survey Research Program at Group Health Research Institute, which required the collection of various biological specimens. Recruitment procedures, methodologies for obtaining consent, response rates, characteristics of participants and non- participants, and recommendations for best practices are discussed.

An Experiment Among U.S. Hispanics Regarding Contextual Identity with Survey Research Design Implications Martin Cerda, Encuesta, Inc. ([email protected])

In this era of unprecedented technological progress, it is now more evident than ever that the context in which an individual experiences the delivery of communications is more crucial than ever. This has important implications for those interested in reaching and impacting thoughts and actions, and correspondingly in how public opinion is gathered and measured as part of the survey research process.

U.S. Hispanics, and in particular bilingual and bicultural Latinos, represent an interesting case in this regard as many often negotiate their ethnic identity continuously during the course of an ordinary day in response to the multidimensional context that surrounds them. This can create unique challenges for survey researchers if for example improper question flow or inappropriate race and ethnic identity constructs are used as part of the screening process.

To explore this issue, an experiment was designed to build on the less structured self-identification format developed by the author (An Exploration of Racial and Ethnic Identity Constructs among U.S. Hispanics with Implications for Survey Design and Analysis; AAPOR Conference 2011, Martin G. Cerda, Ilgin Basar and Jessica Jamanca; Encuesta, Inc.) in order to assess the possible impact of context from the screening process on the manner in which Latinos define their race and ethnicity or respond to questions related to Latino themes or important issues.

Specifically, a series of test and control studies will be conducted using online surveys to explore the likelihood of introducing bias or demonstrate how respondents with similar origins, heritage, acculturation levels, or language proficiency can be influenced on how they describe their ethnic identity or answer relevant attitudinal and behavioral questions.

The findings will demonstrate possible solutions when conducting survey research among U.S. Hispanics in order to reduce or eliminate undesired influences of contextual identity.

To Lead or Not to Lead, That is the Question: Is the Job of a Legislative Leader to Lead the Caucus Where It Should Go, or Rather, to Follow It? Debbie Borie-Holtz, Rutgers University ([email protected])

The ideological division among the current House Republicans has been reported in the popular press as limiting Speaker ’s ability to “cut a deal” on deficit reduction and tax reform. This characterization has caused some to question whether today’s leaders have to change strategies and lead differently than in the past. But has anything really changed? In other words, is the job of a legislative leader to lead the caucus where it should go, or rather, to follow it? A national panel survey of top state legislative leaders, from 1997 to present, provides a unique opportunity to explore this question from the perspective of the men and women who have held and still hold these jobs. State legislative leaders were operationalized as the Speaker and Senate President or the Senate President Pro Tem in those states where the lieutenant governor serves as the Senate President. Among the questions asked of these public officials were the characterizations of their interactions with the caucuses, their own personal styles of legislating, as well as who influenced what policies they prioritized during the legislative session. In addition to reporting these frequencies, this paper analyzes whether differences in the job of top leaders can be observed over time, by geographic region, party affiliation, divided or unified governments or the professionalism of the legislature. Differences in the personal attributes of the leaders such as their professional backgrounds and both their legislative and leadership tenures are also controlled for. Finally, the relationship that exists between leaders and their caucuses on key tactics ranging from running campaigns and fundraising is examined.

Communication Inequality and Fatalistic Beliefs about Cancer Prevention: The Role of Numeracy in Explaining the Socio-economic Disparities in Response to Cancer Information Seeking Sungjong Roh, Cornell University ([email protected])

Informed by the Communication Inequality and Health Disparity Model, this study tested a series of equations predicting fatalistic beliefs (i.e. pessimism, helplessness, and confusion) about cancer prevention. Utilizing the 2007 Health Information National Trends Survey, I suggested two possible moderators, socioeconomic status (i.e. years of education) and numeracy (both subjective and objective measures), influencing the relationships between cancer information seeking and fatalistic beliefs. Although cancer information seeking, in its normative sense, has been conceived of as and is supposed to be helpful in reducing fatalism about cancer prevention, such information seeking appeared to be more likely to reduce fatalistic beliefs (particularly pessimism and confusion) among people with greater education and higher levels of numeracy than among those with less education and lower levels of numeracy in an empirical analysis of the current study. The findings build on extant literature about communication inequality and health disparity by extending the disparities between higher and lower SES population s from the existing cancer knowledge gap to the newly proposed gap in cancer fatalism. These factors may contribute to widening communication inequality and health disparity in cancer prevention. An important caveat in interpreting these results is that they should not be connected to blaming the individual. Instead, considering previous works reporting SES disparity in fatalism, the finding of the present study suggesting that the disparity still holds even after taking the factor of cancer information seeking into consideration in the model addresses issues regarding the quality of current cancer information. In particular, this study revealed that numeracy is the factor explaining SES differences in response to cancer information seeking. Given the trend of cancer communication characterized by relatively heavy use of numeric and statistical information compared to the past, policymakers and health professionals should deliberate on how cancer information should be constructed.

How Do Respondent Behaviors Reflect and Influence Perceptions of Surveys? Allyson L. Holbrook, Survey Research Laboratory, University of Illinois at Chicago ([email protected]); Timothy P. Johnson, Survey Research Laboratory, University of Illinois at Chicago ([email protected]); Young Ik Cho, School of Public Health, University of Wisconsin Milwaukee ([email protected]); Noel Chavez, University of Illinois at Chicago ([email protected]); Saul Weiner, University of Illinois at Chicago ([email protected]); Sharon Shavitt, University of Illinois ([email protected])

It is not uncommon for surveys to include questions that measure either respondents’ perception of the survey (e.g., how important they view the survey to be and how much effort they put into answering the questions on the survey) and/or interviewers’ perceptions of respondents (e.g., interviewers perceptions of respondent effort, interest, and suspicion). Researchers have also used behavior coding to evaluate survey questions whereby interviewer and respondent behaviors during each question are coded and quantified. This paper examines associations between behavior coding of respondent behaviors across survey items and respondent and interviewer perceptions of the survey. The data came from a CAPI survey of 405 respondents, stratified by respondent race and ethnicity. Respondents were recruited via telephone screening and asked to come into the lab to be interviewed for a 90-minute survey. Approximately equal numbers of non-Hispanic White, African-American, Mexican-American, and Korean- American respondents were recruited. Half of Korean-American and Mexican-American respondents were interviewed in Korean and Spanish, respectively. The rest were interviewed in English. Respondents’ perceptions of the value of the survey, effort put into answering the survey questions, and perceived difficulty of the survey questions were measured at the end of the CAPI interview. Interviewers were also asked to rate respondents’ intelligence, interest, suspicion, and sincerity. Interviews were audio- and video-recorded and respondents’ verbal and nonverbal behaviors were coded using both existing and newly developed coding systems. Multivariate analyses are used to assess correspondence between respondent behaviors and respondent and interviewer ratings. For example, do the behaviors of respondents who report they value the survey differ from those who report they do not? Do the behaviors of respondents who interviewers rate as highly interested in the survey differ from those who are rated as less interested? Furthermore, do these patterns vary across racial and ethnic groups?

Predictions of the Effects of Individual Media Messages on the Time Trend of Public Opinion about the Toyota Brand David Fan, University of Minnesota-Twin Cities ([email protected]); David Geddes, Institute for Public Relations ([email protected]); Felix Flory, evolve24, a Maritz Research company ([email protected]); Carrie Lu, evolve24, A Maritz Research Company

Toyota built a world class corporate brand reputation for excellence. However, national attention began to focus on Toyota’s quality problems following widespread publicity about accelerator and braking problems in late 2009 and early 2010.

To study the impact of the media on Toyota's reputation, survey data of the brand were obtained daily from YouGov's BrandIndex surveys. Toyota’s reputation stayed strong throughout 2009 and then showed a precipitous drop in the first half year of 2010. Subsequently, there was a gradual recovery that did not reach the earlier high levels of 2009.

Studies were performed using the ideodynamic model to predict the reputation time trend from media content scored by computer for the numbers of documents positive and negative for Toyota’s image. The media came from blogs, Internet forums, and the print and online versions of major newspapers. The R- squared values for the predicted time trends ranged from 0.75 to 0.84 indicating high accuracy and the unexpected finding of strong intermedia agenda setting where blogs, forums and the news all contained approximately the same content.

The research will evaluate the ability of model parameters established for data from January 2009 to March 2011 to predict for another year, up to the 2012 AAPOR conference. The study will further explore the impacts of different types of messages on the prediction. As discussed above, one of the key messages related to Toyota’s safety problems. Therefore, positive and negative media content of all types will be classified as related or unrelated to safety. Then predictions will be made including and excluding safety related messages. The impact of safety messages will be quantified by the loss of accuracy upon their exclusion. This same strategy will be applied to other important messages to assess their impacts as well.

Open-ended Questions in Web Surveys: One Large vs. Ten Small Boxes Florian Keusch, WU Vienna University of Economics and Business, Austria ([email protected])

Although it is known that answering open-ended questions needs more cognitive effort from the respondent than choosing an option from a close-ended question and therefore raises the response burden (Bradburn 1978) the open-ended question format is regularly used in quantitative surveys to elicit spontaneously and freely formulated answers. The advantage of web surveys here is the higher data quality compared to traditional survey methods (Barrias et al. 2010; Deutsken et al. 2006; Kwak & Radler 2002). In postal surveys the size of an answer box seems to influence respondents’ perception about the length of the answer that should be provided with larger boxes producing longer responses (Christian & Dillman 2005; Fuchs 2009). In web surveys this effect could be demonstrated for less motivated respondents (Smyth et al. 2009). A new study looks at the difference between providing the respondent with one large answer box or ten small answer boxes when asked for unaided brand awareness.

Definitions Matter: Selective Processing of Mediated Messages in Online Environments Jiyoun Kim, University of Wisconsin - Madison ([email protected]); Dominique Brossard, University of Wisconsin - Madison ([email protected]); Mike Xenos, University of Wisconsin - Madison ([email protected]); Dietram Scheufele, University of Wisconsin - Madison ([email protected])

According to selective processing, individuals selectively take in information relevant to their preexisting ideas and beliefs at the expense of the rest of the information in a mediated message (Graf & Aday, 2008). This study uses nanotechnology as the context of enquiry to explore how definitions of a complex topic, as well as knowledge about it, might influence selective processing. The way emerging technologies are portrayed can also directly shape how individuals form attitudes and engage with these complex issues. More specifically, definitions of nanotechnology that highlight its potential risks and benefits could increase individuals’ need to obtain more information (Kim et al., 2010). We therefore posit that people’s previous knowledge about nanotechnology influences their message processing, but this relationship could be moderated by the way this emerging technology is presented. Thus, our study also examines how nanotechnology-related definitions moderates the role of knowledge in how individuals selectively process messages.

We conducted an online experiment that provided a nationally representative sample with three different definitions of nanotechnology (technical; risk/benefits framed; applications) Data were collected by Knowledge Networks from an online panel (N = 873) between 9 July and 23 July 2010). We show that a risk-benefits frame definition of nanotechnology made subjects perceive less bias in a neutral blog post than the other definitions. As hypothesized, we found a significant interaction between nano definition and nano knowledge. For those not knowledgeable, being exposed to the technical definition made subjects judge the blog content more favorable to nanotechnology than exposure to the other definitions. There was no significant difference in the effect of definition for people who were highly informed. Exposure to a technical definition of nano also encouraged people to perceive the blog as providing arguments against nanotechnology unless they were in the highly informed group.

The Effects of Race/Ethnicity, Cultural Values, and Language of Interview on Socially Desirable Responding Sharon Shavitt, University of Illinois ([email protected]); Timothy P. Johnson, Survey Research Laboratory ([email protected]); Allyson Holbrook, Survey Research Laboratory ([email protected]); Young Cho, School of Public Health ([email protected]); Saul J. Weiner, Department of Medicine ([email protected]); Noel Chavez, Survey Research Laboratory ([email protected])

A respondent’s race/ethnicity is known to predict socially desirable responding (SDR) designed to present oneself in a favorable way. In particular, studies show that U.S. minority populations are more likely to engage in impression management, suggesting that collectivist values and backgrounds broadly predispose one to greater SDR. However, Lalwani, Shavitt, and Johnson (2006 JPSP), examining differences in national groups (Singapore vs. U.S.), ethnic groups (Asian Americans vs. European Americans), and cultural values, showed that both individualism and collectivism link to desirable responding, but in distinct ways. Collectivism was positively associated with impression management (IM)— the tendency to misrepresent one’s actions to appear normatively appropriate. Individualism was positively associated with self-deceptive enhancement (SDE)—the tendency to hold and express inflated views of oneself. In this paper, we expand upon these findings by examining IM and SDE among a broader set of racial/ethnic groups and by exploring what factors moderate and mediate racial/ethnic differences in IM and SDE. Specifically, are African-Americans, Mexican Americans, and Korean Americans likely to engage in greater IM and less SDE than are Anglo Whites? We examine whether such differences are mediated by cultural values—including measures of individualism and collectivism, conformity, power, and self-direction. We also examine whether language of interview and acculturation moderate these effects. Analysis is based on a sample of 1008 respondents, stratified by respondent race and ethnicity, who were recruited via telephone screening and came into the lab to be interviewed. Approximately equal numbers of Anglo White, African-American, Mexican-American, and Korean- American respondents were recruited. Half of Korean-American and Mexican-American respondents were interviewed in Korean and Spanish, respectively. The rest were interviewed in English. This large, stratified dataset allows us to examine a number of factors, individually and in interaction, that predict SDR in specific minority populations.

The River Flows: Comparison of Experimental Effect Replicability with Different Sample Sources Randall K. Thomas, ICF International ([email protected]); John Bremer, Toluna USA Inc ([email protected])

There are many different sample sources used today in market and opinion research. Based on number of studies being fielded and dollar volume, two of the most common sample sources in web-based research are river samples and non-probability panels. A non-probability panel recruits respondents in a number of different ways to collect together a group of people who will complete a number of surveys over time. While studies employing non-probability panels have an extensive history of use going back to the 1950s, river samples became more prominent in the past 10 years for web-based studies. A river sample recruits respondents on an ad hoc basis using online pop-up ads and other online recruitment techniques. Once river respondents respond to an invitation they are typically directed to an initial survey that screens them with demographic questions and other topic-relevant questions and then passes them over to ongoing surveys based on quota needs of a number of simultaneous live surveys. This study focused on whether experimental effects are similar in both direction and size in both types of samples. Respondents were randomly assigned to either a yes-no grid, a multiple response format (‘select all’), or a household combination grid to determine the purchase of a series of 16 products. We found similar effects for both sample sources in experimental effects – the combination grid had the highest incidence; the multiple response format was associated with lowest incidence. Though relative order of the results were similar, the river sample showed generally higher incidence for all product types across experimental conditions, however, the differences became smaller after controlling for demographic differences and self-rated consumer style (a series of measures to determine consumer orientation). We examine and propose additional analyses that could lead to reductions in differences in sample source outcomes.

Model Based Targeted Address Canvassing: A Simulation Based on the 2009 Address Canvassing Program John L. Boies, U.S. Census Bureau ([email protected]); Kevin M. Shaw, U.S. Census Bureau ([email protected])

In 2009, the Census Bureau performed a nationwide Address Canvassing (AC) program in preparation for the 2010 Decennial Census. We use data from this program to conduct a “What If” simulation of a model-based “targeted” AC program where census blocks are selected for canvassing based on their predicted probabilities of containing deviations from the Master Address File’s (MAF) data. Covariates measuring block characteristics of two kinds—physical structure, e.g., housing unit count, characteristics; and social structure, e.g., demographics—were used to predict 11 different canvassing outcomes. The results indicate that both physical and social structures are important predictors of stability/instability, that can be used to prioritize listings by Field Representatives (FRs). Among the interesting results is that the number of housing units in a block is negatively related to residential change once other physical structure variables, e.g., multi-unit structure composition, and social structure variables, e.g., age, sex composition, are introduced into the models. We address both gross and net coverage under these simulations. The research indicates that models to predict which blocks should be targeted for canvassing can be developed, and that this approach could result in substantial savings of time and money in preparation for the 2020 Decennial Census.

Getting News from Non-news Outlets: How Incidental News Exposure on the Internet Promotes Young Adults' Political Participation JungHwan Yang, University of Wisconsin-Madison ([email protected])

Although young people increasingly use the Internet as a major source of news information, some of them still would deliberately avoid the use of online news outlets due to enhanced selectivity. However, the prevalence of news information on non-news outlets such as social network sites, micro blogs, and online communities may provide opportunities even for disengaged people to encounter news information incidentally. At the same time, news information from non-news Web sites often provides rich contextual information that helps its users to understand complex current affairs. We argue that the incidental exposure to news information through non-news outlets on the Web influences political participation of young adults. Using 2011 online survey data of 409 young adults from South Korea, we test whether the non-news sites actually provide news information, whether incidental exposure to news information on non-news sites is associated with online and offline political participation, and to whom the incidental exposure is particularly beneficial. The results indicate that (1) the use of non-news outlets was significantly associated with incidental exposure to news information, (2) incidental exposure to news information through non-news outlets acted as a strong predictor of both online and offline political participation, and (3) the effect of incidental exposure on political participation was stronger among heavy users of traditional news outlets (e.g., TV, newspapers, radio). The findings underscore the necessity of measuring wider range of online behaviors when surveying the effect of online news information on political behavior. Further implications are discussed.

Rapid Roll-Out of Household Surveys to Assess Monitoring and Impact Evaluation Indicators Pertaining to Reproductive Health, Child Nutrition, and Immunization in Mesoamerica Catherine M. Wetmore, Institute for Health Metrics and Evaluation ([email protected]); Emily Carnahan, Institute for Health Metrics and Evaluation ([email protected]); K. Ellicott Colson, Institute for Health Metrics and Evaluation ([email protected]); Ali Mokdad, Institute for Health Metrics and Evaluation ([email protected]); Gulnoza Usmanova , Institute for Health Metrics and Evaluation ([email protected]); Dharani Ranganathan , Institute for Health Metrics and Evaluation ([email protected]); Sebastian Martínez, Inter-American Development Bank ([email protected]); Paola Zúñiga Brenes, Inter-American Development Bank ([email protected]); Ana Pérez Expósito, Inter-American Development Bank ([email protected]); Maria Fernanda Merino, Inter-American Development Bank ([email protected]); Luis Tejerina, Inter-American Development Bank ([email protected]); Emma Margarita Iriarte, Inter-American Development Bank ([email protected]); Brent Anderson, Institute for Health Metrics and Evaluation ([email protected]); Tasha B. Murphy, Institute for Health Metrics and Evaluation ([email protected]); Bernardo Hernández Prado , Institute for Health Metrics and Evaluation ([email protected]); Rafael Lozano, Institute for Health Metrics and Evaluation ([email protected]); Ali H. Mokdad, Institute for Health Metrics and Evaluation ([email protected])

In spring 2011, the Institute for Health Metrics and Evaluation at the University of Washington launched a series of rapid assessment surveys among households in the poorest regions of eight Mesoamerican countries to provide accurate information on baseline coverage of key health interventions and indicators pertaining to reproductive health, child nutrition, and immunization. In each country, simple random samples of approximately 3,800 households with age-eligible occupants were selected with representative probability to provide expected samples of 3,400 women aged 15-49 years and 3,200 children aged 0-59 months. A rapid pre-survey census was carried out within segments which had been randomly selected with probability proportional to size. The census allowed us to efficiently target our final sample selection to households with age-eligible occupants. Standardized multi-lingual household surveys were designed to be implemented using netbooks or by pen-and-paper.

In the case of El Salvador, the first country to complete baseline data collection, 18,427 households were visited for the census, and 3,625 (92.1%) of the 3,935 selected households were surveyed within approximately three months. Real-time data collection resulted in improved data quality, facilitated constructive feedback with the field team, and allowed us to produce timely reports. Two months following the completion of data collection, final datasets had been cleaned and final reports written. National census data from 2007 were used to monitor interviewer performance and assess the quality of the pre- survey census exercise. For example, when fewer than 60% of expected households were captured on the pre-survey census, field staff were instructed to return to segments and attempt to capture missing households. This occurred in 9 of 139 segments. Despite a preference for computer-assisted interviews (CAPI), all household surveys in El Salvador were completed by pen-and-paper and later input by a data entry team due to security concerns (which proved largely unfounded).

Methuselah and the Internet Survey: How the Aging Population is Eluding Research in the Technological Age Amy Flowers, Analytic Insight ([email protected]); Andrea Libby, Analytic Insight ([email protected])

Telephone research reaches the older population in such great numbers that they are often weighted down and taken for granted.

As samples are supplemented with cell phone numbers and internet surveys become more accepted, older adults are separated into distinct camps. The most educated and affluent move with their younger counterparts into the cell phone oversamples and internet panels. The oldest old (those over 75) and less affluent sectors however, are less likely to have either cell phones or internet access, and increasingly more likely to have telephone interruptions or no telephone service at all. Televisions, microwave ovens, and answering machines as the most commonly used technologies by the oldest old. As survey researchers turn to new technologies, these commonly used technologies are not likely to provide the increased access promised by the new methodologies.

This session explores the impact of the loss of this population segment on political, opinion and health- related survey research. We explore the potential of mail surveys and other methodologies to reach this population. We present census and BRFSS data showing the unique characteristics of the population. We also explore the effect of the end of the age bubble, and what might happen as the boomers age and the oldest old become technologically savvy.

The oldest old and many elderly require the help of care givers. The line between guardian and gate keeper is important as the caregiver is often on is often on the lookout for marketing scams that target the elderly.

This presentation will discuss the size of the population now and in the future, the impact the oldest old are likely to have on opinion, health and political survey research and the most promising methodologies for accurate inclusion of this important population segment.

Incorporating Computer-Assisted and Real-time Data Collection Methods into a Field Survey in a Developing Country Setting: Lessons Learned from a Longitudinal Maternal-Child Health Survey in Eight Mesoamerican Countries Ali Mokdad, Institute for Health Metrics and Evaluation ([email protected]); Catherine Wetmore, Institute for Health Metrics and Evaluation ([email protected]); Dharani Ranganathan , Institute for Health Metrics and Evaluation ([email protected]); Emily Carnahan , Institute for Health Metrics and Evaluation ([email protected]); Gulnoza Usmanova , Institute for Health Metrics and Evaluation ([email protected]); Sebastian Martínez, InterAmerican Developmental Bank ([email protected]); Paola Zúñiga Brenes, InterAmerican Developmental Bank ([email protected]); Emma Margarita Iriarte, InterAmerican Developmental Bank ([email protected]); Ana Pérez Expósito, InterAmerican Developmental Bank ([email protected]); María Fernanda Merino, InterAmerican Developmental Bank ([email protected]); Luis Tejerina, InterAmerican Developmental Bank ([email protected]); Brent Anderson, Institute for Health Metrics and Evaluation ([email protected]); Tasha Murphy, Institute for Health Metrics and Evaluation ([email protected]); Rafael Lozano, Institute for Health Metrics and Evaluation ([email protected]); Ali H. Mokdad, Institute for Health Metrics and Evaluation ([email protected])

In 2011, the Institute for Health Metrics and Evaluation (IHME) at the University of Washington launched several household and health facility surveys to set a baseline target as part of the Salud Mesoamerican 2015 Initiative (SM2015). SM2015 is funded by the Bill & Melinda Gates Foundation, the Carlos Slim Health Institute, and the Government of Spain and aims to improve maternal and child health for the poorest population quintile in Central America and Southern Mexico. We used computer assisted survey instruments (CAPI) on netbooks and created a secure internet link to monitor survey administration in real time. In this paper, we shed light on the advantages and challenges of this novel incorporation into field survey in the context of developing countries.

Computer assisted methods for data collection are being widely applied in modern survey research. Our CAPI software reduced interview time as compared to paper-based survey and reduced data collection errors through the use of value range restrictions, logical question sequence, and automated skip patterns. More importantly, this conditional logic that we programmed allowed us to better control questions and their relevance to various health facilities of different scopes.

The secure internet link with IHME expedited communications with the survey teams and enabled us to monitor individual interviewer performance and point out oddities in collected data. Feedback from the interviewers was readily incorporated into revised versions of the survey. This allowed us to pilot and modify our survey in a timely manner. Moreover, this functionality improved data collection efficiency and quality whilst simultaneous data collection was occurring at different sites and in different countries. With advances in telecommunication, real time data collection is becoming a vital addition to surveys particularly in developing country contexts when multiple instruments are being implemented across multiple sites.

Friday, May 18, 2012 4:15 p.m. - 5:45 p.m. Concurrent Session F

Addressing the Challenges of Address-Based Sampling Designs

The Public According to Marketers: Imputing National Demographics From Marketing Data Linked to Address-Based Samples Josh Pasek, University of Michigan ([email protected]); S. Mo Jang, University of Michigan ([email protected]); Curtiss Cobb, Knowledge Networks ([email protected]); Charles DiSogra, Knowledge Networks ([email protected]); J. Michael Dennis, Knowledge Networks ([email protected])

Address-based sampling (ABS) methods offer a variety of advantages over both telephone and area- probability methods. The U.S. Postal Service Computerized Delivery Sequence File used for ABS covers approximately 97% of U.S. households and data collected via ABS can also be merged with commercial marketing databases that can provide information on non-respondents as well as respondents in the sampling frame. Presumably, the combination of incredible coverage and the use of these ancillary data should provide an additional strategy for correcting for survey non-response, namely by matching respondents with non-respondents in the sampling frame based on ancillary data and correcting for any observed deviations. Using a dataset compiled by Knowledge Networks, the current study assesses the potential efficacy of such corrections by imputing demographic data for all individuals in the sampling frame based on the ancillary data. The results of a series of imputations are then systematically compared with Current Population Survey estimates to assess whether matching strategies of this sort could potentially address biases due to unit non-response. We illustrate the circumstances under which the imputations produced results that were more or less representative and discuss the implications for the use of such matching strategies with both probability and non-probability sampling designs.

Handling Records With Missing Information in an Address Based Sample with Appended Demographic Characteristics Dan Estersohn, Arbitron Inc ([email protected]); Kelly Dixon, Arbitron Inc ([email protected]); Mike Kwanisai, Arbitron Inc ([email protected]); Al Tupek, Arbitron Inc ([email protected]); Linda Piekarski, Survey Sampling International ([email protected]); Missy Mosher, Survey Sampling International ([email protected]); Jessica Smith, Survey Sampling International ([email protected])

Arbitron and Survey Sampling International (SSI) have collaborated on a study of the accuracy and coverage of appended demographic data to an Address Frame sample. The addresses of Arbitron respondents with known demographics were appended by SSI with information from Census as well as proprietary databases. The appended data were compared to the respondent-supplied information. The principal findings regarding accuracy and coverage are being presented elsewhere at this conference (Dixon, et. al.).

This paper will report on two additional evaluations for possible ways to identify low response demographic groups. The first evaluation will look at respondent addresses that did not have appended information. Do households at these addresses share common characteristics? If so, can that information be used as if the records had been appended with those characteristics? The second will look at Census data such as tract-level demographics or tract-level Census mail out/mail back response rates. Can Census data be used to add information for these unappended households? Can those data help to identify members of low response groups?

Methods to Deal with Non-Working “Matched” Phone Numbers in an Address Based Sample Survey Anna Fleeman, Abt SRBI Inc. ([email protected]); Tiffany Henderson, Abt SRBI Inc. ([email protected])

The use of address-based sampling (ABS) in survey research is increasing as the coverage of the landline phone frame has markedly declined over the last few years. To maximize contact and efficiency, the address sample is typically compared to commercially-available directories so that phone numbers can be appended. Approximately 50% of U.S. addresses can be matched to a phone number. For most research organizations, these records are treated exactly as those from a phone frame. However, when the matched phone numbers from an ABS are dispositioned as non-working, they cannot be removed from the response rate calculations as ineligible because the sampling unit is the address (not the phone number). As part of project conducted in late 2011, Abt SRBI drew more than 50,000 addresses from an ABS frame and phone numbers were matched, if able. The matched sample was dialed, with more than 6,000 phone numbers dispositioned as non-working. As a result, these ABS records were treated as if they were originally not matched to a phone number. Therefore, we sent a survey invitation letter and a short questionnaire to each address mirroring the methodology of non-matched ABS sample. Presented findings will include survey return rates, overall response rates, and demographic characteristics by sample type. Moreover, the non-deliverable mail (e.g., vacant) will be analyzed to discern whether non- working phone dispositions can be used as a proxy for address-based ineligibles in response rate calculations, which would reduce costs and fielding time. Results give great insight as to conducting ABS research as well as to the sample performance and response rate calculations of ABS surveys.

Removal of Address Types to Improve the Effectiveness of Address Based Sampling Frame Lawnzetta Tracie Yancey, The Nielsen Company ([email protected]); Lukasz Chmura, The Nielsen Company ([email protected]); Scott Bell, The Nielsen Company ([email protected])

In an effort to increase the efficiency of address based sampling, Nielsen investigated removing vacant addresses and PO Boxes from the address based sampling frame. The inclusion of vacant and PO Box addresses increases the coverage of potential households. However, the recruitment of vacant addresses often yields minimal benefits and the inclusion of all PO Boxes creates duplicate listings of households with both a city-style address and a PO Box address.

Occupied addresses designated as Vacant are more likely to be younger and ethnic; however the incidence of making contact to these addresses for a survey is minimal. Consequently, a decision was made to remove these addresses from our TV diary sample.

As a consequence of including all PO Boxes, the size of the ABS frame is often larger than the household population. In addition, a large percentage of the PO Boxes cannot be matched to a phone number which makes recruitment for surveys ineffective and inefficient.

One option to reduce duplication and improve the sample effectiveness is to remove PO Boxes from the frame. Our investigation has shown that by removing PO Boxes, except those where PO Box is the only means for mail delivery, these goals can be achieved

This paper will review the data used to make the decision to exclude these two types of addresses; and, it will show if we achieved the benefits expected within a recent TV diary survey.

Implementing the AAPOR Transparency Initiative

Implementing the AAPOR Transparency Initiative Paul J. Lavrakas, AAPOR ([email protected]); Scott Keeter, Pew Research Center; David Cantor, Westat; Leah M. Christian, Pew Research Center; Marjorie Connelly, ; Liz Hamel, The Henry J. Kaiser Family Foundation; Melissa J. Herrmann, Social Science Research Solutions; Timothy P. Johnson, University of Illinois at Chicago; Courtney Kennedy, Abt SRBI, Inc.; Peter Miller, U.S. Census Bureau; Joe Murphy, RTI International; Rich Morin, Pew Research Center; Chuck Shuttles, Knowledge Networks, Inc.

Improving Self-Administered Questionnaire Design.

Understanding the Relationship between Literacy and Data Quality in Self- Administered Surveys Jolene Smyth, University of Nebraska-Lincoln ([email protected]); Kristen Olson, University of Nebraska-Lincoln ([email protected]); Rebecca Powell, University of Nebraska-Lincoln ([email protected]); Amanda Libman, University of Nebraska-Lincoln ([email protected])

Low literacy is commonly thought to contribute to measurement error in self-administered surveys, but there is very little empirical evidence about how or why literacy affects data quality. In this paper, we bring together two unique data sources to examine both how literacy is related to data quality and why responses differ across literacy groups. Although not identical, the questionnaires used for these two surveys have many design elements and questions in common. In both studies, we measure literacy separately from education using either the Self-Assessed Literacy Index (Olson, et al. 2011) or the Wide Range Achievement Test. First, in the2009 Quality of Life in a Changing Nebraska survey (n=565, AAPORRR2=46%), we examine how literacy affects such outcomes as item nonresponse rates, execution of skip instructions, and responses to open-ended number and text boxes. We also examine how experimental variations in questionnaire design (i.e., design of skip instructions, size of box, labeling of grids.) affect the quality of responses across literacy groups. Preliminary results indicate that those with low literacy provide shorter, less informative answers to open-ended questions, controlling for education. Second, we conducted an eye-tracking laboratory experiment in Spring 2011 (n=72). Using these data, we examine how respondents process the visual information they receive in a questionnaire and how such processing varies by literacy and by experimental questionnaire design manipulations. Preliminary results indicate significant differences in the average length of time spent reading each question and the variability in time across literacy levels. Results shed light on the link between literacy and data quality, and on what parts of the response process lead to differential data quality across literacy groups. Results will also provide insight into questionnaire design features that may improve data quality among low literacy populations.

Questionnaire Instructions and Respondent Behavior: A Cross-Survey Comparison Brett E. McBride, Westat ([email protected]); David Cantor, Westat ([email protected])

The way respondents process written instructions affects how they answer the questions presented to them. Skip instructions help respondents navigate through the questionnaire appropriately and clarifying definitions help them interpret questions as survey authors intended. Designing instructions in a way to minimize error can be done by comparing their performance across surveys.

This paper focuses on the impact of changes in instructions used on two administrations of the Health Information National Trends Survey (HINTS). HINTS, sponsored by the National Cancer Institute, is a cross-sectional, nationally-representative survey about health and cancer communication. The most recent administration of the survey (HINTS 3) was carried out in 2007 using a paper-based questionnaire, and the latest administration (HINTS 4) recently completed a Pilot test and will complete the main survey by January 2012.

Across surveys, general changes were made in how questionnaire instructions were designed. This paper will examine the effects of these changes on missing data and response distributions. For example, on HINTS 4 definitions were presented closer to the questions that they clarified, and skip arrow instructions were modified to limit omission errors. In HINTS 3, skip arrows were placed to the right and left of the response category; In HINTS 4, arrows were added to the left of the response category that instructed respondents to continue to the next question and the skip arrow remained to the right of the response category that instructed respondents to skip. The results of the Pilot test for HINTS 4 suggest the modified skip instructions slightly reduced omission error on items following the instructions, though they may have increased unnecessary responses to following items. The paper will expand on this result and report other comparisons related to changes in questionnaire instructions across the HINTS 3, HINTS 4 Pilot and the main HINTS 4 survey.

An Examination of Visual Design Effects in a Self-Administered Mail Survey Sarah Hastedt, National Center for Education Statistics ([email protected]); Douglas Williams, Westat ([email protected]); Catherine Billington, Westat

Item nonresponse in mail surveys is generally related to the complexity of the task the respondent is asked complete. One source of complexity is the use of branching or skip instructions. While helpful by broadening the reach of the survey across different populations, these instructions increase the mental burden on respondents. Another source of complexity can come from design features that do not follow respondent expectations, for example the order of response options. This paper extends previous research using data from the 2009 National Household Education Survey (NHES) Pilot Test which found variation in omission and commission error rates according to the visual design of skip instructions in the self-administered mail questionnaire. We look at experiments implemented in the design of skip instructions and the order of response categories using data from the 2011 NHES Field Test. The 2011 Field Test used a split panel questionnaire experiment which allows for comparisons of item-level nonresponse and response distributions across forms. In our first analysis, we examined two components of the instruction; first, whether increasing the emphasis of the skip instruction affected compliance with the skip; and second whether this emphasis affected item non-response in the filter question. In our second analysis, we examine the effectiveness of skip pattern design changes that eliminated the most problematic type of skip instruction used in the 2009 Pilot Test, a large highlighted box containing a skip instruction. In our final analysis we look at simple dichotomous responses to see what effect reversing the response order has on response distributions. Specifically, we examine whether switching the order of yes or no responses violates respondent expectations about what should come first and whether respondents rely more on their expectations than the survey when answering these questions.

Following the Yellow Brick Road: Two Experiments in Formatting Skip Instructions Carol Cosenza, Center for Survey Research/UMass Boston ([email protected]); Patricia Gallagher, Center for Survey Research/UMass Boston ([email protected]); Stephanie Lloyd, Center for Survey Research/UMass Boston ([email protected])

The Consumer Assessment of Healthcare Providers and Systems (CAHPS®) instruments are usually self-administered and contain a number of skip patterns, where respondents are instructed to skip over questions based on their answer to screening questions. When respondents make errors at screening questions, valuable data can be lost. As part of a field test conducted in a university-based health system, two smaller methodological experiments with alternative formatting of skip instructions were undertaken. The survey was funded by the Agency for Healthcare Research and Quality (AHRQ) and uses the CAHPS Clinician & Group Patient Centered Medical Home adult questionnaire.

Experiment 1: Vertical vs. Horizontal Response Options

CAHPS guidelines recommend the vertical presentation of response categories. In an effort to save space, sometimes survey users format response options horizontally. The research goal of this test is to learn about differences in skip compliance when response options are presented in a vertical versus horizontal format. Sample consists of 4100 cases randomly assigned to one of the two test conditions.

Experiment 2: Placement of Check Box When coding other surveys, we noticed that sometimes respondents correctly follow skip instructions, but fail to check any response box in the screening question. To test whether placement of the check boxes makes a difference in skip compliance, a random half of the sample (n=500) had answer boxes placed to the left of the response choices and the other half had boxes placed to the right (and before skip instructions).

Both experiments use a standard 3-contact mailing protocol. The analysis plan for both experiments is to compare rates of skip errors of omission, where respondents skip questions they should answer, and skip errors of commission, questions that should be skipped are answered. The project is currently in the field. Data should be collected by December 1 and analysis completed in early 2012.

Internet Data Collection for the American Community Survey and Census

Internet Data Collection for the American Community Survey and Census Jennifer Guarino Tancreto, U.S. Census Bureau ([email protected]); Joan Hill, U.S. Census Bureau ([email protected]); Michelle Ruiter, U.S. Census Bureau ([email protected]); Michael Bentley, U.S. Census Bureau ([email protected]); Beth Nichols, U.S. Census Bureau ([email protected]); Mary Davis, U.S. Census Bureau ([email protected]); Mary Frances Zelenak, U.S. Census Bureau ([email protected]); Rachel Horwitz, U.S. Census Bureau ([email protected]); Brenna Matthews, U.S. Census Bureau ([email protected]); Samantha Stokes, U.S. Census Bureau ([email protected])

In 1997, about 19 percent of households in the United States had access to the Internet in the home. In 2010, that percentage jumped to 71 percent. Given the rapid rise in Internet use coupled with the potential cost savings of using such a mode to collect survey data, many survey organizations are eagerly looking for a ways to incorporate an Internet response mode into their surveys. The purpose of this session is to share results from several recent U.S. Census Bureau tests which included an Internet response option. The presentations in this session will share results from testing an Internet response mode for the American Community Survey and the Population and Housing Census. We will share the results from three separate experimental studies that examine the impact of the Internet response mode on response rates, data completeness, characteristics of responding households, and measurement error. We will also present results from a follow-up telephone study of respondents and nonrespondents that gathered information about how effectively the mailing materials for one of the American Community Survey studies conveyed the response option choices. Additionally, this study gathered information on why respondents chose the mode they did and whether offering an Internet response option influenced the decision not to respond at all. Lastly, we will provide detailed results from an analysis of the paradata collected in the American Community Survey experiments that help evaluate the design of the instruments.

First Abstract: The 2010 Census Quality Survey: Results from a Mixed-Mode Mail and Internet Reinterview Planning for the 2020 Census is underway even as data from the 2010 Census are still being released. One particular area of interest is the use of the Internet for the 2020 Census as an alternative means for the public to respond to the census, which is expected to make the census more cost-effective and accurate. The 2010 Census Quality Survey was a reinterview, using nearly identical content as the 2010 Census, with both a mail back and an Internet response component. The research was designed to be a first step in the 2020 Census Internet testing cycle. A sample of 2010 Census mail respondents was selected and assigned to one of three Census Quality Survey panels, each with a different contact strategy approach. The primary goal of the Census Quality Survey was to estimate and compare measurement error, using gross difference rates, between the paper and Internet response options. Other key goals included comparison of item nonresponse rates, household coverage analyses, and reporting various paradata measures from the Internet application (such as breakoff rates, help link usage, editing of answers, and other usability issues). This paper summarizes the main findings of the research.

Second Abstract: Methods for Incorporating an Internet response mode into American Community Survey mailings : A Comparison of Approaches The American Community Survey (ACS) is an on-going monthly survey that collects demographic, social, economic and housing information about the people and housing units in the United States and Puerto Rico using three sequential modes of data collection - mail, telephone and personal visit. The U.S. Census Bureau initially contacts households by mail to inform them about the ACS and provide the paper questionnaire. Later, telephone calls and personal visits are used to contact nonrespondents. In response to the cultural shift in communication from paper to electronic modes, the U.S. Census Bureau tested the use of an Internet response option for the ACS during the April 2011 mail collection period. The focus was on testing different strategies for informing households about the Internet response option and encouraging response by using variations of the current mail materials and methods. The strategies included changes to the messages on the current letters and questionnaires, the addition of a new informational postcard and modifications of the current mailing schedule. This paper will discuss the results of the April 2011 Internet test, specifically the impact of the different strategies on self- response, as well as recommendations for future testing.

Third Abstract: Can we do better the second time around? Another look at incorporating an Internet response mode in American Community Survey Mailings Results from the 2011 American Community Survey (ACS) Internet Test conducted in April 2011 showed that the Internet notification strategies were successful in not only driving response to the Internet, but in keeping overall self-response close to or even better than ACS production. This was a positive finding in light of the results from the first ACS Internet test in 2000 where response decreased when sampled units were offered a choice between mail and Internet modes. Surprisingly, the response rates for the “push” strategy (where we removed the paper questionnaire in the first mailing, and moved up the replacement questionnaire by one week) were better than or equal to the rates when we offered mail only. Given the substantial cost savings that would be associated with moving to a “push” method, we conducted a follow-up test in November 2011 to verify and validate these findings as well as test some changes to the mailing strategies in an effort to further increase both overall self- response as well as Internet response. We tested modification to the push and choice strategies that were tested in April 2011. This paper will discuss the results of the November 2011 Internet test, specifically the impact of the different strategies on self-response and general cost implications.

Fourth Abstract: Why do survey participants choose to report by Web, paper, or not at all? Results from an American Community Survey Qualitative Study In April 2011, the U.S. Census Bureau conducted an Internet Test, in which different mailing materials and mailing strategies were used to offer an Internet reporting option for the ACS. In two panels only an Internet reporting option was initially offered, with a paper form following in a subsequent mailing. The timing of the subsequent mailing varied. In two other panels, both modes were offered simultaneously, but the panels varied the amount of emphasis on the Internet option. In April and May, the Census Bureau fielded a follow-up telephone survey of approximately 1,200 ACS respondents and nonrespondents to the Internet Test survey. This follow-up study gathered information about how effectively the mailing materials for the Internet Test conveyed the response option choices. Follow-up questions focused on which components of the mailing materials or mailing strategy motivated sample households to respond by the Internet or by the paper form. Questions were also asked to determine why some households did not respond at all. This paper presents results from the follow-up telephone study of respondents and nonrespondents to the 2011 ACS Internet Test. Results of the follow- up study show that the majority of respondents knew they could respond either by paper or via the Internet to the ACS. We did not find substantial support for the finding in the literature (e.g. Smyth, et al., 2010) that offering multiple modes simultaneously and forcing participants to make a reporting mode choice is the reason why some participants never respond. Instead, many nonrespondents in this study claimed never to have received the ACS envelope; if they did receive the envelope, many said they did not open it because they were too busy.

Fifth Abstract: Use of Paradata to Assess the Quality and Functionality of the American Community Survey Internet Instrument Many of the recent discussions of paradata have focused on survey operations and nonresponse weighting adjustments. Specifically, in this capacity paradata have been used to optimize call back schedules and interviewer observations have been used to supplement information from nonresponding households. However, using paradata from the respondent side has not been given as much attention. Collecting this type of information can lead to a better understanding of how a respondent interacts and understands a survey as well as provide researchers with the tools to reduce measurement error. As part of the 2011 American Community Survey (ACS) Internet Test, the Census Bureau inserted JavaScript code into every page of their online instrument to capture a vast amount of paradata, along with respondent answers. Specifically, all clicked links were captured (previous, next, radio buttons, help, etc.), along with timestamps, field values, errors rendered, invalid logins, timeouts, and logouts. These data were collected to help evaluate the quality of the ACS Internet instrument by making sure respondents were using the instrument as expected. Additionally, paradata can help identify problematic screens or other issues with the instrument that might impact data quality. This paper will analyze the paradata collected during the 2011 ACS Internet Test. Specifically, it will analyze the authentication procedures, problematic screens (as determined by errors, use of help, and breakoffs), response times, and data quality indicators. It will discuss the potential problems with the instrument, things that worked well, as well as plans for future research based on the findings from this test.

Maximizing Survey Response Rates

The Effect on Differential Mailing Methodologies on Response Rates: Testing Advanced Notices, Package Design, Postage, and Personalization Yelena Pens, Arbitron ([email protected]); Robin Gentry, Arbitron ([email protected])

Arbitron Inc., a provider of radio ratings data, conducted a test using a probability based address sample to recruit the general population, age 12 and older, to complete a one week web-based diary of their radio listening. Since web-based surveys historically have had lower response rates, there were several treatments in place to increase the response rate.

In order to find the optimal mailing strategy for recruitment, the mailing experiment included several treatments such as advanced notice postcards, pre-recorded telephone messages, and an alternative package design. The purpose of an alternative package design was to establish whether a focus on completing a survey as an act of civic duty would appeal to households rather than the current approach of direct mail marketing materials such as flashy bulk packages, brochures, and non-monetary gifts. A personalization approach was tested which included a name on the letters or a generic “Arbitron Ratings Household” greeting. Also, packages were mailed each postal day, so an optimal day of week for mailing was identified.

Finally, additional treatments were in place to increase response rates such as incentive choice (online Amazon Gift card or mailed Visa Gift Card), flexibility in survey start date, and an alternative mode option (mobile).

In this presentation, we will present the results from the web-diary initiatives. We will also determine the combined impact of the non-deliverable rate and response rate of the personalized letters. Since the experiment included first-class as well as standard mail packages, we will determine the impact of postage on response rates. Finally, we will present the optimal mailing strategy for mail-based recruitment for an online survey.

Implementing Timely Data Collection Interventions Based on Response Rates and Key Survey Estimates Donsig Jang, Mathematica Policy Research ([email protected]); Flora F. Lan, National Science Foundation ([email protected]); Ananth Koppikar, Mathematica Policy Research ([email protected])

Groves and Heeringa (2006) have adapted a responsive survey design based on close observations of production measures, including response rates, contact rates, and survey-estimate tracking. Instead of sticking to a predetermined data collection protocol, researchers using this approach conduct real-time monitoring to determine the best time to implement an intervention in order to boost response rates. This approach can also help researchers reduce the variation in response rates across key domains by motivating low-responding groups to respond at the same rate as other groups. In this way, researchers can avoid having nonrespondents with markedly different characteristics from those of respondents.

Motivated by Groves and Heeringa’s responsive design approach, we monitored survey production measures on a daily basis for the National Survey of Recent College Graduates, and we made informed decisions about follow-up efforts, such as postcard and email reminders. We chose the timing of each intervention based on when the slope of cumulative response rates flattened. We investigate any changes in the response rate as our level of effort increases or decreases. This investigation will also reveal whether the overall response increment masks any adverse effects on the survey estimates due to an accelerated increase in responses from high-responding groups as opposed to low-responding groups.

We will also examine key survey estimates over time to determine whether the response increments have an adverse effect on survey bias. This close monitoring will help us to develop a survey-operation system that facilitates timely and informed decisions for appropriate interventions during data collection. Our findings will help us control the survey process in such a way that a representative sample is obtained and a logical decision is made on when to stop collecting data.

Exploring the Effect of an Advance Letter on Response Rates: A Meta-Analysis Study for the National Immunization Survey Abera Wouhib, CDC/NCHS ([email protected]); Meena Khare , CDC/NCHS ([email protected]); Vicki Pineau , NORC at the University of Chicago ([email protected]); Jie Zhao , NORC at the University of Chicago ([email protected])

Response rates in telephone surveys, including the National Immunization Survey (NIS), continue to decrease over time. The NIS uses response rate, eligibility rate, and consent to contact providers as indicator of survey quality. The NIS response rate is defined as a product of the resolution rate, screener rate and interview completion rate. Apart from the interview completion rate, the eligibility rate, generally lower than the census rate, has been also declining recently. The NIS has a provider survey to collect vaccination histories, and the rate of consent from parents to contact providers is another key rate. An advance letter has shown to improve survey participation and may have some effect on the NIS interview completion, eligibility, and consent rates. Meta-analysis can be applied to explore possible impacts of advance letters on response rates using a numerical index called the effect size. With meta-analysis, one could combine information from multiple survey years in terms of effect sizes and generalize these effects as a measure of the impact of advance letters on response rates. The NIS provides an ideal data source for combining multiple years by various subgroups based on selected characteristics of interest and sampling information prior to data collection. Using combined multiple years of NIS data, the effect sizes can be quantified in terms of standardized differences in response rates among the subgroups of resolved households with and without an advance letter. Then by producing an estimated effect size for selected subgroups, meta-analysis could help in identifying whether an advance letter has impact on response rate; however, it could also identify the subgroups where an advance letter does not help to increase response rate, and therefore other alternative survey strategies such as incentives could be investigated.

Use of a 2nd Reminder Mailing, Quick Response Code and Optimized Mobile Survey to Increase Response to a Screener Questionnaire Michelle A. Cantave, Arbitron, Inc ([email protected]); Robin Gentry, Arbitron, Inc ([email protected])

Arbitron currently uses a mailed screener survey sent to an Address Based sample (ABS) to recruit the non-landline portion of our sample frame. Improving the return rate for this survey is critical to improving the overall response rate of the ABS frame sample because the screener is only the first step of a multi- mode data collection process (mailed screener, phone diary placement and mailed diary package). This means that the overall response rate for the ABS sample can never be higher than the return rate to the initial screener survey.

The current methodology for the survey is to send an initial screener questionnaire with a small cash incentive and a replacement questionnaire, for non-returners, approximately one month later. In the Summer of 2011, we conducted a split-sample test to improve the ABS screener return rates by adding an inexpensive postcard sized mailer to addresses that did not return a screener approximately three weeks after the replacement questionnaire was sent. Consistent with the replacement questionnaire, this mailer contained a reminder to either 1) return the paper survey that was previously mailed to the household, 2) go online to complete the screener survey or 3) call into a 1-800 line to complete the survey. In addition, we included a quick response (QR) code that could be scanned by smart phone users and would take them directly to an in-bound survey website which was optimized for mobile phone use.

We will present the return rate results as well as the analysis of the demographics of those that returned the screener to determine who we brought in with the additional mailer. We will also present results about the QR code utilization and information about the demographic profile of users who completed the survey via a mobile phone.

New Reminder Methods and Their Effect on Response Rates for an Establishment Survey Matthew G. Anderson, Mathematica Policy Research ([email protected]); Melissa Krakowiecki, Mathematica Policy Research ([email protected]); Lawrence Vittoriano, Mathematica Policy Research ([email protected]); Cathie E. Alderks, SAMHSA ([email protected]); Karen CyBulski, Mathematica Policy Research ([email protected])

The National Survey of Substance Abuse Treatment Services (N-SSATS) is an annual multi-mode establishment survey conducted by Mathematica Policy Research on behalf of the Substance Abuse and Mental Health Services Administration (SAMHSA). The survey includes information on facility characteristics, treatment services provided, and client count information for over 16,000 substance abuse treatment facilities across the nation. The data also provides the information for the National Directory of Drug and Alcohol Abuse Treatment Programs. During the 2011 data collection cycle, Mathematica Policy Research implemented a new multi-mode strategy for handling respondent reminders to help maintain its historically high response rate. This strategy of respondent reminders utilized different technologies (bulk faxes, emails, and phone reminders), increased the frequency of respondent reminders and used technology to form new procedures. This paper will compare the 2011 N-SSATS strategy to an earlier N-SSATS data collection year that relied on the typical protocol of a phone reminders and some limited faxing. The bulk email function was a technological enhancement added to the 2011 N-SSATS data collection reminder process. To determine if the new strategy impacted either the rate of return or the overall response rate, we will compare response rates at six distinct time points in the data collection cycle. Those six time points include: the first packet mailing, the mailing of the reminder letter, the date of reminder call, a month after the reminder call, two months after the reminder call, and October 1 of each year. The results of this comparison will be used to guide the reminder process for future rounds of N-SSATS and may provide evidence for the field of survey research about reminder methods and their effect on response rates.

New Frontiers: Design Issues for Surveys Using Mobile Devices

Response Quality and Demographic Characteristics of Respondents Using a Mobile Device on a Web-based Survey Kevin R. Guidry, Indiana University ([email protected])

As mobile Internet access has proliferated throughout much of the United States, people invited to participate in Web-based surveys are using their mobile devices (smart phones, tablet computers, etc.) to participate in those surveys. Unfortunately, our understanding of the response process, particularly the role of memory and cognition, and our understanding of the usability challenges of small displays lead us to believe that responses entered using the small screens and keyboards of mobile devices may be of lower quality that those entered on devices with larger screens, especially for surveys that are not optimized for smaller screens. This paper uses responses to the 2011 National Survey of Student Engagement (NSSE) to explore this potential problem. Web browser data were captured from nearly 415,000 of the undergraduate college students who participated in the 2011 NSSE administration. This paper explores the demographic characteristics and response quality of the over 17,000 – 4.1% – respondents who exclusively used a mobile device to participate in the Web-based version of this survey, an instrument not optimized for mobile devices. First, demographic characteristics of these respondents are described using descriptive statistics, tests of significance, and logistic regression. Second, the quality of their responses is explored by comparing their responses to other sources of data, examining item non-response, and calculating an indicator of satisficing (Chen, 2011). The results – mobile device users tended to be older, male, and non-White and their responses were of lower quality – can inform future survey design and administration processes. In particular, these results can help us determine if the cost of optimizing for small screens is worth paying or if the potential lessening of response quality is acceptable.

Use of Mobile Devices to Access Computer-optimized Web Instruments: Implications for Respondent Behavior and Data Quality Colleen A. McClain, Survey Sciences Group, LLC ([email protected]); Scott D. Crawford, Survey Sciences Group, LLC ([email protected]); John P. Dugan, Loyola University Chicago ([email protected])

With recent increases in smartphone usage, the small size of web browsers on mobile devices has become crucial for researchers to consider—- especially when surveying populations with high rates of mobile browser use. Complex web surveys have generally been designed for full screen implementation, and consideration has typically not been paid to small screen size and touch screen usage given the numerous challenges involved in doing so; as a result, any prevalence of users responding on a mobile device may introduce undesirable trends in response (Peytchev & Hill, 2010). This presentation will focus on a national web survey of college students at nearly 100 universities, a population with a near-50% rate of smartphone ownership according to the Pew Research Center’s Internet & American Life Project (Smith, 2011). With the above considerations in mind, we sought to identify respondents who accessed the survey on mobile devices at several key points, gather information that would allow us to describe the characteristics of their experience, and experimentally test one potential method of intervention within the survey in efforts to encourage mobile users to return on a computer. We will discuss the implications of such device use on data quality, focusing on breakoffs, item-missing data rates, length of text responses, and behavior surrounding validations. Analysis by browser type and duration of use will be presented, together with recommendations for addressing challenges posed by mobile device use in future research on and implementation of web surveys.

Scale Orientation, Number of Scale Points and Grids in Mobile Web Surveys Keith Chrzan, Maritz Research ([email protected]); Ted Saunders, Maritz Research ([email protected]); Aaron Jue, Decipher ([email protected])

Respondents increasingly use mobile smart phones to take web-surveys we design for a PC viewing environment. Maritz and Decipher observe lower completion rates on surveys accessed on mobile devices and we posit two reasons for this:

1)respondents on mobile devices are more likely to be distracted by activities outside of the survey than respondents who are seated in front of a PC and 2)surveys have not been optimized for mobile devices so less sophisticated mobile browsers and smaller screen sizes combine to provide respondents with a less pleasurable survey experience.

While researchers do not have much control over the environment in which a respondent receives a survey invite, there may be ways to improve web survey design for mobile respondents to offer a better survey experience. In this research we address several survey design questions: • Given that some previous research suggested that respondents scroll vertically more readily than horizontally, will we get better results from mobile web respondents if we arrange scales vertically instead of horizontally? • We see grid questions appearing to cause terminations among our mobile web respondents. Is this a function of the difficulty of the grid format or would we see the same drop off if we break a k-item grid into k separate questions?

Using a multi-cell test-retest design, we seek to answer the questions above, assessing our rating scale treatments in terms of: • Completion rates • Construct validity • Similarity of PC web and mobile web responses • Individual question and overall survey completion times • Test-retest reliability • Self-reported respondent experience • Perceived and actual survey length

Understanding Smartphone Usage to Take Web Surveys: A Cross Country Analysis Carey Stapleton, Service Management Group ([email protected])

With the continued proliferation of smartphones in both the United States and in countries around the world, survey respondents are increasingly using this platform to respond to online web surveys. Despite this, there is little understanding within the survey research world about who is using their smartphones to take online surveys and what impact their use has on data quality. In order to answer these questions, this paper will synthesize consumer satisfaction research conducted across 20+ countries and local languages including the United States, United Kingdom & other Western European countries, Latin American countries, Eastern European countries, and Asian countries. The following topics will be discussed in-depth: • Smartphone usage rates for responding to online surveys across countries • Types of smartphones used across countries • Demographics of smartphone users across countries • Data quality issues across countries including breakoff rates and response pattern differences Based on the results of this research, recommendations will be made for survey researchers around which countries surveys should be optimized for smartphone usage and which countries not.

Better (Quality), Faster, Cheaper? Completing Web Surveys on Cell-Enabled iPads James J. Dayton, ICF ([email protected]); Heather Driscoll, ICF ([email protected]); Robert S. Pels, ICF ([email protected])

Intercept field data collectors working in outdoor environments using electronic devices face a number of challenges traditional paper and pencil data collectors do not. For example, these electronic devices must be easy to use in difficult environments; their programmatic solutions must efficiently mimic the flexible, dynamic functionality that is baked into paper-based data collection tools; and interviewers must juggle interviewing several respondents with the electronic data collection device participating in the same activity simultaneously instead of filling out multiple paper questionnaires concurrently.

This paper will explore ICF researchers’ quest for data collection options that are both innovative and able to handle challenging environments without many of the cost and flexibility limitations that accompany more traditional CAPI (Computer-Assisted Personal Interviewing) and paper-based implementations. ICF’s solution? The “AppPI" (App for Personal Interviewing)—a data collection application designed for cell-enabled tablets (specifically 3G iPad).

In addition to adding multiple respondent capabilities, the AppPI required other enhancements to address shortcomings discovered during initial pilot testing of the first generation tool. We also wanted to add other productivity enhancements to support field assignment logistics and field force communications. For example, innovations, such as linking photographs and audio files to specific survey responses, were investigated. Finally and most importantly, ICF researchers sought to answer the critical question: “Can the AppPI achieve the holy-grail promise of new technology—collect higher quality data, provide cost savings, and deliver clean, verified data to end-users faster than traditional in-field data collection techniques?”

The Role of the Interviewer in Survey Data Quality

Do Interviewers Influence Respondent Propensity to ‘Satisfice’? Gosia Turner, University of Southampton, UK ([email protected]); Patrick Sturgis, University of Southampton ([email protected]); Chris Skinner, London School of Economics ([email protected])

It is frequently asserted that a primary benefit of interviewer administered surveys, relative to those conducted via self-administration, is that interviewers can help to motivate respondents to provide accurate and well-considered responses. However, an under-acknowledged implication of this assumption is that interviewers may introduce an additional source of variability to survey estimates, insofar as they vary in their ability to motivate accurate responding. A primary determinant of respondent- driven measurement error is the level of cognitive effort applied in answering questions. While some respondents exert a great deal of time and effort in order to come up with an accurate response, others employ what Jon Krosnick has termed a ‘satisficing’ strategy, which enables them to provide an ‘acceptable’ response for the minimum possible effort. Such strategies include, but are not limited to, ‘yeah’ saying, choosing the first ‘reasonable’ response option presented in a list, agreeing with statements presented in the questionnaire, choosing a ‘Don’t know’ option rather than providing a substantive answer, and a tendency for heaping in behavioral frequency questions. In this paper we investigate the extent to which interviewer characteristics are predictive of the extent to which respondents indulge in such satisficing response sets. We use cross-classified multilevel models applied to data from the UK National Travel Survey, which is linked to paradata on interviewer characteristics, attitudes and beliefs to identify the interviewer contribution to a range of measures of respondent satisficing.

Observational Strategies Associated with Increased Accuracy of Interviewer Observations in Employment Research Brady T. West, Institute for Social Research ([email protected]); Frauke Kreuter, Joint Program in Survey Methodology (JPSM) ([email protected]); Mark Trappmann, Institute for Employment Research (IAB) ([email protected])

Only one existing study has examined alternative observational strategies used by interviewers who are tasked with recording observations on sampled housing units in a CAPI survey. Presented at AAPOR 2011, this study suggested that different interviewers use distinct strategies when making their observations, and that these alternative strategies may lead to different accuracy levels in the observations collected (which could in turn impact the quality of estimates based in part on the observations). This study attempts to extend this preliminary work by examining the observational strategies used by CAPI interviewers in a national survey of labor market participation and social security in Germany (the PASS survey). In Wave 5 of this panel survey, interviewers were tasked with recording judgments about the income bracket of a sampled household and whether anyone in the household was currently receiving unemployment benefits. The interviewer judgments were validated based on survey responses, and the 10 most accurate interviewers along with the 10 least accurate interviewers were identified for the two observations. Semi-structured interviews were then completed over the telephone with these 20 interviewers, and the resulting interviews were transcribed and coded. A subsequent qualitative analysis identified distinct strategies used by the more accurate interviewers, in addition to strategies that may not be useful in practice. These strategies are currently being evaluated for training purposes in future PASS waves. Results from this qualitative study will be presented, along with directions for future research in this area.

The Utility of Interviewer Observations as a Measure of Survey Data Quality Chris Antoun, Institute for Social Research, University of Michigan ([email protected])

Many national surveys require interviewers to record subjective evaluations upon completing each interview. While these items measure characteristics of the interview that are presumed to be related to measurement error, such as whether the respondent was attentive, the link between “interviewer observations” and measurement error is rarely tested. This analysis of approximately 22,000 observations from the National Survey of Family Growth (NSFG) Cycle 7 finds evaluations about the quality of information obtained and whether the respondents were attentive, upset, or tired during interviews to be significantly associated with the following three data quality indicators: inconsistent answers, item missing data, and interview length. However, this analysis also finds more favorable ratings for high socioeconomic status (SES) respondents than for non-low SES respondents, even when controlling for data quality indicators and other respondent characteristics. Therefore, measurement error components and respondent social characteristics were important determinants of these particular interviewer observations. Implications are discussed.

Using Behavior Coding to Diagnose Education Question Problems in Telephone Interviewing Fan Guo, Program in Survey Methodology, University of Michigan ([email protected]); Jim Lepkowski, Survey Research Center, University of Michigan ([email protected]); Joe Matuzak, Survey Research Center, University of Michigan ([email protected])

Educational attainment is assessed in almost all social science surveys, but it is often poorly measured. In recent decades, people have changed how they acquire education, and the way they view it. Survey methodologists can measure how well questions are working by means of coding verbal behaviors between interviewer and respondent in an interview. We applied a behavior coding framework to education questions in the Survey of Consumer Attitudes, a national RDD survey of adults about their attitudes toward the economy. Behavior codes were set up to record interviewer-respondent behaviors that deviated from accepted norms of standardized interviewing. Twenty coders were assigned randomly to code 1,084 recorded telephone interviews collected in 2010 and 2011 spring. We examined the distribution of the individual behaviors and the sequence of behaviors to understand more fully the interviewer-respondent interaction. Three behaviors were coded for six education questions: whether the interviewer read the question correctly, whether the respondent chose one of the response alternatives offered, and whether the interviewer probed the answer correctly. Among the six questions, three were about the respondent, and three were proxy reports about their spouse or significant other. Proxy reports had more interviewer question reading errors and incorrect probing behaviors. The relationship of respondents’ characteristics to coded behaviors is also explored in a logistic regression model. Lastly, inter-coder reliability was assessed, and results indicate a high frequency of agreement in question reading and response behaviors (kappa statistics greater than 0.4), but agreement levels were lower for interviewer probing behaviors.

Understanding Public Opinion on Health Care

Public Knowledge and Misunderstanding about Health Reform: A Geographical Analysis Gerald M. Kosicki, School of Communication, The State University ([email protected])

Eighteen months after the passage of the Patient Protection and Affordable Health Care Act of 2010, it remains the subject of considerable controversy. Despite tangible benefits that many Americans enjoy because of the legislation, there persists considerable public confusion about what the bill does and doesn’t do. One aspect of the controversy surrounding the law is the very high level of misinformation about the bill that national surveys have documented. Oddly, public understanding of the bill was actually slipping by August of 2010 compared to some earlier surveys.

The goal of this research is to examine the survey evidence of misinformation and confusion surrounding the health reform act. Which subgroups of Americans are most ill informed? Is the opposition largely ideological or party-based, or is it also related to education, self-interest, media use, coordinated attacks, and geography?

The study uses national survey data from Kaiser Family Foundation, Pew, and several other public surveys that have examined public knowledge about the new law. Multivariate statistical tests will be used to help understand the correlates of public misunderstanding and confusion. A novel part of this analysis will involve geographical analysis and data mapping to better understand the role of geography and regional variations in the distribution of this information. State-by-state and regional analyses, including the dominant party affiliation of the state will be used along with other contextual data related to health care. Data mapping using Tableau and another leading map software system will be used for the highest quality maps. Implications of the results for the study of public opinion and public policy will be discussed.

The Affordable Care Act and the Republican Presidential Primaries Bianca DiJulio, Kaiser Family Foundation ([email protected]); Sarah Cho, Kaiser Family Foundation ([email protected]); Liz Hamel, Kaiser Family Foundation ([email protected]); Claudia Deane, Kaiser Family Foundation ([email protected]); Mollyann Brodie, Kaiser Family Foundation ([email protected])

Since passage of the Affordable Care Act (ACA) in March 2010, the Kaiser Family Foundation has measured Americans’ opinions and understanding of the law with an in-depth, monthly tracking survey. During its infancy, feelings toward the law remained relatively steady, but starkly divided along partisan lines. As the 2012 presidential election approaches, the candidates for the Republican presidential nomination have frequently debated the merits of the ACA and called for it to be repealed. In this paper, we will analyze how opinions of the law develop during the Republican primary season, including the extent to which Americans are tuned in to the health law as a campaign issue, current perceptions of what its impact will be on individuals and the nation, and the partisan underpinnings of opinion on health policy. We will also explore the public’s awareness of specific elements of the law, interest in repeal or modifications to the ACA, and reactions to policy options discussed on the campaign trail.

Sampling Low-Income Californians to Assess their Healthcare Preferences Julie Phelan, Langer Research Associates ([email protected]); Gregory Holyk, Langer Research Associates ([email protected]); Gary Langer, Langer Research Associates ([email protected]); David Dutwin, Social Science Research Solutions ([email protected]); Eran Ben-Porath, Social Science Research Solutions (eben- [email protected])

Sampling barriers are a major challenge to representative studies of low-income Americans, given both their comparatively low incidence in the overall population and the reluctance of many researchers to screen on the basis of income for fear of high nonresponse. Our paper, based on a study of low-income Californians in spring 2010, suggests that such fears are largely unfounded, and that income-threshold screening, combined with list- and Census-based stratified sampling, is an effective and efficient approach to sampling poor and near-poor populations.

This presentation offers an overview of the methodological approach we employed, as well as substantive findings, in a health care survey of California adults with household incomes less than 200 percent of the federal poverty level. Researchers typically are averse to placing income questions at the beginning of a questionnaire due to the sensitive nature of the topic. However, using an income-threshold rather than income-level question, 96.5 percent of respondents answered, enabling us to screen successfully for the target population, completing our 20-minute questionnaire with AAPOR 3 response rates of 29.3 percent landline, 19.8 percent cell phone.

The survey, sponsored by Blue Shield of California Foundation, produced valuable results for safety-net healthcare providers as they seek to position themselves in the new landscape under the Patient Protection and Affordable Care Act. The study moved beyond traditional measures of satisfaction with care to evaluate the low-income population’s desires, needs and expectations. In addition to detailing our sampling approach, the paper includes the results of statistical modeling on overall satisfaction with care and interest in changing health care facilities.

Public Opinion on “New Frontier” Policy Efforts to Combat Chronic Disease Stephanie Morain, Harvard University ([email protected]); Jordon Peugh, Knowledge Networks ([email protected])

Chronic disease morbidity and mortality associated with modifiable risk factors pose growing challenges for the population’s health in the twenty-first century. While infectious diseases continue to pose a threat to the nation’s health, their relative burden has been dwarfed in recent decades by chronic diseases, particularly those associated with “lifestyle choices” such as overeating, lack of exercise, and tobacco use. In response to this challenge, some health departments have undertaken policy measures that apply both traditional and innovative public health tools to reduce the influence of modifiable risk factors. Examples of such policy measures include diabetes surveillance, trans fat bans, cigarette taxation, and school-based body mass index screenings. These interventions, which constitute a new frontier for public health law and practice, have provoked political and moral controversy. Critics assert that they unduly burden individual liberties and exceed the appropriate scope of public health authority. To date, public health agencies have felt their way through this contested territory without the benefit of a solid understanding of how the public views these initiatives at “public health’s new frontier.” To address this gap, we conducted an online survey of a nationally representative sample of 1817 American adults using KnowledgePanel®, Knowledge Networks’ (KN) national probability-based web panel that is recruited through RDD and ABS. In addition to identifying the general population’s views, we also oversampled diabetics and residents of New York City metropolitan area, who we expect may have differing opinions on given their greater exposure to such initiatives. This paper will present information on public attitudes about the legitimacy of new frontier interventions and differences in support among population subgroups. Further, we will discuss the benefits of conducting this research using an online panel, including the ability to rely on extant data on a range of respondent characteristics and health attitudes.

Saturday, May 19, 2012 8:00 a.m. - 9:30 a.m. Concurrent Session G

Analyzing Trends and Issues Concerning the 2012 Elections

The 2012 Republican Primaries: What the Heck was that all About? Gary Langer, Langer Research Associates ([email protected]); Damla Ergun, Langer Research Associates ([email protected]); Patrick J. Moynihan, Institute for Quantitative Social Science-Kennedy School of Government ([email protected])

At this writing, Michele Bachmann, and are playing public-opinion ping-pong, ’s being voted Most Boring Prom Date and , , John Huntsman and are rattling chains in the basement. For the sanity of all involved we should know how it ends by May, with a sheaf of exit poll data to sort out the will of the voters and some preliminary general- election data to set the scene for the fall spectacular. We’re covering this election for the ABC News television network, both reporting on the caucus-and primary-night exit polls and directing ABC’s participation in the ongoing ABC News/Washington Post poll. In a year marked by vast continuing economic discontent and the deep political alienation it produces, we’ll present our analysis of the 2012 race so far.

Key Insights on the 2012 Republican Presidential Nomination Contest from Gallup Tracking Jeffrey M. Jones, Gallup ([email protected])

In 2011-2012, Gallup tracked Republicans' presidential nomination preference and their opinions of the major presidential candidates. This paper will review the data for key insights to help understand the outcome of the contest. To date, the data have suggested that Herman Cain could become a serious contender and that would not likely be a factor if he ran. The data also show that candidates' images generally got worse over the course of the campaign.

Altogether Different: Understanding Dynamics of Primary and General Elections Andrew Smith, University of New Hampshire Survey Center ([email protected])

Final polls before the 2008 New Hampshire primary showed Barack Obama with a sizable lead over , yet Clinton ended up winning. Concern over the inability of polls to predict the winner in New Hampshire, and to accurately predict the margin of victory in several other primary states, led AAPOR to conduct an investigation of polling methodology used to discover the cause. And while this study is useful, it downplays a fundamental aspect of primaries – the choice by voters is typically not one they spend a great deal of time considering, nor is it one that will have much impact on how they vote in the subsequent general election.

This paper provides evidence that most primary voters make up their minds at the very end of a primary campaign while general election voters are largely fixed in their vote choice at a fairly early stage. Data come from extensive polling conducted in New Hampshire during the 2008 and 2012 primary elections as well as in the 2008 and 2010 general elections.

The implication of this research for future campaigns highlight the need for increased education of media so that their reporting of primary campaigns focuses more on the undecided nature of the electorate and less on the horse race.

The Tea Party and Perot Voters: Kindred Spirits? Larry Hugick, Princeton Survey Research Associates International ([email protected]); Jessica Starace, Princeton Survey Research Associates International ([email protected])

The played a critical role in delivering the 2010 Congressional vote to the Republicans. Eighteen years earlier, another group fueled by anti-establishment, anti-Washington sentiment played a key role in helping the GOP take control of the U.S. House in a midterm election – the Perot voters. While there are obvious differences between today’s Tea Partiers and the Perotistas of the 1990s – one largely operating within the Republican Party, the other an independent group – survey data shows these two groups share certain demographic characteristics, economic attitudes, issue priorities, and sociopolitical values. Drawing upon data from surveys conducted by Princeton Survey Research Associates International for the Pew Research Center, the Poll, the Kaiser Family Foundation, and other sources, the paper examines the politics of Ross Perot’s supporters in 1994-6 and Americans who identify with the Tea Party in 2010-12. Understanding the similarities and differences of two groups illuminates important ways in which the voters, the major parties, and the media environment have changed over the past two decades.

The End of Empire: An Examination of Party Registration Shifts in Pennsylvania Christopher Paul Borick, Muhlenberg College Institute of Public Opinion ([email protected])

One of the defining features of the political scene in the key swing state of Pennsylvania during the last decade was the dramatic shift of voters away from the Republican Party and to the Democratic Party. In areas like suburban Philadelphia voters by the tens of thousands opted to change their party registration from Republican to Democrat, with the net effect being a 1.2 million voter advantage for Democrats over the GOP by 2009. In a previous study by this author, utilizing a sampling frame from state registration records, former Republican voters identified that the primary reasons for their flight to the Democratic Party was dissatisfaction with President Bush and his handling of the war in . In the period since the Republican exodous, a counter trend has emerged where The Democratic Party has lost a growing number of voters to the GOP in 2010 and 2011, with numerous electoral successes accompanying the shift. In this paper the decisions of Pennsylvania voters to leave the Democrats and become a Republican (either for the first time or in a return to the party) is examined through a survey of voters making the switch in 2010 and 2011. Through the development of a sampling frame developed from voting registration records provided by the Commonwealth, these “party shifters” were asked a series of questions regarding their decision to change parties. Among the factors examined are views on President Obama, health care reform, and the economy. The results provide insight into the political environment in Pennsylvania as the elections of 2012 approach.

Cross-National Research on Public Opinion

Spin the Tale of the Donkey: Networked Authoritarianism and Social Media in Azerbaijan1 Katy E. Pearce, University of California, Santa Barbara ([email protected]); Sarah Kendzior, Washington University

This mixed-methods study examines tactics used by the government of Azerbaijan to dissuade internet users from engaging in activist politics. We examine Azerbaijan through the frame of networked authoritarianism, a form of internet control common in authoritarian former Soviet states that emphasizes manipulation over censorship, and is increasingly emulated by other countries. Through a content analysis of three years of Azerbaijani media, a two-year structural equation model of the relationship between internet use and attitude toward protest, and interviews with Azerbaijani online activists, we found that the government has successfully dissuaded frequent internet users from supporting protest and many internet users from using social media for political purposes. State attempts to dissuade the wider Azerbaijani public from using social media have been less successful. The findings of our study contradict the belief that frequent internet use leads to greater support for dissent.

Agenda Setting in Qatar Jill Wittrock, University of Michigan ([email protected]); Michael Traugott, University of Michigan ([email protected]); Amina Albloshi, Social and Economic Survey Research Institute ([email protected]); Sara Zikri, Social and Economic Survey Research Institute ([email protected]); Kaltham Khalifa Al-Suwaidi, Social and Economic Survey Research Institute ([email protected]); Fatimah Ali Al-Khaldi, Social and Economic Survey Research Institute ([email protected])

Few studies have investigated the agenda setting model in the context of Arab countries and samples of their mass populations. This paper presents the results of the first ever agenda setting study that includes a probability sample of major segments of the adult population in Qatar linked to the newspapers that they read. Qatar has a complex media environment, including newspapers printed in Arabic and English designed to appeal to and inform two distinct groups of readers, Arab speakers and English speaking expatriates. In separate samples of Qataris and expatriates, we look at the relationship between respondents’ assessments of the most important problem facing Qatar and the front page content of seven different newspapers, three printed in English and four printed in Arabic. The paper will include a description of differences in the news paper front pages as well as the relationship of this content to most important problem evaluations.

The Public Agenda in Mexico 2007 – 2012: The evaluation of the public agenda in 14 national phone surveys between October 2007 to April 2012 Paul Francisco Valdes Cervantes, Parámetro Investigación ([email protected]); Jorge Maldonado García, Parámetro Investigación ([email protected]); Jesús Irineo Carreño Rodriguez, Parámetro Investigacion ([email protected])

This text describes the evolution of the principles issues of public agenda in Mexico and the effects in the party identification and approval of authorities using data from a series of 14 national phone surveys conducted between October 2007 to April 2012.

Public agenda is defined by the ranking of main problems perceived by society. Public insecurity, economic situation, employ and drugs trafficking are the main themes of the Mexican public agenda in the period. In this trial was obtained measuring the public agenda through the open question: What is the main problem facing the country right now??

Public insecurity and economic situation were the two main concerns in recent years, particularly the spiral of violence had impact in the Mexican public opinion in the last four years. This issue has dominated the series in 9 of the 12 measurements, and is generally perceived negatively with several consequences in the Mexican society. Economics and unemployment has consistently been perceived as the second issue in the national public agenda.

The impact of the public agenda in the public opinión is negatively for the government, party identification, and approval of authorities. The surveys tell us that the growing perception of violence and insecurity have contributed to lower overall levels of trust towards politics and institutions, President Calderon and significant costs in the performance of the ruling party in this period.

The project is sponsored by Parameter Research SC (www.parametro.com.mx) and the National Chamber of Commerce (Canaco) and is intended to assess the major themes of the national public agenda.

Methodology: National phone survey, representative of the Mexican population with phone line in home. Sample size: 405 effective interviews. Sampling: Systematic selection of phone numbers, with probability to the size of each state. Confidence interval: 95%, +/- 4.8% nationally.

Evaluating Online Non-Probability Samples

The Challenge of Measuring Political Engagement with Online Surveys: An Analysis of Data from the British Election Study Jeffrey Karp, University of Exeter ([email protected]); Maarja Luhiste, University of Exeter ([email protected])

In a well cited article in Political Analysis, the principal investigators of the 2005 British Election Study (BES) attempted to provide reassurance that the data collected by YouGov over the internet was equivalent to the face-to-face surveys which have hitherto formed the core of the BES. They compared responses between the Internet sample and the face-to-face sample, and found statistically significant, but small, differences in distributions of key explanatory variables in models of turnout and party choice (Sanders et al. 2007). We examine this question focusing on differences in political engagement using data from online and face to face surveys in both the 2005 and 2010 British Election Studies. Our findings suggest that there is greater evidence of bias than discovered hitherto. Specifically the online surveys overestimate the proportion of younger citizens and those who are politically engaged which would lead scholars to underestimate the effects of age compared to the in-person samples. Along with an apparent bias toward voters, there is also a strong bias toward political interest. Moreover a comparison of the 2010 and 2005 BES reveals that the bias has not changed over time. These findings suggest that there is a greater need to acknowledge bias, which is likely the result of sampling as opposed to mode, particularly when examining questions related to political engagement.

Using Probability-based On-line Samples to Calibrate Non-probability Opt-in Samples Charles A. DiSogra, Knowledge Networks, Inc. ([email protected]); Curtiss L. Cobb, Knowledge Networks, Inc. ([email protected]); Elisa Chan, Knowledge Networks, Inc. ([email protected]); J. Michael Dennis, Knowledge Networks, Inc. ([email protected])

Probability-based sampling is the survey researcher’s most reliable method for making population estimates when only data from a sample is being used. Non-probability samples are considered less reliable with presumed biased estimates due to their convenient, non-representative construction. In the realm of Web surveys, a representative study sample, drawn from a probability-based Web panel (such as KnowledgePanel®), after post-stratification weighting, will produce reliable, generalizable unbiased study estimates. However, there are instances when too few Web panel members are available to meet minimum sample size requirements due to the finite size of the panel. In such unique situations, a supplemental sample from a non-probability opt-in Web panel may be added to satisfy sample size targets. First, this paper will show that when both samples are profiled with questions on early adopter (EA) attitudes, non-probability opt-in samples tend to have proportionally more EA characteristics compared to probability samples. This finding is consistent over different demographic groups. Second, taking advantage of these EA differences, this paper describes a statistical technique for calibrating opt-in cases blended with probability-based cases using these EA characteristics. Successful results from different studies will be demonstrated. Additionally, in order to quantify the benefits of calibration, using, for example, data from one probability sample (n=611) and one opt-in sample (n=750), a reduction in the average mean squared error from 3.8 to 1.8 can be achieved with calibration. The average estimated bias is also reduced from 2.056 to 0.064. Other examples will be presented. Knowledge Networks believes that this calibration approach is a viable methodology for combining probability and non-probability Web panel samples. It is also a relatively efficient procedure that serves projects with rapid data turnaround requirements.

How Representative is a Self-selected Web-panel? - The Effect on Representation of Different Sampling Procedures and Survey Modes! Stefan Dahlberg, University of Gothenburg ([email protected]); Johan Martinsson, University of Gothenburg ([email protected]); Sebastian Lundmark, University of Gothenburg ([email protected])

This paper makes a systematic comparison between different sampling procedures and survey modes by making use of three different types of surveys. In all three surveys, identical questions and wordings are used. However, they are all three separate studies whereas the first study is based on a representative sample of approximately 3 000 Swedish citizens (based on the national census register). This study is carried out as a traditional postal survey by the Swedish SOM-institute. The second study is also based on a representative sample of approximately 3 000 Swedish citizens (recruited by telephone from the national census register), but in this case the survey is entirely carried out as a web-survey, distributed by e-mail. The third and final study is carried out as a web-survey as well but is instead based on a self- recruited citizen panel of 10 000 Swedish citizens. All three surveys were carried out during October to December, 2011.

In order to evaluate potential differences between varying sampling procedures and survey modes, we will a) compare the composition of respondents in terms of general SES-related background characteristics between the three surveys, both in terms of potential differences in levels and correlations, and b) analyze the differences in three sets of questions tapping, values, attitudes and behavior. By doing this we will be able to uncover how representative a large-sample self-recruited panel actually can become as well as what the usage of new technological media in surveys does to an initially representative sample in a technologically advanced country. Important questions that we will be able to answer concerns to what extent different sampling procedures and survey modes affect the representativeness of a sample? If potential differences varies between different types of survey questions, i.e. questions relating to values, attitudes and behavior etc.

Using Online Panels for National Surveys of Low Incidence Populations: Findings from the CDC Influenza Vaccination Monitoring Survey of Pregnant Women John M. Boyle, Abt SRBI Inc. ([email protected]); Sarah Ball, Abt Associates ([email protected]); Helen Ding, Chenega Government Consulting/CDC ([email protected]); Gary Euler, CDC ([email protected]); K. P. Srinath, Abt Associates ([email protected])

Future program evaluations may require rapid assessments of large samples of low incidence populations in order to measure targeted interventions. A case in point is the national effort to increase influenza vaccination among pregnant women to protect the health of the mother and the unborn child. The peak vaccination period for each influenza season is only about 3 months in length and less than two percent of adult women will be pregnant during this period. Hence, it is impractical to recruit and interview an effective sample of pregnant women during influenza season using traditional survey approaches, such as RDD telephone surveys.

In order to assess the vaccination uptake and evaluate the knowledge, attitudes, and behaviors of pregnant women toward influenza vaccination during the 2010-11 influenza season, the CDC adopted an innovative non-probability approach to obtaining a large national sample of pregnant women during influenza season. A large national web panel was used to recruit adult women who were either currently pregnant or had been pregnant since the beginning of vaccination in August 2010 for the influenza season. The fall 2010 assessment was launched in November for an early estimate of vaccination during the peak activity of the vaccination period.

Women currently or recently pregnant were recruited from the web panel, using both e-mail invitations to panelists (nearly 200,000) and website intercepts (nearly 30,000). In less than three weeks, we were able to interview a national sample of 1,500 eligible women, including minority oversamples. The demographic characteristics of the achieved sample were representative of population estimates. Moreover, the key variable, vaccination uptake, was consistent with estimates based on a small sample of currently pregnant women from the December 2010 BRFSS. This paper explores the methods and outcomes of this innovative method for rapid surveys of low incidence populations.

Examining Item Non-Response and Missing Data

A Tradeoff Between Quality and Quantity. An Examination of the Negative Relationship Between Unit and Item Non-Response in Survey Research Johan Martinsson, University of Gothenburg ([email protected]); Elias Markstedt, The SOM-institute, University of Gothenburg ([email protected]); Mikael Gilljam, University of Gothenburg ([email protected])

Previous studies have identified a negative relationship between unit non-response and item non- response concerning economic expectation questions in household surveys. Drawing on 25 years of consecutive postal surveys in Sweden with a total of 60 000 respondents, we further explore the connection between unit and item non-response with a wider set of substantial topics including attitudes, values, self-reported behavior, economic expectations and socio-economic status.

We start by describing the long-term trend concerning unit non-response and item non-response for different types of questions. The general hypothesis is a negative relationship, but possibly conditional on the type of questions and types of respondents. Next, we move on to explore the validity of the response continuum model and the relationship between unit and item non-response. More specifically, we examine whether respondents that need more reminders before participating in the survey provide more item non-response. Our large pool of respondents permits a detailed analysis of this question across various sub-groups of the population and across different types of substantial topics and types of questions.

Lastly, we also assess to what extent the placement of a battery of question in the survey influences the rate of item non-response. Does a question receive less item non-response if it is asked early in a survey, than if it is asked later? Taking advantage of the variation in the placement and order of a set of question batteries between different years and between different questionnaires the same years, we explore how large such effects are.

Trends of Income Nonresponse: Forty years of the General Social Survey Jibum Kim, NORC ([email protected]); Jaesok Son, NORC ([email protected]); Jodie Daquilanea, NORC ([email protected]); Lauren Doerr, NORC ([email protected]); Faith Laken, University of Chicago ([email protected]); Peter P. Kwok, NORC (kwok- [email protected]); Steven Pedlow, NORC ([email protected]); Hee-choon Shin, NORC ([email protected]); Tom W. Smith, NORC ([email protected])

We examined the trend and correlates of family and personal income nonresponse. Using the 1973(4)- 2010 General Social Survey cumulative data, we analyzed a single item question with detailed income response categories for both family and personal income. We found that both family and personal income nonresponse has slightly increased from the early 1970s to 2010: 7 % to 13% for family income and 7% to 12% for personal income. Of the 26 occurrences of GSS data collection, in all but 4 years, family income nonresponse is higher than personal income nonresponse. If we separate "refused" from "don’t know" responses, in the 1970s and 1980s, the response rate of "don't know" is higher than "refused" for family income, but since the 1990s there has always been more "refused" than "don't know". For personal income, on the other hand, there have been always more refusals than "don't know”. A logistic regression model predicting family and personal income nonresponse showed that age, female, education, the Northeast, the 12 largest metropolitan areas, and mistrust were positively related with income nonresponse, but race was not significant after controlling for trust. On the other hand, as expected, the number of adults was positively associated with family income nonresponse, but was not associated with personal income nonresponse. Our findings suggest the increasing trend of income nonresponse and the similar correlates of both family and personal income.

Nonresponse in Open-Ended Questions Bradford H. Bishop, Duke University ([email protected]); D. Sunshine Hillygus, Duke University ([email protected]); Natalie M. Jackson, Duke University ([email protected])

Open-ended questions are thought to provide researchers a more unbiased, in-depth look into what respondents think compared to closed-ended questions. Unfortunately, open-ended questions are plagued by higher levels of nonresponse than closed-ended questions. Item nonresponse on closed- ended questions is a well-researched topic, but it is unclear if existing theories are adequate to explain item nonresponse to closed-ended questions. In this paper, we investigate the predictors of nonresponse to open-ended questions in the 2008 Associate Press/Yahoo News Election Panel Study conducted by Knowledge Networks. Initial model results confirm that cognitive ability, topic interest, privacy concerns, item burden, and engagement in the survey predict the likelihood of responding. Most interesting is evidence of a positivity bias—if the respondent does not have something nice to say, they are less likely to respond. In other words, a respondent’s directional valence towards the object of the question (political candidates, in this case) shapes their likelihood of response; In contrast, questions not subject to directional valence (e.g., most important problem) show no such effect. We find that candidate favorability predicts the likelihood of answering an open-ended question about presidential candidates in 2008; moreover, white respondents are less likely than minorities to offer a response about Obama, but there is no difference across race in the likelihood of offering a response in response to a question about McCain. Candidate favorability has no effect on the likelihood of answering the most important problem question.

Visualizing Multiply Imputed Data for Quality Review. Darryl Creel, RTI International ([email protected])

The use of multiple imputation to account for missing survey data has been increasing. However, diagnostic procedures to assess the quality of the multiply imputed data have not kept pace. Consequently, how do we know whether or not our multiply imputed data has been reasonably imputed? To address this problem, we have developed a visual quality review matrix for each imputed variable to ensure that the values used to replace missing values were reasonable. The quality review matrix was used to concisely and quickly review 1) each imputed variable in relation to the original unimputed variable and 2) the imputed variable across the different imputed data sets. In this paper, we review several data visualization techniques currently in use and suggest a method for combining their strengths into one cohesive system for a survey with several types of variables.

Coping with Missing Data: Assessing Methods for Logically Assigning Race/ethnicity Jessica Knoerzer, NORC ([email protected]); Lance Selfa, NORC (selfa- [email protected]); Lynn Milan, National Science Foundation ([email protected]); Karen Grigorian, NORC ([email protected])

Race/ethnicity is a key sampling stratification variable used in the Survey of Doctorate Recipients (SDR), a biennial multi-mode panel survey of more than 40,000 doctorate recipients in the science, engineering, and health fields, sponsored by NSF and NIH. 95% of SDR sample members have reported race/ethnicity in either the SDR or the Survey of Earned Doctorates (SED), the federally sponsored census of research doctorate recipients, which serves as the SDR sample frame.

For those who have not reported race/ethnicity in either the SED or SDR, logical imputation methods have been used to assign sample members to a racial group during sampling stratification. During data collection, sample members responding online are asked the race/ethnicity questions.

The logical imputation methods are used hierarchically. The first two imputation methods evaluate surnames; those with a Hispanic surname are assigned to the Hispanic-white racial group, and those with an Asian surname are assigned to the non-Hispanic Asian racial group. Next, birth place is used to logically impute race/ethnicity. Third, if a sample member previously responded to the survey and did not report race, the prior Hot Deck imputed racial group assignment is retained. If none of this information is available, the case is imputed to be non-Hispanic white for sampling.

This paper will evaluate the efficacy of the imputation methods used to assign race/ethnicity for sampling. From the 2003-2008 survey rounds, a dataset of 1,696 cases will be analyzed, using methods such as inter-coder reliability measures of association, to compare imputed race/ethnicity data to sample members’ race/ethnicity data as ultimately reported on the survey. We will estimate the accuracy of the imputation methods, including estimating the odds of correct prediction by each method.

Issues in Cell Phone Surveys

Cell Phone Operational Efficiencies for a Survey of Young Adults Ashley Mark, ICF International ([email protected]); Randal ZuWallack, ICF International ([email protected]); Cristine Delnevo, University of Medicine and Dentistry of ([email protected]); Daniel Gundersen, University of Medicine and Dentistry of New Jersey ([email protected]); Michelle Bover Manderski, University of Medicine and Dentistry of New Jersey ([email protected])

This presentation presents operational features to improve the efficiency for the National Young Adult Health Survey, a cell phone survey conducted with adults ages 18-34. The AAPOR Cell Phone Task Report states that a break-off may result from a dropped call and that these cases should be handled differently than traditional break-off refusals. To address this issue, we instituted an option for the interviewer to immediately call-back the number if disconnected. Interviewers used their discretion to determine whether the call was “dropped” or whether the respondent hung-up. Interviewers were able to reconnect with respondents 30% of the time after invoking the call-back. Another efficient operational aspect instituted for the NYAHS was in administering a $10 gift card to respondents who completed the survey. We began the survey with the primary mode of transmitting the gift card number or code was verbally over the phone. Text messaging the code was also an option, but only under certain conditions and therefore most were delivered over the phone. Sending the code via text message is considerably more efficient. To prompt greater use of the text message option, we modified the script to offer text messaging as the primary mode of transmitting the gift code. This dramatically increased the number of respondents who opted for text messaging and we were able to cut the closing time in half after making the script changes.

Cell Phone Usage in the United States - Estimation from the 2010 Behavioral Risk Factor Surveillance System (BRFSS) Pranesh P. Chowdhury, CDC ([email protected]); Carol Pierannunzi, CDC ([email protected]); Machell Town, CDC ([email protected]); Lina Balluz, CDC ([email protected])

According to a 2010 report from NHIS, in the United States during 2010, 23.9% of adults and 27.5% of children lived in cell phone-only households. In the past, researchers have relied on modeled estimates to estimate cell phone usage, because population data on cell phone usage were lacking. In order to estimate cell phone usage among adults in the U.S. and the District of Columbia (DC), we used 2010 combined landline and cell phone data from the Behavioral Risk Factor Surveillance System (BRFSS). We compared state level estimates from BRFSS with modeled estimates from NHIS. In addition, we calculated the percentage of each state’s sample which resulted in calls reaching residents of other states, and calls resulting in phone contact with minor children. Our results indicate that the percent of cell phone-only adults ranged from 16.7% in Pennsylvania to 35.8% in DC. State-specific BRFSS estimates were positively correlated with NHIS estimates (correlation coefficient 0.68). We calculated the percent of cell phone contacts from each state’s sample who were no longer residents of the state (moved out-of- state samples) and the percentage of completed cell interviews for each state resident which were completed by calling samples from other states (moved into state samples). The percentage of out of state interviews ranged from 2.5% in Alaska to 47.1% in DC (median: 6.2%). The percentage of interviews of persons moving into states ranged from 1.3% in Nebraska, Rhode Island and Utah to 35.1% in Maryland (median: 7.4%). The percent of cell phone contracts who were not adults ranged from 5.4% in Alabama and Washington to 12.7% in Louisiana (median: 8.5%).

The Telephone Point of Purchase Survey Cell Phone Hit Rate Test Aniekan Okon, Bureau of the Census ([email protected]); James Arthur, Bureau of the Census ([email protected])

The Telephone Point of Purchase Survey (TPOPS) is a Random Digit Dialing Computer-Assisted Telephone Interview (CATI). This survey conducted by the Census Bureau on behalf of the Bureau of Labor Statistics (BLS) collects information from selected households on what people buy and where they buy it. The information collected in the TPOPS, along with information from the Consumer Expenditure Household and Diary Surveys, which are also conducted by the Census Bureau, is used by the BLS to update the Consumer Price Index (CPI).

Currently, the TPOPS only includes landline phone numbers in its sample frame. This presents a coverage issue since the TPOPS does not collect outlets and expenditures data from cell phone-only households. To alleviate this problem, the BLS plans to add a cell phone frame in April 2012. In April 2011, the Census Bureau conducted a test to determine: 1) the number of cell phone numbers needed to result in one productive interview, 2) the accuracy of cell phone number in relation to geographic area, and 3) any changes in respondent demographics due to the addition of the cell phone frame.

We presented the methodologies for the TPOPS Cell Phone Frame Test at the 2011 AAPOR Conference. In 2012, the authors would like to present the results of the test including the response rates and completion rates for both the cell phone frame and the landline frame, and offer comparisons.

Cognitive Lessons from Telephone Status Questions. Vincent E. Welch, NORC ([email protected])

Survey researchers have come to understand the importance of including telephone numbers drawn from cell phone frames in their samples to achieve unbiased results. To this end, a number of items intended to assess whether a particular household could be reached only by landline phone, only by cell phone, or by both have been developed. Much is known about the socio-demographic characteristics of households by telephone status. Far less is known about how respondents understand the items used to assess telephone status. To better understand how respondents think about and respond to telephone status items we conducted a series of focus groups with dual-telephone users. The conversations in these focus groups showed that the respondents generally found the current telephones status items clear and easily understood, which does not suggest a need for a dramatic overhaul of the current telephone status items. However, the conversations highlighted several areas for further inquiry. It was unclear how respondents combined their knowledge about their own telephone behaviors with other household members’ behavior to provide a household-level telephone behavior response. It was also unclear how respondents differentiate between the various response options (e.g., somewhat likely versus very likely or some versus most calls received on a landline phone). To follow up on the questions that were raised from the focus group discussions, the authors conducted a series of cognitive interviews on the current telephone status items. During the cognitive interviews, we used a think-aloud procedure and in-depth probing to explore the process by which respondents retrieve information about telephone behaviors and make judgments about what the response options mean and which is most appropriate. The current work reports on the findings from these qualitative research efforts and explores next steps in understanding the cognitive process of responding to telephone status items on surveys. Methodological Briefs: Issues in Survey Non-Response

Interviewer Assessments of Response Propensity Stephanie Eckman, Institute for Employment Research ([email protected]); Jennifer Sinibaldi, Institutue for Employment Research ([email protected])

Interviewers are in a unique position to obtain information about cases selected for surveys. For this reason, interviewers have in recent years been asked to make area and case observations, such as judgments of socio-economic characteristics and presence of children, to guide field work decisions. In the course of doing their work, interviewers also form judgments about cases’ likelihood of responding to the survey request. In this study, we investigate how interviewers form these assessments of cases' willingness to respond, and the usefulness of these willingness judgments.

For this research, we carried out a telephone survey in the fall of 2011 in Germany (n=2400). After every call resulting in contact, interviewers recorded an assessment of the cases’ willingness to complete the survey, using a 100 point scale. Using more than 10,000 such judgments, we investigate how case, respondent and interviewer characteristics influence interviewers’ willingness ratings. We find strong evidence of interviewer effects. We then explore whether the willingness ratings are more effective than other information contained in the call history data, such as number of outcome of calls, at predicting cases’ final outcomes.

The Effect of Events between Waves on Panel Attrition Mark Trappmann, Institute for Employment Research (IAB) ([email protected])

Panel surveys suffer from attrition. Most panel studies use propensity models to correct for non-random dropout. These models draw on variables measured in a previous wave or taken from the paradata of the study. While it is highly plausible that they affect contactability and cooperativeness, panel studies usually cannot assess the impact of events (or changes in attributes) between waves on panel attrition. The amount of change in the population could be dramatically underestimated by panel studies if such events had an effect on participation in the subsequent wave.

The panel study PASS is a novel dataset in the field of labor market, welfare state and poverty research in Germany. In PASS survey data on the employment and unemployment history, income and education of participants can be linked to corresponding data from respondents' administrative records. Thus change can be observed in the administrative data for attritors as well as for continued participants.

In our presentation we will use the combined PASS and administrative data to show that change in variables like household composition, employment status or receipt of unemployment benefits has an influence on contact and co-operation rates in the following wave and that this still leads to biased estimates of the amount of change after the propensity weights of the survey are applied. We will then show whether the inclusion of variables which are usually not used in propensity models but might predict change (e.g. indicators for quality of partnership, job satisfaction, job search intensity) can reduce this bias.

An Examination of Cohort Retention Efforts on the National Survey of Child and Adolescent Well-Being Jennifer W. Keeney, RTI International ([email protected]); Melissa Dolan, RTI International ([email protected]); Orin Day, RTI International ([email protected]); Keith Smith, RTI International ([email protected]); Alison Kowalski, RTI International ([email protected])

Cohort maintenance is essential to the success of longitudinal studies. In order to maximize retention and reduce nonresponse, studies often incorporate panel maintenance activities designed to track respondents over time. This presentation will examine the effectiveness of an intensive panel maintenance plan used on The National Survey of Child and Adolescent Well-Being (NSCAW).

The NSCAW is sponsored by the Administration for Children and Families and is a Congressionally- mandated study designed to collect nationally representative longitudinal data from children and families who have had contact with the child welfare system. Locating sample members is important on the NSCAW given an at-risk sample and the likelihood of changes in the child’s caregiver or placement setting. Two waves of interviews, approximately 18 months apart have been completed and a third wave is underway. In order to maintain contact with cohort members between waves, each family receives a panel maintenance package nine months after the completion of their interview. The package contains a letter thanking the family for their participation in the previous wave and a postage-paid address update postcard. Families are asked to return the postcard or call a toll-free number to confirm or update their contact information. Packages returned as undeliverable are sent to an in-house tracing unit for locating assistance.

This presentation will summarize the results of between-wave panel maintenance efforts on response rates. Our analysis will measure the success of retention by examining interview completion rates for caregivers who responded to the package compared to rates for caregivers whose packages were undeliverable. We will examine the extent to which child and family characteristics such as income and the child’s placement setting impact the success of retention efforts. A cost analysis will determine if panel maintenance activities implemented between waves reduced the need for more expensive in-house tracing during data collection.

Best Approaches to Mode Order and Non-response Prompting in a Multi-Mode Survey Jocelyn Newsome, Westat ([email protected]); Kerry Levin, Westat ([email protected]); Pat Dean Brick, Westat ([email protected]); Brenda Schafer, Internal Revenue Service ([email protected]); Melissa Vigil, Internal Revenue Service ([email protected])

Although survey researchers often use mixed-mode surveys to help reduce particular forms of survey error, speed up data collection, or lower costs (de Leeuw, 2005; Pierzchala, 2006), current research is unclear which sequence of modes is most effective. Once the survey is in field, it is also not clear which method of non-response prompting is best. Reminder postcards are a well-accepted strategy for prompting non-respondents to complete questionnaires (Dillman et al, 2008), although recent research suggests using an automated phone message can be an effective (and cheaper) prompting tool (Census, 2004). However, it is not clear which method results in the highest response rate (McCarthy, 2007; McCarthy, 2008). Furthermore, there is little literature on the efficacy of live interviewer prompting as compared to automated messages or postcards. In order to address these questions, experiments were embedded in the administration of the 2010 IRS Individual Taxpayer Burden Survey. The experiment compared mode sequence as well as non-response prompting. At the initial contact, approximately one-quarter of the sample received a mailed hard copy invitation to the web survey, while the remainder received the paper survey. A reminder prompt for the entire sample was then followed by a mailing of the paper survey to all non-respondents. For the final contact, 40% of non-respondents received a paper survey by express mail, 20% received an automated telephone prompt, and the remaining 40% received a telephone prompt from a human interviewer. Our analysis will examine the results of the experiment to answer two questions: (1) Which mode should we offer first to maximize response rates and minimize costs; and (2) Which final method of non-response prompting yields the highest response rate at the lowest cost? Whenever feasible, we will include demographic variables in the analysis to determine which contact strategies are most effective for which groups.

Increasing Mail Survey Response Using Automated Phone Call Reminders (Robocalls Michael D. Kaplowitz, Michigan State University ([email protected]); Fank Lupi, Michigan State University ([email protected]); Scott Weicksel, Michigan State University ([email protected]); Min Chen, Michigan State University ([email protected])

Mail surveys remain an important and cost effective approach for research, especially general population studies. Reminder contacts continue to be perhaps the most important technique for producing high response rates in mail surveys. At the same time that Web-based and other new survey methods (e.g., smart phone, SMS) benefit from technological advances, mail survey methods may also benefit from technological innovations. New technologies and data availability now enable identification of telephone numbers associated with names and addresses. Advances in programing, Internet phone calls, and automation have also enabled low-cost prerecorded phone calls (robocalls). This presentation reports on the use of robocall contacts as reminders in conjunction with the mailing of replacement surveys to increase mail survey response in a general population study. The study results are based on a mail survey of 32,230 Michigan residents randomly drawn from the state’s driver license list. For mailing wave 2 (the first replacement survey after the initial wave 1 contact), nonrespondents were assigned to one of four treatment groups: (1) no telephone number match available, (2) pre-notice robocall, (3) reminder robocall, and (4) a control group with a telephone number available but not called. A similar procedure was used for wave 3 (second replacement survey). We found that members of the general public with phone numbers in public records are more likely to respond to a mail survey request than members of the public without such phone numbers (illustrating the importance of our control group). We found that the use of automated phone calls significantly increased response to our mail survey relative to our control group. A reminder robocall (placed after a replacement survey has been received) was more effective in increasing response than a pre-notice robocall (placed before replacement survey is received).

Increasing the Student Response Rate to University Sponsored Survey Research Eric Jenson, Brigham Young University ([email protected]); Danny Olsen, Brigham Young University ([email protected]); Steve Wygant, Brigham Young University ([email protected])

Recent technology advances have provided researchers of all skill levels with easy to use tools to create and send surveys. These technology tools have allowed a flood of surveys to quickly and easily be sent to potential respondents. Although beneficial to informed decision-making, the influx of surveys can fatigue potential respondents and drown out critical surveys.

At a large, private US based university, the proliferation of surveys to students has endangered the response rates to critical institutional surveys required for accreditation and high level institutional needs. Within the deluge of survey invitations received by students, students give higher priority to surveys deemed of greater importance. University researchers are considering how to influence the perceived importance of a survey by students. Researchers hypothesized that by changing the name of the sender or sponsoring person from the director of university assessment, to the president of the university, that students would view the survey with higher importance. The findings indicate students do appear to ascribe a higher importance to the survey invitation when it is sent by the university president, but this effect does not hold for all types of students. The proposed presentation will address the differences observed in types of students and explain the results through social exchange theory and helping behavior theories.

Data for this research was obtained by delivering surveys using the name of the director of assessment as the sender. Following the same protocol, the same survey was sent to the next cohort using the name of the university president as the sender. The response rates and responses provided the data used in the analysis.

New Frontiers: Data Collection Using Smartphones and Other Mobile Devices

Assessing Data Quality and Respondent Compliance in a Smartphone App Survey Lorelle Vanno, The Nielsen Company ([email protected]); Jennie W. Lai, The Nielsen Company ([email protected]); Michael W. Link, The Nielsen Company ([email protected])

The world of communication is changing quickly with the advent of new technologies such as smartphone applications (app). This mode of communication may also be a new mode for survey research; however, there is little to no research on data quality or respondent compliance throughout a survey period in this emerging mode. To fill this void, Nielsen developed a data collection app to learn about potential measurement differences between the traditional format of data collection and data collection through an app. Nielsen’s TV measurement service has traditionally been conducted through either a metering device connected to the respondent’s television or a paper-based diary. In 2011, Nielsen began developing a smartphone app to test collecting TV watching data in this new medium. The pilot will take place in Q1 2012 and will last for 6 weeks collecting TV Viewing data and user compliance data over the course of the survey period. Nielsen will seek to assess data quality throughout the data collection period as well as respondent compliance by assessing data entry variation across time, item non-response, length of time spent within the app, completion of surveys, responses to reminders, and qualitative insights gained through debriefing interviews. This learning can provide researchers with a greater understanding of how to use a smartphone application as an emerging survey mode.

A Focus Group Pilot Study of Use of Smartphone to Collect Information about Health Behaviors Shanta Dube, Cheters for Disease Control and Prevention ([email protected]); Sean Hu, Centers for Disease Control and Prevention ([email protected]); Naomi Freedner-Maguire, ICF MACRO ([email protected])

The evolution of mobile communications technologies provides a unique opportunity for innovation in public health surveillance. Text messaging and smartphone web access are immediate, accessible, and confidential, a combination of features that could make them ideal for ongoing research, surveillance, and evaluation of risk behaviors and health conditions. The purpose of this feasibility study is to explore the perceived feasibility, advantages and disadvantages of conducting a population-based survey via smartphone. Deeper understanding of factors that promote and hinder participation will be useful in creating a population-based smartphone pilot survey. Two focus groups with participants aged 18-34 and 35-65 will be conducted. The semi-structured, researcher-facilitated discussions covered pros and cons of conducting a population-based survey via smartphone, barriers to and facilitators of using this novel data collection mode, and other issues. Audiotapes of the group discussions were transcribed and analyzed qualitatively. Further, we will determine and describe the technological feasibility: whether and under what circumstances smartphones can be used to collect population-based public health and behavior data. As mobile communications continue to evolve, a better understanding of how smartphones can be used to collect data on risk behaviors and health conditions is critical to public health surveillance and evaluation efforts.

Gathering User Experience on Metering Technology for iPhone/iPad Users Kelly L. Bristol, The Nielsen Company ([email protected]); Tom Wells, The Nielsen Company ([email protected]); Michael W. Link, The Nielsen Company ([email protected])

The growing penetration of smartphones and tablets in the US has generated a great deal of interest in the survey research community. These technologies present new opportunities for respondent interaction and general data collection. Passive electronic metering in particular allows for a rich accumulation of device usage information that can provide insight into how individuals use smartphones and tablets. As with any new method of data collection it is vital for researchers to understand the user experience in order to maintain respondent compliance. In April and May of 2011 Nielsen conducted an employee test of its On Device Meter (ODM) for the iOS Operating System to 1) gauge user experience and 2) evaluate the accuracy of the meter. This meter, when installed on iDevices such as an iPhone, iPad, or iPod Touch, tracks the applications used, the content viewed and listened to, and websites visited on the device. A total of 30 people installed the meter on their device and participated in data collection activities. Participants included 22 iPhone users (15 cellular network users and 7 wi-fi network users) and 8 iPad users. The study consisted of three phases of data collection. First, participants were asked to complete a brief user experience survey after installation of the ODM. Then, they were then asked to use the device and keep a log of their activities to validate meter accuracy. Finally, participants were sent an exit survey and invited to attend a debrief session after testing. Primary respondent concerns were device functionality and privacy issues. Furthermore, about half of the participants were aware of the meter despite its passive nature. In some instances changes in behavior were reported as a result of participant concerns. Reported here are results of all three test phases and thoughts for future research.

The Effectiveness of Collecting and Transmitting Data Via Cell Phones in Poor Cell Coverage Areas Courtney N. Mooney, Arbitron ([email protected]); Arianne Buckley, Arbitron ([email protected])

Arbitron developed an electronic Portable People Meter (PPM) that automatically detects audio exposure to encoded radio signals. Arbitron asks panelists to wear their meter everyday in order to measure media exposure. Currently, panelists’ data is stored in the meters until it is sent to Arbitron via the panelists’ home telephone line. This requires the panelists’ meters to be at home and in their charging device in order for the data to be sent to Arbitron. Arbitron must receive each day’s data within seven days for a panelist’s media data to be included in the ratings.

Recently, Arbitron developed a new meter that contains an internal cellular system. Through its cellular system, the meter sends data to Arbitron anytime and anywhere as long as there is cellular service. For most panelists, the new meter should not have a problem communicating with Arbitron; however, some panelists live in areas with poor cellular coverage.

To assess the feasibility of data transmission from panelists in low cellular coverage areas, in March 2011, Arbitron conducted coverage tests in four metropolitan areas classified by a cellular provider as having the lowest percentage of zip codes with at least “fair” coverage. Research staff were each provided with five PPMs, a cell phone, and a GPS and were instructed to drive to low cell coverage zip codes within these metropolitan areas. While in these zip codes, staff recorded the time, the number of bars on their cell phones and the number of times the PPM connected with Arbitron was monitored.

The results of this test have implications for survey research organizations attempting to reach respondents in poor cell coverage areas or use technology to collect and transmit data from the field.

Questionnaire Design: Experiments on Demographic Questions

Effects of Conceptual Variability Among Response Category Options on Classification of Employment—Implications for Data Quality Scott Fricker, Bureau of Labor Statistics ([email protected])

Close-ended survey response options need to be exhaustive, mutually exclusive, and well understood by respondents, but sometimes differential conceptual complexity in the response categories can make the response process difficult and reduce data quality. The present study demonstrated this effect by examining respondent classification decisions in a class-of-worker (COW) question that asked respondents to select one of four employment categories: government, private company, non-profit organization, or self employed. Study participants (n=90) were administered a series of narrative vignettes describing the employment situation for fictional individuals, and then asked to classify that individual using two different groupings of the COW classification. One-half of the sample made choices from among the entire set of four response options, while in the other half-sample respondents were presented with only the “self-employed” option as a “yes-or-no” choice. The vignettes were presented a second time, and everyone classified the same jobs into just three employment categories —government, private or non-profit. The data show that the stability and accuracy of respondents’ answers were highly dependent on the set of response options provided. We focus particular attention on the self employment class of work, which appears conceptually distinct from the other three classes, and for which we observed the highest number of classification errors. We also examine the impact of including conceptually variable response categories in the set of response options of a single close-ended question. The results of this study are discussed in the context of cognitive theories of concept formation and categorization, and for their broader implications for questionnaire designers.

Religious Identification: The Impact of Survey Questions on Estimates of Religious Groups David Dutwin, Social Science Research Solutions ([email protected]); Robyn Rapoport, SSRS ([email protected]); Ron Miller, JPAR ([email protected])

There is a long tradition of survey-based research of religion and religious beliefs, and this literature has grown significantly in the past decade. Large scale studies of religion (ARIS, Pew Religious Landscape, and others) have successfully depended upon a typical open-ended question to determine religion (for example “what is your religion, if any?”). But there is concern in a number of religious research circles that this question may foster a very different response than would questions specifically asking about affiliation to any one religion specifically.

We conducted a randomized experiment in the Chicago Jewish Population Survey to determine the difference in Jewish incidence when asking if the respondent considers him/herself to be Jewish, compared to asking the respondent directly about their religion. We find a substantial 25% increase in the number respondents who say they are Jewish overall, compared to respondents who identify with being Jewish specifically by religion. And subsequently, we find a range of differences within Jews in the two experimental screener samples, including by denomination, Jewish education, and Jewish practices.

As well, we conducted a number of experiments targeting other religious groups, such as Catholics, in our omnibus survey, again randomly asking first whether they considered themselves Catholic versus a standard religious question. We find differences across Catholics and a number of other religious groups. This paper reports on the results of these experiments, and draws conclusions of the impact of survey screening strategies on data of specific religious groups.

Collecting Information About Every Kind of Household from a Self-Administered Questionnaire Sarah K. Grady, American Institutes for Research ([email protected]); Jeremy Redford, American Institutes for Research ([email protected])

One major challenge of questionnaire design is collecting information about familial relationships in self- administered questionnaires. Whereas data collections with interviewers allow for a dialogue between interviewer and respondent which can help clarify familial relationships, self-administered questionnaires rely on sound item development to capture the nuances of ever-diversifying family structures. The National Household Education Surveys Program (NHES), undergoing a redesign from RDD telephone survey to ABS mail survey, conducted methodological experiments including split panels of important questionnaire items in its 2011 Field Test. The survey collected information about the education, care, and household characteristics of a sampled child. Sections of the questionnaire that were designed to collect information about children’s parents (called “Mother/Female Guardian” and “Father/Male Guardian” sections) were adapted to “Parent 1” and “Parent 2” sections in order to accommodate a diversity of family structures. The marital status item was of particular importance because of the attempt to rewrite it such that same-sex parent households could report familial relationships more easily. This paper will compare the quantitative data on these two sections to assess which household characteristics section was more effective. Specifically, the paper will address: 1. How do the two sections compare in terms of descriptive statistics? Where are the differences in the data collected? 2. Comparing data to the ACS, which section gathered more reliable information about households? 3. Which marital status questionnaire item performed better? Which item identified more same-sex parent households? Which item identified more nontraditional household structures? 4. Did the change in questionnaire items introduce measurement error in the form of item nonresponse?

Demographic Question Placement and Its Effect on Item Response Rates and Means of a Veterans Health Administration Survey Robert Teclaw, VHA National Center for Organization Development ([email protected]); Mark Price, VHA National Center for Organization Development ([email protected]); Katerine Osatuke, VHA National Center for Organization Development ([email protected])

There are various opinions about the most advantageous location of demographic questions in questionnaires; however, the issue has rarely been examined empirically. This study uses an experimental design and a large sample size to examine whether demographic question placement affects demographic and non-demographic question completion rates, non-demographic item means, and blank questionnaire rates using a web-based survey of Veterans Health Administration employees. Data were taken from the 2010 Voice of the Veterans Administration Survey (VoVA), a voluntary, confidential, web-based survey offered to all VA employees. Participants were given two versions of the questionnaires - one version had demographic questions placed at the beginning and a second version had demographic questions placed at the end of the questionnaire. Results indicated that placing demographic questions at the beginning of a questionnaire significantly increased item response rate for demographic items without affecting the item response rate for non-demographic items or the average of item mean scores. This research has implications for surveyors who, in addition to ensuring measure validity, set data collection goals to maximize response rates and minimize the number of missing responses. Therefore, it is important to determine which questionnaire characteristics affect these values. Results of this study suggest demographic placement is an important factor.

Saturday, May 19, 2012 10:00 a.m. - 11:30 a.m. Concurrent Session H

Confidence and Trust in Institutions

Trust in American Government: Assessing the Longitudinal Measurement Equivalence in the ANES, 1964-2008 Dmitriy Poznyak, University of Cincinnati ([email protected]); George F. Bishop, University of Cincinnati; Bart Meuleman, University of Leuven

Trust in the federal government has become a key dimension of democratic theory as well as a staple in the diet of empirical political science in the USA. For fifty years (1958-2008) the American National Election Studies (ANES) program has been assessing citizens’ evaluations of the “government in Washington” using what has come to be known as the political trust index. A critical assumption in using such data for longitudinal research is that the meaning-and-interpretation of such items should be comparable across groups of respondents at any one point in time and across samples over time. Using multigroup confirmatory factor analysis (MGCFA) for ordered-categorical data, this investigation systematically tests this measurement invariance or equivalence assumption with data collected by the ANES from 1964-2008. The results confirm that the ANES political trust scale has the same basic factorial structure over time. But only partial measurement equivalence can be established for the same scale, indicating that the meaning-and-interpretation of some items, especially the question about whether the “government in Washington” wastes money people pay in taxes, varies significantly over time. Confirmation of a partially equivalent model, however, suggests that latent means and proportions remain comparable over the ANES time series from 1964-2008. The latent factor score approach is thus recommended as a superior psychometric alternative to the conventional use of summated ratings in constructing the ANES political trust scale.

Trust at the Federal, State, and Local Levels: An Examination of the Similarities and Differences Dean E. Bonner, PPIC ([email protected]); Mark Baldassare, PPIC ([email protected])

Trust in government plays an important role in a representative democracy because of the interplay between citizen expectations and their willingness to allow government to act on their behalf. When expectations are not met this can lead to discontent and potentially a decline in the legitimacy of government institutions. The decline of trust in the federal government nationwide is well documented and trust at both the state and federal level in California has been consistently low in Public Policy Institute of California (PPIC) Statewide Surveys over the last 10 years. Policy devolution, which has become more and more common since the 1990s, makes trust in state and local government even more important. Recent devolution from the state to the local level of a variety of programs in California further enhances the importance of trust at the local level. In recent PPIC surveys, Californians trusted their local government more than they trusted the state government in Sacramento or the federal government in Washington. Using data form recent PPIC surveys, this paper will examine the similarities and differences in the three dimensions of trust—efficacy, efficiency, responsiveness—at the federal, state, and local levels. To gain a deeper understanding of trust at the state and local levels we will examine the impact that state and local trust has on perceptions and preferences of the state and local relationship. Further, by utilizing PPIC data we will be able to analyze Latino subgroups (e.g. age, education, income, immigrant status) as well as subgroups among independents. The examination of these two groups will be especially relevant considering the national implications for the 2012 election.

Is Confidence Really Declining? the Canadian Case Isabelle Valois, Université de Montréal, Département de sociologie ([email protected]); Claire Durand, Université de Montréal, Département de sociologie ([email protected]); John Goyder, University of Waterloo, Department of Sociology ([email protected])

Institutional confidence and social trust have been the topic of numerous empirical studies in the past thirty years. Researchers usually support the thesis of a decline in confidence in institutions and social trust since the 1960s. However, the decline thesis has been criticized on the basis of its methodological and theoretical shortcomings. Using Canadian data, this research aims at revisiting the decline thesis. It hypothesizes that institutional confidence may be more stable and resilient than previous studies have suggested. Because of its geopolitical situation and of its social and cultural diversity that tends to be geographically located, Canada offers a rich and interesting field to study the evolution of institutional confidence. Prior findings suggest that trust and confidence vary greatly across regions (Kazemipur 2006, Helliwell, 1996; Stolle & Uslaner, 2003, Valois & al., 2010). The data used in this research come from 36 surveys conducted in Canada between 1974 and 2008. Sources include the Canadian Election Study, the Project Canada series, and the Environics Research group series. The research examines the evolution of institutional confidence in Canada within that period according to region. The institutions examined include political, economic and judicial institutions as well as the educational system and the media.

Public Confidence in Social Institutions and Media Coverage: A Case of Belarus Dzmitry Yuran, University of Tennessee ([email protected])

Social scientists agree that public confidence in social institutions is a crucial element in building a democratic society. It is of special importance for transitional societies, including post-communist countries, because the lack of public confidence in newly emerged democratic institutions can interfere with democratic development. Existing theories explaining public confidence in social institutions tend to ignore the role that mass media play in building public confidence. The goal of this study is to examine the connection between mass media coverage of social institutions and public confidence in these institutions in Belarus by conducting content analysis of newspapers, analyzing the results of the public opinion polls, and exploring the links between coverage of social institutions and trust in them. Four institutions were chosen: two institutions with high level of confidence representing the state (president and military) and two institutions with low level of confidence representing civil society (independent labor unions and opposition political parties). Results demonstrated a noticeable connection between media coverage and public confidence in social institutions. Content analysis showed that state-controlled newspapers cover President Lukashenko extensively, presenting him within the scope of explicitly positive frames. As polls demonstrate, the president enjoys a high level of confidence among people who trust state-controlled media. On the other hand, independent newspapers present Lukashenko differently: he is depicted as a dictator and an ineffective leader. According to the polls, people who trust the independent media are less confident in the president. Given that state-controlled newspapers present the president within positive frames and independent newspapers concentrate mostly on his failures and shortcomings, we can see a strong connection between media coverage of Lukashenko and the public confidence in him. Examining media coverage and public opinion of other social institutions provided similar results, confirming the connection between media coverage and public confidence in this study.

Georgia on Their Minds: The Impact of War and Financial Crisis on Georgian Confidence in Social and Governmental Institutions Andrea Lynn Phillips, U of Nebraska - Lincoln, Survey Research and Methodology Program ([email protected]); Davit Tsabutashvili, U of Nebraska - Lincoln, Survey Research and Methodology Program ([email protected]); Allan L. McCutcheon, University of Nebraska - Lincoln ([email protected])

A violent, controversial five-day conflict between and Russia in August 2008 (the South Ossetia War) was quickly followed by a global financial crisis. These events both dramatically impacted the Georgian people, and provided a unique opportunity to assess the impact of both war and an economic downturn on a society’s confidence in its social and governmental institutions. Lipset and Schneider’s (1983) seminal study of these relationships in the Vietnam era began a series of studies on this topic that has continued through more recent studies of 9/11 and the 2008 financial crisis (e.g. Bloch-Elkon 2007; Uslaner 2010). Opportunities to examine these relationships outside of the United States, however, have been relatively limited.

This paper uses data from the Gallup World Poll – a multinational probability-based survey – collected before and after the South Ossetia War in 2008, 2009, and 2010. Taking into account factors such as media exposure and urbanicity, this paper investigates Georgian confidence in government, the military, financial institutions, and religion over time. Furthermore, potential differential effects of the South Ossetia War and financial crisis on confidence in institutions are analyzed by controlling for respondents’ perceptions of local economic conditions.

According to preliminary analysis, urban confidence in the national government significantly decreased from 35.6% in 2008 to 26.3% in 2009, then rebounded to 50.2% in 2010. Rural-dwellers’ confidence was higher, but followed a similar trend. Confidence in financial institutions also decreased after the war, but moved less drastically for city-dwellers than rural-dwellers, while confidence in religious institutions remained steady or increased over time. This study seeks to explore and explain the patterns in this data, to better understand the impact of financial crisis and controversial war on people’s confidence in their country’s institutions.

Considering Changing Sectors in the Research Industry?: Advice From Those Who Have Done It! Professional Development Session Michael W. Link , The Nielson Company ([email protected]); Gillian SteelFisher, Harvard Opinion Research Program ([email protected]); John H. Thompson , NORC at the University of Chicago ([email protected]); Ali Mokdad, University of Washington ([email protected]); Paul J. Lavrakas, Independent Consultant ([email protected])

Examining Partisanship and Ideology

The Dynamics of Partisanship Within Election Cycles Curtiss Cobb, Knowledge Networks ([email protected]); Norman Nie, Revolution Analytics ([email protected])

Is party identification immovable and immune from short-term influences or is it swayed by political dynamics that arise during campaigns and elections? Social scientists have debate the answer to this question for over fifty years while relying on data that measures partisanship between elections. This dissertation incorporates data from two unique sources to offer a look into how partisanship behaves within election cycles. First, I observe individual-level systematic changes in partisanship over the course of the 2008 presidential election campaign using 10-wave panel data covering the 13 months leading up to the election. Partisanship becomes entangled by and confused with vote intention. Furthermore, these changes cannot be accounted for by measurement error alone. Next, I use the same dataset to evaluate the relative stability of partisanship compared to a diverse set of attitudes and find it as stable, but not more stable, than ideology or evaluations of president’s job performance. Finally, by bringing together an aggregate set of data composed of over 160 high quality commercial and academic public opinion polls, I am able to corroborate the patterns found in 2008. Overall, partisan intensity decreases closer to elections and increases further away from the glare and mixed messages of a campaign. Implications for how we understand American politics are discussed.

How Much Does "Moderate" Label Mask Mixed Views? Survey Experiments On Self-Described Ideology Michael Mokrzycki, Consultant, University of Massachusetts Lowell ([email protected]); Jordon Peugh, Knowledge Networks ([email protected]); Stephanie Jwo, Knowledge Networks ([email protected]); Francis Talty, University of Massachusetts Lowell ([email protected])

Surveys frequently ask respondents to self-identify as liberal, moderate or conservative on political matters generally. The "moderate" category can include individuals who consider themselves moderate across multiple dimensions, but also many who hold liberal views on some dimensions and conservative views on others, which raises questions about the validity of this measure (Treier and Hillygus, POQ Winter 2009).

This paper will report the results of split-form experiments conducted in three surveys - a dual-frame RDD survey of Massachusetts registered voters, a probability-based online survey of adults nationally, and exit polls in the 2012 New Hampshire and Ohio Republican primaries - comparing a standard single-question self-assessment of political views "in general" with two separate questions about views "on social issues such as gay marriage and abortion" and "fiscal issues such as taxes and spending."

This research confirms that substantial numbers of individuals are cross-pressured - most frequently, self- describing as liberal on social issues and conservative on fiscal issues. In fact, among leaned Republican registered voters in Massachusetts, more called themselves liberal on social issues and conservative on fiscal issues than identified as moderate on both. Implications for predictions and analysis of American voting behavior will be discussed.

What’s Political about Community Attachment? Political Party Affiliation and Perceived Importance of Community-Level Problems Megan Henly, University of New Hampshire ([email protected]); Tracy Keirns, University of New Hampshire ([email protected])

Previous research (Hamilton 2011) has demonstrated that opinions about environmental issues are highly politicized: That is, Republicans and Democrats view conservation and environmentalism as the domain of liberals. Therefore, Republicans rate environmental issues as less important when compared to Democrats. Using both national data and region-specific rural data from the Community and Environment in Rural America (CERA) survey, this paper investigates differences between Republicans and Democrats on seemingly non-political issues, such as community attachment and reasons for moving to a community. We find that although some measures show no significant difference by political party identification, several measures do result in statistically significant and sizeable differences between Republicans and Democrats. This suggests that respondents may view agreement with seemingly politically-neutral survey questions as indication that they subscribe to either a liberal or conservative ideology. We discuss the potential role of questionnaire design and the role of the psychology of survey response in influencing these outcomes.

When Do They Vote For Parties, Rather Than Issues? Hyeonho Hahm, University of Michigan, Ann Arbor ([email protected])

How do voter preferences on issues translate into vote choice? According to spatial theories of voting, voters assess candidates’ positions on the relevant issues with respect to their own views, and evaluate them based on the similarity between the voter’s positions and those of the candidates. In reality, however, not only there exist multiple issues that voters care about, but also the relevance or salience of an issue is not usually fixed, but rather can be primed. Moreover, voluminous literature has shown that people do not usually pay much attention to politics, nor do they have much knowledge of it. Voters’ cognitive constraints are important underlying conditions that make it difficult for them to calculate the proximity between candidates’ positions and their own preferences on multiple (and varying) issues they care about. This study hypothesizes that the impact of partisanship on vote choice is likely to increase as the number of contentious issues at hand increases (regardless of their salience). To test these hypotheses, this study will employ a randomized experiment with auxiliary correlational analyses, administered to a representative sample of U.S. adults over the Internet, measuring respondents’ views on various issues.

Are We Really That Liberal? Evidence from the General Social Survey Spending Items Robert W. Oldendick, University of South Carolina ([email protected]); Dennis N. Lambries, University of South Carolina ([email protected]); Chris Werner, University of South Carolina ([email protected]); Edwin Self, University of South Carolina ([email protected])

Results from the recent General Social Survey items related to government spending have shown that the public is much more liberal in its support in various policy areas than would be expected on basis of reported ideology as well as media accounts of the public’s desire for more limited government. As the U.S. national debt continued to increase, the public has increasingly identified this issue as an important problem for the country to address. The perceived expansion of the federal government’s power during the first two years of the Obama administration contributed not only to increasing distrust of the federal government but also to a desire for a more limited government role, as exemplified in the emergence of the Tea Party movement. But has the general desire for a more limited government role translated into a consequent willingness among the public to reduce spending on specific programs?

This research examines data on from 16 items on government spending that were asked in each of the General Social Surveys conducted between 2000 through 2010 to assess trends in these items over time and to determine how support for different programs varied in response to changes in the party of the president and party control of Congress as well as to perceived changes in national priorities. In addition to investigating these individual items, a multi-item index of support for government spending is created to investigate overall changes on this issue.

The relationship between support for government spending and background characteristics, particularly education, political knowledge, and political ideology is examined to demonstrate the differences among subgroups and identify those that would be most willing to support reductions in spending for government programs. The potential role of response set bias in the responses to these items is also investigated.

Mixed Mode Methods of Data Collection

The Effect of Mixed Mode Designs on Nonresponse Bias Brian M. Wells, University of Nebraska - Lincoln ([email protected]); Kristen Olson, University of Nebraska - Lincoln ([email protected])

Many recent studies have used mixed mode surveys to help reduce nonresponse rates (e.g., Link & Mokdad, 2006; Dillman, et al., 2009), but few have directly observed the effect using these mixed mode designs has on nonresponse bias compared to single mode designs. This paper will examine nonresponse bias properties of sequential mixed mode designs compared to single mode designs. To address this question, we use the 2009 Quality of Life in a Changing Nebraska (QLCN) survey in which a sample of 1,229 Nebraska residents are assigned to one of four experimental groups receiving either a single mode (mail or web) or a sequential mixed mode survey (mail with web follow-up; web with mail follow-up). First, we find significant differences between respondents and nonrespondents in age, marital status, education, and owning a home across experimental groups. Second, we compare those who participated by mail with those who participated by web across the experimental groups and find that receiving any follow-up to a web questionnaire, whether mail or web, significantly changes the nonresponse bias properties of survey estimates. Conversely, receiving a mail or web follow-up to a mail questionnaire has little change in respect to the nonresponse bias of survey estimates. Finally, we examine differences in early to late respondents within each experimental group and find that there are few differences between early and late respondents, with the exception of the web with mail follow-up group. These findings suggest that the primary mode a respondent receives has the largest effect on the change in the nonresponse bias properties of survey estimates.

When More Gets You Less: A Meta-Analysis of the Effect of Concurrent Web Options on Mail Survey Response Rates Jenna Fulton, Joint Program in Survey Methodology, University of Maryland ([email protected]); Rebecca Medway, Joint Program in Survey Methodology, University of Maryland ([email protected])

The increasing popularity of mixed-mode surveys is a testament to their potential benefits relative to surveys conducted in a single mode. Mixed-mode surveys can increase response rates, improve coverage, and reduce survey costs. In particular, researchers increasingly add concurrent web options to mail surveys with the intention of improving overall response rates. However, the existing literature reports inconsistent effects of doing so. Some surveys report significant gains in response rates, while others report a significant detrimental effect compared to a mail-only survey, and still others conclude that adding a concurrent web option does not have a significant effect on response rates. We present the results of a meta-analysis that examines the overall effect of adding a concurrent web option to a mail survey on response rates. We also evaluate the impact of various study features on the size of this effect, such as whether the study has government sponsorship, offers an incentive, or features a topic that is highly salient to sample members. Preliminary results suggest that the addition of a concurrent web option leads to significantly lower response rates as compared to a stand-alone mail survey. Generally, the study characteristics tested have little effect on the magnitude of these findings. However, the negative effect of adding a concurrent web option appears to be slightly smaller for government- sponsored studies, particularly those for which participation is required.

Recruitment and Retention in Multi-Mode Survey Panels Allan L. McCutcheon, Univ. of Nebraska - Lincoln, Survey Research and Methodology ([email protected]); Kumar Rao, Nielsen ([email protected]); Olena Kaminska, University of Essex ([email protected])

This study builds on a previously published panel recruitment experiment (Rao, Kaminska, and McCutcheon 2010), extending that analysis to an examination of the effectiveness of pre-recruitment factors such as mode and response inducements on three post-recruitment panel participation effects: attrition rates, survey completion rates, and panel data quality. The panel recruitment experiment, conducted with the Gallup Panel, netted 1,282 households with 2,042 panel members. For these recruited members, we collected data on panel participation and retention, and use it for analysis in this study.

This investigation is an important contribution to the growing body of literature on the use of probability- based panels that use multiple modes (i.e., Web and mail). While previous studies have typically looked at factors affecting panel recruitment, participation, and retention in isolation, we employ an integrated framework for examining the role of these factors on post-recruitment panel participation effects such as attrition, survey completion, and data quality.

Preliminary analysis indicates a number of interesting findings. First, higher panel survey burden (i.e., survey assignment rate) places higher expectations on members to be an active participant in the panel, while negatively impacting their survival in the panel. Second, the effect of incentive at the time of recruitment continues to operate beyond the life of the recruitment phase; those who received the incentive to join the panel completed a greater number of surveys, compared to their non-incentivized counterparts. Interestingly, while incentivizing panel members helps in keeping them in the panel, it does not have the same effect on their performance in the panel, i.e. on panel survey completion rates. Rao, K., Kaminska, O., & McCutcheon, A. L. (2010). Recruiting Probability Samples for a Multi-Mode Research Panel with Internet and Mail Components. Public Opinion Quarterly, 74(1), 68-84.

Question or Mode Effects in Mixed-Mode surveys: A Cross-cultural study in the Netherlands, Germany, and the UK Edith de Leeuw, Utrecht University ([email protected]); Gerry Nicolaas, Natcen ([email protected]); Pamela Campanelli, The Survey Coach ([email protected]); Joop Hox, Utrecht University ([email protected])

The goal of mixed modes is to combine data from different sources. This assumes that data can be validly combined and that there is equivalence of measurement. Data from different modes may be different because the modes themselves lead to different response processes, or because radically different question formats are employed in different modes. For example, the visual presentation of response lists in self-completion modes (such as web surveys) and in face-to-face interviews with show cards, allow the survey designer to use long lists of response options. However, such lists are not feasible in telephone interviews relying solely on verbal communication. When accommodating for the restricted channel capacity of telephone surveys, designers often decompose response scales into two or more steps (unfolding or branching), while in modes that use visual presentation the full list of response categories is presented.

In a series of cross-cultural experiments we investigated whether mode or question-format effects cause differences in response between telephone and web surveys, and how large these effects are. The same type of experiments was performed in the Netherlands, Germany, and the UK. In each experiment respondents were randomly assigned to one of two modes: a computer assisted telephone interview or a web survey. Within each mode the same split ballot experiments on question format (full scale in one-step vs. branching in two or three steps) were conducted. This design enables us to disentangle question format effects from ‘pure’ mode effects. Furthermore, it gives us the opportunity to investigate whether the findings are generalizable and replicate in different countries.

Making a Match: Exploring the Impact of Mode Preference on Measurement Alian Kasabian, University of Nebraska - Lincoln ([email protected]); Kristen Olson, University of Nebraska - Lincoln ([email protected]); Jolene Smyth, University of Nebraska - Lincoln ([email protected])

The popularity of mixed mode survey designs has led to an increased interest in mode preference. Previous research has shown that offering respondents their preferred mode can increase response rates (Olson, Smyth and Wood 2009), but the effect of mode preference on the quality of survey measurement is still unexplored. In this paper, we examine a variety of experimental questionnaire design manipulations, evaluating whether the quality of data from those who received their preferred mode is better than data quality for those who did not receive their preferred mode. Respondents were asked about their preferred mode and willingness to be recontacted in a 2008 survey, and those that agreed were surveyed again in 2009. The 2009 Quality of Life in a Changing Nebraska survey (n=565, AAPORRR2=46%) randomly assigned respondents to mail or web modes, and one of two questionnaires. The two questionnaires systematically varied many elements of questionnaire design (i.e. question order, text box labels, forced choice vs. check all that apply). Almost one quarter (24%) of the respondents were matched with their stated preferred mode from the previous year. Preliminary analyses indicate significant differences in data quality between those who received and did not receive their preferred mode. In particular, respondents who received their preferred mode had few significant differences in the rate of endorsement of items in a check-all versus forced choice format. In contrast, respondents who did not receive their preferred mode were more likely to endorse statements when presented with a forced choice format than with a check–all format, consistent with previous literature even after controlling for respondent characteristics. These findings may indicate better cognitive processing in a check-all format when respondents receive the mode they prefer. Results provide insight into the impact of mode preference on commonly used questionnaire design features.

New Frontiers: Survey Responses vs. Tweets - New Choices for Social Measurement Frederick G. Conrad, University of Michigan ([email protected]); Michael F. Schober, New School for Social Research ([email protected]); Jeff Hancock Hancock, Cornell University ([email protected]); Piet Daas, Statistics Netherlands (CBS) ([email protected]); Roddy Lindsay; Josh Pashek; Brendan O'Conner

This panel brings together AAPOR attendees and selected scholars not currently part of the AAPOR community to focus attention on a set of serious questions in the rapidly changing world of social measurement. As methods of contacting people for surveys expand and as people resist participating in surveys more often (that is, as response rates decline), alternative means of measuring behaviors and attitudes, in particular mining the content of social media, present potentially attractive alternatives—but raise new questions about representativeness and generalizability, as well as about privacy and informed consent. The panel begins to explore (a) if analyses of social media content can provide estimates accurate enough to be used as reliable social measures or published as official statistics, (b) how self- report in surveys and analyses of social media content might complement and supplement each other, and (c) what should inform decisions about which methods to use for which purposes.

Abstract # 1: The potential for using social media to enhance public opinion research is no less than transformative. The conversion of much daily interaction and conversation from evanescent spoken word into recordable text-based social media, from text messages to tweets to blogs to Facebook updates, has transformed our ability to analyze language and infer psychological, sociological and even biological dynamics. For example, can attitudes or emotions be inferred from twitter feeds? Can circadian rhythms be inferred from half a billion tweets from around the world? Can social media be used to inform policy where traditional surveys are impractical? Can online dating companies match partners without ever asking a question? This presentation will describe a framework called Social Language Processing (SLP). SLP integrates automated text analyses resulting from landmark advances in computational linguistics, discourse processes, the representation of world knowledge and corpus analyses. Thousands of texts can be quickly accessed and analyzed on hundreds of measures. These data are mined and farmed to identify how language and discourse have interesting tentacles to social states of people and groups. The SLP paradigm involves three general stages. The first involves identifying linguistic features predicted to correlate with the dimensions under study, e.g. attitudes, emotions, and social and cultural dynamics. The second involves developing methods for automatically extracting the relevant discourse features. In the third stage, the features extracted in the previous step are classified into relevant categories (e.g. pro- revolution or pro-government) using statistical classifiers. After this overview, the talk will present three recent examples of large-scale projects using an SLP approach: 1) examining emotions and attitudes in Libya during the Libyan revolution, 2) uncovering emotional contagion at the population level in the USA using Facebook updates, and 3) looking at how circadian rhythms can be inferred from 530 million tweets from over 80 countries.

Abstract # 2: This paper reports a study which connected measures of public opinion from polls with sentiment measured from Twitter messages (see O’Connor et al., 2010) over the 2008-2009 period. The survey data concerned U.S. consumer confidence (UMichigan’s Index of Consumer Sentiment and Gallup’s Economic Confidence Index), presidential election polls (Pollster.com’s compilation of 46 polls) and presidential job approval (Gallup’s Daily Tracing Polls). One billion Twitter messages were collected by querying the Twitter API as well as archiving the “Gardenhose” real-time stream. This comprises a roughly uniform sample of public messages, from about 100,000 to 7 million messages per day. Sentiment scores – the ratio of positive to negative messages – were calculated across a range of topics contained in the tweets. In several cases the correlations between the survey data and Twitter sentiment scores are as high as .80, and capture important large-scale trends. For example, the sentiment ratio for the word “jobs” over a two year period tracks fairly closely with the month to month indices obtained from the Michigan and Gallup consumer sentiment polls. But the picture varies in different time periods, suggesting that qualitatively different phenomena are being captured by the text sentiment measures at different times. The results highlight the potential of text streams as a substitute and supplement for traditional polling, although substantial challenges remain: there are important unanswered questions about the accuracy of automatic linguistic analysis, which is not yet sufficiently robust across all domains, and there are questions about the demographic representativeness of Tweeters, which depending on the data source can be difficult to assess. Automatic text analysis as currently used may be more analogous to ethnography and focus groups than surveys, with respect to the problems researchers can address; the question is whether the approach can one day provide credible, quantitative estimates.

Abstract #3 Traditionally, sample surveys are used by National Statistical Institutes (NSI’s) to collect data on persons, businesses, and all kinds of social and economic phenomena. During the last 30 years, however, more and more NSI’s have gradually been replacing these primary sources of information with data available in secondary sources such as administrative registers. Apart from registers there are, however, also other types of secondary data sources available in the world around us that could -potentially- provide data of interest for producers of statistics. Nowadays, more and more information is captured by the electronic devices that surround us, for example by route planners and mobile phones. In addition, the ever increasing use of the internet causes more and more persons (and companies) to leave their digital footprint on the web. To get a grip on the practical and methodological implications, a number of ‘new’ secondary data sources were studied at Statistics Netherlands. These include measures of product prices on the internet, mobile phone location data, traffic loop information, and Twitter feeds. In this talk I will focus on Twitter research, which classifies the content of topics under discussion in the Netherlands. The topics found were compared to the various themes Statistics Netherlands is interested in. Automatic and manual classification results revealed that close to 50% of the Dutch Twitter messages collected could be allocated to themes of interest. Classification proved challenging because of the distorting effect of the many Twitter messages ‘of non-statistical interest’ observed. We therefore also looked at ways to filter out the relevant messages prior to analysis. An overview of the results obtained so far and ideas for future research will be presented.

Abstract #4: The last decade has seen an explosion in the amount of unstructured text in blogs and social networks. Paired with rich self-reported metadata, these textual data are tantalizing for researchers looking for quantitative insight into public opinion. They can be mined with Natural Language Processing (NLP) techniques, which allow researchers to quickly count term frequency, extract topics using co-occurrence, and quantify opinion using sentiment analysis. However, the approach is limited by NLP’s limits in disambiguating word-sense, by linguistic and population change, and by model bias. More specifically: • Percentages of posts containing a term can fluctuate for non-substantive reasons. In 2007, for instance, the frequency of most English words decreased relative to the total posts in the U.S. on Facebook as Spanish bilinguals joined the site in large numbers. • Sense identification for common, ambiguous terms like “Apple” can become outdated as new senses come into use (such as “Apple bottom jeans”, a lyric from a popular rap song) along with new co-occurrences (such as “iPad”). • Sentiment analysis models are likely more robust, but are still sensitive to linguistic change (e.g., “sick” and “dope” are sentiment identifiers whose use has changed over time) and cultural context. An alternative approach exploits their rich metadata about the users in social media, allowing the deployment of “lightweight” one-or-two- question, precisely targeted surveys. The ability of a respondent to immediately view aggregate responses leads to an engaging user experience and broadens the population willing to provide input. Demographically and psychographically targeted survey delivery (i.e. via ad servers) allows for powerful sample creation and weighting. Interactive response widgets can be employed. Political opinion surveys delivered over several years on Facebook have shown very strong correlation with long-running phone surveys, and so offer the possibility of continuous, interpretable measurement of opinion across years and even decades.

Abstract #5: The public opinion world has been largely upended by three seismic shifts in recent decades. Declining telephone response rates have called into question the validity of RDD sampling frames; the possibility for large, diverse samples derived from opt-in Internet panels has raised the specter that probability samples may be unnecessary for many social questions; and the preponderance of unsolicited, publicly available discussions scraped from sites like Twitter suggest that surveys themselves may sometimes prove an inefficient tool by comparison. These challenges to traditional survey research threaten the basis for almost a century of survey sampling. The threats they pose, however, depend on a variety of assumptions. This paper outlines the conditions for which each of these public-opinion-gathering methods will produce results that accurately describe the public. The assumptions required for learning about the population from non-probability and web scraping techniques are compared to the assumptions implicitly employed when weighting traditional sample surveys to correct for non-response bias. Further, various data collection methods may be inconsistently accurate when used for different types of inference. The conditions required to produce accurate trends over time and relations between variables differ notably from the conditions necessary to produce population point estimates. The product of these considerations is a clear empirical agenda for exploring how we can best understand the dynamics of modern society amid a changing public opinion landscape.

Non-Response, Reluctant Respondents, and Data Quality

Do We Really Want Everyone? Evaluating the Data Quality of NCS Respondents Based on the Difficulty to Complete an Interview .Bradley Parsell, NORC at the University of Chicago ([email protected]); Andrea Mayfield, NORC at the University of Chicago ([email protected]); Lee Lucas, Center for Outcomes Research and Evaluation, Maine Medical Center ([email protected])

Over the past several decades, declining survey response rates and increasing inability to contact sampled respondents have hindered the efforts of researchers to produce high-quality data. This has led to “best practices” for obtaining difficult interviews, including increasing efforts to contact sampled respondents and investing resources to convert refusal cases. Despite the emphasis placed on contacting and gaining the cooperation of respondents, relatively little research has been done to examine the effect of these practices on data quality. Recent research demonstrates that gains in response rates likely come at a cost. Busy or reluctant respondents may engage in poor respondent behaviors and have greater measurement error associated with their responses. There is a lack of evidence indicating that increasing the response rate will have an overall positive impact on the total error of a survey. This begs the question, is it worthwhile to pursue reluctant respondents? Using data from a survey designed to screen eligible women for pregnancy propensity in the National Children’s Study, we examine a number of data quality metrics such as item-level nonresponse and interview length. We look for correlates between these quality metrics and the difficulty of obtaining a completed interview. Further, we explore whether respondent characteristics and other factors contribute to this relationship. These results may be helpful in identifying respondents who may give lower quality data, specifically when their participation is difficult to obtain. Results may also help to inform survey researchers about situations when it may be beneficial to shift resources in order to decrease the total survey error.

Straight-Lining and Survey Reluctance: Prevalence and Implications James Cole, Indiana University ([email protected]); Ashley Bowers, Indiana University ([email protected]); Heather Brummett-Carter, Center for Survey Research ([email protected]); Alex McCormkick, Indiana University ([email protected])

Facing low response rates, burdensome surveys, and new technologies for data collection, survey researchers are increasingly devoting their attention to potential tradeoffs of reducing nonresponse or adding content at the expense of increasing measurement or other sources of error. More specifically, it has been shown that respondents who are initially reluctant may exert less effort in answering items (Miller & Wedeking, 2006; Triplett, 2002) but research that addresses a possible relationship between reducing nonresponse and increasing straightlining in attitudinal research among young survey populations is absent. This study investigated the prevalence of straightlining by the number of follow-up attempts needed to obtain participation and placement of the items in the questionnaire in the 2010 National Survey of Student Engagement, a national web survey of 371,616 students enrolled at 573 US colleges and universities. Results indicate that straightlining is rare for early item sets but increases over the length of the survey. Increased straightlining is also associated with completing the survey after a greater number of follow-up attempts and is higher among those who are lower academic ability, male and first generation college students who may be more likely to brought in only through aggressive efforts to increase response rate. Given these findings, we consider possible explanations, including the role of cognitive ability as a common cause of measurement error and nonresponse (Kaminska et al., 2010), and identify difficulties in measuring straightlining, such as the inability to distinguish genuine responses from those that are a result of satisficing. Implications for survey design in a world in which it is increasingly challenging to survey young populations who are using a wide variety of new technologies to complete surveys and as part of their daily life are discussed.

A Comparison of Estimates from Respondents Chosen for In-Person Recruitment (IPR) Kelly Dixon, Arbitron ([email protected]); Ryan McKinney, Arbitron ([email protected]); Al Tupek, Arbitron ([email protected]); William Waldron, Arbitron ([email protected]); Beth Webb, Arbitron ([email protected])

Arbitron implemented In-Person Recruitment (IPR) of survey respondents in July 2010. The purpose was to improve the response rate and sample representation for certain demographic groups that were difficult to engage using traditional RDD or mail recruitment methods. In-person recruitment is utilized for a sample of non-responding households from an address-based frame that do not match to a landline telephone number. In-Person Recruitment also allows Arbitron to recruit non-phone households; which, although a small percent of the US population, had not previously been covered in the RDD methodology. Since its implementation, Arbitron has found that targeted in-person recruitment helps improve representation of cell-phone, young and ethnic responders.

To determine whether the adoption of IPR is improving the quality of the radio audience estimates, we must determine whether the listening of the non-responders via IPR diverges from those recruited via other methods. There are differences in the number of mailings, planned telephone contacts, materials and incentives for IPR and non-IPR panelists that may possibly contribute to differences in performance. Although IPR and non-IPR differ in the sampling and recruitment processes, they are treated similarly in Arbitron’s weighting and estimation processes for the production of radio listening estimates. We will conduct a statistical comparison of the radio listening levels of IPR and non-IPR respondents with similar characteristics to determine if there is a meaningful distinction between the two groups.

An Investigation of Non-response Error Due to Breakoffs in Telephone Surveys Ana Villar, Independent Consultant ([email protected]); Jon Krosnick, Stanford University ([email protected])

Breakoffs in telephone surveys (unlike in web surveys) have been largely ignored both in research and applied survey contexts. Breakoffs are most times not a part of the deliverable that survey agencies provide, and they are not always reported as a separate category in the outcome rates. At the same time, it is possible that there are a substantial amount of breakoffs, and that respondents who break off are systematically different from the respondents that complete the survey, increasing nonresponse error. This might be particularly relevant in surveys that ask many questions related to one single topic. People who are not interested in or who dislike that particular topic may be more likely to break off. The omission of these sample units may therefore change the estimates of interest.

In this paper, we investigate two research questions: a) Do respondents of completed interviews hold different opinions towards global warming than respondents who break off? b) Does omitting breakoff interviews have an impact on measures from questions asked later in the survey?

We use data from four telephone surveys on global warming. Breakoff rates in these surveys ranged between 23% and 27%. We address the first research question by comparing estimates of the first survey questions from the complete interviews to estimates from the breakoff interviews. For the second research question, we use a predictive model based on responses from the 7 first questions to impute responses from the last section of the survey, and we simulate the estimates that would have been obtained if all interviews that were started would have been completed.

First Response: Household Portraits by Timing of Response in a Mail Survey Saida Mamedova, American Institutes for Research ([email protected]); Stacey Bielick, American Institutes for Research ([email protected])

In January 2011 the National Center for Education Statistics (NCES) conducted a large-scale field test of a multi-mode survey about child care and parent involvement in children’s education with a nationally representative sample of approximately 41,000 addresses in the United States. The primary data collection mode was a two-phase mail survey. The two-phases involved a screener questionnaire to determine household eligibility (presence of children) and a topical questionnaire sent only to eligible households. There were two non-response mail follow-ups and a telephone follow-up for the screener, that is the focus of this paper.

The paper examines the characteristics of early and late screener responders in the National Household Education Survey (NHES) field test. The preliminary analysis suggests that households that responded to the first mailing of the screener questionnaire are different from those who responded after the second follow-up. In this paper, we will use the data from the address vendor frame to describe the characteristics of responders and non-responders by each of the 4 waves of screener response. We will explore the changing portrait of respondents over the survey response period. The specific characteristics examined include gender, age, education level, race/ethnicity, income of the head of the household and marital status, number of adults, and residency type of the household.

In addition, we will look at some of the characteristics of the respondents by two different questionnaire treatments – a 20 question screener versus a 5 question screener. Preliminary analysis suggests that households with children may be more likely to respond to the longer version of the screener questionnaire. We will explore at what mailing phase they tend to respond and how that affects the availability of the sample for the topical stage of the survey.

Questionnaire Design: Experiments on Rating Scales

What Number of Scale Points in an Attitude Question Optimizes Response Validity and Administration Practicality? David Scott Yeager, Stanford University ([email protected]); Sowmya Anand, University of Illinois at Chicago ([email protected]); Jon Krosnick, Stanford University ([email protected])

Surveys routinely include rating scales that vary considerably in length, from as short as 2 points (e.g., agree/disagree) to moderate lengths (e.g., strongly agree to strongly disagree) to scales as long as 101 points (from 0 to 100 to measure liking or probability). Because the same type of construct is measured in different surveys by scales of different lengths, there appears to be no consensus among investigators about the optimal length of a scale, defined in terms of measurement accuracy (as indicated by reliability and validity), by administration practicality (e.g., how long it takes a respondent to answer the question), and the ratio of the two (measurement accuracy per unit of administration time, in case a large gain in measurement accuracy comes at a small price in terms of administration practicality). Optimal scale length might vary depending on the familiarity of the topic (perhaps people can effectively use longer rating scales to report more refined views on topics about which they have thought a great deal in the past), the nature of verbal labeling of scale points (perhaps full verbal labeling with optimally-chosen labels allows people to use longer scales more effectively, because the meanings of scale points are clearer) and whether the underlying construct is bipolar or unipolar (since a bipolar scale is essentially two unipolar scales joined at the middle). To explore these issues, we conducted a 3-wave web-based panel survey experiment, in which members of a representative sample of American adults (N = 6,055) were randomly assigned to receive various different versions of 20 rating scales. They were also asked theoretically-relevant criteria questions to assess concurrent validity. This paper will report the results of validity assessments to identify the design of rating scales that maximize validity while also maximizing administration practicality (a separate paper will examine reliability).

I Got a Feeling: Comparison of Feeling Thermometers with Verbally Labeled Scales in Attitude Measurement Randall K. Thomas, ICF International ([email protected]); John Bremer, Toluna USA Inc ([email protected])

Feeling thermometers are a commonly-used response format that asks respondents to indicate the extent with which they experience a particular feeling using a 0 to 100 scale, typically by saying or writing the number down. As such, they are easily implemented in telephone, paper, and web-based surveys. Many researchers believe that a feeling thermometer represents ratio scale measurement and is superior to other ordinal scales, though Brady (1985) pointed out the problem of inter-person comparability in using feeling thermometers. In a series of 4 web-based studies with over 100,000 U.S. respondents, we compared the relative effectiveness of feeling thermometers with other response formats that use single response scales with 5 to 7 response categories with end-anchored or fully-anchored (response labels for each category) response formats. Though 3 of the studies focused on political ratings (liking for various political figures), 1 study used the measures to assess attitudes toward various large industries. Respondents were randomly assigned to response format. We examined extent of scale differentiation, scale extremity use, and proportion of middling responses for each response format. We further analyzed the correspondence of the measures with other measures (including party ID). We generally found no superiority of the feeling thermometers in predicting behaviors towards each topic, and some slight superiority of fully-labeled response formats over feeling thermometers in concurrent-related validity. Response patterns for fully-anchored formats were quite different from end-anchored formats and the feeling thermometers, which we relate to the differences in validity we obtained.

Where is Neutral? Using Negativity Biases to Interpret Thermometer Scores Stuart Soroka, McGill University ([email protected]); Quinn Albaugh, McGill University ([email protected])

This paper draws on work on a negativity bias in political psychology to better understand - and properly interpret - results from thermometer-score rankings in telephone and online surveys. Thermometer scores are used across a wide range of political surveys to capture individual's attitudes towards groups, politicians and parties. A small body of early work shows, however, (a) that respondents do not systematically regard 50 as the neutral point, and (b) that the range across which individuals rank groups, politicians or parties may reflect either the perception that there are only minor differences between them, or individual-level tendencies to use different ranges to reflect the same underlying beliefs. In short, in spite of some existing research on the matter, it is still difficult to know whether a score of 55 is positive or negative, and whether a 10-point difference reflects a large or small difference in perception. This paper suggests that we can get some leverage on these issues by taking into account the non-linear impact of thermometer scores on other variables; in this case, voting decisions. In short, this work capitalizes on the growing body of work in psychology, economics, and political science suggesting that individuals give more weight to negative information than to positive information. It shows that (a) the impact of thermometer scores on voting decisions is nonlinear, (b) those nonlinearities vary across individuals, and (c) those nonlinearities can be used to both identify the neutral point in thermometer score ratings. Results are interpreted both in light of recent work exploring negativity biases in political behavior, and as they relate to the use of thermometer scores in survey research more generally.

A Visual Personification of Personalities. John Magnus Roos, Ergonomidesign ([email protected])

This study aims to validate a non-verbal version of the five factor model of personality traits. Personality research has established a five-factor model of personalities, constituted by the dimensions of agreeableness, conscientiousness, extraversion, neuroticism and openness. A short version of the five factor model is the HP5, an instrument constructed for large public health surveys. The shortest validated version of HP5 consists of 15 items; three items measuring each factor. This study aims to explore the potential of transforming a verbal personality scale, e.g. HP5-15, to a non-verbal (visual) personality scale. Instead of relying on the use of words, we give the respondents an opportunity to report their personalities using cartoons.

According to Desmet (2006), non-verbal scales increase the pleasure of participating and allow researchers to uncover aspects that people are unwilling and/or unable to verbally express. We validate our non-verbal scale versus verbal items in HP5 in order to investigate to what degree the five different cartoons (e.g. extreme personalities) correspond to the verbal meaning we would like them to express.

Each cartoon is measured by all 15 items. The scale used for each item was a four-level scale; completely agree (coded as 1), partly agree (coded as 2), partly disagree (coded as 3), completely disagree (coded as 4). The validation criteria were as following: (i) the three items measuring a particular factor must have an average of 1,33. (ii) Every item must correspond more to the factor it is supposed to measure than to other factors.

The study discusses problems, challenges and opportunities with visualizing a verbal scale. The study also discusses cultural differences in body language and facial expressions. The cartoons are developed with designers at Ergonomidesign in Sweden and validated using 300 international students. The study is financed by the central bank of Sweden.

Tracking Economic Confidence: Effects of Response Format in Trend Sensitivity and Correspondence with National Measures Frances M. Barlas, ICF International ([email protected]); Randall K. Thomas, ICF International ([email protected])

Shifts in consumer confidence have been related to changes in economic activity within nations. Some researchers have indicated that consumer confidence measures are retrospective and reflect recent and current mood within the country, while others suggest that such measures can be useful in predicting future economic activity. Two notable monthly measures are the University of Michigan’s Index of Consumer Sentiment and the Conference Board’s Consumer Confidence Index, each using different types of samples, questions, and response formats. One concern in repeated measurement of consumer confidence is the ability to detect differences from one time period to another. A particular aspect of survey design that can influence the sensitivity of measurement and the consequent ability to detect changes in public opinion over time is the nature of the response scale that is used. We randomly assigned respondents to 1 of 5 response formats, using 3, 4, 5, 6, or 7 response categories, to evaluate a series of economic confidence statements (both the current and anticipated economic condition of the region, the household, and the job market, along with ratings of the economic policies of the national government, anticipation of inflation, and perceptions of the stock market). We had over 70,000 U.S. respondents complete the questions over 32 waves of administration. Our analyses indicated that fewer response categories were less sensitive in detecting month-over-month changes and had a lower correspondence with other national measures of consumer confidence. We also examine the relationship of a number of measures with national economic activity.

Sampling and Weighting Dual Frame Cell Phone/Landline Surveys

Cell Sample Demographics under Alternative Dual-Frame Sample Designs Robert Montgomery, NORC at the University of Chicago ([email protected]); Wei Zeng, NORC at the University of Chicago ([email protected]); Heather Morrison, NORC at the University of Chicago ([email protected]); Kirk Wolter, NORC at the University of Chicago ([email protected]); Stephen Blumberg, National Center for Health Statistics ([email protected]); Kathy O'Connor, National Center for Health Statistics ([email protected])

With cell-phone sampling increasingly used to supplement landline telephone surveys in order to improve sample coverage, debate continues regarding the best approach to including the cell-phone frame. Some suggest that households should be screened for landline usage. The screening approach may seek to identify households that are Cell-Phone-Only (CPO), “cell mostly” (Blumberg et al.), “cell mainly” (Boyle et al.) or some combination of categories. The goal in screening is to spend cell phone interviewing effort on only households excluded from the landline frame. But imperfections are inherent in the screening approach and research by Boyle et al. (2011) suggests that screening may yield a demographic distribution of the sample that is further from benchmark distributions than a take-all approach, where all eligible cell telephone households are interviewed regardless of telephone usage. Our purpose here is to examine the demographic distributions obtained in dual-frame telephone surveys in which the cell-phone interviews are conducted under a take-all approach versus a screening approach. Using data from a suite of large CATI surveys we seek to determine which of the two approaches yields distributions of demographics and phone status that are closer to benchmarks. We construct the distributions selectively using dual-users and Cell-Phone-Only cases from the cell sample and all Landline cases from the landline sample of the National Survey of Children’s Health sponsored by the Maternal and Child Health Bureau and conducted by NORC via the State and Local Area Integrated Telephone Survey mechanism of the National Center for Health Statistics. Distributions based on the take-all approach include all dual users from the cell sample and the landline sample, while distributions based on the screening approach are simulated by removing respondents who are not Cell-Phone-Only from the cell sample. We also consider using cell-mostly and cell-mainly cases as part of a screening design.

Telephone Status, Attitudes toward Participation in Future Surveys, and Willingness to Join a Local Survey Panel: Data from Two Dual Frame RDD Landline/Cell Phone Surveys Scott Beach, University of Pittsburgh ([email protected]); Donald Musa, University of Pittsburgh ([email protected])

Inclusion of cell phone respondents in RDD telephone surveys is becoming standard practice in order to reduce coverage bias due to cell phone-only households. Survey methodologists are still attempting to characterize differences between respondents reached via traditional landlines versus cell phones, and have begun to explore more detailed “telephone status” or usage groups (wireless-only, wireless mostly, dual users, landline mostly, landline only). While there are well-known demographic and health-related differences between, for example, cell-phone only households and those with landlines, little is known about telephone status differences in attitudes towards surveys, response mode preferences, or likelihood of future survey participation. This paper uses data from two overlapping dual frame RDD surveys in the Pittsburgh (PA) metro area that used standard questions (NHIS; Pew Center) for determining telephone status. The first survey (n=795) compares wireless-only (n=138), wireless-mostly (264), dual-users (147), landline-mostly (174), and landline-only (69) respondents on: (1) likelihood of participation in future cell phone surveys; (2) longest future survey (in minutes) they would be willing to do on a cell phone/mobile device; (3) response mode preference in future surveys (general and health- related/sensitive topics) for [a] landline phone, [b] cell phone, [c] web survey on PC, or [d] self- administered survey on cell phone/mobile device. The second survey (n=2,126) compares the same groups (192 WO; 322 WM, 757 DU; 548 LM, 307 LO) on willingness to join a local survey research panel, and provide contact information including address and e-mail, and detailed demographic information (n=814, 38.2% of total sample agreed). All comparative analyses are conducted with and without statistical control of demographic variables. Results from this study will add to the literature on profiles of respondents of differing telephone use status reachable by cell and landline phone, and may have implications for the planning and design of mixed-mode surveys.

Evaluating Where Overlap Occurs in a Landline and Cell Phone Dual-Frame Piper Dubray, ICF International ([email protected]); Randal ZuWallack, ICF International ([email protected]); Kristie Hannah, ICF International ([email protected]); Naomi Freedner- Maguire, ICF International ([email protected])

The AAPOR Cell Phone Task Force Report describes two approaches to a landline and cell phone dual frame RDD sample: the screened approach where landline users from the cell phone sample are screened out; and the overlap approach where everyone is interviewed in the landline sample and the cell phone sample. As the industry learns more about telecommunication behaviors and response propensities, dual-frame designs are evolving to include combinations of screening and overlap. The Behavioral Risk Factor Surveillance System (BRFSS) surveys began using the screened approach to include cell-only respondents in 2008. Respondents to the landline and cell phone surveys are asked what percentage of calls they receive on their cell phone, and in 2012, the BRFSS design will be modified to include dual-users who receive 90 percent or more of all calls on their cell phone. During the last quarter of 2011 we conducted an experiment wherein all cell-users (those who also have a landline and those who do not) contacted in Arizona, Connecticut, and Vermont were eligible to complete the BRFSS cell survey. We will group respondents on a continuum according to the relative percentage of calls received on their cell phone: 100% landline, 90% landline/10% cell all the way up to 10% cell/90% landline and finally 100% cell. Theoretically, respondents who fall in the same usership category should be similar regardless of the sample from which they are selected. We will evaluate this by demographically comparing respondents reached by landline to those reached by cell for each category in the continuum. We will also evaluate the continuum distribution for the landline sample compared to the cell phone sample. This research will inform our understanding of a blended screened and overlap dual-frame approach.

Using Iterative Proportional Fitting Techniques to Improve Estimates for RDD Phone Surveys Haci Akcin, CDC/OSELS/PHSPO ([email protected])

The number of cell phone only households has increased in recent years resulting in undercoverage for random-digit-dialing (RDD) landline-only surveys in the United States. In addition, response rates for surveys overall is declining. The Behavioral Risk Factor Surveillance System (BRFSS) is one of the largest RDD surveys in the world with annual sample sizes over 400,000. In order to improve coverage sample cell phones samples have been added to the landline sample of the BRFSS. In 2008, the BRFSS piloted cell phone samples in 18 states. In 2009 and 2010, 48 states participated in the dual frame. In addition to the changes in sample the BRFSS is also changing weighting protocols to accommodate new control variables, including phone source. For years, BRFSS used standard post-stratification technique (PS) to adjust for non-coverage and non-response by region, race, sex and age in land-line frame. Adding the cell phone frame brought new challenges to combine and weight the data. To meet these challenges, the BRFSS is adding new control variables including home ownership, education, marital status and, most importantly, phone source. In 2011, for the first time iterative proportional fitting (IPF) techniques, also known as raking, will be used to weight than integrated landline and cell phone dataset.

The purpose of this research is to compare standard-post-stratification and iterative proportional fitting methodologies for both landline only and integrated landline/cellphone datasets of the 2010 BRFSS. The data will illustrate differences in outcomes and prevalence of health indicators using the two methods and datasets. The research will also illustrate the impact of the introduction of new control variables when compared to data obtained from the 2010 Census.

Practical Considerations in Design and Analysis of Dual-Frame Telephone Surveys: A Simulation Perspective Timothy R. Sahr, Ohio Colleges of Medicine Government Resource Center ([email protected]); Bo Lu, The Ohio State University ([email protected]); Jung Peng, The Ohio State University ([email protected]); Ronaldo Iachan, ICF MACRO ([email protected])

Dual-frame survey design becomes increasingly popular in large-scale telephone surveys. This is due to the lack of coverage of the traditional landline survey design and the escalating use of cell phones in recent years. Several estimation strategies have been proposed and their properties have been discussed under some ideal scenario, however, in practice, estimation in dual-frame telephone survey is vulnerable to various biases and errors (i.e. inaccessibility, topic/mode salience, measurement error, etc.). Via an innovatively designed simulation study, we compare the estimation bias under different sampling designs with various estimation strategies. To reduce the bias, different raking strategies are also compared. Simulated scenarios incorporating sampling cost are also run for practical considerations. Recommendations regarding the design and analysis are provided based on the simulation findings. Consideration is given to survey costing in light of simulation results

Saturday, May 19, 2012 1:15 p.m. - 2:15 p.m. Demonstration Session #3.

Demonstration of an Integrated Sample Management System for a Mixed Mode (Paper/Web) Survey Esther Ullman, Survey Research Center, ISR, University of Michigan ([email protected]); Hueichun Peng, Survey Research Center, ISR, University of Michigan ([email protected]); Brooke Helppie McFall, Survey Research Center, ISR, University of Michigan ([email protected])

Survey projects increasingly use multiple modes of data collection within one study. Developing an integrated sample management system (SMS) to accommodate different modes is a continual challenge. This demonstration is of an SMS devised for a mixed-mode (paper and web) panel study about economic behavior and decision making, with the current survey wave being fielded to approximately 950 sample members. Each member is assigned to paper or web mode at the beginning of the data collection, but participants may switch modes. The system, developed for the specific needs of this project, allows for accurate tracking of mode switches between paper and web and reporting of data collection progress for both modes as well as four different versions of the questionnaire. It can also be used for data entry for the paper questionnaires, sample member updating (address information, etc.), logging of paper questionnaire receipt, scheduling and updating of follow-up and reminder mailings and emails. The SMS also incorporates mailing algorithms, facilitating exact timing of follow-up mailings. Additionally, the system provides several ways to search and retrieve status information. The program is a web application in Coldfusion utilizing a SQL Server database. Two web surveys are developed in Illume- one for the paper version and one for the web instrument. The web survey for paper is used for data entry. The two surveys are then linked with the SMS using the web survey internal API. The SMS tracks the completion status of each sample line in either mode. This brief demonstration will show the ease of utilizing this system for sample management of a mixed mode survey.

Demonstration of the International Cross-Time, Cross-System Database David Miller, American Institutes for Research ([email protected])

This demonstration will show participants a new data-gathering tool. While the tool is being developed primarily to expand the breadth and depth of comparative international education research, a demonstration of the tool may be useful for those working in other related areas, such as public policy and survey research, who are interested in developing data-gathering tools. This web-based tool is free and publicly available, developed with funding from the International Activities Program (IAP) at the National Center for Education Statistics (NCES), U.S. Department of Education, and in collaboration with the American Institutes for Research (AIR) and other contractors.

This data-gathering tool, the International Cross-Time, Cross-System (XTXS) database (see http://intledstatsdatabase.org/default.aspx), contains information relevant to quantitative analyses, including the empirical modeling of international data. It places information collected from multiple sources and multiple points in time in one database for use by researchers. The XTXS database currently includes assessment data from three international surveys—PIRLS, PISA, and TIMSS—together with relevant background and contextual variables from these and other data sources as well as additional economic and demographic data from the World Bank and UNESCO. New data (both quantitative and qualitative) to be added over the next several months include participation measures, outcome measures, school contexts, student characteristics, education system characteristics, and more detailed national/subnational characteristics. Researchers can submit their own data for review by the XTXS steering committee for possible inclusion in the database. Data from the XTXS database gets outputted to a flat file, with rows corresponding to nations or subnational units and columns corresponding to variables. It is available both in a wide format (years nested in columns/variables) and in a long format (years nested in rows/countries) to meet the specific needs of the researcher and can be imported into SAS, SPSS, or Stata for statistical analysis.

Saturday, May 19, 2012 1:15 p.m. - 2:15 p.m. Poster Session 3

Programme for the International Assessment of Adult Competencies Incentive Experiment Tom Krenzke, Westat ([email protected])

Sponsored by the National Center for Education Statistics, the U.S. Programme for the International Assessment of Adult Competencies (PIAAC) field test data collection occurred between September and November 2010, with 1,510 adults interviewed and assessed in 22 primary sampling units (PSUs) across the country. The survey was part of an international study that included 25 countries. Each participant was administered (1) an in-person background questionnaire, (2) a brief Information and Communication Technology (ICT) module to determine whether the participant can use the computer to complete the assessment, and either (a) a paper and pencil version of the assessment, or (b) a computer-based assessment including an orientation module.

The PIAAC field test included an experiment to evaluate the impact of increasing the incentive amount from $35 (equivalent to 2003 Adult Literacy and Lifeskills (ALL) and National Assessment of Adult Literacy (NAAL) incentives when accounting for inflation) to $50 to account for the added burden of a longer interview and assessment (about two hours total) than past literacy surveys and the increased complexity of the PIAAC computer-based assessment. The paper will provide a description of the experiment design and the analysis results. The incentive experiment was conducted at the segment level (clusters of dwelling units (DUs) within PSUs). Incentive payments were randomly assigned to each segment, so that each interviewer was assigned both incentive amounts to minimize any interviewer impact. The statistical analyses were conducted to examine differences between refusal rates at the two incentive levels for the screener, background questionnaire (BQ), and screener and BQ combined. Other measures of the incentive impact were compared for the number of attempts, and item response rates. The analysis results provided support for an increase to the incentive amount for the PIAAC main study.

New Approaches to Health Facility Surveys Michael Hanlon, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Catherine M. Wetmore, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Ali H. Mokdad, Institute for Health Metrics and Evaluation, University of Washington ([email protected])

In 2010, the Institute for Health Metrics and Evaluation at the University of Washington launched a series of facility-level surveys in 15 developing countries. The objective was to capture information on revenue, expenses, costs, inputs and outputs. Data was collected across a range of facilities, from large hospitals to local clinics and specialty providers.

We adopted several cutting-edge approaches to collect this data. First, our survey was designed in a modular fashion, and our data-collection software dynamically selected questions based on the facility’s characteristics and earlier responses. Second, a series of verification processes were built into the software, to help avoid collecting incorrect or nonsensical data. Finally, data was uploaded from the surveyor’s netbook to central databases on a regular basis. This process was designed to occur while surveyors were still on site. We automated the process of evaluating data and comparing it to responses from other facilities. This enabled us to further identify potentially implausible or incorrect responses, which could then be verified before surveyors left the site.

For example, in some countries we had evidence which led us to expect a particular range for revenue-to- expense ratios. Since data were uploaded while surveyors were still on site, we could analyze a facility’s results, and then ask the surveyor to confirm revenue and expense data for those which fell far outside that range. Given the locations in which we operated, returning to a facility after the surveyor left the site was cost-prohibitive. However, with rapid increases in the availability of internet connectivity and the manner in which we automated this process, our approach could identify and correct potential problems while surveyors were still on site. This improved our ability to collect high quality data in a cost-effective manner.

A Survey Analysis of Identity Importance and Political Engagement among American Independents Samara Mani Klar, Northwestern University ([email protected])

Political independents play an increasingly prominent role in political discourse for, as recent survey data show, they are more numerous now than ever before. Yet we possess little insight into the process by which these apartisan Americans engage in politics. This neglect can be partially explained by the common belief that independents do not engage in politics but are rather thought to be apathetic and ill- informed. In this paper, using data from two nationally representative surveys, I show that independents vary in their political engagement to the same degree as partisans and that being independent is not a statistically significant inhibitor of political engagement. With new survey variables, I demonstrate that political engagement among independents cannot be predicted by their ideological strength, as is the case with partisans, but instead stems from the importance they attach to their independent identity. My results have implications for understanding the processes underlying political engagement among independents, and for how we understand political identity among Americans more generally.

Creating Mixed Internet and Mail Samples for Patient Satisfaction Surveys at Medical Practices Kristopher H. Morgan, Press Ganey Associates ([email protected]); Bradley R. Fulton, Press Ganey Associates ([email protected])

Press Ganey is a healthcare quality improvement organization that began testing a mixed-mode survey distribution method, so as to increase sample size, improve respondent representativeness, and decrease sampling bias. Specifically, internet (an email linked to a survey) was added to standard mail- out, mail-back methodology. A test comparing the two was conducted using patient satisfaction data from medical practices. The test included 8 practices and 75,621 patients randomly selected to receive a mail or email survey about their most recent experience. Previous research had suggested that internet and mail responders differ demographically (Dilman 2009, Couper 2007) and that older patients tend to be more satisfied with medical care than younger patients. Analyses showed mode effects to be minor and inconsistent across practices, but mix effects were significant: Internet responders were younger and male, whereas mail responders were older and female. Thus, an adjustment is required to ensure comparability across modes. It was determined that adjusting the internet sample to the mail sample was most appropriate. Consequently, respondents were grouped into ten categories by age and gender for both mail and internet. A percentage was calculated by dividing the sample size in a category by the total sample size for that mode. A ratio was calculated by dividing the internet category percentage of returns by the mail category percentage of returns. The category with the smallest ratio is then used to determine the percentage of patients to sample from the other categories, and is itself sampled at 100%. The sampling rate of the other categories is determined by dividing the category with the smallest ratio by each of the other category ratios. The result of this calculation is the retention of the largest number of internet surveys and a uniform proportion of internet to mail surveys preserving the comparability of the samples.

New Frontiers in Public Health Campaigns: Media Message Strategies and Psychological Reactance Bin Xing, Kent State University ([email protected])

Rationale: Public health campaigns make great efforts to bring benefits to people by changing their behaviors. However, goodwill does not guarantee good results. Many health campaigns still fail. According to Psychological Reactance Theory, if people feel their freedom of choice is threatened by the health advocacy, they may not be willing to accept the messages. By examining the degree of participants’ psychological reactance, this study tests people’s feelings toward four types of heath advocating messages (positively-framed statistical message, positively-framed narrative message, negatively-framed statistical message, and negatively-framed narrative message). Besides using need for cognition and need for affect as moderators, this study tries to find out which type of message is more persuasive to message recipients who have different cognition and affect levels.

Research Questions: Which of the four types of message (2 framing methods × 2 evidence types) generates fewer psychological reactances in people who have different levels of (a) need for cognition and (b) need for affect?

Methods: 400 college students at a large Midwest university were recruited. The hypotheses and questions were tested in 2 between-subjects factorial designs: 2 (framing: positive/negative) × 2 (evidence: statistical/narrative) × 2 (need for cognition: high/low), and 2 (framing) × 2 (evidence) × 2 (need for affect: high/low). Participants were randomly divided into four groups and each group read one of the four types of messages stimuli (positively-framed statistical message, positively-framed narrative message, negatively-framed statistical message, and negatively-framed narrative message). Then all participants filled out the same questionnaire that contained three measurements (18-Item Need for Cognition Scale, 26-Item Need for Affect Scale, 6-Item Perception of Manipulative Intent Scale).

Conclusion: The results from this study will help campaign practitioners gain a better understanding of message recipients’ cognitive and affective responses to advocacy messages and thus improve practitioners’ message design in future campaigns.

Kenya: Cultural and Political Opinions from a National Survey Dameka Thompson Williams, D3 Systems, Inc. ([email protected]); Amanda Bajkowski, D3 Systems, Inc. ([email protected])

The Republic of Kenya has endured violent unrest and civil uprising while struggling to achieve a democratic nation. This struggle continues in Kenya’s current political landscape. Despite these trials, Kenya is a thriving population of nearly 41 million people, and is one of the most modern and industrialized nations in East Africa. D3 Systems Inc. will conduct a national survey to gauge the beliefs and opinions of Kenyans. The survey will be conducted face-to-face and inquire about respondents’ opinions and behaviors regarding media usage, domestic and international issues, their national and local governments, security, and health issues. Respondents will be asked specifically about the effects of drought conditions, religious extremism, HIV/AIDS, the economy and ethnic / tribal relations. This paper will analyze and report the findings of this national survey. Interesting findings will be highlighted and presented. This paper will also examine and report significant differences in opinion and/or behavior between demographic groups (age, gender, ethnicity, education, religion, employment, and income). In order to achieve a representative sample of all Kenyans, the survey will be conducted in the urban and rural sections of all seven provinces of Kenya (including Nairobi): Central (110 interviews), Coast (90 interviews), Eastern (150 interviews), North Eastern (60 interviews), Nyanza (150 interviews), Rift Valley (250 interviews), Western (110 interviews) and Nairobi (80 interviews). We will select a target sample of 1,000 respondents using a multi-stage random method, using probability proportional to size systematic random sampling to select the primary sampling units in each province, the random walk method to select households and the Kish grid method to select individual respondents.

Who Seeks Stop-Smoking Help Online? Demographic and Tobacco Use Profiles at the National Cancer Institute Site SmokeFree.gov . Janet Brigham, SRI International ([email protected]); Harold S. Javitz, SRI International ([email protected]); Ruth E. Krasnow, SRI International ([email protected]); Lisa M. Jack, SRI International ([email protected]); Gary E. Swan, SRI International ([email protected])

The National Cancer Institute’s website smokefree.gov is an accessible venue for tobacco cessation information and help. We developed the Recent Tobacco Use Questionnaire (RTUQ) to assess current smoking behaviors and beliefs. Responses were linked to tailored cessation advice for 2,000 current and former tobacco users visiting the NCI site in 2010-11. RTUQ assessed smoking patterns and correlates, exploring personal and environmental factors associated with tobacco dependence. Questions were adapted from (1) our earlier Web survey administered to >6,000 invited respondents, (2) standard items in US and state surveys, and (3) the Theory of Planned Behavior.

DATA QUALITY: We examined care in responding, determined by indications of confusion, difficulty responding, inadequate responding, and abandons. Extreme responses occurred only in estimating cigars per day. Inappropriate straight-line responding on grid questions occurred in 2 cases. Analysis of total completion time identified 3 cases with unreasonably low completion time. Abandonment mid-survey occurred at questions asking reasons for wanting to quit and at "other" responses, although no elaboration was required.

RESPONDENTS: All U.S. states and various income levels were represented among RTUQ respondents. More than half were from low-income areas. Demographics reflected diversity in age, location, and education, but not race (85% White). Most smoked cigarettes (90%), using 18 cigarettes/day. 2/3 were women, most ages 26-55 with post-high school education. 2/3 of smokers expected to quit within 6 months, although 36% expressed low confidence. Reasons for quitting were health (74%), cost (67%), desire to be active (59%), and pressure from others (55%).

CONCLUSION: Respondent site visitors were dependent smokers anxious to quit but wanting help. The 32% ever-use of cigars and 18% use of smokeless tobacco supported the advisability of assessing and offering cessation help for multiple types of tobacco. The demographic profile of site visitors indicated a need to broaden the site’s demographic reach.

Usability and Computer Literacy in ACASI survey for Spanish Speakers Anna Sandoval Giron, National Center for Health Statistics ([email protected])

Audio computer-assisted self-interviewing (ACASI) has grown in prevalence in the last decade due to advances in technology and the increased use of multiple operating systems platforms such as personal computers and tablets. Nonetheless little research has taken place in regards to usability of the instruments (Hansen and Couper 2004). The focus of the research of computer- assisted interviewing (CAI) has been on attitudes of interviewers toward CAI and the evidence suggests that respondents prefer to use ACASI technology as opposed to face-to-face interviews when reporting sensitive information. Nonetheless little research has been done on the effect that computer literacy has on ACASI usability. Computer literacy refers to an acquired set of skills that imply competency with some computer applications such as word processing and email, but not limited to these applications.

The evaluation of usability was done through the method of cognitive interviewing, through this process researchers focused not on the way that participants report their overall satisfaction with the use of ACASI, and instead focused on the observation on the interaction between the participant and the computer and the briefing afterwards in which interviewers ask participants about the ways in which the participant interacted with the technology. The paradigm used by researchers is an integrative one that recognizes the question response process of an individual is set within a sociocultural context. Thus cognitive interviewing methodology provides an interpretative understanding of why respondents answered the questions the way they did.

Our findings suggest that computer literacy has an impact on the ability of respondents to successfully complete ACASI instruments. Although usability problems exist across the board those from the most marginalized groups were more likely to struggle with the technology. By providing easy to use instructions with visual aids respondents were able to complete the survey at higher rates.

Ask More, Get More? Comparing Responses to Detailed and Global Questions Jennifer Edgar, Bureau of Labor Statistics ([email protected])

The Consumer Expenditure Quarterly Interview (CEQ) survey asks respondents specific, detailed questions about a wide range of expenses that the household has had in the past three months (e.g., “Since the first of June, have you purchased any pants?”). These questions require respondents to recall specific purchases and report the details of each purchase they have made. As the CE program investigates options to reduce respondent burden and streamline the survey, one option being investigated is to replace some of the detailed questions with more general questions, or global questions (e.g., “Since the first of June, how much have you spent on clothing?”), that can be used for post- collection modeling or other adjustments.

The response processes for global and detailed questions are known to differ, with respondents often resorting to less accurate strategies such as estimation or even guessing when answering global questions (Beatty, Fowler and Cosenza, 2006; Edgar, 2009). Although response strategies for global questions are thought to be less exact and accurate, research has found inconsistent results when looking at data quality, with global questions yielding higher or lower estimates than detailed questions in different contexts. Sometimes global questions perform well, in other circumstances they do not (Goldenberg and Steinberg, 2009; Kreuter, McCulloch and Presser, 2009).

This research explores the relationship between responses to detailed and global clothing questions by evaluating responses from seventy participants to both types of questions. By comparing the items reported to each type of question gain insight into question comprehension. Expenditure totals will also be compared to explore the impact of question type on the data. Finally, debriefing answers of participant response strategies will also be analyzed explored to provide insight into the response process and the errors made when answering global questions.

Using Mixed-Mode Contacts to Facilitate Participation in Public Agency Client Surveys Glenn D. Israel, University of Florida ([email protected])

Considerable research has focused on encouraging responses via the web for address-based samples of the general population. Although response rates for such mixed-mode surveys have been lower than for the postal only mode, this strategy can generate a substantial number of responses via the web. Many public agencies and nongovernmental organizations serve large segments of the public and, consequently, often conduct surveys to identify client needs or assess customer satisfaction. Given tight budgets, public agencies are looking at mixed-mode strategies for surveying the general public or quasi- public populations. When there are multiple types of contact information, survey administrators can use more mixed-mode methods. This study examines the utility of incorporating e-mail addresses into mixed- mode procedures for a survey of an agency’s clients. The study uses clients who have received information from the Cooperative Extension Service to analyze how implementation procedures and response mode affect response rates and item distributions. The clients form three strata (based on providing contact information for postal address only, e-mail only, and both). For clients with both mail and e-mail addresses, three experimental groups were created, including two mixed-mode groups. I focus the analysis on response rates, as well as explore responses for mail and web modes over the sequence of contacts. I found that when mail and e-mail addresses are used to implement a sequence of e-mail and postal invitations in a mixed-mode design, response rates were lower (53-55%) than those for mail only surveys (66%). On the other hand, administration costs for postage was substantially lower for the mixed- mode groups (because 60% of the surveys were completed via the web) and the distributions of the substantive and demographic questions were nearly identical to the postal only group. This study demonstrates the benefit of obtaining e-mail addresses and using them in a mixed-mode survey process.

Differential Effects of Cash Incentives in Vulnerable Populations Tracy A. Keirns, UNH Survey Center ([email protected])

There are many reasons people choose to respond to requests for participation in survey research. Many of the reasons a respondent chooses to participate are situated within social exchange theory, such as highlighting the ways the results of the survey may benefit them or including a small token of appreciation (Dillman 2009) to name just a few. It has been well established in the literature that a small cash incentive provided with the survey invitation is one of the best ways to improve response rates (Dillman 2009). However, most of this research examines the impact of incentives in general population or student samples, with little if any research on response rates in vulnerable populations (Lesser et al., 2001). This study examines whether the inclusion of a crisp one dollar bill increases response rates in vulnerable populations to the same extent as in the general population. Every year youth and adult satisfaction with mental health services in New Hampshire is assessed using a multi-mode survey. In 2011 we experimentally tested the inclusion of one dollar bills and an additional mailing of a second packet in an attempt to increase response rates among adults and youth, as well as parents or guardians of severe mentally ill and emotionally disabled children. Preliminary analyses of this year’s survey shows significant improvement in response rates for adults but not for youth and family members of youth. Furthermore, the second packet did not improve response rates for any of the populations surveyed. These results suggest the need for additional considerations when trying to improve response rates for surveys of vulnerable populations.

Order Manipulation of a Request to Validate Responses with Records in a Web Survey of Researchers Kelly Burmeister, Children's Hospital Boston ([email protected]); Stavroula Osganian, Children's Hospital Boston - Harvard Medical School ([email protected]); Sarah de Ferranti, Children's Hospital Boston - Harvard Medical School ([email protected]); Erica Denhoff, Children's Hospital Boston ([email protected]); Sarah Stelz, Children's Hospital Boston ([email protected])

Permission to verify web survey responses with records may be considered sensitive as it heightens participant’s concern of privacy. This notion was explored in a survey of researchers who were asked about their experiences with pediatric and/or adult participant recruitment. A request for permission to check records to confirm participants’ responses was randomly assigned to be asked either first or last in the survey. Results showed that permission rates differed when requests for permission to validate responses was asked first compared with last. Responses to survey questions did not differ by question order. We expect that the question requesting permission to check records did not likely bias responses in our survey of health researchers. However, question order when asking respondents for permission to validate response may have impacted the rates of permission.

“What’s Happening?” Twitter for Diary Studies Sarah Cook, RTI International ([email protected]); Ashley Richards, RTI International ([email protected]); Elizabeth Dean, RTI International ([email protected]); Saira Haque, RTI International ([email protected])

The technologies used for collecting data in diary studies change rapidly. In the last 20 years several new technologies have emerged as data collection tools, including email, web, handheld computers, text messaging, and smartphones. As new technologies continue to emerge, researchers must determine which technologies are beneficial and appropriate for data collection. In this paper, we evaluate the appropriateness and effectiveness of Twitter for diary data collection.

Diaries are used to collect frequent reports on the events and experiences in respondents’ lives. When it first emerged, Twitter prompted users to respond to the question, “What are you doing?”, much like a time-use diary would. (Twitter’s prompt has since changed to “What’s happening?”) Twitter may have several benefits relative to other data collection tools. Twitter respondents may be less likely to forget or lose their diary because Twitter is available and often used on mobile phones and tablets. The constant access respondents have to Twitter diaries may reduce nonresponse, as well as measurement error caused by retrospective responding. Other possible benefits of Twitter diaries include decreased respondent burden because of respondents’ fluency with the mode, as well as increased engagement in the study due to participants’ interest in the mode and the potential for retweeting.

We conducted Twitter diary studies with three distinct non-probability samples of Twitter users: Hispanics, young adults, and diabetics. Participants were asked to tweet responses to questions asked via a study twitter account in a variety of question formats. Each participant was assigned to a diary on one of three topics: diet, activity, or mood. A subset of the questions was used in cognitive interviews conducted on Skype to assess how participants formulated their responses into tweets. In this paper we summarize the findings and reflect on whether Twitter should be considered for other diary studies.

Hispanic Self-Identification Among Spanish-Speakers in the US Jennifer Leeman, US Census Bureau ([email protected])

The difficulty that many Latinos encounter when asked to self-identify according to the five OMB race categories is well-known. As a result, a large body of academic and US Census Bureau research has examined the inconsistencies between official categories and Latinos’ own sense of their racial identity and explored whether “Hispanic” should be redefined as a racial rather than an ethnic category. In contrast with race, few studies have examined self-identification of Hispanic ethnicity. Although researchers have explored respondent preference for various labels (e.g., “Hispanic, “Latino” and “Spanish origin”) and debated whether or not various groups (e.g., Brazilians or Mayans) should be included, there has been little empirical research on whether Spanish speakers in the US identify themselves or their children as Hispanic.

This paper focuses on quantitative data showing discrepancies between ‘objective’ external ascriptions of Hispanic origin and self-identification among Spanish-speaking individuals in the US. The paper begins with a brief overview of theoretical issues surrounding the construct of “Hispanic, Latino or Spanish origin” and a review of qualitative data drawn from previous Census research. Next, quantitative data from a behavior coding study of 200 Spanish language CATI interviews conducted as part of a Census 2010 follow-up operation are presented. In this study, interviewer behaviors were coded (among other things) as to whether they read the questions as worded, while respondent behaviors were coded as to whether they matched the provided response options. Findings revealed that Spanish-speaking respondents provided yes or no responses to the Hispanic, Latino or Spanish origin question only 34% of the time (vs. 73% in English-language interviews), demonstrating problems with this construct. Responses are analyzed to provide greater insights regarding the sources of difficulty of Hispanic self-identification. Discussion also covers the use of Spanish-language interviews as a proxy for selecting a dataset of “Hispanic” respondents.

Factors Impacting the Accuracy of Interviewer Observations in the National Survey of Family Growth (NSFG) Brady T. West, Institute for Social Research ([email protected]); Frauke Kreuter, Joint Program in Survey Methodology (JPSM) ([email protected])

Existing work showing that interviewer observations are associated with both response indicators and key survey variables in a variety of surveys has suggested that these observations may be useful auxiliary variables for constructing nonresponse adjustments. Unfortunately, the observations are typically estimates and judgments made by the interviewers, making them error-prone, and emerging research has suggested that errors in these types of auxiliary variables will reduce the effectiveness of nonresponse adjustments. The ability to identify both respondent- and interviewer-level factors that impact the quality of interviewer observations could assist survey researchers with the development of design strategies aimed at increasing the quality of the observations. Unfortunately, no existing studies of face-to-face surveys have attempted to identify these factors. This study attempts to fill this important gap in the literature by presenting multilevel models of accuracy in two interviewer observations recorded in the National Survey of Family Growth (NSFG). A variety of significant interactions between respondent- and interviewer-level factors were identified for the accuracy of one observation (presence of young children in the household), while independent main effects of respondent- and interviewer-level factors were found for the accuracy of a second judgment (current sexual activity). Importantly, a design strategy for improving judgment accuracy suggested by theories in social psychology is actually shown to improve the quality of the observations, and other theoretical expectations are also confirmed. Specific observational strategies used by NSFG interviewers are also shown to impact accuracy. Implications of these results for practice and directions for future research in this area will conclude the presentation.

Evaluating the Impact of Emails and Landing Page on Web Survey Access Amy E. Falcone, ICF International ([email protected]); Randall K. Thomas, ICF International ([email protected]); Amy R. Mack, ICF International, SAMHSA DTAC Project Director ([email protected])

Several studies have examined the impact of survey invitations on response rates and generally conclude that shorter invitations lead to higher response rates. In web-based studies, however, little research has examined how design of emails and design of landing page interact to affect survey access and response. The current experiment examines the interaction of both factors simultaneously. We designed invitation emails to be either short or long and our landing pages either presented the consent button at the top of the landing page with long amounts of text below the consent button or the consent button at the bottom of the landing page with the text above the consent button. Levels of both variables were randomly assigned to respondents creating a 2 X 2 factorial design.

We report on the results of a Substance Abuse and Mental Health Services Administration Disaster Technical Assistance Center (SAMHSA DTAC) survey regarding training and technical assistance provided to recipients. This survey is being administered in November-December 2011 to 16,000 behavioral health (i.e., mental health and substance abuse) professionals and others who have requested training or technical assistance or subscribe to SAMHSA DTAC publications. We report the results for the 4 groups (1) Long email invitation and consent box at bottom of landing page; 2) Long email invitation and consent box at top of landing page; 3) Short email invitation and consent box at bottom of landing page; or 4) Short email invitation and consent box at top of landing page) on response and subsequent entry into the web survey. Studying both design factors simultaneously allows us to understand factors affecting respondent behavior concerning web-based survey response.

Using Technology to Enhance the Quality of In-Depth Interview Data Collected by Telephone: A Study of Illicit Retail Methamphetamine Markets Timothy M. Mulcahy, NORC at the University of Chicago ([email protected]); Kim Nguyen, NORC at the University of Chicago ([email protected]); Henry Brownstein, NORC at the University of Chicago ([email protected]); Fernandes-Huessy Johannes, NORC at the University of Chicago ([email protected])

As recent online technologies in qualitative research have largely been understood as alternative mediums for collecting primary data, their use as complements to existing modes of data collection has yet to be fully recognized.

In this paper, we reconsider this prevailing conception of web-based tools to open up new possibilities for productively engaging technology in qualitative inquiry. Based on our study of the operation and organization of methamphetamine markets across the U.S., we find pairing traditional (telephone) and new (WebEx and Google) technologies yielded a fruitful partnership.

This paper reflects on our experiences of incorporating a web-based platform and computerized mapping technology into our telephone interviews and elaborates on how their inclusion enhanced the data collection process with respect to establishing rapport, eliciting rich description, changing relationship dynamics, and increasing data quality.

We conclude with thoughts on the wider utility and challenges of deploying web applications in this fashion for social science research.

Variable Selection Methods for Survey Data Analysis Curtis Signorino, University of Rochester ([email protected])

Survey researchers often do not have strong theory concerning which of many possible regressors are related to their dependent variables. Recent advances in variable selection techniques (e.g., the adaptive lasso and SCAD) allow researchers to estimate models with numerous variables but without the inefficiency seen in traditional regression methods. Although these methods are increasingly common in genetics, engineering, and financial data analysis, we are unaware of their use in survey research. In this paper, we demonstrate how to employ these techniques with survey data (e.g., the American National Election Study). We also demonstrate an extension to the variable selection techniques that we have developed, which allows researchers to estimate nonlinear -- and possibly nonmonotonic -- effects of regressors.

Increasing the Utility of a Cell Phone Screener Charles Darin Harm, Arbitron, Inc. ([email protected])

Arbitron is moving toward an address based sampling (ABS) frame, in an effort to reach a greater proportion of U.S. households. With an ABS sample, the selected address is the sampling unit and it is important to verify that the survey respondent resides at the selected address.

Currently, a mail-based screener is sent to an ABS sample household where a selected address cannot be matched to a landline phone number. If a respondent reports being cell phone only or cell phone mainly the household is added to a cell-phone frame and used to supplement a 2+ list assisted RDD sample.

As part of the screener questionnaire, households are asked to provide an address to which we can send an incentive that was promised for returning the screener. There is concern that some respondents are providing an address other than their primary residence (e.g., work address, friend’s house) because it is more convenient for receipt of the promised incentive.

In response to this concern, two revised screeners were tested. Both revisions involved pre-populating the respondent’s address in the area of the screener where contact information is collected and asking the respondent to verify the accuracy of the address. One test version omitted the phone number request.

This presentation will discuss the results of pre-populating the address on the screener, as well as examine the impact of omitting the phone number request from the screener. Response rates and coverage data will also be presented, along with suggestions for future research based on test results.

Challenges and Lessons Learned from Tracing Highly Select Postdoctoral Populations in the NIST RAP Study Henry Tran, Westat ([email protected]); Kwang Kim, Westat ([email protected]); Kimberly Raue, Westat ([email protected]); Keith MacAllum, Westat ([email protected])

As part of the evaluation of the National Institute of Standards and Technology (NIST) Postdoctoral Research Associateship Program (RAP), a program that provides two-year postdoctoral appointments for outstanding scientists and engineers, Westat conducted a set of surveys to gather information about postdoctoral experiences and career trajectories of former NIST postdoctoral associates and unsuccessful NIST postdoctoral applicants. In order to conduct the surveys, we first traced contact information (e-mail addresses, job/company, home/work addresses, home/work phone numbers) for these highly select populations using a number of internet search tools (e.g., Google, LinkedIn) and methods (e.g., use of key term combinations). Prior available information on respondents included name, research field, Ph.D. institution, and graduation year. The results of our tracing efforts highlight the challenges of surveying a unique select population and the counterintuitive degree of tracing necessary for a population we expected to be easy to find using their record of publications. Tracing yielded contact information for 231 of the 297 (78%) former NIST postdocs and 893 of the 1,110 (80%) NIST applicants. For these found respondents, we estimated that there was incomplete or incorrect contact information for 10-15% of the respondents, yielding accurate contact information for approximately 65% of respondents in each group. Our tracing efforts may be of interest to survey and social science researchers who are conducting tracing efforts. In particular, as we gained tracing experience, we learned and adapted different search strategies (e.g., company or university directories was found to be useful) to help increase tracing efficiency and effectiveness. Furthermore, new found strategies were shared regularly among data collectors and updated into our tracing protocol. In the presentation, we plan on discussing the challenges of tracing these unique populations and what lessons we learned along the way in terms of effective search approaches and best data collection practices.

Five Golden Rings? The Impact of Number of Rings on Data Quality Nicole R. Buttermore, Social Science Research Solutions ([email protected]); Melissa Herrmann, Social Science Research Solutions ([email protected])

As response rates to telephone surveys have decreased, researchers have struggled to minimize the cost of interviewing while maintaining the quality of the data collected. Researchers can maximize interviewer productivity by minimizing the amount of time interviewers spend waiting to get a respondent on the line, and interviewers can move more rapidly through a sample of phone numbers by decreasing the number of times each number is allowed to ring before moving on to the next number. This strategy is effective only insofar as those who answer the phone in fewer rings do not differ in meaningful ways from those who answer the phone after more rings.

We present the results of an experiment conducted as part of a weekly, dual frame, national omnibus survey. Each week, the phone system for half of the sample was set to allow four rings, whereas, for the other half, the phone system was set to allow six rings. We examine differences in response rate between these two groups and investigate the relationship among number of rings, number of call attempts, various demographic variables, and markers of data quality such as number of missing items and interview length.

Using a Multi-Method Approach for the Re-design and Testing of the 2012 Census of Governments: Finance Component Questionnaire Design Heidi M. Butler, US Census Bureau ([email protected])

Similar to the Economic Census, the U.S. Census Bureau conducts a Census of Governments every five years to provide benchmark data on all state and local government units across the United States. In 2007 the Committee on National Statistics conducted an extensive review of the Census of Governments program at the U.S. Census Bureau and provided recommendations on a number of issues related to data quality, timeliness, and relevance. To address these recommendations, the Census of Governments implemented a three-prong approach to modernize and reengineer its programs, including the 2012 census. Specific to the redesign of the questionnaires, we started with facilitated group discussions with data users, followed by facilitated group discussions with data providers which sparked a third set of comprehensive discussions with key stakeholders. These discussions directed the in-depth record keeping meetings with data providers which, in turn, guided form design and gave focus to our cognitive interviews. Other studies have demonstrated the value of using multiple research methods for developing and pre-testing questionnaires, particularly in instances such as a major redesign. As with any approach each method has implications for form design. This paper will explain what was learned from each individual method within the multi-method approach, describe how findings from each method informed the subsequent methods used, and demonstrate the value of using a multi-method approach for improving data quality and reducing respondent burden while redesigning one component of the Census of Governments.

Data Quality of Adolescent Reports on Person and Household Level Income and Program Participation Patricia LeBaron, RTI International ([email protected]); Gretchen McHenry, RTI International ([email protected]); Lauren Klein Warren, RTI International ([email protected])

The validity of proxy-reported data has been investigated in a wide body of literature, with mixed results. Research designs often permit knowledgeable proxy respondents to report about another sampled person when self reports are difficult or impossible to obtain in order to save costs and achieve acceptable unit and item response rates.

The National Survey on Drug Use and Health (NSDUH), sponsored by the Substance Abuse and Mental Health Services Administration, was first conducted in 1971. NSDUH provides national, state and substate data on substance use and mental health in the civilian, non-institutionalized population ages 12 and older. Approximately 140,000 household screenings and 67,500 interviews are completed annually. Survey respondents can nominate adult household members to answer questions about the respondent’s health insurance and income. Whereas many other survey designs regard self reports of behavior and demographic variables as optimal, it is unknown whether adolescent respondent reports of income and health insurance introduce more measurement error compared to estimates from an adult proxy respondent. To determine whether allowing adolescents as young as 12 to provide data about their and their household’s income and health insurance coverage, this presentation will address a number of research questions. Does the demographic profile of adolescents who decide to answer these questions about health insurance themselves differ from the rest of the adolescent population who nominate a parent to complete these items on their behalf? Though allowing adolescents to report their own data may lead to more missing data, does the information that is reported by adolescents match information reported by adults in the same household, if these adults were also interviewed?

Findings on adolescents’ ability to report data about their household will evaluate the suitability of NSDUH proxy protocols, while determining whether this practice should be more widespread on other similar projects.

Comparison of the American Community Survey Voluntary Versus Mandatory Estimates Karen E. King, U.S. Census Bureau ([email protected]), Michael Starsinic, and Alfredo Navarro, Decennial Statistical Studies Division US Census Bureau

The American Community Survey (ACS) collects essentially the same detailed demographic, housing, and socio-economic data as were collected on the 2000 decennial census long form questionnaire using similar mandatory collection methods. We were requested by Congress in 2002 to conduct research to determine if the ACS could be implemented as a voluntary collection survey. In 2003, we designed and developed a test to assess the effect of a switch from a mandatory to a voluntary ACS on • Public reaction – by analyzing response rates by mode • Quality – in terms of sampling error and levels of unit and item response • Feasibility – by looking at costs and workloads

The initial analysis of the 2003 ACS voluntary collection method test did not include an assessment of whether estimates produced from data reported using a voluntary method would differ from those produced from data reported using the mandatory method. Additional analysis of the data gathered in the 2003 collection year was done to shed light on whether a change to a voluntary ACS would result in different characteristics of the responding population and what would be the possible effect on ACS estimates.

Our technique to answer these important questions was to create two sets of weighted annualized 2003 estimates – one based on data collected using a voluntary method, and the other based on data collected using a mandatory method.

We selected a subset of demographic, economic, and social characteristics to focus on, compared the estimates by collection method and then assessed if the resultant estimates differ beyond sampling variability. This paper shows some of the results from this analysis.

Facebook Ads: An Adaptive Convenience Sample-building Mechanism Adam Sage, RTI International ([email protected]); Elizabeth Dean, RTI International ([email protected]); Ashley Richards, RTI International ([email protected])

Facebook currently has over 800 million users worldwide (www.facebook.com). Nearly 50 percent of Americans currently have a Facebook account (www.socialbreakers.com). Social networking sites, and Facebook in particular, have become a unique marketing environment and a critical medium for engaging consumers. For instance, Facebook allows advertising to be targeted to specific populations defined by a variety of demographic characteristics, “likes” and interests, location, activities, and social connections.

To date, the field of survey research has demonstrated limited utilization of social networking sites, such as Facebook. Most utilization we see in research includes participant tracing and surveys of non- probability samples (i.e. convenience samples). Of additional note, most surveys we see on Facebook are conducted for market research purposes. While the development of methodologies in social networking sites is in its early stages, we believe some established marketing strategies are ready to be adapted to fulfill a variety of research needs. One area we are exploring is the value of Facebook advertising as a viable solution to recruiting study participants when a probability based sample is not required for research needs.

This poster demonstrates how Facebook advertising methods were used to recruit study participants for three studies: 1) a cognitive interviewing project in the virtual world Second Life, 2) a project building an application user-base to build online registries, and 3) a project consisting of cognitive interviewing and usability testing of a nationwide, mixed mode survey with participants meeting specific demographic characteristics. Specifically, our analysis will demonstrate situations that are most conducive to the utilization of Facebook advertising for recruitment. This analysis will take an in-depth look into the reach, or potential audience, for an advertisement; costs associated with ad development, implementation, and level of effort; and the limitations and advantages of Facebook’s advertising technique.

Measures of Neighborhood Quality: Self-Reports of Mothers of Infant Children Melissa Clark, Brown University, Program in Public Health ([email protected]); Samantha Rosenthal, Brown University, Program in Public Health ([email protected]); Michelle Rogers, Brown University, Program in Public Health ([email protected]); Frances Saadeh, Brown University, Program in Public Health ([email protected]); Patrick Vivier, Brown University, Program in Public Health ([email protected])

There is increasing interest in the effect of neighborhood conditions on health outcomes for children. However, there have been limited studies on the reliability of self-reported measures of neighborhood conditions. We interviewed 740 mothers immediately postpartum and again 13 months later. We asked about neighborhood conditions (e.g., People in neighborhood are willing to help neighbors; People in neighborhood can be trusted) and overall perceptions of their neighborhood. Among participants who reported not moving between the two time points (n=431), we assessed consistency of reports of neighborhood conditions across seven measures and over time. We tested spatial correlation of the measures of consistency using local Moran's I and found significant correlation for each. We used generalized estimating equations to determine correlates of consistent reporting while accounting for spatial clustering and parity. We also assessed predictors of change in global neighborhood perceptions among individuals who were consistent in their reports (n=363). Compared to individuals who identified as non-Hispanic white, the odds of consistent reporting were lower for Hispanics across measures [AOR=0.18 (0.06, 0.53)] and over time [AOR=0.28, 95% CI= (0.14, 0.58)]. The odds of consistent reports were lower for individuals who identified as Black [AOR=0.13, 95% CI=(0.05, 0.38)] or Other race [AOR=0.23, 95% CI=0.09, 0.54)] over time but not across measures. Mothers living in areas with a greater proportion of vacant homes were also less likely to provide consistent reports over time. Among those who provided consistent reports, individuals who identified as Black were significantly more likely than non-Hispanic whites to report a decrease in global ratings of the neighborhood over time. Parity was not associated with any of the outcomes. Studies of racial/ethnic differences in neighborhood effects on children’s health may be biased if consideration is not given to the type and frequency of self-report measures of neighborhood quality.

Partials Interviews in the BRFSS Data Collection: Causes and Characteristics in Six States Marilyn Wilkinson, Abt SRBI, Inc. ([email protected])

The BRFSS defines partial interviews as those who answer most of the demographic questions but do not complete the entire interview. The design of the BRFSS survey is to have the Core questions asked by all states first followed by state-selected CDC-supported optional modules and finally state-added questions. Because the demographic information which designates the partial gains is in the core questions, much of the state desired information will not be obtained in a partial interview.

Partial interviews occur due to a refusal to complete the entire interview. Thus survey factors such refusal conversion will affect the percentage of partials. Other factors examined include interview length, call attempt, day and time when partials occurred and internal disposition codes to determine the situation when interviewers encounter a partial. External factors examined include demographic characteristics of those who result in a partial by whether the partial is due to a refusal or something else.

Abt SRBI implemented the BRFSS data collection in the states of Massachusetts, Maryland, New Jersey, Ohio, Virginia and Georgia in both 2010 and 2011. A comparison and analysis of the partials by state will provide information about the factors associated with these incomplete interviews. Comparing 2010 to 2011, the partial rate (the number of partials over the number of partials and completed interviews) ranged in all reported states from a low of 4.1% in Maryland to a high of 12.5% in Massachusetts in 2010 in the first six months. All states saw an increase in the partial rate in the first half of 2011 with a high of 17% in Massachusetts.

Greater understanding of the factors behind partial interviews will lead to better field procedures to prevent partial interviews and persuade those not interested in finishing the interview to complete it.

Effects of Technical Difficulties on Item Nonresponse and Response Favorability in a Mixed-Mode Survey Jennifer Lee Gibson, Fors Marsh Group LLC ([email protected])

This study evaluates differences between responses to a quality of life survey of military recruiters offered in web and paper modes. Of the 3,997 participants, 202 (5%) reported having technical difficulties with the web survey. Common problems were trouble logging into the survey, security restrictions on personal computers, and difficulty navigating to the survey URL. Ultimately, 3,085 (77%) responded via the web and 912 (23%) responded via the paper survey.

Respondents who reported facing technical difficulties when attempting to respond via the web were 2.4 times (t=2.25, p<.05) more likely to respond to the open-ended question at the end of the survey than those who did not report technical difficulties. Among those who did respond to the OEQ, technical difficulties did not distinguish respondents in terms of response positivity, constructiveness, length, and topic. Interestingly, average job satisfaction did not differ by mode (Mweb = 3.4, Mpaper=3.4) or having experienced technical difficulties (Mreported=3.4, Mnot reported=3.4). Furthermore, the negative effect of technical difficulties on OEQ response rate was not moderated by survey mode (web versus paper). Item nonresponse for the five job satisfaction items was higher for web respondents than paper respondents, but no effects were found for having experienced technical difficulties.

These results illustrate a negative effect of technical difficulties on response rate for an open-ended question. However, there was no impact on the nature of obtained open-ended responses. Evaluation of a closed-ended psychometric scale, job satisfaction, provide preliminary evidence that mode, but not technical difficulties, affected item response rate but not average values. The finding regarding OEQ response rate was replicated using data from an earlier administration of the survey which indicated that respondents who reported technical difficulties were 1.7 times (t=5.57, p<.05) more likely to respond to the open-ended question at the end of the survey.

Look Who’s Screening? Participant Characteristics and Pregnancy Screening Outcomes in the National Children’s Study Keeshawna Brooks, NORC at the University of Chicago ([email protected]); Andrea Mayfield, NORC at the University of Chicago ([email protected]); Lee Lucas, Center for Outcomes Research and Evaluation-Maine Medical Center ([email protected])

The National Children’s Study (NCS) is the largest and most comprehensive long-term study of children’s health and development sponsored by the National Institute for Child Health and Human Development (NICHD) and other federal agencies. The NCS aims to track the health and well-being of 100,000 children from before birth through age 21 by examining possible influences on their health and development. During the pilot phase of the NCS, researchers used a scientific selection method to choose the 105 study locations across the United States. From these locations, individual neighborhoods, and then households, were selected to take part in the NCS. Data collectors then enumerated these households and screened for eligible adult women to be included in the study according to their pregnancy probability, including those that were currently pregnant, those trying and not trying to become pregnant, those that have experienced a recent pregnancy loss, and those who are medically unable to conceive. The pregnancy screening interview also collects demographic data, preferred methods of communication for study participation (e.g., cell phone calls, text messages), ways in which participants initially hear of the NCS, and data related to mobility.

The purpose of this research is to examine the correlates of NCS pregnancy screener outcomes to determine if demographic characteristics, mobility-related behaviors, or communication preferences are related to pregnancy likelihood. We explore whether these respondent characteristics can be used to identify populations with rare pregnancy probabilities. These relationships may aid in tailoring methods for locating or identifying this and other rare populations.

Investigating Spouse/Partner Dyad Response in a Longitudinal Study of Older Adults Meredith Czaplewski, NORC at the University of Chicago ([email protected]); Jennifer Satorius, NORC at the University of Chicago ([email protected]); Michael Colicchia, NORC at the University of Chicago ([email protected])

Methodological research focused on interviewing spouse/partner dyads is a less explored area in survey literature. Most research in this area has focused on within-interview protocols for maintaining data quality, while less attention has been paid to the role of dyad response in data quality. To researchers interested in relationship dynamics, participation behavior of both partners is a key methodological research issue. Understanding why dyads choose to participate may help researchers develop data collection processes that are more likely to yield higher response rates.

Using data from the National Social Life, Health, and Aging Project (NSHAP), we will employ a multidimensional approach to examine potential influences of partner participation. NSHAP is a longitudinal, population-based study of older adults that explores the interaction between aging, relationships, and health outcomes. The first wave of NSHAP was conducted in 2005-2006 with more than 3,000 respondents. The second wave of NSHAP was conducted in 2010-2011 with the same respondents (Main Respondents) and their cohabiting spouses or romantic partners (Partners).

In our analysis of partner response patterns, we will compare demographic characteristics of Main Respondents and Partners, as well as assess any interaction between these characteristics that may have promoted or hindered Partner participation. We will also examine the ways in which the Main Respondent’s physical and social health affected Partner participation. By design, NSHAP interviewed Main Respondents first and Partners last, within households. As such, we will also use paradata to consider whether the Main Respondent’s interview experience influenced Partner participation. Finally, to assess any circumstances that uniquely affect spouse/partner dyad participation among older adults, we will discuss dyad participation in the context of younger- and older-adult households.

Siamese Triplets Neither With nor Without: Jewish Israelis, Palestinian-Israelis, and Palestinians of the Palestinian Territories Meryem Ay, UNL - Gallup Research Center ([email protected]); Tarek Baghal, UNL- Gallup Research Center ([email protected])

In 1948, Israel was established as a Jewish state. However, it is also distinctive in that the 20.3 % of its citizens are ethnically Palestinian, who are the only minority Arab population in the Middle East (Central Bureau of Statistics – Israel). In addition, the Palestinians living in the Palestinian Territories are inexorably linked to both ethnic groups in Israel due to the shared history and political nature of the land. Although all three groups are linked, it is expected that there are differences in levels of satisfaction in a variety of life domains, since living conditions for all three might be significantly different. For example, Arab and Jewish Israelis have been shown to live in segregation (Falah 1996). Additionally, parts of the Palestinian Territories have been under blockade that impacts the economy and quality of life in general, largely put in place and enforced by Israel.

Due to the possible differences in living conditions between these three groups, this study focuses on the perceived conditions of quality of life of each group. We will analyze people’s opinions on their life satisfaction by using Gallup World Poll data, a randomly sampled nationally representative dataset collected in both. These domains include satisfaction with life overall, their personal freedom, the economy, education, housing, municipal services, and religious and political expressions. It is expected that Jewish Israelis will express highest satisfaction in most domains, followed by Palestinian Israelis, with those in the Palestinian Territories expressing least satisfaction. Conforming with this expectation, initial results show, for example, Jewish Israelis are more satisfied with their standard of living (73%), personal freedom (69%), and local economy (72%) than Palestinian Israelis (63%, 55%, and 32%, respectively), who are more satisfied than those in the Palestinian Territories (48%, 46%, and 27%, respectively).

Telephone Quality Control Checks in a Mail Survey of Residential Utility Customers Christine Ledoux, Southern Company ([email protected]); Lincoln Wood, Southern Company ([email protected])

Every three years, Southern Company (the parent company of Alabama Power, Georgia Power, Gulf Power, and Mississippi Power) conducts a company-wide Residential Saturation Survey to understand the penetration and customers' usage of appliances and electronics in the utility's service area. The survey is primarily administered by mail with telephone follow-up to fill quota cells based on operating company, geography, and housing structure type. The information garnered from the survey is used for a variety of purposes within the company, including the all-important system load forecast that determines future generation needs for the four operating companies.

Because of the importance of the survey results to generation planning, it is necessary to collect the most accurate data possible. Therefore, each returned mail questionnaire is subjected to 31 logical consistency checks. When a questionnaire fails one or more of approximately ten of these checks which are considered critical, a telephone follow-up call is made to the respondent to clarify the responses which indicate an issue with any of the 31 logical consistency checks. Results will be illustrated from the most recent wave of this study, which was conducted in 2010.

This is an important paper because it demonstrates how logical consistency checks can be applied to mail questionnaire to improve survey data quality, particularly in instances where the resulting data are critical for forecasting purposes. While many electric utilities conduct mail saturation surveys of their residential customers, this is perhaps the only example of such projects with stringent quality control requirements. Importantly, it all provides lessons for other mail studies where strict quality control requirements are necessary. Understanding Nonresponse and Refusal to Participation in a Biobank Jeanette Y. Ziegenfuss, Mayo Clinic ([email protected]); Jennifer Ridgeway, Mayo Clinic ([email protected]); Janet E. Olson, Mayo Clinic ([email protected]); Timothy J. Beebe, Mayo Clinic ([email protected])

The Mayo Clinic Biobank is a collection of patient blood samples, clinical data and questionnaire data from 20,000 patients with a wide array of health concerns. Two to three weeks prior to a scheduled medical appointment, patients are sent a recruitment packet. The packet contains information about the Biobank, a consent form, a form to select an incentive valued at $20, a questionnaire (containing questions about health behaviors, environmental exposures and family health history), as well as information about how to provide a blood sample. Over a specific period of time in August 2011, 1600 such requests were made. The Biobank achieved a 25% participation rate to these solicitations. An additional 12% actively refused participation and the remaining 64% did not respond. In an effort to better understand participation, we randomly selected refusers and non-responders for phone follow-up and completed 26 telephone interviews in each group. We found that there were no differences in self- reported health status and little difference in educational attainment between groups. Refusers were more likely to have participated in medical research in the past and more likely to report having reviewed all or most of the consent form. Privacy concerns and lack of time were the reasons refusers gave most frequently when asked why they decided not to respond. Non-responders were also most likely to cite being too busy to participate or not being able to “get around to it.” Future analysis will include comparing the demographic composition of the refusers and nonresponders to those that chose to participate to better understand the potential for non-response bias in the recruitment of subjects through this mechanism. The Biobank recruitment team is using these findings to consider changes to the recruitment materials in order to minimize concerns about privacy and the perceived amount of time required for participation.

A National and Mutlistate Survey on Issues of Importance to the 50+ Population Joanne Binette, AARP ([email protected]); Jennifer H. Sauer, AARP ([email protected])

This poster will illustrate the methodology, methodological implications, and key findings of a national and 53 state survey conducted by AARP. Although an increasing number of American households are cell- phone only, reaching the 50+ population via landline remains an effective survey research method.

In 2010, AARP fielded 77 telephone surveys to gather state and national level information on the needs, interests and concerns of the 50+ population. AARP collected data from approximately 29,000 adults ages 50+ in all 50 states, the District of Columbia, Puerto Rico and the Virginia Islands. The survey included a standard core set of questions for each state and the opportunity for each AARP state office to choose up to ten more pretested survey questions from a large question ‘bank’. These questions were mostly advocacy related ranging from community service issues to long-term care to financial security. This semi-custom survey design allowed for national baseline data as well as comparative state and regional data.

The national and state survey samples were generated using random digit dial methodology designed to reach all households in each state and U.S. territory with landline telephone service. Eligible households were initially identified based on telephone prefix (provided by M-S-G, Inc.) and were confirmed by respondent self-reported information. Respondents were selected within households based on age 50 and older males and females who were at home at the time of the interview. This study also included 23 samples of African-American and Hispanic adults age 50+. Interviews were offered in Spanish in the Hispanic markets. The average response rate among all states was 4.43 percent, the average cooperation rate was 58.87 percent, and the margin of error for each state survey was ± 5.0 percent. The margin of error for the national survey was ± 7.0 percent.

Measuring, Quantifying and Bemoaning Civic Health in America Don Levy, Siena Research Institute ([email protected])

Drawing on a series of studies, this paper explores the current state of civic health in America among all citizens and college students. Simultaneously, the studies speak on several levels of analysis in that the civic health of individuals is assessed and collectively used to comment on the universes that they populate including state, community or institution of higher education.

As with many oft discussed concepts, civic health cannot be measured directly but rather requires painstaking conceptualization and subsequent multi-indicator measurement. In these studies, civic health is understood as collective state in which legitimate human needs, defined dating to the United Nations’ Universal Declaration of Human Rights, are addressed and, both the individual level perception of that state as well as the individual contribution to it is realized and expressed.

In the first study, public opinion data from a three year study is used to describe civic health in terms of a new STRID Index. This index comprised of over twenty indicators quantifies the extent that citizens are (S) socially engaged with other citizens, (T) trusting of local, regional, national institutions as well as of other citizens, (R) expressing the responsibilities of citizenship including attending public meetings and events, (I) informed of the salient local political and cultural events, and (D) performing the duties of citizenship including voting, volunteering and charitable giving.

The second study draws from over 12,000 online interviews with college students from 30 colleges and universities. Asked to describe their involvement in nine areas of human need, we measure their involvement, depth and frequency for each area. The unique ‘percent of the possible’ score provides a fruitful depiction of engagement as a component of civic health among students and of the capacity contribution expressed by the institution that houses them.

Assessing the Feasibility of Respondent-Driven Sampling: A Telephone Survey of African American Males in Georgia Robert P. Agans, Carolina Survey Research Laboratory, Dept of Biostatistics, UNC-CH ([email protected])

Respondent-driven sampling (RDS) uses chain-referral methods to obtain representative samples of hard-to-reach but socially interconnected populations. Based on the work of Heckathorn in the late 1990s, RDS has become a legitimate probability-based method producing unbiased indicators with known variability. Most of this work, however, has been done face-to-face and has not been applied in a CATI environment. In this study, we recruited African American males between the ages of 18 to 65 in four Georgia counties (DeKalb, Fulton, Muskogee & Lowndes) using two random cell phone samples. To test the efficiency of RDS, half of the cell phone sample implemented an RDS approach, while the other half served as a comparison group. In this paper, we compare the costs of recruiting cell phone-only respondents into the study via standard methods versus RDS methods and evaluate the cost- effectiveness of RDS given that weighting and analysis are complicated by use of the procedure. In addition, we discuss practical matters and alert researchers to some of potential problems of implementing RDS in a CATI setting.

Meeting Expectations: The Intersection of Issues, Traits, Party, and Gender in the Candidate Evaluation Process Lindsey Meeks, University of Washington ([email protected])

Voters draw upon a constellation of factors when forming their opinion of political candidates and eventually using that opinion to cast their vote. Among this myriad, three components compose a large share of the assessment process: political party identification, political issues, and character traits. These three factors, however, weigh differently in the evaluative process for individual voters, and within each factor, there is an internal hierarchy of influence, e.g., one issue is more salient to an individual than another. Furthermore, these factors intersect with another important, omnipresent evaluative cue in politics: gender. With these characteristics in mind, I conducted an experiment to explore the relationship between the issues and traits voters deem most important, political party identification and expectations, and electoral success. A total of 1,057 adult participants were randomly assigned to one of six news articles that pitted either two Republican or two Democratic women candidates against each other for the office of Governor. The candidates were reported as focusing on certain political issues and projecting certain character traits. Regarding issues, when the individuals’ most salient issues converged with the candidates’ issue emphases, the individual was more likely to vote for the candidate and to think that the candidate would be most likely to win. This trend was also consistent for those that identified compassionate traits? e.g., honesty, friendliness? As their most important traits. However, for those that expressed a preference for instrumental traits, the candidate that projected confidence and toughness had more electoral success than the candidate that emphasized ambition and independence. For party, the candidate that was reported as emphasizing traditionally Republican issues and traits had less electoral success among Republican voters. Conversely, the candidate that was reported as emphasizing traditionally Democratic issues and traits had more electoral success among Democratic voters.

Collecting Dried Blood Spots in a Sample of Cambodian Refugees Suzanne Perry, RAND Corporation ([email protected]); Emily Cansler, RAND Corporation ([email protected]); Judy Perlman, RAND Corporation ([email protected])

This paper describes the procedures developed and implemented for collecting health markers in a sample of Cambodian refugees (n = 526) using Dried Blood Spots (DBS). Initially developed in the 1960s for use in screening newborns for metabolic diseases, the use of DBS has become common in large- scale population and epidemiological studies. However, smaller research projects can also take advantage of the method as an easy, accurate, and cost-effective way to incorporate basic health markers into field research projects. We collected DBS as part of a follow-up study of randomly selected Khmer-speaking households from the Cambodian community in Long Beach, California. Respondents were between 41 to 81 years of age and lived in Cambodia during the Khmer Rouge reign and immigrated to the United States prior to 1993. We used DBS to test for 3 health markers: total and HDL cholesterol (indicators of heart disease risk), C-reactive protein (an indicator of inflammation), and HbA1c (an indicator of diabetes risk). In this paper, we provide an overview of the history and method for collecting DBS, tests that can be performed using DBS samples, and advantages and disadvantages of the method. Next, we describe efforts taken to train and supervise field interviewers to follow the strict protocol we designed to collect high-quality samples. Finally, we discuss issues to consider in interpreting test results and calculating disease risk. Overall, we found that cultivating a strong partnership with the lab that processed our samples was key in allowing us to formulate procedures for sample collection, provide prompt feedback to our field staff about sample quality, and to interpret the test results.

Gaining Knowledge from the Field: The Importance of Fact-Finding Trips Prior to the Design and Implementation of Health Evaluation Surveys in Central America Bernardo Hernández Prado , Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Paola Zúñiga Brenes, Inter-American Development Bank ([email protected]); Catherine M. Wetmore, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Rafael Lozano, Institute for Health Metrics and Evaluation, University of Washington ([email protected]); Ali H. Mokdad, Institute for Health Metrics and Evaluation, University of Washington ([email protected])

Introduction Evaluation of health programs and interventions is a growing field that often relies on primary data collection. Advances in data collection have improved the efficiency of field work and increased data quality. However, field conditions that may affect the data collection process are sometimes overlooked in the design phase. We describe the goals and outcomes of a series of fact-finding visits undertaken prior to data collection for the evaluation of the Salud Mesoamerica 2015 Initiative (SM2015).

Methods SM2015 aims to improve health indicators (mainly maternal-child health and nutrition) in Mesoamerica. The evaluation of SM2015 will be based on data collected via household and health facility surveys. Visits to five participant countries were conducted to seek out existing administrative and survey data that would inform our sampling procedures, and to obtain valuable knowledge of the field conditions and other logistical considerations prior to data collection.

Results Between July and September 2011, we conducted fact-finding visits to communities and health facilities in Guatemala, Nicaragua, Honduras, Panama and Mexico. These trips proved to be valuable experiences with important ramifications for the design and implementation of data collection efforts. The site visits forced us to consider geographic characteristics in the sample design and the field work logistics. We established contacts with community leaders in indigenous areas and obtained prior authorization to conduct the field work. We had the opportunity to observe the physical space and workflows in hospitals and clinics, including complex referral systems, which influenced the design of our facility surveys. Lastly, we were able to identify previously unrecognized bottlenecks in health care provision, which are now addressed in the surveys from a population perspective.

Conclusion Fact-finding trips improve data collection efforts by providing opportunities to consider important aspects of the field conditions that otherwise would have passed unnoticed.

Can Pre/Post Surveys Measure Media and High Visibility Enforcement Impact Towards Motivating Driver Behavior Change? Evaluation of Ticketing Aggressive Cars and Trucks Safety Campaign Danna L. Moore, Social and Economic Sciences Research Center ([email protected])

A multi faceted mixed mode survey evaluation was implemented to evaluate a Highway Safety campaign. Speed and passing are two of the most prevalent factors contributing to traffic crashes. Combine these two factors with an aggressive driver passing a large truck, if something goes amiss the truck driver is most likely the one to walk away. TACT is safety program designed to increase awareness, educate, and promote safer driving around commercial vehicles by increasing passing distances on rural highways. The campaign included new signage, high level enforcement, radio, TV, newspaper, and gas pump media. To know if the campaign changed things there is a need to know how conditions were before the campaign was implemented. This evaluation set out to measure the effect the advertising, media, and enforcement had using surveys which were conducted before, during, and after the campaign in the zip code areas surrounding rural highway corridors. The key to determining the effectiveness was to evaluate driver characteristics, in areas exposed to interventions and compare to areas not exposed. Comparisons were implemented for four rural highway corridors—immediately before, during, and after the campaign. 1478 telephone CATI interviews were completed for the pre/during/post test phases. Almost 4,000 drivers had enforcement contact during the program and surveys were completed with a random sample violators. In addition a long term recall was conducted in an area that had been exposed a number of years previously. The results show significant differences between groups on specific test measures. These findings are likely to be of value for public safety organizations that desire to educate the public about potentially dangerous and risky driving behaviors. These results are of interest to researchers as it demonstrates the value of pre/post experimental design effectiveness.

Update Your Status Lately? – Then Why Not Respond to Our Survey! Debbie Borie-Holtz, Rutgers University ([email protected])

The benefits of searching public records and utilizing a social network aggregator web site, (which culls together data from both online and offline sources), to construct an address-based sample frame was found to have a three-fold advantage. In a national panel study of top legislative leaders in the 50 states from 1997 to 2010, this data collection process helped to reduce coverage error to less than 1 %. Additionally, the detailed information allowed a customized and personalized “pitch letter” to accompany the initial and follow-up questionnaire mailings which resulted in a 62% response rate. Finally, email contacts obtained through online searches was also used as a follow-up contact which led to higher response rates as compared to those without emails and lowered implementation costs. As such, when the census survey of legislative leaders was extended to officials who served in 2011-12, an additional attempt was made to contact non-responders from Phase 1. In Phase 2, Facebook and LinkedIn were used to search for online contact information for approximately 250 former and current public officials. In addition to reporting the demographic profile of those leaders and former leaders who maintained profiles on these two sites, this contact information was used to send follow-up messages to the respondents requesting their participation. Again, the response rate of those respondents contacted though one of these social networks was measured. An additional question was added in Phase 2 to also measure the likelihood that the completed survey resulted from the social media follow-up contact. Finally, a Chi- square test of no difference was conducted to determine if those who responded after the social media follow-up were different from those who did not respond.

The Quality Pledge: Encouraging Accurate Reporting Inna Burdein, NPD group ([email protected])

Previous research has found that a simple quality pledge, requesting panelists to pay attention or strive for accuracy, results in less straightlining and more time spent in surveys. To further build on this research, the goal of this study is to test three different types of pledges, which appeal to different motivations: (1) Accountability (i.e. In order to qualify for prizes…), (2) Helping (i.e. Please answer honestly…), and (3) Respect (i.e. …vital to the integrity of this research). Additionally, whether or not the respondent was asked to passively (pressing next) or actively (committing to accuracy) agree to the pledge will be tested. The goal is to determine what kind of quality pledge is most effective in increasing accurate and non-fraudulent behavior, without hurting completion or satisfaction rating of the survey.

In the effort to ensure the pledge is both read and effective, I also played with the standard intro length and whether or not a picture of a survey researcher was present at the introduction.

Priming Issue Agendas and Changes in Trust in Government over Time: The Multilevel SEM approach Dmitriy Poznyak, University of Cincinnati ([email protected]); Stephen Mockabee, University of Cincinnati ([email protected]); Bart Meuleman, University of Leuven ([email protected])

A recent NYT/CBS poll found that Americans’ distrust of government is at its highest levels in history and that Americans approval of Congress has dropped to 9%. What's driving these dramatic drops in public confidence? This research addresses this issue directly, advancing the proposition that changes in levels of political trust of the “government in Washington” can be explained by the shifts in national and public issue agendas, which in turn alter the cognitive accessibility of such criteria in judging the trustworthiness of government. We test the agenda setting (mediation) and priming (moderation) hypotheses of political trust.

We take a multilevel SEM approach, using the ANES cross-sectional time-series data—augmented by aggregate-level data—to show that shifts in the level of trust on the macro-level are highly sensitive to changes in indicators of the state of the domestic economy and the state of international affairs. Controlling for individual-level predictors, we demonstrate how changes in certain aspects of the national issue and problems agenda—using real-world and media indicators that measure the degree of severity of a problem in a nation—determine the public’s agenda—that is, the degree of public concern about these issues as measured in public opinion polls. By employing mediation analysis on the macro-level, we confirm that public perceptions of the nationally important problems mediate the effect of the national issues and problems agenda on political trust.

We test for the priming hypothesis of political trust by modeling the cross-level interactions between the indicators of the national agenda on the macro-level and public perception of the nationally important problem on the individual-level. We confirm that changes in the issue national issue importance shifts the weight citizens place on this issue when evaluate the government. When the issue receives its “normal attention" in the media, it does not have an effect on trust.

Cultural Differences in the Validity of Self-Reports of Chronic Health Conditions Young Ik Cho, Zliber School of Public Health, University of Wisconsin - Milwaukee ([email protected]); Timothy P. Johnson, Survey Research Laboratory, University of Illinois at Chicago ([email protected]); Allyson L. Holbrook, Survey Research Laboratory, University of Illinois at Chicago ([email protected]); Sharon Shavitt , Business Administration, University of Illinois ([email protected]); Noel Chávez , School of Public Health, University of Illinois at Chicago ([email protected]); Saul Weiner, Department of Medicine, University of Illinois ([email protected])

Self-reports of health have been widely used because of their ease of data collection and strong correlation with indicators of morbidity and mortality. However, the measurement of health inequalities with self-reports can be biased if individuals from different cultural backgrounds vary in their perceptions and interpretations of health conditions. This paper addresses potential cultural disparities in the quality of self-report measures of health conditions. To better understand the quality of health reporting and the potential effects of culture on measurement quality in behavioral surveys, we compare the validity (i.e., concordance) of chronic health status reporting across race/ethnic and language groups. These responses are compared to biomedical data collected at the time of the interview. Four specific health conditions are examined -- asthma, diabetes, hypertension and obesity -- within a sample of 600 African American, Korean American, Mexican American and non-Hispanic white adults recently interviewed in Chicago. Associations between health reporting validity and acculturation status among immigrants in the sample are also investigated. These associations are further examined in multivariate models that control for several respondent level sociodemographic measures also known to be correlated with self-report validity. Discussion will focus on the degree to which findings suggest systematic differences between ethnic groups in the quality of self-reports of chronic health conditions and implications for survey research. For instance, we propose that survey items measuring health conditions should be written in ways that can minimize respondent’s subjective interpretation by providing culturally unbiased objective descriptions of health conditions. Also, use of the survey response options that are known to be culturally sensitive should be avoided.

Does Supplying Definitions on Request to Opinion Questions on the Ethics of Assisted Reproductive Techniques Affect the Response Patterns? A Comparison of Two Telephone Surveys.. Brooke Long, Kent State University ([email protected]); Laurie K. Scheuble, The Pennsylvania State University ([email protected]); David R. Johnson, Pennsylvania State University ([email protected])

When seeking opinions from the public on the ethics of common assisted reproductive techniques used for infertile women to have children, a significant segment of the respondents may not have a clear understanding of the procedure to express an informed opinion. Including a one or two sentence definition of the procedure in the item could help but this comes at the cost of a longer interview and greater respondent burden. An alternative method for use in an interview survey mode is for the interviewer to tell the respondent that he/she will provide them with a definition of the procedure if they are not certain about it. It is not clear, however, how this would affect the opinions expressed or whether providing the definition on demand will alter the response patterns enough to warrant its use. In this paper, we examine responses to six opinion items asking about the respondent’s perception of the ethics of common assisted reproductive techniques (ART) [insemination with partner’s sperm, donor insemination, in vitro fertilization, donor eggs, surrogate mother and gestational carrier]. These items were included in two telephone surveys; one (N=574) in which the items were asked without an offer to provide a definition and one (N=3,293) in which the offer was made. Both were RDD surveys of women age 25-45. We first examine how frequently the respondent’s asked for a definition and the factors that predicted whether not they asked for a definition. We then compared the responses on the ethics of the procedures in both studies adjusting for any demographic differences between the samples with regression analysis. Demographic variables and ethical concerns about ART were significantly related to asking for definitions, but there were only small differences in response patterns in the two studies.

Response Anchoring and Polarity Effects on Endorsement and Response Patterns William Bryan Higgins ICF International ([email protected]); Randall K. Thomas, ICF International ([email protected])

Response categories may be used differently as a result of ethnic background or country of residence (e.g. Baumgartner and Steenkamp, 2001; Chen, Lee, and Stevenson, 1995; ter Hofstede, Steenkamp, and Wedel, 1999). Many researchers believe that respondents from some countries/ethnicities are less likely to use extreme response categories while those from other countries are more likely to use them. When making comparisons between countries/ethnicities, we need to ensure that we do not confound country/ethnicity with other factors and before we can attribute differences due to culture and not to other factors, such as scale polarity (bipolar vs. unipolar) and extent of verbal labeling of response categories. In this study, we present 3 web-based survey experiments, 2 from the U.S. and 1 international where we compared scale variants (e.g. unipolar versus bipolar) and extent of semantic anchoring (fully anchored scales give a semantic label for each response; end anchored scales provide only the extremes of the scale), and order of responses in how they affect differences in conclusions and can relate to differences between countries. We found significant differences in endorsement proportions for the response categories as a function of scale type. The fully anchored unipolar scale showed a lower endorsement of the highest response categories across countries/ethnicities. Across the experiments, there were mean differences in the evaluations of the issues as a function of country/ethnicity. However, controlling for familiarity with the topic and demographic factors, we found that these differences between groups and countries was eliminated or reduced for most of the activities we examined. We found significant differences in response patterns (extreme versus middling response patterns) as a result of type of scale – especially for end-anchored scales, regardless of scale polarity.

The Use of Online Methodology to Inform Public Policy Planning: A Case Study from San Francisco Jeffrey Shand-Lubbers, Knowledge Networks ([email protected]); J, Michael Dennis, Knowledge Networks ([email protected]); Jordon Peugh, Knowledge Networks ([email protected]); Liz Brisson, San Francisco County Transportation Authority ([email protected]); Elizabeth M. Bent, San Francisco County Transportation Authority ([email protected])

This paper will focus on methodology and learnings from a public policy project integrating qualitative and quantitative data collection as part of a San Francisco County Transportation Authority (SFCTA) project designed to incorporate public opinion in transportation planning. The key research question for this paper is how can qualitative dialogues contribute to, and integrate with, quantitative surveys to produce an enriched source of public opinion within the policy planning process?

Using its Quale® ForumSM research service, Knowledge Networks and SFCTA designed an online Town Hall approach in 2010 involving three online meetings with residents of the San Francisco Bay area. Using Knowledge Networks as part of a variety of research tools used to understand public opinion, SFCTA also used quantitative data collected during the meetings and recruitment and follow-up quantitative surveys. This paper will describe this methodology in detail.

Participants were asked to attend an online Town Hall meeting using Quale® ForumSM consisting of a presentation on hypothetical transportation policies in the San Francisco area, polling questions, and real- time dialogues with experts in transportation policy. During each meeting a policy expert presented alternatives for congestion management and answered participant questions. Conducting region-based sessions allowed researchers to address unique questions and concerns from the different geographic areas.

Participants were also invited to take a follow-up online survey, which yielded an overall response rate of 89%. This data, combined with questions asked during the recruitment survey provided rich insight into public opinions on traffic-related topics. The pre-and post- Town Hall survey approach using Quale® ForumSM allowed for examination of possible change in support or opposition of key topics.

This research paper will describe the outcomes of the meetings and recruitment and follow-up surveys in terms of their impact on and contribution to next steps in the policy planning process.

A Shot in the Dark: Measurement Influence on Likelihood to Vaccination William Bryan Higgins, ICF International ([email protected]); Randall K. Thomas, ICF International ([email protected])

In most years, influenza is associated with tens of thousands of deaths each year in the U.S., though the number of deaths rose significantly during the three pandemics in the past century. In general, flu vaccinations have become more effective in controlling the extent and severity of the annual incidence of flu. A major factor affecting vaccination effectiveness is people’s willingness to get vaccinated. We conducted an experiment to examine the effect that response measures have in predicting people’s intention to get a flu shot. We had 6,247 U.S. respondents participate in a web-based survey. We examined past flu occurrence, flu vaccination, and then we examined the influence that had 2 key experimental manipulations – absolute risk of flu strains today (Not at all dangerous – Extremely dangerous) versus comparative risk (Less dangerous than prior strains – More dangerous than prior strains). The second manipulation randomly assigned respondents to one of two comparative tasks - to evaluate the comparative risk of the flu vaccination versus with the risk of getting the flu – or to evaluate the comparative liking for getting the flu vaccination versus getting the flu. This created a 2 X 2 factorial design. We then examined a number of predictors of likelihood to get vaccinated for each cell of the experiment and found that the combined comparative risk rating of the current strain and comparative liking condition of the flu vaccination had the best predictive model for flu vaccination. The results indicated that decision making concerning vaccinations is subject to both the relative judgment of risks and hedonic relevance to the respondent and that public health strategies should emphasize the factors affecting these judgments to improve vaccination compliance.

Saturday, May 19, 2012 2:15 p.m. - 3:45 p.m. Concurrent Session I

Advancing the Methodology for Cognitive Pretesting and Evaluation of Multilingual Survey Instruments

Advancing the Methodology for Cognitive Pretesting and Evaluation of Multilingual Survey Instruments M. Mandy Sha, RTI International ([email protected]); Yuling Pan, U.S. Census Bureau ([email protected]); Hyunjoo Park, RTI International ([email protected]); Lu Liu, RTI International ([email protected]); Marissa Fond, US Census Bureau ([email protected]); Barbara Lazirko, US Census Bureau ([email protected]); Jiyoung Son, Independent Consultant ([email protected])

Experimenting with Incentives

Making the Money Count: Maximizing the Utility of Incentives in a Two-Stage Mail Survey Cameron Brook McPhee, American Institutes for Research ([email protected])

Since 1991, the National Center for Education Statistics (NCES) has used the National Household Education Surveys Program (NHES) to collect education-related data from households on topics that are difficult to study through institution-based frames. From 1991 through 2007, the NHES used a list- assisted RDD CATI survey. However, like most RDD surveys, NHES response rates have been declining over time and the increase in households converting from landlines to cell phone-only service has raised concerns about population coverage. These issues prompted NCES to redesign the NHES program, shifting to a two-stage address-based mail survey.

In July 2011 NCES completed a field test of the redesigned mail survey on a nationally representative sample of approximately 41,000 addresses in the United States. Included in the field test were three incentive experiments. At the screener phase, prepaid incentives of $2 and $5 were offered. At the second phase, eligible screener respondents were either offered no incentive, a prepaid incentive of $5, $10, $15, or $20 at the first mailing, or a prepaid incentive of $5 or $15 only at the second nonresponse follow-up (third mailing).

Our analyses examine the effectiveness of differential incentive levels in NCES’s new two-stage design. Logistic regression analyses indicate that topical incentives are generally more effective if provided at the initial request rather than at the follow-up phase, and that it is unnecessary to offer larger second phase incentives to households that responded quickly to the initial screener, but necessary and effective for households that required several follow-ups before returning the initial screener.

We found that the patterns observed nationally are not homogeneous across all demographic groups. The patterns diverged in linguistically-isolated areas, Hispanic households, African-American households, and lower income households, further demonstrating the complexity of targeting incentives in order to maximize response and representativeness while containing costs.

Address Based Sampling: Census Block Group Data Used to Define Incentive Structure Anh Thu Burks, The Nielsen Company ([email protected]); Michael Link, The Nielsen Company ([email protected])

A significant but little understood benefit of address-based sampling (ABS) is the ability to append data to each sample unit to develop sample indicators to drive differential treatment of sampled addresses. Census Block Group (CBG) information, such as socio-economic status, owner- renter status, household size and occupation can be used to tailor recruitment (mailings and incentives) to gain respondent cooperation. We present the results from a test using different combinations of CBG variables and commercial indicators to develop two alternative ways of targeting incentives to hard-to-reach respondents (younger adults, Blacks, and Hispanics). The first identifies addresses in block groups where the percentage of Blacks was 70% or higher or addresses associated with a Hispanic surname or addresses with a householder aged 34 years or younger. Homes here received a $5 prepaid cash incentive to complete a self-administered survey; all others received $2. The second used a more complex combination of variables (CBG and commercial indicators-- percentage of younger renters, Spanish-speakers, household income and younger, Black or Hispanic surname homes in low income areas) to identify a $5 incentive group. Using a 2 x 2 nationwide ABS design, the two incentive groups were tested across two modes: half using an online questionnaire only, and half with a mail questionnaire with online follow-up 10 days later. The test allows us to examine not only the impact of different incentive structures, but the intersection of incentive and mode as well. The analysis focuses on both response rates, but perhaps more importantly which approach produces the percentage of adults aged 18-34, blacks and Hispanics that mirror most closely nationwide estimates. The results have implications for the effective use of ABS in tailored survey designs.

Want to be an Early Bird? Can Encouraging Respondents to Contact Interviewers to Make Appointments Boost Co-operation Rates and Save Costs in the UK Context? Matt Brown, Centre for Longitudinal Studies ([email protected]); Lisa Calderwood, Centre for Longitudinal Studies ([email protected])

A recent trial conducted by the National Longitudinal Surveys in America where respondents were financially incentivized to contact interviewers to arrange interview appointments found that, for those taking up the offer, the interviewer time required to achieve a productive interview was significantly reduced. The savings in interviewer time more than outweighed the cost of providing incentives. This approach represents a reversal of typical fieldwork practice whereby interviewers drive the process of gaining respondent cooperation by mailing advance materials themselves, deciding themselves when to attempt initial contact and to an extent arrange appointments around their own availability.

In this paper we will present the results of a randomized control experiment, conducted in 2011 amongst 1400 households in the UK Household Longitudinal Study Innovation Panel, which seeks to evaluate the potential impact on co-operation rates and cost-efficiencies which could be achieved by encouraging respondents (with and without financial reward) to contact interviewers to schedule appointments.

Households were randomly allocated to one of three groups. Two groups received advance materials two weeks before fieldwork began. These materials sought to encourage participants to be an ‘Early-Bird’ and contact their interviewer to arrange an appointment in the first two weeks of fieldwork. Group 1 were offered a financial incentive (to all members of the household) to do this whereas the encouragement for Group 2 was additional text in the advance materials explaining that being an ‘Early-Bird’ would make ‘your interviewer’s life much easier as they will not have to make repeated telephone calls or visits to your home in order to reach you’. The third group acted as a control group and received the ‘standard’ contact strategy – advance materials were posted at the beginning of fieldwork and followed by the interviewer attempting to make contact.

Satisficing in Telephone Surveys: Do Prepaid Cash Incentives Make a Difference? Rebecca Medway, Joint Program in Survey Methodology, University of Maryland ([email protected])

Prepaid cash incentives repeatedly have been found to increase response rates in telephone surveys. Beyond convincing sample members to participate at all, incentives also may alter respondents’ motivation while completing the interview. Because motivation is a key driver of satisficing behavior, incentives may subsequently affect the prevalence of satisficing. Existing research investigating the impact of incentives on satisficing behaviors has produced mixed results. However, these studies have focused on a narrow set of indicators, particularly item nonresponse. This presentation aims to expand our knowledge of the effect of incentives on respondent effort by examining the impact on a wider variety of satisficing indicators, such as non-differentiation, recency effects, and the length of responses to open- ended items.

The data for this presentation will come from a recent nationwide telephone survey, in which a proportion of the sample members were randomly assigned to receive a $5 cash prepaid incentive. I will discuss the effect that the incentive had on twelve different indicators of satisficing, with multiple measures for many of the indicators. I will also discuss whether the impact of the incentive varied depending on respondent ability (age and education), other indicators of motivation (conscientiousness), and incentive recall.

Methodological Briefs: New Technologies and Web Surveys

Encouraging Survey Response via Smartphones: Effects on Respondents’ Use of Mobile Devices and Survey Response Rates Morgan M. , Washington State University ([email protected]); Don A. Dillman, Washington State University ([email protected])

Smartphone usage in the U.S. is increasing. As more people adopt smartphones and rely on them for Internet access, it is important for web surveyors to consider how they affect the survey response process. Recent experiments demonstrated that web surveys which combine postal mail contacts (including a token cash incentive) with follow-up email messages obtain higher response rates than surveys which use only mail or only email contacts. However, the effectiveness of added email contacts may change as more people access the Internet on their phones. Therefore, this study examines whether it is possible to build off of these techniques to drive respondents to complete web surveys via smartphones. Using a sample of college undergraduates, a population in which smartphone use is common, we designed three experimental treatment groups, one of which actively encouraged smartphone use. This research determines whether actively encouraging smartphone use produces a higher proportion of smartphone responses, and a higher overall response rate, compared to a standard web response treatment and a treatment which offers a choice of web or mail response. Paradata reveal that only 5.5% of all web respondents completed the survey using a smartphone. However, the smartphone treatment group obtained a significantly higher percentage of smartphone responses than the standard web group. Interestingly, the treatment offering a choice of web or paper response also received a larger proportion of smartphone responders than the web control group. Additionally, encouraging the smartphone option did not significantly increase the overall response rate. Data from the questionnaire suggest that smartphone users do not prefer to respond to surveys via their phones. This, in combination with the experimental results, indicates that much work is needed to increase response to mobile surveys.

Using SMS Text Messaging To Collect Time Use Data Philip Brenner, University of Michigan ([email protected]); John DeLamater, University of Wisconsin - Madison ([email protected])

Simple Message Service (SMS) text messaging is a ubiquitous technology available on the vast majority cellphones in use in 2012. Moreover, SMS is widely and regularly used, and provides a technological common denominator between mobile devices of nearly every make and model. This level of commonality may allow researchers an avenue to collect data without the expense and difficulty of designing specific applications for every cellphone on the market for the past few years. The Student Daily Life Survey used SMS text messaging as a method of data collection using a sample of students from a large, Midwestern university. The procedure adapted conventional time use procedures to fit the device, the sample, and the behavior of interest. After answering questions on a brief Web survey, students were asked to text researchers for five days, updating major changes in their activities. Following data collection, data from the text condition was compared to that from a conventional (Web) survey and data from a reverse record check from campus recreation facilities to validate the behavior of interest — physical exercise and activity. Text respondents showed consistently higher quality data on self-reports of the behaviors of interest. Moreover, indicators of text data quality (e.g., number of text messages sent, number of late messages, number of days without messages) have predictive validity on the behavior of interest. In this methodological brief, we describe our procedures, the quality of these data, and make suggestions for improvements to the procedure and give advice to other researchers interested in using this procedure.

Auto vs. Manual Login Today: Updating Early Research Scott D Crawford, Survey Sciences Group, LLC ([email protected]); Colleen McClain, Survey Sciences Group, LLC ([email protected]); John Dugan, Loyola University, Chicago ([email protected])

Past research on web survey login procedures produced conflicting and unclear results regarding the best implementation of web survey login procedures (Crawford et al., 2001; Heerwegh and Loosveldt, 2002; Heerwgh & Loosveldt, 2003). Advancements in email client software, computing platforms, and browsers as well as users expectations merit renewed research on various means of communicating and implementing survey login procedures.

This presentation will discuss the results of an experiment varying login type (manual versus automated), email format (plain text versus HTML), and link style (URL vs. “click here”) on a large national study of college students. Measures of data quality, including response rates, item missing data, break-offs, and others will be used to describe how these three implementation decisions may have an impact on the quality of the data collected within this population.

Using Text-to-Speech (TTS) for Audio-CASI Mick P. Couper, University of Michigan ([email protected]); Nicole Kirgis, University of Michigan ([email protected]); Sarrah Buageila, University of Michigan ([email protected]); Patricia Berglund, University of Michigan ([email protected])

For many years, the standard approach for producing sound files for audio computer assisted self- interviewing (ACASI) has been to use recorded human voices. This process is relatively inefficient, especially if questions are changed. An early study (Couper, Singer, and Tourangeau, 2004) found computer-generated text-to-speech (TTS) voices equally effective at eliciting sensitive information over the telephone. In this paper we compare the effects of a recorded human voice versus a text-to-speech system using a quasi-experimental design. The National Survey of Family Growth (NSFG) adopted TTS for ACASI beginning in September 2011, using the TextSpeech Pro software from Digital Future. We will compare the reports of sensitive behavior and respondent reactions to and use of ACASI in the first quarter of the new cycle of NSFG (September-December 2011) with data from the previous cycle of NSFG (completed in June 2011). We will use the substantive data to compare response distributions, interview observation data to compare use of ACASI, and keystroke data to explore time differences in completion. If necessary, we will use matching techniques to control for sample composition differences between the two cycles. Our working hypothesis is that there will be no differences between recorded and computer-generated speech. If supported, this finding will advance the development of ACASI systems.

Designing an Instrument to Measure No-notice Emergency Evacuations: The Case of the Emergency Evacuation Response Survey Rene Bautista, NORC at the University of Chicago ([email protected]); Angela Fontes, NORC at the University of Chicago/Illinois State University ([email protected]); Joshua Auld, Argonne National Laboratory ([email protected]); Vadim Sokolov, Argonne National Laboratory ([email protected])

U.S. cities are increasingly making use of social surveys to better inform decisions on emergency preparedness planning. In this context, surveys are valuable tools to collect information on the readiness and behaviors of the population. Typically, surveys on emergency preparedness have been conducted using traditional modes of data collection (mail, telephone, face to face). Fewer surveys have exploited the strengths of Internet-based methods. While traditional survey methods allow researchers to better address issues on coverage error and nonresponse error, web surveys allow researchers to collect information using cost-effective tools that may not be readily available in other modes such as colors, symbols and images (Couper 2000; Dillman, Smyth and Christian 2009). This paper describes the experience in designing and conducting an Internet-based survey to measure reactions to no-notice emergency evacuations. NORC at the University of Chicago collaborated with Argonne National Laboratory to improve the instrument used in the “Emergency Evacuation Response Survey” in the City of Chicago fielded in August 2011. Using novel methods such as Google Maps technology, the survey measured how people would react in non-anticipated emergency scenarios. This self-administered survey represented a powerful instrument to visually present several randomly-generated hypothetical scenarios, which could not have been feasible in other modes of data collection. Methodological issues on survey design, limitations, and main substantive findings are discussed in the paper.

Matrix vs. Single Question Formats in Web Surveys: Results from a Large Scale Experiment Joop Hox, Department Methods & Statistics, Utrecht University ([email protected]); Thomas Klausch, Utrecht University ([email protected]); Edith de Leeuw, Utrecht University ([email protected])

A matrix question format (grid) has may advantages in traditional paper-and-pen surveys design: more questions can be presented in a smaller space, thereby saving paper, printing and mailing costs. With the onset of computer-assisted questionnaires and Internet research, these cost saving arguments are no longer valid. Still, matrix questions remain a widely used tool also in Web surveys (e.g. Couper, 2008).

There is some empirical evidence (e.g. Toepoel et al 2005; Petchev, 2006; Callegaro, 2010) that matrix questions lead to more statisficing and more dropout. However presenting questions as single items on the screen leads to longer response times and may increase response burden. A promising alternative is the scrolling matrix, which shows one question of a matrix at a time with the same response options presented (GMI, 2009). Based on this principle the horizontal scrolling matrix (HSM) was developed and tested. In the HSM (horizontal scrolling matrix) questions are presented one-by-one on the screen, while ease of answering is improved (e.g., after an answer is selected, the next question is quickly and automatically presented, there is no need for the next or previous button, etc), and respondents still have an overview of the number of questions in the matrix through a visual bar (an example of this format In Dutch is added).

In a randomized field experiment we tested whether the HSM question format does improve data quality over the traditional matrix format. We used a 2 by 3 factorial design: the experimental factors were (1) question format: traditional matrix vs. HSM and (2) length of ‘matrix’: 5, 10, and 15 items per matrix. The results were positive: the HSM-format led to fewer break-offs, better data quality and less satisficing as indicated by well-known response styles, such as non-differentiation and extremeness. Finally, respondents evaluated the HSM format very positive.

Professional Respondents in Internet Panels: Who are They and What Do They Do to Our Data? Edith de Leeuw, Utrecht University ([email protected]); Suzette Matthijsse, Erasmus University ([email protected])

Online panels are at present one of fastest growing data collection modes for market and opinion research (AAPOR report on online panels). With the use of online panels, concern grew about the recruitment of panel members, and especially the emergence of large nonprobability panels. The combination of self-selection and incentives, both important characteristics of non-probability based internet surveys, and the increased use of such non-probability based internet panels, lead to an increased fear of the emergence of ‘professional respondents’ (e.g. Comley, 2005) and the negative consequences of this for data quality. So far, there are very few empirical studies into this topic.

The goal of this study is (1) to investigate whether ‘professional’ respondents can be distinguished in online panels using latent class analysis, (2) describe who the professional are and provide a demographical and psychographical profile, and (3) investigate whether there is a difference in the quality of the data provided by these ‘professional’ respondents and the other respondents in the panel.

In our study, we analyzed a unique data set of the NOPVO (Nederlands Online Panel Vergelijkings Onderzoek; Vonk et al, 2006) that includes 19 large Dutch online panels, which together capture 90% of the respondents to online market research in the Netherlands.

A latent class analysis showed that four types of respondents can be distinguished, ranging from the ‘professional’ respondent to the ‘altruistic-voluntary’ respondent. Also, different respondent types can be clearly characterized using demographical and psychographical variables.

The ultimate question is whether the existence of professional respondents is a threat to the data quality of internet panels. Do professional respondents satisfice more, do they take more short-cuts? Indeed, small differences in data quality can be detected between the groups. However, these differences disappear when controlling for socio-demographic variables.

New Frontiers: Smart Data Collection - Innovations in the Use of Smartphones

Disclosure and Quality of Answers in Text and Voice Interviews on iPhones Michael F. Schober, New School for Social Research ([email protected]); Frederick G. Conrad, University of Michigan ([email protected]); Christopher Antoun, University of Michigan ([email protected]); David Carroll, Parsons the New School for Design ([email protected]); Patrick Ehlen, AT&T Research ([email protected]); Stefanie Fail, New School for Social Research ([email protected]); Andrew L. Hupp, University of Michigan ([email protected]); Michael Johnston, AT&T Research ([email protected]); Courtney Kellner, New School for Social Research ([email protected]); Kelly F. Nichols, Parsons the New School for Design ([email protected]); Leif Percifield, Parsons the New School for Design ([email protected]); Lucas Vickers, Parsons the New School for Design ([email protected]); Huiying Yan, University of Michigan ([email protected]); Chan Zhang, University of Michigan ([email protected])

As people increasingly communicate via mobile multimodal devices like iPhones, new possibilities for survey data collection become available. The study reported here examines data quality, completion rates, and respondent satisfaction in four existing or plausible survey modes that work through native apps on the iPhone. The study contrasts whether the interviewing agent is a person or automated and whether the medium of communication is voice or text (SMS), allowing us to isolate effects of the agent and the medium. The resulting interview modes are telephone-human, telephone-automated (speech IVR), text-human, and text-automated. (We built the automated systems and interviewer user interface using customized server-side technology, and used the iPhone’s standard phone and messaging applications for the respondents’ user interface). Data quality is measured by evidence of satisficing (straightlining and rounding), reports of socially desirable and sensitive behaviors, and requests for clarification. Approximately 600 iPhone users (150 per mode) recruited as respondents from Craigslist with an iTunes store incentive are randomly assigned to one of the four modes (data collected in Spring 2012). These data allow us to explore issues such as (1) how the greater asynchrony of text relative to voice (different time pressure to respond, ability to communicate when convenient) affects respondents’ thoughtfulness, willingness to disclose, satisfaction and length of interview; (2) whether text, which presents different (possibly fewer?) cues of the interviewer’s humanness than voice, increases disclosure, or alternatively decreases disclosure because there is a persistent visual record (on the iPhone screen and in the cloud) of questions and answers which others might see; (3) whether the advantages of self- administration (increased disclosure) are observed in a mobile environment that may include other people and distractions; and (4) how the environment in which respondents find themselves (lighting, sound, multitasking, distractions) affects data quality.

Measurement and Methodological Challenges in Utilizing Passive Meter Technology on Smartphones Max Kilger, Experian Simmons ([email protected]); TraShawna Boals, Experian Simmons ([email protected])

The exponential growth in mobile devices such as smartphones and tablets has spurred interest in these devices for several key reasons. First is their use as a new method of survey administration. A number of researchers are utilizing smartphones and tablets already owned by respondents as portable survey instrument delivery and data collection platforms. The second reason researchers are interested in these devices is they can also be used as passive data collection devices. With mobile metering technology downloaded on their devices, they can be used to record how respondents use their smartphones and tablets to surf the web, consume traditional media such as television shows and movie as well as music, utilize voice and alternative forms of communication such as texting and picture messaging and deploying geospatial data in their everyday lives. This is the focus of the paper.

One of the emerging methodologies being utilized to collect data on mobile device behaviors is passive metering technology. The idea here is that a mobile metering application is downloaded to the respondent’s smartphone or tablet and this application records a large number of events and behaviors related to the respondent’s use of their mobile device. While there are considerable advantages to the direct measurement of mobile behaviors using passive meter technology, there are also new methodological and measurement challenges to be met.

This paper presents some of the knowledge gained in the course of a small, three month pilot test utilizing a passive meter to monitor mobile device behaviors by respondents on four different smartphone platforms. During the course of the discussion we examine some of the methodological, measurement and logistical issues uncovered during the pilot study and discuss how some of these challenges were overcome as well as provide the reader with some initial data on mobile device behaviors.

On the Run: In the Moment Smartphone Data Collection Jeff M. Scagnelli, Nielsen ([email protected]); Justin T. Bailey, Nielsen ([email protected]); Michael W. Link, Nielsen ([email protected]); Hala Makowska, Nielsen ([email protected]); Karen Benezra, Nielsen ([email protected])

The rise in on the go food & beverage consumption, particular by Millennial’s (18-29 yrs.) has brought increased attention within the food & beverage industry. The ability to offer insights into this unique purchasing behavior beyond simply what was bought would be a great asset to the industry. For the research community, this offers an opportunity to investigate the usage behavior of an on the go consumer, as data collection occurs.

Building on previous longitudinal repeated-measures approach research (Bailey et al., 2011, Link and Bailey, 2010), a pilot test was conducted where respondents were provided an Android smartphone pre- loaded with an app-based mobile survey. The survey was launched by the users whenever they made immediately consumable purchases during the course of a month. This sample was comprised of about 275 millennial’s (18-29 yrs.) in the southern California area. Questions included: where they were, what they purchased and what the motivators for the purchase were. The survey incorporated barcode scanning, along with taking pictures of the products to augment the survey data. The Android smartphone was leveraged and provided as an incentive for the respondents’ during the test period, combined with a monetary incentive paid at completion. This paper will examine the response rates, user engagement, data quality, and user experience during the month long pilot study. This research adds to previous work to better understand on the go mobile behavior and product consumption.

Time Use Data Collection Using Smartphones: Results of a Pilot Study Among Experienced and Inexperienced Users Annette C. Scherpenzeel, CentERdata, Tilburg University ([email protected]); Meike Morren, CentERdata, Tilburg University ([email protected]); Nathalie Sonck, The Netherlands Institute for Social Research ([email protected]); Henk Fernee, The Netherlands Institute for Social Research ([email protected])

Time Use Research (TUR) is usually carried out using questionnaires and diaries. Respondents complete, for example at the end of the day, all their activities of one day during consecutive fixed time- slots. Current technology, such as Smartphones and “apps”, allows TUR to be set up in a completely different way. Respondents carrying a Smartphone can enter their activities several times during a day. In addition, Smartphones enable to collect complementary data, such as the GPS location of the respondent at the time of the activity or photos and videos of the activity performed. CentERdata, a research institute associated with Tilburg University, and The Netherlands Institute for Social Research have jointly started to collect experimental time-use data using Smartphones. CentERdata is the operator of the LISS panel (Longitudinal Internet Studies for the Social sciences); an online panel based on a true probability sample of households. For the pilot study in November 2011, a special TUR app was developed by CentERdata. We were especially interested in the suitability of the app for this purpose, the effect of this method of data collection and the influence of experience with Smartphones on the quality of the data. Therefore, we selected 50 participants who owned an Android Smartphone and 50 participants without a Smartphone. The latter participants were provided with an Android Smartphone by CentERdata, with the TUR app already installed on it. The participants owning a Smartphone could download the app.

The paper will present the willingness and capacity of respondents to participate in the Smartphone TUR study. Furthermore, we analyze differences in the response behavior between the Smartphone-owners and the respondents provided with a Smartphone for this study. Additionally, we look at the quality of the resulting data and the general feasibility of collecting TUR data in this way.

What is That Thing? Knowledge and Usage of Quick Response Codes Jonathan Mendelson, Fors Marsh Group ([email protected]); Matt Lackey, Fors Marsh Group ([email protected]); Scott Turner, Fors Marsh Group ([email protected])

As smartphone technology usage continues to increase, advertisers are taking advantage of the opportunity to reach consumers at their point-of-interest in a product, and connect them with more marketing messages about the product. Quick Response (QR) code technology is a relatively new tool that allows companies to embed digital information into a real world product. Consumers scan the code with a smartphone and are connected to a website, delivered a message, or given contact information. QR codes have recently been displayed everywhere from business cards to billboards, magazines, and even on computers. With 35% of the adult population now owning a smartphone, it is important to understand who is aware of these codes and how they are using them.

In order to obtain baseline estimates of QR code understanding and usage among three populations of interest, researchers added a battery of questions on the topic to a long-running advertising tracking study. Respondents were shown an image of a QR code, asked whether they knew what it was, and depending on their response, asked how it would be used, whether they had used one in the past, and how they had used them. Open-ended follow-ups were used to determine the specific understanding and usage of respondents.

This paper will examine how QR code understanding and behavior varies across different demographic groups. Findings on social media habits from the same study will also be examined in the context of QR codes. Preliminary results indicate that a surprisingly small proportion of the population has actually used a QR code.

Questionnaire Design: Question Wording and Order Effects

Effects of Agree/Disagree Versus Construct-Specific Items on Reliability, Validity, and Interviewer-Respondent Interaction Jennifer Dykema, University of Wisconsin Survey Center ([email protected]); Nora Cate Schaeffer, University of Wisconsin Survey Center ([email protected]); Dana Garbarski, University of Wisconsin Department of Sociology ([email protected])

Recently, Saris et al. (2010) presented an extensive analysis that appears to support the concerns about the measurement properties of items that use statements followed by “agree-disagree” response categories (e.g., Fowler 1995). Saris et al. find that agree-disagree items have lower reliability and construct validity than construct-specific items. We further explore the characteristics of agree-disagree items with an experiment that compares two versions of five questions designed to measure political efficacy that have appeared in the General Social Survey. The agree-disagree questions present statements and ask the respondent to agree or disagree with the statements. The revised construct- specific questions identify an underlying response dimension (e.g., “how much influence”) for each item and provide a set of response categories using adverbs to express intensity (e.g., “a great deal”) or frequency (“extremely often”). Our analysis has two parts. We begin by exploring outcomes related to data quality using traditional methods that include describing systematic differences in the distributions and inter-item correlations, and examining the reliability and predictive validity of the items, using self- reported voting as the criterion, as well as associations with education. We then subject the questions to an extensive coding of their interviewer-respondent interactional properties by coding behaviors exhibited by interviewers and respondents that have been associated with data quality in standardized measurement. These behaviors include question-asking and probing by interviewers and question- answering, tokens, and response latencies by respondents. The two question versions of the questions were administered in the Badger Poll in December 2010 to approximately 500 respondents in an RDD survey in Wisconsin. Fowler, F.J. (1995). “Improving Survey Questions: Design and Evaluation.” Appplied Social Research Methods Series 38: 56-57. Saris, W., M. Revilla, et al. (2010). "Comparing Questions with Agree/Disagree Response Options to Questions with Item-Specific Response Options." Survey Research Methods 4(1): 61-79.

Question-Wording Effect: Bias or Conceptual Difference? Ward R. Kay, Adirondack Communications ([email protected])

In designing questionnaires, we strive to develop questions that are unbiased so that the responses are as close to the “truth” as possible. However, in attitude questions, there is a reach history of question- wording effects. Schuman and Presser (1981) demonstrated the wording effect of such differences as using “not allow” versus “prohibit”.

In conducting public opinion surveys, we often ask policy questions in a duality format (Are you for or against abortion?) which ignores the complexity of both the policy and how people perceive the issue. Elizabeth Adell Cook examined the General Social Survey questions on abortion and found that 9% were against abortions in all situations and 34% were in favor of abortion in all circumstances – which means that the majority of Americans are both against and for abortion depending on the circumstance (Cook, Jelen, Wilcox, 1992). In opinion, the “truth” we are trying to measure is conditional and therefore impossible to determine in a single question.

This study examines data on attitudes about illegal immigration policy from a national telephone survey. A series of policy options including extreme options such as putting illegal immigrants in jail to opening the borders to anyone who wants to the United States with a variety of policies in between. The results show that the majority supports neither the most lenient nor most punitive policies toward immigrants. However, differences in the moderating policies are dependent on the question-wording. For example, a question that asks whether immigrants should be “required” to become citizens receives more support from more conservative respondents than a question about immigrants being “allowed” to become citizens.

The findings are that should not be the result of a “bias” in the question but rather that each wording difference is measuring a different point on the spectrum of opinion space.

Is President Obama Up or Down? The Impact of Question Wording and Universe Definition on Approval Ratings Clifford Young, Ipsos Public Affairs ([email protected]); Julia Clark, Ipsos Public Affairs ([email protected])

Presidential approval questions are a staple of public opinion polling. Such questions serve both as a signal to political actors about the relative strength of the President’s ability to push his agenda, as well as being fairly good predictors of the party-in-power maintaining the . A few points up or down can make all the difference.

Currently, we (Ipsos Public Affairs) conduct monthly national dual-frame (landline and cell phone), RDD, live operator telephone surveys of the American adult population for Thomson Reuters, which include both policy-oriented and political questions, including Presidential approval. Since the beginning of the Obama administration, our Presidential approval numbers have been consistently higher than other polling firms’ numbers. For instance, our numbers over this time have been approximately four (4) points higher than the aggregate averages found on Pollster.com and RCP (Real Clear Politics).

What might account for this?

Our central hypothesis is that our question wording is different than the standard "Gallup" question wording used by most polling firms. In our case (and the case for a handful of other firms), we make explicit "mixed opinion" as an option in the question wording and then follow it up with a "which way do you lean?" probe. In contrast, the standard "Gallup" question wording asks if people explicitly approve or disapprove, leaving "mixed opinion" and "undecided" as volunteered response items. To test the effect of the differences in question wording, we will run a split ballot design on six (6) survey waves. This will give us 3000 interviews for each question condition--sufficient sample size to analyze differences in aggregate and by subgroup.

We will also test other hypotheses including variation by universe definition (all adults vs. registered voters vs. likely voters) as well as the effect of the cell phone frame on approval ratings.

Question Order Effects in Long Question Lists Jamie L. Marincic, Mathematica Policy Research ([email protected]); Martha Kovac, Mathematica Policy Research ([email protected]); Hong Zhang, Mathematica Policy Research ([email protected])

Responses to attitudinal survey questions are inextricably linked to the context in which the attitude was assessed (Schuman & Presser, 1981, Schwarz & Sudman, 1992). In the case of question order effects, the response to a question is influenced by the presence of a previous question—typically the question immediately preceding it. Schuman and Presser (1981) outlined and demonstrated three types of question order effects: part-part consistency effects, part-whole contrast effects, part-whole consistency effects. Such effects are often studied using pairs of questions in which one question order is considered the control condition and the opposite question order is considered the treatment condition (Strack, 1992). When more than two related items appear in a long question list, additional question order effects may occur including salience effects, which occur when the total context preceding a question influences that question’s response, and fatigue effects, which occur when questions appearing at the end of a long list are answered less thoughtfully than questions at the beginning of a long list.

The purpose of the current study was to examine part-part, salience, and fatigue effects in a survey of 1,500 health care professionals. The order of nine related attitudinal questions was rotated such that each question appeared in each of the nine positions exactly once. As a comparison, the order of thirteen related knowledge questions was similarly rotated. Inspection of the response distributions illustrates several phenomena including (1) the seemingly non-existent effect of context on knowledge items in contrast to the sometimes extreme effect of context on attitude questions, (2) the moderation of part-part effects with increased context, and 3) the absence of fatigue effects in both knowledge and attitudinal questions. Effects of context on analytic statistics will also be discussed.

Question Order Effect: A Web Survey Experiment with Paradata Cong Ye, University of Maryland ([email protected]); Roger Tourangeau, University of Maryland ([email protected])

Bishop, Hippler, Schwarz, and Strack (1988) asked two parallel questions on imports in both orders in a telephone survey and found there was a large change in the answers in the two orders. However, the order effect was eliminated in mail surveys. A replication of this study by Sangster (1993) found the order effect in both modes. The different findings in the two studies are thought by some to be due to a difference in the study population. That is, students as used in Bishop et al. answered the surveys more as taking a test, and were more likely to review and change answers.

We tried to find some new evidence about the question order effect by collecting paradata (use of “previous” button, change of answers, etc.) in a web survey. We expected that when respondents could easily change their answers, the apparent question order effect would be diminished. We varied the difficulty of reviewing and revising answers by placing the paired questions: 1) on the same screen; 2) on adjacent screens; 3) three screens apart with irrelevant questions in between; or 4) nine questions apart with irrelevant questions in between. These four conditions were tested in a web experiment.

Societal Change across a Generation: The General Social Survey at 40 (1972-2012) Tom W. Smith, NORC at the University of Chicago ([email protected]); Peter V. Marsden, Harvard University ([email protected]); James D. Wright, University of Central Florida ([email protected]); Mark Chaves, Duke University ([email protected]); Allan McCutcheon, University of Nebraska ([email protected])

The Impact of Survey Mode on Non-Response

Assessing the Mode-dependency of Survey Response and Non-response Bias Thomas Klausch, Utrecht University ([email protected]); Joop Hox, Utrecht Univeristy ([email protected]); Barry Schouten, Statistics Netherlands ([email protected])

-Nonresponse bias depends on the mode of data collection, which is often taken as a reason to conduct mixed-mode surveys, hoping they would balance any deficits of single modes. In this project we test whether the ‘mode-dependency’ assumption actually holds, not only with regard to nonresponse bias on socio-demographics, but also on survey target variables. The latter type is often hidden to analysts, but our design allows assessing the relationship to some degree.

We administered a two-wave random factorial design based on a national probability sample. In wave 1, units were randomly assigned to one of the major survey modes (CAPI, CATI, Mail, and Web, n=8800). The context of this survey was the Dutch Security Monitor. In wave 2, both respondents and nonrespondents were solicited again after 4-8 weeks, this time all by a much shorter CAPI survey, in which some of the wave 1 questions were repeated (e.g. security attitudes).

We present three comparisons that are possible due to this design. First, we use socio-demographics from national registries to assess mode differences in non-response in wave 1. Although all modes are biased, we find only small relative differences, mainly between the interviewer modes and non-interviewer modes and caused by differential under-coverage. Second, target variables surveyed during wave 2 are compared for wave 1 (non-)respondents. Again we find indication for nonresponse bias, which, however, is equal in size over all modes. Third, we take wave 2 (CAPI) as a sequential mixed-mode extension to wave 1 (Web or Mail) considering joint sample bias, which is not improved compared to the single modes.

We conclude that non-response bias in the Security Monitor does not strongly depend on mode. Mixed- mode designs could not make substantial use of mode-dependent response behavior. Assessing mode- dependency before deciding on mixed-mode designs is therefore advisable in terms of accuracy and costs.

Are Multiple Modes Helpful? Balancing Reduction of Nonresponse and Sampling Error against Mode Effects Benjamin Phillips, Abt SRBI ([email protected]); Chase Harrison, Harvard Business School ([email protected]); Chintan Turakhia, Abt SRBI ([email protected])

Multimode studies are becoming increasingly common as researchers seek to maximize response rates and minimize nonresponse bias. Conducting surveys in multiple modes has the potential to reduce sampling error due to increased sample size and reduction in variance of weights, in addition to the potential reduction in nonresponse error. However, literature on the relationship between response rates and nonresponse error is mixed, suggesting that strategies to increase levels of response can increase bias in some cases. In particular, research suggests that self-administered survey questions may have different measurement properties than interviewer-administered questions. Thus strategies to increase response rates to self-administered surveys using telephone interviews may actually increase measurement error. We examine the trade-offs of using telephone interviews in a multimode design using data from the 2011 U.S. Competitiveness Survey conducted by Harvard Business School (HBS).

The U.S. Competitiveness Survey was designed as a census of HBS alumni (N=73,000), using self- administered web survey as the primary means of data collection (N=50,000 with email addresses). A random sample was selected for intensive follow-up (n=4,000). In addition to receiving the email invitations and reminders, sample members received a mailed invitation letter and telephone interviewing attempts. Mode effects are estimated using matching models for causal effects due to the nonignorable relationship between respondent characteristics and completion of a survey in telephone or self- administered modes. Potential reduction of bias of estimates due to different sample composition under a high response rate scenario is estimated net of estimated mode effects. These potential reductions in bias and increases in measurement error under a higher response rate scenario are combined with the reduction in sampling error due to larger sample size and lower variance of weights using mean squared error. The results of this investigation will help illuminate the trade-offs in survey error inherent in multimode designs.

Nonresponse and Mode Effects in a Two-wave Randomized Mode Experiment Scott Beach, University of Pittsburgh ([email protected]); Donald Musa, University of Pittsburgh ([email protected])

Mixed-mode surveys, which are becoming more common in the industry, are conducted as a way to reduce coverage and nonresponse errors. While many surveys offer mode choice, researchers conducting panel surveys also use different modes for different waves (interviewer-based first, then self- administered), or switch to a more cost-effective mode (e.g., web) after using an interviewer-based approach (e.g., phone). This paper will explore nonresponse and mode effects in an undergraduate student satisfaction survey experiment conducted at University of Pittsburgh in 2007-2009. In anticipation of switching from a phone to a web survey, freshmen and sophomores from two cohorts (n=2,507) were randomly assigned to either telephone or web survey conditions in a two-wave panel survey. Approximately one-fourth of the students were assigned to each of four conditions: (1) FR phone, SO phone; (2) FR phone, SO web; (3) FR web, SO web, (4) FR web, SO phone. Students could be surveyed as sophomores even if they had not completed a freshman survey. Preliminary analyses of baseline (2007) data revealed both differential response rates by mode and mode effects on satisfaction ratings. The paper will present nonresponse patterns across the four conditions for the total sample and key sub- groups, along with wave 2 response propensity models using wave 1 and wave 2 mode assignment, response (vs. not) at wave 1, and demographic variables (from administrative data) as predictors. For those responding to both waves, mode effects on key student satisfaction indicators at wave 2 will be examined using a similar regression approach. To explore potential links between nonresponse and mode effects, wave 2 satisfaction ratings will be regressed on response (vs. not) at wave 1, wave 1 and wave 2 mode, and demographic variables. Results have implications for mode switches in panel surveys, particularly among college students.

Developing a New Mixed Mode Methodology for a Provincial Park Camper Survey in British Columbia Brian W. Dyck, Ministry of Environment, British Columbia ([email protected]); Phil Dearden, Department of Geography, University of Victoria ([email protected]); Rick Rollins, Department of Recreation and Tourism Management, Vancouver Island Univ ([email protected])

While face-to-face/paper surveys have provided high response rates for park visitor surveys for many years, little research has been given to examining whether face-to-face/web surveys could be used for this population. We report on an effort to develop and test a face-to-face/web survey in 13 provincial park campgrounds located throughout British Columbia during the summer of 2010. This procedure involved conducting a short face-to-face interview (6 – 8 questions) with a random sample of park visitors and requesting potential respondents (i.e. those who use the internet) to complete a short web survey after they returned home from their trip. Potential respondents were given a postcard near the end of the interview that contained a picture on the front cover and a printed message along with the website and an access code on the back cover. In testing this approach, two experiments were conducted. The first experiment was designed to compare response rates, non-response bias and item non-response between a face-to-face/paper survey and a face-to-face/web survey. It involved conducting 801 interviews at Goldstream Provincial Park located near Victoria, British Columbia. The second experiment was designed to examine the effect of the number of follow-ups on response rates and non-response bias in a face-to-face/web survey. Each postcard contained one of four letters in the access code that determined the number of follow-ups (A=no follow-ups; B=one follow-up; C=two follow-ups; and D=three follow-ups). It involved conducting 3,704 interviews at 12 campgrounds. To help determine non-response bias, several questions on the interview form (e.g. location of residence, group size, age and gender of respondents) were similar in the paper and the web questionnaires. Preliminary findings indicate that non- use internet rates were lower than expected and the average web response rate increased steadily from no to two follow-ups and then tapered off after the 2nd follow-up.

Influencing Mode Choice in a Mixed Mode Survey Geraldine Mooney, Mathematica Policy Research, Inc. ([email protected]); Flora Lan, National Science Foundation ([email protected]); Xiaojing Lin, Mathematica Policy Research, Inc. ([email protected]); Andrew Hurwitz, Mathematica Policy Research, Inc. ([email protected])

Current research on the value of offering alternative response modes is mixed; some studies suggest that choice improves response rates (e.g., Quigley, Riemer, Cruzen, and Rosen, 2000), while others (Millar and Dillman 2011) suggest that providing alternatives may lower response rates, especially if the only mode offered is web. In a recent presentation, Olson, Smyth and Wood (2010) explored whether or not allowing sample members to respond in their preferred mode choice increased response rates or the speed with which they responded. The results in their general population survey were also mixed. Those who preferred web responded more quickly than those who preferred other modes, but, in general, those who preferred web tended to have the lowest response rates overall. When offered first, most sample members responded by mail, regardless of their mode preference.

With an eye toward controlling data collection costs, this paper looks at the extent to which the survey design, despite offering several options, can influence mode choice without negatively impacting response rates Using data from the incentive and mode choice experiment imbedded in the 2008 National Survey of Recent College Graduates, we will look at the potential impact of using a differential incentive and mode choice to increase online survey completions and response rates. We will compare the performance of a differential incentive to that of an incentive that rewards all completed questionnaires equally and the impact of offering mode choice throughout the survey to starting with only a single mode choice, web. In addition, we will compare the relative contributions of both procedures on increasing the proportion of web completes and response rates. Cost implications will also be discussed.

Weighting and Design Issues in Dual Frame Cell Phone/Landline Surveys.

In Search of a Method: Model-Based Approach to Weighting Overlapping Dual Frame RDD Samples Paul Schroeder, Abt SRBI ([email protected]); Brian Meekins, BLS ([email protected]); Randolph Atkins, NHTSA ([email protected]); Mike Battaglia, Abt Associates ([email protected])

Despite the advent of overlapping dual frame (cell, landline) RDD telephone samples a number of years ago, there has yet to be a best practice defined for weighting such samples. Design-based approaches, which have been used extensively to weight dual frame samples, use a multistage procedure that can increase the variance of the estimates significantly. In recent research Benford et al. (2011) compare a propensity-based approach to their traditional weighting approach on two AP-GfK polls, showing some positive gains in the reduction of bias.

The current research compares the effect of a design-based and model-based approach on the variance and bias of estimates from a nationally representative overlapping dual frame sample that was fielded recently for the US Department of Transportation. The 2011 National Survey of Speeding Attitudes and Behaviors (NSSAB) survey of over 6,100 drivers in the US which includes interviews with 4,600 respondents reached on a landline and 1,500 respondents who were reached on a cell phone and live in either cell phone only or cell phone mostly households. The survey instrument includes a variety of sensitive questions covering topics such as speeding behavior, aggressive driving, drinking and driving, and texting while driving. We model nonresponse using paradata and frame variables that are typically present on RDD surveys.

Dual-Frame Weighting: Issues and Approaches for Incorporating an Undersampled Cell Phone Frame in a Dual-Frame Telephone Survey Elizabeth Ormson, NORC at the University of Chicago ([email protected])

In June 2011, the National Center for Health Statistics (NCHS) reported that the percentage of wireless- only households had grown to 29.7%, up from 13.6% in 2007 (Blumberg and Luke, 2011). This finding has substantial implications for any telephone survey designer who wants to interview a representative sample of the population and create unbiased estimates. Until recently, most telephone surveys were fielded using a random-digit-dial (RDD) sample drawn from working banks of only landline telephones.

A dual frame design including RDD samples of both landline and cell telephones requires careful consideration of weighting approaches that both reduce coverage bias and control variance. Cell telephone samples are more costly to field and therefore are typically undersampled compared to the landline telephone sample. This leads to larger weights in the cell telephone sample compared to weights associated with the landline telephone sample and a resultant increase in variance and design effect.

This presentation will use the National Survey of Children with Special Health Care Needs (NS-CSHCN) as a basis for considering alternative weighting approaches for a combined landline and cell phone sample where the cell phone frame is underrepresented. The NS-CSHCN is sponsored by the Maternal and Child Health Bureau and conducted by NCHS. The most recent NS-CSHCN fielded a landline telephone RDD sample during six quarters (Q3/2009 – Q4/2010), supplemented with a cell telephone RDD sample fielded for two quarters (Q3/2010 – Q4/2010). Cases from the cell telephone sample represented 9% of all household completes. This dual frame design resulted in different sampling rates from the landline and cell telephone frames, and by state.

In this presentation, we discuss various approaches considered for weighting the NS-CSHCN sample and the resultant impact on bias, variance, and mean squared error of estimates at both the state and national level.

Allocation to Cell and Landline Frames for Various Dual Frame Telephone Survey Designs Burton Levine, RTI International ([email protected])

In this paper we present methodology for calculating the allocation of telephone respondents to cell phone and landline frames in an overlap dual frame survey design that results in the smallest variance estimates for a given cost. This methodology accommodates different methods of weighting the overlap respondents. To determine the optimal design, we enumerate all possible allocations with the same specified cost. Then we calculate the unequal weighting effect and the effective sample size for each design. The design with the largest effective sample size, for the given cost, is considered optimal. Finally, we present the relationship between the relative cost of obtaining landline and cell phone respondent to the optimal allocation of landline and cell phone respondents in a national design for various weighting strategies.

Sunday, May 20, 2012 8:30 a.m. - 10:00 a.m. Concurrent Session J

Addressing the Challenges of Surveying Hispanics

Using a Hispanic Surname List to Tailor Contacts in an RDD Telephone Survey Sherman Edwards, Westat ([email protected]); Sarah Dipko, Westat ([email protected]); Royce Park, UCLA Center for Health Policy Research ([email protected]); David Grant, UCLA Center for Health Policy Research ([email protected])

Among the challenges facing telephone surveys is obtaining responses from Latino households. Even with Spanish-language questionnaires and materials and Spanish-speaking interviewers, both coverage and response rates tend to be lower among Latinos than among non-Hispanic white people. In the 2011 California Health Interview Survey, an RDD survey which selects one random adult per household, the achieved sample of self-identified Latinos is proportionally much smaller than in the California population. There are many reasons for this difference; one might be that the first contact with a sampled telephone number is usually done by a monolingual (English-only) interviewer; only if a “language problem” results is the number moved to a bilingual interviewer.

For a majority of landline households, additional information is available from directory services, including the name and address of the subscriber. This information may be used to tailor the initial contact. In CHIS 2011, we conducted an experiment in which surnames matched with sampled landline numbers were classified as likely Hispanic or not. Half of the numbers associated with likely Hispanic surnames were assigned to bilingual interviewers for the initial contact; the other half were available to any interviewer. This paper will compare cooperation and response rates, the proportion of interviews completed in Spanish, and the proportion of respondents identifying themselves as Latino or Hispanic between the two samples. We will also compare the overall yield of individuals identifying as Hispanic and the proportion of interviews in Spanish with other parts of the sample: cell phone numbers, matched landline numbers with non-Hispanic surnames, and unmatched landline numbers. Finally, we will compare the relative costs of the two approaches for the Hispanic surname samples, considering the number of contacts required per completed interview, the interview length, and the “opportunity cost” of doing additional screening with the rarer resource (bilingual interviewers).

Speaking the Same Language: Effective Techniques for Reaching Spanish Speaking Households in a Mail Survey Andrew Zukerberg, National Center for Education Statistics ([email protected]); Saida Mamedova, American Institutes for Research ([email protected])

Reaching non-English speaking households is a challenge for many surveys, especially those conducted by mail. Unlike telephone surveys, where the interviewer can immediately identify a language problem and route the case to an interviewer that speaks the respondent’s language, a mail survey must identify ways to target the household prior to contact. As part of the transition from a telephone administered to a mail self-administered design, The National Household Education Survey (NHES) has conducted a number of experiments to look at optimal ways to identify and reach Spanish-speaking households.

The issue of correct identification of the language spoken in the household is especially acute for the NHES, as it is a two-phase study, where sampled households are screened by mail with a simple household roster to determine the presence of eligible children. If eligible children are present, within household sampling is performed to select a reference child. The household is then sent a longer and more complex topical survey by mail. The screener is used to determine the language for the topical survey form.

In 2011, NHES undertook several experiments to explore ways to identify Spanish-speaking respondents through a mail screener questionnaire as part of a larger field test. This paper examines the results of experiments that compared a Bilingual form, an English form, separate English and Spanish forms, and sending English and Spanish forms to households that had previously received only an English form. The experiments were done on three independent samples of households 1) those in Census tracks identified as Linguistically Isolated areas, 2) households with Hispanic Surnames on the sample frame, and 3) a nationally representative sample of households. This paper explores the optimal approaches to reach Spanish-speaking households by mail identified in the 2011 NHES field test.

Critical Lessons for Training Bilingual Assessors on a Longitudinal Study Rebecca Weiner, Mathematica Policy Research ([email protected])

One-third of all children in the United States are born to unwed parents. Although many children of unwed couples flourish, research shows that, on average, compared with children growing up with their married biological parents, they are more socially vulnerable (McLanahan and Sandefur 1994; Amato 2005). The Administration for Children and Families (ACF), part of the U.S. Department of Health and Human Services, initiated the Building Strong Families (BSF) program to help interested and romantically involved unwed parents build stronger relationships, and thus enhance their child’s wellbeing and their own future. The BSF evaluation consisted of multiple waves of data collection. We are focusing on the home-based assessment which includes a measure of children’s receptive language, a self-regulation task, and a semi-structured play-based activity for mother-child and father-child dyads.

The proposed session focuses on training designed for field interviewers on the home-based assessment with special emphasis on bilingual interviewers. Hispanics comprise a significant portion of the BSF sample. However, we experienced challenges training bilingual interviewers using traditional training methods. Initially, our trainings combined English and Spanish speakers with extra time added for bilinguals to practice in Spanish. We revised the training to better serve bilingual interviewers, holding a separate bilingual training, conducting all practice sessions in Spanish, and providing Spanish-speaking families for certification sessions at the end of training.

The bilingual interviewers trained at this session accounted for 66 percent of the completed bilingual In- Homes, and they located nearly 50 percent of sample members who completed the 36 month survey in Spanish. These figures are even more impressive considering that this group was trained six months prior to the end of data collection. Our session will discuss lessons learned and make recommendations for best practices when training bilingual assessors for complex assessment tasks.

Quantitative Evaluation of Questionnaire Translation with Bilingual Speakers Sunghee Lee, University of Michigan ([email protected]); Julia Lee, University of Michigan ([email protected])

Persons reporting Hispanic origin grew from 35.3 million to 50.5 million, and Hispanics contributed more than half of the total population growth in the last decade. As nearly four out of ten Hispanics are classified as linguistically isolated, many surveys in US have started offering Spanish as an interview language, making themselves multilingual surveys. In fact, the literature suggests that conducting surveys only in English will incur biases.

Questionnaire translation, a unique feature in multilingual survey operations, influences measurement properties. Because current practice of questionnaire translation often involves a team of translators and its assessment takes mostly qualitative approaches, it is difficult to understand the extent of measurement error due to translation.

This study attempts to evaluate the quality of questionnaire translation quantitatively by using data from the National Latino and Asian American Study that included a randomized experiment on interview language: bilingual English-Spanish speakers who spoke both languages equally proficiently were randomly assigned to either English or Spanish interview language. This provides a unique opportunity to evaluate translation as these two groups are comparable in all measures by design except for the interview language. Any differences between the groups are attributable to altered measurement properties due to translation.

Because response scales influence measurement errors apart from question wording itself, we will divide questions based on their response scales: yes/no scale or complex scales. Assuming that yes/no response scale is unlikely to be affected by translation deficiency, differences in estimates for associated variables between two experimental groups will be regarded as coming from inadequate question translation. For complex scales, we will select particular scales used for multiple questions and examine whether there is a common pattern in between-group differences, which will indirectly indicate differential measurement properties in response scales between interview languages.

Survey Error of Hispanics From Sample Design, Language and Effort David Dutwin, Social Science Research Solutions ([email protected]); Mark Lopez, Pew Hispanic Center ([email protected]); Melissa Herrmann, SSRS ([email protected])

As the largest and fastest growing minority population in the United States, it naturally follows that survey research of Hispanics has become more widespread. As with any branch of survey research, there are a myriad of options regarding how the population is sampled and how the sample plan is operationalized. Typical survey designs include simple but expensive RDD; stratified RDD, “top markets” designs, and surname designs. To complicate matters, all but the last of these designs can include both landline and cell phone sampling. Every one of these designs covers a different percent on the U.S. Hispanic population. As well, many of these designs sample certain Hispanics at different proportions to other Hispanics. Additionally, how such designs are executed, particularly with regard to the allocation of bilingual interviewers, has tremendous impact on the data acquired. And finally, there is the question of quality, for example, whether response rates, number of call attempts, and the amount of effort given to refusals and callbacks all have an impact on bias in survey estimates of Hispanics. The following paper explores these three foci; sampling, interviewing language, and interviewing effort, by utilizing a meta- analytic approach to the five most recent Pew Hispanic Center National Surveys of Latinos, 2004 to present. We report on what we find to be best practices, and the consequences of failing to enact these practices, as measured by bias in survey estimates of Hispanics.

Assessing Public Opinion on Social and Political Issues

Public Opinion on Gun Control Revisited: Collective Consensus or Unbridgeable Ideological Divide? Bryan C. Parkhurst, University of Nebraska - Lincoln ([email protected])

Though support for gun control has been fairly high over several decades, polling data in recent years has shown a noteworthy decline in public support for stricter gun control. At the same time researchers have only begun to get a handle on what drives public opinion on gun control. Using data from the NORC GSS, Celinska (2007) has developed a particularly useful measure of respondents‘ ideology of individualism vs. collectivism for predicting both gun ownership and opinions on gun control. Her analysis clearly demonstrated that respondents falling toward the individualism end of the spectrum were significantly more likely to own firearms and to oppose their regulation by the government, whereas those inclined toward collectivism were noticeably more likely not to own firearms and to support gun control. Her analysis, which was based on GSS data from 1984 to 1998, may now be somewhat dated, however, in light of recent drops in public support for gun control, growth of gun purchases in reaction to the election of President Obama, and the recent Supreme Court decision on the applicability of the 2nd Amendment to local and state gun control laws. In this presentation, the author revisits American public opinion on gun control in this altered political climate and reexamines Celinska‘s conclusions using updated NORC GSS data from the past decade. The findings suggest that consensus on this enduring issue remains illusory, despite contrary polling data reported by the Gallup Organization, among others.

A Multi-Method Approach to Polling Same-Sex Marriage: Experiments in Question Wording, Framing, and Implicit Attitudes David P. Redlawsk, Rutgers University ([email protected]); Ashley A. Koning, Rutgers University ([email protected])

We explore implicit versus explicit attitudes toward same-sex marriage through a multi-method survey experiment using two different modes of data collection. A New Jersey Rutgers-Eagleton Poll in Summer 2011, conducted soon after legalization in New York State, found a majority of adults supported same-sex marriage for the first time. It also found much less opposition than expected, with a high number of refusals and “don’t knows.” The puzzle is whether increased support and unexpectedly high non-opinions were artifacts of timing or represented real shifts in public opinion in New Jersey.

To examine this, we implemented list and question-wording experiments on a subsequent poll, investigating social desirability and framing effects. In the list experiment, we identify an intriguing “gender gap” where respondents implicitly appear to be more supportive of lesbian couples versus gay male couples. The question-wording experiment explicitly tested the phrasing of the issue – the standard, and perhaps polarizing, term of “gay marriage” versus the phrase more recently advanced by advocates: “marriage equality.” We find clear evidence that “marriage equality” noticeably increases support across most, but not all, subgroups. Republicans, along with those who are more educated and those who have a gay or lesbian friend or family member, are all but unaffected by this framing.

Party identification has particular effects consistent throughout the list and question-wording experiments. Both experiments show noteworthy effects on Democrats and Independents, but Republicans remain completely stable in their opposition both implicitly and explicitly and in both versions of the frames.

To explore this finding in more depth, we test primes with an additional experiment using Amazon Mechanical Turk. This online survey targets traditional Republican ideals and values that may alter underlying motivations for Republican attitudes toward same-sex marriage by making reasons for a “conservative” case in favor of the issue more salient.

Examining the Growing Support for Same-Sex Marriage in California: What Factors Are Driving the Change? Sonja Petek, Public Policy Institute of California ([email protected]); Mark Baldassare, Public Policy Institute of California ([email protected])

The same-sex marriage issue may be heading to the U.S. Supreme Court because of legal challenges to California’s Proposition 8, the voter-approved citizens’ initiative that amended the state constitution to ban same-sex marriage. Meanwhile, a marked shift in attitudes has occurred among Californians since voters narrowly passed Proposition 8 (52% yes, 48% no) in 2008. In March 2010, 50 percent of adults for the first time expressed support for same-sex marriage in a survey by the Public Policy Institute of California (PPIC). Subsequent PPIC surveys reinforced this shift of opinion: 52 percent expressed support in September 2010 and 53 percent in September 2011. While Californians were divided on the issue as recently as March 2009, the margin of support is now at 11 points (53% favor, 42% oppose). This attitudinal shift is not unique to California. Two nationwide polls in 2011—one by Gallup and one by ABC News/Washington Post—showed majority support for same-sex marriage for the first time.

Using data collected through PPIC’s ongoing Statewide Survey series, this paper will examine the liberalizing of attitudes among Californians toward same-sex marriage. Through regression analysis, it will explore the demographic and political factors that predict support for same-sex marriage at several pivotal points in time dating back to 2000, including before and after the 2008 election. We will determine whether the attitudinal shift is occurring among particular groups or whether it represents an overall liberalizing of opinion across groups. To gain a deeper understanding of the forces driving the change, we will examine partisan groups by age, income, and education subgroups to the extent possible. Similarly, a unique aspect of the PPIC data is the ability to analyze Latino subgroups (e.g. age, education, income, immigrant status).

Demographic Determinants of Trends in Public Opinion about Abortion in the United States Jason Kevern, Northwestern University ([email protected]); Jeremy Freese, Northwestern University ([email protected])

Population demographic changes and longitudinal trends in public opinion are fundamentally intertwined. Using the General Social Survey, we consider the relationship between population demographic trends and trends in abortion attitudes. After an initial period of liberalizing attitudes, the United States population’s attitudes towards abortion have been mostly stable over the past two decades, or even more conservative. Cohort replacement explains most of this pattern, and our paper explores the role of differential fertility in changes in how population support for abortion rights has been affected by cohort replacement. Opponents of abortion rights maintain considerably higher fertility than their pro-choice counterparts, and elsewhere abortion attitudes have been shown to have a high parent child correlation. We apply these two pieces of evidence to help explain the mitigation of the upward trend toward pro- choice beliefs in GSS data from 1976-2010. We also find evidence that the fertility differential between pro-life and pro-choice individuals has grown, despite declining fertility for both groups, and it does not appear that the cohort trend in abortion attitudes can be explained by changes in conservatism or education over the same time period. We are currently working to quantify more precisely how strong the force of differential fertility is on abortion attitudes, and what kinds of population dynamics may amplify or attenuate its effects. This difference itself may be composed into two parts, depending on the mechanism of intergenerational transmission. The first is change due to children sharing sociodemographic characteristics of their parents; the second is change due to vertical cultural inheritance, where political orientations and attitudes are transmitted more directly.

Exploring the Gender Gap in Public Opinion Toward Global Climate Change Marc D. Weiner, Bloustein Center for Survey Research, Rutgers University ([email protected]); Orin T. Puniello, Bloustein Center for Survey Research, Rutgers University ([email protected])

The growing public opinion literature on perceptions of global climate change largely focuses on how attitudes are affected by a variety of factors, including political orientation, education, media and scientific exposures, perceived and actual temperature increases, as well as exploring question wording experiments. Surprisingly underrepresented in that literature is the gender gap; using original data supplemented with secondary data, we find that gender plays a substantively and statistically significant role in predicting climate change perceptions and attitudes.

We use two data sources to evaluate the role of gender in climate change perception. First, we use a 2010 environmental risk perception and tolerance survey; this nationwide RDD with an oversample of households within six 50-mile-radii regions surrounding nuclear energy production, research, and waste management facilities, permits us to explore the gender gap in both the general population as well as environmentally-stressed areas. This stratification is essential to understanding the gender gap in that the environmental public opinion literature finds that in environmentally-stressed areas the gender gap closes. The second data source is a set of Pew Research Center For The People & The Press surveys fielded in 2008, 2009, and 2010. Together, we pool three years of national trend data concerning global climate change and statistically test for gender gap effects.

Global climate change presents a policy quagmire for the United States. While the research community is in general agreement as to the fact of global climate change, American political culture had yet to form a coherent climate change attitude; in turn, American political institutions have failed to develop a coherent climate change policy. To that end, our findings demonstrate the vital importance of assuring population- proportional gender representation among those policymaking groups developing national global climate change policy.

Case Studies of Address-Based Sampling Designs

Address Based Sampling for In-Person Interviews: A Case Study in Low Coverage Randal ZuWallack, ICF ([email protected]); Matthew Denker, ICF ([email protected]); Robynne Locke, ICF ([email protected]); William Robb, ICF ([email protected]); Paul Martino, ICF ([email protected])

The U.S. Postal Service’s (USPS) Computerized Delivery Sequence File (CDS) offers a cost-effective sampling frame for in-person surveys. Studies have shown that USPS CDS address based frames provide high coverage residential households, but there are pockets where coverage is low. We recently experienced low coverage in three sites selected for the Census Barriers, Attitudes and Motivators Survey. These three sites are census tracts located on Native American Reservations in Arizona, South Dakota and New Mexico. Two of the locations had very few city-style addresses in comparison to the number of Census 2000 households in the tract. To select households in these locations, we developed an interactive mapping application to list the geographic coordinates of all physical structures in the site. Interviewers were equipped with hand-held GPS devices and instructed to visit the location and attempt an interview.

The third site had a number of address listings, but many were nonexistent when visited by an interviewer. To get an address frame for this site, we contacted the Tribal Office, who reviewed the list and identified omissions and erroneously included addresses.

Our presentation is a case study of these three sites. We demonstrate the mapping application; present outcomes of the identified structures; and share feedback from the field interviewers. Using the frame constructed from visiting the Tribal Office, we compare the coverage of the interactive mapping application.

Methodological Findings from a Two-Phase Address Based Sample Fielded by Mail Jill M. Montaquila, Westat ([email protected]); J. Michael Brick, Westat ([email protected]); Kwang Kim, Westat ([email protected])

The National Household Education Surveys Program (NHES) has made the transition from a random digit dial (RDD) sample with computer assisted telephone interviewing (CATI) administration to an address- based sample (ABS) with mail as the primary data collection mode. This study requires two phases of data collection: A Screener to determine a household’s eligibility (determined by the presence of eligible children in the household), followed by a Topical survey administered in eligible households. In the Fall of 2009, we conducted a pilot study for NHES to evaluate this two-phase ABS approach, with mail as the primary mode of collection. We followed up this pilot study with a very large-scale methodological field test in 2011 and we outline key findings from this field test.

The NHES:2011 Field Test had several embedded experiments to identify effective approaches for eliciting cooperation. These included tests of different versions of the Screener; experimentation with switching the Screener version for the nonresponse mailings; tests of different envelopes for the Topical survey mailing; tests of different mailing services for the Screener mailing and Topical survey mailings; tests of different levels of incentives for the Screener and Topical; tests of two different versions of the Topical questionnaires that mainly varied design elements; and tests of sending an advance letter prior to the first Screener mailing and with including a magnet in the first Screener mailing. This Field Test also included extensive testing of various approaches aimed at increasing participation among Spanish- speaking households.

In this paper, we present overall findings from this Field Test and use results of the embedded experiments to discuss the implications for approaches to increase response in a two-phase survey setting.

The Use of Address Based Sampling to Target Households with Children John Boyle, Abt SRBI Inc. ([email protected]); Anna Fleeman, Abt SRBI Inc. ([email protected]); Andy Weiss, Abt SRBI Inc. ([email protected]); Patricia Vanderwolf, Abt SRBI Inc. ([email protected]); Ruvini Ratnayake, Abt SRBI Inc. ([email protected])

Over the last few years, researchers have become increasingly reliant on dual frame surveys to correct for the declining coverage of landline RDD surveys. Often the cell phone component is from a cellular RDD frame; however, Address Based Sampling (ABS) provides an alternative source. To explore the comparative efficiency and bias of these two approaches in sampling cell phone households with children, Abt SRBI fielded an ABS supplement to a large national survey about child safety. More than 70,000 addresses were drawn and phone numbers, if able to be matched, were appended. An invitation letter and two-page questionnaire were mailed to all addresses without matched phone numbers and to a subset of those with phone numbers. The language of the materials was tailored to encourage response only from households with children. We also asked about phone status and for contact information including phone number(s) in case the household was selected to participate in an upcoming phone survey. We estimated the proportion of cell phone only (CPO), landline only, and dual users (DU) within both the matched and unmatched samples as well as the percentage of respondents providing a contact phone number by phone status and sample type. For households with children providing a contact phone number, we administered a 40 minute CATI survey about child safety. To compare the CPO and DU households from the ABS frame to those reached via cell RDD, 5,000 cell RDD numbers were dialed and administered the same CATI survey, if eligible. While the sampling and mail recruitment of respondents were similar to other ABS research, three aspects of this study are particularly novel: materials targeting households with children, mailing to households with a phone number match, and concurrent comparison of CPO and dual user households from ABS and cell RDD.

Mode Differences within an Address Based Sample Survey of the Washington Area Peyton M. Craighill, The Washington Post ([email protected]); Jon Cohen, The Washington Post ([email protected]); Scott Clement, The Washington Post ([email protected]); David Dutwin, SSRS ([email protected]); Eran Ben-Porath, SSRS ([email protected])

The Washington metro area poses a special challenge for coverage in representative telephone survey research. With a high proportion of cell phone only households and a relatively high transient population, traditional RDD phone sampling procedures are less effective at achieving high levels of coverage. In August 2011, The Washington Post with SSRS conducted its first Address Based Sample (ABS) of the D.C. area. This paper will explore mode differences embedded in the ABS data among respondents who completed the survey online, called in to complete and those reached by telephone call-out. The paper will compare attitudinal and demographic differences with previous D.C. area polls using RDD sample frames. After controlling for inherent demographic differences according to mode choice, we will evaluate the data quality, price and efficiency of this new mode to past RDD surveys and explore methods to maximize the effectiveness of both ABS and dual frame RDD designs.

The 2011 National Survey of Fishing, Hunting, and Wildlife-Associated Recreation (FHWAR) Cell Phone and Debit Card Test Study Logistics and Cost Analysis Elke McLaren, United States Census Bureau ([email protected]); Aniekan Okon, United States Census Bureau ([email protected]); Denise Pepe, United States Census Bureau ([email protected])

The 2011 FHWAR is an address-based sample survey with interviews conducted by Computer-Assisted Telephone Interviewing (CATI) and Computer-Assisted Personal Interviewing (CAPI). We conducted a Cell Phone and Debit Card Test to research alternative survey designs that could increase the number of CATI interviews, reduce the cost and number of CAPI interviews, and reduce the variance associated with conducting fewer CAPI interviews.

This test consisted of 1,411 households with no available telephone number that were randomly assigned to three panels. The three panels included:  Advance Letter and Cell Phone – Respondents received a cell phone for communication between the household and the Census Bureau  Advance Letter with a $25 Debit Card Incentive – PIN received upon completion of interview  Advance Letter Only – Only wording that impresses that CATI interviews save government dollars

Each panel received an advance letter that 1) emphasized the cost savings of conducting a telephone interview and 2) requested that a respondent call the telephone center to conduct an interview.

We compared the test data to the production CATI and CAPI data to evaluate whether any of the test options are viable. The results may provide an alternative option for future surveys to decrease interview costs while equalizing the differences between households with available telephone numbers and households without available telephone numbers.

At the 2011 AAPOR conference, the authors presented the design of the study. In 2012, the authors plan to present study logistics and cost benefit analysis results.

Comparing Data Collected Using Mobile Devices with Other Survey Modes

The Reliability and Validity of Alternative Customer Satisfaction Measurement Scales in PC Web and Mobile Web Environments Keith Chrzan, Maritz Research ([email protected]); Ted Saunders, Maritz Research ([email protected])

Increasing numbers of respondents opt to take their web-based surveys on their smartphones. This raises questions about sampling, about comparing results across modalities, and about questionnaire design. We focus on the latter two topics, seeking to discover valid, reliable ways of measuring customer satisfaction which, ideally, also yield comparable results across PC-web and mobile web respondents.

Using a multi-cell and test-retest research design, we examine four different customer satisfaction scales • A standard fully-anchored five-point unipolar rating scale • The D-T (Delighted-Terrible) scale with seven scale points and two off-scale responses (Westbrook 1980) • An 11-point percentage of satisfaction scale with points labeled from 0% to 100% • A binary scale that may lend itself better to a mobile web environment. In different cells we show these scales in horizontal and vertical format.

Using this design we compare the scales in terms of: • Completion rates • Similarity of PC web and mobile web responses • Construct validity • Criterion validity (both the ability to be well-predicted by attributes/antecedents of satisfaction and the ability to predict consequents of satisfaction like intentional loyalty) • Test-retest reliability • Self-reported respondent experience • Perceived and actual survey length

A Direct Comparison of Mobile Versus Online Survey Modes Tom Wells, The Nielsen Company ([email protected]); Justin Bailey, The Nielsen Company ([email protected]); Michael W. Link, The Nielsen Company ([email protected])

Mobile phone surveys are gaining popularity, however very little experimental research has been conducted on mobile phone surveys. One rare exception is the work by Peytchev and Hill (2010), who embedded several experiments within a series of mobile surveys.

In this study, we build on and extend Peytchev and Hill’s work in several ways: 1) by utilizing a large, national sample of mobile phone users, 2) by including an additional set of question wording and question formatting experiments, and 3) by conducting a parallel set of online survey experiments. The main focus of this study is replicating the experimental mobile survey findings with web surveys taken on a PC. The study design will allow us to compare findings between smartphone and online survey administrations.

This study is based on 1500 online panelists who are also smartphone users. Respondents were randomly assigned to 1 of 4 survey versions: mobile version A, mobile version B, online version A, or online version B. Surveys for each study group were 25 questions and approximately 10 minutes long. Experiments embedded within the surveys included the presentation of low versus high frequency scales and closed-ended versus open-ended other categories (similar to Peytchev and Hill) as well as the manipulation of text box size and response option order.

Finally, we also analyze survey breakoffs within and across modes, but study this observationally, rather than by explicitly manipulating length of survey.

By conducting this research, we gain a better understanding of what does and doesn’t work with mobile surveys and how they can be optimized.

Matching Data Collection Method to Purpose: In the Moment Data Collection with Mobile Devices for Occasioned Based Analysis Edward Paul Johnson, Survey Sampling International ([email protected]); Carol Shea, Olivetree Research ([email protected])

Innovation requires a steady stream of new consumer insights based on the ever changing world of the consumer. Historically, we’ve relied on traditional and ethnographic research techniques for these insights which work well for products that are purchased with long term goals and needs. Products that fulfill short term needs are more likely to change with a person’s mood or circumstance so the occasion of the purchase is very significant. Surprisingly, the Hartman Group has found that occasion based purchases account for 57% of all grocery store purchases. Occasion based marketing gains new consumer insight by revealing those specific occasions tied to different consumer habits and needs. In our example, we explore snack consumption and snacks considered but not consumed. Unfortunately traditional methods for surveying people rely on a respondent’s memory of their behaviors and needs at a particular time which is notoriously faulty even in daily diaries. Luckily, new data collection techniques (mobile research) allow us to collect data in the moment of the occasion we are interested in. This matches the exact need of occasion-based marketing. In our presentation we examine snacking behavior for 400 panelists. These panelists were split into two “in the moment” (ITM) groups and two traditional end of day (EOD) diary groups. We also explore mode effects between SMS text and mobile web when collecting data in the moment and between mobile devices and traditional online for daily diaries. The most important new results are the differences between the collecting data in the moment versus at the end of the day. We show it is essential to match the data collection to the purpose of the study and today’s mobile usage habits; relying solely on mobile devices could yield faulty conclusions unless the respondent can update responses periodically during the study.

Effectiveness and Reliability of Student Response Systems (SRS) Devices for Evaluating an Adolescent Adventure Program ack Fentress, Data Recognition Corporation ([email protected]); John J. Deyer, United States Air Force ([email protected]); Colleen Rasinowich, Data Recognition Corporation (DRC) ([email protected])

The use of SRS devices for collecting attitudinal data from adolescents in less formal venues has inherent appeal. Relative to traditional paper instruments, SRS devices are more synergistic with less formal venues, more engaging and, certainly, a familiar mode for adolescent populations. However, when multiple program evaluations are required and SRS devices may not be available or appropriate and a paper survey alternative may be required. Electronic devices are familiar to this population, but there are usability challenges and data equivalence issues in mixed mode data collection applications. 150 6th – 8th grade students participating in a National Guard Adventure Based Education program were randomly assigned to a SRS test group and asked to complete a pre and post program evaluation using the Quizdom Q6 Remote electronic device. Using a formal evaluation protocol, an onsite team observed the SRS administration, conducted post evaluation interviews and recorded their observations. An additional 982 students evaluated the same program using a paper survey and were used for the equivalence analysis. Results showed that students overwhelmingly liked and preferred the SRS device to a paper survey. While maintaining engagement was an issue with the SRS device, observational and analytic results suggest that the engagement threshold is similar to a paper execution. Overall, analytic results support equivalency in SRS and paper survey results. The study was conducted in the context of an adolescent outdoor adventure based education program using a specific SRS device. Results do have relevance for the use of alternative electronic data collection devices for data collection in less traditional venues.

Do Surveys that are Completed on Mobile Devices Differ from Surveys Completed Online, Over the Phone, or Via Mail? Adam Gluck, Arbitron ([email protected])

Arbitron uses a panel-based methodology to collect radio listening data, and produce media ratings in various markets around the country. The method for collecting this data is the Portable People Meter, a cell phone sized device that passively measures exposure to encoded audio in media. As each individual meter is carried by a unique panelist, we can associate the media that the PPM detects to the panelist who is wearing it, thus creating an electronic log of their listening. From that we can estimate who was listening to radio. After panelists leave a panel, we occasionally re-contact them to gather additional information via surveys.

During the summer of 2011, Arbitron conducted one such survey. Panelists were given the opportunity to choose to respond via a variety of methods, including:

• Self-administering a paper survey and mailing it back to us in a Business Reply envelope • Completing the survey on the web after clicking on a link in an email • Calling Arbitron and having an interviewer Administer the survey • Completing the survey on the web by utilizing a “Quick Code” that was shown on the mailed survey.

The survey consisted of a few multiple choice questions, as well as several open-ended questions.

In this paper, the contributions of each mode of response to the overall response rate, as well as the difference between the types of data collected by each mode will be analyzed. Additionally, item non- response for the various modes will be examined as well.

Explaining Public Attitudes about Science and Technology

Religious Beliefs, Knowledge about Science and Attitudes Towards Medical Genetics Nick Allum, University of Essex ([email protected]); Elissa Sibley, University of Essex ([email protected]); Patrick Sturgis, University of Southampton ([email protected]); Paul Stoneman, University of Southampton ([email protected])

The use of genetics in medical research is one of the most important avenues currently being explored to enhance human health. For some, the idea that we can intervene in the mechanisms of human existence at such a fundamental level can be at minimum worrying and at most repugnant. In particular, religious doctrines are likely to collide with the rapidly advancing capability for science to make such interventions. The key ingredient for acceptance of genetics, on the other hand, is prototypically assumed to be scientific literacy - familiarity and understanding of the critical facts and methods of science. However, this binary opposition between science and religion runs counter to what is often found in practice amongst the general public. In this paper, we examine the association between religiosity, science knowledge and attitudes to medical genetics amongst the British public, using a new probability-based face to face survey, the Wellcome Trust Monitor. In particular, we test the hypothesis that religion acts as a ‘perceptual filter’ through which citizens acquire and use scientific knowledge in the formation of attitudes towards medical genetics in various ways. Results indicate that strongly religious citizens become more concerned about negative impacts of genetic testing as they become more knowledgeable about science. The reverse is true for their less religious counterparts.

Disentangling Public Opinion of Nanotechnology: Exploring the Interactive effects of News Media, Values, and Information Processing on Opinion Formation Doo-Hun Choi, University of Wisconsin - Madison ([email protected]); Michael Cacciatore, University of Wisconsin - Madison ([email protected]); Dietram Scheufele, University of Wisconsin - Madison ([email protected]); Dominique Brossard, University of Wisconsin - Madison ([email protected]); Michael Xenos, University of Wisconsin - Madison ([email protected])

News audiences often differ in the information processing strategies they employ when working through mass media information (Eveland, 2002; Ho et al., 2011). Moreover, for many political and scientific issues the mass public has been found to rely on cognitive shortcuts or heuristics in order to help in opinion formation (e.g., Scheufele et al., 2008). Less well understood, however, is how news information processing strategies interact with value predispositions when forming judgments about science and technology issues.

Analyzing nationally representative online survey data (N = 1,155), this study explores the processing of political and scientific news based on respondents’ political ideology, levels of religiosity, and elaborative processing strategies. This study explores how these factors interact in determining public opinion toward a novel scientific issue: nanotechnology.

Our findings show that, overall, increased political news use results in largely negative impacts on public nanotechnology support. Conversely, increased science news use was found to increase public support for the emerging technology. However, more importantly, the influences of media use were found to differ based on one’s value predispositions (i.e., religiosity and ideology). Specifically, the negative impacts of increased political news use were found to be significantly stronger among conservatives. Similarly, the positive impacts on science news use were significantly weaker among the more religious respondents of our sample.

Finally, there was a significant three-way interaction between political news, ideology and elaborative processing. The interaction found that, among liberals, an increased tendency to engage in elaborative processing attenuated the negative effects of increased political news use. On the other hand, among conservatives, elaborative processing heighted the negative effects of political news. Taken together, findings indicate that news audiences process information through a set of perceptual filters reliance on a biased set of cognitive processes, which impacts overall opinions toward the science issue (Kunda, 1990).

The Racial Gap in Confidence in Science Eric Plutzer, Penn State University ([email protected])

African Americans, compared to whites, are starkly under-represented in scientific and technological professions, are especially reluctant to participate as research subjects, and express attitudes that are skeptical of science and scientific institutions. This paper seeks to explain the racial gap in confidence in science. We examine explanations suggested by research on human capital, inequality in educational opportunity, and culture. We conclude that differential returns to schooling account for about a third of the racial divide, with various cultural mechanisms accounting for most of the balance.

Weather or Not? Examining the Impact of Meteorological Conditions on Public Opinion Regarding Climate Change Christopher Paul Borick, Muhlenberg College ([email protected]); Barry G. Rabe, University of Michigan ([email protected])

During the past three years there has been considerable volatility in the views of the American public regarding the existence of global warming. Numerous public opinion polls have tracked a moderate decline in belief beginning in 2008 and a modest rebound in belief levels during the last year. Researchers have attributed this volatility to a variety of factors including economic conditions, increased political polarization, incidents involving climate scientists and variation in local temperatures. While previous studies have been able to link shifts in attitudes towards climate change to the previous array of factors, the explanations for the variations in opinion appear to not be fully developed. In this study we build upon the current literature on public opinion regarding climate change by examining the effect of meteorological conditions and events on belief in the existence of global warming. Using data from the National Surveys of American Public Opinion on Climate Change between 2008 and 2012 we analyze the relationship between factors such as snowfall, drought and hurricanes and individual perceptions of global warming. Among the preliminary findings from the research is that the likelihood of belief that the planet is warming appears to be affected by significant departures in snowfall and precipitation levels from forty-year averages.

New Developments in Cognitive Interviewing

Towards a Cultural Sociology of Survey Response Data and Measurement Valerie L. Chepp, National Center for Health Statistics ([email protected]); Caroline P. Gray, National Center for Health Statistics ([email protected])

It might appear that a culturally informed approach to doing sociology would have little to say about survey methods, but this is not the case. In fact, a cultural sociological approach may be exactly what is needed in order to better understand the data produced through survey measurement. The direct way that cultural sociology can have an impact on how survey research is conducted and how survey data is interpreted is through the method known as cognitive interviewing. Though not commonly acknowledged or employed in the academy, cognitive interviewing is a widely practiced way to “pre-test” survey questions in non-academic, survey research settings, including across U.S. federal statistical agencies. This presentation will explore new developments in cognitive sociology, a sub-field of cultural sociology, which has empirically demonstrated various ways culture shapes human thought processes. Such culturally informed insights have much to offer the method of cognitive interviewing—which historically developed from theories of cognitive psychology and survey methodology—yet cultural approaches to human cognition are largely absent from discussions of the method. Specifically, a cultural sociological approach to survey response data and measurement can offer a more sophisticated understanding of how respondents: (1) perceive survey questions, (2) retrieve cognitive information when responding to survey questions, and (3) make decisions about how to answer survey questions. Moreover, culturally attuned approaches to cognitive interviewing can account for the ways in which respondents’ social identities inform the question-response process. To date, insights from cultural sociology, generally housed in academic settings, are lacking from cognitive interviewing techniques and analysis, typically practiced in applied research settings. In this presentation we advocate for more cross-dialogue between academic sociology and applied sociology, and we suggest that cognitive interviewing is an especially fruitful site where this cross-pollination of theory and practice can take place.

Measuring Environmental Barriers as a Source of Disablement: Lessons Learned from Cognitive Interviewing Heather Ridolfo, National Center for Health Statistics ([email protected]); Valerie Chepp, National Center for Health Statistics ([email protected]); Dynesha Brooks, National Center for Health Statistics ([email protected])

Over the last three decades disability researchers and advocates have increasingly recognized the importance of environmental factors (i.e., barriers in the built environment and social barriers, such as stigma and discrimination) in the disablement process. However, measurement of these barriers has been slow to develop. Current measures are limited in scope, and studies that use these measures to examine the impact of such barriers on disability outcomes have produced mixed findings. Central to this issue is the quality of existing measures. While some measures have been evaluated using quantitative analyses, such as statistical tests of validity and reliability, qualitative evaluations of these measures appear less prevalent and publicly unavailable. Yet, information gained through qualitative evaluations such as cognitive interviewing provide unique insight into factors that impact the validity of data produced from these measures, which cannot be known using quantitative evaluations. Using data from 30 cognitive interviews, we identify common problems with measuring environmental barriers that are likely to affect the validity of data produced by most measures of environmental barriers. This includes impaired individuals’ conceptualization of a barrier, which variously informed how respondents answered survey questions. Some impaired persons lacked knowledge about barriers, and others attributed difficulties experienced to their own impairment rather than to the nature of their environment. Impaired individuals’ experiences with environmental barriers were also experiential and situational. Additionally, some impaired individuals did not report the need for accommodations due to lack of perceived efficacy in attaining them and/or because they found ways to accommodate for these barriers on their own. Finally, when accommodations were provided, often times they only partially improved accessibility. In this presentation we will provide an overview of this research and discuss the implications of these findings in terms of the use of current measures and in the development of new measures.

Another Use for Cognitive Interviews: Understandings Inconsistencies in Survey Data HarmoniJoie Noel, National Center for Health Statistics ([email protected])

Cognitive interviews are commonly used in the survey research world as a pre-testing method to test survey questions before they go into the field. Cognitive interviews can show how respondents go through the response process when answering survey questions. Specifically, they can show how respondents understand a question and identify potential sources of response error that may suggest changes in the question. As a qualitative method, cognitive interviews can also be used in a mixed-method framework to help explain inconsistencies in survey data that cannot be answered with numbers alone. Inconsistencies in parental reports of their children’s behavioral health conditions were found between the 2009-2010 National Survey of Children with Special Health Care Needs (NS-CSHCN) and the follow-up 2011 Survey of Pathways to Diagnosis and Services (Pathways) where the same parents were asked the same questions at two time points. The nature and frequency of these inconsistencies led the survey sponsors to wonder whether there was something else going on in the response process that contributed to these inconsistencies. Twenty-four cognitive interviews were conducted and analyses show that the complexity of the experiences parents are asked to report about can lead to several response patterns where there could be the potential for shifting answers. Cognitive interviews could be applied in this mixed-method framework to help understand inconsistencies in other survey data as well.

Cognitive Interviews without the Cognitive Interviewer? Jennifer Edgar, Bureau of Labor Statistics ([email protected])

Cognitive interviewing is known to be an in-person pretesting method. The interaction between the interviewer and participant allows for in-depth probing allowing the researcher to use spontaneous probes designed to elicit explanations of the participant’s thought processes. Although scripted probes are often used, it is common for interviewers to ask additional probes based on the participant’s comments.

Taking advantage of technology, however, it is possible that the goals of cognitive interviewing can be met using an unmoderated format, where participants “thinking aloud” respond to a carefully determined sequence of directions and questions without researcher intervention. If researchers can collect information about a participant’s thought processes without going through the time intensive process of recruitment and conducting individual interviews, perhaps they would be willing to accept slightly less detailed information from the participants than is gained through spontaneous probing and face-to-face interactions.

This paper looks at a study that used an unmoderated approach to pretest a survey question. Fifty participants were audio recorded while completing online tasks in which they were asked to think aloud. To serve as a comparison, 20 participants were interviewed using standard, face-to-face, cognitive interviewing methods. The results from the two studies were compared, with the amount and depth of the information about the participant’s thought processes obtained from each mode. Additionally, the resources required for each mode, including total study costs, participant payment, and staff recruiting time, will be estimated.

Design, Development and Evaluation of a Sexual Identity Question for the NHIS John Michael Ryan, National Center for Health Statistics ([email protected]); Kristen Miller, National Center for Health Statistics ([email protected])

This paper describes research to develop and evaluate a sexual identity question for the National Health Interview Survey (NHIS). Development and then evaluation of the question is based on findings from cognitive testing studies conducted by the Questionnaire Design Research Laboratory (QDRL) at the National Center for Health Statistics, specifically, 8 different testing projects consisting of a total of 386 in- depth cognitive interviews. Additionally, data from the 2002 and 2006 National Survey of Family Growth (NSFG) was examined to further investigate findings from past cognitive interviewing studies. This paper, first, defines the construct to be measured and then outlines known question design problems with existing sexual identity measures. It then presents a revised version of the question, the rationale for the new design and cognitive interviewing results. In March 2012, the question will be included in an NHIS field test. Results of this field test will be presented. Findings from the field test will show whether or not the redesign improves estimates of the lesbian, gay and bisexual population.

Questionnaire Design: Experiments on Response Options and Format

Is More Better?: 4 vs. 6 Response Options Patricia M Gallagher, University of Massachusetts - Boston ([email protected]); Carol Cosenza, University of Massachusetts - Boston ([email protected]); Stephanie Lloyd, University of Massachusetts - Boston ([email protected])

There is evidence that adult humans can hold 3 to 5 distinct pieces of information in working memory. This limit may create challenges to data quality in telephone and mixed-mode surveys entailing respondent consideration of 6 response options. Providing more response options, however, can be useful because offering more response options generally increases response variation. Our research assesses the impact of 4- and 6-point frequency response scales on survey reliability, with the goal of providing methodological guidance regarding which scale to use to estimate the numbers of patient responses needed per clinician to provide reliable estimates of care at the clinician level. The 6- point scale adds “Almost never” and “Almost always” to the 4 response categories of Never, Sometimes, Usually, Always. The survey instrument, the Consumer Assessment of Healthcare Providers and Systems Clinician and Group Patient Centered Medical Home (CAHPS® C&G PCMH), was developed to measure patients’ experiences in ambulatory settings. A sample of adult patients (n=5,800) from a university-based health system was randomized to either a 4- or 6- response options instrument. For this mode study, a random subset of both study arms was assigned to telephone-first administration and the rest to a mail-first protocol. Once data collection in the assigned mode is complete, non-responders are exposed to the other mode of administration. The survey was funded by the Agency for Healthcare Research and Quality (AHRQ). Using data from the initial mode of contact only, we will examine the psychometric properties of the 2 scales by mode of administration. We will also estimate the number of patient responses required to provide reliable clinician-level estimates for each version of the instrument. The project is currently in the field with an interim response rate of 53% (AAPOR RR 1). Data collection will be completed in December.

Ordering Your Attention: Response Order Effects in Parallel Phone and Online Surveys Frances M. Barlas, ICF International ([email protected]); Randall K. Thomas, ICF International ([email protected])

Response order effects occur when respondents select responses based on the order in which the responses are presented. For attitudinal measures, primacy effects of response selection have often been obtained when scales are presented visually (e.g. web or paper), while recency effects have often been found for scales presented orally (e.g. telephone). A variety of explanations for response order effects have been proffered, including attentional factors, memory limitations, satisficing, reading or listening patterns of processing, education, and age. In this study, we compared response order effects for over 140 different measures that were fielded in parallel web-based s and telephone RDD studies over an eight year period. These measures varied in content and number of response categories. We found that response order effects were dependent on the nature of the scale and the nature of the mode. Intensity measures (importance, usefulness) showed the weakest order effects, while affective measures (good-bad, like-dislike) showed the strongest order effects. The web-based version was most likely to show a primacy effect while the telephone version was most likely to show a recency effect. Number of response categories only weakly interacted with mode of presentation.

Differences in Vague Quantifier Interpretation: Influences On and Detection by Latent Variable Models Jamie L. Marincic, Mathematica Policy Research ([email protected])

Surveys collecting behavioral frequency information request either numerically or vaguely quantified frequency reports. Regardless of the nature of the behavioral frequency report, researchers are often interested in placing respondents on a latent behavioral frequency continuum based on responses to a set of related items. To the extent that numerically and vaguely quantified reports capture the same frequency information, measurement models based on either should not differ; however, whereas numerically quantified reports reflect absolute behavioral frequency information, the literature has demonstrated that vaguely quantified frequency reports reflect relative or conditional behavioral frequency information. For example, researchers have found that the numeric definition of vague quantifiers is influenced by respondent characteristics such as level of engagement in the behavior (Goocher, 1965), attitude toward the behavior (Goocher, 1965), and demographic characteristics (Schaeffer, 1991). Therefore, measurement models based on numerically and vaguely quantified frequency reports may differ in important ways.

Using data from an experiment embedded in the 2006 National Survey of Student Engagement, the purpose of the current study is to determine whether the structure of the substantive factor(s) extracted from numerically quantified reports differs from the structure extracted from vaguely quantified reports. Furthermore, the study examines whether factor mixture models applied to only the vaguely quantified reports successfully reproduce the factor structure(s) extracted from the numerically quantified reports. Finally, the study addresses whether latent variable models may be used to detect differential interpretation by identifying a continuous latent interpretation factor (via a random intercept item factor model) or categorical latent interpretation classes (via a factor mixture model). The purpose of these analyses is to determine whether a substantive researcher may use any of the available latent variable models to extract a methodological artifact--differential interpretation--from measurement models intended to be purely substantive (Leite & Cooper, 2010).

Testing Questions on a Large-scale Schools Omnibus Panel for the Fifth Wave of the UK Millennium Cohort Study Kate Smith, Centre for Longitudinal Studies, Institute of Education ([email protected]); Lucinda Platt, Centre for Longitudinal Studies, Institute of Education ([email protected])

When designing questions that require temporal frequencies there can be a trade-off between specificity and simplicity. More specific questions can offer better purchase on variation between respondents, but responses to simpler questions may prove more stable and additionally, result in better rates of completion. This may be especially true when respondents are children.

To ensure optimal design of questions, it is standard practice on most large-scale surveys to cognitively test new or adapted questions and to carry out field pilots to evaluate data collection instruments. While such question testing is beneficial, it is often relatively small scale and lacks the sample sizes sufficient to quantitatively evaluate questions or test different versions of questions.

This paper reports on a large scale testing of response categories for questions for the child self- completion instrument to be used in the UK Millennium Cohort Study (MCS). The MCS is following over 19,000 children born in the UK in 2000/1. There have been four waves of the study at 9 months, 3 years, 5 years and 7 years. The fifth wave starts in January 2012 when the study children will be aged 11. As part of the development work for this wave, two tests were carried out with around 3,500 10-15 year olds in a schools omnibus panel.

The first test involved asking children the same bullying frequency question on two separate occasions and comparing responses obtained using numeric response categories with those using descriptive response categories. The second test involved randomly allocating children to different versions of a question about alcohol use and comparing the responses obtained using a pre-coded frequency list with an open write-in question. Initial analysis suggests that pre-coded lists ease response and produce higher quality data. The paper discusses the results and their implications for child questionnaire design.

Sunday, May 20, 2012 10:15 a.m. - 11:45 a.m. Concurrent Session K

Addressing the Challenges of Longitudinal Surveys

All Participants Being Unequal: A Bias Analysis of Three Contemporary Strategies for Locating Longitudinal Study Participants after an Extended Hiatus Celeste Stone, American Institutes for Research ([email protected]); Jenny Bandyk, University of Michigan, Survey Research Center ([email protected]); Sandy Eyster, American Institutes for Research ([email protected]); Christopher Bradley, American Institutes for Research ([email protected]); Susan Lapham, American Institutes for Research ([email protected])

In 2011, Project Talent researchers began a pilot test to assess the feasibility of finding and reengaging study participants who had not been contacted in 37 to 51 years. Developed by the American Institutes for Research (AIR), Project Talent is a nationally representative, longitudinal study that began in 1960 by measuring the knowledge, abilities, personality, interests, and demographics of approximately 440,000 9th-12th grade students across the United States. The study conducted three waves of follow-up surveys before going on hiatus in 1974. A subsample of 4,879 participants was selected for the 2011-12 pilot test to assess the feasibility of future follow-up activities. For this pilot test, AIR in collaboration with the University of Michigan’s Survey Research Center employed three broadly-categorized methods to track sampled participants who are currently 65-70 years old: 1) batch searches of administrative data, 2) outreach activities, including attendance of 50th-high school reunions, and 3) intensive internet, database, and phone tracking of participants not located using methods (1) or (2). Sampled participants were classified as located, not located, or located, but ineligible (e.g., deceased; incapacitated).

Using base year demographic, cognitive, and personality data, this paper focuses on one source of sample attrition bias (i.e., un-located sample members) by comparing people who can be tracked to those who cannot. We also compare the relative sample biases of three locating strategies to determine which combination of strategies is the most effective for locating an unbiased sample. We anticipate multivariate analyses to reveal that, even after controlling for demographics, personality differences like impulsivity and sociability are related to tracking success. These findings may offer more insight about the types of people who are missed above and beyond their demographic characteristics, and provide guidance for how best to locate and later persuade underrepresented participants to continue their involvement in longitudinal studies.

Modeling the Confounds of Divorce and Attrition in a 20-year Panel Study: Chickens or Eggs? Veronica Roth, The Pennsylvania State University ([email protected])

Assessing the impact of attrition on estimated models is always a concern in longitudinal data analysis. Two specific problems stem from the possible confounds: estimates of key variables over the duration of the study may no longer be representative of the population and the key variables may impact the rates or types of attrition. Divorce and attrition, which have been found to share unmeasured correlates that are not appropriately fixed by weighting (Hill, 1997), present a rich example for the utility of hazard models in studying this problem. I used data from the Marital Instability over the Life Course (MILC) study (Booth, et.al., 2000), which focused on marital instability and dissolution. The clustered, RDD sample of married persons under age 55 began in 1980 with 2,033 respondents and concluded 20 years later. To assess the impact of attrition on predicting divorce rates, I modeled the hazards of divorce in this sample, and compared them to national estimates of marital longevity. Next, I examined whether divorce affects the type of attrition (refusal, could not locate) as divorced persons are more likely to move. Using a competing risks analysis, I looked at the differences of those who did not attrite, refused or could not be located, emphasizing married and divorced groups. Understanding the ways these two processes impact one another can help researchers improve procedures for retaining panel survey respondents and provide better estimates of factors impacting divorce.

Booth, Alan, David Johnson, Lynne K. White and John N. Edwards. Marital Instability over the Life Course, 2000 [Computer File]. ICPSR3812. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011-9-1. doi: 10.3886/ICPSR03812. Hill, Daniel H. 1997. Adjusting for Attrition in Event-History Analysis. Sociological Methodology 27: 393- 416.

Parents’ Participation in a Two-Generation Longitudinal Health Study Amy Lucas, University of North Carolina at Chapel Hill ([email protected]); Judith A. Seltzer, University of California, Los Angeles ([email protected]); Kathleen Mullan Harris, University of North Carolina at Chapel Hill ([email protected])

Intergenerational health research requires data from both children and their biological parents. Parents know more about family health histories, and biological data from parents and children provide objective measures needed for health research. Most large-scale surveys and biomedical studies use a single respondent to describe the health of others in the family. Study designs that interview two generations are difficult to field. Cross-sectional efforts that asked one member of a parent-child pair for the other’s contact information have been relatively unsuccessful. Refusal rates are high, and those who do provide contact information have higher-quality relationships. This selectivity hampers research on the social mechanisms contributing to intergenerational correlations of health. We use unique data from Add Health (National Longitudinal Study of Adolescent Health) to investigate demographic and relationship-quality correlates of a biological parent’s participation in the survey at Wave I (1994) when the Add Health respondent was a teenager in the parent’s home. We then investigate what factors affect the Add Health respondent’s willingness to report parent’s contact information 14 years later at Wave IV (2008) for a potential follow-up parent interview. Native-born white parents, with higher SES, and with whom children report a close relationship were more likely to participate at Wave I. Over 95% of respondents whose biological parent (mother or father) was interviewed in Wave I provided contact information for that parent at Wave IV. If a biological parent did not participate in Wave I, young-adult respondents were more likely to provide their mother’s contact information (68%) than their father’s (40%), among those with living parents. Among those without a Wave 1 parent interview, young-adult respondent’s demographic characteristics and relationship quality were more important predictors of providing contact information for fathers than for mothers. The final paper also examines the quality and completeness of the contact information.

Predicting Retention in a National Longitudinal Study of Health and Well-Being Barry Radler, University of Wisconsin Institute on Aging ([email protected])

This presentation uses data from MIDUS (Midlife in the United States), a national study of Americans (N = 7,108), to investigate factors that predict longitudinal retention. With its extensive age range (25-75 at Time 1)and long-term design (9- to 10-year survey interval), MIDUS is useful for investigating common sociodemographic and health predictors of continuing participation. Logistic regression analyses was performed on baseline sociodemographic and health variables predicting retention. Select interaction terms examined the interplay between targeted variables. Consistent with prior research, higher retention rates were found among Whites, females, and married individuals as well as those with better health and more education. Interaction analyses further clarified that (a) health status better predicted retention among older compared to younger respondents and among women compared to men, (b) marital status better predicted retention among Whites compared to non-Whites and among women compared to men, and (c) economic status better predicted retention among those with poorer functional health status. The analyses clarify that longitudinal retention varied depending on respondents’ sociodemographic characteristics and their health status. The unique contribution of this research is that factors predicting nonparticipation can be offset by, or compensated for, other factors.

“I Still Don’t Know”: Non-substantive Responses in Longitudinal Data Rebekah Young, The Pennsylvania State University ([email protected]); David R. Johnson, The Pennsylvania State University ([email protected])

Survey respondents are occasionally unable to generate the type of response researcher’s hope to record. Respondents may, instead, offer non-substantive answers such as saying they are unsure, cannot recall, or don’t know. Understanding these responses is fundamental for optimal survey design and for identifying sensible methods to handle these responses in data analysis. Unfortunately, most research about non-substantive responses has been limited to cross-sectional data analysis only.

In this paper we use two waves of the National Survey of Midlife in the United States (n = 3,487) to explore “Don’t know” (DK) responses over time. We answer the following four research questions: (1) Do DK responses early in a survey predict DK responses occurring later in the same survey? (2) Does DK response propensity in the first survey predict DK response propensity in a follow-up survey? (3) Do people repeatedly say DK to the same question? (4) Can DK responses in the first survey predict which respondents will drop out of the second survey?

We find that respondents increased their use of DK responses within a single survey, suggesting that some DK responses are the result of survey satisficing. DK response propensity was fairly consistent between surveys, evidence that these responses could also reflect a stable personality trait. 40% of respondents who said DK to a question in the first survey said DK again when asked the same question 9-10 years later, providing a compelling case that some respondents genuinely do not know. Finally, DK responses during the first survey were highly predictive of panel attrition, even when controlling other characteristics known to predict survey drop-out. Our findings suggest that including a DK or ‘no opinion’ option in a survey design may be worthwhile since identifying likely future study dropouts is an important step in improving panel data response rates.

Cross-National Studies of Muslim Public Opinion

The Arab Spring: Roots of the Popular Uprisings Meryem Ay, University of Nebraska - Lincoln, Gallup Research Center ([email protected]); Tarek Baghal, UNL- Gallup Research Center ([email protected])

The past year has seen popular uprisings in the Arab world against long entrenched leaders. Among these countries, Tunisia, Egypt and Libya have overthrown their presidents, with continuing unrest in Syria, Yemen, and Bahrain. Because the main actors of the revolutions are the citizens of these countries, public opinion has been important in starting and shaping the uprisings, as well as holding importance in structuring the future of these countries. The ongoing revolutions are often discussed as the consequence of the public’s perceptions of economic hardship, deficiency of civil rights, suffering and perceived corruption of the governments. To understand the precursors and potential reasons for these uprisings, it is essential to examine public attitudes on the pre-revolutionary economy, government, and civil liberties as potentially important causal factors.

Using Gallup World Poll data, this study examines Tunisian, Egyptian, Bahraini, Yemeni, and Syrian pre- revolutionary attitudes towards the social and economic conditions as well as popular attitudes toward the leadership of these nations prior to the revolutions. Libya will be excluded due to the absence of data. Initial analyses indicate that in general, these Arab populations view their governments as corrupt, with majorities saying there is corruption in the government in Egypt (79%), Syria (65%), Tunisia (55%), and Yemen (84%). Only in Bahrain (56%) did the majority state there is not corruption in government. However, only majorities in Egypt (61%) and Yemen (74%) said the economy was not good in the area in which the respondent lived. Additional analyses will examine attitudes regarding government efficacy, personal economy, and perceived freedoms, as well as the differences across population groups. This will include sectarian differences, which has been important in Bahrain, and to a lesser extent Syria, and rural-urban differences, which have been prevalently seen in nations like Syria.

The Fighting Factions within the “Clash of Civilizations”: An Examination of the Latent Classes of Conflict Lauren A. Walton, University of Nebraska - Lincoln ([email protected]); Brian M. Wells, University of Nebraska - Lincoln ([email protected])

Much discussion and media consideration have been given to the idea of major differences between Western and Islamic civilizations since the 9/11 al Qaeda attacks. Samuel Huntington’s “Clash of Civilizations” hypothesis (1993a, 1993b, 1996) received substantial attention as people attempted to understand conflict in the world around them, seeing it as an explanation for such differences between cultures. Huntington’s hypothesis contends that division and conflict in the post-Cold War world will occur between the world’s civilizations, including Western and Islamic civilization, due to distinct differences in culture. Using the Gallup World Poll data set, a probability based multinational survey, this study examines the clash of civilizations hypothesis utilizing direct measures of popular perceptions of both potential and occurring conflict between Western and Islamic worlds. Walton, et al. (2011a, 2011b) suggests that Western and Islamic civilizations, generally, have the same beliefs regarding the avoidance of civilizational conflict, but there are not two clearly defined groups as to the inevitability of conflict. Using a simultaneous latent class model (SLCM), this paper examines how there are many distinct groups within civilizations, each with their own unique point of view on the avoidance of conflict. Preliminary analysis suggests that a common five class model fits within both Western and Islamic civilizations; approximately 70% of Western respondents and 60% of Islamic respondents go along with the trend of Huntington’s hypothesis. This analysis will provide a more detailed picture of the nuances of perceptions regarding the “clashing” civilizations utilizing public opinion data.

Factors Shaping the Politics of American Muslims Jessica Hamar Martinez, Pew Forum on Religion & Public Life ([email protected]); Greg Smith, Pew Forum on Religion & Public Life ([email protected])

Muslims constitute a small but nonetheless key part of the religious, social and political fabric of the U.S. But since they make up a comparatively small proportion of the overall population and are thus represented by only small numbers in most national surveys, relatively little is known about the political attitudes and behavior of this group, and even less is known about the factors that serve to shape Muslims’ political predispositions. In this paper, therefore, we seek both to describe and to understand of the basic political attitudes and electoral behavior of American Muslims. We use data from two waves of a unique study of American Muslims, conducted by the Pew Research Center in 2007 (N=1,050) and 2011 (N=1,033). Drawing on this data, our analysis covers a wide range of attitudes, including foreign affairs, social issues and church/state topics, economic issues, and the translation of views on these issues into political preferences. The paper focuses specifically on understanding the importance of three kinds of traits – feelings of persecution, religious beliefs and behaviors, and demographic characteristics – in shaping Muslims’ political views. Our analysis points to a few key findings. First, feelings of persecution and experiences of discrimination are closely linked with negative views of U.S. foreign policy among American Muslims. Second, we show that their foreign policy views are the most important predictors of Muslims’ partisanship and voting decisions. Third, with respect to views on social issues, religious predictors operate among Muslims much as they do among non-Muslim religious groups in the U.S., with more religiously committed Muslims taking more conservative issue positions than their comparatively less devout counterparts. Fourth and finally, we find that demographic characteristics, on the whole, do not explain much about Muslims’ political views.

Love Thy Neighbor and Zakat: Religiosity and Positive Social Engagement in the Western and Islamic Worlds Nicholas Ruther, University of Nebraska - Lincoln, Suvery Research and Methodology ([email protected]); Amanda Libman, University of Nebraska - Lincoln, Suvery Research and Methodology ([email protected]); Allan McCutcheon, University of Nebraska - Lincoln, Suvery Research and Methodology ([email protected])

One of the major points of conflict in the twenty-first century is the often tension-fraught interaction between the Western and Muslim worlds. Much of that tension is focused around the religious differences between the mostly Christian West and Islam. Both of these major world religions, however, value and encourage positive social engagement among their adherents, suggesting commonalities between the two worlds that might bridge that divide. Previous work in the United States has indicated that religion plays an important role in motivating individuals to work towards bettering their communities (Wilson and Janoski, 1995).

This paper uses data from the Gallup World Poll, a probability-based multinational survey, to compare the effects of religiosity on social engagement on the populations in a selection of Western and Muslim states. This study assesses the influence of the reported importance of religion in daily life and attendance of religious services on civic involvement, looking at volunteerism and charitable giving across the United States, Germany, Saudi Arabia, and Pakistan. Religion is consistently rated as important in Muslim countries (95.2% Saudi Arabia, 96.5% Pakistan) versus markedly lower rates in the West (65.4% US, 40.4% Germany). However, individuals report higher levels of volunteerism in the United States (39.0%) and Germany (28.4%), with lower rates in Saudi Arabia (10.0%) and Pakistan (8.0%).

Preliminary analysis indicates a similar relationship between volunteerism and both reported importance of religion and attendance of religious services in the US and Germany. Conversely, there is a difference between the associations of those measures with volunteering in the Muslim-majority countries. Further analysis will explore the effects of religiosity on volunteerism, charity, and other areas of civic and community engagement as well as differences among religious denominations within countries.

South Sudan: Voices from an Emerging Democracy Brian M. Kirchhoff, D3 Systems, Inc. ([email protected])

The Republic of South Sudan is the newest country in the world, gaining independence from The Republic of the Sudan on July 9, 2011. In order to better understand how the people of South Sudan view the issues facing their new nation, D3 Systems will field a South Sudan survey in November 2011.

With independence comes a host of new challenges and decisions that South Sudan will need to be address. D3 Systems’ survey of South Sudan will measure public opinion as it relates to the most important issues facing this new country. This paper analyzes and presents the survey results. The research topics include political stability, hydrocarbon policy, development of an independent oil infrastructure, delivery of services and resources to a largely rural population, the HIV/AIDS epidemic, regional drought and famine, the regional spread of terrorism and a perennially contentious relationship with Sudan. In addition to improving understanding on the aforementioned issues, the survey will also capture key demographic information and will include a wide array of questions that measure media penetration and usage.

Due to the low penetration of phones and internet throughout the country, the survey will be conducted via face to face interviewing. The sample will consist of 5 key cities across South Sudan, with a representative sample of the 18+ population by city, gender and age group. The five cities are Juba (250 interviews), Malkal (225), Rumbek (225), Yambio (150) and Wau (150). This will result in a total sample size of 1,000 interviews. Respondents will be selected using a multi-stage random method, from PSU selection (from a proportional to population list of sampling points), to household selection (random route) and respondent selection (next birthday or Kish grid, as appropriate to South Sudan).

Interviewer Job Performance, Training, Satisfaction and Retention

Investigating the Effect of Interviewer Job Attitudes on Turnover and Job Performance Ashley Bowers, University of Michigan and Indiana University ([email protected]); Steven G. Heeringa, University of Michigan ([email protected]); Michael R. Elliott, University of Michigan ([email protected]); Alycia H. Cameron, Indiana University ([email protected]); Lilian Yahng, Indiana University ([email protected])

While high levels of interviewer turnover and concerns about job dissatisfaction have plagued survey organizations for years (Marketing Research and Intelligence Association, 2008; Wiggins, 1999), pressure to contain survey costs and growing demands on interviewers in a challenging, complex data collection environment now make addressing these issues critically important for the survey research industry. The cost associated with interviewers is one of the largest components of survey budgets and the need to continually recruit, hire and train new interviewers significantly increases costs (Harding, Yost, & Knittle, 2007). While interviewers who leave drive up survey costs, dissatisfied interviewers who stay may exert less effort on the job and fail to meet performance standards. Such poor performance may lead to nonresponse error and measurement error. Despite these serious consequences, there has been little research to identify the attitudinal predictors of turnover and job performance. The few reported studies in this area (see Harding, Yost, & Knittle, 2007; Link, 2006) fail to consider job attitudes which have long been associated with turnover in management research. We report findings from a mixed methods study that included a survey of current interviewers and the collection of administrative data on job performance and turnover three and six months post survey completion from 500 interviewers at three US and Canadian call centers. With data from this study, we test a model that links a series of attitudinal states to the intent to quit and ultimately turnover. In addition, we assess the effect of interviewer job satisfaction on job performance. We use our empirical findings to propose a set of approaches to reduce turnover and to increase job satisfaction among interviewers.

CATI Interviewers Job Satisfaction Level Wojciech Jablonski, University of Lodz ([email protected])

The aim of this presentation is to outline the results of the methodological study that was carried out among CATI interviewers from October 2009 to August 2010. 12 major Polish research organizations as well as 2 companies in and Iceland participated in the research (Norwegian and Icelandic modules of the project were co-financed by technical assistance funds of the EEA Financial Mechanism and the Norwegian Financial Mechanism within the framework of the Scholarship and Training Fund). The research was based on a standardized self-completion questionnaire (in total 942 interviewers were surveyed) and in-depth interviews (which were conducted with 49 experienced CATI interviewers). The presentation focuses on the selected results of this project, it investigates the issue of the satisfaction CATI interviewers derive from doing their job. I will outline the results of one question included in the questionnaire. The interviewers were given a list of different statements describing the telephone interviewers’ job and were asked to define to what extent they agree or disagree with each statement. The results are presented in six scales, measuring satisfaction with different aspects of job: supervision, recognition, co-workers, progress / learn & grow, working conditions, and earnings. Presented results will take into account background variables such as interviewers’ age, field of education and work experience. Additionally, the qualitative data obtained in in-depth interviews will be presented. The interviewers were encouraged to describe difficult situations they encounter while working in a CATI studio. They were also asked to elaborate on the improvements which could be implemented in order to make CATI studio staff more satisfied with their job.

Evaluating Interviewer Performance in Surveys of Early Care and Education Rupa Datta, NORC at University of Chicago ([email protected]); Ting Yan, NORC at the University of Chicago ([email protected]); Jill Connelly, NORC at the University of Chicago ([email protected])

The National Survey of Early Care and Education (NSECE) is an integrated set of surveys with households with young children, institutions and individuals providing care for young children. The objective of the NSECE is to document the nation’s current utilization and availability of early care and education, and to deepen our understanding of the extent to which families’ needs and preferences coordinate well with providers’ offerings and constraints. The complexity of the surveys and the tight data collection schedules increase the challenges faced by interviewers, who need to locate, contact, recruit, and interview sampled households and sampled establishments and individuals providing early care and education. We devised a system to monitor and evaluate interviewer performance in all aspects of data collection by employing different sources of data such as survey paradata (e.g., timing data, records of call attempts), survey-related interviewer characteristics (e.g., interviewer-level response rates), non- survey-related interviewer characteristics (e.g., interviewer experience, gender, languages spoken), and survey questionnaire data. We plan to integrate these data into one single metric that would allow us to evaluate interviewer performance and identify those who are performing inadequately.

Training Data Collectors for Panel Surveys Brad Edwards, Westat ([email protected]); Laura Branden, Westat ([email protected])

Longitudinal surveys can impose extraordinary burden on respondents and often require a large investment of resources to overcome resistance to this burden. To increase the return on investment, panel surveys (especially those that collect data in person) often incorporate many features beyond asking questions and recording answers. Linkage with administrative records, collection of bio markers, assessments of literacy, physical or cognitive function, social network data, and collection of environmental information are increasingly used to increase survey efficiency. These components are often easier to incorporate in panel surveys than in cross-sectional surveys because their burden and cost can be spread out over time.

Data collector training is one focus of the large investment of resources. The high burden of longitudinal surveys demands that data collectors develop sophisticated skills in gaining cooperation. But as more components are folded into panel survey activities, field staff must also be trained to accomplish tasks beyond "simple" interviewing and gaining cooperation. Although the initial training costs can be amortized over subsequent data collection waves, drift in data collector skills may become such a concern that periodic training is required, to refresh skills and to ensure that the data collector component of measurement error is controlled.

On the National Health and Aging Trends Study (NHATS), we developed a rigorous training program for baseline that provided intense training and certification on assessment of cognitive and physical functioning, using asynchronous distance learning as well as classroom instruction. Re-certification on scoring the assessments was accomplished several months after baseline launch with videos delivered through a learning management system. Training plans for follow-up waves make increasing use of distance learning as the data collection staff become more experienced, and may use a split ballot experiment to examine the effects. We conclude with some insights and recommendations for other panel surveys.

Investigating Privacy Concerns.

Predictors of Personal Data Privacy Attitudes and Behaviors and the Consequences for Survey Researchers Max Kilger, Experian Simmons ([email protected]); Danica Jovanova, Experian Simmons ([email protected])

Challenges surrounding the emerging issues of data privacy are of direct concern to survey researchers. The willingness of individuals to cooperate and provide demographic, attitudinal and behavioral data is a key element in the ability of researchers to conduct studies that incorporate a wide breadth of topics utilizing representative samples. As people become more and more aware of the economic, legal and social costs of the misuse of personal data, it becomes more and more important for researchers to understand how privacy mechanisms work as well as what types of procedures may be effective in reducing respondent apprehension and increasing cooperation in providing information to researchers. This paper explores the issues surrounding perceptions of data privacy and in doing so discredits some commonly held beliefs about attitudes and behaviors surrounding personal information and privacy. Utilizing a national probability sample of over 25,000 respondents, this research reveals demographic profiles for attitudes towards a number of key privacy issues including the propensity to read privacy statements, perceptions of the risks of online information, the incidence of negative consequences from online information, trust in the federal government to make the best decisions about personal privacy and the willingness to provide personal information in exchange for something of value. Time series trends over the last several years for some of these variables will also be presented. Following this descriptive analysis, a multivariate analysis of factors associated with the willingness to provide personal data is developed and the results are interpreted in terms of providing researchers with a better understanding of what influences personal data privacy decisions and how better cooperation and less respondent apprehension can be gained by survey researchers.

Privacy Concern: A Question of Age or the Ages? Kristen L. Cibelli, University of Michigan ([email protected])

The Census Bureau launched a program of privacy-related research in the early 1990s to examine privacy and confidentiality attitudes and views of the public toward proposed practices such as obtaining data from other agencies. This research included a series of similar nationally representative surveys between 1995 and 2010. Research stemming from these studies and others has shown that privacy concerns can affect respondent willingness to respond to surveys and to cooperate with requests for information, presenting a challenge to the work of the Census and other government statistical agencies.

This paper examines possible differences in trends in privacy concerns by age. Specifically, we test the hypothesis that differing levels of familiarity with sharing personal information between younger and older adults in today’s society has resulted in an "age gap" with older respondents generally demonstrating heightened privacy concerns.

We begin by examining changes in general privacy concern reported in 1995, 1999, 2000 and 2010. While the increase in general privacy concern between the 2000 and the most recent 2010 survey is not significant, results indicate a shift away from the small yet significant declines in privacy concern observed across the previous surveys. We also find that the reversal of this trend is in large part due to elevated privacy concerns among older respondents. Further analyses looking at the effect of age on privacy concern support the hypothesis that privacy concern differs among age groups, with some effect of age observed in 1995 and to a highly significant degree in 2010. The difference in the effect of age between the 1995 and 2010 surveys is statistically significant and offers evidence that an “age gap” has emerged in privacy concern not merely due to age, but due to the ages in which we live today.

Respondent-Level Influences on Consent to Record Linkage: Effects of Privacy Attitudes and Consent Request Salience Jenna Fulton, Joint Program in Survey Methodology, University of Maryland ([email protected])

A growing number of surveys ask respondents for permission to link their survey responses with administrative records. Such linked data enhance the utility of both surveys and administrative records by making possible studies that would be difficult or impossible to conduct using either source alone. Linking survey data and administrative records may also improve data quality and can reduce respondent and interviewer burden. In most cases, linkage is contingent upon respondents’ consent. In situations for which consent is required for record linkage to occur, not all respondents usually consent. With evidence of declining consent rates, there is a growing need to identify factors associated with providing consent. Some research suggests that respondents’ privacy, confidentiality, and trust attitudes can influence their willingness to consent to record linkage. Further, the salience of the consent request has also demonstrated an impact on consent likelihood. In this study, we used a telephone survey sponsored by the 2011 Joint Program in Survey Methodology Practicum to investigate respondent-level influences on consent. Respondents were randomly assigned to a bogus consent request to link either their health or income and employment-related administrative records with survey responses. The survey included a battery of items measuring respondents’ privacy, confidentiality, and trust attitudes in order to evaluate the effects of these attitudes on consent. In addition, the survey contained correlates of information that would be incorporated in health and income and employment-related administrative records in order to examine the influence of request salience on consent. We also present qualitative findings for which Practicum survey respondents explain their reasons for consenting or refusing consent. Together, this research provides greater understanding of the factors motivating respondents to consent to record linkage and barriers preventing them from doing so.

Respondent Permission to Contact or Locate on Facebook: Findings from the National Longitudinal Transition Study 2012 Holly H. Matulewicz, Mathematica Policy Research ([email protected]); Stephanie Boraas, Mathematica Policy Research ([email protected]); Daniel J. Friend, Mathematica Policy Research ([email protected]); Anne B. Ciemnecki, Mathematica Policy Research ([email protected])

Engagement with social networking sites such as Facebook continues to evolve rapidly, in both the number of users and ways in which individuals construct and manage online identities. To effectively engage sample members through such mediums, more information is needed on how people perceive the networking site and the identity they have constructed for themselves within it. Social media has the potential to provide researchers with another means of locating and contacting respondents over time through social networking sites. Because online identities are not bound by physical space or even economic barriers, they present opportunities for longitudinal tracking of respondents which would otherwise pose challenges in the physical world. For example, access to and use of a Facebook account is not limited to those with a working telephone number or current postal addresses. However, would sample members perceive it as an “invasion of privacy” if researchers used social media to locate and contact them? Do perceptions vary by age or other demographic characteristics? Due to the rapid evolution of these sites and the subsequent formation of user perceptions, there is a need for nationally representative data on these issues. The National Longitudinal Transition Study 2012 (NLTS 2012), sponsored by the U.S. Department of Education, contains a nationally representative sample of 15,000 youth ages 13-21. Baseline data collection, which includes a survey of these youth and their parents, is being conducted in Spring 2012, with follow-up in 2014. This paper presents preliminary findings on the percentage of parents and students who agree to be contacted through Facebook and the results from affective coding of parent and students’ oral responses to these questions. Results describe differences in responses by key demographic characteristics and the implications for survey researchers who are considering using such media for these purposes.

Media Effects on Political Views and Behaviors

Media Partisanship Scores: Developing a Holistic Measure for the Effects of Politically Relevant Media Devra C. Moehler, Annenberg School for Communication, University of Pennsylvania ([email protected]); Elizabeth Roodhouse, Annenberg School for Communication ([email protected]); Douglas Allen, Annenberg School for Communication, University of Pennsylvania ([email protected])

The recent explosion of cable television and talk radio programming provides citizens with greater choice than ever before. The proliferation of targeted entertainment and overtly partisan news shows allows citizens to select more ideologically extreme and homogenous media diets than was possible when the more temperate network channels dominated the airwaves. Observers express concern that media fragmentation along partisan lines polarizes the citizenry and undermines deliberative democracy. Yet there are also reasons to expect that exposure to a more unified and outspoken partisan perspectives in the media may help mobilize citizens to participate in politics, thus bolstering participatory democracy. We test the effects of exposure to partisan news and entertainment media on polarization and participation using the Internet Panel of the 2008 National Annenberg Election Survey. To do so we construct a new measure of partisan media exposure that reflects important dimensions of the current media environment. Our measure offer four benefits First, it incorporates in a single metric the influence of 74 TV and radio shows spanning entertainment and news genres. Second, it reflects the degree of each program’s partisanship (a continuous measure from neutral to extreme) rather than just each show’s type of partisanship (Democratic or Republican). Third, it is sensitive to the homogeneity or heterogeneity of one’s total media diet;. Finally, it is replicable across different media and political systems. The paper provides a methodological contribution with the introduction of a new measure, and an empirical contribution by analyzing the effects of understudied dimensions of the new media environment. It thus serves as a complement to extant studies of partisan media effects.

The Effects of Media Localism on Political and Social Trust Michael Barthel, University of Washington, Department of Communication ([email protected])

As cornerstones of any democracy, political trust and social trust have been the focus of much research, with scholars attempting to identify their many antecedents. A large corpus of literature has examined media influences, most of which has emphasized the negativity and information found in various media outlets (e.g., Norris 1997; Putnam 2000; Robinson 1976). Studies of media effects on trust then tend to focus on the differential effects of content (e.g., news vs. entertainment) or media (e.g., newspapers vs. television). Unfortunately, this research has been concerned primarily with national news media outlets, and has given short shrift to local news media, which have been shown to bring politically disinterested, socially atomistic citizens together and to help them engage in public life. In other words, local media can promote active engagement in civic and political life (e.g., Bellah et al. 1985), which in turn can shape perceptions of political and social trust.

This study examines the extent to which local media can influence citizens’ political and social trust. We are interested not only in citizens’ consumption of local media content, but also the extent to which the media are based geographically within that community (or “media localism”). Using data from the 2006 Social Capital Community Survey conducted by Harvard University (N = 5,803), we investigate the impact of media use and media localism (operationalized as the degree to which a media market’s primary sources of local news – daily newspapers and network TV affiliates – are based in that community) on social and political trust. Full analyses will involve validation of these findings with data from the 2008 American National Election Studies, examining the political effects of media localism in each Congressional district, while controlling for sociodemographics, media use, and key political attitudes.

The Impacts of and Not-Fox Television News on Americans’ Judgments about Global Warming Bo MacInnis, Stanford University ([email protected]); Jon Krosnick, Stanford University ([email protected])

Decades of research claimed “minimal effects” of news media on individual attitudes. We found the appearance of “minimal effects” due to news exposure on global warming attitudes and beliefs when using a traditional analytic approach that treats news content as homogenous. But when distinguishing Fox News on television from not-Fox television news using an instrumental variable approach to account for the endogeneity of media consumption by a nationally representative sample, we found that more exposure to Fox News raised skepticism about, while more exposure to not-Fox television news increased acceptance of, global warming. Impacts were large and increased with amount of exposure. Meditational analyses showed that, in line with the Attitude-Certainty-Evaluation model, more exposure to Fox News and not-Fox television news influenced perceived scientific consensus and trust in scientists, which in turn affected causes and consequences of judgments of the seriousness of national problems. This study’s principal findings shed lights on the renewed debate on the persuasive effect of mass media. Large media effects are present and detectable in the general population when the empirical conditions delineated in Zaller (1996) hold and when proper analytical methods are employed to disentangle the selective exposure effect from the direct persuasive effect. Additional implications of the present study’s findings on the persuasion literature, public understanding of science and media bias are discussed.

Effects of Televised Campaign Advertising: Considering the Accuracy of Retrospective Survey Self-Reports of Media Consumption Sarah Niebler, University of Wisconsin - Madison ([email protected]); Carly Urban, University of Wisconsin - Madison ([email protected]); Ken Goldstein, Campaign Media Analysis Group (CMAG) ([email protected])

Current literature on the effects of televised campaign advertising rely on public opinion surveys that ask individuals to retrospectively self-report how much and what type of media they consume. Scholars then use responses to estimate the number of television advertisements individuals were exposed to and use those estimates to determine the effect of advertisements on individuals’ likelihood to vote and their vote choice. However, we know little about the accuracy of individuals’ retrospective self-reports of their media consumption habits. If individuals erroneously report how much television they watch, then estimates about how many advertisements they were exposed to are also erroneous. Thus, any effects of advertising on voter turnout or vote choice are likely biased. This paper combines unique data sources to: address the accuracy of individuals’ retrospective self- reports of their media consumption habits; determine whether some people are more accurate in their retrospective self-reports than others; and demonstrate how the errors on retrospective self-reports can bias estimates of the effects of campaign activities on a variety of political variables. Specifically, this paper employs 2010 Nielsen data from Michigan, where individuals were asked to keep a television diary of their media consumption habits. We combine this diary data with survey responses where individuals were asked to retrospectively report how much television they watched during the preceding week. Comparing individuals’ responses to these questions with their daily television diary, preliminary results indicate that, retrospectively, respondents tend to overestimate the amount of television they watch. These differences are heterogeneous across demographic groups and political variables. Finally, we merge these data with advertising data from the Campaign Media Analysis Group (CMAG) to argue that the mismatch between real-time and retrospective accounts of media consumption introduces an additional source of uncertainty in estimates of the effect of campaign advertising on political variables.

Methods to Improve Web Surveys

Advanced Paradata in Web Surveys: What Can They Tell about the Response Process? Nejc Berzelak, University of Ljubljana ([email protected]); Katja Lozar Manfreda, University of Ljubljana ([email protected]); Ana Slavec, University of Ljubljana ([email protected]); Vasja Vehovar, University of Ljubljana ([email protected])

One of the advantages of web surveys is the availability of simple and effective collection of paradata, i.e. data about the process of data collection. Widespread availability of client-side technologies in modern web browsers offers a range of different types of these data. Their utilization and analysis can significantly contribute to understanding the response process and thus identification of potential data quality problems at the level of individual respondents and the survey as a whole.

In this paper we scrutinize the use of different possibilities of automatically collected paradata for evaluation of data quality in web surveys. We implemented an advanced paradata collection solution that offers a number of different paradata related to: time (total and first-visit page times, estimated answering times at the question level), questionnaire routing (tracking across pages, returns in the questionnaire), notifications to respondents (frequency and location of item nonresponse and quality check reminders), indications of response-order effects across several questions, multitasking during responding and others. We analyze the relations between these paradata, questionnaire characteristics and response quality indicators across a number of surveys conducted within our research organization. In addition, a dedicated web questionnaire with different question types, logical controls and other features for data quality estimation was conducted.

The paper begins with a discussion of technological, methodological and ethical aspects of potentials and problems of using advanced automatically collected paradata in web surveys. It then focuses on analysis of empirical data and presents the power of paradata for understanding the response process and estimation of measurement data quality. Finally, possible future developments of web survey paradata collection and uses are outlined.

Usability Issues from Testing a Census Web Survey: Results from Testing of the Census Quality Survey (CQS) Kathleen T. Ashenfelter, U.S. Census Bureau ([email protected])

The Census Bureau would like to reduce paper use as well as costs by moving from a paper form to an online version of the Decennial Census for use in 2020. The Census Quality Survey was conducted in order to estimate measurement error, such as simple response variance, from a Census Internet questionnaire compared to that from a Census paper questionnaire (Hill, Reiser, & Bentley, 2010). The Census Bureau’s Human Factors and Usability Lab performed usability testing on prototypes on an online form that was very similar to the 2010 Census and provided feedback based on human factors considerations to the developers of the survey.

When considering an online version of the Census, its overall usability must be taken very seriously, especially for an instrument that needs to be completed by every resident of the United States. For a data-collection Web survey to be successful, its user interface must support the user in completing the survey in an efficient, effective, and satisfying way. The Census Bureau’s Usability Lab conducted two rounds of usability testing of the online Census CQS instrument in April and June of 2010. The goal was to identify elements of the user-interface design that were problematic and led to ineffective and unsatisfying experiences for potential respondents of the survey. Usability issues identified during testing (such as participants having trouble with the auto-tabbing functionality of the “Date of Birth” question) will be discussed along with potential suggestions for the improvement of future online surveys.

Effects of Pagination on Short Online Surveys Aaron Sedley, Google ([email protected]); Mario Callegaro, Google ([email protected])

When designing online surveys, researchers must choose from a variety of pagination options. Respondents' expectations, experiences, and behaviors may vary depending on a survey's pagination, affecting both breakoffs and responses themselves. Surprisingly little formal experimentation has been conducted on the effects of survey pagination, with initial evidence focused on a long survey of university students (Peytchev, Couper, McCabe, Crawford 2006).

This experiment is intended to further inform the effects on pagination in online surveys. In a split-ballot experiment, we served respondents one of three versions of a short online questionnaire (~15 questions) on attitudes and experiences toward an online product. questionnaire are randomly served to respondents constructed with a) one question per page, b) logical groupings of questions over several pages, and c) as few pages as possible.

Effects of pagination are evaluated on breakoff rates, response time, item and unit nonresponse, inter- item correlations, and perceived length/difficulty. We hypothesize that the questionnaire with the fewest (longest) pages will cause greater initial breakoff, and the one with most pages will suffer increased breakoff during the survey.

Testing Video Messages in Web Surveys: Effects on Sample Bias and Validity Dina Shapiro, Annenberg School for Communication, University of Pennsylvania ([email protected]); Joseph Cappella, Annenberg School for Communication, University of Pennsylvania ([email protected])

One of the benefits of web surveys over their traditional mail or phone counterparts is the capacity to deliver multi-media content, such as videos with audio components, quickly and inexpensively directly to a large sample of respondents in geographically diverse locations via their personal computers. Yet to be investigated is the question of how reliably this format can deliver video content to the intended audience and how the validity of the findings are affected by the inclusion of videos. To address these questions, this research paper focuses on two survey related behaviors: (1) premature termination resulting from respondents’ inability to view videos embedded within online surveys and (2) the influence participant characteristics, including those related to video content, on video viewing times. Through a secondary analysis of two studies focusing on adult smokers, this research paper used multilevel cross-classified logistic models to investigate whether video failure occurs differently across different demographic groups, whether video characteristics influence video viewing success, and whether respondents who do not view videos for their full duration differ from those who view videos for an appropriate amount of time. Results show the influence of age, education, gender, household internet access, and work status on the likelihood a respondent will report video failure and on video viewing time as a function of video length. In addition, results show that issue salience impacts video viewing times such that participants who express greater readiness to quit smoking are more likely to view anti-smoking PSA videos in their entirety. These results have important implications for scholars interested in using internet-based surveys to deliver videos to survey participants.

Panel Conditioning: Results from Two Experiments in a Probability-based Online Panel Bella Struminskaya, GESIS - Leibniz Institute for the Social Sciences ([email protected]); Lars Kaczmirek, GESIS - Leibniz Institute for the Social Sciences ([email protected]); Ines Schaurer, GESIS - Leibniz Institute for the Social Sciences ([email protected]); Wolfgang Bandilla, GESIS - Leibniz Institute for the Social Sciences ([email protected])

Online panels as a mode of data collection allow for short intervals between waves. This facilitates more frequent polling, which poses the problem of heavy survey-taking effects. One of these effects is panel conditioning — changes occurring in actual behavior and attitudes or reporting of behavior and attitudes. Yet the existing literature on panel conditioning provides mixed results since the occurrence of conditioning is not pervasive.

A common way to detect panel conditioning is to compare a panel wave with a cross-section of new panelists, but in that case it is hard to single out the influence of non-response, which is different for the initial wave and subsequent waves.

This study investigates changes in reporting using the data of a probability-based online panel. Two identically designed experiments have been conducted within the panel. In each experiment, respondents of the panel conditioning group completed several questionnaires in the online panel prior to the experiment. Respondents of the control group only answered the initial online questionnaire.

The first experiment (n=442) tested a positive consequence of panel conditioning – more honest reporting and the hypothesis about social desirability reduction. The questionnaire topic in the first experiment was media consumption and environment. The second experiment (n=378) tested a possible threat to validity – strategic reporting in order to reduce the burden of follow-up questions. The questionnaire topic of the second experiment was friends and leisure time.

Administering the questionnaire as the second wave to control groups allowed for elimination of the confounding effects of attrition since the largest attrition occurred after the first wave. Identical questionnaires were administered to the two groups parallel in time, which allowed for the exclusion of the influence of current events on reports. We discuss how the tested types of changing response behavior fit into the general framework of panel conditioning.

New Frontiers: Social Media Analysis

Social Media Intelligence: Measuring Brand Sentiment from Online Conversations David A. Schweidel, University of Wisconsin - Madison ([email protected])

With the proliferation of social media, questions have begun to emerge about its role in deriving marketing insights. In this research, we investigate the potential to "listen in" on social media conversations as a means of inferring brand sentiment. Our analysis employs data collected from multiple website domains, spanning a variety of online venue formats to which social media comments may be contributed. We demonstrate how factors relating to the focus of social media comments and the venue to which they have been contributed relate to the sentiment expressed through social media and need to be explicitly modeled in an effort to derive a measure of online brand sentiment. Our proposed approach provides an adjusted brand sentiment metric that is correlated with the results of an offline brand tracking survey while a simple average of sentiment across all social media comments that ignores the venue and focus of the comments is uncorrelated with the same offline tracking survey. We apply our modeling framework to social media comments from three additional industries to further illustrate the limits associated with analyzing social media sentiment with comments gleaned from a single venue. We conclude with a discussion the implications of our findings for practitioners considering social media as a potential research tool.

Appealing to the Masses: How Crowdsourcing can be Effectively Used as a Data Collection Tool Justin T. Bailey, The Nielsen Company ([email protected]); Michael Link, The Nielsen Company ([email protected])

Crowdsourcing, or the act of asking an unknown group or sample to perform a task normally assigned to specific individuals, has become a popular way to collect data and information that might otherwise be difficult to obtain. In this series of studies, we investigate the various strategies we used to refine crowdsourcing methodology, including tests of recruitment mode (online, in-person, mobile, or social network), incentives (cash, mobile airtime, or points), and task difficulty. By conducting a series of five crowdsourcing experiments in emerging markets such as India, Africa, and China, we were able to identify best practices and standards for using a crowdsourcing sample to collect data. In most of the tests, we asked a panel of participants to perform a moderately-complex task such as go to a store and take a photograph of a specific product. Along with the photo, respondents would also provide store name and address, product name, and answer up to seven questions regarding how they felt about the experience and the reward they received. They could then transmit the data via various combinations of computer, social networking site, text message, or mobile app, depending on the test. Our testing showed that, if the optimal recruitment modes and tasks are used, crowdsourcing can present researchers with a useful alternative to gain insights regarding information about current trends or products. Leveraging modern data collection methods such as mobile phones and social network sites, crowdsourcing tasks can be optimized for potential panelists. We also address various issues relating to data accuracy and quality, sample and respondent characteristics, and types of tasks that are best suited for crowdsourcing.

The Brave New World of Social Communication: Exploring Patterns of Opinion Dissemination in Online News Environments Kristin Runge, University of Wisconsin - Madison ([email protected]); Dominique Brossard, University of Wisconsin - Madison ([email protected]); Dietram A Scheufele, University of Wisconsin - Madison ([email protected])

The growing popularity of social media as a contextual filter and aggregator for media users has become a problem for public opinion research as conventional content analysis methodology and sampling techniques pre-date today’s decentralized, highly-fragmented and constantly changing online media environment. The sheer volume of blog, Twitter and Facebook posts and their fleeting, un-archived nature is a challenge when considering when to sample as well as where to sample. This study focuses on information searches surrounding novel scientific developments. Specifically, we use Crimson Hexagon software to analyze all English-language nanotechnology- related opinions posted on blogs, Facebook, Twitter and other social media websites between September 1, 2010 and August 31, 2011. Our purpose was to analyze opinion along dimensions of positive-neutral-pessimistic and certainty-uncertainty, identify social media channels used for opinion expression and discover commonly used frames. 49 % of all opinion posts were neutral, 40% were optimistic and 11% were pessimistic. Certainty was expressed in 80% of all posts. Twitter accounted for 53% of nanotechnology-related opinion traffic, blogs accounted for 31%, news 8%, forums 4%, Facebook 3% and comments 1%. Our exhaustive online tracking shows that opinions were more likely to frame nanotechnology in medical or scientific research terms, and less likely to frame nanotechnology in economic terms. Posts were next mapped by country and state when geographic information was available. While a majority of posts were generated in the United States, regional analysis shows that California, the United Kingdom, New York and Canada generated the most nanotechnology opinion posts by volume. The strong trend toward certainty in opinion, optimism and neutrality coupled with the distinct geographic origins of much of the social media traffic for nanotechnology opinion has significant implications for understanding how opinion framing by key influencers determines the direction opinion takes as it is diffused among nanotechnology constituencies.

If You Ask Me I Won’t Tell You, but I’ll Tell the World When I Feel Like Doing So! The Frequency of Answering a Survey About a Specific Topic Versus Posting Comments About this Same Topic on Social Media Sites Michael G. Elasmar, Boston University ([email protected])

The social media revolution underway has resulted in an exponential growth in the volume of human expression of thoughts and feelings about a great variety of topics. This growth in human expression is in sharp contrast to the steady decline in survey response rates witnessed in the last two decades. Will the social media revolution transform the way public opinion is captured? A myriad of professionals have had to adjust to the perpetually evolving social media environment. Will those who study public opinion also need to adapt?

This paper focuses on the younger generation and the manner with which they make their opinions known. Young adults are known to be heavy users of social media and thus this study focuses on them. The main research questions of this paper are: Are there differences in young adults’ frequency of responding to public opinion surveys about a variety of topics as compared to the frequency of posting their opinions about these same topics on social media sites? If discrepancies do exist, then what factors help explain these differences? A survey of 18-22 year old college students was conducted to help shed light on these research questions.

This paper compares young adults’ frequency of responding to traditional surveys about politics, social issues, and products/services to their frequency of posting comments about these same topics on social media sites. Several theory- driven factors are tested as potential drivers of using social media for expressing one’s opinions.

Implications are drawn for both theory development and the likely evolution of the practice of public opinion research that focuses on young adults.

Predicting the Future of Social Media Analysis Peter Ph Mohler, University of Mannheim ([email protected])

Analyzing Social Media Analysis (SMA) is a hot issue again. Some even like it to predict the future. But without solving three crucial methodological issues, its own future is bleak. SMA can build on an impressive legacy from content analysis(CA). SMA could learn from CA’s many success stories as well as not so successful CA's. Success stories are propaganda analyses in WWII, Newspaper analyses in the fifties and sixties, analyzing open ended questions in the seventies and continuous media reporting since the eighties.

Many of the seemingly successful projects were plagued by major technical and methodical problems. Most prominent in the past was the lack of computer-ready information. That problem does not exist anymore. Other challenges were defining the target population and designing appropriate samples, identifying what to count, and how to analyze the counts. These three challenges are more pressing today than ever:

Like in special population surveys, a prime task in SMA is to define the target population/sampling frame. The next challenge is to sample from that populations. SMA today cannot easily identify its target population and appropriate sampling strategies.

The number of possible counts in CA is unlimited: from the number of black dots on white paper to the number of “words”. Defining what to count is thus a complex issue. In addition one could use information about the sampled material such as readership as a kind of “media demographics”. Media demographics are hard to get for many modern, web-based SMA. Statistical analyses of the counts made need strong text- and media-related theoretical assumptions. Most important are two: a. measuring distances between counting units (say, words)necessary for computing associations and b. context boundaries for counts. The paper will discuss, based on past and recent studies, pitfalls and solutions for the three methodological challenges that overshadow the future of SMA.

The Relationship Between Religion and Political Attitudes

Faith and Politics around the World: A Cross-National Study of the Relation between Religiosity and Political Attitudes Ariel Malka, Yeshiva University ([email protected]); Yphtach Lelkes, Stanford University ([email protected])

It is common in American political discourse for messages to imply a link between religiosity and political conservatism. Indeed, American religious elites have, over the last several decades, been more inclined to support the Republican Party and conservative policy positions than the Democratic Party and liberal policy positions. Some political psychologists have argued that there is a natural deep-seated connection between religious sentiment and political conservatism. In this view, characteristics such as sensitivity to threat, need for order and predictability, and uncritical deference to authority are said to cause both religiosity and preference for conservative social outcomes and policies. This perspective contrasts with the classic political science viewpoint that the structuring of political and related characteristics is conditional on exposure to the messages of political discourse, whose content ultimately stems from the strategic action of political elites.

In this research we use cross-national data from 47 nations to address the extent of, and the reasons for, religiosity’s relation with political attitudes. We address two broad questions in our primary analyses. First, does religiosity relate to both conservative cultural and economic preferences – the “broad conservatism hypothesis” – or does religiosity only relate to conservative cultural preferences – the “narrow conservatism” hypothesis? Second, to what extent are observed relations between religiosity and conservative attitudes accounted for by dispositional characteristics – the “organic influence” hypothesis – and to what extent are they conditional on individual exposure to political discourse – the “discursively driven influence” hypothesis. In ancillary analyses, we examine nation-level characteristics – such as political institutions, economic indicators, human development indicators, and ethno-religious composition – as predictors of within-nation individual level processes linking religiosity and political attitudes.

A Secular Society? Examining the Religious Beliefs, Knowledge and Attitudes among the Unaffiliated in the U.S. Carolyn Funk, Pew Research Center ([email protected]); Besheer Mohamed, Pew Research Center ([email protected])

What does it mean for the U.S. to become an increasingly secular society? A growing number of Americans, sometimes labeled "seculars", evince no particular religious faith. Surprisingly little is known about this group or the implications of secularization for public opinion. This paper examines the similarities and differences among those with no religious affiliation, those espousing atheistic or agnostic views, and those affiliated with a religion. We examine the extent to which those with no religious affiliation hold religious beliefs, their knowledge and information levels about each of the world’s major religions, and their views about the role of religion in politics and society. Our analysis draws on several large-scale surveys in the U.S. conducted by the Pew Forum on Religion & Public Life over the last few years. We find a surprisingly large percentage of those with “no particular religious affiliation” believe in God and attend religious services at least occasionally. Further, views about religion and politics among this group suggest broad levels of religious tolerance while also supporting a separation between church and state.

The Interplay Between Religiosity, Moral Values and Political Party Preference: What Are Americans Willing to Die For? Ariela Keysar, ISSSC Trinity College ([email protected]); Barry A. Kosmin, ISSSC Trinity College ([email protected]); Benjamin Beit-Hallahmi, University of Haifa ([email protected])

What are Americans prepared to risk their lives to defend in the 21st Century, their families, their country, or their ideals? How do Americans’ feelings about self-sacrifice correlate with their religious and political beliefs?

We explore the hypothesis that religious people declare their readiness for self-sacrifice more often than non-religious people, perhaps because of their faith in an afterlife. We also explore the alternative hypothesis that ideology, as exemplified by political party preference, is the stronger influence on one’s willingness to risk his or her life to defend family, country, and ideals. Statistical models will be introduced to weigh the relative importance of these factors, controlling for the collinearity between party preference and religious belief.

Surveys show that believers in a higher power are almost universally more trusted by the American public than non-believers because of their perceived moral values. This paper will test popular claims that religious people are more patriotic than non-religious. By looking specifically at people who currently profess no religion, those who do not believe in God, and individuals’ upbringing, secular or religious, and current religious identification, we can begin to understand, document and challenge common perceptions regarding secular ethics and the moral values of our society.

The data to answer these questions are drawn from the American Religious Identification Survey (ARIS) 2008, a nationally representative RDD telephone survey with over 50,000 adult respondents, including a sub-sample of people who profess no religion. The study will explore the correlations between respondents’ age, gender, race, and education and their answers to the questions about self-sacrifice.

Mormon Presidents and Mosques Next Door: An Examination of American Attitudes Toward Religious Minorities in 2012 Daniel Cox, Public Religion Research Institute ([email protected]); Robert P. Jones, Public Religion Research Institute ([email protected])

American attitudes about Muslims, Mormons and their faith have been a prominent feature of the political debate over the past year. The candidacies of Mitt Romney and Jon Huntsman, both of whom are Mormon, has brought the issue of Mormonism, and religious pluralism to the forefront of the Republican nomination battle. The lingering hostility and fear about Muslims harbored by certain segments of the public has bubbled to the surface in response to the construction of an Islamic Center near Ground Zero in NYC and congressional investigations into the alleged radicalization of Islam. This paper will explore how American views about and comfort with Mormons and Muslims differ and in what ways they are similar. It will also examine some possible determinants of these attitudes.

Already there has been extensive polling conducted about Romney and his faith, however the degree to which it will prove to be a liability remains an open question. In addition to questions that measure American views about Mormonism and whether it is a Christian religion, we conducted a List Experiment that allows respondents to express their discomfort or concern about a Mormon becoming president without having to directly express this opinion to the interviewer.

In the last few years there has been evidence of increasing anti-Muslim sentiment in the U.S. In this context, we examine views about Islam and Muslims and the degree to which Americans are comfortable with certain aspects of Muslim culture and religious expression (i.e. wearing a burqa). Using a survey experiment we also tested whether priming respondents with messages about religious pluralism increases feelings of comfort with Muslims in society.