<<

Workshop on Probability-Based and Nonprobability Survey Research

Collaborative Research Center SFB 884 University of Mannheim

June 25-26, 2018

Keynote: Jon A. Krosnick (Stanford University)

Scientific Committee: Carina Cornesse Alexander Wenz Annelies Blom

Location: SFB 884 – Political Economy of Reforms B6, 30-32 68131 Mannheim Room 008 (Ground Floor)

Schedule Monday, June 25

08:30 – 09:10 Registration and coffee 09:10 – 09:30 Conference opening

09:30 – 10:30 Session 1: Professional Respondents and Response Quality o Professional respondents: are they a threat to probability-based online panels as well? (Edith D. de Leeuw)

o Response quality in nonprobability and probability-based online panels (Carina Cornesse and Annelies Blom)

10:30 – 11:00 Coffee break

11:00 – 12:30 Session 2: Accuracy o Comparing complex measurement instruments across probabilistic and non-probabilistic online surveys (Stefan Zins, Henning Silber, Tobias Gummer, Clemens Lechner, and Alexander Murray-Watters)

o Comparing web nonprobability based surveys and telephone probability-based surveys with registers data: the case of Global Entrepreneurship Monitor in Luxembourg (Cesare A. F. Riillo)

o Does matter? Evidence from personality and politics (Mahsa H. Kashani and Annelies Blom)

12:30 – 13:30 Lunch

1 13:30 – 15:00 Session 3: Conceptual Issues in Probability-Based and Nonprobability Survey Research o The association between population representation and response quality in probability-based and nonprobability online panels (Alexander Wenz, Carina Cornesse, and Annelies Blom)

o Probability vs. nonprobability or high-information vs. low- information? (Andrew Mercer)

o Non-probability based online panels: market research practitioners perspective (Wojciech Jablonski)

15:00 – 15:30 Coffee break

15:30 – 17:00 Session 4: Practical Considerations in Online Panel Research o Replenishment of the Life in Australia Panel (Benjamin Phillips and Darren W. Pennay)

o Terms of agreement: the inclusion of Muslim minorities (Elisabeth Ivarsflaten and Paul Sniderman)

o Do you get what you asked for? On the implementation of a survey in nonprobability online panels (Daniela Ackermann-Piek and Annelies Blom)

2 Tuesday, June 26 09:30 – 10:30 Keynote An update on the accuracy of probability sample surveys and non- probability sample surveys (Jon A. Krosnick)

10:30 – 11:00 Coffee break

11:00 – 12:30 Session 5: Variance Estimation and Weighting Adjustments o Precision of estimates based on non-probability online panels (Marek Fuchs and Tobias Baier)

o Improving estimates from non-probability online surveys (Dina Neiger, Andrew C. Ward, Darren W. Pennay, Paul J. Lavrakas, and Benjamin Phillips)

o Weighting and estimation strategies for probability-based and nonprobability panel research (Christian Bruch, Barbara Felderer, and Annelies Blom)

12:30 – 13:30 Lunch

13:30 – 15:00 Session 6: Combining Probability-Based and Nonprobability Samples o Rationale for conducting and methods for calibrating hybrid probability/non-probability surveys (David Dutwin)

o Estimating the size of the LGB population with a random telephone/Internet survey and a joint Internet volunteer survey and taking advantage of the two to increase the analytical LGB subsample (Stéphane Legleye and Géraldine Charrance)

o Blending probability and nonprobability samples for survey inference under a Bayesian framework (Joseph W. Sakshaug, Arkadiusz Wisniowski, Diego Perez-Ruiz, and Annelies Blom)

3 15:00– 15:30 Coffee break

15:30– 17:00 Session 7: Nonprobability Survey Research and Big Data o How research on probability and nonprobability panels can inform passive data collection studies (Bella Struminskaya)

o Probability, nonprobability sampling, and Big Data (Andreas Quatember)

o Digital trace data: just another nonprobability sample? (Josh Pasek)

4 Abstracts

Session 1: Professional Respondents and Response Quality

Professional respondents: are they a threat to probability-based online panels as well?

Edith D. de Leeuw

A major concern about the quality of non-probability online panels centers on the presence of ‘professional’ respondents. In reaction to criticism on non-probability panels, probability- based online panels are established. However, probability-based panels suffer initial nonresponse during panel formation with the danger of selective nonresponse. Are probability-based panels a safeguard against professional respondents?

In the Netherlands a large study (NOPVO) of 19 (opt-in) online panels reports on professional respondents. We partly replicated their study in two Dutch probability based online panels: a probability sample of the general population of the Netherlands (LISS panel), and a probability sample of the four largest ethnic minority groups in the Netherlands (LISS immigrant panel).

In the probability-based Dutch online panels, the number of panel memberships was lower than in the NOPVO panels: 84.5% of the LISS members and 80.3% of the immigrant panel did not belong to other panels, while in the NOPVO-panel study only 38% was not a member of multiple panels. In the NOPVO-study on average more than 80% of the respondents reported to have completed more than one in the past 4 weeks, while in the two probability-based panels this was less than 40 %.

Response quality in nonprobability and probability-based online panels Carina Cornesse and Annelies Blom

The ongoing debate about the quality of nonprobability online panels predominantly discusses whether or not these panels have representative sets of respondents. While the number of publications on nonprobability panel representativeness is increasing, less attention has so far been paid to potential measurement errors in nonprobability as compared to probability panels. In our paper, we investigate whether there are differences in satisficing across probability and nonprobability online panels using three indicators to operationalize survey satisficing (item nonresponse and non-substantive answers, straight-lining in grids, and mid-point selection in a visual design experiment). These indicators are included in a questionnaire module that was implemented across nine online panels in Germany: one academic probability online panel that includes the offline population, one commercial probability online panel, and seven nonprobability online panels, all differing with respect to their sampling and recruitment methods. Our analyses show significantly less straight-lining in probability than in nonprobability online panels, but no significant differences regarding

5 mid-point selection. With respect to non-substantive answers, we find that significantly more respondents in the probability than in the nonprobability panels say that they don’t know who they voted for in the last general election or refuse to report their height and body weight.

Session 2: Sample Accuracy Comparing complex measurement instruments across probabilistic and non- probabilistic online surveys Stefan Zins, Henning Silber, Tobias Gummer, Clemens Lechner, and Alexander Murray-Watters

The quality of non-probabilistic samples is often only determined by the comparison of estimated frequency distribution of items with that of benchmarks distributions, e.g. obtained from official statistics or probabilistic samples (Yaeger et al., 2011). However, simple frequencies are often not of primary interest, but dependencies between the measured variables are at the core of theory building and testing in behavioural sciences. In order to compare possible method effects of different non-probabilistic sampling designs with respect to the joint distribution of the measured variables, an established measuring instrument (BFI- 2-S instrument, see, Soto and John, 2017) with measurements from two nonprobabilistic samples and one probabilistic sample are evaluated. The probabilistic mixed-mode sample serves as a reference. With an already carried out study, the sample type of a non-probabilistic online access panel will be examined. As a second non-probabilistic sample type we plan to analyse a so called river sample. River sampling is a relatively uncontrolled recruitment of respondents via ads on various websites. This type of sampling has not yet been taken up in any comparative study of this kind in Germany and rarely internationally. The main goal of this method comparison study is to compare the ability to use the different types of samples to model the latent variables of the measuring instrument. For this we will evaluate two alternative methods to model the latent variables. One will be the standard method that uses factor analysis and for the other we will use a causal search algorithm (FOFC).

6 Comparing web nonprobability based surveys and telephone probability- based surveys with registers data: the case of Global Entrepreneurship Monitor in Luxembourg Cesare A. F. Riillo

Failure to predict Brexit and the US election outcome has called into question poll and . This study contributes to this debate by assessing the Total Survey Error of two surveys on entrepreneurship. One survey is probability-based and is conducted by fix-line phone. The other survey is based on an opt-in web panel and it is nonprobability based. The same questions are administered in both surveys. The study assess with survey resembles better official register data in terms of distribution of socio demographic characteristics of respondents and in terms of variable of interest (entrepreneurial activity). Research is based on the Global Entrepreneurship Monitor (GEM) survey data for Luxembourg. GEM interviews individuals to collect international comparable entrepreneurship information. There are two survey designs for GEM survey in Luxembourg: telephone interviews - randomly dialling to fix lines - and web interview- from an opt-in web panel. Research is conducted in three steps. First, I compare the distribution of socio demographic characteristics of respondents of both survey designs (web and telephone) with official data ( and business demography). Second, the econometric analysis (multivariate regression, Oaxaca decomposition and Coarsened Exact Matching) disentangles difference in entrepreneurial activity in terms of observable and unobservable characteristics in both survey designs. Finally, I test how effectively weighting adjustments correct for the entrepreneurial activity bias. Results show that both surveys are not perfectly emulating official data. Both telephone and web surveys underestimate the proportion of low educated adults as recorded in census data. Additionally, respondents of fix-line are considerably older than the census population. In terms of entrepreneurship –the main variable of interest-, the fix-line survey underestimates the proportion of adults owning or managing a firm. This proportion is over overestimated by the web survey. Current weighting procedures fail to account for different survey design (probability and nonprobability-based) and do not correct for the bias. The study highlights the challenges of survey collection. Both web and telephone surveys are not perfectly emulating official data and current weighting procedure is not appropriate. A Monte Carlo study is planned to compare weighting procedures to account for probability- based and nonprobability survey data.

7 Does sampling matter? Evidence from personality and politics Mahsa H. Kashani and Annelies Blom Internet surveys have provided survey methodologists with a faster, cheaper, and easier way to collect data. Nevertheless, the majority of Internet surveys suffer from a grave methodological setback–because there is no sampling frame of Internet users, the results cannot be easily generalized to the population. This project aims to contribute to the recent literature that explores the extent to which non-probability samples can be used to shed insights on the population. Using data collected in 2015 from eight different non-probability Internet samples and one Internet probability sample of the German population aged 18-70, I ask: Are there differences in correlations derived from probability and non-probability samples? I answer this by looking at correlations between the Big Five personality traits and political behaviors. Results present a mixed picture for using non-probability samples. In terms of predicting political participation and political interest, the results generated by both types of samples were statistically indistinguishable from each other. However, the results were inconsistent among the models predicting vote choice suggesting that, at least, when it comes to predicting certain political behaviors non-probability samples are not nearly as reliable as has been suggested by some.

Session 3: Conceptual Issues in Probability-Based and Nonprobability Survey Research The association between population representation and response quality in probability-based and nonprobability online panels Alexander Wenz, Carina Cornesse, and Annelies Blom Despite the vast increase in the number of online panels over the past decade, online panel data quality continues to be a hotly debated issue in the survey methodological literature. This debate mainly focuses on the question of whether probability and nonprobability online panels produce data that allow valid inference to the general population in terms of sample accuracy. Less attention has so far been drawn to the amount and type of measurement error that is produced by these types of online panels. From a Total Survey Error perspective, however, both sample accuracy and response quali ty are crucial aspects of general data quality. In our study, we investigate sample accuracy and response quality of the data produced by eight nonprobability online panels and two probability online panels in Germany. We assess sample accuracy by comparing the data to official statistics benchmarks and investigate response quality by examining a number of indicators including item nonresponse rates and straight-lining in grid questions. Preliminary results suggest that probability online panel samples are more accurate with regard to socio-demographic characteristics as well as political participation than 8 nonprobability online panel samples. In addition, probability online panels have a higher response quality than nonprobability online panels with regard to the amount of straight- lining in grid questions. All online panels obtain a high response quality with regard to the amount of item-nonresponse. We conclude that data quality is generally better in probability than in nonprobability online panels. In research where sample accuracy is of less importance, however, nonprobability samples may suffice.

Probability vs. nonprobability or high-information vs. low-information? Andrew Mercer Traditionally, the argument in favor of probability sampling has been the fact that when respondents are selected from a complete population frame with a known probability of inclusion, the sample distribution is guaranteed to match the population distribution on average for any variable that can be measured accurately. High rates of nonresponse for many probability-based surveys mean that valid inferences have come to depend less on random selection and more on modeling assumptions. This state of affairs has led some so suggest that all samples are nonprobability samples. In terms of statistical theory, it is true that for both probability samples with nonresponse and nonprobability samples, valid inferences depend on similar sets of assumptions and correctly specified models. However, the process of randomly selecting respondents from a reasonably complete sampling frame may have important implications for data quality that have nothing to do with the statistical properties of random samples. The most important differences have to do with visibility into the process that produces a sample and the amount of information available to researchers for use in statistical modeling and analysis. While probability and nonprobability samples face many similar challenges, the difficulties for opt-in samples are compounded by opaque recruitment and sampling procedures that vary considerably across sample providers. Through this lens, I will review the findings from Pew Research Center’s program of research into the accuracy of survey estimates from both probability-based and opt-in samples as well as some notable examples from the literature. Although important differences remain, many concerns about the use of opt-in samples could be reduced through greater transparency on the part of sample providers regarding recruitment and selection procedures.

9 Non-probability based online panels: market research practitioners perspective Wojciech Jablonski According to Groves (2011), there is a breach between the government and academic surveys on the one hand, and the private sector on the other. These fields are disconnected as far as utilization of methodological knowledge is concerned: in market research practitioners are less prone to follow the guidelines arising from methodological analyses (Smith 2009), which works for the detriment of survey research. In the presentation, we will focus on the results of the qualitative research carried out in 2017 among top experts (based in the UK, Netherlands, Germany, and Belgium) in the field of market surveys. We conducted 13 in-depth interviews with (a) representatives of research associations: ESOMAR, EFAMRO, MRS (Market Research Society, UK-based), and MOA (Center for Marketing Insights-Research-Analytics, Netherlands-based); (b) senior survey specialists in leading market research agencies: GfK, IPSOS, Skopos; and (c) market research professional working in large international technology company ordering and conducting surveys on a regular basis. The aim of these interviews was to identify common market research practices which may affect the quality of survey data, investigate the reasons for performing these practices (e.g., insufficient methodological knowledge of survey research professionals, inflexibility of the clients), and point out factors that contribute to the client’s attitude towards quality issues (e.g., research field, research agency, mode of data collection). Issues related to online panels are among those most often mentioned by experts. Problems with sampling techniques used by research providers (probability sampling is almost never used in the industry; most research relies on non-probability based online panels) or panel representativeness (it is not always directly communicated; the possibility of coverage bias is not mentioned; methodologies are often not transparent) are the examples of such practices.

Session 4: Practical Considerations in Online Panel Research Replenishment of the Life in Australia Panel Benjamin Phillips and Darren W. Pennay The Life in Australia panel is Australia’s only probability recruited panel, with a current panel size of 3,322 members and approximately 2,100 completed surveys per wave. We describe our first panel replenishment, which will be in-field as of the conference date. The design goals are to have approximately 2,000 surveys completed per wave and reduce biases in panel composition: adults aged 55+ are over-represented by 18 percentage points, university graduates are over-represented by 18 percentage points and women are over-represented by 2.5 percentage points. Steps to reduce the biases include the following. First, new panellists will be recruited solely from the mobile RDD frame because panellists recruited from the mobile frame were younger and more likely to be male and because of declines landline

10 coverage. (Initial panel members were recruited from both the mobile and landline RDD frames.) Second, recruitment will be selective, with only prospective panellists aged less than 55 being eligible and a subsampling rate of 50% applied to prospective panellists with a university degree. Third, inactive panellists with fewer than 20% of surveys completed will be retired. Fourth, active panellists aged 45+ will be subsampled at various rates. Weighting considerations will also be addressed. We are looking for feedback from others’ experiences with panel refreshment. How have other panels addressed weighting sample from multiple recruitment waves? How are existing panellists selected for removal from the panel? For any panels recruited using dual-frame RDD, how is decline in the coverage of the landline frame addressed in replenishment?

Terms of agreement: the inclusion of Muslim minorities Elisabeth Ivarsflaten and Paul Sniderman

This study reports the discovery of terms on which majority citizens are supportive of inclusion of cultural diversity and Muslim minorities. Introducing the use of repeatable templates in sequential factorials, iterative experimental trials establish support for progressively more stringent standards of acknowledgement of worth and a readiness to widen the boundaries of the national identity. Methodologically, this study demonstrates the value of sequential factorial designs in reconciling the conflicting objectives of making a new discovery versus replicating previous results. Drawing on experimental trials conducted in four countries in Western Europe and the U.S., between 2012 and 2017, the results of this research bring into view for the first time terms on which native citizens are supportive of inclusion.

Do you get what you asked for? On the implementation of a survey in nonprobability online panels Daniela Ackermann-Piek and Annelies Blom When implementing a survey across seven commercial online panels in Germany in May 2015, we were struck by the great diversity in the implementation quality across panels. Since, we have been approached by numerous colleagues for advice on what to look out for when contracting a commercial online panel. This paper presents the ensuing results of a systematic analysis into the implementation quality of our survey in the seven commercial online panels. While survey methodological research has collected and published insights into the process quality and survey errors in probability-based surveys of different modes, still little is known about errors in nonprobability online panels and even less about how these errors arise, i.e. how such surveys are actually conducted in the field. Presumably, this lack of information is largely due to the proprietary nature of the commercial nonprobability survey sector. Yet, even scientific research has so far done little to reduce this knowledge gap.

11 Our study systematically monitored various indicators of the quality of the survey implementation process. More specifically, we investigate what the commercial online panels promised to do to fulfill our research needs stipulated in our call for tender. We then compare these promises to actual realized outcomes during the survey implementation. In particular, the indicators examined relate to the following areas: panel recruitment, representativeness, costs, sample size (across waves), questionnaire implementation, data set quality, and quality of the technical report. In addition, we evaluate the amount of work involved on our side (i.e. labor costs) to ensure the aspired data quality.

Keynote An update on the accuracy of probability sample surveys and non-probability sample surveys Jon A. Krosnick The last decade has seen an explosion in the use of non-probability samples in online research, as well as slower growth of the development of probability sample online panels. And for the last 20 years, scholars have been comparing the accuracy of various different sampling methods used for telephone and online surveys. This address will review some of this literature and report brand new findings on the accuracy of probability and non-probability samples, including river samples. To compare the quality of data obtained from these sorts of samples, Dr. Krosnick has done a series of studies comparing the same questionnaire administered simultaneously in RDD telephone interviews or face-to-face interviews of probability samples, a probability-sample interviewed online, and many online survey groups that employed non-probability samples. He will compare the results to assess data accuracy and reach conclusions about the optimal mode for survey data collection. The results point toward danger in the future of survey methodology.

Session 5: Variance Estimation and Weighting Adjustments Precision of estimates based on non-probability online panels Marek Fuchs and Tobias Baier

Non-probability online panels offer advantages as compared to probability-based online surveys. Amongst others, non-probability online surveys provide access to large sample sizes at a relatively low cost. Also, field work is typically less burdensome and field durations are shorter. However, issues concerning the quality of survey data obtained from non-probability online panels pose a main challenge. To this point, research has mainly focused on metrics which

12 compare the accuracy of point estimates derived from non-probability-based panels to benchmarks drawn from probability-based online panels (Callegaro et al. 2014). Aside from the accuracy of point estimates, variance estimates pose a challenge to data obtained from non-probability-based online surveys. As non-probability-based surveys are keen to self-selectivity of respondents, variance estimates are expected to differ from variance estimates derived from non-probability surveys. In particular, it is assumed that standard errors for estimates based on non-probability online panels are smaller which leads to seemingly smaller confidence intervals but at the same time increases the risk of committing alpha-errors. In this paper we aim to test this assumption by comparing variance estimates and standard errors for identical estimates derived from probability-based and non-probability-based online surveys. For the analysis we used data collected by Pew Research Center (Kennedy et al. 2016) for a comparison of the American Trends Panel (ATP), a probability-based online study, to eight non-probability-based panels obtained from different vendors.

Improving estimates from non-probability online surveys Dina Neiger, Andrew C. Ward, Darren W. Pennay, Paul J. Lavrakas, and Benjamin Phillips A well-designed and executed survey based on a probability sample has traditionally been the preferred approach for drawing inferences about the population. Given the cost and time required for probability surveys, as well as targeting rare populations, non-probability surveys are increasingly common, even beyond market research. Probability surveys are commonly weighted to the population distribution of key geo-demographic variables in an effort to account for the effects of differential non-response. For non-probability surveys, however, similar weighting adjustments often do not result in bias reduction due to different mechanisms of sample selection combined with strict quota controls on the same geo- demographic dimensions that are normally used in post-stratification. Alternatives such as blending, calibration and propensity-based weighting have shown benefit, but there is limited research available comparing the impact of different methods on reducing bias, especially outside of the U.S. We assess a range of weighting adjustments to reduce the bias of survey estimates from non- probability samples. Using the Life in Australia™ (LinA) panel, Australia’s only probability online panel in conjunction with data from the Australian Online Panels Benchmarking Study, we evaluate a number of different approaches to incorporate and improve the results of non- probability panels. These include the use of LinA as the reference sample to calculate pseudo- probability design weights for the non-probability samples as well as adding key differentiators between probability and non-probability samples to geo-demographic variables as part of the post-stratification weighting adjustments.

13 By comparing estimates of key outcome variables with independent benchmarks, we provide general guidance for reducing bias and improving the robustness of survey estimates from non-probability online surveys.

Weighting and estimation strategies for probability-based and nonprobability panel research Christian Bruch, Barbara Felderer, and Annelies Blom

This paper compares different methods of weighting adjustment and estimation with regard to probability-based and nonprobability panels. Our key question is whether weighting can compensate for both, self-selection and sampling error. Representative survey data is needed to be able to infer survey findings to the general population. To be able to conduct longitudinal analysis, in the last couple of years more and more probability-based and nonprobability online panels have been established. Following statistical theory a probability sample for which each element of the population has the same probability to be sampled leads to unbiased estimates which can be inferred to the population. However, the representativeness might be limit due to self-selection into the panel and panel attrition. In the cases of nonprobability panels, even the sampling process itself might lead to non-representativeness of the panel data. To tackle these challenges weighting adjustments have been established to correct for reduced representativeness. These strategies adjust the survey data to match known population statistics (e.g. means and proportions). The usefulness of weighting strategies depends on the benchmarks of variables available from official statistics which are mostly roughly categorize. However, high correlations of these weighting variables and the survey variables of interest are needed to be able to correct the estimates of the latter. Our research includes complex weighting and multilevel estimation and simple raking methods.

Session 6: Combining Probability-Based and Nonprobability Samples Rationale for conducting and methods for calibrating hybrid probability/non- probability surveys David Dutwin Probability and non-probability research in the U.S. have separate problems. Probability samples suffer a crisis in confidence, and increasingly high costs. Without a great deal of recent research on probability-based nonresponse bias, and telephone response rates that are often below 10%, can we be confident in the quality of probability-based data? On the other hand, the issue with non-probability samples is better documented by research, for

14 example by Krosnick, Dutwin, Walker, and others: Non-probability samples on average hold far more bias than probability samples, and the variance of that bias is also much higher, leading to both higher mean bias and a greater frequency of outlier point estimates. This presentation will briefly document recent research on bias in probability samples, documenting that while nonresponse error is greater than ever before, nonresponse bias is little changed over 20 years of telephone research. With that in mind, the presentation turns to the more central question: can one conduct studies that leverage the data quality of probability samples to “reign in” the bias of non-probability samples, but also capitalize on the low cost of non-probability surveys? Researchers are increasingly using probability estimates to calibrate point estimates in non-probability data. I provide three overarching approaches, and one in-depth example of a process by which a study that gathers both probability and non-probability data can leverage the probability estimates to develop a unified overall sample. Briefly, the process entails first identifying both candidate calibration variables as well as candidate interaction terms that will have the greatest effectiveness in calibration, and then developing and applying a calibration process with those and other benchmarks. Data are calibrated with processes considered “light” versus “heavy” and the results of these processes will be considered in terms of bias reduction and variance inflation.

Estimating the size of the LGB population with a random telephone/Internet survey and a joint Internet volunteer survey and taking advantage of the two to increase the analytical LGB subsample Stéphane Legleye and Géraldine Charrance Surveying and assessing the size of rare populations such as Lesbian, gay, bisexual and transgender (LGBT) is a scientific and political challenge. LGBT are exposed to discriminations that could reduce their participation in general population surveys (GPS), while questioning by an interviewer on sexual orientation could generate insincere responses, resulting in an underestimation of the true LGBT proportion.

“Virage GPS" is a random telephone survey conducted in 2015 among subjects aged 20 to 69 (n= 27,268, including 503 LGB –transgenders being not identified) in which 634 individuals responded via Internet. “Virage LGBT” is a volunteer Internet survey using the same questionnaire but targeting the LGBT (n=7148) in which 5790 LGB were collected. Using the paradata of Virage GPS and Virage LGBT data, we did not find any solid evidence for an overall non-response bias specific to LGB in Virage GPS. However, when the interviewer is male, LGB orientation is under-reported (odds-ratio OR=0.69 for both sexes), and report of LGB status is twice as frequent on Internet as on telephone (OR=1.93 for both sexes). The proportion of LGB in Virage GPS could thus be closer to 3% than 1.6% initially. To overcome the small number of LGB respondents in Virage GPS, we selected respondents from Virage LGBT using a matching technique before calibrating the composite sample with

15 the corrected LGB proportion as an additional margin. Advantages and limits of the procedure are discussed.

Blending probability and nonprobability samples for survey inference under a Bayesian framework Joseph W. Sakshaug, Arkadiusz Wisniowski, Diego Perez-Ruiz, and Annelies Blom

Scientific surveys based on random probability samples are ubiquitously used in the social sciences to study and describe large populations. They provide a critical source of quantifiable information used by governments and policy-makers to make informed decisions. However, probability-based surveys are increasingly expensive to carry out and declining response rates observed over recent decades have necessitated costly strategies to raise them. Consequently, many survey organizations have shifted away from probability sampling in favor of cheaper non-probability sampling based on volunteer web panels. This practice has provoked significant controversy and scepticism over the representativeness and usefulness of non-probability samples. While probability-based surveys have their own representativeness concerns, comparison studies generally show that they are more representative than non-probability surveys. Hence, the survey research industry is in a situation where probability sampling is the preferred choice from an error perspective, while non-probability sampling is preferred from a cost perspective. Given the advantages of both sampling schemes, it makes sense to devise a strategy to combine them in a way that is beneficial from both a cost and error perspective. We examine this notion by evaluating a method of integrating probability and non-probability samples under a Bayesian inferential framework. The method is designed to utilize information from a non-probability sample to inform estimations based on a parallel probability sample. The method is evaluated through a real-data application involving two probability and eight non-probability surveys that fielded the same questionnaire simultaneously. We show that the method reduces the variance and mean-squared error (MSE) of a variety of survey estimates, with only small increases in bias, relative to estimates derived under probability-only sampling. The MSE/variance efficiency gains are most prominent when a small probability sample is supplemented by a larger non- probability sample. Using actual cost data we show that the Bayesian data integration method can produce cost savings for a fixed amount of error relative to a standard probability-only approach.

16 Session 7: Nonprobability Survey Research and Big Data How research on probability and nonprobability panels can inform passive data collection studies Bella Struminskaya Much scholarly attention has focused on the issues of coverage, nonparticipation, and measurement errors in nonprobability panels as well as possibilities of mitigating possible biases using weighting. A body of empirical evidence accumulated in the recent years suggests that the errors are nonignorable and nonprobability online panels should not be used when the goal is making inferences about the general population. The cost and time reduction which made online panel research attractive not long ago presently make found (big) data equally attractive for the social, political, and health scientists. A growing number of empirical studies make use of the large volumes of data usually with little or no concern for coverage and selection. In this talk, I will focus on studies that collect such large volumes of data passively, e.g., using built-in GPS/accelerometer sensors of smartphones. After providing some examples of empirical studies classified by the design options they use, I will discuss in which ways can research evidence from nonprobability vs. probability-based online panels inform future studies using passive data collection. The talk will illustrate how (non)probability online panels can be utilized to study selection mechanisms as well as discuss the advantages and possible problems of using nonprobability online panels for passive data collection studies. The talk aims to stimulate discussion and generate ideas for future research.

Probability, nonprobability sampling, and Big Data Andreas Quatember Is a probability sample with nonresponse still better than a nonprobability sample? What would be the answer to that question, if the non-probability sample were considerably “bigger” than the probability sample (cf. Meng 2018)? – Under laboratory conditions, both approaches to sampling look like they could not be more different: On the one hand, the probability sampling techniques, which clearly set the standard for the selection of samples, can be described under the uniform framework of known sample inclusion probabilities for the population units. On the other hand, the different nonprobability sampling methods do not have much more in common than the absence of such calculable inclusion probabilities. Consequently, these methods have to be discussed one by one. Considered under real conditions, the conclusions from probability as well as nonprobability samples have one thing in common: they are based on models. Hence, the quality of inference depends on the correctness of these model assumptions (and on the robustness of estimators against model violations). The same applies for the analysis of really big data sets. Usually, such data are not collected with the intention with the intention of drawing conclusions on

17 somehow defined populations: When used to characterize populations other than the population of data providers, despite the “bigness” of the data, models have to be formulated.

Digital trace data: just another nonprobability sample? Josh Pasek Public opinion scholars have shown increasing interest in recent years in leveraging digital trace data from social media posts and search engines. Much of the work using these kinds of data have been based on the assumption that online searches and expressions can be regarded as indicators of attitudes and behaviors from an unknown sample of individuals. Implicit in this approach is an expectation that digital traces might be subject to the same sorts of challenges as surveys derived from nonprobability sampling strategies. The current study examines this proposition by comparing three types of data as they relate to approval of Barack Obama between 2009 and 2014: Probability sample telephone surveys from nationally representative samples of Americans, nonprobability Internet surveys from national samples of Americans, and sentiment in Tweets about Barack Obama. Results of the comparison between these types of data indicate that the correspondence between the two types of survey samples, while imperfect, yielded trends over time far more similar than either survey sample was to the Twitter sentiment data. These findings imply that digital trace data and survey data differ in ways more extreme than the differences between probability and nonprobability surveys. Further, although trends in all three samples find a downward shift in approval over the course of the Obama presidency, the effect of seminal events on the two types of survey data streams were more similar to one-another than the Twitter sentiment was to either survey, again indicating that the nature of the data are different.

18