<<

ONLINE MR QUALITY: IS IGNORANCE BLISS? By Neil Chakraborty, PRC, PMP, MPhil, MS

12 MRA’S ALERT! MAGAZINE – FOURTH QUARTER 2014 his article focuses on specific fundamental flaws of online research that professionals deliberately or unknowingly are Tguilty of committing in their everyday work. Digital technologies have recently fueled several innovations in the marketing research field, including new online methods, Internet focus groups and communities, which have impacted the way marketing research data are collected, analyzed and employed to support management decisions. Online methods have simplified and accelerated , reduced administrative and interviewing costs, allowed for reaching out to more geographically dispersed samples of respondents and flexibility driven by remote video capability. Since Internet surveys have become the most popular method of collecting data, researchers must minimize survey error, so that the study results are accurate, duplicable and meet the survey measurement objectives. There are broadly two categories of survey error that arise due to : sampling and non-. Sampling error occurs because it is usually impossible to collect measurements from members of the entire population, and the study is conducted with a of respondents which is not likely to be representative of the population concern. Sampling error generally decreases as the sample size increases, and depends on the size of the population as well as on the variability of the characteristic of interest in the population. It can be accounted for and reduced by an appropriate sampling plan. Alternatively, online, non-sampling error is the most frequent and relevant error of concern to market researchers. Non-sampling errors arise from marketing research sampling that is not probability-based or “random.” The nature of errors, in large part, is influenced by the type of sampling method adapted in the studies. Sampling can be grouped into two major classifications: Despite the growing use of online (1) probability sampling, and (2) non- probability sampling. Probability sampling is sampling in the marketing research scientifically generated and provides equal chance of participation to every member of industry, little attention is given the population being studied. panels that intend to survey an entire population to the methodological issues use this type of sampling to representatively sample a specific population. Surveys using underlying the use of online samples probability-based panels seem to constitute a small portion of marketing research. On the or to the potential consequences of other hand, non-probability sampling is the most commonly used marketing research online research practices. tool where participants are selected or volunteer to participate as per convenience without a scientifically valid sampling plan

MRA’S ALERT! MAGAZINE – FOURTH QUARTER 2014 13 and usually don’t have any correspondence suppliers. These firms generally recruit Moreover, even if panels can get access to the general representative population. respondents through river sampling by to willing opt-in respondents, extraneous This sampling makes no attempt to identify placing banner advertisements on Web issues like message filtering can prevent the the or to randomly select pages or purchasing lists of email addresses. users from knowing about a survey request. respondents. Any inferences made about As Web users sign up to join opt-in online In some cases, potential respondents population parameters from non-probability panels, they are invited periodically to have multiple email addresses and might surveys are potentially problematic complete a (either for no not check certain accounts regularly. An and violate the underlying principles of reward or a financial or nonfinancial invitation to participate in an online survey probability theory. Typical examples of non- incentive). The process of managing these does not have equal chances of targeting probability sampling are convenience or volunteers within the online panels is every member of the population. haphazard, quota, volunteer and judgment fraught with problems. Panel companies Alarmingly, despite what is being claimed sampling. don’t necessarily institute processes to as an example of robustness of the online Sampling errors can only be measured screen unqualified volunteers out of panels, methodology, best estimates of Internet and controlled in probability sample though some might do marginal filtering access in developed countries like the U.S. surveys. Sampling errors cannot be and channel people into specific surveys. and Canada, indicate that roughly one third computed and reported for online panels Often, people who have strong opinions of the adult population does not use the because they rely mainly on non-probability about a survey topic self-select into panels Internet on a regular basis. As an example, sampling methods. A dubious metric and people who don’t care much are even in the U.S., where the panel size is by frequently misused by marketing research under-sampled. Thus, researchers generally far the largest (three million members), only professionals to report on sampling error agree that results from a voluntary response about two percent of adult Internet users is the margin of error. Margin of error sample should be viewed with caution, and are online at any given time (American is supposed to be used to calculate the the volunteer sample does not have the Association for Research – sampling error for a probability sampling same characteristics as a probability-based AAPOR, 2010). As a majority of online non- method, not a non-probability method, sample drawn from random representation probability panels rely merely on those who and is an inappropriate measure for opt-in of the entire population. already are online, these have inherent and surveys based on self-selected volunteers. significant , primarily in form Further, these surveys cannot be necessarily of under-coverage error. projected to the target population and are Weighting is a Nonresponse Error subject to non-systematic unmeasurable One of the more powerful biases is the biases. Rather, mentioning such a metric paradoxical and false nonresponse errors which arise due to misleads the readers, giving a false attempt to portray a nonresponse to a specific survey experience. impression that the results apply to the Nonresponse bias is not unique to Internet general population when, in fact, the more accurate picture surveys, but is especially severe for Web- margin of error for this particular survey is based surveys that have inconsistent only limited to the adult members of that than the data recruitment procedures. AAPOR reports particular online panel. Even in cases when that the response rates for non-probability opt-in surveys are drawn from probability will allow. surveys have fallen drastically over the last samples of large pools of volunteers, their several years to an abysmal point, as low results suffer from some serious validity and Coverage Error as 10 percent. Studies have established reliability issues due to unknown biases. Coverage Error is the bias introduced into that response rates vary widely among Errors in the context of non-probability a study when certain groups are excluded demographic groups, a pattern that is also sampling methods are not well understood (e.g., non-Internet households or non- likely to skew the results. Nonresponse can and are ubiquitous in marketing research telephone households). Researchers quite arise during each of the stages of volunteer industry. Too many projects use the non- frequently overlook that online sampling is panel process: recruitment, joining and probability sampling methods without not actually a of the profiling, specific study sampling, and panel explicitly referencing the assumptions target population. It is systematically likely maintenance. Technological imperfections underlying these errors. Non-sampling to exclude certain types of consumers in the (e.g. browser incompatibility and slow errors in marketing research can be further area: people who don’t have email address Internet connections) might accentuate subdivided into a number of errors, the most and who don’t have access to Internet or non-response among online surveys. pertinent of which include self-selection, even a computer. It will also miss people Responses from Internet respondents coverage, non-response and measurement who do not wish to be surveyed or are very can vary on whether panelists have basic errors. particular about with whom they interact literacy skills and can answer surveys Selection Error online and have set heightened spam and according to requirements. Selection error occurs when a panel is made unsolicited mail filters. For many large Further adding to the problem, research up of volunteers (“self selected”) instead of groups, population cannot be pinned down and panel firms don’t really strive to track being derived from a designated sample. with certainty as a simple list because the and report on non-response rates as a Surveys are administered with select sampling frame does not exist. For example, way to understand the extent of bias that respondents who have Internet access, it will be inconceivable to garner a list of exists in non-probability panels. There is visit the website and decide to participate email addresses from every member of the no consensus among panels on how to best in the survey. Lately, the industry has adult American population. Additionally, measure response rates. Even if firms agree witnessed a mushrooming of online survey research shows that people with Internet on response rate metrics, the measurement data collection firms or online community access differ, on average, from those without can be misleading, since the metric can panels developed in-house at research Internet access. be artificially boosted by pre-selecting

14 MRA’S ALERT! MAGAZINE – FOURTH QUARTER 2014 the most cooperative panel members and At times, biases are introduced demographic representativeness (Malhotra showing favorable numbers for the panel in automatically with protracted surveys, and Krosnick, 2007). The most common consideration. usually ones more than 30 minutes long, examples of demographic variables are age, as surveys tend to be completed only gender and region, which are used to set Measurement Error by a certain segment of panelists for quotas or weights on the data invariably to Measurement error is generated by the purely financial gain. Longer surveys can project the study results onto the broader measurement process itself, and represents disengage respondents and jeopardize data population. However, quotas or weights the difference between the information quality and reliability. A number of studies are not necessarily best strategy as these generated and the information wanted by have shown an increase in satisficing balancing schemes end up introducing the researcher. Measurement errors can be behaviors and break-offs after 18-20 minutes more sample composition bias. As an caused by a few different factors ascribed (ESOMAR, 2014). The more questions example, if a specific project has older to the respondent, the method (e.g., the in the survey, the less time respondents respondents participate in larger numbers to questionnaire, interviewer or mode of data spend on average per question, leading start with, the data will already be skewed collection), and contextual factors (e.g., toward the preferences and behaviors of time pressure). There is a widespread older respondents. Weighting younger concern among clients that online ...Responsible respondents more heavily to account for the surveys are vulnerable to questionable discrepancy is not the right answer, as the data supplied by respondents who do researchers are younger respondents who are sampled may not give an adequate level of thought to not be representative of younger population. survey questions or deliberately provide obligated to disclose Weighting is a paradoxical and false attempt fraudulent answers. These concerns arise to portray a more accurate picture than the from increasing evidence that only a select the exact nature of data will allow. The result is “photoshopped” group of panelists complete a large share the respondents, survey results – the information looks nice of surveys, and some manipulate screener and seems to be complete, but it isn’t and other qualifier questions to maximize sampling representative or scientifically valid (Duda their chances of qualifying. As a result, and Nobile, 2010). the industry has witnessed an alarming methodology and level of response bias in the form of assumptions, and Conclusion socially desirable responding or satisficing, The obvious issues and methodological acquiescent responding, extreme should be categorical fallacy of the increased usage of online responding, and straight-lining or midpoint surveys beg the question: How aware responding. in stating that the are we as a profession and what steps In Internet panels, in addition to the issue can we take to mitigate the problem? of recruitment of volunteer respondents, it conclusions are Given the high penetration of online panel is imperative to remember that respondent confined to the research, coupled with proliferation of non- answers might not be as bias-free as probability sampling in marketing research usually perceived. Financial incentives are sample at hand. projects, researchers should be honest and often a significant influence on respondent upfront while reporting to their clients. participation. The fact that one group Prior research has established that non- to satisficing or speeding behaviors. Only responded to the survey while others chose probability samples are largely less accurate motivated respondents are likely to take the to opt out, indicates bias stemming from and more biased than probability samples. time to complete an overly long survey, in many factors including price-sensitivity A majority of studies found significantly the process compromising data accuracy and the loyal nature of those respondents. different results on a wide array of behaviors and quality. Similarly, bias can be evoked Additionally, if someone has a strong desire and attitudes after comparing results from in the form of patterned responding due to win a prize, they can find ways around surveys using non-probability online panels to a repeated series of matrix or grid-style any safeguards against multiple responses with those using probability-based methods questions or primacy effect with responding and complete several surveys, thereby like RDD telephone. on top of a list rather than bottom questions increasing their chances of winning. Foremost, market researchers have with a large number of answer categories. Occasionally, there are individuals, having to clearly state limitations regarding Likewise, extensive grid response formats a vested interest in survey results, who methodology and inferences in their online increase stress due to the demands of the complete a particular online survey multiple survey research, and need to ask: Do the survey task and can induce higher levels times and urge others to influence the study results tend to project onto greater of satisficing. In addition, over-reliance on results. Some even have gone to the extent population? How much will generalizability, using intricate tasks and of developing automated programs which or lack thereof, impact the research formats, such as slider bars and complex can repeatedly answer surveys to influence findings? Researchers should avoid claims of conjoint designs, can increase respondent results. Unfortunately, the marketing “representativeness” among non-probability burden and negatively impact data quality. research profession has not been able to online panels. Too often, non-probability One approach frequently used by keep up with the technological challenges. samples are used to draw quantitative researchers to correct bias is standard Hence, it becomes very difficult for the conclusions, assuming a very simplistic but demographic quota or weighting. To ensure researchers to figure out who is accessing erroneous view of the world, thus leading balanced representation, the demographics the survey, their respective demographic to oversimplification of , the of the sample are fine-tuned by imposing background, location and whether the computation of standard errors, and the quotas at the time of data collection. In information they are volunteering to share conduct of significance tests. Alternatively, if addition, post-stratification weighting is online is accurate. the idea is to extrapolate the results onto the implemented to yield the appearance of

MRA’S ALERT! MAGAZINE – FOURTH QUARTER 2014 15 Margin of Error is calculated and The errors still exist but steps can be reported as a way to measure the taken to minimize it – not relevant to Sampling Probability Sampling amount of sampling errors that Marketing Research as few studies exists. use probability sampling. Particularly severe in Marketing Research – Varying degrees of Commonly misreported – No Non-Probability errors, out of which widely prevalent Statistics should be used for opt-in- Sampling ones are self-selection errors, panels. coverage errors, non-response and measurement errors. Sampling Errors Non-Sampling Errors

Errors

Figure. Distribution of Sampling versus Errors in Marketing Research. general population, the research should be For example, Knowledge Networks, recently References based on probability sampling which can be acquired by GFK, had developed a system AAPOR. 2010. AAPOR Report on Online expensive, time-consuming and elaborate. to include non-Internet population segments Panels. http://www.aapor.org/AM/ Therefore, in marketing research projects by providing them with Internet access in Template.cfm?Section=AAPOR_ using non-probability sampling, responsible an effort to correct this skew toward Internet Committee_and_Task_Force_ researchers are obligated to disclose the usage (Pineau & Slotwiner, 2003). Polimetrix, Reports&Template=/CM/ContentDisplay. exact nature of the respondents, sampling bought by YouGov, provided another method cfm&ContentID=2223 methodology and assumptions, and by conducting sampling by location. Luth AAPOR. 2013. Report of the AAPOR Task should be categorical in stating that the Research has developed a propensity score Force on Non-Probability Sampling. conclusions are confined to the sample at system as a way to normalize the self- http://www.aapor.org/AM/Template. hand. selection bias in panel sampling (Luth, 2008). cfm?Section=Reports1&Template=/CM/ A recent report from AAPOR states A word of caution: these approaches can ContentDisplay.cfm&ContentID=5963 that despite its obvious methodological minimize a lot of bias but cannot absolutely Duda, M.D, & Nobile, J.L. 2010. “The Fallacy limitations, non-probability surveys are replicate the of probability of Online Surveys: No Data Are Better most promising when based on models that sampling. Than Bad Data,”. Human Dimensions of attempt to deal with challenges to inference Research and panel companies can devise Wildlife 15(1): 55-64. in both the sampling and estimation stages multiple means to improve the quality of ESOMAR. 2014. ESOMAR/GRBN guideline (AAPOR, 2013). There are occasions to sampling and the data. They need to develop for online sample quality. http://www. employ convenience samples for marketing elaborate and sophisticated approaches to .org/uploads/public/knowledge- research decisions, such as in the “idea eliminate fake respondents and ensure data and-standards/codes-and-guidelines/ generation” phase of research. In many quality. They need to verify the background ESOMAR-GRBN-draft-Online-Sample- projects, where incidence rates can be low of the respondents so that the respondents Quality-Guideline-April-2014.pdf and researchers have to deal with hard-to- truly are what they say in their online Luth L. “An Empirical Approach to reach respondents (e.g., c-suite audiences), profile. It is important to identify potential Correct Self-Selection Bias of Online non-probability sampling may be the only inattentive or deceptive respondents through Panel Research.” 2008 CASRO Panel feasible solution. Non-probability online a variety of engagement metrics like survey Conference. 2008. sampling can also be applied when every completion time, proportion of unanswered Malhotra, Neil, and Jon A. Krosnick. single employee or consumer of a particular questions, patterned responses in a matrix or 2007. “The Effect of Survey Mode and product/service has access to the Internet. grid questions (e.g., straight-lining, random Sampling on Inferences about Political Such situations like employee or customer responding, etc.), detection of inconsistent Attitudes and Behavior: Comparing the satisfaction survey can be perfect examples responses and red herring questions. The aim 2000 and 2004 ANES to Internet Surveys where the sample is limited to only a is to minimize their impact and reduce biases with Nonprobability Samples.” Political manageable size and can provide adequate on a study’s overall findings. Analysis 15:286-323. information on the targeted population It is logical to assume that, just like any Pineau, V & D. Slotwiner (2003). Probability characteristics. other, a non-probability Web survey method Samples vs. Volunteer Respondents in Techniques such as propensity weighting has its drawbacks. As the industry is swiftly Internet Research: Defining Potential or other simple weighting schemes may be witnessing a paradigmatic and widespread Effects on Data and Decision-Making in useful in improving the representativeness shift toward online methods, researchers Marketing Applications, Technical Paper, of Internet survey samples. Developing and consultants have an obligation to Knowledge Networks, California, USA. multi-mode research by supplementing provide the full picture to their clients and Web surveys with telephone surveys consumers so that the generalizability of Neil Chakraborty, PRC, PMP, MPhil, can help develop appropriate weighting the results is treated with caution. At the MS, is an experienced consultant and senior schemes. To maximize reliability, a few same time, researchers need to minimize research manager. His expertise includes panel companies have also made efforts biases in sampling, which can go a long way delivering in a wide range of research to balance their panel populations to in providing accurate estimates of target approaches and ability to lead people and better represent the overall population. population characteristics. projects from initial concept development to implementation.

16 MRA’S ALERT! MAGAZINE – FOURTH QUARTER 2014 Downloading Music Getting Down Live Is Better Done Online Is Better Done In Person

Thankfully, Schlesinger Associates Does Both

Actually, we can’t help you with your dance moves. However, we will deliver a perfect pitch solution to fit your exact requirements. Whether it is qualitative or quantitative, Schlesinger Associates is your resource for complete global data collection services.

Tel. +1 866 549 3500

YOUR GLOBAL RESEARCH PARTNER, ANYWHERE, ANYTIME.

Recruitment > Focus Groups > Online > Technology > Global Project Management

© 2011 Schlesinger Associates. MRA’S ALERT! MAGAZINE – FOURTH QUARTER 2014 17

SA_Single_Alert.indd 4 4/21/11 6:57:17 AM