GSSR
Survey Research (I):
Types of survey design
www.socialinquiry.wordpress.com Survey research in social sciences:
= method: systematic collection of information from large nr. of individuals (sample), via asking questions (interview modes).
Key words: survey design target & survey population (universe) sample (design & frame) response rates interview mode survey instrument Survey research in social sciences:
= the product(s) the survey method yields: survey data
Key words: types of surveys units of analysis survey data quality public use (data sharing) The Survey Lifecycle ccsg.isr.umich.edu/index.php/chapters Survey design: academic & non-academic surveys
Academic
Non-academic
• Statistical-offices surveys (e.g., Current Population Survey; Labor Force Survey)
• Marketing surveys (e.g., in segmentation research to determine the demographic, psychographic & behavioral characteristics of potential buyers)
• Political opinion polls (e.g., TV stations & newspapers organized polls) Survey Design
Cross-sectional: single round of data collection
Longitudinal: multiple rounds of data collection - on different samples: cross-sectional time series design - on same sample of respondents: panel design
Mono- cultural survey design Multicultural
3MC Survey Projects, ccsg.isr.umich.edu/ Research design: Target & survey population
Target population: population to whom inferences using the sample statistics will be made.
Survey population: actual population from which survey data are collected, given various restrictions (e.g. citizens vs. any resident) Sampling design (a)
Non-random:
Single-stage quota sampling
Expert sampling
Random walk: only interviews, without compiling address listing of households; + quota Sampling design (b) Random, single stage:
simple random sampling systematic random sampling stratified random sampling cluster sampling matched-pairs sampling w/without substitution of dropouts permitted during fieldwork Sampling design (c) Random, multi-stage:
Clustered sample
Stage 1: Selection of clusters via simple random sampling Stage 1: Selection of clusters via systematic random sampling Stage 1: Number of selected clusters
Clustering groups for stage n Stage n:Selection of clusters via simple random sampling Stage n:Selection of clusters via systematic random sampling Stage n: Number of selected clusters Sampling design (d)
Random, multi-stage:
Multi-stage stratified sample
Characteristics used for stratification at stage 1, 2, … n Sampling design (e)
Random route/random walk:
Random walk: compiles address listing of households only (no interviews conducted at this step)
Random walk: compiles address listing of households and interviews target person identified w. Kish-table
Random walk: compiles address listing of households and interviews target person identified w. quota Sampling design (f) Multi-stage random sample, with quota:
Groups for quota sample at stage 1
Stage 1/n: quota sample, proportional/not
Stage 1/n: planned size of quota samples, for each group Sampling frame Materials used to identify all elements of the survey population from which sample is selected
Sampling units (primary, secondary, …)
Number of sampling frames used
Type of sampling frame: listbased (what type of list)
Sampling frame created new for this survey?
Did sampling frame preexist, but was updated for this survey? Research design: Interview Mode(s)
In person, face-to-face - questionnaire on paper (PAPI)
In person, face-to-face: electronic/computerized questionnaire (CAPI)
Telephone interview, landline
Telephone interview, cell phones
Self-completion, postal
Self-completion, internet
Multiple interview modes within survey
Change of interview mode within respondent Survey data quality
Analytic frameworks for quality assessment
“Survey quality” as understood by data producers and by data users
2 perspectives on survey quality (as general concept):
(a)Freedom from deficiencies Total Survey Error (b)Responsiveness to users’ needs Survey Process Quality Management
Survey quality as multidimensional concept Total Survey Error (TSE): estimate & reduce mean square error of survey estimates given financial, time and ethical constraints Potential sources, TSE: representation
Source: ccsg.isr.umich.edu/index.php/chapters/survey-quality-chapter Potential sources, TSE: measurement
Source: ccsg.isr.umich.edu/index.php/chapters/survey-quality-chapter Survey Process Quality Management (SQM)
Overall survey quality follows from:
- ongoing quality control during all stages of survey production (singularly and in conjunction with each other)
- survey’s responsiveness to customers’ needs: • comparability • relevance, • accuracy TSE • timeliness • accessibility • interpretability The SDR database Selected international survey projects (23):
Asian Barometer International Social Justice Project Afrobarometer International Social Survey Programme Americas Barometer Latinobarometro Arab Barometer Life in Transition Survey Asia Europe Survey New Baltic Barometer Caucasus Barometer New Europe Barometer Consolidation of Democracy Political Action II Comparative National Elections Project Political Action - An Eight Nation Study Eurobarometer Political Participation and Equality European Quality of Life Survey Values and Political Change European Social Survey World Values Survey European Values Study
Ca. 3,400 national surveys (i.e. project*wave*countries); ~ 142 countries/territories, 1966 - 2017. SDR 1.1 available via Harvard Dataverse (see asc.ohio-state.edu/dataharmonization/data/) Defining survey quality in SDR
TSE & SQM frameworks 3 dimensions of survey quality a) Quality of surveys as reflected in the survey documentation - inadequate information in documentation reduces confidence in the data b) Degree of consistency documentation <-> data records in the computer file - - processing errors can affect the overall usability of the survey c) Quality of the data records in national datasets (i.e. computer files) - errors can lead to distortion of empirical results. Operationalizing quality of survey data
Indicators that measure methodological variability pertain to:
a) Quality of survey documentation;
b) (In)consistencies between data description (in codebooks, questionnaires) & data records in the computer file;
c) Quality of data records in the computer files. Operationalization a) Survey Documentation Type of sampling
Summary index, 5 variables: Codes Does the survey documentation specify the type of sample used? Yes = 1, No = 0
Does the survey documentation provide information on the response rate? Yes = 1, No = 0
Was was translation checked by experts? Yes = 1, No = 0
Is there evidence that the questionnaire was pre-tested? Yes = 1, No = 0
Does the documentation show that the fieldwork was controlled? Yes = 1, No = 0
Effect of positive answers (Yes = 1): Increases confidence in the data Overall documentation assessment: 1720 national surveys, 89 waves, 22 projects
68% - no information about pretesting questionnaire prior to fieldwork
62% - no information about fieldwork control
49% - no information on response rates
~ 28% - no info on sampling, or info is so poor that sample type cannot be identified
Sources: Kołczyńska, 2018; Kołczyńska & Schoene, 2018; Changes in documentation quality over time (Quality index, 0-5), SDR 1.0 – 1.1
4
3
2
1
Quality, three-year moving average movingthree-yearQuality,
0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 Year
Source: Kołczyńska & Schoene, 2018 Operationalization: b) Consistency Documentation <->Data: processing errors categories
Based on analysis of variable values for 7 variables: gender, age, year of birth, education levels, years of schooling, trust in parliament, participation in demonstrations
Do variable values in the codebook and the data file contain? Codes Illegitimate Variable Values No = 0, Yes = 1 Misleading Variable Values No = 0, Yes = 1 Contradictory Variable Values No = 0, Yes = 1 Variable Values Discrepancy No = 0, Yes = 1 Lack of Variable Value Labels No = 0, Yes = 1
Presence of errors, Yes = 1: decreased interpretability of the data Operationalization: b) Consistency Documentation <->Data: processing errors categories
Processing error index: number of mistakes per wave divided by the number of source variables analyzed within the particular wave.
- identifies the number of discrepancies (out of 5 possible) between documentation and data, per target variable in the survey-project-wave, accounting for the number of source variables analyzed. Empirical illustrations: b) Distribution of processing errors per survey project wave, SDR 1.0 – 1.1
ASB/2 1,33 ASB/3 1,14 LITS/1 1,00 ARB/1; LITS/2 0,83 LB/1997; LB/2000 0,67 AFB/3 0,60 EQLS/2 0,57 ARB/2; LB/1995 0,50 LB/1996 0,43 AMB/1 0,40 ASB/1 0,38 AMB/2-5; CB/2010; CB/2011; LB/2004 0,33 LB/1998;LB/2001 0,29 AFB/1; EQLS/1; LB/2002 0,25 AFB/2; CDCEE/1; EB/77.3; ISJP/1;CNEP/3 0,20 CB/2009; LB/2009; LB/2010 0,17 CB/2012; LB/2007 0,14 EQLS/3; ESS/2; ESS/6; LB/2003; LB/2008 0,13 WVS/1 0,11 EVS/1 0,10 EVS/3; EVS/4; WVS/5 0,09 PPE7N 0,07 ASES 0,04
0,00 0,20 0,40 0,60 0,80 1,00 1,20 1,40
Higher values = more problems; Sources: Oleksyienko et al. 2018 Changes in data processing quality over time (higher values = more errors)
0,3
0,25
0,2
0,15
0,1 Data processing error index
0,05
0 1968-1989 1990-1995 1996-2001 2002- 2007 2008-2013
Sources: Oleksyienko et al. 2018 Operationalization c) Data Records in the Computer File: Are data records formally correct?
Summary index on the basis of 4 variables: Codes
Do survey cases (respondents) have unique identification numbers Yes = 1, No = 0 (IDs)? Are survey weights free of formal errors (not inflating sample size)? Yes = 1, No = 0
Is the proportion of missing values for gender and age within the Yes = 1, No = 0 standard limits (< 5%)? Is the data file free from repeated cases (duplicates)? Yes = 1, No = 0
Effect of positive answers (Yes = 1): Less distortion of research results based on the data Problematic samples in terms of 100% AGE or GENDER non-responses
• 100% missing AGE 100% missing GENDER • AMB/3/US EB/2012/Montenegro (ME) EB/2012/Serbia(RS) • CDCEE/1/Lithuania (LT) WVS/1/FI • CDCEE/1/Romania (RO) WVS/1/MX • WVS/1/Finnland (FI) WVS/1/ZA • WVS/1/Hungary • WVS/1/South Korea (KR) • WVS/1/Mexico (MX) • WVS/1/South Africa (ZA)
For non-unique records in SDR 1.0-1.1: Slomczynski, Kazimierz Maciek, Przemek Powalko, and Tadeusz Krauze. 2017. "Non- unique records in international survey projects: the need for extending data quality control." Survey Research Methods. 11(1):1-16. Major organizations of survey research
World Association of Public Opinion Research, WAPOR (founded in 1947, 500 members in more than 60 countries) International Journal of Public Opinion Research
European Survey Research Association, ESRA (founded in 2000) Survey Methods
World Association of Opinion and Marketing Research Professionals, known as ESOMAR (founded in 1948 as the European Society for Opinion and Marketing Research today unites users and providers of research in 100 countries). Professional codes and standards
American Association of Public Opinion Research (AAPOR) (founded in 1947) Data Archives
Three main data archives: • USA: Interuniversity Political & Social Research, ICPSR (University of Michigan) • UK: Data Archive (University of Essex) • Germany: GESIS (Cologne/Mannheim)
Two general organizations: • 1. The International Federation of Data Organizations for the Social Sciences, IFDOSS • 2. Council of European Social Science Data Archives, CESSDA (Clickable map of the social science data archives all over the world)