<<

GSSR

Survey Research (I):

Types of design

www.socialinquiry.wordpress.com Survey research in social sciences:

= method: systematic collection of information from large nr. of individuals (), via asking questions ( modes).

Key words: survey design target & survey population (universe) sample (design & frame) response rates interview mode survey instrument Survey research in social sciences:

= the product(s) the survey method yields: survey

Key words: types of surveys units of analysis survey data quality public use (data sharing) The Survey Lifecycle ccsg.isr.umich.edu/index.php/chapters Survey design: academic & non-academic surveys

Academic

Non-academic

• Statistical-offices surveys (e.g., Current Population Survey; Labor Force Survey)

• Marketing surveys (e.g., in segmentation research to determine the demographic, psychographic & behavioral characteristics of potential buyers)

• Political opinion polls (e.g., TV stations & newspapers organized polls) Survey Design

Cross-sectional: single round of data collection

Longitudinal: multiple rounds of data collection - on different samples: cross-sectional time series design - on same sample of respondents: panel design

Mono- cultural survey design Multicultural

3MC Survey Projects, ccsg.isr.umich.edu/ Research design: Target & survey population

Target population: population to whom inferences using the sample will be made.

Survey population: actual population from which survey data are collected, given various restrictions (e.g. citizens vs. any resident) design (a)

Non-random:

Single-stage quota sampling

Expert sampling

Random walk: only interviews, without compiling address listing of households; + quota Sampling design (b) Random, single stage:

simple random sampling systematic random sampling stratified random sampling cluster sampling matched-pairs sampling w/without substitution of dropouts permitted during fieldwork Sampling design (c) Random, multi-stage:

Clustered sample

Stage 1: Selection of clusters via simple random sampling Stage 1: Selection of clusters via systematic random sampling Stage 1: Number of selected clusters

Clustering groups for stage n Stage n:Selection of clusters via simple random sampling Stage n:Selection of clusters via systematic random sampling Stage n: Number of selected clusters Sampling design (d)

Random, multi-stage:

Multi-stage stratified sample

Characteristics used for stratification at stage 1, 2, … n Sampling design (e)

Random route/random walk:

Random walk: compiles address listing of households only (no interviews conducted at this step)

Random walk: compiles address listing of households and interviews target person identified w. Kish-table

Random walk: compiles address listing of households and interviews target person identified w. quota Sampling design (f) Multi-stage random sample, with quota:

Groups for quota sample at stage 1

Stage 1/n: quota sample, proportional/not

Stage 1/n: planned size of quota samples, for each group Sampling frame Materials used to identify all elements of the survey population from which sample is selected

Sampling units (primary, secondary, …)

Number of sampling frames used

Type of sampling frame: listbased (what type of list)

Sampling frame created new for this survey?

Did sampling frame preexist, but was updated for this survey? Research design: Interview Mode(s)

In person, face-to-face - on paper (PAPI)

In person, face-to-face: electronic/computerized questionnaire (CAPI)

Telephone interview, landline

Telephone interview, cell phones

Self-completion, postal

Self-completion, internet

Multiple interview modes within survey

Change of interview mode within respondent Survey data quality

Analytic frameworks for quality assessment

“Survey quality” as understood by data producers and by data users

2 perspectives on survey quality (as general concept):

(a)Freedom from deficiencies  (b)Responsiveness to users’ needs  Survey Process Quality Management

Survey quality as multidimensional concept Total Survey Error (TSE): estimate & reduce mean square error of survey estimates given financial, time and ethical constraints Potential sources, TSE: representation

Source: ccsg.isr.umich.edu/index.php/chapters/survey-quality-chapter Potential sources, TSE: measurement

Source: ccsg.isr.umich.edu/index.php/chapters/survey-quality-chapter Survey Process Quality Management (SQM)

Overall survey quality follows from:

- ongoing quality control during all stages of survey production (singularly and in conjunction with each other)

- survey’s responsiveness to customers’ needs: • comparability • relevance, • accuracy  TSE • timeliness • accessibility • interpretability The SDR database Selected international survey projects (23):

Asian Barometer International Social Justice Project International Social Survey Programme Americas Barometer Latinobarometro Arab Barometer Life in Transition Survey Asia Europe Survey New Baltic Barometer Caucasus Barometer New Europe Barometer Consolidation of Democracy Political Action II Comparative National Elections Project Political Action - An Eight Nation Study Political Participation and Equality European Quality of Life Survey Values and Political Change European Values Study

Ca. 3,400 national surveys (i.e. project*wave*countries); ~ 142 countries/territories, 1966 - 2017. SDR 1.1 available via Harvard Dataverse (see asc.ohio-state.edu/dataharmonization/data/) Defining survey quality in SDR

TSE & SQM frameworks  3 dimensions of survey quality a) Quality of surveys as reflected in the survey documentation - inadequate information in documentation reduces confidence in the data b) Degree of consistency documentation <-> data records in the computer file - - processing errors can affect the overall usability of the survey c) Quality of the data records in national datasets (i.e. computer files) - errors can lead to distortion of empirical results. Operationalizing quality of survey data

Indicators that measure methodological variability pertain to:

a) Quality of survey documentation;

b) (In)consistencies between data description (in codebooks, ) & data records in the computer file;

c) Quality of data records in the computer files. Operationalization a) Survey Documentation Type of sampling

Summary index, 5 variables: Codes Does the survey documentation specify the type of sample used? Yes = 1, No = 0

Does the survey documentation provide information on the response rate? Yes = 1, No = 0

Was was translation checked by experts? Yes = 1, No = 0

Is there evidence that the questionnaire was pre-tested? Yes = 1, No = 0

Does the documentation show that the fieldwork was controlled? Yes = 1, No = 0

Effect of positive answers (Yes = 1): Increases confidence in the data Overall documentation assessment: 1720 national surveys, 89 waves, 22 projects

68% - no information about pretesting questionnaire prior to fieldwork

62% - no information about fieldwork control

49% - no information on response rates

~ 28% - no info on sampling, or info is so poor that sample type cannot be identified

Sources: Kołczyńska, 2018; Kołczyńska & Schoene, 2018; Changes in documentation quality over time (Quality index, 0-5), SDR 1.0 – 1.1

4

3

2

1

Quality, three-year moving average movingthree-yearQuality,

0

1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 Year

Source: Kołczyńska & Schoene, 2018 Operationalization: b) Consistency Documentation <->Data: processing errors categories

Based on analysis of variable values for 7 variables: gender, age, year of birth, education levels, years of schooling, trust in parliament, participation in demonstrations

Do variable values in the codebook and the data file contain? Codes Illegitimate Variable Values No = 0, Yes = 1 Misleading Variable Values No = 0, Yes = 1 Contradictory Variable Values No = 0, Yes = 1 Variable Values Discrepancy No = 0, Yes = 1 Lack of Variable Value Labels No = 0, Yes = 1

Presence of errors, Yes = 1: decreased interpretability of the data Operationalization: b) Consistency Documentation <->Data: processing errors categories

Processing error index: number of mistakes per wave divided by the number of source variables analyzed within the particular wave.

- identifies the number of discrepancies (out of 5 possible) between documentation and data, per target variable in the survey-project-wave, accounting for the number of source variables analyzed. Empirical illustrations: b) Distribution of processing errors per survey project wave, SDR 1.0 – 1.1

ASB/2 1,33 ASB/3 1,14 LITS/1 1,00 ARB/1; LITS/2 0,83 LB/1997; LB/2000 0,67 AFB/3 0,60 EQLS/2 0,57 ARB/2; LB/1995 0,50 LB/1996 0,43 AMB/1 0,40 ASB/1 0,38 AMB/2-5; CB/2010; CB/2011; LB/2004 0,33 LB/1998;LB/2001 0,29 AFB/1; EQLS/1; LB/2002 0,25 AFB/2; CDCEE/1; EB/77.3; ISJP/1;CNEP/3 0,20 CB/2009; LB/2009; LB/2010 0,17 CB/2012; LB/2007 0,14 EQLS/3; ESS/2; ESS/6; LB/2003; LB/2008 0,13 WVS/1 0,11 EVS/1 0,10 EVS/3; EVS/4; WVS/5 0,09 PPE7N 0,07 ASES 0,04

0,00 0,20 0,40 0,60 0,80 1,00 1,20 1,40

Higher values = more problems; Sources: Oleksyienko et al. 2018 Changes in data processing quality over time (higher values = more errors)

0,3

0,25

0,2

0,15

0,1 Data processing error index

0,05

0 1968-1989 1990-1995 1996-2001 2002- 2007 2008-2013

Sources: Oleksyienko et al. 2018 Operationalization c) Data Records in the Computer File: Are data records formally correct?

Summary index on the basis of 4 variables: Codes

Do survey cases (respondents) have unique identification numbers Yes = 1, No = 0 (IDs)? Are survey weights free of formal errors (not inflating sample size)? Yes = 1, No = 0

Is the proportion of missing values for gender and age within the Yes = 1, No = 0 standard limits (< 5%)? Is the data file free from repeated cases (duplicates)? Yes = 1, No = 0

Effect of positive answers (Yes = 1): Less distortion of research results based on the data Problematic samples in terms of 100% AGE or GENDER non-responses

• 100% missing AGE 100% missing GENDER • AMB/3/US EB/2012/Montenegro (ME) EB/2012/Serbia(RS) • CDCEE/1/Lithuania (LT) WVS/1/FI • CDCEE/1/Romania (RO) WVS/1/MX • WVS/1/Finnland (FI) WVS/1/ZA • WVS/1/Hungary • WVS/1/South Korea (KR) • WVS/1/Mexico (MX) • WVS/1/South Africa (ZA)

For non-unique records in SDR 1.0-1.1: Slomczynski, Kazimierz Maciek, Przemek Powalko, and Tadeusz Krauze. 2017. "Non- unique records in international survey projects: the need for extending data quality control." Survey Research Methods. 11(1):1-16. Major organizations of survey research

World Association of Research, WAPOR (founded in 1947, 500 members in more than 60 countries) International Journal of Public Opinion Research

European Survey Research Association, ESRA (founded in 2000) Survey Methods

World Association of Opinion and Marketing Research Professionals, known as ESOMAR (founded in 1948 as the European Society for Opinion and Marketing Research today unites users and providers of research in 100 countries). Professional codes and standards

American Association of Public Opinion Research (AAPOR) (founded in 1947) Data Archives

Three main data archives: • USA: Interuniversity Political & Social Research, ICPSR (University of Michigan) • UK: Data Archive (University of Essex) • Germany: GESIS (Cologne/Mannheim)

Two general organizations: • 1. The International Federation of Data Organizations for the Social Sciences, IFDOSS • 2. Council of European Social Science Data Archives, CESSDA (Clickable map of the social science data archives all over the world)