Sampling Schemes and Survey Quality in Cross-National Surveys
Total Page:16
File Type:pdf, Size:1020Kb
From Cross-national ex post Survey Harmonization to Substantive Analyses: A Roadmap and Empirical Illustration Marta Kołczyńska Institute of Philosophy and Sociology, Polish Academy of Sciences Cross-National Studies: Interdisciplinary Research and Training Program (CONSIRT) WAPOR 71st Annual Conference Public Opinion Research in a Changing World Marrakesh, June 29, 2018 Purpose Present the approach to ex post harmonization of survey data developed in the Survey Data Recycling (SDR, dataharmonization.org) project Propose steps for applying the SDR data and approach illustrated with the analysis of individual- and macro-level determinants of participation in demonstrations Determinants of protest participation Individual-level predictors: Education (+) Trust in state institutions (-) Country-level predictors: Quality of democracy (+) The case for survey data harmonization Problems: limited availability of survey data necessary to measure political values, attitudes and participation, from countries outside of the WEIRD zone (Western, Educated, Industrialized, Rich, Democratic) few datasets with a sufficient number of countries for reliable estimation of country effects (proposed) solution in the SDR project: ex post harmonization of cross-national survey data Slomczynski, K.M. and I. Tomescu-Dubrow. Forthcoming. “Basic Principles of Survey Data Recycling.” In Advances in Comparative Survey Methodology, ed. T. P. Johnson, B-E. Pennell, I. Stoop, and B. Dorer. Wiley. Harmonization in SDR Transformation of source data into a common metric and capturing the methodological variation with newly created control variables. The goal is to preserve as much information about the original (source) data as possible, with the aim of using this information in analyses to distinguish between methodological and substantive effects. 2 types of control variables: Harmonization controls & Quality controls They can be used: (a) to select surveys that meet some pre-defined criteria, and/or (b) as control variables. Harmonization controls produced in the process of variable harmonization to capture properties of survey items that would be lost in the process of harmonization • details of question wording • properties of response scales item-specific, i.e., they are constructed individually for each target variable • depending on the type of the variable and based on relevant methodological literature Quality controls address the inter-survey variation in the methodology and quality of the survey process 3 types: • based on available survey documentation (e.g., type of sampling) • derived from the data (e.g., presence of duplicates or correctness of weights) • consistency between the documentation and the data (e.g., labels, values) SDR data (version 1.0) 1966-2013 22 international survey projects 1721 national surveys 142 countries/territories 2,290,060 respondents 16 harmonized variables: political attitudes, protest behavior, socio-demographics @Dataverse: doi.org/10.7910/DVN/VWGF5Q Project name Years N waves N surveys N cases Asian Barometer 2001-2011 3 30 43691 Afrobarometer 1999-2009 4 66 98942 Americas Barometer 2004-2012 5 92 151341 Arab Barometer 2006-2011 2 16 19684 Asia Europe Survey 2000 1 18 18253 Caucasus Barometer 2009-2012 4 12 24621 Consolidation of Democracy in Central Eastern Europe 1990-2001 2 27 28926 Comparative National Elections Project 2004-2006 1 8 13372 Eurobarometer 1983-2012 7 152 138753 European Quality of Life Survey 2003-2012 3 93 105527 European Social Survey 2002-2013 6 146 281496 European Values Study 1981-2009 4 128 166502 International Social Justice Project 1991-1996 2 21 25805 International Social Survey Programme 1985-2013 13 363 493243 Latinobarometro 1995-2010 15 260 294965 Life in Transition Survey 2006-2010 2 64 67866 New Baltic Barometer 1993-2004 6 18 21601 Political Action II 1979-1981 1 3 4057 Political Action - An Eight Nation Study 1973-1976 1 8 12588 Political Participation and Equality in Seven Nations 1966-1971 1 7 16522 Values and Political Change in Postcommunist Europe 1993 1 5 4723 World Values Survey 1981-2008 5 184 256582 Total 1966-2013 89 1721 2289060 Data selection: availability of variables DV: participation in demonstrations IVs: education and trust in parliament (individual level) democracy (country level) controls: age, gender, rural/urban residence, household income (individual level) economic development (country level) Data selection: availability of variables with desired properties DV: participation in demonstrations Last 12 months/last year Last 2, 3, 5, 8, 10 years „ever” (no time specification) Availability of selected variables in projects (post-1989) Data selection: duplicated cases Eliminate surveys with >5% non-unique records (NUR): WVS/3/Brazil: 6.8% NURs WVS/3/Mexico: 22.7% NURs WVS/5/Ethiopia: 35.9% NURs 37 other national survey have between 2 non-unique records (single duplicated record) and 36 NURs (18 duplicated records) Slomczynski, K.M., P. Powalko, and T. Krauze. 2017. “Non-unique Records in International Survey Projects: The Need for Extending Data Quality Control.” Survey Research Methods 11(1): 1-16. Data selection: survey multiplets Side effect of combining data from different survey projects: occasionally there is more than one survey per country-year. Survey multiplets increase the inequality in survey coverage between countries. 10 instances with more than one survey per country-year (1) Sampling scheme closer to simple random sample (2) Larger sample size Methodological variation (1) Across source variables • Time in the question on participation in demonstrations (1 year, 2, 3, 5, 8, 10 years, „ever”) selection of surveys • The question on participation in demonstrations mentioned other activities, such as marches, protests or sits-in, in addition to demonstrations control variable (2) Quality across surveys • Sampling scheme control variable • Non-response to the DV control variable DV: Participation in demonstrations „ever” Model 1 Model 2 Model 3 Trust in parliament (group centered) 0.002 -0.000 -0.000 Education, years (centered 12) 0.094*** 0.103*** 0.103*** Freedom House (centered 9) 0.107*** 0.116*** 0.115*** Education, years (12) x Freedom House (9) 0.008*** 0.008*** Original question extended 0.286** % missing in DV -0.014+ Sampling (Ref.: no/insufficient information) Quota -0.123 Random route -0.054 Multi-stage address register -0.174 Multi-stage individual register -0.235+ Simple random sample 0.095 Controls: age, age squared, female, rural, household income GDP per capita, mean trust in parliament, mean education, proportion rural N individuals = 301598, N surveys = 276, N countries = 95. both + "ever" DV: Participation in demonstrations 1 year "ever" both dummy Freedom House (centered 9) 0.014 0.118*** 0.108*** 0.120*** GDP per capita, USD (ln) -0.213 -0.119+ -0.118 -0.136* Mean trust in parliament 0.184*** 0.032 0.086+ 0.055 Mean education (years) 0.065 -0.076** -0.111*** -0.068** Proportion rural 0.879* 0.032 -0.891*** 0.316 % missing values in DV -0.042 -0.010 0.055*** -0.008 Sampling (Ref.: no/insuff. information) Quota 0.017 -0.133+ -0.001 -0.065 Random route 0.304+ -0.044 -0.065 0.198* Multi-stage address register -0.460** -0.220+ -0.664*** -0.284** Multi-stage individual register -0.421* -0.188+ -0.296* -0.135 Simple random sample -0.753*** 0.102 -0.872*** -0.364** "ever" dummy 1.305*** N individuals 304747 286863 591610 591610 N surveys / N surveys 232/71 278/96 510/113 510/113 Conclusions There is a need for ex postharmonization of survey data and for a systematic method of analyzing survey data harmonized ex post. Harmonization and quality controls proposed by SDR capture (part of) the methodological variation across surveys to distinguish between methodological and substantive effects. How to do this better? Acknowledgements The SDR project “Survey Data Recycling: New Analytic Framework, Integrated Database, and Tools for Cross-national Social, Behavioral and Economic Research”, is funded by the National Science Foundation, USA (PTE Federal award 1738502). The SDR project builds on the grant funded by the National Science Centre, Poland (2012/06/M/HS6/00322). SDR Team at the Polish Academy of Sciences and The Ohio State University: Kazimierz M. Słomczyński (PI), Irina Tomescu-Dubrow (PI), J. Craig Jenkins (PI), Marta Kołczyńska, Ilona Wysmułek, Przemek Powałko, Nika Palaguta, Weronika Boruc, Denys Lavryk, Marcin W. Zieliński, Bashir Tofangsazi dataharmonization.org https://dataharmonization.org/newsletter/ https://dataharmonization.org/publications/.