OPEN PEER COMMENTARY Bias Due to Controlling a Collider

Total Page:16

File Type:pdf, Size:1020Kb

OPEN PEER COMMENTARY Bias Due to Controlling a Collider Journal Code Article ID Dispatch: 24.05.12 CE: Dorio, Lynette P E R 1 8 6 5 No. of Pages: 23 ME: 1 European Journal of Personality, Eur. J. Pers. (2012) 65 2 Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/per.1865 66 3 67 4 68 5 69 6 70 7 OPEN PEER COMMENTARY 71 8 72 9 73 10 Bias due to Controlling a Collider: A Potentially Important Issue for Personality Research 74 11 75 12 76 JENS B. ASENDORPF 13 77 14 Department of Psychology, Humboldt University Berlin, Germany 78 15 [email protected] 79 16 80 17 Abstract: I focus on one bias in correlational studies that has been rarely recognised because of the current taboo on discussions of 81 fi 18 causality in these studies: bias due to controlling a collider. It cannot only induce arti cial correlations between statistically 82 19 independent predictors but also suppress or hide real correlations between predictors. If the collider is related to selective sampling, a 83 particularly nasty bias results. Bias due to controlling a collider may be as important as bias due to a suppressor effect. Copyright © 20 84 2012 John Wiley & Sons, Ltd. 21 85 22 86 23 In his stimulating paper that is unfortunately sometimes hard Self-efficacy is an important resource, so the association of im- 87 24 to read, Lee (this issue) touches a taboo topic in current migrant status with self-efficacy provides important 88 25 personality publications: causal relations among variables information. Do these immigrants have lower self-efficacy 89 26 that describe between-person differences. For many years, expectations than their Greek peers? The answer is yes (the 90 27 authors were educated by reviewers and editors to avoid zero-order correlation between dummy-coded immigrant 91 28 causal language because of the many pitfalls in causal inter- status in a sample of 969 adolescent immigrant students along 92 29 pretations of correlations. These pitfalls granted dismissing with their Greek classmates was À.15, p < .001). 93 30 causality altogether are like throwing out the baby with the In studies of immigrant adaptation, skills in the host 94 31 bathwater. As humans, we cannot avoid thinking in terms language are often routinely controlled because they may 95 32 of causality, and therefore, tabooing this topic in publications already explain most or all effects of other predictors of 96 33 does not prevent readers and mass media from their own causal adaptation (although in many cases, suppressor effects may 97 34 interpretations guided by implicit rules such as ‘A correlates occur because the effect of language skills on adaptation is 98 35 with B’ means ‘A causes B’ but ‘B correlates with A’ means relatively strong). In the aforementioned case, if one controls 99 36 ‘B causes A’. the correlation between immigrant status and self-efficacy for 100 37 Although causality is a difficult concept in correlational the ability to speak Greek, the resulting partial correlation is 101 38 studies, scientists should and actually can do better than this À.03 and not significant any more. The control of Greek speak- 102 39 if they can be pressed to explicate the causal model, or alterna- ing skills induces a bias due to a collider because these skills are 103 40 tive causal models, underlying their research questions. The very likely causally influenced by both immigrant status and 104 41 directed acyclic graph (DAG) method described by Lee (this self-efficacy. Indeed, the respective correlations were À.37, 105 42 issue) is a valuable method of achieving such an explication p < .001 and .23, p < .001. Thus, controlling for host language 106 43 (see Foster, 2010, for an excellent discussion of causality based skills is highly problematic in studies of the adaptation of immi- 107 44 on DAGs for developmental psychologists). My comment here grants where a resource and/or an adaptation outcome influence 108 45 focuses on a key concept in the DAG approach: the collider. these skills because in such cases one controls a collider. 109 46 If one starts with explicit causal models before decisions 110 47 Bias due to explicit control of a collider: Example from are made on the statistical control of variables in the model, 111 48 research on adaptation one will rarely commit this kind of erroneous over control. 112 49 But if no causal analysis is made and the models involve many 113 50 A collider is an outcome of two joint predictors that may be variables, or variables where it is not clear whether they should 114 51 correlated or not. If one statistically controls for a collider, be considered a predictor or an outcome, researchers can easily 115 52 the resulting correlation between the predictors will be neces- be lost in covariation, relying on traditional routines designed 116 53 sarily biassed. Although this bias is most often discussed for the control of certain predictor variables although they 117 54 only for the case where two predictors are uncorrelated such might be outcomes in the present context. 118 55 that the bias consists of a spurious correlation, the bias is in 119 56 fact general: any correlation will be biassed by the adjust- Bias due to implicitly controlling a collider through 120 57 ment. As Lee (this issue) has correctly observed, the bias is sampling: Example from research on achievement 121 58 obvious but rarely noticed by researchers. 122 59 For an example, let us consider data on risks and resources If a collider is related to sampling such that the sample of 123 60 for adaptation of immigrant youth in Greece to the Greek participants is restricted in variation on the collider, this is 124 61 culture (Motti-Stefanidi, Asendorpf, & Masten, in press). equivalent to statistically controlling part of the variation of 125 62 126 63 127 64 Copyright © 2012 John Wiley & Sons, Ltd. 128 1 2 Discussion 65 2 66 3 the collider and therefore also introduces a bias. This is a Note that this bias is different from effects of restricted 67 4 particularly nasty case because the researcher did not variance that can inflate or suppress a correlation but cannot 68 5 explicitly control for the collider—the collider was implicitly induce a spurious correlation or change signs of a correlation. 69 6 controlled through selective sampling. Bias due to implicit control of a collider can do this and may 70 7 Asurprisingfinding from research on achievement may be even more common than bias due to explicit control of a 71 8 illustrate this bias (I am grateful to Marco Perugini who alerted collider. Many personality researchers are aware of biases 72 9 me to this case). Studies that relate IQ and conscientiousness to due to restricted sampling, and the possibility to correct for 73 10 achievement regularly find the expected positive correlations them, but there is no tradition to consider biases due to the 74 11 of IQ and conscientiousness with achievement but at the implicit control of a collider. 75 12 same time nonsignificant or even negative correlations 76 13 between IQ and conscientiousness (see, e.g. the meta-analysis Importance for personality research 77 14 by Ackerman & Heggestad, 1997), and authors who tried to 78 15 explain this unexpected result had difficulty finding post hoc Right now it is hard to judge the importance of biases in 79 16 explanations. A causal analysis suggests a bias introduced by personality research that are due to explicit or implicit 80 17 controlling a collider related to sampling. Most of these studies control of colliders because both biases are largely 81 18 used university students or samples biassed toward high unexplored. For the time being, a working hypothesis is 82 19 achievement, and this bias in sampling alone induced a that they may be as important as the better known 83 20 negatively biassed correlation because achievement is a suppressor effects between multiple predictors of the 84 21 collider of IQ and conscientiousness. same outcome. 85 22 86 23 87 24 88 25 89 26 What Kind of Causal Modelling Approach Does Personality Research Need? 90 27 91 28 DENNY BORSBOOM1, SOPHIE VAN DER SLUIS1,2, ARJEN NOORDHOF1, MARIEKE WICHERS3, NICOLE 92 29 GESCHWIND3, STEVEN H. AGGEN4,5, KENNETH S. KENDLER4,5 AND ANGÉLIQUE O. J. CRAMER1 93 30 94 31 1 Department of Psychology, University of Amsterdam 95 32 2 Complex Trait Genetics, Department of Functional Genomics and Department of Clinical Genetics, Centre for Neurogenomics and 96 33 Cognitive Research (CNCR), FALW-VUA, Neuroscience Campus Amsterdam, VU University Medical Centre (VUmc) 97 34 3 European Graduate School for Neuroscience, SEARCH, Department of Psychiatry and Psychology, Maastricht University Medical Centre 98 35 4 Virginia Institute for Psychiatric and Behavioural Genetics 99 36 5 Department of Psychiatry, Virginia Commonwealth University 100 37 101 38 [email protected] 102 39 103 40 Q1 Abstract: Lee (2012) proposes that personality research should utilise recent theories of causality. Although 104 41 we agree that such theories are important, we also note that their empirical application has not been very 105 42 successful to date. The reason may be that psychological systems are frequently characterised by feedback, 106 43 nonlinearity and individual differences in causal structure. Such features do not preclude the application of 107 44 causal modelling but do limit the usefulness of the approach for the analysis of typical personality data. To 108 45 adequately investigate personality, intensive time series of repeated measurements are needed. Copyright © 109 46 2012 John Wiley & Sons, Ltd. 110 47 111 48 112 49 We agree with Lee that recent theories of causality (Pearl, to showing that there is a problem in either the causal 113 50 2000) are important additions to the methodological litera- assumptions or the data, or both.
Recommended publications
  • Making Progress on Causal Inference in Economics.Pdf
    See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/344885909 Making Progress on Causal Inference in Economics Chapter · October 2020 CITATIONS READS 0 48 1 author: Harold Kincaid University of Cape Town 108 PUBLICATIONS 1,528 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Naturalism, Physicalism and Scientism View project Causal inference in economics View project All content following this page was uploaded by Harold Kincaid on 26 October 2020. The user has requested enhancement of the downloaded file. 1 Forthcoming in A Modern Guide to Philosophy of Economics (Elgar) Harold Kincaid and Don Ross, editors Making Progress on Causal Inference in Economics1 Harold Kincaid, School of Economics, University of Cape Town Abstract Enormous progress has been made on causal inference and modeling in areas outside of economics. We now have a full semantics for causality in a number of empirically relevant situations. This semantics is provided by causal graphs and allows provable precise formulation of causal relations and testable deductions from them. The semantics also allows provable rules for sufficient and biasing covariate adjustment and algorithms for deducing causal structure from data. I outline these developments, show how they describe three basic kinds of causal inference situations that standard multiple regression practice in econometrics frequently gets wrong, and show how these errors can be remedied. I also show that instrumental variables, despite claims to the contrary, do not solve these potential errors and are subject to the same morals. I argue both from the logic of elemental causal situations and from simulated data with nice statistical properties and known causal models.
    [Show full text]
  • Causal Graph Analysis with the CAUSALGRAPH Procedure
    Paper SAS2998-2019 Causal Graph Analysis with the CAUSALGRAPH Procedure Clay Thompson, SAS Institute Inc., Cary, NC ABSTRACT Valid causal inferences are of paramount importance both in medical and social research and in public policy evaluation. Unbiased estimation of causal effects in a nonrandomized or imperfectly randomized experiment (such as an observational study) requires considerable care to adjust for confounding covariates. A graphical causal model is a powerful and convenient tool that you can use to remove such confounding influences and obtain valid causal inferences. This paper introduces the CAUSALGRAPH procedure, new in SAS/STAT® 15.1, for analyzing graphical causal models. The procedure takes as its input one or more causal models, represented by directed acyclic graphs, and finds a strategy that you can use to estimate a specific causal effect. The paper provides details about using directed acyclic graphs to represent and analyze a causal model. Specific topics include sources of association and bias, the statistical implications of a causal model, and identification and estimation strategies such as adjustment and instrumental variables. Examples illustrate how to apply the procedure in various data analysis situations. INTRODUCTION In an experimental setting, the effect of an intervention (for example, a drug treatment or a public policy) can be investigated by randomly assigning experimental units (for example, individuals or households) to either a treatment group or a control (untreated or unexposed) group. Then the magnitude of the causal effect can be estimated by directly comparing the measured outcomes in the two groups. This estimate has a valid causal interpretation because the randomization step enables you to safely assume that no confounding variables are associated with both the treatment assignment and the outcome.
    [Show full text]
  • Graphical Models and Their Applications
    k Selected preview pages from Chapter 2 (pp. 35-48) of J. Pearl, M. Glymour, and N.P. Jewell, Causal Inference in Statistics: A Primer, Wiley, 2016. 2 Graphical Models and Their Applications 2.1 Connecting Models to Data In Chapter 1, we treated probabilities, graphs, and structural equations as isolated mathematical objects with little to connect them. But the three are, in fact, closely linked. In this chapter, we show that the concept of independence, which in the language of probability is defined by alge- braic equalities, can be expressed visually using directed acyclic graphs (DAGs). Further, this graphical representation will allow us to capture the probabilistic information that is embedded in a structural equation model. k The net result is that a researcher who has scientific knowledge in the form of a structural k equation model is able to predict patterns of independencies in the data, based solely on the structure of the model’s graph, without relying on any quantitative information carried by the equations or by the distributions of the errors. Conversely, it means that observing patterns of independencies in the data enables us to say something about whether a hypothesized model is correct. Ultimately, as we will see in Chapter 3, the structure of the graph, when combined with data, will enable us to predict quantitatively the results of interventions without actually performing them. 2.2 Chains and Forks We have so far referred to causal models as representations of the “causal story” underlying data. Another way to think of this is that causal models represent the mechanism by which data were generated.
    [Show full text]
  • New Developments in Structural Equation Modeling
    New developments in structural equation modeling Rex B Kline Concordia University Montréal A1 Set A: SCM UNL Methodology Workshop A2 A3 A4 Topics o Graph theory o Mediation: Design Conditional Causal A5 Topics o Graph theory: Pearl’s SCM Causal reasoning Causal estimation A6 Topics o Mediation: Design requirements Conditional process modeling Cause × mediator (SCM) A7 Graph theory o Pearl, J. (2009a). Causal inference in statistics: An overview. Statistics Surveys , 3, 96–146. o Pearl, J. (2009b). Causality: Models, reasoning, and inference (2nd ed.). New York: Cambridge University Press. o Pearl, J. (2012). The causal foundations of structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 68–91). New York: Guilford Press. A8 Graph theory o Bollen, K. A., & Pearl, J. (2013). Eight myths about causality and structural equation models. In S.L. Morgan (Ed.), Handbook of causal analysis for social research (pp. 301–328). New York: Springer. o Cole, S. R., Platt, R. W., Schisterman, E. F., Chu, H., Westreich, D., Richardson, D., & Poole, C. (2010). Illustrating bias due to conditioning on a collider. International Journal of Epidemiology , 39 , 417–420 o Elwert, F. (2013). Graphical causal models. In S. L. Morgan (Ed.), Handbook of causal analysis for social research (pp. 245–273). New York, NY: Springer. A9 Graph theory o Elwert, F. (2014). Endogenous selection bias: The problem of conditioning on a collider variable. Annual Review of Sociology , 40 , 31–53. o Glymour, M. M. (2006). Using causal diagrams to understand common problems in social epidemiology. In M. Oakes & J. Kaufman (Eds), Methods in social epidemiology (pp.
    [Show full text]
  • Introduction to Causal Inference (ICI)
    Course Lecture Notes Introduction to Causal Inference from a Machine Learning Perspective Brady Neal December 17, 2020 Preface Prerequisites There is one main prerequisite: basic probability. This course assumes you’ve taken an introduction to probability course or have had equivalent experience. Topics from statistics and machine learning will pop up in the course from time to time, so some familiarity with those will be helpful but is not necessary. For example, if cross-validation is a new concept to you, you can learn it relatively quickly at the point in the book that it pops up. And we give a primer on some statistics terminology that we’ll use in Section 2.4. Active Reading Exercises Research shows that one of the best techniques to remember material is to actively try to recall information that you recently learned. You will see “active reading exercises” throughout the book to help you do this. They’ll be marked by the Active reading exercise: heading. Many Figures in This Book As you will see, there are a ridiculous amount of figures in this book. This is on purpose. This is to help give you as much visual intuition as possible. We will sometimes copy the same figures, equations, etc. that you might have seen in preceding chapters so that we can make sure the figures are always right next to the text that references them. Sending Me Feedback This is a book draft, so I greatly appreciate any feedback you’re willing to send my way. If you’re unsure whether I’ll be receptive to it or not, don’t be.
    [Show full text]
  • STATISTICAL CONTROL & CAUSATION 1 Statistical
    Running head: STATISTICAL CONTROL & CAUSATION 1 Statistical Control Requires Causal Justification Anna Wysocki Katherine M. Lawson Mijke Rhemtulla University of California, Davis STATISTICAL CONTROL & CAUSATION 2 Abstract It is common practice in correlational or quasi-experimental studies to use statistical control to remove confounding effects from a regression coefficient. Controlling for relevant confounders can de-bias the estimated causal effect of a predictor on an outcome, that is, bring the estimated regression coefficient closer to the value of the true causal effect. But, statistical control only works under ideal circumstances. When the selected control variables are inappropriate, controlling can result in estimates that are more biased than uncontrolled estimates. Despite the ubiquity of statistical control in published regression analyses and the consequences of controlling for inappropriate third variables, the selection of control variables is rarely well justified. We argue that, to carefully select appropriate control variables, researchers must propose and defend a causal structure that includes the outcome, predictors, and plausible confounders. We underscore the importance of causality when selecting control variables by demonstrating how population-level regression coefficients are affected by controlling for appropriate and inappropriate variables. Finally, we provide practical recommendations for applied researchers who wish to use statistical control. Keywords: Statistical control, Causality, Confounder, Mediator,
    [Show full text]