The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners Joshua Abah

Total Page:16

File Type:pdf, Size:1020Kb

The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners Joshua Abah The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners Joshua Abah To cite this version: Joshua Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners. International Journal of Research & Review (www.ijrrjournal.com), 2018, 5 (3), pp.112- 129. hal-01758493 HAL Id: hal-01758493 https://hal.archives-ouvertes.fr/hal-01758493 Submitted on 4 Apr 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike| 4.0 International License International Journal of Research and Review www.gkpublication.in E-ISSN: 2349-9788; P-ISSN: 2454-2237 Review Article The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners Joshua Abah Abah Department of Science Education University of Agriculture, Makurdi, Nigeria ABSTRACT There is a growing body of evidence on the prevalence of ignorance, biases and malpractice among researchers which questions the authenticity, validity and integrity of the knowledge been propagated in professional circles. The push for academic relevance and career advancement have driven some research practitioners into committing gross misconduct in the form of innocent ignorance, sloppiness, malicious intent and outright fraud. These, among other concerns around research data handling and reporting, form the basis for this in-depth review. This discourse also draws attention to the recent official statement on the correct use of the p-value and the need for professional intervention is ensuring that the outcomes of research are neither erroneous nor misleading. The expositions in this review express cogent implications for institutions, supervisors, mentors, and editors to promote high ethical standards and rigor in scientific investigations. Keywords: Research, Research misconduct, Bias, P-value, Statistical significance, ANCOVA Assumptions. INTRODUCTION new knowledge and/or the use of existing Research is an enterprise aimed at knowledge is a new and creative way so as finding solutions and answers to existing to generate new concepts, methodologies problems. Research can be seen as an and understandings. This could include objective, systematic, controlled and critical synthesis and analysis of previous research activity planned and directed towards the to the extent that it leads to new and creative discovery and development of dependable outcomes. From all indications, research can knowledge (Emaikwu, 2012). Literally “re- be described as an organized mechanism for search” means to “search again”. It connotes studying phenomenon and testing patient study and scientific investigation hypotheses. wherein the researcher takes another, more Research is an indispensable tool for careful look at data to discover all that can growth and development in all fields of be known about the subject of the study human endeavour. It has been a means of (Bodla, 2017). Broadly, research entails breaking forth into new frontiers in bringing together some content that is of medicine agriculture, banking, education, interests, some ideas that give meaning to food security, sociology, literature, arts and that content and some techniques or the sciences. Outcomes of diverse procedures by means of which those ideas researches across different disciplines and content can be studied (Deshmukh, constitute the fuel for the present scientific n.d.). According to O’Donnell (2012), and technological advancement the world is research can be defined as the creation of witnessing. The world today, being a International Journal of Research & Review (www.ijrrjournal.com) 112 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners “global village” is driven by the quest to In light of the ripple effect of research in the know more, to venture into the unknown knowledge-generation circle, researchers and make human existence much better than and academic institutions place serious ever. As a result this significance of emphasis on research ethics. In the words of research, it is gradually becoming a sub- Norris (1997): discipline in itself, within every discipline. Research demands skepticism, This implies that within every field of study, commitment and detachment. To there is a prescribed way of doing research, understand the object or domain of broadly referred to as “Research inquiry takes an intense degree of methodology”. commitment and concentration. To Research methodology consists of learning remain open minded, alert to how to adopt several common approaches foreclosure and to sources of error when doing research, and how to conceive a needs some measure of detachment. research design (Jonker & Pennink, 2010). As with other forms of art, research Methodology is a systematic plan for requires detachment from oneself, a thinking and acting in the conduct of willingness to look at the self and research work. Emaikwu (2012) maintains the way it influences the quality of that scientific research methods must be data and reports; in particular verifiable, cumulative, ethical, theoretical research demands a capacity to and empirical. How well a research project accept and use criticism and to be is planned and how well the steps in the self-critical in a constructive manner plan are integrated can make the difference (p.173). between success or failure. In this respect, a plan consists of two general areas, namely Ethical conduct, in general refers to research concepts and context, and research actions that one takes pride in according to logistics (Congdon & Dunham, 1999), his or her conscience and that lives up to his which are coordinated within a given time or her responsibility as a member of society. frame, culminating in the writing of a Kim (2009) asserts that research ethics is a research report. The research report is the special social norm that researchers are output of the entire research process made obliged to abide by as criterion of judgment visible to a targeted audience and/or the for researchers not to operate against their public. For academics and researchers in professional integrity and to carry out universities, research centres, science socially responsible research activities. laboratories and other research generating Ethical standards are set by professional agencies, the production of quality and associations, educational institutions, relevant research reports is a measure of journal publishers and government growth and a determination of career and regulatory agencies. It is likely that these institutional relevance. Research reports are organizations vary considerably in the often published in professional journals, attention they invest and the procedures they institutional bulletins, associations’ notices deploy to uphold research ethics (Johnson, and government agencies gazettes. They can Parker & Clements, 2001). Practices carried also be presented at workshops, seminars out by researchers outside these regulatory and conferences, where learned guidelines constitute research misconduct. contributions, corrections and suggestions By definition, research misconduct can be synthesized into the research process entails fabrication, falsification or before publishing for public use. Such plagiarism in proposing, performing or rigorous vetting is essential considering the reviewing research or in reporting research fact that a published work is expected to be results (OSTP, 2002). Research misconduct an addition to existing knowledge and a may occur if the conduct represents a reference point for future studies. significant departure from accepted International Journal of Research & Review (www.ijrrjournal.com) 113 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners practices; has been committed intentionally, essential for us to contemplate what knowingly or recklessly and can be proven responsible conduct of research by a preponderance of evidence (Inzana, actually entails and fully establish 2008). The ramification of research research ethics as an integral part of misconduct has been broaden to include our academic culture (p.1). other serious deviation from accepted guidelines of the scientific community for The pressure on academics to maintaining the integrity of research record increase their number of publications in line and retaliation of any kind against a person with requirements for promotion and career who reported or provided information about growth has also contributed to this grave suspected or alleged misconduct and who concern for research ethics. In the view of has not acted in bad faith (Fisehen, n.d.). Mullane and Williams (2013), bias in Among the three “cardinal sins” of research research, where prejudice or selectivity conduct, only plagiarism seems to be in the introduces a deviation in outcome beyond public eye, with the other
Recommended publications
  • Statistical Fallacy: a Menace to the Field of Science
    International Journal of Scientific and Research Publications, Volume 9, Issue 6, June 2019 297 ISSN 2250-3153 Statistical Fallacy: A Menace to the Field of Science Kalu Emmanuel Ogbonnaya*, Benard Chibuike Okechi**, Benedict Chimezie Nwankwo*** * Department of Ind. Maths and Applied Statistics, Ebonyi State University, Abakaliki ** Department of Psychology, University of Nigeria, Nsukka. *** Department of Psychology and Sociological Studies, Ebonyi State University, Abakaliki DOI: 10.29322/IJSRP.9.06.2019.p9048 http://dx.doi.org/10.29322/IJSRP.9.06.2019.p9048 Abstract- Statistical fallacy has been a menace in the field of easier to understand but when used in a fallacious approach can sciences. This is mostly contributed by the misconception of trick the casual observer into believing something other than analysts and thereby led to the distrust in statistics. This research what the data shows. In some cases, statistical fallacy may be investigated the conception of students from selected accidental or unintentional. In others, it is purposeful and for the departments on statistical concepts as it relates statistical fallacy. benefits of the analyst. Students in Statistics, Economics, Psychology, and The fallacies committed intentionally refer to abuse of Banking/Finance department were randomly sampled with a statistics and the fallacy committed unintentionally refers to sample size of 36, 43, 41 and 38 respectively. A Statistical test misuse of statistics. A misuse occurs when the data or the results was conducted to obtain their conception score about statistical of analysis are unintentionally misinterpreted due to lack of concepts. A null hypothesis which states that there will be no comprehension. The fault cannot be ascribed to statistics; it lies significant difference between the students’ conception of with the user (Indrayan, 2007).
    [Show full text]
  • Misuse of Statistics in Surgical Literature
    Statistics Corner Misuse of statistics in surgical literature Matthew S. Thiese1, Brenden Ronna1, Riann B. Robbins2 1Rocky Mountain Center for Occupational & Environment Health, Department of Family and Preventive Medicine, 2Department of Surgery, School of Medicine, University of Utah, Salt Lake City, Utah, USA Correspondence to: Matthew S. Thiese, PhD, MSPH. Rocky Mountain Center for Occupational & Environment Health, Department of Family and Preventive Medicine, School of Medicine, University of Utah, 391 Chipeta Way, Suite C, Salt Lake City, UT 84108, USA. Email: [email protected]. Abstract: Statistical analyses are a key part of biomedical research. Traditionally surgical research has relied upon a few statistical methods for evaluation and interpretation of data to improve clinical practice. As research methods have increased in both rigor and complexity, statistical analyses and interpretation have fallen behind. Some evidence suggests that surgical research studies are being designed and analyzed improperly given the specific study question. The goal of this article is to discuss the complexities of surgical research analyses and interpretation, and provide some resources to aid in these processes. Keywords: Statistical analysis; bias; error; study design Submitted May 03, 2016. Accepted for publication May 19, 2016. doi: 10.21037/jtd.2016.06.46 View this article at: http://dx.doi.org/10.21037/jtd.2016.06.46 Introduction the most commonly used statistical tests of the time (6,7). Statistical methods have since become more complex Research in surgical literature is essential for furthering with a variety of tests and sub-analyses that can be used to knowledge, understanding new clinical questions, as well as interpret, understand and analyze data.
    [Show full text]
  • United Nations Fundamental Principles of Official Statistics
    UNITED NATIONS United Nations Fundamental Principles of Official Statistics Implementation Guidelines United Nations Fundamental Principles of Official Statistics Implementation guidelines (Final draft, subject to editing) (January 2015) Table of contents Foreword 3 Introduction 4 PART I: Implementation guidelines for the Fundamental Principles 8 RELEVANCE, IMPARTIALITY AND EQUAL ACCESS 9 PROFESSIONAL STANDARDS, SCIENTIFIC PRINCIPLES, AND PROFESSIONAL ETHICS 22 ACCOUNTABILITY AND TRANSPARENCY 31 PREVENTION OF MISUSE 38 SOURCES OF OFFICIAL STATISTICS 43 CONFIDENTIALITY 51 LEGISLATION 62 NATIONAL COORDINATION 68 USE OF INTERNATIONAL STANDARDS 80 INTERNATIONAL COOPERATION 91 ANNEX 98 Part II: Implementation guidelines on how to ensure independence 99 HOW TO ENSURE INDEPENDENCE 100 UN Fundamental Principles of Official Statistics – Implementation guidelines, 2015 2 Foreword The Fundamental Principles of Official Statistics (FPOS) are a pillar of the Global Statistical System. By enshrining our profound conviction and commitment that offi- cial statistics have to adhere to well-defined professional and scientific standards, they define us as a professional community, reaching across political, economic and cultural borders. They have stood the test of time and remain as relevant today as they were when they were first adopted over twenty years ago. In an appropriate recognition of their significance for all societies, who aspire to shape their own fates in an informed manner, the Fundamental Principles of Official Statistics were adopted on 29 January 2014 at the highest political level as a General Assembly resolution (A/RES/68/261). This is, for us, a moment of great pride, but also of great responsibility and opportunity. In order for the Principles to be more than just a statement of noble intentions, we need to renew our efforts, individually and collectively, to make them the basis of our day-to-day statistical work.
    [Show full text]
  • Case Study Applications of Statistics in Institutional Research
    /-- ; / \ \ CASE STUDY APPLICATIONS OF STATISTICS IN INSTITUTIONAL RESEARCH By MARY ANN COUGHLIN and MARIAN PAGAN( Case Study Applications of Statistics in Institutional Research by Mary Ann Coughlin and Marian Pagano Number Ten Resources in Institutional Research A JOINTPUBLICA TION OF THE ASSOCIATION FOR INSTITUTIONAL RESEARCH AND THE NORTHEASTASSO CIATION FOR INSTITUTIONAL REASEARCH © 1997 Association for Institutional Research 114 Stone Building Florida State University Ta llahassee, Florida 32306-3038 All Rights Reserved No portion of this book may be reproduced by any process, stored in a retrieval system, or transmitted in any form, or by any means, without the express written permission of the publisher. Printed in the United States To order additional copies, contact: AIR 114 Stone Building Florida State University Tallahassee FL 32306-3038 Tel: 904/644-4470 Fax: 904/644-8824 E-Mail: [email protected] Home Page: www.fsu.edul-airlhome.htm ISBN 1-882393-06-6 Table of Contents Acknowledgments •••••••••••••••••.•••••••••.....••••••••••••••••.••. Introduction .•••••••••••..•.•••••...•••••••.....••••.•••...••••••••• 1 Chapter 1: Basic Concepts ..•••••••...••••••...••••••••..••••••....... 3 Characteristics of Variables and Levels of Measurement ...................3 Descriptive Statistics ...............................................8 Probability, Sampling Theory and the Normal Distribution ................ 16 Chapter 2: Comparing Group Means: Are There Real Differences Between Av erage Faculty Salaries Across Departments? •...••••••••.••.••••••
    [Show full text]
  • The Numbers Game - the Use and Misuse of Statistics in Civil Rights Litigation
    Volume 23 Issue 1 Article 2 1977 The Numbers Game - The Use and Misuse of Statistics in Civil Rights Litigation Marcy M. Hallock Follow this and additional works at: https://digitalcommons.law.villanova.edu/vlr Part of the Civil Procedure Commons, Civil Rights and Discrimination Commons, Evidence Commons, and the Labor and Employment Law Commons Recommended Citation Marcy M. Hallock, The Numbers Game - The Use and Misuse of Statistics in Civil Rights Litigation, 23 Vill. L. Rev. 5 (1977). Available at: https://digitalcommons.law.villanova.edu/vlr/vol23/iss1/2 This Article is brought to you for free and open access by Villanova University Charles Widger School of Law Digital Repository. It has been accepted for inclusion in Villanova Law Review by an authorized editor of Villanova University Charles Widger School of Law Digital Repository. Hallock: The Numbers Game - The Use and Misuse of Statistics in Civil Righ 1977-19781 THE NUMBERS GAME - THE USE AND MISUSE OF STATISTICS IN CIVIL RIGHTS LITIGATION MARCY M. HALLOCKt I. INTRODUCTION "In the problem of racial discrimination, statistics often tell much, and Courts listen."' "We believe it evident that if the statistics in the instant matter represent less than a shout, they certainly constitute '2 far more than a mere whisper." T HE PARTIES TO ACTIONS BROUGHT UNDER THE CIVIL RIGHTS LAWS3 have relied increasingly upon statistical 4 analyses to establish or rebut cases of unlawful discrimination. Although statistical evidence has been considered significant in actions brought to redress racial discrimination in jury selection,5 it has been used most frequently in cases of allegedly discriminatory t B.A., University of Pennsylvania, 1972; J.D., Georgetown University Law Center, 1975.
    [Show full text]
  • Misuse of Statistics
    MISUSE OF STATISTICS Author: Rahul Dodhia Posted: May 25, 2007 Last Modified: October 15, 2007 This article is continuously updated. For the latest version, please go to www.RavenAnalytics.com/articles.php INTRODUCTION Percent Return on Investment 40 Did you know that 54% of all statistics are made up on the 30 spot? 20 Okay, you may not have fallen for that one, but there are 10 plenty of real-life examples that bait the mind. For 0 example, data from a 1988 census suggest that there is a high correlation between the number of churches and the year1 year2 number of violent crimes in US counties. The implied year3 Group B year4 Group A message from this correlation is that religion and crime are linked, and some would even use this to support the preposterous sounding hypothesis that religion causes FIGURE 1 crimes, or there is something in the nature of people that makes the two go together. That would be quite shocking, Here is the same data in more conventional but less pretty but alert statisticians would immediately point out that it format. Now it is clear that Fund A outperformed Fund B is a spurious correlation. Counties with a large number of in 3 out of 4 years, not the other way around. churches are likely to have large populations. And the larger the population, the larger the number of crimes.1 40 Percent Return on Investment Statistical literacy is not a skill that is widely accepted as Group A Group B necessary in education. Therefore a lot of misuse of 30 statistics is not intentional, just uninformed.
    [Show full text]
  • Regression Assumptions in Clinical Psychology Research Practice—A Systematic Review of Common Misconceptions
    Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions Anja F. Ernst and Casper J. Albers Heymans Institute for Psychological Research, University of Groningen, Groningen, The Netherlands ABSTRACT Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA- recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. Subjects Psychiatry and Psychology, Statistics Keywords Linear regression, Statistical assumptions, Literature review, Misconceptions about normality INTRODUCTION One of the most frequently employed models to express the influence of several predictors Submitted 18 November 2016 on a continuous outcome variable is the linear regression model: Accepted 17 April 2017 Published 16 May 2017 Yi D β0 Cβ1X1i Cβ2X2i C···CβpXpi C"i: Corresponding author Casper J. Albers, [email protected] This equation predicts the value of a case Yi with values Xji on the independent variables Academic editor Xj (j D 1;:::;p). The standard regression model takes Xj to be measured without error Shane Mueller (cf. Montgomery, Peck & Vining, 2012, p. 71). The various βj slopes are each a measure of Additional Information and association between the respective independent variable Xj and the dependent variable Y.
    [Show full text]
  • Quantifying Aristotle's Fallacies
    mathematics Article Quantifying Aristotle’s Fallacies Evangelos Athanassopoulos 1,* and Michael Gr. Voskoglou 2 1 Independent Researcher, Giannakopoulou 39, 27300 Gastouni, Greece 2 Department of Applied Mathematics, Graduate Technological Educational Institute of Western Greece, 22334 Patras, Greece; [email protected] or [email protected] * Correspondence: [email protected] Received: 20 July 2020; Accepted: 18 August 2020; Published: 21 August 2020 Abstract: Fallacies are logically false statements which are often considered to be true. In the “Sophistical Refutations”, the last of his six works on Logic, Aristotle identified the first thirteen of today’s many known fallacies and divided them into linguistic and non-linguistic ones. A serious problem with fallacies is that, due to their bivalent texture, they can under certain conditions disorient the nonexpert. It is, therefore, very useful to quantify each fallacy by determining the “gravity” of its consequences. This is the target of the present work, where for historical and practical reasons—the fallacies are too many to deal with all of them—our attention is restricted to Aristotle’s fallacies only. However, the tools (Probability, Statistics and Fuzzy Logic) and the methods that we use for quantifying Aristotle’s fallacies could be also used for quantifying any other fallacy, which gives the required generality to our study. Keywords: logical fallacies; Aristotle’s fallacies; probability; statistical literacy; critical thinking; fuzzy logic (FL) 1. Introduction Fallacies are logically false statements that are often considered to be true. The first fallacies appeared in the literature simultaneously with the generation of Aristotle’s bivalent Logic. In the “Sophistical Refutations” (Sophistici Elenchi), the last chapter of the collection of his six works on logic—which was named by his followers, the Peripatetics, as “Organon” (Instrument)—the great ancient Greek philosopher identified thirteen fallacies and divided them in two categories, the linguistic and non-linguistic fallacies [1].
    [Show full text]
  • UNIT 1 INTRODUCTION to STATISTICS Introduction to Statistics
    UNIT 1 INTRODUCTION TO STATISTICS Introduction to Statistics Structure 1.0 Introduction 1.1 Objectives 1.2 Meaning of Statistics 1.2.1 Statistics in Singular Sense 1.2.2 Statistics in Plural Sense 1.2.3 Definition of Statistics 1.3 Types of Statistics 1.3.1 On the Basis of Function 1.3.2 On the Basis of Distribution of Data 1.4 Scope and Use of Statistics 1.5 Limitations of Statistics 1.6 Distrust and Misuse of Statistics 1.7 Let Us Sum Up 1.8 Unit End Questions 1.9 Glossary 1.10 Suggested Readings 1.0 INTRODUCTION The word statistics has different meaning to different persons. Knowledge of statistics is applicable in day to day life in different ways. In daily life it means general calculation of items, in railway statistics means the number of trains operating, number of passenger’s freight etc. and so on. Thus statistics is used by people to take decision about the problems on the basis of different type of quantitative and qualitative information available to them. However, in behavioural sciences, the word ‘statistics’ means something different from the common concern of it. Prime function of statistic is to draw statistical inference about population on the basis of available quantitative information. Overall, statistical methods deal with reduction of data to convenient descriptive terms and drawing some inferences from them. This unit focuses on the above aspects of statistics. 1.1 OBJECTIVES After going through this unit, you will be able to: Define the term statistics; Explain the status of statistics; Describe the nature of statistics; State basic concepts used in statistics; and Analyse the uses and misuses of statistics.
    [Show full text]
  • Prestructuring Multilayer Perceptrons Based on Information-Theoretic
    Portland State University PDXScholar Dissertations and Theses Dissertations and Theses 1-1-2011 Prestructuring Multilayer Perceptrons based on Information-Theoretic Modeling of a Partido-Alto- based Grammar for Afro-Brazilian Music: Enhanced Generalization and Principles of Parsimony, including an Investigation of Statistical Paradigms Mehmet Vurkaç Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/open_access_etds Let us know how access to this document benefits ou.y Recommended Citation Vurkaç, Mehmet, "Prestructuring Multilayer Perceptrons based on Information-Theoretic Modeling of a Partido-Alto-based Grammar for Afro-Brazilian Music: Enhanced Generalization and Principles of Parsimony, including an Investigation of Statistical Paradigms" (2011). Dissertations and Theses. Paper 384. https://doi.org/10.15760/etd.384 This Dissertation is brought to you for free and open access. It has been accepted for inclusion in Dissertations and Theses by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. Prestructuring Multilayer Perceptrons based on Information-Theoretic Modeling of a Partido-Alto -based Grammar for Afro-Brazilian Music: Enhanced Generalization and Principles of Parsimony, including an Investigation of Statistical Paradigms by Mehmet Vurkaç A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering Dissertation Committee: George G. Lendaris, Chair Douglas V. Hall Dan Hammerstrom Marek Perkowski Brad Hansen Portland State University ©2011 ABSTRACT The present study shows that prestructuring based on domain knowledge leads to statistically significant generalization-performance improvement in artificial neural networks (NNs) of the multilayer perceptron (MLP) type, specifically in the case of a noisy real-world problem with numerous interacting variables.
    [Show full text]
  • A Causation Coefficient and Taxonomy of Correlation/Causation Relationships
    A causation coefficient and taxonomy of correlation/causation relationships Joshua Brulé∗ Abstract This paper introduces a causation coefficient which is defined in terms of probabilistic causal models. This coefficient is suggested as the natural causal analogue of the Pearson correlation coefficient and permits comparing causation and correlation to each other in a simple, yet rigorous manner. Together, these coefficients provide a natural way to classify the possible correlation/causation relationships that can occur in practice and examples of each relationship are provided. In addition, the typical relationship between correlation and causation is analyzed to provide insight into why correlation and causation are often conflated. Finally, example calculations of the causation coefficient are shown on a real data set. Introduction The maxim, “Correlation is not causation”, is an important warning to analysts, but provides very little information about what causation is and how it relates to correlation. This has prompted other attempts at summarizing the relationship. For example, Tufte [1] suggests either, “Observed covariation is necessary but not sufficient for causality”, which is demonstrably false arXiv:1708.05069v1 [stat.ME] 5 Aug 2017 or, “Correlation is not causation but it is a hint”, which is correct, but still underspecified. In what sense is correlation a ‘hint’ to causation? Correlation is well understood and precisely defined. Generally speaking, correlation is any statistical relationship involving dependence, i.e. the ran- dom variables are not independent. More specifically, correlation can refer ∗Department of Computer Science, University of Maryland, College Park. [email protected] 1 to a descriptive statistic that summarizes the nature of the dependence.
    [Show full text]
  • K C Chakrabarty: Uses and Misuses of Statistics
    K C Chakrabarty: Uses and misuses of statistics Address by Dr K C Chakrabarty, Deputy Governor of the Reserve Bank of India, at the DST Centre for Interdisciplinary Mathematical Sciences, Faculty of Science, Banaras Hindu University, as part of the 150th Birth Anniversary Celebrations of Mahanama Pandit Madan Mohan Malviya, Varanasi, 20 March 2012. * * * Assistance provided by Shri Abhiman Das in preparation of this address is gratefully acknowledged. 1. Prof. Umesh Singh, Coordinator, DST Centre for Interdisciplinary Mathematical Sciences, Prof Sengupta, Dean Faculty of Science, Prof Joshi, other distinguished members of faculty of the University, and above all, dear students. I thank you all for inviting me to be in your midst during the 150th birth anniversary celebrations of Mahamana Pandit Madan Mohan Malviya. It is a great honour and privilege for me. Pandit Madan Mohan Malviya 2. You have provided an opportunity to me to pay my tribute to the “Mahamana” by delivering a lecture as part of the celebrations of his 150th birth anniversary. He was one of the greatest personalities that this nation has ever produced. In many senses, he was what most of the top class institutions today vie to produce. He was, at the same time a great patriot, an eminent educationist, a teacher of teachers, a silver-tongued transcendental orator, an ancient as well as a modern leader, a reluctant but an eminent lawyer, a social reformer, a great human being, a torch-bearer of the downtrodden, and above all, a great nation builder. He lived his life for us and for the future generations.
    [Show full text]