DATA ANALYSIS What Can Be Learned from the Past 50 Years

Total Page:16

File Type:pdf, Size:1020Kb

DATA ANALYSIS What Can Be Learned from the Past 50 Years DATA ANALYSIS What Can Be Learned From the Past 50 Years Peter J. Huber Professor of Statistics, retired Klosters, Switzerland WILEY A JOHN WILEY & SONS, INC., PUBLICATION Thispageintentionallyleftblank DATA ANALYSIS WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F M. Smith, Ruey S. Tsay, Sanford Weisberg Editors Emeriti: Vic Barnett, J. Stuart Hunter, Joseph B. Kadane, JozefL. Teugels A complete list of the titles in this series appears at the end of this volume. DATA ANALYSIS What Can Be Learned From the Past 50 Years Peter J. Huber Professor of Statistics, retired Klosters, Switzerland WILEY A JOHN WILEY & SONS, INC., PUBLICATION Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Huber, Peter J. Data analysis : what can be learned from the past 50 years / Peter J. Huber. p. cm. — (Wiley series in probability and statistics ; 874) Includes bibliographical references and index. ISBN 978-1-118-01064-8 (hardback) 1. Mathematical statistics—History. 2. Mathematical statistics—Philosophy. 3. Numerical analysis—Methodology. I. Title. QA276.15.H83 2011 519.509—dc22 2010043284 Printed in the United States of America. 10 987654321 CONTENTS Preface xi 1 What is Data Analysis? 1 1.1 Tukey's 1962 paper 3 1.2 The Path of Statistics 5 2 Strategy Issues in Data Analysis 11 2.1 Strategy in Data Analysis 11 2.2 Philosophical issues 13 2.2.1 On the theory of data analysis and its teaching 14 2.2.2 Science and data analysis 15 2.2.3 Economy of forces 16 2.3 Issues of size 17 2.4 Strategic planning 21 2.4.1 Planning the data collection 21 2.4.2 Choice of data and methods. 22 v VI CONTENTS 2.4.3 Systematic and random errors 23 2.4.4 Strategic reserves 24 2.4.5 Human factors 25 2.5 The stages of data analysis 26 2.5.1 Inspection 26 2.5.2 Error checking 27 2.5.3 Modification 30 2.5.4 Comparison 30 2.5.5 Modeling and Model fitting 30 2.5.6 Simulation 31 2.5.7 What-if analyses 32 2.5.8 Interpretation 32 2.5.9 Presentation of conclusions 32 2.6 Tools required for strategy reasons 33 2.6.1 Ad hoc programming 33 2.6.2 Graphics 34 2.6.3 Record keeping 35 2.6.4 Creating and keeping order 35 Massive Data Sets 37 3.1 Introduction 38 3.2 Disclosure: Personal experiences 39 3.3 What isi massive? A classification of size 39 3.4 Obstacles to scaling 40 3.4.1 Human limitations: visualization 40 3.4.2 Human - machine interactions 41 3.4.3 Storage requirements 41 3.4.4 Computational complexity 42 3.4.5 Conclusions 43 3.5 On the structure of large data sets 43 3.5.1 Types of data 43 3.5.2 How do data sets grow? 44 3.5.3 On data organization 44 3.5.4 Derived data sets 45 3.6 Data base management and related issues 46 3.6.1 Data archiving 48 CONTENTS VII 3.7 The stages of a data analysis 49 3.7.1 Planning the data collection 49 3.7.2 Actual collection 50 3.7.3 Data access 50 3.7.4 Initial data checking 50 3.7.5 Data analysis proper 51 3.7.6 The final product: presentation of arguments and conclusions 51 3.8 Examples and some thoughts on strategy 52 3.9 Volume reduction 55 3.10 Supercomputers and software challenges 56 3.10.1 When do we need a Concorde? 57 3.10.2 General Purpose Data Analysis and Supercomputers 57 3.10.3 Languages, Programming Environments and Data- based Prototyping 58 3.11 Summary of conclusions 59 Languages for Data Analysis 61 4.1 Goals and purposes 62 4.2 Natural languages and computing languages 64 4.2.1 Natural languages 64 4.2.2 Batch languages 65 4.2.3 Immediate languages 67 4.2.4 Language and literature 68 4.2.5 Object orientation and related structural issues 69 4.2.6 Extremism and compromises, slogans and reality 71 4.2.7 Some conclusions 73 4.3 Interface issues 74 4.3.1 The command line interface 75 4.3.2 The menu interface 78 4.3.3 The batch interface and programming environments 80 4.3.4 Some personal experiences 81 4.4 Miscellaneous issues 82 4.4.1 On building blocks 82 4.4.2 On the scope of names 83 4.4.3 On notation 83 viii CONTENTS 4.4.4 Book-keeping problems 84 4.5 Requirements for a general purpose immediate language 85 5 Approximate Models 89 5.1 Models 89 5.2 Bayesian modeling 92 5.3 Mathematical statistics and approximate models 94 5.4 Statistical significance and physical relevance 96 5.5 Judicious use of a wrong model 97 5.6 Composite models 98 5.7 Modeling the length of day 99 5.8 The role of simulation 111 5.9 Summary of conclusions 112 6 Pitfalls 113 6.1 Simpson's paradox 114 6.2 Missing data 116 6.2.1 The Case of the Babylonian Lunar Six 118 6.2.2 X-ray crystallography 126 6.3 Regression of Y on X or of X on Yl 129 7 Create order in data 133 7.1 General considerations 134 7.2 Principal component methods 135 7.2.1 Principal component methods: Jury data 137 7.3 Multidimensional scaling 145 7.3.1 Multidimensional scaling: the method 145 7.3.2 Multidimensional scaling: a synthetic example 145 7.3.3 Multidimensional scaling: map reconstruction 147 7.4 Correspondence analysis 147 7.4.1 Correspondence analysis: the method 147 7.4.2 Kiiltepe eponyms 148 7.4.3 Further examples: marketing and Shakespearean plays 156 7.5 Multidimensional scaling vs. Correspondence analysis 160 7.5.1 Hodson's grave data 162 7.5.2 Plato data 168 CONTENTS ÎX 8 More case studies 177 8.1 A nutshell example 178 8.2 Shape invariant modeling 182 8.3 Comparison of point configurations 184 8.3.1 The cyclodecane conformation 186 8.3.2 The Thomson problem 189 8.4 Notes on numerical optimization 190 References 195 Index 205 Thispageintentionallyleftblank PREFACE "These prolegomena are not for the "Diese Prolegomena sind nicht zum use of apprentices, but of future Gebrauch vor Lehrlinge, sondern teachers, and indeed are not to help vor künftige Lehrer, und sollen auch them to organize the presentation of diesen nicht etwa dienen, um den an already existing science, but to dis- Vortrag einer schon vorhandnen Wis- cover the science itself for the first senschaft anzuordnen, sondern um time." (Immanuel Kant, transi. Gary diese Wissenschaft selbst allererst zu Hatfield.) erfinden." (Immanuel Kant, 1783) How do you learn data analysis? First, how do you teach data analysis? This is a task going beyond "organizing the presentation of an already existing science". At several academic institutions, we tried to teach it in class, as part of an Applied Statistics requirement. Sometimes I was actively involved, but mostly I played the part of an interested observer. In all cases we ultimately failed. Why? xi XÜ PREFACE I believe the main reason was: it is easy to teach data analytic techniques, but it is difficult to teach their use in actual applied situations.
Recommended publications
  • F:\RSS\Me\Society's Mathemarica
    School of Social Sciences Economics Division University of Southampton Southampton SO17 1BJ, UK Discussion Papers in Economics and Econometrics Mathematics in the Statistical Society 1883-1933 John Aldrich No. 0919 This paper is available on our website http://www.southampton.ac.uk/socsci/economics/research/papers ISSN 0966-4246 Mathematics in the Statistical Society 1883-1933* John Aldrich Economics Division School of Social Sciences University of Southampton Southampton SO17 1BJ UK e-mail: [email protected] Abstract This paper considers the place of mathematical methods based on probability in the work of the London (later Royal) Statistical Society in the half-century 1883-1933. The end-points are chosen because mathematical work started to appear regularly in 1883 and 1933 saw the formation of the Industrial and Agricultural Research Section– to promote these particular applications was to encourage mathematical methods. In the period three movements are distinguished, associated with major figures in the history of mathematical statistics–F. Y. Edgeworth, Karl Pearson and R. A. Fisher. The first two movements were based on the conviction that the use of mathematical methods could transform the way the Society did its traditional work in economic/social statistics while the third movement was associated with an enlargement in the scope of statistics. The study tries to synthesise research based on the Society’s archives with research on the wider history of statistics. Key names : Arthur Bowley, F. Y. Edgeworth, R. A. Fisher, Egon Pearson, Karl Pearson, Ernest Snow, John Wishart, G. Udny Yule. Keywords : History of Statistics, Royal Statistical Society, mathematical methods.
    [Show full text]
  • Two Principles of Evidence and Their Implications for the Philosophy of Scientific Method
    TWO PRINCIPLES OF EVIDENCE AND THEIR IMPLICATIONS FOR THE PHILOSOPHY OF SCIENTIFIC METHOD by Gregory Stephen Gandenberger BA, Philosophy, Washington University in St. Louis, 2009 MA, Statistics, University of Pittsburgh, 2014 Submitted to the Graduate Faculty of the Kenneth P. Dietrich School of Arts and Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2015 UNIVERSITY OF PITTSBURGH KENNETH P. DIETRICH SCHOOL OF ARTS AND SCIENCES This dissertation was presented by Gregory Stephen Gandenberger It was defended on April 14, 2015 and approved by Edouard Machery, Pittsburgh, Dietrich School of Arts and Sciences Satish Iyengar, Pittsburgh, Dietrich School of Arts and Sciences John Norton, Pittsburgh, Dietrich School of Arts and Sciences Teddy Seidenfeld, Carnegie Mellon University, Dietrich College of Humanities & Social Sciences James Woodward, Pittsburgh, Dietrich School of Arts and Sciences Dissertation Director: Edouard Machery, Pittsburgh, Dietrich School of Arts and Sciences ii Copyright © by Gregory Stephen Gandenberger 2015 iii TWO PRINCIPLES OF EVIDENCE AND THEIR IMPLICATIONS FOR THE PHILOSOPHY OF SCIENTIFIC METHOD Gregory Stephen Gandenberger, PhD University of Pittsburgh, 2015 The notion of evidence is of great importance, but there are substantial disagreements about how it should be understood. One major locus of disagreement is the Likelihood Principle, which says roughly that an observation supports a hypothesis to the extent that the hy- pothesis predicts it. The Likelihood Principle is supported by axiomatic arguments, but the frequentist methods that are most commonly used in science violate it. This dissertation advances debates about the Likelihood Principle, its near-corollary the Law of Likelihood, and related questions about statistical practice.
    [Show full text]
  • The Likelihood Principle
    1 01/28/99 ãMarc Nerlove 1999 Chapter 1: The Likelihood Principle "What has now appeared is that the mathematical concept of probability is ... inadequate to express our mental confidence or diffidence in making ... inferences, and that the mathematical quantity which usually appears to be appropriate for measuring our order of preference among different possible populations does not in fact obey the laws of probability. To distinguish it from probability, I have used the term 'Likelihood' to designate this quantity; since both the words 'likelihood' and 'probability' are loosely used in common speech to cover both kinds of relationship." R. A. Fisher, Statistical Methods for Research Workers, 1925. "What we can find from a sample is the likelihood of any particular value of r [a parameter], if we define the likelihood as a quantity proportional to the probability that, from a particular population having that particular value of r, a sample having the observed value r [a statistic] should be obtained. So defined, probability and likelihood are quantities of an entirely different nature." R. A. Fisher, "On the 'Probable Error' of a Coefficient of Correlation Deduced from a Small Sample," Metron, 1:3-32, 1921. Introduction The likelihood principle as stated by Edwards (1972, p. 30) is that Within the framework of a statistical model, all the information which the data provide concerning the relative merits of two hypotheses is contained in the likelihood ratio of those hypotheses on the data. ...For a continuum of hypotheses, this principle
    [Show full text]
  • School of Social Sciences Economics Division University of Southampton Southampton SO17 1BJ, UK
    School of Social Sciences Economics Division University of Southampton Southampton SO17 1BJ, UK Discussion Papers in Economics and Econometrics Professor A L Bowley’s Theory of the Representative Method John Aldrich No. 0801 This paper is available on our website http://www.socsci.soton.ac.uk/economics/Research/Discussion_Papers ISSN 0966-4246 Key names: Arthur L. Bowley, F. Y. Edgeworth, , R. A. Fisher, Adolph Jensen, J. M. Keynes, Jerzy Neyman, Karl Pearson, G. U. Yule. Keywords: History of Statistics, Sampling theory, Bayesian inference. Professor A. L. Bowley’s Theory of the Representative Method * John Aldrich Economics Division School of Social Sciences University of Southampton Southampton SO17 1BJ UK e-mail: [email protected] Abstract Arthur. L. Bowley (1869-1957) first advocated the use of surveys–the “representative method”–in 1906 and started to conduct surveys of economic and social conditions in 1912. Bowley’s 1926 memorandum for the International Statistical Institute on the “Measurement of the precision attained in sampling” was the first large-scale theoretical treatment of sample surveys as he conducted them. This paper examines Bowley’s arguments in the context of the statistical inference theory of the time. The great influence on Bowley’s conception of statistical inference was F. Y. Edgeworth but by 1926 R. A. Fisher was on the scene and was attacking Bayesian methods and promoting a replacement of his own. Bowley defended his Bayesian method against Fisher and against Jerzy Neyman when the latter put forward his concept of a confidence interval and applied it to the representative method. * Based on a talk given at the Sample Surveys and Bayesian Statistics Conference, Southampton, August 2008.
    [Show full text]
  • Frequentist and Bayesian Statistics
    Faculty of Life Sciences Frequentist and Bayesian statistics Claus Ekstrøm E-mail: [email protected] Outline 1 Frequentists and Bayesians • What is a probability? • Interpretation of results / inference 2 Comparisons 3 Markov chain Monte Carlo Slide 2— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics What is a probability? Two schools in statistics: frequentists and Bayesians. Slide 3— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Frequentist school School of Jerzy Neyman, Egon Pearson and Ronald Fischer. Slide 4— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Bayesian school “School” of Thomas Bayes P(D|H) · P(H) P(H|D)= P(D|H) · P(H)dH Slide 5— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Frequentists Frequentists talk about probabilities in relation to experiments with a random component. Relative frequency of an event, A,isdefinedas number of outcomes consistent with A P(A)= number of experiments The probability of event A is the limiting relative frequency. Relative frequency 0.0 0.2 0.4 0.6 0.8 1.0 020406080100 n Slide 6— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Frequentists — 2 The definition restricts the things we can add probabilities to: What is the probability of there being life on Mars 100 billion years ago? We assume that there is an unknown but fixed underlying parameter, θ, for a population (i.e., the mean height on Danish men). Random variation (environmental factors, measurement errors, ...) means that each observation does not result in the true value. Slide 7— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics The meta-experiment idea Frequentists think of meta-experiments and consider the current dataset as a single realization from all possible datasets.
    [Show full text]
  • Statistical Inference a Work in Progress
    Ronald Christensen Department of Mathematics and Statistics University of New Mexico Copyright c 2019 Statistical Inference A Work in Progress Springer v Seymour and Wes. Preface “But to us, probability is the very guide of life.” Joseph Butler (1736). The Analogy of Religion, Natural and Revealed, to the Constitution and Course of Nature, Introduction. https://www.loc.gov/resource/dcmsiabooks. analogyofreligio00butl_1/?sp=41 Normally, I wouldn’t put anything this incomplete on the internet but I wanted to make parts of it available to my Advanced Inference Class, and once it is up, you have lost control. Seymour Geisser was a mentor to Wes Johnson and me. He was Wes’s Ph.D. advisor. Near the end of his life, Seymour was finishing his 2005 book Modes of Parametric Statistical Inference and needed some help. Seymour asked Wes and Wes asked me. I had quite a few ideas for the book but then I discovered that Sey- mour hated anyone changing his prose. That was the end of my direct involvement. The first idea for this book was to revisit Seymour’s. (So far, that seems only to occur in Chapter 1.) Thinking about what Seymour was doing was the inspiration for me to think about what I had to say about statistical inference. And much of what I have to say is inspired by Seymour’s work as well as the work of my other professors at Min- nesota, notably Christopher Bingham, R. Dennis Cook, Somesh Das Gupta, Mor- ris L. Eaton, Stephen E. Fienberg, Narish Jain, F. Kinley Larntz, Frank B.
    [Show full text]
  • “It Took a Global Conflict”— the Second World War and Probability in British
    Keynames: M. S. Bartlett, D.G. Kendall, stochastic processes, World War II Wordcount: 17,843 words “It took a global conflict”— the Second World War and Probability in British Mathematics John Aldrich Economics Department University of Southampton Southampton SO17 1BJ UK e-mail: [email protected] Abstract In the twentieth century probability became a “respectable” branch of mathematics. This paper describes how in Britain the transformation came after the Second World War and was due largely to David Kendall and Maurice Bartlett who met and worked together in the war and afterwards worked on stochastic processes. Their interests later diverged and, while Bartlett stayed in applied probability, Kendall took an increasingly pure line. March 2020 Probability played no part in a respectable mathematics course, and it took a global conflict to change both British mathematics and D. G. Kendall. Kingman “Obituary: David George Kendall” Introduction In the twentieth century probability is said to have become a “respectable” or “bona fide” branch of mathematics, the transformation occurring at different times in different countries.1 In Britain it came after the Second World War with research on stochastic processes by Maurice Stevenson Bartlett (1910-2002; FRS 1961) and David George Kendall (1918-2007; FRS 1964).2 They also contributed as teachers, especially Kendall who was the “effective beginning of the probability tradition in this country”—his pupils and his pupils’ pupils are “everywhere” reported Bingham (1996: 185). Bartlett and Kendall had full careers—extending beyond retirement in 1975 and ‘85— but I concentrate on the years of setting-up, 1940-55.
    [Show full text]
  • Herman Otto Hartley, Distinguished Professor the Texas A&M University Department of Statistics Established the H. O. Hartley
    Edward George received a Ph.D. in Statistics from Stanford University in 1981; a Master’s in Applied Mathematics and Statistics from SUNY at Stony Brook in 1976; and an AB in Mathematics from Cornell University, in 1972. He was named full professor in the Department of Statistics at the University of Texas at Austin in 1992. From 1997-2001 he was the Ed and Molly Smith Chair in Business Administration. He then moved to the Wharton School of the University of Pennsylvania where he has been a Universal Furniture Professor since 2002. George was the Executive Editor Elect of Statistical Science in 2004 and Executive Editor from 2005-2007. He was guest co-editor of Research on Official Statistics in 2001 and guest co-editor of Applied Stochastic Models in Business and Industry in 2006. He is also serving on the Advisory Board for Quantitative Marketing and Economics. He is currently Associate Editor for Statistics Surveys, Bayesian Analysis, and Asia- Pacific Financial Markets and has organized or participated in the organization of 13 conferences. Herman Otto Hartley, Distinguished Professor The Texas A&M University Department of Statistics established the H. O. Hartley Memorial Lectures in 1988 to honor the memory of Prof. Herman Otto Hartley. Hartley accepted an appointment as Distinguished Professor at Texas A&M in 1963, founded the Texas A&M Institute of Statistics, and served as its director until his retirement in 1977. Hartley built his initial faculty of four into a group of 16, directed more than 30 doctoral students, and published over 75 papers during this period.
    [Show full text]
  • “I Didn't Want to Be a Statistician”
    “I didn’t want to be a statistician” Making mathematical statisticians in the Second World War John Aldrich University of Southampton Seminar Durham January 2018 1 The individual before the event “I was interested in mathematics. I wanted to be either an analyst or possibly a mathematical physicist—I didn't want to be a statistician.” David Cox Interview 1994 A generation after the event “There was a large increase in the number of people who knew that statistics was an interesting subject. They had been given an excellent training free of charge.” George Barnard & Robin Plackett (1985) Statistics in the United Kingdom,1939-45 Cox, Barnard and Plackett were among the people who became mathematical statisticians 2 The people, born around 1920 and with a ‘name’ by the 60s : the 20/60s Robin Plackett was typical Born in 1920 Cambridge mathematics undergraduate 1940 Off the conveyor belt from Cambridge mathematics to statistics war-work at SR17 1942 Lecturer in Statistics at Liverpool in 1946 Professor of Statistics King’s College, Durham 1962 3 Some 20/60s (in 1968) 4 “It is interesting to note that a number of these men now hold statistical chairs in this country”* Egon Pearson on SR17 in 1973 In 1939 he was the UK’s only professor of statistics * Including Dennis Lindley Aberystwyth 1960 Peter Armitage School of Hygiene 1961 Robin Plackett Durham/Newcastle 1962 H. J. Godwin Royal Holloway 1968 Maurice Walker Sheffield 1972 5 SR 17 women in statistical chairs? None Few women in SR17: small skills pool—in 30s Cambridge graduated 5 times more men than women Post-war careers—not in statistics or universities Christine Stockman (1923-2015) Maths at Cambridge.
    [Show full text]
  • Should the Widest Cleft in Statistics - How and Why Fisher Oppos Ed Neyman and Pearson
    School of Economics and Management TECHNICAL UNIVERSITY OF LISBON Department of Economics Carlos Pestana Barros & Nicolas Peypoch Francisco Louçã A Comparative Analysis of Productivity Change in Italian and Portuguese Airports Should The Widest Cleft in Statistics - How and Why Fisher oppos ed Neyman and Pearson WP 02/2008/DE/UECE WP 006/2007/DE _________________________________________________________ WORKING PAPERS ISSN Nº 0874-4548 The Widest Cleft in Statistics - How and Why Fisher opposed Neyman and Pearson Francisco Louçã (UECE-ISEG, UTL, Lisbon) [email protected] Abstract The paper investigates the “widest cleft”, as Savage put it, between frequencists in the foundation of modern statistics: that opposing R.A. Fisher to Jerzy Neyman and Egon Pearson. Apart from deep personal confrontation through their lives, these scientists could not agree on methodology, on definitions, on concepts and on tools. Their premises and their conclusions widely differed and the two groups they inspired ferociously opposed in all arenas of scientific debate. As the abyss widened, with rare exceptions economists remained innocent of this confrontation. The introduction of probability in economics occurred in fact after these ravaging battles began, even if they were not as public as they became in the 1950s. In any case, when Haavelmo, in the 1940s, suggested a reinterpretation of economics according to the probability concepts, he chose sides and inscribed his concepts in the Neyman-Pearson tradition. But the majority of the profession indifferently used tools developed by each of the opposed groups of statisticians, and many puzzled economists chose to ignore the debate. Economics became, as a consequence, one of the experimental fields for “hybridization”, a synthesis between Fisherian and Neyman-Pearsonian precepts, defined as a number of practical proceedings for statistical testing and inference that were developed notwithstanding the original authors, as an eventual convergence between what they considered to be radically irreconcilable.
    [Show full text]
  • Econometrics ELSEVIER Journal of Econometrics 67 (1995) 5-24
    JOURNALOF Econometrics ELSEVIER Journal of Econometrics 67 (1995) 5-24 On tests and significance in econometrics Hugo A. Keuzenkamp**a, Jan R. Magnusb3c aDepartment of‘Economics, Tilburg University, 5000 LE Tilburg, The Netherlands bLondon School ofEeonomics, London, UK ’ CentER for Economic Research, Tilburg University, 5000 LE Tilburg, The Netherlands Abstract Different aims of testing are investigated: theory testing, validity testing, simplification testing, and decision making. Different testing methodologies may serve these aims. In particular, the approaches of Fisher and Neyman-Pearson are considered. We discuss the meaning of statistical significance. Significance tests in the Journal of Econometrics are evaluated. The paper concludes with a challenge to ascertain the impact of statistical testing on economic thought. Key words: Statistical tests; Inference; Significance level JEL classification: B23; B40; Cl2 1. Introduction In a provocative paper, McCloskey (1985, p. 182) contends that ‘no proposi- tion about economic behaviour has yet been overturned by econometrics’. McCloskey is not a lonely skeptic. Many outsiders are doubtful of the value added of econometric testing (e.g., Hahn, 1992). But also many econometricians are increasingly worried about the credibility gap between econometric theory and applied economics (for example, Spanos, 1986, p. 660). Whether the skeptics are right or wrong, we must face the question: What is the significance of testing econometric hypotheses? * Corresponding author. We are grateful to Michael McAleer and Mark Steel for their helpful suggestions. 0304~4076/95/%09.50 0 1995 Elsevier Science S.A. All rights reserved SSDI 030440769401624 9 6 H.A. Keuzenkamp, J.R. MagnusJJournal of Econometrics 67 (1995) 5-24 Testing hypotheses belongs to the basic pastimes of econometricians.
    [Show full text]
  • The Economist
    Statisticians in World War II: They also served | The Economist http://www.economist.com/news/christmas-specials/21636589-how-statis... More from The Economist My Subscription Subscribe Log in or register World politics Business & finance Economics Science & technology Culture Blogs Debate Multimedia Print edition Statisticians in World War II Comment (3) Timekeeper reading list They also served E-mail Reprints & permissions Print How statisticians changed the war, and the war changed statistics Dec 20th 2014 | From the print edition Like 3.6k Tweet 337 Advertisement “I BECAME a statistician because I was put in prison,” says Claus Moser. Aged 92, he can look back on a distinguished career in academia and civil service: he was head of the British government statistical service in 1967-78 and made a life peer in 2001. But Follow The Economist statistics had never been his plan. He had dreamed of being a pianist and when he realised that was unrealistic, accepted his father’s advice that he should study commerce and manage hotels. (“He thought I would like the music in the lobby.”) War threw these plans into disarray. In 1936, aged 13, he fled Germany with his family; four years later the prime minister, Winston Churchill, decided that because some of the refugees might be spies, he would “collar the lot”. Lord Moser was interned in Huyton, Latest updates » near Liverpool (pictured above). “If you lock up 5,000 Jews we will find something to do,” The Economist explains : Where Islamic he says now. Someone set up a café; there were lectures and concerts—and a statistical State gets its money office run by a mathematician named Landau.
    [Show full text]