Statistical Inference from Data to Simple Hypotheses
Total Page:16
File Type:pdf, Size:1020Kb
Statistical inference from data to simple hypotheses Jason Grossman Australian National University DRAFT 2011 Please don't cite this in its current form without permission. I am impressed also, apart from prefabricated examples of black and white balls in an urn, with how baffling the problem has always been of arriving at any explicit theory of the empirical confirmation of a synthetic statement. (Quine 1980, pp. 41–42) Typeset in Belle 12/19 using Plain TEX. CONTENTS Front Matter .......................i Chapter 1. Prologue ....................1 1. Evaluating inference procedures . 3 One option: Frequentism . 5 Another option: factualism . 6 Statistical inference is in trouble . 8 2. A simple example . 9 3. What this book will show . 16 4. Why philosophers need to read this book . 21 PART I: THE STATE OF PLAY IN STATISTICAL INFERENCE Chapter 2. Definitions and Axioms . 27 1. Introduction . 27 2. The scope of this book . 28 Four big caveats . 29 Hypotheses . 32 Theories of theory change . 34 3. Basic notation . 37 An objection to using X . 39 Non-parametric statistics . 41 4. Conditional probability as primitive . 43 5. Exchangeability and multisets . 45 i Exchangeability . 46 Multisets . 50 6. Merriment . 54 7. Jeffrey conditioning . 57 8. The words “Bayesian” and “Frequentist” . 59 9. Other preliminary considerations . 64 Chapter 3. Catalogue I: Bayesianism . 67 1. Introduction . 67 2. Bayesianism in general . 73 Bayesian confirmation theory . 79 3. Subjective Bayesianism . 83 The uniqueness property of Subjective Bayesianism . 85 4. Objective Bayesianism . 86 Restricted Bayesianism . 87 Empirical Bayesianism . 88 Conjugate Ignorance Priors I: Jeffreys . 90 Conjugate Ignorance Priors II: Jaynes . 96 Robust Bayesianism . 100 Objective Subjective Bayesianism . 102 Chapter 4. Catalogue II: Frequentism . 105 1. Definition of Frequentism . 105 2. The Neyman-Pearson school . 109 3. Neyman's theory of hypothesis tests . 110 Reference class 1: Random samples . 110 Reference class 2: “Random experiments” . 112 ii Probabilities fixed once and for all . 113 Frequentist probability is not epistemic . 114 Neyman-Pearson hypothesis testing . 119 4. Neyman-Pearson confidence intervals . 122 5. Inference in other dimensions . 128 6. Fisher's Frequentist theory . 129 7. Structural inference . 133 8. The popular theory of P-values . 134 Chapter 5. Catalogue III: Other Theories . 137 1. Pure likelihood inference . 137 The method of maximum likelihood . 138 The method of support . 144 Fisher's fiducial inference . 148 Other pure likelihood methods . 152 2. Pivotal inference . 152 3. Plausibility inference . 154 4. Shafer belief functions . 155 5. The two-standard-deviation rule (a non-theory) . 157 6. Possible future theories . 158 iii PART II: FOR AND AGAINST THE LIKELIHOOD PRINCIPLE Chapter 6. Prologue to Part II . 167 Chapter 7. Objections to Frequentist Procedures . 173 1. Frequentism as repeated application of a procedure . 174 General features of Frequentist procedures . 175 Uses of error rates: expectancy versus inference . 178 2. Constructing a Frequentist procedure . 180 Privileging a hypothesis . 181 Calculating a Frequentist error rate . 182 Choosing a test statistic (T) . 191 T's lack of invariance . 197 Problems due to multiplicity . 200 Are P-values informative about H? . 202 3. Confidence intervals . 209 Are confidence intervals informative about H? . 211 A clearly useless confidence interval . 214 Biased relevant subsets . 217 4. In what way is Frequentism objective? . 221 5. Fundamental problems of Frequentism . 224 Counterfactuals . 225 Conditioning on new information . 234 6. Conclusion . 237 iv Chapter 8. The Likelihood Principle . 239 1. Introduction . 239 The importance of the likelihood principle . 240 2. Classification . 241 3. Group I: the likelihood principle . 244 4. Group II: Corollaries of group I . 271 5. Group I compared to Group II . 273 6. Group III: the law of likelihood . 274 7. A new version of the likelihood principle . 277 8. Other uses of the likelihood function . 281 9. The likelihood principle in applied statistics . 285 Chapter 9. Is the Likelihood Principle Unclear? . 289 1. Objection 9.1: hypothesis space unclear . 292 2. Objection 9.2: Likelihood function unclear . 296 3. Objection 9.3: Likelihood principle unimportant . 301 Chapter 10. Conflicts With the Likelihood Principle . 303 1. Objection 10.1: It undermines statistics . 303 2. Objection 10.2: There are counter-examples . 305 Objection 10.2.1: Fraser's example . 305 Objection 10.2.2: Examples using improper priors . 311 3. Objection 10.3: Akaike's unbiased estimator . 316 The definition of an unbiased estimator . 319 Unbiasedness not a virtue . 324 An example of talk about bias . 328 Why is unbiasedness considered good? . 330 v 4. Objection 10.4: We should use only consistent estimators 331 Chapter 11. Further Objections to the Likelihood Principle . 335 1. Objection 11.1: No arguments in favour . 335 2. Objection 11.2: Not widely applicable . 336 Objection 11.3: No care over experimental design . 344 3. Objection 11.4: Allows sampling to a foregone conclusion 346 4. Objection 11.5: Implies a stopping rule principle . 348 PART III: PROOF AND PUDDING Chapter 12. A Proof of the Likelihood Principle . 361 1. Introduction . 361 2. Premises . 364 Premise: The weak sufficiency principle (WSP) . 371 Premise: The weak conditionality principle (WCP) . 373 Alternative premises . 374 3. Proof of the likelihood principle . 380 How the proof illuminates the likelihood principle . 387 Infinite hypothesis spaces . 389 Bjrnstad's generalisation . 391 Chapter 13. Objections to Proofs of the Likelihood Principle . 393 1. Objection 13.1: The WSP is false . 393 2. Objection 13.2: Irrelevant which merriment occurs . 395 3. Objection 13.3: Minimal sufficient statistics . 397 Chapter 14. Consequences of Adopting the Likelihood Principle403 vi 1. A case study . 403 Sequential clinical trials . 406 A brief history . 417 A Subjective Bayesian solution . 422 A more objective solution . 426 2. General conclusions . 432 Mildly invalidating almost all Frequentist methods . 434 Grossly invalidating some Frequentist methods . 436 Final conclusions . 436 References . 439 vii viii ACKNOWLEDGEMENTS Max Parmar, who first got me interested in this topic and thus ruined my relatively easy and lucrative careers in computing and public health. Thanks Max. Geoffrey Berry, Peter Lipton, Neil Thomason, Paul Griffiths, Huw Price, Alison Moore, Jackie Grossman, Justin Grossman, Tarquin Grossman, Nancy Moore and Alan Moore, for massive long-term support. Mark Colyvan, Alan Hajek, Andrew Robinson and Nicholas J.J. Smith, who read the first draft of this book amazingly thoroughly and corrected many mathematical mistakes and other infelicities. I would especially like to thank Alan Hajek for dividing his 154 suggested changes into no less than six categories, from “bigger things” to “nano things”. The Center for Philosophy of Science, University of Pittsburgh, for a Visiting Fellowship which enabled me to write the first draft of this book. Susie Bayarri, James O. Berger, Jim Bogen, David Braddon-Mitchell, Jeremy Butterfield, Mike Campbell, Hugh Clapin, Mark Colyvan, David Dowe, Stephen Gaukroger, Ian Gordon, Dave Grayson, Alan Hajek, Allen Hazen, Matthew Honnibal, Claire Hooker, Kevin Korb, Doug Kutach, Claire Leslie, Alison Moore, Erik Nyberg, Max Parmar, Huw Price, John Price, Denis Robinson, Daniel Steel, Ken Schaffner, Teddy Seidenfeld, David Spiegelhalter, Neil Thomason and Robert Wolpert, for discussions which helped me directly with the ideas in this book, and many others for discussions which helped me with indirectly related topics. ix And two books. Ian Hacking's Logic of Statistical Inference (Hacking 1965), which introduced me to the Likelihood Principle and also (if I remember correctly) gave me the silly idea that I might want to be a professional philosopher. And James O. Berger and Robert L. Wolpert's The Likelihood Principle (Berger & Wolpert 1988), which hits more nails on the head than I can poke a stick at. One of my main hopes has been to translate some of the more exciting ideas which I find in Berger and Wolpert (or imagine I find there — any mistakes are of course my own responsibility) into philosopher-speak. x — 1 — Prologue This is a book about statistical inference, as used in almost all of mod- ern science. It is a discussion of the general techniques which statistical inference uses, and an attempt to arbitrate between competing schools of thought about these general techniques. It is intended for two audiences: firstly, philosophers of science; and secondly, anyone, including statisti- cians, who is interested in the most fundamental controversies about how to assess evidence in the light of competing statistical hypotheses. In addition to attempting to arbitrate between theories of statistics, this book also makes some attempt to arbitrate between the literatures of statistics and of philosophy of science. These are disciplines which have often overlapped to some extent, with the most philosophically educated statisticians and the most statistically educated philosophers generally aware of each others' work.1 And yet it is still common for work on the foundations of statistics to proceed in ignorance of one literature or the other. I am sure this book has many omissions, but at least it has the merit of paying attention to the best of my ability to what both philosophers and statisticians have to say. And much of its work will be drawing out philosophical conclusions (by which