R. A. Fisher in the 21St Century
Total Page:16
File Type:pdf, Size:1020Kb
Statistical Science 1998, Vol. 13, No. 2, 95]122 R. A. Fisher in the 21st Century Invited Paper Presented at the 1996 R. A. Fisher Lecture Bradley Efron Abstract. Fisher is the single most important figure in 20th century statistics. This talk examines his influence on modern statistical think- ing, trying to predict how Fisherian we can expect the 21st century to be. Fisher's philosophy is characterized as a series of shrewd compro- mises between the Bayesian and frequentist viewpoints, augmented by some unique characteristics that are particularly useful in applied problems. Several current research topics are examined with an eye toward Fisherian influence, or the lack of it, and what this portends for future statistical developments. Based on the 1996 Fisher lecture, the article closely follows the text of that talk. Key words and phrases: Statistical inference, Bayes, frequentist, fidu- cial, empirical Bayes, model selection, bootstrap, confidence intervals. 1. INTRODUCTION to some speculations on Fisher's role in the statisti- cal world of the 21st century. Even scientists need their heroes, and R. A. Fisher What follows is basically the text of the Fisher was certainly the hero of 20th century statistics. lecture presented to the August 1966 Joint Statisti- His ideas dominated and transformed our field to cal meetings in Chicago. The talk format has cer- an extent a Caesar or an Alexander might have tain advantages over a standard journal article. envied. Most of this happened in the second quarter First and foremost, it is meant to be absorbed of the century, but by the time of my own education quickly, in an hour, forcing the presentation to Fisher had been reduced to a somewhat minor concentrate on main points rather than technical figure in American academic statistics, with the details. Spoken language tends to be livelier than influence of Neyman and Wald rising to their high the gray prose of a journal paper. A talk encourages water mark. bolder distinctions and personal opinions, which There has been a late 20th century resurgence of are dangerously vulnerable in a written article but interest in Fisherian statistics, in England where appropriate I believe for speculations about the his influence never much waned, but also in Amer- future. In other words, this will be a broad-brush ica and the rest of the statistical world. Much of painting, long on color but short on detail. this revival has gone unnoticed because it is hidden These advantages may be viewed in a less favor- behind the dazzle of modern computational meth- able light by the careful reader. Fisher's mathemat- ods. One of my main goals here will be to clarify ical arguments are beautiful in their power and Fisher's influence on modern statistics. Both the economy, and most of that is missing here. The strengths and limitations of Fisherian thinking will broad brush strokes sometimes conceal important be described, mainly by example, finally leading up areas of controversy. Most of the argumentation is by example rather than theory, with examples from my own work playing an exaggerated role. Refer- Bradley Efron is Max H. Stein Professor of Human- ences are minimal, and not indicated in the usual ities and Sciences and Professor of Statistics and author]year format but rather collected in anno- Biostatistics, Department of Statistics, Stanford tated form at the end of the text. Most seriously, University, Stanford, California 94305-4065 Že- the one-hour limit required a somewhat arbitrary mail: [email protected].. selection of topics, and in doing so I concentrated on 95 96 B. EFRON those parts of Fisher's work that have been most methods will be widely recognized as a central important to me, omitting whole areas of Fisherian element of scientific thinking. influence such as randomization and experimental The 20th century began on an auspicious statisti- design. The result is more a personal essay than a cal note with the appearance of Karl Pearson's systematic survey. famous x 2 paper in the spring of 1900. The This is a talk Žas I will now refer to it. on Fisher's groundwork for statistics's growth was laid by a influence, not mainly on Fisher himself or even his pre]World War II collection of intellectual giants: intellectual history. A much more thorough study of Neyman, the Pearsons, Student, Kolmogorov, the work itself appears in L. J. Savage's famous Hotelling and Wald, with Neyman's work being talk and essay, ``On rereading R. A. Fisher,'' the especially influential. But from our viewpoint at the 1971 Fisher lecture, a brilliant account of Fisher's century's end, or at least from my viewpoint, the statistical ideas as sympathetically viewed by a dominant figure has been R. A. Fisher. Fisher's leading Bayesian ŽSavage, 1976.. Thanks to John influence is especially pervasive in statistical appli- Pratt's editorial efforts, Savage's talk appeared, cations, but it also runs through the pages of our posthumously, in the 1976 Annals of Statistics. In theoretical journals. With the end of the century in the article's discussion, Oscar Kempthorne called it view this seemed like a good occasion for taking the best statistics talk he had ever heard, and stock of the vitality of Fisher's legacy and its poten- Churchill Eisenhart said the same. Another fine tial for future development. reference is Yates and Mather's introduction to the A more accurate but less provocative title for this 1971 five-volume set of Fisher's collected works. talk would have been ``Fisher's influence on modern The definitive Fisher reference in Joan Fisher Box's statistics.'' What I will mostly do is examine some 1978 biography, The Life of a Scientist. topics of current interest and assess how much It is a good rule never to meet your heroes. I Fisher's ideas have or have not influenced them. inadvertently followed this rule when Fisher spoke The central part of the talk concerns six research at the Stanford Medical School in 1961, without areas of current interest that I think will be impor- notice to the Statistics Department. The strength of tant during the next couple of decades. This will Fisher's powerful personality is missing from this also give me a chance to say something about the talk, but not I hope the strength of his ideas. Heroic kinds of applied problems we might be dealing with is a good word for Fisher's attempts to change soon, and whether or not Fisherian statistics is statistical thinking, attempts that had a profound going to be of much help with them. influence on this century's development of statistics First though I want to give a brief review of into a major force on the scientific landscape. ``What Fisher's ideas and the ideas he was reacting to. One about the next century?'' is the implicit question difficulty in assessing the importance of Fisherian asked in the title, but I won't try to address that statistics is that it's hard to say just what it is. question until later. Fisher had an amazing number of important ideas and some of them, like randomization inference and conditionality, are contradictory. It's a little as if in 2. THE STATISTICAL CENTURY economics Marx, Adam Smith and Keynes turned Despite its title, the greater portion of the talk out to be the same person. So I am just going to concerns the past and the present. I am going to outline some of the main Fisherian themes, with no begin by looking back on statistics in the 20th attempt at completeness or philosophical reconcilia- century, which has been a time of great advance- tion. This and the rest of the talk will be very short ment for our profession. During the 20th century on references and details, especially technical de- statistical thinking and methodology have become tails, which I will try to avoid entirely. the scientific framework for literally dozens of fields, In 1910, two years before the 20-year-old Fisher including education, agriculture, economics, biology published his first paper, an inventory of the statis- and medicine, and with increasing influence re- tics world's great ideas would have included the cently on the hard sciences such as astronomy, following impressive list: Bayes theorem, least geology and physics. squares, the normal distribution and the central In other words, we have grown from a small limit theorem, binomial and Poisson methods for obscure field into a big obscure field. Most people count data, Galton's correlation and regression, and even most scientists still don't know much multivariate distributions, Pearson's x 2 and Stu- about statistics except that there is something good dent's t. What was missing was a core for these about the number ``.05'' and perhaps something bad ideas. The list existed as an ingenious collection of about the bell curve. But I believe that this will ad hoc devices. The situation for statistics was change in the 21st century and that statistical similar to the one now faced by computer science. R. A. FISHER IN THE 21ST CENTURY 97 In Joan Fisher Box's words, ``The whole field was be freed from the a priori assumptions of the like an unexplored archaeological site, its structure Bayesian school. hardly perceptible above the accretions of rubble, Fisher's main tactic was to logically reduce a its treasures scattered throughout the literature.'' given inference problem, sometimes a very compli- There were two obvious candidates to provide a cated one, to a simple form where everyone should statistical core: ``objective'' Bayesian statistics in agree that the answer is obvious.