STATISTICAL SCIENCE Volume 29, Number 2 May 2014

Special Section on Programming & Four Papers on Contemporary Software Design Strategies for Statistical Methodologists ...... Vincent Carey and Dianne Cook 165 Object-Oriented Programming, Functional Programming and R ...... John M. Chambers 167 Enhancing R withAdvancedCompilationToolsandMethods...... Duncan Temple Lang 181 Reactive Programming for Interactive Graphics ...... Yihui Xie, and Xiaoyue Cheng 201 ScalableGenomicswithRandBioconductor...... Michael Lawrence and Martin Morgan 214 General Section OntheBirnbaumArgumentfortheStrongLikelihoodPrinciple...... Deborah G. Mayo 227 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” ...... A. P. Dawid 240 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” ...... Michael Evans 242 Discussion: Foundations of Statistical Inference, Revisited...... Ryan Martin and Chuanhai Liu 247 Discussion:OnArgumentsConcerningStatisticalPrinciples...... D. A. S. Fraser 252 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” ...... Jan Hannig 254 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” ...... Jan F. Bjørnstad 259 Rejoinder: “On the Birnbaum Argument for the Strong Likelihood Principle” ...... Deborah G. Mayo 261 Comments on the Neyman–Fisher Controversy and Its Consequences ...... Arman Sabbaghi and Donald B. Rubin 267 TwoModelingStrategiesforEmpiricalBayesEstimation...... Bradley Efron 285 Pooled Association Tests for Rare Genetic Variants: AReviewandSomeNewResults...... Andriy Derkach, Jerry F. Lawless and Lei Sun 302

Statistical Science [ISSN 0883-4237 (print); ISSN 2168-8745 (online)], Volume 29, Number 2, May 2014. Published quarterly by the Institute of Mathematical Statistics, 3163 Somerset Drive, Cleveland, OH 44122, USA. Periodicals postage paid at Cleveland, Ohio and at additional mailing offices. POSTMASTER: Send address changes to Statistical Science, Institute of Mathematical Statistics, Dues and Subscriptions Office, 9650 Rockville Pike—Suite L2310, Bethesda, MD 20814-3998, USA. Copyright © 2014 by the Institute of Mathematical Statistics Printed in the United States of America EDITOR Peter Green University of Bristol and University of Technology, Sydney

ASSOCIATE EDITORS Vincent Carey Shane Jensen Christian Robert Harvard University University of Pennsylvania University of Paris, Dauphine Rong Chen Samuel Kou Andrea Rotnitzky Rutgers University Harvard University Universidad Torcuato Di Tella Dianne Cook David Madigan and Harvard University Iowa State University Columbia University Thomas Severini Rainer Dahlhaus Kerrie Mengersen Northwestern University University of Heidelberg Queensland University Glenn Shafer Michel Dekking of Technology Rutgers Business Delft University Peter Müller School–Newark and Peter J. Diggle TheUniversityofTexas New Brunswick Lancaster University Sonia Petrone Royal Holloway College, Robin Evans Bocconi University University of London University of Oxford Jim Pitman Michael Stein Michael Friendly University of California, University of Chicago York University Berkeley Jon Wakefield Edward I. George Annie Qu University of Washington University of Pennsylvania University of Illinois, Guenther Walther Peter Green Urbana-Champaign Stanford University University of Bristol Nancy Reid Martin Wells Peter Hoff University of Toronto Cornell University University of Washington Thomas Richardson Tong Zhang Sylvie Huet University of Washington Rutgers University INRA MANAGING EDITOR T. N. Sriram University of Georgia PRODUCTION EDITOR Patrick Kelly EDITORIAL COORDINATOR Kristina Mattson PAST EXECUTIVE EDITORS Morris H. DeGroot, 1986–1988 Morris Eaton, 2001 Carl N. Morris, 1989–1991 George Casella, 2002–2004 Robert E. Kass, 1992–1994 Edward I. George, 2005–2007 Paul Switzer, 1995–1997 David Madigan, 2008–2010 Leon J. Gleser, 1998–2000 Jon A. Wellner, 2011–2013 Richard Tweedie, 2001 Statistical Science 2014, Vol. 29, No. 2, 165–166 DOI: 10.1214/14-STS481 © Institute of Mathematical Statistics, 2014 Four Papers on Contemporary Software Design Strategies for Statistical Methodologists Vincent Carey and Dianne Cook

REFERENCES [3] NOLAN,D.andLANG, D. T. (2013). XML and Web Technolo- gies for Data Sciences with R. Springer, New York. [1] GENTLEMAN,R.,CAREY,V.,BATES,D.,BOLSTAD,B., [4] TEMPLE LANG, D. (2000). The Omegahat environment: DETTLING,M.,DUDOIT,S.,ELLIS,B.,GAUTIER,L., New possibilities for statistical computing. J. Comput. Graph. GE,Y.,GENTRY,J.,HORNIK,K.,HOTHORN,T.,HU- Statist. 9 423–451. MR1818989 BER,W.,IACUS,S.,IRIZARRY,R.,LEISCH,F.,LI,C., MAECHLER,M.,ROSSINI,A.,SAWITZKI,G.,SMITH,C., [5] TIERNEY, L. (1990). LISP-STAT: An Object-Oriented Environ- SMYTH,G.,TIERNEY,L.,YANG,J.andZHANG, J. (2004). ment for Statistical Computing and Dynamic Graphics.Wiley, Bioconductor: Open software development for computational New York. biology and bioinformatics. Genome Biol. 5 R80. [6] UNWIN,A.,THEUS,M.andHOFMANN, H. (2006). Graphics [2] MAJUMDER,M.,HOFMANN,H.andCOOK, D. (2013). Val- of Large Datasets. Springer, New York. idation of visual statistical inference, applied to linear models. [7] XIE, Y. (2013). Dynamic Documents with R and . Chap- J. Amer. Statist. Assoc. 108 942–956. MR3174675 man & Hall/CRC, Boca Raton, FL.

Vincent Carey is Associate Professor of Medicine (Biostatistics), Channing Division of Network Medicine, Harvard Medical School, Boston, Massachusetts 02115, USA (e-mail: [email protected]). Dianne Cook is Professor, Department of Statistics, Iowa State University, Ames, Iowa 50011, USA (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 167–180 DOI: 10.1214/13-STS452 © Institute of Mathematical Statistics, 2014 Object-Oriented Programming, Functional Programming and R John M. Chambers

Abstract. This paper reviews some programming techniques in R that have proved useful, particularly for substantial projects. These include several ver- sions of object-oriented programming, used in a large number of R packages. The review tries to clarify the origins and ideas behind the various versions, each of which is valuable in the appropriate context. R has also been strongly influenced by the ideas of functional programming and, in particular, by the desire to combine functional with object oriented programming. To clarify how this particular mix of ideas has turned out in the current R language and supporting software, the paper will first review the basic ideas behind object-oriented and functional programming, and then examine the evolution of R with these ideas providing context. Functional programming supports well-defined, defensible software giving reproducible results. Object-oriented programming is the mechanism par ex- cellence for managing complexity while keeping things simple for the user. The two paradigms have been valuable in supporting major software for fit- ting models to data and numerous other statistical applications. The paradigms have been adopted, and adapted, distinctively in R. Func- tional programming motivates much of R but R does not enforce the paradigm. Object-oriented programming from a functional perspective dif- fers from that used in non-functional languages, a distinction that needs to be emphasized to avoid confusion. R initially replicated the S language from Bell Labs, which in turn was strongly influenced by earlier program libraries. At each stage, new ideas have been added, but the previous software continues to show its influence in the design as well. Outlining the evolution will further clarify why we currently have this somewhat unusual combination of ideas. Key words and phrases: Programming languages, functional programming, object-oriented programming.

REFERENCES BECKER,R.A.,CHAMBERS,J.M.andWILKS, A. R. (1988). The New S Language. Chapman & Hall, Boca Raton, FL. BECKER,R.A.andCHAMBERS, J. M. (1977). Gr-z: A system of CHAMBERS, J. M. (1977). Computational Methods for Data Anal- graphical subroutines for data analysis. In Proc. Interface Symp. ysis. Wiley, New York. MR0659716 on Statistics and Computing 10 409–415. CHAMBERS, J. M. (1987). Interface for a quantitative program- BECKER,R.A.andCHAMBERS, J. M. (1984). S: An Interactive ming environment. In Comp. Sci. and Stat., Proc.19th Symp. Environment for Data Analysis and Graphics.Wadsworth,Bel- on the Interface 280–286. mont, CA. CHAMBERS, J. M. (1998). Programming with Data: A Guide to BECKER,R.A.andCHAMBERS, J. M. (1985). Extending the S the S Language. Springer, New York. System. Wadsworth, Belmont, CA. CHAMBERS, J. M. (2008). Software for Data Analysis: Program-

John M. Chambers is Consulting Professor, Department of Statistics, Stanford University, Stanford, California 94305-4065, USA (e-mail: [email protected]). ming with R. Springer, New York. Available at http://CRAN.R-project.org/package=rJython. CHAMBERS,J.M.andHASTIE, T., eds. (1992). Statistical Models IHAKA,R.andGENTLEMAN, R. (1996). R: A language for data in S. Chapman & Hall, Boca Raton, FL. analysis and graphics. J. Comput. Graph. Statist. 5 299–314. CHAMBERS,J.M.,MALLOWS,C.L.andSTUCK, B. W. (1976). ODERSKY,M.,SPOON,L.andVENNERS, B. (2010). Program- A method for simulating stable random variables. J. Amer. ming in Scala, 2nd ed. Artima, Walnut Creek, CA. Statist. Assoc. 71 340–344. MR0415982 PYTHON (2013). The Python Tutorial. Python. Available at DANENBERG, P. (2011). rJavax: rJava extensions. http://docs.python.org/tutorial. version 0.3. Available at http://CRAN.R-project.org/package= RCORE TEAM (2013). R Language Definition. R Foundation for rJavax. Statistical Computing, Vienna, Austria. ISBN 3-900051-13-5. EDDELBUETTEL,D.andFRANÇOIS, R. (2011). Rcpp: Seam- Available at http://cran.r-project.org/doc/manuals/R-lang.html/. less R and C++ integration. Journal of Statistical Software 40 SHALIT, A. (1996). The Dylan Reference Manual. Addison- 1–18. Wesley, Reading, MA. GENTLEMAN,R.C.,CAREY,V.J.,BATES, D. M. et al. (2004). TEMPLE LANG, D. (2014). Enhancing R with advanced compila- Bioconductor: Open software development for computational tion tools and methods. Statist. Sci. 29 181–200. biology and bioinformatics. Genome Biology 5 R80. TIERNEY, L. (1990). LISP-STAT: An Object-Oriented Environment GROTHENDIECK,G.andBELLOSTA, C. J. G. (2012). rJython: for Statistical Computing and Dynamic Graphics.Wiley,New R interface to Python via Jython. R package version 0.0-4. York. Statistical Science 2014, Vol. 29, No. 2, 181–200 DOI: 10.1214/13-STS462 © Institute of Mathematical Statistics, 2014 Enhancing R with Advanced Compilation Tools and Methods Duncan Temple Lang

Abstract. I describe an approach to compiling common idioms in R code directly to native machine code and illustrate it with several examples. Not only can this yield significant performance gains, but it allows us to use new approaches to computing in R. Importantly, the compilation requires no changes to R itself, but is done entirely via R packages. This allows others to experiment with different compilation strategies and even to define new domain-specific languages within R. We use the Low-Level Virtual Machine (LLVM) compiler toolkit to create the native code and perform sophisticated optimizations on the code. By adopting this widely used software within R, we leverage its ability to generate code for different platforms such as CPUs and GPUs, and will continue to benefit from its ongoing development. This approach potentially allows us to develop high-level R code that is also fast, that can be compiled to work with different data representations and sources, and that could even be run outside of R. The approach aims to both provide a compiler for a limited subset of the R language and also to enable R pro- grammers to write other compilers. This is another approach to help us write high-level descriptions of what we want to compute, not how. Key words and phrases: Programming language, efficient computation, compilation, extensible compiler toolkit.

While the code for this function is reasonably sim- list(n = Int32Type), mod) ple, there are many details involved in generating the start = Block(f) native code, such as defining the routine and its param- ir = IRBuilder(start) eters, creating the instruction blocks, loading and stor- parms = getParameters(f) ing values, and creating instructions to perform sub- n.minus.1 = binOp(ir, Sub, parms$n, traction, call the fib() function and return a value. The createConstant(ir, 1L)) LLVM C++ API (Application Programming Interface) createCall(ir, f, n.minus.1) provides numerous classes and methods that allow us We don’t want to write this code manually ourselves to create instances of these conceptual items such as in R, although Rllvm enables us to do so. Instead, Functions, Blocks, many different types of instructions we want to programmatically transform the R code andsoon.TheRllvm package provides an R interface in the fib() function to create the LLVM objects. The to these C++ classes and methods and allows us to RLLVMCompile package does this. Since R functions create and manipulate these objects directly within R. are regular R objects which we can query and manip- For example, the following code shows how we can de- ulate directly in R, we can traverse the expressions in fine the function, the entry instruction block and gen- the body of a function, analyze each one and perform erate the call fib(n - 1): a simple-minded translation from R concepts to LLVM mod = Module() concepts. This is the basic way the compileFunction() f = Function("fib", Int32Type, generates the code, using customizable handler func-

Duncan Temple Lang is Associate Professor, Department of Statistics, University of California at Davis, 4210 Math Sciences Building, Davis, California 95616, USA (e-mail: [email protected]). what to compute, not how. We then use smart inter- IHAKA,R.andTEMPLE LANG, D. (2008). Back to the future: Lisp preters or compilers to generate efficient code, simulta- as a base for a statistical computing system. In Proceedings in neously freeing R programmers to concentrate on their Computational Statistics 21–33. Springer, Heidelberg. tasks and leveraging domain expertise for executing the JONES,E.,OLIPHANT,T.andPETERSON, P. et al. (2001). SciPy: Open source scientific tools for Python. code. We hope others will be able to use these basic KEMPENAAR,M.andDIJKSTRA, M. (2010). R/GPU: Using the building blocks to improve matters and also to explore Graphics Processing Unit to speedup bioinformatics analysis quite different approaches and new languages within with R. Available at https://trac.nbic.nl/rgpu/. the R environment. LATTNER,C.andADVE, V. (2004). LLVM: A compilation frame- work for lifelong program analysis and transformation. In Proc. ACKNOWLEDGMENTS of the 2004 International Symposium on Code Generation and Optimization (CGO’04), San Jose, CA, USA 75–88. IEEE Com- Vincent Buffalo made valuable contributions to de- puter Society, Washington, DC. Available at http://llvm.org/. signing and developing the RLLVMCompile package RCORE TEAM (2013). R: A Language and Environment for Sta- in the initial work. Vincent Carey has provided impor- tistical Computing. R Foundation for Statistical Computing, Vi- tant ideas, insights, advice and motivation and I am enna, Austria. SCHÖLKOPF,B.andSMOLA, A. (2001). Learning with Kernels. very grateful to him for organizing this collection of MIT Press, Cambridge, MA. papers and the session at the 2012 Joint Statistical TEMPLE LANG, D. (2002). RCurl: General network (http/ftp/...) Meetings. Also, I appreciate the very useful comments client interface for R. R package version 1.95-4. on the initial draft of this paper by the three reviewers TEMPLE LANG, D. (2010a). RCIndex: R interface to the clang and also John Chambers. parser’s C API. Available at https://github.com/omegahat/ RClangSimple. SUPPLEMENTARY MATERIAL TEMPLE LANG, D. (2010b). Rllvm: R interface to the Low-Level Virtual Machine API. Available at https://github.com/duncantl/ The code for the examples in this paper, along Rllvm. with the timing results and their meta-data, are avail- TEMPLE LANG, D. (2011). Rffi: Interface to libffi to dynamically able from https://github.com/duncantl/RllvmTimings invoke arbitrary compiled routines at run-time without compiled as a git repository. The versions of the Rllvm and bindings. R package version 0.3-0. TEMPLE LANG, D. (2013). FastCSVSample: An R package to RLLVMCompile packages involved in the timings can sample lines from a text file. Available at https://github.com/ also be retrieved from their respective git repositories. duncantl/FastCSVSample. The specific code used is associated with the git tag TEMPLE LANG,D.andBUFFALO, V. (2011). RLLVMCom- StatSciPaper. pile: A simple LLVM-based compiler for R code. Available at https://github.com/duncantl/RLLVMCompile. TEMPLE LANG,D.andGENTLEMAN, R. (2005). TypeInfo: Op- REFERENCES tional type specification prototype. R package version 1.27.0. TEMPLE LANG,D.,PENG,R.andNOLAN, D. (2007). Code- ADLER, D. (2012). rdyncall: Improved foreign function interface Depends: Analysis of R code for reproducible research and (FFI) and dynamic bindings to C libraries. R package version 0.7.5. code comprehension. Available at https://github.com/duncantl/ CodeDepends. BUCKNER,J.,WILSON,J.,SELIGMAN,M.,ATHEY,B.,WAT- IERNEY SON,S.andMENG, F. (2009). The gputools package enables T , L. (2001). Compiling R: A preliminary report. In Pro- GPU computing in R. Bioinformatics 26 135–135. Available at ceedings of the 2nd International Workshop on Distributed http://cran.r-project.org/web/packages/gputools/index.html. Statistical Computing, March 15–17, 2001 (K. Hornik and CARRUTH,C.,CHRISTOPHER,E.,GREGOR,D.,KO- F. Leisch, eds.). Technische Universität Wien, Vienna, Austria. ROBEYNIKOV,A.,KREMENEK,T.,MCCALL,J.,ROSIER,C. TIERNEY, L. (2011). codetools: Code analysis tools for R. R pack- and SMITH, R. (2007). libclang: C/C++ translation unit parser age version 0.2-8. library. Available at http://clang.llvm.org. URBANEK, S. (2007). R to C compiler. Available at http://www. CLAYTON, D. (2011). snpstats: SnpMatrix and XSnpMatrix rforge.net/r2c/index.html. classes and methods. R package version 1.5.0. WONG, J. (2013). pdist: Partitioned distance function. R package EDDELBUETTEL,D.andFRANÇOIS, R. (2011). Rcpp: Seamless version 1.2. R and C++ integration. Journal of Statistical Software 40 1– ZAKAI, A. (2010). Emscripten: An LLVM-to-JavaScript compiler. 18. Available at https://github.com/kripken/emscripten/wiki. Statistical Science 2014, Vol. 29, No. 2, 201–213 DOI: 10.1214/14-STS477 © Institute of Mathematical Statistics, 2014 Reactive Programming for Interactive Graphics Yihui Xie, Heike Hofmann and Xiaoyue Cheng

Abstract. One of the big challenges of developing interactive statistical ap- plications is the management of the data pipeline, which controls transforma- tions from data to plot. The user’s interactions needs to be propagated through these modules and reflected in the output representation at a fast pace. Each individual module may be easy to develop and manage, but the dependency structure can be quite challenging. The MVC (Model/View/Controller) pat- tern is an attempt to solve the problem by separating the user’s interaction from the representation of the data. In this paper we discuss the paradigm of reactive programming in the framework of the MVC architecture and show its applicability to interactive graphics. Under this paradigm, developers ben- efit from the separation of user interaction from the graphical representation, which makes it easier for users and developers to extend interactive appli- cations. We show the central role of reactive data objects in an interactive graphics system, implemented as the R package cranvas, which is freely available on GitHub and the main developers include the authors of this pa- per. Key words and phrases: Reactive programming, interactive graphics, R lan- guage.

REFERENCES FISHERKELLER,M.A.,FRIEDMAN,J.H.andTUKEY,J.W. (1988). PRIM-9: An interactive multidimensional data display ASIMOV, D. (1985). The grand tour: A tool for viewing mul- and analysis system. In Dynamic Graphics for Statistics 91– tidimensional data. SIAM J. Sci. Statist. Comput. 6 128–143. 109. Wadsworth & Brooks/Cole, Belmont, CA. MR0773286 HURLEY,C.andOLDFORD, R. W. (1988). Higher hierarchical 3 BOSTOCK,M.,OGIEVETSKY,V.andHEER, J. (2011). D data- views of statistical objects. Available from the video library of driven documents. IEEE Transactions on Visualization and the ASA sections on Statistical Graphics: http://stat-graphics. Computer Graphics 17 2301–2309. org/movies/. BUJA,A.,ASIMOV,D.,HURLEY,C.andMCDONALD,J.A. KRASNER,G.E.andPOPE, S. T. (1988). A cookbook for using (1988). Elements of a viewing pipeline for data analysis. the model-view controller user interface paradigm in Smalltalk- In Dynamic Graphics for Statistics 277–308. Wadsworth & 80. Journal of Object-Oriented Programming 1 26–49. Brooks/Cole, Belmont, CA. LAWRENCE,M.andSARKAR, D. (2013a). qtbase: Interface be- CHAMBERS, J. (2013). Objects with fields treated by reference tween R and Qt. R package version 1.0.6. (OOP-style). See help(ReferenceClasses) in R. LAWRENCE,M.andSARKAR, D. (2013b). qtpaint: Qt-based COOK,D.andSWAYNE, D. F. (2007). Interactive and Dynamic painting infrastructure. R package version 0.9.0. Graphics for Data Analysis with R and GGobi. Springer, Berlin. LAWRENCE,M.andTEMPLE LANG, D. (2010). RGtk2: A graph- DYKES, J. (1998). Cartographic visualization: Exploratory spatial ical user interface toolkit for R. Journal of Statistical Software data analysis with local indicators of spatial association using 37 1–52. Tcl/Tk and cdv. Journal of the Royal Statistical Society: Series LAWRENCE,M.andWICKHAM, H. (2012). plumbr: Mutable and D (The Statistician) 47 485–497. dynamic data models. R package version 0.6.6.

Yihui Xie and Xiaoyue Cheng are Ph.D. Students, Department of Statistics, Iowa State University, 102 Snedecor Hall, Ames, Iowa 50011, USA (e-mail: [email protected];URL:http://yihui.name; e-mail: [email protected];URL: http://xycheng.public.iastate.edu). Heike Hofmann is Professor, Department of Statistics, Iowa State University, 2413 Snedecor Hall, Ames, Iowa 50011, USA (e-mail: [email protected];URL:http://hofmann.public.iastate.edu). LAWRENCE,M.andYIN, T. (2011). Mutable signal objects. R TIERNEY, L. (2005). Some notes on the past and future of LISP- package version 0.10.2. STAT. Journal of Statistical Software 13 1–15. LEFF,A.andRAYFIELD, J. T. (2001). Web-application develop- UNWIN,A.R.,HAWKINS,G.,HOFMANN,H.andSIEGL,B. ment using the model/view/controller design pattern. In IEEE (1996). Interactive graphics for data sets with missing values— Enterprise Distributed Object Computing Conference 118–127. MANET. J. Comput. Graph. Statist. 5 113–122. IEEE. URBANEK,S.andWICHTREY, T. (2013). iplots: iPlots— MCDONALD,J.A.,STUETZLE,W.andBUJA, A. (1990). Paint- Interactive graphics for R. R package version 1.1-5. ing multiple views of complex objects. In ACM SIGPLAN No- VELLEMAN,P.F.andVELLEMAN, A. Y. (1988). Data Desk tices 25 245–257. ACM, New York. Handbook. Odesta Corporation, Northbrook, IL. QT PROJECT (2013). A cross-platform application and UI frame- VERZANI,J.andLAWRENCE, M. F. (2012). Programming Graph- work. Available at http://qt-project.org/. ical User Interfaces in R. Chapman & Hall/CRC, London. RCORE TEAM (2013). R: A Language and Environment for Sta- VIEGAS,F.B.,WATTENBERG,M.,VAN HAM,F.,KRISS,J.and tistical Computing. R Core Team, Vienna, Austria. MCKEON, M. (2007). Manyeyes: A site for visualization at in- ROSLING,H.andJOHANSSON, C. (2009). Gapminder: Liberating ternet scale. IEEE Transactions on Visualization and Computer the X-axis from the burden of time. Statistical Computing and Graphics 13 1121–1128. Statistical Graphics Newsletter 20 4–7. WHALEN, E. (2005). Creating linked, interactive views to explore RSTUDIO,INC. (2013). Easy web applications in R. Available at multivariate data. Ph.D. thesis, Harvard Univ. http://www.rstudio.com/shiny/. WICKHAM,H.,LAWRENCE,M.,TEMPLE LANG,D.and SAS INSTITUTE (2009). JMP 8 Statistics and Graphics Guide. SWAYNE, D. F. (2008). An introduction to rggobi. RNews8 SAS Publishing, Cary, NC. 3–7. SHNEIDERMAN, B. (1983). Direct manipulation: A step beyond WICKHAM,H.,LAWRENCE,M.,COOK,D.,BUJA,A.,HOF- programming languages. Computer 16 57–69. MANN,H.andSWAYNE, D. F. (2009). The plumbing of in- STUETZLE, W. (1987). Plot Windows. J. Amer. Statist. Assoc. 82 teractive graphics. Comput. Statist. 24 207–215. MR2506079 466–475. WICKHAM,H.,COOK,D.,HOFMANN,H.andBUJA, A. (2011). SWAYNE,D.F.andKLINKE, S. (1999). Introduction to the Special tourr: An R package for exploring multivariate data with pro- issue on interactive graphical data analysis: What is interaction? jections. Journal of Statistical Software 40 1–18. Comput. Statist. 14 1–6. WILLS, G. J. (1999). Interactive statistical graphics. In Handbook SWAYNE,D.F.,TEMPLE LANG,D.,BUJA,A.andCOOK,D. of Data Mining and Knowledge Discovery. Oxford Univ. Press, (2003). GGobi: Evolving from XGobi into an extensible frame- London. work for interactive data visualization. Comput. Statist. Data WOLFE,J.M.,KLUENDER,K.R.andLEVI, D. M. (2012). Sen- Anal. 43 423–444. MR2005447 sation and Perception, 3rd ed. Sinauer, Sunderland. THEUS, M. (2002). Interactive data visualization using Mondrian. XIE,Y.,HOFMANN,H.,COOK,D.,CHENG,X.,SCHLO- Journal of Statistical Software 7 1–9. ERKE,B.,VENDETTUOLI,M.,YIN,T.,WICKHAM,H. TIERNEY, L. (1990). LISP-STAT: An Object-Oriented Environment and LAWRENCE, M. (2013). cranvas: Interactive statistical for Statistical Computing and Dynamic Graphics. Wiley, New graphics based on Qt. R package version 0.8.3. Available at York. http://cranvas.org. Statistical Science 2014, Vol. 29, No. 2, 214–226 DOI: 10.1214/14-STS476 © Institute of Mathematical Statistics, 2014 Scalable Genomics with R and Bioconductor Michael Lawrence and Martin Morgan

Abstract. This paper reviews strategies for solving problems encountered when analyzing large genomic data sets and describes the implementation of those strategies in R by packages from the Bioconductor project. We treat the scalable processing, summarization and visualization of big genomic data. The general ideas are well established and include restrictive queries, com- pression, iteration and parallel computing. We demonstrate the strategies by applying Bioconductor packages to the detection and analysis of genetic vari- ants from a whole genome sequencing experiment. Key words and phrases: R, Bioconductor, genomics, biology, big data.

REFERENCES CAREY, V. (2013). Software for computing and annotating genomic ranges. PLoS Computational Biology 9 e1003118. [1] BISCHL,B.,LANG,M.,MERSMANN,O.,RAHNEN- [9] LAWRENCE,M.andWICKHAM, H. (2012). plumbr: Muta- FUEHRER,J.andWEIHS, C. (2011). Computing on high per- ble and dynamic data models. R package version 0.6.6. formance clusters with R: Packages BatchJobs and BatchEx- [10] LI,H.,HANDSAKER,B.,WYSOKER,A.,FENNELL,T., periments. Technical Report 1, TU Dortmund. RUAN,J.,HOMER,N.,MARTH,G.,ABECASIS,G., HAMBERS [2] C , J. M. (2008). Software for Data Analysis: Pro- DURBIN, R. and 1000 GENOME PROJECT DATA PROCESS- gramming with R. Springer, New York. ING SUBGROUP (2009). The Sequence Alignment/Map for- [3] CORMEN,T.H.,LEISERSON,C.E.,RIVEST,R.L.and mat and SAMtools. Bioinformatics 25 2078–2079. STEIN, C. (2001). Introduction to Algorithms, 2nd ed. [11] OSTROUCHOV,G.,CHEN,W.-C.,SCHMIDT,D.andPA- McGraw-Hill, Boston, MA. MR1848805 TEL, P. (2012). Programming with big data in R. Available at [4] DANECEK,P.,AUTON,A.,ABECASIS,G.,ALBERS,C.A., http://r-pbd.org/. BANKS,E.,DEPRISTO,M.A.,HANDSAKER,R.E., [12] PAGÈS,H.,ABOYOUN,P.,GENTLEMAN,R.and LUNTER,G.,MARTH,G.T.,SHERRY,S.T.,MCVEAN,G., DEBROY, S. (2013). Biostrings: String objects repre- DURBIN, R. and 1000 GENOMES PROJECT ANALYSIS senting biological sequences, and matching algorithms. R GROUP (2011). The variant call format and VCFtools. Bioin- package version 2.25.6. formatics 27 2156–2158. [13] R DEVELOPMENT CORE TEAM (2010). R: A Language and [5] GENTLEMAN,R.C.,CAREY,V.J.,BATES,D.M.andOTH- Environment for Statistical Computing. R Foundation for Sta- ERS (2004). Bioconductor: Open software development for tistical Computing, Vienna, Austria. computational biology and bioinformatics. Genome Biol. 5 [14] and WESTON, S. (2013). fore- R80. ach: Foreach looping construct for R. R package version [6] KENT,W.J.,SUGNET,C.W.,FUREY,T.S.,ROSKIN, 1.4.1. K. M., PRINGLE,T.H.,ZAHLER,A.M.andHAUSSLER,D. [15] WICKHAM, H. (2011). The split-apply-combine strategy for (2002). The human genome browser at UCSC. Genome Res. data analysis. Journal of Statistical Software 40 1–29. 12 996–1006. [16] WICKHAM,H.,LAWRENCE,M.,COOK,D.,BUJA,A., [7] KENT,W.J.,ZWEIG,A.S.,BARBER,G.,HINRICHS,A.S. HOFMANN,H.andSWAYNE, D. F. (2009). The plumb- and KAROLCHIK, D. (2010). BigWig and BigBed: Enabling ing of interactive graphics. Comput. Statist. 24 207–215. browsing of large distributed datasets. Bioinformatics 26 MR2506079 2204–2207. [17] YIN,T.,LAWRENCE,M.andCOOK, D. (2013). biovizBase: [8] LAWRENCE,M.,HUBER,W.,PAGÈS,H.,ABOYOUN,P., Basic graphic utilities for visualization of genomic data. CARLSON,M.,GENTLEMAN,R.,MORGAN,M.and R package version 1.9.1.

Michael Lawrence is Computational Biologist, Genentech, 1 DNA Way, South San Francisco, California 94080, USA (e-mail: michafl[email protected]). Martin Morgan is Principal Staff Scientist, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., P.O. Box 19024, Seattle, Washington 98109, USA (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 227–239 DOI: 10.1214/13-STS457 © Institute of Mathematical Statistics, 2014 On the Birnbaum Argument for the Strong Likelihood Principle Deborah G. Mayo

Abstract. An essential component of inference based on familiar frequen- tist notions, such as p-values, significance and confidence levels, is the rel- evant sampling distribution. This feature results in violations of a principle known as the strong likelihood principle (SLP), the focus of this paper. In ∗ ∗ particular, if outcomes x and y from experiments E1 and E2 (both with un- known parameter θ) have different probability models f1(·), f2(·),theneven ∗ ∗ ∗ ∗ though f1(x ; θ) = cf 2(y ; θ) for all θ, outcomes x and y may have dif- ferent implications for an inference about θ. Although such violations stem from considering outcomes other than the one observed, we argue this does not require us to consider experiments other than the one performed to pro- duce the data. David Cox [Ann. Math. Statist. 29 (1958) 357–372] proposes the Weak Conditionality Principle (WCP) to justify restricting the space of relevant repetitions. The WCP says that once it is known which Ei produced the measurement, the assessment should be in terms of the properties of Ei. The surprising upshot of Allan Birnbaum’s [J. Amer. Statist. Assoc. 57 (1962) 269–306] argument is that the SLP appears to follow from applying the WCP in the case of mixtures, and so uncontroversial a principle as sufficiency (SP). But this would preclude the use of sampling distributions. The goal of this article is to provide a new clarification and critique of Birnbaum’s argument. Although his argument purports that [(WCP and SP) entails SLP], we show how data may violate the SLP while holding both the WCP and SP. Such cases also refute [WCP entails SLP]. Key words and phrases: Birnbaumization, likelihood principle (weak and strong), sampling theory, sufficiency, weak conditionality.

REFERENCES BIRNBAUM, A. (1968). Likelihood. In International Encyclopedia of the Social Sciences 9 299–301. Macmillan and the Free Press, BARNDORFF-NIELSEN, O. (1975). Comments on paper by J. D. New York. Kalbfleisch. Biometrika 62 261–262. BIRNBAUM, A. (1969). Concepts of statistical evidence. In Philos- BERGER, J. O. (1986). Discussion on a paper by Evans et al. [On ophy, Science, and Method: Essays in Honor of Ernest Nagel principles and arguments to likelihood]. Canad. J. Statist. 14 (S. Morgenbesser, P. Suppes and M. G. White, eds.) 112–143. 195–196. St. Martin’s Press, New York. BERGER, J. O. (2006). The case for objective Bayesian analysis. Bayesian Anal. 1 385–402. MR2221271 BIRNBAUM, A. (1970a). Statistical methods in scientific inference. BERGER,J.O.andWOLPERT, R. L. (1988). The Likelihood Prin- Nature 225 1033. ciple, 2nd ed. Lecture Notes—Monograph Series 6.IMS,Hay- BIRNBAUM, A. (1970b). On Durbin’s modified principle of condi- ward, CA. tionality. J. Amer. Statist. Assoc. 65 402–403. BIRNBAUM, A. (1962). On the foundations of statistical inference. BIRNBAUM, A. (1972). More on concepts of statistical evidence. J. Amer. Statist. Assoc. 57 269–306. Reprinted in Breakthroughs J. Amer. Statist. Assoc. 67 858–861. MR0365793 in Statistics 1 (S. Kotz and N. Johnson, eds.) 478–518. Springer, BIRNBAUM, A. (1975). Comments on paper by J. D. Kalbfleisch. New York. Biometrika 62 262–264.

Deborah G. Mayo is Professor of Philosophy, Department of Philosophy, Virginia Tech, 235 Major Williams Hall, Blacksburg, Virginia 24061, USA (e-mail: [email protected]). CASELLA,G.andBERGER, R. L. (2002). Statistical Inference, MAYO,D.G.andCOX, D. R. (2010). Frequentist statistics as a 2nd ed. Duxbury Press, Belmont, CA. theory of inductive inference. In Error and Inference: Recent COX, D. R. (1958). Some problems connected with statistical in- Exchanges on Experimental Reasoning, Reliability, and the Ob- ference. Ann. Math. Statist. 29 357–372. MR0094890 jectivity and Rationality of Science (D. G. Mayo and A. Spanos, COX, D. R. (1977). The role of significance tests. Scand. J. Stat. 4 eds.) 247–274. Cambridge Univ. Press, Cambridge. First pub- 49–70. MR0448666 lished in The Second Erich L. Lehmann Symposium: Optimality COX, D. R. (1978). Foundations of statistical inference: The case 49 (2006) (J. Rojo, ed.) 77–97. Lecture Notes—Monograph Se- for eclecticism. Aust. N. Z. J. Stat. 20 43–59. MR0501453 ries. IMS, Beachwood, OH. COX,D.R.andHINKLEY, D. V. (1974). Theoretical Statistics. MAYO,D.G.andCOX, D. R. (2011). Statistical scientist meets Chapman & Hall, London. MR0370837 a philosopher of science: A conversation. In Rationality, Mar- COX,D.R.andMAYO, D. G. (2010). Objectivity and conditional- kets and Morals: Studies at the Intersection of Philosophy and ity in frequentist inference. In Error and Inference: Recent Ex- Economics 2 (D. G. Mayo, A. Spanos and K. W. Staley, eds.) changes on Experimental Reasoning, Reliability, and the Objec- (Special Topic: Statistical Science and Philosophy of Science: tivity and Rationality of Science (G. Mayo and A. Spanos, eds.) Where do (should) They Meet in 2011 and Beyond?) (October 276–304. Cambridge Univ. Press, Cambridge. 18) 103–114. Frankfurt School, Frankfurt. DAWID, A. P. (1986). Discussion on a paper by Evans et al. [On MAYO,D.G.andKRUSE, M. (2001). Principles of inference and principles and arguments to likelihood]. Canad. J. Statist. 14 their consequences. In Foundations of Bayesianism (D. Corfield 196–197. and J. Williamson, eds.) 24 381–403. Applied Logic.Kluwer DURBIN, J. (1970). On Birnbaum’s theorem on the relation be- Academic Publishers, Dordrecht. tween sufficiency, conditionality and likelihood. J. Amer. Statist. MAYO,D.G.andSPANOS, A. (2006). Severe testing as a basic Assoc. 65 395–398. concept in a Neyman–Pearson philosophy of induction. British EVANS, M. (2013). What does the proof of Birnbaum’s theorem J. Philos. Sci. 57 323–357. MR2249183 prove? Unpublished manuscript. MAYO,D.G.andSPANOS, A. (2011). Error statistics. In Philos- EVANS,M.J.,FRASER,D.A.S.andMONETTE, G. (1986). On ophy of Statistics 7 (P. S. Bandyopadhyay and M. R. Forster, principles and arguments to likelihood. Canad. J. Statist. 14 eds.) 152–198. Handbook of the Philosophy of Science.Else- 181–199. MR0859631 vier, Amsterdam. GHOSH,J.K.,DELAMPADY,M.andSAMANTA, T. (2006). An In- troduction to Bayesian Analysis. Theory and Methods. Springer REID, N. (1992). Introduction to Fraser (1966) structural probabil- Texts in Statistics. Springer, New York. MR2247439 ity and a generalization. In Breakthroughs in Statistics (S. Kotz and N. L. Johnson, eds.) 579–586. Springer Series in Statistics. KALBFLEISCH, J. D. (1975). Sufficiency and conditionality. Biometrika 62 251–268. MR0386075 Springer, New York. LEHMANN,E.L.andROMANO, J. P. (2005). Testing Statistical SAVAG E , L. J., ed. (1962a). The Foundations of Statistical Infer- Hypotheses,3rded.Springer Texts in Statistics. Springer, New ence: A Discussion. Methuen, London. York. MR2135927 SAVAG E , L. J. (1962b). Discussion on a paper by A. Birnbaum [On MAYO, D. G. (1996). Error and the Growth of Experimental the foundations of statistical inference]. J. Amer. Statist. Assoc. Knowledge. Univ. Chicago Press, Chicago, IL. 57 307–308. MAYO, D. G. (2010). An error in the argument from conditionality SAVAG E , L. J. (1970). Comments on a weakened principle of con- and sufficiency to the likelihood principle. In Error and Infer- ditionality. J. Amer. Statist. Assoc. 65 (329) 399–401. ence: Recent Exchanges on Experimental Reasoning, Reliabil- SAVAG E ,L.J.,BARNARD,G.,CORNFIELD,J.,BROSS,I., ity, and the Objectivity and Rationality of Science (D. G. Mayo BOX,G.E.P.,GOOD,I.J.,LINDLEY, D. V. et al. (1962). and A. Spanos, eds.) 305–314. Cambridge Univ. Press, Cam- On the foundations of statistical inference: Discussion. J. Amer. bridge. Statist. Assoc. 57 307–326. Statistical Science 2014, Vol. 29, No. 2, 240–241 DOI: 10.1214/14-STS470 © Institute of Mathematical Statistics, 2014

Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” A. P. Dawid

Abstract. Deborah Mayo claims to have refuted Birnbaum’s argument that the Likelihood Principle is a logical consequence of the Sufficiency and Con- ditionality Principles. However, this claim fails because her interpretation of the Conditionality Principle is different from Birnbaum’s. Birnbaum’s proof cannot be so readily dismissed. Key words and phrases: Conditionality principle, Birnbaum’s theorem, likelihood principle, sufficiency principle, weak conditionality principle.

REFERENCES DAWID, A. P. (2011). Basu on ancillarity. In Selected Works of Debabrata Basu. Sel. Works Probab. Stat. 5–8. Springer, New BIRNBAUM, A. (1962). On the foundations of statistical inference. York. MR2799327 J. Amer. Statist. Assoc. 57 269–326. MR0138176 DAWID, A. P. (2013). Principles of statistics. Online lecture notes DAWID, A. P. (1977). Conformity of inference patterns. In Recent at http://www.flooved.com/reader/3470. Developments in Statistics (Proc. European Meeting Statisti- DURBIN, J. (1970). On Birnbaum’s theorem on the relation be- cians, Grenoble, 1976) 245–256. North-Holland, Amsterdam. tween sufficiency, conditionality and likelihood. J. Amer. Statist. MR0471123 Assoc. 65 395–398. DAWID, A. P. (1983). Statistical inference: I. In Encyclopedia of FRASER, D. A. S. (1963). On the sufficiency and likelihood prin- Statistical Sciences 4 (S. Kotz, N. L. Johnson and C. B. Read, ciples. J. Amer. Statist. Assoc. 58 641–647. MR0153078 eds.) 89–105. Wiley, New York. KALBFLEISCH, J. D. (1975). Sufficiency and conditionality. DAWID, A. P. (1987). Invited discussion of “On principles and Biometrika 62 251–268. MR0386075 arguments to likelihood,” by M. Evans, D. A. S. Fraser and MAYO, D. (2014). On the Birnbaum argument for the strong like- G. Monette. Canad. J. Statist. 14 196–197. lihood principle. Statist. Sci. 29 227–239.

A. P. Dawid is Emeritus Professor of Statistics, Statistical Laboratory, Centre for Mathematical Sciences, Cambridge University, Wilberforce Road, Cambridge CB3 0WB, UK (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 242–246 DOI: 10.1214/14-STS471 © Institute of Mathematical Statistics, 2014

Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” Michael Evans

Abstract. We discuss Birnbaum’s result, its relevance to statistical reason- ing, Mayo’s objections and the result in [Electron. J. Statist. 7 (2013) 2645– 2655] that the proof of this result doesn’t establish what is commonly be- lieved. Key words and phrases: Sufficiency, conditionality, likelihood, statistical evidence.

REFERENCES 181–199. MR0859631 EVANS,M.andJANG, G. H. (2011a). A limit result for the prior BASKURT,Z.andEVANS, M. (2013). Hypothesis assessment and inequalities for Bayes factors and relative belief ratios. Bayesian predictive applied to checking for prior-data conflict. Statist. Anal. 8 569–590. MR3102226 Probab. Lett. 81 1034–1038. MR2803740 BASU, D. (1959). The family of ancillary statistics. Sankhya¯ 21 EVANS,M.andJANG, G. H. (2011b). Weak informativity and the 247–256. MR0110115 information in one prior relative to another. Statist. Sci. 26 423– BIRNBAUM, A. (1962). On the foundations of statistical inference. 439. MR2917964 J. Amer. Statist. Assoc. 57 296–326. MR0138176 EVANS,M.andMOSHONOV, H. (2006). Checking for prior-data DURBIN, J. (1970). On Birnbaum’s theorem on the relation be- conflict. Bayesian Anal. 1 893–914 (electronic). MR2282210 tween sufficiency, conditionality and likelihood. J. Amer. Statist. FRASER Assoc. 65 395–398. , D. A. S. (2004). Ancillaries and conditional inference. Statist Sci 19 EVANS, M. (2013). What does the proof of Birnbaum’s theorem . . 333–369. MR2140544 prove? Electron. J. Stat. 7 2645–2655. MR3121626 ROYALL, R. M. (1997). Statistical Evidence. A Likelihood EVANS,M.J.,FRASER,D.A.S.andMONETTE, G. (1986). On Paradigm. Monographs on Statistics and Applied Probability principles and arguments to likelihood. Canad. J. Statist. 14 71. Chapman & Hall, London. MR1629481

Michael Evans is Professor, Department of Statistical Sciences, University of Toronto, 100 St. George St., Toronto, Ontario M5S 3G3, Canada (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 247–251 DOI: 10.1214/14-STS472 © Institute of Mathematical Statistics, 2014

Discussion: Foundations of Statistical Inference, Revisited Ryan Martin and Chuanhai Liu

Abstract. This is an invited contribution to the discussion on Professor Deb- orah Mayo’s paper, “On the Birnbaum argument for the strong likelihood principle,” to appear in Statistical Science. Mayo clearly demonstrates that statistical methods violating the likelihood principle need not violate either the sufficiency or conditionality principle, thus refuting Birnbaum’s claim. With the constraints of Birnbaum’s theorem lifted, we revisit the foundations of statistical inference, focusing on some new foundational principles, the inferential model framework, and connections with sufficiency and condi- tioning. Key words and phrases: Birnbaum, conditioning, dimension reduction, in- ferential model, likelihood principle.

REFERENCES HANNIG, J. (2009). On generalized fiducial inference. Statist. Sinica 19 491–544. MR2514173 BIRNBAUM, A. (1962). On the foundations of statistical inference. HANNIG, J. (2013). Generalized fiducial inference via discretiza- J. Amer. Statist. Assoc. 57 269–326. MR0138176 tion. Statist. Sinica 23 489–514. MR3086644 CHIANG, A. K. L. (2001). A simple general method for construct- LINDLEY, D. V. (1958). Fiducial distributions and Bayes’ theorem. ing confidence intervals for functions of variance components. J. R. Stat. Soc. Ser. BStat. Methodol. 20 102–107. MR0095550 Technometrics 43 356–367. MR1943189 LIU,C.andMARTIN, R. (2015). Inferential Models: Reasoning DEMPSTER, A. P. (2008). The Dempster–Shafer calculus with Uncertainty. Monographs in Statistics and Applied Proba- for statisticians. Internat. J. Approx. Reason. 48 365–377. bility Series. Chapman & Hall, London. To appear. MR2419025 MARTIN,R.andLIU, C. (2013). Inferential models: A framework EVANS, M. (2013). What does the proof of Birnbaum’s theorem for prior-free posterior probabilistic inference. J. Amer. Statist. prove? Electron. J. Stat. 7 2645–2655. MR3121626 Assoc. 108 301–313. MR3174621 ARTIN IU EVANS,M.J.,FRASER,D.A.S.andMONETTE, G. (1986). On M ,R.andL , C. (2014a). Conditional inferential models: principles and arguments to likelihood. Canad. J. Statist. 14 Combining information for prior-free probabilistic inference. 181–199. MR0859631 J. R. Stat. Soc. Ser. B. Stat. Methodol. To appear. Full-length preprint version at arXiv:1211.1530. DOI:10.1111/rssb.12070 FISHER, R. A. (1930). Inverse probability. Math. Proc. Cambridge MARTIN,R.andLIU, C. (2014b). A note on p-values inter- Philos. Soc. 26 528–535. preted as plausibilities. Statist. Sinica. To appear. Available at FISHER, R. A. (1973). Statistical Methods and Scientific Inference, arXiv:1211.1547. DOI:10.5705/ss.2013.087 3rd ed. Hafner Press, New York. MARTIN,R.,ZHANG,J.andLIU, C. (2010). Dempster–Shafer FRASER, D. A. S. (1968). The Structure of Inference. Wiley, New theory and statistical inference with weak beliefs. Statist. Sci. York. MR0235643 25 72–87. MR2741815 FRASER, D. A. S. (2004). Ancillaries and conditional inference. SHAFER, G. (1976). A Mathematical Theory of Evidence. Prince- Statist. Sci. 19 333–369. MR2140544 ton Univ. Press, Princeton, NJ. MR0464340 FRASER, D. A. S. (2011). Is Bayes posterior just quick and dirty TARALDSEN,G.andLINDQVIST, B. H. (2013). Fiducial theory confidence? Statist. Sci. 26 299–316. MR2918001 and optimal inference. Ann. Statist. 41 323–341. MR3059420 GHOSH,M.,REID,N.andFRASER, D. A. S. (2010). Ancillary WEERAHANDI, S. (1993). Generalized confidence intervals. J. statistics: A review. Statist. Sinica 20 1309–1332. MR2777327 Amer. Statist. Assoc. 88 899–905. MR1242940

Ryan Martin is Assistant Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, 851 S. Morgan St., Chicago, Illinois 60607, USA (e-mail: [email protected]). Chuanhai Liu is Professor, Department of Statistics, Purdue University, 250 North University St., West Lafayette, Indiana 47907-2067, USA (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 252–253 DOI: 10.1214/14-STS473 © Institute of Mathematical Statistics, 2014

Discussion: On Arguments Concerning Statistical Principles D. A. S. Fraser

REFERENCES FRASER, D. A. S. (2014). Why does statistics have two theo- ries? In Past, Present and Future of Statistical Science (X. Lin, BIRNBAUM, A. (1962). On the foundations of statistical infer- D. Banks, C. Genest, G. Molenberghs, D. Scott and J.-L. Wang, ence (with discussion). J. Amer. Statist. Assoc. 57 269–326. eds.) 237–252. CRC Press, Boca Raton, FL. MR0138176 FRASER,A.M.,FRASER,D.A.S.andSTAICU, A.-M. (2010). COX, D. R. (1958). Some problems connected with statistical in- Second order ancillary: A differential view from continuity. ference. Ann. Math. Statist. 29 357–372. MR0094890 Bernoulli 16 1208–1223. MR2759176 FISHER, R. A. (1956). Statistical Methods and Scientific Inference. RICE, J. (2007). Mathematical Statistics and Data Analysis,3rd Oliver and Boyd, Edinburgh. ed. Brooks/Cole, Belmont, CA.

D. A. S. Fraser is Emeritus Professor, Department of Statistical Sciences, University of Toronto, 100 St. George St., Toronto, Ontario M5S 3G3, Canada (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 254–258 DOI: 10.1214/14-STS474 © Institute of Mathematical Statistics, 2014

Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” Jan Hannig

Abstract. In this discussion we demonstrate that fiducial distributions pro- vide a natural example of an inference paradigm that does not obey Strong Likelihood Principle while still satisfying the Weak Conditionality Principle. Key words and phrases: Generalized fiducial inference, strong likelihood principle violation, weak conditionality principle.

REFERENCES DEMPSTER, A. P. (1966). New methods for reasoning towards posterior distributions based on sample data. Ann. Math. Statist. BARNARD, G. A. (1995). Pivotal models and the fiducial argu- 37 355–374. MR0187357 ment. Internat. Statist. Rev. 63 309–323. DEMPSTER, A. P. (1968). A generalization of Bayesian inference BAYARRI,M.J.,BERGER,J.O.,FORTE,A.andGARCÍA- (with discussion). J. R. Stat. Soc. Ser. BStat. Methodol. 30 205– DONATO, G. (2012). Criteria for Bayesian model choice with 247. MR0238428 application to variable selection. Ann. Statist. 40 1550–1577. DEMPSTER, A. P. (2008). The Dempster–Shafer calculus MR3015035 for statisticians. Internat. J. Approx. Reason. 48 365–377. BERGER,J.O.andBERNARDO, J. M. (1992). On the de- MR2419025 velopment of reference priors. In Bayesian Statistics 4 EDLEFSEN,P.T.,LIU,C.andDEMPSTER, A. P. (2009). Estimat- (J.M.Bernardo,J.O.Berger,A.P.DawidandA.F.M.Smith, ing limits from Poisson counting data using Dempster–Shafer eds.) 35–60. Oxford Univ. Press, New York. MR1380269 analysis. Ann. Appl. Stat. 3 764–790. MR2750681 BERGER,J.O.,BERNARDO,J.M.andSUN, D. (2009). The EFRON, B. (1998). R. A. Fisher in the 21st century (invited pa- formal definition of reference priors. Ann. Statist. 37 905–938. per presented at the 1996 R. A. Fisher Lecture). Statist. Sci. 13 MR2502655 95–122. MR1647499 BERGER,J.O.,BERNARDO,J.M.andSUN, D. (2012). Objective FISHER, R. A. (1930). Inverse probability. Math. Proc. Cambridge priors for discrete parameter spaces. J. Amer. Statist. Assoc. 107 Philos. Soc. XXVI 528–535. 636–648. MR2980073 FISHER, R. A. (1933). The concepts of inverse probability and fiducial probability referring to unknown parameters. Proc. R. BERGER,J.O.andSUN, D. (2008). Objective priors for the bi- Soc Lond Ser AMath Phys Eng Sci 139 variate normal model. Ann. Statist. 36 963–982. MR2396821 ...... 343–348. FISHER, R. A. (1935). The fiducial argument in statistical infer- CHIANG, A. K. L. (2001). A simple general method for construct- ence. Ann. Eugenics VI 91–98. ing confidence intervals for functions of variance components. FRASER, D. A. S. (1961a). On fiducial inference. Ann. Math. Technometrics 43 356–367. MR1943189 Statist. 32 661–676. MR0130755 CISEWSKI,J.andHANNIG, J. (2012). Generalized fiducial infer- FRASER, D. A. S. (1961b). The fiducial method and invariance. ence for normal linear mixed models. Ann. Statist. 40 2102– Biometrika 48 261–280. MR0133910 2127. MR3059078 FRASER, D. A. S. (1966). Structural probability and a generaliza- COX, D. R. (1958). Some problems connected with statistical in- tion. Biometrika 53 1–9. MR0196840 ference. Ann. Math. Statist. 29 357–372. MR0094890 FRASER, D. A. S. (1968). The Structure of Inference.Wiley,New DAWID,A.P.andSTONE, M. (1982). The functional-model basis York. MR0235643 of fiducial inference. Ann. Statist. 10 1054–1074. MR0673643 FRASER, D. A. S. (2004). Ancillaries and conditional inference. DAWID,A.P.,STONE,M.andZIDEK, J. V. (1973). Marginaliza- Statist. Sci. 19 333–369. MR2140544 tion paradoxes in Bayesian and structural inference. J. R. Stat. FRASER, D. A. S. (2011). Is Bayes posterior just quick and dirty Soc. Ser. BStat. Methodol. 35 189–233. MR0365805 confidence? Statist. Sci. 26 299–316. MR2918001

Jan Hannig is Professor, Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, 330 Hanes Hall, Chapel Hill, North Carolina 27599-3260, USA (e-mail: [email protected]). FRASER,A.M.,FRASER,D.A.S.andSTAICU, A.-M. (2010). Technical Report 2004/11, Colorado State Univ., Fort Collins, Second order ancillary: A differential view from continuity. CO. Bernoulli 16 1208–1223. MR2759176 SALOME, D. (1998). Staristical inference via fiducial methods. FRASER,D.A.S.andNADERI, A. (2008). Exponential models: Ph.D. thesis, Univ. Groningen. Approximations for probabilities. Biometrika 94 1–9. SCHWEDER,T.andHJORT, N. L. (2002). Confidence and like- FRASER,D.,REID,N.andWONG, A. (2005). What a model with lihood. Scand. J. Stat. 29 309–332. Large structured models data says about theta. Internat. J. Statist. Sci. 3 163–178. in applied sciences; challenges for statistics (Grimstad, 2000). FRASER,D.A.S.,REID,N.,MARRAS,E.andYI, G. Y. (2010). MR1909788 Default priors for Bayesian and frequentist inference. J. R. Stat. SINGH,K.,XIE,M.andSTRAWDERMAN, W. E. (2005). Combin- Soc. Ser. BStat. Methodol. 72 631–654. MR2758239 ing information from independent sources through confidence HANNIG, J. (2009). On generalized fiducial inference. Statist. distributions. Ann. Statist. 33 159–183. MR2157800 Sinica 19 491–544. MR2514173 SONDEREGGER,D.andHANNIG, J. (2014). Fiducial theory for HANNIG, J. (2013). Generalized fiducial inference via discretiza- free-knot splines. In Contemporaly Developments in Statisti- tion. Statist. Sinica 23 489–514. MR3086644 cal Theory, a Festschrift in Honor of Professor Hira L. Koul HANNIG,J.,IYER,H.andPATTERSON, P. (2006). Fiducial gen- (T. N. Sriraus, ed.) 155–189. Springer, Berlin. eralized confidence intervals. J. Amer. Statist. Assoc. 101 254– STEVENS, W. L. (1950). Fiducial limits of the parameter of a dis- 269. MR2268043 continuous distribution. Biometrika 37 117–129. MR0035955 TARALDSEN,G.andLINDQVIST, B. H. (2013). Fiducial theory HANNIG,J.,LAI,R.C.S.andLEE, T. C. M. (2014). Computa- tional issues of generalized fiducial inference. Comput. Statist. and optimal inference. Ann. Statist. 41 323–341. MR3059420 Data Anal. 71 849–858. MR3132011 TSUI,K.-W.andWEERAHANDI, S. (1989). Generalized p-values in significance testing of hypotheses in the presence of nuisance HANNIG,J.andLEE, T. C. M. (2009). Generalized fidu- parameters. J. Amer. Statist. Assoc. 84 602–607. MR1010352 cial inference for wavelet regression. Biometrika 96 847–860. TSUI,K.-W.andWEERAHANDI, S. (1991). Corrections: “Gen- MR2767274 eralized p-values in significance testing of hypotheses in the HANNIG,J.andXIE,M.-G. (2012). A note on Dempster–Shafer presence of nuisance parameters”. [J. Amer. Statist. Assoc. 84 recombination of confidence distributions. Electron. J. Stat. 6 (1989) 602–607. MR1010352]. J. Amer. Statist. Assoc. 86 256. 1943–1966. MR2988470 MR1137115 IYER,H.K.,WANG,C.M.J.andMATHEW, T. (2004). Models WANG,C.M.,HANNIG,J.andIYER, H. K. (2012). Fiducial pre- and confidence intervals for true values in interlaboratory trials. diction intervals. J. Statist. Plann. Inference 142 1980–1990. J. Amer. Statist. Assoc. 99 1060–1071. MR2109495 MR2903406 JEFFREYS, H. (1940). Note on the Behrens–Fisher formula. Ann. WEERAHANDI, S. (1993). Generalized confidence intervals. J. Eugenics 10 48–51. MR0002080 Amer. Statist. Assoc. 88 899–905. MR1242940 LAI,R.C.S.,HANNIG,J.andLEE, T. C. M. (2013). Generalized WEERAHANDI, S. (1994). Correction: “Generalized confidence fiducial inference for ultra high dimensional regression. Avail- intervals”. [J. Amer. Statist. Assoc. 88 (1993) 899–905. able at arXiv:1304.7847. MR1242940]. J. Amer. Statist. Assoc. 89 726. MR1294096 LINDLEY, D. V. (1958). Fiducial distributions and Bayes’ theorem. WEERAHANDI, S. (1995). Exact Statistical Methods for Data J. R. Stat. Soc. Ser. BStat. Methodol. 20 102–107. MR0095550 Analysis. Springer Series in Statistics. Springer, New York. MARTIN,R.andLIU, C. (2013a). Conditional inferential models: MR1316663 Combining information for prior-free probabilistic inference. WILKINSON, G. N. (1977). On resolving the controversy in statis- Preprint. tical inference. J. R. Stat. Soc. Ser. BStat. Methodol. 39 119– MARTIN,R.andLIU, C. (2013b). Inferential models: A frame- 171. MR0652326 work for prior-free posterior probabilistic inference. J. Amer. XIE,M.-G.andSINGH, K. (2013). Confidence distribution, the Statist. Assoc. 108 301–313. MR3174621 frequentist distribution estimator of a parameter: A review. In- MARTIN,R.andLIU, C. (2013c). Marginal inferential mod- ternat. Statist. Rev. 81 3–39. MR3047496 els: prior-free probabilistic inference on interest parameters. XIE,M.,SINGH,K.andSTRAWDERMAN, W. E. (2011). Confi- Preprint. dence distributions and a unifying framework for meta-analysis. MARTIN,R.andLIU, C. (2013d). On a ’plausible’ interpretation J. Amer. Statist. Assoc. 106 320–333. MR2816724 of p-values. Preprint. XIE,M.,LIU,R.Y.,DAMARAJU,C.V.andOLSON,W.H. MARTIN,R.,ZHANG,J.andLIU, C. (2010). Dempster–Shafer (2013). Incorporating external information in analyses of clin- theory and statistical inference with weak beliefs. Statist. Sci. ical trials with binary outcomes. Ann. Appl. Stat. 7 342–368. 25 72–87. MR2741815 MR3086422 PATTERSON,P.,HANNIG,J.andIYER, H. K. (2004). Fiducial ZHANG,J.andLIU, C. (2011). Dempster–Shafer inference with generalized confidence intervals for proportion of conformance. weak beliefs. Statist. Sinica 21 475–494. MR2829843 Statistical Science 2014, Vol. 29, No. 2, 259–260 DOI: 10.1214/14-STS475 © Institute of Mathematical Statistics, 2014

Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” Jan F. Bjørnstad

Abstract. The paper by Mayo claims to provide a new clarification and cri- tique of Birnbaum’s argument for showing that sufficiency and conditionality principles imply the likelihood principle. However, much of the arguments go back to arguments made thirty to forty years ago. Also, the main contention in the paper, that Birnbaum’s arguments are not valid, seems to rest on a misunderstanding. Key words and phrases: Likelihood, conditionality, sufficiency, Birnbaum’s theorem.

REFERENCES COX, D. R. (1978). Foundations of statistical inference: The case for eclecticism. Aust. N. Z. J. Stat. 20 43–59. MR0501453 BERGER,J.O.andWOLPERT, R. L. (1988). The Likelihood Prin- COX,D.R.andHINKLEY, D. V. (1974). Theoretical Statistics. ciple, 2nd ed. Lecture Notes—Monograph Series 6.IMS,Hay- Chapman & Hall, London. MR0370837 ward, CA. DURBIN, J. (1970). On Birnbaum’s theorem in the relation be- BJØRNSTAD, J. F. (1991). Introduction to Birnbaum (1962): On tween sufficiency, conditionality and likelihood. J. Amer. Statist. the foundations of statistical inference. In Breakthroughs in Assoc. 65 395–398. Statistics 1 (S. Kotz and J. Johnson, eds.) 461–477. Springer EVANS,M.J.,FRASER,D.A.S.andMONETTE, G. (1986). On Series in Statistics. Springer, New York. principles and arguments to likelihood. Canad. J. Statist. 14 BJØRNSTAD, J. F. (1996). On the generalization of the likelihood 181–199. MR0859631 function and the likelihood principle. J. Amer. Statist. Assoc. 91 KALBFLEISCH, J. D. (1975). Sufficiency and conditionality. 791–806. MR1395746 Biometrika 62 251–268. MR0386075

Jan F. Bjørnstad is Professor of Statistics, Department of Mathematics, University of Oslo and Head of Research, Division for Statistical Methods, Statistics Norway, P.O. Box 8131 Dep., N-0033 Oslo, Norway (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 261–266 DOI: 10.1214/14-STS482 © Institute of Mathematical Statistics, 2014

Rejoinder: “On the Birnbaum Argument for the Strong Likelihood Principle” Deborah G. Mayo

REFERENCES Lindley–Savage argument for Bayesian theory. Synthese 36 19– 49. MR0652320 BERGER,J.O.andWOLPERT, R. L. (1988). The Likelihood Prin- COX, D. R. (1958). Some problems connected with statistical in- ciple, 2nd ed. Lecture Notes—Monograph Series 6.IMS,Hay- ward, CA. ference. Ann. Math. Statist. 29 357–372. MR0094890 BIRNBAUM, A. (1962). On the foundations of statistical inference. DURBIN, J. (1970). On Birnbaum’s theorem on the relation be- J. Amer. Statist. Assoc. 57 269–306. Reprinted in Breakthroughs tween sufficiency, conditionality and likelihood. J. Amer. Statist. in Statistics 1 (S. Kotz and N. Johnson, eds.) 478–518. Springer, Assoc. 65 395–398. New York. EVANS,M.J.,FRASER,D.A.S.andMONETTE, G. (1986). On BIRNBAUM, A. (1968). Likelihood. In International Encyclopedia principles and arguments to likelihood. Canad. J. Statist. 14 of the Social Sciences 9 299–301. Macmillan and the Free Press, 181–199. MR0859631 New York. FISHER, R. A. (1956). Statistical Methods and Scientific Inference. BIRNBAUM, A. (1969). Concepts of statistical evidence. In Philos- Oliver and Boyd, Edinburgh. ophy, Science, and Method: Essays in Honor of Ernest Nagel GIERE, R. N. (1977). Allan Birnbaum’s conception of statistical (S. Morgenbesser, P. Suppes and M. G. White, eds.) 112–143. evidence. Synthese 36 5–13. MR0494585 St. Martin’s Press, New York. KALBFLEISCH, J. D. (1975). Sufficiency and conditionality. BIRNBAUM, A. (1970a). Statistical methods in scientific inference. Biometrika 62 251–268. MR0386075 Nature 225 1033. MAYO, D. G. (2010). An error in the argument from conditionality BIRNBAUM, A. (1970b). On Durbin’s modified principle of condi- tionality. J. Amer. Statist. Assoc. 65 402–403. and sufficiency to the likelihood principle. In Error and Infer- BIRNBAUM, A. (1972). More on concepts of statistical evidence. ence: Recent Exchanges on Experimental Reasoning, Reliabil- J. Amer. Statist. Assoc. 67 858–861. MR0365793 ity, and the Objectivity and Rationality of Science (D.G.Mayo BIRNBAUM, A. (1975). Comments on “Sufficiency and condition- and A. Spanos, eds.) 305–314. Cambridge Univ. Press., Cam- ality” by J. D. Kalbfleisch. Biometrika 62 262–264. bridge. BIRNBAUM, A. (1977). The Neyman–Pearson theory as deci- PRATT, J. W. (1962). On the foundations of statistical inference: sion theory, and as inference theory; with a criticism of the Discussion. J. Amer. Statist. Assoc. 57 314–316.

Deborah G. Mayo is Professor of Philosophy, Department of Philosophy, Virginia Tech, 235 Major Williams Hall, Blacksburg, Virginia 24061, USA (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 267–284 DOI: 10.1214/13-STS454 © Institute of Mathematical Statistics, 2014 Comments on the Neyman–Fisher Controversy and Its Consequences Arman Sabbaghi and Donald B. Rubin

Abstract. The Neyman–Fisher controversy considered here originated with the 1935 presentation of Jerzy Neyman’s Statistical Problems in Agricul- tural Experimentation to the Royal Statistical Society. Neyman asserted that the standard ANOVA F-test for randomized complete block designs is valid, whereas the analogous test for Latin squares is invalid in the sense of detect- ing differentiation among the treatments, when none existed on average, more often than desired (i.e., having a higher Type I error than advertised). How- ever, Neyman’s expressions for the expected mean residual sum of squares, for both designs, are generally incorrect. Furthermore, Neyman’s belief that the Type I error (when testing the null hypothesis of zero average treatment effects) is higher than desired, whenever the expected mean treatment sum of squares is greater than the expected mean residual sum of squares, is gen- erally incorrect. Simple examples show that, without further assumptions on the potential outcomes, one cannot determine the Type I error of the F-test from expected sums of squares. Ultimately, we believe that the Neyman– Fisher controversy had a deleterious impact on the development of statistics, with a major consequence being that potential outcomes were ignored in fa- vor of linear models and classical statistical procedures that are imprecise without applied contexts. Key words and phrases: Analysis of variance, Latin squares, nonadditivity, randomization tests, randomized complete blocks.

REFERENCES COX, D. R. (1958). Planning of Experiments, 1st ed. Wiley, New York. BARTLETT, M. S. (1947). The use of transformations. Biometrics COX, D. R. (2012). Statistical causality: Some historical re- 3 39–52. MR0020763 marks. In Causality: Statistical Perspectives and Applications BOX, J. F. (1978). R. A. Fisher: The Life of a Scientist. Wiley, New York. MR0500579 (C. Berzuini, P. Dawid and L. Bernardinelli, eds.) 1–5. Wiley, BOX, G. E. P. (1984). Discussion of paper by D.R. Cox. Interna- New York. tional Statistical Review 52 26. EISENHART, C. (1947). The assumptions underlying the analysis BOX,G.E.P.andCOX, D. R. (1964). An analysis of transforma- of variance. Biometrics 3 1–21. MR0020761 tions (with discussion). J. R. Stat. Soc. Ser. BStat. Methodol. 26 FIENBERG,S.E.andTANUR, J. M. (1996). Reconsidering the 211–252. MR0192611 fundamental contributions of Fisher and Neyman on experimen- COCHRAN, W. G. (1947). Some consequences when the assump- tation and sampling. International Statistical Review 64 237– tions for the analysis of variance are not satisfied. Biometrics 3 253. 22–38. MR0020762 FISHER, R. A. (1935). Comment on “Statistical problems in COX, D. R. (1958). The interpretation of the effects of non- agricultural experimentation (with discussion).” Suppl. J. Roy. additivity in the Latin square. Biometrika 46 69–73. Statist. Soc. Ser. B 2 154–157, 173. COX, D. R. (1984). Interaction. Internat. Statist. Rev. 52 1–31. FISHER, R. A. (1971). The Design of Experiments, 9th ed. Hafner MR0967201 Publishing Company, New York.

Arman Sabbaghi is Assistant Professor of Statistics, Department of Statistics, Purdue University, 250 N. University Street, West Lafayette, Indiana 47907, USA (e-mail: [email protected]). Donald B. Rubin is John L. Loeb Professor of Statistics, Department of Statistics, Harvard University, 1 Oxford Street Fl. 7, Cambridge, Massachusetts 02138, USA (e-mail: [email protected]). GOURLAY, N. (1955a). F-test bias for experimental designs in ed- RUBIN, D. B. (1990). Comment on J. Neyman and causal inference ucational research. Psychometrika 20 227–258. in experiments and observational studies: “On the application of GOURLAY, N. (1955b). F-test bias for experimental designs of the probability theory to agricultural experiments. Essay on princi- latin square type. Psychometrika 20 273–287. ples. Section 9” [Ann. Agric. Sci. 10 (1923) 1–51]. Statist. Sci. HINKELMANN,K.andKEMPTHORNE, O. (2008). Design and 5 472–480. MR1092987 Analysis of Experiments. Vol.1:Introduction to Experimental RUBIN, D. B. (2005). Causal inference using potential outcomes: Design, 2nd ed. Wiley, Hoboken, NJ. MR2363107 Design, modeling, decisions. J. Amer. Statist. Assoc. 100 322– KEMPTHORNE, O. (1952). The Design and Analysis of Experi- 331. MR2166071 ments. Wiley, New York. MR0045368 SABBAGHI,A.andRUBIN, D. B. (2014) Supplement to “Com- KEMPTHORNE, O. (1955). The randomization theory of ex- perimental inference. J. Amer. Statist. Assoc. 50 946–967. ments on the Neyman–Fisher controversy and its conse- MR0071696 quences.” DOI:10.1214/13-STS454SUPP. LEHMANN, E. L. (2011). Fisher, Neyman, and the Creation of SPLAWA-NEYMAN, J. (1990). On the application of probability Classical Statistics. Springer, New York. MR2798202 theory to agricultural experiments. Essay on principles. Sec- MANDEL, J. (1961). Non-additivity in two-way analysis of vari- tion 9. Statist. Sci. 5 465–472. MR1092986 ance. J. Amer. Statist. Assoc. 56 878–888. MR0131934 SUKHATME, P. (1935). Comment on “Statistical problems in NEYMAN, J. (1935). Statistical problems in agricultural experi- agricultural experimentation (with discussion).” Suppl. J. Roy. mentation (with discussion). Suppl. J. Roy. Statist. Soc. Ser. B Statist. Soc. Ser. B 2 166–169. 2 107–180. TUKEY, J. (1949). One degree of freedom for nonadditivity. Bio- NEYMAN, J. (1976). Emergence of mathematical statistics. In metrics 5 232–242. On the History of Statistics and Probability: Proceedings of a Symposium on the American Mathematical Heritage, to Cele- TUKEY, J. (1955). Query 113. Biometrics 11 111–113. brate the Bicentennial of the United States of America, Held at WELCH, B. (1937). On the z-test in randomized blocks and latin Southern Methodist University, May 27–29, 1974 (D. B. Owen, squares. Biometrika 29 21–52. W. G. Cochran, H. O. Hartley and J. Neyman, eds.) 149–185. WILK, M. B. (1955). The randomization analysis of a generalized Dekker, New York. randomized block design. Biometrika 42 70–79. MR0068800 PITMAN, E. (1938). Significance tests which may be applied to WILK,M.B.andKEMPTHORNE, O. (1957). Non-additivities samples from any populations: III. The Analysis of Variance in a Latin square design. J. Amer. Statist. Assoc. 52 218–236. Test. Biometrika 29 322–335. MR0088137 REID, C. (1982). Neyman—from Life. Springer, New York. WU,C.F.J.andHAMADA, M. S. (2009). Experiments: Plan- MR0680939 ning, Analysis, and Optimization, 2nd ed. Wiley, Hoboken, NJ. ROJAS, B. (1973). On Tukey’s test of additivity. Biometrics 29 45– MR2583259 52. RUBIN, D. B. (1978). Bayesian inference for causal effects: The YATES, F. (1935). Complex experiments. J. R. Stat. Soc. Ser. BStat. role of randomization. Ann. Statist. 6 34–58. MR0472152 Methodol. 2 181–247. RUBIN, D. B. (1984). Bayesianly justifiable and relevant frequency YATES, F. (1939). The comparative advantages of systematic and calculations for the applied statistician. Ann. Statist. 12 1151– randomized arrangements in the design of agricultural and bio- 1172. MR0760681 logical experiments. Biometrika 30 440–466. Statistical Science 2014, Vol. 29, No. 2, 285–301 DOI: 10.1214/13-STS455 © Institute of Mathematical Statistics, 2014 Two Modeling Strategies for Empirical Bayes Estimation Bradley Efron

Abstract. Empirical Bayes methods use the data from parallel experiments, for instance, observations Xk ∼ N (k, 1) for k = 1, 2,...,N, to estimate the conditional distributions k|Xk. There are two main estimation strate- gies: modeling on the θ space, called “g-modeling” here, and modeling on the x space, called “f -modeling.” The two approaches are described and compared. A series of computational formulas are developed to assess their frequentist accuracy. Several examples, both contrived and genuine, show the strengths and limitations of the two strategies. Key words and phrases: f -modeling, g-modeling, Bayes rule in terms of f , prior exponential families.

REFERENCES FISHER,R.,CORBET,A.andWILLIAMS, C. (1943). The relation between the number of species and the number of individuals BENJAMINI,Y.andHOCHBERG, Y. (1995). Controlling the false in a random sample of an animal population. J. Anim. Ecol. 12 discovery rate: A practical and powerful approach to multi- ple testing. J. R. Stat. Soc. Ser. BStat. Methodol. 57 289–300. 42–58. MR1325392 GOOD,I.J.andTOULMIN, G. H. (1956). The number of new BROWN,L.D.,GREENSHTEIN,E.andRITOV, Y. (2013). The species, and the increase in population coverage, when a sam- Poisson compound decision problem revisited. J. Amer. Statist. ple is increased. Biometrika 43 45–63. MR0077039 Assoc. 108 741–749. HALL,P.andMEISTER, A. (2007). A ridge-parameter approach BUTUCEA,C.andCOMTE, F. (2009). Adaptive estimation of to deconvolution. Ann. Statist. 35 1535–1558. MR2351096 linear functionals in the convolution model and applications. JAMES,W.andSTEIN, C. (1961). Estimation with quadratic loss. Bernoulli 15 69–98. MR2546799 In Proc.4th Berkeley Sympos. Math. Statist. and Prob., Vol. I CARLIN,B.P.andLOUIS, T. A. (2000). Bayes and Empirical 361–379. Univ. California Press, Berkeley, CA. MR0133191 Bayes Methods for Data Analysis, 2nd ed. Texts in Statistical JIANG,W.andZHANG, C. -H. (2009). General maximum likeli- Science. Chapman & Hall, Boca Raton, FL. hood empirical Bayes estimation of normal means. Ann. Statist. CASELLA, G. (1985). An introduction to empirical Bayes data analysis. Amer. Statist. 39 83–87. MR0789118 37 1647–1684. MR2533467 CAVALIER,L.andHENGARTNER, N. W. (2009). Estimating lin- LAIRD, N. (1978). Nonparametric maximum likelihood estimation ear functionals in Poisson mixture models. J. Nonparametr. Stat. of a mixed distribution. J. Amer. Statist. Assoc. 73 805–811. 21 713–728. MR2549434 MR0521328 EFRON, B. (1975). Defining the curvature of a statistical problem MORRIS, C. N. (1983). Parametric empirical Bayes inference: (with applications to second order efficiency). Ann. Statist. 3 Theory and applications. J. Amer. Statist. Assoc. 78 47–65. 1189–1242. MR0428531 MR0696849 EFRON, B. (2004). The estimation of prediction error: Covariance MURALIDHARAN,O.,NATSOULIS,G.,BELL,J.,JI,H.and penalties and cross-validation. J. Amer. Statist. Assoc. 99 619– ZHANG, N. R. (2012). Detecting mutations in mixed sample 642. MR2090899 sequencing data using empirical Bayes. Ann. Appl. Stat. 6 1047– EFRON, B. (2010). Large-Scale Inference: Empirical Bayes Meth- 1067. MR3012520 ods for Estimation, Testing, and Prediction. IMS 1. Cambridge Univ. Press, Cambridge. MR2724758 ROBBINS, H. (1956). An empirical Bayes approach to statistics. In EFRON, B. (2011). Tweedie’s formula and selection bias. J. Amer. Proceedings of the Third Berkeley Symposium on Mathemati- Statist. Assoc. 106 1602–1614. MR2896860 cal Statistics and Probability, 1954–1955, Vol. I 157–163. Univ. EFRON,B.andMORRIS, C. (1975). Data analysis using Stein’s es- California Press, Berkeley and Los Angeles. MR0084919 timator and its generalizations. J. Amer. Statist. Assoc. 70 311– ZHANG, C. -H. (1997). Empirical Bayes and compound estimation 319. of normal means. Statist. Sinica 7 181–193. MR1441153

Bradley Efron is Professor of Statistics and Biostatistics, Department of Statistics, Stanford University, Stanford, California 94305-4065, USA (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 302–321 DOI: 10.1214/13-STS456 © Institute of Mathematical Statistics, 2014 Pooled Association Tests for Rare Genetic Variants: A Review and Some New Results Andriy Derkach, Jerry F. Lawless and Lei Sun

Abstract. In the search for genetic factors that are associated with complex heritable human traits, considerable attention is now being focused on rare variants that individually have small effects. In response, numerous recent papers have proposed testing strategies to assess association between a group of rare variants and a trait, with competing claims about the performance of various tests. The power of a given test in fact depends on the nature of any association and on the rareness of the variants in question. We review such tests within a general framework that covers a wide range of genetic models and types of data. We study the performance of specific tests through exact or asymptotic power formulas and through novel simulation studies of over 10,000 different models. The tests considered are also applied to real sequence data from the 1000 Genomes project and provided by the GAW17. We recommend a testing strategy, but our results show that power to detect association in plausible genetic scenarios is low for studies of medium size unless a high proportion of the chosen variants are causal. Consequently, considerable attention must be given to relevant biological information that can guide the selection of variants for testing. Key words and phrases: Linear statistics, quadratic statistics, score tests, weighting, power, next generation sequencing, complex traits.

REFERENCES BASU,S.andPAN, W. (2011). Comparison of statistical tests for disease association with rare variants. Genet. Epidemiol. 35 1000 GENOMES PROJECT CONSORTIUM (2010). A map of hu- 606–619. man genome variation from population-scale sequencing. Na- DAYE,Z.J.,LI,H.andWEI, Z. (2012). A powerful test for mul- ture 467 1061–1073. tiple rare variants association studies that incorporates sequenc- ALMASY,L.,DYER,T.D.,PERALTA,J.M.,KENT,J.W., ing qualities. Nucleic Acids Res. 40 e60. CHARLESWORTH,J.C.,CURRAN,J.E.andBLANGERO,J. (2011). Genetic Analysis Workshop 17 mini-exome simulation. DERKACH,A.,LAWLESS,J.F.andSUN, L. (2013a). Supplement BMC Proc. 5 Suppl 9 S2. to “Pooled association tests for rare genetic variants: A review ASIMIT,J.andZEGGINI, E. (2010). Rare variant association anal- and some new results.” DOI:10.1214/13-STS456SUPP. ysis methods for complex traits. Annu. Rev. Genet. 44 293–308. DERKACH,A.,LAWLESS,J.F.andSUN, L. (2013b). Robust and BANSAL,V.,LIBIGER,O.,TORKAMANI,A.andSCHORK,N.J. powerful tests for rare variants using Fisher’s method to com- (2010). Statistical analysis strategies for association studies in- bine evidence of association from two or more complementary volving rare variants. Nat. Rev. Genet. 11 773–785. tests. Genet. Epidemiol. 37 110–121. BARNETT,I.J.,LEE,S.andLIN, X. (2013). Detecting rare variant DERKACH,A.,LAWLESS,J.F.,MERICO,D.,PATERSON,A.D. effects using extreme phenotype sampling in sequencing asso- and SUN, L. (2014). Evaluation of gene-based association tests ciation studies. Genet. Epidemiol. 37 142–151. for analyzing rare variants using Genetic Analysis Workshop 18

Andriy Derkach is Graduate Student, Department of Statistical Sciences, University of Toronto, 100 St. George Street, Toronto, Ontario, Canada 1M5S 3G3. Jerry F. Lawless is Distinguished Professor Emeritus, Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1 and Professor, Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, 155 College Street, Toronto, Ontario, Canada M5T 3M7. Lei Sun is Associate Professor, Department of Statistical Sciences, University of Toronto, 100 St. George Street, Toronto, Ontario, Canada 1M5S 3G3 and Associate Professor, Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, 155 College Street, Toronto, Ontario, Canada M5T 3M7 (e-mail: [email protected]). data. BMC Proc. 8 Suppl 1 S9. MANOLIO,T.A.,BROOKS,L.D.andCOLLINS, F. S. (2008). DUCHESNE,P.andLAFAYE DE MICHEAUX, P. (2010). Comput- A HapMap harvest of insights into the genetics of common dis- ing the distribution of quadratic forms: Further comparisons be- ease. J. Clin. Invest. 118 1590–1605. tween the Liu–Tang–Zhang approximation and exact methods. MARDIA,K.V.,KENT,J.T.andBIBBY, J. M. (1979). Multivari- Comput. Statist. Data Anal. 54 858–862. MR2580921 ate Analysis. Academic Press, Waltham, MA. GOEMAN,J.J.,VA N D E GEER,S.A.andVA N HOUWELIN- MORGENTHALER,S.andTHILLY, W. G. (2007). A strategy to GEN, H. C. (2006). Testing against a high dimensional al- discover genes that carry multi-allelic or mono-allelic risk for ternative. J. R. Stat. Soc. Ser. BStat. Methodol. 68 477–493. common diseases: A cohort allelic sums test (CAST). Mutation MR2278336 Research/Fundamental and Molecular Mechanisms of Mutage- HAN,F.andPAN, W. (2010). A data-adaptive sum test for disease nesis 615 28–56. association with multiple common or rare variants. Hum. Hered. MORRIS,A.P.andZEGGINI, E. (2010). An evaluation of statis- 70 42–54. tical approaches to rare variant analysis in genetic association HINDORFF,L.A.,SETHUPATHY,P.,JUNKINS,H.A., studies. Genet. Epidemiol. 34 188–193. RAMOS,E.M.,MEHTA,J.P.,COLLINS,F.S.andMANO- NEALE,B.M.,RIVAS,M.A.,VOIGHT,B.F.,ALTSHULER,D. LIO, T. A. (2009). Potential etiologic and functional implica- et al. (2011). Testing for an unusual distribution of rare variants. tions of genome-wide association loci for human diseases and PLoS Genet. 7 e1001322. traits. Proc. Natl. Acad. Sci. USA 106 9362–9367. OWEN, A. B. (2009). Karl Pearson’s meta-analysis revisited. Ann. Statist. 37 3867–3892. MR2572446 HOFFMANN,T.J.,MARINI,N.J.andWITTE, J. S. (2010). Com- PAN, W. (2009). Asymptotic tests of association with multiple prehensive approach to analyzing rare genetic variants. PLoS SNPs in linkage disequilibrium. Genet. Epidemiol. 33 497–507. ONE 5 e13584. PRICE,A.L.,KRYUKOV,G.V.,DE BAKKER,P.I.,PUR- HUANG,B.E.andLIN, D. Y. (2007). Efficient association map- CELL, S. M. et al. (2010). Pooled association tests for rare vari- ping of quantitative trait loci with selective genotyping. Am. J. ants in exon-resequencing studies. The American Journal of Hu- Hum. Genet. 80 567–576. man Genetics 86 832–838. KING,C.R.,RATHOUZ,P.J.andNICOLAE, D. L. (2010). An RAO, C. R. (1973). Linear Statistical Inference and Its Applica- evolutionary framework for association testing in resequencing tions, 2nd ed. Wiley, Hoboken, NJ. MR0346957 studies. PLoS Genet. 6 e1001202. REICH,D.E.,CARGILL,M.,BOLK,S.,IRELAND,J.,SA- LADOUCEUR,M.,DASTANI,Z.,AULCHENKO,Y.S.,GREEN- BETI,P.C.,RICHTER,D.J.,LAV E RY,T.,KOUYOUMJIAN,R., WOOD,C.M.T.andRICHARDS, J. B. (2012). The empirical FARHADIAN,S.F.,WARD,R.andLANDER, E. S. (2001). power of rare variant association methods: Results from sanger Linkage disequilibrium in the human genome. Nature 411 199– sequencing in 1998 individuals. PLoS Genet. 8 e1002496. 204. LEE,S.,WU,M.C.andLIN, X. (2012). Optimal tests for rare SKOTTE,L.,KORNELIUSSEN,T.S.andALBRECHTSEN,A. variant effects in sequencing association studies. Biostatistics (2012). Association testing for next-generation sequencing data 13 762–775. using score statistics. Genet. Epidemiol. 36 430–437. LI,Q.H.andLAGAKOS, S. W. (2006). On the relationship be- SUL,J.H.,BUHM,H.andELEAZAR, E. (2011). Increasing power tween directional and omnibus statistical tests. Scand. J. Stat. of groupwise association test with likelihood ratio test. J. Com- 33 239–246. MR2279640 put. Biol. 18 1611–1624. LI,B.andLEAL, S. M. (2008). Methods for detecting associations WU,M.C.,LEE,S.,CAI,T.,LI,Y.,BOEHNKE,M.andLIN,X. with rare variants for common diseases: Application to analysis (2011). Rare-variant association testing for sequencing data of sequence data. Am. J. Hum. Genet. 83 311–321. with the sequence Kernel association test. The American Jour- LIN,D.-Y.andTANG, Z.-Z. (2011). A general framework for nal of Human Genetics 89 82–93. detecting disease associations with rare variants in sequencing YI,N.andZHI, D. (2011). Bayesian analysis of rare variants in studies. The American Journal of Human Genetics 89 354–367. genetic association studies. Genet. Epidemiol. 35 57–69. MADSEN,B.E.andBROWNING, S. R. (2009). A groupwise as- YILMAZ,Y.E.andBULL, S. B. (2011). Are quantitative trait- sociation test for rare mutations using a weighted sum statistic. dependent sampling designs cost-effective for analysis of rare PLoS Genet. 5 e1000384. andcommonvariants?BMC Proc. 5 Suppl 9 S111.