STATISTICAL SCIENCE Volume 29, Number 2 May 2014

STATISTICAL SCIENCE Volume 29, Number 2 May 2014 Special Section on Programming & R Four Papers on Contemporary Software Design Strategies for Statistical Methodologists ............................................................Vincent Carey and Dianne Cook 165 Object-Oriented Programming, Functional Programming and R .........John M. Chambers 167 Enhancing R withAdvancedCompilationToolsandMethods.........Duncan Temple Lang 181 Reactive Programming for Interactive Graphics .............................................Yihui Xie, Heike Hofmann and Xiaoyue Cheng 201 ScalableGenomicswithRandBioconductor.........Michael Lawrence and Martin Morgan 214 General Section OntheBirnbaumArgumentfortheStrongLikelihoodPrinciple..........Deborah G. Mayo 227 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” .................................................................................A. P. Dawid 240 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” ..............................................................................Michael Evans 242 Discussion: Foundations of Statistical Inference, Revisited..................................................Ryan Martin and Chuanhai Liu 247 Discussion:OnArgumentsConcerningStatisticalPrinciples.................D. A. S. Fraser 252 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” .................................................................................Jan Hannig 254 Discussion of “On the Birnbaum Argument for the Strong Likelihood Principle” ............................................................................Jan F. Bjørnstad 259 Rejoinder: “On the Birnbaum Argument for the Strong Likelihood Principle” ...........................................................................Deborah G. Mayo 261 Comments on the Neyman–Fisher Controversy and Its Consequences ....................................................Arman Sabbaghi and Donald B. Rubin 267 TwoModelingStrategiesforEmpiricalBayesEstimation.....................Bradley Efron 285 Pooled Association Tests for Rare Genetic Variants: AReviewandSomeNewResults..........Andriy Derkach, Jerry F. Lawless and Lei Sun 302 Statistical Science [ISSN 0883-4237 (print); ISSN 2168-8745 (online)], Volume 29, Number 2, May 2014. Published quarterly by the Institute of Mathematical Statistics, 3163 Somerset Drive, Cleveland, OH 44122, USA. Periodicals postage paid at Cleveland, Ohio and at additional mailing offices. POSTMASTER: Send address changes to Statistical Science, Institute of Mathematical Statistics, Dues and Subscriptions Office, 9650 Rockville Pike—Suite L2310, Bethesda, MD 20814-3998, USA. Copyright © 2014 by the Institute of Mathematical Statistics Printed in the United States of America EDITOR Peter Green University of Bristol and University of Technology, Sydney ASSOCIATE EDITORS Vincent Carey Shane Jensen Christian Robert Harvard University University of Pennsylvania University of Paris, Dauphine Rong Chen Samuel Kou Andrea Rotnitzky Rutgers University Harvard University Universidad Torcuato Di Tella Dianne Cook David Madigan and Harvard University Iowa State University Columbia University Thomas Severini Rainer Dahlhaus Kerrie Mengersen Northwestern University University of Heidelberg Queensland University Glenn Shafer Michel Dekking of Technology Rutgers Business Delft University Peter Müller School–Newark and Peter J. Diggle TheUniversityofTexas New Brunswick Lancaster University Sonia Petrone Royal Holloway College, Robin Evans Bocconi University University of London University of Oxford Jim Pitman Michael Stein Michael Friendly University of California, University of Chicago York University Berkeley Jon Wakefield Edward I. George Annie Qu University of Washington University of Pennsylvania University of Illinois, Guenther Walther Peter Green Urbana-Champaign Stanford University University of Bristol Nancy Reid Martin Wells Peter Hoff University of Toronto Cornell University University of Washington Thomas Richardson Tong Zhang Sylvie Huet University of Washington Rutgers University INRA MANAGING EDITOR T. N. Sriram University of Georgia PRODUCTION EDITOR Patrick Kelly EDITORIAL COORDINATOR Kristina Mattson PAST EXECUTIVE EDITORS Morris H. DeGroot, 1986–1988 Morris Eaton, 2001 Carl N. Morris, 1989–1991 George Casella, 2002–2004 Robert E. Kass, 1992–1994 Edward I. George, 2005–2007 Paul Switzer, 1995–1997 David Madigan, 2008–2010 Leon J. Gleser, 1998–2000 Jon A. Wellner, 2011–2013 Richard Tweedie, 2001 Statistical Science 2014, Vol. 29, No. 2, 165–166 DOI: 10.1214/14-STS481 © Institute of Mathematical Statistics, 2014 Four Papers on Contemporary Software Design Strategies for Statistical Methodologists Vincent Carey and Dianne Cook REFERENCES [3] NOLAN,D.andLANG, D. T. (2013). XML and Web Technolo- gies for Data Sciences with R. Springer, New York. [1] GENTLEMAN,R.,CAREY,V.,BATES,D.,BOLSTAD,B., [4] TEMPLE LANG, D. (2000). The Omegahat environment: DETTLING,M.,DUDOIT,S.,ELLIS,B.,GAUTIER,L., New possibilities for statistical computing. J. Comput. Graph. GE,Y.,GENTRY,J.,HORNIK,K.,HOTHORN,T.,HU- Statist. 9 423–451. MR1818989 BER,W.,IACUS,S.,IRIZARRY,R.,LEISCH,F.,LI,C., MAECHLER,M.,ROSSINI,A.,SAWITZKI,G.,SMITH,C., [5] TIERNEY, L. (1990). LISP-STAT: An Object-Oriented Environ- SMYTH,G.,TIERNEY,L.,YANG,J.andZHANG, J. (2004). ment for Statistical Computing and Dynamic Graphics.Wiley, Bioconductor: Open software development for computational New York. biology and bioinformatics. Genome Biol. 5 R80. [6] UNWIN,A.,THEUS,M.andHOFMANN, H. (2006). Graphics [2] MAJUMDER,M.,HOFMANN,H.andCOOK, D. (2013). Val- of Large Datasets. Springer, New York. idation of visual statistical inference, applied to linear models. [7] XIE, Y. (2013). Dynamic Documents with R and Knitr. Chap- J. Amer. Statist. Assoc. 108 942–956. MR3174675 man & Hall/CRC, Boca Raton, FL. Vincent Carey is Associate Professor of Medicine (Biostatistics), Channing Division of Network Medicine, Harvard Medical School, Boston, Massachusetts 02115, USA (e-mail: [email protected]). Dianne Cook is Professor, Department of Statistics, Iowa State University, Ames, Iowa 50011, USA (e-mail: [email protected]). Statistical Science 2014, Vol. 29, No. 2, 167–180 DOI: 10.1214/13-STS452 © Institute of Mathematical Statistics, 2014 Object-Oriented Programming, Functional Programming and R John M. Chambers Abstract. This paper reviews some programming techniques in R that have proved useful, particularly for substantial projects. These include several versions of object-oriented programming, used in a large number of R packages. The review tries to clarify the origins and ideas behind the various versions, each of which is valuable in the appropriate context. R has also been strongly influenced by the ideas of functional programming and, in particular, by the desire to combine functional with object oriented programming. To clarify how this particular mix of ideas has turned out in the current R language and supporting software, the paper will first review the basic ideas behind object-oriented and functional programming, and then examine the evolution of R with these ideas providing context. Functional programming supports well-defined, defensible software giving reproducible results. Object-oriented programming is the mechanism par ex- cellence for managing complexity while keeping things simple for the user. The two paradigms have been valuable in supporting major software for fit- ting models to data and numerous other statistical applications. The paradigms have been adopted, and adapted, distinctively in R. Func- tional programming motivates much of R but R does not enforce the paradigm. Object-oriented programming from a functional perspective dif- fers from that used in non-functional languages, a distinction that needs to be emphasized to avoid confusion. R initially replicated the S language from Bell Labs, which in turn was strongly influenced by earlier program libraries. At each stage, new ideas have been added, but the previous software continues to show its influence in the design as well. Outlining the evolution will further clarify why we currently have this somewhat unusual combination of ideas. Key words and phrases: Programming languages, functional programming, object-oriented programming. REFERENCES BECKER,R.A.,CHAMBERS,J.M.andWILKS, A. R. (1988). The New S Language. Chapman & Hall, Boca Raton, FL. BECKER,R.A.andCHAMBERS, J. M. (1977). Gr-z: A system of CHAMBERS, J. M. (1977). Computational Methods for Data Anal- graphical subroutines for data analysis. In Proc. Interface Symp. ysis. Wiley, New York. MR0659716 on Statistics and Computing 10 409–415. CHAMBERS, J. M. (1987). Interface for a quantitative program- BECKER,R.A.andCHAMBERS, J. M. (1984). S: An Interactive ming environment. In Comp. Sci. and Stat., Proc.19th Symp. Environment for Data Analysis and Graphics.Wadsworth,Bel- on the Interface 280–286. mont, CA. CHAMBERS, J. M. (1998). Programming with Data: A Guide to BECKER,R.A.andCHAMBERS, J. M. (1985). Extending the S the S Language. Springer, New York. System. Wadsworth, Belmont, CA. CHAMBERS, J. M. (2008). Software for Data Analysis: Program- John M. Chambers is Consulting Professor, Department of Statistics, Stanford University, Stanford, California 94305-4065, USA (e-mail: [email protected]). ming with R. Springer, New York. Available at http://CRAN.R-project.org/package=rJython. CHAMBERS,J.M.andHASTIE, T., eds. (1992). Statistical Models IHAKA,R.andGENTLEMAN, R. (1996). R:

Load more