Computational Statistics in Practice
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Navigating the R Package Universe by Julia Silge, John CCONTRIBUTED RESEARCH ARTICLES 558 Navigating the R Package Universe by Julia Silge, John C. Nash, and Spencer Graves Abstract Today, the enormous number of contributed packages available to R users outstrips any given user’s ability to understand how these packages work, their relative merits, or how they are related to each other. We organized a plenary session at useR!2017 in Brussels for the R community to think through these issues and ways forward. This session considered three key points of discussion. Users can navigate the universe of R packages with (1) capabilities for directly searching for R packages, (2) guidance for which packages to use, e.g., from CRAN Task Views and other sources, and (3) access to common interfaces for alternative approaches to essentially the same problem. Introduction As of our writing, there are over 13,000 packages on CRAN. R users must approach this abundance of packages with effective strategies to find what they need and choose which packages to invest time in learning how to use. At useR!2017 in Brussels, we organized a plenary session on this issue, with three themes: search, guidance, and unification. Here, we summarize these important themes, the discussion in our community both at useR!2017 and in the intervening months, and where we can go from here. Users need options to search R packages, perhaps the content of DESCRIPTION files, documenta- tion files, or other components of R packages. One author (SG) has worked on the issue of searching for R functions from within R itself in the sos package (Graves et al., 2017).
- 
												  Frequently Asked Questions About RcppFrequently Asked Questions about Rcpp Dirk Eddelbuettel Romain François Rcpp version 0.12.9 as of January 14, 2017 Abstract This document attempts to answer the most Frequently Asked Questions (FAQ) regarding the Rcpp (Eddelbuettel, François, Allaire, Ushey, Kou, Russel, Chambers, and Bates, 2017; Eddelbuettel and François, 2011; Eddelbuettel, 2013) package. Contents 1 Getting started 2 1.1 How do I get started ?.....................................................2 1.2 What do I need ?........................................................2 1.3 What compiler can I use ?...................................................3 1.4 What other packages are useful ?..............................................3 1.5 What licenses can I choose for my code?..........................................3 2 Compiling and Linking 4 2.1 How do I use Rcpp in my package ?............................................4 2.2 How do I quickly prototype my code?............................................4 2.2.1 Using inline.......................................................4 2.2.2 Using Rcpp Attributes.................................................4 2.3 How do I convert my prototyped code to a package ?..................................5 2.4 How do I quickly prototype my code in a package?...................................5 2.5 But I want to compile my code with R CMD SHLIB !...................................5 2.6 But R CMD SHLIB still does not work !...........................................6 2.7 What about LinkingTo ?...................................................6
- 
												  The Rockerverse: Packages and Applications for ContainerisationPREPRINT 1 The Rockerverse: Packages and Applications for Containerisation with R by Daniel Nüst, Dirk Eddelbuettel, Dom Bennett, Robrecht Cannoodt, Dav Clark, Gergely Daróczi, Mark Edmondson, Colin Fay, Ellis Hughes, Lars Kjeldgaard, Sean Lopp, Ben Marwick, Heather Nolis, Jacqueline Nolis, Hong Ooi, Karthik Ram, Noam Ross, Lori Shepherd, Péter Sólymos, Tyson Lee Swetnam, Nitesh Turaga, Charlotte Van Petegem, Jason Williams, Craig Willis, Nan Xiao Abstract The Rocker Project provides widely used Docker images for R across different application scenarios. This article surveys downstream projects that build upon the Rocker Project images and presents the current state of R packages for managing Docker images and controlling containers. These use cases cover diverse topics such as package development, reproducible research, collaborative work, cloud-based data processing, and production deployment of services. The variety of applications demonstrates the power of the Rocker Project specifically and containerisation in general. Across the diverse ways to use containers, we identified common themes: reproducible environments, scalability and efficiency, and portability across clouds. We conclude that the current growth and diversification of use cases is likely to continue its positive impact, but see the need for consolidating the Rockerverse ecosystem of packages, developing common practices for applications, and exploring alternative containerisation software. Introduction The R community continues to grow. This can be seen in the number of new packages on CRAN, which is still on growing exponentially (Hornik et al., 2019), but also in the numbers of conferences, open educational resources, meetups, unconferences, and companies that are adopting R, as exemplified by the useR! conference series1, the global growth of the R and R-Ladies user groups2, or the foundation and impact of the R Consortium3.
- 
												  Frequently Asked Questions About RcppFrequently Asked Questions about Rcpp Dirk Eddelbuettel Romain François Rcpp version 0.12.7 as of September 4, 2016 Abstract This document attempts to answer the most Frequently Asked Questions (FAQ) regarding the Rcpp (Eddelbuettel, François, Allaire, Ushey, Kou, Chambers, and Bates, 2016a; Eddelbuettel and François, 2011; Eddelbuettel, 2013) package. Contents 1 Getting started 2 1.1 How do I get started ?.....................................................2 1.2 What do I need ?........................................................2 1.3 What compiler can I use ?...................................................3 1.4 What other packages are useful ?..............................................3 1.5 What licenses can I choose for my code?..........................................3 2 Compiling and Linking 4 2.1 How do I use Rcpp in my package ?............................................4 2.2 How do I quickly prototype my code?............................................4 2.2.1 Using inline.......................................................4 2.2.2 Using Rcpp Attributes.................................................4 2.3 How do I convert my prototyped code to a package ?..................................5 2.4 How do I quickly prototype my code in a package?...................................5 2.5 But I want to compile my code with R CMD SHLIB !...................................5 2.6 But R CMD SHLIB still does not work !...........................................6 2.7 What about LinkingTo ?...................................................6
- 
												  Installing R and Cran Binaries on UbuntuINSTALLING R AND CRAN BINARIES ON UBUNTU COMPOUNDING MANY SMALL CHANGES FOR LARGER EFFECTS Dirk Eddelbuettel T4 Video Lightning Talk #006 and R4 Video #5 Jun 21, 2020 R PACKAGE INSTALLATION ON LINUX • In general installation on Linux is from source, which can present an additional hurdle for those less familiar with package building, and/or compilation and error messages, and/or more general (Linux) (sys-)admin skills • That said there have always been some binaries in some places; Debian has a few hundred in the distro itself; and there have been at least three distinct ‘cran2deb’ automation attempts • (Also of note is that Fedora recently added a user-contributed repo pre-builds of all 15k CRAN packages, which is laudable. I have no first- or second-hand experience with it) • I have written about this at length (see previous R4 posts and videos) but it bears repeating T4 + R4 Video 2/14 R PACKAGES INSTALLATION ON LINUX Three different ways • Barebones empty Ubuntu system, discussing the setup steps • Using r-ubuntu container with previous setup pre-installed • The new kid on the block: r-rspm container for RSPM T4 + R4 Video 3/14 CONTAINERS AND UBUNTU One Important Point • We show container use here because Docker allows us to “simulate” an empty machine so easily • But nothing we show here is limited to Docker • I.e. everything works the same on a corresponding Ubuntu system: your laptop, server or cloud instance • It is also transferable between Ubuntu releases (within limits: apparently still no RSPM for the now-current Ubuntu 20.04)
- 
												  Seamless R and C Integration with RcppUse R! Series Editors: Robert Gentleman Kurt Hornik Giovanni G. Parmigiani For further volumes: http://www.springer.com/series/6991 Dirk Eddelbuettel Seamless R and C++ Integration with Rcpp 123 Dirk Eddelbuettel River Forest Illinois, USA ISBN 978-1-4614-6867-7 ISBN 978-1-4614-6868-4 (eBook) DOI 10.1007/978-1-4614-6868-4 Springer New York Heidelberg Dordrecht London Library of Congress Control Number: 2013933242 © The Author 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
- 
												  Hadley Wickham @Hadleywickham Chief Scientist, RstudioPackages are easy! Hadley Wickham @hadleywickham Chief Scientist, RStudio May 2014 A package is a set of conventions that (with the right tools) makes your life easier A package is a set of conventions that (with the right tools) makes your life easier # The right tools: # # The latest version of R. # RStudio. # # Code development tools: # # * Windows: Rtools, download installer from # http://cran.r-project.org/bin/windows/Rtools/ # * OS X: xcode, free from the app store # * Linux: apt-get install r-base-dev (or similar) # # Packages that make your life easier: install.packages(c("devtools", "knitr", "Rcpp", "roxygen2", "testthat")) A package is a set of conventions that (with the right tools) makes your life easier R/ R code devtools::create("easy") # Creates DESCRIPTION # Creates rstudio project # Sets up R/ directory ! # NB: All devtools functions take a path to a # package as the first argument. If not supplied, # uses the current directory. Never use package.skeleton()! Why? It only works once, it does something v. easy to do by hand, and automates a job that needs to be manual # Add some R code ! load_all() ! # In rstudio: cmd + shift + l # Also automatically saves all files for you Programming cycle Identify the Modify and task save code NO Reload in R Write an Does it work? YES automated test Document More # Reloads changed code, data, … accurate # cmd + shift + l load_all() ! # Rstudio: Build & Reload # cmd + shift + b # Installs, restarts R, reloads ! # Builds & installs install() # (needed for vignettes) Faster + DESCRIPTION Who can use it, what it needs, and who wrote it Package: easy Title: What the package does (short line) Version: 0.1 Authors@R: "First Last <[email protected]> [aut, cre]" Description: What the package does (paragraph) Depends: R (>= 3.1.0) License: What license is it under? LazyData: true We’re done! ! That’s all you need to know about packages ! But you can also add data, documentation, unit tests, vignettes and C++ code + man/ Compiled documentation Roxygen2 • Essential for function level documentation.
- 
												  Using R Via RockerUSING R VIA ROCKER A BRIEF INTRODUCTION WITH EXAMPLES Dirk Eddelbuettel Köln R User Group 19 Nov 2020 https://dirk.eddelbuettel.com/papers/cologneRUG2020.pdf TODAY’S TALK Brief Outline • Docker Intro: Appeal, Motivation, “Lego” blocks, … • Brief Overview of Docker and its commands • Going Further: More about Rocker Köln RUG, Nov 2020 2/45 DOCKER A Personal Timeline • Docker itself started in 2013 • I started experimenting with it around spring of 2014 • Was convinced enough to say ‘will change how we build and test’ in keynote at useR! 2014 conference in mid-2014 • Started the Rocker Project with Carl Boettiger fall 2014 • Gave three-hour tutorial at useR! in 2015 • Active development of multiple (widely-used) containers • Introductory paper in R Journal, 2017 • Rockerverse paper (lead by Daniel Nuest) in R Journal, 2020 Köln RUG, Nov 2020 3/45 MOTIVATION Source: https://lemire.me/blog/2020/05/22/programming-inside-a-container/ Köln RUG, Nov 2020 4/45 MOTIVATION Excellent discussion by Daniel Lemire The idea of a container approach is to always start from a pristine state. So you define the configuration that your database server needs to have, and you launch it, in this precise state each time. This makes your infrastructure predictable. We can of course substitute “predictable” with “reproducible” … Köln RUG, Nov 2020 5/45 Source: https://commons.wikimedia.org/wiki/File:3_D-Box.jpg ON IN SIMPLER TERMS: Köln RUG, Nov 2020 6/45 ON IN SIMPLER TERMS: Source: https://commons.wikimedia.org/wiki/File:3_D-Box.jpg Köln RUG, Nov 2020 6/45 ON IN
- 
												  And Multi-Core BLAS Implementations and Gpus for Use with RBenchmarking Single- and Multi-Core BLAS Implementations and GPUs for use with R Dirk Eddelbuettel Debian Project Abstract We provide timing results for common linear algebra subroutines across BLAS (Basic Lin- ear Algebra Subprograms) and GPU (Graphics Processing Unit)-based implementations. Several BLAS implementations are compared. The first is the unoptimised reference BLAS which provides a baseline to measure against. Second is the Atlas tuned BLAS, configured for single-threaded mode. Third is the development version of Atlas, configured for multi-threaded mode. Fourth is the optimised and multi-threaded Goto BLAS. Fifth is the multi-threaded BLAS contained in the commercial Intel MKL package. We also measure the performance of a GPU-based implementation for R (R Development Core Team 2010a) provided by the package gputools (Buckner et al. 2010). Several frequently-used linear algebra computations are compared across BLAS (and LAPACK) implementations and via GPU computing: matrix multiplication as well as QR, SVD and LU decompositions. The tests are performed from an end-user perspective, and `net' times (including all necessary data transfers) are compared. While results are by their very nature dependent on the hardware of the test platforms, a few general lessons can be drawn. Unsurprisingly, accelerated BLAS clearly outperform the reference implementation. Similarly, multi-threaded BLAS hold a clear advantage over single-threaded BLAS when used on modern multi-core machines. Between the multi- threaded BLAS implementations, Goto is seen to have a slight advantage over MKL and Atlas. GPU computing is showing promise but requires relatively large matrices to outperform multi-threaded BLAS. We also provide the framework for computing these benchmark results as the corre- sponding R package gcbd.
- 
												  Introduction to RcppIntroduction to Rcpp From Simple Examples to Machine Learning Dirk Eddelbuettel useR! 2019 Pre-Conference Tutorial Toulouse, France 9 July 2019 Overview Rcpp @ useR! 2019 2/124 Very Broad Outline Overview • Motivation: Why R? • Who Use This? • How Does Use it? • Usage Illustrations Rcpp @ useR! 2019 3/124 Why R? Rcpp @ useR! 2019 4/124 Why R? Pat Burn’s View Why the R Language? Screen shot on the left part of short essay at Burns-Stat His site has more truly excellent (and free) writings. The (much longer) R Inferno (free pdf, also paperback) is highly recommended. Rcpp @ useR! 2019 5/124 Why R? Pat Burn’s View Why the R Language? • R is not just a statistics package, it’s a language. • R is designed to operate the way that problems are thought about. • R is both flexible and powerful. Source: https://www.burns-stat.com/documents/tutorials/why-use-the-r-language/ Rcpp @ useR! 2019 6/124 Why R? Pat Burn’s View Why R for data analysis? R is not the only language that can be used for data analysis. Why R rather than another? Here is a list: • interactive language • data structures • graphics • missing values • functions as first class objects • packages • community Source: https://www.burns-stat.com/documents/tutorials/why-use-the-r-language/ Rcpp @ useR! 2019 7/124 Why R: Programming with Data Rcpp @ useR! 2019 8/124 Why R? R as Middle Man R as an Extensible Environment • As R users we know that R can • ingest data in many formats from many sources • aggregate, slice, dice, summarize, … • visualize in many forms, … • model in just about any way • report in many useful and scriptable forms • It has become central for programming with data • Sometimes we want to extend it further than R code goes Rcpp @ useR! 2019 9/124 R as central point Rcpp @ useR! 2019 10/124 R as central point From any one of via into any one of • csv • txt • txt • html • xlsx • latex and pdf • xml, json, ..
- 
												  Getting Started in R: Tinyverse EditionGetting Started in R: Tinyverse Edition Saghir Bashir and Dirk Eddelbuettel This version was compiled on October 11, 2018 Are you curious to learn what R can do for you? Do you want to see how Quitting R & RStudio. When you quit RStudio you will be asked it works? Yes, then this “Getting Started” guide is for you. It uses realistic whether to Save workspace with two options: examples and a real life dataset to manipulate, visualise and summarise • “Yes” – Your current R workspace (containing the work that data. By the end of it you will have an overview of the key concepts of R. you have done) will be restored next time you open RStudio. R | Statistics | Data Science | Tinyverse • “No” – You will start with a fresh R session next time you open RStudio. For now select “No” to prevent errors being carried 1. Preface over from previous sessions). This “Getting Started” guide will give you a flavour of what R1 can 3. R Help do for you. To get the most out of this guide, read it whilst doing We strongly recommend that you learn how to use R’s useful and the examples and exercises using RStudio2ˆ. extensive built-in help system which is an essential part of finding This note is a variant of the original document3 but stresses solutions to your R programming problems. the use of Base R along with careful dependency management as discussed below. help() function. From the R “Console” you can use the help() function or ?. For example, try the following two commands (which Experiment Safely.
- 
												  Rtips. Revival 2012!Rtips. Revival 2012! Paul E. Johnson <pauljohn @ ku.edu> June 8, 2012 The original Rtips started in 1999. It became difficult to update because of limitations in the software with which it was created. Now I know more about R, and have decided to wade in again. In January, 2012, I took the FaqManager HTML output and converted it to LATEX with the excellent open source program pandoc, and from there I've been editing and updating it in LYX. From here on out, the latest html version will be at http://pj.freefaculty.org/R/Rtips. html and the PDF for the same will be http://pj.freefaculty.org/R/Rtips.pdf. You are reading the New Thing! The first chore is to cut out the old useless stuff that was no good to start with, correct mistakes in translation (the quotation mark translations are particularly dangerous, but also there is trouble with ~, $, and -. Original Preface (I thought it was cute to call this \StatsRus" but the Toystore's lawyer called and, well, you know. ) If you need a tip sheet for R, here it is. This is not a substitute for R documentation, just a list of things I had trouble remembering when switching from SAS to R. Heed the words of Brian D. Ripley, \One enquiry came to me yesterday which suggested that some users of binary distributions do not know that R comes with two Guides in the doc/manual directory plus an FAQ and the help pages in book form. I hope those are distributed with all the binary distributions (they are not made nor installed by default).