Sweave: Reproducible Research Using R and LATEX

Sweave: Reproducible Research Using R and LATEX

Sweave: Reproducible Research using R and LATEX Sandra D. Griffith Department of Biostatistics and Epidemiology University of Pennsylvania [email protected] Biostatistics Computing Workshop Series March 15, 2012 S. Griffith ([email protected]) Sweave 15 March 2012 1 / 20 Non-reproducible Research • Characteristics I Prepare or manipulate data in a spreadsheet I Cut and paste output to create tables I Multiple versions of data and analysis scripts I Create many versions of graphics, selecting only one for final presentation of results • Problems I Data, code, and results not linked I Any changes in analysis or data require manual regeneration of results I Workflow or organization scheme may change over time I Can be difficult to replicate in the future I Less forensic evidence if results are questioned S. Griffith ([email protected]) Sweave 15 March 2012 2 / 20 Response to Duke University Scandal \We now require most of our reports to be written using Sweave, a literate programming combination of LATEX source and R code (SASweave and odfWeave are also available) so that we can rerun the reports as needed and get the same results." S. Griffith ([email protected]) Sweave 15 March 2012 3 / 20 Sweave: Conceptual Overview • Link data, code, and results with a single .Rnw file I Similar to .tex file, but includes interspersed \chunks" of R code I Uses noweb syntax for literate programming • Weave .Rnw file to produce .tex file which includes output from R code • Compile TeX file to PDF or PS files as usual • Tangle .Rnw file to extract R code into separate file • In addition to including them in the output, creates individual files for each figure • Can refer to within-chunk R expressions in regular document text using Sexpr S. Griffith ([email protected]) Sweave 15 March 2012 4 / 20 Getting Started with Sweave • Assume R and LATEX already installed • Sweave.sty is already included with base R installation I Preferred method: include R folder containing Sweave.sty in your TeX path F Will automatically update style file when you update R I Copy Sweave.sty to a centralized location with other style files, also in your TeX path F Requires manual updates, but can be located in a central location shared among computers (e.g. Dropbox) I Hard path: include \usepackage{...\Sweave} in preamble I Copy Sweave.sty into same folder as each .Rnw file S. Griffith ([email protected]) Sweave 15 March 2012 5 / 20 Anatomy of a Code Chunk << label (optional), options >>= insert R code here @ Commonly-used options (see manual for full list) • echo = F Suppress R input from appearing in document (default = T) • eval = F R code not evaluated (default = T) • results = hide Suppress R output from appearing in document (default = verbatim) • results = tex R output will be read as TeX (default = verbatim) • fig = T Code chuck includes a figure (default = F) S. Griffith ([email protected]) Sweave 15 March 2012 6 / 20 Global Options Default options can be set in preamble and updated throughout document • Set R chunk options \SweaveOpts{eval=T, echo=F} • Preserve comments and spacing of echoed R code \SweaveOpts{keep.source=TRUE} • Figure options for height, width, and file type S. Griffith ([email protected]) Sweave 15 March 2012 7 / 20 Example <<echo=T>>= x <- exp(2.3) x @ > x <- exp(2.3) > x [1] 9.974182 <<echo=F>>= x <- exp(2.3) x @ [1] 9.974182 <<echo=T, results=hide>>= x <- exp(2.3) x @ > x <- exp(2.3) > x S. Griffith ([email protected]) Sweave 15 March 2012 8 / 20 Compiling an Sweave Document • Manually (Windows or Mac) 1. Run Sweave(`foo.Rnw') in R console 2. Open foo.tex in a TeX editor 3. Compile PDF using TeX editor 4. Stangle(`foo.Rnw') to extract R code if desired • Manually (Linux/Unix) 1. Run R CMD Sweave foo.Rnw 2. Run pdflatex foo or latex foo • Integrated Development Environment (IDE) I Rstudio, Emacs (ESS), Eclipse (StatEt), etc. I If supported, usually one click/command for all steps (Sweave, compile TeX, view PDF) S. Griffith ([email protected]) Sweave 15 March 2012 9 / 20 RStudio S. Griffith ([email protected]) Sweave 15 March 2012 10 / 20 The xtable Package: Basic Table Code R package to convert many R objects to LATEXor HTML tables <<label=tab:GenderRace, results=tex>>= library(xtable) data(tli) xtable(table(tli$ethnicty, tli$sex), caption="Distribution of gender and ethnicity") @ <<label=tab:LM1, results=tex>>= lm1 <- lm(tlimth ~ sex + ethnicty, data=tli) xtable(lm1, caption="Linear Model Results") @ S. Griffith ([email protected]) Sweave 15 March 2012 11 / 20 The xtable package: Basic Table Output FM BLACK 11 12 HISPANIC 8 12 OTHER 2 0 WHITE 30 25 Table: Distribution of gender and ethnicity Estimate Std. Error t value Pr(>jtj) (Intercept) 71.0226 3.2894 21.59 0.0000 sexM 3.3734 2.8594 1.18 0.2410 ethnictyHISPANIC -3.7466 4.3044 -0.87 0.3863 ethnictyOTHER 18.4774 10.4716 1.76 0.0809 ethnictyWHITE 7.4622 3.4964 2.13 0.0354 Table: Linear Model Results S. Griffith ([email protected]) Sweave 15 March 2012 12 / 20 The xtable package: Customized Tables > mat <- round(matrix(c(0.9, 0.89, 200, 0.045, 2.0), + c(1, 5)), 4) > rownames(mat) <- "$y_{t-1}$" > colnames(mat) <- c("$R^2$", "$\\bar{R}^2$", + "F-stat", "S.E.E", "DW") > mat <- xtable(mat) > print(mat, sanitize.text.function = function(x){x}) R2 R¯ 2 F-stat S.E.E DW yt−1 0.90 0.89 200.00 0.04 2.00 Almost all functionality available for LATEX tables can be included directly in R code using xtable S. Griffith ([email protected]) Sweave 15 March 2012 13 / 20 Aside: Using xtable for MS Word Tables Non-statistical collaborators often prefer tabular results in MS Word xtable(table(tli$ethnicty, tli$sex), file="TabGenderRace", type="html" ) 1. Save results in HTML file using xtable() in R 2. Open \TabGenderRace.htm" in a browser 3. Copy and paste into Word document as a fully-formatted table S. Griffith ([email protected]) Sweave 15 March 2012 14 / 20 Basic Figure Example <<fig=T, echo=F, width=5, height=3.5>>= plot(1:10, rnorm(10)) @ ● ● ● 1 ● 0 ● ● ● ● ● rnorm(10) −1 −2 ● 2 4 6 8 10 1:10 NB: Embed figure chunk within a LATEX figure environment for more precise control S. Griffith ([email protected]) Sweave 15 March 2012 15 / 20 Large or Computationally Intensive Projects • Use input statements or make files • save() and load() intermediate results • Conditional evaluation if (file exists) {load file} else {run; save file}) • Change R chunk evaluation options as necessary • R package: cacheSweave to cache intermediate results S. Griffith ([email protected]) Sweave 15 March 2012 16 / 20 Including R code as an Appendix • Useful for homework, solution sets, etc. • Include \usepackage{listings} in the preamble • Include the following R chunk and TeX code in foo.Rnw where you would like to place appendix <<echo=FALSE, results=hide, split=TRUE>>= Stangle(file="foo.Rnw",output="foo.R", annotate=FALSE) @ \pagebreak \section{R Code} \texttt{\lstinputlisting[emptylines=0]{foo.R}} S. Griffith ([email protected]) Sweave 15 March 2012 17 / 20 Miscellaneous Sweave Tricks • Load all libraries in one chunk with results = hide option to suppress unwanted output (e.g. package dependencies) • Beamer presentations I Include [fragile] option for every frame with R code to handle verbatim output I For frames with TeX and verbatim output, must include [containsverbatim] option instead • R graphics package ggplot2 I Must use print() wrapper for ggplot objects • R session information > toLatex(sessionInfo(), locale=F) I R version 2.14.1 (2011-12-22), x86_64-pc-mingw32 I Base packages: base, datasets, graphics, grDevices, methods, stats, utils I Other packages: xtable 1.7-0 I Loaded via a namespace (and not attached): tools 2.14.1 S. Griffith ([email protected]) Sweave 15 March 2012 18 / 20 Alternatives for Reproducible Research • R for other document formats I HTML: R2HTML I Open Office: odfWeave I MS Word: Sword I MS Powerpoint: R2PPT • Other statistical packages I Statweave for SAS, Stata, or MATLAB and LATEX or Open Office I Various other software-specific report generators S. Griffith ([email protected]) Sweave 15 March 2012 19 / 20 Resources • Sweave user manual (Friedrich Leisch): http://www.stat. uni-muenchen.de/~leisch/Sweave/Sweave-manual.pdf • Stack Overflow questions tagged Sweave: http://stackoverflow.com/questions/tagged/sweave • Keith Baggerly's introduction to Sweave: http://bioinformatics. mdanderson.org/SweaveTalk/sweaveTalkb.pdf • QuickR summary of alternatives to Sweave: http://www.statmethods.net/interface/output.html • Citing R with Sweave: http://biostat.mc.vanderbilt.edu/ wiki/pub/Main/SweaveLatex/RCitation.pdf • xtable gallery with examples: http://cran.r-project.org/web/ packages/xtable/vignettes/xtableGallery.pdf S. Griffith ([email protected]) Sweave 15 March 2012 20 / 20.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us