Simple Reproducible Analysis with Knitr, R Markdown, and Rstudio
Total Page:16
File Type:pdf, Size:1020Kb
Simple Reproducible Analysis with knitr, R Markdown, Simple Reproducible Analysis with knitr, R and RStudio Jeromy Markdown, and RStudio Anglim Melbourne R Users Group (melbURN) Introduction Markdown knitr and R Jeromy Anglim Markdown LaTeX Melbourne Business School Conclusion 18th July 2012 http://jeromyanglim.blogspot.com https://github.com/jeromyanglim/ rmarkdown-rmeetup-2012 Outline Simple Reproducible Analysis with knitr, R Markdown, 1 Introduction and RStudio Jeromy Anglim 2 Markdown Introduction Markdown knitr and R 3 knitr and R Markdown Markdown LaTeX Conclusion 4 LaTeX 5 Conclusion Motivation: How to create documents? Simple Reproducible Types and distinctions Analysis with knitr, R Markdown, Formal Documents: Journal articles, books, book chapters, and RStudio theses, consulting reports, etc. Jeromy Informal documents: preliminary analyses, statistical Anglim homework, Introduction Online content: web pages, blog posts, forum posts Markdown Browser metaphor versus page/slide-based metaphor knitr and R Markdown Context LaTeX When to use reproducible analysis? Conclusion When to use knitr with R Markdown or LaTeX? What is reproducible analysis? Simple Reproducible Reproducibility varies on a continuum Analysis with knitr, R Markdown, One particular form: and RStudio code transforms raw data and meta-data into processed Jeromy Anglim data, code runs analyses on the data, and Introduction code incorporates analyses into a report Markdown knitr and R Ideally, the process involves a one-click build Markdown Public sharing of document, code, and data is optional, LaTeX Conclusion but forms part of gold standard of scientific openness Goes by many names, particularly \reproducible research", but I prefer \reproducible analysis". See also: http://stats.stackexchange.com/a/15006/183 https://github.com/jeromyanglim/rmarkdown-rmeetup-2012/issues/11 Aims of reproducible analysis Simple Reproducible Ability to reproduce analysis Analysis with knitr, R Markdown, Increase accuracy and RStudio Ability to verify analyses are consistent with intentions Jeromy Anglim Ability to review analysis choices Introduction Increase clarity of communication Markdown Increased trustworthiness knitr and R Markdown Increased accuracy + LaTeX Ability for others to verify Conclusion Extensibility Ability to easily modify or re-use existing analyses Reproducible analysis in R Simple Reproducible Analysis with Typically: knitr, R Markdown, and RStudio Combine R and plain text file format to produce Jeromy Anglim documents (e.g., pdfs, HTML documents, etc.) Introduction Markdown Popular Instances knitr and R Markdown Sweave LaTeX brew Conclusion knitr see also http://cran.r-project.org/web/views/ReproducibleResearch.html Installation of software used in this talk Simple Reproducible R: http://www.r-project.org/ Analysis with knitr, R Markdown, R Studio: http://rstudio.org/ and RStudio In R: Jeromy Anglim install.packages("knitr) Introduction install.packages("markdown") Markdown install.packages("xtable") knitr and R install.packages("ggplot2") Markdown install.packages("lattice") LaTeX pandoc: Conclusion http://johnmacfarlane.net/pandoc/ LaTeX distribution: E.g., TeXLive, MikTeX http://www.latex-project.org/ftp.html What is markdown? Simple Reproducible Simple, readable, intuitive, light-weight markup Analysis with knitr, R Markdown, Convert to HTML and RStudio Raw HTML can be interspersed to add functionality Jeromy Anglim Various extensions and flaours of markdown Introduction Popular on websites: e.g., StackOverflow, GitHub, Reddit Markdown see also: http://daringfireball.net/projects/markdown/ knitr and R Markdown LaTeX Conclusion Headings Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Basic formatting Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Paragraphs Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Dot points Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Equations Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown Uses MathJaX to render LaTeX (and other) equations LaTeX Inserts MathJaX script reference into HTML header Conclusion getting started: http://jeromyanglim.blogspot.com.au/2010/10/getting-started-with-writing.html Hyperlinks Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Images Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Code Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Quotes Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Tables Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Raw HTML Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion knitr, R Markdown, and R Studio Simple Reproducible knitr: R Package developed by Yihui Xie for weaving R Analysis with knitr, R (and other languages) with various markup languages Markdown, and RStudio R Markdown: A file format that combines R code chunks Jeromy Anglim and markdown text which is converted by knitr into markdown, and other formats (e.g., HTML, pdf, etc.). Introduction Markdown R Studio: Open source, cross-platform IDE for R. knitr and R Markdown LaTeX Conclusion Benefits of knitr Simple Reproducible knitr supports many markups: LaTeX, Markdown, HTML, Analysis with knitr, R reStructuredText Markdown, and RStudio knitr has really nice defaults Jeromy Anglim Tidy placement of generated files Introduction Simplified figure production Markdown automatically print ggplot2 and lattice figures knitr and R print figures by default Markdown permit interspersing of figures and console output LaTeX Conclusion Greater extensibility: output options supports languages other than R Simplified caching And more: http: //yihui.name/slides/2012-knitr-RStudio.html Rstudio Simple Reproducible Benefits of Rstudio as IDE for R Analysis with knitr, R Markdown, Open source and RStudio Works on Linux, Mac, and Windows Jeromy Many useful features Anglim It just works Introduction Tight integration with knitr Markdown But many other options knitr and R Markdown Emacs with ESS LaTeX Vim with R plugin Conclusion Eclipse with StatET etc. RMarkdown Examples Simple Reproducible Introduction to R Markdown Analysis with knitr, R Markdown, Statistics homework example and RStudio Analysis of Winter Olympic Medals Example Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Rstudio screenshot Simple Reproducible Analysis with knitr, R Markdown, and RStudio Jeromy Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion R Code chunks Simple Reproducible see http://yihui.name/knitr/options Analysis with knitr, R Markdown, and RStudio ```{r my_chunk_name, some_option='some_value'} Jeromy some_r_code Anglim ``` Introduction Markdown knitr and R Markdown LaTeX Conclusion R Code chunks options Simple Reproducible Analysis with Global options: knitr, R Markdown, and RStudio `r opts_chunk$set(opt = value)` # general form Jeromy Anglim `r opts_chunk$set(cache=TRUE)` # e.g, global cache Introduction Markdown Some useful local options knitr and R Markdown LaTeX Hide console input: echo=FALSE Conclusion Hide assorted messages: warning=FALSE, error=FALSE, message=FALSE Hide console output: results="hide" Display console input as is: tidy=FALSE Output raw markup: results="asis" Inline R Code Simple Reproducible Analysis with knitr, R R Markdown `r 2 + 2` `r I(2+2)` Markdown, and RStudio Markdown `4` 4 Jeromy HTML <code>4</code> 4 Anglim Introduction Markdown knitr and R Markdown LaTeX Conclusion Figures Simple Reproducible Support for multiple figures in a code block Analysis with knitr, R Markdown, also see e.g., par(mfrow=c(2,2)) or grid.arrange and RStudio Jeromy Figures and console output can be interspersed in a code Anglim chunk Introduction Various code chunk options Markdown see http://yihui.name/knitr/options knitr and R Markdown fig.width and fig.height LaTeX dev defaults to pdf for LaTeX and png for Conclusion HTML/markdown Tables Simple Reproducible Many options for creating HTML Tables: Analysis with knitr, R Markdown, R packages: xtable, googleVis, R2HTML, hwriter and RStudio markdown extentions: github, pandoc Jeromy Custom R code Anglim xtable is a reasonable option Introduction Markdown For informal reports just use console output knitr and R css can be added later to control table appearance Markdown LaTeX If you require sophisticated tables, you may want to switch Conclusion to LaTeX xtable example Simple Reproducible Analysis with knitr, R print(xtable(my_data_frame, caption = "My Caption", Markdown, and RStudio digits = 3), type = "html", Jeromy caption.placement = "top", Anglim html.table.attributes = Introduction "style=\"border: 1px solid black;\"") Markdown knitr and R Markdown LaTeX Conclusion Caching Simple Reproducible