<<

Mixing R with other languages JOHN D. COOK, PHD VALUE CONSULTING Why R?

 Libraries, libraries, libraries  De facto standard for statistical research  Nice language, as far as statistical languages go  “Quirky, flawed, and an enormous success.” Why mix languages?

 Improve performance of R code  Execution speed (e.g. loops)  Memory management  Raid R’s libraries How to optimize R

 Vectorize  Rewrite not using R A few R quirks

 Everything is a vector  Everything can be null or NA  Unit-offset vectors  Zero index legal but strange  Negative indices remove elements  Matrices filled by column by default  $ acts like dot, dot not special package interface

 Must manage low-level details of R object model and memory  Requires Rtools on Windows  Lots of macros like REALSXP, PROTECT, and UNPROTECT  Use C++ (Rcpp) instead

“I do not recommend using C for writing new high-performance code. Instead write C++ with Rcpp.” – Hadley Wickham Rcpp

 The most widely used extension method for R  Call C, C++, or Fortran from R  Companion project RInside to call R from C++  Extensive support even for advanced C++  Create R packages or inline code  http://rcpp.org  Dirk Eddelbuettel’s book Simple Rcpp example

(Rcpp)

cppFunction('int add(int x, int y, int z) { int sum = x + y + z; return sum; }')

add(1, 2, 3) .NET

 RDCOM http://sunsite.univie.ac.at/rcom/  F# type provider for R http://bluemountaincapital.github.io/FSharpRProvider/  R.NET https://rdotnet.codeplex.com/ SQL Server 2016

execute sp_execute_external_script @language = N'R' , @script = N' OutputDataSet<- data.frame(c("hello"), " ", c("world"));' , @input_data_1 = N' ' WITH RESULT SETS ( ([col1] varchar(20) , [col2] char(1), [col3] varchar(20) ) ); Haskell

 HaskellR from Tweag.io http://tweag.github.io/HaskellR/  Use quasi-quoting into inline R [r| … |]  Interactive REPL with H wrapper around GHCi  Works with Jupyter notebooks org-mode

 Crufty but powerful, like all things Emacs  Ships with support for many languages  Works reliably cross-platform  Good for exploration / prototyping  Literate programming org- languages

Supported Other ABC Dot Ledger Org Screen Axiom Mathematica Asymptote Ebnf Lilypond Sed Elixir Mathomatic Awk Elisp Lisp Picolisp Shell Eukleides MongoDB C Forth Make PlantUML Shen Fomus Neo4j C++ Fortran Matlab Processing SQL Google translate OZ Calc Python SQLite Groovy Haskell Mscgen R Stan HTML Rec Comint Io OCaml Ruby http request SML Coq J Octave Sass iPython Stata CSS Scala Julia Tcl D Javascript Scheme Kotlin Typescript Ditaa LaTeX LFE Structure of an org-mode file

Text, images, LaTeX , etc.

#+begin_src R … #+end_src

text etc. …

#+begin_src python … #+end_src Language interop

#+name: sin_r #+name: sum_sq #+begin_src R :var x=0 #+begin_src perl :var a=3 :var b=4 sin(x) $a*$a + $b*$b #+end_src #+end_src

#+call: sum_sq(sin_r(1), cos_p(1)) #+name: cos_p #+begin_src python :var x=1 #+results: import math : 1 return math.cos(x) #+end_src Jupyter notebooks

 Started out as IPython notebooks  Julia + Python + R  Multiple languages supported (separately)  Less transparent than org-babel  For better: images, formatting, etc.  For worse: Hard to version and diff Some languages with Jupyter kernels

Bash F# Julia Prolog C Forth Matlab Python C++ Go Maxima Ruby C# Haskell OCaml SAS Clojure Octave SageMath Coffeescript J PHP Scala Java Perl(6) Tcl Erlang Javascript PowerShell Xonsh Beaker notebooks

 A fork of IPython, predecessor to Jupyter  http://beakernotebook.com/  Cells can be written in different languages  attribute on beaker object in one language, access attribute from another language  R data.frame <-> Python pandas.DataFrame Beaker example

beaker.foo = “Hello world” # Python cell

x <- beaker::get(‘foo’) # R cell

beaker::set(‘answer’, 42) # R cell

z = beaker.answer[0] # Python cell Languages supported in Beaker notebooks

C++ Java Python(3) Clojure JavaScript R F# Julia Ruby Groovy Lua/Torch Scala/Spark HTML Node SQL R Markdown

 Similar to Jupyter, Beaker  http://rmarkdown.rstudio.com  Can mix languages in a single document  Exchange data between languages via data frames  Many publication export formats Languages supported in R Markdown

Bash R CSS Rcpp JavaScript SQL Python Stan R Markdown example

Text (markdown)…

```{r} x <- “hello from R” print(x) ```

Text …

```{python} x = “ “.join( [“Hello”, “from”, “Python”] ) print(x) ``` Summary

 Make R more efficient, or borrow its libraries.  R differences: null/NA, vectors, unit offset, etc.  Most of these approaches do not simply install and “just work.”  Org-babel works as documented, but maybe not as expected.  Most general/powerful approach: language <-> Rcpp <-> R

Contact