Mixing R with Other Languages JOHN D

Total Page:16

File Type:pdf, Size:1020Kb

Mixing R with Other Languages JOHN D Mixing R with other languages JOHN D. COOK, PHD SINGULAR VALUE CONSULTING Why R? Libraries, libraries, libraries De facto standard for statistical research Nice language, as far as statistical languages go “Quirky, flawed, and an enormous success.” Why mix languages? Improve performance of R code Execution speed (e.g. loops) Memory management Raid R’s libraries How to optimize R Vectorize Rewrite not using R A few R quirks Everything is a vector Everything can be null or NA Unit-offset vectors Zero index legal but strange Negative indices remove elements Matrices filled by column by default $ acts like dot, dot not special C package interface Must manage low-level details of R object model and memory Requires Rtools on Windows Lots of macros like REALSXP, PROTECT, and UNPROTECT Use C++ (Rcpp) instead “I do not recommend using C for writing new high-performance code. Instead write C++ with Rcpp.” – Hadley Wickham Rcpp The most widely used extension method for R Call C, C++, or Fortran from R Companion project RInside to call R from C++ Extensive support even for advanced C++ Create R packages or inline code http://rcpp.org Dirk Eddelbuettel’s book Simple Rcpp example library(Rcpp) cppFunction('int add(int x, int y, int z) { int sum = x + y + z; return sum; }') add(1, 2, 3) .NET RDCOM http://sunsite.univie.ac.at/rcom/ F# type provider for R http://bluemountaincapital.github.io/FSharpRProvider/ R.NET https://rdotnet.codeplex.com/ SQL Server 2016 execute sp_execute_external_script @language = N'R' , @script = N' OutputDataSet<- data.frame(c("hello"), " ", c("world"));' , @input_data_1 = N' ' WITH RESULT SETS ( ([col1] varchar(20) , [col2] char(1), [col3] varchar(20) ) ); Haskell HaskellR from Tweag.io http://tweag.github.io/HaskellR/ Use quasi-quoting into inline R [r| … |] Interactive REPL with H wrapper around GHCi Works with Jupyter notebooks Emacs org-mode Crufty but powerful, like all things Emacs Ships with support for many languages Works reliably cross-platform Good for exploration / prototyping Literate programming org-babel languages Supported Other ABC Dot Ledger Org Screen Axiom Mathematica Asymptote Ebnf Lilypond Perl Sed Elixir Mathomatic Awk Elisp Lisp Picolisp Shell Eukleides MongoDB C Forth Make PlantUML Shen Fomus Neo4j C++ Fortran Matlab Processing SQL Google translate OZ Calc Gnuplot Maxima Python SQLite Groovy Prolog Clojure Haskell Mscgen R Stan HTML Rec Comint Io OCaml Ruby http request SML Coq J Octave Sass iPython Stata CSS Java Scala Julia Tcl D Javascript Scheme Kotlin Typescript Ditaa LaTeX LFE Structure of an org-mode file Text, images, LaTeX equations, etc. #+begin_src R … #+end_src text etc. … #+begin_src python … #+end_src Language interop #+name: sin_r #+name: sum_sq #+begin_src R :var x=0 #+begin_src perl :var a=3 :var b=4 sin(x) $a*$a + $b*$b #+end_src #+end_src #+call: sum_sq(sin_r(1), cos_p(1)) #+name: cos_p #+begin_src python :var x=1 #+results: import math : 1 return math.cos(x) #+end_src Jupyter notebooks Started out as IPython notebooks Julia + Python + R Multiple languages supported (separately) Less transparent than org-babel For better: images, formatting, etc. For worse: Hard to version and diff Some languages with Jupyter kernels Bash F# Julia Prolog C Forth Matlab Python C++ Go Maxima Ruby C# Haskell OCaml SAS Clojure Hy Octave SageMath Coffeescript J PHP Scala Common Lisp Java Perl(6) Tcl Erlang Javascript PowerShell Xonsh Beaker notebooks A fork of IPython, predecessor to Jupyter http://beakernotebook.com/ Cells can be written in different languages Set attribute on beaker object in one language, access attribute from another language R data.frame <-> Python pandas.DataFrame Beaker example beaker.foo = “Hello world” # Python cell x <- beaker::get(‘foo’) # R cell beaker::set(‘answer’, 42) # R cell z = beaker.answer[0] # Python cell Languages supported in Beaker notebooks C++ Java Python(3) Clojure JavaScript R F# Julia Ruby Groovy Lua/Torch Scala/Spark HTML Node SQL R Markdown Similar to Jupyter, Beaker http://rmarkdown.rstudio.com Can mix languages in a single document Exchange data between languages via data frames Many publication export formats Languages supported in R Markdown Bash R CSS Rcpp JavaScript SQL Python Stan R Markdown example Text (markdown)… ```{r} x <- “hello from R” print(x) ``` Text … ```{python} x = “ “.join( [“Hello”, “from”, “Python”] ) print(x) ``` Summary Make R more efficient, or borrow its libraries. R differences: null/NA, vectors, unit offset, etc. Most of these approaches do not simply install and “just work.” Org-babel works as documented, but maybe not as expected. Most general/powerful approach: language <-> Rcpp <-> R Contact.
Recommended publications
  • The Machine That Builds Itself: How the Strengths of Lisp Family
    Khomtchouk et al. OPINION NOTE The Machine that Builds Itself: How the Strengths of Lisp Family Languages Facilitate Building Complex and Flexible Bioinformatic Models Bohdan B. Khomtchouk1*, Edmund Weitz2 and Claes Wahlestedt1 *Correspondence: [email protected] Abstract 1Center for Therapeutic Innovation and Department of We address the need for expanding the presence of the Lisp family of Psychiatry and Behavioral programming languages in bioinformatics and computational biology research. Sciences, University of Miami Languages of this family, like Common Lisp, Scheme, or Clojure, facilitate the Miller School of Medicine, 1120 NW 14th ST, Miami, FL, USA creation of powerful and flexible software models that are required for complex 33136 and rapidly evolving domains like biology. We will point out several important key Full list of author information is features that distinguish languages of the Lisp family from other programming available at the end of the article languages and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSL): languages which are specialized to a particular area and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the “programmable programming language.” We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and AI research in bioinformatics and computational biology.
    [Show full text]
  • CAS (Computer Algebra System) Mathematica
    CAS (Computer Algebra System) Mathematica- UML students can download a copy for free as part of the UML site license; see the course website for details From: Wikipedia 2/9/2014 A computer algebra system (CAS) is a software program that allows [one] to compute with mathematical expressions in a way which is similar to the traditional handwritten computations of the mathematicians and other scientists. The main ones are Axiom, Magma, Maple, Mathematica and Sage (the latter includes several computer algebras systems, such as Macsyma and SymPy). Computer algebra systems began to appear in the 1960s, and evolved out of two quite different sources—the requirements of theoretical physicists and research into artificial intelligence. A prime example for the first development was the pioneering work conducted by the later Nobel Prize laureate in physics Martin Veltman, who designed a program for symbolic mathematics, especially High Energy Physics, called Schoonschip (Dutch for "clean ship") in 1963. Using LISP as the programming basis, Carl Engelman created MATHLAB in 1964 at MITRE within an artificial intelligence research environment. Later MATHLAB was made available to users on PDP-6 and PDP-10 Systems running TOPS-10 or TENEX in universities. Today it can still be used on SIMH-Emulations of the PDP-10. MATHLAB ("mathematical laboratory") should not be confused with MATLAB ("matrix laboratory") which is a system for numerical computation built 15 years later at the University of New Mexico, accidentally named rather similarly. The first popular computer algebra systems were muMATH, Reduce, Derive (based on muMATH), and Macsyma; a popular copyleft version of Macsyma called Maxima is actively being maintained.
    [Show full text]
  • An Evaluation of Go and Clojure
    An evaluation of Go and Clojure A thesis submitted in partial satisfaction of the requirements for the degree Bachelors of Science in Computer Science Fall 2010 Robert Stimpfling Department of Computer Science University of Colorado, Boulder Advisor: Kenneth M. Anderson, PhD Department of Computer Science University of Colorado, Boulder 1. Introduction Concurrent programming languages are not new, but they have been getting a lot of attention more recently due to their potential with multiple processors. Processors have gone from growing exponentially in terms of speed, to growing in terms of quantity. This means processes that are completely serial in execution will soon be seeing a plateau in performance gains since they can only rely on one processor. A popular approach to using these extra processors is to make programs multi-threaded. The threads can execute in parallel and use shared memory to speed up execution times. These multithreaded processes can significantly speed up performance, as long as the number of dependencies remains low. Amdahl‘s law states that these performance gains can only be relative to the amount of processing that can be parallelized [1]. However, the performance gains are significant enough to be looked into. These gains not only come from the processing being divvied up into sections that run in parallel, but from the inherent gains from sharing memory and data structures. Passing new threads a copy of a data structure can be demanding on the processor because it requires the processor to delve into memory and make an exact copy in a new location in memory. Indeed some studies have shown that the problem with optimizing concurrent threads is not in utilizing the processors optimally, but in the need for technical improvements in memory performance [2].
    [Show full text]
  • Clojure, Given the Pun on Closure, Representing Anything Specific
    dynamic, functional programming for the JVM “It (the logo) was designed by my brother, Tom Hickey. “It I wanted to involve c (c#), l (lisp) and j (java). I don't think we ever really discussed the colors Once I came up with Clojure, given the pun on closure, representing anything specific. I always vaguely the available domains and vast emptiness of the thought of them as earth and sky.” - Rich Hickey googlespace, it was an easy decision..” - Rich Hickey Mark Volkmann [email protected] Functional Programming (FP) In the spirit of saying OO is is ... encapsulation, inheritance and polymorphism ... • Pure Functions • produce results that only depend on inputs, not any global state • do not have side effects such as Real applications need some changing global state, file I/O or database updates side effects, but they should be clearly identified and isolated. • First Class Functions • can be held in variables • can be passed to and returned from other functions • Higher Order Functions • functions that do one or both of these: • accept other functions as arguments and execute them zero or more times • return another function 2 ... FP is ... Closures • main use is to pass • special functions that retain access to variables a block of code that were in their scope when the closure was created to a function • Partial Application • ability to create new functions from existing ones that take fewer arguments • Currying • transforming a function of n arguments into a chain of n one argument functions • Continuations ability to save execution state and return to it later think browser • back button 3 ..
    [Show full text]
  • Hy Documentation Release 0.12.1+64.G5eb9283
    hy Documentation Release 0.12.1+64.g5eb9283 Paul Tagliamonte Apr 14, 2017 Contents 1 Documentation Index 3 1.1 Quickstart................................................4 1.2 Tutorial..................................................5 1.2.1 Basic intro to Lisp for Pythonistas...............................5 1.2.2 Hy is a Lisp-flavored Python..................................7 1.2.3 Macros............................................. 12 1.2.4 Hy <-> Python interop..................................... 13 1.2.5 Protips!............................................. 14 1.3 Hy Style Guide.............................................. 14 1.3.1 Prelude............................................. 15 1.3.2 Layout & Indentation...................................... 15 1.3.3 Coding Style.......................................... 16 1.3.4 Conclusion........................................... 17 1.3.5 Thanks............................................. 17 1.4 Documentation Index.......................................... 18 1.4.1 Command Line Interface.................................... 18 1.4.2 Hy <-> Python interop..................................... 19 1.4.3 Hy (the language)........................................ 21 1.4.4 Hy Core............................................. 47 1.4.5 Reader Macros......................................... 65 1.4.6 Internal Hy Documentation................................... 66 1.5 Extra Modules Index........................................... 72 1.5.1 Anaphoric Macros....................................... 72 1.5.2
    [Show full text]
  • An Industrial Strength Theorem Prover for a Logic Based on Common Lisp
    An Industrial Strength Theorem Prover for a Logic Based on Common Lisp y z Matt Kaufmannand J Strother Moore Personal use of this material is permitted. particular style of formal veri®cation that has shown consid- However, permission to reprint/republish this erable promise in recent years is the use of general-purpose material for advertising or promotional pur- automated reasoning systems to model systems and prove poses or for creating new collective works for properties of them. Every such reasoning system requires resale or redistribution to servers or lists, or considerable assistance from the user, which makes it im- to reuse any copyrighted component of this portant that the system provide convenient ways for the user work in other works must be obtained from the to interact with it. IEEE.1 One state-of-the-art general-purpose automated reason- ing system is ACL2: ªA Computational Logic for Applica- AbstractÐACL2 is a re-implemented extended version tive Common Lisp.º A number of automated reasoning of Boyer and Moore's Nqthm and Kaufmann's Pc-Nqthm, systems now exist, as we discuss below (Subsection 1.1). In intended for large scale veri®cation projects. This paper this paper we describe ACL2's offerings to the user for con- deals primarily with how we scaled up Nqthm's logic to an venientªindustrial-strengthºuse. WebegininSection2with ªindustrial strengthº programming language Ð namely, a a history of theACL2 project. Next, Section 3 describes the large applicative subset of Common Lisp Ð while preserv- logic supportedby ACL2, which has been designed for con- ing the use of total functions within the logic.
    [Show full text]
  • Lisp Exercises
    Language-Oriented Programming Assignment Author: Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences More Lisp exercises You can find a list of 99 Lisp problems on http://picolisp.com/wiki/?99problems . Select problems you like most. Below you find a selection by me. For all problems: 1. Use recursion and the functions for list processing and mathematics presented in the lecture. You may also use the higher-order functions detect, select, collect, and reduce for solving the exercises. Do not use the Lisp loop macro! 2. Implement a test case and a suitable function and then test your implementation. 1 Working with lists P03 (*) Find the K'th element of a list. The first element in the list is number 1. Example: * (element-at '(a b c d e) 3) C P05 (*) Reverse a list. P06 (*) Find out whether a list is a palindrome. A palindrome can be read forward or backward; e.g. (x a m a x). P15 (**) Replicate the elements of a list a given number of times. Example: * (repli '(a b c) 3) (A A A B B B C C C) Page 1 Language-Oriented Programming P22 (*) Create a list containing all integers within a given range. If second argument is smaller than first, produce a list in descending order. Example: * (range 4 9) (4 5 6 7 8 9) 2 Arithmetic P31 (**) Determine whether a given integer number is prime. Example: * (is-prime 7) T P32 (**) Determine the greatest common divisor of two positive integer numbers. Use Euclid's algorithm. Example: * (gcd 36 63) 9 P35 (**) Determine the prime factors of a given positive integer.
    [Show full text]
  • Proceedings of the 8Th European Lisp Symposium Goldsmiths, University of London, April 20-21, 2015 Julian Padget (Ed.) Sponsors
    Proceedings of the 8th European Lisp Symposium Goldsmiths, University of London, April 20-21, 2015 Julian Padget (ed.) Sponsors We gratefully acknowledge the support given to the 8th European Lisp Symposium by the following sponsors: WWWLISPWORKSCOM i Organization Programme Committee Julian Padget – University of Bath, UK (chair) Giuseppe Attardi — University of Pisa, Italy Sacha Chua — Toronto, Canada Stephen Eglen — University of Cambridge, UK Marc Feeley — University of Montreal, Canada Matthew Flatt — University of Utah, USA Rainer Joswig — Hamburg, Germany Nick Levine — RavenPack, Spain Henry Lieberman — MIT, USA Christian Queinnec — University Pierre et Marie Curie, Paris 6, France Robert Strandh — University of Bordeaux, France Edmund Weitz — University of Applied Sciences, Hamburg, Germany Local Organization Christophe Rhodes – Goldsmiths, University of London, UK (chair) Richard Lewis – Goldsmiths, University of London, UK Shivi Hotwani – Goldsmiths, University of London, UK Didier Verna – EPITA Research and Development Laboratory, France ii Contents Acknowledgments i Messages from the chairs v Invited contributions Quicklisp: On Beyond Beta 2 Zach Beane µKanren: Running the Little Things Backwards 3 Bodil Stokke Escaping the Heap 4 Ahmon Dancy Unwanted Memory Retention 5 Martin Cracauer Peer-reviewed papers Efficient Applicative Programming Environments for Computer Vision Applications 7 Benjamin Seppke and Leonie Dreschler-Fischer Keyboard? How quaint. Visual Dataflow Implemented in Lisp 15 Donald Fisk P2R: Implementation of
    [Show full text]
  • Ipad Educational Apps This List of Apps Was Compiled by the Following Individuals on Behalf of Innovative Educator Consulting: Naomi Harm Jenna Linskens Tim Nielsen
    iPad Educational Apps This list of apps was compiled by the following individuals on behalf of Innovative Educator Consulting: Naomi Harm Jenna Linskens Tim Nielsen INNOVATIVE 295 South Marina Drive Brownsville, MN 55919 Home: (507) 750-0506 Cell: (608) 386-2018 EDUCATOR Email: [email protected] Website: http://naomiharm.org CONSULTING Inspired Technology Leadership to Transform Teaching & Learning CONTENTS Art ............................................................................................................... 3 Creativity and Digital Production ................................................................. 5 eBook Applications .................................................................................... 13 Foreign Language ....................................................................................... 22 Music ........................................................................................................ 25 PE / Health ................................................................................................ 27 Special Needs ............................................................................................ 29 STEM - General .......................................................................................... 47 STEM - Science ........................................................................................... 48 STEM - Technology ..................................................................................... 51 STEM - Engineering ...................................................................................
    [Show full text]
  • SMT Solving in a Nutshell
    SAT and SMT Solving in a Nutshell Erika Abrah´ am´ RWTH Aachen University, Germany LuFG Theory of Hybrid Systems February 27, 2020 Erika Abrah´ am´ - SAT and SMT solving 1 / 16 What is this talk about? Satisfiability problem The satisfiability problem is the problem of deciding whether a logical formula is satisfiable. We focus on the automated solution of the satisfiability problem for first-order logic over arithmetic theories, especially using SAT and SMT solving. Erika Abrah´ am´ - SAT and SMT solving 2 / 16 CAS SAT SMT (propositional logic) (SAT modulo theories) Enumeration Computer algebra DP (resolution) systems [Davis, Putnam’60] DPLL (propagation) [Davis,Putnam,Logemann,Loveland’62] Decision procedures NP-completeness [Cook’71] for combined theories CAD Conflict-directed [Shostak’79] [Nelson, Oppen’79] backjumping Partial CAD Virtual CDCL [GRASP’97] [zChaff’04] DPLL(T) substitution Watched literals Equalities and uninterpreted Clause learning/forgetting functions Variable ordering heuristics Bit-vectors Restarts Array theory Arithmetic Decision procedures for first-order logic over arithmetic theories in mathematical logic 1940 Computer architecture development 1960 1970 1980 2000 2010 Erika Abrah´ am´ - SAT and SMT solving 3 / 16 SAT SMT (propositional logic) (SAT modulo theories) Enumeration DP (resolution) [Davis, Putnam’60] DPLL (propagation) [Davis,Putnam,Logemann,Loveland’62] Decision procedures NP-completeness [Cook’71] for combined theories Conflict-directed [Shostak’79] [Nelson, Oppen’79] backjumping CDCL [GRASP’97] [zChaff’04]
    [Show full text]
  • Eindhoven University of Technology MASTER Extracting GXF Models
    Eindhoven University of Technology MASTER Extracting GXF models from C code towards LIME-ng tool-chain for dtaflow models Deshpande, A.S. Award date: 2010 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain Extracting GXF Models from C Code: towards LIME - next generation for Dataflow Models Aditya S. Deshpande August 2010 TECHNISCHE UNIVERSITEIT EINDHOVEN Department of Mathematics & Computer Science Software Engineering & Technology Master Thesis Extracting GXF Models from C Code towards LIME-ng Tool-chain for Dataflow models by Aditya S. Deshpande (0728718) Supervisors: dr. ir. Tom Verhoeff Pjotr Kourzanov, ir. Yanja Dajsuren, PDEng. August 2010 Preview This thesis introduces the LIME - next generation (LIME-ng) toolchain.
    [Show full text]
  • Scientific Tools for Linux
    Scientific Tools for Linux Ryan Curtin LUG@GT Ryan Curtin Getting your system to boot with initrd and initramfs - p. 1/41 Goals » Goals This presentation is intended to introduce you to the vast array Mathematical Tools of software available for scientific applications that run on Electrical Engineering Tools Linux. Software is available for electrical engineering, Chemistry Tools mathematics, chemistry, physics, biology, and other fields. Physics Tools Other Tools Questions? Ryan Curtin Getting your system to boot with initrd and initramfs - p. 2/41 Non-Free Mathematical Tools » Goals MATLAB (MathWorks) Mathematical Tools » Non-Free Mathematical Tools » MATLAB » Mathematica Mathematica (Wolfram Research) » Maple » Free Mathematical Tools » GNU Octave » mathomatic Maple (Maplesoft) »R » SAGE Electrical Engineering Tools S-Plus (Mathsoft) Chemistry Tools Physics Tools Other Tools Questions? Ryan Curtin Getting your system to boot with initrd and initramfs - p. 3/41 MATLAB » Goals MATLAB is a fully functional mathematics language Mathematical Tools » Non-Free Mathematical Tools You may be familiar with it from use in classes » MATLAB » Mathematica » Maple » Free Mathematical Tools » GNU Octave » mathomatic »R » SAGE Electrical Engineering Tools Chemistry Tools Physics Tools Other Tools Questions? Ryan Curtin Getting your system to boot with initrd and initramfs - p. 4/41 Mathematica » Goals Worksheet-based mathematics suite Mathematical Tools » Non-Free Mathematical Tools Linux versions can be buggy and bugfixes can be slow » MATLAB
    [Show full text]