Switching from Matlab to R
Total Page:16
File Type:pdf, Size:1020Kb
Appendix A Switching from Matlab to R Most analysts find that it is not difficult to switch from Matlab to R, especially if a few key differences are kept in mind. This appendix provides a list of such differences, gleaned from the experience of the author and his colleagues (especially Clark Richards, who helped to compile the list). Syntax 1. Assignment to a variable is denoted with “=” in Matlab, but with “<-”inR. (Actually, it is possible to use “=” for R assignment, but not recommended.) 2. In R, assignment statements do not print their value, so there is no need for the Matlab convention of using “;” for silencing an assignment. 3. In R, as in most modern languages except Matlab, square brackets are used for indexing; see Sect. 2.3.4. 4. R matrices are not constructed with a square-bracket syntax, but rather with matrix(), as.matrix(), cbind() or rbind(); see Sect. 2.3.5. 5. In Matlab, vectors (one-dimensional sequences of values) are often represented as single-column matrices. The same form can be used in R, but most functions work with vectors, instead. The drop() function, which drops unused matrix dimensions, helps to convert Matlab data to R format, e.g. the following shows how to create vectors for regression with lm(). library(R.matlab) m <- readMat("filename.mat") x <- drop(m$x) y <- drop(m$y) lm(y ~ x) 6. R coerces arrays to a lower dimension whenever it can (Sect. 2.3.5.2), but the drop=FALSE argument can override this, e.g. © Springer Science+Business Media, LLC, part of Springer Nature 2018 241 D. E. Kelley, Oceanographic Analysis with R, https://doi.org/10.1007/978-1-4939-8844-0 242 Appendix A Switching from Matlab to R a <- matrix(1:10, nrow=2) a[1,] # a vector [1]13579 yields a vector, whereas a[1,,drop=FALSE] # a 1-row matrix [,1] [,2] [,3] [,4] [,5] [1,]13579 yields a one-row matrix. (Note the second comma in the drop=FALSE case.) 7. Matrix multiplication in R uses the %*% operator, while the * operator does item- by-item multiplication. See Sect. 2.3.5.1 for this and other matrix operations. 8. In Matlab, a period in a variable name indicates the selection of a subcomponent of a structure. In R, a period in a variable name is generally taken to have no particular meaning except for generalized functions. See Sect. 2.3.2. 9. The q() function is called to exit R. Since this is a function call, the parentheses are required. Dropping the parentheses yields a cryptic message that does little to suggest that R is a friendly language! Graphics 1. In Matlab, hold on is used to indicate a desire to embellish an existing plot. Instead, R provides a suite of functions whose whole purpose is to add to existing plots, such as points() for adding points, lines() for adding lines, title() for adding titles, legend() for adding legends, mtext() for writing in plot margins, etc., plus an add argument to contour() and a few other functions to make them add to an existing plot; see Sect. 2.4. 2. Matlab offers better interactive control of plots than is available in the basic R graphics system, although the shiny package makes up for this, at some coding cost. See Sects. 2.4.15 and 2.8, plus Appendix B. 3. Matlab produces “flashier” default graphics, e.g. automatically using colours to distinguish between lines on a plot. The R strategy is to produce more utilitarian black/white default plots, in accordance with the tenets outlined by Cleveland and McGill (1984) and others who have studied the interpretation of graphical material. R also offers colour schemes that are suitable for viewers with vision limitations (Ihaka 2003; Light and Bartlein 2004; Zeileis et al. 2009). Freedom 1. Matlab is a commercial product, sold at a price that is significant to many research groups and is likely to be prohibitive to those “citizen scientists” who might wish to use code provided in the supplemental materials of research papers. R costs nothing. Appendix A Switching from Matlab to R 243 2. Portions of Matlab are in closed-source form, making it difficult for users to check the methods for veracity or appropriateness. The entire R source is open to inspection. 3. Matlab is covered by commercial licences that may be untoward in some circumstances. R is covered by a GNU license. Appendix B GUI Systems for R Graphical user interfaces (GUIs) simplify the interactive use of R, providing benefits to users of various skill levels. Those with limited programming experience may find GUIs less daunting than the R command-line interface. Those will more programming experience may appreciate the ease with which GUIs let them access previous commands, plots, and documentation views. Even experts may use GUIs when they need simple access to debugging tools, without losing the ability to use the editors and revision-control systems of their choice. The basic R system centres on a command-line mode, but this is supplemented by various GUI systems that can ease interactive analysis (Fig. B.1). Many users switch between GUI and the command-line, for different sorts of tasks or for a given task at different stages of completion.1 Some R newcomers, especially those with limited programming experience, focus almost entirely on GUI systems, but even experienced programmers can benefit from GUIs for occasional or everyday use. There are several GUIs to choose from. Most are in continuous development, making it problematic to provide detailed feature comparisons here. Still, a broad and quite personal overview may be of some use to readers. • Rstudio is a multi-platform system that is popular across a wide range of R users. Beginners appreciate the access it provides to previously viewed docu- mentation entries and plots, along with its provision of a simple data viewer and a code-aware text editor. Those with more computing skills will appreciate that Rstudio works well with system tools, such as external editors and revision control systems. Developers will appreciate the ease of rebuilding packages, 1For example, the author tends to use Rstudio to build packages, but not for data analysis. For the latter, he uses Mac-GUI (usually coupled with a Vim editor window) for exploratory work, moving to standalone script files as the work progresses. For any task that takes more than a minute, these script files are run with a Makefile so that analysis is repeated only when the R source-code or the data files are changed. © Springer Science+Business Media, LLC, part of Springer Nature 2018 245 D. E. Kelley, Oceanographic Analysis with R, https://doi.org/10.1007/978-1-4939-8844-0 246 Appendix B GUI Systems for R Fig. B.1 An Rstudio window, configured with panels for source code, documentation, console and graphics interacting with the debugger, running code-quality checkers, etc. Users at all levels can benefit from Rstudio notebooks, which record interactive work in webpages that combine code and graphs. • Mac-GUI is a Macintosh-only system that provides menu access to documenta- tion, graphs, console, and editor windows. It provides several Rstudio features, although typically in more limited ways. Its restriction to a single platform is a limitation for users who switch between different types of computer. Mac-GUI does not follow the Rstudio tendency of merging windows together into a single frame, so it can be a good choice for those who prefer to size and arrange their own windows. • JGR supports some of the features of Rstudio, in a more limited way. • Rcommander provides menu items for some operations such as data loading, regression and plotting, which may be appealing to beginners. In addition to GUI systems, there are connections between R and several of the general text editors that are popular in technical fields. This is important because users with programming experience tend to have a favourite code-aware text editor with which they are particularly adept. Emacs users can run R with the ESS (Emacs Speaks Statistics) system, while Vim users can use the Vim-R-plugin system. Both of these are set up for R syntax so they can indent and colourize appropriately, and both interact with R to let users execute blocks of code, consult documentation, and perform other tasks involved in using R. Appendix C Map Projections in oce The oce package provides many of the map projections used in modern cartog- raphy. The calculations are done with the rgdal package as an interface to the underlying PROJ.4 system, and this means that the notation will be familiar to readers who have experience with map projections in other computing languages. Only projections that may be inverted are incorporated in oce, because inverse calculations are required for labelling axes, etc. A table of the roughly 100 oce projections is presented in this appendix, along with general remarks on choosing projections and the details of a few common projections. Cartesian latitude-longitude plots may be suitable for limited areas, if meridional convergence is handled by setting asp to the reciprocal cosine of mean latitude. However, a projection may be preferable for scales exceeding a few hundred kilometres. There are many projections to choose from, and no firm rules to guide the selection. Often, a projection is chosen to match previous work, facilitating visual comparison of the data being displayed. The distortion of shapes and areas are also important considerations (see Airy (1861) for an early treatment and modern discussions by Snyder (1987) and Evenden (2003), the last of which employs the PROJ.4 notation used in oce).