The VGAM Package, Which Is Ca- Ter of the Publication Will Be Largely Unchanged
Total Page:16
File Type:pdf, Size:1020Kb
News The Newsletter of the R Project Volume 8/2, October 2008 Editorial by John Fox pable of fitting a dizzying array of statistical mod- els. Paul Murrell shows how the compare package I had the opportunity recently to examine the devel- is used to locate similarities and differences among opment of the R Project. A dramatic index of this de- non-identical objects, and explains how the package velopment is the rapid growth in R packages, which may be applied to grading students’ assignments. is displayed in Figure 1; because the vertical axis in Arthur Allignol, Jan Beyersmann, and Martin Schu- this graph is on a log scale, the least-squares linear- macher describe the mvna package for the Nelson- regression line represents exponential growth. Aalen estimator of cumulative transition hazards in This issue of R News is another kind of testament multistate models. Finally, in the latest installment of to the richness and diversity of applications pro- the Programmers’ Niche column, Vince Carey reveals grammed in R: Hadley Wickham, Michael Lawrence, the mind-bending qualities of recursive computation Duncan Temple Lang, and Deborah Swayne describe in R without named self-reference. the rggobi package, which provides an interface to The “news” portion of this issue of R Rnews in- GGobi, software for dynamic and interactive sta- cludes a report on the recent useR! conference in tistical graphics. Carsten Dormann, Bernd Gruber, Dortmund, Germany; announcements of useR! 2009 and Jochen Fründ introduce the bipartite package in Rennes, France, and of DSC 2009 in Copenhagen, for examining interactions among species in ecolog- Denmark, both to be held in July; news from the R ical communities. Ionnis Kosmidis explains how to Foundation and from the Bioconductor Project; the use the profileModel package to explore likelihood latest additions to CRAN; and changes in the re- surfaces and construct confidence intervals for pa- cently released version 2.8.0 of R. rameters of models fit by maximum-likelihood and Unless we accumulate enough completed articles closely related methods. Ingo Feinerer illustrates to justify a third issue in 2008, this will be my last is- how the tm (“text-mining”) package is employed for sue as editor-in-chief of R News. Barring a third issue the analysis of textual data. Yihui Xie and Xiaoyne in 2008, this is also the last issue of R News — but Cheng demonstrate the construction of statistical an- don’t panic! R News will be reborn in 2009 as The R imations in R with the animation package. Thomas Journal. Moreover, the purpose and general charac- Yee introduces the VGAM package, which is ca- ter of the publication will be largely unchanged. We Contents of this issue: Comparing Non-Identical Objects . 40 mvna: An R Package for the Nelson–Aalen Es- Editorial . .1 timator in Multistate Models . 48 An Introduction to rggobi ............3 Programmers’ Niche: The Y of R . 51 Introducing the bipartite Package: Analysing Changes in R Version 2.8.0 . 54 Ecological Networks . .8 Changes on CRAN . 60 The profileModel R Package: Profiling Objec- News from the Bioconductor Project . 69 tives for Models with Linear Predictors. 12 useR! 2008 . 71 An Introduction to Text Mining in R . 19 Forthcoming Events: useR! 2009 . 73 animation: A Package for Statistical Animations 23 Forthcoming Events: DSC 2009 . 74 The VGAM Package . 28 R Foundation News . 74 Vol. 8/2, October 2008 2 think that the new name better reflects the direction R Version in which R News has evolved, and in particular the 1.3 1.4 1.5 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 quality of the refereed articles that we publish. 1400 1427 1300 During my tenure as an R News editor, I’ve had 1200 1000 1000 the pleasure of working with Vince Carey, Peter Dal- 911 800 gaard, Torsten Hothorn, and Paul Murrell. I look for- 739 647 ward to working next year with Heather Turner, who 600 548 is joining the editorial board in 2009, which will be 500 my last year on the board. 400 406 357 300 John Fox 273 McMaster University Number of CRAN Packages 219 200 Canada 162 [email protected] 129 110 100 2001−06−21 2001−12−17 2002−06−12 2003−05−27 2003−11−16 2004−06−05 2004−10−12 2005−06−18 2005−12−16 2006−05−31 2006−12−12 2007−04−12 2007−11−16 2008−03−18 Date Figure 1: The number of R packages on CRAN has grown exponentially since R version 1.3 in 2001. Source of Data: https://svn.r-project.org/ R/branches/. R News ISSN 1609-3631 Vol. 8/2, October 2008 3 An Introduction to rggobi Hadley Wickham, Michael Lawrence, outliers. It is customary to look at uni- or bivariate Duncan Temple Lang, Deborah F Swayne plots to look for uni- or bivariate outliers, but higher- dimensional outliers may go unnoticed. Looking for these outliers is easy to do with the tour. Open your Introduction data with GGobi, change to the tour view, and se- The rggobi (Temple Lang and Swayne, 2001) pack- lect all the variables. Watch the tour and look for age provides a command-line interface to GGobi, an points that are far away or move differently from the interactive and dynamic graphics package. Rggobi others—these are outliers. complements GGobi’s graphical user interface by en- Adding more data sets to an open GGobi is also g$mtcars2 <- mtcars abling fluid transitions between analysis and explo- easy: will add another data ration and by automating common tasks. It builds on set named “mtcars2”. You can load any file type the first version of rggobi to provide a more robust that GGobi recognises by passing the path to that file. ggobi_find_file and user-friendly interface. In this article, we show In conjunction with , which locates how to use ggobi and offer some examples of the in- files in the GGobi installation directory, this makes it sights that can be gained by using a combination of easy to load GGobi sample data. This example loads analysis and visualisation. the olive oils data set included with GGobi: This article assumes some familiarity with library(rggobi) GGobi. A great deal of documentation, both ggobi(ggobi_find_file("data", "olive.csv")) introductory and advanced, is available on the GGobi web site, http://www.ggobi.org; newcom- ers to GGobi might find the demos at ggobi.org/ Modifying observation-level docs especially helpful. The software is there attributes, or “automatic brushing” as well, both source and executables for several platforms. Once you have installed GGobi, you Brushing is typically thought of as an operation per- can install rggobi and its dependencies using in- formed in a graphical user interface. In GGobi, it stall.packages("rggobi", dep=T). refers to changing the colour and symbol (glyph type This article introduces the three main compo- and size) of points. It is typically performed in a nents of rggobi, with examples of their use in com- linked environment, in which changes propagate to mon tasks: every plot in which the brushed observations are • getting data into and out of GGobi; displayed. in GGobi, brushing includes shadowing, where points sit in the background and have less vi- • modifying observation-level attributes (“auto- sual impact, and exclusion, where points are com- matic brushing”); pleted excluded from the plot. Using rggobi, brush- ing can be performed from the command line; we • basic plot control. think of this as “automatic brushing.” The following We will also discuss some advanced techniques such functions are available: as creating animations with GGobi, the use of edges, • change glyph colour with glyph_colour (or and analysing longitudinal data. Finally, a case study glyph_color) shows how to use rggobi to create a visualisation for a statistical algorithm: manova. • change glyph size with glyph_size • change glyph type with glyph_type Data • shadow and unshadow points with shadowed • exclude and include points with excluded Getting data from R into GGobi is easy: g <- ggobi(mtcars). This creates a GGobi object Each of these functions can be used to get or set called g. Getting data out isn’t much harder: Just the current values for the specified GGobiData. The index that GGobi object by position (g[[1]]) or by “getters” are useful for retrieving information that name (g[["mtcars"]] or g$mtcars). These return you have created while brushing in GGobi, and the GGobiData objects which are linked to the data in “setters” can be used to change the appearance of GGobi. They act just like regular data frames, except points based on model information, or to create an- that changes are synchronised with the data in the imations. They can also be used to store, and then corresponding GGobi. You can get a static copy of later recreate, the results of a complicated sequence the data using as.data.frame. of brushing steps. Once you have your data in GGobi, it’s easy to This example demonstrates the use of do something that was hard before: find multivariate glyph_colour to show the results of clustering the R News ISSN 1609-3631 Vol. 8/2, October 2008 4 infamous Iris data using hierachical clustering. Us- Name Variables ing GGobi allows us to investigate the clustering in 1D Plot 1 X the original dimensions of the data. The graphic XY Plot 1 X, 1 Y shows a single projection from the grand tour. 1D Tour n X Rotation 1 X, 1 Y, 1 Z 2D Tour n X g <- ggobi(iris) 2x1D Tour n X, n Y clustering <- hclust(dist(iris[,1:4]), Scatterplot Matrix n X method="average") Parallel Coordinates Display n X glyph_colour(g[1]) <- cuttree(clustering, 3) Time Series 1 X, n Y Barchart 1 X After creating a plot you can get and set the dis- played variables using the variable and variable<- methods.