The R Jounral, December 2009
Total Page:16
File Type:pdf, Size:1020Kb
The Journal Volume 1/2, December 2009 A peer-reviewed, open-access publication of the R Foundation for Statistical Computing Contents Editorial..................................................3 Special section: The Future of R Aspects of the Social Organization and Trajectory of the R Project................5 Contributed Research Articles Party on!.................................................. 14 ConvergenceConcepts: An R Package to Investigate Various Modes of Convergence..... 18 asympTest: A Simple R Package for Classical Parametric Statistical Tests and Confidence Intervals in Large Samples...................................... 26 copas: An R package for Fitting the Copas Selection Model.................... 31 Transitioning to R: Replicating SAS, Stata, and SUDAAN Analysis Techniques in Health Policy Data............................................... 37 Rattle: A Data Mining GUI for R.................................... 45 sos: Searching Help Pages of R Packages............................... 56 From the Core The New R Help System......................................... 60 News and Notes Conference Review: DSC 2009..................................... 66 Conference Review: WZUR(2.0) – The Second Meeting of Polish R Users............ 67 R Changes: 2.9.1-2.10.0 Patched.................................... 68 Changes on CRAN............................................ 80 News from the Bioconductor Project.................................. 95 R Foundation News........................................... 96 2 The Journal is a peer-reviewed publication of the R Foundation for Statistical Computing. Communications regarding this publication should be addressed to the editors. All articles are copyrighted by the respective authors. Prospective authors will find detailed and up-to-date submission instructions on the Journal’s homepage. Editor-in-Chief: Vince Carey Channing Laboratory Brigham and Women’s Hospital 75 Francis St. Boston, MA 02115 USA Editorial Board: John Fox, Heather Turner, and Peter Dalgaard. Editor Programmer’s Niche: Bill Venables Editor Help Desk: Uwe Ligges R Journal Homepage: http://journal.r-project.org/ Email of editors and editorial board: [email protected] The R Journal Vol. 1/2, December 2009 ISSN 2073-4859 3 Editorial by Vince Carey processing. This article is the first for a recurring journal section “From the Core” where we plan to This issue of the R Journal comes on the heels of R highlight new ideas and methods in the words of 2.10.1. R 2.10 sports a variety of changes to core members themselves. • the documentation system It has been a pleasure to assemble this number. We have a special item on the sociology of the R • factor handling project from our past editor-in-chief, John Fox. Re- • debugging and code analysis support search articles cover topics in random forest interpre- tation, meta-analysis, complex surveys in health pol- • encodings management icy research, data mining via GUI, enhanced support • [[ semantics for resource discovery, and issues in teaching about convergence of sequences of random variables and • regular expression processing large sample inference. • data compression facilities My tenure as Editor-in-Chief of the R Journal comes to a close with this issue. Peter Dalgaard now • package installation and checking takes the reins. I am deeply indebted to Peter, John among other features. Most users will want to famil- Fox, Heather Turner, Uwe Ligges, and Bill Venables iarize themselves with the details of items described for their editorial assistance, and to Martin Maech- in R_HOME/NEWS and in this issue’s “Changes to ler for systems support. John Fox is owed a special R” article. Thanks are due to the core members and thanks for staying in the editorial group for an extra other contributors who have introduced these en- year; we welcome Martyn Plummer of IARC who is hancements, many of which will increase the ease joining as Associate Editor. and scope of use of R in the growing set of domains To close, I’d like to suggest to readers that for which effectiveness requires excellent data analy- they spend at least a little while in the “Changes sis. on CRAN” section. There is much to be learned The R Journal also has some new or impend- there from the perspective of software interoperabil- ing features of interest. A number of readers have ity alone, with new packages defining interfaces to inquired about subscriptions and RSS feeds. We MS Word, Apache ant, NVIDIA CUDA, and send- now have a feed, thanks to Heather Turner: http: mail, for example. Folks interested in working with //journal.r-project.org/rss.xml. It is also a plea- AVIRIS hyperspectral images, NIfTI-formatted brain sure to announce the addition of Jay Kerns as Book images, or the TikZ system for algebraically speci- Review Editor; the Book Review section will be inau- fied vector graphics will find connections to R in this gurated in the next issue. Thanks to efforts of Achim section. Owners of multicore hardware will want to Zeileis, we have added subsections to the “Changes get acquainted with new contributions from Revolu- on CRAN” regular feature that describe new CRAN tion Computing, Inc. Lastly, browsing the new con- task views and new allocations of packages to CRAN tributions inspired me to learn that “quaternary sci- task views. In the PDF image of the Journal, these ence” denotes the study of the past 2.6 million years are all hyperlinked to the view or package resources on Earth. Go CRAN! on CRAN, so that readers can quickly investigate or acquire packages in views of interest. Finally, in this issue we have a nice piece by R core mem- Vince Carey bers Duncan Murdoch and Simon Urbanek describ- Channing Laboratory, Brigham and Women’s Hospital ing changes to the R help markup language and its [email protected] The R Journal Vol. 1/2, December 2009 ISSN 2073-4859 4 The R Journal Vol. 1/2, December 2009 ISSN 2073-4859 INVITED SECTION:THE FUTURE OF R 5 Aspects of the Social Organization and Trajectory of the R Project by John Fox Although I will address the question of motiva- tion briefly (and although it is raised repeatedly by Abstract: Based partly on interviews with mem- economists), it is not an issue unique to open-source bers of the R Core team, this paper considers software development: After all, people participate the development of the R Project in the context in a wide variety of voluntary organizations. There of open-source software development and, more is a large and venerable literature in sociology on generally, voluntary activities. The paper de- voluntary associations (for reviews, see Smith, 1975, scribes aspects of the social organization of the R and Knoke, 1986), much of it focusing on participa- Project, including the organization of the R Core tion, and more recent work in the area addressing team; describes the trajectory of the R Project; the “social capital” accruing to communities as a con- seeks to identify factors crucial to the success of sequence of participation in voluntary organizations R; and speculates about the prospects for R. (following Putnam, 1995). Winchester(2003, p. 215) writes of the unpaid volunteers who contributed meticulous work to the Introduction monumental Oxford English Dictionary: This paper describes aspects of the R Core team; [W]e do not really know why so many briefly traces the trajectory of the R Project; discusses people gave so much of their time for the development and organization of the R Project; so little apparent reward. And this is considers the reasons for the success of R; and spec- the abiding and most marvelous mys- ulates about its prospects for continued success. The tery of the enormously democratic pro- paper is based on semi-structured interviews con- cess that was the Dictionary — that hun- ducted during 2006 and 2007 with most members of dreds upon hundreds of people, for mo- the R Core team, whom I will occasionally quote in tives known and unknown, for reasons the paper; on publicly available archival sources; and both stated and unsaid, helped to chron- on participant observation in the R Project, as a user, icle the immense complexities of the lan- package developer, author, and — more recently — a guage that was their own, and that they member of the R Foundation. dedicated in many cases . years upon The paper is not a complete consideration of the years of labour to a project of which they social organization of the R Project in that it does not all, buoyed by some set of unfathomable systematically address interactions among members and optimistic notions, insisted on be- of the R Core team, nor between the Core team and coming a part. package developers and users, nor among develop- ers and users, all of which would be the proper sub- With a few changes in specifics, much the same can ject of a more complete account. Nevertheless, I do be said of participation in the R Project — both by try to identify key aspects of the social design of the members of the Core team and by others. R Project, particularly with respect to their contribu- Participation in open-source software projects is tions to the success of R and to its future. in this sense no different from participation in other voluntary organizations, such as coaching a chil- dren’s ice-hockey team or contributing to the OED. What is problematic about open- When asked about their motivation for working on the R project, members of the R Core team responded source software development? with conventional reasons for participating in a vol- untary association: Why do people contribute to open-source projects such as R? Is this behaviour purely altruistic, or are • To satisfy a sense of obligation (with a hint of there rewards — tangible and otherwise — to open- rational self-interest). source development? Raymond(2001c), for exam- ple, suggests that the open-source development com- [M]y feeling is that I gain great munity constitutes a “gift culture” in which the cur- benefit from open-source software.