The R Journal, June 2012
Total Page:16
File Type:pdf, Size:1020Kb
The Journal Volume 2/1, June 2010 A peer-reviewed, open-access publication of the R Foundation for Statistical Computing Contents Editorial..................................................3 Contributed Research Articles IsoGene: An R Package for Analyzing Dose-response Studies in Microarray Experiments..5 MCMC for Generalized Linear Mixed Models with glmmBUGS ................. 13 Mapping and Measuring Country Shapes............................... 18 tmvtnorm: A Package for the Truncated Multivariate Normal Distribution........... 25 neuralnet: Training of Neural Networks............................... 30 glmperm: A Permutation of Regressor Residuals Test for Inference in Generalized Linear Models.................................................. 39 Online Reproducible Research: An Application to Multivariate Analysis of Bacterial DNA Fingerprint Data............................................ 44 Two-sided Exact Tests and Matching Confidence Intervals for Discrete Data.......... 53 Book Reviews A Beginner’s Guide to R......................................... 59 News and Notes Conference Review: The 2nd Chinese R Conference........................ 60 Introducing NppToR: R Interaction for Notepad++......................... 62 Changes in R 2.10.1–2.11.1........................................ 64 Changes on CRAN............................................ 72 News from the Bioconductor Project.................................. 85 R Foundation News........................................... 86 2 The Journal is a peer-reviewed publication of the R Foundation for Statistical Computing. Communications regarding this publication should be addressed to the editors. All articles are copyrighted by the respective authors. Prospective authors will find detailed and up-to-date submission instructions on the Journal’s homepage. Editor-in-Chief: Peter Dalgaard Center for Statistics Copenhagen Business School Solbjerg Plads 3 2000 Frederiksberg Denmark Editorial Board: Vince Carey, Martyn Plummer, and Heather Turner. Editor Programmer’s Niche: Bill Venables Editor Help Desk: Uwe Ligges Editor Book Reviews: G. Jay Kerns Department of Mathematics and Statistics Youngstown State University Youngstown, Ohio 44555-0002 USA [email protected] R Journal Homepage: http://journal.r-project.org/ Email of editors and editorial board: [email protected] The R Journal Vol. 2/1, June 2010 ISSN 2073-4859 3 Editorial by Peter Dalgaard The transition from R News to The R Journal was always about enhancing the journal’s scientific Welcome to the 1st issue of the 2nd volume of The R credibility, with the strategic goal of allowing re- Journal. searchers, especially young researchers, due credit I am writing this after returning from the NORD- for their work within computational statistics. The R STAT 2010 conference on mathematical statistics in Journal is now entering a consolidation phase, with Voss, Norway, followed by co-teaching a course on a view to becoming a “listed journal”. To do so, we Statistical Practice in Epidemiology in Tartu, Estonia. need to show that we have a solid scientific standing In Voss, I had the honour of giving the opening with good editorial standards, giving submissions lecture entitled “R: a success story with challenges”. fair treatment and being able to publish on time. I shall spare you the challenges here, but as part of Among other things, this has taught us the concept the talk, I described the amazing success of R, and of the “healthy backlog”: You should not publish so a show of hands in the audience revealed that only quickly that there might be nothing to publish for the about 10% of the audience was not familiar with R. I next issue! also got to talk about the general role of free software We are still aiming at being a relatively fast-track in science and I think my suggestion that closed- publication, but it may be too much to promise pub- source software is “like a mathematician hiding his lication of even uncontentious papers within the next proofs” was taken quite well. two issues. The fact that we now require two review- R 2.11.1 came out recently. The 2.11.x series dis- ers on each submission is also bound to cause some plays the usual large number of additions and cor- delay. rections to R, but if a single major landmark is to be pointed out, it must be the availability of a 64-bit Another obstacle to timely publication is that the version for Windows. This has certainly been long entire work of the production of a new issue is in the awaited, but it was held back by the lack of success hands of the editorial board, and they are generally with a free software 64-bit toolchain (a port using four quite busy people. It is not good if a submis- a commercial toolchain was released by REvolution sion turns out to require major copy editing of its A Computing in 2009), despite attempts since 2007. On LTEX markup and there is a new policy in place to A January 4th this year, however, Gong Yu sent a mes- require up-front submission of LTEX sources and fig- sage to the R-devel mailing list that he had succeeded ures. For one thing, this allows reviewers to advise A in building R using a version of the MinGW-w64 on the LTEX if they can, but primarily it gives bet- tools. On January 9th, Brian Ripley reported that he ter time for the editors to make sure that an accepted was now able to build a version as well. During Win- paper is in a state where it requires minimal copy ter and Spring this developed into almost full-blown editing before publication. We are now able to enlist platform support in time for the release of R 2.11.0 student assistance to help with this. Longer-term, I in April. Thanks go to Gong Yu and the “R Win- hope that it will be possible to establish a front-desk dows Trojka”, Brian Ripley, Duncan Murdoch, and to handle submissions. Uwe Ligges, but the groundwork by the MinGW- Finally, I would like to welcome our new Book w64 team should also be emphasized. The MinGW- Review editor, Jay Kerns. The first book review ap- w64 team leader, Kai Tietz, was also very helpful in pears in this issue and several more are waiting in the porting process. the wings. The R Journal Vol. 2/1, June 2010 ISSN 2073-4859 4 The R Journal Vol. 2/1, June 2010 ISSN 2073-4859 CONTRIBUTED RESEARCH ARTICLES 5 IsoGene: An R Package for Analyzing Dose-response Studies in Microarray Experiments by Setia Pramana, Dan Lin, Philippe Haldermans, Ziv 2010), an R package called IsoGene has been devel- Shkedy, Tobias Verbeke, Hinrich Göhlmann, An De Bondt, oped. The IsoGene package implements the testing Willem Talloen and Luc Bijnens. procedures described by Lin et al. (2007) to identify a subset of genes where a monotone relationship be- Abstract IsoGene is an R package for the anal- tween gene expression and dose can be detected. In ysis of dose-response microarray experiments to this package, the inference is based on resampling identify gene or subsets of genes with a mono- methods, both permutations (Ge et al., 2003) and the tone relationship between the gene expression Significance Analysis of Microarrays (SAM), Tusher and the doses. Several testing procedures (i.e., et al., 2001. To control the False Discovery Rate (FDR) the likelihood ratio test, Williams, Marcus, the the Benjamini Hochberg (BH) procedure (Benjamini M, and Modified M), that take into account and Hochberg, 1995) is implemented. the order restriction of the means with respect This paper introduces the IsoGene package with to the increasing doses are implemented in the background information about the methodology package. The inference is based on resampling used for analysis and its main functions. Illustrative methods, both permutations and the Signifi- examples of analysis using this package are also pro- cance Analysis of Microarrays (SAM). vided. Introduction Testing for Trend in Dose Response The exploration of dose-response relationship is im- Microarray Experiments portant in drug-discovery in the pharmaceutical in- In a microarray experiment, for each gene, the fol- dustry. The response in this type of studies can be lowing ANOVA model is considered: either the efficacy of a treatment or the risk associ- ated with exposure to a treatment. Primary concerns Yij = m(di) + #ij, i = 0,1,. .,K, j = 1,2,. .,ni, (1) of such studies include establishing that a treatment where Y is the jth gene expression at the ith dose has some effect and selecting a dose or doses that ap- ij level, d (i = 0,1,. .,K) are the K+1 dose levels, m(d ) pear efficacious and safe (Pinheiro et al., 2006). In i i is the mean gene expression at each dose level, and recent years, dose-response studies have been inte- # ∼ N(0,s2). The dose levels d ,...,d are strictly in- grated with microarray technologies (Lin et al., 2010). ij 0 K creasing. Within the microarray setting, the response is gene The null hypothesis of homogeneity of means (no expression measured at a certain dose level. The dose effect) is given by aim of such a study is usually to identify a subset of genes with expression levels that change with ex- H0 : m(d0) = m(d1) = ··· = m(dK). (2) perimented dose levels. where m(di) is the mean response at dose di with One of four main questions formulated in dose- i = 0,..,K, where i = 0 indicates the control. The al- response studies by Ruberg (1995a, 1995b) and ternative hypotheses under the assumption of mono- Chuang-Stein and Agresti (1997) is whether there is tone increasing and decreasing trend of means are re- any evidence of the drug effect. To answer this ques- spectively specified by tion, the null hypothesis of homogeneity of means Up (no dose effect) is tested against ordered alternatives. H1 : m(d0) ≤ m(d1) ≤ · · · ≤ m(dK), (3) Lin et al.(2007, 2010) discussed several testing pro- cedures used in dose-response studies of microar- HDown : m(d ) ≥ m(d ) ≥ · · · ≥ m(d ).