A Community of Practice Around Peer Review for Long-Term Research Software Sustainability
Total Page:16
File Type:pdf, Size:1020Kb
A Community of Practice Around Peer Review for Long-Term Research Software Sustainability Karthik Ram Noam Ross University of California, Berkeley, EcoHealth Alliance, The rOpenSci Project The rOpenSci Project Carl Boettiger Maelle€ Salmon University of California, Berkeley, The rOpenSci Project The rOpenSci Project Stefanie Butland Scott Chamberlain The rOpenSci Project University of California, Berkeley, The rOpenSci Project Abstract—Scientific open source projects are responsible for enabling many of the major advances in modern science including recent breakthroughs such as the Laser Interferometer Gravitational-Wave Observatory project recognized in the 2017 Nobel Prize for physics. However, much of this software ecosystem is developed ad hoc with no regard for sustainable software development practices. This problem is further compounded by the fact that researchers who develop software have little in the way of resources or academic recognition for their efforts. The rOpenSci Project, founded in 2011 with the explicit mission of developing software to support reproducible science, has in recent years undertaken an effort to improve the long tail of scientific software. In this paper, we describe our software peer-review system, which brings together the best of traditional academic review with new ideas from industry code review. & MODERN SCIENCE RELIES very heavily on open source software to acquire, manipulate, and ana- Digital Object Identifier 10.1109/MCSE.2018.2882753 lyze large volumes of data. One would be hard Date of publication 30 November 2018; date of current pressed to find areas of science today that do not rely on software to generate new discoveries. version 8 March 2019. March/April 2019 Published by the IEEE Computer Society 1521-9615 ß 2018 IEEE 59 Accelerating Scientific Discovery With Reusable Software Such software, often referred to as research soft- analyze their data. The free version control soft- ware, have become indispensable for progress in ware Git and the associated collaboration most areas of science and engineering. By some platform GitHub have been playing an increas- estimates, the National Science Foundation has ingly important role in scientific software over spent close to 10 billion dollars between 1996 the past several years. Much of the collaboration and 2006 in support of research proposals that around such software development now hap- described a software development component.13 pens on platforms such as GitHub (http://www. Despite its importance, a large proportion of nature.com/news/democraticdatabases-science- academic software is developed in an ad hoc on-github-1.20719), which is now being taught manner with little regard for the high standards both in a handful of university courses and by that are characteristic of other research activi- independent programs such as The Carpentries. ties. As a result, the research software ecosystem Such widespread exposure has made the plat- is fragile and the source of numerous problems form and vocabulary familiar to a large group of that afflict modern computational science. Per- researchers, making it easier for new tools built haps the most widely publicized of those prob- in this ecosystem to thrive. Yet the lack of formal lems has been the reproducibility or replication training and mentorship contributes to the grow- crisis.4 Although a multitude of reasons has cul- ing problem of poorly engineered software that minated in different forms of this crisis, some is fragile, difficult to use, and prone to bit rot. reproducibility challenges have been attributed In 2011, a subset of us noticed the need for directly to software and how it is developed in well-developed software that scientists routinely academia. When research software is inaccessi- use. These range from simple utilities to produce ble, lacks adequate usage instructions, is difficult figures, perform conversions, to more elaborate to install, suffers from inadequate testing, or is operations such as retrieving complex data from poorly maintained, it can impede efforts to repro- the web in a repeatable manner. This was the orig- duce results generated using such tools. Most inal motivation for the creation of the rOpenSci often, reproducibility is made impossible when project (https://ropensci.org). rOpenSci plays a one or more of these issues cause the software to critical role in the scientific software ecosystem, break unexpectedly. particularly in the R programming language. Our In 2014 and 2015, Collberg et al. carried out a primary mission is to promote development and series of studies to quantify the state of compu- use of high-quality research software in the sci- tational reproducibility in applied computer sci- entific community. We enable this transformation ence, an area that relies more heavily on in two major ways: in-house software develop- software than other fields. Their premise was ment and peer review of community-contributed that software described in papers should be software. readily accessible and contain enough instruc- Our core development team primarily builds tions for an informed reader to build from robust software to help researchers access, dis- source code on their own systems. Shockingly, cover, publish, and work with disparate types of less than 20% of more than 600 papers they scientific data. Over the years we have created reviewed had software that could actually be 287 high-quality software packages to support downloaded and built with any reasonable the scientific community (192 of them published effort. The study did not consider whether that on R’s primary package manager, CRAN, 2 on 20% of software functioned correctly or pro- Bioconductor https://www.bioconductor.org/, duced valid scientific results. In many ways, and 93 on GitHub https://github.com/ropensci). such a result is not entirely surprising. Scientists Compared to core infrastructure projects that spend decades in training to become experts in are the poster children of scientific open source their chosen domains. Yet critical skills neces- such as Jupyter and NumPy, rOpenSci serves to sary to engineer software are rarely taught. support packages that make up the “long tail” of Despite lack of formal training, a growing science, where software packages have been number of researchers are now spending their designed for specialized applications rather time writing software and code to organize and than general purpose needs. A complete list of 60 Computing in Science & Engineering software packages that rOpenSci has reviewed most successful projects are the ones that turn or accepted is available at https://ropensci.org/ one-person projects into thriving communities. packages/. Perhaps rOpenSci’s biggest contribu- Long-term archiving: Although software col- tion to improving the state of research software is laboration platforms such as GitHub bring visi- not just the development and maintenance of criti- bility to projects, they are not a solution for cal software tools in house but in mentoring long-term archiving. Nearly 30% of the papers domain scientists in good software development surveyed in Collberg 2014 could not be located. practices and fostering a peer-review culture for Without permanent archiving of source code in research software. In this paper, we briefly persistent repositories such as Zenodo, one can- describe our efforts to improve the state of not ensure availability of software over time. research software by creating a peer-review sy- Software design and code smells: Poorly des- stem that shares many similarities with the pub- igned software with inconsistent and unintuitive lishing system but also addresses challenges that user interfaces is a problem that cannot be easily are unique to software development in research. surfaced without a detailed review. Deeper design issues include issues such as attention to software dependencies, good error messages and handling CHALLENGES WITH RESEARCH of unexpected inputs, and following conventions SOFTWARE and coding styles that are characteristic of a com- Documentation: Over the past decade scien- munity. There are also various other code smells11 tists have been embracing various open source that indicate deeper problems with the software. programming languages for scientific computing. These include lack of a clear README with instal- Python, R, and Julia, in particular, have fueled lation instructions and basic examples, no con- the meteoric rise in popularity of data science as tributed code of conduct or clear contribution a new field. As of this writing, there are 147 000 guidelines, lack of continuous integration (serv- packages in the Python package manager (PyPi), ices that allow for the software to be tested any 13 800 packages on CRAN, the Central R Archive time a change is made to the code) for testing, and Network, and 2400 packages in the Julia language large gaps in documentation. (via libraries.io). Sorting through such large col- Maintainability: Design considerations that lections of software packaging and finding the make it easy for future developers to understand right tools presents a serious challenge for most the software, extend functionality, and fix bugs researchers. Beyond the software functionality are critical to prevent software from becoming itself, one of the biggest hurdles to using soft- stale before it reaches a natural end of cycle. ware is lack of good documentation. Given the tedious nature of the process and the need for SOFTWARE REVIEW AS A SERVICE different skills than