6 | The Rising Popularity of : More Power to the Community

Evan Sterling, School of Information Studies

Preprints have grown significantly in popularity in the life sciences in the past few years via sites such as bioRxiv, as researchers want to share their work rapidly without waiting for a peer- review process that many of them view as lengthy and ineffective. Some advocates see them not merely as an adjunct to traditional journals, but as a means to disrupt and improve this model and reduce the power of publishing gatekeepers. This change is having unintended effects on life sciences communication such as increasing the pressure to publish, while new hybrid platforms are being created in the grey area between journal article and .

Keywords: preprints; ; ; arXiv; bioRxiv; scholarly communities

One would not expect preprints, shared drafts of scientific papers, to be a contentious or even a particularly interesting topic. But they are currently a bone of contention in the life sciences as they rapidly grow in popularity in 2018, one change among many underway in scholarly communications. The discussion has played out on Twitter, blogs, academic papers, and even in WIRED and on NPR radio. Some researchers make passionate cases for their adoption; others write survival guides for scientists; others still call them a “moral hazard” or write that they are in danger of being “hijacked by pseudoscience” (Teixeira da Silva, 2017; Thompson, 2017) . This paper will review the growth of preprints and their effects, in particular on peer review.

Origins and Overview Preprints are usually defined as a draft version of an academic article which is shared with others for comment, before having been peer-reviewed. Preprints allow researchers to share their

Copyright © 2018 Sterling. The Rising Popularity of Preprints 49 nearly-completed work with colleagues without having to wait for the delay imposed by the peer review and editorial process.

Preprints began as letters exchanged between scientists, and in their modern form first became popular in in the 1980s – specifically in theoretical and high-energy physics. Researchers shared papers via postal mail, then e-mail, and then via the website arXiV, which was founded in 1991 (Luther, 2017). Over time this became the standard mode of scholarly communication in that subfield; a 2009 study found that over 95% of the articles in the five most prestigious journals in the field had been previously posted in draft form on the site (Gentil- Beccot, Mele, & Brooks, 2009). Preprints have been popular in mathematics and economics for roughly the same length of time. Currently, according to Google Scholar, 4 of the most cited ‘journals’ in physics are categories (subdisciplinary sections) of ArXiV; the top journal in economics is the NBER Working Papers series, also a preprint archive.

In the life sciences, researchers have not released their articles in preprints in great numbers until the past 3-4 years, despite efforts and forums such as ‘ Precedings’ which operated from 2007-2012. They were held back by worries about being ‘scooped’ or plagiarised, and a lack of clarity from journals in the field on whether they would even accept an article previously released as a preprint (Callaway, 2012; Kaiser, 2017b; Norman, 2017). However, journals have recently liberalised their policies, new preprint sites have opened, and importantly, funding agencies including the National Institutes of Health (NIH) in the United States have announced they will consider preprints in their grant reviews (Kaiser, 2017a) 1. This has led to an eight-fold increase in the number of preprints on life-science specific forums since 2013, as shown in Figure 1, with at least 1600 papers per month currently being posted. The majority of this growth has been on bioRxiv, which was founded in 2013 and received financial backing from the Chan Zuckerberg Initiative in 2017. However preprints still represent a relatively small fraction of all papers – one estimate from 2017 was 1% – and are still far less popular than preprints in physics and math (arXiV, n.d.; Molteni, 2017).

1 One medical researcher tweeted his surprise at quickly journals have accepted preprints has changed in the past 3 years (Gejman, 2018).

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 50

It is important to note that preprint forums are not a complete free-for-all; many disciplinary pre-print archives exercise some editorial oversight of submissions. For example, arXiV moderators perform a basic review of submissions for being of “plausible interest” to the academic audience before posting them (Kuperberg, n.d.). On a continuum of oversight, preprints lie between a scholarly blog post (all control and responsibility lies with the author) and a traditional journal article (much of the control and responsibility lies with the journal).

Figure 1 - The number of preprints posted on prominent websites for life sciences preprints, as of January 2018. Adapted from "Monthly Stats," by J. Anaya, 2018, PrePubMed. Retrieved February 16 2018, from http://www.prepubmed.org. Copyright by Jordan Anaya; data and code are made available under the MIT License.

Motivations for scholars Academic scholars have several typical motivations in the act of publishing, which can work at cross-purposes. Hangel & Schmidt-Pfister (2017) identified the following four major categories based on their interviews: sharing and disseminating their work to benefit the scholarly community; gaining recognition from peers; obtaining funding; and the sheer enjoyment of writing. Two specific reasons relevant for preprints are establishing a record that you have originated an idea (staking a precedence claim) and seeking feedback to improve your research.

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 51

A dominant factor cited by many scholars who use preprints is the increase in author control and flexibility it offers, which allows them to improve dissemination while retaining the validation from traditional publishing. A paper shared earlier in the process can receive feedback, generate buzz, and lead to invitations from peer-reviewed journals. An example is a preprint on ribosomes posted to bioRxiv in 2014 by Nikolai Slavov, which led to active discussion and a tenure-track job offer from Northeastern University before the final version of the paper even appeared (Kaiser, 2017b). On a more altruistic level, a paper shared earlier can benefit society; a just-published paper shows that preprints published during the Ebola and Zika virus outbreaks appeared more than 100 days before the full peer-reviewed version, even with accelerated journal review (Johansson, Reich, Meyers, & Lipsitch, 2018).

Woven through these arguments are the weaknesses of the traditional journal peer review model, especially for biomedical sciences. Researchers argue that (1) the review timeframe is excessive and inhibits research dissemination; one biomedical researcher discussed how a four- year publication lag for one paper helped increase his support preprints (Dessimoz, 2016); (2) that it does a poor job of weeding out bad science (Eisen, 2011; Smith, 2006); and (3) that it inhibits the publication of negative or “contrarian” results (Kaiser, 2017b). Norman (2017) points out that this was not that long ago that not all research in journals such as Nature was peer reviewed.

Two improvements to peer review that are often advocated for along with preprints are:

• Post-publication peer review (PPPR). This term is used to refer to both to an organized system of peer review combined with the posting of a preprint; and to informal, voluntary comments and discussion taking place about the “published” version of record (either on the journal website, on websites set up for this purpose such as PubMed Commons, or on social media). • Open peer review, a loosely-defined term in which attributed peer reviews are shared in some way with the author or the public, either during or after the publishing process. The British Medical Journal, for example, has used open peer review since 1999. Post- publication peer review is usually but not always open. A systematic review of the term

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 52

generated 122 distinct definitions of ‘open peer review’, indicating that there is a lack of clarity around this term (Ross-Hellauer, 2017).

ASAPbio, a non-profit scientist collective founded in 2015 to promote preprints, held a symposium in February 2018 to discuss improvements to peer review (ASAPBio, 2018).

The decline of importance, originality and novelty as acceptance criteria used by science journals is also related to this movement. While at first glance this may seem unrelated, the goal of reducing the role of the gatekeeper is the same. PLoS, the Public Library of Science, explains its decision to publish all submitted papers which are technically high quality, regardless of their perceived importance, thusly: “Judgments about the importance of any particular paper are then made after publication by the readership, who are the most qualified to determine what is of interest to them” (Norman, 2017; PLoS, n.d.).

Effects and Concerns For years, concerns about researchers stealing or copying ideas, or ‘scooping’ a colleague by rushing to submit their similar research to a peer-reviewed journal, held back preprints in the life sciences (Callaway, 2012). However, in an about-face, scholars seem to be adopting preprints as a method of ‘claim-staking’ or creating a record of precedence for their work, which has a rich tradition going back to the Philosophical Transations of the Royal Society of London in the 1600s (Guédon, 2010). This has created its own unintended consequences with researchers being pressured even more to publish quickly, exacerbating personal stress and increasing the odds of premature results being rushed onto the internet (McGlynn, 2017). In 2017 two preprints on bioRxiv were found to have been posted without a methodology section. One was updated immediately after the authors were notified (albeit 4 months after publishing), but the second group did not update it until they were contacted two weeks later by the media; they stated they were not aware of standard practise in the field (Kwon, 2017).

This latter occurrence is especially important in in the health sciences and medicine, as the danger of exposing unvalidated work or “pseudoscience” (Thompson, 2017) to readers who are are not equipped to evaluate it themselves is higher than in most disciplines (Kaiser, 2017b).

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 53

There remains a very real risk that early results could be used and later found wanting, although this risk is not removed by the peer-reviewed process. Teixeira da Silva (2017) draws an analogy from preprints to predatory journals, in that both create an ethical hazard by providing unvetted material. BioRxiv consequently does not accept most clinical research, and the upcoming launch of a preprint server at Yale University for clinical research has met with some negative reaction (Enserink, 2017; Kaiser, 2017b). The case of the Zika and Ebola virus preprints above is an example where the ethical risk of publishing erroneous material must be weighed against the potential benefit of publishing early – a balancing act that is now left mostly up to the researchers themselves.

For early-career researchers, being able to cite preprints in grant applications will help bolster their CVs in the early years when they may only have one or two published papers. However, it does not appear that tenure review boards are taking preprints into consideration, so this help is only limited.

Finally, a result of the dynamic, unsettled status of these trends, and the disappearance of print as an end-stage for many ‘traditional’ journal articles, is confusion about what is a preprint versus a ‘journal article’, and what is a journal versus a preprint platform. The Federation of American Societies for Experimental Biology released an open letter when the NIH started to review preprints, highlighting the lack of uniformity among preprint services and the difficulty of distinguishing a preprint from a peer-reviewed product (Freeze, 2016). This is only heightened for interdisciplinary research, as Neylon, Pattinson, Bilder, & Lin (2017) point out in a detailed argument for a conceptual model of communication which distinguishes between a preprint’s state and its standing in the community. This confusion about community norms was claimed as an explanation in the article above (Kwon, 2017).

Preprints, Self-Publishing, and the Role of Communities Viewing preprints through the lens of non-academic self-publishing, we see many parallels, from the desire to share work quickly and flexibly with an audience, to the potentially beneficial effect of feedback from that audience on future revisions; the term ‘self-publishing’ is not generally used in the preprint world, however (possibly because papers in the life sciences are almost

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 54 always the work of more than one person, or because there has been no suggestion of authors charging money for their articles).

Just as the community of readers plays an important role in self-publishing, informal scholarly communities play an important role in a world in which preprints are part of scholarly discourse, since the preprint reduces or eliminates the role of the journal as a gatekeeper and quality controller in the field. The word ‘community’ is used frequently in the discussion of preprints, and is an important aspect to understanding the drawbacks and effects of preprints. There is no one scholarly community – it is a network of galaxies connected by filaments, as shown dramatically in citation analysis visualizations of academia such as one below from Boyack, Klavans, & Börner (2005). Even medicine contains many distinct communities. This pattern helps explain why community norms vary from subfield to subfield, and why clearly explaining these norms to outsiders is so important.

Figure 2 – Part of a cluster network visualization of science and social science, based on citations. Each dot represents a journal. From Boyack, Klavans, & Börner (2005). Copyright 2005 by Springer-Verlag/Akadémiai Kiadó.

Ideally, academic communities provide peer review and feedback, amplify good work, call out errors and bad actors, and recommend resources to each other. This latter role is important since it is harder to browse and search through preprints than academic articles, as

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 55 there is not yet a popular or robust central search engine for preprints; in lieu of this, social networks such as ResearchGate, Academia.edu and Twitter are a prime method of sharing. As one genetics researcher says, “it can be difficult to find a specific [preprint] unless you know where to look for it. But using social media makes a big difference…. By using social media, communities of shared interest start to share their information on what they’ve found interesting” (Baker, 2017).

Jason Priem, a prominent member of the and altmetrics community, sees preprints as part of the broader move to a new scholarly communications system: “The Web opens the workshop windows to disseminate scholarship as it happens, erasing the artificial distinction between process and product … The editors and reviewers employed as proxy community assessors will be replaced by the aggregated, collective judgements of communities themselves. The information-overload problem supplies its own solution” (Priem, 2013). In this world, gatekeepers are replaced with curators and commentators. Biologist and PLoS co-founder Michael Eisen also advocates for a community rating model, and laments that reviews of pen refills on Amazon are more helpful for users than article peer reviews (Harris, 2018).2

The optimism of people like Priem and Eisen in the self-organizing power of online communities is apiece with the optimism of Silicon Valley, but in 2018 it seems naïve. Scholarly communities are not in fact altruistic meritocracies which inherently surface and validate the best work, any more than is the self-publishing world, or are online communities like Twitter or Reddit. Without intentional work by preprint forums, they can exhibit some of the same unintended dynamics as online communities, which allow unhealthy behaviours to disproportionately impact some members of the community.

To give one example, leaving peer review up to ‘the community’ disadvantages early career and unconnected researchers. In the traditional system, peer reviewers are incentivized by the recognized role of peer reviewer and relationships with journals and editorial staff (even if there is no compensation or valuation in the tenure process). But in the new system, this incentive is lessened, and those in the author’s existing social network are the likeliest to

2 This prompted a satirical April Fool’s Day article on the Scholarly Kitchen website announcing “Amazon Peer Review” https://scholarlykitchen.sspnet.org/2018/04/01/amazon-peer-review-coming-preprint-near/

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 56 contribute. For those without a large network, “[w]ill their papers ever be read if even more papers are added to the already bloated scientific literature?” (Irizarry, 2012) His colleague later agreed, writing for all the drawbacks of ‘glamour mags’ like Nature and Science, “it is possible for someone to get a paper in no matter who they are” (Leek, 2016).

The future of preprints and traditional papers

The name preprint itself implies that it is an initial version of an article to be published later in “print”. Many advocates of preprints are careful to claim that it is in no way meant as a replacement for traditional peer review. The NIH for example referred to them as “interim research products” in their statement approving their consideration (Kaiser, 2017a). One of the directors of the ScienceOpen research network, in his personal blog, emphasized repeatedly that preprints complement, not replace, formal peer review (Tennant, 2017).

However, this is not always the case – as discussed above, many researchers are advocating for preprints because of their unhappiness with peer review, and some would prefer that peer review be entirely post-publication. One geneticist, Graham Coop, generated a minor stir in 2017 by tweeting that he had decided to not submit his preprint to a journal at all, though he admitted that he would not do this if he had a co-author without tenure (Singh Chawla, 2017).

In high-energy physics, “[p]eer-reviewed journals have lost their role as a means of scientific discourse, which has effectively moved to the discipline repository” according to a summary of scholarly communications in the field (Gentil-Beccot et al., 2009). In general, most articles on arXiV are still submitted to journals and published, though the percentage not appearing in journals at a later stage had risen to 35% by 2008 (Gentil-Beccot et al., 2009).3

A more likely scenario than this though is a new hybrid between an arXiV-style bare- bones preprint server and a traditional journal model, tailored to the life sciences. The F1000 site, which also runs publication platforms for the Wellcome Trust and Gates Foundation, already checks “whether it is scientific work from a scientist, whether it is plagiarised, whether it meets ethical requirements, whether it is readable and meets community standards, and whether we

3 The authors did not discuss this rise, which seems a notable omission.

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 57 have source data (which we require) and it adheres to FAIR [data management] principles” (Lawrence & Tracz, 2018); this process takes 2-3 weeks and rejects over 20% of submissions. Meanwhile Michael Eisen has recently announced an experimental post-publication assessment and curation tool called ‘APPRAISE’ in which a team of reviewers will add reviews and comments to bioRxiv preprints; he describes it as ‘an editorial board without a journal” (Eisen, 2018). As described, this is not an ‘Amazon-style’ free-for-all, which is a good thing. However, since the choice of what to comment on is left to each member of the team, this could still have the same unintended community effect of missing important work from unconnected researchers.

Conclusions

The discussion around preprints exposes the duelling principles of dissemination and community validation which are at the heart of the broader Open Science movement. Preprints offer much potential to open up scientific discourse. However, they are not an unalloyed good, and in the rush to adopt them, each research community needs to establish norms and definitions in order to clarify a document’s status not just with fellow researchers but with outsiders as well. They also need to consider whether their desired community participation model creates enough incentives for adequate and healthy participation from community members. One thing is almost certain: with all of these new experiments and existing models in flux, it will be a more confusing time for anyone reading a scientific paper.

References arXiV. (n.d.). arXiv monthly submission rates. arXiv.org. ASAPBio. (2018). Transparency, Recognition, and Innovation in Peer Review in the Life Sciences (February 2018). Retrieved from ASAPBio.org. Baker, I. (2017, February 13). Preprints – what’s in it for me? Retrieved from InSight.mrc.ac.uk. Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374. Callaway, E. (2012). Geneticists eye the potential of arXiv. Nature News, 488(7409), 19.

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 58

Dessimoz, C. (2016, March 31). Thoughts on pre- vs. post-publication peer-review – Open Reading Frame - Dessimoz Lab. Retrieved from Dessimoz Lab. Eisen, M. (2011, October 28). Peer review is f***ed up – let’s fix it. Retrieved from MichaelEisen.org. Eisen, M. (2018, January 24). APPRAISE (A Post-Publication Review and Assessment In Science Experiment). Retrieved from ASAPBio.org. Enserink, M. (2017, September 12). Plan for new medical preprint server receives a mixed response. Retrieved from ScienceMag.org. Freeze, H. (2016, December 6). FASEB Highlights Concerns Regarding Use of Preprints in Grant Applications. Retrieved from Federation of American Societies for Experimental Biology. Gejman, R. S. (2018, February 13). 3 years ago: I was advised to refrain from posting a manuscript to @biorxivpreprint b/c risk of scooping & not all journals accepting preprinted manuscripts. Today: I was advised to post to and wait a month to get comments before submitting to a journal. [Tweet]. Retrieved from https://twitter.com/rongejman/status/963565796281679872 Gentil-Beccot, A., Mele, S., & Brooks, T. (2009). Citing and Reading Behaviours in High- Energy Physics. How a Community Stopped Worrying about Journals and Learned to Love Repositories. arXiv:0906.5418 [Cs]. Retrieved from http://arxiv.org/abs/0906.5418 Guédon, J.-C. (2010). In Oldenburgś long shadow: librarians, research scientists, publishers, and the control of scientific publishing. Washington, D.C: Association of Research Libraries. Retrieved from http://www.arl.org/storage/documents/publications/in-oldenburgs-long- shadow.pdf Hangel, N., & Schmidt-Pfister, D. (2017). Why do you publish? On the tensions between generating scientific knowledge and publication pressure. Aslib Journal of Information Management, 69(5), 529–544. https://doi.org/10.1108/AJIM-01-2017-0019 Harris, R. (2018, February 24). Scientists Aim To Pull Peer Review Out Of The 17th Century. Retrieved April 8, 2018, from https://www.npr.org/sections/health- shots/2018/02/24/586184355/scientists-aim-to-pull-peer-review-out-of-the-17th-century

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 59

Irizarry, R. (2012, October 8). Why we should continue publishing peer-reviewed papers. Retrieved February 16, 2018, from https://simplystatistics.org/2012/10/08/why-we-should- continue-publishing-peer-reviewed-papers/ Johansson, M. A., Reich, N. G., Meyers, L. A., & Lipsitch, M. (2018). Preprints: An underutilized mechanism to accelerate outbreak science. PLOS Medicine, 15(4), e1002549. https://doi.org/10.1371/journal.pmed.1002549 Kaiser, J. (2017a, March 24). NIH enables investigators to include draft preprints in grant proposals. Retrieved January 24, 2018, from http://www.sciencemag.org/news/2017/03/nih-enables-investigators-include-draft- preprints-grant-proposals Kaiser, J. (2017b, September 21). Are preprints the future of biology? A survival guide for scientists. ScienceMag.org. Kuperberg, G. (n.d.). (In)frequently asked questions. Retrieved from Front for the arXiv. Kwon, D. (2017, August 1). Do Preprints Require More Rigorous Screening? The Scientist. Lawrence, R., & Tracz, V. (2018, January 30). F1000: our experiences with preprints followed by formal post-publication peer review. Retrieved from ASAPbio.com. Leek, J. (2016, February 26). Preprints are great, but post publication peer review isn’t ready for prime time. Retrieved from SimplyStatistics. Luther, J. (2017, April 18). The Stars Are Aligning for Preprints. Retrieved from ScholarlyKitchen.sspnet.org. McGlynn, T. (2017, July 24). What’s up with preprints? Retrieved from SmallPondScience.com. Molteni, M. (2017, July 8). Biology’s Roiling Debate Over Publishing Research Early. WIRED. Neylon, C., Pattinson, D., Bilder, G., & Lin, J. (2017). On the origin of nonequivalent states: How we can talk about preprints. F1000Research, 6, 608. Norman, F. (2017). The history of peer review, and looking forward to preprints in biomedicine. In What might peer review look like in 2030? London: SpotOn, BioMed Central, Digital Science. PLoS (Public Library of Science). (n.d.). PLOS ONE: Journal Information. Retrieved from Journals.Plos.org.

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018) The Rising Popularity of Preprints 60

Priem, J. (2013, March 27). Scholarship: Beyond the paper [Comments and Opinion]. Retrieved from Nature.com. Ross-Hellauer, T. (2017). What is open peer review? A systematic review. F1000Research, 6, 588. Singh Chawla, D. (2017). When a preprint becomes the final paper. Nature News. Smith, R. (2006). Peer review: a flawed process at the heart of science and journals. Journal of the Royal Society of Medicine, 99(4), 178–182. Teixeira da Silva, J. A. (2017). Preprints: ethical hazard or academic liberation? KOME, 5(2), 73–80. Tennant, J. (a.k.a. protohedgehog). (2017, May 14). Should we cite preprints? Retrieved from FossilandShit.com. Thompson, B. (2017, June 4). The danger for bioRxiv is being hijacked by pseudoscience chasing a veneer of respectability. [Tweet]. Retrieved from Twitter.com.

Moving through the Grey: Publishing in Action The Publishing Business: Transformations and Opportunities (ISI6314 – Winter 2018)