ADVANCES IN INFORMATION in

Carole J. Lee Department of Philosophy, University of Washington, 361 Savery Hall, Seattle, WA 98195. E-mail: [email protected]

Cassidy R. Sugimoto, Guo Zhang, and Blaise Cronin School of Library and Information Science, Indiana University, 1320 East Tenth Street, Bloomington, IN 47405. E-mail: {sugimoto, guozhang, bcronin}@indiana.edu

Research on bias in peer review examines scholarly will be masked from reviewers (double-blind review) to communication and funding processes to assess the protect against forms of social bias. The structure of peer epistemic and social legitimacy of the mechanisms by review is designed to encourage peer impartiality: typically, which knowledge communities vet and self-regulate their work. Despite vocal concerns, a closer look at the peer review involves the use of a “third party” (Smith, 2006, empirical and methodological limitations of on p. 178), someone who is neither affiliated directly with the bias raises questions about the existence and extent of reviewing entity (university, research council, academic many hypothesized forms of bias. In addition, the notion journal, etc.) nor too closely associated with the person, unit, of bias is predicated on an implicit ideal that, once or institution being reviewed; and peers submit their reviews articulated, raises questions about the normative impli- cations of research on bias in peer review. This review without, initially at least, knowledge of other reviewers’ provides a brief description of the function, history, and comments and recommendations. In some cases, however, scope of peer review; articulates and critiques the con- peers will be known to one another, as with in vivo review, ception of bias unifying research on bias in peer review; and may even be able to confer and compare their evalua- characterizes and examines the empirical, methodologi- tions (e.g., members of a National Science Foundation cal, and normative claims of bias in peer review research; and assesses possible alternatives to the [NSF] review panel). status quo. We close by identifying ways to expand Peer review, broadly construed, covers a wide spectrum conceptions and studies of bias to contend with the of activities, including but not limited to observation of complexity of social interactions among actors involved peers’ clinical practice; assessment of colleagues’ classroom directly and indirectly in peer review. teaching abilities; evaluation by experts of research grant and fellowship applications submitted to federal and other and Purpose of Peer Review funding agencies; review by both editors and external refer- ees of articles submitted to scholarly journals; rating of Peer review is an established component of professional papers and posters submitted to conferences by program practice, the academic reward system, and the scholarly committee chairs and members; evaluation of book pro- publication process. The fundamental principle is straight- posals submitted to university and commercial presses forward: experts in a given domain appraise the professional by in-house editors and external readers; and assess- performance, creativity, or quality of scientific work pro- ments of the quality, applicability, and interpretability of duced by others in their field or area of competence. In most data sets (Lawrence, Jones, Matthews, Pepler, & Callaghan, cases, reviewer identity is hidden (single-blind review) to 2011; Parsons, Duerr, & Minster, 2010). To this list one encourage frank commentary by protecting against possible might add promotion and tenure decisions in higher educa- reprisals by authors; and, in some cases, author identities tion for which an individual’s institutional peers and select outside experts determine that person’s suitability for tenure and/or promotion in rank, and also the procedures whereby Received May 10, 2012; revised July 12, 2012; accepted July 13, 2012 candidates are admitted to national , elected ©2012ASIS&T• Published online 6 December 2012 in Wiley Online fellows of learned societies, or awarded honors such as the Library (wileyonlinelibrary.com). DOI: 10.1002/asi.22784 Fields Medal or Nobel Prize.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 64(1):2–17, 2013 In many ideal depictions, peer review processes are of science” (Chubin & Hackett, 1990, p. 5) could function understood as providing “a system of institutionalized vigi- flawlessly. Throughout the literature, charges of systematic lance” (Merton, 1973, p. 339) in the self-regulation of bias—not just isolated incidents—are repeatedly aired. Such knowledge communities. Peer expertise is coordinated to vet concerns need to be addressed in an open and thoroughgoing the quality and feasibility of submitted work. Authors, in the fashion to ensure that trust in the integrity of peer review is anticipation of the peer evaluation of their work, aim to maintained. In this spirit, our review seeks to articulate conform to shared standards of excellence out of expediency notions of impartiality and bias that are faithful to concerns and in accordance with an internalized ethos (Merton, raised by quantitative research on peer review; characterize 1973). The norms and values to which peers hold each other major genres of research on bias by their methods, assump- are conceived as being universally and consistently applied tions, and concerns; report their results; and indicate how to all members, where these norms and values pertain to the alternative forms of peer review might ameliorate various content of authors’ evidence and arguments independently forms of bias. of their social caste or positional authority (Merton, 1973). Our discussion will draw on literature on the origins, When these norms and values are impartially interpreted purpose, and mechanics of scientific peer review across and applied, peer evaluations are understood as being fair. It multiple genres including journal articles, grant proposals, is the impartial interpretation and application of shared and fellowship applications (e.g., Bornmann & Daniel, 2007; norms and standards that make for a fair process, which— Bornmann, Mutz, & Daniel, 2007, 2008, 2009; Campanario, psychologically (Tyler, 2006) and epistemologically— 1998a, 1998b; Chubin & Hackett, 1990; Daniel, Mittag, & legitimizes peer review outcomes, content, and institutions. Bornmann, 2007; Hames, 2007; Holbrook & Frodeman, This is why critics’ charge of bias in peer review is so 2011; Kronick, 1990; Marsh, Bornmann, Mutz, Daniel, & troubling: Threats to the impartiality of review appear to O’Mara, 2009; Shatz, 2004; Spier, 2002); the growing body threaten peer review’s psychological and epistemic legiti- of empirical and meta-analytic research on the reliability macy. Although there are a few exceptions (Lamont, 2009; and predictive validity of the peer review process (e.g., Lee, in press; Mallard, Lamont, & Guetzkow, 2009), varia- Bornmann, 2011a, 2011b; Peters & Ceci, 1982a), including tions in the interpretation and application of epistemic the various kinds of presumptive (e.g., institutional, norms and values are almost always conceived of as prob- gender, cognitive) associated with different types of review lematic. Failures in impartiality to outcomes that result systems (e.g., Alam et al., 2011; Blank, 1991; Budden et al., from the “luck of the reviewer draw” (Cole, Cole, & Simon, 2008); survey research on scientists’ attitudes towards peer 1981, p. 885), fail to uphold the meritocratic image of review (e.g., Sense About Science, 2009; Ware & Monkman, knowledge communities (Lee & Schunn, 2011; Merton, 2008); and debates surrounding the relative merits of open 1973), protect orthodox theories and approaches (Travis & peer review in light of new experimental, web-based systems Collins, 1991), insulate “old boy” networks (Gillespie, (e.g., Delamothe & Smith, 2002). Chubin, & Kurzon, 1985; McCullough, 1989), encourage authors to “chase” disputable standards (Ioannidis, 2005, p. History of Peer Review 696), and mask bad faith efforts by reviewers who also serve as competitors (Campanario & Acedo, 2005). Perceived par- The origins of scholarly peer review are commonly asso- tiality to dissatisfaction among those whose profes- ciated with the formation of national academies in 17th- sional success or failure is determined by review outcomes century Europe, although some have found foreshadowing (Gillespie, Chubin, & Kurzon, 1985; McCullough, 1989; of the practice. Biagioli (2002, p. 31) has described in detail Ware & Monkman, 2008). “the slow differentiation of peer review from book censor- The charge of bias also threatens the social legitimacy of ship” and the role state licensing and censorship systems peer review. Peer review signals to the body politic that the played in 16th-century Europe. A few years after the Royal world of science and scholarship takes seriously its social Society of London (1662) and the Académie Royale des responsibilities as a self-regulating, normatively driven com- Sciences of Paris (1699) were established, both bodies munity. The enormity and complexity of contemporary created in-house journals, the Philosophical Transactions science and its ramified institutional arrangements are such and Journal des Sçavans, respectively. These prototypical that peer review has, in the words of Biagioli (2002, p. 34), scientific journals gradually replaced the exchange of been “elevated to a ‘principle’ — a unifying principle for a experimental reports and findings via correspondence, for- remarkably fragmented field.” As a consequence, the system malizing a process that up until then had been essentially is held to almost impossibly strict standards and routinely personal, informal, and nonassured in nature. In London, exposed to intense scrutiny by insiders and outsiders was appointed Secretary to the Royal alike, including elected politicians (Gustafson, 1975; Walsh, Society and became the journal’s first editor, gathering, 1975). reporting, and editing the work of others (Manten, 1980). Does the “mundane reality of peer review” depart radi- From these early efforts gradually emerged the process of cally from its “mythology” (Biagioli, 2002, p. 13)? Given independent review of scientific reports by acknowledged that human fallibility and venality are inescapable facts of experts that persists to this day. Indeed, as early as 1731 the , it seems unreasonable to imagine that “the flywheel Royal Society of Edinburgh had adopted a review process in

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 3 DOI: 10.1002/asi which materials sent to it for publication were vetted and of these (National Institutes of Health, 2011; National evaluated by knowledgeable members (Spier, 2002, p. 357). Science Foundation, 2011). Many of those who review NSF This was the era of the amateur scientist and armchair and NIH research proposals are probably at the very least philosopher who “produced reliable knowledge in and also regular reviewers of papers for a range of academic through a moral economy patterned upon the conventions of journals and conferences and also occasional reviewers of gentlemanly conversation” (Shapin, 1995, p. 290). But pro- promotion and tenure dossiers, tens of thousands of which fessional science is not conducted by “logically omniscient require careful scrutiny by multiple reviewers every aca- lone knowers” (Kitcher, 1993, p. 59), and mechanisms thus demic year. It is not hard to grasp the enormity of the burden evolved to formalize the ways in which the trustworthiness placed upon members of the scientific community, both of scientific findings could be verified and promulgated to a junior and senior (Vines, 2010), by a system that, with very wider audience. Over time, three principal forms of journal few exceptions (see Engers & Gans, 1998), operates on a peer review evolved: single-blind, double-blind, and open. voluntary, unremunerated basis (Kravitz & Baker, 2011). Of these, single-blind (the author’s identity is known to the With advances in technology, scientific research has reviewer while the reviewer’s is concealed from the author) become highly sophisticated, collaborative, distributed, and is the most widely used, not least because it is a less onerous capital intensive in recent years: as a result many manu- and less expensive system to operate than double-blind, for scripts are now accompanied by large amounts of supple- which considerable (often unsuccessful) effort is required in mentary materials that require careful scrutiny, placing an order to remove all traces of the author’s identity from all even greater burden on conscientious reviewers. As the com- parts of the manuscript/proposal under review (e.g., Blank, mercial and career stakes rise, in what Ziman (2000, p. 211) 1991; Nature, 2008). has termed the age of “post-academic science,” so does the burden placed on the shoulders of those individuals refereeing for the world’s leading scientific journals. Commitment to and Dependence on Peer Review The competition for both pecuniary resources and atten- Today there are literally thousands (estimates vary con- tion in the marketplace of ideas has intensified to such an siderably) of peer-reviewed journals in existence, although extent that reviewers need to be ever alert to the possibility the stringency and consistency with which peer review of fraud (e.g., data fabrication, data trimming), credit mis- procedures are applied across this population are variable allocation (e.g., unearned/gift authorship), and potential (Mabe, 2003). In any given year these journals publish, at a conflicts of interest (e.g., undeclared commercial or consult- conservative estimate, a million articles (Björk, Roos, & ing ties) in the publications they evaluate. Although unethi- Lauri, 2009). Each one of those articles will, in all likeli- cal practices have been documented repeatedly in the hood, have been read by at least one, often two, and some- medical and biomedical fields (e.g., Biagioli, 1998; Cronin, times three or more reviewers, selected by journal editors, 2002; Sismondo, 2009), there is also suggestive evidence and most of those submissions will have undergone multiple that chicanery and corner-cutting may be on the rise in some rounds of review prior to eventual publication in a journal of of the social sciences (Shea, 2011). record. In addition to the million or so published articles More than ever, we need to rely on peer review in the there will be at any given moment a very sizeable pool of efficient and effective evaluation of knowledge claims. rejected articles moving through the system, as many (but Research on bias in peer review seeks to identify ways in not all) leading journals have high rejection rates (Schultz, which it fails to do so. However, as Bornmann (2008) notes, 2010a, 2010b). These rejected papers will also have con- the focal concept of bias has not been defined unambigu- sumed a great deal of reviewer time (Hamermesh, 1994; ously in the literature, perhaps because there is presumed to Vines, Rieseberg, & Smith, 2010). Moreover, at least some exist a shared, albeit tacit, understanding of this term. In of those rejected papers will be resubmitted to a different what follows, we will articulate a general notion of bias, journal (possibly more than one) in an effort to see the light defined as the violation of impartiality in peer evaluation, of day (Cronin & McKenzie, 1992). As Kravitz and Baker that draws the empirical literature’s normative concerns (2011, para. 1) put it: “each submission of a rejected manu- together. We will then identify different categories of bias script requires the entire machinery of peer review to creak research by their hypothesized source of partiality and (in to life anew,” creating, in effect, “a journal loop bounded some cases) by the methods and assumptions adopted to only by the number of journals available and the dignity of study that type of bias. the Authors.” But that is only part of the story. Research councils, Bias in the Peer Review Process foundations, universities, and other grant-awarding bodies also need to call upon the services of peer experts to review In the context of quantitative research on bias in peer the millions of research proposals, intra- and extramural, review, reviewer bias is understood as the violation of impar- seeking funding in any given year. In the U.S. alone, the tiality in the evaluation of a submission. We define impar- National Institutes of Health (NIH) and the NSF, the two tiality in peer evaluations as the ability for any reviewer to principal funding agencies, together receive nearly 90,000 interpret and apply evaluative criteria in the same way in the research proposals each year and fund less than one quarter assessment of a submission. That is, impartial reviewers

4 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 DOI: 10.1002/asi arrive at identical evaluations of a submission in relation to sures single-dimension constructs such as intelligence and evaluative criteria because they see the relationship of the creativity (Hargens & Herting, 1990; Rust & Golombok, criteria to the submission in the same ways. And, so long as 2009). Improvement in peer review practices, from this the evaluative criteria have to do with the cognitive content perspective, involves improving the reliability with which of the submission and its relationship to the literature, reviewers identify the true quality value of submissions. impartiality ensures evaluations are independent of the author’s and reviewer’s social identities and independent of Bias as deviation from proxy measures for true quality. In the reviewer’s theoretical biases and tolerance for risk. one subgenre of this research, studies seek to assess the There are many reasons to challenge this ideal notion of construct validity of peer review as a test/process by com- impartiality in peer review. Lamont (2009) and Mallard paring its outcomes to proxy measures for manuscript et al. (2009) argue that evaluative criteria should not quality. Proxy measures include reviewers’ pooled mean be subject to unifying, transdisciplinary interpretations rating (Goodman, Berlin, Fletcher, & Fletcher, 1994), (Lamont, 2009; Mallard et al., 2009). Lee (in press) argues ratings by super experts (Gardner & Bond, 1990), editor/ that impartiality in peer evaluations may not be possible panel decisions (Marsh, Jayasinghe, & Bond, 2008; van since definitions of evaluative criteria underdetermine their Rooyen, Godlee, Evans, Smith, & Black, 2010), ratings by interpretation and application in both multidisciplinary and readers of a journal (Justice, Cho, Winker, & Berlin, 1998), disciplinary contexts. Lamont (2009, p. 19) argues that the citation counts (Hagstrom, 1971; Campanario, 1995; Daniel, cognitive value of submissions cannot and should not be 2005; Bornmann & Daniel, 2009; Gottfredson, 1978), and assessed in ways that are dissociated from the reviewer’s subsequent publication (Bornmann & Daniel, 2008b; Born- “sense of self and relative positioning” with respect to the mann, Mutz, Marx, Schier, & Daniel, 2011). Reliance on submission’s content. Likewise, it is not clear that a review- proxies for quality is especially common in research on peer er’s theoretical or methodological orientations should be review in medicine, which seeks to carry out randomized looked upon as normatively problematic. However, we controlled trials to identify practices that improve peer articulate this notion of impartiality in an effort to identify review processes and outcomes (Godlee, Gale, & Martyn, an underlying ideal that aligns different genres of quantita- 1998; Justice et al. 1998; van Rooyen, Godlee, Evans, tive research on bias in peer review. Smith, et al., 2010). These genres of quantitative research can be categorized For example, studies have investigated the citation by differences in their conception of the primary source of patterns of “rejected-then-published-elsewhere” articles. A bias: (a) error in assessing a submission’s “true quality,” (b) high subsequent citation rate is used to indicate error in the social characteristics of the author, (c) social characteristics original decision to reject a manuscript (Bornmann & of the reviewer, and/or (d) content of the submission.1 In Daniel, 2008b, 2009; Bornmann, Mutz, Marx, et al., 2011). what follows we will characterize these genres of work Bornmann and Daniel (2009) found, when comparing (and their subgenres), identify assumptions and methods citation counts, that 15% of accepted papers and 15% of adopted to undertake this quantitative research, and provide rejected papers (that were subsequently published else- a selective review of their findings. In all these genres and where) should not have been accepted/rejected at a top subgenres, bias is deemed problematic qua partiality. chemistry journal. Bornmann, Mutz, Marx, et al. (2011) However, when critics implicitly or explicitly express addi- found that acceptance by the original journal was a good tional grounds for normative complaint, we identify them predictor of later citation success and that rejection was throughout. a good predictor of limited citation, thereby validating editorial decisions. In some studies, subsequent publication of a rejected Bias as Deviation From “True Quality” Value manuscript in a more prestigious journal is also used as an Some quantitative research conceives of bias as a kind of indication of error in the original publication decision error in identifying “the true quality of the object being (Bornmann, Mutz, Marx, et al., 2011). However, Cronin rated” (Blackburn & Hakal, 2006, p. 378). Errors in identi- and McKenzie (1992) note relatively few instances of fying the true quality of submissions violate the ideal “upward migration” and challenge the notion that such of impartiality in peer review by demonstrating that cases may reflect error on the part of the original publica- reviewers—in succumbing to error—can fail to interpret and tion decision: upward migration may sometimes result apply evaluative criteria in consistent ways. The assumption from the manuscript’s finding a better fit—in terms of that there exists such a value along a single dimension is “focus, scope or style”—with a more prestigious journal commonplace within psychometric research, which mea- (p. 316).

Bias as low inter-rater reliability. From a psychometric 1 From a conceptual point of view, social characteristics of editors may perspective, in order for peer review to be a valid test of also bias peer review. However, we do not expand on this type of bias due to the paucity of empirical work on the topic. Whatever studies are available submission quality, reviewer judgments must be reliable are mentioned in our discussion of bias as a function of “social character- with respect to each other (Hargens & Herting, 1990; Rust & istics of the reviewer.” Golombok, 2009, p. 72). Some researchers have suggested

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 5 DOI: 10.1002/asi that inter-rater reliability for two reviewers on a single sub- turned out to be less statistically credible than those with low mission should be about 0.8–0.9 (Marsh et al., 2008, p. 162), levels of agreement. which is similar to the rate found for intelligence and per- sonality tests (Rust & Golombok, 2009). Unfortunately, agreement between reviewers is very low (e.g., Bornmann & Bias as a Function of Author Characteristics Daniel, 2008a; Ernst & Resch, 1999; Jackson, Srinivasan, Rea, Fletcher, & Kravitz, 2011; Rothwell & Martyn, 2000), Among Merton’s classical norms of science is universal- with agreement “barely beyond chance” (Kravitz et al., ism, the ideal that knowledge claims be evaluated according 2010, p. 1) and comparable to rates found for Rorschach to “preestablished impersonal criteria” that assess the excel- inkblot tests (Lee, in press). lence or originality of a person’s ideas, rather than on par- Research has demonstrated that inter-reviewer agreement ticular facts about their social identity and status (Merton, is improved when reviewers evaluate more rather than fewer 1973, p. 269). As expressed by Peters and Ceci (1982b), grant applications, suggesting improvement via learning/ universalism in the context of peer review requires that an training (Jayasinghe, Marsh, & Bond, 2003). Research has author’s research be “judged on the merit of [his/her] ideas, also shown improvements in inter-rater reliability with the not on the basis of academic rank, sex, place of work, addition of more reviewers per grant application (Jayasinghe publication record, and so on” (p. 252). Social bias is the et al., 2003). Such improvements are important to psycho- differential evaluation of an author’s submission as a metrically oriented researchers since they decrease the result of her/his perceived membership in a particular chance that review outcomes vary dramatically as a function social category. Social bias challenges the thesis of impar- of which reviewers are chosen (Cole, Cole, & Simon, 1981). tiality by suggesting that reviewers do not evaluate submis- However, empirical study suggests that increasing the sions—their content and relationship to the literature— number of reviewers per journal manuscript does not sig- independently of the author’s (perceived) identity. nificantly affect final decisions (Schultz, 2010a). Some view this type of bias as malicious in nature. For Inter-rater reliability research focuses on recommenda- example, acknowledging the problem of ad hominem bias, tion outcomes without studying other qualities of reviews, Nature’s review policies warn that reviewer anonymity such as their length, tone, and presence of references. cannot be protected “in the face of a successful legal action Without considering the nature and language of the review, to disclose identity in the event of a reviewer having written it is difficult to assess whether systematic bias is present and personally derogatory [review] comments about the what type of bias it may be (epistemic, language, etc.). authors” (Nature, 2012, para. 41). However, bias that vio- When “we shift focus away from the numerical representa- lates the norm of universalism need not be ill-intended or tion of a reviewer’s assessment to the content upon which conscious at all (Lee & Schunn, 2011). An individual may such assessments are grounded, we can identify” (Lee, in sincerely espouse norms of equality and invoke normatively press, p. 5) ways in which inter-rater disagreement might appropriate criteria to justify biased evaluations. However, reflect normatively appropriate disagreements. Editors and implicit biases in evaluation—resulting from automatic and grant program officers may seek reviewers who can evaluate subconscious processes—are not usually blocked by the different aspects of a submission according to their own conscious, deliberative processes by which egalitarian subspecialization and expertise (Bailar, 1991), and we beliefs are formed and sustained (Bargh & Williams, 2006; would not expect high inter-rater reliability in cases where Chaiken & Trope, 1999). For example, hiring studies dem- quality along these different aspects diverges (Hargens & onstrate that, despite identical curricula vitae, male appli- Herting, 1990). These considerations suggest that “diversity cants are deemed as having superior qualities than female of opinion among referees may be desirable and beneficial” applicants (Steinpreis, Anders, & Ritzke, 1999). Ironically, (Chubin & Hackett, 1990, p. 102)—disagreement is thus evaluators given the opportunity to disagree with blatantly sometimes normatively desirable and appropriate (Harnad, sexist statements were more likely to reject women for 1982; Hirschauer, 2009; James, Demaree, & Wolf, 1984; stereotypically male jobs (Monin & Miller, 2001). Lee, in press). Much (but not all) research in this genre assumes that the Philosophical and qualitative sociological research chal- quality of work by individuals across different social groups lenges the psychometric assumption that peer review (e.g., prestigious vs. not, male vs. female) is, in the aggre- involves assessments along a single dimension of evalua- gate, roughly comparable. As a result, we should expect the tion. Peer review criteria—such as novelty, soundness, and rate with which members of less powerful social groups significance—may be open to different, normatively appro- enjoy successful peer review outcomes to be proportionate priate interpretations (Lamont, 2009; Lee, in press) and fail to their representation in submission rates. Researchers infer to reduce to a single dimension of evaluation. If this is the the existence of bias when a difference is discovered and case, then we would expect normatively appropriate dis- infer the lack of bias when no difference is discovered. Very agreement between reviewers. That normative credibility is few studies are able to demonstrate that their submission conceptually different from high inter-rater reliability is also pools are similar to or representative of the larger population demonstrated by Bornmann, Mutz, and Daniel (2010), who of researchers (for an exception, see RAND, 2005). Some found that studies with high levels of inter-rater agreement models refine comparisons across groups by controlling for

6 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 DOI: 10.1002/asi additional factors that might correlate with submission function of reviewer characteristics, since affiliation is quality. For example, some studies control for factors such shared between authors and reviewers. Affiliation bias may as type of institution (Blank, 1991; Xie & Shauman, 1998), be a form of prestige bias in cases where reviewers and experience (RAND, 2005), and rank (Ley & Hamilton, authors enjoy formal or information relationships due to 2008) since these are acknowledged as affecting the shared, prestige-marked characteristics (e.g., institutional resources and expertise needed to do quality work. Studies affiliation). Wennerås and Wold (1997) discovered that post- that distribute for review submissions that are identical in all doctoral fellowship applicants with personal ties to review- respects except for the perceived social category to which ers were assessed as more competent than those who were the stated author belongs control for quality most adequately not affiliated but equally productive. A replication of this (Borsuk et al., 2009; Peters & Ceci, 1982b). However, which work found a 15% affiliation bonus for both male and author attributes should and do correlate with indicators female applicants (Sandström & Hällsten, 2008). However, of manuscript quality are questions that deserve further affiliation does not always result in favorable outcomes for theorizing and testing. authors and applicants: Oswald (2008) found that two jour- nals housed at top economics departments did not favor or Prestige bias. As Merton observed, prestige-based bias even discriminated against authors from the journal’s parent calls attention to a “class structure” in science, where those institution. rich in prestige disproportionately accumulate limited resources (e.g., grant monies, publication space, awards), Nationality bias. Many studies have found that journals which allows them to garner yet more prestige in a process favor authors located in the same country as the journal (e.g., of cumulative advantage (Merton, 1973, p. 443; Price, Daniel, 1993; Ernst & Kienbacher, 1991; Link, 1998), some 1976). The preferential evaluation of contributions by the highlighting a particularly strong degree of preferential prestigious versus the nonprestigious has been dubbed “the attachment in the U.S. (Ernst & Kienbacher, 1991).Yet other Matthew effect” (Merton, 1968). Some researchers perceive studies suggest that American authors are more critical of that prestige-bias affects peer review: surveys report that their compatriots and more lenient when assessing grant applicants to the NSF and NIH are concerned about “old applications of non-American authors (Marsh et al., 2008). boy” networks (McCullough, 1989, p. 82; Gillespie et al., These studies use current author address as a proxy for 1985, p. 49) and bias against researchers in nonmajor nationality; however, doing so conflates current affiliation or universities (Gillespie et al., 1985, p. 49). address with country of origin and ethnicity. Others worry A study of peer-reviewed grant decisions for awarding that nationality bias may reflect prose quality and not nation- long-term fellowships to postgraduate researchers in bio- ality per se (Cronin, 2009). However, it may also be that medicine discovered that funding rates decreased correla- language and writing style are cited as problems for manu- tively with institutional prestige; however, the effect was scripts written by non-native speakers even when there is small and not statistically significant (Bornmann & Daniel, nothing problematic about the prose (Herrera, 1999). 2006). In a much discussed and highly cited study, Peters and Ceci (1982b) investigated whether “researchers affili- Language bias. The potential for language bias has been ated with prestigious institutions will tend to fare better than examined both in terms of acceptance rates and as a depen- colleagues at less prestigious ones” (p. 748). To control for dent variable when blinding reviews. An examination of the quality of submissions, they resubmitted published reviews in medical research demonstrated a significant dif- articles by prestigious individuals from prestigious institu- ference in the acceptance rates for abstracts written by tions under fictitious names associated with less prestigious authors from English- and non-English-speaking countries institutions. They found that resubmitted manuscripts were (Ross et al., 2006). The difference diminished when the rejected 89% of the time (higher than the journal’s 80% editors instituted blind review: “Blinding significantly rejection rate) on the grounds that the studies contained attenuated the association between language and likelihood “serious methodological flaws” (p. 187). of abstract acceptance” (p. 1679). Tregenza (2002) found that Reviewers do not necessarily use the prestige of an acceptance rates at ecology and evolution journals were author as direct grounds for their recommendations. For higher for first authors living in wealthy English-speaking example, Bornmann, Weymuth, and Daniel (2010) investi- nations versus wealthy non-English-speaking nations. How- gated the content of reviews of rejected articles to identify ever, another study found that language was not an important which negative comments were the best predictors of criterion for acceptance or rejection, noting no significant future success at subsequent high- and low-ranked jour- difference in acceptance rates for “linguistically criticized” nals. They found that reviewers cite relevance and design manuscripts compared with those that did not receive such of research rather than social factors (such as affiliation criticism (Loonen, Hage, & Kon, 2005, p. 1,469). and institution). Gender bias. In light of the gender gap in STEM (science, Affiliation bias. Affiliation bias occurs when reviewers and technology, education, and medicine) fields (Budden et al., authors/applicants enjoy formal or informal relationships. 2008; Wennerås & Wold, 1997), the prevailing assumption This bias may be classified as a kind of bias that varies as a has been that men are overall more favorably treated than

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 7 DOI: 10.1002/asi women in peer review. Although empirical research on male colleagues (e.g., Borsuk et al., 2009; Jayasinghe et al., gender bias in publication and grant outcomes has produced 2003; Lane & Linden, 2009; Wing, Benner, Petersen, “data and interpretations which at times are contradictory” Newcomb, & Scott, 2010). Female editors have been found (Rees, 2011, p. 140), recent meta-analysis suggests that to reject more submissions than their male colleagues claims of gender bias in peer review “are no longer valid” (Gilbert et al., 1994; Lane & Linden, 2009), although the (Ceci & Williams, 2011, p. 3,157). reverse phenomenon has also been discovered (Wing et al., For example, if there is gender bias in review, we would 2010). Toughness may also vary by disciplinary affiliation: expect double-blind conditions to increase acceptance rates Lee and Schunn (2011) found philosophers’ reviews were for female authors. However, this is not the case (Blank, more negative in tone and more likely to lead to rejection 1991). Nor are manuscripts by female authors dispropor- than those written by psychologists. Wood (1997) found tionately rejected at single-blind review journals such as American reviewers to be more lenient than their colleagues Journal of Biogeography (Whittaker, 2008), Journal of the from the U.K. or Germany. Marsh et al. (2008) found American Medical Association (Gilbert, Williams, & Lund- American reviewers to be more lenient than Australians and berg, 1994), Nature Neuroscience (Nature Neuroscience, suggested that the leniency of American reviewers results 2006), and Cortex (Valkonen & Brooks, 2011). Even when from “a culture that is comfortable being generous in their the quality of submissions is controlled for, manuscripts evaluations” (p. 163). authored by women do not appear to be rejected at a higher This observation raises interesting normative questions rate than those authored by men (Borsuk et al., 2009). about the role that cultural differences play in reviewer style Wennerås and Wold (1997) found that female biomedical and strictness. Because Marsh et al. (2008) work within a postdoctoral fellowship applicants had to be 2.5 times more psychometric framework, they conceive of cultural differ- productive than a male applicant to receive the same com- ences in peer evaluations as sources of contamination or petence score. However, replications of the study at compa- error in assessments of a submission’s true quality value. rable institutions in the U.K. (Grant, Burden, & Breen, However, when evaluative cultures are specific to disci- 1997), Canada (Friesen, 1998), and Germany (Bornmann & plines, it is less clear whether such differences should be Daniel, 2006) failed to discover statistically significant understood as a form of problematic bias. Lamont (2009) gender bias in the awarding of the same type of postdoctoral and Mallard et al. (2009) argue that discipline-specific fellowship. A later replication at the same institution found evaluative cultures articulate appropriate ways to approach that gender-based allotments had reversed (Sandström & theory/method and provide the proper epistemic grounds for Hällsten, 2008). fairly evaluating grant proposals. Meta-analyses (Marsh et al., 2009) and large-scale studies (Marsh, Jayasinghe, & Bond, 2011; RAND, 2005) of grant outcomes found no gender differences after adjusting Content-Based Bias for factors such as discipline, country, institution, experi- Content-based bias involves partiality for or against a ence, and past research output. One study found that female submission by virtue of the content (e.g., methods, theoreti- applicants received only 63% of the funding that their male cal orientation, results) of the work.2 Since different types of colleagues received from the NIH (RAND, 2005). However, content-based bias challenge the thesis of impartiality in a later study found funding success rates were nearly equal different ways, we will save analysis of these challenges to for men and women at NIH when controlling for research discussion of the subtypes. Content-based bias is primarily rank/stage (Ley & Hamilton, 2008). studied in the context of scientific disciplines. This is because the overarching concern motivating research on Bias as a Function of Reviewer Characteristics content-based bias is whether peer review is capable of the kind of self-regulation that encourages scientific progress Bias as a function of reviewer characteristics challenges and the achievement of other scientific goals. Most studies the impartiality of peer review by demonstrating that review- attempt to demonstrate content-based bias by showing that ers fail to evaluate a submission’s content and relationship to review outcomes vary as a function of the submission’s the literature independently of reviewer characteristics. Such content. However, when such studies are not available, bias is demonstrated by showing that specific classes of surveys or anecdotal evidence from researchers or grant reviewers are systematically tougher or softer on identical program managers are appealed to instead. submissions (e.g., Jayasinghe et al., 2003) or across multiple Many hypothesize that reviewers will evaluate more submissions (e.g., Gilbert et al., 1994). favorably the submissions of authors who belong to similar Evaluative strictness or leniency can be idiosyncratic to “schools of thought,” a form of “cognitive cronyism” (Travis individual reviewers (Casati, Marchese, Ragone, & Turrini, & Collins, 1991, p. 323). The perception that cognitive cro- 2009; Marsh & Ball, 1989; Thurner & Hanel, 2010). Strict- nyism is at play in peer review contexts is evidenced by ness or leniency can also vary systematically as a function of the social categories to which reviewers belong. Studies 2Content-based bias may also include a form of “ego bias,” where show significant differences in the patterns of reviewing reviewers and editors prefer submissions that cite their own work (e.g., by gender, with female reviewers being stricter than their Sugimoto & Cronin, in press).

8 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 DOI: 10.1002/asi conversations among grant committee members at the U.K. versial clinical intervention judged a manuscript whose Science and Engineering Research Council, which reveal data supported the use of that intervention more favorably attempts to contextualize reviewer recommendations by than those who had published work against it. Confirmation identifying theoretical and subdisciplinary affiliations bias for or against manuscripts may be rooted in biased between reviewers and proposal authors (Travis & Collins, assessments along more specific dimensions of evaluation. 1991). Sandström (2009) operationalized cognitive crony- For example, Mahoney (1977) found that reviewers judged ism in reviews by examining the relationships between key the methodological soundness, data presentation, scientific noun phrases appearing in the titles and abstracts of papers contribution, and publishability of a manuscript to be of being reviewed and papers written by the reviewers, hypoth- higher quality when its data were consistent with the esizing that reviewers would favor work that was similar to reviewer’s theoretical orientation. However, consistency their own. The data did not support the hypothesis. between a reviewer’s theoretical orientation and a manu- At what point does cognitive difference become discrimi- script’s reported results does not automatically lead to nation? Travis and Collins (1991) contrast cognitive crony- confirmation bias. Hull’s (1988) analysis of reviewer ism with bias based on social status. For Travis and Collins, recommendations for Systematic Zoology demonstrates cognitive cronyism is not pernicious like social status bias so that, during a time of warring schools of taxonomy, confir- long as the boundaries of cognitive communities and social mation bias among reviewers was “far from total” (p. 333) hierarchies do not coincide. However, in cases where they since allies can disagree on fundamental tenets and wish to do coincide, outsiders may find “old-boy networks” that prevent the publication of weak papers that could become control journal and conference content (Hull, 1988, p. 156) easy targets for rivals. and citation networks (Ferber, 1986) difficult to penetrate for social reasons disguised as purely cognitive ones (Lee & Conservatism. Peer review is often censured for its conser- Schunn, 2011). vativism, that is, bias against groundbreaking and innovative If reviewers prefer research that is similar in cognitive research (Braben, 2004; Chubin & Hackett, 1990; Wesseley, orientation and content to their own, then we would expect 1998). Conservativism violates the impartiality of peer that, on the whole, reviewers disfavor research inconsistent review by suggesting that reviewers do not interpret and with their theoretical orientation as well as research falling apply evaluative criteria in identical ways since what count outside the mainstream, including interdisciplinary and as the proper criteria of evaluation—and their relative transformative research. weightings—are disputed. Although some challenge the suggestion that conservativism is epistemically problematic Confirmation bias. In the psychological literature, confir- (Shatz, 2004), most argue that conservativism threatens sci- mation bias is the tendency to gather, interpret, and remem- entific progress by stifling the funding and public articula- ber evidence in ways that affirm rather than challenge tion of alternative and revolutionary scientific theories one’s already held beliefs (Nickerson, 1998). Historical and (Stanford, 2012). More locally, conservativism violates philosophical analyses have demonstrated the obstructive explicit mandates, articulated by journals and granting insti- and constructive role that confirmation bias has played in tutions, to fund and publish innovative research (Frank, the course of scientific inquiry, theorizing, and debate 1996; Horrobin, 1990; Luukkonen, 2012). (Greenwald, Pratkanis, Leippe, & Baumgardner, 1986; Many have voiced concern about conservativism in peer Solomon, 2001). In the context of peer review, confirmation review, including past directors at the NSF and NIH (Carter, bias is understood as reviewer bias against manuscripts 1979; Kolata, 2009) and applicants to these institutions describing results inconsistent with the theoretical perspec- (Gillespie et al., 1985, p. 49; McCullough 1989, p. 83). tive of the reviewer (Jelicic & Merckelbach, 2002). As Research suggests that authors proposing unorthodox as such, confirmation bias can also be classified as a type of opposed to orthodox claims must meet a higher burden of bias that varies as a function of reviewer characteristics. proof: Resch, Ernst, and Garrow (2000) demonstrated that Confirmation bias challenges the impartiality of peer studies supporting unorthodox medical treatments were review by questioning whether reviewers evaluate sub- rated less highly even though the supporting data were missions on the basis of their content and relationship equally strong. Qualitative research reveals another possible to the literature, independently of their own theoretical/ source for conservativism: for many grant panelists, methodological preferences and commitments. Confirma- “frontier” research is understood as “paradigm-shifting” and tion bias also challenges the impartiality of scientists qua “revolutionary” (Luukkonen, 2012, p. 54), while “excellent” scientists by questioning their ability to evaluate scientific research is understood as involving “methodological rigour hypotheses on the basis of the evidence independently of and solid quality of the research” (Luukkonen, 2012, p. 54). their “desires, value perspectives, cultural and institutional Because of the uncertainty surrounding the pursuit of novel norms and presuppositions, expedient alliances and their methods and theories—and the need for multiple contin- interests” (Lacey, 1999, p. 6). gency plans should a new experiment or project not go as Empirical study suggests reviewers are vulnerable to planned—it may be more difficult for frontier research confirmation bias. Ernst, Resch, and Uher (1992) found to appear excellent qua methodologically rigorous or that referees who had published work in favor of a contro- solid.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 9 DOI: 10.1002/asi There is a paucity of quantitative work on whether and (Lee, 2012), and encourages the practice of “burying” or where conservativism arises in peer review. This gap indi- “redressing” negatives as positives in distorting ways (Chan, cates a crucial area for future research—one facing method- Hróbjartsson, Haahr, Gøtzsche, & Altman, 2004; Gerber & ological and conceptual challenges. Since all manuscripts Malhotra, 2008). There is work suggesting that publication and grant proposals aim to be novel in some respect, studies bias is the result of reviewer and editor preferences for on conservativism must find ways to measure degrees of positive outcomes: for example, the Journal of the American novelty and/or parse out how different types of novelty (e.g., Medical Association was more likely to accept statistically in methods, theory, application context, research question, significant results on the primary outcome (Olson et al., or statistical analyses) impact peer evaluations. 2002). However, other work suggests it is authors, anticipat- ing the rejection of negative outcomes, who are primarily Bias against interdisciplinary research. Some researchers responsible for the disproportionate publication of positive have expressed concerns of bias against interdisciplinary outcomes (Dickersin, 1990; Easterbrook, Berlin, Gopalan, research since, it is thought, disciplinary reviewers prefer & Matthews, 1991) as well as for the increased time lag in mainstream research (Travis & Collins, 1991). Bias against the publication of negative results (Ioannidis, 1998). interdisciplinary research, if discovered, would violate the impartiality of peer review by suggesting that reviewers do Forms of Peer Review not interpret and apply evaluative criteria in identical ways because what count as the proper criteria of evaluation—and Despite concerns about bias, researchers still believe peer their relative weightings—are disputed. Bias against inter- review is necessary for the vetting of knowledge claims. One disciplinary research would also be problematic since many of the most comprehensive surveys of perception of peer of the most important social and scientific problems require review to date found that 93% disagree with the claim that multiple disciplinary perspectives to address (Metzger & peer review is unnecessary; 85% believe peer review ben- Zare, 1999, p. 642). efits scientific communication; and 83% believe that Efforts to demonstrate interdisciplinary bias have been “without peer review there would be no control” (Ware & mixed. Porter and Rossini (1985) found that interdiscipli- Monkman, 2008, p. 1). This suggests that, for researchers, nary proposals at the NSF received lower ratings. However, “the most important question with peer review is not no difference in peer rating for interdisciplinary research whether to abandon it, but how to improve it” (Smith, 2006, was found by the Finnish Research Council (Bruun, Hukki- p. 180). Many scholars and editorial staff are advocating nen, Huutoniemi, & Klein, 2005) or by the International alternative models of peer review in the hope of accelerating Review Committee for Physics (Rinia, van Leeuwen, van the publication process (reducing “time to market”) and Vuren, & van Raan, 2001). The perception of bias remains, making the review process itself more transparent and less however: grant panelists at the European Research Council susceptible to bias of different kinds (e.g., Smith, 1999). gave their favorite interdisciplinary projects the highest Both double-blind and are heavily sup- rating in a strategic effort to counterbalance anticipated bias ported at present. Double-blind review is more commonly (Luukkonen, 2012, p. 56). Rather than endorse this kind of found in the humanities and social sciences, and sometimes gaming behavior by reviewers, the Public Library of Science used in the clinical medical and nursing fields (Ware & (PLoS) and the U.K. Research Integrity Office recommend Monkman, 2008, p. 8). Single-blind review remains the seeking the expertise of a larger number of reviewers, a norm in life sciences, physical science, and engineering practice undertaken by the Royal Society (Science and Tech- (Ware & Monkman, 2008, p. 8). It is widely believed that nology Committee, 2011), to ensure that interdisciplinary double-blind review ensures greater fairness for authors work is evaluated by individuals with the appropriate skills while continuing to protect reviewer identities to promote and expertise. frank commentary. In open peer review authors and review- ers know one another’s identities, reviews may be open to . Publication bias is the tendency for jour- the public, and, in some cases, members of the public can nals to publish research demonstrating positive rather than self-select as reviewers. Variations exist on the theme of negative outcomes, where “positive outcomes” include open review, although most such systems disclose the iden- results that have a positive direction (Bardy, 1998), are sta- tity of the reviewer to the public. Hybrid systems also exist tistically significant irrespective of the direction of result that combine elements of open review with public commen- (Dickersin, Min, & Meinert, 1992), or both (Fanelli, 2010; tary and traditional anonymous peer review. Ioannidis, 1998). The controversy surrounding publica- tion bias demonstrates that scientists disagree about the Double-blind peer review. Single-blind is the most widely evaluative merits of research reporting negative outcomes used model for peer review; yet in Ware and Monkman’s (Ioannidis, 2005; Palmer, 2000). More commonly, publica- (2008) survey, 56% of respondents indicated that they would tion bias is understood as normatively problematic because prefer double-blind review (only 25% prefer single-blind): it leads to exaggerated effect size measurements in later “[d]ouble-blind review was primarily supported because of meta-analyses (Ioannidis, 2005; Palmer, 2000), creates pub- its perceived objectivity and fairness” (p. 2), a perception lication patterns that conflict with overall disciplinary goals that might have an impact on submission behavior by

10 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 DOI: 10.1002/asi authors who perceive themselves to be vulnerable to social not be the case (Godlee et al., 1998; van Rooyen, Godlee, bias (Budden et al., 2008). Studies in fields where double- Evans, Black, & Smith, 1999). Those concerned about blind is the norm have shown high levels of satisfaction social- and content-based biases may see increased transpar- (Baggs, Broome, Dougherty, Freda, & Kearney, 2008). ency as a way for reviewers to be accountable and sensitive Others have suggested that double-blind review could to their own forms of partiality. In addition, transparency protect against bias, but noted the difficulty in truly “blind- about reviewer identities would allow authors and members ing” a submitted manuscript (Brown, 2007; Science and of the community to contextualize review content with Technology Committee, 2011). There is far from unanimous knowledge about the particular theoretical, methodological, support for double-blind review: in a survey of nearly 1,500 disciplinary, and cultural perspective from which the review editors in chemistry, a plurality of respondents stated that is written. This contextualized knowledge would enable double-blind was “pointless, because content and references the community to use differences along these axes as a give away identity” (Brown, 2007, p. 133). This assumption resource in the articulation of each other’s background has been tested in a number of empirical studies, which assumptions and the development of more refined critiques showed that reviewers can successfully identify authors and responses—a process of transformative criticism that 25%–40% of the time (Baggs et al., 2008; Ceci & Peters, uses the diversity of individuals’ content-based partiality to 1984; Justice et al., 1998; Yankauer, 1991). It has been sug- improve the objectivity of the community as a whole gested that these numbers might be significantly higher in (Longino, 1990, 2002). more specialized fields (Lane, 2008). Despite the potential advantages of open peer review, Several studies have attempted to show that double-blind researchers and scholars seem somewhat reticent to adopt it. review reduces social bias against authors. Budden et al. In Ware and Monkman’s (2008) survey, only 13% preferred (2008) found an increase in female first-authored manu- open review to other models and only 27% thought it could scripts in a single journal following a policy change to be an effective form of review compared with 17% in the double-blind review. However, in a reexamination of this Melero and Lopez-Santovena (2001) study. Nearly half of work, Engqvist and Frommen (2008) demonstrated that the all Ware and Monkman’s (2008) respondents said that open data were inconclusive, given the growth in submissions by peer review would make them less likely to review. Other female authors across similar journals within the field before studies have noted that disclosing the reviewer’s name the policy change; Webb, O’Hara, and Freckleton (2008) would act as a disincentive and lead to a decline in the demonstrated that alternative statistical techniques suggest potential pool of willing reviewers (Baggs et al., 2008; van no change in publication outcomes; and Whittaker (2008) Rooyen, Delamothe, & Evans, 2010). Some scholars note argued that the number of authors of unknown gender in the that reviewer anonymity protects the social cohesion of original study may have been large enough to cast doubt on research groups by allowing same-group reviewers to “play the effect. down their areas of disagreement” in public (Hull, 1988, Other studies have tested for social bias against authors p. 334). More generally, scholars feel that “anonymity pro- by randomizing double- and single-blind reviewing across tects younger, less powerful reviewers from possible retri- manuscripts, but found no significant differences in out- bution on the part of the rejected author” (Peters & Ceci, comes (Alam et al., 2011; Blank, 1991; Smith, Nixon, 1982b, p. 251). Of course, these statistics may be a reflection Bueschen, Venable, & Henry, 2002) and no difference in the of generational differences, specifically (to some degree at quality of the resultant reviews (Alam et al., 2011; Justice least) the attachment established scholars have to the et al., 1998). One study did demonstrate that more-favorable “legacy system” (Kravitz & Baker, 2011, para. 1). reviews were written when the reviewers correctly guessed Controlled studies have found that open review is asso- the author of the anonymized manuscripts (Isenberg, ciated with a higher refusal rate (on the part of reviewers) Sanchez, & Zafran, 2009). However, given that senior and and an increase in the amount of time taken to write reviews well-known authors are more likely to be unmasked (Justice (van Rooyen, Delamothe, et al., 2010; Walsh, Rooney, et al., 1998), it is unclear whether this is attributable to bias Appleby, & Wilkinson, 2000). Studies are inconclusive on or to demonstrable differences in quality. the effect on quality, with some finding no difference (van Rooyen, Delamothe, et al., 2010; van Rooyen et al., 1999) Open peer review. Open peer review, that is, a form of and others an increase in quality (Walsh et al., 2000). It has review in which authors and reviewers are both known to been suggested that open peer review may increase levels of one another, is seen by proponents as a way to induce trans- inter-rater reliability; however, recent studies have found no parency in the process and speed difference in agreement levels between open and closed peer up the process of vetting new work. One of the driving review (e.g., Bornmann & Daniel, 2010b). motivations behind the move for new forms of peer review is Despite the lack of empirical evidence in favor of open to reduce bias by opening up the “deliberative chambers” reviewing, many journals are implementing, or experiment- (Lamont, 2009, p. 2) to much closer scrutiny. Those who ing with, the approach. Shakespeare Quarterly tried it suc- believe that submissions have a true quality value predict cessfully (Cohen, 2010); it has been used by BMJ for more that transparency incentivizes better quality reviews and rec- than a decade (Science and Technology Committee, 2011; ommendations; however, empirical work suggests this might Ware & Monkman, 2008); MIT Press, PLoS Medicine, and

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 11 DOI: 10.1002/asi Nature have had less successful attempts due to lack of In the case of Atmospheric Chemistry and Physics, the origi- engagement by authors and reviewers (Nature, 2006; nal draft, the comments, and the final version are all made Science and Technology Committee, 2011; Timmer, 2008; accessible online. Some journals have sought to capture and Ware & Monkman, 2008). disseminate the “prepublication history” and anonymized There are, however, multiple ways in which open review reviews along with the final version of the paper (e.g., can be conducted. In the cases just cited, the editor instigated EMBO Journal, BioMed Central medical journals, Elec- and oversaw the review process. While author-suggested tronic Transactions on Artificial Intelligence, Hydrology reviewers are also employed by a few journals, these review- and Earth Systems Science Discussions, and BMC Medicine ers tend to be more favorable than editor-nominated review- [Science and Technology Committee, 2011]). ers (Bornmann & Daniel, 2010a). A more radical approach The Journal of Medical Internet Research has taken a involves “crowdsourcing”: let the readers themselves decide slightly different route, providing a list of articles that are what should be admitted into the scholarly literature. Such awaiting peer review. A reviewer can choose to sign up and an approach has been used by the book publishing company review the article. If the article is accepted, the name of the unbound,whichallowsauthorstoposttheirworkandgarner reviewer will accompany the article. If it is rejected, the support from readers to have it published commercially reviewer will remain anonymous. The goal of the procedure (http://unbound.co.uk/). In the scholarly realm, it has been is to “give constructive feedback to the authors and/or to proposed that arXiv.org, a popular preprint repository, use prevent publication of uninteresting or fatally flawed open peer review to evaluate and certify the preprints it articles” (Journal of Medical Internet Research, 2012). houses, the idea being that editors could invite referees to There have been other experiments in publishing reviews write and sign reviews of preprints, the result of which alongside the reviewed manuscript. One such system works would be the ability to label certain articles as published as a social networking site for reviewing—allowing authors articles, rather than preprints, thereby, ultimately, bypassing to submit work that is distributed anonymously to registered the world of commercial scholarly publishing (Boldt, 2011). reviewers on the site. The reviews are then “scored” by the authors, providing an evaluation/ranking mechanism for Hybrid peer review. Hybrid systems, as noted above, reviewers. If reviewers agree, their signed reviews are pub- combine elements of openness (often in the form of public lished in an online journal (de Vrieze, 2012). commentary) with traditional (i.e., blind) review. Discus- sions of hybrid systems are not new: More than 30 years ago A posteriori peer review. Postpublication comments are Harnad (1979, p. 18) argued presciently for a “complemen- seen by some as “a useful supplement to formal peer review, tary mechanism . . . in which scientists could solicit open rather than a replacement for it” (Ware & Monkman, 2008, peer commentary on their work.” He claimed that this p. 2). Kravitz and Baker (2011, para. 1) have outlined a “would not only provide a medium and an incentive for radical model that combines prepublication (anonymous) efforts and initiatives that might otherwise have been buried peer review with “the development of a marketplace where in the evanescence and anonymity of closed peer review, but the priority of a paper rises and falls based on its reception it would also focus and preserve the creative disagreement, from the field.” Systems where stable documents are pro- in a disciplined form, answerable to print and posterity” vided and readers are allowed to post comments are the most (Harnad, 1979, p. 19). The ubiquity of online computing has common form of postpublication review (e.g., PLoS jour- made this possibility much more of a reality. nals: see http://www.plos.org/), but other novel implemen- Implicit in Harnad’s proposal for hybrid review is the tations have been tried; for example, Open Medicine, which idea that it serves as a supplement to, not a replacement for, posts reviewed articles on a , allows real-time changes traditional closed peer review. Many hybrid options have by readers. This is probably the most radical of implemen- been developed based on the same assumption. Harnad tations, as it allows collaborative authoring of the peer- (1982) classifies these as either a priori systems, where reviewed work, rather than mere comments on it. Other comments are invited before formal peer review, or a poste- proposals have included “rebound” systems, which allow riori systems, where comments are invited after the work has the author to engage in postreview debate over a rejected been reviewed. manuscript (Sen, 2012). Post-peer-review commentary on, and rating of, published papers in thousands of journals by A priori peer review. The journal Atmospheric Chemistry domain experts across a range of fields is the approach used and Physics experimented with an a priori open review by (see http://f1000.com/). These various system in which articles were posted to the site (after initial types of open review have been referred to as “fear review” editorial screening) and select reviewers as well as the as it is suggested that this exposure of, and the ability for public were invited to comment over an 8-week period. The peers to publicly comment on, one’s published work may author then prepared a final draft, which was reviewed and “well be more nerve-racking than having it read discretely assessed for suitability for publication by the editor (Science by only two or three peers” (Cronin, 2003, p. 15). and Technology Committee, 2011). Other journals have also Although a number of proposals have been suggested, explored ways of providing a platform for discussion pre- very few studies have investigated how these various ceding formal peer review (e.g., Copernicus Publications). approaches might actually mitigate bias or how bias might

12 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 DOI: 10.1002/asi be quantified with precision. Despite the many calls for an which may alter final recommendations. Frequent or highly overhaul of the status quo, there is little evidence that alter- sought authors to the journal may develop a privileged rela- native forms of review either reduce bias or have been enthu- tionship with the editor and with potential reviewers. Editors siastically embraced by the scientific community at large. In may feel peer pressure when evaluating manuscripts submit- fact, studies have found no significant difference in per- ted by frequent reviewers and members (Lip- ceived bias or review quality in these systems but, rather, a worth, Kerridge, Carter, & Little, 2011). The readership may refusal on the part of reviewers to participate and protracted function as an invisible hand in the selection of authors and turnaround times. Given that the most commonly voiced manuscript content, since the editor will need to be cogni- criticism of the current system is the time taken to produce zant of the needs and wants of the marketplace. An editor reviews, as noted by 38% of survey respondents (Ware & may also be influenced by her/his relationship with the edi- Monkman, 2008)—and the finding that there is “no crisis in torial board and/or publisher (commercial, academic, or supply of peer reviewers” (Vines et al., 2010, p. 1,041)—a society). The editor’s strategy or vision for the journal may wholesale shift seems unlikely in the near term. have a bearing on which manuscripts are reviewed and ulti- mately accepted for publication. As Chubin and Hackett (1990, p. 92) note, “[t]he journal editor occupies a delicate Conclusions and Future Research position between the author and reviewers, alternating Impartiality ensures both the consistency and meritoc- among the roles of wordsmith and gatekeeper, caretaker and racy of peer review. Research on bias in peer review— networker, literary agent and judge.” predicated on the ideal of impartiality—raises not just local Not all of these sources of social influence impact peer hypotheses about specific sources of partiality, but much review in problematic ways. For example, the ways in which broader questions about whether the processes by which authors, reviewers, and editors anticipate each others’ scru- knowledge communities regulate themselves are epistemi- tiny and judgment may serve to improve the quality of each cally and socially corrupt. Contra impartiality, the evidence of their contributions (Bailar, 1991; Hirschauer, 2009), and suggests that peer evaluations vary as a function of author editors’ personal connections allow them to learn about nationality and prestige of institutional affiliation; reviewer and capture high-impact papers for publication (Laband nationality, gender, and discipline; author affiliation with & Piette, 1994). These examples suggest that the sociality reviewers; reviewer agreement with submission hypotheses of peer review can be structured “to enrich, rather than (confirmation bias); and submission demonstration of posi- threaten” the well-being of peer review (Lipworth et al., tive outcomes (publication bias). 2011, p. 1,056). A natural direction for future research However, a closer look at the empirical and method- includes articulating and assessing alternative normative ological limitations of research on bias raises questions models that acknowledge reviewer partiality, with a focus about the existence, extent, and normative status of many on the epistemic and cultural bases for reviewer disagree- hypothesized forms of bias. Psychometrically oriented ment; the ways editors and grant program managers antici- research is predicated on the questionable assumption that pate, capitalize on, and manage reviewer disagreement; and disagreement among reviewers is not normatively appropri- the ways publication venues and funding opportunities ate or desirable. Research on bias as a function of author should be structured to accommodate reviewer differences characteristics adopts the untested assumption that authors (Hargens & Herting, 1990; Lee, 2012). Finally, the inescap- belonging to different social categories submit manuscripts able sociality and partiality of peer evaluation raise ques- and grant proposals of comparable quality. Despite vocal tions about whether impartiality can or should be upheld as concerns about conservativism in science, there is no the ideal for peer review. empirical evidence (beyond anecdote) to buttress or belie such worries. And the evidence for bias against interdisci- plinary research is mixed, as is the evidence for bias against Acknowledgments female authors and authors living in non-English-speaking The authors acknowledge the contributions of the three countries. reviewers who provided thoughtful commentary on an Research on bias in peer review also suggests that peer earlier version of this article. Blaise Cronin, being Editor- review is social in ways that go beyond the social categories in-Chief of JASIST, was uninvolved in the review process, to which authors and reviewers belong: Relationships which was handled entirely by Jonathan Furner, an associate between individuals in the process impact outcomes (e.g., editor of the Journal and editor for the Advances in Infor- affiliation bias), and individuals make decisions conditioned mation Science section. on beliefs about what others value (e.g., publication bias). Future research might usefully investigate these complex and dynamic social relations. Consider, for example, how References the editor’s relationships and beliefs about other actors may Alam, M., Kim, N.A., Havey, J., Rademaker, A., Ratner, D., Tregre, B., . . . have an impact on his/her decisions. On the basis of previous Coleman, W.P. 3rd. (2011). Blinded vs. unblinded peer review of manu- experience with reviewers, the editor may differentially scripts submitted to a dermatology journal: A randomized multi-rater value and preferentially assign reviewers to manuscripts, study. British Journal of Dermatology, 165(3), 563–567.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 13 DOI: 10.1002/asi Baggs, H.G., Broome, M.E., Dougherty, M.C., Freda, M.C., & Kearney, Bornmann, L., Mutz, R., & Daniel, H.-D. (2010). A reliability- M.H. (2008). Blinding in peer review: the preferences of reviewers for generalization study of journal peer reviews: A multilevel meta-analysis nursing journals. Journal of Advanced Nursing, 64(2), 131–138. of inter-rater reliability and its determinants. PLoS One, 5(12), Bailar, J.C. (1991). Reliability, fairness, objectivity and other inappropriate e14331. goals in peer review. Behavioral and Brain Sciences, 14, 137–138. Bornmann, L., Mutz, R., Marx, W., Schier, H., & Daniel, H.-D. (2011). A Bardy, A.H. (1998). Bias in reporting clinical trials. British Journal of multilevel modeling approach to investigating the predictive validity of Clinical Pharmacology, 46, 147–150. editorial decisions: Do the editors of a high profile journal select manu- Bargh, J.A., & Williams, E.L. (2006). The automaticity of social life. scripts that are highly cited after publication? Journal of the Royal Current Directions in Psychological Science, 15(1), 1–4. Statistical Society, 174, 857–879. Biagioli, M. (1998). The instability of authorship: Credit and responsibility Bornmann, L., Weymuth, C., & Daniel, H.-D. (2010). A content analysis of in contemporary medicine. The FASB Journal, 12(1), 3–16. referees’ comments: How do comments on manuscripts rejected by a Biagioli, M. (2002). From book censorship to academic peer review. Emer- high-impact journal and later published in either a low- or high-impact gences, 12(1), 11–45. journal differ? Scientometrics, 83(2), 493–506. Björk, B.-C., Roos, A., & Lauri, M. (2009). Scientific journal publishing: Borsuk, R.M., Aarssen, L.W., Budden, A.E., Koricheva, J., Leimu, R., Yearly volume and availability. Information Research, Tregenza, T., Lortie, C.J., (2009). To name or not to name: The effect of 14(1). Retrieved from http://informationr.net/ir/14-1/paper391.html changing author gender on peer review. Bioscience, 59(11), 985–989. Blackburn, J.L., & Hakel, M.D. (2006). An examination of sources of Braben, D.W. (2004). Pioneering research: A risk worth taking. Hoboken, peer-review bias. Psychological Science, 17(5), 378–382. NJ: Wiley-Interscience. Blank, R.M. (1991). The effects of double-blind versus single-blind review- Brown, R.J.C. (2007). Double anonymity in peer review within the ing: Experimental evidence from the American Economic Review. chemistry periodicals community. Learned Publishing, 20(2), 131– American Economic Review, 81(5), 1041–1068. 137. Boldt, A. (2011). Extending ArXiv.org to achieve open peer review. Journal Bruun, H., Hukkinen, J., Huutoniemi, K., & Klein, J.T. (2005). Promoting of Scholarly Publishing, 42(2), 238–242. interdisciplinary research: The case of the of Finland. Helsinki: Bornmann, L. (2008). Scientific peer review: An analysis of the peer review Academy of Finland. process from the perspective of sociology of science theories. Human Budden, A.E., Tregenza, T., Aarssen, L.W., Koricheva, J., Leimu, R., . . . Architecture — Journal of the Sociology of Self-Knowledge, 6(2), 23–38. (2008). Double-blind review favours increased representation of female Bornmann, L. (2011a). Peer review and bibliometrics: Potentials and prob- authors. Trends in Ecology & Evolution, 23, 4–6. lems. In J.C. Shin, R.K. Toutkoushian, & U. Teichler (Eds.), University Campanario, J.M. (1995). Commentary: On influential books and journal rankings: Theoretical basis, methodology and impacts on global higher articles initially rejected because of negative referees’ evaluations. education (pp. 145–164). Berlin, Germany: Springer. Science Communication, 16, 304–325. Bornmann, L. (2011b). Scientific peer review. In B. Cronin (Ed.). Annual Campanario, J.M. (1998a). Peer review for journals as it stands today — review of information science and technology, 45 (pp. 199–245). Part I. Science Communication, 19(3), 181–211. Medford, NJ: Information Today/American Society for Information Campanario, J.M. (1998b). Peer review for journals as it stands today — Science & Technology. Part 2. Science Communication, 19(4), 277–306. Bornmann, L., & Daniel, H.-D. (2006). Potential sources of bias in research Campanario, J.M., & Acedo, E. (2005). Rejecting highly cited papers: The fellowship assessments. Effects of university prestige and field of study views of scientists who encounter resistance to their discoveries from on approval and rejection of fellowship applications. Research Evalua- other scientists. Journal of the American Society for Information Science tion, 15(3), 209–219. and Technology, 58(5), 734–743. Bornmann, L., & Daniel, H.-D. (2007). Gatekeepers of science: Effects of Carter, L.J. (1979). A new and searching look at NSF. Science, 204, 1064– external reviewers’ attributes on the assessments of fellowship applica- 1065. tions. Journal of Informetrics, 1(1), 83–91. Casati, F., Marchese, M., Ragone, A., & Turrini, M. (2009). Is peer review Bornmann, L., & Daniel, H.-D. (2008a). The effectiveness of the peer any good? A quantitative analysis of peer review (Technical Report No. review process: Inter-referee agreement and predictive validity of manu- DISI-09-045). Retrieved from Unitn-eprints. Research website: http:// script refereeing at Angewandte Chemie. Angewandte Chemie Interna- eprints.biblio.unitn.it/1654 tional Edition, 47(38), 7173–7178. Ceci, S.J., & Peters, D. (1984). How blind is blind review? American Bornmann, L., & Daniel, H.-D. (2008b). Selecting manuscripts for a high Psychologist, 39(12), 1491–1494. impact journal through peer review: A citation analysis of communica- Ceci, S.J., & Williams, W.M. (2011). Understanding current causes of tions that were accepted by Angewandte Chemie International Edition, or women’s underrepresentation in science. Proceedings of the National rejected but published elsewhere. Journal of the American Society for Academy of Sciences of the United States of America, 108(8), 3157– Information Science and Technology, 59(11), 1841–1852. 3162. Bornmann, L., & Daniel, H.-D. (2009). Extent of type I and type II errors Chaiken, S. & Trope, T. (1999). Dual-process theories in social . in editorial decisions: A case study on Angewandte Chemie International New York: Guilford Press. Edition. Journal of Informetrics, 3(4), 348–352. Chan, A.-W., Hróbjartsson, A., Haahr, M.T., Gøtzsche, P.C., & Altman, Bornmann, L., & Daniel, H.-D. (2010a). Do author-suggested reviewers D.G. (2004). Empirical evidence for selective reporting of outcomes in rate submissions more favorably than editor-suggested reviewers? A randomized trials: Comparison of protocols to published articles. Journal study onAtmospheric Chemistry and Physics. PLoS ONE, 5(10), e13345. of the American Medical Association, 291, 2457–2465 Bornmann, L., & Daniel, H.-D. (2010b). Reliability of reviewers’ ratings Chubin, D.E., & Hackett, E.J. (1990). Peerless science: Peer review and when using public peer review: A case study. Learned Publishing, 23, U.S. science policy. Stony Brook, NY: State University of New York 124–131. Press. Bornmann, L., Mutz, R., & Daniel, H.-D. (2007). Gender differences in Cohen, P. (2010). Scholars test web alternative to peer review. The New grant peer review: A meta-analysis. Journal of Informetrics, 1, 226–238. York Times. Retrieved from http://www.nytimes.com/2010/08/24/arts/ Bornmann, L., Mutz, R., & Daniel, H.-D. (2008). How to detect indications 24peer.html?pagewanted=all of potential sources of bias in peer review: A generalized latent variable Cole, S., Cole, J.R., & Simon, G. (1981). Chance and consensus in peer modeling approach exemplified by a gender study. Journal of Informet- review. Science, 214, 881–886. rics, 2, 280–287. Cronin, B. (2002). Hyperauthorship: A postmodern perversion or evidence Bornmann, L., Mutz, R., & Daniel, H.-D. (2009). The influence of the of a structural shift in scholarly communication practices? Journal of the applicants’ gender on the modeling of a peer review process by using American Society for Information Science and Technology, 52(7), 558– latent Markov models. Scientometrics, 81(2), 407–411. 569.

14 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 DOI: 10.1002/asi Cronin, B. (2003). Scholarly communication and epistemic cultures. New Gottfredson, S.D. (1978). Evaluating psychological research reports: Review of Academic Librarianship, 9, 1–24. Dimensions, reliability, and correlates of quality judgments. American Cronin, B. (2009). Editorial. Vernacular and vehicular language. Journal of Psychologist, 33, 920–934. the American Society for Information Science and Technology, 60(3), Grant, J., Burden, S., & Breen, G. (1997). No evidence of in peer 433. review. Nature, 390, 438. Cronin, B., & McKenzie, G. (1992). The trajectory of rejection. Journal of Greenwald, A.G., Pratkanis, A.R., Leippe, M.R., & Baumgardner, M.H. Documentation, 48(3), 310–317. (1986). Under what conditions does theory obstruct research progress? Daniel, H.-D. (1993). Guardians of science: Fairness and reliability of peer Psychological Review, 93(2), 216–229. review. Weinheim, Germany: Wiley-VCH. Gustafson, T. (1975). The controversy over peer review. Science, Daniel, H.-D. (2005). Publications as a measure of scientific achievement 190(4219), 1060–1066. and scientists’ productivity. Learned Publishing, 18, 143–148. Hagstrom, W.O. (1971). Inputs, outputs, and the prestige of university Daniel, H.-D., Mittag, S., & Bornmann, L. (2007). The potential and prob- science departments. Sociology of Education, 44, 375–397. lems of peer evaluation in higher education and research. In A. Cavalli Hamermesh, D.S. (1994). Facts and myths about refereeing. Journal of (Ed.), Quality assessment for higher education in Europe. London: Port- Economic Perspectives, 8(1), 153–164. land Press, 71–82. Hames, I. (2007). Peer review and manuscript management in scientific Delamothe, T., & Smith, R. (2002). Twenty thousand conversations: rapid journals. Oxford, UK: Blackwell. responses suggest new models of knowledge creation. British Medical Hargens, L.L., & Herting, J.R. (1990). Neglected considerations in the Journal, 324. Retrieved from http://www.bmj.com/content/324/7347/ analysis of agreement among journal referees. Scientometrics, 19, 1171.full 91–106. Dickersin, K. (1990). The existence of publication bias and risk factors for Harnad, S. (1979). Creative disagreement. The Sciences, 19, 18–20. its occurrence. Journal of the American Medical Association, 263, 1385– Harnad, S. (Ed.). (1982). Peer commentary on peer review: A case study in 1389. scientific quality control. Cambridge, UK: Cambridge University Press. Dickersin, K., Min, Y.-I., & Meinert, C.L. (1992). Factors influencing Herrera, A.J. (1999). Language bias discredits the peer-review system. publication of research results: Follow-up of applications submitted to Nature, 397(6719), 467. two institutional review boards. Journal of the American Medical Asso- Hirschauer, S. (2009). Editorial judgments: A praxeology of “voting” in ciation, 267, 374–379. peer review author(s). Social Studies of Science, 40(1), 71–103. de Vrieze, J. (2012). Online social network seeks to overhaul peer review Holbrook, J.B., & Frodeman, R. (2011). Peer review and the ex ante in scientific publishing. ScienceInsider. Retrieved from http://news. assessment of societal impacts. Research Evaluation, 20(3), 239– sciencemag.org/scienceinsider/2012/01/online-social-network-seeks-to. 246. html Horrobin, D.F. (1990). The philosophical basis of peer review and the Easterbrook, P.J., Berlin, J.A., Gopalan, R., & Matthews, D.R. (1991). suppression of innovation. Journal of the American Medical Association, Publication bias in clinical research. Lancet, 337(8746), 867–872. 263, 1438–1441. Engers, M., & Gans, J.S. (1998). Why referees are not paid (enough). Hull, D. (1988). Science as a process. Chicago: University of Chicago American Economic Review, 88(5), 1341–1350. Press. Engqvist, L., & Frommen, J.G. (2008). Double-blind peer review and Ioannidis, J.P.A. (1998). Effect of the statistical significance of results on gender publication bias. Animal Behaviour, 76, E1–E2. the time to completion and publication of randomized efficacy trials. Ernst, E., & Kienbacher, T. (1991). Chauvinism. Nature, 352, 560. Journal of the American Medical Association, 279, 281–286. Ernst, E., & Resch, K.L. (1999). Reviewer bias against the unconventional? Ioannidis, J.P.A. (2005). Why most published research findings are false. A randomized double-blind study of peer review. Complementary Thera- PLoS Medicine, 2(8), e124. pies in Medicine, 7(1), 19–23. Isenberg, S.J., Sanchez, E., & Zafran, K.C. (2009). The effect of masking Ernst, E., Resch, K.L., Uher, E.M. (1992). Reviewer bias. Annals of Inter- manuscripts for the peer-review process of an ophthalmic journal. British nal Medicine, 116(11), 958. Journal of Ophthalmology, 93(7), 881–884. Fanelli, D. (2010). Do pressures to publish increase scientists’ bias? An Jackson, J.L., Srinivasan, M., Rea, J., Fletcher, K.E., & Kravitz, R.L. empirical support from US States data. PLoS ONE, 5(4), e10271. (2011). The validity of peer review in a general medicine journal. PLoS Ferber, M.A. (1986). Citations: Are they an objective measure of scholarly ONE, 6(7), e22475. merit? Signs, 11(2), 381–389. James, L.R., Demaree, R.G., & Wolf, G. (1984). Estimating within-group Frank, E. (1996). Editors’ requests of reviewers: A study and a proposal. interrater reliability with and without response bias. Journal of Applied Preventative Medicine, 25, 102–104. Psychology, 69(1), 85–98. Friesen, H.G. (1998). Equal opportunities in Canada. Nature 391, 326. Jayasinghe, U.W., Marsh, H.W., & Bond, N. (2003). A multilevel cross- Gardner, M.J., & Bond, J. (1990). An exploratory study of statistical assess- classified modelling approach to peer review of grant proposals: The ment of papers published in the British Medical Journal. Journal of the effects of assessor and researcher attributes on assessor ratings. Journal American Medical Association, 263, 1355–1357. of the Royal Statistical Society. Series A (Statistics in Society), 166(3), Gerber, A.S., & Malhotra, N. (2008). Publication bias in empirical socio- 279–300. logical research — Do arbitrary significance levels distort published Jelicic, M., & Merckelbach, H. (2002). Peer-review: Let’s imitate the results? Sociological Methods & Research, 37(1), 3–30. lawyers! Cortex, 38, 406–407. Gilbert, J.R., Williams, E.S., & Lundberg, G.D. (1994). Is there gender bias Journal of Medical Internet Research. (2012). Open peer review articles. in JAMA’s peer-review process? Journal of the American Medical Asso- Retrieved from http://www.jmir.org/reviewer/openReview/abstracts ciation, 272(2), 139–142. Justice, A.C., Cho, M.K., Winker, M.A., & Berlin, J.A. (1998). Does Gillespie, G.W., Chubin, D.E., & Kurzon, G.M. (1985). Experience with masking author identity improve peer review quality? A randomized NIH peer review: Researchers’ cynicism and desire for change. Science, controlled trial. Journal of the American Medical Association, 280(3), Technology, & Human Values, 10(3), 44–54. 240–242. Godlee, F., Gale, C.R., & Martyn, C.N. (1998). Effect on the quality of peer Kitcher, P. (1993). The advancement of science: Science without legend, review of blinding reviewers and asking them to sign their reports: A objectivity without illusion. New York: Oxford University Press. randomized controlled trial. Journal of the American Medical Associa- Kolata, G. (2009, June 27). System leads cancer researchers to play it safe. tion, 280, 237–240. New York Times. Retrieved from http://www.nytimes.com/2009/06/28/ Goodman, S.N., Berlin, J.A., Fletcher, S.W., & Fletcher, R.H. (1994). health/research/28cancer.html Manuscript quality before and after peer review and editing at Annals of Kravitz, D.J., & Baker, C.I. (2011). Toward a new model of scientific Internal Medicine. Annals of Internal Medicine, 121, 11–21. publishing: discussion and a proposal. Frontiers in Computational

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 15 DOI: 10.1002/asi Neuroscience, 5(55). Retrieved from http://www.frontiersin.org/ synergy in support of the null hypothesis model. Journal of Informetrics, computational_neuroscience/10.3389/fncom.2011.00055/full 5, 167–180. Kravitz, R.L., Franks, P., Feldman, M.D., Gerrity, M., Byrne, C., & Marsh, H.W., Bornmann, L., Mutz, R., Daniel, H-D., & O’Mara, A. (2009). Tierney, W.M. (2010). Editorial peer reviewers’ recommendations at a Gender effects in the peer reviews of grant proposals: A comprehensive general medical journal: Are they reliable and do editors care? PLoS meta-analysis comparing traditional and multilevel approaches. Review ONE, 5(4), e10072. of , 79, 1290–1326. Kronick, D.A. (1990). Peer review in 18th-century scientific journalism. McCullough, J. (1989). First comprehensive survey of NSF applicants Journal of the American Medical Association, 263, 1321–1322. focuses on their concerns about proposal review. Science, Technology, & Laband, D.N., & Piette, M.J. (1994). Favoritism versus search for good Human Values, 14(1), 78–88. papers: Empirical evidence regarding the behavior of journal editors. Melero, R., & Lopez-Santovena, F. (2001). Referees’ attitudes toward open Journal of Political Economy, 102, 194–203. peer review and electronic transmission of papers. Food Science and Lacey, H. (1999). Is science value free? Values and scientific understand- Technology International, 7(6), 521–527. ing. New York: Routledge. Merton, R.K. (1968). The Matthew effect in science. Science, 159(3810), Lamont, M. (2009). How professors think: Inside the curious world of 56–63. academic judgment. Cambridge, MA: Harvard University Press. Merton, R.K. (1973). The sociology of science: Theoretical and empirical Lane, D. (2008). Double-blind review: Easy to guess in specialist fields. investigations. Chicago: University of Chicago Press. Nature, 452, 28. Metzger, N., & Zare, R.N. (1999). Interdisciplinary research: From belief to Lane, J.A., & Linden, D.J. (2009). Is there gender bias in the peer review reality. Science, 283(5402), 642–643. process at Journal of Neurophysiology? Journal of Neurophysiology, Monin, B., & Miller, D.T. (2001). Moral credentials and the expression of 101(5), 2195–2196. prejudice. Journal of Personality and Social Psychology, 81(1), 33– Lawrence, B., Jones, C., Matthews, B., Pepler, S., & Callaghan, S. 43. (2011). Citation and peer review of data: Moving towards formal National Institutes of Health. (2011). NIH databook. Retrieved from http:// data publication. International Journal of Digital Curation, 6(2), report.nih.gov/nihdatabook/ 4–37. National Science Foundation. (2011). Proposal and award policies and Lee, C.J. (in press). A Kuhnian critique of psychometric research on peer procedures guide. Retrieved from http://www.nsf.gov/pubs/policydocs/ review. Philosophy of Science. pappguide/nsf11001/index.jsp Lee, C.J. (2012). Incentivizing procedural objectivity: Community Nature. (2006). Overview: Nature’s peer review trial. Retrieved from http:// response to publication bias. Unpublished manuscript. www.nature.com/nature/peerreview/debate/nature05535.html Lee, C.J., & Schunn, C.D. (2011). Social biases and solutions for proce- Nature. (2008). Editorial. Working double-blind. Should there be author dural objectivity. : A Journal of Feminist Philosophy, 26(2), anonymity in peer review? Retrieved from http://www.nature.com/ 352–373. nature/journal/v451/n7179/full/451605b.html Ley, T.J., & Hamilton, B.H. (2008). The gender gap in NIH grant applica- Nature. (2012). Peer-review policy. Retrieved from http://www.nature.com/ tions. Science, 322, 1472–1474. authors/policies/peer_review.html Link, A.M (1998). US and non-US submissions. Journal of the American Nature Neuroscience. (2006). Women in neuroscience: A numbers game. Medical Association, 280(3), 246–247. Nature Neuroscience, 9(7), 853. Lipworth, W.L., Kerridge, I.H., Carter, S.M., & Little, M. (2011). Journal Nickerson, R.S. (1998). Confirmation bias: A ubiquitous phenomenon in peer review in context: A qualitative study of the social and subjective many guises. Review of General Psychology, 2(2), 175–220. dimensions of manuscript review in biomedical publishing. Social Olson, C.M., Rennie, D., Cook, D., Kickersin, K., Flanagin, A., Hogan, Science & Medicine, 72, 1056–1063. J.W., . . . Pace, B. (2002). Publication bias in editorial decision making. Longino, H. (1990). Science as social knowledge. Princeton. NJ: Princeton Journal of the American Medical Association, 287, 2825–2828. University Press. Oswald, A.J. (2008). Can we test for bias in scientific peer-review. IZA Longino, H. (2002). The fate of knowledge. Princeton, NJ: Princeton discussion paper 3665. Bonn, Germany: Institute for the Study of Labor. University Press. Palmer, A.R. (2000). Quasi-replication and the contract of error: Lessons Loonen, M.P.J., Hage, J.J., & Kon, M. (2005). Who benefits from peer from sex ratios, heritabilities and fluctuating asymmetry. Annual Review review? An analysis of the outcome of 100 requests for review by Plastic of Ecology and Systematics, 31, 441–480. and Reconstructive Surgery. Plastic and Reconstructive Surgery, 116(5), Parsons, M.A., Duerr, R., & Minster, J.-B. (2010). Data citation and peer- 1461–1472. review. Eos, Transactions, American Geophysical Union, 91(34), 297– Luukkonen, T. (2012). Conservatism and risk-taking in peer review: 298. Emerging ERC practices. Research Evaluation, 21, 48–60. Peters, D.P., & Ceci, S.J. (1982a). Peer-review practices of psychological Mabe, M. (2003). The growth and number of journals. Serials, 16(2), journals: The fate of accepted, published articles, submitted again. 191–197. Behavioral and Brain Sciences, 5(2), 187–195. Mahoney, M.J. (1977). Publication preferences: An experimental study of Peters, D.P., & Ceci, S.J. (1982b). Peer-review research: Objections and confirmatory bias in the peer review system. Cognitive Therapy and obligations. Behavioral and Brain Sciences, 5(2), 246–252. Research, 1(2), 161–175. Porter, A.L., & Rossini, F.A. (1985). Peer review of interdisciplinary Mallard, G., Lamont, M., & Guetzkow, J. (2009). Fairness as appropriate- research proposals. Science, Technology, & Human Values, 10(3), ness: Negotiating epistemological differences in peer review. Science, 33–38. Technology, & Human Values, 34(5), 573–606. Price, D.S. (1976). A general theory of bibliometric and other cumulative Manten, A.A. (1980). The growth of European scientific journal publishing advantage processes. Journal of the American Society for Information before 1850. In A.J. Meadows (Ed.), Development of scientific publish- Science, 27(5), 292–306. ing in Europe (pp. 1–22). New York: . RAND. (2005). Is there gender bias in federal grant programs? RAND: Marsh, H.W., & Ball, S. (1989). The peer review process used to evaluate Infrastructure, Safety, and the Environment Brief RB-9147-NSF. manuscripts submitted to academic journals: Interjudgmental reliability. Retrieved from http://www.rand.org/pubs/research_briefs/RB9147/ Journal of Experimental Education, 57, 151–169. RAND_RB9147.pdf Marsh, H.W., Jayasinghe, U.W., & Bond, N.W. (2008). Improving the Rees, T. (2011). The gendered construction of scientific excellence. Inter- peer-review process for grant applications: Reliability, validity, bias, and disciplinary Science Reviews, 36(2), 133–145. generalizability. American Psychologist, 63(3): 160–168. Resch, K.I., Ernst, E., & Garrow, J. (2000). A randomized controlled study Marsh, H.W., Jayasinghe, U.W., & Bond, N.W. (2011). Gender differences of reviewer bias against an unconventional therapy. Journal of the Royal in peer reviews of grant applications: A substantive-methodological Society of Medicine, 93, 164–167.

16 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 DOI: 10.1002/asi Rinia, E.J., van Leeuwen, Th.N., van Vuren, H.G., & van Raan, A.F.J. Thurner, S., & Hanel, R. (2010). Peer-review in a world with rational (2001). Influence of interdisciplinarity on peer-review and bibliometric scientists: Toward selection of the average. European Physical Journal B, evaluations in physics research. Research Policy, 30, 357–361. 84(4), 707–711. Ross, J.S., Gross, C.P., Desia, M.M., Hong, Y.L., Grant, A.O., Daniels, Timmer, J. (2008). Online peer review supplements, doesn’t replace real S.R., . . . Krumholz, H.M. (2006). Effect of blinded peer review on thing. Ars Technica. Retrieved from http://arstechnica.com/old/content/ abstract acceptance. Journal of the American Medical Association, 2008/04/online-peer-review-supplements-doesnt-replace-real-thing.ars 295(14), 1675–1680. Travis, G.D.L., & Collins, H.M. (1991). New light on old boys: Cognitive Rothwell, P.M., & Martyn, C.N. (2000). Reproducibility of peer review in and institutional particularism in the peer review system. Science, Tech- clinical neuroscience — Is agreement between reviewers any greater nology, & Human Values, 16(3), 322–341. than would be expected by chance alone? Brain, 123, 1964–1969. Tregenza, T. (2002). Gender bias in the refereeing process? Trends in Rust, J., & Golombok, S. (2009). Modern psychometrics: The science of Ecology & Evolution, 17(8), 349–350. psychological assessment (3rd ed.) New York: Routledge. Tyler, T.R. (2006). Psychological perspectives on legitimacy and legitmiza- Sandström, U. (2009). Cognitive bias in peer review: A new approach. In tion. Annual Review of Psychology, 57, 375–400. Proceedings of 12th International Conference on Scientometrics and Valkonen, L., & Brooks, J. (2011). Gender balance in Cortex acceptance Informetrics, July 28–31, Rio de Janeiro, Brazil, 1–5. rates. Cortex, 47, 763–770. Sandström, U., & Hallsten, M. (2008). Persistent nepotism in peer-review. van Rooyen, S., Delamothe, T., & Evans, S.J.W. (2010). Effect on peer Scientometrics, 74(2), 175–189. review of telling reviewers that their signed reviews might be posted on the Schultz, D.M. (2010a). Are three heads better than two? How the number of web: Randomized controlled trial. British Medical Journal, 341, c5729. reviewers and editor behavior affect the rejection rate. Scientometrics, van Rooyen, S., Godlee, F., Evans, S., Black, N., & Smith, R. (1999). Effect 84(2), 277–292. of open peer review on quality of reviews and on reviewers’ recommen- Schultz, D.M. (2010b). Rejection rates for journals publishing in the atmo- dations: A randomized trial. British Medical Journal, 318(7175), 23–27. spheric sciences. Bulletin of the American Meteorological Society, 91, van Rooyen, S., Godlee, F., Evans, S.J.W., Smith, R., & Black, N. (2010). 231–243. Effect of blinding and unmasking on the quality of peer review. Journal Science and Technology Committee. (2011). Eighth report: Peer review of the American Medical Association, 280, 234–237. in scientific publications. Retrieved from http://www.publications. Vines, T. (2010). The famous grouse — Do prominent scientists have parliament.uk/pa/cm201012/cmselect/cmsctech/856/85602.htm biased perceptions of peer review? Scholarly Kitchen: What’s hot and Sen, C.K. (2012). Rebound peer review: A viable recourse for aggrieved cooking in scholarly publishing. Retrieved from http://scholarlykitchen. authors? Antioxidants & Redox Signaling, 16(4), 293–296. sspnet.org/2012/02/01/the-famous-grouse-do-prominent-scientists- Sense About Science (2009). Peer review survey 2009: The full have-biased-perceptions-of-peer-review/ report. London: Sense About Science. Retrieved from http://www. Vines, T., Rieseberg, L., & Smith, H. (2010). No crisis in supply of peer senseaboutscience.org/data/files/Peer_Review/Peer_Review_Survey_ reviewers. Nature, 468(7327), 1041. Final_3.pdf Walsh, J. (1975). NSF and its critics in Congress: New pressure on peer Shapin, S. (1995). A social history of truth: Civility and science in review. Science, 188(4192), 999–1001. seventeenth-century England. Chicago: University of Chicago Press. Walsh, E., Rooney, M., Appleby, L., & Wilkinson, G. (2000). Open peer Shatz, D. (2004). Peer review: A critical inquiry. Lanham, MD: Rowman & review: A randomized controlled trial. British Journal of Psychiatry, 176, Littlefield. 47–51. Shea, C. (2011, November 13). Fraud scandal fuels debate over practices of Ware, M., & Monkman, M. (2008). Peer review in scholarly journals: social psychology. Chronicle of Higher Education. Retrieved from http:// Perspective of the scholarly community — an international study. chronicle.com/article/As-Dutch-Research-Scandal/129746/ Publishing Research Consortium. Retrieved from http:// Sismondo, S. (2009). Ghosts in the machine: Publication planning in the www.publishingresearch.net/documents/PRCsummary4Warefinal.pdf medical sciences. Social Studies of Science, 39(2), 171–198. Webb, T.J., O’Hara, B., & Freckleton, R.P. (2008). Does double-blind Smith, J.A., Nixon, R., Bueschen, A.J., Venable, D.D., & Henry, H.H. review benefit female authors? Trends in Ecology and Evolution, 23(7), (2002). The impact of blinded versus unblinded abstract review on sci- 351–353. entific program content. Journal of Urology, 168(5), 2123–2125. Wennerås, C., & Wold, A. (1997). Nepotism and sexism in peer-review. Smith, R. (1999). Opening up BMJ peer review: A beginning that Nature, 387, 341–343. should lead to complete transparency. British Medical Journal, 318, Wesseley, S. (1998). Peer review of grant applications: What do we know? 4–5. Lancet, 352(9124), 301–305. Smith, R. (2006). Peer review: a flawed process at the heart of science Whittaker, R.J. (2008). Journal review and gender equality: A critical and journals. Journal of the Royal Society of Medicine, 99, 178– comment on Budden et al. Trends in Ecology and Evolution, 23(9), 182. 478–479. Solomon, M. (2001). Social empiricism. Cambridge, MA: MIT Press. Wing, D.A., Benner, R.S., Petersen, R., Newcomb, R., & Scott, J.R. (2010). Spier, R. (2002). The history of the peer-review process. Trends in Bio- Differences in editorial board reviewer behavior based on gender. technology, 20(8), 357–358. Journal of Women’s Health, 19(10), 1919–1923. Stanford, P.K. (2012). Getting what we pay for: Unconceived alternatives Wood, F.Q. (1997). The peer review process. Canberra, : National and historical changes in scientific inquiry. Unpublished manuscript, Board of Employment, Education and Training. Irvine, CA: University of California, Irvine. Xie, Y., & Shauman, K.A. (1998). Sex differences in research productivity: Steinpreis, R.E., Anders, K.A., & Ritzke D. (1999). The impact of gender New evidence about an old puzzle. American Sociological Review, on the review of the curricula vitae of job applicants and tenure 63(6), 847–870. candidates: A national empirical study. Sex Roles, 41(7/8), 509– Yankauer, A. (1991). How blind is blind review? American Journal of 528. Public Health, 81(7), 843–845. Sugimoto, C.R., & Cronin, B. (in press). Citation gamesmanship: Testing Ziman, J. (2000). Real science: What it is, and what it means. Cambridge, for evidence of ego bias in peer review. Scientometrics. UK: Cambridge University Press.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2013 17 DOI: 10.1002/asi