Meeting the Challenges of Evidence-Based Policy: The Campbell Collaboration

By ANTHONY PETROSINO, ROBERT F. BORUCH, HALUK SOYDAN, LORNA DUGGAN, and JULIO SANCHEZ-MECA

ABSTRACT: Evidence-based policy has much to recommend it, but it also faces significant challenges. These challenges reside not only in the dilemmas faced by policy makers but also in the quality of the evaluation evidence. Some of these problems are most effectively ad- dressed by rigorous syntheses of the literature known as systematic reviews. Other problems remain, including the range of quality in systematic reviews and their general failure to be updated in light of new evidence or disseminated beyond the research community. Based on the precedent established in health care by the international Collaboration, the newly formed Campbell Collaboration will prepare, maintain, and make accessible systematic reviews of re- search on the effects of social and educational interventions. Through mechanisms such as rigorous quality control, electronic publication, and worldwide coverage of the literature, the Campbell Collaboration seeks to meet challenges posed by evidence-based policy.

Anthony Petrosino is a research fellow at the Center of Evaluation, American Acad- emy of Arts and Sciences and coordinator for the Campbell Crime and Justice Group. Robert F. Boruch is University Trustee Professor at the University of Pennsylvania Graduate School of Education and cochair of the Campbell Collaboration Steering Group. Haluk Soydan is the director of the Center for Evaluation at the Center for Evalua- tion, Swedish National Board of Health and Welfare and a cochair of the Campbell Col- laboration Steering Group. Lorna Duggan is a forensic psychiatrist in the United Kingdom and reviewer for the Cochrane Collaboration. Julio Sanchez-Meca is a psychology professor at the University of Murcia in Spain and the director of the Unit for Meta-Analysis.

NOTE: We appreciate the assistance of Maureen Matkovich and colleagues at the National Criminal Justice Reference Service who supplied the data in Figure 1. The first author’s work was supported by grants from the Mellon Foundation to the Center for Evaluation and from the Home Office to Cambridge University. Thanks to Brandon C. Welsh for his comments on earlier drafts of the article.

14

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 15

Campbell (1969) was an they trust were not the product of DONALDinfluential psychologist who advocacy? wrote persuasively about the need Donald Campbell unfortunately for governments to take evaluation did not live long enough to bear wit- evidence into account in decisions ness to the creation of the interna- about social programs. He also recog- tional collaboration named in his nized, however, the limitations of the honor that ambitiously attempts to evidence-based approach and the address some of the challenges posed fact that government officials would by evidence-based policy. The Camp- be faced with a number of political di- bell Collaboration was created to pre- lemmas that confined their use of re- pare, update, and disseminate sys- search. The limits of evidence-based tematic reviews of evidence on what policy and practice, however, reside works relevant to social and educa- tional intervention not only in the political pressures (see http:// faced by decision makers when im- campbell.gse.upenn.edu). The target plementing laws and administrative audience will include decision makers at all levels of directives or determining budgets; government, they also reside in problems with the practitioners, citizens, media, and research evidence. researchers. This article begins with a discus- Questions such as, What works to sion of the rationale for the reduce crime in communities? are Campbell Collaboration. We then describe not easily answered. The studies that the precedent established by the bear on these questions are often Cochrane Collaboration in health care. scattered across different fields and This is followed by an overview of the written in different are languages, advent and of the sometimes disseminated in obscure early progress Campbell Collaboration. We con- or inaccessible and can be of outlets, clude with the of the such that inter- promise Camp- questionable quality bell Collaboration in meeting the is at best. How can pretation risky evidence-based and challenges posed by policy practice be informed, if not policy. persuaded, by such a fragmented knowledge base comprising evalu- RATIONALE ative studies that range in quality? Which study, or set of studies, if any at all, ought to be used to influence Surge of interest in policy? What methods ought to be evidence-based policy used to appraise and analyze a set of There are many influences on separate studies bearing on the same decisions or beliefs about what ought question? And how can the findings to be done to address problems like be disseminated in such a way that crime, illiteracy, and unemployment. the very people Donald Campbell Influential factors include ideology, cared about-the decision makers in politics, costs, ethics, social back- government and elsewhere-receive ground, clinical experience, expert findings from these analyses that opinion, and anecdote (for example,

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 16

Lipton 1992). The evidence-based Gibbon (1999), MacDonald (1999), approach stresses moving beyond and Sheldon and Chilvers (2000), these factors to also consider the among others, articulated views results of scientific studies. Although about evidence-based education and few writers have articulated the social welfare. deterministic view that the term A more persuasive indicator that &dquo;evidence-based&dquo; suggests, it is clear evidence-based policy is having some that the vast majority of writers impact is initiatives undertaken by argue that decision makers need to- governments since the late 1990s. at the very least-be aware of the Whether due to growing pragmatism research evidence that bears on poli- or pressures for accountability on cies under consideration (for exam- how public funds are spent, the ple, Davies, Nutley, and Smith 2000). evidence-based approach is begin- Certainly the implicit or explicit goal ning to take root. For example, the of research-funding agencies has United Kingdom is promoting always been to influence policy evidence-based policy in medicine through science (Weiss and Petrosino and the social sectors vigorously (for 1999), and there have always been example, Davies, Nutley, and Smith individuals who have articulated the 2000; Wiles 2001). In 1997, its need for an evidence-based approach Labour government was elected (for example, Fischer 1978). But using the slogan, &dquo;What counts is there has been a surge of interest, what works&dquo; (Davies, Nutley, and particularly in the 1990s, in argu- Smith 2000). The 1998 U.K. Crime ments for research-, science-, or evi- Reduction Programme was greatly dence-based policy (for example, influenced by both the University of Amann 2000; Boruch, Petrosino, and Maryland report to Congress on Chalmers 1999; Nutley and Davies crime prevention (Sherman et al. 1999; Wiles 2001). 1997) and the Home OfBce’s own syn- One indirect gauge of this surge is theses (Nuttall, Goldblatt, and Lewis the amount of academic writing on 1998). In Sweden, the National the topic. For example, in Sherman’s Board of Health and Welfare was (1999) argument for evidence-based commissioned by the government to policing, decisions about where to draft a program for advancing target police strategies would be knowledge in the social services to based on epidemiological data about ensure they are evidence based the nature and scope of problems. (National Board of Health and Wel- The kinds of interventions employed, fare 2001). and how long they were kept in place, In the United States, the Govern- would be guided by careful eval- ment Performance and Review Act of uative studies, preferably random- 1993 was implemented to hold fed- ized field trials. Cullen and eral agencies responsible for identi- Gendreau (2000) and MacKenzie fying measurable objectives and (2000) are among those who made reaching them. This has led to the similar arguments about correc- development of performance indica- tional treatment. Davies (1999), Fitz- tors to assess whether there is value

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 17 added by agencies. The 1998 Coinciding with fragmentation, reauthorization of the Safe and Drug evaluation studies are not regularly Free Schools and Communities Act published in academic journals or in required that programs funded outlets that are readily accessible. under the law be research based. The Instead a large percentage of news media now commonly ask why evaluative research resides in what police are not using research-based Sechrest, White, and Brown (1979) eyewitness identification techniques called the fugitive literature. These (Gawande 2001) or why schools use are government reports, disserta- ineffective drug prevention pro- tions and master’s theses, conference grams (for example, Cohn 2001). papers, technical documents, and All of these signs seem to indicate other literature that is difficult to more than a passing interest in evi- obtain. Lipsey (1992), in his review of dence-based policy. As Boruch (1997) delinquency prevention and treat- noted, different policy questions ment studies, found 4 in 10 were require different types of scientific reported in this literature. Although evidence. To implement the most some may argue that unpublished effective interventions to ameliorate studies are of lesser quality because problems, careful evaluations are they have not been subjected to blind needed. An evidence-based approach peer review as journal articles are, to what works therefore requires this is an empirical question worthy be that these evaluations gathered, of investigation. Such an assertion, and and that appraised, analyzed the at the very least, ignores the high- results be made accessible to influ- quality evaluations done by private ence relevant decisions whenever research firms. Evaluators in such appropriate and possible. entities do not have organizational incentives to publish in peer- to evidence-based Challenges reviewed journals, as professors or policy: Evaluation studies university-based researchers do. If evidence-based policy requires Relevant studies are not reported that we cull prior evaluation studies, solely within the confines of the researchers face significant chal- United States or other English- lenges in doing so. For one, the rele- speaking nations. Recently the vant evaluations are not tidily Kellogg Foundation supported an reported in a single source that we international project that has identi- can consult. Instead they are scat- fied more than 30 national evalua- tered across different academic tion societies, including those in fields. For example, medical, psycho- Brazil, Ghana, Korea, Sri Lanka, logical, educational, and economic Thailand, and Zimbabwe (see http:// researchers more routinely include home.wmis.net/~russon/icce// crime measures as dependent vari- eorg.htm). One argument is that it is ables in their studies (for example, not important to consider evalua- Greenberg and Shroder 1997). These tions conducted outside of one’s evaluations are as relevant as those jurisdiction because the cultural con- reported in justice journals. text will be very different. This is

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 18

FIGURE 1 CUMULATIVE GROWTH OF EVALUATION STUDIES: NATIONAL CRIMINAL JUSTICE REFERENCE SERVICE DATABASE

SOURCE: The National Criminal Justice Reference Service (www.ncjrs.org).

another assertion worthy of empiri- Another challenge to gathering cal test. Harlen (1997) noted that evaluative studies is that there is no many in education believe evaluative finite time by which the production of studies and findings from different this evidence stops. Research, includ- jurisdictions are not relevant to each ing evaluation, is cumulatively other. This ignores the reality that increasing (Boruch, Petrosino, and interventions are widely dissemi- Chalmers 1999). An example is pro- nated across jurisdiction without vided in Figure 1. Consider the concern for context. For example, the cumulative growth of studies in- officer-led drug prevention program dexed either as evaluation or as known as D.A.R.E. (Drug Abuse evaluative study by the National Resistance Education) is now in 44 Criminal Justice Reference Service nations (Weiss and Petrosino 1999). for its database of abstracts. The data Harlen (1997) suggested that we underscore the challenge faced in must investigate the role of context coping with the burgeoning evalua- across these evaluations. This is tion literature. most effectively done through rigor- It would be good to identify and ous research reviews. Such reviews, acquire all relevant evaluations and however, are difficult with interna- keep abreast of new studies as they tional literature without translation become available. It would be even capabilities. better if all evaluations were of

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 19 similar methodological quality and evaluations (for example, Bailey came to the same conclusion about 1966; Kirby 1954; Lipton, Martinson, the effectiveness of the intervention. and Wilks 1975; Logan 1972; Witmer Unfortunately not all evaluations and Tufts 1954). Although a few of are created equal. The results across these earlier syntheses were remark- studies of the same intervention will ably exhaustive, the science of often differ, and sometimes those dif- reviewing that developed in the ferences will be related to the quality 1970s focused attention on the meth- of the methods used (see Weisburd, ods used in reviews of research. and Petrosino 2001). This Lum, high- Methods for analyzing separate the importance of appraising lights but similar studies have a century of evaluation studies for methodologi- experience (Chalmers, Hedges, and cal quality. Cooper in press), but it was not until the 1970s that reviews became scru- Challenges to evidence-based tinized like of sur- policy: Reviewing methods primary reports veys and experiments. This was But what is the best way to draw ironic, as some of the most influential upon these existing evaluation stud- and widely cited articles across fields and ies to understand what works were literature reviews (Chalmers, develop evidence-based policy? Cer- Hedges, and Cooper in press). Begin- on one or a few studies tainly, relying ning in the 1970s, not only were the when others are available is very traditional reviews of evaluations risky because it ignores evidence. For under attack, but the modern statis- on one if five example, relying study tical foundation for meta-analysis or relevant studies have been com- quantitative analysis of study pleted means that we ignore 80 per- results was also being developed (for cent of the evidence (Cook et al. example, Glass, McGaw, and Smith 1992). It is true that the one we study and Olkin 1985). be of all the 1981; Hedges pick may representative Research confirmed that traditional other studies, but as mentioned pre- in which researchers make studies in an area often con- reviews, viously, relative about what flict rather than converge. Evalua- judgments works some unknown and tion studies themselves are part of a by using of were sampling distribution and may differ inexplicit process reasoning, because of chance probability. Until fraught with potential for bias (Coo- and we do a reasonable job of collecting per Hedges 1994). Quinsey underscored how such bias those other studies, an assertion of (1983) could affect conclusions about convergence based on an inadequate sampling of studies is unsupported. research following his review of Criminologists have generally research on sex offender treatment understood the problem of drawing effects: &dquo;The difference in recidivism conclusions from incomplete evi- across these studies is truly remark- dence and have a half century’s expe- able ; clearly by selectively contem- rience in conducting broad surveys of plating the various studies, one can the literature to identify relevant conclude anything one wants&dquo; (101).

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 20

One major problem noted with effects have often been interpreted as regard to traditional reviews was statistically insignificant and there- their lack of explicitness about the fore as treatment failures. methods used, such as why certain studies were included, the search Systematic reviews methods used, and how the studies were analyzed. This includes the cri- Evidence-based policy requires teria used to judge whether an inter- overcoming these and other prob- vention was effective or not. Consider lems with the evaluation studies and the debate over the conclusions in the methods for reviewing them. There is and Wilks Lipton, Martinson, (1975) consensus among those who advo- of more than 200 correc- summary cate evidence-based policy that sys- tional program evaluations, briskly tematic reviews are an important reported first by Martinson (1974). tool in this process (Davies 1999; Despite finding that nearly half of Nutley, Davies, and Tilley 2000). the evaluations reported in Martin- In systematic reviews, researchers son’s article had at least one statisti- attempt to gather relevant evalua- cally significant fmding in favor of tive studies, critically appraise them, conclusions treatment, his overall and come to judgments about what were gloomy about the prospects of works using explicit, transparent, correctional intervention. The crite- state-of-the-art methods. Systematic rion for success was not readily reviews will include detail about apparent, but it must have been each stage of the decision process, strict (Palmer 1975). including the question that guided These earlier reviews, like the review, the criteria for studies to Martinson’s (1974), were also prob- be included, and the methods used to lematic because they seemed to rely search for and screen evaluation on statistical significance as the cri- reports. It will also detail how analy- terion for judging whether an inter- ses were done and how conclusions vention was successful. This later were reached. proved to be problematic, as research The foremost advantage of sys- showed that statistical significance tematic reviews is that when done is the function not only of the size of well and with full integrity, they pro- the treatment effect but of method- vide the most reliable and compre- ological factors such as sample size hensive statement about what (for example, Lipsey 1990). For exam- works. Such a final statement, after ple, large and meaningful effects sifting through the available reported in studies with small sam- research, may be, &dquo;We know little or ples would be statistically insignifi- nothing-proceed with caution.&dquo; cant ; the investigator and traditional This can guide funding agencies and reviewer would consider the finding researchers toward an agenda for a evidence that treatment did not suc- new generation of evaluation studies. ceed. Given that most social science This can also include feedback to research uses small samples, moder- funding agencies where additional ate and important intervention process, implementation, and theory-

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 21 driven studies would be critical to psychotherapy, class size, and school implement. funding. Palmer (1994) noted that like Systematic reviews, therefore, are meta-analyses Lipsey’s (1992) reviews in which rigorous methods helped to counter the prevailing pes- simism about the of correc- are employed regardless of whether efficacy tional treatment ear- meta-analysis is undertaken to sum- generated by marize, analyze, and combine study lier reviews. findings. When meta-analysis is to evidence- used, however, estimates of the aver- Challenges based Current age impact across studies, as well as policy: how much variation there is and why, systematic reviews can be can provided. Meta-analysis There seems to be growing conver- clues as to some generate why gence among researchers and others are more effective in some programs that systematic reviews are a critical settings and others are not. Meta- tool for evidence-based policy (for analysis is also critical in ruling out example, Nutley, Davies, and Tilley the play of chance when combining 2000). This is reflected in the deci- results (Hedges and Olkin 1985). sion by the United Kingdom’s most Systematic reviews have other prestigious social science funding byproducts. They can reconcile dif- agency, the Economic and Social ferences between studies. Because Research Council, to support an each study document is scrutinized, evidence-based policy and practice systematic reviews can underscore initiative featuring systematic deficiencies in report writing and reviews (see http://www.esrc.ac.uk/ lead to better systems for collecting EBPesrcUKcentre.htm). On closer data required by reviewers. Reviews scrutiny, however, we find that there also ensure that relevant evalua- are some challenges to the use of sys- tions-which may have been ignored tematic reviews in evidence-based and long forgotten-are eternally policy as they are currently done. utilized. It is satisfying to investiga- One problem is that there is often tors to find their studies still consid- a lack of transparency in the review ered 20 years or more after comple- process. Completed syntheses are tion (Petrosino forthcoming). generally submitted to peer- Systematic reviews have been reviewed journals, long after the influential. This is understandable, research question has been deter- as Weiss (1978) predicted that policy mined and the methods selected. makers would find good syntheses of Except for rare occasions in which research compelling because they reviewers submit a grant proposal would reconcile conflicting studies for funding, researchers do not a pri- when possible and provide a compre- ori describe why they are doing the hensive resource for their aides to review and what methods they will consult. Hunt (1997) discussed how employ. Without transparent pro- the results from meta-analysis con- cesses from beginning to end, ex post tradicted conclusions in earlier tradi- facto decisions that can influence a tional reviews, in areas such as review and slant it knowingly or

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 22 unknowingly toward one conclusion interests of individual researchers. or another are possible. This is espe- For example, in criminology, offender cially important in persuading policy treatment has been a controversial makers who want to be sure that and popular topic and the target of research is not the product of slick most activity advocacy. (Petrosino 2000). Other important Because there is no uniform qual- areas for review such as police train- ity control process, systematic ing, services for crime victims, court reviews, like evaluations, range on a backlog interventions, and so on have continuum of quality. In some cases, been inadequately covered. Evi- the quality is due to the methods dence-based policy requires that employed. Some reviewers may use evaluations in these areas be synthe- meta-analytic methods but inade- sized, even if they are less relevant to quately describe their decision pro- longstanding criminological debates. cess. Other reviews may detail ex- Even if reviews did cover many haustive search processes but then more questions than they currently use questionable methods for analy- do, they are often not disseminated in sis. It is difficult for even the discern- such ways that decision makers and ing reader to know how trustworthy the public can get them. All too often, the findings are from a particular reviews are published by academics review. In other cases, the quality of in peer-reviewed journals, outlets the review is due to the way it is that are not consulted by policy mak- reported. Sometimes the nature of ers. In fact, decision makers often get the outlet dictates how explicit and their information about research transparent the reviewers can be. from news media, which can selec- Some reviews, particularly those pre- tively cover only a few of the thou- pared for academic print journals sands of evaluative studies relevant with concerns about page lengths, to crime and justice reported each are briskly written. Dissertations year (Weiss and Singer 1988). Tyden and technical reports are usually (1996) wrote that publishing an aca- very detailed but are less accessible demic paper to disseminate to policy to readers. Systematic reviews may makers was akin to shooting it over a all contain Materials and Methods wall, blindly, into water. The path to sections, but the level of detail in utilization by decision makers was each may vary depending on the dis- haphazard at best. semination outlet. To examine dissemination further, Policy makers and other inter- we analyzed the citations for 302 ested users of research have a wide meta-analyses reported by Lipsey range of questions about what works and Wilson (1993) of psychological to which they want answers. and educational treatment studies. Although funding agencies will spon- Nearly two-thirds listed in the refer- sor reviews at times to meet these ence section were published in aca- information needs, reviews are gen- demic journals. These were scattered erally conducted because of the across 93 journals during the years

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 23 covered (1977-1991). Only the Some journals also have lengthy lag Review of Educational Research pub- times between submission and publi- lished an average of one review or cation, delaying the dissemination of more per year. Unless researchers evidence even further. were using other unknown mecha- nisms such as oral and briefings THE COCHRANE to internal memos to communicate COLLABORATION decision makers, it seems very that this evidence into unlikely got Are there ways of overcoming other the hands of anyone than challenges to using systematic research specialists working in nar- reviews in evidence-based policy? A row areas. precedent for doing so was estab- Most systematic reviews also tend lished in the health care field. Archie to be one-off exercises, conducted Cochrane was a noted epidemiologist only as funding, interest, or time per- who wrote persuasively about the mits. Rarely are they updated to take need for medical practitioners to take into account new studies that are rel- scientific evidence into account in evant to the review, a challenge that their practice. Cochrane (1972) is more significant given the cumula- lamented the fact that although ran- tive growth of evaluation reports domized trials had shown some prac- highlighted in Figure 1. Yet years tices to be effective and others harm- may go by before an investigator pur- ful, clinical practitioners and medical sues funding to update an existing schools were ignoring the informa- review. The methodology and statis- tion. He later (Cochrane 1979) won- tical foundation for meta-analysis is dered why the medical sciences had still rapidly evolving, with improved not yet organized all relevant trials techniques and new software being into subspecialties so that decision developed to solve data problems. It makers could take such evidence into is rare to find reviews that take into account. A prot6g6 of Cochrane, an account these new techniques, con- obstetrician turned researcher ducting analyses to determine if named , soon identi- results using different methods fied and reviewed randomized trials converge. relevant to childbirth and prenatal Some reviewers publish in print interventions (see www.cochrane. journals, an inefficient method for org). disseminating reviews. Because In the early 1990s, the U.K. print journals find them too costly, National Health Service (NHS), cases in which reviewers take into under the direction of Sir Michael account cogent criticisms by others Peckham, initiated the Research and and conduct reanalysis are rarely Development Programme with the reported. Unlike medical journals, goal of establishing an evidence- criminological journals do not have a based resource for health care. strong tradition in routinely printing Because of the success of their earlier letters to the editor that respond to project on childbirth and pregnancy criticisms with additional analyses. studies, Chalmers and his colleagues

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 24 were asked to extend this effort to all which the intended topic area seems areas of health care intervention. appropriate. Once a title for the pro- The U.K. Cochrane Centre was estab- posed review is agreed on, it is circu- lished, with core funding from the lated to ensure that no other similar NHS, to begin the work. It soon reviews are being prepared by became clear that the amount of reviewers from other CRGs. Reduc- work far surpassed the capacity of ing overlap and duplication is a cru- one center or one nation to take into cial goal for the Cochrane Collabora- account. In 1993, in honor of his men- tion, as scarce resources must be tor, Chalmers and his colleagues used judiciously. Reviews are needed launched the international Cochrane in so many areas of health care that Collaboration to &dquo;help people make wide coverage is a priority. Once the well-informed decisions about title is agreed on, the reviewers must healthcare by preparing, maintain- then submit a protocol for the review. ing and promoting the accessibility of The protocol is a detailed plan that systematic reviews of the effects of spells out a priori the question to be healthcare interventions.&dquo; In just 8 answered, the background to the years, the Cochrane Collaboration issue, and the methods to be has been able to organize thousands employed. The protocol then goes of individuals worldwide to contrib- through a round or two of criticism by ute to its work. Much more informa- the CRG editors. It must conform to a tion about the Cochrane Collabora- certain template to facilitate elec- tion can be found at its Web site, tronic publication using the www.cochrane. org. But the Cochrane Collaboration’s software, Cochrane Collaboration, in a very Review Manager, or RevMan. Once brief time, established a number of the protocol is approved, it is pub- mechanisms to address challenges to lished in the next edition of the quar- using systematic reviews in evi- terly electronic publication, the dence-based health care policy. Cochrane Library, and made avail- For example, collaborative review able to all subscribers for comment groups (CRGs) are responsible for and criticism. The editorial board the core work of systematic review- must decide which criticisms should ing. CRGs are international net- be taken into account. works of individuals interested in The reviewers then prepare the particular health areas such as review according to the protocol. breast cancer, epilepsy, injuries, and Although deviation from the plan is stroke. Each CRG has an editorial sometimes necessary, the protocol board, generally comprising persons forces a prospective, transparent pro- with scientific or practical expertise cess. Post hoc changes are readily in the area, who are responsible for detected, and analyses can be done to quality control of protocols (plans) determine if they altered findings. and completed drafts of reviews. After the reviewers conduct the It is useful to examine how a review and write up a draft, it too is Cochrane review is prepared. First, submitted to the CRG editorial individuals approach the CRG in board. Once the review draft is

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 25 completed, the editors critique it make sure that decisions in reviews, again. It is also sent to external read- such as setting eligibility criteria, be ers, including researchers as well as informed as much as possible by evi- practitioners and patients. This dence. The Cochrane Collaboration is round of criticism is designed to also facilitated by 15 centers around improve the methods in the review the world; they promote the interests and to ensure that the final review is of the collaboration within host coun- written as accessibly as possible to a tries, train people in doing system- nonresearch audience, including atic reviews, and identify potential health care patients, providers, and collaborators and end users. Finally citizens. For each completed Cochrane fields and networks, such Cochrane review, the Cochrane Con- as the Cochrane Consumer Network, sumer Network crafts a one-page focus on dimensions of health care synopsis written accessibly for other than health problems and work patients and other consumers and to ensure that their priorities are posts the synopses at its Web site (see reflected in systematic reviews. http://www.cochrane.org/cochrane/ The main product of the Cochrane consumer.htm). Once the review is Collaboration is the Cochrane approved, it is published in the next Library. This electronic publication issue of the Cochrane Library and is updated quarterly and made avail- again made available for external able via the World Wide Web or criticism by subscribers. Again the through CD-ROMs mailed to sub- editors and the reviewers have to scribers. The January 2001 issue con- determine which of these criticisms tained 1000 completed reviews and ought to be taken into account in a 832 protocols (or plans for a review) subsequent review update. Cochrane in one central location using the reviews must be updated every 2 same format. Wolf (2000) noted that years, to take into account new stud- the uniformity allows the reader to ies meeting eligibility criteria. understand and fmd all of the neces- Another important mechanism for sary information in each review, fa- the Cochrane Collaboration is the cilitating training and use. Another methods groups. These are interna- important feature of the Cochrane tional networks of individuals who Library is the Cochrane Controlled conduct systematic reviews focused Trials Register (CCTR). The CCTR on the methods used in systematic has more than a quarter million cita- reviews and primary studies. For tions to randomized trials relevant to example, a methods group might col- health care, an important resource in lect all systematic reviews in which assisting reviewers find studies so randomized trials are compared to they can prepare and maintain their nonrandomized trials. In this review, reviews. they would seek to determine if there Empirical studies have reported is a consistent relationship between that Cochrane syntheses are more the reporting of random assignment rigorous than non-Cochrane system- and results. Their objective is to atic reviews and meta-analyses pub-

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 26 lished in medical journals. For exam- ing an unbiased, single source for evi- ple, Jadad and his colleagues (1998) dence and producing reviews, found that Cochrane reviews pro- abstracts, and synopses for different vided more detail, were more likely to audiences, they facilitate utilization. test for methodological effects, were less to be restricted lan- likely by THE CAMPBELL guage barriers, and were updated COLLABORATION more than print journal reviews. The Cochrane Library is quickly becom- With the success of the Cochrane ing recognized as the best single Collaboration, the same type of orga- source of evidence on the effective- nization was soon suggested for re- ness of health care interventions viewing social and educational eval- (Egger and Davey-Smith 1998). uations. Adrian Smith (1996), Reviews by the Cochrane Collabora- president of the Royal Statistical So- tion are frequently used to generate ciety, issued a challenge when he and support guidelines by govern- said, ment agencies such as the National Institute for Clinical Excellence (for As ordinary citizens ... we are, through example, see Chalmers, Hedges, and the media, confronted daily with contro- Cooper in press). In 1999, the U.S. versy and debate across a whole spec- National Institutes of Health made trum of public policy issues. Obvious topi- the Cochrane Library available to all cal examples include education-what work the classroom?-and 16000 of its employees. It is now does in penal accessible by all doctors in Brazil, the policy-what is effective in reducing there is an U.K. NHS, and all U.K. universities reoffending? Perhaps opportu- nity ... to launch a campaign directed at (Mark communica- Starr, personal developing analogues to the Cochrane the tion, 2001). Finally queen recog- Collaboration, to provide suitable evi- nized Iain Chalmers for his efforts dence bases in other areas besides medi- with the United Kingdom’s greatest cine [emphasis added]. (378) honor: knighthood! Thus the Cochrane Collaboration A number of individuals across has been able to meet many of the different fields and professions orga- challenges posed by evidence-based nized and met to determine how best policy in health care. By requiring to meet this challenge. Several ex- detailed protocols, the Cochrane Col- ploratory meetings were held during laboration addresses the lack of 1999, including two headed by the transparency in most systematic School of Public Policy at University reviews of research. Through rigor- College-London, and one organized ous quality control, they produce in Stockholm by the National Board commendable reviews. By publishing of Health and Welfare. These meet- electronically, dissemination is ings, which included researchers and quickened, and the ability to update members of the policy and practice and correct the reviews in light of communities, provided evidence that new evidence is realized. By provid- the development of an infrastructure

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 27 similar to Cochrane’s for social and TABLE 1 educational intervention including PRINCIPLES OF THE criminal justice should be vigorously CAMPBELL COLLABORATION pursued (Davies, Petrosino, and Chalmers 1999; www.ucl.ac.uk/spp/ publications/campbell.htm).

Early days and progress The Campbell Collaboration was officially inaugurated in February 2000 at a meeting in Philadelphia, with more than 80 individuals from 12 nations participating. The Camp- bell Collaboration was founded on nine principles developed first by the Cochrane Collaboration (see Table 1). At the February 2000 inaugural meeting, it was agreed that the head- quarters (secretariat) should reside at the University of Pennsylvania. An international eight-member steering group was officially desig- nated to guide its early development. The first three Campbell coordi- nating groups (similar to Cochrane’s SOURCE: C2 Steering Group (2001). CRGs) were created to facilitate sys- tematic reviews in their areas: edu- cation, social welfare, and crime and employment programs within the justice. The Campbell Education welfare system. The early progress of Coordinating Group is focused on the Campbell Crime and Justice developing protocols and reviews in Coordinating Group is described the following critical areas: truancy, elsewhere in this issue (see mathematics learning, science learn- Farrington and Petrosino 2001). ing, information technology learning, The Campbell Collaboration work-related learning and transfer- Methods Group was developed to able skills, assessment and learning, increase the precision of Campbell comprehensive school reform, school reviews by conducting reviews to leadership and management, profes- investigate the role of methodologi- sional education, and economics and cal and statistical procedures used in education. The Campbell Social Wel- systematic reviews, as well as char- fare Coordinating Group has also acteristics in original studies (see organized itself into several areas: httpJ/web.missouri.edu/~c2method). social work, transportation, housing, Three methods subgroups were cre- social casework with certain ethnic ated during the past year, including clientele, child welfare, and statistics, quasi-experiments, and

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 28

FIGURE 2 C2-SPECTR

SOURCE: Petrosino et al. (2000).

process and implementation sub- Figure 2 shows, preliminary work groups. In conjunction with the toward C2-SPECTR has already Campbell secretariat and the coordi- identified more than 10000 citations nating groups, the methods group to randomized or possibly ran- has taken the lead in developing a domized trials (Petrosino et al. 2000), preliminary quality control process and this has now been augmented based on the Cochrane model (see with new additions to the data file. Appendix A). The Campbell Commu- Like the CCTR in health care, C2- nication and Dissemination Group SPECTR should serve as a produc- will develop best practice in translat- tive resource and facilitate prepara- ing results to a variety of end users, tion and maintenance of reviews. including policy makers, practitio- Plans to build a complimentary data- ners, media, and the public. base of nonrandomized evaluations To facilitate the work of reviewers, are being discussed. the Campbell Collaboration Social, During 2000, Campbell and Psychological, Educational and Cochrane groups mutually partici- Criminological Trials Register (C2- pated in the NHS Wider Public SPECTR) is in development. As Health Project (see www.york.ac.uk/

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 29 crd/publications/wphp.htm), an unbiased and technically sound as attempt to collate evidence from sys- possible. Appendix B provides a pre- tematic reviews relevant to the liminary model of stages in a Camp- United Kingdom’s intended health bell review. policies. As the NHS now considers social and crimi- education, welfare, CONCLUSION nal justice directly or indirectly related to public health, Campbell Donald Campbell articulated an groups were involved (for example, evidence-based approach before the Petrosino 2000). Several hundred methods of systematic reviewing and systematic-or possibly system- meta-analysis became staples of atic-reviews were identified in empirical inquiry. Still we think he these areas and can now be used to would be pleased with the efforts of us the terrain and help map identify hundreds of individuals who are target areas where high-quality working worldwide to advance the reviews are needed. international collaboration that Funding from the Ministry of bears his name. Social Affairs of Denmark has been The challenges of evidence-based acquired to establish a Campbell policy are many We think the Camp- Center in Copenhagen to facilitate bell Collaboration will help to meet efforts in the Nordic region. some of them. For example, with rig- Resources have also been secured to orous quality control and protocols, create the Meta-Analysis Unit at the the Campbell Collaboration will University of Murcia in Spain (see attempt to produce the same level of http://www.um.es/sip/unidadmal. transparency with unbiased reviews html), the first step in developing a for which the Cochrane Collabora- Campbell Center for Mediterranean tion is lauded. Through electronic nations. Objectives of these centers publication of reviews in a single include facilitating reviews through source (that is, Campbell Library), it training, identifying end users and will attempt to extend beyond com- collaborators, and promoting dissem- municating with researchers and ination and utilization. facilitate dissemination and utiliza- These are just a few of the many tion by decision makers and ordinary developments in the early days of the citizens. Maintaining reviews, taking Campbell Collaboration. Although into account evidence worldwide, and the collaboration anticipates creat- preparing reviews using the best sci- ing an electronic publication that ence available should enhance the will make available C2-SPECTR and use of Campbell reviews. other helpful resources, its critical Systematic reviews are certainly product will be high-quality system- an important tool for evidence-based atic reviews. For the Campbell Col- policy, but like Campbell before us, laboration to achieve the kind of suc- we do not wish to zealously oversell cess Cochrane has obtained in the scientific evidence. Systematic health care field, it will have to reviews will not resolve all enduring ensure that these reviews are as political and academic conflicts, nor

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 30 will they often-if at all-provide neat and tidy prescriptions to deci- sion makers on what they ought to do. In their proper role in evidence- based policy, they will enlighten by explicitly revealing what is known from scientific evidence and what is not. They will also generate more questions to be resolved. One unan- ticipated benefit in the short life of the Campbell Collaboration is the sustained forum it provides for dis- cussions about evaluation design and how to engage people from around the world. Criminology is a noble profession because it aims to reduce the misery stemming from crime and injustice. To the extent that this collaboration can fulfill Don Campbell’s vision of assisting people in making well- informed decisions, it will help crimi- nologists stay true to criminology’s original and noble intent.

SOURCE: C2 Steering Group (2001 ). APPENDIX A A FLOW CHART OF THE STEPS IN THE CAMPBELL REVIEW APPENDIX B QUALITY CONTROL PROCESS STAGES OF A CAMPBELL SYSTEMATIC REVIEW

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 31

References

Amann, Ron. 2000. Foreword. In What Works? Evidence-Based Policy and Practice in Public Services, ed. Huw T. O. Davies, Sandra Nutley, and Peter C. Smith. Bristol, UK: Policy Press. Bailey, William C. 1966. Correctional Outcome: An Evaluation of 100 Re- ports. Journal of Criminal Law, Crim- inology and Police Science 57:153-60. Boruch, Robert F. 1997. Randomized Ex- periments for Planning and Evalua- tion: A Practical Guide. Thousand Oaks, CA: Sage. Boruch, Robert F., Anthony Petrosino, and lain Chalmers. 1999. The Camp- bell Collaboration: A Proposal for Mul- tinational, Continuous and System- atic Reviews of Evidence. In Pro- ceedings of the International Meeting on Systematic Reviews of the Effects of Social and Educational Interventions July 15-16, ed. Philip Davies, Anthony Petrosino, and Iain Chalmers. London: University College-London, School of Public Policy. Campbell, Donald T. 1969. Reforms as Experiments. American Psychologist 24:409-29. Chalmers, lain, Larry V. Hedges, and Harris Cooper. In press. A Brief His- tory of Research Synthesis. Evalua- tion & the Health Professions. Cochrane, Archie L. 1972. Effectiveness and Efficiency. Random Reflections on Health Services. London: Nuffield Pro- vincial Hospitals Trust.

—. 1979. 1931-1971: A Critical Re- view, with Particular Reference to the Medical Profession. In Medicines for the Year 2000. London: Office of Health Economics. Cohn, Jason. 2001. Drug Education: The Triumph of Bad Science. Rolling Stone 869:41-42,96. Cook, Thomas D., Harris Cooper, David S. Cordray, Heidi Hartmann, Larry V SOURCE: C2 Steering Group (2001). Hedges, Richard J. Light, Thomas A.

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 32

Louis, and Frederick Mosteller, eds. Gawande, Atul. 2001. Under Suspicion. 1992. Meta-Analysis for Explanation. The Fugitive Science of Criminal Jus- New York: Russell Sage. tice. New Yorker, 8 Jan., 50-53. Cooper, Harris C. and Larry V Hedges, Glass, Gene V, Barry McGaw, and Mary L. eds. 1994. The Handbook of Research Smith. 1981. Meta-Analysis in Social Synthesis. New York: Russell Sage. Research. London: Sage. David and Mark Shroder. C2 Steering Group. 2001. C2: Concept, Greenberg, 1997. The Social Status, Plans. Overarching Grant Pro- Digest of Experi- ments. 2d ed. DC: Urban Collab- Washington, posal. Philadelphia: Campbell Institute Press. oration Secretariat. Harlen, Wynne. 1997. Educational Re- Francis T. and Paul Gendreau. Cullen, search and Educational Reform. In Correctional Rehabil- 2000. Assessing The Role of Research in Mature Educa- itation : Policy, Practice, and Prospects. tional Systems, ed. Seamus Hegarty. In Policies, Processes, and Decisions of Slough, UK: National Foundation for the Criminal Justice System, Criminal Educational Research. vol. Justice 2000, 3, ed. Julie Horney. Hedges, Larry V and Ingram Olkin. 1985. Washington, DC: National Institute of Statistical Methods for Meta-Analysis. Justice. New York: Academic Press. Davies, Huw T. O., Sandra M. Nutley, and Hunt, Morton. 1997. The Story of Meta- Peter C. Smith, eds. 2000. What Analysis. New York: Russell Sage. Works? Evidence-Based Policy in Pub- Jadad, Alejandor R., Deborah J. Cook, lic Services. Bristol, UK: Policy Press. Alison Jones, Terry P. Klassen, Peter Davies, Phillip. 1999. What Is Evidence- Tugwell, Michael Moher, and David Based Education? British Journal of Moher. 1998. Methodology and Re- Educational Studies 47:108-21. ports of Systematic Reviews and Davies, Philip, Anthony Petrosino, and Meta-Analyses: A Comparison of lain Chalmers, eds. 1999. Proceedings Cochrane Reviews with Articles Pub- lished in Journals. Jour- of the International Meeting on Sys- Paper-Based tematic Reviews of the Effects of Social nal of the American Medical Associa- and Educational Interventions, July tion 280:278-80. 15-16. London: University College- Kirby, Bernard C. 1954. Measuring Ef- London, School of Public Policy. fects of Criminals and Delinquents. and Social Research 38 :368- Matthias and Sociology Egger, George Davey- 74. Smith. 1998. Bias in Location and Se- Mark W. 1990. lection of Studies. British Medical Lipsey, Design Sensitivity: Statistical Re- Journal 316:61-66. Power forExperimental search. Newbury Park, CA: Sage. Farrington, David P. and Anthony —. 1992. Juvenile Petrosino. 2001. The Collab- Delinquency Campbell Treatment: A Meta-Analytic oration Crime and Justice An- Inquiry Group. into the Variability of Effects. InMeta- Analysis forExplanation, ed. Thomas D. ical and Social Science 578:35-49. als of the American Academy of Polat- Cook, Harris Cooper, David S. Cordray, Fischer, Joel. 1978. Does Anything Work? Heidi Hartmann, Larry V. Hedges, Journal of Social Service Research Richard J. Light, Thomas A. Louis, and 1:215-43. Frederick Mosteller. New York: Fitz-Gibbon, Carol. 1999. Education: The Russell Sage. High Potential Not Yet Realized. Pub- Lipsey, Mark W. and David B. Wilson. lic Money & Management 19:33-40. 1993. The Efficacy of Psychological,

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 33

Educational and Behavioral Treat- Palmer, Ted. 1975. Martinson Revisited. ment : Confirmation from Meta- Journal of Research in Crime and De- Analysis. American Psychologist linquency 12:133-52. 48:1181-209. —.A 1994. Profile of Correctional Ef- Lipton, Douglas S. 1992. How to Maxi- fectiveness and New Directions for Re- mize Utilization of Evaluation Re- search. Albany: State University of search by Policymakers. Annals of the New York Press. American Academy of Political and Petrosino, Anthony. 2000. Crime, Drugs Social Science 521:175-88. and Alcohol. In Evidence from System- Reviews Research Relevant to Lipton, Douglas S., Robert Martinson, atic of "Wider Public and Judith Wilks. 1975. The Effective- Implementing the Health" Agenda, Contributors to the ness of Correctional Treatment. New Cochrane Collaboration and the York: Praeger. Campbell Collaboration. York, UK: Logan, Charles H. 1972. Evaluation Re- NHS Centre for Reviews and Dissemi- in and search Crime Delinquency: A nation. Reappraisal. Journal of Criminal —. Forthcoming. What Works to Re- Law, and Police Science Criminology duce Offending? A Systematic Review 63:378-87. of 300 Randomized Field Trials in MacDonald, Geraldine. 1999. Evidence- Crime Reduction. New York: Oxford Based Social Care: Wheels Off the University Press. Runway? Public Money & Manage- Petrosino, Anthony, Robert F. Boruch, ment 19:25-32. Cath Rounding, Steve McDonald, and MacKenzie, Doris Layton. 2000. Evi- lain Chalmers. 2000. The Campbell dence-Based Corrections: Identifying Collaboration Social, Psychological, What Works. Crime & Delinquency Educational and Criminological 46:457-71. Trials Register (C2-SPECTR) to Facil- Martinson, Robert. 1974. What Works? itate the Preparation and Mainte- Questions and Answers About Prison nance of Systematic Reviews of Social Reform. Public Interest 10:22-54. and Educational Interventions. Eval- National Board of Health and Welfare. uation Research in Education 14 :293- 307. 2001. A Programme of National Sup- Vernon L. 1983. Prediction of port for the Advancement of Knowl- Quinsey, edge in the Social Services. Stockholm: Recidivism and the Evaluation of Author. Treatment Programs for Sexual Of fenders. In Sexual and the Sandra and Huw T. O. Davies. Aggression Nutley, Law, ed. Simon N. Verdun-Jones and 1999. The Fall and Rise of Evidence in A. A. Keltner. Burnaby, Canada: Si- Criminal Justice. Public Money & mon Fraser University Criminology Management 19:47-54. Research Centre. Nutley, Sandra, Huw T. O. Davies, and Sechrest, Lee, Susan O. White, and Nick Tilley. 2000. Letter to the Editor. Elizabeth D. Brown, eds. 1979. The Re- Public Money & Management 20:3-6. habilitation of Criminal Offenders: Nuttall, Christopher, Peter Goldblatt, Problems and Prospects. Washington, and Chris Lewis, eds. 1998. Reducing DC: National Academy of Sciences. Offending: An Assessment of Research Sheldon, Brian and Roy Chilvers. 2000. Evidence on Ways of Dealing with Of- Evidence-Based Social Care: A Study fending Behaviour. London: Home Of of Prospects and Problems. Dorset, fice. UK: Russell House.

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010 34

Sherman, Lawrence W. 1999. Evidence- L. E. Lynn. Washington, DC: National Based Policing. In Ideas in American Academy of Sciences. Policing. Washington, DC: Police Weiss, Carol Hirschon and Anthony Foundation. Petrosino. 1999. Improving the Use of Sherman, Lawrence W., Denise C. Research: The Case of D.A.R.E. Un- Gottfredson, Doris Layton MacKen- published grant proposal to the Rob- zie, John E. Eck, Peter Reuter, and ert Wood Johnson Foundation, Sub- Shawn D. Bushway. 1997. Preventing stance Abuse Policy Research Pro- Crime: What Works, What Doesn’t, gram. Cambridge, MA: Harvard Grad- What’s Promising. Washington, DC: uate School of Education. U.S. Department of Justice, National Weiss, Carol Hirschon and Eleanor Institute of Justice. Singer. 1988. The Reporting of Social Smith, Adrian. 1996. Mad Cows and Ec- Science in the National Media. New stasy : Chance and Choice in an Evi- York: Russell Sage. dence-Based Society. Journal of the Wiles, Paul. 2001. Criminology in the Royal Statistical Society 159:367-83. 21st Century: Public Good or Private Tyden, Thomas. 1996. The Contribution Interest? Paper presented at the meet- of Longitudinal Studies for Under- ing of the Australian and New Zea- standing Science Communication and land Society of Criminology, Univer- Research Utilization. Science Commu- sity of Melbourne, Australia, Feb. nication 18:29. Witmer, Helen L. and Edith Tufts. 1954. Weisburd, David, Cynthia M. Lum, and The Effectiveness of Delinquency Pre- Anthony Petrosino. 2001. Does Re- vention Programs. Washington, DC: search Design Affect Study Out- U.S. Department of Health, Education comes ? Findings from the Maryland and Welfare, Social Security Adminis- Report Criminal Justice Sample. An- tration, Children’s Bureau. nals of the American Academy ofPolit- Wolf, Frederic M. 2000. Lessons to be ical and Social Science 578:50-70. Learned from Evidence-Based Medi- Weiss, Carol Hirschon. 1978. Improving cine : Practice and Promise of Evi- the Linkage Between Social Research dence-Based Medicine and Evidence- and Public Policy In Knowledge and Based Education. Medical Teacher Policy: The Uncertain Connection, ed. 22:251-59.

Downloaded from http://ann.sagepub.com at Vrije Universiteit 34820 on March 8, 2010