PATIENT-CENTERED OUTCOMES RESEARCH INSTITUTE FINAL RESEARCH REPORT

Testing a New Software Program for Data Abstraction in Systematic Reviews

Tianjing Li, MD, MHS, PhD1; Ian J. Saldanha, MBBS, MPH, PhD2; Jens Jap, BA2; Joseph Canner, MHS3; Christopher H. Schmid, PhD4; on behalf of the Data Abstractor Assistant investigators

1Center for Clinical Trials and Evidence Synthesis, Department of , Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland 2Center for Evidence Synthesis in Health, Department of Health Services, Policy, and Practice, Brown University School of Public Health, Providence, Rhode Island 3Center for Outcomes Research, Department of Surgery, Johns Hopkins School of Medicine, Baltimore, Maryland 4Center for Evidence Synthesis in Health, Department of Biostatistics, Brown University School of Public Health, Providence, Rhode Island

Institution Receiving the PCORI Award: Original Project Title: Develop, Test, and Disseminate a New Technology to Modernize Data Abstraction in Systematic Reviews PCORI ID: ME-1310-07009 HSRProj ID: HSRP20152269

______To cite this document, please use: Li T, Saldanha IJ, Jap J, Canner J, Schmid CH; Data Abstractor Assistant (DAA) Investigators. (2020). Testing a New Software Program for Data Abstraction in Systematic Reviews. Patient-Centered Outcomes Research Institute (PCORI). https://doi.org/10.25302/04.2020.ME.131007009 TABLE OF CONTENTS

ABSTRACT ...... 4 BACKGROUND ...... 6 Figure 1. Steps in completing a systematic reviewa ...... 6 Specific Aims ...... 8 PARTICIPATION OF PATIENTS AND OTHER STAKEHOLDERS ...... 10 Impact of Stakeholder Engagement on Project ...... 10 METHODS ...... 12 Aim 1: Developing DAA ...... 12 Aim 2: Conducting a Randomized Controlled Trial to Evaluate DAA ...... 14 Table 1. Assignment of 24 Pairs of Data Abstractors to 6 Sequences and 48 Articlesa ...... 17 Figure 2. Screenshot from the Baseline Tab of a data abstraction form used during the DAA trial ...... 20 Aim 3: Disseminating the Study Findings ...... 24 RESULTS ...... 25 Aim 1: Developing DAA ...... 25 Figure 3. Screenshot showing how DAA displays the source document in HTML format (right) adjacent to the data abstraction form in the data abstraction system (SRDR, left)a...... 26 Aim 2: Conducting a Randomized Controlled Trial to Evaluate DAA ...... 27 Figure 4. Participant flow during the DAA trial ...... 27 Table 2. Baseline Characteristics of All 52 Participants in the DAA Trial ...... 28 Table 3. Baseline Characteristics of All 52 Participants in the DAA Trial by Level of Experience With Data Abstraction ...... 30 Table 4. Proportion of Errors by Data Abstraction Approach, Type of Error, Type of Data Item, and Topic ...... 33 Table 4. Proportion of Errors by Data Abstraction Approach, Type of Error, Type of Data Item, and Systematic Review Topic (cont’d) ...... 34 Table 5. Proportion of Errors Across All Approaches, by Type of Error, Type of Data Abstracted, and Systematic Review Topic ...... 36 Table 6. Between-Approach Comparisons of Error Proportions by Type of Data Abstracteda ...... 37 Table 7. Auto-recorded Time Spent (in minutes) by Data Abstraction Approach, Type of Data Item, and Systematic Review Topic ...... 39

2 Table 8. Self-recorded Time (in minutes) Spent by Data Abstraction Approach, Step of Data Abstraction, and Systematic Review Topic ...... 41 Table 9. Self-recorded Time (in minutes) Spent Across All Approaches, by Step of Data Abstraction and Systematic Review Topic ...... 42 Table 10. Between-Approach Comparisons of Auto-recorded Time by Type of Data Abstracteda ...... 44 Table 11. Between-Approach Comparisons of Self-recorded Time Across All Topicsa ...... 44 Aim 3. Disseminating the Study Findings ...... 46 Table 12. Considerations When Selecting Data Abstraction Approaches During Systematic Reviews ...... 47 DISCUSSION ...... 49 Error Proportions Observed and Context for Study Results ...... 49 Differences in Error Proportions and Time Among Data Abstraction Approaches ...... 50 Possible Reasons for Higher Error Proportions With DAA ...... 51 Subpopulation Considerations ...... 51 Value of Using DAA and Implications for Future Research ...... 51 Challenges With Independent Dual Data Abstraction Plus Adjudication ...... 52 Implications and Uptake of Study Results ...... 53 Study Limitations and Strengths ...... 53 CONCLUSIONS ...... 55 REFERENCES ...... 56 RELATED PUBLICATIONS ...... 59 In Preparation ...... 59 Published ...... 59 ACKNOWLEDGMENTS ...... 60 APPENDICES ...... 61 Appendix 1: Published paper describing the technical details of DAA ...... 61 Appendix 2: Survey instrument ...... 75 Appendix 3: Summary of survey responses, by level of experience with data abstraction ...... 78 Appendix 4: Published paper describing the DAA trial protocol ...... 80

3

ABSTRACT

Background: When performing systematic reviews, data abstraction, a predominantly manual process, is labor intensive and error prone. Current standards for abstraction rest on a weak evidence base.

Objectives:

Aim 1. Develop Data Abstraction Assistant (DAA), a software tool to identify and track the location of data in articles and to automatically enter data into the Systematic Review Data Repository.

Aim 2. Conduct a randomized controlled trial to evaluate the comparative effectiveness of 3 approaches—(A) DAA-facilitated single data abstraction plus verification, (B) single data abstraction plus verification, and (C) independent dual data abstraction plus adjudication—on the accuracy and efficiency of data abstraction.

Aim 3. Disseminate DAA, study findings, and a decision tool to enable users to better understand the trade-offs between accuracy and efficiency when selecting abstraction approaches during systematic reviews.

Methods: For aim 1, we designed DAA to be a user-friendly platform that would indicate the source of abstracted data and be compatible with various data abstraction systems. We surveyed early users of DAA regarding its user-friendliness. For aim 2, we conducted an online, randomized, crossover trial with 26 pairs of data abstractors. Pairs abstracted data from 6 articles, 2 under each approach. Outcomes were (1) proportion of data items abstracted constituting an error (compared with an answer key), and (2) time taken to complete abstraction. For aim 3, we disseminated DAA to various stakeholders.

Results:

Aim 1. Using DAA, abstractors flag specific locations in source documents, thereby creating potentially permanent linkages between abstracted information and its source. When users click on existing flags, DAA scrolls the screen to the exact highlighted location of the source text. Among the 52 surveyed early users of DAA, 83% reported that using DAA was very or somewhat easy; 71% were very or somewhat likely to use DAA; and 87% were very or somewhat likely to recommend DAA to others.

Aim 2. Although overall mean error proportions were similar among the 3 approaches (A, 17%; B, 16%; C, 15%), A was associated with 8% higher odds of errors than B (odds ratio [OR], 1.08; 95% CI, 0.99-1.17) and 12% higher odds of errors than C (OR, 1.12; 95% CI, 1.03-1.22). Approach A had more errors in data items related to study outcomes or results (41%) than approaches B (36%; OR, 1.30; 95% CI, 1.09-1.56) and C (31%; OR, 1.52; 95% CI, 1.27-1.82). Approach A took 20 minutes more (95% CI, 1-40 minutes) to implement than B and 46 minutes less than C (95% CI, 26-66 minutes).

4

Aim 3. We published manuscripts, made conference presentations, and developed considerations for selecting from available abstraction approaches.

Conclusions: Our findings suggest independent dual abstraction is necessary for outcomes and results data; a verification approach is sufficient for other data. By linking abstracted data with their exact source, DAA provides an audit trail crucial for reproducible research. Reviewers should choose their data abstraction approach on the basis of the inevitable trade-off between saving time and minimizing errors.

Limitations: Currently, DAA is limited to flagging entire lines (not individual words) and cannot flag image-based text. The error proportions, although consistent with those reported in previous studies, might have been inflated due to abstractors’ unfamiliarity with DAA and with the review topics selected for abstraction.

5

BACKGROUND

Systematic reviews are research studies in which explicit methods are used to identify, appraise, and synthesize the research evidence addressing a research question.1 The steps in completing a systematic review include preparing the topic and formulating the research question, searching for studies, screening studies for inclusion, abstracting data from relevant individual studies, analyzing the data, synthesizing the evidence, and reporting the findings (Figure 1).2 The validity of the systematic review findings is contingent on collecting accurate and complete data from reports of relevant studies, a process known as data abstraction (or data extraction).

Figure 1. Steps in completing a systematic reviewa

aAdapted from Wallace et al 2013.3

As a predominantly manual process, data abstraction is inefficient, being both labor intensive and error prone. Errors during data abstraction are common and have been well- documented in the literature.4-6 Buscemi et al4 estimated that the proportion of errors, which they defined as “any small discrepancy from the reference standard,” was approximately 30% for single abstraction, regardless of the level of data abstractor experience. Abstraction errors occur when data abstractors either omit to abstract or incorrectly abstract information present in the article. When Gøtzsche et al5 examined 27 meta-analyses (ie, statistical combinations of

6

abstracted results from studies) across a range of topics, they were unable to replicate the results of 37% of meta-analyses. In another study, Jones et al6 documented abstraction errors in 20 of 42 systematic reviews (48%); in all cases, the errors changed the summary meta- analytic results, although none changed the systematic review conclusions.

Current recommended approaches for reducing errors in data abstraction fall into 2 categories: (1) abstraction by 1 person, followed by checking of the abstraction by a second person (ie, single abstraction plus verification); and (2) independent abstraction by 2 people followed by resolution of any discrepancies (ie, independent dual abstraction plus adjudication). Buscemi et al4 found an absolute error proportion of 17.7% for single abstraction plus verification and 14.5% for independent dual abstraction plus adjudication (an absolute difference of 3.2% and a relative difference of 21.7%), but the independent dual abstraction plus adjudication approach took approximately 50% longer.

To our knowledge, only the Buscemi et al study4 has examined the trade-offs between single abstraction plus verification and independent dual abstraction plus adjudication, and that study focused on a single systematic review topic with only 4 data abstractors; therefore, current standards for data abstraction rest on a weak evidence base. Major sponsors and producers of systematic reviews (eg, Agency for Healthcare Research and Quality [AHRQ] Evidence-based Practice Centers [EPCs], , Centre for Research and Dissemination [CRD]) and organizations that develop methodology standards for systematic reviews (eg, AHRQ, Cochrane, Institute of Medicine [IOM; now named National Academy of Medicine]) made inconsistent recommendations for approaches to reducing errors in data abstraction.1,2,7,8 Because “so little is known about how best to optimize accuracy and efficiency,”1 the IOM Committee stopped short of recommending independent dual abstraction for all data elements. Instead, it recommended “at minimum, use two or more researchers, working independently, to extract quantitative and other critical data from each study.”1 Thus, although the IOM recommended independent dual abstraction for “critical data,” an important gap in our current methodological understanding of data abstraction remains. The recommendation for critical data could represent unnecessary work or, conversely, the IOM’s implicit

7

recommendation that a single person could abstract noncritical data could represent an opportunity for error. The Patient-Centered Outcomes Research Institute (PCORI) endorses the IOM standards for conducting systematic reviews in general but noted that “Dual screening and data abstraction are desirable, but fact-checking may be sufficient. Quality control procedures are more important than dual review per se.”9

Computer-aided abstraction could potentially make the abstraction process more efficient and more accurate by facilitating the location and tracking of key information in articles. In recent years, several web-based data abstraction systems, such as the Systematic Review Data Repository (SRDR),10,11 Covidence, EPPI-Reviewer,12 DistillerSR, and Doctor Evidence, have been developed to for creating data abstraction forms and for receiving and organizing data collected. Although these data abstraction systems can record which source documents are used for data abstraction, they do not track the specific locations and context of relevant pieces of information in these often-lengthy documents. The ability to track the specific location and context of abstracted data in source documents would record initial data abstraction and likely facilitate data verification and adjudication. This would likely promote the validity of the systematic review findings, save time, and advance the transparency and reproducibility of the systematic review enterprise.

Specific Aims We had 3 specific aims for this work:

Aim 1: Develop Data Abstraction Assistant (DAA), a software tool to identify and track the location of data in articles and to automatically enter data into SRDR.

Aim 2: Conduct a randomized controlled trial (RCT) to evaluate the comparative effectiveness of 3 approaches—(A) DAA-facilitated single data abstraction plus verification, (B) single data abstraction plus verification, and (C) independent dual data abstraction plus adjudication—on the accuracy and efficiency of data abstraction.

8

Aim 3: Disseminate DAA, study findings, and a decision tool to enable users to better understand trade-offs between accuracy and efficiency when selecting data abstraction approaches during systematic reviews.

9

PARTICIPATION OF PATIENTS AND OTHER STAKEHOLDERS

We partnered with 13 stakeholders, including patients, systematic reviewers, clinical trialists, practice-guideline developers, policy makers, and industry representatives. These stakeholders, recruited through the core investigative team, represented the following broad set of domains of expertise: patient advocacy, data mining, machine learning, natural language processing, health informatics, systematic review methodology, methodology, patient-centered outcomes research, epidemiology, biostatistics, medicine, educational outreach, public policy, and regulatory science. We decided on and achieved this balance to reflect the multi-stakeholder representation of the systematic review enterprise.1

Impact of Stakeholder Engagement on Project We engaged with the entire investigative team of stakeholders via conference calls every 3 months. During these calls, we discussed progress, issues, and challenges; explored possible solutions; and laid out next steps. Specific areas where stakeholder engagement was particularly helpful were (1) refining the desired features of the DAA software; (2) providing feedback on the features and functioning of the DAA software; (3) refining the design details of the DAA trial; (4) developing the strategy for recruiting participants for the DAA trial; (5) suggesting additional analyses for the DAA trial; (6) interpreting the findings of the DAA trial; and (7) disseminating DAA and results of the DAA trial via stakeholder networks and publications in peer-reviewed journals.

Specific Contributions of Our Patient Stakeholders In addition to the aforementioned contributions of all stakeholders, the patient stakeholders made specific contributions to this project. For aim 2, our patient stakeholders (Vernal Branch, Sandra Walsh, and Elizabeth Whamond) helped us select 4 systematic reviews that address patient-important conditions. Then, to identify the outcomes for data abstraction during the DAA trial, we worked with the patient stakeholders and selected patient-centered outcomes that had the maximum number of studies. For aim 3, the patient stakeholders helped us disseminate our work through useful comments on the manuscripts and presentations. Aim

10

1, which involved technical software development, was not amenable to the patient stakeholders’ time constraints.

11

METHODS

We first developed and implemented DAA, and then conducted an RCT to evaluate it, randomly assigning pairs of data abstractors to different sequences of abstracting data under 3 different data abstraction approaches. We analyzed and compared the proportion of errors and time taken to complete data abstraction under the 3 approaches. We developed a set of considerations to guide selection of data abstraction approaches during systematic reviews.

Aim 1: Developing DAA Appendix 1 contains our published paper that describes the technical details of DAA.13

Three Essential Features for DAA We identified the following 3 essential features for DAA:

1. A platform to indicate the source of abstracted information: The major impetus behind the development of DAA was to create a platform where data abstractors could indicate the source of information by placing flags at, or pinpointing, specific locations in source documents (eg, journal articles), thereby creating a potentially permanent linkage (ie, tracking) between abstracted information and its source. 2. Compatibility with a variety of data abstraction systems: Systematic reviewers usually use a data abstraction system to help extract, manage, and archive the primary study data abstracted during the review. Examples of data abstraction systems include SRDR, Covidence, and DistillerSR. DAA’s main purpose is to contain information that links individual abstracted data items to specific locations in source documents. To make DAA compatible with a variety of data abstraction systems, we designed the DAA platform to be distinct from the data abstraction system. This distinction is attained by keeping separate the process of linkage with an item on the data abstraction form (in the data abstraction system) and the process of capturing and navigating to the location of information (in the source document). Details about the technical implementation of this separation are available in a data repository (https://bitbucket.org/cebmbrown/daa/src/master/). 3. User-friendliness: To make navigation easy and fast, we developed DAA to be user- friendly and menu driven. When abstracting data, the data abstractor can see DAA as integrated into the data abstraction system, side by side on the same screen (ie, a split- screen view).

12

How DAA Functions Behind the Scenes DAA works behind the scenes through 3 steps:

1. Converting documents from portable document format (PDF) to hypertext markup language (HTML) format 2. Transmitting the HTML version of the source document to the data abstraction system 3. Displaying and allowing for annotation of the HTML version of the source document in the data abstraction system

We examined the accuracy of content and formatting of the conversion of PDF to HTML via visual inspection. We developed a process that establishes linkage between abstracted data and its source as follows: (1) use of a unique identifier (ID 1) that denotes a specific data item in the data abstraction system (SRDR in this project); (2) use of a unique identifier (ID 2) that denotes a specific location in the HTML file of a specific source document; and (3) creation of a record of the mapping between IDs 1 and 2. Each mapping record is also provided a unique identifier (ID 3).

How DAA Functions at the Data Abstractor End DAA is designed to assist with data abstraction, a step that is carried out after the set of relevant studies for the systematic review is identified (Figure 1). Data abstractors can interface directly with DAA by logging into the password-protected DAA web application and uploading study documents (as PDF files). This uploading process can be centrally managed by the project lead if more protected governance of the data abstraction and document management process is desired.

Once source documents are uploaded as PDF files, they are converted into HTML format and organized into document stores, which are groups or collections of source documents. DAA assigns each document store a security token, allowing access to the HTML files from any systematic review project that the data abstractor is working on. Using SRDR as an example, upon logging in, SRDR requires the data abstractor to provide the security token to access the

13

data abstractor’s document stores. After the data abstractor selects the document store and, subsequently, a source document in HTML format, DAA transmits the HTML file to SRDR. Once DAA transmits the HTML to SRDR, SRDR displays the HTML in an area adjacent to the abstraction form (ie, the split-screen view).

Survey of Early Users of DAA We surveyed early users of DAA (all 52 individual data abstractors enrolled in the DAA trial described in aim 2) regarding their opinions about the user-friendliness of DAA. After completing data abstraction for the DAA trial, we asked each data abstractor to complete a brief survey designed using Qualtrics. We asked questions related to the data abstractor’s self- reported ease with completing each of the following tasks: (1) opening source documents in split-screen view in SRDR, (2) scrolling between pages of a source document, (3) placing flags in a source document, and (4) clicking on existing flags to automatically navigate to the relevant location in the source document. We also asked data abstractors to assess the overall ease of using DAA and to indicate the DAA feature they liked the most. Finally, we asked data abstractors about their likelihood of using DAA in the future and of recommending that others use it in the future (Appendix 2 contains the survey instrument and Appendix 3 presents a summary of the survey responses).

Aim 2: Conducting a Randomized Controlled Trial to Evaluate DAA Appendix 4 contains our published paper describing the protocol for the DAA trial.14

Study Population We recruited individuals who met each of the following criteria (based on self-report): at least 20 years of age, self-reported proficiency with reading scientific articles in English, completed data abstraction for at least 1 journal article for a systematic review in any field, and provided informed consent. We used 4 strategies to recruit potential participants: (1) emails to students who registered for courses in systematic review methods through Johns Hopkins Bloomberg School of Public Health (JHBSPH) and Brown University, (2) emails to faculty and staff at the Johns Hopkins EPC and the Brown EPC, (3) advertising on the SRDR website, and (4)

14

advertising through patient organizations such as Consumers United for Evidence-based Healthcare and Cochrane Consumer Network.

To mimic how individuals are often paired for data abstraction in systematic reviews, we formed pairs consisting of 1 less-experienced data abstractor and 1 more-experienced data abstractor. On the basis of the results of a pilot study,14,15 we determined that the number of published systematic reviews authored, dichotomized at fewer than 3 vs 3 or more, was best able (ie, had the highest area under the curve) to classify abstractors into less or more experienced with abstraction.

Approaches Compared We compared 3 abstraction approaches:

Approach A: DAA-facilitated Single Abstraction Plus Verification. In approach A, which used DAA, the less-experienced data abstractor in a pair completed the abstraction for an article first. The more-experienced data abstractor verified the information. The less- experienced data abstractor was instructed to place a flag identifying each location within the source document (eg, journal article) supporting the answer to every question on the abstraction form. DAA allowed multiple locations in the document to be flagged for a given question. Once the initial abstraction had been completed, the more-experienced data abstractor was given access to the data abstraction form with the abstracted data in SRDR, together with the flagged locations in the documents. The more-experienced data abstractor could change any of the less-experienced data abstractor’s responses that the former considered appropriate (verification) and, if desired, request discussion with the less- experienced data abstractor (data adjudication).

Approach B: Single Abstraction Plus Verification. Approach B did not use DAA. As in approach A, the less-experienced data abstractor in a pair completed the abstraction form first, without using DAA. The more-experienced data abstractor then verified the information abstracted by the less-experienced partner.

15

Approach C: Independent Dual Abstraction Plus Adjudication. Approach C also did not use DAA. The 2 data abstractors in a pair each abstracted data independently for the assigned articles using the abstraction form in SRDR. The 2 data abstractors informed each other when they had completed their independent abstractions, and they developed a plan for adjudication (eg, video call, phone call, in-person meeting). The data abstractors compared their abstractions and addressed any discrepancies in the abstracted data (data adjudication).

Randomization and Allocation Concealment Data abstractors were randomly assigned in pairs. Each pair completed abstraction for 6 articles, 2 under each of the 3 aforementioned approaches. Three different pairs abstracted data for each article. To maximize efficiency, we used a crossover design (Table 1), such that each pair of data abstractors implemented all 3 approaches being evaluated, with the intent of estimating differences within pairs. The 6 possible sequences were AABBCC, AACCBB, BBCCAA, BBAACC, CCAABB, and CCBBAA. The DAA trial protocol14 (Appendix 4) and Table 1 describe the randomization schema in detail.

16

Table 1. Assignment of 24 Pairs of Data Abstractors to 6 Sequences and 48 Articlesa

aA, B, and C denote 3 different approaches for data abstraction; see Aim 2: Approaches Compared. Note: Random sequence is the permuted arrangement of 3 different approaches for data abstraction. For example, sequence 1 indicates data abstractors will collect data from 6 unique articles using AABBCC approaches, respectively.

17

The senior statistician (C.H.S.) used the R statistical environment to generate the random order. To maintain allocation concealment, we kept the project director (I.J.S.), who was responsible for pairing data abstractors and communicating the randomized sequence to the pair, unaware of the next sequence to be assigned. When a given pair was ready to be randomized, the project director contacted and received from the senior statistician via email the random sequence to which the pair was to be assigned.

Masking It was not feasible to mask data abstractors, because the data abstractors needed to be aware of the abstraction approach in order to abstract data. It is possible that the lack of masking of data abstractors might have caused some bias, but we do not anticipate that this had a meaningful impact on our results, and we are not able to surmise the direction of any bias. The project director was not masked, because he needed to be aware of the sequence of assigned approaches in order to allocate articles and follow data abstractors through the trial. However, the project director played no part in recording the data for either of the trial’s outcomes (ie, errors and time). The lack of masking is unlikely to have influenced our results.

Follow-up and Retention of Participants To maximize retention, the project director maintained regular email contact with data abstractors throughout the trial. We provided each data abstractor US$250 as compensation for participation in the trial only after abstraction for all 6 articles had been completed (ie, there was no partial or interim compensation). As a result of these efforts and the commitment of the participants, all participants completed the trial; we had no missing data.

Evaluative Framework We identified 48 journal articles from 4 systematic reviews reporting results of RCTs (12 articles per systematic review) for use in the trial (Table 1). To ensure the systematic reviews and the outcomes were relevant to patients, our 3 patient co-investigators were involved in the selection of systematic reviews and outcomes. In cases where a systematic review included

18

more than 12 articles, we selected 12 articles that reported the largest number of outcomes. The topics addressed in the selected systematic reviews were (1) multifactorial interventions to prevent falls in older adults16; (2) proprotein convertase subtilisin/kexin type 9 (PCSK9) antibodies for adults with hypercholesterolemia17; (3) interventions to promote physical activity in cancer survivors18; and (4) omega-3 fatty acids for adults with depression.19

Data Collection, Management, and Monitoring We collected all data through websites, SRDR (the data abstraction system), and DAA. We developed and pilot tested a data abstraction form in SRDR with recommended data elements compatible with each of the 4 systematic review topics (the forms are publicly available at https://bit.ly/2w7HAUK). Each form comprised predominantly multiple-choice or numeric entry data items. We organized the data elements into separate “tabs” in SRDR: Design Tab (study design, risk of bias), Baseline Tab (characteristics of participants by study arm at baseline), Outcomes Tab (list of outcomes reported in the article), and Results Tab (quantitative results data). We combined data from the Outcomes and Results Tabs in the analysis. The forms had a median of 145 data items (range, 106-187 data items) for analysis. The total number of data items abstracted varied between articles, depending on the systematic review topic, number of outcomes, and the amount of information available in each article.

Figure 2 displays an example screenshot of 2 items (pertaining to sample size and study participant age) in the Baseline Tab of the form used during the DAA trial.

19

Figure 2. Screenshot from the Baseline Tab of a data abstraction form used during the DAA trial

Abbreviation: DAA, Data Abstraction Assistant.

20

Study Outcomes The 2 primary outcomes for the DAA trial were the proportion of data items abstracted that constitute an error (hereafter referred to as “error proportions,” for simplicity) and the time taken to complete abstraction (by both data abstractors, including adjudication). In approaches A and B, we used the verified data from the senior data abstractor’s form as the final answers; in approach C, we used data from the senior data abstractor’s form after adjudication by both data abstractors. For each approach, we compared the final answers from the pair with an answer key generated using data independently abstracted and adjudicated by 2 investigators with extensive experience with data abstraction for systematic reviews (T.L. and I.J.S.). We also manually double-checked each data item that had an error proportion ≥50% in case its corresponding answer key value needed correction.

All errors were ascertained by a computer program that automatically compared the selected or entered value of a given data item with the answer-key value for that data item. We defined an error as any discrepancy or difference between an entry for a data item and the answer key value for that data item. We were interested in abstraction errors resulting from either omission or incorrect abstraction. If participants abstracted more data items than were in the answer key, the additional data items were discarded and not considered errors.

The total time taken to complete abstraction for a given article was defined as the sum of the time taken (in minutes) for initial abstraction(s) plus subsequent verification or adjudication. Because we summed the time spent by data abstractors in a pair, the times technically refer to person-minutes. We asked each data abstractor to record the time spent (in minutes) on each step of data abstraction for each article: initial abstraction, verification, and adjudication (self-recorded time). These data were recorded using the online survey tool Qualtrics. The study data abstraction system (ie, SRDR) also automatically recorded time.

To assess the accuracy of our assessment of time, we corroborated self-recorded time and auto-recorded time, and noted that self-recorded time was consistently shorter than auto- recorded time, partly because the auto-recording clock continued to count time when data abstractors took a break. Our primary analysis of time focuses on the auto-recorded time.

21

As a post hoc secondary objective of the DAA trial, which is not part of the contractual deliverables for this project, we evaluated bias of meta-analytic summary statistics constructed using various possible results data abstracted for 2 outcomes, compared with data from the answer key.

Analytical and Statistical Approaches

Overview. We conducted all analyses according to the intention-to-treat (ITT) principle, using all pairs who contributed data. In addition, we conducted a per-protocol analysis, using only the pairs who properly completed abstraction (ie, ignoring 2 pairs in whom protocol violations occurred). We computed summary error proportions and time statistics for each approach, systematic review topic, and by type of question (ie, questions in the Design Tab, Baseline Tab, or the Outcomes and Results Tabs).

Statistical Models. We used 2-level mixed models to compare the times and error proportions of the 3 data abstraction approaches. Analyses for the times used a linear mixed model and those for error proportions used a binomial generalized linear mixed model. The first level described variation within pairs of data abstractors across the 6 articles abstracted by each pair; the second level described variation between pairs. Factors investigated at the first level included the 3 approaches as well as indicators for the approach used on the first and last article abstracted by each pair. We included these 2 indicator variables to investigate learning effects (ie, whether time and error proportions tended to decrease as data abstractors abstracted additional articles). Factors at the second (pair) level included the systematic review from which the articles were abstracted and the sequence in which the pair abstracted data. We considered the pair as a random effect by including a random intercept in the first-level model. We also explored interactions of approach with sequence, systematic review, and first and last articles reviewed. Because all participants completed all abstractions, there was no need for additional analyses to deal with missing data.

Exploring the Impact of Errors on Meta-analysis. When different data abstractors abstract different values for an estimate of effect or fail to abstract a value at all, meta-analytic

22

summaries using the different abstractions might differ if the errors made involve values used in the meta-analysis. To explore the potential impact of errors on meta-analyses, we identified 2 outcomes for meta-analysis, 1 continuous (from topic 2) and 1 binary (from topic 1), which each had 5 or more studies reporting results for that outcome.

Some studies reported arm-level results, some reported between-arm results, and some reported both; we instructed data abstractors to abstract all results for each outcome of interest. Accordingly, we conducted each meta-analysis using 2 methods, 1 based on arm-level results (method 1) and the other on between-arm results (method 2). With each method, if some studies reported only 1 type of result, we used it instead. For instance, in a meta-analysis using estimates of effect derived from between-arm data, missing between-arm results were computed with arm-level results, provided these were available.

By the design of the DAA trial, each article in the meta-analysis had 3 pairs of data abstractors. We carried out all possible combinations of meta-analyses formed by randomly choosing for each study in the meta-analysis 1 of the 3 data abstractor pairs who had abstracted results for the given outcome. For each combination of abstractions, we carried out a random-effects meta-analysis using the DerSimonian and Laird method20 and recorded the mean treatment effect (mean difference [MD] for the continuous outcome and risk ratio for the binary outcome) along with the between-study variance and the I2 statistic. We examined the distribution of each statistic and compared it with the estimate from using data from the answer key.

Conduct of the Study The DAA trial was approved by the IRBs at JHBSPH (dated July 13, 2015; IRB no. 00006521) and Brown University (dated August 21, 2015). Online informed consent for participation in the trial was obtained from every participant via the DAA trial consent website (http://srdr.ahrq.gov/daa/consent).

23

Aim 3: Disseminating the Study Findings Because there are various stakeholders in the systematic review enterprise, our target audience for dissemination activities includes, among others, PCORI investigators, Cochrane systematic review authors, AHRQ EPCs, guideline producers, US Preventive Services Task Force, Centers for Disease Control and Prevention, National Institute for Health and Clinical Excellence in the United Kingdom, Center for Reviews and Dissemination in the United Kingdom, Blue Cross Blue Shield, Hayes Inc., industry, academic institutions, and individual users.

Specific Dissemination Strategies of the DAA Software and Findings of the DAA Trial We adopted a multipronged dissemination strategy to ensure that DAA software reaches various systematic review stakeholders. First, we developed DAA to be open source, open access, and free of charge for any future systematic reviews. This development model promotes sharing of original source code and will allow modification and improvement of the software by the community and the general public. Second, we are publishing multiple manuscripts in peer-reviewed journals. Third, we have presented information about DAA and its features at scientific conferences, where we networked with individuals and entities who are likely to use DAA in their systematic reviews.21-24 Finally, our 4 data collection forms (1 for each systematic review in the DAA trial) contain common data items that can be readily adapted for any future systematic reviews. We have made these forms publicly available (https://bit.ly/2w7HAUK). Systematic reviewers can use the forms entirely or in part and can modify questions as desired.

Developing a Set of Considerations to Guide Selection of Data Abstraction Approaches We examined the trade-offs in accuracy versus efficiency for the various data abstraction approaches as well as the resource needs for each approach. We developed a set of considerations for stakeholders to understand the pros and cons of the various choices that they will inevitably have to make in selecting an approach for data abstraction during systematic reviews.

24

RESULTS Aim 1: Developing DAA

DAA Demonstration and Source Codes DAA records mappings between abstracted data elements and their corresponding locations in source documents.13 Examples of locations include a specific line or paragraph of text, a figure, and a row in a table in a journal article or any report about the study. Mapping a single data element to multiple locations is also supported. Clicking on established mappings automatically loads the source document side by side with the data abstraction form in split- screen view, scrolls the location into view, and highlights the relevant text. Data abstractors can then use a mouse and drag a flag from any item on the data abstraction form to any desired location on the adjacent HTML (Figure 3).

Results of Survey of Early Users of DAA All 52 data abstractors who participated in the DAA trial (aim 2) completed the survey (Appendix 3). Most data abstractors (n = 43 of 52; 83%) found using DAA to be either very or somewhat easy overall. Opening source documents in split-screen view and scrolling between pages of a source document were reported to be easy by 83% and 69% of data abstractors, respectively. Among those who placed flags initially (ie, less-experienced data abstractors), 62% agreed that doing so was easy. Among those who clicked on existing flags (ie, more- experienced data abstractors), 73% agreed that doing so was easy.

25

Figure 3. Screenshot showing how DAA displays the source document in HTML format (right) adjacent to the data abstraction form in the data abstraction system (SRDR, left)a

aA demonstration video is available at https://www.youtube.com/watch?v=29eI2ry8eqM&feature=youtu.be&t=45s. The source code of DAA (Data Abstraction Assistant) can be found at https://bitbucket.org/cebmbrown/daa. The SRDR (Systematic Review Data Repository Code) source code that has the DAA implementation can be found at https://bitbucket.org/cebmbrown/srdr/branch/daa. These code repositories include documentation to assist in setting up DAA and SRDR server instances.

When asked about use of DAA for data abstraction in the future, 65% of less- experienced and 77% of more-experienced data abstractors stated they would be very or somewhat likely to use it. Similarly, 80% of less-experienced and 93% of more-experienced data abstractors stated that they would be very or somewhat likely to recommend that others use it (see Appendix 3 for detailed breakdown of responses). When asked to name their favorite DAA feature, 54% of all data abstractors chose the ability to click on existing flags marking information sources (73% of more-experienced data abstractors named this feature), 19% of data abstractors chose the ability to open a document in split-screen view, and 17% chose the ability to place flags on the PDF (23% of less-experienced data abstractors named this feature).

26

Aim 2: Conducting a Randomized Controlled Trial to Evaluate DAA Between March 18, 2016, and February 1, 2017, we screened 160 potential data abstractors for eligibility and randomly assigned 52 (n = 26 pairs) (Figure 4).

Figure 4. Participant flow during the DAA trial

Abbreviation: DAA, Data Abstraction Assistant.

We enrolled 26 pairs instead of the 24 pairs planned in the design, because 3 protocol violations required replacing 2 of the pairs. The first 2 violations (nos. 1 and 2) occurred because the first data abstractor in the pair forgot to place flags during data abstraction under approach A. Protocol violation no. 3 occurred because the project director assigned 2 incorrect studies to a pair. Two protocol violations (nos. 2 and 3) occurred in the same pair. After discussing the issues with the entire investigative team, we enrolled 2 additional pairs of data abstractors (pairs 25 and 26) to replace the 2 pairs in whom the violations occurred.

All participants completed the DAA trial by April 3, 2017. We did not encounter any missing data and analyzed data from all 52 participants under the ITT principle. We conducted a per-protocol sensitivity analysis by replacing the 2 pairs in whom the protocol violations occurred with the 2 added pairs. Because of the crossover design, we present the baseline

27

characteristics of all 52 participants by sequence (Table 2). In brief, most participants were between 20 and 40 years old, reflecting the populations from which we recruited, and most participants had abstracted data within the past 6 months. Most participants (90%) had abstracted data from 10 or more studies, and all participants had previously received some form of training in systematic reviews. Nearly all participants characterized their level of experience as “somewhat/moderately experienced” or “very experienced.”

Table 2. Baseline Characteristics of All 52 Participants in the DAA Trial

Random sequence, No. (%) AABBCC BBCCAA CCAABB AACCBB BBAACC CCBBAA Characteristic n = 8 n = 8 n = 8 n = 10 n = 10 n = 8 Age range, y 20-29 3 (38) 3 (37) 7 (88) 6 (60) 5 (50) 5 (63) 30-39 2 (25) 5 (63) — 4 (40) 4 (40) 2 (25) 40-49 1 (13) — — — — 1 (13) 50-59 2 (25) — — — 1 (10) — 60-69 — — — — — — ≥70 — — 1 (13) — — — No. of articles abstracted 1-9 — — 2 (25) 1 (10) 1 (10) 1 (13) 10-19 — 3 (38) — — 2 (20) 3 (38) ≥20 8 (100) 5 (63) 6 (75) 9 (90) 7 (70) 4 (50) No. of systematic reviews published 0 1 (13) 4 (50) 3 (38) 4 (40) 4 (40) 3 (38) 1-2 — — 1 (13) 1 (10) 1 (10) 1 (13) 3-5 3 (38) 2 (25) 2 (25) 4 (40) 2 (20) 1 (13) ≥6 4 (50) 2 (25) 2 (25) 1 (10) 3 (30) 3 (38) Last time abstracting data

28

Random sequence, No. (%) AABBCC BBCCAA CCAABB AACCBB BBAACC CCBBAA Characteristic n = 8 n = 8 n = 8 n = 10 n = 10 n = 8 Within the last 6 mo 7 (88) 7 (88) 8 (100) 8 (100) 7 (88) 7 (88) ≥6 mo ago 1 (13) 1 (13) — — 1 (13) 1 (13) Training in systematic reviewsa No training — — — — — — Took a systematic review 5 (63) 5 (63) 3 (38) 7 (70) 7 (70) 7 (88) methods course Attended a systematic 3 (38) 3 (38) 1 (13) 2 (20) 3 (38) 2 (25) review workshop Received on-the-job 5 (63) 4 (50) 7 (88) 5 (50) 7 (70) 5 (63) training Received other forms of 2 (25) 2 (25) 2 (25) 3 (38) 1 (10) 1 (13) training Self-rated level of experience Slightly experienced — 1 (13) — 1 (10) 1 (10) 1 (13) Somewhat/moderately 4 (50) 3 (38) 2 (25) 5 (50) 6 (60) 6 (75) experienced Very experienced 4 (50) 4 (50) 6 (75) 4 (40) 3 (30) 1 (13) Primary professional status Faculty 3 (38) 1 (13) 1 (13) 2 (20) 3 (38) — Doctoral student 1 (13) 2 (25) 2 (25) 3 (30) 2 (20) 2 (25) Master’s student 2 (25) 2 (25) 1 (13) 2 (20) 2 (20) 1 (13) Staff 1 (13) 3 (38) 3 (38) — 2 (20) 3 (38) Other 1 (13) — 1 (13) 3 (30) 1 (10) 2 (25) Abbreviation: DAA, Data Abstraction Assistant. a Participants could select all options that apply, so the percentages add up to more than 100%.

We also present the baseline characteristics of participants by the classified level of experience with data abstraction (Table 3).

29

Table 3. Baseline Characteristics of All 52 Participants in the DAA Trial by Level of Experience With Data Abstraction

Less experienced More experienced Overall (n = 26) (n = 26) (N = 52) Characteristic No. (%) No. (%) No. (%) Demographics Age category, y 20-29 18 (69) 11 (42) 29 (56) 30-39 7 (27) 10 (39) 17 (33)

40-49 0 (0) 2 (8) 2 (4) 50-59 1 (4) 2 (8) 3 (6) 60-69 0 (0) 0 (0) 0 (0) ≥70 0 (0) 1 (4) 1 (2) Current professional status Masters student 7 (27) 3 (12) 10 (19) Doctoral student 7 (27) 5 (19) 12 (23) Staff 4 (15) 8 (31) 12 (23) Faculty 3 (12) 7 (27) 10 (19) Other 5 (19) 3 (12) 8 (16) Affiliationa Brown University 1 (4) 8 (31) 9 (17) Johns Hopkins University 24 (92) 9 (35) 33 (65) Other 0 (0) 7 (27) 7 (14) Unclear 1 (4) 2 (8) 3 (6) Training Type of training receivedb Systematic review methods 3 (12) 11 (42) 14 (27) workshop Systematic review course 20 (77) 14 (54) 34 (65)

30

Less experienced More experienced Overall (n = 26) (n = 26) (N = 52) Characteristic No. (%) No. (%) No. (%) On-the-job training 10 (39) 23 (89) 33 (64) Other 2 (8) 9 (35) 11 (21) Prior experience with data abstraction for systematic reviews No. of articles abstracted 1-9 5 (19) 0 (0) 5 (10) 10-19 7 (27) 1 (4) 8 (15) ≥20 14 (54) 25 (96) 39 (75) No. of systematic reviews published 0 19 (73) 0 (0) 19 (37) 1-2 7 (27) 0 (0) 7 (14) 3-5 0 (0) 11 (42) 11 (21) ≥6 0 (0) 15 (58) 15 (29) Self-assessment of prior experience Slightly experienced 3 (12) 1 (4) 4 (8)

Somewhat/moderately 18 (69) 8 (31) 26 (50) experienced Very experienced 5 (19) 17 (65) 22 (42) Abbreviation: DAA, Data Abstraction Assistant. a Based on email addresses and project director’s familiarity with the participant’s current or past affiliation(s). b Participants could select all that apply, so the percentages add up to more than 100%.

31

Errors

Error Proportions. Table 4 provides the error proportions observed during the DAA trial by data abstraction approach (A, B, and C), type of error (error of omission, incorrect abstraction, and total errors), type of data abstracted (study design, baseline characteristics, outcomes/results, and all types of data), and systematic review topic (1, 2, 3, 4, and all topics). Table 5 reports these data aggregated across all approaches.

Across all approaches, the proportion of errors committed by pairs per abstraction form was 16% (range, 2%-33%; see Table 5). These proportions were similar among data abstraction approaches: 17% (range, 6%-33%) for approach A, 16% (range, 4%-33%) for approach B, and 15% (range, 2%-30%) for approach C (Table 4). Error proportions were much higher when abstracting data items related to outcomes/results (36%) compared with data items related to study design (15%) or baseline characteristics (10%; see Table 5). When extracting data items related to outcomes/results, error proportions were higher for approach A (41%) than for approach B (36%) or approach C (31%; see Table 4). Differences were smaller for data items related to study design (ranging between 13% and 17%) and baseline characteristics (all 10%). Error proportions did not vary as much by systematic review topic. Errors of omission were less common than incorrect abstractions, except among the outcomes/results data items, for which a large majority of the errors were omissions.

32

Table 4. Proportion of Errors by Data Abstraction Approach, Type of Error, Type of Data Item, and Systematic Review Topic

Approach Aa Approach Bb Type of Error Type of Error Errors of Incorrect Errors of Incorrect Total errors omission abstractions No. of fields Total errors omission abstractions No. of fields Mean % Mean % Mean % Mean Mean % Mean % Mean % Mean (Range) (Range) (Range) (Range) (Range) (Range) (Range) (Range) Study Design Topic 1d 21 (7-49) 0 (0-0) 21 (7-49) 42 (37-43) 14 (7-19) 0 (0-0) 14 (7-19) 42 (37-43) Topic 2e 18 (9-30) 0 (0-0) 18 (9-30) 45 (42-46) 13 (0-21) 0 (0-0) 13 (0-21) 45 (42-46) Topic 3f 12 (5-20) 0 (0-0) 12 (5-20) 45 (42-46) 12 (2-20) 0 (0-0) 12 (2-20) 45 (42-46) Topic 4g 17 (2-48) 0 (0-0) 17 (2-48) 46 (42-48) 15 (2-36) 0 (0-0) 15 (2-36) 46 (42-48) All topics 17 (2-49) 0 (0-0) 17 (2-49) 45 (37-48) 13 (0-36) 0 (0-0) 13 (0-36) 45 (37-48) Baseline Characteristics Topic 1 11 (3-19) 0 (0-0) 11 (3-19) 62 (59-65) 10 (0-20) 1 (0-9) 9 (0-20) 62 (59-65) Topic 2 11 (0-35) 0 (0-0) 11 (0-35) 76 (63-84) 11 (0-34) 0 (0-0) 11 (0-34) 76 (63-84) Topic 3 9 (0-24) 0 (0-0) 9 (0-24) 73 (64-81) 9 (0-26) 0 (0-0) 9 (0-26) 72 (64-81) Topic 4 9 (0-33) 0 (0-0) 9 (0-33) 52 (45-57) 11 (0-27) 0 (0-0) 11 (0-27) 51 (45-57) All topics 10 (0-35) 0 (0-0) 10 (0-35) 65 (45-84) 10 (0-34) 0 (0-9) 10 (0-34) 65 (45-84) Outcomes and Results Topic 1 48 (9-95) 44 (9-76) 4 (0-19) 24 (10-38) 37 (10-65) 35 (0-65) 2 (0-17) 24 (10-38) Topic 2 42 (0-86) 32 (0-86) 9 (0-37) 31 (7-52) 41 (6-95) 28 (0-95) 13 (0-86) 31 (7-52) Topic 3 40 (0-100) 36 (0-100) 4 (0-23) 22 (7-43) 35 (7-100) 32 (0-100) 3 (0-27) 23 (7-43) Topic 4 35 (8-100) 22 (0-100) 13 (0-71) 13 (3-25) 31 (0-100) 27 (0-100) 4 (0-40) 11 (3-25) All topics 41 (0-100) 33 (0-100) 8 (0-71) 22 (3-52) 36 (0-100) 31 (0-100) 5 (0-86) 22 (3-52) All data items Topic 1 20 (6-28) 8 (2-17) 11 (3-20) 150 (131-162) 16 (7-21) 8 (0-13) 8 (4-13) 144 (123-168)

33

Approach Aa Approach Bb Type of Error Type of Error Errors of Incorrect Errors of Incorrect Total errors omission abstractions No. of fields Total errors omission abstractions No. of fields Mean % Mean % Mean % Mean Mean % Mean % Mean % Mean (Range) (Range) (Range) (Range) (Range) (Range) (Range) (Range) Topic 2 17 (6-24) 6 (0-14) 11 (6-16) 166 (128-181) 16 (7-33) 6 (0-23) 10 (3-20) 161 (123-181) Topic 3 14 (8-30) 6 (0-15) 8 (2-15) 155 (133-187) 14 (4-32) 6 (1-15) 8 (1-18) 148 (123-177) Topic 4 17 (9-33) 7 (4-15) 11 (1-24) 130 (116-142) 18 (6-25) 7 (4-10) 11 (2-21) 122 (106-137) All topics 17 (6-33) 7 (0-17) 10 (1-24) 150 (116-187) 16 (4-33) 6 (0-23) 9 (1-21) 143 (106-181)

Table 4. Proportion of Errors by Data Abstraction Approach, Type of Error, Type of Data Item, and Systematic Review Topic (cont’d)

Approach Cc Type of Error Errors of Incorrect Total errors omission abstractions No. of fields Mean % Mean % Mean % Mean (Range) (Range) (Range) (Range) Study Design Topic 1d 18 (7-35) 0 (0-0) 18 (7-35) 42 (37-43) Topic 2e 10 (2-23) 0 (0-0) 10 (2-23) 45 (42-46) Topic 3f 10 (4-21) 0 (0-0) 10 (4-21) 45 (42-46) Topic 4g 17 (4-36) 0 (0-0) 17 (4-36) 46 (42-48) All topics 14 (2-36) 0 (0-0) 14 (2-36) 45 (37-48) Baseline Characteristics Topic 1 7 (0-14) 0 (0-0) 7 (0-14) 62 (59-65)

34

Approach Cc Type of Error Errors of Incorrect Total errors omission abstractions No. of fields Mean % Mean % Mean % Mean (Range) (Range) (Range) (Range) Topic 2 15 (0-41) 0 (0-0) 15 (0-41) 76 (63-84) Topic 3 10 (1-24) 0 (0-0) 10 (1-24) 72 (64-81) Topic 4 8 (0-20) 0 (0-0) 8 (0-20) 52 (45-57) All topics 10 (0-41) 0 (0-0) 10 (0-41) 65 (45-84) Outcomes and Results Topic 1 43 (4-100) 40 (0-100) 3 (0-14) 24 (10-38) Topic 2 35 (7-86) 33 (0-86) 2 (0-14) 31 (7-52) Topic 3 21 (0-60) 17 (0-53) 5 (0-57) 23 (7-43) Topic 4 29 (0-100) 28 (0-100) 1 (0-8) 11 (3-25) All topics 31 (0-100) 29 (0-100) 2 (0-57) 22 (3-52) All data items Topic 1 16 (8-27) 8 (0-18) 9 (2-16) 144 (123-156) Topic 2 16 (8-30) 6 (0-12) 10 (4-21) 161 (128-179) Topic 3 12 (2-19) 4 (0-13) 9 (2-15) 149 (123-177) Topic 4 17 (6-25) 7 (4-15) 10 (2-20) 123 (106-142) All topics 15 (2-30) 6 (0-18) 9 (2-21) 144 (106-179) a Approach A was DAA (Data Abstraction Assistant)–facilitated single abstraction plus verification. b Approach B was single abstraction plus verification. c Approach C was independent dual abstraction plus adjudication. d Topic 1: Multifactorial interventions to prevent falls in older adults.16 e Topic 2: PCSK9 (proprotein convertase subtilisin/kexin type 9) antibodies for adults with hypercholesterolemia.17 f Topic 3: Interventions to promote physical activity in cancer survivors.18 g Topic 4: Omega-3 fatty acids for adults with depression.19

35

Table 5. Proportion of Errors Across All Approaches, by Type of Error, Type of Data Abstracted, and Systematic Review Topic

All approachesa Type of error Errors of Incorrect Total errors omission abstractions No. of fields Mean % (Range) Mean % (Range) Mean % (Range) Mean (Range) Study Design Topic 1b 18 (7-49) 0 (0-0) 18 (7-49) 42 (37-43) Topic 2c 14 (0-30) 0 (0-0) 14 (0-30) 45 (42-46) Topic 3d 11 (2-21) 0 (0-0) 11 (2-21) 45 (42-46) Topic 4e 16 (2-48) 0 (0-0) 16 (2-48) 46 (42-48) All topics 15 (0-49) 0 (0-0) 15 (0-49) 45 (37-48) Baseline Characteristics Topic 1 9 (0-20) 0 (0-9) 9 (0-20) 62 (59-65) Topic 2 12 (0-41) 0 (0-0) 12 (0-41) 76 (63-84) Topic 3 9 (0-26) 0 (0-0) 9 (0-26) 72 (64-81) Topic 4 9 (0-33) 0 (0-0) 9 (0-33) 52 (45-57) All topics 10 (0-41) 0 (0-9) 10 (0-41) 65 (45-84) Outcomes and Results Topic 1 43 (4-100) 40 (0-100) 3 (0-19) 24 (10-38) Topic 2 39 (0-95) 31 (0-95) 8 (0-86) 31 (7-52) Topic 3 32 (0-100) 28 (0-100) 4 (0-57) 23 (7-43) Topic 4 32 (0-100) 26 (0-100) 6 (0-71) 12 (3-25) All topics 36 (0-100) 31 (0-100) 5 (0-86) 22 (3-52) All data items Topic 1 17 (6-28) 8 (0-18) 9 (2-20) 146 (123-168) Topic 2 17 (6-33) 6 (0-23) 11 (3-21) 163 (123-181) Topic 3 14 (2-32) 5 (0-15) 8 (1-18) 151 (123-187) Topic 4 17 (6-33) 7 (4-15) 10 (1-24) 125 (106-142) All topics 16 (2-33) 6 (0-23) 10 (1-24) 145 (106-187) a Approach A was DAA (Data Abstraction Assistant)–facilitated single abstraction plus verification; approach B was single abstraction plus verification; approach C was independent dual abstraction plus adjudication. b Topic 1: Multifactorial interventions to prevent falls in older adults.16 c Topic 2: PCSK9 (proprotein convertase subtilisin/kexin type 9) antibodies for adults with hypercholesterolemia.17 d Topic 3: Interventions to promote physical activity in cancer survivors.18 e Topic 4: Omega-3 fatty acids for adults with depression.19

36

Between-Approach Comparisons of Errors. We fit a variety of models to compare error proportions among the 3 data abstraction approaches. Error proportions varied by abstraction approach, by the sequence in which approaches were undertaken, by review topic, and by the order in which articles were abstracted (error proportions were generally higher for the first article extracted and lower for the last article). We did not find any interactions with approach, except for sequence, but these were difficult to interpret. Table 6 presents comparisons of error proportions between approaches using a model based on the DAA trial design that adjusted for sequence, systematic review topic, and indicators for the approach used on the first and last article abstracted by each pair.

Table 6. Between-Approach Comparisons of Error Proportions by Type of Data Abstracteda

Approach Ab vs Approach Cc Approach Bd vs Approach C Approach A vs Approach B Tab Adj. OR 95% CI P Adj. OR 95% CI P Adj. OR 95% CI P Study Design 1.30e 1.11-1.53 .002 0.99 0.83-1.17 0.87 1.32e 1.12-1.55 .001 Baseline 1.02 0.87-1.20 .83 1.05 0.89-1.23 0.59 0.97 0.83-1.14 .74 Characteristics Outcomes and 1.52e 1.27-1.82 <.0001 1.17 0.97-1.40 0.10 1.30e 1.09-1.56 .004 Results All data items 1.12e 1.03-1.22 .01 1.04 0.95-1.13 0.41 1.08 0.99-1.17 .09

Abbreviation: Adj. OR, adjusted odds ratio. a The model that did not include indicators for the approach used on the first and last article abstracted by each pair rendered similar findings. b Approach A was DAA (Data Abstraction Assistant)-facilitated single abstraction plus verification. c Approach C was independent dual abstraction plus adjudication. d Approach B was single abstraction plus verification. e Significant at 0.05 level.

37

Overall, across all types of data items, although the crude error proportions were similar (17% for A, 16% for B, and 15% for C; see Table 4), approach A was associated with a statistically significant 12% higher odds of errors than approach C (OR, 1.12; 95% CI, 1.03-1.22) and with a nonstatistically significant 8% higher odds of errors than approach B (OR, 1.08; 95% CI, 0.99-1.17; see Table 6). The majority of these between-approach differences arose from the data items related to outcomes/results and study design, where, for example, compared with approach C, approach A was associated with 52% (OR, 1.52; 95% CI, 1.27-1.82) and 30% (OR, 1.30; 95% CI, 1.11-1.53) higher odds of errors, respectively, for each type of data item. Approach A also was associated with statistically significantly higher odds of errors than approach B in these 2 types of data items. Approaches B and C were associated with similar odds of errors in data items related to study design, but approach B was associated with marginally significantly higher odds of errors in data items related to outcomes/results than approach C. No between-approach differences were observed in data items related to baseline characteristics.

Time Mean times for data abstraction during the DAA trial, as captured by auto-recorded time, were generally longer than those captured by self-recorded time. Across all approaches, the mean times per abstraction were 136 minutes (range, 39-399 minutes) and 107 minutes (range, 30-285 minutes), as captured by the auto- and self-recorded times, respectively.

Auto-recorded Time. Mean times for data abstraction during the DAA trial, as captured by auto-recorded time, were longer for approach C (172 minutes; range, 48-399 minutes) than for approach A (128 minutes; range, 41-350 minutes) and approach B (107 minutes; range, 39-341 minutes; see Table 7). Some systematic review topics took longer to abstract than others. Auto-recorded time clocks were not able to differentiate between initial abstraction versus adjudication or verification; however, they recorded times by type of data item. Regardless of the abstraction approach, data abstractors spent between 2 and 3 times more time on data items related to study design or outcomes/results than on items related to

38

baseline characteristics. Abstracting data related to study design and baseline characteristics took slightly longer using approach A than approach B, even though both approaches involved verification.

Table 7. Auto-recorded Time Spent (in minutes) by Data Abstraction Approach, Type of Data Item, and Systematic Review Topic Approach Aa Approach Bb Approach Cc All approaches Mean % (range) Mean % (range) Mean % (range) Mean % (range) Study Design Topic 1d 46 (21-107) 36 (19-59) 51 (22-70) 44 (19-107) Topic 2e 61 (17-111) 58 (10-232) 50 (36-85) 56 (10-232) Topic 3f 54 (17-148) 37 (9-82) 84 (43-145) 58 (9-148) Topic 4g 63 (16-199) 41 (16-81) 63 (24-166) 55 (16-199) All topicsh 56 (16-199) 43 (9-232) 63 (22-166) 54 (9-232) Baseline Characteristics Topic 1 9 (5-18) 7 (4-15) 15 (5-32) 10 (4-32) Topic 2 27 (8-66) 14 (3-24) 29 (16-78) 24 (3-78) Topic 3 19 (6-52) 11 (3-20) 28 (15-47) 19 (3-52) Topic 4 23 (4-155) 9 (5-19) 23 (8-75) 18 (4-155) All topicsh 20 (4-155) 10 (3-24) 24 (5-78) 18 (3-155) Outcomes and Results Topic 1 29 (5-68) 27 (10-82) 75 (16-165) 44 (5-165) Topic 2 44 (11-111) 46 (9-96) 55 (27-138) 48 (9-138) Topic 3 43 (10-128) 58 (8-244) 69 (20-140) 57 (8-244) Topic 4 27 (5-69) 33 (8-97) 72 (13-245) 44 (5-245) All topicsh 36 (5-128) 41 (8-244) 68 (13-245) 48 (5-245) All data items Topic 1 98 (44-170) 84 (39-194) 162 (48-243) 114 (39-243) Topic 2 146 (50-290) 132 (44-341) 150 (93-267) 143 (44-341) Topic 3 134 (46-350) 118 (40-311) 199 (113-310) 151 (40-350) Topic 4 132 (41-326) 96 (42-172) 174 (51-399) 134 (41-399) All topicsh 128 (41-350) 107 (39-341) 172 (48-399) 136 (39-399) a Approach A was DAA (Data Abstraction Assistant)–facilitated single abstraction plus verification. b Approach B was single abstraction plus verification. c Approach C was independent dual abstraction plus adjudication. d Topic 1: Multifactorial interventions to prevent falls in older adults.16 e Topic 2: PCSK9 (proprotein convertase subtilisin/kexin type 9) antibodies for adults with hypercholesterolemia.17 f Topic 3: Interventions to promote physical activity in cancer survivors.18 g Topic 4: Omega-3 fatty acids for adults with depression.19 h Total time for All topics are greater than the sum of the Design, Baselines, and Outcomes and Results Tabs because All Topics also incorporates time spent on the other tabs in SRDR ([Systematic Review Data Repository] ie, Key Questions, Publications, Arms, and Finalize Tabs).

39

Self-recorded Time. Mean times for data abstraction during the DAA trial, as captured by self-recorded time, were similar between the 2 verification approaches (90 minutes [range, 39-229 minutes] for approach A; and 90 minutes [range, 3-285 minutes] for approach B). The mean time was longer for independent abstraction (142 minutes; range, 59-256) for approach C (Table 8). Across all abstraction approaches, approximately 60% of the time was spent on initial abstraction and approximately 40% on adjudication or verification. Some systematic review topics took longer to abstract than others. Table 9 reports these data aggregated across approaches.

40

Table 8. Self-recorded Time (in minutes) Spent by Data Abstraction Approach, Step of Data Abstraction, and Systematic Review Topic

Approach Aa Approach Bb Approach Cc Step of data abstraction Step of data abstraction Step of data abstraction Initial Adjudi- Initial Adjudi- Initial abstraction Verification cation Total abstraction cation Total abstraction Verification Adjudi- Total Mean Mean Mean Mean Mean Verification Mean Mean Mean Mean cation Mean (range), (range), (range), (range), (range), Mean (range), (range), (range), (range), Mean (range), Min Min Min Min Min (range), Min Min Min Min Min (range), Min Min Topic 51 (36-80) 24 (11-38) 5 (0-15) 80 (61- 44 (20-97) 23 (5-41) 8 (0-31) 75 (39- 82 (40- 3 (0-10) 47 (22-87) 132 (65- 1d 105) 145) 132) 229) Topic 70 (20- 30 (19-60) 4 (0-30) 103 (48- 57 (20- 26 (2-50) 9 (0-30) 92 (35- 90 (38- 0 (0-0) 48 (28-90) 138 (76- 2e 210) 229) 140) 172) 182) 255) Topic 61 (20- 31 (10-75) 4 (0-21) 96 (45- 70 (18- 22 (10-41) 20 (0-80) 112 (30- 95 (69- 4 (0-42) 62 (24- 161 (98- 3f 149) 224) 210) 285) 145) 145) 227) Topic 45 (19-72) 31 (18-65) 5 (0-20) 81 (39- 46 (18- 27 (10-55) 8 (0-28) 81 (43- 64 (32- 4 (0-20) 66 (24- 135 (59- 4g 145) 113) 136) 153) 190) 256) All 56 (19- 29 (10-75) 5 (0-30) 90 (39- 54 (18- 25 (2-55) 12 (0-80) 90 (30- 83 (32- 3 (0-42) 56 (22- 142 (59- topics 210) 229) 210) 285) 182) 190) 256) a Approach A was DAA (Data Abstraction Assistant)–facilitated single abstraction plus verification. b Approach B was single abstraction plus verification. c Approach C was independent dual abstraction plus adjudication. d Topic 1: Multifactorial interventions to prevent falls in older adults.16 e Topic 2: PCSK9 (proprotein convertase subtilisin/kexin type 9) antibodies for adults with hypercholesterolemia.17 f Topic 3: Interventions to promote physical activity in cancer survivors.18 g Topic 4: Omega-3 fatty acids for adults with depression.19

41

Table 9. Self-recorded Time (in minutes) Spent Across All Approaches, by Step of Data Abstraction and Systematic Review Topic

All approachesa Step of data abstraction Initial abstraction Verification Adjudication Total Mean (range), Mean (range), Mean (range), Mean (range), Min Min Min Min Topic 1b 59 (20-132) 17 (0-41) 20 (0-87) 96 (39-229) Topic 2c 72 (20-210) 19 (0-60) 20 (0-90) 111 (35-255) Topic 3d 75 (18-210) 19 (0-75) 29 (0-145) 123 (30-285) Topic 4e 52 (18-153) 21 (0-65) 27 (0-190) 99 (39-256) All topics 64 (18-210) 19 (0-75) 24 (0-190) 107 (30-285) a Approach A was DAA (Data Abstraction Assistant)–facilitated single abstraction plus verification; approach B was single abstraction plus verification; approach C was independent dual abstraction plus adjudication. b Topic 1: Multifactorial interventions to prevent falls in older adults.16 c Topic 2: PCSK9 (proprotein convertase subtilisin/kexin type 9) antibodies for adults with hypercholesterolemia.17 d Topic 3: Interventions to promote physical activity in cancer survivors.18 e Topic 4: Omega-3 fatty acids for adults with depression.19

Between-Approach Comparisons for Time. We fit a variety of models for the self- and auto-recorded times. Both sets of times varied by abstraction approach, by sequence of approaches, and by the order in which articles were abstracted, but not by review topic. Again, because of difficulties with interpretation, we ignored interactions between approach and the order in which articles were abstracted. Table 10 reports comparison data of auto-recorded time between approaches using a model based on the DAA trial design that adjusted for sequence, systematic review topic, and indicators for the approach used on the first and last article abstracted by each pair. Table 11 presents similar comparison data for self-recorded time. Irrespective of which time was used, the comparisons between approaches rendered similar findings.

When considering total time spent on studies, approach A took statistically significantly less time than approach C by both methods of time recording: by 46 minutes (95% CI, 26-66 minutes) using auto-recorded time and by 53 minutes (95% CI, 39-66 minutes) using self-

42

recorded time. Approaches A and B did not differ on self-recorded time but differed in favor of B on auto-recorded time by 20 minutes (95% CI, 1-40 minutes). Approach B also took statistically significantly less time than approach C: by 66 minutes (95% CI, 47-86 minutes) using auto-recorded time and by 52 minutes (95% CI, 39-66 minutes) using self-recorded time.

When considering time spent by type of data abstracted (Table 10), approach A took statistically significantly less time than approach C for data items related to outcomes/results, but not for study design and baseline characteristics. Approach B took statistically significantly less time than approach C for each type of data item. Approach A took longer than approach B for data items related to study design and for baseline characteristics, but not for data items related to outcomes/results.

43

Table 10. Between-Approach Comparisons of Auto-recorded Time by Type of Data Abstracteda

Approach Ab – Approach Cc Approach Bd – Approach C Approach A – Approach B Tab Adj. MD 95% CI P Adj. MD 95% CI P Adj. MD 95% CI P Design Tab –7.2 –17.7 to 3.3 .18 –20.2e –31.2 to –9.2 .0003 13.4e 3.0-23.9 .01 Baselines Tab –4.2 –9.8 to 1.5 .15 –13.8e –19.4 to –8.2 <.0001 9.7e 4.0-15.3 .0008 Outcomes –33.3d –45.7 to –20.9 <.0001 –27.4e –39.8 to –15.0 <.0001 –5.9 –18.3 to 6.5 .35 and Results All data items –45.9d –65.5 to–26.3 <.0001 –66.1e –85.7 to –46.5 <.0001 20.2e 0.6-39.8 .04 Abbreviation: Adj. MD, adjusted mean difference. a The model that did not include indicators for the approach used on the first and last article abstracted by each pair rendered similar findings, except that the comparison between approaches A and B for all data items was not statistically significant (when those indicators were not included in the model). b Approach A was DAA (Data Abstraction Assistant)–facilitated single abstraction plus verification. c Approach C was independent dual abstraction plus adjudication. d Approach B was single abstraction plus verification. e Significant at 0.05 level.

Table 11. Between-Approach Comparisons of Self-recorded Time Across All Topicsa

Approach Ab – Approach Cc Approach Bd – Approach C Approach A – Approach B Tab Adj. MD 95% CI P Adj. MD 95% CI P Adj. MD 95% CI P All data items –52.7e –66.2 to –39.2 <.0001 –52.4d –65.9 to –39.0 <.0001 –0.3 –13.7 to 13.2 .97 Abbreviation: Adj. MD, adjusted mean difference. a The model that did not include indicators for the approach used on the first and last article abstracted by each pair rendered similar findings. b Approach A was DAA (Data Abstraction Assistant)-facilitated single abstraction plus verification. c Approach C was independent dual abstraction plus adjudication. d Approach B was single abstraction plus verification. e Significant at 0.05 level.

44

Sensitivity Analyses For both error proportions and time, per-protocol sensitivity analyses returned similar results (results not shown) as the main ITT analyses.

Impact of Errors on Meta-analysis Continuous Outcome (low-density lipoprotein–cholesterol [LDL-C] level absolute change from baseline to 12 weeks). Eight of the 12 studies in systematic review topic 2 (ie, PCSK9 antibodies for adults with hypercholesterolemia17) provided sufficient data for a meta-analysis for this outcome, when comparing evolocumab 420 mg (a PCSK9 antibody) and placebo. With 3 data abstractors for each study and 8 studies, there were 38 = 6561 possible combinations of data that could be used for this meta-analysis. Because data abstractors sometimes omitted an outcome (ie, failed to abstract any data for an outcome) or did not abstract sufficient data for a meta-analysis (eg, abstracted mean without measures of precision), the mean number of studies per meta-analysis was 5.67 and 4.67 for methods 1 and 2, respectively.

When using data from the answer key, the pooled MD in the continuous outcome (ie, LDL-C level absolute change from baseline to 12 weeks) using analysis method 1 was –2.08 mmol/L (95% CI, –2.48 to –1.68). When using the resampling meta-analysis, we found that the mean of the MDs was of slightly smaller magnitude (–1.92 mmol/L) and ranged from –2.27 mmol/L to –1.63 mmol/L. Although the objective of the DAA trial was to compare 3 data abstraction approaches, in the resampling meta-analysis, there were only 3 combinations that had the same approach for all 8 sampled studies, 1 combination for each abstraction approach. Compared with the answer key (–2.08 mmol/L), the magnitude of the MD was slightly lower for approach A (–1.89 mmol/L), similar for approach B (–2.11 mmol/L), and moderately lower for approach C (–1.68 mmol/L). Despite the smaller mean number of studies, the MDs for analysis method 2 were very similar to those for method 1.

45

Binary Outcome (having at least 1 fall by 12 months). Ten of the 12 studies in systematic review topic 1 (ie, multifactorial interventions to prevent falls in older adults16) provided sufficient data for a meta-analysis for this outcome when comparing physical activity and usual care. With 3 data abstractors for each study and 10 studies, there were 310 = 59 049 possible combinations of data that could be used for this meta-analysis. We considered a random sample of 10 000 of these possible combinations. There was a mean of 6.65 studies per meta-analysis for both analysis methods 1 and 2.

When using data from the answer key, the relative risk (RR) of the binary outcome (ie, having at least 1 fall by 12 months) using analysis method 1 was 0.93 (95% CI, 0.84-1.03). When using the resampling meta-analysis, the mean of the RRs rendered a slightly larger effect (0.91) and ranged from 0.86 to 0.97. Compared with the answer key (0.93), the RR for approach A was similar (0.94), slightly stronger for approach B (0.91), and slightly weaker for approach C (0.96). The RRs using analysis method 2 were very similar to those for analysis method 1.

Aim 3. Disseminating the Study Findings Table 12 presents a set of considerations to guide systematic reviewers in their choice of approach to data abstraction. We published manuscripts in peer-reviewed journals describing the features and functioning of DAA14 and the protocol for the DAA trial.14 We also are preparing a separate manuscript that describes the primary results of the DAA trial. We presented DAA and the DAA trial design at the Cochrane Colloquium in 201615 and at the Global Evidence Summit in 2017.22 We presented the results of the DAA trial at the annual meeting of the Society for Research Synthesis Methodology in 201823 and at the Cochrane Colloquium in 2018.24 We responded to a manuscript about reducing research waste in systematic reivews.25

46

Table 12. Considerations When Selecting Data Abstraction Approaches During Systematic Reviews

Tasks Guidance Data abstraction Use electronic data abstraction systems where possible. The system system chosen should be able to implement best practices of form development, enhance open science and reproducibility, and reduce research waste. Form development • Pilot test form. • Provide clear instructions. • Provide definitions to clarify terms. • Minimize open-ended questions. • Use existing templates, existing (and common) data items, tailoring questions to specific topics as needed. Training and Conduct regular and ongoing training to reinforce methods and composition of data prevent inconsistencies in interpretation of data items. abstractor team Data abstraction • Avoid single-data abstraction to minimize errors. approach (directly • Single abstraction plus verification (without using DAA) leads to a informed by findings similar amount of errors overall as independent dual abstraction of the DAA trial) plus adjudication but takes substantially less time. Single abstraction plus verification may lead to more errors than independent dual abstraction plus adjudication for data items related to outcomes and results. • DAA-facilitated single abstraction plus verification could be considered for the following reasons: 1. It has similar overall error proportions to independent dual abstraction plus adjudication. 2. It takes substantially less time than independent dual abstraction plus adjudication 3. It has the potential to promote. reproducible science through creation of permanent linkages between abstracted data and their sources (something that single abstraction plus verification without DAA does not do). This could facilitate the updating of systematic reviews and sharing of previously abstracted data for other purposes. 4. It can contribute to evaluating and advancing the use of various automated and semiautomated natural-language processing and machine-learning tools for systematic review production. • Regardless of the approach chosen, pay careful attention to data items that are more prone to errors (eg, outcomes and numeric results) and those that are subjective and require judgment (eg,

47

Tasks Guidance risk of bias). These types of data items may benefit from independent dual abstraction plus adjudication. Managing abstracted • Anticipate challenges associated with the complexities of data data management, especially for large systematic reviews, and plan accordingly. • Decide whether calculation-type questions should be dealt with during data abstraction or centrally during data management.

Abbreviation: DAA, Data Abstraction Assistant.

48

DISCUSSION We developed DAA to assist the data abstraction process during systematic reviews and tested DAA using a randomized crossover trial conducted online. We found that although the overall error proportions were similar among the 3 data abstraction approaches tested in the DAA trial (range, 15%-17%), DAA-assisted single abstraction plus verification (approach A) was associated with higher odds of errors than were the other 2 approaches, especially for data items related to study outcomes and results. The overall and data type-specific error proportions for single abstraction plus verification (approach B) were similar to those for independent dual data abstraction plus adjudication (approach C). Regardless of the abstraction approach, certain types of data items (namely, outcomes/results) were more prone to errors, and a large proportion of errors in numeric results were omissions, because data abstractors missed certain outcomes. Approach A took substantially less time than approach C, but longer than approach B.

Error Proportions Observed and Context for Study Results There are several possible reasons for the relatively high error proportions observed in our study. First, the error proportions might be higher for studies in which the quality of reporting was poor, a factor we did not evaluate in this study. Ambiguity of reporting poses challenges for data abstraction. In addition, accurate abstraction of certain data items requires a nuanced understanding of methodological and statistical concepts related to study design and analysis. Although we required as an eligibility criterion for the DAA trial that participants have experience with data abstraction, we did not evaluate data abstractor expertise related to statistics or clinical trial methodology. We also did not require data abstractors to have knowledge related to the content of the topics of the reviews. In addition, data abstractors were not involved in conceiving the systematic reviews, such as protocol development, screening of studies, and form design and testing. By participating in these activities, data abstractors develop domain knowledge and become familiar with relevant concepts, terminology, measures, and methods. As such, it is possible that the data abstractors in the DAA trial were less familiar with the data items to be abstracted than might be expected of data

49

abstractors working on real-life systematic reviews. Finally, although we tried to make the questions and instructions on the data abstraction forms as clear as possible, we did not intervene to improve the quality of data abstraction midway, as might be attempted in real-life systematic reviews through regular and ongoing training and group discussions.

The highest proportions of errors were observed for data items related to outcomes and results, and most of these errors arose because of omissions, either of entire outcomes or specific fields within outcomes. The proportions of errors were also high for data items that require judgment (eg, risk of bias). Because opinions may vary even among the most experienced data abstractors, the so-called errors in data items that require judgment may simply reflect a range of views and interpretations. Quality assurance procedures, including development of detailed protocols and data abstraction instructions, and regular and ongoing training of data abstractors,26 should focus on these areas to minimize errors.

Differences in Error Proportions and Time Among Data Abstraction Approaches The overall error proportions were similar among the 3 approaches in the DAA trial (range, 15%-17%). However, when focusing on data items related to outcomes and results, we noted that DAA-assisted single abstraction plus verification (approach A) was associated with a higher proportion of errors (41%) than single abstraction plus verification without DAA (approach B; 36%) and independent dual abstraction plus adjudication (approach C; 31%). However, the 2 verification approaches (A and B) required considerably less time (almost 1 hour less per article) than independent dual data abstraction plus adjudication (C). This translated into a time saving of more than one-third. The independent nature of abstraction in approach C, coupled with both abstractors having to spend time adjudicating their data, likely led to approach C taking the longest time. This suggests that precautions such as independent dual data abstraction plus adjudication may be most important for data needed for meta- analysis but may not be necessary for all items.

50

Possible Reasons for Higher Error Proportions With DAA The higher error proportions using DAA than the other 2 approaches may arise from several sources. First, DAA is a new software application that was tested using data abstractors who were naive to using it. Although we provided data abstractors with training videos for using DAA, some of the errors might be related to abstractors being unfamiliar with a new technology. Second, we did not monitor whether DAA was being used as intended. When placing and reviewing flags, it is possible that abstractors flagged only the first instance of relevant information for a given data item in an article and missed other locations in the article that might have provided relevant information. It is also possible that DAA verifiers were anchored to what had already been flagged and, therefore, were particularly prone to missing information that was not flagged. Regarding these first 2 factors, it should be noted that when using DAA, it is good practice to flag all locations that contain relevant information for a given data item, and we had instructed abstractors as such. In addition, there generally is a delay during which human performance with a new tool might be expected to peak. With diffusion of innovation, adequate training, and integration of the tool with other enhancements, such as the use of machine learning and natural-language processing to locate items in text, we surmise that these error proportions may become lower over time. Third, outcomes and results (the type of data items with the highest error proportions) are often reported in tables and figures. It is possible that the format of such tables and figures in some articles used in the DAA trial did not allow for appropriate flagging of specific-enough pieces of text to answer specific data items, thereby limiting the value of using DAA.

Subpopulation Considerations The consideration of subpopulations is not applicable because the interventions in the DAA trial (ie, data abstraction approaches) were not expected to affect the health of data abstractors.

Value of Using DAA and Implications for Future Research To the extent that DAA is used appropriately, it has the potential to promote reproducible science through the creation of permanent linkages between abstracted data and

51

their sources. This facilitates the updating of systematic reviews and sharing of previously abstracted data for other purposes. DAA also can contribute to evaluating the performance of various automated or semiautomated tools that facilitate data abstraction during systematic reviews. These tools use natural-language processing and machine-learning approaches to assist with data abstraction.27 Most existing tools focus on automating the abstraction of data items, such as number, age, and sex of participants; number of recruiting centers; intervention groups; and outcomes.28 A few tools can abstract information about study objectives and certain aspects of study design (eg, study duration, participant flow) and risk of bias.28,29 However, to date, most of the data items typically abstracted during systematic reviews, including outcomes and results needed for meta-analyses, have not been explored for automated abstraction. Before automated tools for text identification and highlighting can achieve the goals set for their use, their performance should be evaluated using a common data set. The linkages created by DAA can facilitate this evaluation and provide lessons about how these tools can fit into existing systematic review workflows.

Challenges With Independent Dual Data Abstraction Plus Adjudication The 2 verification approaches required considerably less time (almost 1 hour less per article) than did independent abstraction. In addition, our findings suggest that, for approach C, the step of adjudication (ie, by 2 data abstractors after the initial independent abstraction) took about two-thirds the amount of time as the initial abstraction by 1 data abstractor (Table 8). When adjudicating, data abstractors had to reorient themselves to the content of the article, identify discrepancies on their data collection forms in the data abstraction system (in our case, SRDR), and discuss the discrepant fields to arrive at consensus. Such reorientation would likely take longer as the time between initial abstraction and the adjudication session(s) increases. The time taken for adjudication would possibly be less if SRDR could automate the comparison step in identifying discrepancies. Such a data comparison tool, now available in the newly launched SRDR Plus (https://srdrplus.ahrq.gov), was unavailable during the DAA trial.

52

Implications and Uptake of Study Results The findings of the DAA trial fill a critical methodological gap in our current understanding of data abstraction best practices, as revealed in a 2017 systematic review30 and the 2011 IOM standards for systematic reviews.1 Both documents identified only 1 study (by Buscemi et al4) that had compared verification with independent abstraction. In the Buscemi et al4 study, the absolute error proportions were similar between the 2 approaches (17.7% for verification and 14.5% for independent abstraction). These proportions are consistent with the error proportions in our study (range, 15%-17%). However, the main conclusion (that the verification approach resulted in more errors than did independent dual abstraction) in the Buscemi et al4 study was based on a relative difference of a 21.7% lower error proportion for independent abstraction (P = 0.02).4 Our study may now be added to the information needed to determine best practices for systematic reviews.

Study Limitations and Strengths The current version of the DAA software created in aim 1 has some limitations. Currently, the smallest unit of text that can be highlighted as source material for a given data item is an entire line in a paragraph in the source document. Also, DAA currently does not allow the highlighting of text in image-based tables and figures. We are continuing to develop and refine DAA to address these limitations.

The DAA trial itself (aim 2) had some limitations. First, we evaluated as a test case DAA’s compatibility with only 1 data abstraction system (ie., SRDR). Second, certain questions on the data abstraction forms might have been reasonably interpreted differently by different pairs of data abstractors, leading to multiple acceptable answers. For example, in instances when an article provided no or ambiguous information about masking of outcome assessors, the distinction between “No,” “Not reported,” and “Not applicable” might not be readily apparent. This might have artificially inflated the error proportions for such questions.

This study also has several strengths. First, we designed DAA to be open access, open source, and free. To our knowledge, DAA is the only software application that enables tracking

53

of the source of abstracted data. To the extent that DAA is used as intended, it has the potential to promote reproducible science through the creation of permanent linkages between abstracted data and their sources. Such links may facilitate the updating of reviews and sharing of previously abstracted data for other purposes. Second, DAA is compatible with a wide range of data abstraction systems. Third, related to the DAA trial, we used a rigorous and efficient crossover design with random allocation and allocation concealment, testing the effectiveness of DAA-assisted single abstraction plus verification vis-à-vis 2 standard approaches to data abstraction. The studies included for data abstraction in the trial covered a range of topics and examined a range of outcomes. Fourth, we included 52 data abstractors in the trial, and each completed all steps of the trial; we had no missing data. The generalizability of the trial is likely high because of the broad eligibility criteria for data abstractors from multiple locations and organizations with various types of backgrounds and levels of experience with data abstraction for systematic reviews. Fifth, we obtained from all 52 participants, on their completion of activities in the trial, their opinion about the user friendliness of DAA and suggestions for its improvement (which we are incorporating). Once we incorporate these suggestions and make other improvements, we will make DAA available to the public, initially through SRDR and then for use with other data abstraction systems. Sixth, we collaborated with multiple stakeholders, including patients, in developing DAA and designing, conducting, analyzing, and disseminating the results of the DAA trial. Finally, our multipronged dissemination strategy likely will ensure that the DAA software reaches various systematic review stakeholders.

54

CONCLUSIONS

Because data abstraction is still largely a manual process, errors in data abstraction are almost inevitable and, in some cases, quite frequent. Users of systematic reviews, including patients, clinicians, guideline developers, and others, should be aware that systematic reviews may sometimes be based on inaccurately abstracted data. However, on the basis of findings from this study, we do not know and cannot predict how the conclusions of an individual systematic review and meta-analysis might be affected by data abstraction errors.

Systematic reviewers should always adopt quality assurance procedures during data abstraction, develop detailed protocols and instructions, and regularly train data abstractors. Such efforts should focus on areas where error proportions are particularly high, such as data items related to study outcomes and results.

In summary, considering accuracy and efficiency together, our findings suggest independent dual abstraction plus adjudication is necessary for outcomes and results data during systematic reviews; a verification approach is sufficient for other types of data. By linking abstracted data with their exact source, DAA provides an audit trail that is crucial for reproducible research and complete transparency. Reviewers should choose their data abstraction approach on the basis of the inevitable trade-off between saving time and minimizing errors.

55

REFERENCES

1. Eden J, Levit L, Berg A, Morton S, eds; Committee on Standards for Systematic Reviews of Comparative Effectiveness Research, Board on Health Care Services. Finding What Works in Health Care: Standards for Systematic Reviews. National Academies Press; 2011.

2. Chandler J, Churchill R, Higgins J, Lasserson T, Tovey D. Methodological Expectations of Cochrane Intervention Reviews (MECIR). Version 1.05. January 2018. Accessed December 5, 2018. https://community.cochrane.org/mecir-manual

3. Wallace BC, Dahabreh IJ, Schmid CH, Lau J, Trikalinos TA. Modernizing the systematic review process to inform comparative effectiveness: tools and methods. J Comp Eff Res. 2013;2(3):273-282.

4. Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. Single data extraction generated more errors than double data extraction in systematic reviews. J Clin Epidemiol. 2006;59(7):697-703.

5. Gøtzsche PC, Hróbjartsson A, Maric K, Tendal B. Data extraction errors in meta-analyses that use standardized mean differences. JAMA. 2007;298(4):430-437.

6. Jones AP, Remmington T, Williamson PR, Ashby D, Smyth RL. High prevalence but low impact of data extraction and reporting errors were found in Cochrane systematic reviews. J Clin Epidemiol. 2005;58(7):741-742.

7. Centre for Reviews and Dissemination. Systematic reviews: CRD’s guidance for undertaking reviews in health care. York Publishing Services, Ltd. Published 2009. Accessed December 5, 2018. https://www.york.ac.uk/media/crd/Systematic_Reviews.pdf

8. Methods Guide for Effectiveness and Comparative Effectiveness Reviews. AHRQ Publication No. 10(14)-EHC063-EF. Agency for Healthcare Research and Quality website. Accessed December 5, 2018. https://effectivehealthcare.ahrq.gov/topics/cer-methods- guide/overview

9. Helfand M, Berg A, Flum D, Gabriel S, Normand S-L, eds; Patient-Centered Outcomes Research Institute Methodology Committee. Draft Methodology Report: Our Questions, Our Decisions: Standards for Patient-Centered Outcomes Research. Published July 23, 2012. Accessed December 5, 2018. http://pcori.org/assets/MethodologyReport- Comment.pdf

10. Ip S, Hadar N, Keefe S, et al. A Web-based archive of systematic review data. Syst Rev. 2012;1:15. https://link.springer.com/article/10.1186/2046-4053-1-15

56

11. Li T, Vedula SS, Hadar N, Parkin C, Lau J, Dickersin K. Innovations in data collection, management, and archiving for systematic reviews. Ann Intern Med. 2015;162(4):287- 294.

12. Thomas J, Brunton J, Graziosi S. EPPI-Reviewer 4.0: Software for Research Synthesis. EPPI-Centre Software. Social Science Research Unit, Institute of Education, University of London; 2010. Accessed December 5, 2018. https://eppi.ioe.ac.uk/cms/Default.aspx?tabid=2967

13. Jap J, Saldanha IJ, Smith BT, Lau J, Schmid C, Li T; Data Abstraction Assistant Investigators. Features and functioning of Data Abstraction Assistant, a software application for data abstraction during systematic reviews. Res Synth Methods. 2019;10(1):2-14.

14. Saldanha IJ, Schmid CH, Lau J, et al. Evaluating Data Abstraction Assistant, a novel software application for data abstraction during systematic reviews: protocol for a randomized controlled trial. Syst Rev. 2016;5(1):196. https://doi.org/10.1186/s13643- 016-0373-7

15. Saldanha IJ, Wen J, Schmid CH, Li T. Data Abstraction Assistant (DAA): what characteristics classify “experience” with data abstraction? Presented at 2016 Cochrane Colloquium; October 23-27, 2016; Seoul, South Korea.

16. Choi M, Hector M. Effectiveness of intervention programs in preventing falls: systematic review of recent 10 years and meta-analysis. J Am Med Dir Assoc. 2012;13(2):188.e13- 21. doi: 10.1016/j.jamda.2011.04.022

17. Navarese EP, Kolodziejczak M, Schulze V, et al. Effects of proprotein convertase subtilisin/kexin type 9 antibodies in adults with hypercholesterolemia: a systematic review and meta-analysis. Ann Intern Med. 2015;163(1):40-51.

18. Fong DY, Ho JW, Hui BP, et al. Physical activity for cancer survivors: meta-analysis of randomised controlled trials. BMJ. 2012;344:e70. https://doi.org/10.1136/bmj.e70

19. Appleton KM, Sallis HM, Perry R, Ness AR, Churchill R. Omega-3 fatty acids for depression in adults. Cochrane Database Syst Rev. 2015;(11):CD004692. doi: 10.1002/14651858.CD004692.pub4

20. DerSimonian R, Laird L. Meta-analysis in clinical trials. Control Clin Trials. 1986;7(3):177- 188.

21. Saldanha IJ, Jap J, Smith B, Dickersin K, Schmid CH, Li T. Data Abstraction Assistant (DAA): a new open-access tool being developed and tested in a randomized controlled trial. Presented at 2016 Cochrane Colloquium; October 23-27, 2016; Seoul, South Korea.

57

22. Saldanha IJ, Jap J, Smith B, Dickersin K, Schmid CH, Li T. Data Abstraction Assistant (DAA) – what can it do and does it work? Presented at 2017 Global Evidence Summit; September 13-16, 2017; Cape Town, South Africa.

23. Li T, Saldanha IJ, Smith B, et al. Data Abstraction Assistant, a new tool, saves time without compromising the accuracy of data abstraction during systematic reviews. Presented at 2018 Society for Research Synthesis Methodology Annual Meeting; July 8- 10, 2018; Bristol, United Kingdom.

24. Li T, Saldanha IJ, Smith B, Jap J, Canner J, Schmid CH. Data Abstraction Assistant, a new tool, saves time without compromising the accuracy of data abstraction during systematic reviews. Presented at 2018 Cochrane Colloquium; September 16-18, 2018; Edinburgh, United Kingdom.

25. Jap J, Saldanha IJ, Smith BT, Lau J, Schmid C, Li T. Response to “Increasing value and reducing waste in data extraction for systematic reviews: tracking data in data extraction forms.” Syst Rev. 2018;7(1):18. https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s13643-018- 0677-x

26. Goodman S, Dickersin K. Metabias: a challenge for comparative effectiveness research. Ann Intern Med. 2011;155(1):61-62.

27. Jonnalagadda SR, Goyal P, Huffman MD. Automating data extraction in systematic reviews: a systematic review. Syst Rev. 2015;4:78. https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s13643-015- 0066-7

28. Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. J Am Med Inform Assoc. 2016;23:193-201.

29. Millard LAC, Flach PA, Higgins JPT. Machine learning to assist risk-of-bias assessments in systematic reviews. Int J Epidemiol. 2016;45:266-277.

30. Mathes T, Klaßen P, Pieper D. Frequency of data extraction errors and methods to increase data extraction quality: a methodological review. BMC Med Res Methodol. 2017;17(1):152. doi: 10.1186/s12874-017-0431-4

58

RELATED PUBLICATIONS In Preparation Li T, Saldanha IJ, Jap J, et al. A randomized trial evaluating Digital Abstraction Assistant (DAA), a software application to facilitate data abstraction in systematic reviews.

Published Jap J, Saldanha IJ, Smith BT, Lau J, Schmid C, Li T; Data Abstraction Assistant Investigators. Features and functioning of Data Abstraction Assistant, a software application for data abstraction during systematic reviews. Res Synth Methods. 2019;10(1):2-14.

Jap J, Saldanha IJ, Smith BT, Lau J, Schmid C, Li T. Response to “Increasing value and reducing waste in data extraction for systematic reviews: tracking data in data extraction forms.” Syst Rev. 2018;7(1):18. https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s13643-018-0677-x

Saldanha IJ, Schmid CH, Lau J, et al. Evaluating Data Abstraction Assistant, a novel software application for data abstraction during systematic reviews: protocol for a randomized controlled trial. Syst Rev. 2016;5(1):196. https://doi.org/10.1186/s13643-016-0373-7

59

ACKNOWLEDGMENTS

We are grateful to the 52 data abstractors who participated in the Data Abstractor Assistant (DAA) trial as well as the 98 individuals who provided consent but were not needed for participation. We are grateful to the patient stakeholders who participated as collaborators on this project: Vernal Branch (public policy manager and patient advocate), Sandra A. Walsh, BS (California Breast Cancer Organizations), and Elizabeth J. Whamond (Cochrane Consumer Network). Whenever possible, we used verbatim text from the published protocol and other manuscripts emanating from this project.

PCORI funded this work under contract no. ME-1310-07009.

DAA investigators: Joseph Lau, MD (Brown University School of Public Health); Kay Dickersin, MA, PhD (Johns Hopkins Bloomberg School of Public Health); Jesse A. Berlin, ScD (Johnson & Johnson); Vernal Branch (Public Policy Manager and Patient Advocate); Bryant T. Smith, MPH, CPH (Brown University School of Public Health); Simona Carini, MA (University of California, San Francisco, School of Medicine); Wiley Chan, MD (Kaiser Permanente Northwest); Berry De Bruijn, MSc, PhD (National Research Council Information and Communications Technologies Portfolio, Canada); Byron C. Wallace, PhD (Northeastern University College of Computer and Information Science); Susan M. Hutfless, MS, PhD (Johns Hopkins School of Medicine); Ida Sim, MD, PhD (University of California, San Francisco, School of Medicine); M. Hassan Murad, MD, MPH (Mayo Clinic); Sandra A. Walsh, BS (California Breast Cancer Organizations); Elizabeth J. Whamond (Cochrane Consumer Network).

60

APPENDICES Appendix 1: Published paper describing the technical details of DAA

61 Received: 22 March 2018 Revised: 5 October 2018 Accepted: 10 October 2018 DOI: 10.1002/jrsm.1326

COMPUTATIONAL TOOLS AND METHODS

Features and functioning of Data Abstraction Assistant, a software application for data abstraction during systematic reviews

Jens Jap1 | Ian J. Saldanha1 | Bryant T. Smith1 | Joseph Lau1 | Christopher H. Schmid2 | Tianjing Li3 | on behalf of the Data Abstraction Assistant Investigators

1 Center for Evidence Synthesis in Health, Introduction: During systematic reviews, data abstraction is labor‐ and time‐ Department of Health Services, Policy, and Practice, Brown School of Public intensive and error‐prone. Existing data abstraction systems do not track Health, Providence, Rhode Island specific locations and contexts of abstracted information. To address this limi- 2 Center for Evidence Synthesis in Health, tation, we developed a software application, the Data Abstraction Assistant Department of Biostatistics, Brown School of Public Health, Providence, Rhode (DAA) and surveyed early users about their experience using DAA. Island Features of DAA: We designed DAA to encompass three essential features: 3 Center for Clinical Trials and Evidence (1) a platform for indicating the source of abstracted information, (2) compat- Synthesis, Department of Epidemiology, ibility with a variety of data abstraction systems, and (3) user‐friendliness. Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland How DAA functions: DAA (1) converts source documents from PDF to HTML format (to enable tracking of source of abstracted information), (2) Correspondence Jens Jap, BA, Software Engineer, Center transmits the HTML to the data abstraction system, and (3) displays the HTML for Evidence Synthesis in Health, in an area adjacent to the data abstraction form in the data abstraction system. Department of Health Services, Policy, The data abstractor can mark locations on the HTML that DAA associates with and Practice, Brown School of Public Health, 121 S Main St, Providence, RI items on the data abstraction form. 02903. Experiences of early users of DAA: When we surveyed 52 early users of DAA, Email: [email protected] 83% reported that using DAA was either very or somewhat easy; 71% are very Funding information or somewhat likely to use DAA in the future; and 87% are very or somewhat ‐ Patient Centered Outcomes Research likely to recommend that others use DAA in the future. Institute, Grant/Award Number: ME‐ 1310‐07009 Discussion: DAA, a user‐friendly software for linking abstracted data with their exact source, is likely to be a very useful tool in the toolbox of systematic reviewers. DAA facilitates verification of abstracted data and provides an audit trail that is crucial for reproducible research.

KEYWORDS data abstraction, data exchange, data tracking, software, systematic reviews

1 | INTRODUCTION collection form (either verbatim or after interpretation and manipulation). Information about included studies During the conduct of a systematic review, data abstrac- is typically obtained from journal articles and other tion (or data extraction) refers to the key process of source documents. Data abstraction is typically per- identifying relevant information about studies included formed by trained researchers with varying degrees of in the review and transferring this information to a data content and methodological expertise.

Res Syn Meth. 2018;1–13. wileyonlinelibrary.com/journal/jrsm © 2018 John Wiley & Sons, Ltd. 1 2 JAP ET AL.

Data abstraction is labor‐ and time‐intensive and linkage (ie, tracking) between abstracted information often error‐prone.1-7 The lack of an audit trail makes and its source. the verification of the abstracted information difficult. 2. Compatibility with a variety of data abstraction sys- Errors made during data abstraction, which often remain tems. DAA's main purpose is to contain information undetected by peer reviewers, editors, and readers, can that links individual abstracted data items to specific impact the validity of the results of systematic reviews.1 locations in source documents. To make DAA com- With a surge in collaborative science and reuse of previ- patible with a variety of data abstraction systems ously abstracted data, it is likely that decision making (eg SRDR, Covidence, and DistillerSR), we designed by clinicians, policymakers, patients, and others may be the DAA platform to be distinct from the data compromised by data abstraction errors. abstraction system. This distinction is attained by In recent years, several web‐based data abstraction keeping separate the process of linkage with an item systems, such as the Systematic Review Data Repository on the data abstraction form (in the data abstraction (SRDR),8,9 Covidence,10 EPPI‐Reviewer,11 DistillerSR system) and the process of capturing and navigating (Evidence Partners, Ottawa, Canada), and Doctor to the location of information (in the source docu- Evidence (www.drevidence.com), have been built to ment). The source code for the implementation of aid the development and population of data abstraction this independence is available at https://bitbucket. forms for organizing data in an efficient way for subse- org/cebmbrown/daa/src/master/. While distinct from quent data analysis. These data abstraction systems the underlying data abstraction system, DAA is have a major limitation, however. While they can designed to appear to the end user as part of the record which source documents are used for data web‐based data abstraction system. This is done to abstraction, they do not track the specific locations enable a seamless user experience. While we have and contexts of relevant pieces of information in these developed DAA to be compatible across data abstrac- often lengthy documents. To enable accurate verifica- tion systems, we describe in this paper the test case of tion of abstracted data during a systematic review, the DAA's compatibility with SRDR. verifier often has to reread the entire source document 3. User‐friendliness. To make navigation easy and fast, or large swaths of it, a task that can require as much we have developed DAA to be user‐friendly and time as abstracting the data in the first place. An ability menu‐driven. When abstracting data, the data to track the specific location and context of abstracted abstractor visualizes DAA as integrated seamlessly data in source documents would likely help to docu- into the data abstraction system. ment initial data abstraction and facilitate data verifica- tion and adjudication. This would likely promote the For a technical description of how DAA achieves each of validity of the systematic review findings, save time, these desired features, see Table 1. and advance the openness of the systematic review enterprise. 3 | HOW DAA FUNCTIONS BEHIND In this paper, we describe (1) the features and func- THE SCENES tioning of Data Abstraction Assistant (DAA), a free, open‐source, open‐access software application to facilitate DAA works through three steps (Figure 1): tracking the location of abstracted information in source documents, with the potential to reduce errors and time Step 1: Converting documents from portable document spent during data abstraction in systematic reviews, and format (PDF) to hypertext markup language (HTML) (2) the results from a survey of early users of DAA. format. Most source documents that are used for data abstraction during systematic reviews (eg, journal arti- cles, conference proceedings, approval packages from 2 | FEATURES OF DAA regulatory authorities, and clinical study reports [CSRs]) are accessed as PDF files. While documents We designed DAA to encompass three desired essential are sometimes obtained in other formats (eg, websites features. or word‐processing documents), our experience has 1. A platform for indicating the source of abstracted been that most systematic review teams convert those information. The major impetus behind the develop- documents into PDF format. Tracking the location of ment of DAA was to create a platform where data abstracted information from a given document is best abstractors could indicate the source of information achieved through annotating (ie, highlighting text in a by pinpointing specific locations in source docu- different color, circling, or otherwise marking up) the spe- ments, thereby creating a potentially permanent cific source text, tables, and/or figures in that document. JAP ET AL. 3

TABLE 1 Technical description of how Data Abstraction Assistant (DAA) achieves our three desired essential features

# Desired Essential Feature Technical Description

1 A platform for indicating We built a RESTful application programming interface (API) server, the source of abstracted which exposes end‐points and returns the following information (HTTP verb): information • List of documents with document titles and unique identifiers (GET) • HTML of document (GET) • List of markers (GET) • Add a new marker to a document (POST) • Remove a marker to a document (DELETE) 2 Compatibility with For DAA to work with a given data abstraction system, the data abstraction system a variety of data must (1) have sufficient screen real restate to load the HTML document into view abstraction systems and (2) have a mechanism that lets the data abstractor isolate a section of the HTML document and send a web request to the DAA server with the following three pieces of information:

1. The text to highlight (STRING); 2. The position of the text identified by the unique class identifier of the text (STRING); and 3. The identification number of the document (NUMERIC). DAA will receive this information and save it in its database. The data abstraction system's user interface then needs to reload the document or partially update it to reflect the flag placement by fetching the information from DAA. DAA also makes available a list of existing flags. The integrated system must have a mechanism to display this information as well, so that the data abstractor can choose from the list and select a marker to display. 3 User‐friendliness We built DAA's interface to have a simple‐to‐navigate, menu‐driven design. The data abstractor of DAA receives immediate feedback upon each step, and menus are updated in real time as the data abstractor makes selections. The update to the selection menu is done using Google's AngularJS components. Whenever the data abstractor makes a selection, a JavaScript on‐update listener triggers an API call to the DAA server and updates each component's content.

Annotating PDFs is extremely challenging because an item is text or an image is sometimes ambiguous. In of the lack of open‐source solutions (ie, software for some cases, PDFs might originate as scanned versions of which the original source code is made freely available paper copies of documents, which, instead of being ren- and may be redistributed and modified) for manipulating dered as text‐based PDFs, are rendered as image‐based and editing PDFs. Therefore, we elected to convert PDFs. Image‐based PDFs are particularly challenging to source documents from PDF to HTML, a format that work with because not all imaging processing software has many open‐source solutions for editing and location can read text from images accurately. This problem is tagging. For converting PDFs to HTML format, we compounded when the scanned version of a printed use pdf2htmlEX (an open‐source tool available at document is blurry and/or of otherwise poor quality. https://github.com/coolwanglu/pdf2htmlEX). Another DAA currently does not function with PDFs that are advantage of converting documents from PDF to HTML image‐based instead of text‐based. format is that distinct versions of a given document may originate from different publishers or platforms. Such Step 2: Transmitting the HTML version of the source distinctions arise because publishers or platforms might document to the data abstraction system. Once the render PDFs of the same document using different tools source document is converted from PDF to HTML for- (eg, Adobe Reader and Preview), resulting in PDFs of mat, DAA uses encrypted communication to transmit inconsistent format. Inconsistencies in format are largely the HTML to the data abstraction system (ie, SRDR in eliminated when DAA converts source documents into this instance). We authorize communication between HTML format because during the conversion process, DAA and SRDR through the use of security tokens, images and plain text are separated, creating a more ie, unique hash numbers that are required for access consistent representation of the documents. The HTML (see Section 4). Security tokens ensure that only format offers consistency because text and images are authorized users have access to view and edit markers clearly marked as such, while, in PDF format, whether to relevant source documents. 4 JAP ET AL.

FIGURE 1 Pictorial representation of DAA's functioning. DAA, Data Abstraction Assistant; HTML, hypertext markup language; PDF, portable document format; SRDR, Systematic Review Data Repository [Colour figure can be viewed at wileyonlinelibrary.com] JAP ET AL. 5

Step 3: Displaying and allowing for annotating the data verification), DAA scrolls the screen to navigate HTML version of the source document in the data to the exact location of the source text, with the per- abstraction system. Once DAA transmits the HTML tinent text highlighted. version of the source document to SRDR, SRDR dis- plays the document in HTML format on the screen Because there may be multiple source documents for adjacent to the data abstraction form (Split Screen a given study, DAA allows the data abstractor to toggle view, see online demonstration video at https://goo. between multiple HTML files when abstracting data in gl/ZhAkq4). To the data abstractor, the HTML format SRDR. When using this feature, the data abstractor can appears exactly like the PDF version of the source create links between a given item on the data abstraction document. Using a mouse, the data abstractor can form and locations on two or more separate HTML files. then drag a flag from any item on the data abstraction A single item that is linked to multiple locations will have form to any desired location on the adjacent HTML one flag representing each of these linkages. This individ- (Figure 2). Upon doing so, DAA creates a link ual flag carries both the identifier of the HTML file and between the specific item on the data abstraction the exact linked location in the HTML file. Having both form and the selected location on the HTML. DAA of these pieces of information allows DAA to switch to allows for the linkage of a given item on the data the document when clicking on the existing marker and abstraction form with multiple locations on the scroll automatically to the location as described above, HTML document and also the linkage of a given loca- even if it is on a separate HTML file. tion on the HTML document with multiple items on the data abstraction form. Linkages between abstracted information and corresponding locations 4 | HOW DAA FUNCTIONS AT THE on the HTML document are saved as markers on DATA ABSTRACTOR END the DAA server, so that the next time that particular source document is requested, the data abstractor DAA is designed to assist with data abstraction, which is may view the HTML document and all previously a step that is carried out after the set of eligible studies for placed markers together. As a result, a running record the systematic review is identified. Data abstractors can of all markers added to any source document in the interface directly with DAA by logging into the project is retained for future reference and access. password‐protected DAA web application and uploading By clicking on existing markers (for example, during study documents (as PDFs). This uploading process can

FIGURE 2 Screenshot of how Data Abstraction Assistant (DAA) displays the source document in hypertext markup language (HTML) format (right) adjacent to the data abstraction form in the data abstraction system (Systematic Review Data Repository [SRDR], left). DAA (1) allows for placing of flags that links the data item to a specific location on the source document (red arrows) and (2) automatically records the content of the highlighted text in the data abstraction system (black arrow) [Colour figure can be viewed at wileyonlinelibrary.com] 6 JAP ET AL. be centrally managed by the project lead if a more abstractor who placed flags and the senior data abstractor protected governance of the data abstraction and docu- who could view or remove existing flags or add new flags. ment management process is desired. After completing data abstraction for the DAA Trial, Once source documents are uploaded in PDF format, we asked each data abstractor to a complete a brief survey they are converted into HTML format and organized into designed using Qualtrics. We asked questions pertaining Document Stores, which are groups or collections of to the data abstractor's self‐reported ease with which each source documents. DAA assigns each Document Store a of the following tasks could be completed: (1) opening security token, allowing access to the HTML files from source documents in Split Screen view in SRDR, (2) any systematic review project that the data abstractor is scrolling between pages of a source document, (3) placing working on in SRDR. Upon logging into SRDR, SRDR flags on a source document, and (4) clicking on existing requires the data abstractor to provide the security token flags to automatically navigate to the relevant location on in order to access the data abstractor's Document Stores. the source document. We also asked data abstractors to After the data abstractor selects the Document Store assess the overall ease of using DAA and to indicate the and, subsequently, a source document in HTML format, DAA feature that they liked the most. Finally, we asked DAA transmits the HTML file to SRDR. data abstractors about their likelihood of using DAA in Once DAA transmits the HTML to SRDR, SRDR dis- the future and of recommending that others use it in the plays the HTML in an area adjacent to the abstraction future. Appendix A includes the entire survey instrument. form (see Section 3, step 3). 5.2 | Results of survey 5 | EXPERIENCES OF EARLY USERS OF DAA All 52 data abstractors who participated in the DAA Trial completed the survey (Table 2). Most data abstractors To evaluate rigorously the benefit of using DAA, we con- (43/52, 83%) found using DAA to be either very or some- ducted an online randomized trial (DAA Trial) comparing what easy overall. Opening source documents in Split the accuracy (ie, error rates) and efficiency of data abstrac- Screen view and scrolling between pages of a source doc- tion (ie, time taken) using DAA versus verification and ument were reported to be easy by 83% and 69% of data adjudication approaches that do not use DAA. As part of abstractors, respectively. Among those who placed flags the DAA Trial, data abstractors abstracted various kinds initially, ie, less experienced data abstractors, 62% agreed of information pertaining to published studies, including that doing so was easy. Among those who clicked on study design, risk of bias, characteristics of study partici- existing flags, ie, more experienced abstractors, 73% pants, treatment arms, outcomes, within‐arm results, and agreed that doing so was easy. between‐arm results. We previously reported the detailed When asked about use of DAA for data abstraction in protocol for this trial2 and will report its results separately. the future, 65% of less experienced and 77% of more expe- In this section, we describe the results of surveying the rienced data abstractors stated they are very or somewhat trial participants, ie, the early users of DAA, with regards likely to use it. Similarly, 80% of less experienced and 93% to the user‐friendliness of DAA. of more experienced data abstractors stated that they are very or somewhat likely to recommend that others use it (see Table 2 for detailed breakdown of responses). 5.1 | Methods of survey When asked to name their favorite DAA feature, 54% of all data abstractors chose the ability to click on existing We surveyed all 52 individual data abstractors enrolled in flags marking information sources (73% of senior data the DAA Trial. As part of the trial, we organized these abstractors named this feature); 19% of data abstractors individuals into 26 pairs, each pair comprising one less chose the ability to open a document in Split Screen view; experienced and one more experienced data abstractor. and 17% chose the ability to place flags on the PDF (23% During the DAA Trial, each pair abstracted data from of junior data abstractors named this feature) (Table 2). six studies; for two of the six studies, the pair used DAA to abstract data (two other abstraction approaches with- out DAA were used for the other four studies that the pair 6 | DISCUSSION abstracted). For both studies assigned to a given pair for abstraction using DAA, the junior data abstractor first In this paper, we described the features and functioning abstracted data, and then the senior data abstractor veri- of DAA, a software application that assists the data fied the data, making changes to the abstraction as abstraction process during systematic reviews by enabling needed. Therefore, it was always the junior data viewing the source document (eg, journal article) JAP ET AL. 7

TABLE 2 Survey responses, by level of experience with data abstraction

All Data Less Experienced Data More Experienced Data Abstractors Abstractors (N = 26) Abstractors (N = 26) (N = 52) Survey item n (%) n (%) n (%)

Ease of use of DAA It was easy to open documents in Split Screen view in SRDR. Agree 20 (77) 23 (88) 43 (83) Disagree 2 (8) 0 (0) 7 (4) Neutral 4 (15) 3 (12) 2 (13) It was easy to scroll between pages on the document. Agree 17 (65) 19 (73) 36 (69) Disagree 3 (12) 3 (12) 6 (12) Neutral 6 (23) 4 (15) 10 (19) It was easy to place flags on the document. Agree 16 (62) N/A N/A Disagree 3 (12) Neutral 7 (27) It was easy to click on existing flags to automatically navigate to a relevant location on the document. Agree 16 (61) 21 (85) 37 (71) Disagree 2 (8) 1 (4) 3 (3) Neutral 8 (31) 4 (15) 12 (23) Overall ease of using DAA Very easy 5 (19) 9 (35) 14 (27) Somewhat easy 17 (65) 12 (46) 29 (56) Somewhat difficult 2 (8) 3 (12) 5 (10) Very difficult 0 (0) 0 (0) 0 (0) Neutral 2 (8) 2 (8) 4 (8) Future use of DAA Likelihood of using DAA for data abstraction in the future. Very likely 10 (38) 12 (46) 22 (42) Somewhat likely 7 (27) 8 (31) 15 (29) Somewhat unlikely 2 (8) 4 (15) 6 (12) Very unlikely 1 (4) 0 (0) 1 (2) Neutral 6 (23) 2 (8) 8 (15) Likelihood of recommending that others use DAA for data abstraction in the future. Very likely 11 (42) 15 (58) 26 (50) Somewhat likely 10 (38) 9 (35) 19 (37) Somewhat unlikely 0 (0) 1 (4) 1 (2) Very unlikely 2 (8) 0 (0) 2 (4) Neutral 3 (12) 1 (4) 4 (8) Favorite DAA feature (even if abstractors themselves did not use the feature) Ability to open a document in Split Screen view 7 (27) 3 (12) 10 (19)

(Continues) 8 JAP ET AL.

TABLE 2 (Continued)

All Data Less Experienced Data More Experienced Data Abstractors Abstractors (N = 26) Abstractors (N = 26) (N = 52) Survey item n (%) n (%) n (%) Ability to place flags on the document 6 (23) 3 (12) 9 (17) Ability to click on existing flags to navigate 9 (35) 19 (73) 28 (54) Ability to copy text to SRDR 3 (12) 0 (0) 3 (6) Other 0 (0) 1 (4) 1 (2) None (ie, no particular ability stood out) 1 (4) 0 (0) 1 (2)

Abbreviations: DAA, Data Abstraction Assistant. SRDR, Systematic Review Data Repository. Percentages are calculated using the column totals as the denominator. adjacent to the data abstraction form in the data abstrac- data abstraction during systematic reviews. These tools tion system and enabling the tracking of the source of use natural language processing and machine learning abstracted data. When we surveyed 52 early users of approaches to assist with data abstraction. Most existing DAA, most found the software user‐friendly, most would tools focus on automating the abstraction of data ele- use it, and most would recommend that others use it for ments such as number of participants, their age, sex, data abstraction in the future. The most popular feature country, recruiting centers, intervention groups, and out- of DAA appears to be the ability to click on existing flags comes.13 A few tools are able to abstract information to navigate to portions of text/figures/tables in the source about study objective and certain aspects of study design document that contain relevant data, a feature that could (eg, study duration and participant flow) and risk of be very useful when verifying abstracted data and when bias.14,15 However, to date, most of the data elements that updating systematic reviews. are typically abstracted during systematic reviews have not been explored for automated abstraction. Before auto- mated tools for text identification and highlighting can | 6.1 Potential utility of DAA for achieve their goals, their performance should be evalu- systematic reviews ated using a common dataset. The markers placed by DAA can facilitate this evaluation and provide lessons DAA is likely to be a very useful tool in the toolbox of sys- about how these tools can fit into existing systematic tematic reviewers. Systematic reviews take a median of review workflows. Further, the manually annotated data 66 weeks from registration to publication (interquartile collected by the tool could be used as training data for range 42 wk, range 6‐186 wk).12 With an ever‐growing supervised machine learning approaches. Even in a size of the body of relevant evidence in most topic areas, future where the process of identification and highlight- this duration is likely to get even longer. Data abstraction ing of relevant locations of data elements in source docu- accounts for a large share of the time spent conducting ments is satisfactorily automated by these other tools, the systematic reviews, and tools such as DAA have the features that DAA offers will be a much‐needed comple- potential to reduce that time. The utility of DAA would ment by allowing manual tracking and checking of data likely be further enhanced if systematic reviewers choose elements, entry of the data elements into a data abstrac- to share their annotations publicly, thus allowing future tion system, and creation of a permanent linkage between systematic reviewers to capitalize on existing annotations abstracted data and their sources. in new systematic reviews. Similarly, in the case of review DAA can be particularly useful when extracting data updates, access to existing annotations and exact data from trial reports not traditionally used during systematic source location could greatly reduce the time spent on reviews, for example, CSRs and regulatory documents. data abstraction. A CSR contains an unabridged and comprehensive description of the clinical problem, design, conduct, and 6.2 | DAA in the context of automated results of a clinical trial, following structure and content tools for systematic reviews guidance prescribed by the International Conference on Harmonization (ICH).16 To obtain marketing approval DAA can contribute to evaluating the performance of var- of drugs or biologics for a specific indication, pharmaceu- ious automated or semiautomated tools that facilitate tical companies submit CSRs and other required JAP ET AL. 9 materials to regulatory authorities. CSRs differ from trial In summary, we described the features and function- datasets (ie, electronic individual patient data) in that ing of DAA, a software application to facilitate tracking they are paper (or mostly PDF) documents and can be of the location of abstracted information in source thousands of pages long. CSRs typically contain a wealth documents, with the potential to reduce errors and time of information for evaluating the efficacy and safety of spent during data abstraction in systematic reviews. pharmacological treatments, including information that When we surveyed 52 early users of DAA, 83% stated that is often missing from the public domain.17,18 Regulatory they found using DAA to be either very or somewhat documents are summaries of CSRs and related files, easy; 71% stated they are very or somewhat likely to use prepared by the regulatory agency's staff as part of the DAA in the future; and 87% stated that they are very or process of approving the products. Regulatory documents somewhat likely to recommend that others use DAA in are usually made available to the public in PDF format. the future. Abstracting information and verifying and reconciling abstracted information from CSRs and regulatory docu- ACKNOWLEDGMENTS ments can be particularly laborious17; DAA can greatly assist these processes by enabling tracking of the source We thank the Data Abstraction Assistant investigators of the information. who provided comments on a draft of this manuscript— Jesse A. Berlin, Simona Carini, Susan M. Hutfless, M. Hassan Murad, and Ida Sim. We would also like to 6.3 | Current limitations of DAA acknowledge the other Data Abstraction Assistant inves- tigators for their contributions to this project—Vernal We expect to release DAA for use by the general public Branch, Wiley Chan, Berry De Bruijn, Kay Dickersin, by September 2018. We are continuing to develop and Byron C. Wallace, Sandra A. Walsh, and Elizabeth J. refine DAA to address its current limitations and the Whamond. The Patient‐Centered Outcomes Research feedback provided by the early users. For example, cur- Institute (PCORI) sponsored the development of DAA rently, the smallest unit of text that can be highlighted and the DAA Trial under contract number: ME‐1310‐ as source material for a given data item is an entire line 07009. in a paragraph in the source document (Figure 1). We are updating this feature to allow for the highlighting CONFLICT OF INTEREST of more granular amounts of text, such as single words or partial words, by implementing a technique of creat- The author reported no conflict of interest. ing ranges or groupings of characters. We do this by defining the beginning and the end of a character range ORCID and the character range's location in the source docu- ment. This improvement will also help overcome another Jens Jap http://orcid.org/0000-0003-4625-683X limitation—that DAA does not currently allow for non- Christopher H. Schmid http://orcid.org/0000-0002-0855- contiguous sections of text to be highlighted together. 5313 Once the ability to create ranges of text is in place, we can also group together these ranges of characters, REFERENCES allowing the grouping of noncontiguous sections of text. Another limitation is that DAA currently does not allow 1. Mathes T, Klaßen P, Pieper D. Frequency of data extraction the highlighting of text that is in image‐based tables and errors and methods to increase data extraction quality: a meth- odological review. BMC Med Res Methodol. 2017;17(1):152. figures. DAA can, however, highlight text in tables and figures that are in text format. We plan on overcoming 2. Saldanha IJ, Schmid CH, Lau J, et al. Evaluating Data Abstrac- these limitations by incorporating into DAA general pur- tion Assistant, a novel software application for data abstraction during systematic reviews: protocol for a randomized controlled pose annotation tools, such as Annotator.js, that have a trial. Syst Rev. 2016;5(1):196. more powerful set of annotation features. Finally, DAA 3. Carroll C, Scope A, Kaltenthaler E. A case study of binary out- is unable to read text from PDFs that are scanned docu- come data extraction across three systematic reviews of hip ments of poor quality. In these instances, conversion of arthroplasty: errors and differences of selection. BMC Res Notes. the scanned PDF to HTML format results in one large 2013;6(1):539. image (as opposed to text), which cannot be read or 4. Horton J, Vandermeer B, Hartling L, Tjosvold L, Klassen TP, annotated. Addressing this challenge will likely require Buscemi N. Systematic review data extraction: cross‐sectional the use of commercially available software packages that study showed that experience did not increase accuracy. J Clin convert images to text. Epidemiol. 2010;63(3):289‐298. 10 JAP ET AL.

5. Gøtzsche PC, Hróbjartsson A, Maric K, Tendal B. Data extrac- 13. Jonnalagadda SR, Goyal P, Huffman MD. Automating data tion errors in meta‐analyses that use standardized mean extraction in systematic reviews: a systematic review. Syst Rev. differences. JAMA. 2007;298(4):430‐437. 2015;4(1):78. 6. Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. 14. Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of Single data extraction generated more errors than double data a system for automatically assessing bias in clinical trials. JAm extraction in systematic reviews. J Clin Epidemiol. Med Inform Assoc. 2016;23(1):193‐201. 2006;59(7):697‐703. 15. Millard LAC, Flach PA, Higgins JPT. Machine learning to assist ‐ ‐ 7. Jones AP, Remmington T, Williamson PR, Ashby D, Smyth RL. risk of bias assessments in systematic reviews. Int J Epidemiol. ‐ High prevalence but low impact of data extraction and reporting 2016;45(1):266 277. errors were found in Cochrane systematic reviews. J Clin 16. International Conference on Harmonisation of Technical Epidemiol. 2005;58(7):741‐742. Requirements for Registration of Pharmaceuticals for Human 8. Li T, Vedula SS, Hadar N, Parkin C, Lau J, Dickersin K. Innova- Use. ICH Harmonised Tripartite Guideline: Structure and tions in data collection, management, and archiving for Content of Clinical Study Reports E3. 1995. Available at systematic reviews. Ann Intern Med. 2015;162(4):287‐294. www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guide- lines/Efficacy/E3/E3_Guideline.pdf. Last accessed June 18, ‐ 9. Ip S, Hadar N, Keefe S, et al. A web based archive of systematic 2018. review data. Syst Rev. 2012;1(1):15. https://doi.org/10.1186/2046‐ 17. Mayo‐Wilson E, Li T, Fusco N, Dickersin K, MUDS investiga- 4053‐1‐15 tors. Practical guidance for using multiple data sources in 10. Covidence systematic review software, Veritas Health Innova- systematic reviews and meta‐analyses (with examples from the tion, Melbourne, Australia. Available at www.covidence.org. MUDS study). Res Syn Meth. 2018;9(1):2‐12. Last accessed June 18, 2018. 18. Doshi P, Jefferson T. Clinical study reports of randomised con- 11. Thomas J, Brunton J, Graziosi S (2010) EPPI‐Reviewer 4.0: soft- trolled trials: an exploratory review of previously confidential ware for research synthesis. EPPI‐Centre Software. London: industry reports. BMJ Open. 2013;3(2). pii: e002496. doi: Social Science Research Unit, Institute of Education, University https://doi.org/10.1136/bmjopen‐2012‐002496 of London. Available at https://eppi.ioe.ac.uk/cms/Default. aspx?tabid=2967. Last accessed June 18, 2018. How to cite this article: Jap J, Saldanha IJ, 12. Borah R, Brown AW, Capers PL, Kaiser KA. Analysis of the time and workers needed to conduct systematic reviews of medical Smith BT, et al. Features and functioning of Data interventions using data from the PROSPERO registry. BMJ Abstraction Assistant, a software application for Open. 2017;7(2):e012545. https://doi.org/10.1136/bmjopen‐ data abstraction during systematic reviews. Res Syn 2016‐012545 Meth. 2018;1–13. https://doi.org/10.1002/jrsm.1326 JAP ET AL. 11

APPENDIX A SURVEY INSTRUMENT Data Abstraction Assistant (DAA) Trial Exit Survey for DAA Trial Participants (Data Abstractors) 12 JAP ET AL. JAP ET AL. 13 Appendix 2: Survey instrument Data Abstraction Assistant (DAA) Trial Exit Survey for DAA Trial Participants (Data Abstractors)

Purposes of Survey This is a paper copy of a survey that is being administered online using Qualtrics®. The survey will be administered to each individual participant (data abstractor) in the DAA Trial. The purposes of the survey are to: 1. Obtain the opinion of the Data Abstraction Assistant (DAA) Trial Participants regarding their use of the DAA software; and 2. Obtain suggestions from the Participants for ways in which the software can be improved.

Instructions for Survey Respondents ● Please answer all 10 questions in this survey. ● We estimate that completing this survey will you no more than 3 minutes. ● If you have any questions during the completion of this survey, please email Ian Saldanha at [email protected].

Q1. What is your FIRST name? Your response will be kept confidential. Please note that we only share data from this survey in aggregate.

Q2. What is your LAST name? Your response will be kept confidential. Please note that we only share data from this survey in aggregate.

Q3. Have you EVER worked with any of the following systematic review tools BEFORE the DAA Trial? (Select all that apply) [ ]1 Abstrackr [ ]2 Cochrane Author Support Tool (CAST) [ ]3 Covidence [ ]4 Distiller SR TM [ ]5 DOC Data [ ]6 Early Review Organizing Software (EROS) [ ]7 EPPI-Reviewer [ ]8 OpenMeta[Analyst] [ ]9 Rayyan [ ]10 Review Manager (RevMan) [ ]11 System for the Unified Management, Assessment, and Review of Information (SUMARI) [ ]12 Other (please specify) - ______13

75 [ ]14 I used some tool, but don’t remember which one [ ]15 None of the above

Q4. Do you agree or disagree with each of the following statements?

Agree Neutral Disagree

a. It was easy to open documents in ( )1 ( )2 ( )3 Split Screen view in SRDR. b. It was easy to scroll between pages ( )1 ( )2 ( )3 on the document. c. It was easy to click on existing flags ( )1 ( )2 ( )3 to automatically navigate to a relevant location on the document.

Q5. Do you agree or disagree with the following statement?

Agree Neutral Disagree Not applicable (I was a second data abstractor and did not place flags myself.) a. It was easy to place flags ( )1 ( )2 ( )3 on the document.

Q6. How would you characterize the OVERALL EASE of using DAA? (Select one)

( )1 Very easy ( )2 Somewhat easy ( )3 Neutral ( )4 Somewhat difficult ( )5 Very difficult ( )6 No opinion/don’t know Q7. If DAA is available at no cost the next time you conduct a systematic review, how likely would you be to USE DAA for data abstraction? (Select one)

( )1 Very likely ( )2 Somewhat likely ( )3 Neutral ( )4 Somewhat unlikely. If checked, please specify why: ______5 ( )6 Very unlikely. If checked, please specify why: ______7 ( )8 No opinion/don’t know

76 Q8. If DAA is available at no cost, how likely would you be to RECOMMEND that others conducting systematic reviews use DAA for data abstraction? (Select one)

( )1 Very likely ( )2 Somewhat likely ( )3 Neutral ( )4 Somewhat unlikely ( )5 Very unlikely ( )6 No opinion/don’t know

Q9. Even if you did not perform each of the following tasks yourself, overall what ability did you like the MOST about the Data Abstraction Assistant (DAA)? (Select one)

( )1 The ability to open a document in Split Screen view in SRDR ( )2 The ability to place flags on the document ( )3 The ability to click on existing flags to automatically navigate to a relevant location on the document ( )4 The ability to copy text from the document to the data abstraction form ( )5 Other, (please specify):______6 ( )7 None, i.e., no particular ability stood out

Q10. Do you have any suggestions that might help improve the DAA software? (Select one) ( )1 No ( )2 Yes, please specify: ______5

ADMINSTRATIVE DETAILS

Date survey completed (MM/DD/YYYY): __ __ / __ __ / ______(Auto-filled by Qualitrics®)

77 Appendix 3: Summary of survey responses, by level of experience with data abstraction

Less experienced More experienced All data Survey item data abstractors data abstractors abstractors (N=26) (N=26) (N=52) n (%) n (%) n (%) Ease of use of DAA It was easy to open documents in Split Screen view in SRDR. Agree 20 (77) 23 (88) 43 (83) Disagree 2 (8) 0 (0) 7 (4) Neutral 4 (15) 3 (12) 2 (13) It was easy to scroll between pages on the document. Agree 17 (65) 19 (73) 36 (69) Disagree 3 (12) 3 (12) 6 (12) Neutral 6 (23) 4 (15) 10 (19) It was easy to place flags on the document. Agree 16 (62) N/A N/A Disagree 3 (12) Neutral 7 (27) It was easy to click on existing flags to automatically navigate to a relevant location on the document. Agree 16 (61) 21 (85) 37 (71) Disagree 2 (8) 1 (4) 3 (3) Neutral 8 (31) 4 (15) 12 (23) Overall ease of using DAA Very easy 5 (19) 9 (35) 14 (27) Somewhat easy 17 (65) 12 (46) 29 (56) Somewhat difficult 2 (8) 3 (12) 5 (10) Very difficult 0 (0) 0 (0) 0 (0) Neutral 2 (8) 2 (8) 4 (8) Future use of DAA Likelihood of using DAA for data abstraction in the future. Very likely 10 (38) 12 (46) 22 (42) Somewhat likely 7 (27) 8 (31) 15 (29)

78 Somewhat unlikely 2 (8) 4 (15) 6 (12) Very unlikely 1 (4) 0 (0) 1 (2) Neutral 6 (23) 2 (8) 8 (15) Likelihood of recommending that others use DAA for data abstraction in the future. Very likely 11 (42) 15 (58) 26 (50) Somewhat likely 10 (38) 9 (35) 19 (37) Somewhat unlikely 0 (0) 1 (4) 1 (2) Very unlikely 2 (8) 0 (0) 2 (4) Neutral 3 (12) 1 (4) 4 (8) Favorite DAA feature (even if abstractors themselves did not use the feature) Ability to open a document in Split Screen View 7 (27) 3 (12) 10 (19) Ability to place flags on the document 6 (23) 3 (12) 9 (17) Ability to click on existing flags to navigate 9 (35) 19 (73) 28 (54) Ability to copy text to SRDR 3 (12) 0 (0) 3 (6) Other 0 (0) 1 (4) 1 (2) None (i.e., no particular ability stood out) 1 (4) 0 (0) 1 (2)

Note: Percentages are calculated using the column totals as the denominator.

79 Appendix 4: Published paper describing the DAA trial protocol

80 Saldanha et al. Systematic Reviews (2016) 5:196 DOI 10.1186/s13643-016-0373-7

PROTOCOL Open Access Evaluating Data Abstraction Assistant, a novel software application for data abstraction during systematic reviews: protocol for a randomized controlled trial Ian J. Saldanha1*, Christopher H. Schmid2, Joseph Lau3, Kay Dickersin1, Jesse A. Berlin4, Jens Jap5, Bryant T. Smith5, Simona Carini6, Wiley Chan7, Berry De Bruijn8, Byron C. Wallace9, Susan M. Hutfless10, Ida Sim11, M. Hassan Murad12, Sandra A. Walsh13, Elizabeth J. Whamond14 and Tianjing Li1

Abstract Background: Data abstraction, a critical systematic review step, is time-consuming and prone to errors. Current standards for approaches to data abstraction rest on a weak evidence base. We developed the Data Abstraction Assistant (DAA), a novel software application designed to facilitate the abstraction process by allowing users to (1) view study article PDFs juxtaposed to electronic data abstraction forms linked to a data abstraction system, (2) highlight (or “pin”) the location of the text in the PDF, and (3) copy relevant text from the PDF into the form. We describe the design of a randomized controlled trial (RCT) that compares the relative effectiveness of (A) DAA- facilitated single abstraction plus verification by a second person, (B) traditional (non-DAA-facilitated) single abstraction plus verification by a second person, and (C) traditional independent dual abstraction plus adjudication to ascertain the accuracy and efficiency of abstraction. Methods: This is an online, randomized, three-arm, crossover trial. We will enroll 24 pairs of abstractors (i.e., sample size is 48 participants), each pair comprising one less and one more experienced abstractor. Pairs will be randomized to abstract data from six articles, two under each of the three approaches. Abstractors will complete pre-tested data abstraction forms using the Systematic Review Data Repository (SRDR), an online data abstraction system. The primary outcomes are (1) proportion of data items abstracted that constitute an error (compared with an answer key) and (2) total time taken to complete abstraction (by two abstractors in the pair, including verification and/or adjudication). Discussion: The DAA trial uses a practical design to test a novel software application as a tool to help improve the accuracy and efficiency of the data abstraction process during systematic reviews. Findings from the DAA trial will provide much-needed evidence to strengthen current recommendations for data abstraction approaches. (Continued on next page)

* Correspondence: [email protected] 1Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Room W6507-B, Baltimore, MD 21205, USA Full list of author information is available at the end of the article

© The Author(s). 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Saldanha et al. Systematic Reviews (2016) 5:196 Page 2 of 11

(Continued from previous page) Trial registration: The trial is registered at National Information Center on Health Services Research and Health Care Technology (NICHSR) under Registration # HSRP20152269: https://wwwcf.nlm.nih.gov/hsr_project/view_hsrproj_record. cfm?NLMUNIQUE_ID=20152269&SEARCH_FOR=Tianjing%20Li. All items from the World Health Organization Trial Registration Data Set are covered at various locations in this protocol. Protocol version and date: This is version 2.0 of the protocol, dated September 6, 2016. As needed, we will communicate any protocol amendments to the Institutional Review Boards (IRBs) of Johns Hopkins Bloomberg School of Public Health (JHBSPH) and Brown University. We also will make appropriate as-needed modifications to the NICHSR website in a timely fashion. Keywords: Data abstraction, Systematic reviews, Randomized controlled trial

Background sometimes concludes with adjudication between the two Systematic reviews (“reviews”) are comparative effective- abstractors. Buscemi and colleagues showed that single ness research studies that use explicit methods to identify, abstraction plus verification results in approximately appraise, and synthesize the research evidence addressing 20% more errors than independent dual abstraction plus a given research question [1]. The steps in completing a adjudication, but the latter approach takes approximately review include formulating the research question, finding 50% longer [3]. and collecting data from individual studies, and synthesi- Because only one study [3] has examined the tradeoffs zing the evidence [2]. The validity of the review findings is between single abstraction plus verification and inde- contingent upon accurate and complete data collection pendent dual abstraction plus adjudication and that study from journal articles reporting results of relevant studies focused on a single review topic with only four abstrac- (“articles”), a process known as data abstraction (also known tors, current standards for data abstraction rest on a weak as data extraction). evidence base. Major sponsors and producers of reviews As a predominantly manual process, data abstraction (e.g., Agency for Healthcare Research and Quality Evidence- is inefficient, being both labor-intensive and error-prone. based Practice Centers (AHRQ EPCs), Cochrane, Centre for Errors during abstraction are common and have been well Research and Dissemination (CRD)) and organizations that documented in the literature [3–6]. One study estimated develop methodology standards for reviews (e.g., AHRQ, that the error rate, defined as “any small discrepancy from Cochrane, Institute of Medicine (IOM)) are inconsistent in the reference standard,” was approximately 30% for single their recommendations for approaches to reduce errors in abstraction, regardless of the level of abstractor experience abstraction [1, 2, 8–10]. Because “so little is known about [6]. Our pilot data showed that less experienced abstrac- how best to optimize accuracy and efficiency” [1], the IOM tors made more errors across all types of research ques- Committee stopped short of recommending independent tions, and errors were highest for numerical results [7]. dual abstraction for all data elements. Instead, the IOM rec- Abstraction errors occur when abstractors either omit ommended: “at minimum, use two or more researchers, from the abstraction information that is present in the working independently, to extract quantitative and other article or when information is abstracted incorrectly. When critical data from each study” [1]. Thus, although the IOM Gøtzsche and colleagues examined 27 meta-analyses recommended independent dual abstraction for “critical (i.e., statistical analyses of results from included studies in data,” an important gap in our current methodological systematic reviews) published in 2004 across a range of understanding of data abstraction remains. The recom- topics, they found multiple errors in 37% of meta-analyses mendation for “critical data” could represent unnecessary [4]. In another study, Jones and colleagues documented work or, conversely, the IOM’s implicit recommendation abstraction errors in 20/42 reviews (48%); in all cases, that a single person could abstract non-critical data could the errors changed the summary meta-analytic results, represent an opportunity for error. although none changed the review conclusions [5]. Computer-aided abstraction could potentially make the Currently recommended approaches to reducing errors abstraction process more efficient and more accurate by in data abstraction fall into two categories: (1) abstraction facilitating the location and storage of key information in by one person followed by checking of the abstraction by articles. With funding from the Patient Centered Out- a second person (“single abstraction plus verification”) comes Research Institute (PCORI), we developed the and (2) independent abstraction by two persons followed Data Abstraction Assistant (DAA), a novel software appli- by resolution of any discrepancies (“independent dual cation designed to facilitate tracking the location of ab- abstraction plus adjudication”). The former approach also stracted information in articles and to reduce errors during Saldanha et al. Systematic Reviews (2016) 5:196 Page 3 of 11

abstraction. DAA facilitates abstraction by allowing users to (1) view article PDFs juxtaposed to electronic data ab- straction forms in data abstraction systems, (2) highlight (or “pin”) the location of text in the PDF, and (3) copy text automatically from the PDF into the form. We are conducting a randomized controlled trial (RCT) to compare the relative effectiveness of (A) DAA- facilitated single abstraction plus verification, (B) trad- itional (non-DAA-facilitated) single abstraction plus veri- fication, and (C) traditional independent dual abstraction plus adjudication on the accuracy and efficiency of ab- straction. The objective of this manuscript is to describe the design of our RCT adhering to the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) evidence-based guideline for reporting protocols of RCTs [11] (Additional file 1). Our searches of PubMed and The Cochrane Library up to August 26, 2016, did not identify any RCT or systematic review of RCTs that have compared the accuracy and efficiency of various data abstraction approaches.

Methods Study design and setting We have designed this study as a randomized, three-arm, crossover trial. The trial flowchart is presented in Fig. 1. This trial will be conducted entirely online. However, during verification of abstracted data or adjudication of discrepancies, participants (abstractors) in a pair will have the option to communicate with each other using any preferred mode of communication (e.g., video call, phone Fig. 1 Flow of participants during the trial call, in-person meeting).

Study population The intended study population is individuals, including srdr.ahrq.gov), is state-of-the-art, open-source, open-access, researchers, patients, clinicians, and methodologists, who and available free of charge to anyone conducting a review have previously participated in data abstraction for sys- [12, 13]. We will utilize four strategies to identify and re- tematic reviews, without restriction on the number, types, cruit potential abstractors: (1) emails to students who have or topics of reviews. registered for at least one course in systematic review methods through Johns Hopkins Bloomberg School of Pub- Eligibility criteria lic Health (JHBSPH) and Brown University; (2) emails to The trial will only include individuals who meet all the faculty and staff at Johns Hopkins and Brown EPCs; (3) ad- following criteria: vertising on the SRDR website; and (4) advertising through patient organizations such as Consumers United for  At least 20 years of age Evidence-based Healthcare (CUE) and Cochrane Consumer  Self-reported proficiency with reading scientific Network (CCNet). CUE is a US-based coalition of health articles in English and patient advocacy organizations committed to empow-  Completed abstraction for at least one journal ering patients to make the best use of evidence-based article for a systematic review in any field healthcare (http://us.cochrane.org/CUE). CCNet’sprimary  Provided informed consent role is to get patients around the world involved in the pro- duction of Cochrane reviews (http://consumers.cochra- Recruitment ne.org). All potentially eligible abstractors are directed to For this trial, we will use the Systematic Review Data the DAA Trial Data Abstractor Enrollment website Repository (SRDR) as the data abstraction system. SRDR, (“Enrollment website”), the informational page for the maintained by the Brown University EPC (http:// trial (http://srdr.ahrq.gov/daa/info). Saldanha et al. Systematic Reviews (2016) 5:196 Page 4 of 11

Participant enrollment, training, and pairing random order from 1 to 24. For example, if the first ran- To mimic how individuals are often paired for data ab- dom number is 17, the first pair will be assigned to “pair straction in real-world reviews, we will form pairs of ab- 17”, and will abstract data from articles 31 to 36 accord- stractors consisting of one less experienced abstractor ing to sequence BBAACC, as shown in Table 1. and one more experienced abstractor. The Enrollment The Project Director will release two articles at a time. website will ask participants to answer questions related To maintain allocation concealment, we will keep the to their eligibility and level of experience with abstrac- Project Director, who is responsible for pairing abstrac- tion for reviews. Based on the results of a pilot study tors and communicating the randomized sequence to the (Additional file 2), we determined that the number of pair, unaware of the next slot assigned. For this reason, published reviews authored, dichotomized at fewer than the Senior Statistician alone will have access to the random 3 versus 3 or more, was best able to classify abstractors order. For each pair to be randomized, the Project Director as “less” or “more” experienced with abstraction, will contact and receive from the Senior Statistician the respectively. randomized slot to which the pair will be assigned. Once an abstractor is deemed to have met all eligibility criteria for the trial and has provided information regar- Study arms (abstraction approaches) ding experience with abstraction, the Enrollment website Approach A—DAA-facilitated single abstraction plus will notify the abstractor that s/he is eligible for partici- verification pation and, upon the click of a button, will direct the In Approach A, which uses DAA, the less experienced participant to the DAA Trial Informed Consent website abstractor in a pair will complete the abstraction form (“Consent website”) (http://srdr.ahrq.gov/daa/consent). The first, followed by the more experienced abstractor who Consent website will automatically notify the Project will verify the information abstracted by her/his less ex- Director regarding the names and email addresses of perienced partner. The less experienced abstractor will abstractors who have successfully provided informed complete abstraction for the two assigned articles using consent. DAA and the abstraction form in SRDR by placing a pin Abstractors will be required to complete training in identifying each location of the PDF text supporting the using SRDR and DAA before being paired and random- answer to every question on the abstraction form. The ized. Once an abstractor has completed training, the Pro- software allows multiple locations of the PDF text to be ject Director pairs the abstractor with the next available pinned for a given question. Once the initial abstraction abstractor, in chronological order, who has complemen- is completed, the more experienced abstractor will be tary abstraction experience (i.e., each pair will include one given access the abstracted data for the two articles in less experienced and one more experienced abstractor). SRDR, together with the pinned locations on the PDFs. The more experienced abstractor can change any of the Randomization of pairs of abstractors less experienced abstractor’s responses as s/he considers Abstractors will be randomized as pairs. Each pair will appropriate (verification) and, if desired, request discus- complete abstraction for six articles, two under each of sion with the less experienced abstractor (data adjudica- the three approaches (A = DAA-facilitated single abstrac- tion). Once the more experienced abstractor has verified tion plus verification; B = traditional [non-DAA-facilitated] the data abstracted for an article (with or without dis- single abstraction plus verification; and C = traditional in- cussion with the less experienced abstractor), abstraction dependent dual abstraction plus adjudication). This is to for that article will be considered complete. reduce possible contamination from the learning process and for ease of coordination. To maximize efficiency, we Approach B—Traditional single abstraction plus verification will use a crossover design such that each pair of abstrac- Approach B does not use DAA. As in Approach A, the tors will implement all three approaches being evaluated, less experienced abstractor in a pair will complete the with the intent of estimating differences within pairs. The abstraction form first (without using DAA), followed by six possible sequences are AABBCC, AACCBB, BBCCAA, the more experienced abstractor who will verify the infor- BBAACC, CCAABB, and CCBBAA. Table 1 lays out the mation abstracted by her/his less experienced partner. assignment of 24 abstractor pairs to sequences (n =6)and The less experienced abstractor in a pair will complete to reviews (n = 4). For further explanation of sample size abstraction for the assigned articles using the abstraction calculation and review selection, see sections titled “Sample form in SRDR. Once the abstraction by the less expe- size and power calculation” and “Identification of studies rienced abstractor is completed, the more experienced and outcomes for abstraction during the trial”, respectively. abstractor will be given access the abstracted data for the We will randomly assign each consecutive pair of abstrac- two articles in SRDR. The more experienced abstractor tors to a “slot” (row in Table 1). The Senior Statistician can change any of the less experienced abstractor’s re- will use the R statistical environment to generate the sponses as s/he considers appropriate (verification) and, if Saldanha et al. Systematic Reviews (2016) 5:196 Page 5 of 11

Table 1 Assignment of 24 pairs of abstractors to 6 sequences and to 48 articles Random sequence Article 1 Article 2 Article 3 Article 4 Article 5 Article 6 Articles selected from systematic review #1 Pair 1 A A B B C C Sequence 1 Pair 2 B B C C A A Sequence 2 Pair 3 C C A A B B Sequence 3 Article 7 Article 8 Article 9 Article 10 Article 11 Article 12 Pair 4 A A C C B B Sequence 4 Pair 5 B B A A C C Sequence 5 Pair 6 C C B B A A Sequence 6 Article 13 Article 14 Article 15 Article 16 Article 17 Article 18 Articles selected from systematic review #2 Pair 7 A A B B C C Sequence 1 Pair 8 B B C C A A Sequence 2 Pair 9 C C A A B B Sequence 3 Article 19 Article 20 Article 21 Article 22 Article 23 Article 24 Pair 10 A A C C B B Sequence 4 Pair 11 B B A A C C Sequence 5 Pair 12 C C B B A A Sequence 6 Article 25 Article 26 Article 27 Article 28 Article 29 Article 30 Articles selected from systematic review #3 Pair 13 A A B B C C Sequence 1 Pair 14 B B C C A A Sequence 2 Pair 15 C C A A B B Sequence 3 Article 31 Article 32 Article 33 Article 34 Article 35 Article 36 Pair 16 A A C C B B Sequence 4 Pair 17 B B A A C C Sequence 5 Pair 18 C C B B A A Sequence 6 Article 37 Article 38 Article 39 Article 40 Article 41 Article 42 Articles selected from systematic review #4 Pair 19 A A B B C C Sequence 1 Pair 20 B B C C A A Sequence 2 Pair 21 C C A A B B Sequence 3 Article 43 Article 44 Article 45 Article 46 Article 47 Article 48 Pair 22 A A C C B B Sequence 4 Pair 23 B B A A C C Sequence 5 Pair 24 C C B B A A Sequence 6 A, B, and C denote three different approaches for data abstraction; see the section “Study arms (Abstraction approaches)”. Random sequence is the permuted arrangement of three approaches for data abstraction. For example, sequence 1 indicates data abstractors will collect data from 6 unique articles using AABBCC approaches respectively desired, request discussion with the less experienced pair each will abstract data independently for the two abstractor to adjudicate the data (data adjudication). As in assigned articles using the abstraction form in SRDR. Approach A, once the more experienced abstractor has The two abstractors will inform each other that they verified the abstracted data for an article (with or without have completed their independent abstractions and will discussion with the less experienced abstractor), abstrac- develop a plan for adjudication (e.g., video call, phone call, tion for that article is considered complete. in-person meeting). In the second step, the abstractors will compare their abstractions and address any discrep- Approach C—Traditional independent dual abstraction plus ancies in the abstracted data for the two articles (data adjudication adjudication). Once the two abstractors arrive at consen- Approach C, which also does not use DAA, involves sus on all abstracted data for a given article, abstraction two main steps. In the first step, the two abstractors in a for that article is considered complete. As appropriate, Saldanha et al. Systematic Reviews (2016) 5:196 Page 6 of 11

both abstractors will edit their own incorrect answers. We The reviews chosen are (1) multi-factorial interventions will not allow a third abstractor to resolve discrepancies. to prevent falls in older adults [14]; (2) proprotein con- vertase subtilisin/kexin type 9 (PCSK-9) antibodies for Masking adults with hypercholesterolemia [15]; (3) interventions It is not feasible to mask abstractors or the Project to promote physical activity in cancer survivors [16]; and Director because the abstractors need to be aware of the (4) omega-3 fatty acids for adults with depression [17]. abstraction approach in order to abstract data, and the Project Director needs to be aware of the sequence of Data collection, management, and monitoring assigned approaches to allocate articles and follow abstrac- All data during the trial will be collected via websites, tors through the trial. The data analysts will use computer SRDR (the data abstraction system), and DAA (the docu- programs, such as the R statistical environment, to detect ment management system). We have developed and pilot errors (a step that does not involve subjective judgment), tested a separate data abstraction form compatible with and will not be masked. SRDR for each of the four reviews (forms available upon request). We want the results of our trial to be broadly applicable across a wide range of review topics, and thus, Follow-up and retention of participants the forms include recommended common data elements To maximize retention, the Project Director will main- (Table 2) [13, 18]. In keeping with best practices of form tain regular email contact with each abstractor (or pair development [13], each form comprises predominantly of abstractors, as appropriate) throughout the trial. Follow- pre-populated multiple-choice or numerical entry data ing are the scheduled junctures for email contact: after items. We have organized the data elements into separate screening and consent, once DAA and SRDR trainings “tabs” in SRDR: Design Tab (study design, risk of bias), are completed, after randomization, and after abstraction Baseline Tab (characteristics of participants by study arm under each approach (two articles) is completed. Abstrac- at baseline), Outcomes Tab (list of outcomes reported in tors will be followed throughout the trial unless consent the article), and Results Tab (quantitative results data for is withdrawn. In instances where an abstractor or pair of the outcomes). Table 2 lists the various data elements that abstractors does not complete abstraction for the assigned are contained in each tab, framed as answerable ques- articles, we will make every effort to encourage comple- tions. Note that some data elements include multiple data tion of abstraction. If abstraction is not completed after items. The total number of data items varies between arti- five weekly email reminders, or if consent is withdrawn, cles, depending upon the review topic, number of out- we will replace the abstractor with a previously not en- comes, and the amount of information available in each rolled abstractor with the same level of abstraction experi- article. The form has a median of 121 multiple-choice or ence, or replace both abstractors in the pair, as needed, numerical entry data items (interquartile range 102 to 150, so that the remaining abstraction is completed. range 71 to 176). We will provide each abstractor US $250 as compen- The DAA trial does not include stopping rules or a sation for participation in the trial. Compensation will Data Safety and Monitoring Board because the trial be provided only once abstraction for all six articles has does not evaluate the safety or effectiveness of an been completed (i.e., no partial/interim payment). intervention on health outcomes. We do not expect any adverse events as a result of abstracting data dur- Identification of studies and outcomes for abstraction ing this trial. during the trial We have identified 48 journal articles from four reviews Outcomes reporting results of RCTs (12 articles per review), and The two primary outcomes for our trial are proportion these are the articles that will be abstracted from during of data items abstracted that constitute an error (hereafter the trial. We identified the reviews by searching MED- referred to as “error rates” for simplicity) and the time LINE and the Cochrane Database of Systematic Reviews taken to complete abstraction (by both abstractors, in- for a range of clinical topic areas. To ensure that the cluding verification/adjudication). To determine errors reviews and the outcomes are relevant to patients, our for Approaches A and B, we will compare the verified consumer co-investigators (SAW, EJW, and Ms. Vernal data to data independently abstracted and adjudicated Branch) were involved in the selection of reviews and by two investigators with extensive abstraction experi- outcomes. In cases where a review includes more than 12 ence (IJS and TL), which will be considered the “an- articles, we selected 12 articles that reported the most swer key”. In Approach C, both abstractors edit their number of outcomes. We have included one article for own abstracted data during adjudication. To determine each trial (i.e., not multiple publications, conference ab- errors for Approach C, we will use the edited data stracts, or data from trial registries). from the more experienced abstractor. We will do this Saldanha et al. Systematic Reviews (2016) 5:196 Page 7 of 11

Table 2 Data elements by tab in each abstraction form used Table 2 Data elements by tab in each abstraction form used in the trial in the trial (Continued) Tab Data element 95% CI for the measure of association, by between Design Study eligibility criteria arm comparison Number of study centers P-value for the measure of association, by between arm comparison Region of study participant recruitment For each continuous outcome at the relevant time-point: Start year of study participant recruitment Number of participants analyzed, by group (arm) End year of study participant recruitment Mean of outcome, by group (arm) End year of randomized study participant follow-up Standard deviation of outcome, by group (arm) Length of planned (or stated) randomized study participant follow-up Mean difference, by between arm comparison Report of a study sample size/power calculation 95% CI for the mean difference, by between arm comparison Report of conduct of an intention-to-treat analysis P value for the mean difference, by between arm comparison Presence of a participant flow diagram in the article Study method to generate the random sequence Risk of bias related to random sequence generation by using a computer program that automatically com- Study method to conceal the random allocation sequence pares the selected/entered value of a given data item Risk of bias related to concealment of the random to the answer key value for that data item. An error is allocation sequence defined as any discrepancy or difference between an Masking (or blinding) of study participants to treatment entry for a data item and the answer key value for assigned that data item. We are interested in abstraction errors Masking (or blinding) of healthcare providers to treatment resulting from omission or incorrect abstraction. If assigned participants abstract more data items than are in the Masking (or blinding) of outcome assessors to treatment answer key, the additional data items will not be con- assigned sidered as errors. Report of “single,”“double,” or “triple” masking without The total time taken to complete abstraction for a clarification given article is defined as the sum of the time taken Report of absence of any masking during the study for initial abstraction(s) plus subsequent verification/ad- Sources of monetary or material support for the study judication. To measure time, we will use three strategies. Financial relationships for any author of the study article First, the study data abstraction system (i.e., SRDR) will Total number of randomized study arms (or groups) automatically record when each abstractor logs in and Number of study participants randomized, by group (or arm) out of the system, including the time spent on each tab. It is possible that this time overestimates the true time Number of study participants followed up, by group (or arm) spent on the tab if the abstractor steps away from the Whether reasons to follow up were similar between the computer. Second, as part of the Design Tab of the ab- groups (or arms) straction form in SRDR, we will ask abstractors to record How much time the abstractor spent abstracting data for the Design Tab the self-timed duration (in minutes) that was spent abstracting data for the Design Tab. Third, we will ask Baseline Sample size at baseline, by group (or arm) each abstractor to record the time spent (in minutes) Age at baseline, by group (arm) on each step of data abstraction for each article: initial Sex at baseline, by group (arm) abstraction, verification, and adjudication, recorded using Other baseline characteristics as appropriate an online survey tool (Qualtrics®). We will use the auto- (e.g., body mass index), by group (arm) matically recorded timestamps to calculate time when- Outcomes Each outcome from a pre-defined list of outcomes with ever possible, but will corroborate these data with the time-points specific to articles from each review latter two manual strategies to assess the accuracy of our Results For each dichotomous outcome at the relevant time-point: assessment of time. Number of participants analyzed, by group (arm) To explore the impact of errors on results of meta- Number of participants with the outcome, by group (arm) analyses, we also will conduct an exploratory descriptive Percentage of participants with the outcome, by group (arm) analysis of differences among the meta-analytic estimates and 95% confidence intervals based on data derived from Measure of association (e.g., relative risk, odds ratio), by between arm comparison Approaches A, B, and C compared with those using the answer keys. Saldanha et al. Systematic Reviews (2016) 5:196 Page 8 of 11

Statistical methods are explained by the common variance components. The Statistical analysis—overview model for the continuous time outcome has a similar struc- All analyses will be conducted according to the intention-to- ture, but utilizes a linear regression modeling the mean treat principle. We will document any protocol deviations time with normally distributed errors. We will examine and violations. We will compute summary error rates and both error rates and time adjusting for type of question on time statistics for each approach, abstractor pair, review, and the abstraction form and will also examine error rates and article, by type of question (questions in the Design Tab, time for each type of question separately. We will compare Baseline Tab, or the Outcomes and Results Tabs). We expect automatically recorded and self-reported times, and analyze each of our primary outcomes (error rate and time) to vary each separately. by six factors; we have included these factors as covariates in the statistical model for analyzing each of the primary Subgroup analyses outcomes. These factors include question type (design, We will conduct exploratory subgroup analyses to exa- baseline, results), abstractor pair (1 to 24), abstraction mine whether the differences between the three abstrac- sequence (1 to 6), abstraction approach (A, B, C), article tion approaches vary by question type, abstractor pair, (1 to 48), and review from which articles were obtained abstraction sequence, article, and review, as specified in (1 to 4). We will define errors for each data item as a binary the statistical model (see the “Statistical analysis—technical variable (correct/incorrect) and time as a continuous vari- details” section). able. We will analyze error rates using logistic regression and time using linear regression. Each outcome will be Missing data and sensitivity analysis analyzed in terms of the six factors using a mixed effects We anticipate that missing data may occur if abstractors regression model. do not finish all six articles assigned. We will make every effort to retain all abstractors and encourage complete Statistical analysis—technical details data collection through (1) describing sufficient detail For modeling error rates, let phijklmn be the probability about the trial, problems caused by missing data, and the of an error for the hth item of the ith question type need for commitment to the trial as part of the consent abstracted by the jth abstractor pair following the kth process before enrollment; (2) maintaining frequent con- abstraction sequence under the lth abstraction approach tact and sending reminders to abstractors; and (3) com- from the mth article obtained from the nth review. Then, pensating abstractors US $250 for their time once they the logistic regression model is have completed all assignments. We will ask and report reasons for those who discontinue some or all types of ¼ þ þ þ þ þ logit phijklmn a qi bl gk dn hjkðÞ participation. In handling missing data items, we will compare the þ z ðÞþ ðÞbq þ ðÞbg mn li lk characteristics of the missing and non-missing items þ ðÞþ … to determine the nature of the missing data mechanism. bd ln If data appear to be missing at random, inference can be where “…” indicates additional interaction terms one may drawn based on the observed data mixed model likeli- wish to add. Question type (qi), abstraction approach (bl), hood, as the predictor factors are all part of the design and abstraction sequence (gk) are fixed factors; abstractor and therefore will be known. If the missing at random pair (hj(k)), review (dn), and article (zm(n))arerandom assumption appears invalid, we will conduct sensitivity factors. The key term of interest is bl, representing the analyses to assess robustness of findings to different miss- main effect of the abstraction approach. The interaction ing not at random scenarios. terms (bq)li,(bg)lk,and(bd)ln in the model examine whether differences between the abstraction approaches Sample size and power calculation vary with type of questions asked on the abstraction form The purpose of our trial is to determine whether (1) use (since some questions might be easier to answer than of DAA (Approach A) improves accuracy (i.e., reduces others), abstraction sequence (which might indicate a error rates) compared with Approach B, and maintains learning effect or, more formally, a carry-over effect), and the accuracy of the usual Approach C; and (2) use of review (which might indicate a level of difficulty effect), DAA improves efficiency (i.e., reduces abstraction time) respectively. We will check whether other factors explain compared with Approaches B and C. The adjudicated differences among the abstractors (e.g., level of experience) abstractions of the experienced study investigators (IJS or among the articles (e.g., better reporting). Each random and TL) will be used as the answer key for comparison. effect is assumed to follow a normal distribution centered The design in Table 1 includes factors related to the at zero with its own variance component. Correlations abstractor pair, abstraction sequence, article, and review. among items taken from the same article, for example, For determining necessary sample sizes, we make the Saldanha et al. Systematic Reviews (2016) 5:196 Page 9 of 11

simplifying assumption that the error rates and abstrac- identifying markers for participants in the data we col- tion time per article will not depend on pair, sequence, lect. A file linking participant names with the assigned or review. We consider this trial as having a crossover abstraction sequence will be accessible only to the Pro- design in which each of the 48 articles is abstracted three ject Director and saved on a password-protected server times, once under each abstraction approach. maintained by JHBSPH. The server is backed up daily. In our pilot study, we found that the proportion of We will archive trial documentation and electronic data items incorrectly abstracted by inexperienced abstractors files at the end of the trial and will retain them for at least was 24% [7]. If an experienced reviewer caught and cor- 10 years. Because data abstracted in SRDR are associated rected half of these errors, the error rate would be re- with the abstractor’s name, we do not plan on allowing duced to 12%; this is what we used in sample size and open access to the abstracted data in SRDR. However, power calculations for Approach B. We assume that upon request, we can make the exported de-identified DAA-facilitated single abstraction plus verification (Ap- abstracted data available after the trial. proach A) would reduce the error rate relative to trad- itional single abstraction plus verification (Approach B), Publications and dissemination but perhaps not relative to independent dual abstraction All investigators will collaborate to disseminate trial re- plus adjudication (Approach C). Given the expected sults through manuscripts and presentations at scientific error rate, we want to be able to estimate the differ- meetings. Authorship and roles in preparation of the ence in error rate to within 1 or 2% in order to be able to manuscripts will be decided ahead of time and agreed by determine which approach is meaningfully more accurate. all concerned. PCORI, the funder of this trial, will con- For example, with an error rate of 12% for Approach B, vene an independent team to peer review the final report we will be able to detect a statistically significant differ- of the project and will post the finalized report on the ence between Approaches A and B if Approach A’serror PCORI website. is less than 10%. Using the error rate as the outcome for calculating Discussion sample size, we would expect a standard deviation for a Current approaches to data abstraction have resulted in 100-item form to be about 3% (exact if the proportion a large resource burden on those conducting systematic were 9%). The standard error of the difference is then reviews; however, efforts to reduce the burden may lead to σ(1 − ρ)/N, where σ is the within-unit standard deviation, errors in abstraction. These errors could lead to inaccurate N is the number of units (articles) receiving each sequence conclusions derived in reviews and, consequently, health- ofcrossovers(ApproachesA,B,andC),andρ is the care decisions that are based on faulty evidence summa- correlation between two measurements on the same unit. ries. The very foundation of evidence-based healthcare is The standard error for the difference in a crossover trial challenged when one of its three underlying tenets when comparing two approaches with 24 units each (best-available research evidence, clinician expertise, and receiving each approach in one of two sequences, given a patient values [19]) is compromised. standard deviation per unit of 3%, ranges from 0.125%, when ρ = 0, to 0%, when ρ = 1, decreasing linearly. Thus, Challenges in designing the trial even if measurements are uncorrelated (i.e., ρ = 0), our The first challenge in designing this trial was to identify design would be able to detect a very small difference four reviews from which we could meet our target of between error rates. identifying 12 appropriate RCTs each. Many of the re- Likewise, we might assume that Approach A would views we selected initially could not be used because the reduce the total time needed to abstract data relative to outcomes were poorly defined in the reviews and, in Approaches B and C, and we would want to estimate this some instances, meta-analyses in the reviews combined within a few minutes in order to determine meaningful data from RCTs inappropriately. Ultimately, after ruling differences in efficiency. The same calculation shows that out various topics, we identified four suitable topics and even with independent measurements, the standard error reviews. of the difference between the average times for two diffe- The second challenge was deriving the answer keys rent approaches would be no more than σ/24, which for all 48 articles. Two experienced abstractors on our in- would be less than 2 min for σ as large as 48 min (48 min vestigator team (IJS and TL) abstracted data from all 48 is much longer than we would expect for the amount of articles. However, to be able to truly evaluate abstraction data to be abstracted and the length of the articles). “errors” by abstractors during the trial, we needed to de- velop instructions and unambiguous language to clearly Confidentiality articulate every question and every answer option on We will not share trial data until the trial is completed every abstraction form for use during the trial. Ambigu- and primary analyses are done. There are no personally ous articulation of these entities on our forms could have Saldanha et al. Systematic Reviews (2016) 5:196 Page 10 of 11

led to information bias in a primary outcome of the DAA Additional file 4: Institutional review board (IRB) approval for DAA Trial trial if the abstraction errors resulted not from omission from Johns Hopkins University Bloomberg School of Public Health. or incorrect abstraction, but rather an abstractor’s misun- (DOCX 265 kb) derstanding of the abstraction requested. Additional file 5: Institutional review board (IRB) approval for DAA Trial from Brown University. (DOCX 236 kb) To reduce the abstraction process learning curve, we are restricting the trial to individuals with at least some Abbreviations experience with data abstraction for systematic reviews. AHRQ: Agency for Healthcare Research and Quality; CCNet: Cochrane Another limitation is that, unlike many real-world reviews, Consumer Network; CRD: Centre for Reviews and Dissemination; we are not allowing resolution of abstraction discrepancies CUE: Consumers United for Evidence-based Healthcare; DAA: Data Abstraction Assistant; EPC: Evidence-based Practice Center; IOM: Institute of by a third abstractor. Medicine; JHBSPH: Johns Hopkins Bloomberg School of Public Health; NICHSR: National Information Center on Health Services Research and Health Care Technology; PCORI: Patient-Centered Outcomes Research Institute; PCSK-9: Proprotein convertase subtilisin/kexin type 9; RCT: Randomized Strengths of the trial design controlled trial; SPIRIT: Standard Protocol Items: Recommendations for This trial has many strengths. First, it tests a novel soft- Interventional Trials; SRDR: Systematic Review Data Repository ware application (DAA) designed to make the data ab- Acknowledgements straction process more accurate and efficient. In addition, We acknowledge the contribution of Ms. Vernal Branch, a consumer used in conjunction with a data abstraction system such co-investigator who helped identify reviews, trials, and outcomes for as SRDR, DAA would maintain an annotated version of abstraction during the DAA Trial. the data abstracted for easy access when the same or Funding another systematic review team updates the review at a PCORI sponsors the development of DAA and the DAA Trial under the later date. Second, using a rigorous and efficient cross- following contract number: ME-1310-07009 (Additional file 3). PCORI does over design with random allocation and allocation con- not have a representative on the Steering Committee. PCORI has no role in the design and conduct of the trial; collection, management, analysis, cealment, this trial tests the effectiveness of the software and interpretation of the data; and preparation, review, or approval of the application vis-à-vis two currently recommended best manuscript(s). PCORI will convene an independent team to peer review the practice approaches to abstraction. Third, we are collab- final report of the project, and will post the final report on the PCORI website. orating with patient/consumer co-investigators as the trial Availability of data and materials moves forward, in designing the trial and identifying the We are publishing this manuscript describing our protocol in an open access review topics, study articles, and outcomes for use during journal (Systematic Reviews). We are committed to making the analytic datasets, codebooks, and annotated computer programming code underlying figures, abstraction. Finally, the generalizability of the trial is likely tables, and other principal results publicly available within nine months of the to be high because of the broad eligibility criteria for end of the project. We will provide adequate documentation of the computer abstractors from multiple locations with various types code and software environment to enable others to repeat and/or conduct similar analyses. of background and levels of experience with abstraction during reviews. Authors’ contributions In summary, current standards for data abstraction, a TL is the Principal Investigator of the DAA Project that includes this trial. TL key step during systematic reviews, rest on a weak evi- also is the Director of the Coordinating Center that also includes the Project Director (IJS). TL, IJS, and CHS lead the design of the trial, with scientific input dence base. Our trial represents a potentially substantial from JL, KD, JAB, SC, WC, BDB, BCW, SMH, IS, MHM, SAW, and EJW. CHS is the step forward in the quest to reduce errors in data abstrac- Senior Statistician for the trial. CHS also is the Director of the Data Warehouse tion. The trial will rigorously evaluate whether a novel that also includes JJ and BTS. JJ and BTS are responsible for the design of the DAA software. JL, KD, JAB, SC, WC, BDB, BCW, SMH, IS, MHM, SAW, and EJW software application (DAA) could help the systematic re- contributed to the design of the trial, the protocol, and the DAA software. view community efficiently use scarce research resources. SAW and EJW (consumer co-authors) helped identify reviews, trials, and Because systematic reviews are a key component of com- outcomes for abstraction during the trial. IJS is responsible for the day-to-day running of the trial and drafted the manuscript. All authors have read parative effectiveness research, our results could help and approved the final manuscript. improve patient care by reducing errors in the synthesis of research evidence that informs that care. Competing interests All investigators are asked to disclose any financial conflicts of interest that could be perceived by others to have a financial interest in the outcome of the trial. Additional files Consent for publication All authors have read and approved the final manuscript, and have provide Additional file 1: Location of SPIRIT 2013 checklist items in this consent for publication of this manuscript. manuscript. (DOCX 47 kb) Additional file 2: Pilot study to classify data abstractor experience with Ethical approval and consent to participate data abstraction. (DOCX 39 kb) The Coordinating Center (JHBSPH) and the Data Warehouse (Brown University) have obtained approval for the trial from their respective IRBs (dated July 13, Additional file 3: Announcement of funding of DAA Trial by PCORI. 2015 [IRB number 00006521] (Additional file 4) and August 21, 2015 (DOCX 486 kb) (Additional file 5), respectively). Online informed consent for participation Saldanha et al. Systematic Reviews (2016) 5:196 Page 11 of 11

in the trial will be obtained from every participant via the Consent patient-centered outcomes research. July 23, 2012 version. Available at: Website (http://srdr.ahrq.gov/daa/consent). http://pcori.org/assets/MethodologyReport-Comment.pdf; Accessed 6 Sept 2016. Steering committee 9. Center for Reviews and Dissemination. Systematic reviews: CRD’s guidance for The Steering Committee monitors the design and conduct of this trial. The undertaking reviews in health care. York, UK: York Publishing Services, Ltd. Steering Committee includes CHS, KD, JAB, Ms. Vernal Branch, SC, WC, BDB, Available at: https://www.york.ac.uk/media/crd/Systematic_Reviews.pdf. BCW, SMH, IS, MHM, SAW, EJW, and TL. The Steering Committee reviews and Accessed 6 Sept 2016. approves procedures and changes in procedures for the trial; monitors trial 10. Methods Guide for Effectiveness and Comparative Effectiveness Reviews. progress; resolves technical issues; appoints subcommittees as needed; and AHRQ Publication No. 10(14)-EHC063-EF. Rockville. Available at: https:// provides oversight for ancillary studies and publication of trial findings. www.effectivehealthcare.ahrq.gov/ehc/products/60/318/CER-Methods- The Steering Committee is composed of experts representing different Guide-140109.pdf. Accessed 6 Sept 2016 stakeholder groups, including patients, clinicians/clinician associations, 11. Chan A-W, Tetzlaff JM, Altman DG, Laupacis A, Gøtzsche PC, Krleža-Jerić K, industry, policymakers, guideline developers, researchers, and educational et al. SPIRIT 2013 statement: defining standard protocol items for clinical institutions. trials. Ann Intern Med. 2013;158:200–7. 12. Ip S, Hadar N, Keefe S, Parkin C, Iovin R, Balk EM, et al. A Web-based archive of systematic review data. Syst Rev. 2012;1:15. doi:10.1186/2046-4053-1-15. Trial Status 13. Li T, Vedula SS, Hadar N, Parkin C, Lau J, Dickersin K. Innovations in data As of September 6, 2016, we are in the process of recruiting and screening collection, management, and archiving for systematic reviews. Ann Intern participants. The trial is scheduled to be completed by February 28, 2017. Med. 2015;162(4):287–94. doi:10.7326/M14-1603. 14. Choi M, Hector M. Effectiveness of intervention programs in preventing falls: Author details systematic review of recent 10 years and meta-analysis. J Am Med Dir 1Department of Epidemiology, Johns Hopkins Bloomberg School of Public Assoc. 2012;13(2):188.e13–21. Health, 615 North Wolfe Street, Room W6507-B, Baltimore, MD 21205, USA. 15. Navarese EP, Kolodziejczak M, Schulze V, Gurbel PA, Tantry U, Lin Y, et al. 2Department of Biostatistics, and Center for Evidence-based Medicine, Brown Effects of proprotein convertase subtilisin/kexin type 9 antibodies in adults University School of Public Health, Providence, RI, USA. 3Department of with hypercholesterolemia: a systematic review and meta-analysis. Health Services, Policy and Practice, and Center for Evidence-based Medicine, Ann Intern Med. 2015;163(1):40–51. doi:10.7326/M14-2957. Brown University School of Public Health, Providence, RI, USA. 16. Fong DY, Ho JW, Hui BP, Lee AM, Macfarlane DJ, Leung SS, et al. Physical 4Epidemiology, Johnson & Johnson, Titusville, NJ, USA. 5Center for activity for cancer survivors: meta-analysis of randomised controlled trials. Evidence-based Medicine, Brown University School of Public Health, BMJ. 2012;344:e70. doi:10.1136/bmj.e70. Providence, RI, USA. 6Department of Medicine, University of California San 17. Appleton KM, Sallis HM, Perry R, Ness AR, Churchill R. Omega-3 fatty acids Francisco School of Medicine, San Francisco, CA, USA. 7Internal Medicine, for depression in adults. Cochrane Database Syst Rev. 2015;Issue 11: Kaiser Permanente Northwest, Portland, OR, USA. 8National Research Council CD004692. doi:10.1002/14651858.CD004692.pub4. Information and Communications Technologies Portfolio (NRC-ICT), Ottawa, 18. Higgins JPT, Green S (editors). Cochrane Handbook for Systematic Reviews ON, Canada. 9Northeastern University College of Computer and Information of Interventions Version 5.1.0 [updated March 2011]. The Cochrane Science, Boston, MA, USA. 10Department of Medicine, Johns Hopkins School Collaboration; 2011. Available at: http://handbook.cochrane.org. Accessed 6 of Medicine, Baltimore, MD, USA. 11Department of Medicine, University of Sept 2016. California San Francisco School of Medicine, San Francisco, CA, USA. 19. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence 12College of Medicine, and Evidence-based Practice Center, Mayo Clinic, based medicine: what it is and what it isn't. BMJ. 1996;312(7023):71–2. Rochester, MN, USA. 13California Breast Cancer Organizations, Davis, CA, USA. 14Cochrane Consumer Network, Fredericton, NB, Canada.

Received: 26 September 2016 Accepted: 3 November 2016

References 1. Committee on Standards for Systematic Reviews of Comparative Effectiveness Research, Board on Health Care Services, Eden J, Levit L, Berg A, Morton S. Finding what works in health care: standards for systematic reviews. Washington: National Academies Press; 2011. 2. Chandler J, Churchill R, Higgins J, Lasserson T, and Tovey D. Methodological standards for the conduct of new Cochrane intervention reviews. Version 2. 3, December 2, 2013. Available at: http://www.editorial-unit.cochrane.org/ sites/editorial-unit.cochrane.org/files/uploads/MECIR_conduct_ standards%202.3%2002122013.pdf. Accessed 6 Sept 2016. 3. Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. Single data extraction generated more errors than double data extraction in systematic reviews. J Clin Epidemiol. 2006;59(7):697–703. 4. Gøtzsche PC, Hróbjartsson A, Maric K, Tendal B. Data extraction errors in Submit your next manuscript to BioMed Central meta-analyses that use standardized mean differences. JAMA. 2007; 298(4):430–7. and we will help you at every step: 5. Jones AP, Remmington T, Williamson PR, Ashby D, Smyth RL. High prevalence but low impact of data extraction and reporting errors were • We accept pre-submission inquiries found in Cochrane systematic reviews. J Clin Epidemiol. 2005;58(7):741–2. • Our selector tool helps you to find the most relevant journal 6. Horton J, Vandermeer B, Hartling L, Tjosvold L, Klassen TP, Buscemi N. • We provide round the clock customer support Systematic review data extraction: cross-sectional study showed that experience did not increase accuracy. J Clin Epidemiol. 2010;63(3):289–98. • Convenient online submission 7. Gresham G, Matsumura S, Li T. Faster may not be better: data abstraction • Thorough peer review for systematic reviews. In: Evidence-Informed Publich Health: Opportunities • Inclusion in PubMed and all major indexing services and Challenges. Abstracts of the 22nd Cochrane Colloquium; 2014 21-26 Sep; Hyderabad, India. John Wiley & Sons; 2014. • Maximum visibility for your research 8. Patient-Centered Outcomes Research Institute Methodology Committee. Draft methodology report: our questions, our decisions: standards for Submit your manuscript at www.biomedcentral.com/submit Copyright © 2020. Johns Hopkins University. All Rights Reserved.

Disclaimer: The [views, statements, opinions] presented in this report are solely the responsibility of the author(s) and do not necessarily represent the views of the Patient-Centered Outcomes Research Institute® (PCORI®), its Board of Governors or Methodology Committee.

Acknowledgment: Research reported in this report was funded through a Patient-Centered Outcomes Research Institute® (PCORI®) Award (#ME-1310-07009). Further information available at: https://www.pcori.org/research-results/2014/testing-new-software- program-data-abstraction-systematic-reviews

92