Critical Appraisal of Research Evidence for Its Validity and Usefulness
Total Page:16
File Type:pdf, Size:1020Kb
Critical Appraisal of Research Evidence for Its Validity and Usefulness Joy C. MacDermid, BScPT, PhDa,b,*, David M. Walton, MScPT, PhD (c)c, Mary Law, PhDd KEYWORDS Relevance Clinical research Critical appraisal Evidence-based Quality Clinical recommendations FIVE STEPS OF EVIDENCE-BASED PRACTICE a given patient (generalizability/external validity); and (3) determine the nature and strength of The five steps in the evidence-based practice recommendations based on synthesis of several (EBP) approach are: individual evidence resources. Ask a specific clinical question. Find the best evidence to answer the Critical Appraisal of Individual Study Quality question. (Internal Validity) Critically appraise the evidence for its validity and usefulness. The importance of critical appraisal in EBP has led Integrate appraisal results with clinical exper- to the development of systems, processes, tools, tise and patient values. and support systems for rating clinical research Evaluate the outcomes. evidence. In fact, we now have systematic reviews of appraisal tools.1 In addition, there has been an Step 3 in the EBP approach involves critical increased move toward having experts in critical appraisal of the validity and usefulness of appraisal perform this task. Clinicians are then evidence, with the specific goal of identifying the able to ‘‘pull-out’’ preappraised forms of evidence, highest quality evidence that applies to a given such as the PEDro Physiotherapy Evidence clinical question. Because evidence-based deci- Database or OTSeeker. Most recently, there has sion making requires using the best available been development of ‘‘push-out’’ approaches, evidence, quality and relevance judgments are where high quality, critically appraised evidence important components in the process. In fact, resources already rated by experts are sent this third step can be broken down into three directly to end users with specific information sequential subcomponents: (1) determine whether needs (eg, BMJ updates). This article focuses on the results of individual studies are true (internally how hand surgeons and therapists can access valid); (2) determine whether the results apply to and apply ranking systems, critical appraisal tools, J.C.M. is funded by a New Investigator Award, Canadian Institutes of Health Research. D.M.W. is funded by a Doctoral Fellowship, Canadian Institutes of Health Research. M.L. holds the John and Margaret Lillie Chair in Childhood Disability. a Hand and Upper Limb Centre Clinical Research Laboratory, St. Joseph’s Health Centre, 268 Grosvenor Street, London, Ontario, N6A 4L6, Canada b School of Rehabilitation Science, McMaster University, Institute for Applied Health Sciences, 1400 Main Street West, 4th Floor, Hamilton, Ontario L8S 1C7, Canada c The University of Western Ontario School of Physical Therapy, Room EC 1588, 1201 Western Road, London, Ontario, N6G 1H1, Canada d School of Rehabilitation Science, McMaster University, 268 Grosvenor Street, Hamilton, Ontario, Canada * Corresponding author. School of Rehabilitation Science, LB33, McMaster University, Institute for Applied Health Sciences, Room 429, 1400 Main Street West, 4th Floor, Hamilton, Ontario L8S 1C7, Canada E-mail address: [email protected] (J.C. MacDermid). Hand Clin 25 (2009) 29–42 doi:10.1016/j.hcl.2008.11.003 0749-0712/08/$ – see front matter ª 2009 Elsevier Inc. All rights reserved. hand.theclinics.com 30 MacDermid et al and guides for making overall recommendations decision-making. The ‘‘best’’ study design varies to provide guideposts on how research evidence according to the type of study that is being con- can be transitioned into patient specific ducted. For example, while the RCT is considered recommendations. the best study design for detecting differences Critical appraisal first focuses on the internal val- between intervention groups, for studies in prog- idity of the study, or the extent to which the nosis a prospective cohort design with complete conclusions of the study are true within the partic- follow-up is the best design. The types of study ular context of the study. This process can be per- designs that have been used often signify the state formed at various depths of analysis, such as of knowledge about an intervention. Early in the quick classification systems or more detailed development of an intervention, case series are rating tools. Critical appraisal instruments range the most common. Data from these designs are from very structured tools that contain specific then used to develop RCTs. The classic ‘‘Sack- questions and defined response categories, to etts’’ five levels of evidence are a broad ordinal more open-ended scales where the assessor tool but have had a tremendous impact. For makes guided subjective judgments on the quality example, many evidence reviews performed by of aspects of study design, using a framework the Cochrane Collaboration include either only provided by the assessment tool. Different critical RCTs or the two highest levels of evidence when appraisal tools are appropriate for different study conducting a systematic review. designs. Hand surgeons and therapists should select different critical appraisal instruments de- The ‘‘Classic’’ Levels of Evidence for Treatment pending on their clinical question, its associated Effectiveness study design, their familiarity with critical appraisal, personal preferences, accessibility of Because treatment effectiveness is one of the the literature, and a realistic balance between primary interests of clinicians, and the RCT is the time commitment and depth of analysis. ideal design for experimental evaluation of treat- Different depths of critical appraisal are also ment effectiveness, the conduct of RCTs has appropriate at different points in practice. For expanded exponentially. Early evidence rating example, when needing to make quick decisions systems for treatment effectiveness designated at the point of care, screening for specific random- RCTs as level 1 evidence. With the proliferation ized, controlled trials (RCTs) or presynthesized of RCTs emerged a new research methodology: evidence may be the most expedient approach. the systematic review. The original levels of The classic five levels of evidence will be useful evidence developed at McMaster University were for this purpose. In other cases, when planning subsequently updated and are clearly presented to implement a new intervention into one’s prac- on the Web site for the Oxford Center For tice, there may be a significant learning curve Evidence-Based Medicine by David Sackett and and cost involved. Therefore, it would be important colleagues (last updated May 2001, http://www. to delve more deeply into the study design to gain cebm.net/levels_of_evidence.asp). This rating a more thorough understanding of issues that system allows you to classify individual studies in might affect the validity of the study conclusions, broad categories or ‘‘levels’’ (see the article by and the clinical interpretability or applicability Szabo and MacDermid elsewhere in this issue). across different patients. Furthermore, knowing Level 1 is the highest level of evidence that can the evidence about a specific planned intervention be achieved for treatment effectiveness. Three can guide its implementation. Clinicians who potential situations are considered to be suffi- commit to learning and practicing detailed critical ciently rigorous to be labeled as level 1. Level 1a appraisal gain a greater appreciation of the issues would consist of a systematic review of a number that can compromise confidence in research of RCTs, where the studies substantially agree studies. However, quick rating scales or even pre- with each other in terms of the direction and synthesized evidence ratings have the advantage approximate size of the effects observed. A level of being less time consuming than more traditional 1b study would be an individual RCT where the evaluation methods. size of the treatment effect was defined by a narrow confidence interval. A level 1c study is LEVELS OF EVIDENCE a very unusual circumstance in surgery or hand therapy, and is when an all-or-none phenomenon The concept of ranking levels of evidence is based occurs in the absence of a randomized study. An on the principle that certain study types have more example of a level 1c would be a study where an rigor and these higher quality study designs overwhelmingly dramatic change in outcomes provide more confidence to associated clinical can be demonstrated once a new treatment Critical Appraisal EBP 31 becomes available. Cases where all patients die internal validity. These include the use of standard- before an intervention is available, and some ized outcome measures, adequate sampling, survive following introduction up of a new interven- appropriate blinding,3–5 rigorous follow-up, and tion, provide overwhelming evidence. For proper statistical analysis, including adjustment example, vaccination is widely accepted in prac- for important potential confounders. A level 2a tice although not based on RCT evidence. Level study is a systematic review of cohort (prospec- 1 studies are those that provide the highest tive) studies that agree with each other in terms internal validity (confidence that the study results of the direction and approximate size of the effects are true), enhancing our confidence that if we obtained. A level 2b study is a single, high quality select this intervention for our patients, we will be cohort study (with greater than 80%