<<

[ clinical commentary ]

HEIDI ISRAEL, FNP, PhD1 • RANDY R. RICHTER, PT, PhD2

A Guide to Understanding Meta-Analysis

eta-analysis is a popular and frequently used statistical able one to combine and summarize technique used to combine data from several studies and the findings of several clinical trials that reexamine the effectiveness of treatment interventions. As evaluate the effectiveness or efficacy of a similar treatment approach on similar the number of articles using meta-analysis increases, under- M group of patients. This technique can standing of the benefits and drawbacks of the technique is essential. prove especially useful when there are several similar clinical trials with or with- Well-conducted systematic reviews of high level of evidence, and enhanced clin- out consistent outcomes, or when there randomized controlled trials are regarded ical interpretation of treatment effects, are smaller to medium-sized trials with as representing a high level of evidence.28 interpreting a meta-analysis is an impor- inconclusive results. Practicing in an evidence-based manner tant skill for physical therapists. The pur- By combining the results from 2 or is a recognized goal for the .3 pose of this commentary is to expand on more studies, a meta-analysis can in- Systematic reviews are used to answer existing articles describing meta-analysis crease statistical power18 and provide questions7,10,40 about the evidence sup- interpretation,6,13,14,42,61 discuss differences a single numerical value of the overall porting or refuting the effectiveness or in the results of a meta-analysis based on treatment effect. The meta-analysis result efficacy of an intervention. When certain the treatment questions, explore special may show either a benefit or lack of ben- conditions are met, a cases in the use of meta-analysis, and efit of a treatment approach that will be may be extended to include a meta- provide physical therapists guidance in indicated by the , which is the analysis, a statistical procedure used to interpreting a meta-analysis. term used to describe the treatment ef- numerically summarize the included fect of an intervention. Treatment effect is studies’ treatment effect.56 A meta-anal- WHY META-ANALYSIS the gain (or loss) seen in the experimental ysis provides a single, overall measure of group relative to the control group. The the treatment effect, enhancing the clini- number of reasons exist for overall positive or negative change may cal interpretation of findings across sev- considering the use of meta-anal- be hard to discern from individual stud- eral studies. Because of its increasing use, A ysis techniques. Meta-analyses en- ies. For example, Clare et al15 used a meta- analysis to examine the treatment effect of McKenzie therapy for spinal pain. SYNOPSIS: With the focus on evidence- meta-analysis, and strengths and weaknesses of TT Three studies supported the use of McK- Journal of Orthopaedic & Sports Physical Therapy®

Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. based practice in healthcare, a well-conducted meta-analysis. Common components like forest systematic review that includes a meta-analysis interpretation, software that may be used, enzie therapy for short-term pain. Two of where indicated represents a high level of evidence special cases for meta-analysis, such as subgroup the 3 studies reported a small but similar for treatment effectiveness. The purpose of this analysis, individual patient data, and meta-regres- reduction in pain, which was statistically commentary is to assist clinicians in understand- sion, and a discussion of criticisms, are included. significant for only 1 of the 2 studies. The ing meta-analysis as a statistical tool using both J Orthop Sports Phys Ther 2011;41(7):496-504. third study reported a reduction of pain published articles and explanations of components doi:10.2519/jospt.2011.3333 that was twice the magnitude of the other of the technique. We describe what meta-analysis studies. The results of the meta-analysis is, what heterogeneity is, and how it affects meta- TTKEY WORDS: , , analysis, effect size, the modeling techniques of statistical analysis, systematic review indicated an overall treatment effect that was statistically significant and closer

1Assistant Professor, Saint Louis University, Department of Orthopaedic Surgery, St Louis, MO. 2Associate Professor, Saint Louis University, Program in Physical Therapy, St Louis, MO. Address correspondence to Dr Heidi Israel, Saint Louis University, Department of Orthopaedic Surgery, 3635 Vista FDT7N, St Louis, MO 63104. E-mail: [email protected]

496 | july 2011 | volume 41 | number 7 | journal of orthopaedic & sports physical therapy

41-07 Israel.indd 496 6/15/2011 3:14:46 PM in magnitude to the 2 studies reporting a small reduction in pain. This example TABLE 1 Selected Concepts in Meta-Analysis illustrates the potential value of meta- analysis when direction of the treatment Concept Definition effect is the same across all studies but the Meta-analysis Statistical analysis that integrates results from 2 or more studies,27 providing a single numerical magnitude and of value of the overall treatment effect for that group of studies. the treatment effect varies. A meta-anal- Effect size A dimensionless estimate (ie, a measure with no units) that indicates both direction and magni- ysis can also be used to show changes in tude of the treatment effect.16,36 the treatment effect that occur over time. The odds the ratio of the of an event occurring compared to the event not occurring For example, Zhang et al,60 in an update in a particular group. The odds ratio is the ratio of the odds between 2 groups.9 of the management of hip and knee os- Relative risk Relative risk is equal to the risk among exposed subjects divided by the risk among unexposed teoarthritis (2006 to 2009), determined subjects.49 that the treatment effect sizes for exercise Fixed-effects A model that assumes that each study included in the meta-analysis is estimating the same and acupuncture did not change at mul- model population treatment effect, which, in theory, represents the true population treatment effect.21 tiple time points, while the treatment ef- Random-effects A model that assumes that the treatment effects of the included studies are part of a distribution fect sizes for weight reduction eventually model of treatment effects that fall along a of values.21 reached statistical significance at later Forest plot A graph that visually shows the results from the individual studies (treatment effect and confidence time points, and the treatment effect size interval), as well as the estimate of overall treatment effect and associated .34 for electromagnetic therapy was no lon- Confidence Confidence intervals (CIs) provide upper and lower limits that capture the range of values around ger significant at later time points. These interval the true but unknown population value. The 95% CI is most commonly used and corresponds type of results obtained from meta-anal- with the typical 5% significance level used in hypothesis tests. CIs of continuous measures that ysis can be used to make better informed include 0 represent nonsignificant results. CIs of odds ratios and relative risk that include 1.0 treatment decisions. represent nonsignificant results.23 To interpret a meta-analysis, the read- er needs to understand several concepts, including effect size, heterogeneity, the inal units, the interpretation is clearer. size estimates.36 Finally, variables such as model used to conduct the meta-analy- When the effect sizes are in standardized gender, age differences, or differences in sis, and the forest plot, a graphical rep- units, the interpretation is more difficult the intervention provided, such as dose, resentation of the meta-analysis. These and published guidelines for interpret- can influence the magnitude and direc- concepts are discussed below and sum- ing effect sizes may be used.17 Whether tion of the effect size. marized in TABLE 1. standardized or not, the overall effect size Use of the confidence interval can lend derived from the meta-analysis is calcu- insight into the precision of the treatment EFFECT SIZE lated by combining the effect sizes of the estimates of the included studies. A wider included studies. There are several types confidence interval may be a function of a tudies included in a meta-anal- of effect sizes. For dichotomous data, small size, as well as imprecision ysis must have common outcome such as improved or not improved, odds in the measurement. Larger sample sizes Sstatistics that allow their results to ratios or relative risks are used for effect provide more precise estimates of the ef- be combined.31 Effect sizes, which re- sizes. Other types of effect sizes are often fect size,36 whereas smaller studies are flect the magnitude and direction of the reported in meta-analysis, and these are less precise, unless these smaller studies treatment effect for each study, serve this described in TABLE 2. have little . Confidence intervals, Journal of Orthopaedic & Sports Physical Therapy® Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. purpose. When all the studies to be in- Because several factors, such as sam- which are reported as a probability (eg, cluded in a meta-analysis have the same ple size, variance, and reliability of the 95% confidence interval), provide a range outcome measure, an effect size in the outcome measures, can influence the (upper and lower bounds) that indicate original units may be calculated. For ex- magnitude and direction of the effect the precision of the estimate of the effect ample, if all studies in the meta-analysis size, the estimates of the effect sizes will size.23,48 If the confidence interval of the measure a continuous outcome, such as vary among studies. In addition, the ef- effect size falls within an area considered range of motion, the difference fect size of the individual studies may be as clinically meaningful, then applica- can be used as the effect size. Standard- somewhat imprecise and, therefore, lead tions of the results in clinical care may be ization of the effect size is needed when to an unstable finding when multiple justified.39,51 Conversely, wide confidence treatments are not measured in the same small studies are utilized. Weighting of intervals indicate less precise estimates units. Standardization makes data unit- the based on sample size and, coupled with a small sample size, less. When the effect sizes are in the orig- allows for the best precision of the effect can lead to questions about the stability

journal of orthopaedic & sports physical therapy | volume 41 | number 7 | july 2011 | 497

41-07 Israel.indd 497 6/15/2011 3:14:47 PM [ clinical commentary ]

Typical Effect Size Calculations With Various TABLE 2 Data Types Used in Meta-Analysis

Type of Data Type of Effect Size Reported Examples of Outcome Measures Continuous data Standardized mean difference (eg, Cohend ) Various measures of strength16 Unstandardized mean difference 100-mm visual analog pain scale8,41 Pretest-posttest difference Depression preintervention/postintervention19 Binary (dichotomous) data Odds ratio Risk factors for persistent problems following whiplash58 Relative risk Rates of rerupture in Achilles tendon repair32

of the effect size estimates. By combining overall confidence interval.34 Even when an effect size can be used to interpret the the results of small studies, a meta-anal- a systematic review does not include a results of a meta-analysis in a clinically ysis may provide a more precise estimate meta-analysis, a forest plot can be used meaningful manner.39,51 For example, if a of the treatment effect. to compare the effect size of the included 10° change in range of motion is consid- For example, Khan et al32 examined studies.44 ered clinically meaningful and the lower randomized trials of operative versus In addition to illustrating the effect bound of the 95% confidence interval nonoperative treatment of Achilles ten- sizes and related confidence intervals of is 12° and the upper bound is 18°, the don rupture and calculated effect sizes individual studies, a forest plot can illus- statistically significant difference is also for rerupture, infection rates, and other trate the extent to which the results from clinically meaningful, as the confidence complications. One aspect of this meta- individual studies vary. Variability in re- interval exceeds the clinically meaningful analysis examined complications other sults among studies on the same topic value of 10°. This type of finding based than rerupture for postoperative man- is called heterogeneity. When the mag- on randomized controlled trials should agement that included splinting with nitude and direction of the effect sizes prompt the adoption of the intervention casting alone versus casting followed by among the studies are similar, heteroge- that lead to those gains. However, if the functional bracing. The confidence in- neity is less likely and meta-analysis may MCID falls within the lower and upper tervals of estimates of the relative risks be appropriate. Conversely, when study bounds of the confidence interval, the in the individual studies were quite wide, results vary, heterogeneity is possible and clinician will have to determine if adop- especially in trials with a smaller sample a meta-analysis may not be appropriate. tion of the intervention is warranted or if size; but the meta-analysis effect size fa- A forest plot of a meta-analysis typi- additional evidence is needed.51 vored functional bracing with a smaller cally includes the numerical value of the confidence interval than that of the in- treatment effect and variability for each HETEROGENEITY dividual studies. Mollon et al38 make the individual study, the modeling technique point that even when many studies are assumed (random or fixed), the “line of eterogeneity is a term used to excluded from a meta-analysis, based on no effect,” a test and corresponding value describe variability among stud- the stringent inclusion criteria of the sys- for heterogeneity, and the numerical es- Hies,21 and both statistical and clini- tematic review protocol, the confidence timate of overall treatment effect FIGURES( cal heterogeneity need to be considered.54 intervals surrounding the overall esti- 1-2). The forest plot, therefore, provides a Statistical heterogeneity occurs when the mated effect size is larger when the small quick visual assessment of the individual treatment effect estimates of a set of stud- studies are included. studies included in the meta-analysis, a ies vary among one another.21 Because Journal of Orthopaedic & Sports Physical Therapy® Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. visual assessment of heterogeneity, and some variation in treatment effect among FOREST PLOTS the overall treatment effect of the indi- studies would be expected by chance, vidual studies included. statistical heterogeneity refers to the ne of the most useful tools The clinical context or clinical sig- amount of variation in treatment effect used in meta-analysis is the forest nificance of the findings must be con- present beyond chance.21 By convention, Oplot, which provides a visual sum- sidered when interpreting effect sizes. statistical heterogeneity is referred to as mary of the analysis and findings. A for- Some researchers use the term “minimal just heterogeneity.21 Studies with meth- est plot graphically represents estimates clinically important difference” (MCID) odological flaws and small studies may of the effect size and corresponding confi- to indicate clinical versus statistical sig- overestimate treatment effects45,59 and dence intervals for each study, along with nificance. Because statistical significance can contribute to statistical heterogene- an estimate of overall effect size of all does not always translate into clinical ity. Statistical heterogeneity can be exam- included studies and the corresponding significance, the confidence intervals of ined and quantified using statistical tests.

498 | july 2011 | volume 41 | number 7 | journal of orthopaedic & sports physical therapy

41-07 Israel.indd 498 6/15/2011 3:14:49 PM Review: Laser knee osteoarthritis Comparison: 05 Pain outcome RCTs, knee osteoarthritis Outcome: 06 Pain relief on 100-mm VAS, end of therapy with optimal dose Study/Subcategory n Treatment (Mean  SD) n Control (Mean  SD) WMD (Fixed) 95% CI* Weight, %† WMD (Fixed) 95% CI 01 Electroacupuncture Yurtkuran 25 25.00  12.00 25 5.00  8.90 ‡ 22.83 20.00 (14.14, 25.86) Sangdee 48 48.00  24.30 47 23.00  24.20 8.23 25.00 (15.25, 34.75) Vas 48 53.50  21.40 49 28.50  35.20 5.85 25.00 (13.43, 36.57) Subtotal (95% CI) 121 121 36.92 21.91 (17.30, 26.51) Test for heterogeneity: χ2 = 1.07, df = 2 (P = .59), I2 = 0%§ Line of no eect Test for overall effect:z = 9.32 (P<.00001)

FIGURE 1. Forest plot suggesting little heterogeneity. Abbreviations: CI, confidence interval; RCT, randomized controlled trial; VAS, visual analog scale; WMD, weighted mean difference. *A fixed-effects model, which assumes a common, fixed treatment effect. Between-study differences are assumed due to chance and not incorporated into the model. †Larger studies have a greater influence (weight) than smaller studies.‡ A qualitative visual analysis of the studies’ results suggests little between-study variability. The individual study point estimates of the treatment effect (blue squares) are on the same side of the line of no effect and closely line up on a vertical axis, indicating a similar treatment effect magnitude. The confidence intervals for each study’s treatment effect (horizontal line) overlap one another, and none cross the line of no effect, indicating a similar estimation of the population treatment effect between studies. These qualitative results suggest that there is little heterogeneity.§ The chi-square test for heterogeneity was nonsignificant. The 2I value was zero. These quantitative results suggest that there was little between-study variability (ie, heterogeneity). Adapted from Bjordal JM, Johnson MI, Lopes-Martins RA, Bogen B, Chow R, Ljunggren AE. Short-term efficacy of physical interventions in osteoarthritic knee pain. A systematic review and meta-analysis of randomised placebo-controlled trials. BMC Musculoskelet Disord. 2007;8:51. © 2007 Bjordal et al; licensee BioMed Central Ltd.

However, despite having statistical tests heterogeneity is too great2; in others, they findings may lead one to conclude that for statistical heterogeneity, there are no may decide to do so, using a subgroup the choice of a fixed- or random-effects accepted guidelines for when a meta- analysis to explore the origin of the clini- model is not critical, which is an incorrect analysis should not be completed due to cal heterogeneity8 (for example, to insure conclusion. Understanding the assump- statistical heterogeneity, and it is left to that certain included populations are not tions of each model sheds light on when the author’s discretion to determine if a affected adversely by a generally overall one model will be more appropriate than meta-analysis is appropriate. favorable intervention for the general the other. Clinical heterogeneity refers to dif- population in the meta-analysis). Other The fixed- and random-effects models ferences in study methods that affect the authors may attempt to minimize clini- differ in assumptions related to the ob- ability to compare and/or combine data cal heterogeneity within a meta-analysis served differences among study results. from different studies. Examples of dif- by limiting study eligibility. Carey et al12 The 2 models are actually answering ferences in study methods that may lead decided which studies should be included slightly different questions. In the fixed- to clinical heterogeneity include differ- in an allograft meta-analysis before start- effects model, the question is “What is ences in participant demographics, such ing the study, based on acceptable clinical the best estimate of the population effect as risk or severity of disease, the settings heterogeneity and the quality of assess- size?”18 An assumption of the fixed-effects in which the was conducted, the ment for inclusion. This approach, while model is that among a fixed set of stud- and intensity of the interven- reducing heterogeneity, typically results ies there is a common treatment effect18,26 tion, and how outcomes were measured in the total number of articles included and between-study differences in results across studies.54 While there are statisti- on a topic to be reduced. occur by chance.22,53 In other words, the Journal of Orthopaedic & Sports Physical Therapy® Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. cal tests to estimate the extent of statis- true treatment effect is assumed to be tical heterogeneity, there are no tests to MODELING DATA: FIXED- AND fixed22 and variability of between-study determine the extent of clinical hetero- RANDOM-EFFECTS MODELS results is not incorporated into the mod- geneity. Researchers and clinicians must el. Because of this assumption of fixed decide if the studies contributing to a me- he 2 most frequently used mod- treatment effect, larger studies are given ta-analysis are similar enough clinically els to conduct a meta-analysis are greater weight than the smaller stud- to make meta-analysis sensible. Tthe fixed- and random-effects mod- ies.11 Different calculation methods are Whether the amount of clinical het- els,53 each of which handles statistical available under the fixed-effects model.22 erogeneity is too great to warrant meta- heterogeneity differently. Although the Three common fixed-effect methods are analysis is a matter of judgment. In some assumptions of each model differ, they the inverse variance method, the Mantel- instances, authors may decide not to con- frequently lead to similar results when Haenszel method, and the Peto method.22 duct a meta-analysis because the clinical heterogeneity is not extreme. These The random-effects model, which

journal of orthopaedic & sports physical therapy | volume 41 | number 7 | july 2011 | 499

41-07 Israel.indd 499 6/15/2011 3:14:50 PM [ clinical commentary ]

Review: Laser knee osteoarthritis Comparison: 05 Pain outcome RCTs, knee osteoarthritis Outcome: 06 Pain relief on 100-mm VAS, end of therapy with optimal dose Study/Subcategory n Treatment (Mean  SD) n Control (Mean  SD) WMD (Fixed) 95% CI Weight, % WMD (Fixed) 95% CI 02 Transcutaneous electrical stimulation * Fargas Babjak 19 56.00  37.50 18 10.70  64.00 0.68 45.30 (11.26, 79.34) Yurtkuran 25 25.00  12.00 25 5.00  8.90 22.83 20.00 (14.14, 25.86) Cheing 16 58.00  27.00 16 4.96  42.40 1.29 53.04 (28.41, 77.67) Adeboyin 15 68.70  24.00 15 43.00  24.00 2.65 25.70 (8.52, 42.88) Cheing 30 35.20  41.50 10 30.00  24.10 1.75 5.20 (–15.94, 26.34) Defrin 33 28.40  11.10 21 5.70  13.40 16.59 22.70 (15.83, 29.57 Law 25 46.70  24.40 9 17.00  39.70 1.02 29.70 (2.06, 57.34) Subtotal (95% CI) 163 114 46.82 22.21 (18.12, 26.30) Line of no eect Test for heterogeneity: χ2 = 11.28, df = 6 ( P = .08), I2 = 46.8%† Test for overall effect:z = 10.65 (P<.00001)

FIGURE 2. Forest plot suggesting heterogeneity. Abbreviations: CI, confidence interval; RCT, randomized controlled trial; VAS, visual analog scale; WMD, weighted mean difference. *Qualitative analysis of heterogeneity. A qualitative visual analysis of the studies’ results suggests between-study variability. The individual study point estimates of the treatment effect (blue squares) are on the same side of the line of no effect but do not line up on a vertical axis, indicating a difference in treatment effect magnitude among studies. The confidence intervals for each study’s treatment effect (horizontal lines) overlap one another, but the upper and lower limits of the CI do not consistently line up on a vertical axis, indicating differences in estimation of the population treatment effect among studies. These qualitative results suggest that there is heterogeneity.† Quantitative tests of heterogeneity. The chi-square test for heterogeneity was significant at a level of less than .10. The 2I value was 47%. These quantitative results suggest there was study variability (ie, heterogeneity). Adapted from Bjordal JM, Johnson MI, Lopes-Martins RA, Bogen B, Chow R, Ljunggren AE. Short-term efficacy of physical interventions in osteoarthritic knee pain. A systematic review and meta-analysis of randomised placebo-controlled trials. BMC Musculoskelet Disord. 2007;8:51. © 2007 Bjordal et al; licensee BioMed Central Ltd.

assumes a distribution of treatment ef- study is accounted for in the calcula- graph, the interpretation is not absolute. fects, answers the question “What is the tion.22,26 A calculation method under the Two statistical methods to analyze sta- ?” The random- random-effects model is the DerSimo- tistical heterogeneity that are frequently effects model assumes a distribution of nian and Laird method.22 reported are the Cochran Q test (also the treatment effect for some popula- When heterogeneity is present, the known as chi-square test for heterogene- tions, meaning that the treatment effect random-effects model will weight the ity or the chi-square test for homogene- falls along a range of values, not a single studies comprising the meta-analysis ity) and the I2 (also known as Higgins I2). value, as in the fixed-effects model. -Be more equally, resulting in smaller studies The Cochran Q tests whether the in- cause of this distribution, the effect size having greater relative influence on the dividual studies’ treatment effects are may be positive for some populations but combined overall effect than in the fixed- farther away from the common effect, may be negative or harmful for others.18,29 effects model.50 To the extent that smaller beyond what is expected by chance.22,30 Studies included in the meta-analysis studies overestimate treatment effects, a When the chi-square test is significant, using a random-effects model are as- random-effects model may overestimate statistical heterogeneity is present. This sumed to represent a random sample of treatment effects when heterogeneity is test has low power when few studies Journal of Orthopaedic & Sports Physical Therapy® Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. a population of studies. The results of present. In this case, one recommenda- make up the meta-analysis,30 and, as a re- each study included in the meta-analysis tion is to compare the fixed- and random- sult, a nonsignificant test may lead to the represent a study-specific effect size that effects models.50 wrong conclusion regarding heterogene- varies around a mean population ef- ity. A compensation for the low power of fect size.18,26 In other words, the results EXAMING STUDIES FOR the Cochran Q is to test for heterogeneity of each study in the meta-analysis are HETEROGENEITY at an alpha level of .10, rather than at .05, assumed to represent a unique effect. thereby increasing the chance of finding Because of this assumption, larger stud- igh heterogeneity may indicate heterogeneity. ies are given proportionally less weight, that it is inappropriate to combine The test can have excessive power while smaller studies are given propor- Hstudies in a meta-analysis. Hetero- when there are many large studies,30 tionally more weight.11 In the random- geneity can be visualized using forest which is similar to the problems encoun- effects model, the unique effect of each plots (FIGURES 1 and 2); however, like any tered in other statistics with large sample

500 | july 2011 | volume 41 | number 7 | journal of orthopaedic & sports physical therapy

41-07 Israel.indd 500 6/15/2011 3:14:51 PM sizes. Finally, the Cochran Q reduces the prevented Achilles tendon rupture32 was determine if the meta-analysis findings question of heterogeneity to a dichotomy acknowledged by the authors to be a pos- change when different decisions related based on the P value, so there is no quan- sible result of both low event occurrence to the systematic review/meta-analysis tification of the amount of heterogeneity, and numerous small randomized clini- process are made.21 For example, a sensi- just whether or not there is statistically cal trials included in the meta-analysis. tivity analysis could be conducted to de- significant heterogeneity. When the event in question happens termine if a fixed- versus random-effects A different question to ask is “How rarely or infrequently, or is captured in analysis reach different conclusions.21 much heterogeneity is present?” The I2 small numbers within small trials, cau- was developed to answer this tion should be exercised in interpreting Meta-Regression question.30 The range of I2 is from 0% the meta-analysis. While a random-ef- Often, the studies included in a meta- to 100%. This percentage represents fects model is advocated by many au- analysis vary in their study charac- the percentage of total variation across thors, a fixed-effects model should be teristics (eg, variations in participant studies due to heterogeneity. The test is considered, because the random effects characteristics). Rather than not ac- not influenced by the number of studies model is influenced more by smaller counting for these differences, a meta- in the meta-analysis, and, rather than studies. regression tries to relate the size of the a dichotomy, the results indicate how effect to characteristics of the studies in- much heterogeneity is present. Another Subgroup Analysis volved.55 Conceptually, meta-regression advantage of I2 is that this test can be in- The meta-analysis produces an average is similar to regression. The predictor terpreted similarly, regardless of the type effect for all the trials included in the variables are the characteristics of the of outcome data and choice of effect mea- analysis. A question that may be asked is, studies (ie, sample size, ) sure. An important disadvantage of I2 is does this average treatment effect hold that influence the effect size, which is the that there are no empirically developed true for all types of groups included in outcome variable.21 The associations de- cut-points to determine when there is too the meta-analysis? A subgroup analysis rived from meta-regressions are observa- much heterogeneity to do a meta-analy- can be used to compare the overall ef- tional and have a weaker interpretation sis. Higgins et al30 have suggested rule- fect of the meta-analysis to a particular than the causal relationships derived of-thumb interpretations, such as 25% subgroup of patients within the meta- from randomized studies. This applies equals low heterogeneity, 50% equals analysis. A subgroup effect can occur by particularly when averages of patient medium heterogeneity, and 75% equals chance.20 Because of the possibility of characteristics in each trial are used as high heterogeneity. Other authors may a chance finding, care should be taken covariates in the regression. Data dredg- choose a cut-off for heterogeneity when when interpreting the results of a sub- ing is the main pitfall in reaching reli- choosing whether a fixed or random group analysis. Some authors suggest able conclusions from meta-regression. model is appropriate. determining prior to the analysis which It can only be avoided by prespecification The decision to move forward with factors to include in a subgroup analy- of variables that are believed to be po- the meta-analysis or stop at the system- sis.21 Risk stratification encourages the tential sources of heterogeneity.55 While atic review should be made based on the same phenomenon, in which those at some sources of heterogeneity may be results of the test of heterogeneity and higher risk are more likely to benefit from expected due to differences in study de- clinical judgment. High heterogeneity a treatment. In fact, detrimental effects sign (use and nonuse of randomization), implies dissimilarity in the studies, and a of a treatment may outweigh the ben- others sources of heterogeneity related meta-analysis should be conducted with efits in the low-risk group. If the results to patient characteristics require expert caution. The question that the informed in the subgroups are very different, then knowledge of the clinical area. Journal of Orthopaedic & Sports Physical Therapy® Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. clinician should evaluate is whether using meta-analysis to produce an aver- this amount of heterogeneity is so large age effect may not be appropriate. Lastly, Meta-Analysis Using Individual Patient that the results of the meta-analysis are including randomized trials in the meta- Data (IPD) problematic. analysis does not mean that comparisons Some critics of meta-analysis have stated between the trials are random,20 mak- that group-level analysis can lead to over- OTHER CONSIDERATIONS ing interpretation of subgroup analysis estimation of low-occurrence events and difficult. nondetection of important subset effects. Event Rarity In theory, using IPD allows for a larger vent rarity usually leads to Sensitivity Analysis sample size and potentially better stan- overestimate of effect size.36 For While a subgroup analysis attempts to es- dardization of the data, because the data Eexample, the result that functional timate a treatment effect for a particular are reanalyzed by the researchers doing bracing instead of operative intervention subgroup, a sensitivity analysis is used to the meta-analysis, rather than relying

journal of orthopaedic & sports physical therapy | volume 41 | number 7 | july 2011 | 501

41-07 Israel.indd 501 6/15/2011 3:14:52 PM [ clinical commentary ]

on the estimates provided in group-level CRITICISMS OF plored with meta-analysis.2 data. For these reasons, some research- META-ANALYSIS Other critics25,33 have claimed that ers recommend use of IPD, rather than meta-analysis techniques represent the group-level data alone, when possible.5 eta-analysis is not without its destruction of the scientific methods However, IPD analysis has a number of critics. Rosenthal43 cautions that, formed to provide statistical accuracy drawbacks. For example, IPD analysis Mwhile there are many positive as- and reproducibility in research. The ag- techniques are not standardized, IPD pects to meta-analysis, researchers must gregation of the data is a concern and, as data require sharing of the data among be cognizant that studies may vary con- discussed earlier in this paper, without a investigators, the use of IDP requires all siderably in their operational definitions good systematic review and a protocol for data to be available,52 and the research of the independent and dependent vari- inclusion of studies into the meta-analy- questions, recruitment criteria, and mea- ables, methods of measuring variables, sis, these concerns may be valid. sures of treatment effect need to be simi- data-analytic approaches, and results. Most of these criticisms can be ad- lar. Differences in how data are collected Eysenck24 has published numerous dressed by conducting a quality system- can lead to limitations in the use of IDP. criticisms of the assumptions and the atic review and then deciding whether When any of these elements differ, uncer- techniques of meta-analysis. Eysenck24 meta-analysis is appropriate. Having a tainty or error is introduced into the over- believes that meta-analysis encourages a study team with expertise in the area of all analysis. However, IPD is certainly an narrow focus on the effect size, without the research topic, in searching databas- alternative that can be appealing when consideration of other aspects of the in- es, and in the technique of meta-analysis answering questions related to group or cluded studies, such as methods or indi- can help guard against some of these po- subset differences. vidual study outcomes that are in opposite tential problems. directions. This may lead to an erroneous When Meta-Analysis Results are conclusion. As with many other statistical CONCLUSION Counterintuitive techniques, a focus on singular aspects of When results of the meta-analysis are the data (eg, the overall effect) can lead to eta-analysis is a valuable tool counterintuitive, clinical judgment will conflicting interpretations, thus the en- for researchers, because this tech- be needed to decipher the unexpected tire body of evidence should be part of the Mnique allows a reexamination of results. This judgment, based on expe- interpretation of any statistical analysis. treatment effect of several studies using rience, education, and current practic- Appraising a systematic review using the a larger sample than is possible for most es, will help the clinician to determine PRISMA (Preferred Reporting Items for researchers to recruit on their own. In whether to accept the findings or ques- Systematic Reviews and Meta-Analyses) many ways, meta-analysis allows, with- tion the statistical technique. One op- statement35,37 or a standardized appraisal out additional clinical resources, explo- tion is to go back to the original articles, tool, such as the AMSTAR (a measure- ration of potential treatment benefits or reassess the inclusion of the articles ment tool to assess the methodological drawbacks and utilizes made into the analysis, and judge whether quality of systematic reviews) tool46,47 available in smaller clinical trials. the inclusion made clinical sense. In will help clinicians focus on all aspects Clinicians reading the results of a meta-analysis there can be a loss of the of the systematic review. A meta-analysis meta-analysis should have a clear under- original individual study assumptions. should, at a minimum, include forest standing of the strengths and limitations Similar to assumptions regarding the re- plots with effect size estimates and confi- of the technique. In clinical medicine, search question and in a dence intervals for each included study, a many small studies are performed due given study, assumptions of the original measure of heterogeneity, and the meta- to lack of access to patients, resources for Journal of Orthopaedic & Sports Physical Therapy® Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. can be lost when the analysis overall treatment effect and re- conducting studies, or other forces that studies are combined together.25 lated confidence interval. drive clinical practice. Meta-analysis Eysenck24 also argues that only meta- provides a way to reevaluate the results Software Packages analysis of a simple question is valid and, of a particular clinical question. Meta- There are a variety of free and propri- when several studies are positive but analysis can be misleading if the studies etary software packages for meta-anal- not significant because of insufficient included are dissimilar in their research ysis.4 One well-known software package statistical power, using meta-analysis to question or collect different types of out- offered by the Collaboration examine effect size can lead to spurious come data. Meta-analysis, like any other is RevMan (http://www.cc-ims.net/rev- conclusions. Clinicians should examine statistical method, is unable to identify man). Another free meta-analysis soft- the results of the systematic review and whether the data being utilized are ap- ware package was recently described by the protocol for article inclusion to judge propriate. It is the responsibility of clini- Wallace et al.57 whether a research question can be ex- cians and researchers in the field to be

502 | july 2011 | volume 41 | number 7 | journal of orthopaedic & sports physical therapy

41-07 Israel.indd 502 6/15/2011 3:14:53 PM well informed about the and Spindler KP. A systematic review of anterior 2001;6:203-217. interpretation of the research informa- cruciate ligament reconstruction with autograft 30. Higgins JP, Thompson SG, Deeks JJ, Altman compared with allograft. J Bone Joint Surg Am. DG. Measuring inconsistency in meta-anal- tion before them, so as to make good 2009;91:2242-2250. http://dx.doi.org/10.2106/ yses. BMJ. 2003;327:557-560. http://dx.doi. clinical decisions for their patients. JBJS.I.00610 org/10.1136/bmj.327.7414.557 t 13. Chan KS, Morton SC, Shekelle PG. Systematic 31. Kane RL, Saleh KJ, Wilt TJ, Bershadsky B. The reviews for evidence-based management: how functional outcomes of total knee arthroplasty. J REFERENCES to find them and what to do with them. Am J Bone Joint Surg Am. 2005;87:1719-1724. http:// Manag Care. 2004;10:806-812. dx.doi.org/10.2106/JBJS.D.02714 14. Chung KC, Burns PB, Kim HM. A practical 32. Khan RJ, Fick D, Keogh A, Crawford J, Bram- 1. Ada L, Dorsch S, Canning CG. Strengthening guide to meta-analysis. J Hand Surg Am. mar T, Parker M. Treatment of acute achilles interventions increase strength and improve 2006;31:1671-1678. http://dx.doi.org/10.1016/j. tendon ruptures. A meta-analysis of random- activity after stroke: a systematic review. Aust J jhsa.2006.09.002 ized, controlled trials. J Bone Joint Surg Am. Physiother. 2006;52:241-248. 15. Clare HA, Adams R, Maher CG. A systematic 2005;87:2202-2210. http://dx.doi.org/10.2106/ 2. Alexander LD, Gilman DR, Brown DR, Brown review of efficacy of McKenzie therapy for spinal JBJS.D.03049 JL, Houghton PE. Exposure to low amounts of pain. Aust J Physiother. 2004;50:209-216. 33. Lau J, Ioannidis JP, Schmid CH. Summing up ev- ultrasound energy does not improve soft tissue 16. Cochrane Collaboration. Glossary of Terms in idence: one answer is not always enough. Lan- shoulder pathology: a systematic review. Phys The Cochrane Collaboration Version 4.2.5. Avail- cet. 1998;351:123-127. http://dx.doi.org/10.1016/ Ther. 2010;90:14-25. http://dx.doi.org/10.2522/ able at: http://www.cochrane.org/sites/default/ S0140-6736(97)08468-7 ptj.20080272 files/uploads/glossary.pdf. Accessed March 22, 34. Lewis S, Clarke M. Forest plots: trying to see the 3. American Physical Therapy Association. 2005. wood and the trees. BMJ. 2001;322:1479-1480. APTA Vision Statement for Physical Therapy 17. Cohen J. Statistical Power Analysis for the 35. Liberati A, Altman DG, Tetzlaff J, et al. The 2020. Available at: http://www.apta.org/ Behavioural Sciences. 2nd ed. Hillsdale, NJ: PRISMA statement for reporting systematic AM/Template.cfm?Section=Vision_20201&T Lawrence Erlbaum Associates, Inc; 1988. reviews and meta-analyses of studies that evalu- emplate=/TaggedPage/TaggedPageDisplay. 18. Cohn LD, Becker BJ. How meta-analysis ate health care interventions: explanation and cfm&TPLID=285&ContentID=32061. Accessed increases statistical power. Psychol elaboration. Ann Intern Med. 2009;151:W65-94. July 22, 2000. Methods. 2003;8:243-253. http://dx.doi. 36. Lipsey M, Wilson D. Practical Meta-Analysis. 4. Bax L, Yu LM, Ikeda N, Moons KG. A system- org/10.1037/1082-989X.8.3.243 Thousand Oaks, CA: Sage Publications; 2000. atic comparison of software dedicated to 19. Cuijpers P, van Straten A, Warmerdam L. Prob- 37. Moher D, Liberati A, Tetzlaff J, Altman DG. meta-analysis of causal studies. BMC Med lem solving therapies for depression: a meta- Preferred reporting items for systematic reviews Res Methodol. 2007;7:40. http://dx.doi. analysis. Eur Psychiatry. 2007;22:9-15. http:// and meta-analyses: the PRISMA statement. Ann org/10.1186/1471-2288-7-40 dx.doi.org/10.1016/j.eurpsy.2006.11.001 Intern Med. 2009;151:264-269, W264. 5. Berlin JA, Santanna J, Schmid CH, Szczech LA, 20. Davey Smith G, Egger M, Phillips A. Meta- 38. Mollon B, da Silva V, Busse JW, Einhorn TA, Feldman HI. Individual patient- versus group- analysis. Beyond the grand mean? BMJ. Bhandari M. Electrical stimulation for long-bone level data meta-regressions for the investigation 1997;315:1610-1614. fracture-healing: a meta-analysis of random- of treatment effect modifiers: ecological bias 21. Deeks J, Higgins JP, Altman D. Analysing data ized controlled trials. J Bone Joint Surg Am. rears its ugly head. Stat Med. 2002;21:371-387. and undertaking meta-analyses. In: Higgins JP, 2008;90:2322-2330. http://dx.doi.org/10.2106/ 6. Bhandari M, Devereaux PJ, Montori V, Cina Green S, eds. Cochrane Handbook for System- JBJS.H.00111 C, Tandan V, Guyatt GH. Users’ guide to the atic Reviews of Interventions. Chichester, West 39. Montori VM, Kleinbart J, Newman TB, et al. surgical literature: how to use a systematic Sussex, UK: John Wiley & Sons; 2008:243-296. Tips for learners of evidence-based medicine: literature review and meta-analysis. Can J Surg. 22. Deeks JJ, Altman D, Bradburn M. Statistical 2. Measures of precision (confidence inter- 2004;47:60-67. methods for examining heterogeneity and vals). CMAJ. 2004;171:611-615. http://dx.doi. 7. Bhandari M, Montori VM, Devereaux PJ, Wilc- combining results from several studies in meta- org/10.1503/cmaj.1031667 zynski NL, Morgan D, Haynes RB. Doubling the analysis. In: Egger M, Smith GD, Altman D, 40. Patsopoulos NA, Analatos AA, Ioannidis JP. impact: publication of systematic review articles eds. Systematic Reviews in Health Care: Meta- Relative impact of various study designs in orthopaedic journals. J Bone Joint Surg Am. Analysis in Context. London, UK: BMJ Books; in the health sciences. JAMA. 2005;293:2362- 2004;86-A:1012-1016. 2001:285-312. 2366. http://dx.doi.org/10.1001/ 8. Bjordal JM, Johnson MI, Lopes-Martins RA, 23. Doll H, Carney S. Statistical approaches to jama.293.19.2362 Bogen B, Chow R, Ljunggren AE. Short-term ef- uncertainty: P values and confidence intervals 41. Pittler MH, Brown EM, Ernst E. Static magnets ficacy of physical interventions in osteoarthritic unpacked. Equine Vet J. 2007;39:275-276. for reducing pain: systematic review and Journal of Orthopaedic & Sports Physical Therapy® Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved. knee pain. A systematic review and meta-anal- 24. Eysenck HJ. Meta-analysis of best-evidence meta-analysis of randomized trials. CMAJ. ysis of randomised placebo-controlled trials. synthesis? J Eval Clin Pract. 1995;1:29-36. 2007;177:736-742. http://dx.doi.org/10.1503/ BMC Musculoskelet Disord. 2007;8:51. http:// 25. Feinstein AR. Meta-analysis: statistical al- cmaj.061344 dx.doi.org/10.1186/1471-2474-8-51 chemy for the 21st century. J Clin Epidemiol. 42. Ried K. Interpreting and understanding meta- 9. Bloch J. Interpreting odds ratios and logistic 1995;48:71-79. analysis graphs--a practical guide. Aust Fam regression: a neccessity for evidence-based 26. Fleiss JL. The statistical basis of meta-analysis. Physician. 2006;35:635-638. practice. Am J Nurse Pract. 2008;12:39-43. Stat Methods Med Res. 1993;2:121-145. 43. Rosenthal R, DiMatteo MR. Meta-analysis: recent 10. Bolland MJ, Grey A, Reid IR. The randomised 27. Glass GV. Primary, secondary, and meta-analysis developments in quantitative methods for litera- controlled trial to meta-analysis ratio: original of research. Educ Res. 1976;5:3-8. ture reviews. Annu Rev Psychol. 2001;52:59-82. data versus systematic reviews in the medical 28. Guyatt G, Rennie D. Users’ Guides to the Medi- http://dx.doi.org/10.1146/annurev.psych.52.1.59 literature. N Z Med J. 2007;120:U2804. cal Literature: A Manual for Evidence-Based 44. Santamaria LJ, Webster KE. The effect of fatigue 11. Borenstein M, Hedges L, Higgins J, Rothstein H. Clinical Practice. Chicago, IL: American Medical on lower-limb biomechanics during single-limb Introduction to Meta-Analysis. Chichester, West Association; 2002. landings: a systematic review. J Orthop Sports Sussex, UK: John Wiley & Sons; 2009. 29. Hedges LV, Pigott TD. The power of statisti- Phys Ther. 2010;40:464-473. http://dx.doi. 12. Carey JL, Dunn WR, Dahm DL, Zeger SL, cal tests in meta-analysis. Psychol Methods. org/10.2519/jospt.2010.3295

journal of orthopaedic & sports physical therapy | volume 41 | number 7 | july 2011 | 503

41-07 Israel.indd 503 6/15/2011 3:14:54 PM [ clinical commentary ]

45. Schulz KF, Chalmers I, Hayes RJ, Altman DG. dx.doi.org/10.2522/ptj.2010.90.3.333 2009;39:334-350. http://dx.doi.org/10.2519/ Empirical evidence of bias. Dimensions of meth- 52. Sud S, Douketis J. The devil is in the details...or jospt.2009.2765 odological quality associated with estimates not? A primer on individual patient data meta- 59. Wood L, Egger M, Gluud LL, et al. Empirical of treatment effects in controlled trials. JAMA. analysis. Evid Based Med. 2009;14:100-101. evidence of bias in treatment effect estimates in 1995;273:408-412. 53. Sutton AJ, Higgins JP. Recent developments controlled trials with different interventions and 46. Shea BJ, Grimshaw JM, Wells GA, et al. Devel- in meta-analysis. Stat Med. 2008;27:625-650. outcomes: meta-epidemiological study. BMJ. opment of AMSTAR: a measurement tool to http://dx.doi.org/10.1002/sim.2934 assess the methodological quality of systematic 54. Thompson SG. Why and how sources of hetero- 2008;336:601-605. http://dx.doi.org/10.1136/ reviews. BMC Med Res Methodol. 2007;7:10. geneity should be investigated. In: Egger M, Dav- bmj.39465.451748.AD http://dx.doi.org/10.1186/1471-2288-7-10 ey Smith G, Altman D, eds. Systematic Reviews 60. Zhang W, Nuki G, Moskowitz RW, et al. OARSI 47. Shea BJ, Hamel C, Wells GA, et al. AMSTAR is in Health Care: Meta-Analysis in Context. Lon- recommendations for the management of a reliable and valid measurement tool to as- don, UK: BMJ Publishing Group; 2001:157-175. hip and knee osteoarthritis: part III: Changes sess the methodological quality of systematic 55. Thompson SG, Higgins JP. How should meta- in evidence following systematic cumulative reviews. J Clin Epidemiol. 2009;62:1013-1020. regression analyses be undertaken and inter- update of research published through January 48. Sim J, Reid N. Statistical by confi- preted? Stat Med. 2002;21:1559-1573. http:// 2009. Osteoarthritis Cartilage. 2010;18:476-499. dence intervals: issues of interpretation and dx.doi.org/10.1002/sim.1187 http://dx.doi.org/10.1016/j.joca.2010.01.013 utilization. Phys Ther. 1999;79:186-195. 56. Thompson SG, Pocock SJ. Can meta-analyses 61. Zlowodzki M, Poolman RW, Kerkhoffs GM, 49. Sistrom CL, Garvan CW. Proportions, odds, and be trusted? Lancet. 1991;338:1127-1130. risk. Radiology. 2004;230:12-19. http://dx.doi. 57. Wallace BC, Schmid CH, Lau J, Trikalinos TA. Tornetta P, 3rd, Bhandari M. How to interpret a org/10.1148/radiol.2301031028 Meta-Analyst: software for meta-analysis of meta-analysis and judge its value as a guide for 50. Sterne JA, Egger M, Moher D. Addressing binary, continuous and diagnostic data. BMC clinical practice. Acta Orthop. 2007;78:598-609. reporting biases. In: Higgins JP, Green S, eds. Med Res Methodol. 2009;9:80. http://dx.doi. http://dx.doi.org/10.1080/17453670710014284 Cochrane Handbook for Systematic Reviews org/10.1186/1471-2288-9-80 of Interventions. Chichester, West Sussex, UK: 58. Walton DM, Pretty J, MacDermid JC, Teasell RW. John Wiley & Sons; 2008:297-333. Risk factors for persistent problems following MORE INFORMATION 51. Stratford PW. The added value of confidence whiplash injury: results of a systematic review WWW.JOSPT.ORG intervals. Phys Ther. 2010;90:333-335. http:// and meta-analysis. J Orthop Sports Phys Ther. @

CHECK Your References With the JOSPT Reference Library

JOSPT has created an EndNote reference library for authors to use in conjunction with PubMed/Medline when assembling their manuscript references. This addition to “INFORMATION FOR AUTHORS” on the JOSPT website under “Complete Author Instructions” oers a compilation of all article reference sections published in the Journal from 2006 to date as well as complete references for all articles published by JOSPT since 1979—a total of nearly 10,000 unique references. Each reference has been

Journal of Orthopaedic & Sports Physical Therapy® checked for accuracy. Downloaded from www.jospt.org at on December 21, 2016. For personal use only. No other uses without permission. Copyright © 2011 Journal of Orthopaedic & Sports Physical Therapy®. All rights reserved.

This resource is updated monthly with each new issue of the Journal. The JOSPT Reference Library can be found at http://www.jospt.org/aboutus/ for_authors.asp.

504 | july 2011 | volume 41 | number 7 | journal of orthopaedic & sports physical therapy

41-07 Israel.indd 504 6/15/2011 3:14:55 PM