<<

eeping up with the latest advances in diagnosis and treatment is a challenge we all face as phy- cians. We need information that is both valid (that is, accurate and correct) and relevant to A Simple Kour patients and practices. While we have many sources of clinical information, such as CME lectures, textbooks, pharmaceutical advertising, pharmaceutical representatives Method for and colleagues, we often turn to journal articles for the most current clinical information. Unfortunately, a great deal of research reported in journal articles is poorly done, poorly analyzed or both, and thus Evaluating is not valid. A great deal of research is also irrelevant to our patients and practices. Separating the clinical wheat from the chaff can take skills that many of us never were taught. The article “Making Evidence-Based Medicine Doable the Clinical in Everyday Practice” in the February 2004 issue of FPM describes several organizations that can help us. These organizations, such as the Cochrane Library, Bandolier and Clinical Evidence, develop clinical questions and then Literature review one or more journal articles to identify the best available evidence that answers the question, with a focus on the quality of the study, the validity of the results and Robert J. Flaherty, MD the relevance of the findings to everyday practice. These organizations provide a very valuable service, and the num- ber of important clinical questions that they have studied has grown steadily over the past five years. (See “Four steps The “PP-ICONS” approach will to an evidence-based answer” on page 48.) If you find a or meta-analysis done help you separate the clinical wheat by one of these organizations, you can feel confident that from the chaff in mere minutes. you have found the current best evidence. However, these organizations have not asked all of the common clinical questions yet, and you will frequently be faced with finding the pertinent articles and determining for yourself whether they are valuable. This is where the PP-ICONS approach can help.

What is PP-ICONS? When you find a systematic review, meta-analysis or randomized controlled trial while reading your clinical journals or searching PubMed (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi), you need to determine whether it is valid and relevant. There are many different ways to analyze an abstract or journal article, some more rigorous than others.1,2 I have found a simple but effective way to identify a valid or relevant article within a couple of minutes, ensuring that I can use or discard the conclusions with confidence. This approach works well on articles regarding treatment and prevention, and can also be used with articles on diagnosis and screening. ➤

Dr. Flaherty is a family physician with the Student Health Service and WWAMI Medical Program, Montana State University-Bozeman, and a clinical professor in the Department of Family Medicine,

ILLUSTRATION UniversityBY GREG CLARKE of Washington. Conflicts of interest: none reported.

May 2004 ■ www.aafp.org /fpm ■ FAMILY PRACTICE MANAGEMENT ■ 47 Downloaded from the Family Practice Management Web site at www.aafp.org/fpm. Copyright © 2004 American Academy of Family Physicians. For the private, noncommercial use of one individual user of the Web site. All other rights reserved. Contact [email protected] for copyright questions and/or permission requests. ® SPEEDBAR The most important information to look KEY POINTS for when reviewing an article can be sum- marized by the acronym “PP-ICONS,” • Reading the abstract is often suffi cient when eval- ➤ ➤ which stands for the following: uating an article using the PP-ICONS approach. • Problem, Physicians often turn • The most relevant studies will involve outcomes • Patient or population, to journal articles to that matter to patients (e.g., morbidity, mortal- • Intervention, fi nd the most current ity and cost) versus outcomes that matter to • Comparison, clinical informa- physiologists (e.g., blood pressure, blood sugar or • Outcome, tion; however, much cholesterol levels). • Number of subjects, research is not valid • Statistics. • Ignore the reduction, as it overstates or relevant. For example, imagine that you just saw research fi ndings and will mislead you. a nine-year-old patient in the offi ce with common warts on her hands, an ideal candi- in the treatment of verruca vulgaris (the date for your usual cryotherapy. Her mother common wart). Arch Pediatr Adolesc Med. ➤ ➤ had heard about treating warts with duct 2002 Oct;156(10):971-974.” tape and wondered if you would recommend You decide to apply PP-ICONS to this The PP-ICONS this treatment. You promised to call Mom abstract (shown on page 49) to determine if approach can help back after you had a chance to investigate the information is both valid and relevant. physicians quickly this rather odd treatment. Problem. The fi rst P in PP-ICONS is for determine whether a When you get a free moment, you write “problem,” which refers to the clinical con- particular study down your clinical question: “Is duct tape dition that was studied. From the abstract, is valuable. an effective treatment for warts in chil- it is clear that the researchers studied the dren?” Writing down your clinical question same problem you are interested in, which is useful, as it can help you clarify exactly is important since fl at warts or genital warts what you are looking for. Use the PPICO may have responded differently. Obviously, ➤ ➤ parts of the acronym to help you write your if the problem studied were not suffi ciently clinical question; this is actually how many similar to your clinical problem, the results PP-ICONS stands for researchers develop their research questions. would not be relevant. problem, patient or You search Cochrane and Bandolier with- Patient or population. Next, consider population, interven- out success, so now you search PubMed, the patient or population. Is the study group tion, comparison, which returns an abstract for the following similar to your patient or practice? Are they outcome, number of article: “Focht DR 3rd, Spicer C, Fairchok primary care patients, for example, or are subjects and statistics. MP. The effi cacy of duct tape vs cryotherapy they patients who have been referred to a

FOUR STEPS TO AN EVIDENCE-BASED ANSWER

➤ ➤ When faced with a clinical question, follow these steps to fi nd an evidence-based answer: Writing down your clinical question can 1. Search the Web site of one of the evidence review organizations, such as Cochrane (http://www.cochrane.org/ help you clarify exactly cochrane/revabstr/mainindex.htm), Bandolier (http://www.jr2.ox.ac.uk/bandolier) or Clinical Evidence (http:// what you are looking www.clinicalevidence.com), described in “Making Evidence-Based Medicine Doable in Everyday Practice,” FPM, February 2004, page 51. You can also search the TRIP+ Web site (http://www.tripdatabase.com), which simul- for in the clinical taneously searches the databases of many of the review organizations. If you fi nd a systematic review or meta- literature. analysis by one of these organizations, you can be confi dent that you’ve found the best evidence available.

2. If you don’t fi nd the information you need through step 1, search for meta-analyses and systematic reviews using the PubMed Web site (see the tutorial at http://www.nlm.nih.gov/bsd/pubmed_tutorial/m1001.html). Most of the recent abstracts found on PubMed provide enough information for you to determine the validity and relevance of the fi ndings. If needed, you can get a copy of the full article through your hospital library or the journal’s Web site.

3. If you cannot fi nd a systematic review or meta-analysis on PubMed, look for a randomized controlled trial (RCT). The RCT is the “” in . Case reports, cohort studies and other research meth- ods simply are not good enough to use for making patient care decisions.

4. Once you fi nd the article you need, use the PP-ICONS approach to evaluate its usefulness to your patient.

48 ■ FAMILY PRACTICE MANAGEMENT ■ www.aafp.org /fpm ■ May 2004 EVALUATING CLINICAL LITERATURE

SPEEDBAR ® ABSTRACT FROM PUBMED

Using the PP-ICONS ➤ ➤ approach, physicians For a study to be use- can evaluate the ful, it must address the validity and relevance same problem you are of clinical articles in interested in and your minutes using only patient must be similar the abstract, such as this one, obtained free to those in the study online from PubMed, group. http://www.ncbi. nlm.nih.gov/entrez/ query.fcgi. The author uses this abstract to evaluate the use of ➤ ➤ duct tape to treat The study interven- common warts. tion should also be the same as what you are looking for.

➤ ➤ -oriented out- comes may be interest- ing but of questionable tertiary care center? Are they of a similar Outcome. The outcome is particularly relevance, whereas age and gender? In this case, the researchers important. Many outcomes are “disease- patient-oriented out- studied children and young adults in oriented outcomes,” which are based on comes are interesting outpatient clinics, which is similar to your “disease-oriented evidence” (DOEs). DOEs and relevant. patient population. If the patients in the usually refl ect changes in physiologic param- study are not similar to your patient, for eters, such as blood pressure, blood sugar, example if they are sicker, older, a different cholesterol, etc. We have long assumed gender or more clinically complicated, the that improving the physiologic parameters results might not be relevant. of a disease will result in a better disease ➤ ➤ Intervention. outcome, but Using PubMed online, The intervention that is not neces- you can fi nd abstracts could be a diag- We have long assumed that improving sarily true. For for the clinical articles nostic test or a instance, fi naste- the physiologic parameters of a disease that match your search treatment. Make ride can improve parameters. sure the inter- will result in a better disease outcome, urinary fl ow vention is the but that is not necessarily true. rate in prostatic same as what you hypertrophy, but are looking for. it does not signifi - The patient’s mother was asking about duct cantly change symptom scores.3 tape for warts, so this is a relevant study. DOEs look at the kinds of outcomes that Comparison. The comparison is what physiologists care about. More relevant are the intervention is tested against. It could be outcomes that patients care about, often a different diagnostic test or another thera- called “patient-oriented outcomes.” These py, such as cryotherapy in this wart study. are based on “patient-oriented evidence that It could even be or no treatment. matters” (POEMs) and look at outcomes Make sure the comparison fi ts your ques- such as morbidity, mortality and cost. Thus, tion. You usually use cryotherapy for com- when looking at a journal article, DOEs mon warts, so this is a relevant comparison. are interesting but of questionable relevance,

May 2004 ■ www.aafp.org /fpm ■ FAMILY PRACTICE MANAGEMENT ■ 49 SPEEDBAR ® whereas POEMs are very interesting and article similar to this one from an article very relevant. In the study on the previous titled “Long-Term Effects of Mammography page, the outcome is complete resolution of Screening: Updated Overview of the Swed- 5 ➤ ➤ the wart, which is something your patient is ish Randomised Trials”: interested in. “There were 511 breast cancer deaths Too few patients in a Number. The number of subjects is in 1,864,770 women-years in the invited research study may crucial to whether accurate statistics can be groups and 584 breast cancer deaths in not be enough to show generated from the data. Too few patients in 1,688,440 women-years in the control that a difference actu- a research study may not be enough to show groups, a signifi cant 21 percent reduction ally exists between the that a difference actually exists between the in breast cancer mortality.” intervention and com- intervention and comparison groups (known This 21-percent statistic is the relative parison groups. as the “power” of risk reduction a study). Many (RRR), which ➤ ➤ studies are pub- The is very is the percent lished with reduction in the Studies with less than less than 100 popular and will be reported in nearly measured out- 100 subjects are usu- subjects, which is come between ally inadequate to pro- every journal article, perhaps because usually inadequate the experimen- vide reliable statistics. to provide reliable it makes weak results look good. tal and control statistics. A good groups. (See ➤ ➤ rule of thumb is “Some important 400 subjects.4 Fifty-one patients completed statistics,” below, for more information on The relative risk the wart study, which is a pretty small num- calculating the RRR and other statistics.) reduction is not a ber to generate good statistics. The RRR is not a good way to compare good way to compare Statistics. The statistics you are inter- outcomes. It amplifi es small differences outcomes, as it ampli- ested in are few in number and easy to and makes insignifi cant fi ndings appear fi es small differences understand. Since statistics are frequently signifi cant, and it doesn’t refl ect the baseline and doesn’t refl ect the misused in journal articles, it is worth a risk of the outcome event. Nevertheless, the baseline risk of the few minutes to learn which to believe and RRR is very popular and will be reported in outcome event. which to ignore. nearly every journal article, perhaps because Relative risk reduction. It is not unusu- it makes weak results look good. Think of al to fi nd a summary statement in a journal the RRR as the “reputation reviving ratio”

SOME IMPORTANT STATISTICS

Absolute risk reduction (ARR): The difference between the control group’s event rate (CER) and the experimental group’s event rate (EER).

Control event rate (CER): The proportion of patients responding to placebo or other control treatment. For example, if 25 patients are in a control group and the event being studied is observed in 15 of those patients, the control event rate would be 15/25 = 0.60.

Experimental event rate (EER): The proportion of patients responding to the experimental treatment or intervention. For example, if 26 patients are in an experimental group and the event being studied is observed in 22 of those patients, the experimental event rate would be 22/26 = 0.85.

Number needed to treat (NNT): The number of patients that must be treated to prevent one adverse outcome or for one patient to benefi t. The NNT is the inverse of the ARR; NNT = 1/ARR.

Relative risk reduction (RRR): The percent reduction in events in the treated group compared to the control group event rate.

When the experimental Example: Beta-blockers to prevent When the experimental treatment reduces the deaths in high-risk patients with treatment increases the Example: Duct tape to eliminate risk of a bad event: recent : probability of a good event: common warts. Relative risk CER-EER/CER (.66 -. 50)/.66 = .24 or 24 percent EER-CER/CER (.85-.60)/.60 = .42 or 42 percent reduction (RRR) Absolute risk CER-EER (.66 - .50) = .16 or 16 percent EER-CER .85-.60 = .25 or 25 percent reduction (ARR): Number needed 1/ARR 1/.16 = 6 1/ARR 1/.25 = 4 to treat (NNT)

50 ■ FAMILY PRACTICE MANAGEMENT ■ www.aafp.org /fpm ■ May 2004 EVALUATING CLINICAL LITERATURE or the “reporter’s reason for ‘rit- SPEEDBAR ® ing.” Ignore the RRR. It will EXAMPLES OF NNTS mislead you. In our wart treat- ment example, the RRR would The (NNT) is one of the most useful ➤ ➤ be (85 percent - 60 percent)/60 statistics for physicians and patients. It calculates the number of The absolute risk percent x 100 = 42 percent. The patients that must be treated to prevent one adverse event or for reduction shows RRR could thus be interpreted one patient to benefi t. Note that NNTs for preventive interventions the true difference as showing that duct tape is will usually be higher than NNTs for treatment interventions. The between the experi- 42 percent more effective than lower the NNT, the better. mental and control cryotherapy in treating warts. The following examples of NNTs are borrowed from an excellent list interventions. Absolute risk reduction. A available through the Bandolier Web site at http://www.jr2.ox.ac.uk/ better statistic is the absolute risk bandolier/band50/b50-8.html. reduction (ARR), which is the difference in the outcome event Therapy NNT rate between the control group Triple antibiotic therapy to eradicate H. pylori 1.1 ➤ ➤ and the experimental treated The most clinically use- group. Thus, in our wart treat- Isosorbide dinitrate for prevention of 5 ful statistic is the num- ment example, the ARR is the exercise-induced angina ber needed to treat outcome event rate (complete Short course of antibiotics for otitis media in children 7 (NNT), which tells you resolution of warts) for duct tape Statins for secondary prevention of adverse the number of patients (85 percent) minus the outcome 11 cardiovascular outcomes who must be treated event rate for cryotherapy (60 Statins for primary prevention of adverse to prevent one adverse percent) = 25 percent. Unlike 35 cardiovascular outcomes outcome or for one the RRR, the ARR does not patient to benefi t. Finasteride to prevent one operation for benign amplify small differences 39 but shows the true difference prostatic hyperplasia between the experimental and Misoprostol to prevent any gastrointestinal compli- 166 control interventions. Using the cation in nonsteroidal anti-infl ammatory drug users ARR, it would be accurate to ➤ ➤ say that duct tape is 25-percent The NNT is easy to cal- more effective than cryotherapy 1/25 percent =1/0.25 = 4, meaning that 4 culate, as it is simply in treating warts. patients need to be treated with duct tape the inverse of the abso- Number needed to treat. The single for one to benefi t more than if treated lute risk reduction. most clinically useful statistic is the number by cryotherapy. needed to treat (NNT). The NNT is the Wrapped up in this simple little statistic number of patients who must be treated are some very important concepts. The to prevent one adverse outcome. To think NNT provides you with the likelihood about it another way, the NNT is the num- that the test or treatment will benefi t any ➤ ➤ ber of patients who individual patient, In a perfect world, a must be treated for one an impression of the treatment would have patient to benefi t. (The Now this is a statistic that baseline risk of the an NNT of 1, but real rest who were treated adverse event, and life is not so kind. obtained no benefi t, physicians and their patients a sense of the cost although they still suf- can really appreciate! to society. Thus, it fered the risks and costs gives perspective and of treatment.) In our hints at the “reason- wart therapy article, the NNT would tell us ableness” of a treatment. The value of this how many patients must be treated with the statistic has become appreciated in the last experimental treatment for one to benefi t fi ve years, and more journal articles are more than if he or she had been treated with reporting it. the standard treatment. What is a reasonable NNT? In a perfect Now this is a statistic that physicians world, a treatment would have an NNT of and their patients can really appreciate! 1, meaning that every patient would benefi t Furthermore, the NNT is easy to calculate, from the treatment. Real life is not so kind as it is simply the inverse of the ARR. (see “Examples of NNTs,” above). Clearly, For our wart treatment study, the NNT is an NNT of 1 is great and an NNT of 1,000

May 2004 ■ www.aafp.org /fpm ■ FAMILY PRACTICE MANAGEMENT ■ 51 SPEEDBAR ® is terrible. Although it is hard to come up way to screen an article for validity and with firm guidelines, for primary therapies relevance, and the abstract often contains I am satisfied with an NNT of 10 or all of the information you need. Even the ➤ ➤ less and very pleased with an NNT less statistics can be done quickly in your head. than 5. Our duct tape NNT of 4 is good, You can apply PP-ICONS when search- NNTs for preventive particularly since the treatment is cheap, ing for a particular article, when you come interventions will usu- easy and painless. across an article in your reading, when ally be higher than Note that NNTs for preventive inter- data are presented at lectures, when a phar- NNTs for treatment ventions (e.g., the use of aspirin to prevent maceutical representative hands you an interventions. cardiac problems) will article to support his usually be higher than or her pitch, and even NNTs for treatment You can apply PP-ICONS when reading news interventions (e.g., stories describing even when reading news ➤ ➤ the use of duct tape to medical breakthroughs. cure warts). Prevention Don’t be discour- Explaining the NNT to stories describing medical groups contain both aged if you find that patients can help them higher-risk and lower- breakthroughs. high-quality articles are be more informed risk individuals, so they rare, even in the most partners in therapeutic produce bigger denomi- prestigious journals. decisions. nators, whereas treatment groups only This seems to be changing for the better, contain diseased patients. Thus, an NNT although many careers are still being built for prevention of less than 20 might be on questionable research. Nevertheless, particularly good. screening articles will help you find the ➤ ➤ When discussing a particular therapy, truth that is out there and will help I explain the NNT to my patient. Since you practice the best medicine. And as If a study’s number of this statistical concept is easy to understand, we become more discerning end-users subjects is small, its it can help the patient be a more informed of research, we might just stimulate intervention may still partner in therapeutic decisions. improvements in in be worth trying if its You will soon start to see a similar sta- the process. risks and costs are low. tistic, the number needed to screen (NNS), which is the number of patients needed Send comments to [email protected]. to screen for a particular disease for a given duration for one patient to benefit.6 ➤ ➤ Although few NNSs have been calculated, 1. Miser WF. Critical appraisal of the literature. they are likely to involve higher numbers, J Amer Board Fam Pract. 1999;12(4):315-333. The abstract of an since the screening population consists of article often contains 2. Guyatt GH, et al. Users’ guides to the medi- patients with and without the disease. For all of the information cal literature. How to use an article about therapy example, in the article on mammography you need to screen it or prevention. Are the results of the study valid? screening mentioned above, the NNS was for validity using the JAMA. 1993;270(21):2598-2601. 961 for 16 years. In other words, you would PP-ICONS approach. need to screen 961 women for 16 years to 3. Lepor H, et al. The efficacy of terazosin, fin- prevent one breast cancer death. asteride or both in benign prostatic hyperplasia. Veterans Affairs Cooperative Studies Benign The good news and the bad Prostatic Hyperplasia Study Group. N Engl J Med. Using PP-ICONS to assess the wart study, 1996;335(8):533-539. the problem, the patient/population, the 4. Krejcie RV, Morgan DW. Determining sample intervention, the comparison and the out- size for research activities. Educational and Psycho- come are all relevant to your patient. The logical Measurement. 1970;30:607-610. number of subjects is on the small side, making you a little wary, but the interven- 5. Nystrom L, et al, Long-term effects of tion is cheap and low-risk. The statistics, mammography screening: updated overview particularly the NNT, are reasonable. On of the Swedish randomised trials. Lancet. balance, this looks like a fair approach, so 2002;359(9310): 909-919. you call the patient’s mother and discuss 6. Rembold CM. Number needed to screen: it with her. development of a statistic for disease screening. The PP-ICONS approach is an easy BMJ. 1998;317:307-312.

52 ■ FAMILY PRACTICE MANAGEMENT ■ www.aafp.org /fpm ■ May 2004