Some of your sessions may be at the Old Road Campus (Churchill Hospital site) – so here is a map of how to get to the relevant buildings.

Teaching Rooms

University of Oxford, 2010 0

Evidence Based Thread Course Notes

TABLE OF CONTENTS

Topic Pages Introduction 2 Program 3 Lecture: Introduction to Evidence Based Medicine 4 1: A. Study Designs & B. Asking questions 12 2: Critical Appraisal of a Therapy Study 21 3: Library sessions 24 4: Presentation instructions 26 5: Short Presentations 28 OTHER READINGS Critical Appraisal Sheet: Systematic Reviews 30 Critical Appraisal Sheet: Diagnostic Studies 32 Finding the Gold in Medline: clinical queries Measures of Assocation Measurement Scales Statistical Approaches to Uncertainty Self Quiz Glossary & Answers to Quiz

University of Oxford, 2010 1

An introduction to EBM Evidence Based Medicine

“Evidence-based medicine is the integration of best research evidence with clinical expertise and patient values ” - Dave Sackett

Welcome to the evidence-based medicine theme. The primary purpose of EBM sessions is to give you concrete experience in using searching and critical appraisal skills and in the context of you current clinical learning and future practice. Our aim is to give you some basic skills that will be useful in your other medical terms (e.g., for writing up case reports) and for your own life long learning and future medical career.

Objectives At the end of year 4 you should be able to:

 Recognise and formulate your own answerable clinical questions  Appraise, and apply the results of different types of research studies to help in the management of individual patients. (Note: we will only cover therapy studies in detail; the other types – diagnosis, prognosis, aetiology - will be covered throughout the year).  Express the results of clinical trials in terms of both relative and absolute risk reductions, and be able to explain the numerical results to a patient.  Identify the type of research that best answers the different classes of clinical questions.  Know which of several research databases (MEDLINE, Cochrane Library, Embase, etc)and secondary resources (Clinical Evidence, Guidelines) are most likely to be helpful in answering different types of clinical questions.  Search using multiple text words and MeSH headings connected by Booleans (AND, OR, NOT) and truncations (* and $)

(please mark which objectives you need to work on most)

Recommended Text Straus SE, Richardson WS, Glasziou PP, Haynes RB. Evidence-based Medicine: How to Practise and Teach EBM . Third Edition. Churchill Livingstone: Edinburgh, 2005 or Badenoch D, Heneghan C. Evidence-Based Medicine Toolkit, 2 nd Edition BMJ Publishing, 2006.

Also www.cebm.net - contains useful material, including a toolbox which has a glossary and useful breif summaries of the types of studies.

Contact Dr Carl Heneghan Tel :289299 email ; [email protected]

University of Oxford, 2010 2

PROGRAM

The program below is for the introductory block of Year 4. This will introduce you to the basics of Evidence-Based Medicine. During your other terms you will do further sessions on Evidence-Based Medicine and Evidence-Based Surgery. These will include sessions on diagnosis, on appraisal of surgical treatments, and further critically appraised topics of your own.

When? What? Why? Wednesday Lecture Introduction to interpreting 15th September What is normal/abnormality? normal and abnormal ranges; 9-10 PICO questions and gather questions for October sessions

Monday Lecture Introduction to the 4-step 11 th / 18 th Introduction to Evidence- process of EBM (ask, search, October based Medicine appraise, & apply research 09:00 – 09:30 Question Formulation evidence)

JR Lecture Theatre Monday EBM Presentations Practice of question 11 th / 18 th Formulation/Rapid formulation and appraisal and October Appraisal/RCT’s and preparation for your 09:30 – 11:00 Systematic Reviews developing your own topic for Thursday JR Lecture Theatre Monday Advanced Searching Tips on which databases to 11 th / 18 th use for what and how to use October JR lecture Theatre them efficiently. 11;00 - 11:30 Monday Lab Medicine Lectures 11 th / 18 th October Evening Working on EBM 14:00 – 16:00 task

Monday Searching Session Searching for paper for 11 th /18 th (see Note) CAIRNS Tuesday presentation. October LIBRARY/Private Study 16:00 – Optional 17:00

Tuesday pm Small Group Session Each Student will present for 12 th / 19 th Presentation of CATs about 10 minutes on their October 3:30-5 (critically appraised topics) topic of choice

University of Oxford, 2010 3

Please see you individual group timesheets (back of these notes) for the location of your small group presentations sessions on Tuesday.

University of Oxford, 2010 4

Lecture slides here

University of Oxford, 2010 5

University of Oxford, 2010 6

University of Oxford, 2010 7

University of Oxford, 2010 8

University of Oxford, 2010 9

University of Oxford, 2010 10

University of Oxford, 2010 11

1 page trial here

University of Oxford, 2010 12

Asking Questions

This first session aims to familiarise you with the PICO structure of questions, and being able rapidly recognise these in research articles.

PART A. Exercise: study designs Read the abstracts from published studies on the following pages and answer the following questions for each study: 1. What is the question (PICO) of the study? 2. What is the purpose of the study? a. intervention a. frequency ( or ) b. diagnostic accuracy c. prognosis (or natural history) d. aetiology and risk factors 3. Which study type would give the highest quality evidence to answer the question? (see ‘Levels of evidence table’) 4. Which is the best study type that is also feasible? (You can use the Table below as a guide) 5. What is the study type used?

Table: levels of evidence according to type of research question Le Intervention Diagnosis ** Prognosis Aetiology ††† vel I A of level II Systematic review of A systematic review A systematic studies level II studies of level II studies review of level II studies II A randomised controlled trial Cross-sectional study A prospective A prospective among consecutive inception cohort presenting patients study III- A pseudo-randomised Cross-sectional study untreated control A 1 controlled trial (eg alternate among non-consecutive patients in a retrospective allocation or some other patients randomised cohort study method) controlled trial III- A comparative study with Diagnostic case-control A retrospectively A case-control 2 concurrent control group: study assembled cohort study study • Nonrandomised experimental study • Cohort study, case-control study, interrupted time series with a control group III- A comparative study without 3 concurrent control group: • Historical control study • Comparison of two or more single arm studies (ie from two studies) • Interrupted time series without a parallel control group IV Case series Case series Case series, or A cross- cohort study of sectional patients at different study stages of

University of Oxford, 2010 13

Abstract 1 Voutilainen S, Rissanen TH, Virtanen J, Lakka TA, Salonen JT; Kuopio Ischemic Heart Disease Study. Low dietary folate intake is associated with an excess incidence of acute coronary events: The Kuopio Ischemic Heart Disease Risk Factor Study.

BACKGROUND: Although several prospective studies have shown that low folate intake and low circulating folate are associated with increased risk of coronary heart disease (CHD), the findings are inconsistent. METHODS AND RESULTS: We studied the associations of dietary intake of folate, vitamin B(6), and vitamin B(12) with the risk of acute coronary events in a of 1980 Finnish men 42 to 60 years old examined in 1984 to 1989 in the Kuopio Ischemic Heart Disease Risk Factor Study. Nutrient intakes were assessed by 4-day food record. During an average follow-up time of 10 years, 199 acute coronary events occurred. In a Cox proportional hazards model adjusted for 21 conventional and nutritional CHD risk factors, men in the highest fifth of folate intake had a of acute coronary events of 0.45 (95% CI 0.25 to 0.81, P=0.008) compared with men in the lowest fifth. This association was stronger in nonsmokers and light alcohol users than in smokers and alcohol users. A high dietary intake of vitamin B(6) had no significant association and that of vitamin B(12) a weak association with a reduced risk of acute coronary events. CONCLUSIONS: The present work in CHD-free middle-aged men is the first prospective cohort study to observe a significant inverse association between quantitatively assessed moderate-to-high folate intakes and incidence of acute coronary events in men. Our findings provide further support in favor of a role of folate in the promotion of good cardiovascular health.

Question Answer 1. What is the ques tion (PICO) of the study? P I C O 2. What is the purpose of the study? 3. Which study type would give the highest quality evidence to answer the question? 4.Which is the best study type that is also feasible? 5.What is the study type used?

University of Oxford, 2010 14

Abstract 2 Lonn E, et al for the HOPE 2 Investigators. Homocysteine lowering with folic acid and B vitamins in vascular disease. N Engl J Med. 2006 BACKGROUND: In observational studies, lower homocysteine levels are associated with lower rates of coronary heart disease and stroke. Folic acid and vitamins B6 and B12 lower homocysteine levels. We assessed whether supplementation reduced the risk of major cardiovascular events in patients with vascular disease. METHODS: We randomly assigned 5522 patients 55 years of age or older who had vascular disease or diabetes to daily treatment either with the combination of 2.5 mg of folic acid, 50 mg of vitamin B6, and 1 mg of vitamin B12 or with placebo for an average of five years. The primary outcome was a composite of death from cardiovascular causes, myocardial infarction, and stroke. RESULTS: Mean plasma homocysteine levels decreased by 2.4 micromol per liter (0.3 mg per liter) in the active-treatment group and increased by 0.8 micromol per liter (0.1 mg per liter) in the placebo group. Primary outcome events occurred in 519 patients (18.8 percent) assigned to active therapy and 547 (19.8 percent) assigned to placebo (relative risk, 0.95; 95 percent confidence interval, 0.84 to 1.07; P=0.41). As compared with placebo, active treatment did not significantly decrease the risk of death from cardiovascular causes (relative risk, 0.96; 95 percent confidence interval, 0.81 to 1.13), myocardial infarction (relative risk, 0.98; 95 percent confidence interval, 0.85 to 1.14), or any of the secondary outcomes. Fewer patients assigned to active treatment than to placebo had a stroke (relative risk, 0.75; 95 percent confidence interval, 0.59 to 0.97). More patients in the active-treatment group were hospitalized for unstable angina (relative risk, 1.24; 95 percent confidence interval, 1.04 to 1.49). CONCLUSIONS: Supplements combining folic acid and vitamins B6 and B12 did not reduce the risk of major cardiovascular events in patients with vascular disease.

Question Answer 1. What is the question (PICO) of the study? P I C O 2. What is the purpose of the study? 3. Which study type would give the highest quality evidence to answer the question? 4.Which is the best study type that is also feasible? 5.What is the study type used?

University of Oxford, 2010 15

Abstract 3 Chen SM, Chang MH, Du JC, Lin CC, Chen AC, Lee HC, Lau BH, Yang YJ, Wu TC, Chu CH, Lai MW, Chen HL; Taiwan Infant Stool Color Card Study Group, 2006. Screening for biliary atresia by infant stool color card in Taiwan. Pediatrics 117(4):1147–54. OBJECTIVE: We aimed to detect biliary atresia (BA) in early infancy to prevent additional liver damage because of the delay of referral and surgical treatment and to investigate the incidence rate of BA in Taiwan. METHODS: A pilot study to screen the stool color in infants for the early diagnosis of BA was undertaken from March 2002 to December 2003. We had designed an ‘infant stool color card’ with 7 numbers of different color pictures and attached it to the child health booklet. Parents were then asked to observe their infant's stool color by using this card. The medical staff would check the number that the parents chose according to their infant's stool color at 1 month of age during the health checkup and then send the card back to the stool color card registry center. RESULTS: The average return rate was approximately 65.2% (78,184 infants). A total of 29 infants were diagnosed as having BA, and 26 were screened out by stool color card before 60 days of age. The sensitivity, specificity, and positive predictive value were 89.7%, 99.9%, and 28.6%, respectively. Seventeen (58.6%) infants with BA received a Kasai operation within 60-day age period. The estimated incidence of BA in screened newborns was 3.7 of 10,000. CONCLUSIONS: The stool color card was a simple, efficient, and applicable mass screening method for early diagnosis and management of BA. The program can also help in estimating the incidence and creating a registry of these patients.

Question Answer 1. What is the question (PICO) of the study? P I C O 2. What is the purpose of the study? 3. Which study type would give the highest quality evidence to answer the question? 4.Which is the best study type that is also feasible? 5.What is the study type used?

University of Oxford, 2010 16

Abstract 4 Brna P, Dooley J, Gordon K, Dewan T. The prognosis of childhood headache: a 20-year follow-up. Arch Pediatr Adolesc Med . 2005;159:1157-60. BACKGROUND: Headaches affect most children and rank third among illness-related causes of school absenteeism. Although the short-term outcome for most children appears favorable, few studies have reported long-term outcome. OBJECTIVE: To evaluate the long-term prognosis of childhood headaches 20 years after initial diagnosis in a cohort of Atlantic Canadian children who had headaches diagnosed in 1983. METHODS: Ninety-five patients with headaches who consulted 1 of the authors in 1983 were previously studied in 1993. The 77 patients contacted in 1993 were followed up in 2003. A standardized interview was used. RESULTS: Sixty (78%) of 77 patients responded (60 of the 95 of the original cohort). At 20-year follow-up, 16 (27%) were headache free, 20 (33%) had tension-type headaches, 10 (17%) had migraine, and 14 (23%) had migraine and tension-type headaches. Having more than 1 headache type was more prevalent than at diagnosis or initial follow-up (P<.001), and headache type varied across time. CONCLUSIONS: Twenty years after diagnosis of pediatric headache, most patients continue to have headache, although the headache classification often changes across time.

Question Answer 1. What is the question (PICO) of the study? P I C O 2. What is the purpose of the study? 3. Which study type would give the highest quality evidence to answer the question? 4.Which is the best study type that is also feasible? 5.What is the study type used?

University of Oxford, 2010 17

PART B In this session you should try to formulate the questions for the scenarios provided and for one or more of your own scenarios (see the worked example on opposite page)

There are several supplied scenarios to choose from – do a couple of these.

There are also several blank scenario sheets for you own questions. You will need to do at least one of these (a patient you have seen, or a health problem of a relative, friend or even yourslef – please keep the identity confidential).

You should complete the upper half of the sheet (FORMULATE AN ANSWERABLE QUESTION), and begin the design of the search strategy in the lower half of the sheet (TRACK DOWN THE BEST EVIDENCE) – however, most of the search section you will complete in the next tutorial.

EXAMPLE

STEP 1 : FORMULATE AN ANSWERABLE QUESTION

Example – Stockings for long flights? A 43 year old male asked for some repeat prescriptions and advice about preventing deep vein thrombosis on a 12 hour flight (his brother had had one last year). You suggest stockings as the most effective prevention.

Question Patient or Population: _ In patients on long flights ______

Intervention or Indicator: __ do compression stockings ______

Comparator: ___ no compression stockings ______

Outcome: ____ prevent Deep Vein Thrombosis (DVT) ______

Question sentence: In patients on long flights(P), do compression stockings (I) prevent DVT (O)?

What type of question is this (phenomena, frequency, diagnosis, prediction, or intervention)?

What would be the ideal study type? (Randomised Trial, Inception cohort, Survey, etc)

What would be the best feasible study type?

University of Oxford, 2010 18

STEP 1 : FORMULATE AN ANSWERABLE QUESTION

Scenario 1– Beta-Blockers in Heart Failure?

Over afternoon tea you a re discuss a patient with heart failure who had a myocardial infarction about 6 weeks ago. He has recovered well though: no breathlessness, pulse 80 regular, BP 136/85, chest is clear, but echocardiography shows reduced function with an LVEF (left ventricular ejection fraction) of 30% - which is well below normal. You wonder whether beta-blockers are safe and helpful in such a patient.

Question Patient or Population: ______

Intervention or Indicator: ______

Comparator: ______

Outcome: ______

Question sentence: ______

What type of question is this (phenomena, frequency, diagnosis, prediction, or intervention)?

What would be the ideal study type? (Randomised Trial, Inception cohort, Survey, etc)

What would be the best feasible study type?

STEP 2 : TRACK DOWN THE BEST EVIDENCE

SEARCH STRATEGY DESIGN TABLE Primary Term Synonym 1 Synonym 2 P ( OR OR ) AND I ( OR OR ) AND C ( OR OR ) AND O ( OR OR Note: consider truncation for each word and add an “*”, e.g., child* for child, children or childhood

ACTUAL SEARCHES Cochrane Searches Hits PubMed Searches Hits

Key Reference: ______

Key Finding: ______

University of Oxford, 2010 19

STEP 1 : FORMULATE AN ANSWERABLE QUESTION

Scenario 2 – Childhood Seizure Recurrence Childhood seizures are common and frightening for the parents, and the decision to initiate treatment is a difficult one. What is the risk of further recurrences following a single seizure of unknown cause? Are there any identifiable factors that modify this risk?

Question Patient or Population: ______

Intervention or Indicator: ______

Comparator: ______

Outcome: ______

Question sentence: ______

What type of question is this (phenomena, frequency, diagnosis, prediction, or intervention)?

What would be the ideal study type? (Randomised Trial, Inception cohort, Survey, etc)

What would be the best feasible study type?

STEP 2 : TRACK DOWN THE BEST EVIDENCE

SEARCH STRATEGY DESIGN TABLE Primary Term Synonym 1 Synonym 2 P ( OR OR ) AND I ( OR OR ) AND C ( OR OR ) AND O ( OR OR Note: consider truncation for each word and add an “*”, e.g., child* for child, children or childhood

ACTUAL SEARCHES Cochrane Searches Hits PubMed Searches Hits

Key Reference: ______

Key Finding: ______

University of Oxford, 2010 20

STEP 1 : FORMULATE AN ANSWERABLE QUESTION

Your Scenario A

Your Question Patient or Population: ______

Intervention or Indicator: ______

Comparator: ______

Outcome: ______

Question sentence: ______

What type of question is this (phenomena, frequency, diagnosis, prediction, or intervention)?

What would be the ideal study type? (Randomised Trial, Inception cohort, Survey, etc)

What would be the best feasible study type?

STEP 2 : TRACK DOWN THE BEST EVIDENCE

SEARCH STRATEGY DESIGN TABLE Primary Term Synonym 1 Synonym 2 P ( OR OR ) AND I ( OR OR ) AND C ( OR OR ) AND O ( OR OR Note: consider truncation for each word and add an “*”, e.g., child* for child, children or childhood

ACTUAL SEARCHES Cochrane Searches Hits PubMed Searches Hits

Key Reference: ______

Key Finding: ______

University of Oxford, 2010 21

Critical Appraisal of a Therapy Study

Over afternoon tea you are debriefing with a colleague about a patient who had gone into cardiogenic shock after a myocardial infarction, and died shortly thereafter. Your colleague asks about whether they had had an early angiography and revascularization? You are aware the patient didn’t have this, but wonder whether such early revascularization would have really have helped such a patient.

Suppose you had tracked down the attached paper. Work though the critical appraisal worksheets for this article, and:

1. decide what question (PICO) the study asked and answered

Patients ……………………………………………… Intervention ………………………………………………. Comparator ……………………………………………… Outcome ………………………………………………

2. whether the internal validity of the study is sufficient to allow firm conclusions (all studies have some flaws; but are these flaws sufficient to discard the study?)

3. if the study is sufficiently valid, look at and interpret the results – what is the relevance or size of the effects of the intervention? What is the (RRR) and Absolute Risk Reduction (ARR)?

4. decide whether and how the results would apply to our patient above. Then role play explaining the condition and treatment to a patient using the following steps: (a) the prognosis, ie chance of recurrence (b) the impact of treatment on this

University of Oxford, 2010 22

THERAPY STUDY: Are the results of the trial valid? (Internal Validity) 1. R- Was the assignment of patients to treatments randomised ? What is best? Where do I find the information? Centralised computer randomisation is ideal and often The Methods should tell you how patients were allocated used in multi-centred trials. Smaller trials may use an to groups and whether or not randomisation was independent person (e.g, the hospital pharmacy) to concealed. “police” the randomization.

This paper: Yes No Unclear Comment: 2. R- Were the groups similar at the start of the trial? What is best? Where do I find the information? If the randomisation process worked (that is, achieved The Results should have a table of "Baseline comparable groups) the groups should be similar. The Characteristics" comparing the randomized groups on a more similar the groups the better it is. number of variables that could affect the outcome (ie. age, There should be some indication of whether differences risk factors etc). If not, there may be a description of group between groups are statistically significant (ie. p values). similarity in the first paragraphs of the Results section.

This paper: Yes No Unclear Comment:

3. A - Aside from the allocated treatment, were groups treated equally ? What is best? Where do I find the information? Apart from the intervention the patients in the different Look in the Methods section for the follow -up schedule, groups should be treated the same, eg., additional and permitted additional treatments, etc and in Results for treatments or tests. actual use.

This paper: Yes No Unclear Comment: 4. A - Were all patients who entered the trial accounted for? - and were they analysed in the groups to which they were randomised? What is best? Where do I find the information? Losses to follow -up should be minimal - preferably less The Results section should say how many patients were than 20%. However, if few patients have the outcome of randomized (eg., Baseline Characteristics table) and how interest, then even small losses to follow-up can bias the many patients were actually included in the analysis. You results. Patients should also be analysed in the groups to will need to read the results section to clarify the number which they were randomised – ‘intention-to-treat analysis’ . and reason for losses to follow-up.

This paper: Yes No Unclear Comment: 5. M - Were measures objective or were the patients and clinicians kept “blind” to which treatment was being received? What is best? Where do I find the information? It is ideal if the study is ‘double -blinded’ – that is, both First, look in the Methods section to see if there is some patients and investigators are unaware of treatment mention of masking of treatments, eg., placebos with the allocation. If the outcome is objective (eg., death) then same appearance or sham therapy. Second, the Methods blinding is less critical. If the outcome is subjective (eg., section should describe how the outcome was assessed symptoms or function) then blinding of the outcome and whether the assessor/s were aware of the patients' assessor is critical. treatment.

This paper: Yes No Unclear Comment:

University of Oxford, 2010 23

What were the results? 6. How large was the treatment effect? Most often results are pr esented as dichotomous outcomes (yes or not outcomes that happen or don't happen) and can include such outcomes as cancer recurrence, myocardial infarction and death. Consider a study in which 15% (0.15) of the control group died and 10% (0.10) of the treatment group died after 2 years of treatment. The results can be expressed in many ways as shown below.

What is the measure? What does it mean? Relative Risk (RR) = risk of the outcome in the The relative risk tells us how many times more likely it is that treatment group / risk of the outcome in the control an event will occur in the treatment group relative to the control group. group. An RR of 1 means that there is no difference between the two groups thus, the treatment had no effect . An RR < 1 means that the treatment decreases the risk of the outcome. An RR > 1 means that the treatment increased the risk of the outcome. In our example, the RR = 0.10/0.15 = 0.67 Since the RR < 1, the treatment decreases the risk of death.

Absolute Risk Reduction (ARR) = risk of the The absolute risk reduction tells us the absolute difference in the outcome in the control group - risk of the outcome rates of events between the two groups and gives an indication in the treatment group. This is also known as the of the baseline risk and treatment effect. An ARR of 0 means absolute . that there is no difference between the two groups thus, the treatment had no effect . In our example, the ARR = 0.15 - 0.10 = 0.05 or 5% The absolute benefit of treatment is a 5% reduction in the death rate. Relative Risk Reduction (RRR) = absolute risk The relative risk reduction is the complement of the RR and is reduction / risk of the outcome in the control group. probably the most commonly reported measure of treatment An alternative way to calculate the RRR is to effects. It tells us the reduction in the rate of the outcome in the subtract the RR from 1 (eg. RRR = 1 - RR) treatment group relative to that in the control group. In our example, the RRR = 0.05/0.15 = 0.33 or 33% The treatment reduced the risk of death by 33% relative to that Or RRR = 1 - 0.67 = 0.33 or 33% occurring in the control group. (NNT) = inverse of the The number needed to treat represents the number of patients ARR and is calculated as 1 / ARR. we need to treat with the experimental therapy in order to prevent 1 bad outcome and incorporates the duration of treatment. Clinical significance can be determined to some extent by looking at the NNTs, but also by weighing the NNTs against any harms or adverse effects (NNHs) of therapy. In our example, the NNT = 1/ 0.05 = 20 We would need to treat 20 people for 2 years in order to prevent 1 death. 7. How precise was the estimate of the treatment effect? The true risk of the outco me in the population is not known and the best we can do is estimate the true risk based on the sample of patients in the trial. This estimate is called the point estimate . We can gauge how close this estimate is to the true value by looking at the confidence intervals (CI) for each estimate. If the confidence interval is fairly narrow then we can be confident that our point estimate is a precise reflection of the population value. The confidence interval also provides us with information about the statistical significance of the result. If the value corresponding to no effect falls outside the 95% confidence interval then the result is statistically significant at the 0.05 level. If the confidence interval includes the value corresponding to no effect then the results are not statistically significant.

Will the results help me in caring for my patient? (ExternalValidity/Applicability) The questions that you should ask before you decide to apply the results of the study to your patient are: • Is my patient so different to those in the study that the results cannot apply? • Is the treatment feasible in my setting? • Will the potential benefits of treatment outweigh the potential harms of treatment for my patient?

University of Oxford, 2010 24

Searching Session

For your presentation you will need to do a search to find a study which helps to answer the question you formulated in the previous session. You are encouraged to attend a searching session where library staff will help you with your searching and access to full text papers.

Please note: during this session you will need to record the results of one search and your chosen article for the critically appraised topic you present later in the week.

You should use the second half of the sheets from Tutorial 1, viz., STEP 2 : TRACK DOWN THE BEST EVIDENCE

Use the ACTUAL SEARCHES tables for each of the questions to record the different terms you searched on, the number of hits, and the final “best evidence” you chose.

For intervention questions you should try searching: • The Cochrane Library (available via www.thecochranelibrary.com ) • PubMed (available at www.pubmed.gov ) • Embase (available via OVID at www.bodley.ox.ac.uk/oxlip ).

For a quick search, you could try the following steps (using PubMed as an example):

1. Go to www.pubmed.gov and select Clinical Queries (left hand menu)

2. Select the appropriate Category (‘therapy’ is the default)

3. Type in the most crucial single element of your PICO search (usually the I or the P) – remember to type in all your synonyms for that element separated by or e.g. flight sock* or flight stocking* or compression stocking*

4. If your search returns no articles then click the ‘Broad’ scope

5. If your search returns more than 30 articles then try adding more PICO elements, if you used only the ‘I’ now try searching the I and P e.g. (flight sock* or flight stocking* or compression stocking*) and (dvt or deep vein thrombosis)

6. Select the best single article (eg the largest or longest trial NOT necessarily the most recent). Please record why you chose the article you did.

The above search will find you information quickly. If you need to make sure you don’t miss anything use text words and MeSH in your search and consider looking at other databases. The following page outlines some further tips on how to go about searching..

University of Oxford, 2010 25

Searching Tips and Tactics

truncation and wildcard (*) NEAR = AND plus words close together

(furunc* OR (staphylococc* NEAR skin)) AND recur*:TI

BOOLEANS IN CAPITALS Group words with ( ) Word must be in TITLE

Finds studies containing either of the specified words or phrases.

For example, child OR adolescent finds articles with either the OR word child or the word adolescent. Finds studies containing both specified words or phrases. For

example, child AND adolescent finds articles with both the word AND child and the word adolescent. Like AND it requires both words but the specified words must also ADJ or NEAR be within about 6 words from each other (doesn’t work in PubMed). Excludes studies containing the specified word or phrase. For NOT example, child NOT adolescent means studies with the word “child” but not the word “adolescent”. Use sparingly. Articles retrieved may be restricted in several ways, e.g, by date, Limits by language, by publication type, etc. Use parentheses to group words. For example, (Child OR adolescent) AND (hearing OR auditory) finds articles with one or ( ) both “child” and “adolescent” and one or both of the words “hearing” or "auditory" Truncation: the “*” acts as a wildcard indicating any further letters,

e.g, child* is child plus any further letters and is equivalent to * (child OR childs OR children OR childhood). Finds studies with the word in the title. For example, hearing [ti] [ti] or :ti (in PubMed) and hearing:ti (in Cochrane) finds studies with the word hearing in the title. Retrieves studies from a specific source, e.g., hearing AND BMJ [so] or :so [so] finds articles on hearing in the BMJ. MeSH is the Medical Subject Headings, a controlled vocabulary of MeSH keywords which may be used in PubMed or Cochrane. It is often useful to use both MeSH heading and text words.

University of Oxford, 2010 26

Prepare a critically appraised topic

Q1. How long will it take? You should aim to set aside 2-3 hours to prepare you brief presentation at your final session.

Q2. Where do I get a clinical question from? There are three sources you can try:

A. A patient you have seen in a clinic or ward session

B. Your own or a friend or family members health question (for both A and B please note need for confideniality)

:C. You can use a question asked by a clinician and for which an answer has been attempted. Go to: www.tripanswers.org

Q3. Where do you get the article? You should find the article during the library session, and get the full text article

Q4. How do I appraise the article? Critically appraise the article using the appropriate appraisal sheet (see the Treatment trials appraisal sheet from the tutorial, and the extra readings for systematic review and diagnosis appraisal sheets).

Q5. What do I bring for the presentation? You need to repare a brief presentation for a few minutes. There are several options: 1. There are Powerpoint templates on weblearn you can use and bring on a USB drive 2. You can make a photocopy of a paper heandout for everyone or 3. You can get an single OHP (see attached example) For other examples you might look at (i) the CATbank on the CEBM website ( www.cebm.net ) or (ii) the BestBets: www.bestbets.org which has CATs for A&E

STRUCTURE OF YOUR PRESENTATION (with guide times) 1. Describe the clinical situation and the clinical question (PICO) you need to answer. (2 mins) 2. Describe your search (1 min) 3. Write a brief description of the question (PICO) and methods of the study. Appraise the validity of the study. (3 mins) 4. State how the study applies to the patient you identified. (1 min) A template for this is attached, but you are free to use another approach if you wish.

*Note: If you wish you may download CATMAKER from the CEBM website ( www.cebm.net - the downloads section) to assist with writing your CAT. CATMaker structures the critical appraisal process, does the needed calculations for you, and allows you to print out the summary results.

University of Oxford, 2010 27

Critically Appraised Topic Presentation Example Template

1. Give a description of the clinical situation and the clinical question you need to answer.

Question (PICO):

2. Give your search strategy including: (a) database used, (b) search terms used, and (c) number of papers identified, and (d) why did you choose the particular article?

3. What was the question of the study. Appraise the validity of the study

4. What were the results of the study?

5. State how the study applies to the patient you identified.

University of Oxford, 2010 28

Critically Appraised Topic Presentation Example

1. The patient & clinical question A 43 year old male asked for some repeat prescriptions and advice about preventing deep vein thrombosis on a 12 hour flight (his brother had had one last year). You suggest stockings as the most effective prevention.

Question (PICO): In patients on long flights(P), do compression stockings (I) prevent DVT (O)?

2. search strategy

(a) PubMed: Clinical Queries (with therapy filters) (b) flight* AND stocking* AND DVT (c) 6 papers including 2 separate trials (d) The Scurr article was the larger trial, and quality appeared equal

3. The study – the question and appraisal

Study Question: In patients on flights over 8 hours in economy class, do Grade-I below- knee compression stockings, compared to no stocking, prevent ultrasound-detected DVT?

RandRandomisationomisationomisation: was by sealed envelope (not ideal) but lead to reasonable balance (See Table1) though more females appeared to receive stockings than males. AscertainmentAscertainment: there was an 86% followup and ultrasound in each arm; this is adequate. MeasurementsMeasurements: though stockings were removed pre-ultrasound, the sonographer may have seen the stocking mark and hence been unblinded. The study has some flaws, but these are probably insufficient to explain the size of the results.

4. The results DVT occurred in 12% of the No Stocking group and 0% of the Stocking group. Relative Risk Reduction of 100% and an absolute risk reduction of 12% (95%CI 69-100)

The NNT (number needed to treat) is 9. However, there was a small increase in superficial thrombophlebitis.

5. How the results apply. My patient is a little younger than the average of 62years seen in the trial, and hence probably at somewhat lower risk. Nevertheless, this is a simple cheap and effective prevention procedure, which I would recommend to him.

University of Oxford, 2010 29

Presentations

The aim of this session is to produce and present a short a Critically Appraised Topic (CAT), and to discuss any issues and difficulties you have had with the group and your tutor.

For your brief presentation:  Please hand a copy of your review and paper to the tutor before presenting.  You will have about 7 minute to present and 3 minutes for discussion  This should require about 2-3 overheads – e.g., one for the patient, one for the search and paper, and one for application to the individual patient.

University of Oxford, 2010 30

SYSTEMATIC REVIEW: Are the results of the review valid? What question (PICO) did the systematic review addressed? What is best? Where do I find the information? The main question being addressed should be clearly The Title, Abstract or final paragraph of the stated. The exposure, such as a therapy or diagnostic Introduction should clearly state the question. If you test, and the outcome(s) of interest will often be still cannot ascertain what the focused question is expressed in terms of a simple relationship. after reading these sections, search for another paper!

This paper: Yes No Unclear Comment: F - Is it unlikely that important, relevant studies were missed? What is best? Where do I find the information? The starting point for comprehensive search for all The Methods section should describe the search relevant studies is the major bibliographic databases strategy, including the terms used, in some detail. The (e.g., Medline, Cochrane, EMBASE, etc) but should also Results section will outline the number of titles and include a search of reference lists from relevant studies, abstracts reviewed, the number of full-text studies and contact with experts, particularly to inquire about retrieved, and the number of studies excluded unpublished studies. The search should not be limited to together with the reasons for exclusion. This English language only. The search strategy should information may be presented in a figure or flow chart. include both MESH terms and text words. This paper: Yes No Unclear Comment: A - Were the criteria used to select articles for inclusion appropriate? What is best? Where do I find the information? The inclusion or exclusion of studies in a systematic The Methods section should describe in detail the review should be clearly defined a priori. The eligibility inclusion and exclusion criteria. Normally, this will criteria used should specify the patients, interventions or include the study design. exposures and outcomes of interest. In many cases the type of study design will also be a key component of the eligibility criteria.

This paper: Yes No Unclear

Comment: A - Were the included studies sufficiently valid for the type of question asked? What is best? Where do I find the information? The article should describe how the quality of each study The Methods section should describe the assessment was assessed using predetermined quality criteria of quality and the criteria used. The Results section appropriate to the type of clinical question (e.g., should provide information on the quality of the randomization, blinding and completeness of follow-up) individual studies.

This paper: Yes No Unclear Comment: T - Were the results similar from study to stud y? What is best? Where do I find the information? Ideally, the results of the different studies should be The Results section should state whether the results similar or homogeneous. If heterogeneity exists the are heterogeneous and discuss possible reasons. The authors may estimate whether the differences are forest plot should show the results of the chi-square significant (chi-square test). Possible reasons for the test for heterogeneity and if discuss reasons for heterogeneity should be explored. heterogeneity, if present.

This paper: Yes No Unclear Comment:

University of Oxford, 2010 31

What were the results? How are the results presented? A systematic review provides a summary of the data from the results of a number of individual studies. If the results o f the individual studies are similar, a statistical method (called meta-analysis) is used to combine the results from the individual studies and an overall summary estimate is calculated. The meta-analysis gives weighted values to each of the individual studies according to their size. The individual results of the studies need to be expressed in a standard way, such as relative risk, or mean difference between the groups. Results are traditionally displayed in a figure, like the one below, called a forest plot.

The forest plot depicted above represents a meta-analysis of 5 trials that assessed the effects of a hypothetical treatment on mortality. Individual studies are represented by a black square and a horizontal line, which corresponds to the point estimate and 95% confidence interval of the odds ratio. The size of the black square reflects the weight of the study in the meta-analysis. The solid vertical line corresponds to ‘no effect’ of treatment - an odds ratio of 1.0. When the confidence interval includes 1 it indicates that the result is not significant at conventional levels (P>0.05). The diamond at the bottom represents the combined or pooled odds ratio of all 5 trials with its 95% confidence interval. In this case, it shows that the treatment reduces mortality by 34% (OR 0.66 95% CI 0.56 to 0.78). Notice that the diamond does not overlap the ‘no effect’ line (the confidence interval doesn’t include 1) so we can be assured that the pooled OR is statistically significant. The test for overall effect also indicates statistical significance (p<0.0001). Exploring heterogeneity Heterogeneity can be assessed using the “eyeball” test or more formally with statistical tests, such as the Cochran Q test. With the “eyeball” test one looks for overlap of the confidence intervals of the trials with the summary estimate. In the example above note that the dotted line running vertically through the combined odds ratio crosses the horizontal lines of all the individual studies indicating that the studies are homogenous. Heterogeneity can also be assessed using the Cochran chi-square (Cochran Q). If Cochran Q is statistically significant there is definite heterogeneity. If Cochran Q is not statistically significant but the ratio of Cochran Q and the degrees of freedom (Q/df) is > 1 there is possible heterogeneity. If Cochran Q is not statistically significant and Q/df is < 1 then heterogeneity is very unlikely. In the example above Q/df is <1 (0.92/4= 0.23) and the p-value is not significant (0.92) indicating no heterogeneity. Note: The level of significance for Cochran Q is often set at 0.1 due to the low power of the test to detect heterogeneity.

University of Oxford, 2010 32

DIAGNOSTIC ACCURACY STUDY : Are the results of the study valid? R - Was the diagnostic test evaluated in a Representative spectrum of patients (like those in whom it would be used in practice)? What is best? Where do I find the information? It is ideal if the diagnostic test is applied to the full The Methods section s hould tell you how patients were spectrum of patients - those with mild, severe, early and enrolled and whether they were randomly selected or late cases of the target disorder. It is also best if the consecutive admissions. It should also tell you where patients are randomly selected or consecutive admissions patients came from and whether they are likely to be so that is minimized. representative of the patients in whom the test is to be used. This paper: Yes No Unclear Comment: A – Was the reference stan dard ascertained regardless of the index test result? What is best? Where do I find the information? Ideally both the index test and the reference standard The Methods section should indicate whether or not the should be carried out on all patients in the study. In some reference standard was applied to all patients or if an situations where the reference standard is invasive or alternative reference standard (e.g., follow-up) was applied expensive there may be reservations about subjecting to those who tested negative on the index test. patients with a negative index test result (and thus a low probability of disease) to the reference standard. An alternative reference standard is to follow-up people for an appropriate period of time (dependent on disease in question) to see if they are truly negative. This paper: Yes No Unclear Comment: Mbo - Was there an independent, blind comparison between the index test and an appropriate reference ('gold') standard of diagnosis? What is best? Wher e do I find the information? There are two issues here. First the reference standard The Methods section should have a description of the should be appropriate - as close to the 'truth' as possible. reference standard used and if you are unsure of whether Sometimes there may not be a single reference test that or not this is an appropriate reference standard you may is suitable and a combination of tests may be used to need to do some background searching in the area. indicate the presence of disease. The Methods section should also describe who conducted Second, the reference standard and the index test being the two tests and whether each was conducted assessed should be applied to each patient independently and blinded to the results of the other. independently and blindly. Those who interpreted the results of one test should not be aware of the results of the other test.

This paper: Yes No Unclear Comment:

University of Oxford, 2010 33

What were the results? Are test characteristics presented? There are two type s of results commonly reported in diagnostic test studies. One concerns the accuracy of the test and is reflected in the sensitivity and specificity. The other concerns how the test performs in the population being tested and is reflected in predictive values (also called post-test probabilities). To explore the meaning of these terms, consider a study in which 1000 elderly people with suspected dementia undergo an index test and a reference standard. The prevalence of dementia in this group is 25%. 240 people tested positive on both the index test and the reference standard and 600 people tested negative on both tests. The first step is to draw a 2 x 2 table as shown below. We are told that the prevalence of dementia is 25% therefore we can fill in the last row of totals - 25% of 1000 people is 250 - so 250 people will have dementia and 750 will be free of dementia. We also know the number of people testing positive and negative on both tests and so we can fill in two more cells of the table. Reference Standard +ve -ve Index test +ve 240 -ve 600 250 750 1000 By subtraction we can easily complete the table: Reference Standard +ve -ve Index test +ve 240 150 390 -ve 10 600 610 250 750 1000 Now we are ready to calculate the various measures. What is the measure? What does it mean? Sensitivity (Sn) = the proportion of people with the The sensitivity tells us how well the test identifies people with condition who have a positive test result. the condition. A highly sensitive test will not miss many people. In our example, the Sn = 240/250 = 0.96 10 people (4%) with dementia were falsely identified as not having it. This means the test is fairly good at identifying people with the condition. Specificity (Sp) = the propor tion of people without The specificity tells us how well the test identifies people the condition who have a negative test result. without the condition. A highly specific test will not falsely identify many people as having the condition. In our example, the Sp = 600/750 = 0.80 150 people (20%) without dementia were falsely identified as having it. This means the test is only moderately good at identifying people without the condition. Positive Predictive Value (PPV) = the proportion of This measure tells us how well the test performs in this people with a positive test who have the condition. population. It is dependent on the accuracy of the test (primarily specificity) and the prevalence of the condition. In our example, the PPV = 240/390 = 0.62 Of the 390 people who had a positive test result, 62% will actually have dementia. Negative Predictive Value (NPV) = the proportion This measure tells us how well the test performs in this of people with a negative test who do not have the population. It is dependent on the accuracy of the test and the condition. prevalence of the condition. In our example, the NPV = 600/610 = 0.98 Of the 610 people with a -ve test , 98% will not have dementia. Application 1. Were the methods for performing the test describ ed in sufficient detail to permit replication? What is best? Where do I find the information? The article should have sufficient description of the test to The Methods section should describe the test in detail. allow its replication and also interpretation of the results.

This paper: Yes No Unclear Comment:

University of Oxford, 2010 34

University of Oxford, 2010 35

University of Oxford, 2010 36

University of Oxford, 2010 37

University of Oxford, 2010 38

University of Oxford, 2010 39

University of Oxford, 2010 40

University of Oxford, 2010 41

University of Oxford, 2010 42

University of Oxford, 2010 43

University of Oxford, 2010 44

University of Oxford, 2010 45

SELF-QUIZ: These questions will test your knowledge in some skills in evidence-based practice. Question 1 For the issues described below (examples 1 to 4), choose the SINGLE most relevant study type from those listed (options A to G) and clearly write the letter of your chosen option in the space provided at the end of the example :

1. The best type of study to assess the effect of a new treatment would be ______2. The best type of study to determine the prevalence of cataracts would be ______3. The best type of study to determine the accuracy of a new diagnostic test would be ______4. The best type of study to determine the natural history (prognosis) would be ______

Options : A. a cohort study B. a case-control study C. a randomised controlled trial D. a population survey E. a consecutive sample of patients with a reference standard test F. a nested case-control study G. a qualitative study

Question 2 For the issues described below (examples 1 to 8), choose the SINGLE most relevant term from those listed (options A to L):

Examples : 1. keywords coded by the National Library of Medicine are _____ 2. requires that an article contains BOTH words _____ 3. is used to find words with the same stem _____ 4. contains the largest database of randomised controlled trials _____ 5. requires that an article contains EITHER word _____ 6. is a MEDLINE interface available free via the internet _____ 7. to use all subheadings of a MESH term you would use _____ 8. contains the largest database of non-randomised studies _____

Options : A. Mesh Terms G. wildcard (*) B. limiters H. Webspirs C. PubMed I. Grateful Med D. AND J. The Cochrane Library E. OR K. Medline F. NOT L. explode

Question 3 Please read the examples below (1 to 4). For each indicate whether they are formulated to correctly search for an answer to the question (clearly circle either YES or NO where indicated).

QUESTION: What strategies can use to minimise falls in our elderly population. SEARCH: (elder* or old or aged) and prevent and (fall or fracture) Correctly formulated (circle)? YES/NO Summary of reasons if you answered no ______

Question 4 When evaluating the internal validity of a Randomised Controlled Trial the three most important things you would look for are: (number the three most important as 1, 2, 3) Were all patients who entered the trial accounted for at its conclusion? Were the patients randomly selected from the target population? Were the patients and clinicians kept blind to which treatment was being received? Was the randomisation list concealed? Were only patients who full complied included in the final analysis? Are the outcome measures clearly defined? Were the outcome measures blinded or objective?

University of Oxford, 2010 46

Glossary (From the EBM Journal BMJ Publishing)

TERMS USED IN THERAPEUTICS

Allocation concealed: deemed to have taken adequate measures to conceal allocation to study group assignments from those responsible for assessing patients for entry in the trial (eg, central randomisation; sequentially numbered, opaque, sealed envelopes; sealed envelopes from a closed bag; numbered or coded bottles or containers; drugs prepared by the pharmacy; or other descriptions that contain elements convincing of concealment). Allocation not concealed: deemed to have not taken adequate measures to conceal allocation to study group assignments from those responsible for assessing patients for entry in the trial (eg, no concealment procedure was undertaken, sealed envelopes that were not opaque, or other descriptions that contain elements not convincing of concealment). Unclear allocation concealment: the authors of the article did not report or provide us with a description of an allocation concealment approach that allowed for classification as concealed or not concealed. Blinded: any or all of the clinicians, patients, participants, outcome assessors, or statisticians were unaware of who received which study intervention. Those that are blinded are indicated in parentheses. If "initially" is indicated (eg, blinded [patients and outcome assessor initially]), the code was broken during the trial, for instance, because of adverse effects. Blinded (unclear): the authors did not report or provide us with an indication of who, if anyone, was unaware of who received which study intervention. Unblinded: all participants in the trial (clinicians, patients, participants, outcome assessors, and statisticians) were aware of who received which study intervention. When the experimental treatment reduces the risk for a bad event RRR (relative risk reduction): the proportional reduction in rates of bad events between experimental (experimental event rate [EER]) and control (control event rate [CER]) patients in a trial, calculated as |EER-CER|/CER and accompanied by a 95% confidence interval (CI). ARR (absolute risk reduction): the absolute arithmetic difference in event rates, |EER-CER| NNT (number needed to treat): the number of patients who need to be treated to prevent one additional bad outcome; calculated as 1/ARR, rounded up to the next highest whole number, and accompanied by its 95% CI. When the experimental treatment increases the probability of a good event RBI (relative benefit increase): the increase in the rates of good events, comparing experimental and control patients in a trial, also calculated as |EER-CER|/CER. ABI (absolute benefit increase): the absolute arithmetic difference in event rates, |EER-CER|. NNT: calculated as 1/ABI; denotes the number of patients who must receive the experimental treatment to create one additional improved outcome in comparison with the control treatment. When the experimental treatment increases the probability of a bad event RRI (relative risk increase): the increase in rates of bad events, comparing experimental patients to control patients in a trial, and calculated as for RBI. RRI is also used in assessing the effect of risk factors for disease.

University of Oxford, 2010 47

ARI (absolute risk increase): the absolute difference in rates of bad events, when the experimental treatment harms more patients than the control treatment; calculated as for ABI. NNH (): the number of patients who, if they received the experimental treatment, would lead to one additional person being harmed compared with patients who receive the control treatment; calculated as 1/ARI. Confidence interval (CI): the CI quantifies the uncertainty in measurement; usually reported as 95% CI, which is the range of values within which we can be 95% sure that the true value for the whole population lies. Weighted event rates: the contributions of individual studies to the total in a meta- analysis, determined by the sample size and the number of events in each study.

TERMS USED IN DIAGNOSISc Sensitivity: the proportion of patients with the target disorder who have a positive test result (a/[a + c]) (figure). Specificity: the proportion of patients without the target disorder who have a negative test result (d/[b + d]) (figure). Pretest probability (prevalence): the proportion of patients who have the target disorder, as determined before the test is carried out ([a + c]/[a + b + c + d]) (figure). Pretest odds: the odds that the patient has the target disorder before the test is carried out (pretest probability/[1 — pretest probability]). Likelihood ratio (LR): the ratio of the probability of a test result among patients with the target disorder to the probability of that same test result among patients who are free of the target disorder. The LR for a positive test is calculated as sensitivity/(1 — specificity). The LR for a negative test is calculated as (1 — sensitivity)/specificity. Post-test odds: the odds that the patient has the target disorder after the test is carried out (pretest odds x LR). Post-test probability: the proportion of patients with that particular test result who have the target disorder (post-test odds/[1 + post-test odds]).

University of Oxford, 2010 48

SELF-QUIZ: Possible answers. Question 1

1. The best type of study to assess the effect of a new treatment would be ____C_____ 2. The best type of study to determine the prevalence of cataracts would be ___D______3. The best type of study to determine the accuracy of a new diagnostic test would be ___E_____ 4. The best type of study to determine the natural history (prognosis) would be ___A____

Options : A. a cohort study B. a case-control study C. a randomised controlled trial D. a population survey E. a consecutive sample of patients with a reference standard test F. a nested case-control study G. a qualitative study

Question 2 For the issues described below (examples 1 to 8), choose the SINGLE most relevant term from those listed (options A to L):

Examples : 1. keywords coded by the National Library of Medicine are _____Mesh 2. requires that an article contains BOTH words _____AND 3. is used to find words with the same stem _____* (PubMed) 4. contains the largest database of randomised controlled trials _____ Cochrane Trials Register 5. requires that an article contains EITHER word _____OR 6. is a MEDLINE interface available free via the internet _____PubMed 7. to use all subheadings of a MESH term you would use _____explode 8. contains the largest database of non-randomised studies _____Medline

Options : A. Mesh Terms G. wildcard (*) B. limiters H. Webspirs C. PubMed I. Grateful Med D. AND J. The Cochrane Library E. OR K. Medline F. NOT L. explode

Question 3 SEARCH: elder* or old or aged and prevent and fall or fracture Correctly formulated (circle)? YES/NO Summary of reasons if you answered no ______NO______The ORs should be in parentheses and capitalised; the prevent, fall and fracture could be truncated, i.e., (elder* OR old OR aged) AND prevent* AND (fall* OR fracture*)

Question 4 When evaluating the internal validity of a Randomised Controlled Trial the three most important things you would look for are: (number the three most important as 1, 2, 3) 3 Were all patients who entered the trial accounted for at its conclusion? Were the patients randomly selected from the target population? Were the patients and clinicians kept blind to which treatment was being received? 1 Was the randomisation list concealed? Were only patients who full complied included in the final analysis? Were the outcome measures clearly defined? 2 Were the outcome measures blinded or objective?

University of Oxford, 2010 49