<<

BASIC CONCEPTS IN Introduction

Hayley Coleman AIMS

LO 8 To develop an understanding of research methodology and critical appraisal of the research literature 8a Research techniques Demonstrate an understanding of basic research methodology including both quantitative and qualitative techniques 8b Evaluation and critical appraisal of research Assess the importance of findings, using appropriate statistical analysis

Cover: Study Types, Basic Epi and

Disclaimer: I am not a or academic, I have one hour STUDY DESIGNS

Case-control studies Cohort studies Cross-sectional studies Geographical/Ecological studies Randomized controlled trials CASE-CONTROL STUDIES

Start with identification of a group of cases (individuals with a particular outcome) in a given population and a group of controls (individuals without the health outcome) to be included in the study. CASE CONTROL STUDIES

Advantages  Cost- effective relative to  Efficient for study of rare or one with long latency  Quick and fairly inexpensive  Allows examination of multiple exposures Disadvantages  Prone to selection/recall and observer bias  Only examine one outcome  Poor choice for rare exposures  Temporal sequence may be hard to determine COHORT STUDY

Group of individuals exposed to factor and group , unexposed to are followed over time (often years) to determine the occurrence of disease. of disease in the exposed group is compared with the incidence of disease in the unexposed group.

Exposed No exposure group

case No case case No case COHORT STUDY

Advantages  Multiple outcomes can be measured for 1 exposure and can look at multiple exposures  Can delineate temporal relationship  Good for rare exposures  Can measure incidence and Disadvantages ▪ Costly and time consuming ▪ Prone to ▪ Prone to loss to follow-up- bias ▪ Knowledge of exposure may bias assessment of outcome ▪ Being in study may alter participants behaviour ▪ Poor choice if rare disease CROSS-SECTIONAL STUDIES

Examines relationship between disease (or health related state) and other variable of interest (i.e. exposure) as they exist in a defined population at a single point in time or over a short period of time (e.g. 1 year) Provide snap-shot in time Used to assess disease burden or health needs Can be descriptive and analytical CROSS-SECTIONAL STUDIES

Advantages;  Quick, easy, cheap  Multiple outcomes and exposures measured  Good for assessing burden/ planning services  Good for generating hypotheses Disadvantages;  Cant determine temporal relationship  No good if disease rare/short duration  Unable to measure incidence  Bias due to low response and recall ECOLOGICAL STUDIES – USES

Population level Risk Factor

Hypothesis generation  New connections, new ideas Are there factors which genuinely operate at the population level?  Effect modification?  Determinants of exposure to ? Context  The individualistic fallacy? : GEOGRAPHICAL ECOLOGICAL STUDY: TIME TRENDS

Time ASSOCIATION NOT CAUSATION

ASSOCIATION VERSUS CAUSATION

Association as a result of: Bradford-Hill criteria: (J Roy Soc Med 1965:58:295-300) Chance (random error) Strength of the association Bias (systematic error) Consistency of findings Specificity of the association. Confounding Temporal sequence of association Causal link Biological gradient. Coherence . Confounders: unable to adjust for confounders due to lack of Bias: Data on exposure and outcome may be collected in different ways or using different definitions over time or in different places Ecological fallacy: Assuming that group level associations between outcome and exposure also apply at individual level lead to ecological fallacy ( or ecological bias) Loss to attrition/Migration of populations EPIDEMIOLOGY MEASURES OF DISEASE

Prevalence Incidence MEASURING OF FREQUENCY OF OUTCOMES Risk Odds Rates WHAT IS A RATE?

Allows comparisons between populations of different sizes and/or at different times Strictly speaking a rate expresses a time interval e.g per year

no. of events in population (numerator ) rate = no. of people in population (denominat or) Person Place Time PREVALENCE

• A measure of the occurrence of all cases of disease in a population

existing cases of disease in population prevalence = number of people in population • Case definition • Ascertainment (registers, surveys) • Point v period v lifetime prevalence • Suitable for chronic disease TYPES OF PREVALENCE

Point prevalence  The proportion of the population with a disease at any specific point in time Period prevalence  The proportion of the population with a disease at any point during a defined period Lifetime prevalence  The proportion of the population who have, or have had, a disease during their lifetime  Better measure for chronic, relapsing conditions INCIDENCE

Measure of the number of new cases of a disease (or other health outcome of interest) that develops in a population at risk during a specified time period. 2 incidence measures: Risk ( or cumulative Incidence) or Rate Number of NEW cases occurring over a given period of time in the population at risk (free of disease) at the beginning of the time period . INCIDENCE RATE

- take into account the sum of the time that each person remained under observation and at risk of developing the outcome under investigation.

Incidence Rate = 푵풖풎풃풆풓 풐풇 풏풆풘 풄풂풔풆풔 풐풇 풅풊풔풆풂풔풆 풊풏 풂 품풊풗풆풏 풕풊풎풆 풑풆풓풊풐풅 푻풐풕풂풍 풑풆풓풔풐풏−풕풊풎풆 풂풕 풓풊풔풌 풅풖풓풊풏품 풕풊풎풆 풑풆풓풊풐풅

student 10 student 9 student 8 student 7 student 6 student 5 student 4 student 3 student 2 student 1 0 2 4 6 8 10 RELATIONSHIP BETWEEN INCIDENCE & PREVALENCE CRUDE AND ADJUSTED RATES

Crude rate applies to the total population Specific rates can be calculated for sub-groups Adjusting takes account of certain characteristics or factors in the population (potential confounders) This is useful because we know for example that death rates for various conditions differ markedly by age Standardisation allows you to compare populations with a different age profile total number of deaths crude death rate = size of population

number of deaths in age age specific death rate = size of population in age range STANDARISATION

Allows you to ‘age standardise’ your data (adjusts your data for the confounder of age) Comparisons of health outcomes between groups or across time periods, where age structures differ, require techniques that adjust for variations in the age structure of populations

(From Naing 2000) STANDARDIZATION

Indirect and direct methodology available

Direct Standardization Indirect Standardization Use when comparing several To determine if disease incidence is higher or population groups or several lower in one are only ( compares to standard) time periods Use if age specific rates for the population groups are not available or unreliable Use if rare event and thus deaths in population groups are small DIRECT AGE STANDARDISATION

You will need: - your population broken down in age bands - death rates for each group in your population (deaths/total population) “ standard population” with same age bands Instructions Apply your age-band specific death rates to the age bands of the standard population → “Expected deaths” ( if your population had the same age distribution as the standard population) ASDR Preferred EUROPEAN STANDARD POPULATION INDIRECT AGE STANDARDISATION

You will need: - “standard age related rates “ Your population broken down by age bands What you do: Apply the “standard rates” to your population Your answer gives expected deaths in each age group SMR: The ratio of the number of events observed in the study population to the number that would be expected if the study population had the same age-sex specific rates as the standard population 푂푏푠푒푟푣푒푑 푛푢푚푏푒푟 표푓 푑푒푎푡ℎ푠 SMR = 푒푥푝푒푐푡푒푑 푛푢푚푏푒푟 표푓 푑푒푎푡ℎ푠 EXAMPLE

SMR = Observed number of deaths (O) X 100% Expected number of deaths (E) SMR = 160 = 1.6 X 100 = 160 100 MEASURES OF MORTALITY

Mortality rates often used as proxy for disease occurrence routinely counted and readily available Poor proxy for with great morbidity but little mortality  chronic diseases  infectious diseases STATISTICS HYPOTHESIS TESTING

Step 1: what is our hypothesis? Null hypothesis = simplest position = no difference For example: are smoking rates higher in group A or B?

H0 = no difference Alternative: There is a difference Step 2: gather data Step 3: decide on statistical test and calculate test Step 4: P value from the test statistic

Step 5: interpret – accept or reject H0 SIGNIFICANCE

Statistical significance ≠ clinical significance

Statistical significance = is p-value below α?

Clinical significance: is the effect important enough to act upon? Need to consider what difference (e.g. fall in blood pressure) would change clinical practice.

Sample size shaped by the clinical difference needed, power and statistical significance. Nb if a small difference would be clinically significant you’d need a more precise measure and a larger sample. HYPOTHESIS TESTING ERRORS

Four possible outcomes: TIPS

P value 0.05 Type 1 error

Power 80% Type 2 error WHICH TEST? COMMON VALUES

Standard deviation (s or σ): The is a measure of how spread out numbers are. Its symbol is σ (sigma) - it is the square root of the . - Spread

Confidence Intervals: 95% certain that the true value lies between  Precision

Odds ratio (OR) is used in case-control studies to estimate strength of association between exposure and outcome. The results of a case-control study can be presented in a 2x2 table The odds ratio is a measure of the odds of disease in the exposed compared to the odds of disease in the unexposed (controls)

OR=1 Exposure does not affect odds of outcome OR>1 Exposure associated with higher odds of outcome OR<1 Exposure associated with lower odds of outcome EXERCISE

Calculate the OR from a hypothetical case-control study of smoking and of the pancreas among 100 cases and 400 controls.

60 푥 300 Answer: OR = = 4.5 100푥40 (RATE RATIO)

The ratio of the risk of disease among the exposed to the risk among the unexposed

RR>1 implies positive association Answer: 1.5= 15 0.1 RELATIVE RISK

Relative risk (RR): R(e)/R(u)

What is the difference between an odds ratio and relative risk?

Odds, if you throw the die, the odds of rolling a 1 is 1/5 but the risk is 1/6. But as the groups get bigger the difference between these two becomes less. ABSOLUTE V RELATIVE RISK - EXAMPLE

New drug reduces risk of death by 50% !! Should NHS approve it?

But: Risk of death without = 2 per million Risk of death with = 1 per million = 50% ARR = 1 per million NNT = 1 million And cost??

Attributable risk: is the excess incidence of the outcome that we can attribute to the exposure if we assume a causal link Absolute effect or excess risk of disease in those exposed compared to unexposed AR = 0 if no association

Attributable risk = Incidence in exposed – Incidence in unexposed

Attributable risk fraction (percent) is the proportion of the outcome in exposed individuals that can be attributable to the exposure Epidemiological Outcome A Outcome B measure Incidence rate in 6 per 1,000 30 per 1,000 Attributable risk fraction = exposed person -year person - years Attributable risk/ incidence in exposed Incidence rate in 1per 1,000 person- 5 per 1,000 person unexposed year years

From: Carneiro& Howard Exposure No Cumulative Diseased Population Status Disease Incidence (Risk) EXAMPLE Exposed 500 9,500 10,000 0.050 Not 900 89,100 90,000 0.010 Exposed Column 1,400 98,600 100,000 0.014 Totals

1,400 total cases in the "Diseased" column, but only 500 of these had the exposure of interest. None of the other 900 cases can be attributed to the exposure, because they were not exposed.

Consequently, only 500/1,400 = 0.357, or 35.7% of the diseased subjects were exposed (35.7% is the proportion of exposed cases). However, not all of these diseased cases can be attributed to the exposure.

AR= Ie – Iu Ie = 500/10000 = 0.05 Iu = 900/90000= 0.01 AR= 0.05-0.01= 0.04 ARF= Attributable risk/ incidence in exposed ARF= 0.04/ 0.05 = 0.8 80%

Therefore, in the population the fraction of cases that can be attributed to the exposure is 0.357 x 0.80 = 0.286, or 28.6%. POPULATION ATTRIBUTABLE RISK

Population attributable risk= Incidence in population – Incidence in unexposed PAF is the proportional reduction in population disease or mortality that would occur if exposure to a risk factor would be reduced. CORRELATION

Association

Correlation measures strength of association In correlation, variables are independent Strength of correlation given by “correlation coefficient” r (-1 to 1) REGRESSION

In regression, one responds to (is dependent on) the other

Regression allows prediction: if you know x you can predict y

Multiple Linear SUMMARY

Choose right study and statistic  CC OR  Cohort RR  >1 Think about numerator and denominator Crude and Adjusted..Control Type 1 and 2 errors Contextualise USEFUL LINKS/ FURTHER READING

Health Knowledge . Epidemiology for Practitioners: available on: http://www.healthknowledge.org.uk/e-learning/epidemiology/practitioners

Epidemiology for the uninitiated. BMJ available under: http://www.bmj.com/about-bmj/resources- readers/publications/epidemiology-uninitiated/1-what-epidemiology SOME READING

Donaldson’s Essential Health Knowledge website “ Made Easy” http://sumed.sun.ac.za/Portals/0/Repository/Medical- Statistics-Made-Easy.3f8ceb88-f35b-4f29-8d70-17e78ce071d8.pdf “Medical Statistics” Betty R Kirkwood and Jonathan AC Sterne “Qualitative methods for health research” Judith Green and Nicki Thorogood Health Knowledge . Epidemiology for Practitioners: available on: http://www.healthknowledge.org.uk/e-learning/epidemiology/practitioners Epidemiology for the uninitiated. BMJ available under: http://www.bmj.com/about-bmj/resources- readers/publications/epidemiology-uninitiated/1-what-epidemiology YOUTUBE LINKS

Rahul Patwari: Really good youtuber for stats, have a look through, loads of videos (not all relevant but the stats ones are simple and short). Some of the particularly good videos are below. Hypothesis testing: https://www.youtube.com/watch?v=_Z5gPXoRkic

Odds and Risk Ratios https://www.youtube.com/watch?v=hOtoV2Kjb0o

Confidence intervals and P values https://www.youtube.com/watch?v=1tWhe4fWp-o

Hazard ratios and survival curves https://www.youtube.com/watch?v=p1wa8W11JLI CONTACT [email protected] QUIZ QUIZ

The number of new cases that occur within a specific population within a defined time interval is: A. Point Prevalence B. Incidence C. Period prevalence D. Lifetime Prevalence QUIZ

In epidemiology research, if the relative risk is greater than 1.0, the group with the suspected risk factor: A. Has a lower incidence rate of the disorder. B. Has a higher incidence rate of the disorder. C. Has no relationship with the risk factor. D. None of the above QUIZ

The ratio between the incidence of disease among exposed and non-exposed is called: A. Causal risk B. Attributable risk C. Relative risk D. Odd's ratio QUIZ

Which is false about a cohort study? A. Incidence can be measured B. Used to study chronic diseases C. Expensive D. Always prospective QUIZ

Incidence is defined as: A. Number of cases existing in a given population at a given B. Number of cases existing in a given period C. Number of new cases occurring during a specific period D. Number of old cases present QUIZ

Relative risk can best be obtained from: A. B. Cohort study C. Case control study D. Experimental study QUIZ

Calculate the Odd's ratio. Diseased Not A. 0.44 diseased B. 1.5 Exposed 30 20 C. 0.8 D. 2.25 Not 20 30 exposed QUIZ

Indirect Direct

QUIZ

If the trial comparing SuperStatin to placebo with the outcome of all cause mortality found the following “OR 0.5”. What would it ? A. The odds of death in the SuperStatin arm are 50% less than in the placebo arm. B. There is no difference between groups C. The odds of death in the placebo arm are 50% less than in the SuperStatin arm. QUIZ

“OR 0.5 95% CI 0.4-0.6” What would it mean? A. The odds of death in the SuperStatin arm are 50% less than in the placebo arm with the true population effect between 20% and 80%. B. The odds of death in the SuperStatin arm are 50% less than in the placebo arm with the true population effect between 40% and 60%. C. The odds of death in the SuperStatin arm are 50% less than in the placebo arm with the true population effect between 60% and up to 10% worse. QUIZ

OR 0.5 95%CI 0.4-0.6 p<0.01 What would it mean? A The odds of death in the SuperStatin arm are 50% less than in the placebo arm with the true population effect between 60% and 40%. This result was statistically significant. B The odds of death in the SuperStatin arm are 50% less than in the placebo arm with the true population effect between 60% and 40%. This result was not statistically significant. C The odds of death in the SuperStatin arm are 50% less than in the placebo arm with the true population effect between 60% and 40%. This result was equivocal.