Basic Concepts in Epidemiology
Basic concepts in Epidemiology
An introduction to principles of epidemiology for psychiatric trainees
Updated 2012 Module 1 Descriptive Epidemiology
Introduction to epidemiology Descriptive studies Association and causation Measures of frequency • Prevalence • Incidence Case definition Standardisation What is epidemiology?
study of the distribution, frequency and determinants of disease in human populations a basic science which is the foundation of all population based research disciplines • health economics, health service research, clinical research
Characteristics of epidemiology
Concerned with populations, not individuals Comparative Quantifiable – numeric, statistics
Human populations
a group of persons sharing one or more variables in common variable could be age, sex, location, disease, exposure ….. or any combination
Aims of epidemiology
describe disease in populations determine aetiology of disease describe natural history of disease predict outcomes identify preventative measures assist health service planning Epidemiology and clinicians
roots are in non-clinical sciences may not be direct relevance to clinician or patient may point to socio-political interventions rather than clinical/medical interventions clinical epidemiology
Clinical epidemiology
the application by a physician who provides direct patient care of epidemiological and biometric methods to the study of the diagnostic and therapeutic processes. [It does not] constitute a distinct or isolated discipline but reflects an orientation arising from both clinical medicine & epidemiology (Sackett 1969) Epidemiological studies
distribution and frequency • descriptive, prevalence, cross sectional studies • correlational studies determinants • case-control • cohort interventions • experimental Descriptive studies
also called prevalence or cross sectional study disease status and exposure status ascertained at the same point uses survey techniques to examine populations cannot determine causal relationships useful for formulating hypothesis Correlational studies
compare entire populations comparison between different places at same time - - or same place at different times useful to form hypotheses for further testing Discussion point Association
In Denmark there is a correlation between the number of children born in a local area and the number of storks in that area Do storks really deliver children to the Danish? Storks and birth rates in 17 European countries Association and causality Assessing association
Bradford Hill criteria • consistency • strength • specificity • dose-response relationship • temporality • biological plausibility • coherence • experimental evidence Measures of frequency Measuring disease frequency
crude count rates • prevalence • incidence role of standardisation Measures of frequency What is a rate?
allows comparisons between populations of different sizes and/or at different times comprises two components - • the number of events, or numerator • the number of people, or denominator Strictly speaking a rate expresses a time interval e.g per year Measures of frequency Calculating a rate
no. of events in population (numerator) rate = no. of people in population (denominator)
numerator and denominator must refer to the same population either numerator or denominator may not be available numerator and denominator must be comparable across all populations Measures of frequency Prevalence
a measure of the occurrence of all cases of disease in a population existing cases of disease in population prevalence = number of people in population case definition ascertainment (registers, surveys) point v period v lifetime prevalence suitable for chronic disease
Discussion point Case definition
How would you define a case of dementia? How would you define a case of schizophrenia? Case definition
Most diseases exist as continua rather than discrete phenomena Comparisons require consistent definitions of caseness Particular problem for psychiatry Also consistency in social and environmental variables e.g. smoking rates, social class
Case definition
state or categorical • dichotomous variables - ill or not ill • ICD 10, DSM IV trait or dimensional • continuous variables - degrees of illness • MMSE, GHQ • often cut-offs used to convert to state Case ascertainment
structured interviews • used to collect standardised data • PSE, SADS, DIS, CIS-R, GMS diagnostic rules • used to establish consistent diagnoses • CATEGO, RDC, AGECAT International classifications • ICD10, DSMIVR
Measures of frequency Types of prevalence
Point prevalence • The proportion of the population with a disease at any specific point in time Period prevalence • The proportion of the population with a disease at any point during a defined period Lifetime prevalence • The proportion of the population who have, or have had, a disease during their lifetime • Better measure for chronic, relapsing conditions Measures of frequency Incidence
measures frequency of new cases of disease in a population over time new cases of disease over time period incidence = number of people at risk case definition ascertainment (registries, surveys) “at risk” suitable for acute disease
Relationship between incidence & prevalence
for a disease at a steady state in a population
prevalence incidence duration Discussion point Prevalence and incidence
Year 1 2 3 Special types of incidence & prevalence
prevalence • congenital malformation rates • smoking amongst teenagers incidence • teenage conceptions • hospital admission rates • mortality rates Endemic and epidemic
Conditions which exist at usually low levels over time are said to be endemic Periodically peaks of incidence occur – often seasonally – forming epidemics Occasionally very high peaks occur – associated with rapid spread – forming pandemics Measures of mortality
mortality rates often used as proxy for disease occurrence routinely counted and readily available poor proxy for diseases with great morbidity but little mortality • chronic diseases • infectious diseases
Crude and age specific rates
age specific rates may reveal diversity lost in calculating total death rates age specific rates are used in calculating age standardised death rates total number of deaths crude death rate = size of population
number of deaths in age range age specific death rate = size of population in age range Standardisation Age standardisation
crude rates allow comparisons between populations of different sizes populations also differ in age and sex structure age (sex) standardisation modifies crude rates to account for these differences SMRs and ASDRs Standardisation Direct standardisation
age/sex-specific rates from the index population(s) are applied to the age/sex structure of a standard population the weighted average represents the rate which would have occurred in the index population(s) if they had the age/sex structure of the standard population the age standardised death rate ASDR Standardisation Direct methods - limitations
the age/sex specific rates of the index population must be known often small numbers of deaths in index populations leading to instability in rates often the indirect method is preferred Standardisation Indirect standardisation
the age/sex specific rates from a standard population are applied to the age/sex structure of the index population the number of deaths expected in the index population is calculated the standardised mortality ratio (SMR) is observed deaths 100% expected deaths Standardisation Standardised Mortality Ratio
The ratio of the number of events observed in the study population to the number that would be expected if the study population had the same age-sex specific rates as the standard population Break for 15 minutes Module 2 Epidemiological studies
Case control studies • Odds ratio Cohort studies • Relative risk Bias Confounding Sampling Screening • Specificity, sensitivity • Likelihood ratios Case-control study
Case and control status determined by the presence or absence of disease at the start of the study period e.g. cases have dementia and controls don’t have dementia Exposure status determined retrospectively e.g. did they smoke or not in the past
Case control studies Advantages
quick and inexpensive good for rare diseases good if there is a long latent period from exposure to disease Good for investigating multiple exposures for a single disease Case control studies Disadvantages
poor for rare exposures cannot usually calculate incidence rates poor for establishing temporal relationships prone to bias - selection and recall DEFINITION Odds
the ratio of the probability of occurrence of an event to that of non occurrence Of 100 people with a cough (cases), 60 people are smokers and 40 are not - the odds of someone with a cough being a smoker is 60:40 or 1.5 (i.e. 60/40) DEFINITION Odds ratio
the ratio of the odds in favour of exposure among the cases to the odds in favour of exposure in the controls measure of association derived from case-control study estimates relative risk if disease is rare OR > 1 indicates positive association DEFINITION Odds ratio example
Of 100 people with a cough (cases), 60 are smokers and 40 are not - the odds of someone with a cough being a smoker is 60:40 or 1.5 (i.e. 60/40) Of 100 people without a cough (controls), 20 are smokers and 80 are not – the odds of someone without a cough being a smoker is 20:80 or 0.25 (i.e. 20/80) The odds ratio is 1.5 / 0.25 = 6 Odds ratio (unmatched study)
Diseased Not diseased
Exposed A B A+B
Not exposed C D C+D
A+C B+D
Odds ratio = (A / C) / (B / D) = AD / BC Odds ratio (matched study)
Control Control not exposed exposed Case A B A+B exposed Case not C D C+D exposed A+C B+D
Odds ratio = B / C
Only discordant pairs contribute to the analysis Cohort studies
allocation to groups determined by exposure to risk factor usually groups then followed prospectively to ascertain development of disease retrospective cohort studies possible Cohort studies Advantages
good where exposure is rare multiple effects of single exposure temporal relationships direct measurement of incidence in exposed and non-exposed groups Cohort studies Disadvantages
not good for rare diseases expensive and time consuming retrospectively, requires notes available losses to follow up DEFINITION Relative risk
the ratio of the risk of disease among the exposed to the risk among the unexposed estimated by cohort studies
ratio of Ie / Io
(strictly speaking the ratio of CIe / CIo) RR > 1 implies positive association Relative Risk
Diseased Not diseased
Exposed A B A+B
Not exposed C D C+D
A+C B+D Relative Risk = I / I = (A / A+B) / (C / C+D) e o
Threats to validity
Chance • Caused by RANDOM variation • Leads to an IMPRECISE measurement • Ensure sample size large enough Bias • Caused by SYSTEMATIC variation • Leads to INACCURATE measurement Confounding • Error in interpretation rather than measurement
Bias Examples of bias
a systematic error in a study that results in an incorrect estimate of association selection biases • Response bias; volunteer bias; Berkson bias. observation biases • Recall bias; reporting bias, interviewer/observer bias; instrument bias; measurement bias Analysis biases • Compliance bias; attrition bias Bias Managing bias
Randomisation Blinding and double blinding Maximise follow up Intention to treat analysis Confounding
the apparent association between risk factor & disease could be explained in whole or part by a third factor - the confounder
Risk factor Disease
Confounder Confounding: What is a confounder
the confounder is an independent risk factor for the disease the confounder is associated with the exposure under study the confounder is not simply an intermediate risk factor age and sex are frequently confounders identifying confounders is not always easy Confounding – example
Grey Hair Dementia (Risk factor) (Disease)
Age (Confounder) Confounding: Managing in study design
randomisation • matches known and unknown confounders if numbers large enough restriction • limits generalisability of study results matching • often case-control studies with small numbers of patients Confounding: Managing in analysis
stratified analysis • separate analyses for e.g. age / sex groups multivariate analysis • multiple regression models • linear or logistic regression Sampling Sampling frames
should be comprehensive (include all members), with each member represented once only examples include • post office address file (PAF) • electoral register • GP lists • hospital records • case registers Sampling
Epidemiology concerned with whole populations Usually not feasible to include everyone Sampling used to identify people to include in study Sample must be representative, unbiased and sufficiently large Sampling Sampling methods
simple random sampling systematic random sampling stratified random sampling cluster sampling quota sampling snowball sampling multistage sampling Sampling Simple random sampling
n sample subjects drawn from population size N each member of population has equal chance of selection may use “out of the hat”, random number generation every n’th name from a randomly arranged list Sampling Systematic random sampling
used when a list is not randomly arranged e.g. alphabetically arranged the starting point is random - thereafter every n’th name is chosen ensures even spread across the list can cause bias if list is arranged in a trend e.g. seniority Sampling Stratified random sampling
used to avoid inadvertent over or under representation of certain groups population divided into strata random sampling from within each strata sampling may be proportionate or disproportionate Sampling Cluster sampling
pragmatic sampling method - limits costs and time clusters of sub populations are randomly selected from the whole either all, or a sample, of the cluster are selected Sampling Quota sampling
market research sampling researchers given a target number of subjects of particular type door to door, street standing open to substantial bias – those prepared to talk not the same as those who walk by Sampling Snowball sampling
Non random form of sampling Useful where target population not known Recruitment by word of mouth Sampling Sampling in qualitative research
randomness and generalisability less important convenience sampling • easy to find, convenient, opportunistic purposive sampling • selects a group with relevant attributes snowball sampling • word of mouth, subjects recruit others Screening Purpose of screening
purpose of screening is not to identify people who have a disease but is - to identify people with a high risk of having or developing a disease screening is followed by a diagnostic procedure to identify those with disease Screening Requirements for screening
the disease • is an important health problem • has a known natural history • has a pre-clinical stage in disease the intervention • is acceptable and effective in early stage the screening procedure • is acceptable, feasible and cost effective Screening Screening test
Disease present absent Screen positive a b a+b Screen negative c d c+d Total a+c b+d
sensitivity = true positives = a/(a+c) specificity = true negatives = d/(b+d) 1-sensitivity = false negatives = c/(a+c) 1-specificity = false positives = b/(b+d)
Screening Positive predictive value
the positive predictive value of a test is the ability of the test to predict the presence of disease i.e. the proportion of those screened as positive who actually have the disease This would be a/(a+b) in this example the PPV varies with the prevalence of disease in the population studied
Screening Positive likelihood ratio
the probability that a positive test result comes from a person with the disease rather than one without the disease = true positives / false positives = sensitivity / (1-specificity) = [a/(a+c)] / [b/(b+d))] Screening Negative likelihood ratio
the probability that a negative test result comes from a person with the disease rather than one without the disease = false negatives / true negatives = (1-sensitivity) / specificity = [c/(a+c)] / [d/(b+d))] Receiver operator curve
dates from the war related to the ability of a receiver to respond to weak stimuli acts as a visual aid to determining the best balance between sensitivity and specificity plots sensitivity against (1-specificity) or true positives against false positives Receiver operator curves
1
0.8
0.6
0.4 no better than guess sensitivity 0.2
0 0 0.2 0.4 0.6 0.8 1 1-specificity Receiver operator curves
1
0.8
0.6
0.4 sensitivity 0.2
0 0 0.2 0.4 0.6 0.8 1 1-specificity Screening Sources of bias.
lead time bias • screen detected cases have disease diagnosed earlier and appear to have longer survival length bias • screen detected cases over represent those with longer pre clinical disease and therefore ? more benign course and better prognosis. Module 3 Community mental health
Adult Psychiatric Morbidity in England (APMS), 2007
3rd national study of psychiatric morbidity in adults – previously 1993 and 2000 data collected throughout 2007 (period prevalence rate) Multistage stratefied probability samplaing design Two stage interview process 10,000 adults in private households 350 adults with psychosis
OPCS Survey Aims
estimate prevalence of psychiatric morbidity identify social disabilities associated with mental illness ascertain service usage investigate recent stressful life events associated with mental illness investigate lifestyle indicators
OPCS Survey Sampling and interviewing
18,000 addresses from PAF
15,765 private addresses found
12,730 adults selected for interview
10,108 adults co-operated given CIS-R and PSQ A
1,821 adults with CIS-R score >12 8,287 adults below threshold B or positive on PSQ and negative on PSQ C
749 adults postive on PSQ 473 agreed to SCAN
OPCS Survey Interview schedules
schedule A • general health, CIS-R, PSQ schedules B and C • longstanding illness, medication, use of health and social services (not in C) • activities of daily living, social support • stressful life events • education/employment, finances • smoking and alcohol
OPCS Survey Interviews
Clinical Interview Schedule (CIS-R) • to elicit neurotic psychopathology • 14 sections scored 0 to 4 (or 5) • threshold score 12 out of 57 Psychosis Screening Questionnaire (PSQ) • hospital care, medication etc. SCAN administered by clinician alcohol and drug questionnaire
OPCS Survey Distribution of CIS-R scores
50 40
30 % 20
10
0
0,1 2,3 4,5 6,7 8,9
30+
10,11 12,13 14,15 16,17 18,19 20,21 22,23 24,25 26,27 28,29 CIS-R scores
OPCS Survey CIS-R scores by sex (AMPS)
Women Men 100 90 80 70 60 % 50 40 30 20 10 0 0 to 5 6 to 11 12 to 17 18 + CIS-R score
OPCS Survey CIS-R scores & social class
50
40
30
20
10 16 16 18 10 13 14 0 I II IIIN IIIM IV V % above threshold
OPCS Survey CIS-R scores and ethnicity
50
40
30
20
10 19 14 17 0 White West Indian Asian % above threshold
OPCS Survey CIS-R scores & employment
50
40
30
20
10 23 20 10 14 0 Working F/T Working P/T Unemployed Inactive % above threshold
OPCS Survey CIS-R scores & rurality
50
40
30
20
10 16 10 0 Urban Rural % above threshold
OPCS Survey CIS-R scores & marriage
Men 50 Women 40 29 29 30 26 22 18 20 20 15 17 11 10 10
0 Married Single Widowed Divorced Separated % above threshold
OPCS Survey Diagnosis and sex (AMPS)
Rates per 1000 pop; time periods - in last week for neurotic illnesses and in last 12 months for psychosis, alcohol and drug 120 100 Men 80 Women 60 40 20
0
GAD
OCD
Panic
Phobic
alcohol
Harmful
Anx/Dep
Drug
Psychosis
Depression dependance
OPCS Survey Diagnosis and ethnicity
Rates per 1000 pop; time periods - in last week for neurotic illnesses and in last 12 months for psychosis, alcohol and drug 140 120 White 100 West Indian 80 Asian 60 40 20
0
GAD
OCD
Depr
Panic
Phobic
alcohol
Harmful
Anx/Dep
Drud
Psychos dependance
OPCS Survey Questions Quiz
Studies that produce basic estimates of the rates of disorder in a general population and its subgroups are: • A. Qualitative epidemiology • B. Analytic epidemiology • C. Experimental epidemiology • D. Descriptive epidemiology Quiz
The number of new cases that occur within a specific population within a defined time interval is: • A. Point Prevalence • B. Incidence • C. Period prevalence • D. Lifetime Prevalence Quiz
A systematic method for continuous monitoring of diseases in a population, in order to be able to detect changes in disease patterns and then to control them is: • A. Conditional probability • B. Screening • C. Prevalence • D. Surveillance Quiz
In epidemiology research, if the relative risk is greater than 1.0, the group with the suspected risk factor: • A. Has a lower incidence rate of the disorder. • B. Has a higher incidence rate of the disorder. • C. Has no relationship with the risk factor. • D. None of the above Quiz
Number of births divided by total population is the: • A. Crude birth rate • B. General fertility rate • C. Age-specific fertility rates • D. Total period fertility rate Quiz
The statistic used to explain the chances of being exposed to a risk among those with the diagnosis divided by exposure to the risk among those without the diagnosis is the: • A. Phi coefficient • B. Odds ratio • C. Chi square • D. Kappa Quiz
A useful measure of lethality of an acute infectious disease is: • A. Attack rate • B. Incidence rate • C. Case fatality rate • D. Mortality rate Quiz
In an outbreak of cholera in a village of 2,000 population, 20 cases have occurred and 5 died. The case fatality rate is: • A. 1% • B. 0.25% • C. 5% • D. 25% Quiz
Descriptive epidemiology is study in relation to: • A. Time • B. Place • C. Person • D. All of the above Quiz
When launching a study many respondents are invited, some of whom fail to come. This is called: • A. Response bias • B. Volunteer bias • C. Selection bias • D. Berksonian bias Quiz
The ratio between the incidence of disease among exposed and non-exposed is called: • A. Causal risk • B. Attributable risk • C. Relative risk • D. Odd's ratio Quiz
Which is false about a cohort study? • A. Incidence can be measured • B. Used to study chronic diseases • C. Expensive • D. Always prospective Quiz
Prevalence of disease in a community can be estimated by a: • A. Case control study • B. Cohort study • C. Cross-sectional study • D. Experimental study Quiz
A sampling method which involves a random start and then proceeds with the selection of every kth element from then onwards (where k= population size/sample size): • A. Simple random sampling • B. Stratified random sampling • C. Systematic sampling • D. Snowball sampling Quiz
Data collection about everyone or everything in group or population and has the advantage of accuracy and detail: • A. Census • B. Survey • C. Probability sampling • D. Cluster sampling Quiz
Randomization is useful to eliminate: • A. Observer bias • B. Confounding factors • C. Recall bias • D. Attrition bias Quiz
The criteria for validity of a screening test is: • A. Accuracy • B. Predictability • C. Sensitivity and specificity • D. Cost effectiveness Quiz
Berkesonian bias refers to: • A. Different rates of admission to the hospital • B. Interviewer bias • C. Systemic sampling • D. Systematic difference in characteristic cases and controls Quiz
Study of a person who has already contacted the disease is called: • A. Case control • B. Cohort • C. Control cohort • D. Longitudinal Quiz
Incidence is defined as: • A. Number of cases existing in a given population at a given moment • B. Number of cases existing in a given period • C. Number of new cases occurring during a specific period • D. Number of old cases present Quiz
Relative risk can best be obtained from: • A. Case study • B. Cohort study • C. Case control study • D. Experimental study Quiz
Calculate the Odd's Diseased Not diseased ratio. • A. 0.44 Exposed 30 20
• B. 1.5 Not 20 30 • C. 0.8 exposed • D. 2.25 Quiz
Case control study is most suitable for: • A. Finding rare cause • B. Finding multiple risk factors • C. Finding incidence rate • D. Finding morbidity rates