<<

Introduction to for Clinical Researchers

University of Kansas Department of Biostatistics & University of Kansas Medical Center Department of Internal Medicine Materials

ƒ PowerPoint files can be downloaded from the Department of Biostatistics website at http://biostatistics.kumc.edu

ƒ A link to the recorded lectures will be posted in the same location Topics

ƒ Comparing Two (or more) Independent Proportions

ƒ Measures of Association for Dichotomous Data

ƒ

ƒ Study Design Considerations Recall

Number of Parameter Null Hypothesis Test Required Assumptions to

Samples (HO) use

1Mean μ = μO z Normality, known

1Mean μ = μO t Normality, unknown variance

1 2 Difference in μ1 − μ2 = δ z Normality, known

1 2 Difference in means μ1 − μ2 = δ t Normality, unknown but equal variances

1 2 Difference in means μ1 − μ2 = δ t Normality, unknown variances

1 (Paired) difference μD = 0 t Normality, unknown variance

K > 2 Equality of Means μ1 = μ2 = . . . = μk F Normality, equal variances

2 1 Proportion p = pO χ , z Sample size considerations

1 2 2 Difference in proportions p1 − p2 = δ χ , z, Sample size considerations Fisher’s Exact

2 K > 2 Equality of Proportions p1 = p2 = . . . = pk χ Sample size considerations

1: Usually δ = 0 Comparing Proportions Between Two Independent Populations In This Section

ƒ CIs for difference in proportions between two independent populations

ƒ Large sample methods for comparing proportions ƒ Normal approximation method ƒ Chi-square test

ƒ Fisher’s Exact Test

ƒ Measures of association Comparing Two Proportions

ƒ Pediatric AIDS Group (ACTG) Protocol 076 Study Group1

ƒ Study Design ƒ “We conducted a randomized, double-blinded, placebo- controlled trial of the efficacy and safety of zidovudine (AZT) in reducing the risk of maternal-infant HIV transmission” ƒ 363 HIV infected pregnant women were randomized to AZT or placebo

1Conner, E., et al. (1994). Reduction of maternal-infant transmission of human immunodeficiency virus type 1 with zidovudine treatment, NEJM, 331:18 Comparing Two Proportions

ƒ Results ƒ Of the 180 women randomized to AZT, 13 gave birth to children who tested positive for HIV within 18 months of birth ƒ Of the 183 women randomized to placebo, 40 gave birth to children who tested positive for HIV within 18 months of birth Notes

ƒ of treatment ƒ Helps insure two groups are comparable ƒ Patient and physician could not request a particular treatment

ƒ Double-blind ƒ Patient and physician did not know treatment assignment Observed HIV Transmission Rates

ƒ AZT

13 pˆ ==0.07() 7% AZT 180

ƒ PLA

40 pˆ ==0.22() 22% PLA 183 Notes

ƒ Is the difference significant or can it be explained by chance?

ƒ CI on the difference in proportions? P-value? Distribution of Difference in Sample Proportions ƒ Since we have large samples, we can be assured that the sampling distributions of the sample proportions in both groups are approximately normal (CLT)

ƒ It turns out the difference of quantities which are approximately normally distributed are also normally distributed of Difference in Sample Proportions ƒ The sampling distribution of the difference of two sample proportions (based on large samples) approximates a normal distribution ƒ This distribution is centered at the true, unknown population

difference p1−p2 AZT Group Placebo Group AZT−PLA 95% CIs for Difference in Proportions

ƒ General formula: Best estimate ± multiplier*SE(best estimate)

ƒ The best estimate of a population difference in sample proportions is: ppˆˆ12−

ƒ Here, pˆ 1may represent the sample proportion of infants HIV positive for 180 infants in the AZT group and pˆ 2 may represent the same for the 183 infants in the placebo group 95% CI: AZT Study

pˆˆ12−± p multiplierSE( pˆˆ12− p )

ƒ Statisticians have developed formulas for the of the difference

ƒ These formulas depend on both sample sizes and sample proportions

pppp1122()11−−() SE() pˆˆ12−= p + nn12

∧ ppppˆˆˆˆ1122()11−−() SE() pˆˆ12−= p + nn12 HIV/AZT Study

ƒ Recall the data:

Group AZT PLA N 180 183 Sample 0.07 0.22 proportion

∧ 0.07( 0.93) 0.22( 0.78) SE() pˆˆ−= p + =0.36 12 180 183 95% CI: AZT Study

ƒ The 95% CI for the true difference in proportions between the AZT group and PLA groups is:

−±0.15 1.96SE() pˆˆ12 − p −±0.15 1.96() 0.036 ()−0.22,0.08 Summary

ƒ Results ƒ The proportion of infants who tested positive for HIV within 18 months of birth was seven percent (95% CI 4 -12%) in the AZT group and twenty-two percent in the placebo group (95% CI 16 - 28%) ƒ The study results estimate the absolute decrease in the proportion of HIV positive infants born to HIV positive mothers associated with AZT to be as low as 8% and as high as 22% Two-sample z-test: Getting a p-value

ƒ Are the proportions of infants contracting HIV within 18 months of birth equivalent at the population level for those whose mothers are treated with AZT versus untreated?

ƒ HO: p1 = p2

ƒ HA: p1 ≠ p2

ƒ In other words, is the expected difference in proportions zero?

ƒ HO: p1 − p2 = 0

ƒ HA: p1 − p2 ≠ 0 Hypothesis Test to Compare Two Independent Proportions ƒ Recipe:

ƒ Assume HO is true

ƒ Measure the distance of our sample result from pO (here, it’s 0) ƒ Compare the distance (test statistic) to the appropriate distribution to get the p-value

observed difference− null difference z = SE ()observed difference ppˆˆ− = 12 ppppˆˆˆˆ()11−−() 1122+ nn12 HIV/AZT Study

ƒ Recall,

ppˆˆ12−=−0.15 ∧

SE() pˆˆ12−= p 0.036

ƒ So in this study,

−0.15 z ==−4.2 0.36

ƒ This result was 4.2 SE below the null mean of 0 P-values

ƒ Is a result 4.2 standard errors below 0 unusual? ƒ It depends on the distribution we’re dealing with

ƒ The p-value is the probability of getting a test statistic as or more extreme than what we observed (-4.2) by chance

ƒ The p-value comes from the sampling distribution of the difference in two sample proportions

ƒ What is the sampling distribution of the difference in sample proportions? ƒ If both groups are large, it is approximately normal ƒ It is centered at the true difference ƒ Under the null, this true difference is 0 HIV/AZT Study

ƒ To compute a p-value, we need to compute the probability of being 4.2 or more SEs away from 0 on a standard normal curve HIV/AZT Study

ƒ If we were to look this up in a normal table, we would find a very small p (p < 0.001)

ƒ This method is also essentially equivalent to the chi-square (χ2) method Displaying 2X2 Data

ƒ Data of this sort can be displayed using a The Chi-Square Test

ƒ Testing equality of two population proportions using data from two samples

ƒ HO: p1 = p2

ƒ HA: p1 ≠ p2

ƒ In other words, is the expected difference in proportions zero?

ƒ HO: p1 − p2 = 0

ƒ HA: p1 − p2 ≠ 0

ƒ In the context of a 2X2 table, this is testing whether there is a relationship between the row variable (HIV status) and the column variable (treatment type)--the test of independence Chi-Square Test

ƒ Pearson’s Chi-Square Test (χ2) can easily be done by hand

ƒ Works well for “big” sample sizes--it is an approximate method

ƒ Gives essentially the same p as the z test for comparing two proportions

ƒ Unlike the z-test, it can be extended to compare proportions between more than two independent groups in one test The Chi-Square Method

ƒ Looks at discrepancies between observed and expected cell counts (under the null hypothesis) in a 2X2 table ƒ O = observed ƒ E = expected = (row total*column total)/grand total

ƒ “Expected” refers to the values for the cell counts that would be expected if the null hypothesis is true (the expected cell proportions if the underlying population proportions are equal) The Chi-Square Method

ƒ Recipe . . . ƒ Start by assuming the null hypothesis is true ƒ Measure the distance of the sample result from the null value ƒ Compare the test statistic (distance) to the appropriate distribution to get the p-value

2 (OE− ) χ=2 ∑ E

ƒ The sampling distribution of this statistic when the null is true is a chi-square distribution with one degree of freedom Chi-Square (1) Contingency Table

ƒ The observed value for cell one is 13

ƒ We have to calculate the expected count:

RC 53( 180) E == =26.3 N 363 Expected Values

ƒ We could do the same for the other three cells:

ƒ Now we must compute the ‘distance’ (test statistic), χ2

2 (OE− ) χ=2 ∑ E Expected Values

2 (OE− ) χ=2 ∑ E 22 2 2 ()13−− 26.3() 40 26.7() 167 − 153.7() 143 − 156.3 =++ + 26.3 26.7 153.7 156.3 = 15.6 Sampling Distribution

ƒ P = 0.0001 Extending Chi-Square Test

ƒ The chi-square test can be extended to test for differences in proportions across more than two independent populations ƒ Analogous to ANOVA with binary outcomes Recall: HIV/AZT Example Recall: HIV/AZT Example

ƒ Testing equality of two population proportions using data from two samples

ƒ HO: p1 = p2

ƒ HA: p1 ≠ p2

ƒ In other words, is the expected difference in proportions zero?

ƒ HO: p1 − p2 = 0

ƒ HA: p1 − p2 ≠ 0

ƒ In the context of a 2X2 table, this is testing whether there is a relationship between the row variable (HIV status) and the column variable (treatment type)--the test of independence Statistical Methods

ƒ Pearson’s Chi-Square Test or the Normal Approximation (z) Method ƒ Both are based on the kicking in ƒ Both are approximate methods, but are excellent approximations if the sample sizes are large ƒ These do not perform as well in smaller samples Statistical Methods

ƒ Fisher’s Exact Test ƒ Calculations are difficult ƒ Always appropriate, regardless of sample size ƒ Usually involves computer effort ƒ Exact p-values are given (no approximations)

ƒ Rationale ƒ Suppose the null is true--that there is no association between AZT and transmission ƒ Imagine putting 53 red balls (the infected) and 310 green balls (the non-infected) in a jar ƒ Randomly select 180 of them (these are the AZT group, the rest are the placebo group) Fisher’s Exact Test

ƒ The one-sided p-value from this test is the probability that you get 13 (what we actually observed in the AZT group) or fewer red balls among the 180 ƒ For this example, a computer will compute the exact p--0.0001 Measures of Association for Binary Data Risk Difference

ƒ Risk difference is the difference in proportions, p1 − p2

ƒ Ex: the difference in risk of HIV for children born to HIV+ mothers taking AZT relative to those taking placebo is

ppˆˆ12− =−=−0.07 0.22 0.15

ƒ If AZT was given to 1000 infected pregnant women, it would reduce the number of transmissions by 150

ƒ 95% CI: Study results suggest that the reduction in HIV positive births from 1,000 HIV positive pregnant women treated with AZT could from 75 to 220 fewer than the number occurring if the 1,000 women were not treated Relative Risk

ƒ Relative Risk (risk ratio)--ratio of proportions p1/p2

ƒ Ex: The risk of HIV with AZT vs. PLA

pˆ 0.07 1 ==0.32 pˆ2 0.22 ƒ The risk of transmission with AZT is about 1/3 the risk with placebo

ƒ An HIV+ woman can reduce her personal risk of transmitting the virus by nearly 70% if she takes AZT

ƒ 95% CI: Study results suggest that this reduction in risk could be as small as 40% and as large as 82% The Risk Difference vs. Relative Risk

ƒ The risk difference (attributable risk) provides a measure of the public health impact of an exposure (assuming causality)

ƒ The relative risk provides a measure of the magnitude of the disease-exposure association for an individual

ƒ Each provides a different piece of information about the “story”

ƒ AZT example—in this study 22% of the untreated mothers gave birth to children with HIV ƒ Relative risk: .32 ƒ Risk difference: -15% The Risk Difference vs. Relative Risk

ƒ Suppose that only 2% of the children born to untreated HIV positive women became HIV positive

ƒ Suppose the percentage in AZT treated women is .6% ƒ Relative risk: .32 ƒ Risk difference: -1.4 % The Risk Difference vs. Relative Risk

ƒ Suppose that 90% of the children born to untreated HIV positive women became HIV positive

ƒ Suppose the percentage in AZT treated women is 75% ƒ Relative risk: .83 ƒ Risk difference: -15 % What Is an Odds?

ƒ Like the relative risk, the odds ratio provides a measure of association in a ratio (as opposed to difference)

ƒ The odds ratio is a function of risk (prevalence)

ƒ Odds is the ratio of the risk of having an outcome to the risk of not having an outcome ƒ If p represents the risk of an outcome, then the odds are given by:

p odds = 1− p Example

ƒ In the AZT example, the estimated risk of giving birth to an HIV infected child among mothers treated with AZT was 0.07

ƒ The corresponding odds estimate is

∧ pˆ 0.07 odds ===1 0.08 10.93− pˆ1 Example

ƒ In the AZT example, the estimated risk of giving birth to an HIV infected child among mothers not treated was 0.22

ƒ The corresponding odds estimate is

∧ pˆ 0.22 odds ===2 0.28 10.78− pˆ2 AZT/Mother-Infant Transmission Example

ƒ The estimated odds ratio of an HIV birth with AZT relative to placebo

pˆ1 ∧ 1− pˆ 0.08 OR ===1 0.28 pˆ2 0.28

1− pˆ2

ƒ The odds of HIV transmission with AZT is .28 (about 1/3) the odds of transmission with placebo Odds Ratio

ƒ Interpretation ƒ AZT is associated with an estimated 72% (estimated OR = .28) reduction in odds of giving birth to an HIV infected child among HIV infected pregnant women ƒ Study results suggest that this reduction in odds could be as small as 46% and as large as 86% (95% CI on odds ratio, .14 – .54) Odds Ratio

ƒ What about a p-value?

ƒ What value of odds ratio indicates no difference in risk?

ƒ If p1 = p2 then

p1 1− p OR = 1 = 1 p2

1− p2

ƒ Thus, we need to test

ƒ HO: OR = 1 versus HA: OR ≠ 1 ƒ We can use the same test! (Chi-square) Hypothesis of Equal Proportions

ƒ Equivalent hypotheses:

p1 1− p1 HpOO::0:112=−== p Hpp 12 H O p2 1− p2

ƒ Always will estimate the same direction of association

∧∧ OR< 11⇔< RR ∧∧ OR>⇔11 RR > ∧∧ OR= 11⇔= RR How does OR compare to RR?

ƒ If CI for OR does not include 1, CI for RR will not include 1

ƒ If CI for OR includes 1, CI for RR will include 1

ƒ The lower the risk in both groups being compared, the more similar the two measures will be in magnitude The Odds Ratio versus Relative Risk

ƒ AZT example--in this study 7% of AZT treated mothers and 22% of the untreated mothers gave birth to children with HIV ƒ RR = 0.32 ƒ OR = 0.28

ƒ Suppose only 2% of the children born to untreated HIV positive women became HIV positive and that the percentage in AZT treated women is 0.6% ƒ RR = 0.32 ƒ OR = 0.30

ƒ Suppose 90% of the children born to untreated women became HIV positive and that the percentage in AZT treated women is 0.6% ƒ RR = 0.83 ƒ OR = 0.33 Why Bother with OR?

ƒ It is less intuitively interpretable than relative risk

ƒ However, with certain types of non-randomized study designs we can not get a valid estimate of RR but can still get a valid estimate of OR Survival Analysis Survival Analysis

ƒ Survival Analysis is the class of statistical methods for studying the occurrence and timing of events

ƒ The event (to name a few) could be ƒ tumor-free time ƒ development of a disease ƒ response to treatment ƒ relapse ƒ local control ƒ death

ƒ Survival analysis methods are most often applied to the study of deaths Basic Concepts

ƒ Survival Time: the time from a well-defined point in time (time origin) to the occurrence of a given event

ƒ Survival Data: ƒ survival time ƒ response to a given treatment ƒ patient characteristics related to response, survival, and the development of a disease Basic Concepts – Basic Survival Data Structure Basic Concepts

ƒ Survival data has two features that are difficult to handle with “conventional” statistical methods: ƒ censoring ƒ time-dependent covariates

ƒ Many consider survival data analysis as the application of conventional statistical methods when the data is censored

ƒ Censoring comes in many forms and occurs for many different reasons

ƒ The most common and basic types of censored data are Types I-III right-censored data Basic Concepts

ƒ Type I Censoring (also called singly-censored): ƒ Subjects are observed for a fixed period of time (under control of the investigator); at the end of the fixed period all remaining subjects are sacrificed/censored [e.g., at the end of the 3-year , all subjects who have not died are censored at time 3 years] ƒ If there are no accidental losses, all censored observations equal the length of the study period

A x B x C x x D

0 1 2 3 Basic Concepts

ƒ Type II Censoring (also called singly-censored): ƒ Subjects are observed until a fixed pre-specified number of events have occurred; after that the remaining subjects are sacrificed/censored [e.g., experiment with 100 lab rats will stop when 50 of them have died] ƒ If there are no accidental losses, the censored observations equal the largest uncensored observation Basic Concepts

ƒ Type III Censoring occurs when: ƒ subjects are terminated for reasons that are not under the control of the investigator (e.g., withdrawals, loss to follow-up) ƒ there is a single termination time, but entry times vary randomly across subjects (e.g., patients receive heart surgery at various time points, typically uncontrolled by the investigator, but the study has to be terminated on a single date; all patients still alive on that date will be censored, but their survival times from surgery will vary)

A x o B x C o D 0 5 10 Basic Concepts

ƒ In most clinical studies the length of study period is fixed and the patients enter the study at different times ƒ “Lost-to-follow-up” patients’ survival times are measured from the study entry (well-defined point in time) until last contact [these will be censored observations] ƒ Patients still alive at the termination date will have survival times equal to the time from the study entry until study termination [these will be censored observations] ƒ When there are no censored survival times, the set is said to be complete Functions of Survival Time

ƒ Let T = survival time

ƒ The distribution of T can be described by the following functions: ƒ : the probability that an individual survives longer than t: S(t) = P(an individual survives longer than t) = P(T > t)

ƒ S(t) is a non-increasing (and usually decreasing) function of time t with the following properties: S(t) = 1 for t = 0 and S(t) = 0 for t = ∞

ƒ S(t) is also known as the cumulative survival rate Functions of Survival Time

ƒ If there are no censored observations, the survival function is estimated as the proportion of patients surviving longer than t:

# of patients surviving longer than t Stˆ()= total # of patients Functions of Survival Time

ƒ Probability Density Function: The survival time T has a probability density function defined as the limit of the probability that an individual fails in the short interval (t, t + Δt) per unit width Δt:

P (an individual dying in the interval (tt ,+ Δ t )) ft( )= lim Δ→t 0 Δt

ƒ The graph of f(t) is called the density curve Functions of Survival Time

ƒ f(t) has the following two properties:

ƒ f(t) is a non-negative function: f(t) ≥ 0 for all t ≥ 0 = 0 for all t < 0

ƒ The area under the curve equals one Functions of Survival Time

ƒ If there are no censored observations, the probability density function (p.d.f.), f(t), is estimated as the proportion of patients dying in an interval per unit width:

# of patients dying in the interval beginning at time t ftˆ()= ()total # of patients() interval width

ƒ The density function is also known as the unconditional Functions of Survival Time

ƒ Hazard Function: The hazard function h(t) of survival time T gives the conditional failure rate

ƒ It is defined as the probability of failure during a very small time interval, assuming the individual has survived to the beginning of the interval:

P {an individual of age tt fails in the time interval ( ,t+Δt )} ht( )= lim Δ→t 0 Δt Functions of Survival Time

ƒ The hazard function can also be expressed in terms of the cumulative distribution function F(t) and the p.d.f. f(t):

ft() ht()= 1()− Ft ƒ This is also known as the instantaneous failure rate, force of mortality, conditional mortality rate, or age-specific failure rate

ƒ The hazard at any t corresponds to the risk of event occurrence at time t [e.g., a patient’s hazard for contracting influenza is 0.015 [1.2] with time measured in months, i.e., this patient would expect to contract influenza 0.015 [1.2] times over the course of a month assuming the hazard stays constant] Functions of Survival Time

ƒ If there are no censored observations, the hazard function is estimated as the proportion of patients dying in an interval per unit time, given that they have survived to the beginning of the interval:

# of patients dying in the interval beginning at time t htˆ()= ()# of patients surviving at t ( interval width) # of patients dying per unit time in the interval = # of patients surviving at t Functions of Survival Time

ƒ Actuaries usually use the average hazard rate of the interval in which the number of patients dying per unit time in the interval is divided by the average number of survivors at the midpoint of the interval:

# of patients dying per unit time in the interval htˆ()= 1 ()# of patients surviving at t − 2 () # deaths in the interval Functions of Survival Time

ƒ The cumulative hazard function is defined as

t Ht()==− hxdx ( ) log St () ∫0 e ƒ Thus, at t = 0 S(t) = 1, H(t) = 0 at t = ∞ S(t) = 0, H(t) = ∞ Relationships among the Survival Functions

ft() ht()= St() d ft()=−=−⎡⎤ 1 St () S '() t dt ⎣⎦ St'( ) d ht()=− =− logSt () St() dt e t −=hxdx() log() St ∫0 e

Ht()=− loge St () t St()=− exp⎡⎤ Ht () =− exp⎡ hxdx ( ) ⎤ ⎣⎦ ⎣⎢ ∫0 ⎦⎥ ft()=− ht ()exp⎣⎦⎡⎤ Ht () Nonparametric Methods for Estimation of the Survival Functions ƒ Product-Limit Estimates (Kaplan-Meier) [most widely used in biological and medical applications]

ƒ Life Table Analysis (actuarial method) [for large number of observations or if there are many unique event times]

ƒ Relative, Five-Year, and Corrected Survival Rates Nonparametric Methods for Comparing Survival Distributions ƒ Did the treatment make the difference in the survival experience between “treatment A” and “treatment B” patients?

ƒ Gehan’s Generalized Wilcoxon Test [better for detecting short- term differences in survival] ƒ Cox-Mantel Test ƒ Log-rank Test [better for detecting long-term differences in survival] ƒ Peto and Peto’s Generalized Wilcoxon Test ƒ Cox’s F-Test ƒ Mantel and Haenszel Test [allows stratification based on prognostic factors] ƒ Cox’s Proportional Hazard Regression [adjustment for multiple prognostic factors] Survival Examples

ƒ Example 1: ƒ Pilot study was conducted at the VA hospital to compare Accelerated Fractionation Radiation Therapy versus Standard Fractionation Radiation Therapy for patients with advanced unresectable squamous cell carcinoma of the head and neck.

ƒ Is there any difference in survival between the patients treated with Accelerated FRT and the patients treated with Standard FRT? Example 1: Summary of Patient Characteristics

AFRT SFRT Gender Male 28 (97%) 16 (100%) Female 1 (3%) 0 Age 61 65 Range 30-71 43-78 Primary Site Larynx 3 (10%) 4 (25%) Oral Cavity 6 (21%) 1 (6%) Pharynx 20 (69%) 10 (63%) Salivary Glands 0 1 (6%) Stage III 4 (14%) 8 (50%) IV 25 (86%) 8 (50%) Tumor Stage T2 3 (10%) 2 (12%) T3 8 (28%) 7 (44%) T4 18 (62%) 7 (44%)

Example 1: KM Curves by Treatment

Overall Survival by Treatment

1.00

AFRT SFRT

0.75

censored 0.50 observation Survival Probability 0.25

0.00 0 1224364860728496108120 Survival Time (months) Median Survival Time: AFRT: 18.38 months (2 censored observations) SFRT: 13.19 months (5 censored observations) Example 1: KM Curves by Treatment

Overall Survival by Treatment

1.00

AFRT SFRT

0.75

0.50 Survival Probability 0.25

0.00 0 1224364860728496108120 Survival Time (months)

Log-Rank test p-value= 0.5421 Example 1: Further Analysis

ƒ Let us consider the analysis of the survival of these patients by Stage as well as by Treatment Example 1: KM Curves by Stage and Treatment

Overall Survival by Treatment and Stage

1.00

AFRT/Stage 3 AFRT/Stage 4 SFRT/Stage 3 0.75 SFRT/Stage 4

0.50 Survival Probability 0.25

0.00 0 1224364860728496108120 Median Survival Time: Survival Time (Months) AFRT Stage 3: 77.98 mo. AFRT Stage 4: 16.21 mo. SFRT Stage 3: 19.34 mo. SFRT Stage 4: 8.82 mo.

Log-Rank test p-value = 0.0792 Example 2: Survival of Leukemia Patients Example 2: SAS Output Study Design Clofibrate Trial

ƒ A study was performed to look at the effect of a drug, Clofibrate, on mortality rates for individuals with heart disease ƒ Individuals were followed for five years after administration of the drug Results: Clofibrate Group

ƒ Coronary heart disease ƒ Results from group randomized to take Clofibrate Results: Clofibrate Group

ƒ Coronary heart disease ƒ Results from group randomized to take Clofibrate

ƒ Question: Do these results strongly demonstrate the efficacy of Clofibrate in reducing the mortality in subjects with CHD? Results: Placebo Group

ƒ Coronary heart disease ƒ Results from group randomized to take placebo The Dangers of Self-Selection

ƒ Overall mortality in each of the groups in this randomized trial The Dangers of Self-Selection

ƒ Randomized trial ƒ No significant difference (p > 0.2) between the treatment and placebo groups

ƒ No difference between TX groups ƒ The compliers and non-compliers were similar with respect to other variables (age, etc.) ƒ There were no apparent harsh side effects of Clofibrate relative to placebo that may have resulted in differential compliance between the Clofibrate and placebo groups

Source: Coronary drug research group (1980). NEJM, 1038-1041 A Randomized Control Group

ƒ Important for accounting for many kinds of biases

ƒ Randomization, done correctly on a large number of subjects nearly ensures that the only systematic difference in the groups being compared is the exposure(s) of interest Note

ƒ A very famous randomized trial (once again) ƒ 200,745 vaccinated for polio ƒ ~ 400,000 school children randomized ƒ 201,229 given a placebo 1954 Salk Polio Vaccine Trial

ƒ At the end of the follow-up period there were 82 cases in the vaccine group and 162 in the placebo group

ƒ Subsequently analyses report slightly different numbers because some false positives were discovered in each of the two groups 1954 Salk Polio Vaccine Trial

ƒ Result Results in 2X2 Table Format

ƒ How to test for an association between vaccine polio?

ƒ HO: pvaccine = pplacebo

ƒ HA: pvaccine ≠ pplacebo

ƒ You can use either Fisher’s Exact test or a Chi-square test

ƒ With Fisher’s Exact, p < 0.001 Randomized Study Design

ƒ Prospective cohort studies ƒ Choose a fixed number with and without exposure ƒ Follow subjects for a set period of time and determine who has disease/outcome of interest

ƒ Measures of association ƒ Difference in proportions ƒ Relative risk (ratio of proportions) ƒ Odds ratio Benefits of Randomization

ƒ Randomization helps protect against self-selection biases ƒ Examples: ƒ Males are more likely to volunteer for placebo than females ƒ Smokers are less likely to be in the exposed group ƒ Healthier persons sign up for the intervention

ƒ The goal of randomization is to eliminate any systematic differences in characteristics of subjects in each of the exposure groups under study, save for the exposure itself Randomized Study Design

ƒ Many epidemiological studies are concerned with estimating an association between two dichotomous (binary) variables ƒ Example: exposure-disease association

ƒ Randomized study design with control group is a type of prospective ƒ Choose a fixed number with and without exposure ƒ Follow subjects for a set time period and determine who has the disease/outcome of interest

ƒ Measures of association ƒ Difference in proportions ƒ Relative risk (ratio of proportions) ƒ Odds ratio Methods of Randomization Patient Registration

For each patient considered suitable for inclusion in a clinical trial, the following formal sequence of events should take place. Patient Registration

ƒ Patient requires treatment ƒ Patient eligible for inclusion ƒ Clinician willing to accept randomization ƒ Patient consent is obtained ƒ Patient formally entered on trial ƒ Treatment assignment obtained from randomization list ƒ On-study forms are completed ƒ Treatment commences Patient Recruitment

ƒ The identification of appropriate patients is important to ensure that the sample is representative of the disease under investigation. ƒ Trial’s conclusions can be readily applied ƒ All relevant patients should be considered ƒ Clinician should avoid being unduly selective in choice of patients Checking Eligibility

ƒ Eligibility should be checked right away by going through the list of eligibility criteria in the protocol. ƒ If any ineligible patients accidentally make it through the screening process, they may be excluded from the analysis. Agreement to Randomize

ƒ Anything other than randomization is unacceptable if randomization was agreed upon. ƒ If a clinician is unwilling to randomize, he or she should not participate in the trial at all. Patient Consent

ƒ Written informed consent is an ethical and legal requirement that must be obtained before randomization. Formal Entry on Trial

ƒ Registering a patient for entry onto a trial before randomization ensures that all randomized patients are followed for details of treatment and evaluation. ƒ Helps to guard against investigators not giving the randomized treatment ƒ Value in keeping a separate log of eligible patients not entered into the trial, which provides insight into the representativeness of trial patients Random Treatment Assignment

ƒ A randomization list of consecutive random treatment assignment should have been prepared in advance. ƒ The clinician should not know the order of this list, nor should they be able to predict the next treatment. ƒ Prepared by an independent person, statistician, etc. Random Treatment Assignment

ƒ Methods for Revealing Treatment: ƒ Sealed envelopes opened after formal entry onto trial. ƒ List given directly to pharmacist for double-blind evaluations. ƒ For a multi-center trial with a central registration office, the assignment can be read off the randomization list to the investigator over the phone. ƒ For a trial in a single institution, it is best to have an independent person responsible for patient registration and randomization who is not concerned with treating any of the patients. Documentation

ƒ Confirmation of registration ƒ Name, date of birth, trial number, assigned treatment

ƒ On-study Form ƒ Previous therapy, personal details, clinical condition, tests and Reliability

ƒ (1) through (7) should be be completed before treatment commences. ƒ The entire process of registration and randomization should be prompt and efficient so there is no delay in treatment. ƒ Avoid unnecessary drop-outs Preparing the Randomization List

ƒ Simple randomization ƒ Replacement randomization ƒ Random permuted blocks ƒ Biased coin method Simple Randomization

Two treatments, A and B ƒ Table of random numbers ƒ Start with top row and work down ƒ Assign A for digits 0−4 ƒ Assign B for digits 5−9 AB AB BA AB AA BB AB BA BB BB BA BA AB BB BA Simple Randomization

ƒ Three treatments, A B C ƒ Assign A for 1−3 ƒ Assign B for 4−6 ƒ Assign C for 7−9 ƒ Ignore 0 - B AC CB AC BA BC AC BA BB CB CA C- BC CC CB Simple Randomization

ƒ Four treatments, A B C D ƒ Assign A for 1,2 ƒ Assign B for 3,4 ƒ Assign C for 5,6 ƒ Assign D for 7,8 ƒ Ignore 0,9 - C AD DB BD BA CD BD CA CC - C DA D- BD DD DB Simple Randomization

ƒ Each treatment is completely unpredictable. ƒ Probability theory guarantees that in the long run the numbers of patients on each treatment will not be radically different. ƒ It is often desirable to restrict randomization to ensure similar treatment numbers throughout a trial. Random Permuted Blocks

ƒ To ensure exactly equal treatment numbers at equally spaced points in the sequence of patient assignments. ƒ For T treatments, block into k patients and produce a different random ordering of k assignments to each treatment. Random Permuted Blocks

ƒ For two treatments A and B ƒ Blocks of two patients ƒ Assign AB for 0−4 ƒ Assign BA for 5−9 ABBA ABBA BAAB ABBA ABAB BABA ABBA BAAB …. Random Permuted Blocks

ƒ Three treatments, A B and C ƒ Blocks of three patients ƒ ABC for 1 ƒ ACB for 2 ƒ BAC for 3 ƒ BCA for 4 ƒ CAB for 5 ƒ CBA for 6 ƒ Ignore 7, 8, 9, and 0 - CAB ACB - - BCA BAC ABC CBA - BAC - CAB ABC CAB CBA . . . Random Permuted Blocks

ƒ Two treatments A and B ƒ Blocks of four patients ƒ AABB for 1 ƒ ABAB for 2 ƒ ABBA for 3 ƒ BBAA for 4 ƒ BABA for 5 ƒ BAAB for 6 ƒ Ignore 0 and 7−9 - BABA ABAB - - BBAA ABBA - BBAA AABB BAAB - ABBA - BABA …. Random Permuted Blocks

ƒ In general, a trial without stratification should have a reasonably large block size so as to reduce predictability, but if interim analysis is intended, not so large that serious mid-block inequality might occur. ƒ The smaller choice of block size the greater the risk of predictable randomization. ƒ For stratified assignment, small block sizes are recommended.

Random Permuted Blocks

ƒ A trial for 100 patients could have a block size of 20: ƒ Use 0−9 for A and 10−19 for B ƒ Block 1: BBBAAAABAABABBABBAAB ƒ Block 2: BBAABAABBBABAAABBABA ƒ For three treatments, blocks of size 15: ƒ Use 1−5 for A, 6−10 for B, 11−15 for C ƒ Block 1: CCABBCBAACABBAC ƒ Block 2: CCABBCACABABBCA Biased Coin Method

ƒ At each point in the trial one observes which treatment has the least patients so far: that treatment is then assigned with probability p > ½ to the next patient. ƒ If the two treatments have equal numbers then simple randomization is used for the next patient. Biased Coin Method

ƒ One has to choose an appropriate value for p using probability theory. ƒ It turns out that for p = 3/4, 2/3, 3/5, or 5/9, a difference in treatment numbers of 4, 6, 10, 16, or more has less than a 1 in 20 chance of occurring at any particular point in the trial. Biased Coin Method

ƒ p = 3/4 maintains very strict control on treatment numbers but is somewhat predictable once any inequality exists. ƒ p = 2/3 may be an appropriate choice for a relatively small trial. ƒ p = 3/5 is good for larger trials (100 patients) Biased Coin Method

ƒ Using p = 3/5 ƒ Assign the treatment with least patients for 0−5 ƒ Assign the treatment with most patients for 6−9 ƒ When treatment numbers are equal, revert to 0−4 for A and 5− 9 for B (equal randomization)

05278437416838515696 A B A AABBABBB BABAAB BBB Stratified Randomization

ƒ Stratified randomization assists us in ensuring that treatment groups are similar as regards to certain relevant patient characteristics. ƒ Example: breast cancer, pre- and post-menopausal ƒ The larger the trial, the less likely it is that there will be an imbalance. ƒ Stratified randomization is an ‘insurance policy.’ Reasons for Not Using Stratification

1. If the trial is very large (200+) and interim analysis of early results is either not feasible or of minor interest, stratification has little point. 2. If the organizational resources for supervising randomization are limited then the increased complexity of stratification may carry a certain risk of errors. Reasons for Not Using Stratification

3. If there is uncertainty about which patient characteristics might influence response, then one has inadequate knowledge on which to carry out stratification. Stratification

ƒ Decide which patient factors by which to stratify. ƒ Outcomes of previous patients in earlier trials ƒ Should be convinced of a factor’s potential for making an impact on the outcome before it is included in stratification ƒ If only vague knowledge exists, do not stratify Random Permuted Blocks within Strata

ƒ Example: breast cancer ƒ nodal status (number of positive axillary nodes, < 4, ≥ 4) ƒ age (< 50, ≥ 50) Strata

< 50 < 4 < 50 ≥ 4 ≥ 50 < 4 ≥ 50 ≥ 4 Random Permuted Blocks within Strata

ƒ A separate restricted randomization list is prepared for each of the strata using the previous methods. ƒ Generally, a small block size is used when stratifying by several factors. ƒ Randomization needs to be more tightly restricted to be effective ƒ Chances of any investigator predicting the last assignment in any block is considerably reduced Age: < 50 ≥ 50 < 50 ≥ 50 Nodes: < 4 < 4 ≥ 4 ≥ 4 B B A B A B A A B A B A A A B B

A A B A B A A B A B B B B B A A

A B B B B A A B B A B A A B A A

B A B A B B A B A B A A A A B B ...... Random Permuted Blocks within Strata

ƒ The more factors included, the greater the pre-trial documentation and amount of information required each time a patient is registered. ƒ Consider whether the extra burden is justified. Issues

ƒ For so many strata, will you achieve comparable treatment groups? ƒ Over-stratifying can be self-defeating since a large number of strata with incomplete randomized blocks can lead to substantial imbalance between treatment groups. The Minimization Method

ƒ Over-stratification: no one is especially interested in ambulatory patients aged less than 50 years with visceral metastases and disease-free interval for over two years since this is a very small subgroup of patients with advanced breast cancer. The Minimization Method

ƒ One is more interested in ensuring that the different treatment groups are similar as regards the percentage ambulatory, percentage under 50, etc. ƒ This method balances the marginal treatment totals for each level of each patient factor. The Minimization Method

ƒ Example: Advanced breast cancer trial, 80 patients already enrolled

Treatment Factor Level ABNext Patient PS Ambulatory 30 31 ← Nonambulatory 10 9 Age < 50 y 18 17 ← ≥ 50 y 22 23 DF Int < 2 y 31 32 ≥ 2 y 9 8 ← Dominant Visceral 19 21 ← metastatic Osseous 8 7 lesion Soft tissue 13 12 The Minimization Method

ƒ For each treatment, add up the number of patients in the corresponding rows of the table: ƒ A = 30 + 18 + 9 + 19 = 76 ƒ B = 31 + 17 + 8 + 21 = 77 ƒ Give the treatment with the smallest sum of marginal totals to the next patient (A) ƒ If equal, use simple randomization ƒ Keep a continual up-to-date record of patients Balancing for Institution

ƒ In multi-center trials, we must consider whether the institution entering a patient should also be a factor worth including in stratification. ƒ Different institutions can show very different response rates. ƒ Patient selection ƒ Experimental environment Balancing for Institution

ƒ If no other factors are considered, simply provide a separate randomization list for each institution. ƒ If the minimization method is used, simply include ‘institution’ as another patient factor. Stratified Randomization

ƒ Although stratification is theoretically efficient, the practical circumstances must dictate whether its use is desirable. Unequal Randomization

ƒ Equal-sized treatment groups provide the most efficient means of treatment comparison. ƒ However, interest is usually in gaining greater experience into a new treatment’s profile, or there exists a large amount of enthusiasm for the new therapy. ƒ Consider randomizing more often to the new therapy. Unequal Randomization

ƒ Consider: 5% level of significance (type I error) is used to determine required sample size for 95% power ƒ What happens to the power if we decide to put a greater proportion on the new treatment? Unequal Randomization Unequal Randomization

ƒ If the overall size of the trial is kept constant, the power decreases relatively slowly as one begins to move away from equal sized groups. ƒ The power decreases from 95% to 92.5% if we place 2/3 patients on new treatment. ƒ Power is down to 82% for 4/5 patients on new treatment. Unequal Randomization

ƒ Randomization in a 2:1 or 3:2 ratio is a realistic proposition but more extreme imbalance may be undesirable. ƒ Considerable loss of statistical power ƒ Unequal randomization is becoming more common. ƒ Response-adaptive clinical trials Adaptive Trial Designs

ƒ Unlike a fixed design, adaptive designs allow observations made during a trial to update certain aspects of the design, such as ƒ randomization scheme ƒ sample size ƒ stopping early ƒ dropping/adding arms ƒ These updates are not “ad hoc,” but are set in advance of the trial and are based on clinical, statistical, logistical, and ethical considerations. Adaptive Trial Designs

ƒ Adaptive clinical trials provide an ethical and efficient mechanism for testing multiple treatments that ƒ provide statistically sound results for future treatment. ƒ lessen exposure of current study participants to ineffective/unsafe treatments. ƒ The Bayesian approach provides a natural framework for the continual updating required by adaptive designs. Introduction to Confounding

ƒ Consider the results from the following (fictitious) study: ƒ Investigating the association between smoking and a certain disease in male and female adults ƒ 210 smokers and 240 non-smokers were recruited Confounding

ƒ Smoking is protecting against disease?

ƒ Most of the smokers are male and non-smokers are female: Confounding

ƒ Smoking is protecting against disease?

ƒ Most of the persons with disease are female: Confounding Confounding

ƒ The original outcome is disease

ƒ The original exposure of interest is smoking

ƒ In this example, gender is related to both the outcome and exposure ƒ This relationship is possibly impacting the overall relationship between disease and smoking

ƒ How can we look at the relationship between disease and smoking removing any possible ‘interference’ from gender? ƒ One approach—look at disease and smoking relationship separately within males and females Example

ƒ Is smoking related to disease in males? Example

ƒ Is smoking related to disease in females? Smoking, Disease, and Gender

ƒ Recap: ƒ The overall (crude/unadjusted) relationship (RR) between smoking and disease was nearly one (risk difference nearly 0)

ƒ The sex specific results showed similar positive associations between smoking and disease

ƒ Note: for the we aren’t considering , just using estimates to illustrate the point Simpson’s Paradox

ƒ The nature of an association can change (and even reverse direction) or disappear when data from several groups are combined to form a single group

ƒ As association between an exposure X and a disease Y can be confounded by another lurking (hidden) variable Z Simpson’s Paradox

ƒ A confounder Z distorts the true relationship between X and Y

ƒ This can happen if Z is related to both X and Y Confounding What is the solution?

ƒ If you don’t know what the potential confounders are, there’s not much you can do after the study is over ƒ Randomization is your best protection—it eliminates the potential links between the exposure of interest and potential confounders

ƒ If you can’t randomize but know what the potential confounders are, there are statistical methods to help control or adjust for confounders—make sure you’re measuring the potential confounders as part of your study How to adjust for confounding?

ƒ Stratify ƒ Look at the tables separately ƒ Male/female, by clinic, etc. ƒ Take the weighted average of stratum specific estimates

ƒ For example, in the disease/smoking situation: ƒ To get a sex adjusted relative risk for the smoking disease relationship we could weight the sex-specific relative risks by numbers of males and females: How to adjust for confounding?

ƒ There are better ways than this to take such a weighted average (weighting by standard error, for instance) but this illustrates the concept

ƒ Confidence intervals can be computed for these adjusted measures of association

ƒ One way to assess whether sex is a confounder: compare crude RR to sex-adjusted RR—if it’s ‘different’ then sex is a confounder How to adjust for confounding?

ƒ Regression methods can be used ƒ More generalizable than weighted average approach, but the idea is similar Example: Arm Circumference and Height

ƒ An to estimate the association between arm circumference and height in Nepali children ƒ 94 randomly selected subjects (ages 3 months to 6.5 years) had arm circumference, weight, and height measured ƒ This study is observational—it is not possible to randomize subjects to height groups Example: Arm Circumference and Height

ƒ The data: ƒ Arm circumference range: 11.6 – 16.5 cm ƒ Height range: 57 – 109 cm ƒ Weight range: 5 – 18 kg

ƒ To perform the analysis: ƒ Dichotomize height at median (87 cm) ƒ Dichotomize weight at median (11.4 kg) Example: Arm Circumference and Height Example: Arm Circumference and Height

ƒ Mean arm circumference by height group

ƒ Shorter subjects have arm circumference on average 0.7 cm lower than taller subjects (mean difference = -0.7 with 95% CI -1.1 to - 0.3 cm) Example: Arm Circumference and Height

ƒ However, it is very likely that arm circumference and height are both related to a child’s weight

ƒ Some of the relationships between arm circumference and height could be because of, or masked by, these ‘behind the scenes’ relationships to weight Example: Arm Circumference and Height

ƒ What about weight? Example: Arm Circumference and Height

ƒ Possible diagram of this scenario Example: Arm Circumference and Height

ƒ Recall the original finding: children below the median height had arm circumferences of 0.7 cm lower on average than children equal to or above the median height

ƒ To investigate whether this estimate is being fueled or lessened in part by weight differences in the height groups, stratify by weight group and estimate the arm circumference/height association in each weight group Example: Arm Circumference and Height

ƒ Mean arm circumference by height group ƒ Children below median weight

ƒ Shorter subjects below the median weight have average arm circumferences on average 0.02 cm larger than taller subjects below the median weight (95% CI: -0.64 [lower] to 0.68 [higher] cm) Example: Arm Circumference and Height

ƒ Mean arm circumference by height group ƒ Children equal to or above median weight

ƒ Shorter subjects at or above the median weight have average arm circumferences 0.06 cm larger than taller subjects at or above the median weight (95% CI: -0.9 [lower] to 1.0 [higher] cm) Example: Arm Circumference and Height

ƒ Recap: ƒ Ignoring weight, children below the median height had arm circumferences of 0.69 less (on average) than children at or above the median height and this difference was statistically significant ƒ When stratified by weight children below the median height had arm circumferences marginally larger (on average) than children with or above the median height in both weight groups, but these estimates were very close to 0 and not statistically significant Example: Arm Circumference and Height

ƒ It appears that the association between arm circumference and height ‘disappears’ after accounting for weight

ƒ Associations (shorter versus taller): ƒ Crude/unadjusted -0.7 cm (95% CI -1.1 to -0.3) ƒ Adjusted? ƒ One possibility: take the weighted average of weight specific AC/height associations, weighted by the inverse of SEs of weight specific associations: Example: Arm Circumference and Height

We can get 95% CIs for this adjusted Estimate: -0.3 to 0.38 cm Example: Arm Circumference and Height

ƒ One approach—take a weighted average of the average arm circumference differences between subjects below and above the median weight within weight groups (weighted by the size of each group)

ƒ However, this is a pain and if there are more potential confounders we could spend our life stratifying and computing such estimates

ƒ Better approach—multiple regression methods Example: Arm Circumference and Height

ƒ FYI ƒ A weighted overall average height adjusted difference in arm circumference between two weight groups is 0.98 cm with 95% CI 0.4 cm to 1.55 cm

ƒ When adjusted for weight, the arm circumference/height association disappears

ƒ When adjusted for height, the arm circumference/weight association is almost the same as the unadjusted arm circumference/weight association Example: Arm Circumference and Height

ƒ This is an interesting case, perhaps better illustrated by this picture: Example: Arm Circumference and Height

ƒ This is not always the case—many times when there is confounding between and outcome and two (or more) grouping variables, all of the adjusted outcome/group relationships will differ from the unadjusted associations Example: South African Study

ƒ A longitudinal study from South Africa: birth cohort, followed up five years after birth

ƒ Participation by medical aid status at birth, all baseline participants

ƒ 95% CI: 0.53 to 0.92 Example: South African Study

ƒ A longitudinal study from South Africa: birth cohort, followed up five years after birth

ƒ Participation by medical aid status at birth, black participants

ƒ 95% CI: 0.76 to 1.36 Example: South African Study

ƒ A longitudinal study from South Africa: birth cohort, followed up five years after birth

ƒ Participation by medical aid status at birth, white participants

ƒ 95% CI: 0.25 to 4.5 Example: South African Study

ƒ Race: majority of sample is black (91%)

ƒ Race and follow-up participation: 26% of black subjects completed follow-up as compared to 9% of white subjects

ƒ Race and medical aid: 9% of black subjects had medical aid compared to 83% of white subjects Statistical /Effect Modification Effect Modification/Interaction

ƒ Let’s look at the results from a fictitious data set comparing two treatments for a fatal disease as to the impact of each on reducing deaths: there are 600 ‘younger’ patients and 600 ‘older’ patients in this random sample 1,200

ƒ Here is the data on all patients

ƒ Surgery and drug groups have identical proportions dying (50% in each group)! Effect Modification/Interaction

ƒ Here is the data on only the 600 younger patients Effect Modification/Interaction

ƒ Here is the data on only the 600 older patients Effect Modification/Interaction

ƒ A recap of the overall and age specific results Effect Modification/Interaction

ƒ Age is an effect modifier: ƒ Age modifies the association between death and treatment (statistical interaction between age and treatment)!

ƒ The association between death and treatment depends on age ƒ Surgery is better for younger patients, drug therapy is better for the older patients Effect Modification/Interaction

ƒ The association between death and one variable (treatment) depends on the level of another variable (age)

ƒ Here, it would not make sense to estimate one composite, overall measure of the association between death and treatment

ƒ Best way to look at this data is to just look at the two tables (young and old) separately and estimate two separate death/treatment associations Example: Tree Damage and Elevation

ƒ Data on elevation and percentage of dead or badly damaged trees, from 64 Appalachian sites (reported by Committee on Monitoring and Assessment of Trends in Acid Deposition, 1986)

ƒ Eight of the 64 sites are in Southern states

ƒ Elevation information--whether the site was above or below 1,100 meters

ƒ This is an observational study (why?) Example: Tree Damage and Elevation

ƒ Data for the first ten sites Example: Tree Damage and Elevation

ƒ Let’s try to assess the relationship between the percentage of damaged trees and elevation--here is a boxplot of the percentage of damaged trees by elevation Example: Tree Damage and Elevation

ƒ Mean percentage of damaged trees by elevation group Example: Tree Damage and Elevation

ƒ What about region though? ƒ Boxplot percentage of damaged trees by region Example: Tree Damage and Elevation

ƒ So sites in the South have less damage on average (i.e., not only is the percentage of damaged trees related to elevation, but it is also related to region)

ƒ If region is related to elevation, then it’s possible that part of the relationship we saw (or didn’t see) between damage and elevation is because of the region-damage-elevation relationship Example: Tree Damage and Elevation

ƒ Possible diagram of this scenario Example: Tree Damage and Elevation

ƒ Recall the original finding--sites with lower elevation had a marginally lower percentage of damaged trees on average: 0.2% less damaged than sites at higher elevation

ƒ To adjust for regional differences in the elevation groups, and the damage/region relationship, let’s stratify by region, and estimate the damage/elevation association in each region Example: Tree Damage and Elevation

ƒ Mean percentage of damage tree by elevation in southern sites

ƒ Southern sites at higher elevation have 14.5% less damaged trees on average than southern sites at lower elevation (95% CI: 48.0% lower - 19.0% higher) Example: Tree Damage and Elevation

ƒ Mean percentage of damage tree by elevation in northern sites

ƒ Northern sites at higher elevation have 23% less damaged trees on average than northern sites at lower elevation (95% CI: 2% lower - 44% higher) Example: Tree Damage and Elevation

ƒ A recap ƒ Ignoring region, sites with lower PCV and sites with lower elevation had marginally lower percentage of damaged trees on average--0.2% less damaged than sites at higher elevation

ƒ When stratified by region . . . ƒ Northern sites showed positive, statistically-significant association between damage and elevation ƒ Southern sites showed negative, non-statistically-significant association between damage and elevation Example: Tree Damage and Elevation

ƒ So, it appears as though the association between tree damage and elevation is different, both in magnitude and direction depending on region

ƒ We have a small dataset, so lack of statistical significance of negative damage/elevation associations in the south may be because of lower power Example: Tree Damage and Elevation

ƒ One approach--take a weighted average of the average damage differences between sites at low and high elevations within each region, weighted by number of observations in each region

ƒ However, does not necessarily make sense here--why combine estimates that differ in direction into one overall estimate?

ƒ Better approach--to report two mean differences in damage between low and high elevation sites (one estimate for northern sites, one estimate for southern sites)

ƒ Better approach--multiple regression methods References and Citations

Lectures modified from notes provided by John McGready and Johns Hopkins Bloomberg School of Public Health accessible from the World Wide Web: http://ocw.jhsph.edu/courses/introbiostats/schedule.cfm