<<

Designing longitudinal studies

Donna Spiegelman Professor of Epidemiologic Methods Departments of and Biostatistics [email protected]

Xavier Basagaña Research Assistant Professor Centre for Research in Environmental Epidemiology (CREAL), Spain [email protected]

1 Topics covered in this talk

; Study design formulas based on tests that are valid and efficient for observational studies, for two reasonable alternative hypotheses.

; Comprehensive assessment of the effect of all parameters on power and size.

; Extension of results to a context where not all subjects enter the study at the same time.

; Extension of results to the case of time-varying covariates, and comparisons to the time-invariant covariates case. 2 Topics covered in this talk

; Use of a computer program to perform design computations. Intuitive parameterization and easy to use.

3 Design Problems to Solve

¾ Power (π) for fixed N, r

¾ N for fixed π, r

¾ r for fixed π, N

¾ Minimum detectable effect, fixed π, N, r

¾ Optimal (Nopt,ropt) to maximize power for a fixed budget, or to minimize the total cost of the study for a fixed power.

4 Notation and Preliminary Results

5 • We focus on two alternative hypotheses:

1. Constant Mean Difference (CMD).

CMD, V(t0) = 0

A Unexposed ( | XYE )= μ + μ ++ μ + β X Exposed iij 1.00.0 1.0 ir Difference

Y ′ E (YXij| i )=+ββ01 X βi + 2 T ij

E (YXij| i )=+β01β Xi

Time

CMD, V(t0) > 0 C H01:0β = Y

6 Time 2. Linearly Divergent Differences (LDD)

LDD, V(t0) = 0 B ( | XYE iij ) 1.00.0 ir γγμμμ31.0 i ×+++++= TXX ij )(

E (YXij|( i )=+γ 01γγ Xi + 3 X γi × T ij) +2 T ij Y ′′ ′ ′ ′ ′ EY()ij| X ij =+ηη0102 ti η + η ηt ij η −+ t i 0 3 k i + 4() k i ×+ t i ()0 5() ki ×−() t ij t i 0

=+ηη0102ttki η + iji +ηη 3 + 4() η ktkt ii×+05() iij ×,ηγ53= Time

LDD, V(t0) > 0 EY()ij,1+1,− Y ij = λ0 + λ ki ,sλ1 = γ 3 D

H03:0γ = Y

Time

7 Clinical trials vs. Observational studies:

• Some test statistics invalid (SLAIN under LDD) or less efficient (ANCOVA under CMD) in an observational context where

()i00 ≠ YEYE i01 () Y Y 5 10 15 20 25 0 5 10 15 20 25 30 35 0

012345 012345

Time Time

Control 8 Treatment Clinical trials vs. Observational studies:

¾ In RCTs, the time measure of interest is time from randomization → everyone starts at the same time. ¾ In observational studies in epidemiology, for example, age is the time variable of interest, and study participants do not start at the same age. Exposure may be correlated with time.

¾ RCTs: Time-invariant exposures; Observational: exposures can be either

¾ RCTs: exposure (treatment) is 50% by design; Observational: exposures often have low or high prevalence (unbalanced designs) 9 Intuitive parameterization of the alternative hypothesis

1) μ00 : the mean response at baseline (or at the mean initial time) in the unexposed group,

where μ00 ===E (YXii0 |0,1,,) i N

2) p1 : the percent difference between exposed and unexposed groups at baseline (or at the mean initial time), where

EY( ii00|1 X= )(−= EY ii |0 X ) p1 = EY()ii0 |0 X=

10 Intuitive parameterization of the alternative hypothesis (2)

3) p2 : the percent change from baseline (or from the mean initial time) to end of follow‐up (or to the mean final time) in the unexposed group, where

EY( iiτ |0 X= )−= EY( ii0 |0 X ) p2 = EY()ii0 |0 X= When τ is not fixed, p2 is defined at time s instead of at time τ

4) p3 : the percent difference between the change from baseline (or from the mean initial time) to end of follow‐up (or mean final time) in the exposed group and the unexposed group, where

EY( iiττ− Y00| X i=−1)( EY ii − Y| X i =0 ) p3 = EY()iiτ −= Y0 |0 X i

When p2 = 0 , p3 will be defined as the percent change from baseline (or from the mean initial time) to the end of follow‐up (or to the mean final time) in the exposed group, i.e.

EY( |1 X= )−= EY( |1 X ) p = iiτ ii0 3 EY|1 X= ()ii0 11 Intuitive parameterization of the alternative hypothesis (3)

p1 • Under CMD, β2 = μ00

• Under LDD,

(1+ pp1300 ) μ – If p2 = 0, γη== 33 τ

ppμ – Else γη==2300 33 τ 12 Notation & Preliminary Results

• We consider studies where the interval between visits (s) is fixed but the duration of the study is free (τ ) (e.g. participants may respond to every two years) ƒ Increasing r involves increasing the duration of the study

• We also consider studies where the duration of the study, τ, is fixed, but the interval between visits is free (e.g. the study is 5 years long)

ƒ Increasing r involves increasing the frequency of the measurements, s

• τ = s r. 13 Literature review (1) 1. Cochran, W. G. (1977). techniques. New York, Wiley. 2. Dawson, J. D. (1998). "Sample size calculations based on slopes and other summary statistics." Biometrics 54(1): 323‐30. 3. Diggle, P. (2002). Analysis of longitudinal data. Oxford ; New York, Oxford University Press. 4. Fitzmaurice, G. M., N. M. Laird, et al. (2004). Applied longitudinal analysis. Hoboken, N.J., Wiley‐Interscience. 5. Frison, L. and S. J. Pocock (1992). "Repeated measures in clinical trials: analysis using mean summary statistics and its implications for design." Stat Med 11(13): 1685‐704; Frison, L. J. and S. J. Pocock (1997). "Linearly divergent treatment effects in clinical trials with repeated measures: efficient analysis using summary statistics." Stat Med 16(24): 2855‐72. 6. Galbraith, S. and I. C. Marschner (2002). "Guidelines for the design of clinical trials with longitudinal outcomes." Control Clin Trials 23(3): 257‐ 73.

14 Literature review (2) 7. Hedeker, D. G. R. W. C. (1999). "Sample size estimation for longitudinal designs with attrition: comparing time‐related contrasts between two groups." Journal of Educational and Behavioral Statistics 24(1): 70‐93. 8. Jung, S. H. and Ahn C. (2003). “Sample size estimation for gee method for comparing slopes in repeated measurements data”. Stat Med 22(8):1305–15. 9. Kirby, A. J., Galai, N., and Munoz A. (1994). “Sample size estimation using repeated measurements on biomarkers as outcomes”. Control Clin Trials, 15(3):165–72. 10. Liu G. and Liang K. Y. (1997). “Sample size calculations for studies with correlated observations”. Biometrics 53(3):937–47. 11. Overall, J. E. (1996). "How many repeated measurements are useful?" J Clin Psychol 52(3): 243‐52; Overall, J. E. and S. R. Doyle (1994). "Estimating sample sizes for repeated measurement designs." Control Clin Trials 15(2): 100‐23.

15 Literature review (3) 12. Raudenbush, S. W. (1997). "Statistical analysis and optimal design for cluster randomized trials." Psychol Methods 2(2): 173‐85; Raudenbush, S. W. and L. Xiao‐Feng (2001). "Effects of study duration, frequency of observation, and sample size on power in studies of group differences in polynomial change." Psychol Methods 6(4): 387‐401. 13. Rochon J. (1998). “Application of gee procedures for sample size calculations in repeated measures ”. Stat Med 17(14):1643–58. 14. Schlesselman, J. J. (1973). "Planning a longitudinal study. II. Frequency of measurement and study duration." J Chronic Dis 26(9): 561‐70. 15. Schouten H. J. (1999). “Planning group sizes in clinical trials with a continuous outcome and repeated measures”. Stat Med 18(3):255–64. 16 Literature review (4) 17. Snijders, T.A.B. and Bosker, R.J. (1993) “Standard errors and sample sizes for two‐level research”. Journal of Educational Statistics 18(3):237–259. 18. Tu, X. M., Kowalski, J., Zhang, J., Lynch, K. G. and Crits‐ Christoph, P. (2004) “Power analyses for longitudinal trials and other clustered designs”. Stat Med 23(18):2799–815. 19. Yi, Q. and Panzarella, T. (2002) “Estimating sample size for tests on trends across repeated measurements with missing data based on the interaction term in a mixed model”. Control Clin Trials, 23(5):481–96, 2002.

17 General Theoretical Results

ƒ Under CMD with CS response, ANCOVA optimal (valid & efficient) in RCTs; inefficient in observational studies (Appendix 2)

ƒ Under LDD with CS response, SLAIN optimal in RCTs; invalid in observational studies (Appendix 2)

ƒ With CS or RS response and V(t0)=0, 2-stage estimator = OLS = GLS (Appendix 3) Notation & Preliminary Results

• Model EV[YXii]==Β,|,1,,ar[YX iii] Σ in=…

• The generalized least squares (GLS) estimator of B is −1 ⎛⎞⎛⎞11 ˆ −−11′ BX= ⎜⎟⎜⎟∑∑iiΣ XX i iiΣ Y i ⎝⎠⎝⎠nnii −−11 ~,NE()ΒΣΒ = Xiii()X′Σ X • Power formula

⎡ n ()c'Β ⎤ π =−Φ1 ⎢ H A −z ⎥ ⎢ c′Σ c 1/2−α ⎥ ⎣⎢ Β ⎦⎥ 19 Notation & Preliminary Results

‐1 • Let νlm be the (l,m)th element of Σ • Under CMD −1 ƒ ⎡ rr ⎤ ˆ ⎛⎞ Vt()0 = 0Var()β1 =−⎢ pee(1 p ) ⎜⎟∑∑ v jj′ ⎥ ⎣⎢ ⎝⎠jj==00′ ⎦⎥ ƒ and fixed , Vt()0 > 0 s 2 rr ⎛⎞ 2 ⎜⎟∑∑vVtsjj '0()+ det(A ) Var βˆ = ⎝⎠jj==0'0 ()1 2 ⎛⎞rr⎡ ⎛⎞ rr ⎤ pp(1−+− ) vs⎢ 22det(A ) v1 ρ Vt()⎥ e e ⎜⎟∑∑jj '' ⎜⎟ ∑∑ jj ()e,t0 0 ⎝⎠jj==0'0 ⎣⎢ ⎝⎠jj ==0'0 ⎦⎥ where ⎛⎞rr rr ⎜⎟∑∑vjvjj '' ∑∑ jj jj==0'0 jj == 0'0 A = ⎜⎟ ⎜⎟rr rr 20 ⎜⎟∑∑jvjjvjj '' ∑∑ ' jj ⎝⎠jj==0'0 jj == 0'0 Notation & Preliminary Results

‐1 • Let νlm be the (l,m)th element of Σ

• Under LDD and fixed s, and when difference model is used, ⎛⎞rr ⎜⎟∑∑v jj' – ˆ ⎝⎠jj==0'0 Vt()0 = 0 Var()γ 3 = 2 ppsee(1− ) det(A )

rr ƒ ⎛⎞ Vt()0 > 0 ⎜⎟∑∑v jj ' Var()γˆ = ⎝⎠jj==0'0 & Σ=Σ∀i i 3 2 ⎡ ⎛⎞rr ⎤ pp(1−− )⎢ 1ρ 22 Vt ( ) v+ s det(A ) ⎥ ee()e, t0 0⎜⎟∑∑ jj ' ⎣⎢ ⎝⎠jj==0'0 21 ⎦⎥ Correlation structures

• We consider three common correlation structures:

1. Compound symmetry (CS).

⎛ 1 ρρ⎞ ⎜ ⎟ ⎜ ρ 1 ⎟ Var XY )|( = Σ = σ 2 rriij +×+ )1()1( ⎜ ρ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ … ρρ1 ⎠

22 Correlation structures

2. Damped Exponential (DEX) (Munoz et al., 1992)

⎛⎞1 θ ∈[0,1] ⎜⎟ ⎜⎟0.8 1 θ = 0: CS ⎜⎟0.8 0.8 1 θ θ θ ⎜⎟ ss )2( rs)( ⎜⎟0.8 0.8 0.8 1 ⎛ 1 ρρρ⎞ ⎜⎟ ⎜ ⎟ ⎝⎠0.8 0.8 0.8 0.8 1 ⎜ sθ sθ ⎟ 1 ρρ ⎛⎞1 2⎜ θθ θ ⎟ ⎜⎟ Σ=σ )2( ss s)2( ⎜⎟0.8 1 ⎜ ρρ 1 ρ ⎟ θ = 0.3: CS ⎜⎟0.76 0.8 1 θ ⎜⎟ s ⎜⎟0.73 0.76 0.8 1 ⎜ ⎟ ⎜⎟ ρ ⎝⎠0.71 0.73 0.76 0.8 1 ⎜ θ θθ ⎟ ⎜ rs)( )2( ρρρss 1 ⎟ ⎝ ⎠ ⎛⎞1 ⎜⎟ ⎜⎟0.8 1 ⎜⎟0.64 0.8 1 θ = 1: AR(1) ⎜⎟ ⎜⎟0.51 0.64 0.8 1 ⎜⎟ ⎝⎠0.41 0.51 0.64 0.823 1 Correlation structures

3. Random intercepts and slopes (RS). 2 ⎛ b σσρσbbbb ⎞ ′ 22 ⎜ 0 1010 ⎟ Σ =+−ZDZ (1ρ ) σ I D = 2 iii t0 ⎜ σσσρ⎟ ⎝ bbbb 1010 b1 ⎠ Z=(1,t),dim( Z )= (r +× 1) 2 ii i ρ = 0CS⇒ b1 ,τ • Reparameterizing:

– ρ ∈ ]1,0[ is the reliability coefficient at baseline t0

– ρ ∈ ]1,0[ is the slope reliability at the end of follow‐up (ρ =0 is b1 ,τ b1 ,τ CS; ρ =1 all variation in slopes is between subjects). b1 ,τ

• With this correlation structure, the variance of the response changes with time, i.e. this correlation structure gives a heteroscedastic model.

• When , 24 Vt()0 > 0Σi ≠Σ Example

• Goal is to investigate the effect of indicators of socioeconomic status and post‐menopausal hormone use (PMH) on cognitive function (CMD) and cognitive decline (LDD)

• “Pilot study” by Lee S, Kawachi I, Berkman LF, Grodstein F (“Education, other socioeconomic indicators, and cognitive function. Am J Epidemiol 2003; 157: 712‐720). Will denote as Grodstein.

• Design questions include ƒ power of the published study to detect effects of specified magnitude, ƒ the number and timing of additional tests in order to obtain a study with the desired power to detect effects of specified magnitude, ƒ the optimal number of participants and measurements needed in a de novo study of these issues 25 Example

• At baseline and two years later, six cognitive tests were administered to 15,654 participants in the Nurses’ Health Study • Outcome: Telephone for Cognitive Status (TICS)

ƒ μ00=32.7 (4); ƒ Implies model Var() Y = 16,Var(| Y X )12,≈

ƒ p2 =−0.3% / year ⇔ 1 point/10 years of age =γ 2 R2 = 0.25

26 Example

• Exposure: Graduate school degree vs. not (GRAD)

ƒ pe = 6.2%, ƒ Corr(GRAD, age at start of follow‐up)=‐0.01

ƒ p1 =⇔2.3% 0.75 points = β11,γ • Exposure: Post‐menopausal hormone use (CURRHORM) ƒ pe = 26.7%, Corr(CURRHORM, age)=‐0.06

ƒ p1 =⇔0.7% 0.02 points • Time: age (years) is the best choice, not cycle or calendar year of test

ƒ The mean age was 74 and V(t0)≈4. 27 Example • The estimated covariance parameters were

CS RS ρ ρ or t0 0.27 0.26 ρ ~ 1,rb =2 0.04 ρ br1 ,1= 0.01 ρ bb 10 -0.14 • SAS code to fit the LDD model with CS covariance proc mixed; class id; model tics=grad age gradage/s; random id; • SAS code to fit the LDD model with RS covariance proc mixed; class id; model tics=grad age gradage/s ddfm=bw; Random intercept age/type=un subject=id; 28 Example

• To estimate , ρ under DEX response θ t0 – Get RDEC package from Vincent Carey ([email protected]) – R Code • library(RDEC) • mod1 = rdec(tics ~ age + grad + gradage, data=dat, id=ID,S=age, • omega.init = c(.5,.5), omega.low=c(.01,.01), omega.high=c(.95, .95)) • summary(mod1) • r≥2 to fit DEX response

29 Program optitxs.r makes it all possible

30 31 32 http://www.hsph.harvard.edu/faculty/donna-spiegelman

33 http://www.hsph.harvard.edu/faculty/spiegelman/vita.html

34 http://www.hsph.harvard.edu/faculty/spiegelman/software.html

35 http://www.hsph.harvard.edu/faculty/spiegelman/software.html http://www.hsph.harvard.edu/faculty/spiegelman/optitxs.html

37 38 39 40 41 42 Illustration of use of software optitxs.r • We’ll calculate the power of the Grodstein’s published study to detect the observed 70% difference in rates of decline between those with more than high school vs. others over the original two year period

• Recall that 6.2% of NHS had more than high school; there was a –0.3% decline in cognitive function per year

43 > long.power() Press to quit

Constant mean difference (CMD) or Linearly divergent difference (LDD)? ldd The alternative is LDD.

Enter the total sample size (N): 15000

Enter the number of post-baseline measures (r>0): 1

Enter the time between repeated measures (s): 2

Enter the exposure prevalence (pe) (0<=pe<=1): 0.062

Enter the variance of the time variable at baseline, V(t0) (enter 0 if all participants begin at the same time): 4

Enter the correlation between the time variable at baseline and exposure, rho[e,t0] (enter 0 if all participants begin at the same time): -0.01

Will you specify the alternative hypothesis on the absolute (beta coefficient) scale (1) or the relative (percent) scale (2)? 2 The alternative hypothesis will be specified on the relative 44 (percent) change scale. Enter mean response at baseline among unexposed (mu00): 32.7

Enter the percent change from baseline to end of follow-up among unexposed (p2) (e.g. enter 0.10 for a 10% change): -0.006

Enter the percent difference between the change from baseline to end of follow-up in the exposed group and the unexposed group (p3) (e.g. enter 0.10 for a 10% difference): 0.7

Which covariance matrix are you assuming: compound symmetry (1), damped exponential (2) or random slopes (3)? 2 You are assuming DEX covariance

Enter the residual variance of the response given the assumed model covariates (sigma2): 12

Enter the correlation between two measures of the same subject separated by one unit (rho): 0.3

Enter the damping coefficient (theta): 0.10

Power = 0.4206059

45 Power of current study

• To detect the observed 70% difference in cognitive decline by GRAD – CS: 44% (0.30)ρ = (ρ = 0.04,ρ =− 0.14) – RS: 35% br10,2= bb,1 – (0.3,0.10)ρ = θ = DEX : 42% t0

• To detect a hypothesized ±10% difference in cognitive decline by current hormone use – CS & DEX: 7% – RS: 6%

46 How many additional measurements are needed when tests are administered every 2 years how many more years of follow‐up are needed...

• To detect the observed 70% difference in cognitive decline by GRAD with 90% power? – CS, DEX (θ = 0.10), RS: 3 post‐baseline measurements =6 years = τ • one more 5 year grant cycle

• To detect a hypothesized ± 20% difference in cognitive decline by current hormone use with 90% power? – CS, DEX (0.10)θ = : 6 post‐baseline measurements = 12 τ • More than two 5 year grant cycles

N=15,000 for these calculations 47 How many more measurements should be taken in four (1 NIH grant cycle) and eight years of follow‐up (two NIH grant cycles)...

• To detect the observed Duration of 4 years 8 years 70% difference in cognitive follow-up CS 8 1 decline by GRAD with 90% DEX 10 1 power? (θ = 0.10) RS 10 1

• To detect a hypothesized Duration of 4 years 8 years ± 20% difference in cognitive follow-up CS >50 11 decline by current hormone use with 90% power? DEX (θ = 0.10) >50 17 RS >50 13 48 Optimize (N,r) in a new study of cognitive decline

• Assume τs – τ =4 years of follow‐up (1 NIH grant cycle); s fixed at 2 years – cost of recruitment and baseline measurements are twice that of subsequent measurements

• GRAD: p3 = 70% – (N,r)=(26,795; 1) CS – =(26,930;1) DEX (θ = 0.10) – =(28,945;1) RS

• CURRHORM: p3 = 20% – (N,r)=(97,662; 1) CS – =(98,155; 1) DEX (θ = 0.10) – =(105,470;1) RS

49 Summary of the features of existing programs

Software Reference CMD LDD CS RS DEX Vt( )> 0 Exposure Optimal (N, Power N for r for Minimum 0 and time r) for fixed fixed r fixed N detectable correlated cost and/or effect fixed power Snijders It computes the standard errors, 9 (fixed PINT (1993, 2003) 9 9 9 9 × ×* × ˆ cost) Var (β2 )and Var (γˆ3 ) http://stat.gamma.rug.nl/snijders/

Hedeker RMASS2 (1999a, 9 9 9 9 9 × × × × 9 × × 1999b) http://tigger.uic.edu/~hedeker/works.html

Rochon (1998), GEESIZE 9 9 9 × 9 × × × × 9 × × Ziegler (2004) http://www.imbs.uni-luebeck.de/pub/Geesize/

Basagana and 9 (both OPTITXS 9 9 9 9 9 9 9 9 9 9 9 Spiegelman constraints) (2007) http://www.hsph.harvard.edu/faculty/spiegelman/optitxs.html

*Only considers B&W model, which reduces to the Vt()0 = 0 case (appendix 1.3) Designing Longitudinal Studies:

General theoretical results General theoretical results

ƒ CMD: ¾ Power increases as ρ → 0 and as θ →1

¾ Power increases as Var( t 0 ) goes to 0.

¾ Power is maximum at pe=0.5.

¾ LDD:

ρ ~ → ,0 ¾ Power increases as ρ → 1 , as θ → 0 , as 1 , rb as V(t0) increases, and as the correlation between t0 and exposure goes to 0.

¾ Power is maximum at pe=0.5. 52 General Theoretical Results

ƒ Under CMD with CS response, ANCOVA optimal (valid & efficient) in RCTs; inefficient in observational studies (Appendix 2)

ƒ Under LDD with CS response, SLAIN optimal in RCTs; invalid in observational studies (Appendix 2)

ƒ With CS or RS response and V(t0)=0, 2-stage estimator = OLS = GLS (Appendix 3) Theoretical Results on minimum r for fixed N and π

• Under LDD, , and fixed τ , power when r=2 is Vt()0 = 0 the same as power when r=1 (Appendix 4)

• r is minimized at pe = 0.5 (Appendix 5)

• Power is limited below 100% as r →∞(Appendix 6) ƒ Under CMD ƒ CS & RS response ƒ CS & AR(1) response when τ is fixed ƒ Under LDD ƒ AR(1) and fixed τ

ƒ RS, Vt()0 = 0 Theoretical Results on effect of ρ, ρ , and ρb t0 1 for minimum r for fixed N and π

• Under CMD with CS response and Vt()0 = 0,

22 2 ƒ r ↑ as ρ ↑ when ()zzπα+>−1/2− σβ 2 Nppee (1) ƒ Else r ↓ as ρ ↑

• Under LDD with CS or RS response, Vt()0 = 0, and s fixed, r ↓ as ρ ↑

• Under LDD with CS or RS response, Vt()0 = 0, r ↑ ρ ↑ and s fixed, as b1 Designing Longitudinal Studies to Optimize the Number of Subjects and Number of Repeated Measurements:

Theoretical results Literature review

Bloch, D. A. (1986). “Sample size requirements and the cost of a randomized with repeated measurements”. Stat Med, 5(6):663–7. Cochran, W.G. (1977) “Sampling techniques”. Wiley, New York, 3d edition, 1977. Moerbeek, M., van Breukelen, G. J. P., Berger, M.P.F. (2000). “Design Issues for Experiments in Multilevel Populations”. Journal of Educational and Behavioral Statistics, 25(3): 271-284. Raudenbush, S. W. (1997) “Statistical analysis and optimal design for cluster randomized trials”. Psychol Methods 2(2):173–85. Snijders, T.A.B. and Bosker, R.J. (1993) “Standard errors and sample sizes for two-level research”. Journal of Educational Statistics 18(3):237–259. Winkens, B, Schouten, H.J.A, van Breukelen, G.J.P., Berger, M.P.F. (2006). “Optimal number of repeated measures and group sizes in clinical trials with linearly divergent treatment effects”. Contemporary Clinical Trials 27: 57– 69. Problem to solve

Cost 1st measurement κ = Cost of 2nd or following measurement

Nrc1 ⎫ Min Nc1 + Nr, κ ⎪ ⎪ ⎪ ⎬ ≡+Min()()κ r c'ΣBc r ⎡⎤N ()c'B ⎪ H A subject to Φ−≥⎢⎥zPower1/2−α target⎪ ⎢⎥c'Σ c ⎣⎦B ⎭⎪ ⎡⎤N β ⎫ ⎪ Max Φ−⎢⎥z1/2−α Nr, ⎢⎥σ ⎪ ⎣⎦⎬ ≡+Min()()κ r c'ΣBc r Nrc ⎪ subject to Nc+≤1 Budget⎪ 1 κ ⎭

• ropt is the same for both minimizing cost subject to fixed power, and for maximizing power subject to fixed cost (Appendix 9)

• Nopt will be different Methods

• Results based on a Wald test of the coefficient of interest, based on the GLS estimator.

• Two scenarios: – Fixed frequency of measurements, s. • Increasing r involves increasing the duration of follow‐up, τ. – Fixed length of follow‐up, τ. • Increasing r involves increasing the frequency of measurement s. Shape of the group difference

CMD, V(t0) = 0 LDD, V(t0) = 0

A B Unexposed Exposed Difference Y Y Parameter Parameter of interest: of interest: Exposure Time Time Group by time CMD, V(t ) > 0 LDD, V(t ) > 0 difference 0 0 interaction C D term Y Y

Time Time Covariance structures

1. Compound symmetry (CS). 2 Cov() Yij, Y ij ' = σ ρ

2. Damped Exponential (DEX) θ 2 j− j' Cov() Yij, Y ij' =σρ

3. Random intercepts and slopes (RS). Var YX|(=+− ZDZ′ 1),ρσ2 I Z = 1t ( ii) ii t0 i( i) Theoretical results

ƒ Comprehensive results for longitudinal studies, under different scenarios:

¾ Shape of the group difference: CMD or LDD. ¾ Covariance structure: CS, DEX, RS. ¾ Whether all participants are observed at the same time points or not (e.g. age is time variable of interest). CMD

ƒ Under CS: ƒ If κ =1, one would not take repeated measures and increase the number of participants. ƒ If κ >1: ¾If correlations large: still no repeated measures, or just a small number of them. ¾If correlations small: taking some repeated measures and fewer participants is optimal. ƒ If deviations from CS exist, it is advisable to take more repeated measures and less participants. LDD, same set of time points

Fixed s • If the follow‐up period is not fixed, choose the

maximum length of follow‐up possible with ropt=1 (except when RS is assumed, where in most scenarios

ropt=2 to 6).

Fixed τ ƒ If the follow-up period fixed, one would take more than one repeated measure only when κ >5. When there are departures from CS, values of κ around 10 or 20 are needed to justify taking 3 or 4 measures. LDD: Effect of having different time points

ƒ Results depend additionally on Vt () 0 and the correlation between exposure and time.

LDD, CS, ρ=0.2, ρe,t = 0.2 15 ƒ Increasing Vt () can κ=2 0 κ=5 κ=10 either increase or κ=20 10 decrease ropt , opt depending on the r

case. Few patterns 5 appear. 0

0246810

V(t0) τ LDD

– If the follow‐up period is not fixed, choose the maximum length of follow‐up (τ) possible (except when RS is assumed).

– If the follow‐up period fixed, one would take more than one repeated measure only when the subsequent measures are more than five times cheaper. When there are departures from CS, values of κ around 10 or 20 are needed to justify taking 3 or 4 measures.

66 LDD

• The optimal (N,r) and the resulting power strongly depend on the correlation structure. Combinations that are optimal for one correlation may be bad for another. • Recommend performing sensitivity analysis. • All the decisions are based on power considerations alone. There might be other reasons to take repeated measures. Part 2. Designing longitudinal studies with a time‐varying exposure Introduction

• RCTs: • the exposure is time-invariant, or • exposure varies in a manner that is controlled by design. • Observational studies: • the investigator does not control how exposure varies within subjects over time • a large number of exposure patterns are observed, with large differences in the number of exposed periods per participant and changes in the cross- sectional prevalence of exposure over time. Introduction

• Motivating example (Medina-Ramon et al. Eur Resp J 2006): • Study of domestic cleaners, followed during 15 consecutive days. • Every day, cleaners provided measurements of lung function and reported on cleaning product use (e.g. used bleach yes/no). • Exposed days per person (bleach): – Mean: 10 – Range: 1-15 Methods

• We present formulas for study design calculations that address these issues for studies with a continuous outcome and a binary, time- varying exposure. • We covered studies where the interest is the effect of a time-varying exposure on either: – the mean levels of the response (main effect of exposure) (CMD) – on the rate of change of the response over time (exposure by time interaction) (LDD) • We assume that participants are observed at r+1 equidistant time points. Literature Review • JONES, B. & KENWARD, M. G. (1989) Design and analysis of cross‐over trials, London; New York, Chapman and Hall. • JULIOUS, S. A. (2004) Sample sizes for clinical trials with normal data. Stat Med, 23, 1921‐86. • MOERBEEK, M., VAN BREUKELEN, J. P. & BERGER, M. P. F. (2001) Optimal experimental designs for multilevel models with covariates. Communications in Statistics ‐Theory and Methods, 30, 2683‐97. Introduction: Patterns ‐‐ CMD

"Envelope" trajectories Possible pattern for one subject Y Y

Exposed E=0 E=1 E=1 E=0 E=0 Unexposed 0123456 0123456 01234 01234 Time Time

"Envelope" trajectories Possible pattern for one subject Y Y

Exposed Unexposed

024681012 024681012 E=0 E=1 E=1 E=0 E=0 01234 01234 Time Time 6 Notation: Models ‐‐ CMD

• Basic model: E(YEij| ij )=+β01ββ Eij + 2 t ij • One can assume a different intercept for every participant and fit the model by conditional likelihood. – This model estimates the within‐subject effect of exposure. – Generalization of paired t‐test, where every participant serves as his/her own control. – Equivalent to fitting model on differences (Appendix B):

WW E()YYEEij,1++−=+− ij|, ij ,1 ij ββ 2 1( Eij ,1+ E ij) 7 Notation: Intraclass correlation of exposure, ρe

ƒ With a CS exposure covariance, this is a measure of: ƒ common correlation of exposure for all pairs ƒ within-subject variability of exposure ƒ imbalance in the number of exposed periods per

person, Ei•

ƒ When ρe=1, the exposure is time-invariant. ƒ There is no within-subject variation of exposure.

ƒ There is maximum imbalance in Ei• : Ei• =0 or Ei• =r+1

ƒ When ρe = -1/r ƒ Maximum within-subject variation of exposure.

ƒ Minimum imbalance in Ei•, everyone is exposed the same number of periods (“designed study”).

ƒ Example: ρe=0.35 for exposure to bleach in the cleaners study (ρx=0.36) 8 Notation: Within‐subject exposure correlations

• Correlation between all consecutive pairs of exposure measurements within‐subject.

⎛⎞1 ⎜⎟ ρ 1 ⎜⎟ee01, ⎜⎟#%% ⎜⎟ ⎜⎟ρρ" 1 ⎝⎠ee01,,rr e− er

ρ • First order autocorrelation of exposure, e1

9 Results: CS Response, Vt()0 = 0

Model: EE(YYEEij||X i )==+( ij ij ) β01β ij

2 ˆ σρ(1−+ )() 1 ρr Var()β1 = ppree(11121(1)−+−−+−− )()( ρ () r ()ρρee)

W Model: E(YYEEij,1++−=− ij|, ij ,1 ij) β 1( Eij ,1 + E ij)

2 ˆ σρ(1− ) Var()β1 = ppree(1−− ) (1ρ e )

10 Results: Required Input Parameters

ˆ To compute Var()β1 we need:

• If CS response: pe or pjej ∀ , and ρe (Appendix C) ρ • If AR(1) response: pej ∀j and e1 (Appendix E) • If DEX response, 0<θ <1:

– If CS exposure: pej ∀j and ρe ρ ∀jj,' – If exposure not CS: pej ∀j and eejj, '

Assumes within‐subjects model or Vt()0 = 0

11 Results: Required input parameters for DEX Response and Exposure not CS ρ ∀jj,' • Need eejj, '

• Since ρe can be viewed as an average of all the exposure correlations, one can expect that ρ = ρ ∀jj,' assuming eejj, ' e would produce ˆ reasonable estimates of Var()β1 , even if the actual covariance matrix of exposure does not follow a CS structure. • Evaluated in 10,000 arbitrary exposure correlation matrices and vectors

12 Results: Accuracy of approximations

• Large underestimations when: – Close to AR(1) covariance of the response. ρ  ρ – ee1 • In those cases, conservative (large) values of

ρe are recommended

13 Results: Efficiency

ƒ Under CS, probably DEX, and, more generally, when ν jj’<0 ∀j ≠ j ’ (Appendix F): ˆ ¾Var()β1 is maximum when ρe=1 (time- invariant exposure). ˆ ¾ Var()β1 is minimum when ρe= -1/r (maximum within-subject variation of exposure).

ƒ Having within-subject variation in exposure improves efficiency.

14 Efficiency with CS response

ρ=0.8

r=1 Response ~ CS, ρ = 0.8 r=5 r=10 r=20

N N SSR ==ρe =1 time-invariant NN SSR ρe time-varying 0 1020304050

0.0 0.2 0.4 0.6 0.8 1.0 ρe ƒ With CS response, formulas for time-invariant exposure will always over-estimate the required sample size, especially

with large r and large ρ 15 Efficiency with DEX response: SSR as a function of θ assuming CS for the exposure process, for r=5 and pe = 0.2 Lines indicate: (——) ρ=.2 , (‐‐‐‐) ρ =.5, (∙∙∙∙∙) ρ =.8.

ƒ With DEX response, formulas for time-invariant exposure can over-estimate or under-estimate the required sample

size, especially with large r and large ρ. 16 Example 1: Respiratory effects of exposure to cleaning products

• Exposure to bleach: r+1=15, ρe=0.35,

• Required sample size to detect a difference of 10 L·min-1 with 80% power (~3% difference).

• Time-invariant: N = 1387 • Time-varying: N = 24 • SSR=58 Example 2-3. These examples are based on Medina- Ramon et al.’s study of respiratory function in relation to exposure to cleaning products/tasks, where peak expiratory flow is the response and use of air fresheners is the exposure.

E()YEij | ij =+β01ββ Eij + 2 tij is the assumed model, with Vt()0 = 0

More information can be found in the user’s manual at http://www.hsph.harvard.edu/faculty/spiegelman/optitxs .html Example 2: Sample size calculation

What is the sample size (N) needed to detect a 10 L/min decrease in PEF with 14 post-baseline repeated measures (r=14) assuming CS response? > long.N.tv() Enter the number of post-baseline measures (r): 14 Enter the desired power (0

This example is based on Medina-Ramon et al.’s study of the respiratory effects of exposure to cleaning products, with peak expiratory flow as the response and use of air fresheners as the exposure. , t=0,1,…,14 Vt()0 = 0

What is the power to detect a 10 L/min decrease in PEF in relation to a day of exposure to air fresheners, in a study of 31 participants and 14 post-baseline repeated measures, assuming CS response? > long.power.tv() Enter the total sample size (N): 31 Enter the number of post-baseline measures (r): 14 Enter the time between repeated measures (s): 1 Do you want to base the calculations on a model with a main effect of exposure (1) or a model that separates the between- and within-subjects effects of exposure (2)? 2 Will you specify the alternative hypothesis on the absolute (beta coefficient) scale (1) or the relative (percent) scale (2)? 1 The alternative hypothesis will be specified on the absolute (beta coefficient) change scale. Enter the difference between exposed and unexposed periods (beta1): 10 Which residual covariance matrix of the response are you assuming: compound symmetry (1) or damped exponential (2) ? 1 You are assuming CS covariance of the response Enter the residual variance of the response given the assumed model covariates (sigma2): 4686 Enter the correlation between two measures of the same subject (rho): .88 Enter the mean prevalence of exposure (mean.pe): .37 Enter the intraclass correlation of exposure (-1/14 <= rho_e <= 1): .13 Power = 0.9770487 Time‐varying exposure: LDD

• When generalizing the LDD setting to the case of a time‐varying exposure, we distinguish between two cases: – The effect of exposure on the response is cumulative , * EE* = E()YtEij| X i =+γγ01ij γ + 2 ij , ij ∑ kj≤ ik

– The effect of exposure on the response is acute

E(YtEEtij| X i )= γγ01++ij γ 2 γ ij + 3( ij × ij ) Literature Review – LDD with time- varying covariate Time‐varying exposure: LDD Cumulative exposure trajectories

"Envelope" trajectories Possible pattern for one subject

Exposed Unexposed Y Y

E=0 E=1 E=1 E=0 E=1 024681012 024681012 01234 01234 Time Time

Paper 3 25 LDD: Cumulative Exposure Effect Model

EE* = ij ∑ kj≤ ik

* Model: E(YtEij| X i )=+γ 01γγij + e ij

¾ To exactly compute Var () γ ˆ e , we need:

ƒ pej ∀j. ƒ ρ ∀jj,', or equivalently, E ⎡E Ejj⎤ ∀ ,' eejj, ' ⎣⎢ jj' ⎦⎥

ƒ E[]t0 and V(t0) ƒ E ⎡⎤E tj∀ , or equivalently, ρ ∀ j ⎣⎦⎢⎥j 0 etj , 0 * * ƒ E(E−1) and VE ( − 1 ) ⎡⎤* ƒ E E−1Ejj ∀ , or equivalently, ρ * ∀ j ⎣⎦⎢⎥ Ee−1 , j ⎡⎤* ƒ E E−10t , or equivalently, ρ * ⎣⎦⎢⎥ Et−10, LDD: Cumulative Exposure Effect Model

WW**WW Model: E()YYij,1++−=+−=+ ij ,| X iγγ t e()EEE ij ,1 ij ,γγ t e ij ,

• To exactly compute Var () γ ˆ e , we need:

ƒ pej ∀j. ƒ ρ ∀jj,', or equivalently, E ⎡E Ejj⎤ ∀ ,' eejj, ' ⎣⎢ jj' ⎦⎥

ƒ Less efficient, fewer input parameters needed, more valid (between-subjects confounding eliminated) Results: Cumulative Exposure Effect Model Efficiency ƒ Once the exposure prevalence is fixed, if w jj’≥0 ∀j ≠ j’ then :

¾Var()γˆe is minimum when ρe=1 (time-invariant exposure).

¾Var()γˆe is maximum when ρe= -1/r (maximum within subject variation of exposure).

ƒ Having within-subject variation in exposure produces a loss in efficiency. Results: Cumulative Exposure Effect Model Efficiency ¾ If both the response and the exposure process have CS covariance then CS, ρ=0.8 12σρ2 (1− ) Var()γˆ = r=2 e 2 r=5 ppsrree(1−++− ) ( 2)[] 2 ( r 1) ρ e r=10

N N SSR SSR ==ρe =1 time-invariant NN ρe time-varying 0.0 0.5 1.0 1.5 2.0 0.0 0.2 0.4 0.6 0.8 1.0 ρe Results: LDD: Cumulative Exposure Effect Model • DEX response: The required N increases as θ increases • RS response: The required N increases as ρ b1 increases • CS, DEX response: The required N increases as

ρe decreases LDD: Acute exposure trajectories

"Envelope" trajectories Possible pattern for one subject

Acute, transient, exposure effect Y Y

Exposed Unexposed E=0 E=0 E=1 E=1 E=0 0 5 10 15 0 5 10 15 01234 01234 Time Time

"Envelope" trajectories Possible pattern for one subject Y Y

Exposed Unexposed E=1 E=1 E=1 E=0 E=0 051015 051015 01234 01234 Time Time Results: Acute Exposure Effect Model Model: E(YtEEtij| X i )= γγ01++ij γ 2 γ ij + 3( ij × ij ) ¾ To exactly compute σ  2, the following quantities need to be provided:

ƒ pej ∀j. ρ ∀jj,' E ⎡E Ejj⎤ ∀ ,' ƒ eejj, ' , or equivalently, ⎣⎢ jj' ⎦⎥

ƒ V(t0)

ƒ E ⎡⎤E tj∀ ρet, ∀j ⎣⎦⎢⎥j 0 , or equivalently, j 0 ƒ E ⎡⎤E Et∀ j,' j ⎣⎦⎢⎥jj'0 ƒ E ⎡⎤E Et2 ∀ j,' j ⎣⎦⎢⎥jj'0

¾ One possibility is do the calculations for V(t0)=0, in which case only p ∀j and ρ ∀ jj ,' are required. ej eejj, ' This will provide conservative study designs. Results: Acute Exposure Effect Model Efficiency N N SSR ==ρe =1 time-invariant NN ρe time-varying ρ=0.8, θ=0 ρ=0.8, θ=0.5

r=1 r=1 r=2 r=2 r=5 r=5 r=10 r=10 SSR SSR 0.0 1.0 2.0 3.0 0.00.51.01.52.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 ρe ρe Results: Accuracy of approximations

¾ Need pej ∀j, ρ ∀ jj ,' or, equivalently, E ⎡E Ejj⎤ ∀ ,' eejj, ' ⎣⎢ jj' ⎦⎥ ¾ Assume CS exposure and provide ρe. SSR in 10,000 arbitrary correlation matrices of exposure.

Model 2.5 (cumulative) Model 2.6 (acute) RS RS RS RS CS AR(1) AR(1) CS AR(1) AR(1) ρb1 = 0.8 ρb1 = 0.2 ρb1 = 0.8 ρb1 = 0.2

N SSR = CS

Ntrue SSR 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4

ρ=0.8 ρ=0.8 ρ=0.2 ρ=0.8 ρ=0.5 ρ=0.8 ρ=0.8 ρ=0.2 ρ=0.8 ρ=0.5 Results: Accuracy of approximations

• Low SSR when: – Cumulative: small, negative correlations for pairs of time points close in time, and large, positive correlations for pairs of time points distant to each other. – Acute: high correlations for pairs of time points that are both at either the beginning or the end of the study, while the remaining correlations were negative. Example. Medina-Ramon’s respiratory effects of exposure to cleaning products study

Here, we compute the required sample size for a study with 31 participants and 14 post-baseline measures to detect a 5 L/min/day decrease in PEF associated with the use of air- freshener sprays with 90% power, assuming DEX covariance structure of the response.

We assume the rates of change vary by exposure and a cumulative exposure effect using the model WW**WW E()YYij,1++−=+−=+ ij ,| X iγγ t e(EEE ij ,1 ij ,) γγ t e ij ,

* which is equivalent to the model E(YtEij| X i )=+γ 01γγij + e ij when there is no between-subjects confounding. > long.N() * By just pressing after each question, the default value, shown between square brackets, will be entered. * Press to quit Enter the number of post-baseline measures (r) [1]: 14 Enter the desired power (0 Will you specify the alternative hypothesis on the absolute (beta coefficient) scale (1) or the relative (percent) scale (2) [1]? 1

Enter the interaction coefficient (gamma3) [0.1]: 5

Which covariance matrix are you assuming: compound symmetry (1), damped exponential (2) or random slopes (3) [1]? 2

Enter the residual variance of the response given the assumed model covariates (sigma2) [1]: 4570

Enter the correlation between two measures of the same subject separated by one time unit (0

Enter the damping coefficient (theta) [0.5]: .12

Sample size = 28 Do you want to continue using the program (y/n) [y]? n Future work

• Include dropout • For sample size calculations, simply inflate the sample size by a factor of 1/(1-f). • However, dropout can alter the relationship between N and r. • Binary outcomes • Continuous exposures • Optimal (N,r) for time-varying exposures • Other ideas?

39 For further reading

ƒ Basagaña X, Spiegelman D. “The design of observational longitudinal studies with a time-invariant exposure”. Submitted for publication.

ƒ Basagaña X, Spiegelman D. “Power and sample size calculations for longitudinal studies estimating a main effect of a time-varying exposure”. Submitted for publication.

ƒ Basagaña X, Spiegelman D. “Power and sample size calculations for longitudinal studies comparing rates of change with a time-varying exposure.” Submitted for publication.

All three can be found at http://www.hsph.harvard.edu/faculty/spiegelman/optitxs.html

40 The bottom line

ƒ To accurately design an expensive, long-term longitudinal study, it is best to first conduct a pilot study with r=2, with sufficient sample size to accurately estimate all necessary parameters given the assumed model. Then, design the second phase of the study.

ƒ When time and/or funding do not permit, sensitivity analysis, including plausible worst case scenarios, is suggested

ƒ Our program will be helpful http://www.hsph.harvard.edu/faculty/spiegelman/optitxs.html 41 Thanks for your attention