American Journal of Volume 160 Copyright © 2004 by The Johns Hopkins Bloomberg School of Public Health Number 4 Sponsored by the Society for Epidemiologic Research August 15, 2004 Published by Oxford University Press

SPECIAL ARTICLE

Model-based Estimation of Relative Risks and Other Epidemiologic Measures in Downloaded from https://academic.oup.com/aje/article/160/4/301/165918 by guest on 30 September 2021 Studies of Common Outcomes and in Case-Control Studies

Sander Greenland

From the Departments of Epidemiology and , University of California, Los Angeles, Los Angeles, CA.

Received for publication November 7, 2003; accepted for publication May 27, 2004.

Some recent articles have discussed biased methods for estimating risk ratios from adjusted ratios when the outcome is common, and the problem of setting confidence limits for risk ratios. These articles have overlooked the extensive literature on valid estimation of risks, risk ratios, and risk differences from logistic and other models, including methods that remain valid when the outcome is common, and methods for risk and rate estimation from case-control studies. The present article describes how most of these methods can be subsumed under a general formulation that also encompasses traditional standardization methods and methods for projecting the impact of partially successful interventions. Approximate formulas for the resulting estimates allow ; these intervals can be closely approximated by rapid simulation procedures that require only standard software functions.

absolute risk; case-control studies; clinical trials; cohort studies; ; odds ratio; ; risk assessment

Abbreviation: GEE, generalized .

Recently, McNutt et al. (1) noted a in a popular by Zhang and Yu (2). Greenland and Holland (4) described method by Zhang and Yu (2) for converting odds ratios to the biases in that method and gave valid formulas for risk ratios. Both articles overlooked the extensive literature converting odds-ratio estimates into risk-difference esti- on estimating relative risks and other measures from fitted mates. There are now many model-based estimates and models. This literature addresses the problems they noted confidence intervals for risks ( proportions), rates, and provides valid methods for all study designs, including and their ratios and differences (5, pp. 414–415; 6–14) and case-control studies, cohort studies, and clinical trials. for attributable fractions (15, 16). These methods can make use of input from logistic and other models, and most require BACKGROUND no rare- assumption. One can also get valid confi- In 1989, Holland (3) proposed an “adjusted risk differ- dence intervals for risk ratios by using ence” using the same biased risk-ratio formula rediscovered with robust (generalized estimating equations (GEE) or

Correspondence to Dr. Sander Greenland, Departments of Epidemiology and Statistics, University of California, Los Angeles, Los Angeles, CA 90095-1772 (e-mail: [email protected]).

301 Am J Epidemiol 2004;160:301–305 302 Greenland

“sandwich”) variance estimates (10), which avoid the overly A key advantage of model-based estimates is that they do wide intervals noted elsewhere (1). not require large numbers at each regressor level; the To describe the general ideas, suppose that r(x) is the risk regressor values may even be unique to each individual (as or rate at level x of a regressor vector X. X is a function of all would be expected when some covariates are continuous) exposures, confounders, and modifiers in the model; it may (6–14). To avoid sparse-data artifacts, they do require that contain powers, product terms, splines, and so forth. For the numbers of cases and noncases be adequate relative to example, a study of cannabis smoking and lung cancer might the number of model parameters, although this restriction use X = (cannabis grams/year, pack-years cigarettes, age, can be reduced by using penalized estimation (shrinkage) or age2, female, age × female), where female = 1 for women, 0 Bayesian methods to fit the model (19–21). for men. Suppose we want to compare average risks or rates Model-based estimates of r(x) can be sensitive to influen- tial data points and to model misspecification, but they none- when the distribution of X in a target population is p1(x) theless tend to have a smaller -squared error than do versus p0(x). These distributions usually correspond to raw covariate-specific estimates (which become wildly everyone exposed versus everyone unexposed to some risk Downloaded from https://academic.oup.com/aje/article/160/4/301/165918 by guest on 30 September 2021 unstable in sparse data) if the model fits well (22, chap. 12). factor. In policy applications, p1(x) and p0(x) may represent the population distribution without versus with the applica- When one standardizes over distributions similar to those in tion of some intervention program. Some X components the data, the resulting summary estimates will be far less sensitive than the specific r(x) estimates; this robustness (e.g., age, sex) may have the same distribution in p1(x) and p (x); only those components affected by the exposure will derives from the tendency of residual errors to average to 0 zero over the data distribution. differ. For example, p1(x) could represent the existing joint distribution of cannabis smoking, cigarette smoking, age, and sex in the target, while p0(x) could represent the same EXAMPLE distribution of cigarette smoking, age, and sex, with zero cannabis assigned to everyone (for everyone, the first entry Table 1 presents data on the relation of receptor level and in X is shifted to zero, and the rest are unchanged). staging to survival in a cohort of women with breast cancer The standardized (population-averaged) risk or rate under (23, table 5.3), along with risk estimates from use of several methods. Let x = (x , x , x ), with x an indicator of low- exposure or intervention j is R = Σ p (x)r(x), where the sum 1 2 3 1 j x j receptor status and x and x indicators of stage II and III. is over the of X in the target. The adjusted risk or rate 2 3 The first set of estimates is the observed proportion of ratio and difference are then RR = R /R and RD = R – R , 10 1 0 10 1 0 women in each column who died. The second set is from respectively. The adjusted attributable fraction is (R – R )/ 1 0 maximum-likelihood logistic regression using the model r(x) = R = RD /R = (RR – 1)/RR ; the population attributable 1 10 1 10 10 expit(α + xβ), where expit(u) = eu/(1 + eu); the model fits fraction is the special case in which p (x) is the current distri- 1 well (e.g., likelihood-ratio p = 0.8) and the fitted risks are bution and p (x) is the distribution after exposure removal 0 expit(a + xb), where a and b are the α and β estimates. The (15, 16). The covariate-specific RR formula given by α + β third set is from a log-linear model r(x) = e x fit by bino- McNutt et al. (1, p. 941) is a special case in which the distri- mial maximum likelihood (23); this model also fits well x x x butions p1( ) and p0( ) are concentrated at single values 1 (e.g., p = 0.8) and the fitted risks are ea + xb. The fourth set is and x0 of X, and the exposure differs between x1 and x0 but the from this log-linear model fit using the incorrect Poisson other covariates do not. For example, to compare the risk likelihood, as one would obtain by entering the observed from use of 200 g/year of cannabis with that of nonuse column totals as person-years in a Poisson regression among males aged 50 years with 10 cigarette pack-years of program (1, 10, 14). smoking, if X = (cannabis grams/year, cigarette pack-years, To obtain a standardized risk ratio comparing the low and 2 × age, age , female, age female), one would take x1 = (200, high receptor group using the total group as the standard, × 2 × 10, 50, 502, 0, 50 0) and x0 = (0, 10, 50, 50 , 0, 50 0). take p1(x) and p0(x) shown in the final two rows of table 1, multiply them against a row of estimated risks, and sum the ESTIMATION results to get the R1 and R0 estimates. Under the log-linear model, the RR10 estimate simplifies to exp(b1), where b1 is Model-based confidence intervals for the above quantities the estimated x coefficient. The estimated standardized 1 ˆ ˆ can be obtained from variance formulas (8–14). To avoid risks, ratios, and differences are R1 = 0.392, R0 = 0.237, ˆ ˆ programming these formulas, intervals can instead be RR10 = 1.65, and RD10 = 0.156 if the observed proportions ˆ ˆ ˆ ˆ obtained by simulation or other methods such as are used; R1 = 0.401, R0 = 0.239, RR10 = 1.68, and RD10 = ˆ ˆ bootstrapping (16–18). If the comparison distributions p1(x) 0.162 if the logistic model is used; R = 0.371, R0 = 0.238, ˆ ˆ 1 and p0(x) are also estimated, their estimates should also be RR10 = 1.56, and RD10 = 0.133 if the log-linear model fit by ˆ ˆ resampled. There are many ways to make use of the resam- binomial maximum likelihood is used; and R = 0.383, R0 = ˆ ˆ 1 pling distribution of estimates; one should avoid naive use of 0.235, RR10 = 1.63, and RD10 = 0.148 if log-linear Poisson the of the bootstrap distribution to set confidence regression is used. The Mantel-Haenszel estimates (5, p. ˆ ˆ limits, however (16, 17). Resampling methods allow use of 271) are RR10 = 1.62 and RD10 = 0.166. Thus, all the valid any model to fit the risks or rates r(x), and they may also be approximations yield similar results, as expected given the applied by replacing r(x) with other outcome measures such sample size and good fit of the models. as expected years of life lost or with clinical measures such In contrast, the odds-ratio estimate exp(b1) from the as blood pressure, CD4 count, and so forth. logistic model is 2.51, and the Zhang-Yu risk-ratio estimate

Am J Epidemiol 2004;160:301–305 Model-based Estimation of Epidemiologic Measures 303

TABLE 1. Data relating receptor level (low, high) and stage (I, II, III) to 5-year breast cancer mortality (23), observed and model-based estimates of average risk (incidence proportion) by receptor level and stage, and distributions for standardizing receptor- level comparisons to total-cohort stage distribution

Stage I Stage II Stage III

Low High Low High Low High Deaths 2 5 9 17 12 9 Survivors 10 50 13 57 2 6 To t a l 12 55 22 74 14 15

Estimates of 5-year

mortality risk Downloaded from https://academic.oup.com/aje/article/160/4/301/165918 by guest on 30 September 2021 (per 1,000) Observed* 167 91 409 230 857 600 Logistic 190 86 422 226 816 639 Binomial† 148 95 376 242 870 558 Poisson‡ 153 94 386 237 905 555

Comparison distributions§

p1(x) 0.349 0 0.500 0 0.151 0

p0(x) 0 0.349 0 0.500 0 0.151 * Deaths/total (nonparametric risk estimate). † Log-linear risk model with receptor level and stage, fit by binomial maximum likelihood. ‡ Log-linear risk model fit by incorrect Poisson maximum likelihood.

§ p1(x) puts the total cohort at a low receptor level; p0(x) puts the total cohort at a high receptor level.

(2) is 1.89; both overestimate the risk ratio, as expected 0.023, 0.312 for RD10 using logistic regression; 1.05, 2.30 for given that the outcome is not rare (over a quarter of the RR10 and 0.023, 0.312 for RD10 using binomial log-linear patients died). Another invalid model-based adjustment regression; and 1.07, 2.47 for RR10 and 0.021, 0.299 for RD10 predicts an expected number of exposed cases E from a using Poisson regression. model without exposure, then divides E into the observed number of exposed cases to get a standardized mortality ratio CASE-CONTROL STUDIES (22, sec. 4.3). This approach underestimates risk ratios (24); using a logistic model with only x2 and x3 yields E = 17.35 Cumulative case-control studies sample cases and controls and a standardized mortality ratio of 23/17.35 = 1.33. from cohort members who do and do not get disease by the If the observed proportions are used, 95 percent confi- end of follow-up (5, pp. 110–111). Given a valid estimate of dence limits for RR10 are 1.06, 2.58 and for RD10 are 0.006, the crude (overall) risk rc in the target population or of the 0.304 (5, p. 263); if the logistic model is used, the limits for ratio of case-control fractions rf, one can estimate RR10 are 1.09, 2.57 and for RD10 are 0.013, 0.303 (8, 9); and, the covariate-specific risks in the target (and hence their if the binomial log-linear model is used, the limits for RR10 differences and ratios) even if the disease is common. If the ± 1/2 are exp(b1 1.96v1 ) = 1.05, 2.30, where v1 is the estimated data are not sparse, one can use results from case-control variance of b1, and for RD10 are 0.023, 0.312 (9). Standard modeling to estimate risks or rates and their contrasts (5, pp. Poisson regression overestimates the variance of b1, yielding 418–419; 26–32). For models (such as the logistic) in which limits for RR10 of 0.93, 2.87; nonetheless, GEE Poisson the baseline odds is a multiplicative factor, ln(rc) or ln(rf) regression with the robust variance estimate (available in becomes a simple adjustment term to the model intercept (5, Stata proc xtgee and SAS proc genmod (25)) yields limits for pp. 417–419; 26, 27); other models can be used, however RR10 of 1.07, 2.48. The Mantel-Haenszel limits for RR10 are (28, 31). Similar methods can be used to estimate risks from 1.09, 2.39 and for RD10 are 0.016, 0.316 (5, p. 271). The log- case-cohort studies, in which controls are sampled from all linear model fit by binomial maximum-likelihood supplies cohort members, not just noncases (5, pp. 417, 419; 33). the narrowest risk-ratio interval because it is the only one of In density case-control studies, controls are sampled longi- these methods that is fully efficient under the model. tudinally from those at risk, in proportion to person-time (5, Simulated 95 percent confidence limits (16) from 400,001 pp. 93–96). No adjustments are then needed to estimate rate coefficient resamplings (which avoid complex variance ratios from the fitted logistic model (5, pp. 416–417; 34, 35), formulas) were nearly the same: 1.09, 2.54 for RR10 and and intercept adjustments analogous to the cumulative

Am J Epidemiol 2004;160:301–305 304 Greenland formulas can be used to estimate rates and rate differences Epidemiol 2003;157:940–3. (5, pp. 417; 27–30, 32). 2. Zhang J, Yu KF. What’s a relative risk? A method of correcting If the analysis strata are small (sparse), as in matched anal- the odds ratio in cohort studies of common outcomes. JAMA yses, special summary methods may be needed to estimate 1998;280:1690–1. 3. Holland PW. A note on the covariance of the Mantel-Haenszel exposure-specific risks (36). To avoid sparse data, one often log-odds-ratio estimator and the sample marginal rates. sees matching factors entered as simple terms in an Biometrics 1989;45:1009–16. unmatched analysis. Unfortunately, this strategy can 4. Greenland S, Holland PW. Estimating standardized risk differ- produce bias if the matching factors are not ignorable and are ences from odds ratios. Biometrics 1991;47:319–22. modeled as continuous (e.g., age is entered directly despite 5. Rothman KJ, Greenland S, eds. Modern epidemiology. 2nd ed. being matched within 5-year categories), because case- Philadelphia, PA: Lippincott-Raven, 1998. control matching creates discontinuities in the sample factor- 6. Lee J. Covariance adjustment of rates based on the multiple outcome relation at matching-category boundaries (37). logistic regression model. J Chronic Dis 1981;34:415–26. 7. Lane PW, Nelder JA. and standardiza- tion as instances of prediction. Biometrics 1982;38:613–21. Downloaded from https://academic.oup.com/aje/article/160/4/301/165918 by guest on 30 September 2021 DISCUSSION 8. Flanders WD, Rhodes PH. Large-sample confidence intervals for regression standardized risks, risk ratios, and risk differ- Traditional impact measures such as attributable fractions ences. J Chronic Dis 1987;40:697–704. take p1(x) to be the current population distribution and p0(x) 9. Greenland S. Estimating standardized parameters from general- the distribution after complete exposure removal (5, p. 58; ized linear models. Stat Med 1991;10:1069–74. 13, 15, 18). These measures can be very misleading for 10. Stijnen T, van Houwelingen HC. Relative risk, policy projections: Feasible interventions can rarely achieve and rate difference models for sparse stratified data: a pseudo- anything near complete exposure removal, may have un- likelihood approach. Stat Med 1993;12:2285–303. toward side effects (including adverse effects on quality of life 11. Greenland S. Modeling risk ratios from matched cohort data: an or resources available for other purposes), and may affect the estimating equation approach. Appl Stat 1994;43:223–32. size of the population at risk. Hence, intelligent policy input 12. Joffe MM, Greenland S. Estimation of standardized parameters from categorical regression models. Stat Med 1995;14:2131– requires consideration of the full spectrum of intervention 41. limitations and side effects, rather than just traditional esti- 13. Bruzzi P, Green SB, Byar DP, et al. Estimating the population mates (38–40). It further requires quantitative assessments of for multiple risk factors using case-control bias, as well as of random error (41–46); simulation confi- data. Am J Epidemiol 1985;122:904–14. dence intervals are easily extended to subsume this task (16). 14. Cummings P, McKnight B, Greenland S. Matched cohort meth- Finally, because epidemiologic textbooks persist in erro- ods for injury research. Epidemiol Rev 2003;25:43–50. neous claims otherwise (e.g., 47, p. 201), it is worth noting 15. Greenland S, Drescher K. Maximum likelihood estimation of that attributable fractions do not approximate the etiologic attributable fractions from logistic models. Biometrics 1993;49: fraction (fraction of cases caused by exposure) or the proba- 865–72. bility of causation, even if the disease is rare (48–50). 16. Greenland S. Interval estimation by simulation as an alternative to and extension of confidence intervals. Int J Epidemiol (in Rates are often substituted for risks when estimating press). impact measures. This substitution overstates impact on the 17. Carpenter J, Bithell J. Bootstrap confidence intervals: when, study outcome when the exposure at issue strongly affects which, and what? Stat Med 2000;19:1141–64. person-time at risk, as can occur when exposure affects other 18. Greenland S. Estimating population attributable fractions from outcomes (5, p. 63; 51). One can reduce this problem by fitted incidence ratios and exposure data, with an appli- converting rates to risk estimates before standardizing, for cation to electromagnetic fields and childhood leukemia. example, by stratifying on follow-up time and then applying Biometrics 2001;57:182–8. the exponential formula (5, p. 40): If the fitted rate in period 19. Greenland S, Schwartzbaum JA, Finkle WD. Problems due to k is r (x) and the length of period k is t , an estimate of the small samples and sparse data in conditional logistic regression k k analysis. Am J Epidemiol 2000;151:531–9. Σ x risk over periods 1 through K is 1 – exp[– krk( )tk], where 20. Greenland S. When should epidemiologic regressions use ran- the sum is from k = 1 to k = K. One can also estimate risks dom coefficients? Biometrics 2000;56:915–21. from rates via survival models, which allow use of contin- 21. Greenland S. Putting background information about relative uous time (52). risks into conjugate priors. Biometrics 2001;57:663–70. 22. Bishop YMM, Fienberg SE, Holland PW. Discrete multivariate analysis. Cambridge, MA: MIT Press, 1975. 23. Newman SC. Biostatistical methods in epidemiology. New ACKNOWLEDGMENTS York, NY: Wiley, 2001. 24. Greenland S. Bias in methods for deriving standardized mor- The author thanks Katherine Hoggatt for helpful bidity ratios and attributable fraction estimates. Stat Med 1984; comments. 3:131–41. 25. Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol 2004;159:702–6. 26. Anderson JA. Separate-sample logistic discrimination. REFERENCES Biometrika 1972;59:19–35. 27. Greenland S. Multivariate estimation of exposure-specific inci- 1. McNutt LA, Wu C, Xue X, et al. Estimating the relative risk in dence from case-control studies. J Chronic Dis 1981;34:445– cohort studies and clinical trials of common outcomes. Am J 53.

Am J Epidemiol 2004;160:301–305 Model-based Estimation of Epidemiologic Measures 305

28. Nurminen M. Assessment of excess risks in case-base studies. J S59. Clin Epidemiol 1992;45:1081–92. 41. Eddy DM, Hasselblad V, Schachter R. Meta-analysis by the 29. Benichou J, Wacholder S. A comparison of three approaches to confidence profile method. New York, NY: Academic Press, estimate exposure-specific incidence rates from population- 1992. based case-control data. Stat Med 1994;13:651–61. 42. Lash TL, Fink AK. Semi-automated sensitivity analysis to 30. Benichou J, Gail MH. Methods of inference for estimates of assess systematic errors in observational epidemiologic data. absolute risk derived from population-based case-control stud- Epidemiology 2003;14:451–8. ies. Biometrics 1995;51:182–94. 43. Phillips CV. Quantifying and reporting uncertainty from sys- 31. Wacholder S. The case-control study as data missing by design: tematic errors. Epidemiology 2003;14:459–66. estimating risk differences. Epidemiology 1996;7:144–50. 44. Greenland S. The impact of prior distributions for uncontrolled 32. King G, Zeng L. Estimating risk and rate levels, ratios and dif- and response bias. J Am Stat Assoc 2003;98:47– ferences in case-control studies. Stat Med 2002;21:1409–27. 54. 33. Schouten EG, Dekker JM, Kok FJ, et al. Risk ratio and rate esti- 45. Greenland S. Multiple-bias modeling for observational studies mation in case-cohort designs. Stat Med 1993;12:1733–45. (with discussion). J R Stat Soc (A) (in press). 34. Sheehe PR. Dynamic risk analysis in retrospective matched- 46. Steenland K, Greenland S. Monte Carlo sensitivity analysis and Downloaded from https://academic.oup.com/aje/article/160/4/301/165918 by guest on 30 September 2021 pair studies of disease. Biometrics 1962;18:323–41. Bayesian analysis of smoking as an unmeasured confounder in 35. Prentice RL, Breslow NE. Retrospective studies and failure- a study of silica and lung cancer. Am J Epidemiol 2004;160: time models. Biometrika 1978;65:153–8. 384–92. 36. Greenland S. Estimation of exposure-specific rates from sparse 47. Koepsell TD, Weiss NS. Epidemiologic methods. New York, case-control data. J Chronic Dis 1987;40:1087–94. NY: Oxford University Press, 2003. 37. Greenland S. Partial and marginal matching in case-control 48. Greenland S, Robins JM. Conceptual problems in the definition studies. In: Moolgavkar SH, Prentice RL, eds. Modern statisti- and interpretation of attributable fractions. Am J Epidemiol cal methods in chronic disease epidemiology. New York, NY: 1988;128:1185–97. Wiley, 1986:35–49. 49. Greenland S. The relation of the of causation to the 38. Morgenstern H, Bursic ES. A method for using epidemiologic relative risk and the doubling dose: a methodologic error that data to estimate the potential impact of an intervention on the has become a social problem. Am J Public Health 1999;89: health status of a target population. J Community Health 1982; 1166–9. 7:292–309. 50. Greenland S, Robins JM. Epidemiology, justice, and the proba- 39. Greenland S. theory for policy uses of epidemiologic bility of causation. 2000;40:321–40. measures. In: Murray CJL, Salomon JA, Mathers CD, et al, eds. 51. Greenland S. Absence of confounding does not correspond to Summary measures of population health. Geneva, Switzerland: collapsibility of the rate ratio or rate difference. Epidemiology World Health Organization, 2002:291–302. 1996;7:498–501. 40. Poole C. Generalized effect estimation: An antidote to utopian 52. Kalbfleisch JD, Prentice RL. The statistical analysis of failure preventive fantasies. (Abstract). Am J Epidemiol 2003;157: time data. 2nd ed. New York, NY: Wiley, 2002.

Am J Epidemiol 2004;160:301–305