On the C-Statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Collection Of Biostatistics Research Archive Harvard University Harvard University Biostatistics Working Paper Series Year 2009 Paper 101 On The C-Statistics For Evaluating Overall Adequacy Of Risk Prediction Procedures With Censored Survival Data Hajime Uno∗ Tianxi Caiy Michael J. Pencinaz Ralph B. D’Agostino∗∗ L. J. Weiyy ∗Harvard University, [email protected] yHarvard University, [email protected] zBoston University, [email protected] ∗∗Boston University, [email protected] yyHarvard University, [email protected] This working paper is hosted by The Berkeley Electronic Press (bepress) and may not be commer- cially reproduced without the permission of the copyright holder. http://biostats.bepress.com/harvardbiostat/paper101 Copyright c 2009 by the authors. ON THE C-STATISTICS FOR EVALUATING OVERALL ADEQUACY OF RISK PREDICTION PROCEDURES WITH CENSORED SURVIVAL DATA Hajime Uno1;2, Tianxi Cai2, Michael J. Pencina3; Ralph B. D’Agostino4 and L.J. Wei2 1Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute, Boston, MA, USA 2Department of Biostatistics, Harvard University, Boston, MA, USA 3Department of Biostatistics, Boston University, Boston, MA, USA 4Department of Mathematics and Statistics, Boston University, Boston, MA, USA SUMMARY For modern evidence-based medicine, a well thought-out risk scoring system for predicting the timing of the occurrence of a clinical event plays an important role in selecting prevention and treatment strategies. Such an index system is often established based on the subject’s “baseline” genetic or clinical markers via a working parametric or semi-parametric model. To evaluate the adequacy of such a system, C-statistics are routinely used in the medical literature to quantify the capacity of the estimated risk score in discriminating among subjects with different event times. When the event time is possibly censored, however, the commonly used C-statistics estimate parameters which may depend on the study-specific censoring distribution. In this article, we present a simple C-statistic without this shortcoming. The new procedure consistently estimates a conventional concordance measure which is free of censoring. We provide a large sample approximation to the distribution of this estimator for making inferences about this measure. Numerical studies are also conducted to investigate the performance of the new procedure. Keywords: AUC; Cox’s proportional hazards model; Framingham risk score; ROC. 1 Hosted by The Berkeley Electronic Press 1. INTRODUCTION For modern clinical medicine, risk prediction procedures are valuable tools for disease prevention and management. Pioneered by the Framingham study, risk score systems have been established for assessing individual risks of developing cardiovascular diseases (CVD), cancer or many other conditions within a certain time period (Anderson et al., 1991; D’Agostino et al., 2008; Shariat et al., 2008; Parikh et al., 2008). A key component in the assessment of risk algo- rithm performance is its ability to distinguish subjects who will develop an event (“cases”) from those who will not (“controls”). This concept, known as discrimination, has been well studied and quantified using, for example, the estimated AUC, the area under the observed Receiver Operating Characteristics (ROC) curve, which is also referred as a “C-statistic” (Bamber, 1975). Such a statistic is an estimated conditional probability that for any pair of “case” and “control”, the predicted risk of an event is higher for the “case” (Hanley & McNeil, 1982). If the primary response variable is the time to a certain event, the aforementioned procedure for binary outcomes can be used to quantify the ability of the risk score system to differentiate cases from controls at a time point t: If one is not interested in a particular time point, a standard concordance measure may be used to evaluate the overall performance of the risk scoring system. Specifically, let T be the event time, Z be a p × 1 covariate vector, and g(Z) be the theoretical counterpart of the estimated risk score for the subject with Z: Consider two independent copies 0 0 0 f(T1;Z1; g(Z1)) ; (T2;Z2; g(Z2)) g of (T; Z; g(Z)) : A commonly used concordance measure is C = pr(g(Z1) > g(Z2) j T2 > T1) (1:1) (Heagerty & Zheng, 2005). When T is subject to right censoring, as discussed in Heagerty & 2 http://biostats.bepress.com/harvardbiostat/paper101 Zheng (2005) one would typically consider a modified Cτ with a fixed, prespecified follow-up period (0; τ); where Cτ = pr(g(Z1) > g(Z2) j T2 > T1;T1 < τ): (1:2) Estimation of (1.1) or (1.2) when the event time may be censored, however, is not straightforward (Harrell et al., 1996; Pencina and D’Agostino, 2004; Gönen & Heller, 2005; Chambless & Diao, 2006). The estimator for C or Cτ proposed by Heagerty & Zheng (2005) is derived under a proportional hazards model. If this parametric working model is not correctly specified, the resulting estimator may be biased. A popular nonparametric C-statistic for estimating C was proposed by Harrell et al. (1996), which was extensively studied by Pencina & D’Agostino (2004). Note that this generalization is a weighted average of the time-dependent AUCs, which corresponds to the “incident/dynamic” ROC curve (Heagerty & Zheng, 2005; Cai et al., 2006) with weights depending on the study-specific censoring distribution. When the study individuals had differential follow-up times, the C-statistic studied by Harrell et al. (1996) converges to an association measure, which depends on the study-specific censoring distribution. In this article, under the general random censorship assumption, we provide a simple, purely non-parametric estimator for (1.2), which is free of the censoring distribution. Furthermore, we study the large sample properties of the new estimation procedure. Our proposal is illustrated with two real examples. The performance of the new proposal under various practical settings is also examined via a simulation study. 3 Hosted by The Berkeley Electronic Press 2. INFERENCE PROCEDURES FOR DEGREE OF ASSOCIATION BETWEEN EVENT TIMES AND ESTIMATED RISK SCORES In this section, we consider a non-trivial case that at least one component of the covariate vector Z is continuous. For the survival time T; let D be the corresponding censoring variable. Assume that D is independent of T and Z. Also, let f(Ti;Zi;Di) ; i = 1; : : : ; ng be n independent copies of f(T; Z; D)g. For the ith subject, we only observe (Xi;Zi; ∆i) ; where Xi = min (Ti;Di), and ∆i equals 1 if Xi = Ti and 0 otherwise. Suppose that we fit the data with a working parametric or semi-parametric regression model, for example, a standard Cox proportional hazards model (Cox, 1972): 0 ΛZ (t) = Λ0(t) exp(β Z); (2:1) where ΛZ (·) is the cumulative hazard function for subjects with covariate vector Z; Λ0(·) is the unknown baseline cumulative hazard function and β is the unknown p × 1 parameter vector. Let the maximum partial likelihood estimator for β be denoted by β:^ Note that even when the model (2.1) is not correctly specified, under a rather mild non-separable condition that there does not 0 0 ^ exist vector ζ such that pr(T1 > T2 j ζ Z1 < ζ Z2) = 1, β converges to a constant vector, say, β0; as n ! 1: This stability property is important for deriving the new inference procedure. 0 For a pair of future patients with covariate vectors fZk ; k = 1; 2g and the potential survival 0 ^0 0 times fTk ; k = 1; 2g, their corresponding risk scores are fβ Zk ; k = 1; 2g: To evaluate this risk score system, one may use the concordance measure discussed in Section 1: ^0 0 ^0 0 0 0 Cn = pr(β Z1 > β Z2 j T1 < T2 ); 4 http://biostats.bepress.com/harvardbiostat/paper101 0 0 0 0 where the probability is evaluated with respect to the data, and (T1 ;Z1 ) and (T2 ;Z2 ). Note that Cn depends on the sample size. Let the limit of Cn be denoted by 0 0 0 0 0 0 C = pr(β0Z1 > β0Z2 j T1 < T2 ): (2:2) Now, since the support of the censoring variable D is usually shorter than that of the failure time T; the tail part of the estimated survival function of T is rather unstable. Therefore, we consider a truncated version of C in (2.2), that is, 0 0 0 0 0 0 0 Cτ = pr(β0Z1 > β0Z2 j T1 < T2 ;T1 < τ); where τ is a prespecified time point such that pr(D > τ) > 0: It follows from an “inverse probability weighting” technique proposed by Cheng et al. (1995) for dealing with a completely different problem in survival analysis that Cτ can be consistently, nonparametrically estimated by Pn Pn ∆ fG^(X )g−2I(X < X ;X < τ)I(βZ^ > βZ^ ) ^ i=1 j=1 i i i j i i j Cτ = ; (2:3) Pn Pn ^ −2 i=1 j=1 ∆ifG(Xi)g I(Xi < Xj;Xi < τ) where I(·) is the indicator function and G^(·) is the Kaplan-Meier estimator for the censoring distribution G(t) = pr(D > t): Heuristically, the consistency of the above estimator follows from the fact that as n ! 1; the denominator of (2.3) divided by n2 converges to ∆1I(X1 < X2;X1 < τ) I(T1 < T2;T1 < τ)I(D1 ^ D2 > T1) E 2 = E E 2 T1 G (X1) G (T1) = pr(T1 < T2;T1 < τ); 2 0 0 0 0 0 0 0 and the numerator of (2.3) divided by n converges to pr(β0Z1 > β0Z2 ;T1 < T2 ;T1 < τ): 5 Hosted by The Berkeley Electronic Press In the Appendix, we show that 1 ^ W = n 2 (Cτ − Cτ ) is asymptotically normal with mean 0.

On the C-Statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data

Minimum Sample Sizes for On-Road Measurements of Car

Likelihood-Based Random-Effects Meta-Analysis with Few Studies

Targeted Maximum Likelihood Estimation of Treatment Effects In

Model-Based Estimation Official Statistics

Invasive Plant Researchers Should Calculate Effect Sizes, Not P-Values Author(S): Matthew J

Background Material

Precise Unbiased Estimation in Randomized Experiments Using Auxiliary Observational Data

Statistical Inference with Imprecise Data - Reinhard Viertl

Download Article (PDF)

Package 'Binomsamsize'

Guidelines on Small Area Estimation for City Statistics and Other Functional Geographies 2019 Edition Statistical Requirements Compendium Requirements Statistical

Read Book Understanding the New Statistics : Effect Sizes, Confidence