An Alternative Trial-Level Measure for Evaluating Failure-Time Surrogate
Total Page:16
File Type:pdf, Size:1020Kb
Contemporary Clinical Trials Communications 15 (2019) 100402 Contents lists available at ScienceDirect Contemporary Clinical Trials Communications journal homepage: www.elsevier.com/locate/conctc An alternative trial-level measure for evaluating failure-time surrogate T endpoints based on prediction error Shaima Belhechmia,b, Stefan Michielsa,b,*, Xavier Paolettia,b, Federico Rotoloc a Université Paris-Saclay, Univ. Paris-Sud, UVSQ, CESP, INSERM, U1018 ONCOSTAT, F-94805, Villejuif, France b Gustave Roussy, Service de biostatistique et d’épidémiologie, F-94805, Villejuif, France c Innate Pharma, Biostatistics and Data Management Unit, F-13009, Marseille, France ARTICLE INFO ABSTRACT Keywords: To validate a failure-time surrogate for an established failure-time clinical endpoint such as overall survival, the Survival analysis meta-analytic approach is commonly used. The standard correlation approach considers two levels: the in- Surrogate endpoint evaluation dividual level, with Kendall's τ measuring the rank correlation between the endpoints, and the trial level, with Bivariate models the coefficient of determination R2 measuring the correlation between the treatment effects on the surrogate and Copula models on the final endpoint. However, the estimation of is not robust with respect to the estimation error of the trial- Cox model specific treatment effects. Simulation studies. 2010 MSC: 00–01 The alternative proposed in this article uses a prediction error based on a measure of the weighted difference 99–00 between the observed treatment effect on the final endpoint and a model-based predicted effect. The measures can be estimated by cross-validation within the meta-analytic setting or external validation on a set of trials. Several distances are presented, with varying weights, based on the standard error of the observed treatment effect and of its predicted value. A simulation study was conducted under different scenarios, varyingthe number and the size of the trials, Kendall's τ and . These measures have been applied to individual patient data from a meta-analysis of trials in advanced/recurrent gastric cancer (20 randomized trials of chemotherapy, 4069 patients). The distance-based measures appeared to be robust with respect to different values of simulation parameters in several scenarios (such as Kendall's τ, size and number of clinical trials). The absolute prediction error can be an alternative to the trial-level for evaluation of candidate time-to-event surrogates. 1. Introduction endpoint, it must be validated. The clinical conclusions must be con- sistent between the two criteria. In this perspective, several methodo- The main objective of confirmatory phase-III clinical trials isto logical works have been developed in the last twenty years within the determine whether a new treatment is effective. In oncology, the final meta-analytical approach, which is based on the correlation between endpoint of randomized phase-III clinical trials is often the overall the surrogate endpoint and the final endpoint [1–3]. The meta-analy- survival (OS). However, the use of a final endpoint like OS requires a tical approach considers two levels of validation for a surrogate end- large number of patients, a long follow-up period, and high research point: the individual level and the trial level. Validity at the individual and development costs to achieve the statistical power required in level means that, for each patient, the surrogate endpoint is correlated phase-III trials. It is therefore interesting to use intermediate criteria to the final endpoint. Individual surrogacy is straigthforward to verify that can be evaluated earlier and used as a surrogate for the final and requires only data from a single trial. Validity at the trial level endpoint. Surrogate endpoints may allow shorter trial duration, smaller means that, for each trial, the treatment effect on the surrogate end- number of patients, and reduced costs. A surrogate endpoint can also be point correlates with the treatment effect on the final endpoint. Atthe of interest if it can be measured in a simpler, more reproducible and individual level, the Kendall's τ can be used as a measure of validation. accurate or less invasive way for patients compared to the reference At the trial level, for time-to-event endpoints, Burzykowski et al. [4] endpoint. developed a two-stage copula-based approach, with an adjusted trial- Before using a surrogate endpoint as a substitute for the reference level surrogacy measure R2 calculated in the second stage, which takes * Corresponding author. Postal address: Service de Biostatistique et d’épidémiologie, Gustave Roussy, B2M, RdC. 114, rue Edouard Vaillant, 94805, Villejuif, France. E-mail address: [email protected] (S. Michiels). https://doi.org/10.1016/j.conctc.2019.100402 Received 18 January 2019; Received in revised form 13 June 2019; Accepted 24 June 2019 Available online 05 July 2019 2451-8654/ © 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/). S. Belhechmi, et al. Contemporary Clinical Trials Communications 15 (2019) 100402 estimation error of the treatment effects at the first stage into account. estimation errors at the first step, to estimate the relationship between This approach can also provide, for a new trial, a prediction of the effect the treatment effect on the surrogate endpoint and the treatment effect of the treatment on the final endpoint on the basis of the intermediate on the final endpoint. Indeed, the estimates of the treatment effects endpoint. However, the estimation of the coefficient of determination obtained in the first stage are assumed to follow the mixed model is not robust with respect to the estimation error of the trial-specific ˆ treatment effects, is sensitive to trials with extreme treatments effects, i i i ˆ = + , has some convergence issues and often comes with a large standard i i i error [5,6]. Recently, Gabriel et al. [7] have proposed to use an absolute pre- with true treatment effects diction error instead of to compare trial-level surrogates in a binary 2 setting and formalized the use of cross-validation in this case. In the i d d d trial ~N , 2 , time-to-event context, Baker and Kramer [8] used the difference in i d d trial d survival at a given time as outcome and proposed a prediction error estimate using leave-one-out cross-validation. and estimation errors In this paper, we propose the use of an absolute prediction error as 2 an alternative measure to the trial level for the case of two failure- ai 0 i i i i ~N , i = . ()bi ()0 2 time endpoints. The new surrogacy measures proposed here are based i i i i on the distance between the effect of the treatment observed onthe final endpoint in each trial and its predicted value obtained fromthe The measure of the surrogacy at the trial level is then the de- 2 2 effect observed on the surrogate endpoint, based on the prediction termination coefficient Rtrial = trial, obtained by fixing the estimation model fitted using data from the other trials. We present several errors i at their values estimated in the first step [13]. In real-life weights for distance measurement based on the standard error of the examples often the algorithm used to estimate the coefficients of the observed treatment effect and the standard error of its predicted value. model does not converge and an “unadjusted” model is used in which A leave-one-trial-out cross-validation scheme is used to estimate the is fixed to zero. We denote the model which fully accounts for prediction error in the context of an internal validation. The potential measurement error as “adjusted”. benefits of this type of measure are twofold: first, it avoids computing the , for which convergence problems are frequently encountered; 2.2. Cross-validation second, it uses metrics directly measured on the scale of the treatment effect and quantifies the error of its prediction. In the meta-analytic context, we have previously proposed to compare the observed treatment effect on the final endpoint withan 2. Methods independent prediction through leave-one-(trial)-out cross-validation (LOOCV) [14]. LOOCV consists in excluding one trial at a time, esti- The meta-analytic approach allows estimating the relation between mating the two step model (1) from the remaining trials, and then the effect of the treatment on the surrogate endpoint and on thefinal applying this prediction model to the excluded trial. This last step endpoint. A valid surrogate endpoint must predict the final effect pre- consists in plugging the observed treatment effect on the surrogate cisely, and the difference between the predicted and the observed final endpoint in the left-out trial into the prediction model in order to obtain effects quantifies the predictive value of the surrogate. the predicted treatment effect on the final endpoint for the left-out trial. 2.1. Two-step surrogacy model 2.3. Motivating example The proportional hazard model [9] is often used in the analysis of Our motivating example was the advanced GASTRIC meta-analysis survival data to evaluate the effect of various covariates on the event of [15], which included individual data of 4069 patients with advanced/ interest. Let Sij and Tij be the surrogate and final endpoint time vari- recurrent gastric cancer from 20 randomized trials of chemotherapy. ables, respectively, for patient j=1, … , ni in trial i=1, … , N . Burzy- The main endpoint was OS and the candidate surrogate was progres- kowski et al. [4] proposed the following two step model to validate S as sion-free survival (PFS). In previous works [6,16], the individual-level surrogate of T: association between PFS and OS, as measured by the rank correlation coefficient, was given by a value of Kendall's τ of 0.85 (95% confidence hSij(; s Z ij )= h Si ()exp{ s iZ ij }, interval [CI]: 0.85–0.85).