Survival Analysis
APTS 2016/17
Ingrid Van Keilegom ORSTAT KU Leuven
Glasgow, August 21-25, 2017 Basic concepts
Nonparametric estimation
Hypothesis testing in a nonparametric setting
Proportional Basic concepts hazards models
Parametric survival models What is ‘Survival analysis’ ? Basic concepts Survival analysis (or duration analysis) is an area of Nonparametric estimation statistics that models and studies the time until an Hypothesis event of interest takes place. testing in a nonparametric setting In practice, for some subjects the event of interest Proportional cannot be observed for various reasons, e.g. hazards models • the event is not yet observed at the end of the study Parametric • another event takes place before the event of interest survival models • ...
In survival analysis the aim is to model ‘time-to-event data’ in an appropriate way to do correct inference taking these special features of the data into account. Examples Basic concepts Medicine : Nonparametric • time to death for patients having a certain disease estimation • time to getting cured from a certain disease Hypothesis testing in a • time to relapse of a certain disease nonparametric setting Agriculture : Proportional hazards • time until a farm experiences its first case of a certain models disease Parametric survival models Sociology (‘duration analysis’) : • time to find a new job after a period of unemployment • time until re-arrest after release from prison
Engineering (‘reliability analysis’) : • time to the failure of a machine Common functions in survival analysis Basic concepts Let T be a non-negative continuous random variable, Nonparametric representing the time until the event of interest. estimation
Hypothesis Denote testing in a nonparametric F(t) = P(T ≤ t) distribution function setting f (t) probability density function Proportional hazards models For survival data, we consider rather Parametric survival S(t) survival function models H(t) cumulative hazard function h(t) hazard function mrl(t) mean residual life function
Knowing one of these functions suffices to determine the other functions. Survival function : Basic concepts S(t) = P(T > t) = 1 − F(t) Nonparametric estimation
Hypothesis Probability that a randomly selected individual will testing in a nonparametric survive beyond time t setting Decreasing function, taking values in [0, 1] Proportional hazards models Equals 1 at t = 0 and 0 at t = ∞
Parametric survival models Cumulative hazard function :
H(t) = − log S(t)
Increasing function, taking values in [0, +∞] S(t) = exp(−H(t)) Hazard function (or hazard rate) : Basic P(t ≤ T < t + ∆t | T ≥ t) concepts h(t)= lim Nonparametric ∆t→0 ∆t estimation 1 P(t ≤ T < t + ∆t) Hypothesis = lim testing in a P(T ≥ t) ∆t→0 ∆t nonparametric f (t) −d d setting = = log S(t) = H(t) Proportional S(t) dt dt hazards models
Parametric h(t) measures the instantaneous risk of dying right survival models after time t given the individual is alive at time t Positive function (not necessarily increasing or decreasing) The hazard function h(t) can have many different shapes and is therefore a useful tool to summarize survival data Hazard functions of different shapes
Basic
concepts 10
Nonparametric estimation Exponential Weibull, rho=0.5 Hypothesis 8 Weibull, rho=1.5 Bathtub testing in a nonparametric setting 6 Proportional hazards models Hazard
Parametric 4 survival models 2 0
0 5 10 15 20
Time Basic concepts Mean residual life function : Nonparametric The mrl function measures the expected remaining estimation
Hypothesis lifetime for an individual of age t. As a function of t, we testing in a have nonparametric R ∞ setting S(s)ds mrl(t) = t Proportional ( ) hazards S t models This result is obtained from ∞ Parametric R (s − t)f (s)ds survival mrl(t) = E(T − t | T > t)= t models S(t) Mean life time : Z ∞ Z ∞ E(T ) = mrl(0) = sf (s)ds = S(s)ds 0 0 Incomplete data Basic concepts Censoring : Nonparametric • For certain individuals under study, the time to the event estimation of interest is only known to be within a certain interval Hypothesis testing in a • Ex : In a clinical trial, some patients have not yet died at nonparametric setting the time of the analysis of the data
Proportional ⇒ Only a lower bound of the true survival time is known hazards models (right censoring)
Parametric survival Truncation : models • Part of the relevant subjects will not be present at all in the data • Ex : In a mortality study based on HIV/AIDS death records, only subjects who died of HIV/AIDS and recorded as such are included (right truncation) Censoring and truncation do not only take place in Basic concepts ‘time-to-event’ data. Nonparametric estimation Examples Hypothesis testing in a Insurance : Car accidents involving costs below a nonparametric setting certain threshold are often not declared to the
Proportional insurance company hazards models ⇒ Left truncation Parametric Ecology : Chemicals in river water cannot be detected survival models below the detection limit of the laboratory instrument ⇒ Left censoring Astronomy : A star is only observable with a telescope if it is bright enough to be seen by the telescope ⇒ Left truncation Basic concepts
Nonparametric estimation Right censoring Hypothesis Only a lower bound for the time of interest is known testing in a nonparametric setting T = survival time Proportional C = censoring time hazards models ⇒ Data : (Y , δ) with Parametric survival Y = min(T , C) models δ = I(T ≤ C) Type I right censoring
Basic All subjects are followed for a fixed amount of time concepts → all censored subjects have the same censoring time Nonparametric estimation Ex : Type I censoring in animal study Hypothesis testing in a nonparametric setting
Proportional hazards models
Parametric survival models Basic concepts
Nonparametric Type II right censoring estimation
Hypothesis All subjects start to be followed up at the same time and testing in a nonparametric follow up continues until r individuals have experienced setting the event of interest (r is some predetermined integer) Proportional hazards → The n − r censored items all have a censoring time models equal to the failure time of the r th item. Parametric survival Ex : Type II censoring in industrial study : all lamps are models put on test at the same time and the test is terminated when r of the n lamps have failed. Basic concepts Random right censoring Nonparametric estimation The study itself continues until a fixed time point but
Hypothesis subjects enter and leave the study at different times testing in a nonparametric → censoring is a random variable setting → censoring can occur for various reasons: Proportional – end of study hazards models – lost to follow up
Parametric – competing event (e.g. death due to some cause other survival than the cause of interest) models – patient withdrawing from the study, change of treatment, ... Ex : Random right censoring in a cancer clinical trial Basic concepts Example : Random right censoring in HIV study Nonparametric estimation Study enrolment: January 2005 - December 2006 Hypothesis testing in a Study end: December 2008 nonparametric setting Objective: HIV patients followed up to death due to Proportional hazards AIDS or AIDS related complication (time in month from models confirmed diagnosis) Parametric survival Possible causes of censoring : models • death due to other cause • lost to follow up / dropped out • still alive at the end of study Table: Data of 6 patients in HIV study
Basic Patient id Entry Date Date last seen Status Time Censoring concepts 1 18 March 2005 20 June 2005 Dropped out 3 0 Nonparametric 2 19 Sept 2006 20 March 2007 Dead due to AIDS 6 1 estimation 3 15 May 2006 16 Oct 2006 Dead due to accident 5 0 4 01 Dec 2005 31 Dec 2008 Alive 37 0 Hypothesis 5 9 Apr 2005 10 Feb 2007 Dead due to AIDS 22 1 testing in a 6 25 Jan 2005 24 Jan 2006 Dead due to AIDS 12 1 nonparametric setting
Proportional hazards models
Parametric survival models Left censoring
Basic Some subjects have already experienced the event of concepts interest at the time they enter in the trial Nonparametric estimation Only an upper bound for the time of interest is known Hypothesis ⇒ Data : (Y , δ ) with testing in a ` ` nonparametric setting Y` = max(T , C`)
Proportional δ` = I(T > C`) hazards models C` = censoring time Parametric survival Ex : Left censoring in malaria trial models • Children between 2 and 10 years are followed up for malaria • Once children have experienced malaria, they will have antibodies in their blood against the Plasmodium parasite • Children entered at the age of 2 might have already been in touch with the parasite Basic concepts Interval censoring Nonparametric estimation The event of interest is only known to occur within a Hypothesis certain interval (L, U) testing in a nonparametric setting Contrary to right and left censoring, we never observe
Proportional the exact survival time hazards models Typically occurs if diagnostic tests are used to assess Parametric the event of interest survival models Ex : Interval censoring in malaria trial → The exact time to malaria is between the last negative and the first positive test Basic concepts Truncation : Individuals of a subset of the population of Nonparametric interest do not appear in the sample estimation
Hypothesis testing in a Left truncation nonparametric setting Occurs often in studies where a subject must first meet
Proportional a particular condition before he/she can enter in the hazards models study and followed up for the event of interest Parametric ⇒ Subjects that experience the event of interest before survival models the condition is met, will not appear in the study Data : (T , L) observed if T ≥ L, with T = survival time L = left truncation time Basic concepts
Nonparametric estimation Ex : Left truncation in HIV study Hypothesis • Incubation period between HIV infection and testing in a nonparametric seroconversion setting • An individual is considered to have been infected with Proportional hazards HIV only after seroconversion models ⇒ If we study HIV infected individuals and follow them Parametric for survival, all subjects that died between HIV infection survival models and seroconversion will not be considered for inclusion in the study Right truncation Basic concepts Occurs when only subjects who have experienced the Nonparametric estimation event of interest are included in the sample Hypothesis Data : (T , R) observed if T ≤ R, with testing in a nonparametric setting T = survival time
Proportional R = right truncation time hazards models Ex : Right truncation in AIDS study Parametric • Consider time between HIV seroconversion and survival models development of AIDS • Often use a sample of AIDS patients, and ascertain retrospectively time of HIV infection ⇒ Patients with long incubation time will not be part of the sample, nor patients that die from another cause before they develop AIDS Basic concepts
Nonparametric estimation Remark Hypothesis testing in a Censoring : nonparametric setting At least some information is available for a ‘complete’ Proportional random sample of the population hazards models Truncation : Parametric survival No information at all is available for a subset of the models population Basic concepts
Nonparametric estimation
Hypothesis testing in a nonparametric setting
Proportional Nonparametric estimation hazards models
Parametric survival models We will develop nonparametric estimators of the Basic survival function concepts
Nonparametric cumulative hazard function estimation hazard rate Hypothesis testing in a nonparametric for censored and truncated data setting
Proportional hazards All these estimators will be based on the nonparametric models likelihood function : Parametric survival Different from the likelihood for completely observed models data due to the presence of censoring and truncation We will derive the likelihood function for : • right censored data • any type of censored data (right, left and interval censoring) • truncated data Basic Likelihood for randomly right censored data concepts
Nonparametric Random sample of individuals of size n : estimation T1,..., Tn survival time Hypothesis C ,..., C censoring time testing in a 1 n nonparametric setting ⇒ Observed data : (Yi , δi ) (i = 1,..., n) with
Proportional Yi = min(Ti , Ci ) hazards models δi = I(Ti ≤ Ci ) Parametric survival Denote models f (·) and F(·) for the density and distribution of T g(·) and G(·) for the density and distribution of C and we assume that T and C are independent (called independent censoring) Contribution to the likelihood of an event (y = t , δ = 1) : Basic i i i concepts 1 Nonparametric lim P (yi − < Y < yi + , δ = 1) estimation →0 2 > Hypothesis testing in a 1 nonparametric = lim P (yi − < T < yi + , T ≤ C) setting →0 2 > Proportional hazards yi + ∞ models 1 Z Z = lim dG(c)dF(t) (due to independence) Parametric →0 2 survival > yi − t models yi + 1 Z = lim (1 − G(t))dF(t) →0 2 > yi − = (1 − G(yi ))f (yi ) Basic Contribution to the likelihood of a right censored observation concepts (yi = ci , δi = 0) : Nonparametric estimation 1 Hypothesis lim P (yi − < Y < yi + , δ = 0) testing in a →0 2 nonparametric > setting 1 Proportional = lim P (yi − < C < yi + , T > C) hazards →0 2 models >
Parametric = (1 − F(yi ))g(yi ) survival models This leads to the following formula of the likelihood :
n Y h iδi h i1−δi (1 − G(yi ))f (yi ) (1 − F(yi ))g(yi ) i=1 Basic concepts We assume that the censoring is uninformative, i.e. the Nonparametric estimation distribution of the censoring times does not depend on the Hypothesis parameters of interest related to the survival function. testing in a nonparametric δ 1−δ setting ⇒ The factors (1 − G(yi )) i and g(yi ) i are Proportional hazards non-informative for inference on the survival function models
Parametric ⇒ They can be removed from the likelihood, leading to survival models n n Y δi 1−δi Y δi f (yi ) S(yi ) = h(yi ) S(yi ) i=1 i=1 Basic concepts
Nonparametric estimation This likelihood can also be written as Hypothesis Y Y testing in a L = f (yi ) S(yi ) nonparametric setting i∈D i∈R
Proportional with D the index set of survival times and R the index hazards models set of right censored times
Parametric survival It is straightforward to see that the same survival models likelihood is also valid in the case of fixed censoring times (type I and type II) Basic Likelihood for right, left and/or interval censored data concepts
Nonparametric Generalization of the previous likelihood to include right, left estimation and interval censoring : Hypothesis testing in a nonparametric Y Y Y Y setting L = f (yi ) S(yi ) (1 − S(yi )) (S(li ) − S(ri )), Proportional i∈D i∈R i∈L i∈I hazards models with Parametric survival D index set of survival times models R index set of right censored times L index set of left censored times I index set of interval censored times
(with li the lower limit and ri the upper limit) Likelihood for left truncated data Basic concepts Suppose that the survival time Ti is left truncated at ai Nonparametric ⇒ We have to consider the conditional distribution of Ti estimation given Ti ≥ ai : Hypothesis testing in a nonparametric 1 setting f (ti |T ≥ ai ) = lim P(ti − < T < ti + | T ≥ ai ) →0 2 Proportional > hazards 1 P(t − < T < t + , T ≥ a ) models = lim i i i Parametric →0 2 P(T ≥ ai ) survival > models 1 P(t < T < t + ) = lim i i P(T ≥ ai ) →0 > f (t ) = i S(ai ) Basic concepts This leads to the following likelihood, accommodating left
Nonparametric truncation and any type of censoring : estimation
Hypothesis Y f (ti ) Y S(ti ) Y S(ai ) − S(ti ) Y S(li ) − S(ri ) testing in a L = nonparametric S(ai ) S(ai ) S(ai ) S(ai ) setting i∈D i∈R i∈L i∈I
Proportional hazards For right truncated data : models
Parametric Consider the conditional density obtained by replacing survival models S(ai ) by 1 − S(bi ), where bi is the right truncation time for subject i The likelihood function can then be constructed in a similar way Basic concepts Nonparametric Nonparametric estimation of the survival function estimation Hypothesis The survival (or distribution) function is at the basis of testing in a nonparametric many other quantities (mean, quantiles, ...) setting
Proportional The survival function is also useful to identify an hazards models appropriate parametric distribution Parametric For estimating the survival function in a nonparametric survival models way, we need to take censoring and truncation into account Kaplan-Meier estimator of the survival function Basic concepts Kaplan and Meier (JASA, 1958)
Nonparametric estimation Nonparametric estimation of the survival function for
Hypothesis right censored data testing in a nonparametric Based on the order in which events and censored setting observations occur Proportional hazards models Notations :
Parametric survival n observations y1,..., yn with censoring indicators models δ1, . . . , δn r distinct event times (r ≤ n)
ordered event times : y(1),..., y(r) and corresponding number of events: d(1),..., d(r)
R(j) is the size of the risk set at event time y(j) Log-likelihood for right censored data : Basic n concepts X h i Nonparametric δi log f (yi ) + (1 − δi ) log S(yi ) estimation i=1 Hypothesis Replacing the density function f (yi ) by S(yi−) − S(yi ), testing in a nonparametric yields the nonparametric log-likelihood : setting n Proportional X h i hazards log L = δi log(S(yi−) − S(yi )) + (1 − δi ) log S(yi ) models i=1 Parametric ˆ survival Aim : finding an estimator S(·) which maximizes log L models It can be shown that the maximizer of log L takes the following form : ˆ Y S(t) = (1 − h(j)),
j:y(j)≤t
for some h(1),..., h(r) Basic concepts Plugging-in Sˆ(·) into the log-likelihood, gives after some Nonparametric estimation algebra : Hypothesis r testing in a X h i nonparametric log L = d(j) log h(j) + R(j) − d(j) log(1 − h(j)) setting j=1 Proportional hazards Using this expression to solve models d Parametric log L = 0 survival dh(j) models leads to ˆ d(j) h(j) = R(j) Basic concepts Plugging in this estimate hˆ in Sˆ(t) = Q (1 − h ) (j) j:y(j)≤t (j) Nonparametric we obtain : estimation Y R(j) − d(j) Hypothesis Sˆ(t) = = Kaplan-Meier estimator testing in a R(j) nonparametric j:y(j)≤t setting Step function with jumps at the event times Proportional hazards If the largest observation, say y , is censored : models n • ˆ Parametric S(t) does not attain 0 survival • Impossible to estimate S(t) consistently beyond yn models • Various solutions : ˆ - Set S(t) = 0 for t ≥ yn ˆ ˆ - Set S(t) = S(yn) for t ≥ yn ˆ - Let S(t) be undefined for t ≥ yn Uncensored case
Basic When all data are uncensored, the Kaplan-Meier estimator concepts reduces to the empirical distribution function Nonparametric estimation Consider case without ties for simplicity : Hypothesis testing in a If no censoring, R(j) − d(j) = R(j+1) for j = 1,..., r nonparametric setting We can rewrite the KM estimator as Proportional hazards ˆ R(2) R(3) R(k+1) models S(t) = ··· where y(k) ≤ t < y(k+1) R(1) R(2) R(k) Parametric survival R(k+1) models = R(1) # subjects with survival time ≥ y = (k+1) # at risk before first death time n 1 X = I(y > t) n i i=1 Asymptotic normality of the KM estimator
Basic Asymptotic variance of the KM estimator : concepts Z t dHu(s) Nonparametric V (Sˆ(t)) = n−1S2(t) , estimation As 0 (1 − H(s))(1 − H(s−)) Hypothesis where testing in a nonparametric - H(t) = P(Y ≤ t) = 1 − S(t)(1 − G(t)) setting u( ) = ( ≤ , δ = ) Proportional - H t P Y t 1 hazards models This variance can be consistently estimated as
Parametric (Greenwood formula) survival d models ˆ ˆ ˆ 2 X (j) VAs(S(t)) = S (t) R(j)(R(j) − d(j)) j:y(j)≤t Asymptotic normality of Sˆ(t) :
Sˆ(t) − S(t) d q → N(0, 1) ˆ ˆ VAs(S(t)) Basic Nelson-Aalen estimator of the cumulative hazard function concepts
Nonparametric Proposed independently by Nelson (Technometrics, estimation 1972) and Aalen (Annals of Statistics, 1978) : Hypothesis d testing in a ˆ X (j) nonparametric H(t) = for t ≤ y(r) setting R(j) j:y(j)≤t Proportional hazards Its asymptotic variance can be estimated by models X d(j) Parametric ˆ ( ˆ ( )) = VAs H t 2 survival R(j) models j:y(j)≤t Asymptotic normality :
Hˆ (t) − H(t) d q → N(0, 1) ˆ ˆ VAs(H(t)) Basic concepts Alternative for KM estimator Nonparametric estimation An alternative estimator for S(t) can be obtained based Hypothesis testing in a on the Nelson-Aalen estimator using the relation nonparametric setting S(t) = exp(−H(t)),
Proportional hazards leading to models Y d(j) Sˆ (t) = exp − Parametric alt R survival j:y ≤t (j) models (j) ˆ ˆ S(t) and Salt (t) are asymptotically equivalent ˆ ˆ Salt (t) performs often better than S(t) for small samples Basic concepts Example : Survival function for 6 HIV diagnosed patients Nonparametric estimation Ordered observed times: 3*, 5*, 6, 12*, 22, 37* Hypothesis testing in a Only two contributions to KM and NA estimator : nonparametric setting Event time Proportional 6 22 hazards models Number of events d(j) 1 1 Number at risk R(j) 4 2 Parametric KM contribution 1 − d /R 3/4 1/2 survival (j) (j) ˆ models KM estimator S(y(j)) 3/4=0.75 3/8=0.375
NA contribution exp(−d(j)/R(j)) 0.7788 0.6065 NA estimator Q exp(−d /R ) 0.7788 0.4723 j:y(j)≤t (j) (j) Basic concepts
Nonparametric 1.0 estimation
Hypothesis testing in a 0.8 nonparametric setting 0.6 Proportional hazards models 0.4 Estimated survival Parametric survival models 0.2 Kaplan−Meier Nelson−Aalen 0.0
0 5 10 15 20 25 30 35
Time Basic concepts
Nonparametric Confidence intervals for the survival function estimation From the asymptotic normality of Sˆ(t), a 100(1 − α)% Hypothesis testing in a confidence interval (CI) for S(t) (t fixed) is given by : nonparametric setting q ˆ ˆ ˆ Proportional S(t) ± zα/2 VAs(S(t)) hazards models
Parametric However, this CI may contain points outside the [0, 1] survival models interval ⇒ Use an appropriate transformation to determine the CI on the transformed scale and then transform back A popular transformation is log(− log S(t)), which takes Basic concepts values between −∞ and ∞.
Nonparametric estimation One can show that ˆ Hypothesis log(− log S(t)) − log(− log S(t)) d testing in a q → N(0, 1), nonparametric Vˆ log(− log Sˆ(t)) setting As
Proportional where hazards 1 X d(j) models Vˆ log(− log Sˆ(t)) = As ˆ 2 R (R − d ) Parametric log S(t) j:y ≤t (j) (j) (j) survival (j) models Hence, CI for log(− log S(t)) is given by q ˆ ˆ ˆ log(− log S(t)) ± zα/2 VAs log(− log S(t)) By transforming back, we get the following CI for S(t) : q ˆ ˆ exp ±zα/2 VAs log(− log S(t)) Sˆ(t) Point estimate of the mean survival time Nonparametric estimator can be obtained using the Basic concepts Kaplan-Meier estimator, since ∞ ∞ Nonparametric Z Z estimation µ = E(T ) = xf (x)dx = S(x)dx 0 0 Hypothesis testing in a ⇒ We can estimate µ by replacing S(x) by the KM nonparametric ˆ setting estimator S(x) Proportional But, Sˆ(t) is inconsistent in the right tail if the largest hazards models observation (say yn) is censored Parametric • Proposal 1 : assume yn experiences the event survival models immediately after the censoring time : Z yn ˆ µˆyn = S(t)dt 0 • Proposal 2 : restrict integration to a predetermined ˆ ˆ interval [0, tmax ] and consider S(t) = S(yn) for yn ≤ t ≤ tmax : Z tmax ˆ µˆtmax = S(t)dt 0 Basic concepts
Nonparametric µˆyn and µˆtmax are inconsistent estimators of µ, but given estimation the lack of data in the right tail, we cannot do better (at Hypothesis testing in a least not nonparametrically) nonparametric setting Variance of µˆτ (with τ either yn or tmax ): 2 Proportional r Z τ ! hazards ˆ X ˆ d(j) models VAs(ˆµτ ) = S(t)dt y R(j)(R(j) − d(j)) Parametric j=1 (j) survival models A 100(1 − α)% CI for µ is given by : q ˆ µˆτ ± zα/2 VAs(ˆµτ ) Point estimate of the median survival time Basic Advantages of the median over the mean : concepts • Nonparametric As survival function is often skewed to the right, the estimation mean is often influenced by outliers, whereas the Hypothesis median is not testing in a nonparametric • Median can be estimated in a consistent way (if setting censoring is not too heavy) Proportional th hazards An estimator of the p quantile xp is given by : models n ˆ o Parametric xˆp = inf t | S(t) ≤ 1 − p survival models ⇒ An estimate of the median is given by xˆp=0.5
Asymptotic variance of xˆp : ˆ ˆ ˆ VAs(S(xp)) VAs(xˆp) = , ˆ2 f (xp) where ˆf is an estimator of the density f Basic Estimation of f involves smoothing techniques and the concepts choice of a bandwidth sequence Nonparametric estimation ⇒ We prefer not to use this variance estimator in the Hypothesis construction of a CI testing in a nonparametric ˆ setting Thanks to the asymptotic normality of S(xp) : ˆ Proportional S(xp) − S(xp) hazards P − zα/2 ≤ q ≤ zα/2 ≈ 1 − α, models ˆ ˆ VAs(S(xp)) Parametric survival with obviously S(x ) = 1 − p. models p ⇒ A 100(1 − α)% CI for xp is given by Sˆ(t) − (1 − p) t : −zα/2 ≤ q ≤ zα/2 ˆ ˆ VAs(S(t)) Example : Schizophrenia patients Basic concepts Schizophrenia is one of the major mental illnesses Nonparametric estimation encountered in Ethiopia
Hypothesis → disorganized and abnormal thinking, behavior and testing in a nonparametric language + emotionally unresponsive setting
Proportional → higher mortality rates due to natural and unnatural hazards causes models Parametric Project on schizophrenia in Butajira, Ethiopia survival models → survey of the entire population (68491 individuals) in the age group 15-49 years
⇒ 280 cases of schizophrenia identified and followed for 5 years (1997-2001) Basic concepts Table: Data on schizophrenia patients Nonparametric estimation Hypothesis Patid Time Censor Education Onset Marital Gender Age testing in a nonparametric 1 1 1 1 37 3 1 44 setting 2 3 1 3 15 2 2 23 Proportional hazards 3 4 1 6 26 1 1 33 models
Parametric 4 5 1 12 25 1 1 31 survival 5 5 0 5 29 3 1 33 models ... 278 1787 0 2 16 2 1 18 279 1792 0 2 23 1 1 25 280 1794 1 2 28 1 1 35 Basic In R : survfit concepts schizo<-read.table("c://...//Schizophrenia.csv", Nonparametric header=T,sep=";") estimation KM_schizo_l<-survfit(Surv(Time,Censor)∼1,data=schizo, Hypothesis type="kaplan-meier", conf.type="log-log") testing in a nonparametric plot(KM_schizo_l, conf.int=T, xlab="Estimated survival", setting ylab="Time", yscale=1)
Proportional mtext("Kaplan-Meier estimate of the survival function hazards for Schizophrenic patients", 3,-3) models mtext("(confidence interval based on log-log Parametric transformation)", 3,-4) survival models In SAS : proc lifetest title1 ’Kaplan-Meier estimate of the survival function for Schizophrenic patients’; proc lifetest method=km width=0.5 data=schizo; time Time*Censor(0); run; Basic concepts
Nonparametric 1.0 Kaplan−Meier estimate of the survival function for Schizophrenic patients estimation (confidence interval based on log−log transformation) Hypothesis testing in a 0.8 nonparametric setting 0.6 Proportional hazards models Time 0.4 Parametric survival models 0.2 0.0
0 500 1000 1500
Estimated survival Basic concepts > KM_schizo_l Nonparametric Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = estimation "kaplan-meier", conf.type = "log-log")
Hypothesis n events median 0.95LCL 0.95UCL testing in a 280 163 933 757 1099 nonparametric setting > summary(KM_schizo_l) Proportional Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = hazards "kaplan-meier", conf.type = "log-log") models time n.risk n.event survival std.err lower 95% CI upper 95% CI Parametric 1 280 1 0.996 0.00357 0.9749 0.999 survival 3 279 1 0.993 0.00503 0.9717 0.998 models 4 277 1 0.989 0.00616 0.9671 0.997 … 1770 13 1 0.219 0.03998 0.1465 0.301 1773 12 1 0.201 0.04061 0.1283 0.285 1784 8 2 0.151 0.04329 0.0782 0.245 1785 6 2 0.100 0.04092 0.0387 0.197 1794 1 1 0.000 NA NA NA
Basic concepts
Nonparametric 1.0 Kaplan−Meier estimate of the survival function for Schizophrenic patients estimation (confidence interval based on Greenwood formula) Hypothesis testing in a 0.8 nonparametric setting 0.6 Proportional hazards models Time 0.4 Parametric survival models 0.2 0.0
0 500 1000 1500
Estimated survival Basic concepts > KM_schizo_g Nonparametric Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = estimation "kaplan-meier", conf.type = "plain")
Hypothesis n events median 0.95LCL 0.95UCL testing in a 280 163 933 766 1099 nonparametric setting > summary(KM_schizo_g) Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = Proportional "kaplan-meier", conf.type = "plain") hazards models time n.risk n.event survival std.err lower 95% CI upper 95% CI 1 280 1 0.996 0.00357 0.9894 1.000 Parametric 3 279 1 0.993 0.00503 0.9830 1.000 survival 4 277 1 0.989 0.00616 0.9772 1.000 models … 1770 13 1 0.219 0.03998 0.1409 0.298 1773 12 1 0.201 0.04061 0.1214 0.281 1784 8 2 0.151 0.04329 0.0659 0.236 1785 6 2 0.100 0.04092 0.0203 0.181 1794 1 1 0.000 NA NA NA Basic concepts
Nonparametric estimation Median survival time is estimated to be 933 days Hypothesis 95% CI for the median : [757, 1099] testing in a nonparametric setting Survival at, e.g., 505 days is estimated to be 0.6897
Proportional with std error 0.0290 hazards models 95% CI for S(505) : [0.6329, 0.7465] (without Parametric transformation) survival models 95% CI for S(505) : [0.6290, 0.7426] (using log-log transformation) Basic Estimation of the survival function for left truncated and right concepts censored data Nonparametric estimation We need to redefine R(j) : Hypothesis testing in a R = number of individuals at risk at time y nonparametric (j) (j) setting and under observation prior to time y(j) Proportional hazards = #{i : li ≤ y(j) ≤ yi }, models
Parametric where li is the truncation time. survival models We cannot estimate S(t), but only a conditional survival function
Sl (t) = P(T ≥ t | T ≥ l)
for some fixed value l ≥ min(l1,..., ln) Basic concepts
Nonparametric estimation
Hypothesis The conditional survival function Sl (t) is estimated by testing in a nonparametric ( setting 1 if t < l Sˆ (t)= d Proportional l Q 1 − (j) if t ≥ l hazards j:l≤y(j)≤t R(j) models
Parametric Proposed and named after Lynden-Bell (1971), an survival models astronomer Estimation of the hazard function for right censored data
Basic Usually more informative about the underlying concepts population than the survival or the cumulative hazard Nonparametric estimation function
Hypothesis testing in a Crude estimator : take the size of the jumps of the nonparametric cumulative hazard function setting
Proportional Ex : Crude estimator of the hazard function for data on hazards models schizophrenic patients
Parametric
survival 0.015 models 0.010 Hazard estimate 0.005 0.000
0 200 400 600 800 1000
Time (in days) Smoothed estimator of h(t) : (weighted) average of the Basic concepts crude estimator over all time points in the interval Nonparametric [t − b, t + b] for a certain value b, called the bandwidth estimation
Hypothesis Uniform weight over interval [t − b, t + b] : testing in a r nonparametric ˆ −1 X ˆ setting h(t) = (2b) I −b ≤ t − y(j) ≤ b ∆H(y(j)), Proportional j=1 hazards models where
Parametric - Hˆ (t) = Nelson-Aalen estimator survival ˆ ˆ ˆ models - ∆H(y(j)) = H(y(j)) − H(y(j−1)) General weight function : r X t − y(j) hˆ(t) = b−1 K ∆Hˆ (y ), b (j) j=1 where K (·) is a density function, called the kernel Example of kernels :
Basic Name Density function Support concepts ( ) = 1 − ≤ ≤ Nonparametric uniform K x 2 1 x 1 estimation 3 2 Epanechnikov K (x) = 4 (1 − x ) −1 ≤ x ≤ 1 Hypothesis 15 2 2 testing in a biweight K (x) = 16 (1 − x ) −1 ≤ x ≤ 1 nonparametric setting
Proportional hazards Ex : Smoothed estimator of the hazard function for data models on schizophrenic patients Parametric survival
models 1e−03 8e−04 6e−04 Smoothed hazard 4e−04
2e−04 Uniform Epanechnikov 0e+00
0 200 400 600 800 1000
Time Basic concepts The choice of the kernel does not have a major impact Nonparametric estimation on the estimated hazard rate, but the choice of the Hypothesis bandwidth does testing in a nonparametric ⇒ It is important to choose the bandwidth in an setting appropriate way, by e.g. plug-in, cross-validation, Proportional hazards bootstrap, ... techniques models ˆ Parametric Variance of h(t) can be estimated by survival r 2 models X t − y(j) Vˆ (hˆ(t)) = b−2 K ∆Vˆ (Hˆ (y )), As b As (j) j=1 ˆ ˆ ˆ ˆ ˆ ˆ where ∆VAs(H(y(j))) = VAs(H(y(j))) − VAs(H(y(j−1))) Basic concepts
Nonparametric estimation
Hypothesis testing in a nonparametric setting Hypothesis testing in a Proportional hazards nonparametric setting models
Parametric survival models Hypothesis testing in a nonparametric setting Basic concepts Hypotheses concerning the hazard function of one Nonparametric estimation population
Hypothesis Hypotheses comparing the hazard function of two or testing in a nonparametric more populations setting
Proportional Note that hazards models It is important to consider overall differences over time Parametric survival We will develop tests that look at weighted differences models between observed and expected quantities (under H0) Weights allow to put more emphasis on certain part of
the data (e.g. early or late departure from H0) Particular cases : log-rank test, Breslow’s test, Cox Mantel test, Peto and Peto test, ... Ex : Survival differences in leukemia patients : chemotherapy vs. chemotherapy + autologous Basic concepts transplantation Nonparametric estimation
Hypothesis testing in a nonparametric setting
Proportional hazards models
Parametric Survival survival Transplant+chemo models Only chemo 0.0 0.2 0.4 0.6 0.8 1.0
100 200 300
Time (in days) Basic Hypotheses for the hazard function of one population concepts Nonparametric Test whether a censored sample of size n comes from estimation a population with a known hazard function h (t) : Hypothesis 0 testing in a nonparametric H0 : h(t) = h0(t) for all t ≤ y(r) setting H1 : h(t) 6= h0(t) for some t ≤ y( ) Proportional r hazards models Based on the NA estimator of the cumulative hazard
Parametric function, a crude estimator of the hazard function at survival models time y(j) is d(j)
R(j)
Under H0, the hazard function at time y(j) is h0(y(j)) Basic Let w(t) be some weight function, with w(t) = 0 for concepts t > y(r) Nonparametric estimation Test statistic : r Hypothesis Z y(r) X d(j) testing in a Z = w(y(j)) − w(s)h0(s)ds nonparametric R(j) 0 setting j=1
Proportional Under H0 : hazards y models Z (r) h (s) V (Z ) = w 2(s) 0 ds Parametric 0 R(s) survival models with R(s) corresponding to the number of subjects in the risk set at time s For large samples : Z ≈ N(0, 1) pV (Z) One sample log-rank test Weight function : w(t) = R(t) Basic concepts Test statistic : Nonparametric r Z y(r) estimation X Z = d(j) − R(s)h0(s)ds Hypothesis 0 testing in a j=1 nonparametric r n setting X X Z yi = d − h (s)ds Proportional (j) 0 hazards j=1 i=1 0 models r n Parametric X X survival = d(j) − H0(yi ) = O − E models j=1 i=1
Under H0 : Z y(r) V (Z) = R(s)h0(s)ds = E 0 and O − E √ ≈ N(0, 1) E Basic Example : Survival in patients with Paget disease concepts
Nonparametric Benign form of breast cancer estimation Compare survival in a sample of patients to the survival Hypothesis testing in a in the overall population nonparametric setting • Data : Finkelstein et al. (2003)
Proportional • Hazard function of the population : standardized hazards models actuarial table Parametric Compute the expected number of deaths under H0 survival models using • follow-up information of the group of patients with Paget disease • relevant hazard function from standardized actuarial table Paget disease data: age (in years) at diagnosis Basic concepts time to death or censoring (in years) Nonparametric estimation censoring indicator
Hypothesis testing in a gender (1=male, 2=female) nonparametric setting race (1=Caucasian, 2=black)
Proportional hazards models Age Follow-up Status Gender Race Parametric 52 22 0 2 1 survival models 53 4 0 2 1 57 8 0 2 1 57 7 0 2 1 ... 85 6 1 2 1 86 1 0 2 1 Standardized actuarial table :
Basic age (in years) concepts hazard (per 100 subjects) for respectively Caucasian Nonparametric estimation males, Caucasian females, black males, and black Hypothesis females testing in a nonparametric setting Hazard function Proportional hazards Age Caucasian Caucasian black black models
Parametric male female male female survival models 50-54 0.6070 0.3608 1.3310 0.7156 55-59 0.9704 0.5942 1.9048 1.0558 60-64 1.5855 0.9632 2.8310 1.6048 ... 80-84 9.3128 6.2880 10.4625 7.2523 85- 17.7671 14.6814 16.0835 13.7017 Basic concepts Nonparametric E.g. first patient : Caucasian female followed from 52 estimation years on for 22 years : Hypothesis testing in a th nonparametric (1) hazard for the 52 year = 0.3608 setting (2) hazard for the 53th year = 0.3608 Proportional hazards ...... models (22) hazard for the 73th year = 2.3454 Parametric survival Total (cumulative hazard) = 25.637 models ⇒ for one particular patient (/100) = 0.25637 and do the same for all patients Basic concepts
Nonparametric estimation Expected number of deaths under H0 : E = 9.55 Hypothesis testing in a Observed number of deaths : O = 13 nonparametric setting Test statistic :
Proportional O − E 13 − 9.55 hazards √ = √ = 1.116 models E 9.55
Parametric Two-sided hypothesis test : survival models 2P(Z > 1.116) = 0.264
⇒ We do not reject H0 Basic concepts Other weight functions
Nonparametric Weight function proposed by Harrington and Fleming estimation (1982): Hypothesis testing in a p nonparametric w(t) = R(t)S (t)(1 − S (t))q p, q ≥ 0 setting 0 0
Proportional hazards p = q = 0 : log-rank test models
Parametric p > q : more weight on early deviations from H0 survival models p < q : more weight on late deviations from H0 p = q > 0 : more weight on deviations in the middle p = 1, q = 0 : generalization of the one-sample Wilcoxon test to censored data Basic concepts Comparing the hazard functions of two populations Nonparametric Hypothesis test : estimation
Hypothesis H0 : h1(t) = h2(t) for all t ≤ y(r) testing in a nonparametric H1 : h1(t) 6= h2(t) for some t ≤ y(r) setting Notations : Proportional hazards • y , y ,..., y : ordered event times in the pooled models (1) (2) (r) sample Parametric survival • d(j)k : number of events at time y(j) in sample k models (j = 1,..., r and k = 1, 2)
• R(j)k : number of individuals at risk at time y(j) in sample k P2 P2 • d(j) = k=1 d(j)k and R(j) = k=1 R(j)k Derive a 2 × 2 contingency table for each event time
y(j) : Basic concepts
Nonparametric Group Event No Event Total estimation 1 d(j)1 R(j)1 − d(j)1 R(j)1 Hypothesis testing in a 2 d(j)2 R(j)2 − d(j)2 R(j)2 nonparametric setting Total d(j) R(j) − d(j) R(j)
Proportional hazards Test the independence between the rows and the models columns, which corresponds to the assumption that the Parametric survival hazard in the two groups at time y(j) is the same models Test statistic with group 1 as reference group : d(j)R(j)1 Oj − Ej = d(j)1 − R(j)
with Oj = observed number of events in the first group Ej = expected number of events in the first group assuming that h1 ≡ h2 Basic concepts Test statistic : weighted average over the different event
Nonparametric times : r estimation X Hypothesis U = w(y(j))(Oj − Ej ) testing in a nonparametric j=1 setting r X d(j)R(j)1 Proportional = w(y(j)) d(j)1 − hazards R(j) models j=1
Parametric survival Different weights can be used, but choice must be models made before looking at the data For large samples and under the null hypothesis : U ≈ N(0, 1) pV (U) Basic concepts Variance of U : Nonparametric estimation Can be obtained by observing that conditional on d(j), Hypothesis R and R , the statistic d has a hypergeometric testing in a (j)1 (j) (j)1 nonparametric distribution setting
Proportional Hence, hazards r models X 2 V (U) = w (y(j))V (d(j)1) Parametric survival j=1 models R R r (j)1 (j)1 d(j) 1 − (R(j) − d(j)) X R(j) R(j) = w 2(y ) (j) R − 1 j=1 (j) Basic concepts Weights :
Nonparametric w(y(j)) = 1 estimation ,→ log-rank test Hypothesis testing in a ,→ optimum power to detect alternatives when the hazard nonparametric setting rates in the two populations are proportional to each
Proportional other hazards models w(y(j)) = R(j) Parametric survival ,→ generalization by Gehan (1965) of the two sample models Wilcoxon test
,→ puts more emphasis on early departures from H0 ,→ weights depend heavily on the event times and the censoring distribution w(y(j)) = f (R(j)) Basic concepts ,→ Tarone and Ware (1977) ,→ ( ) = p Nonparametric a suggested choice is f R(j) R(j) estimation ,→ puts more weight on early departures from H0 Hypothesis testing in a ˆ Q d(k) nonparametric w(y( )) = S(y( )) = 1 − j j y(k)≤y(j) R(k)+1 setting ,→ Peto and Peto (1972) and Kalbfleisch and Prentice Proportional hazards (1980) models ,→ based on an estimate of the common survival function Parametric survival close to the pooled product limit estimate models ˆ p ˆ q w(y(j)) = S(y(j−1)) 1 − S(y(j−1)) p ≥ 0, q ≥ 0 ,→ Fleming and Harrington (1981) ,→ include weights of the log-rank as special case ,→ q = 0, p > 0 : more weight is put on early differences ,→ p = 0, q > 0 : more weight is put on late differences Example : Comparing survival for male and female
Basic schizophrenic patients concepts
Nonparametric estimation
Hypothesis 1 testing in a nonparametric setting 0.8
Proportional hazards models 0.6
Parametric survival 0.4 Estimated survival Male models Female 0.2 0
0 500 1000 1500 2000
Time Basic concepts
Nonparametric estimation Observed number of events in female group : 93 Hypothesis testing in a Expected number of events under H0 : 62 nonparametric setting Log-rank weights : p Proportional • U/ V (U) = 4.099 hazards models • p-value (2-sided) = 0.000042 Parametric Peto and Peto weights : survival p models • U/ V (U) = 4.301 • p-value (2-sided) = 0.000017 Comparing the hazard functions of more than 2 populations
Basic Hypothesis test : concepts H0 : h1(t) = h2(t) = ... = hl (t) for all t ≤ y(r) Nonparametric estimation H1 : hi (t) 6= hj (t) for at least one pair (i, j) Hypothesis testing in a for some t ≤ y(r) nonparametric setting Notations : same as earlier but now k = 1,..., l Proportional hazards Test statistic based on the l × 2 contingency tables for models the different event times y(j) Parametric survival models Group Event No Event Total
1 d(j)1 R(j)1 − d(j)1 R(j)1 2 d(j)2 R(j)2 − d(j)2 R(j)2 ...
l d(j)l R(j)l − d(j)l R(j)l Total d(j) R(j) − d(j) R(j) t The random vector d(j) = (d(j)1,..., d(j)l ) has a Basic multivariate hypergeometric distribution concepts
Nonparametric We can define analogues of the test statistic U defined estimation previously : Hypothesis r testing in a d(j)R(j)k nonparametric X Uk = w(y(j)) d(j)k − , setting R j=1 (j) Proportional hazards which is a weighted sum of the differences between the models observed and expected number of events under H Parametric 0 survival models The components of the vector (U1,..., Ul ) are linearly Pl dependent because k=1 Uk = 0 t ⇒ define U = (U1,..., Ul−1) ⇒ derive V (U), the variance-covariance matrix of U
For large sample size and under H0 : t −1 2 U V (U) U ≈ χl−1 Example : Comparing survival for schizophrenic patients
Basic according to their marital status concepts
Nonparametric estimation 1 Single Hypothesis Married testing in a Again alone nonparametric
setting 0.8
Proportional hazards models 0.6
Parametric survival Estimated survival models 0.4 0.2 0
0 500 1000 1500 2000
Time Basic concepts
Nonparametric estimation Hypothesis Observed number of events : 55 (single), 37 (married), testing in a nonparametric 71 (alone again) setting
Proportional Expected number of events under H0 : 67, 55, 41 hazards models Test statistic : Ut V (U)−1U = 31.44 Parametric −7 2 survival p-value = 1.5 × 10 (based on a χ2) models Basic Test for trend concepts Sometimes there exists a natural ordering in the hazard Nonparametric estimation functions
Hypothesis testing in a If such an ordering exists, tests that take it into nonparametric setting consideration have more power to detect significant
Proportional effects hazards models Test for trend :
Parametric survival H0 : h1(t) = h2(t) = ... = hl (t) for all t ≤ y(r) models H1 : h1(t) ≤ h2(t) ≤ ... ≤ hl (t) for some t ≤ y(r) with at least one strict inequality
(H1 implies that S1(t) ≥ S2(t) ≥ ... ≥ Sl (t) for some t ≤ y(r) with at least one strict inequality) Test statistic for trend : l Basic X concepts U = wk Uk , Nonparametric k=1 estimation with Hypothesis th testing in a • Uk the summary statistic of the k population nonparametric th setting • wk the weight assigned to the k population, e.g.
Proportional wk = k (corresponds to a linear trend in the groups) hazards models Variance of U : l l Parametric X X survival V (U) = w w 0 Cov(U , U 0 ) models k k k k k=1 k 0=1 For large sample size and under H0 : U ≈ N(0, 1) pV (U) p If wk = k , we reject H0 for large values of U/ V (U) (one-sided test) Example : Comparing survival for schizophrenic patients according to their educational level Basic concepts 4 educational groups : none, low, medium, high Nonparametric estimation
Hypothesis testing in a 1 None nonparametric Low Medium setting High
Proportional 0.8 hazards models
Parametric 0.6 survival models Estimated survival 0.4 0.2 0
0 500 1000 1500 2000
Time Basic concepts Observed number of events : 79 (none), 43 (low), 32 Nonparametric estimation (medium), 9 (high) Hypothesis testing in a Expected number of events under H0 : 71.3, 51.6, 31.1, nonparametric setting 9.0 Proportional Consider H : h (t) ≥ ... ≥ h (t) hazards 1 1 4 models Using weights 0, 1, 2, 3 we have : Parametric p survival • U = −6.77 and V (U) = 134 so U/ V (U) = −0.58 models • One-sided p-value : P(Z < −0.58) = 0.28 p-value for ‘global test’ : p = 0.49 Stratified tests Basic concepts In some cases, subjects in a study can be grouped
Nonparametric according to particular characteristics, called strata estimation Ex : prognosis group (good, average, poor) Hypothesis testing in a It is often advisable to adjust for strata as it reduces nonparametric setting variance Proportional ⇒ Stratified test : obtain an overall assessment of the hazards models difference, by combining information over the different Parametric strata to gain power survival models Hypothesis test :
H0 : h1b(t) = h2b(t) = ... = hlb(t)
for all t ≤ y(r) and b = 1,..., m,
where hkb(·) is the hazard of group k and stratum b (k = 1,..., l; b = 1,..., m) Test statistic : Basic • Ukb = summary statistic for population k (k = 1,..., l) in concepts stratum b (b = 1,..., m) Nonparametric estimation • Stratified summary statistic for population k : Pm Hypothesis Uk. = b=1 Ukb testing in a t nonparametric • Define U. = (U1.,..., U(l−1).) setting Entries of the variance-covariance matrix V (U) of U. : Proportional hazards m models X Cov(Uk., Uk 0.) = Cov(Ukb, Uk 0b) Parametric b=1 survival models For large sample size and under H0 : t −1 2 U. V (U) U. ≈ χl−1 If only two populations : Pm U b=1 b ≈ N(0, 1) qPm b=1 V (Ub) Example : Comparing survival for schizophrenic patients
Basic according to gender stratified by marital status concepts
Nonparametric a estimation 1 Male Female
Hypothesis 0.6 testing in a 0.2 Estimated survival nonparametric 0 setting 0 500 1000 1500 2000 Time Proportional hazards b models 1 0.6 Parametric 0.2 survival Estimated survival 0 models 0 500 1000 1500 2000
Time
c 1 0.6 0.2 Estimated survival 0
0 500 1000 1500 2000
Time Basic concepts Log-rank test (weights=1) : Nonparametric estimation single married alone again
Hypothesis Ub 5.81 5.98 6.06 testing in a nonparametric V (Ub) 9.77 4.12 15.71 setting
Proportional P3 P3 hazards b=1 Ub = 17.85 and b=1 V (Ub) = 29.60 models Test statistic : Parametric survival P3 √ models b=1 Ub q = 10.76 P3 b=1 V (Ub)
p-value (2-sided) = 0.00103 Basic concepts
Nonparametric Matched pairs test estimation Particular case of the stratified test when each stratum Hypothesis testing in a consists of only 2 subjects nonparametric setting m matched pairs of censored data : (y1b, y2b, δ1b, δ2b) Proportional for b = 1,..., m, with hazards models • 1st subject of the pair receiving treatment 1 Parametric • 2nd subject of the pair receiving treatment 2 survival models Hypothesis test :
H0 : h1b(t) = h2b(t) for all t ≤ y(r) and b = 1,..., m Basic concepts
Nonparametric estimation It can be shown that under H0 and for large m :
Hypothesis testing in a U D − D nonparametric . = √ 1 2 ≈ N(0, 1), setting p V (U.) D1 + D2 Proportional hazards models where Dj = number of matched pairs in which the Parametric survival individual from sample j dies first (j = 1, 2) models ⇒ Weight function has no effect on final test statistic in this case Basic concepts
Nonparametric estimation
Hypothesis testing in a nonparametric setting
Proportional Proportional hazards models hazards models
Parametric survival models Basic concepts The semiparametric proportional hazards model Nonparametric estimation Cox, 1972
Hypothesis Stratified tests not always the optimal strategy to adjust testing in a nonparametric for covariates : setting • Can be problematic if we need to adjust for several Proportional covariates hazards models • Do not provide information on the covariate(s) on which Parametric we stratify survival models • Stratification on continuous covariates requires categorization We will work with semiparametric proportional hazards models, but there also exist parametric variations Basic concepts Simplest expression of the model Nonparametric estimation Case of two treatment groups (Treated vs. Control) :
Hypothesis testing in a hT (t) = ψhC(t), nonparametric setting with hT (t) and hC(t) the hazard function of the treated Proportional and control group hazards models Proportional hazards model : Parametric • Ratio ψ = hT (t)/hC (t) is constant over time survival models • ψ < 1 (ψ > 1): hazard of the treated group is smaller (larger) than the hazard of the control group at any time • Survival curves of the 2 treatment groups can never cross each other Basic concepts More generalizable expression of the model Nonparametric estimation Consider a treatment covariate xi (0 = control, 1 = Hypothesis treatment) and an exponential relationship between the testing in a nonparametric hazard and the covariate xi : setting ( ) = (β ) ( ), Proportional hi t exp xi h0 t hazards models with
Parametric • hi (t) : hazard function for subject i survival • h (t) : hazard function of the control group models 0 • exp(β) = ψ : hazard ratio Other functional relationships can be used between the hazard and the covariate More complex model t Basic Consider a set of covariates xi = (xi1,..., xip) for concepts subject i : Nonparametric t estimation hi (t) = h0(t) exp(β xi ), Hypothesis testing in a with nonparametric • β : the p × 1 parameter vector setting • h0(t) : the baseline hazard function (i.e. hazard for a Proportional hazards subject with xij = 0, j = 1,..., p) models
Parametric Proportional hazards (PH) assumption : ratio of the survival models hazards of two subjects with covariates xi and xj is constant over time : t hi (t) exp(β xi ) = t hj (t) exp(β xj ) Semiparametric PH model : leave the form of h0(t) completely unspecified and estimate the model in a semiparametric way Basic concepts
Nonparametric estimation Fitting the semiparametric PH model Hypothesis testing in a nonparametric Based on likelihood maximization setting As h (t) is left unspecified, we maximize a so-called Proportional 0 hazards models partial likelihood instead of the full likelihood :
Parametric • survival Derive the partial likelihood for data without ties models • Extend to data with tied observations Basic Partial likelihood for data without ties concepts Can be derived as a profile likelihood: Nonparametric estimation First β is fixed, and the likelihood is maximized as a
Hypothesis function of h0(t) only to find estimators for the baseline testing in a nonparametric hazard in terms of β setting Notations : Proportional hazards • r observed event times (r = d as no ties) models • y(1),..., y(r) ordered event times Parametric • x ,..., x corresponding covariate vectors survival (1) (r) models Likelihood : r n Y t Y t h0(j) exp x(j)β exp − H0(yi ) exp(xi β) , j=1 i=1
with h0(j) = h0(y(j)) It can be seen that the likelihood is maximized when Basic concepts H0(yi ) takes the following form : X Nonparametric H0(yi ) = h0(y(j)) estimation y(j)≤yi Hypothesis testing in a (i.e. h0(t) = 0 for t 6= y(1),..., y(r), which leads to the nonparametric setting largest contribution to the likelihood) Proportional hazards With β fixed, the likelihood can be rewritten as models L(h0(1),..., h0(r) | β) Parametric survival r r models Y Y t = h0(j) exp x(j)β j=1 j=1 r Y X t × exp − h0(j) exp xk β ,
j=1 k∈R(y(j))
where R(y(j)) is the risk set at time y(j) Maximize the likelihood with respect to h0(j) by setting Basic concepts the partial derivatives wrt h0(j) equal to 0 : Nonparametric ∂L h ,..., h | β estimation 0(1) 0(r)
Hypothesis ∂h0(1) testing in a r r nonparametric Y Y setting t = exp x(j)β exp −h0(j)bj Proportional j=1 j=1 hazards models × h0(2) ... h0(r) − h0(1)h0(2) ... h0(r)b1 = 0 Parametric survival ⇐⇒ 1 − h0(1)b1 = 0, models
P t with bj = exp x β , and in general k∈R(y(j)) k
1 1 = = h0(j) P t bj exp x β k∈R(y(j)) k Basic concepts Plug this solution into the likelihood, and ignore factors Nonparametric not containing any of the parameters : estimation t Hypothesis r exp x β Y (j) testing in a L (β)= nonparametric P exp xt β setting j=1 k∈R(y(j)) k Proportional = partial likelihood hazards models This expression is used to estimate β through Parametric survival maximization models Logarithm of the partial likelihood : r r X t X X t ` (β) = x(j)β − log exp xk β j=1 j=1 k∈R(y(j)) Basic concepts Nonparametric Maximization is often done via the Newton-Raphson estimation
Hypothesis procedure, which is based on the following iterative testing in a nonparametric procedure : setting ˆ ˆ −1 ˆ ˆ βnew = βold + I (βold )U(βold ), Proportional hazards with models ˆ • U(βold ) = vector of scores Parametric −1 ˆ survival • I (βold ) = inverse of the observed information matrix models ˆ ˆ ⇒ convergence is reached when βold and βnew are sufficiently close together Score function U(β) : ∂`(β) Basic Uh(β) = concepts ∂βh Nonparametric r r P x exp xt β estimation X X k∈R(y(j)) kh k = − x(j)h P t Hypothesis k∈R(y ) exp x β testing in a j=1 j=1 (j) k nonparametric setting Observed information matrix I(β) : 2 Proportional ∂ `(β) hazards Ihl (β) = − models ∂βh∂βl Parametric r P x x exp xt β X k∈R(y(j)) kh kl k survival = models P exp xt β j=1 k∈R(y(j)) k r "P x exp xt β# X k∈R(y(j)) kh k − P exp xt β j=1 k∈R(y(j)) k r "P x exp xt β# X k∈R(y(j)) kl k × P exp xt β j=1 k∈R(y(j)) k Basic Remarks : concepts Variance-covariance matrix of βˆ can be approximated Nonparametric estimation by the inverse of the information matrix evaluated at βˆ Hypothesis → V (βˆ ) can be approximated by [I(βˆ)]−1 testing in a h hh nonparametric ˆ setting Properties (consistency, asymptotic normality) of β are
Proportional well established (Gill, 1984) hazards models A 100(1-α)% confidence interval for βh is given by Parametric q survival ˆ ˆ βh ± zα/ V (βh) models 2 and for the hazard ratio ψh = exp(βh) : q ˆ ˆ exp βh ± zα/2 V (βh) , or alternatively via the Delta method Basic concepts Example : Active antiretroviral treatment cohort study
Nonparametric estimation CD4 cells protect the body from infections and other Hypothesis types of disease testing in a nonparametric → if count decreases beyond a certain threshold the setting patients will die Proportional hazards models As HIV infection progresses, most people experience a
Parametric gradual decrease in CD4 count survival models Highly Active AntiRetroviral Therapy (HAART) • AntiRetroviral Therapy (ART) + 3 or more drugs • Not a cure for AIDS but greatly improves the health of HIV/AIDS patients Basic concepts
Nonparametric estimation After introduction of ART, death of HIV patients Hypothesis testing in a decreased tremendously nonparametric setting → investigate now how HIV patients evolve after
Proportional HAART hazards models Data from a study conducted in Ethiopia :
Parametric • 100 individuals older than 18 years and placed under survival models HAART for the last 4 years • only use data collected for the first 2 years Basic concepts Table: Data of HAART Study
Nonparametric estimation
Hypothesis Pat Time Censo- Gen- Age Weight Func. Clin. CD4 ART testing in a ID ring der Status Status nonparametric setting 1 699 0 1 42 37 2 4 3 1 Proportional 2 455 1 2 30 50 1 3 111 1 hazards models 3 705 0 1 32 57 0 3 165 1
Parametric 4 694 0 2 50 40 1 3 95 1 survival 5 86 0 2 35 37 0 4 34 1 models ... 97 101 0 1 39 37 2 . . 1 98 709 0 2 35 66 2 3 103 1 99 464 0 1 27 37 . . . 2 100 537 1 2 30 76 1 4 1 1 How is survival influenced by gender and age ? Basic Define agecat = 1 if age < 40 years concepts = ≥ Nonparametric 2 if age 40 years estimation Define gender = 1 if male Hypothesis testing in a = 2 if female nonparametric setting Fit a semiparametric PH model including gender and Proportional agecat as covariates : hazards ˆ models • βagecat = 0.226 (HR=1.25) ˆ Parametric • βgender = 1.120 (HR=3.06) survival • models Inverse of the observed information matrix : 0.4645 0.1476 I−1(βˆ) = 0.1476 0.4638 ˆ • 95% CI for βagecat : [-1.11, 1.56] 95% CI for HR of old vs. young : [0.33, 4.77] ˆ • 95% CI for βgender : [-0.21, 2.45] 95% CI for HR of female vs. male : [0.81, 11.64] Basic concepts Partial likelihood for data with tied observations Nonparametric Events are typically observed on a discrete time scale estimation
Hypothesis ⇒ Censoring and event times can be tied testing in a nonparametric If ties between censoring time(s) and an event time setting ⇒ we assume that Proportional hazards • the censoring time(s) fall just after the event time models ⇒ they are still in the risk set of the event time Parametric survival If ties between event times of two or more subjects : models Kalbfleish and Prentice (1980) proposed an appropriate likelihood function, but • rarely used due to its complexity • different approximations have been proposed Approximation proposed by Breslow (1974) : r Q exp xt β Basic Y l:yl =y(j),δl =1 l concepts (β) = L d hP t i (j) Nonparametric j=1 exp x β k:yk ≥y k estimation (j)
Hypothesis testing in a nonparametric Approximation proposed by Efron (1977) : setting r Q t Proportional l:y =y ,δ =1 exp x β hazards Y l (j) l l models L(β) = Vj (β) j=1 Parametric survival models where
d(j) Y X t Vj (β) = exp xk β h=1 k:yk ≥y(j)
h − 1 X t − exp xl β d(j) l:yl =y(j),δl =1 Basic concepts
Nonparametric estimation
Hypothesis Approximation proposed by Cox (1972) : testing in a nonparametric r Q exp xt β setting Y l:yl =y(j),δl =1 l (β) = , L P P t Proportional q∈Q h∈q exp xhβ hazards j=1 j models
Parametric with Qj the set of all possible combinations of d(j) subjects survival models from the risk set R(y(j)) Basic concepts Example : Effect of gender on survival of schizophrenic Nonparametric estimation patients Hypothesis Fit a semiparametric PH model including gender as testing in a nonparametric covariate : setting ˆ ˆ Proportional Approx. Max(partial likel.) β s.e.(β) hazards models Breslow -776.11 0.661 0.164
Parametric Efron -775.67 0.661 0.164 survival models Cox -761.36 0.665 0.165 HR for female vs. male: 1.94 95% CI : [1.41; 2.69] Contribution to the partial likelihood at time 1096 days Basic concepts • males : 68 at risk, 2 events Nonparametric • females : 12 at risk, no event estimation • Breslow : Hypothesis testing in a exp(2 × 0) nonparametric = 0.000120 setting (68 + 12 exp β)2 Proportional hazards models • Efron : Parametric exp(2 × 0) survival = 0.000121 models (68 + 12 exp β)(67 + 12 exp β)
• Cox : exp(2 × 0) = 0.000243 h 12 12 68 68i exp(2β) 2 + exp(β) 1 1 + 2 Basic concepts
Nonparametric estimation Testing hypotheses in the framework of the semiparametric Hypothesis testing in a PH model nonparametric setting Global tests : Proportional • hypothesis tests regarding the whole vector β hazards models More specific tests : Parametric survival • hypothesis tests regarding a subvector of β models • hypothesis tests for contrasts and sets of contrasts Basic concepts Global hypothesis tests Nonparametric estimation Hypotheses regarding the p-dimensional vector β : Hypothesis H : β = β testing in a 0 0 nonparametric setting H1 : β 6= β0 Proportional Wald test statistic: hazards models 2 ˆ t ˆ ˆ UW = β − β0 I β β − β0 Parametric survival with models • βˆ = maximum likelihood estimator • I βˆ = observed information matrix 2 2 ⇒ Under H0, and for large sample size : UW ≈ χp Basic concepts Likelihood ratio test statistic : Nonparametric U2 = 2 ` βˆ − ` β estimation LR 0
Hypothesis with testing in a nonparametric • ` βˆ = log likelihood evaluated at βˆ setting • ` β0 = log likelihood evaluated at β0 Proportional 2 2 hazards ⇒ Under H0, and for large sample size : U ≈ χ models LR p
Parametric Score test statistic : survival 2 t −1 models USC = U β0 I β0 U β0 with • U β0 = score vector evaluated at β0 2 2 ⇒ Under H0, and for large sample size : USC ≈ χp Basic concepts Example : Effect of age and marital status on survival of
Nonparametric schizophrenic patients estimation Model the survival as a function of age and marital Hypothesis testing in a status : nonparametric setting βage Proportional H0 : β = βmarried = 0 hazards models βalone again Parametric survival (βsingle = 0 to avoid overparametrization) models 2 2 −7 UW = 31.6; p-value : P(χ3 > 31.6) = 6 × 10 2 ULR = 30.6 2 USC = 33.5 Local hypothesis tests β = (βt , βt )t β Basic Let 1 2 , where 2 contains the ‘nuisance’ concepts parameters Nonparametric estimation Hypotheses regarding the q-dimensional vector β1 : Hypothesis : β = β testing in a H0 1 10 nonparametric setting H1 : β1 6= β10 Proportional Partition the information matrix as hazards models " # I11 I12 Parametric I = survival I21 I22 models with I11 = matrix of partial derivatives of order 2 with respect to the components of β1 " # I11 I12 ⇒ I−1 = I21 I22 Note that the complete information matrix is required to 11 ˆ ˆ obtain I , except when β1 is independent of β2 Basic concepts Define Nonparametric estimation ˆ β1 = maximum likelihood estimator Hypothesis testing in a of β1 nonparametric setting ˆ β2(β10) = maximum likelihood estimator Proportional hazards of β2 with β1 put equal to β10 models ˆ Parametric U1 β10, β2(β10) = score subvector evaluated survival ˆ models at β10 and β2(β10) 11 ˆ 11 I β10, β2(β10) = matrix I for β1 evaluated ˆ at β10 and β2(β10) Basic concepts
Nonparametric estimation Wald test : 2 ˆ t 11 ˆ −1 ˆ 2 Hypothesis UW = β1 − β10 I (β) β1 − β10 ≈ χq testing in a nonparametric Likelihood ratio test : setting 2 ˆ ˆ 2 Proportional ULR = 2 `(β) − ` β10, β2(β10) ≈ χq hazards models Score test : Parametric t 2 ˆ 11 ˆ survival U = U1 β10, β2(β10) I β10, β2(β10) models SC ˆ 2 ×U1 β10, β2(β10) ≈ χq Basic concepts Testing more specific hypotheses Nonparametric estimation Consider a p × 1 vector of coefficients c Hypothesis Hypothesis test : testing in a nonparametric t setting H0 : c β = 0
Proportional Wald test statistic : hazards t −1 models 2 t ˆ t −1 ˆ t ˆ UW = c β c I (β)c c β Parametric survival Under H0 and for large sample size : models 2 2 UW ≈ χ1 Likelihood ratio test and score test can be obtained in a similar way If different linear combinations of the parameters are of Basic concepts interest, define Nonparametric ct estimation 1 C = . Hypothesis . testing in a t nonparametric cq setting with q ≤ p and assume that the matrix C has full rank Proportional hazards models Hypothesis test :
Parametric H0 : Cβ = 0 survival models Wald test statistic : 2 ˆt −1 ˆ t −1 ˆ UW = Cβ CI (β)C Cβ 2 2 Under H0 and for large sample size : UW ≈ χq Likelihood ratio test and score test can be obtained in a similar way Basic concepts Example : Effect of age and marital status on survival of Nonparametric schizophrenic patients estimation
Hypothesis H0 : βmarried = 0 testing in a nonparametric → ct = (0, 1, 0) setting 2 Proportional → Wald test statistic : 1.18; p-value: P(χ1 > 1.18) = 0.179 hazards models H0 : βmarried = βalone again = 0 Parametric survival 0 1 0 models → C = 0 0 1 2 2 2 → Test statistics : UW = 31.6; ULR = 30.6; USC = 33.5 2 −7 → p-value (Wald) : P(χ2 > 31.6) = 1 × 10 Basic concepts
Nonparametric estimation Building multivariable semiparametric models Hypothesis testing in a nonparametric including a continuous covariate setting including a categorical covariate Proportional hazards including different types of covariates models
Parametric interactions between covariates survival models time-varying covariates Basic Including a continuous covariate in the semiparametric PH concepts model Nonparametric estimation For a single continuous covariate xi : Hypothesis testing in a h (t) = h (t) exp(βx ) nonparametric i 0 i setting where Proportional • hazards h0(t) = baseline hazard (refers to a subject with xi = 0) models hazard of a subject i with covar. x • exp(β) = i Parametric hazard of a subject j with covar. xj = xi − 1 survival models and is independent of the covariate xi and of t • exp(rβ) = hazard ratio of two subjects with a difference of r covariate units ⇒ βˆ = increase in log-hazard corresponding to a one unit increase of the continuous covariate Basic Example : Impact of age on survival of schizophrenic concepts patients Nonparametric estimation Introduce age as a continuous covariate in the Hypothesis semiparametric PH model : testing in a nonparametric setting hi (t) = h0(t) exp(βageagei )
Proportional βage = 0.00119 (s.e. = 0.00952). hazards models hazard for a subject of age i (in years) HR = = 1.001 Parametric hazard for a subject of age i − 1 survival models 95% CI : [0.983, 1.020] Other quantities can be calculated, e.g. hazard for a subject of age 40 hazard for a subject of age 30 = exp(10 × 0.00119) = 1.012 Including a categorical covariate in the semiparametric PH Basic concepts model
Nonparametric For a single categorical covariate xi with l levels : estimation t Hypothesis hi (t) = h0(t) exp(β xi ), testing in a nonparametric where setting • β = (β1, . . . , βl ) Proportional hazards • xi is the covariate for subject i models This model is overparametrized ⇒ restrictions : Parametric survival • Set β1 = 0 so that h0(t) corresponds to the hazard of a models subject with the first level of the covariate
• exp(βj ) = HR of a subject at level j relative to a subject at level 1 0 • exp(βj − βj0 ) = HR between level j and j ˆ ˆ ˆ ˆ ˆ ˆ (note that V (βj − βj0 ) = V (βj ) + V (βj0 ) − 2Cov(βj , βj0 )) • Other choices of restrictions are possible Example : Impact of marital status on survival of Basic concepts schizophrenic patients Nonparametric Introduce marital status as a categorical covariate in estimation
Hypothesis the semiparametric PH model testing in a nonparametric hi (t) = h0(t) exp(βmarriedxi2 + βalone againxi3), setting where Proportional hazards • xi2 = 1 if patient is married, 0 otherwise models • xi3 = 1 if patient is alone again, 0 otherwise Parametric survival Married vs single : models ˆ • βmarried = −0.206 (s.e. = 0.214) • HR = 0.814 (95%CI :[0.534, 1.240]), p = 0.34 Alone again vs single : ˆ • βalone again = 0.794 (s.e. = 0.185) • HR = 2.213 (95%CI :[1.540, 3.180]), p = 1.7 × 10−5 Basic concepts
Nonparametric estimation Married vs alone again : Hypothesis ˆ ˆ • exp(βmarried − βalone again) = 0.368 testing in a nonparametric • Variance-covariance matrix : setting
Proportional ! βˆ 0.0460 0.0183 hazards married = models V ˆ βalone again 0.0183 0.0342 Parametric survival • ˆ ˆ models V (βmarried − βalone again) = 0.0436 • 95% CI : [0.244, 0.553] Including different covariates in the semiparametric PH model Basic concepts • Estimates for a particular parameter will then be
Nonparametric adjusted for the other parameters in the model estimation • Estimates for this particular parameter will be different Hypothesis testing in a from the estimate obtained in a univariate model nonparametric setting (except when the covariates are orthogonal) Proportional hazards Example : Impact of marital status and age on survival of models schizophrenic patients Parametric survival models hi (t) = h0(t) exp(βageagei + βmarriedxi2 + βalone againxi3)
Covariate βˆ s.e.(βˆ) HR 95% CI age -0.0154 0.0104 0.99 [0.97,1.01] married -0.3009 0.2238 0.74 [0.48,1.15] alone again 0.8195 0.1857 2.269 [1.58,3.27] Basic concepts Interaction between covariates Nonparametric Interaction : the effect of one covariate depends on the estimation level of another covariate Hypothesis testing in a nonparametric Continuous / categorical (j levels) : different hazard setting ratios are required for the continuous covariate at each Proportional hazards level of the categorical covariate models ⇒ add j − 1 parameters Parametric survival models Categorical (j levels) / categorical (k levels) : for each level of one covariate, different HR between the levels of the other covariate with the reference are required ⇒ add (j − 1) × (k − 1) parameters Example : Impact of marital status and age on survival of Basic concepts schizophrenic patients
Nonparametric estimation hi (t) = h0(t) exp( βmarried × xi2 + βalone again × xi3 Hypothesis testing in a +βage × agei + βage | married × xi2 × agei nonparametric setting +βage | alone again × xi3 × agei ) Proportional hazards models Covariate βˆ s.e.(βˆ) HR 95% CI Parametric survival age -0.0238 0.0172 0.977 [0.94,1.01] models married -0.6811 0.8579 0.506 [0.09,2.72] alone again 0.3979 0.7475 1.489 [0.34,6.44] age|married 0.0129 0.0299 1.013 [0.96,1.07] age|alone again 0.0133 0.0228 1.013 [0.97,1.06] Basic Effect of age in the reference group (single) : concepts ˆ Nonparametric exp(βage) = exp(−0.0238) = 0.977 estimation Effect of age in the married group : Hypothesis testing in a ˆ ˆ nonparametric exp(βage + βage|married) = exp(−0.0238 + 0.0129) setting = 0.989 Proportional hazards models Effect of age in the alone again group : ˆ ˆ Parametric exp(βage + βage|alone again) = exp(−0.0238 + 0.0133) survival models = 0.990 Likelihood ratio test for the interaction : 2 ULR = 0.76 2 P(χ2 > 0.76) = 0.684 ˆ HRmarried = exp(βmarried) = 0.506
Basic = HR of a married subject relative to a concepts single subject at the age of 0 year Nonparametric estimation ⇒ more relevant to express the age as the difference Hypothesis between a particular age of interest (e.g. 30 years) testing in a nonparametric setting ⇒ has impact on parameter estimates of differences
Proportional between groups, but not on parameter estimates hazards models related to age
Parametric survival models Covariate βˆ s.e.(βˆ) HR 95% CI age -0.0238 0.0172 0.977 [0.94,1.01] married -0.2928 0.2378 0.746 [0.47,1.19] alone again 0.7971 0.1911 2.219 [1.53,3.23] age|married 0.0129 0.0299 1.013 [0.96,1.07] age|alone again 0.0133 0.0228 1.013 [0.97,1.06] Example : Impact of marital status and gender on survival of schizophrenic patients Basic concepts hi (t) = h0(t) exp(βmarried × xi2 + βalone again × xi3 Nonparametric +β × estimation female genderi
Hypothesis +βfemale|married × xi2 × genderi testing in a nonparametric +β × x × gender ) setting female|alone again i3 i
Proportional hazards models Covariate βˆ s.e.(βˆ) HR 95% CI Parametric survival female 0.520 0.286 1.681 [0.96, 2.95] models married -0.253 0.26 0.776 [0.47, 1.29] alone again 0.807 0.236 2.242 [1.41, 3.56] female|married 0.389 0.46 1.476 [0.60, 3.64] female|alone again -0.146 0.372 0.865 [0.42, 1.79]
,→ Likelihood ratio test for the interaction : 2 2 ULR = 1.94; P(χ2 > 1.94) = 0.23 Time varying covariates Basic concepts In some applications, covariates of interest change with Nonparametric estimation time
Hypothesis Extension of the Cox model : testing in a nonparametric h (t) = h (t) exp(βt x (t)) setting i 0 i
Proportional ⇒ Hazards are no longer proportional hazards models Estimation of β :
Parametric • Let xk (y) be the covariate vector for subject k at time y survival models • Define the partial likelihood : δ n " t # i Y exp xi (yi ) β (β) = L P t exp (xk (yi ) β) i=1 k∈R(yi ) • Let ˆ β = argmaxβL(β) Basic concepts Example : Time varying covariates in data on the first time Nonparametric estimation to insemination for cows Hypothesis Aim : find constituent in milk that is predictive for the testing in a nonparametric hazard of first insemination setting • one possible predictor is the ureum concentration Proportional hazards • milk ureum concentration changes over time models
Parametric Information for an individual cow i (i = 1,..., n) : survival models yi , δi , xi (ti1) ,..., xi tiki Covariate is determined only once a month ⇒ Value at time t is determined by linear interpolation Basic concepts Ureum concentration is introduced as a time-varying Nonparametric covariate in the semiparametric PH model : estimation
Hypothesis hi (t) = h0(t) exp(βxi (t)), testing in a nonparametric where setting • hi (t) = hazard of first insemination at time t for cow i Proportional hazards having at time t ureum concentration equal to xi (t) models • β = linear effect of the ureum concentration on the Parametric log-hazard of first insemination survival models βˆ = −0.0273 (s.e. = 0.0162) HR = exp(−0.0273) = 0.973 95% CI = [0.943, 1.005] p-value = 0.094 Basic concepts Model building strategies for the semiparametric PH model Nonparametric estimation Often not clear what criteria should be used to decide Hypothesis which covariates should be included testing in a nonparametric setting Should be based first on meaningful interpretation and
Proportional biological knowledge hazards models Different strategies exist : Parametric • Forward selection survival models • Backward selection • Forward stepwise selection • Backward stepwise selection • AIC selection Basic Forward procedure : concepts • First, include the covariate with the smallest p-value Nonparametric • Next, consider all possible models containing the estimation selected covariate and one additional covariate, and Hypothesis testing in a include the covariate with the smallest p-value nonparametric setting • Continue doing this until all remaining non-selected
Proportional covariates are non-significant hazards models Backward procedure :
Parametric • First, start from the full model that includes all survival models covariates • Next, consider all possible models containing all covariates except one, and remove the covariate with the largest p-value • Continue doing this until all remaining covariates in the model are significant Forward / backward stepwise procedure : Basic concepts Start as in the forward / backward procedure, but an included / removed covariate can be excluded / Nonparametric estimation included at a later stage, if it is no longer significant / Hypothesis non-significant with other covariates in the model testing in a nonparametric Note that the above p-values can be based on either setting
Proportional the Wald, likelihood ratio or score test hazards models Akaike’s information criterion (AIC) : instead of
Parametric including / removing covariates based on their p-value, survival models we look at the AIC : AIC = −2 log(L) + kp where • p = number of parameters in the model • L = likelihood • k = constant (often 2) Example : Model building in the schizophrenic patients dataset Basic concepts Univariate models : Nonparametric −7 estimation Marital status p = 6.7 × 10 −5 Hypothesis Gender p = 9.7 × 10 testing in a nonparametric Educational status p = 0.663 setting Age p = 0.9 Proportional hazards models Forward procedure :
Parametric • Start with a model containing marital status survival models • Fit model containing marital status and one of the three remaining covariates ⇒ Gender has smallest p-value • Fit model containing marital status, gender and one of the two remaining covariates ⇒ None of the remaining covariates (educational status and age) is significant ⇒ Final model contains marital status and gender Survival function estimation in the semiparametric model
Basic Survival function for subject with covariate xi : concepts
Nonparametric Si (t) = exp(−Hi (t)) estimation = exp(−H (t) exp(βt x )) Hypothesis 0 i t testing in a exp(β xi ) nonparametric = (S0(t)) setting R t Proportional with S0(t) = exp(−H0(t)) and H0(t) = 0 h0(s)ds hazards models Estimate the baseline cumulative hazard H0(t) by Parametric ˆ X ˆ survival H0(t) = h0(j), models j:y(j)≤t where ˆ d(j) h0(j) = P exp xt βˆ k∈R(y(j)) k extends the Breslow estimator to the case of tied observations Basic concepts Nonparametric Define estimation
Hypothesis ˆt exp(β xi ) testing in a ˆ ˆ nonparametric Si (t) = S0(t) , setting Proportional ˆ ˆ hazards with S0(t) = exp(−H0(t)) models
Parametric It can be shown that survival models Sˆ (t) − S (t) i i →d N(0, 1) 1/2 ˆ V (Si (t)) Example : Survival function estimates for marital status groups in the schizophrenic patients data Basic concepts
Nonparametric estimation 1 Single Hypothesis Married Alone again testing in a nonparametric setting 0.8
Proportional hazards models 0.6
Parametric survival Estimated survival models 0.4 0.2 0
0 500 1000 1500 2000
Time Basic concepts
Nonparametric estimation Hypothesis Consider e.g. survival at 505 days : testing in a nonparametric setting
Proportional Single group : 0.755 95% CI : [0.690, 0.827] hazards models Married group : 0.796 95% CI : [0.730, 0.867]
Parametric Alone again group : 0.537 95% CI : [0.453, 0.636] survival models Stratified semiparametric PH model
Basic concepts The assumption that h0(t) is the same for all subjects
Nonparametric might be too strong in practice estimation ⇒ Possible solution : consider groups (strata) of Hypothesis testing in a subjects with the same baseline hazard nonparametric setting Stratified PH model : the hazard of subject j Proportional (j = 1,..., n ) in stratum i (i = 1,..., s) is given by hazards i models t hij (t) = hi0(t) exp xij β) Parametric survival Extension of the partial likelihood : models δij s ni t Y Y exp(xij β) L(β) = P exp(xt β) i=1 j=1 il l∈Ri (yij ) ⇒ Risk set for a subject contains only the subjects still at risk within the same stratum Example : Stratified PH model for the time to first Basic concepts insemination dataset Nonparametric estimation Cows are coming from different farms
Hypothesis ⇒ baseline hazard might differ considerably between testing in a nonparametric farms (even if the effect of the ureum concentration is setting similar) Proportional hazards Consider the effect of the ureum concentration in milk models on the time to first insemination, stratifying on the Parametric survival farms : models βˆ = −0.0588 (s.e. = 0.0198) HR = 0.943 95% CI = [0.907, 0.980] ⇒ By stratifying on the farms, ureum concentration becomes significant Basic concepts Checking the proportional hazards assumption Nonparametric estimation PH assumption : HR between two subjects with
Hypothesis different covariates is constant over time testing in a nonparametric Formal tests and diagnostic plots have been developed setting
Proportional to check this assumption hazards models Formal test :
Parametric • Add βl xi × t to the PH model : survival models hi (t) = h0(t) exp(βxi + βl xi × t)
• If βl 6= 0, the PH assumption does not hold • Instead of adding βl xi × t, one can also add βl xi × g(t) for some function g Basic concepts Diagnostic plots :
Nonparametric • Consider for simplicity the case of a covariate with r estimation levels Hypothesis • Estimate the cumulative hazard function for each level testing in a nonparametric of the covariate by means of the Nelson-Aalen estimator setting ˆ ˆ ˆ ⇒ H1(t), H2(t),..., Hr (t) should be constant multiples Proportional hazards of each other : models
Parametric survival Plot PH assumption holds if models ˆ ˆ log(H1(t)), ..., log(Hr (t)) vs t parallel curves ˆ ˆ log(Hj (t)) − log(H1(t)) vs t constant lines ˆ ˆ Hj (t) vs H1(t) straight lines through origin Example : PH assumption for the gender effect in the
Basic schizophrenic patients dataset concepts 1 Nonparametric 3.0
Male 0 estimation 2.5 Female −1 Hypothesis 2.0 −2 testing in a 1.5 −3 nonparametric 1.0 Cumulative hazard log(Cumulative hazard) setting −4 0.5 Male Female −5 Proportional 0.0 hazards 0 500 1000 1500 0 500 1000 1500 models Time Time
Parametric survival 1.0
models 1.5 0.5 1.0 0.0 0.5 Cumulative hazard Female log(ratio cumulative hazards) −0.5 0.0
0 500 1000 1500 0.0 0.2 0.4 0.6 0.8 1.0 1.2
Time Cumulative hazard Male Basic concepts
Nonparametric estimation
Hypothesis testing in a nonparametric setting
Proportional Parametric survival models hazards models
Parametric survival models Basic concepts Some common parametric distributions Nonparametric estimation Exponential distribution : Hypothesis testing in a Characterized by one parameter λ > 0 : nonparametric setting S0(t) = exp(−λt) Proportional hazards f0(t) = λ exp(−λt) models
Parametric h0(t) = λ survival models → leads to a constant hazard function
Empirical check : plot of the log of the survival estimate versus time Basic Hazard and survival function for the exponential distribution concepts
Nonparametric estimation
Hypothesis testing in a 0.4 1.0 nonparametric setting
Proportional 0.8 Lambda=0.14 0.3 hazards Lambda=0.14 models 0.6
Parametric 0.2 Hazard survival Survival models 0.4 0.1 0.2 0.0 0.0
0 2 4 6 8 10 0 2 4 6 8 10
Time Time Basic Weibull distribution : concepts
Nonparametric Characterized by a scale parameter λ > 0 and a shape estimation parameter ρ > 0 : Hypothesis ρ testing in a S0(t) = exp(−λt ) nonparametric ρ−1 ρ setting f0(t) = ρλt exp(−λt ) Proportional ρ−1 hazards h0(t) = ρλt models
Parametric → hazard decreases if ρ < 1 survival models → hazard increases if ρ > 1 → hazard is constant if ρ = 1 (exponential case)
Empirical check : plot log cumulative hazard versus log time Basic Hazard and survival function for the Weibull distribution concepts
Nonparametric estimation Hazard and survival functions for Weibull distribution Hypothesis testing in a 0.4 1.0 nonparametric setting Lambda=0.31, Rho=0.5 Lambda=0.31, Rho=0.5 Lambda=0.06, Rho=1.5 Lambda=0.06, Rho=1.5 0.8
Proportional 0.3 hazards models 0.6
Parametric 0.2 Hazard Survival survival 0.4 models 0.1 0.2 0.0 0.0
0 2 4 6 8 10 0 2 4 6 8 10
Time Time Log-logistic distribution : Basic concepts A random variable T has a log-logistic distribution if
Nonparametric logT has a logistic distribution estimation
Hypothesis Characterized by two parameters λ and κ > 0 : testing in a nonparametric 1 setting S (t) = 0 1 + (tλ)κ Proportional hazards κtκ−1λκ models ( ) = f0 t 2 Parametric [1 + (tλ)κ] survival models κtκ−1λκ h (t) = 0 1 + (tλ)κ
The median event time is only a function of the parameter λ : M(T ) = exp(1/λ) Basic Hazard and survival function for the log-logistic distribution concepts
Nonparametric estimation
Hypothesis testing in a 0.4 1.0 nonparametric setting Lambda=0.2, Kappa=1.5 Lambda=0.2, Kappa=1.5 Lambda=0.2, Kappa=0.5 Lambda=0.2, Kappa=0.5 Proportional 0.8 0.3 hazards models 0.6
Parametric 0.2 Hazard survival Survival models 0.4 0.1 0.2 0.0 0.0
0 2 4 6 8 10 0 2 4 6 8 10
Time Time Log-normal distribution : Basic concepts Resembles the log-logistic distribution but is
Nonparametric mathematically less tractable estimation
Hypothesis A random variable T has a log-normal distribution if testing in a nonparametric logT has a normal distribution setting
Proportional Characterized by two parameters µ and γ > 0 : hazards models log(t) − µ S0(t) = 1 − FN √ Parametric γ survival models 1 1 f (t) = √ exp − (log(t) − µ)2 0 t 2πγ 2γ
The median event time is only a function of the parameter µ : M(T ) = exp(µ) Basic Hazard and survival function for the log-normal distribution concepts
Nonparametric estimation
Hypothesis testing in a 0.4 1.0 nonparametric setting Mu=1.609, Gamma=0.5 Mu=1.609, Gamma=0.5 Mu=1.609, Gamma=1.5 Mu=1.609, Gamma=1.5 Proportional 0.8 0.3 hazards models 0.6
Parametric 0.2 Hazard survival Survival models 0.4 0.1 0.2 0.0 0.0
0 2 4 6 8 10 0 2 4 6 8 10
Time Time Basic concepts Parametric survival models
Nonparametric estimation The parametric models considered here have two
Hypothesis representations : testing in a nonparametric Accelerated failure time model (AFT) : setting t Proportional Si (t) = S0(exp(θ xi )t), hazards models where t Parametric • θ = (θ1, . . . , θp) = vector of regression coefficients survival t models • exp(θ xi ) = acceleration factor • S0 belongs to a parametric family of distributions Hence, t t hi (t) = exp θ xi h0 exp(θ xi )t and Basic t concepts Mi = exp(−θ xi )M0 Nonparametric where M = median of S , since estimation i i 1 t Hypothesis S0(M0) = = Si (Mi ) = S0 exp(θ xi )Mi testing in a 2 nonparametric setting Proportional Ex : For one binary variable (say treatment (T) and hazards models control (C)), we have MT = exp(−θ)MC : Parametric survival models Control Treated Survival function 0 0.25 0.5 0.75 1
M C M T 0.0 0.5 1.0 1.5 2.0 Time Basic concepts
Nonparametric Linear model : estimation t log ti = µ + γ xi + σwi , Hypothesis testing in a where nonparametric setting • µ = intercept t Proportional • γ = (γ1, . . . , γp) = vector of regression coefficients hazards models • σ = scale parameter
Parametric • W has known distribution survival models These two models are equivalent, if we choose
• S0 = survival function of exp(µ + σW ) • θ = −γ Basic concepts
Nonparametric estimation Indeed,
Hypothesis Si (t) = P(ti > t) testing in a nonparametric = P(log t > log t) setting i t Proportional = P(µ + σwi > log t − γ xi ) hazards t models = S0 exp(log t − γ xi ) Parametric t survival = S0 t exp(θ xi ) models
⇒ The two models are equivalent Basic Weibull distribution concepts Consider the accelerated failure time model Nonparametric estimation t Hypothesis Si (t) = S0 exp(θ xi )t , testing in a nonparametric setting α where S0(t) = exp(−λt ) is Weibull Proportional hazards t α models ⇒ Si (t) = exp − λ exp(β xi )t ) with β = αθ Parametric α−1 t t α survival ⇒ fi (t) = λαt exp(β xi ) exp − λ exp(β xi )t ) models α−1 t t ⇒ hi (t)= αλt exp(β xi )= h0(t) exp(β xi ),
α−1 with h0(t) = αλt the hazard of a Weibull ⇒ We also have a Cox PH model Basic concepts The above model is also equivalent to the following
Nonparametric linear model : estimation log t = µ + γt x + σw , Hypothesis i i i testing in a nonparametric where W has a standard extreme value distribution, i.e. setting w SW (w) = exp(−e ). Indeed, Proportional hazards P(W > w) = P exp(µ + σW ) > exp(µ + σw) models Parametric = S0 exp(µ + σw) survival models = exp − λ exp(αµ + ασw) Since W has a known distribution, it follows that λ exp(αµ) = 1 and ασ = 1, and hence P(W > w)= exp(−ew ) Basic concepts It follows that Nonparametric Weibull accelerated failure time model estimation
Hypothesis = Cox PH model with Weibull baseline hazard testing in a nonparametric = Linear model with standard extreme value error setting distribution Proportional hazards models and
Parametric • θ = −γ = β/α survival • α = 1/σ models • λ = exp(−µ/σ) Note that the Weibull distribution is the only continuous distribution that can be written as an AFT model and as a PH model Basic Log-logistic distribution concepts Consider the accelerated failure time model Nonparametric estimation S (t) = S exp(θt x )t, Hypothesis i 0 i testing in a nonparametric α setting where S0(t) = 1/[1 + λt ] is log-logistic Proportional 1 hazards ⇒ ( ) = β = αθ models Si t t α with 1 + λ exp(β xi )t Parametric survival Si (t) 1 models ⇒ = t α 1 − Si (t) λ exp(β xi )t
t S0(t) = exp(−β xi ) 1 − S0(t) ⇒ We also have a so-called proportional odds model Basic The above model is also equivalent to the following concepts linear model : Nonparametric estimation t log ti = µ + γ xi + σwi , Hypothesis testing in a where W has a standard logistic distribution, i.e. nonparametric setting SW (w) = 1/[1 + exp(w)]. Indeed, Proportional P(W > w) = P exp(µ + σW ) > exp(µ + σw) hazards models = S0 exp(µ + σw) Parametric survival models = 1/ 1 + λ exp(αµ + ασw)] Since W has a known distribution, it follows that λ exp(αµ) = 1 and ασ = 1, and hence 1 P(W > w)= 1 + exp(w) Basic It follows that concepts Log-logistic accelerated failure time model Nonparametric estimation = Proportional odds model with log-logistic baseline Hypothesis testing in a survival nonparametric setting = Linear model with standard logistic error Proportional hazards distribution models and Parametric survival • θ = −γ = β/α models • α = 1/σ • λ = exp(−µ/σ) Note that the log-logistic distribution is the only continuous distribution that can be written as an AFT model and as a proportional odds model Other distributions
Basic concepts Log-normal :
Nonparametric Log-normal accelerated failure time model estimation = Linear model with standard normal error Hypothesis testing in a distribution nonparametric setting Generalized gamma : Proportional hazards ti follows a generalized gamma distribution if models t log ti = µ + γ xi + σwi , Parametric survival models where wi has the following density : 2 |θ| θ−2 exp(θw)1/θ exp − θ−2 exp(θw) f (w) = W Γ(1/θ2) If θ = 1 ⇒ Weibull model If θ = 1 and σ = 1 ⇒ exponential model If θ → 0 ⇒ log-normal model Estimation Basic It suffices to estimate the model parameters in one of concepts
Nonparametric the equivalent model representations. Consider e.g. the estimation linear model : Hypothesis t testing in a log ti = µ + γ xi + σwi nonparametric setting The likelihood function for right censored data equals Proportional n hazards Y δi 1−δi models L(µ, γ, σ)= fi (yi ) Si (yi )
Parametric i=1 survival n t δ models Y h 1 log yi − µ − γ xi i i = fW σyi σ i=1 t h log y − µ − γ x i1−δi × S i i W σ Since W has a known distribution, this likelihood can be maximized w.r.t. its parameters µ, γ, σ Basic concepts
Nonparametric estimation Let Hypothesis testing in a nonparametric (ˆµ, γ,ˆ σˆ) = argmaxµ,γ,σL(µ, γ, σ) setting Proportional It can be shown that hazards models • (ˆµ, γ,ˆ σˆ) is asymptotically unbiased and normal Parametric • survival The estimators of the accelerated failure time model (or models any other equivalent model) and their asymptotic distribution can be obtained from the Delta-method Basic concepts Model selection Nonparametric To select the best parametric model, we present two estimation
Hypothesis methods testing in a nonparametric Selection of nested models : setting Consider the generalized gamma model as the ‘full’ Proportional hazards model, and test whether models • θ = 1 ⇒ Weibull model Parametric • θ = 1 and σ = 1 ⇒ exponential model survival models • θ = 0 ⇒ log-normal model The test can be done using the Wald, likelihood ratio or score test statistic derived from the likelihood for censored data Basic concepts
Nonparametric estimation AIC selection : Hypothesis AIC = −2 log L + 2(p + 1 + k), testing in a nonparametric where setting • + = (µ, γ) Proportional p 1 dimension of hazards • k = 0 for the exponential model models • k = 1 for the Weibull, log-logistic, log-normal model Parametric survival • k = 2 for the generalized gamma model models and minimize the AIC among all candidate parametric models Basic concepts
Nonparametric estimation
Hypothesis testing in a nonparametric setting The End Proportional hazards models
Parametric survival models