<<

Survival Analysis

APTS 2016/17

Ingrid Van Keilegom ORSTAT KU Leuven

Glasgow, August 21-25, 2017 Basic concepts

Nonparametric estimation

Hypothesis testing in a nonparametric setting

Proportional Basic concepts models

Parametric survival models What is ‘’ ? Basic concepts  Survival analysis (or duration analysis) is an area of Nonparametric estimation that models and studies the time until an Hypothesis event of interest takes place. testing in a nonparametric setting  In practice, for some subjects the event of interest Proportional cannot be observed for various reasons, e.g. hazards models • the event is not yet observed at the end of the study Parametric • another event takes place before the event of interest survival models • ...

 In survival analysis the aim is  to model ‘time-to-event ’ in an appropriate way  to do correct inference taking these special features of the data into account. Examples Basic concepts  Medicine : Nonparametric • time to for patients having a certain disease estimation • time to getting cured from a certain disease Hypothesis testing in a • time to relapse of a certain disease nonparametric setting  Agriculture : Proportional hazards • time until a farm experiences its first case of a certain models disease Parametric survival models  (‘duration analysis’) : • time to find a new job after a period of unemployment • time until re-arrest after release from prison

 Engineering (‘reliability analysis’) : • time to the failure of a machine Common functions in survival analysis Basic concepts  Let T be a non-negative continuous , Nonparametric representing the time until the event of interest. estimation

Hypothesis  Denote testing in a nonparametric F(t) = P(T ≤ t) distribution function setting f (t) density function Proportional hazards models  For survival data, we consider rather Parametric survival S(t) models H(t) cumulative function h(t) hazard function mrl(t) residual life function

 Knowing one of these functions suffices to determine the other functions. Survival function : Basic concepts S(t) = P(T > t) = 1 − F(t) Nonparametric estimation

Hypothesis  Probability that a randomly selected individual will testing in a nonparametric survive beyond time t setting  Decreasing function, taking values in [0, 1] Proportional hazards models  Equals 1 at t = 0 and 0 at t = ∞

Parametric survival models Cumulative hazard function :

H(t) = − log S(t)

 Increasing function, taking values in [0, +∞]  S(t) = exp(−H(t)) Hazard function (or hazard rate) : Basic P(t ≤ T < t + ∆t | T ≥ t) concepts h(t)= lim Nonparametric ∆t→0 ∆t estimation 1 P(t ≤ T < t + ∆t) Hypothesis = lim testing in a P(T ≥ t) ∆t→0 ∆t nonparametric f (t) −d d setting = = log S(t) = H(t) Proportional S(t) dt dt hazards models

Parametric  h(t) measures the instantaneous risk of dying right survival models after time t given the individual is alive at time t  Positive function (not necessarily increasing or decreasing)  The hazard function h(t) can have many different shapes and is therefore a useful tool to summarize survival data Hazard functions of different shapes

Basic

concepts 10

Nonparametric estimation Exponential Weibull, rho=0.5 Hypothesis 8 Weibull, rho=1.5 Bathtub testing in a nonparametric setting 6 Proportional hazards models Hazard

Parametric 4 survival models 2 0

0 5 10 15 20

Time Basic concepts Mean residual life function : Nonparametric  The mrl function measures the expected remaining estimation

Hypothesis lifetime for an individual of age t. As a function of t, we testing in a have nonparametric R ∞ setting S(s)ds mrl(t) = t Proportional ( ) hazards S t models  This result is obtained from ∞ Parametric R (s − t)f (s)ds survival mrl(t) = E(T − t | T > t)= t models S(t)  Mean life time : Z ∞ Z ∞ E(T ) = mrl(0) = sf (s)ds = S(s)ds 0 0 Incomplete data Basic concepts  : Nonparametric • For certain individuals under study, the time to the event estimation of interest is only known to be within a certain interval Hypothesis testing in a • Ex : In a , some patients have not yet died at nonparametric setting the time of the analysis of the data

Proportional ⇒ Only a lower bound of the true survival time is known hazards models (right censoring)

Parametric survival  Truncation : models • Part of the relevant subjects will not be present at all in the data • Ex : In a mortality study based on HIV/AIDS death records, only subjects who died of HIV/AIDS and recorded as such are included (right truncation) Censoring and truncation do not only take place in Basic concepts ‘time-to-event’ data. Nonparametric estimation Examples Hypothesis testing in a  Insurance : Car accidents involving costs below a nonparametric setting certain threshold are often not declared to the

Proportional insurance company hazards models ⇒ Left truncation Parametric  Ecology : Chemicals in river water cannot be detected survival models below the detection limit of the laboratory instrument ⇒ Left censoring  Astronomy : A star is only observable with a telescope if it is bright enough to be seen by the telescope ⇒ Left truncation Basic concepts

Nonparametric estimation Right censoring Hypothesis Only a lower bound for the time of interest is known testing in a nonparametric setting T = survival time Proportional C = censoring time hazards models ⇒ Data : (Y , δ) with Parametric survival Y = min(T , C) models δ = I(T ≤ C) Type I right censoring

Basic  All subjects are followed for a fixed amount of time concepts → all censored subjects have the same censoring time Nonparametric estimation  Ex : Type I censoring in animal study Hypothesis testing in a nonparametric setting

Proportional hazards models

Parametric survival models Basic concepts

Nonparametric Type II right censoring estimation

Hypothesis  All subjects start to be followed up at the same time and testing in a nonparametric follow up continues until r individuals have experienced setting the event of interest (r is some predetermined integer) Proportional hazards → The n − r censored items all have a censoring time models equal to the failure time of the r th item. Parametric survival  Ex : Type II censoring in industrial study : all lamps are models put on test at the same time and the test is terminated when r of the n lamps have failed. Basic concepts Random right censoring Nonparametric estimation  The study itself continues until a fixed time point but

Hypothesis subjects enter and leave the study at different times testing in a nonparametric → censoring is a random variable setting → censoring can occur for various reasons: Proportional – end of study hazards models – lost to follow up

Parametric – competing event (e.g. death due to some cause other survival than the cause of interest) models – patient withdrawing from the study, change of treatment, ...  Ex : Random right censoring in a cancer clinical trial Basic concepts Example : Random right censoring in HIV study Nonparametric estimation  Study enrolment: January 2005 - December 2006 Hypothesis testing in a  Study end: December 2008 nonparametric setting  Objective: HIV patients followed up to death due to Proportional hazards AIDS or AIDS related complication (time in month from models confirmed diagnosis) Parametric survival  Possible causes of censoring : models • death due to other cause • lost to follow up / dropped out • still alive at the end of study Table: Data of 6 patients in HIV study

Basic Patient id Entry Date Date last seen Status Time Censoring concepts 1 18 March 2005 20 June 2005 Dropped out 3 0 Nonparametric 2 19 Sept 2006 20 March 2007 Dead due to AIDS 6 1 estimation 3 15 May 2006 16 Oct 2006 Dead due to accident 5 0 4 01 Dec 2005 31 Dec 2008 Alive 37 0 Hypothesis 5 9 Apr 2005 10 Feb 2007 Dead due to AIDS 22 1 testing in a 6 25 Jan 2005 24 Jan 2006 Dead due to AIDS 12 1 nonparametric setting

Proportional hazards models

Parametric survival models Left censoring

Basic  Some subjects have already experienced the event of concepts interest at the time they enter in the trial Nonparametric estimation  Only an upper bound for the time of interest is known Hypothesis ⇒ Data : (Y , δ ) with testing in a ` ` nonparametric setting Y` = max(T , C`)

Proportional δ` = I(T > C`) hazards models C` = censoring time Parametric survival  Ex : Left censoring in malaria trial models • Children between 2 and 10 years are followed up for malaria • Once children have experienced malaria, they will have antibodies in their blood against the Plasmodium parasite • Children entered at the age of 2 might have already been in touch with the parasite Basic concepts Interval censoring Nonparametric estimation  The event of interest is only known to occur within a Hypothesis certain interval (L, U) testing in a nonparametric setting  Contrary to right and left censoring, we never observe

Proportional the exact survival time hazards models  Typically occurs if diagnostic tests are used to assess Parametric the event of interest survival models  Ex : Interval censoring in malaria trial → The exact time to malaria is between the last negative and the first positive test Basic concepts Truncation : Individuals of a subset of the population of Nonparametric interest do not appear in the sample estimation

Hypothesis testing in a Left truncation nonparametric setting  Occurs often in studies where a subject must first meet

Proportional a particular condition before he/she can enter in the hazards models study and followed up for the event of interest Parametric ⇒ Subjects that experience the event of interest before survival models the condition is met, will not appear in the study  Data : (T , L) observed if T ≥ L, with T = survival time L = left truncation time Basic concepts

Nonparametric estimation  Ex : Left truncation in HIV study Hypothesis • Incubation period between HIV infection and testing in a nonparametric seroconversion setting • An individual is considered to have been infected with Proportional hazards HIV only after seroconversion models ⇒ If we study HIV infected individuals and follow them Parametric for survival, all subjects that died between HIV infection survival models and seroconversion will not be considered for inclusion in the study Right truncation Basic concepts  Occurs when only subjects who have experienced the Nonparametric estimation event of interest are included in the sample Hypothesis  Data : (T , R) observed if T ≤ R, with testing in a nonparametric setting T = survival time

Proportional R = right truncation time hazards models  Ex : Right truncation in AIDS study Parametric • Consider time between HIV seroconversion and survival models development of AIDS • Often use a sample of AIDS patients, and ascertain retrospectively time of HIV infection ⇒ Patients with long incubation time will not be part of the sample, nor patients that die from another cause before they develop AIDS Basic concepts

Nonparametric estimation Remark Hypothesis testing in a  Censoring : nonparametric setting At least some information is available for a ‘complete’ Proportional random sample of the population hazards models  Truncation : Parametric survival No information at all is available for a subset of the models population Basic concepts

Nonparametric estimation

Hypothesis testing in a nonparametric setting

Proportional Nonparametric estimation hazards models

Parametric survival models We will develop nonparametric estimators of the Basic  survival function concepts

Nonparametric  cumulative hazard function estimation  hazard rate Hypothesis testing in a nonparametric for censored and truncated data setting

Proportional hazards All these estimators will be based on the nonparametric models : Parametric survival  Different from the likelihood for completely observed models data due to the presence of censoring and truncation  We will derive the likelihood function for : • right censored data • any type of censored data (right, left and interval censoring) • truncated data Basic Likelihood for randomly right censored data concepts

Nonparametric  Random sample of individuals of size n : estimation T1,..., Tn survival time Hypothesis C ,..., C censoring time testing in a 1 n nonparametric setting ⇒ Observed data : (Yi , δi ) (i = 1,..., n) with

Proportional Yi = min(Ti , Ci ) hazards models δi = I(Ti ≤ Ci ) Parametric survival  Denote models f (·) and F(·) for the density and distribution of T g(·) and G(·) for the density and distribution of C and we assume that T and C are independent (called independent censoring) Contribution to the likelihood of an event (y = t , δ = 1) : Basic i i i concepts 1 Nonparametric lim P (yi −  < Y < yi + , δ = 1) estimation →0 2 > Hypothesis testing in a 1 nonparametric = lim P (yi −  < T < yi + , T ≤ C) setting →0 2 > Proportional hazards yi + ∞ models 1 Z Z = lim dG(c)dF(t) (due to independence) Parametric →0 2 survival > yi − t models yi + 1 Z = lim (1 − G(t))dF(t) →0 2 > yi − = (1 − G(yi ))f (yi ) Basic Contribution to the likelihood of a right censored observation concepts (yi = ci , δi = 0) : Nonparametric estimation 1 Hypothesis lim P (yi −  < Y < yi + , δ = 0) testing in a →0 2 nonparametric > setting 1 Proportional = lim P (yi −  < C < yi + , T > C) hazards →0 2 models >

Parametric = (1 − F(yi ))g(yi ) survival models This leads to the following formula of the likelihood :

n Y h iδi h i1−δi (1 − G(yi ))f (yi ) (1 − F(yi ))g(yi ) i=1 Basic concepts We assume that the censoring is uninformative, i.e. the Nonparametric estimation distribution of the censoring times does not depend on the Hypothesis parameters of interest related to the survival function. testing in a nonparametric δ 1−δ setting ⇒ The factors (1 − G(yi )) i and g(yi ) i are Proportional hazards non-informative for inference on the survival function models

Parametric ⇒ They can be removed from the likelihood, leading to survival models n n Y δi 1−δi Y δi f (yi ) S(yi ) = h(yi ) S(yi ) i=1 i=1 Basic concepts

Nonparametric estimation  This likelihood can also be written as Hypothesis Y Y testing in a L = f (yi ) S(yi ) nonparametric setting i∈D i∈R

Proportional with D the index set of survival times and R the index hazards models set of right censored times

Parametric survival  It is straightforward to see that the same survival models likelihood is also valid in the case of fixed censoring times (type I and type II) Basic Likelihood for right, left and/or interval censored data concepts

Nonparametric Generalization of the previous likelihood to include right, left estimation and interval censoring : Hypothesis testing in a nonparametric Y Y Y Y setting L = f (yi ) S(yi ) (1 − S(yi )) (S(li ) − S(ri )), Proportional i∈D i∈R i∈L i∈I hazards models with Parametric survival D index set of survival times models R index set of right censored times L index set of left censored times I index set of interval censored times

(with li the lower limit and ri the upper limit) Likelihood for left truncated data Basic concepts Suppose that the survival time Ti is left truncated at ai Nonparametric ⇒ We have to consider the conditional distribution of Ti estimation given Ti ≥ ai : Hypothesis testing in a nonparametric 1 setting f (ti |T ≥ ai ) = lim P(ti −  < T < ti +  | T ≥ ai ) →0 2 Proportional > hazards 1 P(t −  < T < t + , T ≥ a ) models = lim i i i Parametric →0 2 P(T ≥ ai ) survival > models 1 P(t < T < t + ) = lim i i P(T ≥ ai ) →0  > f (t ) = i S(ai ) Basic concepts This leads to the following likelihood, accommodating left

Nonparametric truncation and any type of censoring : estimation

Hypothesis Y f (ti ) Y S(ti ) Y S(ai ) − S(ti ) Y S(li ) − S(ri ) testing in a L = nonparametric S(ai ) S(ai ) S(ai ) S(ai ) setting i∈D i∈R i∈L i∈I

Proportional hazards For right truncated data : models

Parametric  Consider the conditional density obtained by replacing survival models S(ai ) by 1 − S(bi ), where bi is the right truncation time for subject i  The likelihood function can then be constructed in a similar way Basic concepts Nonparametric Nonparametric estimation of the survival function estimation Hypothesis  The survival (or distribution) function is at the basis of testing in a nonparametric many other quantities (mean, , ...) setting

Proportional  The survival function is also useful to identify an hazards models appropriate parametric distribution Parametric  For estimating the survival function in a nonparametric survival models way, we need to take censoring and truncation into account Kaplan-Meier estimator of the survival function Basic concepts  Kaplan and Meier (JASA, 1958)

Nonparametric estimation  Nonparametric estimation of the survival function for

Hypothesis right censored data testing in a nonparametric  Based on the order in which events and censored setting observations occur Proportional hazards models Notations :

Parametric survival  n observations y1,..., yn with censoring indicators models δ1, . . . , δn  r distinct event times (r ≤ n)

 ordered event times : y(1),..., y(r) and corresponding number of events: d(1),..., d(r)

 R(j) is the size of the risk set at event time y(j)  Log-likelihood for right censored data : Basic n concepts X h i Nonparametric δi log f (yi ) + (1 − δi ) log S(yi ) estimation i=1 Hypothesis  Replacing the density function f (yi ) by S(yi−) − S(yi ), testing in a nonparametric yields the nonparametric log-likelihood : setting n Proportional X h i hazards log L = δi log(S(yi−) − S(yi )) + (1 − δi ) log S(yi ) models i=1 Parametric ˆ survival  Aim : finding an estimator S(·) which maximizes log L models  It can be shown that the maximizer of log L takes the following form : ˆ Y S(t) = (1 − h(j)),

j:y(j)≤t

for some h(1),..., h(r) Basic concepts  Plugging-in Sˆ(·) into the log-likelihood, gives after some Nonparametric estimation algebra : Hypothesis r testing in a X h  i nonparametric log L = d(j) log h(j) + R(j) − d(j) log(1 − h(j)) setting j=1 Proportional hazards  Using this expression to solve models d Parametric log L = 0 survival dh(j) models leads to ˆ d(j) h(j) = R(j) Basic concepts  Plugging in this estimate hˆ in Sˆ(t) = Q (1 − h ) (j) j:y(j)≤t (j) Nonparametric we obtain : estimation Y R(j) − d(j) Hypothesis Sˆ(t) = = Kaplan-Meier estimator testing in a R(j) nonparametric j:y(j)≤t setting  Step function with jumps at the event times Proportional hazards  If the largest observation, say y , is censored : models n • ˆ Parametric S(t) does not attain 0 survival • Impossible to estimate S(t) consistently beyond yn models • Various solutions : ˆ - Set S(t) = 0 for t ≥ yn ˆ ˆ - Set S(t) = S(yn) for t ≥ yn ˆ - Let S(t) be undefined for t ≥ yn Uncensored case

Basic When all data are uncensored, the Kaplan-Meier estimator concepts reduces to the empirical distribution function Nonparametric estimation Consider case without ties for simplicity : Hypothesis testing in a  If no censoring, R(j) − d(j) = R(j+1) for j = 1,..., r nonparametric setting  We can rewrite the KM estimator as Proportional hazards ˆ R(2) R(3) R(k+1) models S(t) = ··· where y(k) ≤ t < y(k+1) R(1) R(2) R(k) Parametric survival R(k+1) models = R(1) # subjects with survival time ≥ y = (k+1) # at risk before first death time n 1 X = I(y > t) n i i=1 Asymptotic normality of the KM estimator

Basic  Asymptotic of the KM estimator : concepts Z t dHu(s) Nonparametric V (Sˆ(t)) = n−1S2(t) , estimation As 0 (1 − H(s))(1 − H(s−)) Hypothesis where testing in a nonparametric - H(t) = P(Y ≤ t) = 1 − S(t)(1 − G(t)) setting u( ) = ( ≤ , δ = ) Proportional - H t P Y t 1 hazards models  This variance can be consistently estimated as

Parametric (Greenwood formula) survival d models ˆ ˆ ˆ 2 X (j) VAs(S(t)) = S (t) R(j)(R(j) − d(j)) j:y(j)≤t  Asymptotic normality of Sˆ(t) :

Sˆ(t) − S(t) d q → N(0, 1) ˆ ˆ VAs(S(t)) Basic Nelson-Aalen estimator of the cumulative hazard function concepts

Nonparametric  Proposed independently by Nelson (Technometrics, estimation 1972) and Aalen (Annals of Statistics, 1978) : Hypothesis d testing in a ˆ X (j) nonparametric H(t) = for t ≤ y(r) setting R(j) j:y(j)≤t Proportional hazards  Its asymptotic variance can be estimated by models X d(j) Parametric ˆ ( ˆ ( )) = VAs H t 2 survival R(j) models j:y(j)≤t  Asymptotic normality :

Hˆ (t) − H(t) d q → N(0, 1) ˆ ˆ VAs(H(t)) Basic concepts Alternative for KM estimator Nonparametric estimation  An alternative estimator for S(t) can be obtained based Hypothesis testing in a on the Nelson-Aalen estimator using the relation nonparametric setting S(t) = exp(−H(t)),

Proportional hazards leading to models Y  d(j)  Sˆ (t) = exp − Parametric alt R survival j:y ≤t (j) models (j) ˆ ˆ  S(t) and Salt (t) are asymptotically equivalent ˆ ˆ  Salt (t) performs often better than S(t) for small samples Basic concepts Example : Survival function for 6 HIV diagnosed patients Nonparametric estimation  Ordered observed times: 3*, 5*, 6, 12*, 22, 37* Hypothesis testing in a  Only two contributions to KM and NA estimator : nonparametric setting Event time Proportional 6 22 hazards models Number of events d(j) 1 1 Number at risk R(j) 4 2 Parametric KM contribution 1 − d /R 3/4 1/2 survival (j) (j) ˆ models KM estimator S(y(j)) 3/4=0.75 3/8=0.375

NA contribution exp(−d(j)/R(j)) 0.7788 0.6065 NA estimator Q exp(−d /R ) 0.7788 0.4723 j:y(j)≤t (j) (j) Basic concepts

Nonparametric 1.0 estimation

Hypothesis testing in a 0.8 nonparametric setting 0.6 Proportional hazards models 0.4 Estimated survival Parametric survival models 0.2 Kaplan−Meier Nelson−Aalen 0.0

0 5 10 15 20 25 30 35

Time Basic concepts

Nonparametric Confidence intervals for the survival function estimation  From the asymptotic normality of Sˆ(t), a 100(1 − α)% Hypothesis testing in a confidence interval (CI) for S(t) (t fixed) is given by : nonparametric setting q ˆ ˆ ˆ Proportional S(t) ± zα/2 VAs(S(t)) hazards models

Parametric  However, this CI may contain points outside the [0, 1] survival models interval ⇒ Use an appropriate transformation to determine the CI on the transformed scale and then transform back  A popular transformation is log(− log S(t)), which takes Basic concepts values between −∞ and ∞.

Nonparametric estimation  One can show that ˆ Hypothesis log(− log S(t)) − log(− log S(t)) d testing in a q → N(0, 1), nonparametric Vˆ log(− log Sˆ(t)) setting As

Proportional where hazards 1 X d(j) models Vˆ log(− log Sˆ(t)) = As ˆ 2 R (R − d ) Parametric log S(t) j:y ≤t (j) (j) (j) survival (j) models  Hence, CI for log(− log S(t)) is given by q ˆ ˆ ˆ  log(− log S(t)) ± zα/2 VAs log(− log S(t))  By transforming back, we get the following CI for S(t) :  q  ˆ ˆ  exp ±zα/2 VAs log(− log S(t)) Sˆ(t) Point estimate of the mean survival time  Nonparametric estimator can be obtained using the Basic concepts Kaplan-Meier estimator, since ∞ ∞ Nonparametric Z Z estimation µ = E(T ) = xf (x)dx = S(x)dx 0 0 Hypothesis testing in a ⇒ We can estimate µ by replacing S(x) by the KM nonparametric ˆ setting estimator S(x) Proportional  But, Sˆ(t) is inconsistent in the right tail if the largest hazards models observation (say yn) is censored Parametric • Proposal 1 : assume yn experiences the event survival models immediately after the censoring time : Z yn ˆ µˆyn = S(t)dt 0 • Proposal 2 : restrict integration to a predetermined ˆ ˆ interval [0, tmax ] and consider S(t) = S(yn) for yn ≤ t ≤ tmax : Z tmax ˆ µˆtmax = S(t)dt 0 Basic concepts

Nonparametric  µˆyn and µˆtmax are inconsistent estimators of µ, but given estimation the lack of data in the right tail, we cannot do better (at Hypothesis testing in a least not nonparametrically) nonparametric setting  Variance of µˆτ (with τ either yn or tmax ): 2 Proportional r Z τ ! hazards ˆ X ˆ d(j) models VAs(ˆµτ ) = S(t)dt y R(j)(R(j) − d(j)) Parametric j=1 (j) survival models  A 100(1 − α)% CI for µ is given by : q ˆ µˆτ ± zα/2 VAs(ˆµτ ) Point estimate of the survival time Basic  Advantages of the median over the mean : concepts • Nonparametric As survival function is often skewed to the right, the estimation mean is often influenced by outliers, whereas the Hypothesis median is not testing in a nonparametric • Median can be estimated in a consistent way (if setting censoring is not too heavy) Proportional th hazards  An estimator of the p xp is given by : models n ˆ o Parametric xˆp = inf t | S(t) ≤ 1 − p survival models ⇒ An estimate of the median is given by xˆp=0.5

 Asymptotic variance of xˆp : ˆ ˆ ˆ VAs(S(xp)) VAs(xˆp) = , ˆ2 f (xp) where ˆf is an estimator of the density f Basic  Estimation of f involves smoothing techniques and the concepts choice of a bandwidth sequence Nonparametric estimation ⇒ We prefer not to use this variance estimator in the Hypothesis construction of a CI testing in a nonparametric ˆ setting  Thanks to the asymptotic normality of S(xp) : ˆ Proportional  S(xp) − S(xp)  hazards P − zα/2 ≤ q ≤ zα/2 ≈ 1 − α, models ˆ ˆ VAs(S(xp)) Parametric survival with obviously S(x ) = 1 − p. models p ⇒ A 100(1 − α)% CI for xp is given by    Sˆ(t) − (1 − p)  t : −zα/2 ≤ q ≤ zα/2 ˆ ˆ  VAs(S(t))  Example : Schizophrenia patients Basic concepts  Schizophrenia is one of the major mental illnesses Nonparametric estimation encountered in Ethiopia

Hypothesis → disorganized and abnormal thinking, behavior and testing in a nonparametric language + emotionally unresponsive setting

Proportional → higher mortality rates due to natural and unnatural hazards causes models Parametric  Project on schizophrenia in Butajira, Ethiopia survival models → survey of the entire population (68491 individuals) in the age group 15-49 years

⇒ 280 cases of schizophrenia identified and followed for 5 years (1997-2001) Basic concepts Table: Data on schizophrenia patients Nonparametric estimation Hypothesis Patid Time Censor Education Onset Marital Gender Age testing in a nonparametric 1 1 1 1 37 3 1 44 setting 2 3 1 3 15 2 2 23 Proportional hazards 3 4 1 6 26 1 1 33 models

Parametric 4 5 1 12 25 1 1 31 survival 5 5 0 5 29 3 1 33 models ... 278 1787 0 2 16 2 1 18 279 1792 0 2 23 1 1 25 280 1794 1 2 28 1 1 35 Basic  In R : survfit concepts schizo<-read.table("c://...//Schizophrenia.csv", Nonparametric header=T,sep=";") estimation KM_schizo_l<-survfit(Surv(Time,Censor)∼1,data=schizo, Hypothesis type="kaplan-meier", conf.type="log-log") testing in a nonparametric plot(KM_schizo_l, conf.int=T, xlab="Estimated survival", setting ylab="Time", yscale=1)

Proportional mtext("Kaplan-Meier estimate of the survival function hazards for Schizophrenic patients", 3,-3) models mtext("( based on log-log Parametric transformation)", 3,-4) survival models  In SAS : proc lifetest title1 ’Kaplan-Meier estimate of the survival function for Schizophrenic patients’; proc lifetest method=km width=0.5 data=schizo; time Time*Censor(0); run; Basic concepts

Nonparametric 1.0 Kaplan−Meier estimate of the survival function for Schizophrenic patients estimation (confidence interval based on log−log transformation) Hypothesis testing in a 0.8 nonparametric setting 0.6 Proportional hazards models Time 0.4 Parametric survival models 0.2 0.0

0 500 1000 1500

Estimated survival Basic concepts > KM_schizo_l Nonparametric Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = estimation "kaplan-meier", conf.type = "log-log")

Hypothesis n events median 0.95LCL 0.95UCL testing in a 280 163 933 757 1099 nonparametric setting > summary(KM_schizo_l) Proportional Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = hazards "kaplan-meier", conf.type = "log-log") models time n.risk n.event survival std.err lower 95% CI upper 95% CI Parametric 1 280 1 0.996 0.00357 0.9749 0.999 survival 3 279 1 0.993 0.00503 0.9717 0.998 models 4 277 1 0.989 0.00616 0.9671 0.997 … 1770 13 1 0.219 0.03998 0.1465 0.301 1773 12 1 0.201 0.04061 0.1283 0.285 1784 8 2 0.151 0.04329 0.0782 0.245 1785 6 2 0.100 0.04092 0.0387 0.197 1794 1 1 0.000 NA NA NA

Basic concepts

Nonparametric 1.0 Kaplan−Meier estimate of the survival function for Schizophrenic patients estimation (confidence interval based on Greenwood formula) Hypothesis testing in a 0.8 nonparametric setting 0.6 Proportional hazards models Time 0.4 Parametric survival models 0.2 0.0

0 500 1000 1500

Estimated survival Basic concepts > KM_schizo_g Nonparametric Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = estimation "kaplan-meier", conf.type = "plain")

Hypothesis n events median 0.95LCL 0.95UCL testing in a 280 163 933 766 1099 nonparametric setting > summary(KM_schizo_g) Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = Proportional "kaplan-meier", conf.type = "plain") hazards models time n.risk n.event survival std.err lower 95% CI upper 95% CI 1 280 1 0.996 0.00357 0.9894 1.000 Parametric 3 279 1 0.993 0.00503 0.9830 1.000 survival 4 277 1 0.989 0.00616 0.9772 1.000 models … 1770 13 1 0.219 0.03998 0.1409 0.298 1773 12 1 0.201 0.04061 0.1214 0.281 1784 8 2 0.151 0.04329 0.0659 0.236 1785 6 2 0.100 0.04092 0.0203 0.181 1794 1 1 0.000 NA NA NA Basic concepts

Nonparametric estimation  Median survival time is estimated to be 933 days Hypothesis  95% CI for the median : [757, 1099] testing in a nonparametric setting  Survival at, e.g., 505 days is estimated to be 0.6897

Proportional with std error 0.0290 hazards models  95% CI for S(505) : [0.6329, 0.7465] (without Parametric transformation) survival models  95% CI for S(505) : [0.6290, 0.7426] (using log-log transformation) Basic Estimation of the survival function for left truncated and right concepts censored data Nonparametric estimation  We need to redefine R(j) : Hypothesis testing in a R = number of individuals at risk at time y nonparametric (j) (j) setting and under observation prior to time y(j) Proportional hazards = #{i : li ≤ y(j) ≤ yi }, models

Parametric where li is the truncation time. survival models  We cannot estimate S(t), but only a conditional survival function

Sl (t) = P(T ≥ t | T ≥ l)

for some fixed value l ≥ min(l1,..., ln) Basic concepts

Nonparametric estimation

Hypothesis  The conditional survival function Sl (t) is estimated by testing in a nonparametric ( setting 1 if t < l Sˆ (t)=  d  Proportional l Q 1 − (j) if t ≥ l hazards j:l≤y(j)≤t R(j) models

Parametric  Proposed and named after Lynden-Bell (1971), an survival models astronomer Estimation of the hazard function for right censored data

Basic  Usually more informative about the underlying concepts population than the survival or the cumulative hazard Nonparametric estimation function

Hypothesis testing in a  Crude estimator : take the size of the jumps of the nonparametric cumulative hazard function setting

Proportional  Ex : Crude estimator of the hazard function for data on hazards models schizophrenic patients

Parametric

survival 0.015 models 0.010 Hazard estimate 0.005 0.000

0 200 400 600 800 1000

Time (in days)  Smoothed estimator of h(t) : (weighted) average of the Basic concepts crude estimator over all time points in the interval Nonparametric [t − b, t + b] for a certain value b, called the bandwidth estimation

Hypothesis  Uniform weight over interval [t − b, t + b] : testing in a r nonparametric ˆ −1 X  ˆ setting h(t) = (2b) I −b ≤ t − y(j) ≤ b ∆H(y(j)), Proportional j=1 hazards models where

Parametric - Hˆ (t) = Nelson-Aalen estimator survival ˆ ˆ ˆ models - ∆H(y(j)) = H(y(j)) − H(y(j−1))  General weight function : r   X t − y(j) hˆ(t) = b−1 K ∆Hˆ (y ), b (j) j=1 where K (·) is a density function, called the kernel  Example of kernels :

Basic Name Density function Support concepts ( ) = 1 − ≤ ≤ Nonparametric uniform K x 2 1 x 1 estimation 3 2 Epanechnikov K (x) = 4 (1 − x ) −1 ≤ x ≤ 1 Hypothesis 15 2 2 testing in a biweight K (x) = 16 (1 − x ) −1 ≤ x ≤ 1 nonparametric setting

Proportional hazards  Ex : Smoothed estimator of the hazard function for data models on schizophrenic patients Parametric survival

models 1e−03 8e−04 6e−04 Smoothed hazard 4e−04

2e−04 Uniform Epanechnikov 0e+00

0 200 400 600 800 1000

Time Basic concepts  The choice of the kernel does not have a major impact Nonparametric estimation on the estimated hazard rate, but the choice of the Hypothesis bandwidth does testing in a nonparametric ⇒ It is important to choose the bandwidth in an setting appropriate way, by e.g. plug-in, cross-validation, Proportional hazards bootstrap, ... techniques models ˆ Parametric  Variance of h(t) can be estimated by survival r  2 models X t − y(j) Vˆ (hˆ(t)) = b−2 K ∆Vˆ (Hˆ (y )), As b As (j) j=1 ˆ ˆ ˆ ˆ ˆ ˆ where ∆VAs(H(y(j))) = VAs(H(y(j))) − VAs(H(y(j−1))) Basic concepts

Nonparametric estimation

Hypothesis testing in a nonparametric setting Hypothesis testing in a Proportional hazards nonparametric setting models

Parametric survival models Hypothesis testing in a nonparametric setting Basic concepts  Hypotheses concerning the hazard function of one Nonparametric estimation population

Hypothesis  Hypotheses comparing the hazard function of two or testing in a nonparametric more populations setting

Proportional Note that hazards models  It is important to consider overall differences over time Parametric survival  We will develop tests that look at weighted differences models between observed and expected quantities (under H0)  Weights allow to put more emphasis on certain part of

the data (e.g. early or late departure from H0)  Particular cases : log-rank test, Breslow’s test, Cox Mantel test, Peto and Peto test, ... Ex : Survival differences in leukemia patients : chemotherapy vs. chemotherapy + autologous Basic concepts transplantation Nonparametric estimation

Hypothesis testing in a nonparametric setting

Proportional hazards models

Parametric Survival survival Transplant+chemo models Only chemo 0.0 0.2 0.4 0.6 0.8 1.0

100 200 300

Time (in days) Basic Hypotheses for the hazard function of one population concepts Nonparametric  Test whether a censored sample of size n comes from estimation a population with a known hazard function h (t) : Hypothesis 0 testing in a nonparametric H0 : h(t) = h0(t) for all t ≤ y(r) setting H1 : h(t) 6= h0(t) for some t ≤ y( ) Proportional r hazards models  Based on the NA estimator of the cumulative hazard

Parametric function, a crude estimator of the hazard function at survival models time y(j) is d(j)

R(j)

 Under H0, the hazard function at time y(j) is h0(y(j)) Basic  Let w(t) be some weight function, with w(t) = 0 for concepts t > y(r) Nonparametric estimation  Test : r Hypothesis Z y(r) X d(j) testing in a Z = w(y(j)) − w(s)h0(s)ds nonparametric R(j) 0 setting j=1

Proportional  Under H0 : hazards y models Z (r) h (s) V (Z ) = w 2(s) 0 ds Parametric 0 R(s) survival models with R(s) corresponding to the number of subjects in the risk set at time s  For large samples : Z ≈ N(0, 1) pV (Z) One sample log-rank test  Weight function : w(t) = R(t) Basic concepts  Test statistic : Nonparametric r Z y(r) estimation X Z = d(j) − R(s)h0(s)ds Hypothesis 0 testing in a j=1 nonparametric r n setting X X Z yi = d − h (s)ds Proportional (j) 0 hazards j=1 i=1 0 models r n Parametric X X survival = d(j) − H0(yi ) = O − E models j=1 i=1

 Under H0 : Z y(r) V (Z) = R(s)h0(s)ds = E 0 and O − E √ ≈ N(0, 1) E Basic Example : Survival in patients with Paget disease concepts

Nonparametric  Benign form of breast cancer estimation  Compare survival in a sample of patients to the survival Hypothesis testing in a in the overall population nonparametric setting • Data : Finkelstein et al. (2003)

Proportional • Hazard function of the population : standardized hazards models actuarial table Parametric  Compute the expected number of under H0 survival models using • follow-up information of the group of patients with Paget disease • relevant hazard function from standardized actuarial table Paget disease data:  age (in years) at diagnosis Basic concepts  time to death or censoring (in years) Nonparametric estimation  censoring indicator

Hypothesis testing in a  gender (1=male, 2=female) nonparametric setting  race (1=Caucasian, 2=black)

Proportional hazards models Age Follow-up Status Gender Race Parametric 52 22 0 2 1 survival models 53 4 0 2 1 57 8 0 2 1 57 7 0 2 1 ... 85 6 1 2 1 86 1 0 2 1 Standardized actuarial table :

Basic  age (in years) concepts  hazard (per 100 subjects) for respectively Caucasian Nonparametric estimation males, Caucasian females, black males, and black Hypothesis females testing in a nonparametric setting Hazard function Proportional hazards Age Caucasian Caucasian black black models

Parametric male female male female survival models 50-54 0.6070 0.3608 1.3310 0.7156 55-59 0.9704 0.5942 1.9048 1.0558 60-64 1.5855 0.9632 2.8310 1.6048 ... 80-84 9.3128 6.2880 10.4625 7.2523 85- 17.7671 14.6814 16.0835 13.7017 Basic concepts Nonparametric  E.g. first patient : Caucasian female followed from 52 estimation years on for 22 years : Hypothesis testing in a th nonparametric (1) hazard for the 52 year = 0.3608 setting (2) hazard for the 53th year = 0.3608 Proportional hazards ...... models (22) hazard for the 73th year = 2.3454 Parametric survival Total (cumulative hazard) = 25.637 models ⇒ for one particular patient (/100) = 0.25637 and do the same for all patients Basic concepts

Nonparametric estimation  Expected number of deaths under H0 : E = 9.55 Hypothesis testing in a  Observed number of deaths : O = 13 nonparametric setting  Test statistic :

Proportional O − E 13 − 9.55 hazards √ = √ = 1.116 models E 9.55

Parametric  Two-sided hypothesis test : survival models 2P(Z > 1.116) = 0.264

⇒ We do not reject H0 Basic concepts Other weight functions

Nonparametric Weight function proposed by Harrington and Fleming estimation (1982): Hypothesis testing in a p nonparametric w(t) = R(t)S (t)(1 − S (t))q p, q ≥ 0 setting 0 0

Proportional hazards  p = q = 0 : log-rank test models

Parametric  p > q : more weight on early deviations from H0 survival models  p < q : more weight on late deviations from H0  p = q > 0 : more weight on deviations in the middle  p = 1, q = 0 : generalization of the one-sample Wilcoxon test to censored data Basic concepts Comparing the hazard functions of two populations Nonparametric  Hypothesis test : estimation

Hypothesis H0 : h1(t) = h2(t) for all t ≤ y(r) testing in a nonparametric H1 : h1(t) 6= h2(t) for some t ≤ y(r) setting  Notations : Proportional hazards • y , y ,..., y : ordered event times in the pooled models (1) (2) (r) sample Parametric survival • d(j)k : number of events at time y(j) in sample k models (j = 1,..., r and k = 1, 2)

• R(j)k : number of individuals at risk at time y(j) in sample k P2 P2 • d(j) = k=1 d(j)k and R(j) = k=1 R(j)k  Derive a 2 × 2 for each event time

y(j) : Basic concepts

Nonparametric Group Event No Event Total estimation 1 d(j)1 R(j)1 − d(j)1 R(j)1 Hypothesis testing in a 2 d(j)2 R(j)2 − d(j)2 R(j)2 nonparametric setting Total d(j) R(j) − d(j) R(j)

Proportional hazards  Test the independence between the rows and the models columns, which corresponds to the assumption that the Parametric survival hazard in the two groups at time y(j) is the same models  Test statistic with group 1 as reference group : d(j)R(j)1 Oj − Ej = d(j)1 − R(j)

with Oj = observed number of events in the first group Ej = expected number of events in the first group assuming that h1 ≡ h2 Basic concepts  Test statistic : weighted average over the different event

Nonparametric times : r estimation X Hypothesis U = w(y(j))(Oj − Ej ) testing in a nonparametric j=1 setting r X  d(j)R(j)1  Proportional = w(y(j)) d(j)1 − hazards R(j) models j=1

Parametric survival Different weights can be used, but choice must be models made before looking at the data  For large samples and under the null hypothesis : U ≈ N(0, 1) pV (U) Basic concepts Variance of U : Nonparametric estimation  Can be obtained by observing that conditional on d(j), Hypothesis R and R , the statistic d has a hypergeometric testing in a (j)1 (j) (j)1 nonparametric distribution setting

Proportional  Hence, hazards r models X 2 V (U) = w (y(j))V (d(j)1) Parametric survival j=1 models  R  R  r (j)1 (j)1 d(j) 1 − (R(j) − d(j)) X R(j) R(j) = w 2(y ) (j) R − 1 j=1 (j) Basic concepts Weights :

Nonparametric  w(y(j)) = 1 estimation ,→ log-rank test Hypothesis testing in a ,→ optimum power to detect alternatives when the hazard nonparametric setting rates in the two populations are proportional to each

Proportional other hazards models  w(y(j)) = R(j) Parametric survival ,→ generalization by Gehan (1965) of the two sample models Wilcoxon test

,→ puts more emphasis on early departures from H0 ,→ weights depend heavily on the event times and the censoring distribution  w(y(j)) = f (R(j)) Basic concepts ,→ Tarone and Ware (1977) ,→ ( ) = p Nonparametric a suggested choice is f R(j) R(j) estimation ,→ puts more weight on early departures from H0 Hypothesis   testing in a ˆ Q d(k) nonparametric  w(y( )) = S(y( )) = 1 − j j y(k)≤y(j) R(k)+1 setting ,→ Peto and Peto (1972) and Kalbfleisch and Prentice Proportional hazards (1980) models ,→ based on an estimate of the common survival function Parametric survival close to the pooled product limit estimate models ˆ p  ˆ q  w(y(j)) = S(y(j−1)) 1 − S(y(j−1)) p ≥ 0, q ≥ 0 ,→ Fleming and Harrington (1981) ,→ include weights of the log-rank as special case ,→ q = 0, p > 0 : more weight is put on early differences ,→ p = 0, q > 0 : more weight is put on late differences Example : Comparing survival for male and female

Basic schizophrenic patients concepts

Nonparametric estimation

Hypothesis 1 testing in a nonparametric setting 0.8

Proportional hazards models 0.6

Parametric survival 0.4 Estimated survival Male models Female 0.2 0

0 500 1000 1500 2000

Time Basic concepts

Nonparametric estimation  Observed number of events in female group : 93 Hypothesis testing in a  Expected number of events under H0 : 62 nonparametric setting  Log-rank weights : p Proportional • U/ V (U) = 4.099 hazards models • p-value (2-sided) = 0.000042 Parametric  Peto and Peto weights : survival p models • U/ V (U) = 4.301 • p-value (2-sided) = 0.000017 Comparing the hazard functions of more than 2 populations

Basic  Hypothesis test : concepts H0 : h1(t) = h2(t) = ... = hl (t) for all t ≤ y(r) Nonparametric estimation H1 : hi (t) 6= hj (t) for at least one pair (i, j) Hypothesis testing in a for some t ≤ y(r) nonparametric setting  Notations : same as earlier but now k = 1,..., l Proportional hazards  Test statistic based on the l × 2 contingency tables for models the different event times y(j) Parametric survival models Group Event No Event Total

1 d(j)1 R(j)1 − d(j)1 R(j)1 2 d(j)2 R(j)2 − d(j)2 R(j)2 ...

l d(j)l R(j)l − d(j)l R(j)l Total d(j) R(j) − d(j) R(j) t  The random vector d(j) = (d(j)1,..., d(j)l ) has a Basic multivariate hypergeometric distribution concepts

Nonparametric  We can define analogues of the test statistic U defined estimation previously : Hypothesis r testing in a  d(j)R(j)k  nonparametric X Uk = w(y(j)) d(j)k − , setting R j=1 (j) Proportional hazards which is a weighted sum of the differences between the models observed and expected number of events under H Parametric 0 survival models  The components of the vector (U1,..., Ul ) are linearly Pl dependent because k=1 Uk = 0 t ⇒ define U = (U1,..., Ul−1) ⇒ derive V (U), the variance- matrix of U

 For large sample size and under H0 : t −1 2 U V (U) U ≈ χl−1 Example : Comparing survival for schizophrenic patients

Basic according to their marital status concepts

Nonparametric estimation 1 Single Hypothesis Married testing in a Again alone nonparametric

setting 0.8

Proportional hazards models 0.6

Parametric survival Estimated survival models 0.4 0.2 0

0 500 1000 1500 2000

Time Basic concepts

Nonparametric estimation Hypothesis  Observed number of events : 55 (single), 37 (married), testing in a nonparametric 71 (alone again) setting

Proportional  Expected number of events under H0 : 67, 55, 41 hazards models  Test statistic : Ut V (U)−1U = 31.44 Parametric −7 2 survival  p-value = 1.5 × 10 (based on a χ2) models Basic Test for trend concepts  Sometimes there exists a natural ordering in the hazard Nonparametric estimation functions

Hypothesis testing in a  If such an ordering exists, tests that take it into nonparametric setting consideration have more power to detect significant

Proportional effects hazards models  Test for trend :

Parametric survival H0 : h1(t) = h2(t) = ... = hl (t) for all t ≤ y(r) models H1 : h1(t) ≤ h2(t) ≤ ... ≤ hl (t) for some t ≤ y(r) with at least one strict inequality

(H1 implies that S1(t) ≥ S2(t) ≥ ... ≥ Sl (t) for some t ≤ y(r) with at least one strict inequality)  Test statistic for trend : l Basic X concepts U = wk Uk , Nonparametric k=1 estimation with Hypothesis th testing in a • Uk the summary statistic of the k population nonparametric th setting • wk the weight assigned to the k population, e.g.

Proportional wk = k (corresponds to a linear trend in the groups) hazards models  Variance of U : l l Parametric X X survival V (U) = w w 0 Cov(U , U 0 ) models k k k k k=1 k 0=1  For large sample size and under H0 : U ≈ N(0, 1) pV (U) p  If wk = k , we reject H0 for large values of U/ V (U) (one-sided test) Example : Comparing survival for schizophrenic patients according to their educational level Basic concepts 4 educational groups : none, low, medium, high Nonparametric estimation

Hypothesis testing in a 1 None nonparametric Low Medium setting High

Proportional 0.8 hazards models

Parametric 0.6 survival models Estimated survival 0.4 0.2 0

0 500 1000 1500 2000

Time Basic concepts  Observed number of events : 79 (none), 43 (low), 32 Nonparametric estimation (medium), 9 (high) Hypothesis testing in a  Expected number of events under H0 : 71.3, 51.6, 31.1, nonparametric setting 9.0 Proportional  Consider H : h (t) ≥ ... ≥ h (t) hazards 1 1 4 models  Using weights 0, 1, 2, 3 we have : Parametric p survival • U = −6.77 and V (U) = 134 so U/ V (U) = −0.58 models • One-sided p-value : P(Z < −0.58) = 0.28  p-value for ‘global test’ : p = 0.49 Stratified tests Basic concepts  In some cases, subjects in a study can be grouped

Nonparametric according to particular characteristics, called strata estimation Ex : prognosis group (good, average, poor) Hypothesis testing in a  It is often advisable to adjust for strata as it reduces nonparametric setting variance Proportional ⇒ Stratified test : obtain an overall assessment of the hazards models difference, by combining information over the different Parametric strata to gain power survival models  Hypothesis test :

H0 : h1b(t) = h2b(t) = ... = hlb(t)

for all t ≤ y(r) and b = 1,..., m,

where hkb(·) is the hazard of group k and stratum b (k = 1,..., l; b = 1,..., m)  Test statistic : Basic • Ukb = summary statistic for population k (k = 1,..., l) in concepts stratum b (b = 1,..., m) Nonparametric estimation • Stratified summary statistic for population k : Pm Hypothesis Uk. = b=1 Ukb testing in a t nonparametric • Define U. = (U1.,..., U(l−1).) setting  Entries of the variance- V (U) of U. : Proportional hazards m models X Cov(Uk., Uk 0.) = Cov(Ukb, Uk 0b) Parametric b=1 survival models  For large sample size and under H0 : t −1 2 U. V (U) U. ≈ χl−1  If only two populations : Pm U b=1 b ≈ N(0, 1) qPm b=1 V (Ub) Example : Comparing survival for schizophrenic patients

Basic according to gender stratified by marital status concepts

Nonparametric a estimation 1 Male Female

Hypothesis 0.6 testing in a 0.2 Estimated survival nonparametric 0 setting 0 500 1000 1500 2000 Time Proportional hazards b models 1 0.6 Parametric 0.2 survival Estimated survival 0 models 0 500 1000 1500 2000

Time

c 1 0.6 0.2 Estimated survival 0

0 500 1000 1500 2000

Time Basic concepts  Log-rank test (weights=1) : Nonparametric estimation single married alone again

Hypothesis Ub 5.81 5.98 6.06 testing in a nonparametric V (Ub) 9.77 4.12 15.71 setting

Proportional P3 P3 hazards  b=1 Ub = 17.85 and b=1 V (Ub) = 29.60 models  Test statistic : Parametric survival P3 √ models b=1 Ub q = 10.76 P3 b=1 V (Ub)

 p-value (2-sided) = 0.00103 Basic concepts

Nonparametric Matched pairs test estimation  Particular case of the stratified test when each stratum Hypothesis testing in a consists of only 2 subjects nonparametric setting  m matched pairs of censored data : (y1b, y2b, δ1b, δ2b) Proportional for b = 1,..., m, with hazards models • 1st subject of the pair receiving treatment 1 Parametric • 2nd subject of the pair receiving treatment 2 survival models  Hypothesis test :

H0 : h1b(t) = h2b(t) for all t ≤ y(r) and b = 1,..., m Basic concepts

Nonparametric estimation  It can be shown that under H0 and for large m :

Hypothesis testing in a U D − D nonparametric . = √ 1 2 ≈ N(0, 1), setting p V (U.) D1 + D2 Proportional hazards models where Dj = number of matched pairs in which the Parametric survival individual from sample j dies first (j = 1, 2) models ⇒ Weight function has no effect on final test statistic in this case Basic concepts

Nonparametric estimation

Hypothesis testing in a nonparametric setting

Proportional Proportional hazards models hazards models

Parametric survival models Basic concepts The semiparametric proportional hazards model Nonparametric  estimation Cox, 1972

Hypothesis  Stratified tests not always the optimal strategy to adjust testing in a nonparametric for covariates : setting • Can be problematic if we need to adjust for several Proportional covariates hazards models • Do not provide information on the covariate(s) on which Parametric we stratify survival models • Stratification on continuous covariates requires categorization  We will work with semiparametric proportional hazards models, but there also exist parametric variations Basic concepts Simplest expression of the model Nonparametric estimation  Case of two treatment groups (Treated vs. Control) :

Hypothesis testing in a hT (t) = ψhC(t), nonparametric setting with hT (t) and hC(t) the hazard function of the treated Proportional and control group hazards models  Proportional hazards model : Parametric • Ratio ψ = hT (t)/hC (t) is constant over time survival models • ψ < 1 (ψ > 1): hazard of the treated group is smaller (larger) than the hazard of the control group at any time • Survival curves of the 2 treatment groups can never cross each other Basic concepts More generalizable expression of the model Nonparametric estimation  Consider a treatment covariate xi (0 = control, 1 = Hypothesis treatment) and an exponential relationship between the testing in a nonparametric hazard and the covariate xi : setting ( ) = (β ) ( ), Proportional hi t exp xi h0 t hazards models with

Parametric • hi (t) : hazard function for subject i survival • h (t) : hazard function of the control group models 0 • exp(β) = ψ :  Other functional relationships can be used between the hazard and the covariate More complex model t Basic  Consider a set of covariates xi = (xi1,..., xip) for concepts subject i : Nonparametric t estimation hi (t) = h0(t) exp(β xi ), Hypothesis testing in a with nonparametric • β : the p × 1 parameter vector setting • h0(t) : the baseline hazard function (i.e. hazard for a Proportional hazards subject with xij = 0, j = 1,..., p) models

Parametric  Proportional hazards (PH) assumption : ratio of the survival models hazards of two subjects with covariates xi and xj is constant over time : t hi (t) exp(β xi ) = t hj (t) exp(β xj )  Semiparametric PH model : leave the form of h0(t) completely unspecified and estimate the model in a semiparametric way Basic concepts

Nonparametric estimation Fitting the semiparametric PH model Hypothesis testing in a nonparametric  Based on likelihood maximization setting  As h (t) is left unspecified, we maximize a so-called Proportional 0 hazards models partial likelihood instead of the full likelihood :

Parametric • survival Derive the partial likelihood for data without ties models • Extend to data with tied observations Basic Partial likelihood for data without ties concepts  Can be derived as a profile likelihood: Nonparametric estimation First β is fixed, and the likelihood is maximized as a

Hypothesis function of h0(t) only to find estimators for the baseline testing in a nonparametric hazard in terms of β setting  Notations : Proportional hazards • r observed event times (r = d as no ties) models • y(1),..., y(r) ordered event times Parametric • x ,..., x corresponding covariate vectors survival (1) (r) models  Likelihood : r n Y  t  Y  t  h0(j) exp x(j)β exp − H0(yi ) exp(xi β) , j=1 i=1

with h0(j) = h0(y(j))  It can be seen that the likelihood is maximized when Basic concepts H0(yi ) takes the following form : X Nonparametric H0(yi ) = h0(y(j)) estimation y(j)≤yi Hypothesis testing in a (i.e. h0(t) = 0 for t 6= y(1),..., y(r), which leads to the nonparametric setting largest contribution to the likelihood) Proportional hazards  With β fixed, the likelihood can be rewritten as models L(h0(1),..., h0(r) | β) Parametric survival r r models Y Y t  = h0(j) exp x(j)β j=1 j=1 r Y  X t  × exp − h0(j) exp xk β ,

j=1 k∈R(y(j))

where R(y(j)) is the risk set at time y(j)  Maximize the likelihood with respect to h0(j) by setting Basic concepts the partial derivatives wrt h0(j) equal to 0 : Nonparametric ∂L h ,..., h | β estimation 0(1) 0(r)

Hypothesis ∂h0(1) testing in a r r nonparametric Y   Y setting t  = exp x(j)β exp −h0(j)bj Proportional j=1 j=1 hazards  models × h0(2) ... h0(r) − h0(1)h0(2) ... h0(r)b1 = 0 Parametric survival ⇐⇒ 1 − h0(1)b1 = 0, models

P t  with bj = exp x β , and in general k∈R(y(j)) k

1 1 = = h0(j) P t  bj exp x β k∈R(y(j)) k Basic concepts  Plug this solution into the likelihood, and ignore factors Nonparametric not containing any of the parameters : estimation  t  Hypothesis r exp x β Y (j) testing in a L (β)= nonparametric P exp xt β setting j=1 k∈R(y(j)) k Proportional = partial likelihood hazards models  This expression is used to estimate β through Parametric survival maximization models  Logarithm of the partial likelihood : r r X t X  X t   ` (β) = x(j)β − log exp xk β j=1 j=1 k∈R(y(j)) Basic concepts Nonparametric  Maximization is often done via the Newton-Raphson estimation

Hypothesis procedure, which is based on the following iterative testing in a nonparametric procedure : setting ˆ ˆ −1 ˆ ˆ βnew = βold + I (βold )U(βold ), Proportional hazards with models ˆ • U(βold ) = vector of scores Parametric −1 ˆ survival • I (βold ) = inverse of the observed information matrix models ˆ ˆ ⇒ convergence is reached when βold and βnew are sufficiently close together  Score function U(β) : ∂`(β) Basic Uh(β) = concepts ∂βh Nonparametric r r P x exp xt β estimation X X k∈R(y(j)) kh k = − x(j)h P t  Hypothesis k∈R(y ) exp x β testing in a j=1 j=1 (j) k nonparametric setting  Observed information matrix I(β) : 2 Proportional ∂ `(β) hazards Ihl (β) = − models ∂βh∂βl  Parametric r P x x exp xt β X k∈R(y(j)) kh kl k survival = models P exp xt β j=1 k∈R(y(j)) k r "P x exp xt β# X k∈R(y(j)) kh k − P exp xt β j=1 k∈R(y(j)) k r "P x exp xt β# X k∈R(y(j)) kl k × P exp xt β j=1 k∈R(y(j)) k Basic Remarks : concepts  Variance-covariance matrix of βˆ can be approximated Nonparametric estimation by the inverse of the information matrix evaluated at βˆ Hypothesis → V (βˆ ) can be approximated by [I(βˆ)]−1 testing in a h hh nonparametric ˆ setting  Properties (consistency, asymptotic normality) of β are

Proportional well established (Gill, 1984) hazards models  A 100(1-α)% confidence interval for βh is given by Parametric q survival ˆ ˆ βh ± zα/ V (βh) models 2 and for the hazard ratio ψh = exp(βh) :  q  ˆ ˆ exp βh ± zα/2 V (βh) , or alternatively via the Delta method Basic concepts Example : Active antiretroviral treatment

Nonparametric estimation  CD4 cells protect the body from infections and other Hypothesis types of disease testing in a nonparametric → if count decreases beyond a certain threshold the setting patients will die Proportional hazards models  As HIV infection progresses, most people experience a

Parametric gradual decrease in CD4 count survival models  Highly Active AntiRetroviral Therapy (HAART) • AntiRetroviral Therapy (ART) + 3 or more drugs • Not a cure for AIDS but greatly improves the health of HIV/AIDS patients Basic concepts

Nonparametric estimation  After introduction of ART, death of HIV patients Hypothesis testing in a decreased tremendously nonparametric setting → investigate now how HIV patients evolve after

Proportional HAART hazards models  Data from a study conducted in Ethiopia :

Parametric • 100 individuals older than 18 years and placed under survival models HAART for the last 4 years • only use data collected for the first 2 years Basic concepts Table: Data of HAART Study

Nonparametric estimation

Hypothesis Pat Time Censo- Gen- Age Weight Func. Clin. CD4 ART testing in a ID ring der Status Status nonparametric setting 1 699 0 1 42 37 2 4 3 1 Proportional 2 455 1 2 30 50 1 3 111 1 hazards models 3 705 0 1 32 57 0 3 165 1

Parametric 4 694 0 2 50 40 1 3 95 1 survival 5 86 0 2 35 37 0 4 34 1 models ... 97 101 0 1 39 37 2 . . 1 98 709 0 2 35 66 2 3 103 1 99 464 0 1 27 37 . . . 2 100 537 1 2 30 76 1 4 1 1 How is survival influenced by gender and age ? Basic  Define agecat = 1 if age < 40 years concepts = ≥ Nonparametric 2 if age 40 years estimation  Define gender = 1 if male Hypothesis testing in a = 2 if female nonparametric setting  Fit a semiparametric PH model including gender and Proportional agecat as covariates : hazards ˆ models • βagecat = 0.226 (HR=1.25) ˆ Parametric • βgender = 1.120 (HR=3.06) survival • models Inverse of the observed information matrix :  0.4645 0.1476  I−1(βˆ) = 0.1476 0.4638 ˆ • 95% CI for βagecat : [-1.11, 1.56] 95% CI for HR of old vs. young : [0.33, 4.77] ˆ • 95% CI for βgender : [-0.21, 2.45] 95% CI for HR of female vs. male : [0.81, 11.64] Basic concepts Partial likelihood for data with tied observations Nonparametric  Events are typically observed on a discrete time scale estimation

Hypothesis ⇒ Censoring and event times can be tied testing in a nonparametric  If ties between censoring time(s) and an event time setting ⇒ we assume that Proportional hazards • the censoring time(s) fall just after the event time models ⇒ they are still in the risk set of the event time Parametric survival  If ties between event times of two or more subjects : models Kalbfleish and Prentice (1980) proposed an appropriate likelihood function, but • rarely used due to its complexity • different approximations have been proposed Approximation proposed by Breslow (1974) : r Q exp xt β Basic Y l:yl =y(j),δl =1 l concepts (β) = L d hP t i (j) Nonparametric j=1 exp x β k:yk ≥y k estimation (j)

Hypothesis testing in a nonparametric Approximation proposed by Efron (1977) : setting r Q t  Proportional l:y =y ,δ =1 exp x β hazards Y l (j) l l models L(β) = Vj (β) j=1 Parametric survival models where

d(j) Y  X t  Vj (β) = exp xk β h=1 k:yk ≥y(j)

h − 1 X t   − exp xl β d(j) l:yl =y(j),δl =1 Basic concepts

Nonparametric estimation

Hypothesis Approximation proposed by Cox (1972) : testing in a nonparametric r Q exp xt β setting Y l:yl =y(j),δl =1 l (β) = , L P P t  Proportional q∈Q h∈q exp xhβ hazards j=1 j models

Parametric with Qj the set of all possible combinations of d(j) subjects survival models from the risk set R(y(j)) Basic concepts Example : Effect of gender on survival of schizophrenic Nonparametric estimation patients Hypothesis  Fit a semiparametric PH model including gender as testing in a nonparametric covariate : setting ˆ ˆ Proportional Approx. Max(partial likel.) β s.e.(β) hazards models Breslow -776.11 0.661 0.164

Parametric Efron -775.67 0.661 0.164 survival models Cox -761.36 0.665 0.165  HR for female vs. male: 1.94  95% CI : [1.41; 2.69]  Contribution to the partial likelihood at time 1096 days Basic concepts • males : 68 at risk, 2 events Nonparametric • females : 12 at risk, no event estimation • Breslow : Hypothesis testing in a exp(2 × 0) nonparametric = 0.000120 setting (68 + 12 exp β)2 Proportional hazards models • Efron : Parametric exp(2 × 0) survival = 0.000121 models (68 + 12 exp β)(67 + 12 exp β)

• Cox : exp(2 × 0) = 0.000243 h 12 1268 68i exp(2β) 2 + exp(β) 1 1 + 2 Basic concepts

Nonparametric estimation Testing hypotheses in the framework of the semiparametric Hypothesis testing in a PH model nonparametric setting  Global tests : Proportional • hypothesis tests regarding the whole vector β hazards models  More specific tests : Parametric survival • hypothesis tests regarding a subvector of β models • hypothesis tests for contrasts and sets of contrasts Basic concepts Global hypothesis tests Nonparametric estimation  Hypotheses regarding the p-dimensional vector β : Hypothesis H : β = β testing in a 0 0 nonparametric setting H1 : β 6= β0 Proportional  statistic: hazards models 2 ˆ t ˆ ˆ  UW = β − β0 I β β − β0 Parametric survival with models • βˆ = maximum likelihood estimator • Iβˆ = observed information matrix 2 2 ⇒ Under H0, and for large sample size : UW ≈ χp Basic concepts  Likelihood ratio test statistic :   Nonparametric U2 = 2 `βˆ − `β  estimation LR 0

Hypothesis with testing in a  nonparametric • ` βˆ = log likelihood evaluated at βˆ setting  • ` β0 = log likelihood evaluated at β0 Proportional 2 2 hazards ⇒ Under H0, and for large sample size : U ≈ χ models LR p

Parametric  statistic : survival 2 t −1   models USC = U β0 I β0 U β0 with  • U β0 = score vector evaluated at β0 2 2 ⇒ Under H0, and for large sample size : USC ≈ χp Basic concepts Example : Effect of age and marital status on survival of

Nonparametric schizophrenic patients estimation  Model the survival as a function of age and marital Hypothesis testing in a status : nonparametric   setting βage Proportional H0 : β =  βmarried  = 0 hazards   models βalone again Parametric survival (βsingle = 0 to avoid overparametrization) models 2 2 −7  UW = 31.6; p-value : P(χ3 > 31.6) = 6 × 10 2 ULR = 30.6 2 USC = 33.5 Local hypothesis tests  β = (βt , βt )t β Basic Let 1 2 , where 2 contains the ‘nuisance’ concepts parameters Nonparametric estimation  Hypotheses regarding the q-dimensional vector β1 : Hypothesis : β = β testing in a H0 1 10 nonparametric setting H1 : β1 6= β10 Proportional  Partition the information matrix as hazards models " # I11 I12 Parametric I = survival I21 I22 models with I11 = matrix of partial derivatives of order 2 with respect to the components of β1 " # I11 I12 ⇒ I−1 = I21 I22  Note that the complete information matrix is required to 11 ˆ ˆ obtain I , except when β1 is independent of β2 Basic concepts  Define Nonparametric estimation ˆ β1 = maximum likelihood estimator Hypothesis testing in a of β1 nonparametric setting ˆ β2(β10) = maximum likelihood estimator Proportional hazards of β2 with β1 put equal to β10 models ˆ  Parametric U1 β10, β2(β10) = score subvector evaluated survival ˆ models at β10 and β2(β10) 11 ˆ  11 I β10, β2(β10) = matrix I for β1 evaluated ˆ at β10 and β2(β10) Basic concepts

Nonparametric estimation  Wald test : 2 ˆ t 11 ˆ −1 ˆ  2 Hypothesis UW = β1 − β10 I (β) β1 − β10 ≈ χq testing in a nonparametric  Likelihood ratio test : setting 2  ˆ ˆ  2 Proportional ULR = 2 `(β) − ` β10, β2(β10) ≈ χq hazards models  Score test : Parametric t 2  ˆ  11 ˆ  survival U = U1 β10, β2(β10) I β10, β2(β10) models SC  ˆ  2 ×U1 β10, β2(β10) ≈ χq Basic concepts Testing more specific hypotheses Nonparametric estimation  Consider a p × 1 vector of coefficients c Hypothesis  Hypothesis test : testing in a nonparametric t setting H0 : c β = 0

Proportional  Wald test statistic : hazards t −1 models 2 t ˆ t −1 ˆ  t ˆ UW = c β c I (β)c c β Parametric survival Under H0 and for large sample size : models 2 2 UW ≈ χ1  Likelihood ratio test and score test can be obtained in a similar way  If different linear combinations of the parameters are of Basic concepts interest, define Nonparametric  ct  estimation 1 C =  .  Hypothesis  .  testing in a t nonparametric cq setting with q ≤ p and assume that the matrix C has full rank Proportional hazards models  Hypothesis test :

Parametric H0 : Cβ = 0 survival models  Wald test statistic : 2 ˆt −1 ˆ t −1 ˆ UW = Cβ CI (β)C Cβ 2 2 Under H0 and for large sample size : UW ≈ χq  Likelihood ratio test and score test can be obtained in a similar way Basic concepts Example : Effect of age and marital status on survival of Nonparametric schizophrenic patients estimation

Hypothesis  H0 : βmarried = 0 testing in a nonparametric → ct = (0, 1, 0) setting 2 Proportional → Wald test statistic : 1.18; p-value: P(χ1 > 1.18) = 0.179 hazards models  H0 : βmarried = βalone again = 0 Parametric survival  0 1 0  models → C = 0 0 1 2 2 2 → Test statistics : UW = 31.6; ULR = 30.6; USC = 33.5 2 −7 → p-value (Wald) : P(χ2 > 31.6) = 1 × 10 Basic concepts

Nonparametric estimation Building multivariable semiparametric models Hypothesis testing in a  nonparametric including a continuous covariate setting  including a categorical covariate Proportional hazards  including different types of covariates models

Parametric  interactions between covariates survival models  time-varying covariates Basic Including a continuous covariate in the semiparametric PH concepts model Nonparametric estimation  For a single continuous covariate xi : Hypothesis testing in a h (t) = h (t) exp(βx ) nonparametric i 0 i setting where Proportional • hazards h0(t) = baseline hazard (refers to a subject with xi = 0) models hazard of a subject i with covar. x • exp(β) = i Parametric hazard of a subject j with covar. xj = xi − 1 survival models and is independent of the covariate xi and of t • exp(rβ) = hazard ratio of two subjects with a difference of r covariate units ⇒ βˆ = increase in log-hazard corresponding to a one unit increase of the continuous covariate Basic Example : Impact of age on survival of schizophrenic concepts patients Nonparametric estimation  Introduce age as a continuous covariate in the Hypothesis semiparametric PH model : testing in a nonparametric setting hi (t) = h0(t) exp(βageagei )

Proportional  βage = 0.00119 (s.e. = 0.00952). hazards models hazard for a subject of age i (in years)  HR = = 1.001 Parametric hazard for a subject of age i − 1 survival models 95% CI : [0.983, 1.020]  Other quantities can be calculated, e.g. hazard for a subject of age 40 hazard for a subject of age 30 = exp(10 × 0.00119) = 1.012 Including a categorical covariate in the semiparametric PH Basic concepts model

Nonparametric  For a single categorical covariate xi with l levels : estimation t Hypothesis hi (t) = h0(t) exp(β xi ), testing in a nonparametric where setting • β = (β1, . . . , βl ) Proportional hazards • xi is the covariate for subject i models  This model is overparametrized ⇒ restrictions : Parametric survival • Set β1 = 0 so that h0(t) corresponds to the hazard of a models subject with the first level of the covariate

• exp(βj ) = HR of a subject at level j relative to a subject at level 1 0 • exp(βj − βj0 ) = HR between level j and j ˆ ˆ ˆ ˆ ˆ ˆ (note that V (βj − βj0 ) = V (βj ) + V (βj0 ) − 2Cov(βj , βj0 )) • Other choices of restrictions are possible Example : Impact of marital status on survival of Basic concepts schizophrenic patients Nonparametric  Introduce marital status as a categorical covariate in estimation

Hypothesis the semiparametric PH model testing in a nonparametric hi (t) = h0(t) exp(βmarriedxi2 + βalone againxi3), setting where Proportional hazards • xi2 = 1 if patient is married, 0 otherwise models • xi3 = 1 if patient is alone again, 0 otherwise Parametric survival  Married vs single : models ˆ • βmarried = −0.206 (s.e. = 0.214) • HR = 0.814 (95%CI :[0.534, 1.240]), p = 0.34  Alone again vs single : ˆ • βalone again = 0.794 (s.e. = 0.185) • HR = 2.213 (95%CI :[1.540, 3.180]), p = 1.7 × 10−5 Basic concepts

Nonparametric estimation  Married vs alone again : Hypothesis ˆ ˆ • exp(βmarried − βalone again) = 0.368 testing in a nonparametric • Variance-covariance matrix : setting

Proportional ! βˆ  0.0460 0.0183  hazards married = models V ˆ βalone again 0.0183 0.0342 Parametric survival • ˆ ˆ models V (βmarried − βalone again) = 0.0436 • 95% CI : [0.244, 0.553] Including different covariates in the semiparametric PH model Basic concepts • Estimates for a particular parameter will then be

Nonparametric adjusted for the other parameters in the model estimation • Estimates for this particular parameter will be different Hypothesis testing in a from the estimate obtained in a univariate model nonparametric setting (except when the covariates are orthogonal) Proportional hazards Example : Impact of marital status and age on survival of models schizophrenic patients Parametric survival models hi (t) = h0(t) exp(βageagei + βmarriedxi2 + βalone againxi3)

Covariate βˆ s.e.(βˆ) HR 95% CI age -0.0154 0.0104 0.99 [0.97,1.01] married -0.3009 0.2238 0.74 [0.48,1.15] alone again 0.8195 0.1857 2.269 [1.58,3.27] Basic concepts between covariates Nonparametric  Interaction : the effect of one covariate depends on the estimation level of another covariate Hypothesis testing in a nonparametric  Continuous / categorical (j levels) : different hazard setting ratios are required for the continuous covariate at each Proportional hazards level of the categorical covariate models ⇒ add j − 1 parameters Parametric survival models  Categorical (j levels) / categorical (k levels) : for each level of one covariate, different HR between the levels of the other covariate with the reference are required ⇒ add (j − 1) × (k − 1) parameters Example : Impact of marital status and age on survival of Basic concepts schizophrenic patients

Nonparametric estimation hi (t) = h0(t) exp( βmarried × xi2 + βalone again × xi3 Hypothesis testing in a +βage × agei + βage | married × xi2 × agei nonparametric setting +βage | alone again × xi3 × agei ) Proportional hazards models Covariate βˆ s.e.(βˆ) HR 95% CI Parametric survival age -0.0238 0.0172 0.977 [0.94,1.01] models married -0.6811 0.8579 0.506 [0.09,2.72] alone again 0.3979 0.7475 1.489 [0.34,6.44] age|married 0.0129 0.0299 1.013 [0.96,1.07] age|alone again 0.0133 0.0228 1.013 [0.97,1.06] Basic  Effect of age in the reference group (single) : concepts ˆ Nonparametric exp(βage) = exp(−0.0238) = 0.977 estimation  Effect of age in the married group : Hypothesis testing in a ˆ ˆ nonparametric exp(βage + βage|married) = exp(−0.0238 + 0.0129) setting = 0.989 Proportional hazards models  Effect of age in the alone again group : ˆ ˆ Parametric exp(βage + βage|alone again) = exp(−0.0238 + 0.0133) survival models = 0.990  Likelihood ratio test for the interaction : 2 ULR = 0.76 2 P(χ2 > 0.76) = 0.684 ˆ  HRmarried = exp(βmarried) = 0.506

Basic = HR of a married subject relative to a concepts single subject at the age of 0 year Nonparametric estimation ⇒ more relevant to express the age as the difference Hypothesis between a particular age of interest (e.g. 30 years) testing in a nonparametric setting ⇒ has impact on parameter estimates of differences

Proportional between groups, but not on parameter estimates hazards models related to age

Parametric survival models Covariate βˆ s.e.(βˆ) HR 95% CI age -0.0238 0.0172 0.977 [0.94,1.01] married -0.2928 0.2378 0.746 [0.47,1.19] alone again 0.7971 0.1911 2.219 [1.53,3.23] age|married 0.0129 0.0299 1.013 [0.96,1.07] age|alone again 0.0133 0.0228 1.013 [0.97,1.06] Example : Impact of marital status and gender on survival of schizophrenic patients Basic concepts hi (t) = h0(t) exp(βmarried × xi2 + βalone again × xi3 Nonparametric +β × estimation female genderi

Hypothesis +βfemale|married × xi2 × genderi testing in a nonparametric +β × x × gender ) setting female|alone again i3 i

Proportional hazards models Covariate βˆ s.e.(βˆ) HR 95% CI Parametric survival female 0.520 0.286 1.681 [0.96, 2.95] models married -0.253 0.26 0.776 [0.47, 1.29] alone again 0.807 0.236 2.242 [1.41, 3.56] female|married 0.389 0.46 1.476 [0.60, 3.64] female|alone again -0.146 0.372 0.865 [0.42, 1.79]

,→ Likelihood ratio test for the interaction : 2 2 ULR = 1.94; P(χ2 > 1.94) = 0.23 Time varying covariates Basic concepts  In some applications, covariates of interest change with Nonparametric estimation time

Hypothesis  Extension of the Cox model : testing in a nonparametric h (t) = h (t) exp(βt x (t)) setting i 0 i

Proportional ⇒ Hazards are no longer proportional hazards models  Estimation of β :

Parametric • Let xk (y) be the covariate vector for subject k at time y survival models • Define the partial likelihood : δ n " t  # i Y exp xi (yi ) β (β) = L P t exp (xk (yi ) β) i=1 k∈R(yi ) • Let ˆ β = argmaxβL(β) Basic concepts Example : Time varying covariates in data on the first time Nonparametric estimation to insemination for cows Hypothesis  Aim : find constituent in milk that is predictive for the testing in a nonparametric hazard of first insemination setting • one possible predictor is the ureum concentration Proportional hazards • milk ureum concentration changes over time models

Parametric  Information for an individual cow i (i = 1,..., n) : survival  models yi , δi , xi (ti1) ,..., xi tiki Covariate is determined only once a month ⇒ Value at time t is determined by linear interpolation Basic concepts  Ureum concentration is introduced as a time-varying Nonparametric covariate in the semiparametric PH model : estimation

Hypothesis hi (t) = h0(t) exp(βxi (t)), testing in a nonparametric where setting • hi (t) = hazard of first insemination at time t for cow i Proportional hazards having at time t ureum concentration equal to xi (t) models • β = linear effect of the ureum concentration on the Parametric log-hazard of first insemination survival models  βˆ = −0.0273 (s.e. = 0.0162) HR = exp(−0.0273) = 0.973 95% CI = [0.943, 1.005] p-value = 0.094 Basic concepts Model building strategies for the semiparametric PH model Nonparametric estimation  Often not clear what criteria should be used to decide Hypothesis which covariates should be included testing in a nonparametric setting  Should be based first on meaningful interpretation and

Proportional biological knowledge hazards models  Different strategies exist : Parametric • Forward selection survival models • Backward selection • Forward stepwise selection • Backward stepwise selection • AIC selection Basic  Forward procedure : concepts • First, include the covariate with the smallest p-value Nonparametric • Next, consider all possible models containing the estimation selected covariate and one additional covariate, and Hypothesis testing in a include the covariate with the smallest p-value nonparametric setting • Continue doing this until all remaining non-selected

Proportional covariates are non-significant hazards models  Backward procedure :

Parametric • First, start from the full model that includes all survival models covariates • Next, consider all possible models containing all covariates except one, and remove the covariate with the largest p-value • Continue doing this until all remaining covariates in the model are significant  Forward / backward stepwise procedure : Basic concepts Start as in the forward / backward procedure, but an included / removed covariate can be excluded / Nonparametric estimation included at a later stage, if it is no longer significant / Hypothesis non-significant with other covariates in the model testing in a nonparametric  Note that the above p-values can be based on either setting

Proportional the Wald, likelihood ratio or score test hazards models  Akaike’s information criterion (AIC) : instead of

Parametric including / removing covariates based on their p-value, survival models we look at the AIC : AIC = −2 log(L) + kp where • p = number of parameters in the model • L = likelihood • k = constant (often 2) Example : Model building in the schizophrenic patients dataset Basic concepts  Univariate models : Nonparametric −7 estimation Marital status p = 6.7 × 10 −5 Hypothesis Gender p = 9.7 × 10 testing in a nonparametric Educational status p = 0.663 setting Age p = 0.9 Proportional hazards models  Forward procedure :

Parametric • Start with a model containing marital status survival models • Fit model containing marital status and one of the three remaining covariates ⇒ Gender has smallest p-value • Fit model containing marital status, gender and one of the two remaining covariates ⇒ None of the remaining covariates (educational status and age) is significant ⇒ Final model contains marital status and gender Survival function estimation in the semiparametric model

Basic  Survival function for subject with covariate xi : concepts

Nonparametric Si (t) = exp(−Hi (t)) estimation = exp(−H (t) exp(βt x )) Hypothesis 0 i t testing in a exp(β xi ) nonparametric = (S0(t)) setting R t Proportional with S0(t) = exp(−H0(t)) and H0(t) = 0 h0(s)ds hazards models  Estimate the baseline cumulative hazard H0(t) by Parametric ˆ X ˆ survival H0(t) = h0(j), models j:y(j)≤t where ˆ d(j) h0(j) =   P exp xt βˆ k∈R(y(j)) k extends the Breslow estimator to the case of tied observations Basic concepts Nonparametric  Define estimation

Hypothesis ˆt  exp(β xi ) testing in a ˆ ˆ nonparametric Si (t) = S0(t) , setting Proportional ˆ ˆ hazards with S0(t) = exp(−H0(t)) models

Parametric  It can be shown that survival models Sˆ (t) − S (t) i i →d N(0, 1) 1/2 ˆ V (Si (t)) Example : Survival function estimates for marital status groups in the schizophrenic patients data Basic concepts

Nonparametric estimation 1 Single Hypothesis Married Alone again testing in a nonparametric setting 0.8

Proportional hazards models 0.6

Parametric survival Estimated survival models 0.4 0.2 0

0 500 1000 1500 2000

Time Basic concepts

Nonparametric estimation Hypothesis Consider e.g. survival at 505 days : testing in a nonparametric setting

Proportional Single group : 0.755 95% CI : [0.690, 0.827] hazards models Married group : 0.796 95% CI : [0.730, 0.867]

Parametric Alone again group : 0.537 95% CI : [0.453, 0.636] survival models Stratified semiparametric PH model

Basic concepts  The assumption that h0(t) is the same for all subjects

Nonparametric might be too strong in practice estimation ⇒ Possible solution : consider groups (strata) of Hypothesis testing in a subjects with the same baseline hazard nonparametric setting  Stratified PH model : the hazard of subject j Proportional (j = 1,..., n ) in stratum i (i = 1,..., s) is given by hazards i models t hij (t) = hi0(t) exp xij β) Parametric survival  Extension of the partial likelihood : models  δij s ni t Y Y exp(xij β) L(β) =    P exp(xt β) i=1 j=1 il l∈Ri (yij ) ⇒ Risk set for a subject contains only the subjects still at risk within the same stratum Example : Stratified PH model for the time to first Basic concepts insemination dataset Nonparametric estimation  Cows are coming from different farms

Hypothesis ⇒ baseline hazard might differ considerably between testing in a nonparametric farms (even if the effect of the ureum concentration is setting similar) Proportional hazards  Consider the effect of the ureum concentration in milk models on the time to first insemination, stratifying on the Parametric survival farms : models βˆ = −0.0588 (s.e. = 0.0198) HR = 0.943 95% CI = [0.907, 0.980] ⇒ By stratifying on the farms, ureum concentration becomes significant Basic concepts Checking the proportional hazards assumption Nonparametric estimation  PH assumption : HR between two subjects with

Hypothesis different covariates is constant over time testing in a nonparametric  Formal tests and diagnostic plots have been developed setting

Proportional to check this assumption hazards models  Formal test :

Parametric • Add βl xi × t to the PH model : survival models hi (t) = h0(t) exp(βxi + βl xi × t)

• If βl 6= 0, the PH assumption does not hold • Instead of adding βl xi × t, one can also add βl xi × g(t) for some function g Basic concepts  Diagnostic plots :

Nonparametric • Consider for simplicity the case of a covariate with r estimation levels Hypothesis • Estimate the cumulative hazard function for each level testing in a nonparametric of the covariate by of the Nelson-Aalen estimator setting ˆ ˆ ˆ ⇒ H1(t), H2(t),..., Hr (t) should be constant multiples Proportional hazards of each other : models

Parametric survival Plot PH assumption holds if models ˆ ˆ log(H1(t)), ..., log(Hr (t)) vs t parallel curves ˆ ˆ log(Hj (t)) − log(H1(t)) vs t constant lines ˆ ˆ Hj (t) vs H1(t) straight lines through origin Example : PH assumption for the gender effect in the

Basic schizophrenic patients dataset concepts 1 Nonparametric 3.0

Male 0 estimation 2.5 Female −1 Hypothesis 2.0 −2 testing in a 1.5 −3 nonparametric 1.0 Cumulative hazard log(Cumulative hazard) setting −4 0.5 Male Female −5 Proportional 0.0 hazards 0 500 1000 1500 0 500 1000 1500 models Time Time

Parametric survival 1.0

models 1.5 0.5 1.0 0.0 0.5 Cumulative hazard Female log(ratio cumulative hazards) −0.5 0.0

0 500 1000 1500 0.0 0.2 0.4 0.6 0.8 1.0 1.2

Time Cumulative hazard Male Basic concepts

Nonparametric estimation

Hypothesis testing in a nonparametric setting

Proportional Parametric survival models hazards models

Parametric survival models Basic concepts Some common parametric distributions Nonparametric estimation : Hypothesis testing in a  Characterized by one parameter λ > 0 : nonparametric setting S0(t) = exp(−λt) Proportional hazards f0(t) = λ exp(−λt) models

Parametric h0(t) = λ survival models → leads to a constant hazard function

 Empirical check : plot of the log of the survival estimate versus time Basic Hazard and survival function for the exponential distribution concepts

Nonparametric estimation

Hypothesis testing in a 0.4 1.0 nonparametric setting

Proportional 0.8 Lambda=0.14 0.3 hazards Lambda=0.14 models 0.6

Parametric 0.2 Hazard survival Survival models 0.4 0.1 0.2 0.0 0.0

0 2 4 6 8 10 0 2 4 6 8 10

Time Time Basic : concepts

Nonparametric  Characterized by a λ > 0 and a shape estimation parameter ρ > 0 : Hypothesis ρ testing in a S0(t) = exp(−λt ) nonparametric ρ−1 ρ setting f0(t) = ρλt exp(−λt ) Proportional ρ−1 hazards h0(t) = ρλt models

Parametric → hazard decreases if ρ < 1 survival models → hazard increases if ρ > 1 → hazard is constant if ρ = 1 (exponential case)

 Empirical check : plot log cumulative hazard versus log time Basic Hazard and survival function for the Weibull distribution concepts

Nonparametric estimation Hazard and survival functions for Weibull distribution Hypothesis testing in a 0.4 1.0 nonparametric setting Lambda=0.31, Rho=0.5 Lambda=0.31, Rho=0.5 Lambda=0.06, Rho=1.5 Lambda=0.06, Rho=1.5 0.8

Proportional 0.3 hazards models 0.6

Parametric 0.2 Hazard Survival survival 0.4 models 0.1 0.2 0.0 0.0

0 2 4 6 8 10 0 2 4 6 8 10

Time Time Log-logistic distribution : Basic concepts  A random variable T has a log-logistic distribution if

Nonparametric logT has a logistic distribution estimation

Hypothesis  Characterized by two parameters λ and κ > 0 : testing in a nonparametric 1 setting S (t) = 0 1 + (tλ)κ Proportional hazards κtκ−1λκ models ( ) = f0 t 2 Parametric [1 + (tλ)κ] survival models κtκ−1λκ h (t) = 0 1 + (tλ)κ

 The median event time is only a function of the parameter λ : M(T ) = exp(1/λ) Basic Hazard and survival function for the log-logistic distribution concepts

Nonparametric estimation

Hypothesis testing in a 0.4 1.0 nonparametric setting Lambda=0.2, Kappa=1.5 Lambda=0.2, Kappa=1.5 Lambda=0.2, Kappa=0.5 Lambda=0.2, Kappa=0.5 Proportional 0.8 0.3 hazards models 0.6

Parametric 0.2 Hazard survival Survival models 0.4 0.1 0.2 0.0 0.0

0 2 4 6 8 10 0 2 4 6 8 10

Time Time Log-normal distribution : Basic concepts  Resembles the log-logistic distribution but is

Nonparametric mathematically less tractable estimation

Hypothesis  A random variable T has a log-normal distribution if testing in a nonparametric logT has a normal distribution setting

Proportional  Characterized by two parameters µ and γ > 0 : hazards models log(t) − µ S0(t) = 1 − FN √ Parametric γ survival models 1  1  f (t) = √ exp − (log(t) − µ)2 0 t 2πγ 2γ

 The median event time is only a function of the parameter µ : M(T ) = exp(µ) Basic Hazard and survival function for the log-normal distribution concepts

Nonparametric estimation

Hypothesis testing in a 0.4 1.0 nonparametric setting Mu=1.609, Gamma=0.5 Mu=1.609, Gamma=0.5 Mu=1.609, Gamma=1.5 Mu=1.609, Gamma=1.5 Proportional 0.8 0.3 hazards models 0.6

Parametric 0.2 Hazard survival Survival models 0.4 0.1 0.2 0.0 0.0

0 2 4 6 8 10 0 2 4 6 8 10

Time Time Basic concepts Parametric survival models

Nonparametric estimation The parametric models considered here have two

Hypothesis representations : testing in a nonparametric  Accelerated failure time model (AFT) : setting t Proportional Si (t) = S0(exp(θ xi )t), hazards models where t Parametric • θ = (θ1, . . . , θp) = vector of regression coefficients survival t models • exp(θ xi ) = acceleration factor • S0 belongs to a parametric family of distributions Hence, t  t  hi (t) = exp θ xi h0 exp(θ xi )t and Basic t concepts Mi = exp(−θ xi )M0 Nonparametric where M = median of S , since estimation i i 1 t  Hypothesis S0(M0) = = Si (Mi ) = S0 exp(θ xi )Mi testing in a 2 nonparametric setting Proportional Ex : For one binary variable (say treatment (T) and hazards models control (C)), we have MT = exp(−θ)MC : Parametric survival models Control Treated Survival function 0 0.25 0.5 0.75 1

M C M T 0.0 0.5 1.0 1.5 2.0 Time Basic concepts

Nonparametric  : estimation t log ti = µ + γ xi + σwi , Hypothesis testing in a where nonparametric setting • µ = intercept t Proportional • γ = (γ1, . . . , γp) = vector of regression coefficients hazards models • σ = scale parameter

Parametric • W has known distribution survival models  These two models are equivalent, if we choose

• S0 = survival function of exp(µ + σW ) • θ = −γ Basic concepts

Nonparametric estimation Indeed,

Hypothesis Si (t) = P(ti > t) testing in a nonparametric = P(log t > log t) setting i t Proportional = P(µ + σwi > log t − γ xi ) hazards t  models = S0 exp(log t − γ xi ) Parametric t  survival = S0 t exp(θ xi ) models

⇒ The two models are equivalent Basic Weibull distribution concepts  Consider the accelerated failure time model Nonparametric estimation t  Hypothesis Si (t) = S0 exp(θ xi )t , testing in a nonparametric setting α where S0(t) = exp(−λt ) is Weibull Proportional hazards t α models ⇒ Si (t) = exp − λ exp(β xi )t ) with β = αθ Parametric α−1 t t α survival ⇒ fi (t) = λαt exp(β xi ) exp − λ exp(β xi )t ) models α−1 t t ⇒ hi (t)= αλt exp(β xi )= h0(t) exp(β xi ),

α−1 with h0(t) = αλt the hazard of a Weibull ⇒ We also have a Cox PH model Basic concepts  The above model is also equivalent to the following

Nonparametric linear model : estimation log t = µ + γt x + σw , Hypothesis i i i testing in a nonparametric where W has a standard extreme value distribution, i.e. setting w SW (w) = exp(−e ). Indeed, Proportional  hazards P(W > w) = P exp(µ + σW ) > exp(µ + σw) models  Parametric = S0 exp(µ + σw) survival  models = exp − λ exp(αµ + ασw) Since W has a known distribution, it follows that λ exp(αµ) = 1 and ασ = 1, and hence P(W > w)= exp(−ew ) Basic concepts  It follows that Nonparametric Weibull accelerated failure time model estimation

Hypothesis = Cox PH model with Weibull baseline hazard testing in a nonparametric = Linear model with standard extreme value error setting distribution Proportional hazards models and

Parametric • θ = −γ = β/α survival • α = 1/σ models • λ = exp(−µ/σ)  Note that the Weibull distribution is the only continuous distribution that can be written as an AFT model and as a PH model Basic Log-logistic distribution concepts  Consider the accelerated failure time model Nonparametric estimation S (t) = S exp(θt x )t, Hypothesis i 0 i testing in a nonparametric α setting where S0(t) = 1/[1 + λt ] is log-logistic Proportional 1 hazards ⇒ ( ) = β = αθ models Si t t α with 1 + λ exp(β xi )t Parametric survival Si (t) 1 models ⇒ = t α 1 − Si (t) λ exp(β xi )t

t S0(t) = exp(−β xi ) 1 − S0(t) ⇒ We also have a so-called proportional odds model Basic  The above model is also equivalent to the following concepts linear model : Nonparametric estimation t log ti = µ + γ xi + σwi , Hypothesis testing in a where W has a standard logistic distribution, i.e. nonparametric setting SW (w) = 1/[1 + exp(w)]. Indeed, Proportional P(W > w) = P exp(µ + σW ) > exp(µ + σw) hazards models  = S0 exp(µ + σw) Parametric survival  models = 1/ 1 + λ exp(αµ + ασw)] Since W has a known distribution, it follows that λ exp(αµ) = 1 and ασ = 1, and hence 1 P(W > w)= 1 + exp(w) Basic  It follows that concepts Log-logistic accelerated failure time model Nonparametric estimation = Proportional odds model with log-logistic baseline Hypothesis testing in a survival nonparametric setting = Linear model with standard logistic error Proportional hazards distribution models and Parametric survival • θ = −γ = β/α models • α = 1/σ • λ = exp(−µ/σ)  Note that the log-logistic distribution is the only continuous distribution that can be written as an AFT model and as a proportional odds model Other distributions

Basic concepts  Log-normal :

Nonparametric Log-normal accelerated failure time model estimation = Linear model with standard normal error Hypothesis testing in a distribution nonparametric setting  Generalized gamma : Proportional hazards ti follows a generalized if models t log ti = µ + γ xi + σwi , Parametric survival models where wi has the following density : 2 |θ|θ−2 exp(θw)1/θ exp − θ−2 exp(θw) f (w) = W Γ(1/θ2) If θ = 1 ⇒ Weibull model If θ = 1 and σ = 1 ⇒ exponential model If θ → 0 ⇒ log-normal model Estimation Basic  It suffices to estimate the model parameters in one of concepts

Nonparametric the equivalent model representations. Consider e.g. the estimation linear model : Hypothesis t testing in a log ti = µ + γ xi + σwi nonparametric setting  The likelihood function for right censored data equals Proportional n hazards Y δi 1−δi models L(µ, γ, σ)= fi (yi ) Si (yi )

Parametric i=1 survival n t δ models Y h 1 log yi − µ − γ xi i i = fW σyi σ i=1 t h log y − µ − γ x i1−δi × S i i W σ Since W has a known distribution, this likelihood can be maximized w.r.t. its parameters µ, γ, σ Basic concepts

Nonparametric estimation  Let Hypothesis testing in a nonparametric (ˆµ, γ,ˆ σˆ) = argmaxµ,γ,σL(µ, γ, σ) setting Proportional  It can be shown that hazards models • (ˆµ, γ,ˆ σˆ) is asymptotically unbiased and normal Parametric • survival The estimators of the accelerated failure time model (or models any other equivalent model) and their asymptotic distribution can be obtained from the Delta-method Basic concepts Nonparametric To select the best parametric model, we present two estimation

Hypothesis methods testing in a nonparametric  Selection of nested models : setting Consider the generalized gamma model as the ‘full’ Proportional hazards model, and test whether models • θ = 1 ⇒ Weibull model Parametric • θ = 1 and σ = 1 ⇒ exponential model survival models • θ = 0 ⇒ log-normal model The test can be done using the Wald, likelihood ratio or score test statistic derived from the likelihood for censored data Basic concepts

Nonparametric estimation  AIC selection : Hypothesis AIC = −2 log L + 2(p + 1 + k), testing in a nonparametric where setting • + = (µ, γ) Proportional p 1 dimension of hazards • k = 0 for the exponential model models • k = 1 for the Weibull, log-logistic, log-normal model Parametric survival • k = 2 for the generalized gamma model models and minimize the AIC among all candidate parametric models Basic concepts

Nonparametric estimation

Hypothesis testing in a nonparametric setting The End Proportional hazards models

Parametric survival models