<<

AMS 588 – Failure and Survival Analysis

Instructor: Prof. Wei Zhu 1. Introduction Some Basics • Binomial distribution The Binomial distribution is based on the idea of a Bernoulli trial, which is an with two, and only two, possible outcomes.

푌1,…푌푛 iid Bernoulli(p), 푝(푌𝑖 = 1) = 휋 = 1 − p(푌𝑖 = 0)

푌𝑖 = 1 is often termed as a “success” and 휋 is referred to as the success .

퐸 푌𝑖 = 휋; 푉푎푟 푌𝑖 = 휋(1 − 휋) The total number of “success” in 푛 trails follows a Binomial distribution: trials pmf: for i=0,1,…,n

Binomial Theorem: For any real numbers푛and 푦 and integer 푛 ≥ 0, 푛 푛 (푥 + 푦)푛= ෍ 푥𝑖푦푛−𝑖 𝑖 𝑖=0 Some Basics • Hypergeometric distribution Suppose we have a large urn filled with 푁 balls that are identical in every way except that 푀 are red and N − M are green. We reach in, blindfolded, and select 퐾 balls at random. The number of red balls 푋 in a sample of size 퐾 follows a Hypergeometric 푀 푁−푀 distribution: 푥 퐾−푥 푃 푋 = 푥 푁, 푀, 퐾 = 푁 , 푥 = 0,1, … , 퐾 퐾푀 퐾 E X = μ = ; 푁 퐾푀 (푁 − 푀)(푁 − 퐾) 푉푎푟(푋) = 휎2 = 푁 푁(푁 − 1)

Example: Let 푋1 and 푋2 be independent observations with 푋1~푏𝑖푛표푚𝑖푎푙 푛1, 푝1 and 푋2~푏𝑖푛표푚𝑖푎푙 푛2, 푝2 . Consider testing 퐻0: 푝1 = 푝2 푣푠 퐻1: 푝1 > 푝2. It is easy to show that under 퐻0: 푝1 = 푝2 = 푝, 푋 = 푋1 + 푋2is a sufficient for 푝, and 푋1|푋 = 푥~ℎ푦푝푒푟푔푒표푚푒푡푟𝑖푐(푛1 + 푛2, 푛1, 푥) -- Fisher’s

The data can also be formulated into a 2x2 table Under 퐻0: 푝1 = 푝2 x1 n1-x1 n1 푛1 푛2 푥 푥−푥 x2 n2-x2 n2 1 1 (hypergeometric) 푓 푥1 푛1, 푛2, 푥 = 푛 x n-x n 푥 Some Basics • Iterative expectation & If 푋 and 푌 are any two random variables, then 퐸푋 = 퐸 퐸(푋|푌) 푉푎푟 푋 = 퐸 푉푎푟 푋 푌 + 푉푎푟 퐸 푋|푌 Provided that the expectations exist

Example: 푋|푃~binomial(푛, 푃) and 푃~beta(α, β) α 퐸푋 = 퐸 퐸(푋|푃) = 퐸 푛푃 = 푛 α + β 푉푎푟 푋 = 퐸 푉푎푟 푋 푃 + 푉푎푟 퐸 푋|푃 = 퐸 푛푃(1 − 푃) + 푉푎푟 푛푃 αβ αβ = 푛 + 푛2 α + β α + β + 1 α + β 2 α + β + 1 αβ α + β + 푛 = 푛 α + β 2 α + β + 1 Some Basics :

n

Delta method: In many biomedical applications the primary endpoint of interest is time to a certain event, such as time to death; time it takes for a patient to respond to a therapy; time from response until disease relapse (i.e., disease returns); time to eradicate an infection after treatment with antibiotics; etc. We may be interested in • characterizing the distribution of “time to event” for a given population ; • comparing this “time to event” among different groups (e.g., treatment vs. control in a or an ); • modeling the relationship of “time to event” to other covariates (sometimes called prognostic factors or predictors). Difficulties: • : Since the data are collected over a finite period of time, the “time to event” may not be observed for all the individuals in our study population (sample). This results in what is called censored data. • Differential follow-up: It is also common that the amount of follow-up for the individuals in a sample vary from subject to subject. The standard statistical methods cannot properly handle these difficulties in the analysis of the “time to event” data. A new research area in statistics has emerged which is called Survival Analysis or Censored Survival Analysis. “Time To Event” Let the T denote time to the event of our interest. Of course, T is a positive random variable which has to be unambiguously defined; that is, we must be very specific about the start and end with the length of the time period in-between corresponding to T. EX: Survival time of a treatment for a population with certain disease: measured from the time of treatment initiation until death.

cdf of T: 퐹 푡 = 푃 푇 ≤ 푡 , 푡 ≥ 0 Right continuous: lim 퐹(푢) = 퐹 푡 푢→푡+ When 푇 is a survival time, 퐹 푡 is the probability that a randomly selected subject from the population will die before time 푡. 푑퐹 푡 푡 pdf of T: 푓 푡 = , 퐹 푡 = ׬ 푓 푢 푑푢 푑푡 0 In biomedical applications, it is often common to use the survival function 푆 푡 = 푃 푇 ≥ 푡 = 1 − 퐹(푡−) 푤ℎ푒푟푒, 퐹 푡− = lim 퐹(푢) 푢→푡− When T is a survival time, S(t) is the probability that a randomly selected individual will survive to time t or beyond. Note: Sometimes, a survival function may be defined as 푆 푡 = 푃 푇 > 푡 = 1 − 퐹 푡 . This definition will be identical to the above one if T is a continuous random variable, which is the case we will focus on in this course. 푆 푡

• Non-increasing; • 푆 0 = 1 and 푆 ∞ = 0 for a proper random variable, which that everyone will eventually experience the event, e.g. death. • However, we may also allow the possibility that 푆 ∞ > 0. This corresponds to a situation where there is a positive probability of not “dying” or not experiencing the event. For example, if the event of interest is the time from response until disease relapse and the disease has a cure for some proportion of individuals in the population, then we have 푆 ∞ > 0, where 푆 ∞ corresponds to the proportion of cured individuals. • If 푇 is continuous r.v., ∞ 푑푆 푡 푆 푡 = න 푓 푢 푑푢 , 푓 푡 = − 푡 푑푡 That is, there is a one-to-one correspondence between 푓 푡 and 푆 푡 .

Here f(t) is the usual probability density function (pdf), where f(t) = dF(t)/dt Definitions Survival Time: μ = 퐸(푇). Due to censoring, sample mean of observed survival times is no longer an unbiased estimate of μ = 퐸(푇). If we can estimate 푆 푡 well, then we can estimate μ = 퐸(푇) using the following fact: ∞ ∞ ∞ 퐸 푇 = න 푡푑퐹 푡 = − න 푡푑푆 푡 = න 푆 푡 푑푡 0 0 0 Survival Time: Median survival time 푚 is defined as the quantity 푚 satisfying 푆 푚 = 0.5. Sometimes denoted by 푡0.5. If 푆 푡 is not strictly decreasing, 푚 is the smallest one such that 푆 푚 ≤ 0.5. pth quantile of Survival Time (100pth ): 푡푝 such that 푆 푡푝 = 1 − 푝 0 < 푝 < 1 . If 푆 푡 is not strictly decreasing, 푡푝 is the smallest one such that 푆 푡푝 ≤ 1 − 푝 Mean Residual Life Time(푚푟푙): 푚푟푙 푡0 = 퐸[푇 − 푡0|푇 ≥ 푡0] i.e., 푚푟푙 푡0 = average remaining survival time given the population has survived beyond 푡0. It can be shown that ∞ ׬ 푆 푡 푑푡 푡0 푚푟푙 푡0 = 푆 푡0 EX:

Figure 1.1: The survival function for a hypothetical population

In this population: • 70% of the individuals will survive at least 2 years (i.e., 푡0.3 = 2) and • the median survival time is 2.8 years (i.e., 50% of the population will survive at least 2.8 years). is stochastically larger than

Figure 1.2:

We say that the survival distribution for group 1 is stochastically larger than the survival distribution for group 2 if 푆1 푡 ≥ 푆2 푡 , for all 푡 ≥ 0, where 푆𝑖 푡 is the survival function for group 𝑖. If 푇𝑖 is the corresponding survival time for groups 𝑖, we also say that 푇1 is stochastically (not deterministically) larger than 푇2. Note that 푇1 being stochastically larger than 푇1 does NOT necessarily imply that 푇1 ≥ 푇2. Mortality Rate The mortality rate at time 푡, where 푡 is generally taken to be an integer in terms of some unit of time (e.g., years, months, days, etc), is the proportion of the population who fail (die) between times 푡 and 푡 + 1among individuals alive at time 푡, , i.e., 푚 푡 = 푃[푡 ≤ 푇 < 푡 + 1|푇 ≥ 푡]

Figure 1.3: A typical mortality pattern for human Hazard Rate The hazard rate is the limit of a mortality rate if the interval of time is taken to be small (rather than one unit). The hazard rate is the instantaneous rate of failure (experiencing the event) at time 푡 given that an individual is alive at time 푡. 푃[푡 ≤ 푇 < 푡 + ℎ|푇 ≥ 푡] 휆 푡 = lim ℎ→0 ℎ 푃[푡 ≤ 푇 < 푡 + ℎ] lim ℎ = ℎ→0 푃[푇 ≥ 푡] 푓(푡) 푆′ 푡 푑푙표푔{푆 푡 } = = − = − 푆(푡) 푆 푡 푑푡

• A useful way of describing the distribution of “time to event” because it has a natural interpretation that relates to the aging of a population; • Very popular in biomedical community. • If ℎ is very small, we have 푃 푡 ≤ 푇 < 푡 + ℎ 푇 ≥ 푡 ≈휆 푡 ℎ 푡 Λ 푡 = ׬0 휆 푢 푑푢 = −푙표푔{푆 푡 }, where Λ 푡 is referred to as the cumulative • hazard function. Here we used the fact that S(0) = 1. Hence, 푡 ׬ 휆 푢 푑푢 − 푆 푡 = 푒−Λ 푡 = 푒 0 Note: 1. There is a one-to-one relationship between hazard rate 휆 푡 , 푡 ≥ 0, and survival function 푆 푡 , namely, 푡 { ׬ 휆 푢 푑푢 푑푙표푔{푆 푡 − 푆 푡 = 푒 0 , 휆 푡 = − 푑푡 2. The hazard rate is NOT a probability, but a probability rate. Therefore it is possible that a hazard rate can exceed one in the same fashion as a density function푓 푡 may exceed one.

Common Parametric Models for survival data:

(See page 38 of Klein and Moeschberger (textbook) for more distributions.) 휆 푡 = 휆, 푆 푡 = 푒−휆푡 and 푓 푡 = 휆푒−휆푡 ∞ ∞ ∞ 1 . = The Mean Survival Time: μ = 퐸 푇 = ׬ 푡푓(푡)푑푡 = ׬ 푆 푡 푑푡 = ׬ 푒−휆푡푑푡 0 0 0 휆 푙표𝑔2 The Median Survival Time: 푡 = , since 푆 푡 = 푒−휆푡0.5 = 0.5. 0.5 휆 0.5 The Mean Residual Life Time after 푡0 ∞ ∞ ׬ 푆 푡 푑푡 ׬ 푒−휆푡푑푡 푡0 푡0 1 푚푟푙 푡0 = = = = 퐸(푇) 푆 푡0 푆 푡0 휆 Notice that log 푆 푡 = −휆푡. Therefore, sometimes it is useful to plot the survival distribution on a log scale, which can be used to check if the underlying true distribution of the survival time is exponential or not given a data set.

푡 푡 Check Exponential Distribution

1. Suppose we can have an estimate 푆መ 푡 of 푆 푡 without assuming any distribution of the survival time (the Kaplan-Meier estimate to be discussed later is such an estimate). Then we can plot log 푆መ 푡 푣푠 푡 to see if it is approximately a straight line. A (approximate) straight line indicates that the exponential distribution may be a reasonable choice for the data.

2. Alternatively, we can assume the exponential distribution for the data and get the estimate of 푆 푡 = 푒−휆푡(we only need to estimate 휆; this kind of estimation will be discussed in Chapter 3). Denote this estimate by 푆መ1 푡 and Kaplan-Meier estimate by 푆መ퐾푀 푡 . If the exponential distribution assumption is correct, both estimates will be good estimates of the same survival function 푆 푡 = 푒−휆푡. Therefore, 푆መ1 푡 and 푆መ퐾푀 푡 should be close to each other and hence the plot 푆መ1 푡 푣푠 푆መ퐾푀 푡 should be approximately a straight line. A non-straight line indicates that the exponential distributional assumption is not appropriate. 훼 휆 푡 = 훼휆푡훼−1, 푆 푡 = 푒−휆푡 ∞ ∞ 훼 Γ(1+1/훼) . = The Mean Survival Time: μ = 퐸 푇 = ׬ 푆 푡 푑푡 = ׬ 푒−휆푡 푑푡 0 0 휆1/훼 푙표𝑔2 1/훼 훼 The Median Survival Time: 푡 = , since 푆 푡 = 푒−휆푡0.5 = 0.5. 0.5 휆 0.5

The Weibull model allows : • Constant hazard: 훼 = 1; • increasing hazard: 훼 > 1; • decreasing hazard: 훼 < 1.

Notice that: log log 푆 푡 = log 휆 + 훼log(푡). Therefore a straight line in the plot of log log 푆 푡 푣푠 log 푡 indicates a Weibull model. We can use the above equation to check if the Weibull model is a reasonable choice for the survival time given a data set. Alternatively, we can assume a Weibull model for the survival time and use the data to estimate 푆 푡 and plot this estimate against the Kaplan-Meier estimate as we proposed for the exponential distribution. A (approximate) straight line indicates the Weibull model is a reasonable choice for the data. Other Parametric Models for Survival Data • Log-normal; Exponential Power; Gompertz; Inverse Gaussian; Pareto; Gamma; Generalized Gamma… • See section 2.5 of Klein and Moeschberger (textbook) for more details.