Introduction to survival models
Overview
A survival model is a probabilistic model of a random variable that represents the time until the occurrence of an unpredictable event. For example, we may wish to study the life expectancy of a newborn baby, or the future working lifetime of a machine until it fails. In both cases, we study how long the subject may be expected to survive. The theory that we will develop throughout this course can be applied in a wide range of situations, in which the concept of “survival” may not be immediately obvious, for example: • the time until a claim is made on an automobile insurance policy • the time until a patient in a coma recovers from the coma, given that he recovers • the time until a worker leaves employment. The focus of our study—the time until the specified event—is known as a waiting time or a random time-to-event variable. Probabilities associated with these models play a central role in actuarial calculations such as pricing insurance contracts.
1 Introduction to survival models Chapter 1
1.1 The role of a survival model in a contingent payment model
Let’s start by considering the most basic contingent payment model, in which a specified amount is paid if and only if a particular event occurs. Suppose that an amount P is to be paid in n years if a random event E occurs. Otherwise, if the complementary event occurs, then nothing is to be paid. At an effective annual rate of interest i , the random present value of the payment is:
Pv⋅ n if E occurs Z = 0if E′ occurs
where vi=+(1 )−1 is the one-year present value discount factor.
The random present value of the payment, Z , is a discrete random variable. Its expected value is known as the actuarial present value of the payment, which incorporates the amount of the payment, the discount factor associated with the timing of the payment, and the probability of the payment being made:
EZ[]=⋅⋅ Pvn Pr() E +⋅ 0 Pr() E′ =⋅Pvn ⋅Pr E () amount discount probability
Throughout Chapters 2 through 8 of this course, we will be concerned with the distribution of random present value of payment variables. We will frequently calculate the mean, the variance, a percentile, or the probability of some event regarding Z such as Pr(ZEZ> [ ] ) .
In Chapters 2 and 3 we will introduce contingent payment models that arise in the context of life insurance and life annuities. For these models, we will need to compute probabilities of events that are expressed in terms of the random future lifetime after age x , when an insurance contract has just been issued. For example, in order to calculate the appropriate life insurance premium for a 30 year old policyholder, we need to calculate the probability that the policyholder will die before age 31, or 32, or 33, and so on. In this first chapter, our aim is to familiarize you with the theory of survival models, the notation employed in these models, and standard terminology. There are three principal variables, all of which are measured in years: • the random lifetime (ie time until death) of a newborn life is denoted X • the random future lifetime at age x , given that a newborn has survived to age x , is denoted Tx(), ie Tx()=− X x X > x
• the curtate future lifetime at age x , given that a newborn has survived to age x , is the complete number of years of future lifetime at age x and is denoted Kx(), ie Kx()= Tx () (greatest integer) The variables X and T are assumed to be continuous random variables, whereas K is obviously discrete. For example, suppose a newborn life eventually dies at age 74.72. Then X =74.72 , T ()30=−= 74.72 30 44.72 , and KT()30=== [( 30) ] [44.72] 44 .
Notice that T is a function of X , and K is a function of T . So, the distributions of these three variables are closely related. These relationships will be developed over the rest of this chapter.
2 Chapter 1 Introduction to survival models
1.2 The life table – a discrete survival model
We begin by studying the life table, a discrete survival model commonly used in insurance applications. This model gives us the opportunity to gain an intuitive understanding of some of the most fundamental concepts before we study continuous time models.
We start by defining lx as the number of lives expected to survive to age x from a group of l0 newborn lives. A life table displays in a table format the values of lx at ages x equal to 0,1,2,… ,ω , where ω is the first whole number age at which there are no remaining lives in the group. If we are modeling human mortality, we may choose a value of ω of around 120 years. We’ll start by taking a deterministic view of future mortality. By this, we mean that the table tells us exactly how many of the l0 lives will be surviving at ages 1, 2, and so on. Here is a portion of a hypothetical life table:
x 0 1 2 3 4 5 6 7 8 9
lx 1,000 991 985 982 979 976 972 968 964 959
dx 9 6 3 3 3 4 4 4 5 6
The row labeled dx represents the number of lives among l0 newborn lives that die in the age range [,xx+ 1). It is computed as:
dllxxx=−+1
For example, since l2 = 985 lives survive to age 2, and l3 = 982 lives survive to age 3, then exactly dll223=−=3 lives must die between age 2 and age 3. A number of probabilities can be computed from the entries in such a table. Let’s begin by introducing the standard notation for the most significant types of probabilities that you will see in the actuarial models throughout Chapters 2 – 8.
The probability that a life currently age x will survive n years is denoted nxp . With our deterministic interpretation of the life table, along with the point of view that the probability of an event is the relative frequency with which it occurs, we have:
lxn+ nxp = lx It is a standard convention to omit the n subscript when n = 1 , so the probability that a life currently age x will survive 1 year is:
lx+1 px = lx For example, the probability that a life age 5 survives for 2 years to age 7 is:
l7 968 25p == l5 976
3 Introduction to survival models Chapter 1
The probability that a life currently age x will die within n years is denoted nxq , and we have: l xn+ llxxn− + nxqp=−11 n x =− = llxx
Intuitively, this is the probability that a life is one of the ()llxxn− + lives to die between age x and age xn+ , out of the lx lives age x . For example, the probability that a life age 1 dies within 3 years is:
ll14− 991− 979 12 31q == = l1 991 991 Again, we omit the n subscript when n = 1 , so the probability that a life currently age x will die within 1 year is:
llxx− +1 d x qqxx== or llxx Finally, the probability that a life currently age x will survive for m years and then die within the following n years is denoted mnqx , and we have:
llxm+++− xmn mnqx = lx
Intuitively, mnqx is the probability that a life age x survives for m years, multiplied by the probability that a life age xm+ dies within n years:
lllxm+++++++ xm− xmn ll xm− xmn mnqpqxmxnxm=×+ = × = llxxmx+ l Again, we omit the n subscript when n = 1 , so the probability that a life currently age x will survive for m years and then die within 1 year is: ll− xm+++ xm1 dxm+ mmqqxx== or llxx For example, the probability that a life age 4 survives for 3 years and then dies within the following 2 years is:
ll79− 968− 959 9 32q4 == = l4 979 979
Example 1.1
Compute the following probabilities from the life table above:
(a) 50p
(b) 5 q0
(c) 42q1
(d) p1
(e) q2
4 Chapter 1 Introduction to survival models
Solution
l5 976 (a) 50p == l0 1,000
d5 4 (b) 5 q0 == l0 1,000
ll57− 976− 968 8 (c) 42q1 == = l1 991 991
l2 985 (d) p1 == l1 991
d2 3 (e) q2 == ♦♦ l2 985
We can also take a stochastic (ie random) view of future mortality. Under a stochastic approach, the number of survivors at age x is a random variable Lx( ) , and the life table function lx represents the expected number of survivors, ie:
lELxx =()
The random variable Lx() follows a binomial distribution. Each life is viewed as an
independent Bernoulli trial. The number of trials is nl= 0 . We define “success” as survival to age x , with probability p = x p0 . For example, let’s consider the distribution of L(5) , the random number of survivors at age 5 in this life table from the 1,000 newborn lives.
The random variable L()5 follows a binomial distribution with nl= 0 = 1,000 trials and a
probability of success of pp==50 976 /1,000 = 0.976 . The mean and variance of L()5 are:
EL==×=()5 np l050 p 1,000 × 0.976 = 976 ( = l 5 )
var()Lnpqnpq() 5==××=50 50 1,000 × 0.976 × 0.024 = 23.424 Finally, we can deduce little from the life table about the continuous random lifetime X , but we can identify the distribution of the curtate lifetime of a newborn, K(0) .
The probability Pr(Kk (0)= ) is the probability that a newborn life dies in the age range [,kk+ 1),
which we have already defined as k q0 . Hence:
dk Pr()Kk (0)==k q0 = l0 For example, the probability that a newborn life has a curtate lifetime of 3 years is:
d3 3 Pr()Kq (0)== 33 0 = = = 0.003 l0 1,000
5 Introduction to survival models Chapter 1
1.3 The theory of continuous survival models
In this section we will study five different mathematical functions that can all be employed to specify the distribution of X , the random lifetime (ie age at death) of a newborn life: • the cumulative distribution function of X • the probability density function of X • the survival function • the life table function • the force of mortality. We will focus on the relations between these functions as well as their meaning.
The pdf and cdf of the random lifetime The random lifetime (ie age at death) of a newborn life, X , is assumed to be a continuous random variable. We will review the basic properties of continuous random variables and explain their interpretation in the context of the random lifetime. Let’s begin with the cumulative distribution function (cdf):
FxX ()=≤ Pr( Xx )
The cdf FxX () represents the probability that a newborn life will die at or before age x .
FxX () is continuous and non-decreasing with FX (00) = and FX (ω ) = 1 where ω is the first age at which death is certain to have occurred for a newborn life.
The probability density function (pdf) is:
fxXX()= Fx′ () wherever the derivative exists
The pdf fX ()x is non-negative and continuous on the interval [0,ω ).
Recall that a value of fX ()x is not a probability in itself. The probability that a newborn life dies between ages a and b is:
b Pr()aXb≤≤= fxdxFbFaX () = () − () ∫ a
You should note that since X is assumed to be a continuous random variable, all of the intervals ab, , [,)ab, (,]ab, and ()ab, have the same probability, Fb( ) − Fa( ) . Hence:
ω fxdxX () = 1 ∫ 0
and:
x FxXX()=≤=Pr() Xx f() udu ∫0
Finally, the probability that a newborn life dies in the interval [,xx+ ∆ x ] can be estimated as:
Pr()xXx≤≤+∆≈ x fX (xx ) ⋅∆
6 Chapter 1 Introduction to survival models
Example 1.2
Suppose that the lifetime X of a newborn life is uniformly distributed on the interval 0,100 . (a) Identify the probability density function. (b) Identify the cumulative distribution function. (c) Calculate the probability of death occurring between ages 60 and 80.
Solution
(a) The pdf for a uniform distribution is constant and equal to the reciprocal length of the interval: 1 fx()==0.01 for 0 ≤≤ x 100 X 100 (b) The cdf is:
xx FxXX()===≤≤ f () udu0.01 du 0.01 x for 0 x 100 ∫∫00
(c) The probability of death between ages 60 and 80 is:
Pr() 60≤≤XFF 80 =XX() 80 −( 60) = 0.01(80 − 60) = 0.20 Note that the uniform distribution is not particularly well suited as a model of human mortality, but it is useful as a simple context to illustrate the theory. This mortality model is commonly known as de Moivre’s law. It was actually the first mortality model to be used in insurance practice. ♦♦
Example 1.3
Suppose that the lifetime X of a newborn life is exponentially distributed with mean 75 years. (a) Identify the probability density function. (b) Identify the cumulative distribution function. (c) Calculate the probability of death between ages 60 and 80.
Solution 1 The pdf for an exponential distribution with mean θ is fx()= e−x /θ for x> 0 . Hence: X θ (a) The pdf is: 1 fx()=> e−x /75 for x 0 X 75 (b) The cdf is:
x x 1 −−uu/75 /75 − x /75 FxX ()==−=− e due()1 e ∫0 75 0
(c) The probability of death between ages 60 and 80 is:
−−60/75 80/75 Pr() 60≤≤XFFe 80 =XX() 80 − () 60 = − e = 0.10518 ♦♦
7 Introduction to survival models Chapter 1
The survival function
In actuarial mathematics it is common to describe a survival model by giving the survival function rather that the density function or distribution function. The survival function is denoted sxX () and is defined as:
sxX ()=>Pr() Xx The survival function gives the probability that a newborn life dies after age x . This is the same as saying that the newborn survives to age x , or is alive at age x . From the preceding discussion of the lifetime variable X , we can deduce the following properties of the survival function.
Key properties of the survival function
1. sxX () is continuous and non-increasing with sX (01)= and sX (ω ) =0
2. sxXX()=−1 Fx ()
b 3. Pr()aXb≤≤= fxdxsaXXX() = () − sb () ∫ a
4. fXX()xsx=− ′ ()
Example 1.4
Suppose that the lifetime X of a newborn is exponentially distributed with mean 75 years.
(a) Identify the survival function sxX (). (b) Calculate the probability that a newborn is still alive at age 100. (c) Calculate the probability that a newborn dies between ages 60 and 75.
Solution
−x /75 (a) In Example 1.3 we saw that the cdf is FxX ()=−1 e . So, we have:
−x /75 sxXX()=−1for Fxe () = x >0 (b) The probability a newborn is still alive at age 100 is:
−100/75 seX ()100== 0.26360 (c) The probability a newborn dies between ages 60 and 75 is:
−−60/75 75/75 sseXX()60−= (75) − e = 0.08145 ♦♦
8 Chapter 1 Introduction to survival models
The life table function
In contrast with the discussion of the discrete case in Section 1.2, here we will define the life table function lx for all ages between 0 and w .
As before, let Lx() denote the random number of survivors at any age x from a group of l0
newborn lives. The random variable Lx( ) follows a binomial distribution with nl= 0 trials. We
define “success” as survival to age x , with probability p =>=Pr(Xx) sX ( x) .
The life table function lx is defined as the expected number of survivors at age x . Hence:
lELxnplsxxX=() = =0 ( )
Example 1.5
Suppose that the lifetime X of a newborn is uniformly distributed on 0,100 .
(a) Identify the survival function sxX ().
(b) Identify the life table function lx if l0 =100 .
Solution
(a) The survival function is given by:
100 100 sXX() x=>=Pr() X x∫∫ f() u du = 0.01 du xx =−=−≤≤0.01() 100xx 1 0.01 for 0 x 100
(b) The life table function is:
llsxxX==−=−≤≤0 ( ) 100(1 0.01 x ) 100 x for 0 x 100 ♦♦
Let’s summarize some of the key properties of the life table function, lx .
Key properties of the life table function
1. lx is the expected number of survivors at age x from a group of l0 newborn lives
2. llsxxX= 0 () is continuous and non-increasing with lω =0
lx 3. sxX ()= l0
Note that the value of l0 (sometimes called the radix of the life table) is not important to the survival model, since (by property 3) the survival function is independent of this quantity. So, we can choose the value of l0 for convenience. The survival function will be identical whether we choose l0 = 100 or l0 = 1,000,000 .
9 Introduction to survival models Chapter 1
Example 1.6
2 Suppose that the life table function is lxxx =−≤≤10,000() 100 for 0 100 . Identify the cumulative distribution function and the probability density function for the associated lifetime variable X .
Solution
It is elementary to compute the survival function from the life table by property 3:
22 l 10,000() 100−−xx() 100 sx==x = for 0 ≤≤ x 100 X () 22 l0 10,000× 100 100
We can then calculate FxX () using the relationship FxXX( )=−1 sx( ) :
()100 − x 2 FxX ()=−1 for 0 ≤≤ x 100 1002
Finally, we can calculate fxX () using the relationship fXX(xFx) = ′ ( ) :
2 ′ ()100−−xx 2() 100 100 − x fxFx()==−′ () 1 = = for 0 ≤≤ x 100 ♦♦ XX22 100 100 5,000
Example 1.7
Suppose that there are 1,000 newborn lives whose lifetime follows the survival model given in Example 1.6. Determine the interval that lies within two standard deviations either side of the mean for L()10 , the random number of survivors at age 10. Solution
L()10 follows a binomial distribution with:
nl==0 1,000
2 ()100− 10 ps==X ()10 = 0.81 1002 So we have:
EL==×=()10 np 1,000 0.81 810 var()Lnpq() 10== 1,000 ××−= 0.81 (1 0.81) 153.90
σL()10 ==153.90 12.406
Hence the required interval is:
810−× 2 12.406, 810 +× 2 12.406 = 785.19, 834.81 ♦♦
10 Chapter 1 Introduction to survival models
The force of mortality
We can also specify a survival model in terms of the force of mortality. The force of the mortality is denoted µ (x) . It is an instantaneous measure of mortality at age x , and it can be defined in several equivalent ways:
fxXX() sx′ () ′ lx′ µ ()xsx==−=−=−()ln ()X () sxXX() sx() l x These equalities can be verified using simple calculus. For example, using the information in Example 1.6, we have:
2 sx′ () −−2() 100x /100 2 µ ()xx=−X =− = for 0 ≤ < 100 2 2 sxX () ()100− x /100 100 − x
Or, using the information in Example 1.4, we have:
′ x ′ 1 µ ()xsx=−()ln()X () = − − = for x ≥ 0 75 75 Let’s now see how to calculate the survival function from the force of mortality. ′ µ ()xsx=−()ln()X () x x ⇒=−=−+µ ()ydyln() sXXX () y ln() s() x ln() s () 0 ∫0 0
But since sX ()01== and ln10 () , we have:
x µ ()ydy=−ln () sX () x ∫0
x ⇒=−sxX () expµ () ydy ∫ 0
Example 1.8
Suppose that the force of mortality for a survival model is given by the formula: 0.9 µ ()xx=≤ Solution The survival function is calculated as: xx0.9 sxX ()=−exp∫∫µ () ydy =− exp dy 0090 − y x 90 − x =−=exp0.9ln90()y exp0.9ln 0 90 0.9 90 − x =≤< for 0x 90 ♦♦ 90 11 Introduction to survival models Chapter 1 Since we have already studied simple relationships between the survival function, the life table function, and the pdf and cdf of the lifetime function, we can easily calculate any of these functions from the force of mortality. For example, using the information in Example 1.8, we can calculate the life table function (with l0 =1,000 ) as: 0.9 90 − x llsxxX==0 ( ) 1,000 for 0 ≤< x 90 90 It is clear from the definition that the force of mortality is not a probability, so how should it be interpreted? In order to understand the meaning of µ (x) , it is useful to rewrite the defining formula in the form: fXX()xsxx= ()µ () Now, recall that: fX ()xx∆≈Pr() xXx ≤ ≤ +∆ x Rewriting the probability term using a conditional probability, we have: fxxX ()∆≈Pr() xXx ≤ ≤ +∆ x =≤+∆≥≥Pr()Xx xXx Pr() Xx =≤+∆≥Pr()Xx xXxsxX ( ) Substituting fXX()xsxx= ()µ (), we have: sxXX()µ () xx∆≈Pr() Xx ≤ +∆ xXxsx ≥ ( ) ⇒∆≈≤+∆≥µ ()xxPr() Xx xXx So, µ ()xx∆ is approximately equal to the conditional probability that a newborn that has survived to age x subsequently dies during the next ∆x years. For example, µ ()20 multiplied by ∆=x 1/365 (a day), is approximately equal to the conditional probability that a newborn that has survived to exact age 20 will then die during the next day. Example 1.9 Suppose that the force of mortality for a survival model is given by the formula: 0.9 µ ()xx=≤ Solution Setting ∆=x 7/365, the required probability is: 0.9 7 µ ()40∆=x × = 0.00035 ♦♦ 90− 40 365 12 Chapter 1 Introduction to survival models Let’s summarize the main properties of the force of mortality. Key properties of the force of mortality fxXX() sx′ () ′ lx′ 1. µ ()xsx==−=−=−()ln ()X () sxXX() sx() l x x 2. sxX ()=−expµ () ydy ∫0 3. µ ()xx⋅∆ ≈Pr() XxxXx ≤ +∆ | ≥ 4. µ ()x is non-negative and piece-wise continuous where defined ω 5. µ ()ydy=∞ in order that s (ω ) = 0 ∫ 0 X Standard probabilities in a continuous survival model Let’s now reconsider the ideas we met in Section 1.2 (in the context of a discrete life table) in the form of a continuous algebraic function defined for all x in 0,ω . First, the probability that a life currently age x will survive t years is: lxt+ sxtX ()+>+Pr( Xxt) txp == = =Pr()XxtXx >+> | lsxXxxX() Pr()> The probability that a life currently age x will die within the next t years is: llxxt− + Pr()Xx>− Pr( Xxt >+) txqXxtXx== =≤+>Pr() | lXxx Pr()> Finally, the probability that a life currently age x will survive s years but die within the following t years is: llxs+++− xst stqxsXxstXxx ==+<≤++>Pr() | lx These three functions are defined for all ages x in 0,ω and for 0 ≤ tx≤−ω . It should be emphasized now that all of these probabilities are conditional, ie we are given that a newborn has survived to age x. The symbol ()x is commonly used to denote a newborn life that has survived to age x . So, 5 px is the probability that ()x will still be alive in 5 years’ time, at age x + 5 , and 5 qx is the probability that ()x will die within the next 5 years. As in Section 1.2, the general convention is to drop the subscript t from the symbol when t = 1 . So, for example, 3|qx is the probability that ()x will die between ages x + 3 and x + 4 . 13 Introduction to survival models Chapter 1 Example 1.10 Suppose that the force of mortality for a survival model is given by the formula: 0.9 µ ()xx=≤ (a) 2.5p 20 (b) 2.5q20 (c) 2.5|q20 Solution Note that this is the force of mortality in Example 1.8. The life table function is: 0.9 90 − x llsxlxX==00( ) for 0 ≤< x 90 90 Recall that we can choose any convenient value of l0 without changing the distribution of X . So 0.9 let’s simplify our computations by choosing l0 =90 , which gives: 0.9 llsxxX==−0 ( ) (90 x ) for 0 ≤< x 90 We can now calculate the required probabilities as follows. 0.9 l ()90− 22.5 (a) p ==22.5 =0.96780 2.5 20 0.9 l20 ()90− 20 ll20− 22.5 (b) 2.5qp 20==−=1 2.5 20 0.03220 l20 ll−−67.50.9 66.5 0.9 (c) |q ==22.5 23.5 = 0.01291 ♦♦ 2.5 20 0.9 l20 70 The challenge in dealing with these standard probabilities is that there are so many relationships that involve them. The key relations are listed below without proof. Most of the proofs rely on simple probability theory – you may like to attempt them to improve your understanding. Key relations concerning standard probabilities 1. txpq+= tx 1 2. st++p x=⋅ spp x t xs 3. st| qpq x=⋅ s x t x+++ s =− s p x s t p x = s t q x − s q x 4. nxppp=⋅ x x++−11 ⋅⋅ p xn when n is an integer 5. nxxxnxqqq=+++01| | − 1 | q when n is an integer 14 Chapter 1 Introduction to survival models For example, if the probability that (20) survives for 10 years is 0.97, and if the probability that ()30 survives for 10 years is 0.95, then the probability that (20) is still alive at age 40 is: 20ppp 20=⋅=×= 10 20 10 30 0.97 0.95 0.92150 Or, the probability that ()20 dies between ages 30 and 40 is: 10| 10qpq 20=⋅= 10 20 10 30 0.97( 1 −= 0.95) 0.04850 On the other hand, if pp01==0.99 , 0.98 , and p 2 =0.97 , then the probability that a newborn dies within three years is: 30qpppp=−1 30 = 1 − 0 ⋅ 1 ⋅ 2 =− 1 0.99 × 0.98 × 0.97 = 0.05891 Or the probability that a newborn dies during the second year of life is: 10|qpq=⋅= 01 0.99 ×−() 1 0.98 = 0.01980 Relations 4 and 5 are useful in constructing a discrete life table for human lives. A statistical study conducted over a time span of several years could be used to produce estimates of the mortality rates qqq012,,, … and so on. Values of lx at whole number ages can then be produced as follows: ln nnnnpllplppplqqq00000110011=⇒=⋅=⋅⋅⋅⋅=−− − ()()()11 1 − n− l0 1.4 The continuous future lifetime after age x Let the continuous random variable X again denote the random lifetime of a newborn. Now suppose that we are given that a newborn has survived to age x , that is, Xx> . The future time lived after age x is Xx− . The conditional distribution of the time lived after age x , given survival to age x, is: Tx()=− X x| X > x The continuous random variable Tx() is a survival model defined on the interval [0,ω − x ]. As such, it can also be specified in the same ways that we specified the survival model for a newborn life. It should be clear that the distribution of Tx() is closely related to the distribution of X . The quickest way to see the relation between the distributions of Tx() and X is to calculate the survival function for Tx(), ststTxtTx()( ) = T ( ) =>Pr( ( ) ) . In fact, we have already computed this survival function in terms of the distribution of X , since the event Tx()> t is equivalent to saying that ()x is alive at age xt+ . The probability of this event is simply txp . So, we have: lxt+ sxtX ( + ) stTx()()=>===Pr() Txtp ( ) tx since llsx x =0 X ( ) lsxxX() Note: When there is no ambiguity, we will write T for Tx( ) . However, subscripts are often important, for example to distinguish sX (20) , the probability that a newborn survives to age 20, from sT()10 ()20 , the probability that (10) survives to age 30. 15 Introduction to survival models Chapter 1 Example 1.11 2 Suppose that the life table function is lxxx =−≤≤10,000() 100 for 0 100 . (a) Compute the survival function for newborn lives. (b) Compute the survival function for lives currently aged 20. Solution (a) The survival function for newborn lives, sxX (), is: 2 2 l 10,000() 100 − x 100 − x sx=== px =for 0 ≤≤ x 100 Xx() 0 2 l0 10,000( 100− 0) 100 (b) The survival function for lives currently aged 20, stT(20)(), is: 2 l 10,000(100−+ (20t ))2 80 − t stp==20+t = =for 0 ≤≤ t 80 ♦♦ T()20 () t 20 2 l20 10,000(100− 20) 80 When we deal with the future lifetime after age x , we’ll frequently see expressions of the type lxt+ and µ()xt+ . In these expressions the value of x is fixed, and the value of t is allowed to vary so that we can view the expressions as being functions of t . For example, let’s see how to relate the pdf and cdf for the distributions of X and T . FtT ()=≤=−≤>Pr() Tx ( ) t Pr() XxtXx | Pr()xXxt<≤+ FxtFx()+− () ==XX Pr()Xx>− 1 FxX () ′ d FxtFxXXX()+− () f() xt + f X() xt + ftTT()==() Ft () = = dt11−− FxXXX() Fx() sx() In the figure below, we have areas A=+−FxtFxXX( ) ( ) and A+=BsxX ( ) . fX(x) Area Area = A = B x x+t ω Notice that we have: FxtFxXX()+− () A FtT ()== sxX () AB+ If we view the age x as being fixed, then as the value of t increases, the value of A increases, the value of B decreases, while the sum A+ B remains constant. 16 Chapter 1 Introduction to survival models Furthermore, if we examine the relation: fxtX ()+ ftT ()=≤≤− for 0 tω x sxX () in light of the figure above, we can see how the graph of the pdf of T is related to the graph of the pdf of X . If we take the portion of the graph of fX (x) to the right of x , divide by the total area under the remainder of the graph, A+=BsX ( x) , and then relabel the horizontal axis as t and the vertical axis as fT ()t , we have a graph of the pdf of T : fT(t) Area = Area = A / sX(x) B / sX(x) 0 t ω-x One final point worth noting is the similarity between the pdf’s for the distributions of X and T . For X we have: fxX () µµµ()xfxsxxpx=⇒=XX() () () = x0 () sxX () and for T we have: fxtX ()+ xt+ pxtppxt00µµ( ++) x t x ( ) ftTtx()== = =+ pµ () xt sxXx() p00 x p Example 1.12 2 Suppose that the life table function is lxxx =−≤≤10,000() 100 for 0 100 . (a) Compute the distribution function for the future lifetime of a life aged 20. (b) Compute the density function for the future lifetime of a life aged 20. Solution (a) In Example 1.11 we computed: 2 80 − t t ps20 ==() tfor 0 ≤≤ t 80 T()20 80 Hence, the distribution function for T (20) is: 2 80 − t Ft()=−11for080 st() = − ≤≤ t TT()20() 20 80 17 Introduction to survival models Chapter 1 (b) We have two options for calculating the density function. The first option is to differentiate the distribution function: 2 ′ ′ 80 − t ()80 − t ftFt()==−=() 12 TT()20()() 20 2 80 80 80 − t =≤≤ for 0t 80 3,200 The other option is to use the relation: fTtx()t =+p µ ()xt It is consistent with the formula for lx that: 2 µ ()x = 100 − x Hence, we have: 2 80− tt 2 80 − ftpTt(20)()=+=⋅= 20 µ ()20 t ♦♦ 80 80− t 3,200 Key results concerning the relation of the distributions of X and T(x) lxt+ sxtX ( + ) 1. stTtx()=>===Pr() Txt ( ) p lsxxX() FxtFxXX()+− ( ) 2. FtTtxT()==−= q1 st () sxX () fxtX ()+ 3. ftTtx()==+ pµ () xt sxX () 1.5 The curtate future lifetime after age x In addition to computing the distribution of the continuous future lifetime Tx( ) , we may also wish to derive the distribution of the curtate future lifetime after age x . The curtate lifetime is a discrete random variable that is defined by: Kx()= Tx () ie the integer part (or greatest integer) of Tx ( ) Since it is a function of Tx(), it is simple to calculate the probability function of Kx() from what we know about Tx(). The possible values of Kx( ) are the numbers 0,1,2, ,ω − x − 1. For example, if x =70 and ω = 90 , then the possible values of K (70) are the twenty whole numbers 0 through 19. If the life ()70 eventually dies at age 85.8, then the continuous future lifetime is T(70)= 15.8 and the curtate lifetime is K(70)= [ 15.8] = 15 . 18 Chapter 1 Introduction to survival models The key observation is that if Kx( ) = k, then we must have: kTxk≤<+() 1 This leads to the following formula for the probability function: Pr()Kx ( )== k Pr() k ≤ T( x) <+ k 1 dxk+ ==kx|qkx for = 0,1,2,,1 ω −− lx Example 1.13 Suppose that the life table function is given by the formula: lxx =−100 for 0 ≤≤ x 100 Compute the probability function for K (75) . Solution The probability function for K ()75 is: d ll− Pr()Kk() 75 ==75+k =75+++kk 75 1 ll75 75 ()()100−−− 75kk 100 −−− 75 1 1 == ()100− 75 25 So K ()75 has 25 possible values ( 0,1,2, ,24 ) that are equally likely to occur. ♦♦ Example 1.14 Suppose that the life table function is given by the formula: −0.015x lex =≤<∞1,000 for 0 x Compute the probability function for K (75) . Solution The probability function for K ()75 is: d ll− Pr()Kk() 75 ==75+k =75+++kk 75 1 ll75 75 −+0.015() 75 k −++0.015() 75k 1 ee− ==−ee−−0.015k ()1 0.015 e−×0.015 75 Note that this is a geometric distribution. ♦♦ It is also useful to develop formulas for the cumulative distribution function and survival function of the curtate future lifetime. 19 Introduction to survival models Chapter 1 Recall that for any random variable FxX ()= Pr( Xx≤ ), hence: FkKx()( )=≤==+=++= Pr() Kxk() Pr() Kx() 0 Pr( Kx( ) 1) Pr( Kxk( ) ) dd d d ll− =+xx++12 + x ++ xk + =xxk++1 llxx l x l x l x ==−−kx+1qk for 0 ,1 , ,ω x 1 The survival function of the curtate future lifetime is then easily derived as: skKx()()=>=−=−=Pr() Kxk () 1 Fk Kx()( ) 1kxkx++11 q p for k =−− 0 ,1,...,ω x 1 Example 1.15 Suppose that the life table function is given by the formula: lxx =−100 for 0 ≤≤ x 100 Compute the survival function for K ()75 . Solution The survival function for K ()75 is: l75++k 1 100−++ (75k 1) 24 − k skK()75 ()==k+175 p = =for k = 0 ,1, ,24 ♦♦ l75 100− 75 25 Example 1.16 Suppose that the life table function is given by the formula: −0.015x lex =≤<∞1,000 for 0 x Compute the survival function for K ()75 . Solution The survival function for K ()75 is: −++0.015(75k 1) l75++k 1 e −+0.015()k 1 sk()==+ p = = efor k = 0 ,1 ,2 , K()75 k 175 l −0.015(75) 75 e ♦♦ Let’s conclude this section with a summary of the key relations concerning the curtate future lifetime, Kx(). Key relations concerning the curtate future lifetime dxk+ 1. fkKx()()====Pr() Kkqkx | for k = 0 ,1,...,ω −− x 1 lx lxk++1 2. skKx()()==kx+1 pfor k = 0 ,1,...,ω −− x 1 lx 3. FkKx()()==−kx++11 q1 kx p for k = 0 ,1,...,ω −− x 1 20 Chapter 1 Introduction to survival models 1.6 Important life table functions Now that we have developed the basic properties of X , Tx( ) , and Kx(), it is time to study additional features of these distributions, such as the life expectancy for a newborn. The life table functions Lx and Tx The functions Lx and Tx are useful devices in the calculation of life expectancy. They are defined in terms of the life table function, lx , as follows: x+1 Lldyxy= ∫x ω TldyLLxy==+++ xx+−11 Lω ∫x These functions have interpretations in terms of the aggregate future lifetime of a group of lives that die exactly as scheduled in the life table (ie we take a deterministic view of the life table). Consider a brief time interval [,y yy+ ∆ ] that is part of the interval [,x ω ]. At the start of this brief period there are ly survivors. We can estimate the total people-years lived by the survivors during this brief period by ly ∆y . This approximation ignores the possibility that anyone dies in the short time available. If we now sum these people-years lived over a set of disjoint sub-intervals of length ∆y ω comprising the age interval [,x ω ], we will have a Riemann sum for the integral ldy . ∫x y This Riemann sum can be interpreted as an approximation to the total number of people-years lived after age x by the survivors to age x . Taking a limit as ∆y goes to zero, then we have the integral: ω Tld= y xy∫x which can be interpreted as the total people-years lived after age x by the survivors to age x . Beware of confusing Tx with Tx( ) , the random future lifetime of a single life age x. We can break the interval [,x ω ] into subintervals of length one year, ie [,xx+ 1], [1,2],,[1,]xx++ ω −ω . The function Lx is calculated over just one of these one-year periods: x+1 Lld= y xy∫x which is the number of people-years lived by the survivors to age x during the next year. Example 1.17 Compute the function Tx for the following life table functions: (a) lxxx =−1,000 10 for 0 ≤≤ 100 −0.015x (b) lex =≤<∞1,000 for 0 x 21 Introduction to survival models Chapter 1 Solution (a) We have: 100 2 2 100 1,000− 10y − ()()1,000 10x 2 Tydyx =−=−1,000 10 = =− 5() 100 x ∫x 20 20 x (b) We have: ∞ −0.015y −0.015x ∞ −0.015y 1,000ee 1,000 Tedyx ==−=1,000 ♦♦ ∫x 0.015 0.015 x Complete life expectancy Now let’s develop some formulas to calculate life expectancies. Before we do that, though, it’s useful to develop an alternative formula for expected value calculations. The work we do now will make the subsequent derivations simpler. Let’s assume that X is a continuous, positive-valued random variable whose mean and variance both exist. The usual formula for calculating the expected value is: ∞ EX[]= xfX () xdx ∫ 0 We can develop an alternative formula using integration by parts: u== x,, dv fXX() x dx ⇒==− du dx v s( x) ∞∞∞ EX[]==−+ xfxdxxsxXXX() () sxdx() ∫∫00()0 ∞∞ =−limxsXXX() x + 0 + s () x dx = s () x dx x→∞ ∫∫00 −1 In the above calculation, the limit is zero. Apply L’Hopital’s rule writing xsX ( x) as sxxX ()/ . Then use the fact that limxfx2 ()= 0 if EX 2 exists. x→∞ In other words, we can calculate EX[] as the area under the graph of the survival function. If the density function is supported on the finite interval [0,ω ] , then we can replace ∞ in the above integrals by ω , since both of the functions fX (x) and sxX ( ) are zero for x > ω . So we have: ω EX[]= sX () x dx ∫0 Another type of expected value that occurs in both survival model theory and loss distribution theory is the limited expected value. Let’s define Xn∧ as the minimum of the random variable X and the number n : XXnif ≤ Xn∧=min{} Xn , = nXnif > 22 Chapter 1 Introduction to survival models Since Xn∧ is a function of X , we can compute its expected value as above with limits 0 and n: nn∞ EXn∧= xfxdxnfxdxXX() + () = xfxdxnsn XX () + () ∫∫00n ∫ n nn =−xsXXXX() x + s() x dx + ns () n = s () x dx ()0 ∫∫00 Now we’re ready to calculate the two types of life expectancy: complete and curtate. The complete expected future lifetime at age x is denoted by ex , and is defined as: eETxx =() Using results derived earlier in this section, we can develop several methods to calculate this expected value: