Markov Processes and the Product-Integral

Appendix A Markov processes and the product-integral A main purpose of this book is to show how the theory of stochastic processes fits naturally into the more applied framework of event history analysis. In particular, the tools and ideas of martingale and Markov process theory are ubiquitous in an integrated approach. In spite of the close relationship, most presentations found in the literature do not acknowledge this; standard presentations of event history analysis rarely say much about Markov processes, and vice versa. Our main text incorporates many applications of martingale theory and counting processes and this appendix will not say much about that. The fundamental con- nections between hazard, survival, Markov processes, the Kolmogorov equations, and the product-integral are also at the heart of many models covered in our book, but a full discussion is too much to include in the main text. Since these topics are hard to find as a comprehensive text, they are presented in this appendix. The results are particularly useful for understanding the connection between the Nelson- Aalen and Kaplan-Meier estimators (Chapter 3) and for multivariate survival data and competing risk (Section 3.4). This appendix deals mostly with the mathemati- cal relationships; the parallel results used in estimation and asymptotics are in the main text. For extensions of multistate Markov models to semi-Markov models, the reader is referred to Huzurbazar (1999, 2005). We also cover introductory ideas for diffusion processes, stochastic differential equations, and certain stationary processes with independent increments, known as Lévy processes. These models are used extensively in the final chapters of the book (Chapters 10 and 11). The material in the appendix is meant to be self-contained, but a reader with no background in stochastic processes will find it useful (or even necessary!) to consult standard texts such as Karlin and Taylor (1975, 1981, 1998), Allen (2003), and Cox and Miller (1965). 457 458 A Markov processes and the product-integral A.1 Hazard, survival, and the product-integral Let T ≥ 0 be a random survival time with survival function S(t)=P(T > t).Itis common to assume that the survival function S(t) is absolutely continuous, and let us do so for the moment. Let f (t) be the density of T . The standard definition of the hazard rate α(t) of T is 1 f (t) α(t)= lim P(t ≤ T < t + Δt | T ≥ t)= , (A.1) Δt→0 Δt S(t) where dt is “infinitesimally small”, that is, the probability of something happening in the immediate future conditional on survival until time t.Thenα is obtainable from S by −S(t) α(t)= . (A.2) S(t) Inversely, when α is available S solves the differential equation S(t)=−α(t)S(t), which leads to t S(t)=exp(− α(s)ds)=exp(−A(t)), (A.3) 0 ( )= t α( ) where A t 0 s ds is the cumulative hazard rate. Note that T does not have to be ( = ∞) > ∞ ( ) < ∞ α( ) < ∞ finite; if P T 0then 0 f s ds 1and 0 s ds , which is sometimes referred to as a defective survival distribution. We will now see how (A.2)and(A.3) may be generalized to arbitrary distributions, which need neither be absolutely continuous nor discrete. Such generaliza- tions, in particular the product-integral, which generalizes (A.3), are useful for a number of reasons: • It enables us to handle both discrete and continuous distributions within a unified framework. • It is the key to understanding the intimate relation between the Kaplan-Meier and Nelson-Aalen estimators (Chapter 3). • It is the basis for the martingale representation of the Kaplan-Meier estimator; cf. Section 3.2.6. • It can readily be extended to Markov processes and used to derive a generaliza- tion of the Kaplan-Meier estimator to multivariate survival data; cf. Section 3.4 and Section A.2.4. We now prepare the ground for the product-integral. When S is not absolutely continuous, it is still right-continuous with limits from the left, that is, a cadlag function. Informally, let dS(t) be the increment of S over the small time interval [t,t +dt),sothat−dS(t)=P(t ≤ T < t +dt).IfT has a density, we have −dS(t)= f (t)dt,andifT has point mass pt in t then −dS(t)=pt, still in a rather informal sense. If we let S(t−) denote the left-hand limits of S(t), the definition (A.1)gives −dS(t) α(t)dt = P(t ≤ T < t + dt | T ≥ t)= , S(t−) A.1 Hazard, survival, and the product-integral 459 and in this form, since S is monotone, it can be integrated on both sides to yield t dS(u) A(t)=− . (A.4) 0 S(u−) The integral on the right-hand side is now well defined as a Stiltjes integral (Rudin, 1976) regardless of whether S is absolutely continuous or not, and we take this as the definition of the cumulative hazard A in general, also when the hazard α(t)=dA(t)/dt itself does not exist. Note that for the absolutely continuous case dS(u)=− f (u)du and S(u−)=S(u), so in this case (A.4) specializes to ( )= t ( )/ ( ) A t 0 f u S u du in agreement with (A.2). On the other hand, for a purely discrete distribution, (A.4) takes the form A(t)=∑u≤t αu, where the discrete hazard αu = −{S(u) − S(u−)}/S(u−)=P(T = u| T ≥ u) is the conditional probability that the event occurs exactly at time u given that it has not occurred earlier. Equation (A.4) expresses the cumulative hazard in terms of the survival function. We also need the inverse representation. In differential form (A.4) may be written dS(t)=−S(t−) dA(t), (A.5) or more formally as an integral equation t S(t)=1 − S(u−)dA(u) (A.6) 0 for S. In the absolutely continuous case this is just an integrated version of (A.2), but the simple solution (A.3) is not valid in general. In fact, the much-studied integral equation (A.6) may serve as an implicit definition of S when given an increasing function A as the cumulative hazard. We will sketch a (slightly simplified) argument of how an expression for the solution S can be found. Consider first the conditional survival function S(v| u)=P(T > v| T > u)=S(v)/S(u), which is the probability of an event occurring later than time v given that it has not yet occurred at time u, v > u. The conditional survival function is an important concept that corresponds more generally to Markov transition probabilities when an individual can move between several different states, not only from “alive” to “dead”. We will discuss the extension in A.2.4. Partition the time interval (0,t] into a number of subintervals 0 = t0 < t1 < t2 < ···< tK = t. To survive from time 0 to time t, an individual needs to survive all the intermediate subintervals. By proper conditioning, we see that S(t)=P(T1 > t1)P(T2 > t2 | T1 > t1)···P(T > tK | T > tK−1) K = ∏ S(tk | tk−1). (A.7) k=1 460 A Markov processes and the product-integral Observe that, from (A.5), we have the approximation S(tk) − S(tk−1) ≈−S(tk−1)(A(tk) − A(tk−1)) (A.8) or S(tk | tk−1) ≈ 1 − (A(tk) − A(tk−1)). Entering this in (A.7), we get K S(t) ≈ ∏ {1 − (A(tk) − A(tk−1))}. (A.9) k=1 One would expect the approximation to improve with increasing number of subintervals, and if in this product we let the number K of time intervals increase while their lengths go to zero in a uniform way, the product on the right-hand side will in- deed approach a limit, which is termed the product-integral. In fact, for an arbitrary cadlag function B(t) (of locally bounded variation) the product-integral is defined as K def π{1 + dB(u)} = lim ∏ {1 +(B(tk) − B(tk−1))}, M→0 u≤t k=1 where M = maxk |tk −tk−1| is the length of the longest subinterval. Here the product- integral notation π is used to suggest a limit of finite products ∏, just as the integral is a limit of finite sums ∑. Whereas (A.4) expresses A in terms of S the inverse relation can thus be ex- pressed in terms of the product-integral S(t)=π{1 − dA(u)}. (A.10) u≤t For a purely discrete distribution, this relationship takes the form S(t)=∏(1 − αu). u≤t When A is absolutely continuous we write dA(u)=α(u)du. Using the approximation exp(α(u)du) ≈ 1 − α(u)du, valid for small du, it is seen (informally) that S(t)=π{1 − dA(u)} = π{1 − α(u)du} u≤t u≤t = exp − α(u)du = exp{−A(t)}, u≤t so the product-integral specializes to the well-known relation (A.3). More generally, if we decompose the cumulative hazard into a continuous and a discrete part, that is, A(t)=Ac(t)+Ad(t),whereAc(t) is continuous and Ad(t) is a step function, then the product-integral may be factored as A.2 Markov chains, transition intensities, and the Kolmogorov equations 461 −Ac(t) π{1 − dA(u)} = e ∏{1 − Ad(u)}.

Load more