Advanced Topics in Markov Chains
Total Page:16
File Type:pdf, Size:1020Kb
Advanced Topics in Markov chains J.M. Swart April 25, 2018 Abstract This is a short advanced course in Markov chains, i.e., Markov processes with discrete space and time. The first chapter recalls, with- out proof, some of the basic topics such as the (strong) Markov prop- erty, transience, recurrence, periodicity, and invariant laws, as well as some necessary background material on martingales. The main aim of the lecture is to show how topics such as harmonic functions, coupling, Perron-Frobenius theory, Doob transformations and intertwining are all related and can be used to study the properties of concrete chains, both qualitatively and quantitatively. In particular, the theory is applied to the study of first exit problems and branching processes. 2 Notation N natural numbers f0; 1;:::g N+ positive natural numbers f1; 2;:::g NN [ f1g Z integers ZZ [ {−∞; 1g Q rational numbers R real numbers R extended real numbers [−∞; 1] C complex numbers B(E) Borel-σ-algebra on a topological space E 1A indicator function of the set A A ⊂ BA is a subset of B, which may be equal to B Ac complement of A AnB set difference A closure of A int(A) interior of A (Ω; F; P) underlying probability space ! typical element of Ω E expectation with respect to P σ(:::) σ-field generated by sets or random variables kfk1 supremumnorm kfk1 := supx jf(x)j µ ν µ is absolutely continuous w.r.t. ν fk gk lim fk=gk = 0 fk ∼ gk lim fk=gk = 1 o(n) any function such that o(n)=n ! 0 O(n) any function such that supn o(n)=n ≤ 1 δx delta measure in x µ ⊗ ν product measure of µ and ν ) weak convergence of probability laws Contents 0 Preliminaries 5 0.1 Stochastic processes . .5 0.2 Filtrations and stopping times . .6 0.3 Martingales . .7 0.4 Martingale convergence . .9 0.5 Markov chains . 10 0.6 Kernels, operators and linear algebra . 15 0.7 Strong Markov property . 16 0.8 Classification of states . 18 0.9 Invariant laws . 20 1 Harmonic functions 25 1.1 (Sub-) harmonicity . 25 1.2 Random walk on a tree . 32 1.3 Coupling . 37 1.4 Convergence in total variation norm . 40 2 Positive eigenfunctions 45 2.1 Introduction . 45 2.2 The spectral radius . 46 2.3 R-recurrence . 48 2.4 Conditioning to stay inside a set . 52 2.5 The Perron-Frobenius theorem . 55 2.6 Quasi-stationary laws . 57 2.7 Sketch of the proof of the main theorem . 60 2.8 Further results . 64 3 Intertwining 69 3.1 Intertwining of Markov chains . 69 3.2 Markov functionals . 73 3.3 Intertwining and coupling . 78 4 Branching processes 83 4.1 The branching property . 83 4.2 Generating functions . 86 4.3 The survival probability . 89 3 4 CONTENTS 4.4 First moment formula . 92 4.5 Second moment formula . 94 4.6 Supercritical processes . 97 4.7 Trimmed processes . 98 A Supplementary material 105 A.1 The spectral radius . 105 Chapter 0 Preliminaries 0.1 Stochastic processes Let I be a (possibly infinite) interval in Z. By definition, a stochastic process with discrete time is a collection of random variables X = (Xk)k2I , defined on some underlying probability space (Ω; F; P) and taking values in some measurable space (E; E). We call the random function I 3 k 7! Xk(!) 2 E the sample path of the process X. The sample path of a discrete-time stochastic process is in fact itself a random variable X = (Xk)k2I , taking values in the product space (EI ; E I ), where I E := fx = (xk)k2I : xk 2 E 8k 2 Ig is the space of all functions x : I ! E and E I denotes the product-σ-field. It is well-known that a probability law on (EI ; E I ) is uniquely characterized by its finite-dimensional marginals, i.e., even if I is infinite, the law of the sample path X is uniquely determined by the finite dimensional distributions P (Xk;:::;Xk+n) 2 · (fk; : : : ; k + ng ⊂ I): of the process. Conversely, if (E; E) is a Polish space equipped with its Borel-σ- field, then by the Daniell-Kolmogorov extension theorem, any consistent collection of probability measures on the finite-dimensional product spaces (EJ ; E J ), with J ⊂ I a finite interval, uniquely defines a probability measure on (EI ; E I ). Polish 5 6 CHAPTER 0. PRELIMINARIES spaces include many of the most commonly used spaces, such as countable spaces equipped with the discrete topology, Rd, separable Banach spaces, and much more. Moreover, open or closed subsets of Polish spaces are Polish, as are countable carthesian products of Polish spaces, equipped with the product topology. 0.2 Filtrations and stopping times As before, let I be an interval in Z. A discrete filtration is a collection of σ-fields (Fk)k2I such that Fk ⊂ Fk+1 for all k; k + 1 2 I. If X = (Xk)k2I is a stochastic process, then X Fk := σ fXj : j 2 I; j ≤ kg (k 2 I) is a filtration, called the filtration generated by X. For any filtration (Fk)k2I , we set [ F1 := σ Fk : k2I X In particular, F1 = σ((Xk)k2I ). A stochastic process X = (Xk)k2I is adapted to a filtration (Fk)k2I if Xk is Fk- X measurable for each k 2 I. Then (Fk )k2I is the smallest filtration that X is X adapted to, and X is adapted to a filtration (Fk)k2I if and only if Fk ⊂ Fk for all k 2 I. Let (Fk)k2I be a filtration. An Fk- stopping time is a function τ :Ω ! I [ f1g such that the f0; 1g-valued process k 7! 1fτ≤kg is Fk-adapted. Obviously, this is equivalent to the statement that fτ ≤ kg 2 Fk (k 2 I): If (Xk)k2I is an E-valued stochastic process and A ⊂ E is measurable, then the first entrance time of X into A τA := inffk 2 I : Xk 2 Ag X with inf ; := 1 is an Fk -stopping time. More generally, the same is true for the first entrance time of X into A after σ τσ;A := inffk 2 I : k > σ; Xk 2 Ag; where σ is an Fk-stopping time. Deterministic times are stopping times (w.r.t. any filtration). Moreover, if σ; τ are Fk-stopping times, then also σ _ τ; σ ^ τ 0.3. MARTINGALES 7 are Fk-stopping times. If f : I [ f1g ! I [ f1g is measurable and f(k) ≥ k for all k 2 I, and τ is an Fk-stopping time, then also f(τ) is an Fk-stopping time. If X = (Xk)k2I is an Fk-adapted stochastic process and τ is an Fk-stopping time, then the stopped process ! 7! Xk^τ(!)(!)(k 2 I) is also an Fk-adapted stochastic process. If τ < 1 a.s., then moreover ! 7! Xτ(!)(!) is a random variable. If τ is an Fk-stopping time defined on some filtered probability space (Ω; F; (Fk)k2I ; P) (with Fk ⊂ F for all k 2 I), then the σ-field of events observable before τ is defined as Fτ := A 2 F1 : A \ fτ ≤ kg 2 Fk 8k 2 I : Exercise 0.1 If (Fk)k2I is a filtration and σ; τ are Fk-stopping times, then show that Fσ^τ = Fσ ^ Fτ . Exercise 0.2 Let (Fk)k2I be a filtration, let X = (Xk)k2I be an Fk-adapted X stochastic process and let τ be an Fk -stopping time. Let Yk := Xk^τ denote the stopped process Show that the filtration generated by Y is given by Y X Fk = Fk^τ k 2 I [ f1g : In particular, since this formula holds also for k = 1, one has X Fτ = σ (Xk^τ )k2I ; X i.e., Fτ is the σ-algebra generated by the stopped process. 0.3 Martingales By definition, a real stochastic process M = (Mk)k2I , where I ⊂ Z is an interval, is an Fk-submartingale with respect to some filtration (Fk)k2I if M is Fk-adapted, E[jMkj] < 1 for all k 2 I, and E[Mk+1jFk] ≥ Mk (fk; k + 1g ⊂ I): (0.1) We say that M is a supermartingale if the reverse inequality holds, i.e., if −M is a submartingale, and a martingale if equality holds in (0.1), i.e., M is both a 8 CHAPTER 0. PRELIMINARIES submartingale and a supermartingale. By induction, it is easy to show that (0.1) holds more generally when k; k + 1 are replaced by more general times k; m 2 I with k ≤ m. 0 0 If M is an Fk-submartingale and (Fk)k≥0 is a smaller filtration (i.e., Fk ⊂ Fk for all k 2 I) that M is also adapted to, then 0 0 0 E[Mk+1jFk] = E E[Mk+1jFk]jFk] ≥ E[MkjFk] = Mk (fk; k + 1g ⊂ I); which shows that M is also an Fk-submartingale. In particular, a stochastic pro- cess M is a submartingale with respect to some filtration if and only if it is a M submartingale with respect to its own filtration (Fk )k2I . In this case, we simply say that M is a submartingale (resp. supermartingale, martingale). Let (Fk)k2I be a filtration and let (Fk−1)k2I be the filtration shifted one step to left, where we set Fk−1 := f;; Ωg if k − 1 62 I. Let X = (Xk)k2I be a real Fk- adapted stochastic process such that E[jXkj] < 1 for all k 2 I.