Math 832: Theory of Probability
Total Page:16
File Type:pdf, Size:1020Kb
Math 832: Theory of Probability • Processes, filtrations, and stopping times • Markov chains • Stationary processes • Continuous time stochastic processes • Martingales • Poisson and general counting processes • Convergence in distribution • Brownian motion • Continuous time Markov processes • Diffusion approximations • φ-irreducibility and Harris recurrence • Assignments •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 1 • Exercises • Glossary • Technical lemmas • References •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 2 1. Processes, filtrations, and stopping times • Stochastic processes • Filtrations • Stopping times •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 3 Stochastic processes A stochastic process is an indexed family of random variables {Xα, α ∈ I} d • State space: The set E in which Xα takes values. Usually E ⊂ R for some d. Always (for us), a complete , separable metric space (E, r). • Index set: Usually, discrete time (Z, N = {1, 2, 3,...}, N0 = {0, 1, 2,...}) or continuous time ([0, ∞) or (−∞, ∞)) • Finite dimensional distributions: µα1,...,αn (A1 × · · · × An) = P {Xα1 ∈ A1,...,Xαn ∈ An},Ai ∈ B(E), (1.1) B(E) the Borel subsets of E. n • Kolmogorov extension theorem: If {µα1,...,αn ∈ P(E ), αi ∈ I, n = 1, 2,...} is consistent, then there exists a probability space (Ω, F,P ) and {Xα, α ∈ I} defined on (Ω, F,P ) satisfying (1.1). •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 4 Information structure Available information is modeled by a sub-σ-algebra of F. Assume that the index set is discrete or continuous time, [0, ∞) to be specific. • Filtration: {Ft, t ≥ 0}, Ft a sub-σ-algebra of F. If s ≤ t, Fs ⊂ Ft. Ft represents the information available at time t. • Adapted process: {X(t) ≡ Xt, t ≥ 0} is {Ft}-adapted if X(t) is Ft-measurable for each t ≥ 0, that is, the state of X at time t is part of the information avail- able at time t. X X • Natural filtration for a process X: Ft = σ(X(s): s ≤ t). {Ft } is the smallest filtration for which X is adapted. •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 5 Stopping times • Stopping time: A random variable τ with values in the index set (e.g., [0, ∞)) or ∞ is a {Ft}-stopping time if {τ ≤ t} ∈ Ft for each t ∈ [0, ∞). • The max and min of two stopping times (or any finite collection) are stopping times • If τ is a stopping time and c > 0, then τ + c is a stopping time • In discrete time, {τ = n} ∈ Fn for all n if and only if {τ ≤ n} ∈ Fn for all n. • In discrete time, hitting times for adapted processes are stopping times: τA = min{n : Xn ∈ A} {τA ≤ n} = ∪k≤n{Xk ∈ A}, {τA = ∞} = ∩k{Xk ∈/ A} • In discrete time, a stopped process is adapted: If {Xn} is adapted and τ is a stopping time, then {Xn∧τ } is adapted. {Xn∧τ ∈ A} = (∪k<n{Xk ∈ A} ∩ {τ = k}) ∪ ({Xn ∈ A} ∩ {τ ≥ n}) •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 6 Information at a stopping time • Information available at a stopping time τ Fτ = {A ∈ F : A ∩ {τ ≤ t} ∈ Ft, all t} or in the discrete time case Fτ = {A ∈ F : A ∩ {τ = n} ∈ Fn, all n} • σ ≤ τ implies Fσ ⊂ Fτ A ∩ {τ ≤ t} = A ∩ {σ ≤ t} ∩ {τ ≤ t} Exercise 1.1 Show that Fτ is a σ-algebra. •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 7 Stopping times for discrete time processes For definitness, let I = {0, 1, 2,...}, and let {Xn} be {Fn}-adapted. Lemma 1.2 If τ is a {Fn}-stopping time, then Xm∧τ is Fτ measurable. Proof. {Xm∧τ ∈ A} ∩ {τ = n} = {Xm∧n ∈ A} ∩ {τ = n} ∈ Fn (1.2) X Lemma 1.3 Let Fn = σ(Xk : k ≤ n) be the natural filtration for X, and let τ be a finite X X (that is, {τ < ∞} = Ω) {Fn }-stopping time. Then Fτ = σ(Xk∧τ : k ≥ 0). X X Proof. σ(Xk∧τ : k ≥ 0) ⊂ Fτ , by (1.2). Conversely, for A ∈ Fτ , A ∩ {τ = n} = {(X0,...,Xn) ∈ Bn} = {(X0∧τ ,...,Xn∧τ ) ∈ Bn} for some Bn. Consequently, A = ∪n{(X0∧τ ,...,Xn∧τ ) ∈ Bn} ∈ σ(Xk∧τ : k ≥ 0) •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 8 Families of processes • Markov processes: E[f(X(t + s))|Ft] = E[f(X(t + s))|X(t)], all f ∈ B(E), the bounded, measureable functions on E. • Martingales: E = R and E[X(t + s)|Ft] = X(t) • Stationary processes: P {X(s + t1) ∈ A1,...,X(s + tn) ∈ An} does not depend on s •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 9 2. Markov Chains • Markov property • Transition functions • Strong Markov property • Tulcea’s theorem • Optimal stopping • Recurrence and transcience • Stationary distributions •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 10 Markov property {Xn, n ≥ 0} a sequence of E-valued random variables Definition 2.1 {Xn} is a Markov chain with respect to a filtration {Fn} if {Xn} is {Fn}-adapted and P {Xn+1 ∈ C|Fn} = P {Xn+1 ∈ C|Xn},C ∈ B(E), n ≥ 0, or equivalently E[f(Xn+1)|Fn] = E[f(Xn+1)|Xn], f ∈ B(E), n ≥ 0. Dynkin class theorem •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 11 Generic construction of a Markov chain Let F : E × R → E be measurable (F −1(C) ∈ B(E) × B(R) for each C ∈ B(E)). Let Xk+1 = F (Xk,Zk+1), where the {Zk} are iid and X0 is independent of the {Zk} Lemma 2.2 {Xk} is a Markov chain with respect to {Fn}, Fn = σ(X0,Z1,...,Zn). Proof. Let µZ be the distribution of Zk and define Z P f(x) = f(F (x, z))µZ (dz). Then Xk is Fk-measurable and Zk+1 is independent of Fk, so E[f(F (Xk,Zk+1))|Fk] = P f(Xk). X Note that Fn ⊃ Fn . conditional expectation •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 12 Transition function P (x, C) = P {F (x, Z) ∈ C} = µZ ({z : F (x, z) ∈ C}) is the transition function for the Markov chain. P : E × B(E) → [0, 1] is a transition function if P (·,C) is B(E)-measurable for each C ∈ B(E) and P (x, ·) ∈ P(E) for each x ∈ E. Note that we are considering time homogeneous Markov chains. We could consider Xk+1 = Fk(Xk,Zk+1) for a sequence of functions {Fk}. The chain would then be time inhomogeneous. •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 13 Finite dimensional distributions µX0 is called the initial distribution of the chain. The initial distribution and the transition function determine the finite dimensional distributions of the chain Z Z Z P {X0 ∈ B0,...,Xn ∈ Bn} = µX0 (dx0) P (x0, dx1) ··· P (xn−1,Bn) B0 B1 Bn−1 More generally Z Z Z E[f0(X0) ··· fn(Xn)] = µX0 (dx0)f0(x0) P (x0, dx1)f1(x1) ··· P (xn−1, dxn)fn(xn) E E E and Z E[f(X0,...,Xn)] = f(x0, . , xn)µX0 (dx0)P (x0, dx1) ··· P (xn−1, dxn) E×···×E •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 14 Example: FIFO queue 2 Let {(ξk, ηk)} be iid with values in [0, ∞) define + Xk+1 = (Xk − ξk+1) + ηk+1 Xk is the time that the kth customer is in the system for a FIFO queue with service times {ηk} and interarrival times {ξk}. Note that P : C¯([0, ∞)) → C¯([0, ∞)). Transition operators that satisfy this condi- tion are said to have the Feller property. •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 15 Strong Markov property Let τ be a stopping time with τ < ∞ a.s. and consider E[f(Xτ+1)|Fτ ]. Let A ∈ Fτ . Then ∞ Z X Z f(Xτ+1)dP = f(Xτ+1)dP A n=0 A∩{τ=n} ∞ X Z = f(Xn+1)dP n=0 A∩{τ=n} ∞ X Z Z = P f(Xn)dP = P f(Xτ )dP n=0 A∩{τ=n} A so E[f(Xτ+1)|Fτ ] = P f(Xτ ) (Note that Xτ is Fτ -measurable.) •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 16 Tulcea’s theorem Theorem 2.3 For k = 1, 2,..., let (Ωk, Fk) be a measurable space. Define Ω = Ω1 × Ω2 × ··· and F = F1 × F2 × · · ·. Let P1 be a probability measure on F1 and for k = 2, 3,..., let Pk :Ω1 ×· · ·×Ωk−1 ×Fk → [0, 1] be such that for each (ω1, . , ωk−1) ∈ Ω1 ×· · ·×Ωk−1, Pk(ω1, . , ωk−1, ·) is a probability measure on Fk and for each A ∈ Fk, Pk(·,A) is a F1 × · · · × Fk−1-measurable function. Then there is a probability measure P on F such that for A ∈ F1 × · · · × Fk Z Z P (A × Ωk+1 × · · ·) = ··· 1A(ω1, . , ωk)Pk(ω1, . , ωk−1, dωk) ··· P1(dω1) Ω1 Ωk ∞ Corollary 2.4 There exists Px ∈ P(E ) such that for C0,C1,...,Cm ∈ B(E) ∞ Px(C0 × C1 × · · · × Cm × E ) Z Z Z = 1C0 (x) P (x, dx1) P (x1, dx2) ··· P (xm−2, dxm−1)P (xm−1,Cm) C1 C2 Cm−1 For C ∈ B(Em+1), Z Z ∞ Px(C × E ) = P (x, dx1) ··· P (xm−1, dxm)1C (x, x1, . , xm) E E •First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 17 Implications of the Markov property Note that E[f1(Xn+1)f2(Xn+2)|Fn] = E[f1(Xn+1)E[f2(Xn+2)|Fn+1]|Fn] = E[f1(Xn+1)E[f2(Xn+2)|Xn+1]|Fn] = P (f1P f2)(Xn) and by induction P {(Xn,Xn+1, ···) ∈ C|Fn} = PXn (C), (2.1) ∞ for C = C0 × C1 × · · · × Cm × E , Ck ∈ B(E).