Part III
Appendices
A. Bain, D. Crisan, Fundamentals of Stochastic Filtering, DOI 10.1007/978-0-387-76896-0, © Springer Science+Business Media, LLC 2009 A Measure Theory
A.1 Monotone Class Theorem
Let S be a set. A family C of subsets of S is called a π-system if it is closed under finite intersection. That is, for any A, B ∈Cwe have that A ∩ B ∈C. Theorem A.1. Let H be a vector space of bounded functions from S into R containing the constant function 1. Assume that H has the property that for any sequence (fn)n≥1 of non-negative functions in H such that fn % f where f is a bounded function on S,thenf ∈H. Also assume that H contains the indicator function of every set in some π-system C.ThenH contains every bounded σ(C)-measurable function of S. For a proof of Theorem A.1 and other related results see Williams [272] or Rogers and Williams [248].
A.2 Conditional Expectation
Let (Ω,F, P) be a probability space and G⊂Fbe a sub-σ-algebra of F. The conditional expectation of an integrable F-measurable random variable ξ given G is defined as the integrable G-measurable random variable, denoted by E[ξ |G], with the property that ξ dP = E[ξ |G]dP, for all A ∈G. (A.1) A A
Then E[ξ |G] exists and is almost surely unique (for a proof of this result see for example Williams [272]). By this we mean that if ξ¯is another G-measurable integrable random variable such that ξ¯dP = E[ξ |G]dP, for all A ∈G, A A
A. Bain, D. Crisan, Fundamentals of Stochastic Filtering, DOI 10.1007/978-0-387-76896-0, © Springer Science+Business Media, LLC 2009 294 A Measure Theory then E[ξ |G]=ξ¯, P-a.s. The following are some of the important properties of the conditional expectation which are used throughout the text. a. If α1,α2 ∈ R and ξ1,ξ2 are F-measurable, then
E[α1ξ1 + α2ξ2 |G]=α1E[ξ1 |G]+α2E[ξ2 |G], P-a.s. b. If ξ ≥ 0, then E[ξ |G] ≥ 0, P-a.s. c. If 0 ≤ ξn % ξ,thenE[ξn |G] % E[ξ |G], P-a.s. d. If H is a sub-σ-algebra of G,thenE [E[ξ |G] |H]=E[ξ |H], P-a.s. e. If ξ is G-measurable, then E[ξη |G]=ξE[η |G], P-a.s. f. If H is independent of σ(σ(ξ), G), then
E[ξ | σ(G, H)] = E[ξ |G], P-a.s.
The conditional probability of a set A ∈Fwith respect to the σ-algebra G is the random variable denoted by P(A |G) defined as P(A |G) E[IA |G], where IA is the indicator function of the set A. From (A.1), P(A ∩ B)= P(A |G)dP, for all B ∈G. (A.2) B This definition of conditional probability has the shortcoming that the condi- tional probability P(A |G) is only defined outside of a null set which depends upon the set A. As there may be an uncountable number of possible choices for A, P(·|G) may not be a probability measure. Under certain conditions regular conditional probabilities as in Definition 2.28 exist. Regular conditional distributions (following the nomenclature in Breiman [23] whose proof we follow) exist under much less restrictive condi- tions.
Definition A.2. Let (Ω,F, P) be a probability space, (E,E) be a measurable space, X : Ω → E be an F/E-measurable random element and G asub-σ- algebra of F. A function Q(ω, B) defined for all ω ∈ Ω and B ∈E is called a regular conditional distribution of X with respect to G if (a) For each B ∈E,themapQ(·,B) is G-measurable. (b) For each ω ∈ Ω, Q(ω, ·) is a probability measure on (E,E). (c) For any B ∈E,
Q(·,B)=P(X ∈ B |G) P-a.s. (A.3)
Theorem A.3. If the space (E,E) in which X takes values is a Borel space, that is, if there exists a function ϕ : E → R such that ϕ is E-measurable and ϕ−1 is B(R)-measurable, then the regular conditional distribution of the variable X conditional upon G in the sense of Definition A.2 exists. A.2 Conditional Expectation 295
Proof. Consider the case when (E,E)=(R, B(R)). First we construct a reg- ular version of the distribution function P(X Mr,q {ω : Qr Nq ω : lim Qr = Qq r↑q and N Nq; q∈Q by property (c) of conditional expectation it follows that P(Nq)=0,so P(N) = 0. Finally define ⎧ ⎫ ⎧ ⎫ ⎨ ⎬ ⎨ ⎬ L∞ ω : lim Qq =1 and L−∞ ω : lim Qq =0 , ⎩ q→∞ ⎭ ⎩ q→−∞ ⎭ q∈Q q∈Q and again P(L∞)=P(L−∞)=0. Define ⎧ ⎨ lim r↑x Qr if ω/∈ M ∪ N ∪ L∞ ∪ L−∞ |G r∈Q F (x ) ⎩ Φ(x) otherwise, where Φ(x) is the distribution function of the normal N(0, 1) distribution (its choice is arbitrary). It follows using property (c) of conditional expectation ∈ Q ↑ applied to the functions fri =1(−∞,ri) with ri a sequence such that ri x that F (x |G) satisfies all the properties of a distribution function and is a version of P(X Lemma A.4. If X is as in the statement of Theorem A.3 and ψ is a E- measurable function such that E[|ψ(X)|] < ∞ then if Q(·|G) is a regular conditional distribution for X given G it follows that E[ψ(X) |G]= ψ(x)Q(dx |G). E Proof. If A ∈Bthen it is clear that the result follows from (A.3). By linearity this extends to simple functions, by monotone convergence to non-negative functions, and in general write ψ = ψ+ − ψ−. A.3 Topological Results Definition A.5. A metric space (E,d) is said to be separable if it has a countable dense set. That is, for any x ∈ E, given ε>0 we can find y in this countable set such that d(x, y) <ε. Lemma A.6. Let (X, ρ) be a separable metric space. Then X is homeomor- phic to a subspace of [0, 1]N, the space of sequences of real numbers in [0, 1] with the topology of co-ordinatewise convergence. Proof. Define a bounded version of the metricρ ˆ ρ/(1 + ρ); it is easily checked that this is a metric on X, and the space (X, ρˆ) is also separable. Clearly the metric satisfies the bounds 0 ≤ ρˆ ≤ 1. As a consequence of sepa- rability we can choose a countable set x1,x2, . . . which is dense in (X, ρˆ). Define J =[0, 1]N and endow this space with the metric d which generated the topology of co-ordinatewise convergence. Define α : X → J, α : x → (ˆρ(x, x1), ρˆ(x, x2),...). Suppose x(n) → x in X;thenbycontinuityofˆρ it is immediate that (n) (n) ρˆ(x ,xk) → ρˆ(x, xk)foreachk ∈ N and thus α(x ) → α(x). (n) (n) Conversely if α(x ) → α(x) then this implies thatρ ˆ(x ,xk) → ρˆ(x, xk) for each k. Then by the triangle inequality (n) (n) ρˆ(x ,x) ≤ ρˆ(x ,xk)+ˆρ(xk,x) (n) and sinceρ ˆ(x ,xk) → ρˆ(x, xk)itisimmediatethat (n) lim sup ρˆ(x ,x) ≤ 2ˆρ(xk,x) ∀k. n→∞ As this holds for all k ∈ N and the xks are dense in X we may pick a sequence → (n) → →∞ xmk x whenceρ ˆ(x ,x) 0asn . Hence α is a homeomorphism X → J. The following is a standard result and the proof is based on that in Rogers and Williams [248] who reference Bourbaki [22] Chapter IX, Section 6, No 1. A.3 Topological Results 297 Theorem A.7. A complete separable metric space X is homeomorphic to a Borel subset of a compact metric space. Proof. By Lemma A.6 there is a homeomorphism α : X → J.Letd denote the metric giving the topology of co-ordinatewise convergence on J.Wemust now consider α(X) and show that it is a countable intersection of open sets in J and hence belongs to the σ-algebra of open sets, the Borel σ-algebra. For ε>0andx ∈ X we can find δ(ε) such that for any y ∈ X, d(α(x),α(y)) <δimplies thatρ ˆ(x, y) <ε.Forn ∈ N set ε =1/(2n)and then consider the ball B(α(x),δ(ε) ∧ ε). It is immediate that the d-diameter of this ball is at most 1/n. But also, as a consequence of the choice of δ,the image under α−1 of the intersection of this ball with X hasρ ˆ-diameter at most 1/n. Let α(X) be the closure of α(X) under the metric d in J. Define a set Un ⊆ α(X) to be the set of x ∈ α(X) such that there exists an open ball Nx,n about x of d-diameter less than 1/n,withˆρ-diameter of the image under α−1 of the intersection of α(X) and this ball less than 1/n. By the argument of the previous paragraph we see that if x ∈ α(X) we can always find such a ball; hence α(X) ⊆ Un. For x ∈∩nUn choose xn ∈ α(X) ∩ k≤n Nx,k.Byconstructiond(x, xk) ≤ 1/n,thusxn → x as n →∞under the d metric on J. However, for r ≥ −1 −1 n both points xr and xn are in Nx,n thusρ ˆ(α (xr),α (xn)) ≤ 1/n,so −1 (α (xr))r≥1 is a Cauchy sequence in (X, ρˆ). But this space is complete so −1 there exists y ∈ X such that α (xn) → y.Asα is a homeomorphism this implies that d(xn,α(y)) → 0. Hence by uniqueness of limits x = α(y) and thus it is immediate that x ∈ α(X). Therefore ∩n Un ⊆ α(X); since α(X) ⊆ Un it follows immediately that * α(X)= Un. (A.4) n It is now necessary to show that Un is relatively open in α(X). From the definition of Un, for any x ∈ Un we can find Nx,n with diameter properties as above which is a subset of J containing x. For any arbitrary z ∈ α(X), by (A.4) there exists x ∈ Un such that z ∈ Nx,n; then by choosing Nz,n = Nx,n it is clear that z ∈ Un. Therefore Nx,n ∩ α(X) ⊆ Un from which we conclude that Un is relatively open in α(X). Therefore we can write Un = α(X) ∩ Vn where Vn is open in J * * α(X)= Un = α(X) ∩ Vn , (A.5) n n where Vn areopensubsetsofJ.Itonlyremainstoshowthatα(X)canbe expressed as a countable intersection of open sets; this is easily done since * α(X)= {x ∈ J : d(x, α(X)) < 1/n} , n 298 A Measure Theory therefore it follows that α(X) is a countable intersection of open sets in J. Together with (A.5) it follows that α(X) is a countable intersection of open sets. Theorem A.8. Any compact metric space X is separable. Proof. Consider the open cover of X which is the uncountable union of all balls of radius 1/n centred on each point in X.AsX is compact there exists a n n finite subcover. Let x1 , ..., xNn be the centres of the balls in one such finite subcover. By a diagonal argument we can construct a countable set which is the union of all these centres for all n ∈ N. This set is clearly dense in X and countable, so X is separable. Theorem A.9. If E is a compact metric space then the set of continuous real-valued functions defined on E is separable. Proof. By Theorem A.8, the space E is separable. Let x1,x2, . . . be a countable dense subset of E. Define h0(x) = 1, and hn(x)=d(x, xn), for n ≥ 1. Now define an algebra of polynomials in these hns with coefficients in the rationals # $ → n0,...,nr n0 nr n0,...,nr ∈ Q A = x qk0,...,kr hk0 (x) ...hkr (x):qk0,...,kr . The closure of A is an algebra containing constant functions and it is clear that it separates points in E, therefore by the Stone–Weierstrass theorem, it follows that A is dense in C(E). Corollary A.10. If E is a compact metric space then there exists a countable set f1, f2, . . . which is dense in C(E). Proof. By Theorem A.8 E is separable, so by Theorem A.9 the space C(E) is separable and hence has a dense countable subset. A.4 Tulcea’s Theorem Tulcea’s theorem (see Tulcea [265]) is frequently stated in the form for product spaces and their σ-algebras (for a very elegant proof in this vein see Ethier and Kurtz [95, Appendix 9]) and this form is sufficient to establish the existence of stochastic processes. We give the theorem in a more general form where the measures are defined on the same space X, but defined on an increasing family of σ-algebras Bn as this makes the important condition on the atoms of the σ-algebras clear. The approach taken here is based on that in Stroock and Varadhan [261]. Define the atom A(x)oftheBorelσ-algebra B on the space X,forx ∈ X by * A(x) {B : B ∈B,x∈ B}, (A.6) that is, A(x) is the smallest element of B which contains x. A.4 Tulcea’s Theorem 299 Theorem A.11. Let (X, B) be a measurable space and let Bn be an increasing ∞ family of sub-σ-algebras of B such that B = σ( n=1 Bn). Suppose that these σ-algebras satisfy the following constraint. If An is a sequence of atoms such ∞ that An ∈Bn and A1 ⊇ A2 ⊇··· then n=0 An = ∅. Let P0 be a probability measure defined on B0 and let πn be a family of probability kernels, where πn(x, ·) is a measure on (X, Bn) and the mapping x → πn(x, ·) is Bn−1-measurable. Such a probability kernel allows us to define inductively a family of probability measures on (X, Bn) via Pn(A) πn(x, A)Pn−1(dx), (A.7) X with the starting point for the induction being given by the probability measure P0. Suppose that the kernels πn(x, ·) satisfy the compatibility condition that for x/∈ Nn,whereNn is a Pn-null set, the kernel πn+1(x, ·) is supported on An(x) (i.e. if B ∈Bn+1 and B ∩ An(x)=∅ then πn+1(x, B)=0). That is, starting from a point x, the transition measure only contains with positive probability transitions to points y such that x and y belong to the same atom of Bn. Then there exists a unique probability measure P defined on B such that P| Bn = Pn for all n ∈ N. Proof. It is elementary to see that Pn as defined in (A.7) is a probability measure on Bnand that Pn+1 agrees with Pn on Bn. We can then define a set function P on Bn by setting P(Bn)=Pn(Bn)forBn ∈Bn. From the definition (A.7), for B ∈Bn we have defined Pn inductively via the transition functions Pn(Bn)= ··· πn(qn−1,B)πn−1(qn−2, dqn−1) ···π1(q0, dq1) P0(dq0). X X (A.8) To simplify the notation define πm,n such that πm,n(x, ·)isameasureon M(X, Bn) as follows. m,n If m ≥ n ≥ 0andB ∈Bn, then define π (x, B)=1B(x) which is clearly m,n Bn-measurable and hence as Bm ⊇Bn, x → π (x, B)isalsoBm-measurable. m,n If m = ··· πn(yn−1,B) ···πm+2(ym+1, dym+2)πm+1(x, dym+1). X X 300 A Measure Theory It therefore follows from the above with m = 0 and (A.8) that for B ∈Bn, 0,n P(Bn)=Pn(Bn)= π (y0,B)P0(dy0). (A.10) X ∞ We must show that P is a probability measure on n=0 Bn, as then the Carath´eodory extension theorem† establishes the existence of an extension ∞ to a probability measure on (X, σ ( n=0 B)). The only non-trivial condition which must be verified for P to be a measure is countable additivity. A necessary and sufficient condition for countable additivity of P is that if Bn ∈ n Bn,aresuchthatB1 ⊇ B2 ⊇··· and n Bn = ∅ then P(Bn) → 0 as n →∞(the proof can be found in many books on measure theory, see for example page 200 of Williams [272]). It is clear that the non-trivial cases are covered by considering Bn ∈Bn for each n ∈ N. We argue by contradiction; suppose that P(Bn) ≥ ε>0 for all n ∈ N.We must exhibit a point of n Bn; as we started with the assumption that this intersection was empty, this is the desired contradiction. Define 0 0,n Fn x ∈ X : π (x, Bn) ≥ ε/2 . (A.11) 0,n n Since x → π (x, ·)isB0-measurable, it follows that F0 ∈B0.Thenfrom (A.10) it is clear that 0 P(Bn) ≤ P0(Fn )+ε/2. 0 As by assumption P(Bn) ≥ ε for all n ∈ N, we conclude that P0(Fn ) ≥ ε/2 for all n ∈ N. 0 0,n+1 Suppose that x ∈ Fn+1;thenπ (x, Bn+1) ≥ ε/2. But Bn+1 ⊆ Bn,so 0,n+1 π (x, Bn) ≥ ε/2. From (A.9) it follows that 0,n+1 0,n π (x, Bn)= πn+1(yn,Bn)π (x, dyn), X for y/∈ Nn, the probability measure πn+1(y, ·) is supported on An(y). As Bn ∈Bn, from the definition of an atom, it follows that y ∈ Bn if and only ⊆ ∈ if An(y) Bn,thusπn+1(y, Bn)=1Bn (y)fory/Nn.Soonintegrationwe 0,n 0,n+1 n obtain that π (x, Bn)=π (x, Bn) ≥ ε/2. Thus x ∈ F0 .Sowehave n+1 n shown that F0 ⊆ F0 . n Since P0(F0 ) ≥ ε/2 for all n and the Fn form a non-increasing sequence, ∞ n it is then immediate that P0( n=0 F0 ) ≥ ε/2, whence we can find x0 ∈/ N0 0,n such that π (x0,Bn) ≥ ε/2 for all n ∈ N. Now we proceed inductively; suppose that we have found x0,x1,...xm−1 such that x0 ∈/ N0 and xi ∈ Ai−1(xi−1) \ Ni for i =1,...,m − 1, and † Carath´eodory extension theorem: Let S be a set, S0 be an algebra of subsets of S and S = σ(S0). Let μ0 be a countably additive map μ0 : S0 → [0, ∞]; then there exists a measure μ on (S, S) such that μ = μ0 on S0. Furthermore if μ0(S) < ∞, then this extension is unique. For a proof of the theorem see, for example, Williams [272] or Rogers and Williams [248]. A.4 Tulcea’s Theorem 301 i,n i+1 π (xi,Bn) ≥ ε/2 for all n ∈ N for each i = 0, ..., m − 1. We have already established the result for the case m = 0. Now define m m,n m+1 Fn {x ∈ X : π (x, Bn) ≥ ε/2 }; from the integral representation for πm,n, m−1,n m,n π (x, Bn)= π (ym,Bn)πm(x, dym), X 0 it follows by an argument analogous to that for Fn ,that m m−1,n m+1 m ε/2 ≤ π (xm−1,Bn) ≤ ε/2 + πm(xm−1,Fn ), where the inequality on the left hand side follows from the inductive hypoth- m m esis. As in the case for m = 0, we can deduce that Fn+1 ⊆ Fn .Thus *∞ m m+1 πm xm−1, Fn ≥ ε/2 , (A.12) n=0 ∞ m m,n which implies that we can choose xm ∈ n=0 Fn , such that π (xm,Bn) > m+1 ε/2 for all n ∈ N, and from (A.12) as the set of suitable xm has strictly positive probability, it cannot be empty, and we can choose an xm not in the Pm-null set Nm. Therefore, this choice can be made such that xm ∈ Am−1(xm−1) \ Nm. This establishes the required inductive step. n,n Now consider the case of π (xn,Bn); we see from the definition that this n,n is just 1Bn (xn), but by choice of the xns, π (xn,Bn) > 0. Consequently as xn ∈/ Nn, by the support property of the transition kernels, it follows that An(xn) ⊆ Bn for each n.Thus An(xn) ⊂ Bn and if we define n Kn i=0 Ai(xi)itfollowsthatxn ∈ Kn and Kn is a descending sequence; by the σ-algebra property it is clear that Kn ∈Bn, and since An(xn)isan atom in Bn it follows that Kn = An(xn). We thus have a decreasing sequence of atoms; by the initial assumption, such an intersection is non-empty, that is, An(xn) = ∅ which implies that Bn = ∅, but this is a contradiction, since we assumed that this intersection was empty. Therefore P is countably additive and the existence of an extension follows from the theorem of Carath´eodory. A.4.1 The Daniell–Kolmogorov–Tulcea Theorem The Daniell–Kolmogorov–Tulcea theorem gives conditions under which the law of a stochastic process can be extended from its finite-dimensional distri- butions to its full (infinite-dimensional) law. The original form of this result due to Daniell and Kolmogorov (see Doob [81] or Rogers and Williams [248, section II.30]) requires topological conditions on the space X; the space X needs to be Borel, that is, homeomorphic to a 302 A Measure Theory Borel set in some space, which is the case if X is a complete separable metric space as a consequence of Theorem A.7. It is possible to take an alternative probabilistic approach using Tulcea’s theorem. In this approach the finite-dimensional distributions are related to each other through the use of regular conditional probabilities as transition kernels; while this does not explicitly use topological conditions, such condi- tions may be required to establish the existence of these regular conditional probabilities (as was seen in Exercise 2.29 regular conditional probabilities are guaranteed to exist if X is a complete separable metric space). I We useC the notation X for the I-fold product space generated by X,that I BI is, X = i∈I Xi where Xis are identicalC copies of X, and let denote the I I product σ-algebra on X ;thatis,B = i∈I Bi where Bi are copies of B.If V U and V are finite subsets of the index set I,letπU denote the restriction map from XV to XU . Theorem A.12. Let X be a complete separable metric space. Let μU be a family of probability measures on (XU , BU ),forU any finite subset of I. Suppose that these measures satisfy the compatibility condition for U ⊂ V V μU = μV ◦ πU . I I Then there exists a unique probability measure on (X , B ) such that μU = I μ ◦ πU for any U a finite subset of I. Proof. Let Fin(I) denote the set of all finite subsets of I. It is immediate from the compatibility condition that we can find a finitely additive μ0 which is a I I −1 probability measure on (X , F ∈Fin(I)(πF ) (BF )), such that for U ∈ Fin(I), I −1 μU =(πU ) ◦ μ0. If we can show that μ0 is countably additive, then the Carath´eodory extension theorem implies that μ0 can be extended to a measure μ on (XI , BI ). We cannot directly use Tulcea’s theorem to construct the extension mea- sure; however we can use it to show that μ0is countably additive. Suppose I −1 An is a non-increasing family of sets An ∈ F ∈Fin(I)(πF ) (BF ) such that An ↓∅; we must show that μ0(An) → 0. ∈ I −1B Given the Ais, we can find finite subsets Fi of I such that Ai (πFi ) Fi for each i. Without loss of generality we can choose this sequence so that ⊂ ⊂ ⊂··· F I −1 B ⊂B F0 F1 F2 . Define n (πFn ) ( Fn ) I . As a consequence of the product space structure, these σ-algebras satisfy the condition that the intersection of a decreasing family of atoms Zn ∈Fn is non-empty. I For q ∈ X and B ∈Fn,let %! " ! " % −1 −1 % Fn B I I πn(q, B) μFn % πFn−1 ( Fn−1 ) πFn (q), πFn (B) , where (μFn |G)(ω, ·)forG⊂BFn is the regular conditional probability dis- tribution of μFn given G. We refer to the properties of regular conditional A.5 C`adl`ag Paths 303 probability distribution using the nomenclature of Definition 2.28. This πn is I I n a probability kernel from (X , Fn−1)to(X , Fn), i.e. π (q, ·) is a measure I on (X , Fn)andthemapq → πn(q, ·)isFn−1-measurable (which follows from property (b) of regular conditional distribution). In order to apply Tul- cea’s theorem we must verify that the compatibility condition is satisfied i.e. πn(q, ·) is supported for a.e. q on the atom in Fn−1 containing q which is c denoted An−1(q). This is readily established by computing π(q, (An−1(q)) ) and using property (c) of regular conditional distribution and the fact that c q/∈ (An−1(q)) . Thus we can apply Tulcea’s theorem to find a unique proba- I ∞ ∞ bility measure μ on (X ,σ( n=0 Fn)) such that μ is equal to μ0 on n=0 Fn. Hence as An ∈Fn, it follows that μ(An)=μ0(An)foreachn and therefore since μ is countably additive μ0(An) ↓ 0 which establishes the required count- able additivity of μ0. A.5 C`adl`ag Paths Ac`adl`ag (continuea ` droite, limitea ` gauche)pathisonewhichisrightcon- tinuous with left limits; that is, xt has c`adl`ag paths if for all t ∈ [0, ∞), the limit xt− exists and xt = xt+. Such paths are sometimes described as RCLL (right continuous with left limits). The space of c`adl`ag functions from [0, ∞) to E is conventionally denoted DE[0, ∞). Useful references for this material are Billingsley [19, Chapter 3], Ethier and Kurtz [95, Sections 3.5–3.9], and Whitt [269, Chapter 12]. A.5.1 Discontinuities of C`adl`ag Paths Clearly c`adl`ag paths can only have left discontinuities, i.e. points t where xt = xt−. Lemma A.13. For any ε>0,ac`adl`ag path taking values in a metric space (E,d) has at most a finite number of discontinuities of size in the metric d greater than ε; that is, the set D = {t ∈ [0,T]:d(xt,xt−) >ε} contains at most a finite number of points. Proof. Let τ be the supremum of t ∈ [0,T] such that [0,t) can be finitely subdivided 0 Lemma A.14. Let X be a c`adl`ag stochastic process taking values in a metric space (E,d);then {t ∈ [0, ∞):P(Xt− = Xt) > 0} contains at most countably many points. Proof. For ε>0, define Jt(ε) {ω : d(Xt(ω),Xt−(ω)) >ε} Fix ε, then for any T>0, δ>0 we show that there are at most a finite number of points t ∈ [0,T] such that P(Jt(ε)) >δ. Suppose this is false, and an infinite sequence ti of disjoint times ti ∈ [0,T] exists. Then by Fatou’s lemma ! " c c P lim inf (Jt (ε)) ≤ lim inf P ((Jt (ε)) ) i→∞ i i→∞ i thus P ≥ P lim sup Jti (ε) lim sup (Jti (ε)) >δ, i→∞ i→∞ so the event that Jt(ε) occurs for an infinite number of the tis has strictly positive probability and is hence non empty. This implies that there is a c`adl`ag path with an infinite number of jumps in [0,T] of size greater than ε,which contradicts the conclusion of Lemma A.13. Taking the union over a countable sequence δn ↓ 0, it then follows that P(Jt(ε)) > 0 for at most a countable set of t ∈ [0,T]. Clearly P(Jt(ε)) → P(Xt = Xt−)asε → 0, thus the set {t ∈ [0,T]: P(Xt = Xt−) > 0} contains at most a countable number of points. By taking the countable union over T ∈ N, it follows that {t ∈ [0, ∞):P(Xt = Xt−) > 0} is at most countable. A.5.2 Skorohod Topology n Consider the sequence of functions x (t)=1{t≥1/n}, and the function x(t)= 1{t>0} which are all elements of DE[0, ∞). In the uniform topology which we used on CE[0, ∞), as n →∞the sequence xn does not converge to x;yet considered as c`adl`ag paths it appears natural that xn should converge to x since the location of the unit jump of xn converges to the location of the unit jump of x. A different topology is required. The Skorohod topology is the most frequently used topology on the space DE[0, ∞) which resolves this problem. Let λ :[0, ∞) → [0, ∞), and define A.5 C`adl`ag Paths 305 γ(λ) esssup | log λ(t)| t≥0 % % % % % λ(s) − λ(t)% =sup%log % . s>t≥0 s − t Let Λ be the subspace of Lipschitz continuous increasing functions from [0, ∞) → [0, ∞) such that λ(0) = 0, limt→∞ λ(t)=∞ and γ(λ) < ∞. The Skorohod topology is most readily defined in terms of a metric which ∈ ∞ induces the topology. For x, y DE[0, ) define a metric dDE (x, y)by ∞ ∨ −u dDE (x, y)= inf γ(λ) e d(x, y, λ, u)du , λ∈Λ 0 where d(x, y, λ, u)=supd(x(t ∧ u),y(λ(t) ∧ u)). t≥0 It is of course necessary to verify that this satisfies the definition of a metric. This is straightforward, but details may be found in Ethier and Kurtz [95, Chapter 3, pages 117-118]. For the functions xn and x in the example, it is → →∞ clear that dDR (xn,x) 0asn . While there are other simpler topologies which have this property, the following proposition is the main reason why the Skorohod topology is the preferred choice of topology on DE. Proposition A.15. If the metric space (E,d) is complete and separable, then ∞ (DE[0, ),dDE ) is also complete and separable. Proof. The following proof follows Ethier and Kurtz [95]. As E is separable, it has a countable dense set. Let {xn}n≥1 be such a set. Given n,0=t0 < + t1 < ··· The set of all such functions forms a dense subset of DE[0, ∞), therefore the space is separable. To show that the space is complete, suppose that {yn}n≥1 is a Cauchy ∞ sequence in (DE[0, ),dDE ), which implies that there exists an increasing sequence of numbers Nk such that for n, m ≥ Nk, ≤ −k−1 −k dDE (yn,ym) 2 e . ≤ −k−1 −k Set vk = yNk ;thendDE (vk,vk+1) 2 e . Thus there exists λk such that ∞ −u −k −k e d(vk,vk+1,λk,u)du<2 e . 0 As d(x, y, λ, u) is monotonic increasing in u, it follows that for any v ≥ 0, 306 A Measure Theory ∞ ∞ e−ud(x, y, λ, u)du ≥ d(x, y, λ, v) e−u du =e−vd(x, y, λ, v). 0 v Therefore it is possible to find λk ∈ Λ and uk >ksuch that −k max(γ(λk),d(vk,vk+1,λk,uk)) ≤ 2 . (A.13) Then form the limit of the composition of the functions λi μk lim λk+n ◦···λk+1 ◦ λk. n→∞ It then follows that ∞ ∞ −i −k+1 γ(μk) ≤ γ(λi) ≤ 2 =2 < ∞; i=k i=k thus μk ∈ Λ. Using the bound (A.13) it follows that for k ∈ N, −1 −1 sup d vk(μk (t) ∧ uk),vk+1(μk+1(t) ∧ uk) t≥0 −1 −1 =supd vk(μk (t) ∧ uk),vk+1(λk ◦ μk (t) ∧ uk) t≥0 =supd (vk(t ∧ uk),vk+1(λk(t) ∧ uk)) t≥0 = d (vk,vk+1,λk,uk) ≤ 2−k. −1 Since (E,d) is complete, it now follows that zk = vk ◦μk converges uniformly on compact sets of t to some limit, which we denote z. As each zk has c`adl`ag paths, it follows that the limit also has c`adl`agpaths and thus belongs to DE[0, ∞). It only remains to show that vk converges to z in the Skorohod −1 topology. This follows since, γ(μk ) → 0ask →∞and for fixed T>0, −1 lim sup d vk ◦ μk (t),z(t) =0. k→∞ 0≤t≤T A.6 Stopping Times o In this section, the notation Ft is used to emphasise that this filtration has not been augmented. Definition A.16. A random variable T taking values in [0, ∞) is said to be o o an Ft -stopping time, if for all t ≥ 0, the event {T ≤ t}∈Ft . A.6 Stopping Times 307 The subject of stopping times is too large to cover in any detail here. For more details see Rogers and Williams [248], or Dellacherie and Meyer [77, Section IV.3]. o Lemma A.17. A random variable T taking values in [0, ∞) is an Ft+- o stopping time if and only if {T o o it follows that {T ≤ t}∈Ft+ε for any t ≥ 0andε>0, thus {T ≤ t}∈Ft+. o Thus T is an Ft+-stopping time. o Conversely if T is an Ft+-stopping time then since ∞ {T Thus {Ta ≤ t} may be written as a countable union of Ft-measurable sets and so is itself Ft-measurable. Hence Ta is a Ft-stopping time. Theorem A.20 (D´ebut Theorem). Let X be a process defined in some topological space (S) (with its associated Borel σ-algebra B(S)). Assume that o X is progressively measurable relative to a filtration Ft . Then for A ∈B(S), the mapping DA =inf{t ≥ 0; Xt ∈ A} defines an Ft-stopping time, where Ft o is the augmentation of Ft . 308 A Measure Theory For a proof see Theorem IV.50 of Dellacherie and Meyer [77]. See also Rogers and Williams [248, 249] for related results. We require a technical result regarding the augmentation aspect of the usual conditions which is used during the innovations derivation of the filtering equations. o Lemma A.21. Let Gt be the filtration Ft ∨N where N contains all the P-null o sets. If T is a Gt-stopping time, then there exists a Ft+-stopping time T such o that T = T P-a.s. In addition if L ∈GT then there exists M ∈FT + such that L = M P-a.s. + Proof. Consider a stopping time of the form T = a1A + ∞1Ac where a ∈ R o and A ∈Ga;inthiscaseletB be an element of Fa such that the symmetric difference A B is a P-null set and define T = a1B + ∞1Bc . For a general Gt-stopping time T use a dyadic approximation. Let ∞ (n) −n S k2 1{(k−1)2−n≤T (n) (n) n Clearly S is GT -measurable and by construction S ≥ T .ThusS is a (n) Gt-stopping time. But the stopping time S takes values in a countable set, so (n) −n S =inf k2 1A + ∞IAc , k k k (n) −n where Ak {S = k2 }. The result has already been proved for stopping (n) (n) times of the form of those inside the infimum. As T = limn S =infn S , consequently the result holds for all Gt-stopping times. As a consequence of o o this limiting operation Ft+ appears instead of Ft . To prove the second assertion, let L ∈GT . By the first part since L ∈G∞ o there exists L ∈F∞ such that L = L P-a.s. Let V = T 1L + ∞1Lc a.s. Using o the first part again, Ft+-stopping times V and T can be constructed such that V = V a.s. and T = T a.s. Define M {L ∩{T = ∞}} ∪ {V = T < ∞}. o Clearly M is FT +-measurable and it follows that L = M P-a.s. The following lemma is trivial, but worth stating to avoid confusion in the more complex proof which follows. o Lemma A.22. Let Xt be the unaugmented σ-algebra generated by a process o X. Then for T an Xt -stopping time, if T (ω) ≤ t and Xs(ω)=Xs(ω ) for s ≤ t then T (ω) ≤ t. o Proof. As T is a stopping time, {T ≤ t}∈Xt = σ(Xs :0≤ s ≤ t)from which the result follows. o Corollary A.23. Let Xt be the unaugmented σ-algebra generated by a process o X. Then for T an Xt -stopping time, if T (ω) ≤ t and Xs(ω)=Xs(ω ) for s ≤ t then T (ω)=T (ω). A.6 Stopping Times 309 Proof. Apply Lemma A.22 with t = T (ω) to conclude T (ω) ≤ T (ω). By symmetry, T (ω) ≤ T (ω) from which the result follows. o Lemma A.24. Let Xt be the unaugmented σ-algebra generated by a process o X. Then for T a Xt -stopping time, for all t ≥ 0, o Xt∧T = σ {Xs∧T :0≤ s ≤ t} . o Proof. Since T ∧ t is also a Xt -stopping time, it suffices to show o XT = σ {Xs∧T : s ≥ 0} . The definition of the σ-algebra associated with a stopping time is that o o o XT {B ∈X∞ : B ∩{T ≤ s}∈Xs for all s ≥ 0} . o If A ∈FT then it follows from this definition that T if ω ∈ A, TA = +∞ otherwise, o defines a Xt -stopping time. Conversely if for some set A,thetimeTA defined o as above is a stopping time it follows that A ∈XT . Therefore we will have established the result if we can show that A ∈ σ{Xs∧T : s ≥ 0} is a necessary and sufficient condition for TA to be a stopping time. o For the first implication, assume that TA is a Xt -stopping time. It is necessary to show that A ∈ σ{Xs∧T : s ≥ 0}. Suppose that ω, ω ∈ Ω are such that Xs(ω)=Xs(ω )fors ≤ T (ω). We will establish that A ∈ σ{Xs∧T : s ≥ 0} if we show ω ∈ A implies that ω ∈ A. If T (ω)=∞ then it is immediate that the trajectories Xs(ω)andXs(ω ) are identical and hence ω ∈ A. Therefore consider T (ω) < ∞;ifω ∈ A then o TA(ω)=T (ω) and since it was assumed that TA is a Xt -stopping time the fact that Xs(ω)andXs(ω ) agree for s ≤ TA(ω) implies by Corollary A.23 that TA(ω )=TA(ω)=T (ω) < ∞ and from TA(ω ) < ∞ it follows that ω ∈ A. We must now prove the opposite implication; that is, given that T is a stopping time and A ∈ σ{Xs∧T : s ≥ 0}, we must show that TA is a stopping time. Given arbitrary t ≥ 0, if TA(ω) ≤ t and Xs(ω)=Xs(ω )fors ≤ t it follows that ω ∈ A (since TA(ω) < ∞). If T (ω) ≤ t and Xs(ω)=Xs(ω )fors ≤ t, since T is a stopping time it follows from Corollary A.23 that T (ω)=T (ω). Since we assumed A ∈ σ{Xs∧T : s ≥ 0} it follows that ω ∈ A from which we deduce TA(ω)=T (ω)=T (ω )=TA(ω ) whence {TA(ω) ≤ t, Xs(ω)=Xs(ω ) for all s ≤ t}⇒TA(ω ) ≤ t, o o which implies that {TA(ω) ≤ t}∈Xt and hence that TA is a Xt -stopping time. 310 A Measure Theory For many arguments it is required that the augmentation of the filtration generated by a process be right continuous. While left continuity of sample paths does imply left continuity of the filtration, right continuity (or even continuity) of the sample paths does not imply that the augmentation of the generated filtration is right continuous. This can be seen by considering the event that a process has a local maximum at time t which may be in Xt+ but not Xt (see the solution to Problem 7.1 (iii) in Chapter 2 of Karatzas and Shreve [149]). The following proposition gives an important class of process for which the right continuity does hold. Proposition A.25. If X is a d-dimensional strong Markov process, then the augmentation of the filtration generated by X is right continuous. Proof. Denote by X o the (unaugmented) filtration generated by the process X.If0≤ t0 =1{X ∈Γ ,...,X ∈Γ }P(Xt ∈ Γn+1,...,Xm ∈ Γm | Xs). t0 0 tn n n+1 The right-hand side in this equation is clearly Xs-measurable and it is P-a.s. P ∈ ∈ |F equal to (Xt0 Γ0,...Xtm Γm s+). As this holds for all cylinder sets, o o it follows that for all F ∈X∞ there exists a Xs -measurable random variable o which is P-a.s. equal to P(F |Xs+). o o o Suppose that F ∈Xs+ ⊆X∞; then clearly P(F |Xs+)=1F .Asabove o there exists a Xs -measurable random variable 1ˆF such that 1ˆF =1F a.s. o Define the event G {ω : 1ˆF (ω)=1},thenG ∈Xs and the events G and F differ by at most a null set (i.e. the symmetric difference GF is null). o Therefore F ∈Xs, which establishes that Xs+ ⊆Xs for all s ≥ 0. It is clear that Xs ⊆Xs+. Now prove the converse implication. Suppose that F ∈Xs+, which implies that for all n, F ∈Xs+1/n. Therefore there ∈Xo exists Gn s+1/n such that F and Gn differ by a null set. Define G ∞ ∞ o m=1 n=m Gn. Then clearly G ∈Xs+ ⊆Xs (by the result just proved). To show that F ∈Xs, it suffices to show that this G differs by at most a null set from F . Consider ∞ ∞ G \ F ⊆ Gn \ F = (Gn \ F ), n=1 n=1 where the right-hand side is a countable union of null sets; thus G \ F is null. Secondly *∞ ∞ c ∞ *∞ c F \ G = F ∩ Gn = F ∩ Gn m=1 n=m m=1 n=m ∞ *∞ ∞ ∞ c c = F ∩ Gn ⊆ F ∩ Gm = (F \ Gm), m=1 n=m m=1 m=1 A.7 The Optional Projection 311 and again the right-hand side is a countable union of null sets, thus F \ G is null. Therefore F ∈Xs, which implies that Xs+ ⊆Xs; hence Xs = Xs+. A.7 The Optional Projection Proof of Theorem 2.7 Proof. The proof uses a monotone class argument (Theorem A.1). Let H be the class of bounded measurable processes for which an optional projection exists. The class of functions 1[s,t)1F ,wheres E[1F |FT ]=E[Z∞ |FT ]=ZT whence E[1F 1{T<∞} |FT ]=ZT 1{T<∞} P-a.s. To apply the Monotone class theorem A.1 it is necessary to check that if Xn is a bounded monotone sequence in H with limit X then the optional projections o Xn converge to the optional projection of X. Consider o Y lim inf Xn1{| lim inf oX |<∞}. n→∞ n→∞ n We must check that Y is the optional projection of X. Thanks to property (c) of conditional expectation the condition (2.8) is immediate. Consequently H is a monotone class and thus by the monotone class theorem A.1 the optional projection exists for any bounded B×F-measurable process. To extend to the unbounded non-negative case consider X ∧ n and pass to the limit. In order to verify that the projection is unique up to indistinguishability, consider two candidates for the optional projection Y and Z. For any stopping time T from (2.8) it follows that YT 1{T<∞} = ZT 1{T<∞}, P-a.s. (A.14) Define F {(t, ω):Zt(ω) = Yt(ω)}.SincebothZ and Y are optional pro- cesses the set F is an optional subset of [0, ∞)×Ω.Writeπ :[0, ∞)×Ω → Ω for 312 A Measure Theory the canonical projection map π :(t, ω) → ω. Now argue by contradiction. Sup- pose that Z and Y are not indistinguishable; this implies that P(π(F )) > 0. By the optional section theorem (see Dellacherie and Meyer [77, IV.84]) it follows that given ε>0 there exists a stopping time U such that when U(ω) < ∞, (U(ω),ω) ∈ F and P(U<∞) ≥ P(π(F )) − ε. As it has been assumed that P(π(F )) > 0, by choosing ε sufficiently small, P(U<∞) > 0. It follows that on some set of non-zero probability 1{U<∞}YU =1 {U<∞}ZU . But from (A.14) this may only hold on a null set, which is a contradiction. Therefore P(π(F )) = 0 and it follows that Z and Y are indistinguishable. o Lemma A.26. If almost surely Xt ≥ 0 for all t ≥ 0 then Xt ≥ 0 for all t ≥ 0 almost surely. Proof. Use the monotone class argument (Theorem A.1) in the proof of the existence of the optional projection, noting that if F ∈Fthen the c`adl`ag version of E[1F |Ft] is non-negative a.s. Alternatively use the optional section theorem as in the proof of uniqueness. A.7.1 Path Regularity Introduce the following notation for a one-sided limit, in this case the right limit lim sup xs lim sup xs =inf sup xu, s↓↓t s→t:s>t v>t t Proof. It is sufficient to consider the case of lim sup. Let b ∈ R be such that b>0, then define −n −n n sup −n −n Xs if b(k − 1)2 ≤ t For every t ≤ b, the supremum in the above definition is Fb-measurable since n X is progressively measurable; thus the random variable Xt is Fb-measurable. n For every ω ∈ Ω, Xt (ω) has trajectories which are right continuous for n t ∈ [0,b]. Therefore X is B([0,b]) ⊗Fb-measurable and is thus progressively n measurable. On [0,b] it is clear that lim supn→∞ Xt = lim sups↓↓t Xs, hence lim sups↓↓t Xs is progressively measurable. A.7 The Optional Projection 313 In a similar vein, the following lemma is required in order to establish the existence of left limits. For the left limits the result is stronger and the lim inf and lim sup are previsible and thus also progressively measurable. Lemma A.28. Let X be a progressively measurable stochastic process taking values in R;thenlim infs↑↑t Xs and lim sups↑↑t Xs are previsible. Proof. It suffices to consider lim sups↑↑t Xt. Define n Xt 1{k2−n n from this definition it is clear that Xt is previsible as it is a sum of left n continuous, adapted, processes. But as lim supn→∞ Xt = lim sups↑↑t Xs,it follows that lim sups↑↑t Xs is previsible. Proof of Theorem 2.9 o Proof. First observe that if Yt is bounded then Yt must also be bounded. There are three things which must be established; first, the existence of right limits; second, right continuity; and third the existence of left limits. Because of the difference between Lemmas A.27 and A.28 the cases of left and right limits are not identical. The first part of the proof establishes the existence of right limits. It is sufficient to show that o o P lim inf Ys < lim sup Ys for some t ∈ [0, ∞) =0. (A.15) s↓↓t s↓↓t The following steps are familiar from the proof of Doob’s martingale reg- ularization theorem which is used to guarantee the existence of c`adl`ag modi- fications of martingales. If the right limit does not exist at t ∈ [0, ∞), that is, o o if lim infs↓↓t Ys < lim sups↓↓t Ys, then rationals a, b can be found such that o o lim infs↓↓t Ys The lim sup and lim inf processes are progressively measurable by Lemma A.27, therefore for rationals a 0 then this implies P(Sa,b < ∞) > 0. Thus a consequence of the assumption that (A.15) is false is that we can find a, b ∈ Q,witha o A1 {(t, ω):S(ω) 2 P(S1 < ∞) > (1 − 1/2 )P(S<∞). We can carry on this construction inductively defining −2k o A2k (t, ω):S(ω) We can construct stopping times using the optional section theorem such that for each i,on{Si < ∞},(Si(ω),ω) ∈ Ai,andsuchthat ! " −(i+1) P(Si < ∞) > 1 − 2 P(S<∞). −i On the event {Si < ∞} it is clear that Si P(Si = ∞,S <∞)=P(S<∞) − P(Si < ∞,S <∞) i+1 = P(S<∞) − P(Si < ∞) ≤ P(S<∞)/2 . Thus ∞ P(Si = ∞,S <∞) ≤ P(S<∞) ≤ 1 < ∞, i=0 so by the first Borel–Cantelli lemma the probability that infinitely many of the events {Si = ∞,S <∞} occur is zero. In other words for ω ∈{S<∞}, we can find an index i0(ω) such that for i ≥ i0, the sequence Si converges in o o a decreasing fashion to S and YSi But since Ti is bounded by N, from the definition of the optional projection (2.8) it is clear that o E E |F E E[ YTi ]= [ [YTi 1Ti<∞ Ti ]] = [YTi ]. (A.16) Thus, since Y has right limits, by an application of the bounded convergence E → E →∞ theorem [YTi ] [YT ], and so as i o o E[ YTi ] → E[ YT ]. (A.17) Thus E o E o E o E o lim sup [ YTi ]=limsup [ YT2i ] and lim inf [ YTi ] = lim inf [ YT2i+1 ], i→∞ i→∞ i→∞ i→∞ P E o so, if (S N was chosen arbitrarily, this implies that P(S = ∞)=1whichisacon- o tradiction, since we assumed P(S<∞) > 0. Thus a.s., right limits of Yt exist. o o Now we must show that Yt is right continuous. Let Yt+ be the process of right limits. As this process is adapted and right continuous, it follows that it is optional. Consider for ε>0, the set o o Aε {(t, ω): Yt(ω) ≥ Yt+(ω)+ε}. Suppose that P(π(Aε)) > 0, from which we deduce a contradiction. By the optional section theorem, for δ>0, we can find a stopping time S such that on S<∞,(S(ω),ω) ∈ Aε,andP(S<∞)=P(π(Aε)) − δ.Wemaychooseδ such that P(S<∞) > 0. Let Sn = S +1/n, and bound these times by some N, which is chosen sufficiently large that P(S o o o lim E[ YT ]=E[ YN 1S≥N ]+E[ YT +1S o o Bε = {(t, ω): Yt(ω) ≤ Yt+(ω) − ε}, which allows us to conclude that P(π(Bε)) = 0; hence o o P ( Yt = Yt+, ∀t ∈ [0, ∞)) = 1, o and thus, up to indistinguishability, the process Yt is right continuous. The existence of left limits is approached in a similar fashion; by Lemma o o A.28, the processes lim infs↑↑t Ys and lim sups↑↑t Ys are previsible and hence optional. For a, b ∈ Q we define o o Fa,b (t, ω) : lim inf Ys(ω) We assume P(π(Fa,b)) > 0 and deduce a contradiction. Since Fa,b is optional, we may apply the optional section theorem to find an optional time T such that on {T<∞},thepoint(T (ω),ω) ∈ Fa,b and with P(T<∞) >ε. Define o C0 {(t, ω):t Then define o C1 {(t, ω):R0(ω) The results used in the above proof are due to Doob and can be found in a very clear paper [82] which is discussed further in Benveniste [16]. These papers work in the context of separable processes, which are processes whose graph is the closure of the graph of the process with time restricted to some countable set D. That is, for every t ∈ [0, ∞) there exists a sequence ti ∈ D such that ti → t and xti → xt. In these papers ‘rules of play’ disallow the use of the optional section theorem except when unavoidable and the above results are proved without its use. These results can be extended (with the addition of extra conditions) to optionally separable processes, which are similarly defined, but the set D consists of a countable collection of stopping times and by an application of the optional section theorem it can be shown that every optional process is optionally separable. The direct approach via the optional section theorems is used in Dellacherie and Meyer [79]. A.8 The Previsible Projection The optional projection (called the projection bien measurable in some early articles) has been discussed extensively and is the projection which is of im- portance in the theory of filtering; a closely related concept is the previsible (or predictable) projection. Some of the very early theoretical papers make use of this projection. By convention we take F0− = F0. Theorem A.29. Let X be a bounded measurable process; then there exists an optional process oX called the previsible projection of X such that for every previsible stopping time T , p XT 1{T<∞} = E XT 1{T<∞} |FT − . (A.20) This process is unique up to indistinguishability, i.e. any processes which sat- isfy these conditions will be indistinguishable. 318 A Measure Theory Proof. As in the proof of Theorem 2.7, let F be a measurable set, and define Zt to be a c`adl`ag version of the martingale E[1F |Ft]. Then we define the previsible projection of 1(s,t]1F by p 1(s,t]1F (r, ω)=1(s,t](r)Zr−(ω). We must verify that this satisfies (A.20); let T be a previsible stopping time. Then we can find a sequence Tn of stopping times such that Tn ≤ Tn+1 < T for all n. By Doob’s optional sampling theorem applied to the uniformly integrable martingale Z; E |F E |F [1F Tn ]= [Z∞ Tn ]=ZTn , now pass to the limit as n →∞, using the martingale convergence theorem (see Theorem B.1), and we get E |∨∞ F ZT − = [Z∞ n=1 Tn ] and from the definition of the σ-algebra of T − it follows that ZT − = E[Z∞ |FT −]. To complete the proof, apply the monotone class theorem A.1 as in the proof for the optional projection and use the same optional section theorem argu- ment for uniqueness. The previsible and optional projection are actually very similar, as the following theorem illustrates. Theorem A.30. Let X be a bounded measurable process; then the set o p {(t, ω): Xt(ω) = Xt(ω)} is a countable union of graphs of stopping times. Proof. Again we use the monotone class argument. Consider the process 1[s,t)(r)1F , from (2.8) and (A.20) the set of points of difference is {(t, ω):Zt(ω) = Zt−(ω)} and since Z is a c`adl`ag process we can define a sequence Tn of stopping times corresponding to the nth discontinuity of Z, and by Lemma A.13 there are at most countably many such discontinuities, therefore the points of difference are contained in the countable union of the graphs of these Tns. A.9 The Optional Projection Without the Usual Conditions 319 A.9 The Optional Projection Without the Usual Conditions The proof of the optional projection theorem in Section A.7 depends crucially on the usual conditions to construct a c`adl`ag version of a martingale, both the augmentation by null sets and the right continuity of the filtration being used. The result can be proved on the uncompleted σ-algebra by making suitable modifications to the process constructed by Theorem 2.7. These results were first established in Dellacherie and Meyer [78] and take their definitive form in [77], the latter approach being followed here. The proofs in this section are of a more advanced nature and make use of a number of non-trivial results about σ-algebras of stopping times which are not proved here. These results and their proofs can be found in, for example, Rogers and Williams [249]. As o usual let Ft denote the unaugmented σ-algebra corresponding to Ft. Lemma A.31. Let L ⊂ R+ × Ω be such that L = {(Sn(ω),ω):ω ∈ Ω}, n o o where the Sn are positive Ft -stopping times. We can find disjoint Ft -stopping times Tn to replace the Sn such that L = {(Tn(ω),ω):ω ∈ Ω}. n Proof. Define T1 = S1 and define An {ω ∈ Ω : S1 = Sn,S2 = Sn,...,Sn−1 = Sn}. ∈Fo Then it is clear that An Sn . From the definition of this σ-algebra, if we define ∞ c Tn Sn1An + 1An , then this Tn is a stopping time. It is clear that this process may be continued inductively. The disjointness of the Tns follows by construction. Given this lemma the following result is useful when modifying a process as it allows us to break down the ‘bad’ set A of points in a useful fashion. Lemma A.32. Let A be a subset of R+ ×Ω contained in a countable union of graphs of positive random variables then A = K∪L where K and L are disjoint measurable sets such that K is contained in a disjoint union of graphs of op- tional times and L intersects the graph of any optional time on an evanescent set.† † AsetA ⊂ [0, ∞) × Ω is evanescent if the projection π(A)={ω : ∃t ∈ [0, ∞) such that (ω, t) ∈ A} iscontainedinaP-null set. Two indistinguishable processes differ on an evanescent set. 320 A Measure Theory Proof. Let V denote the set of all optional times. For Z a positive random variable define V (Z)=∪T ∈V {ω : Z(ω)=T (ω)}; consequently there is a useful decomposition Z = Z ∧ Z,where Z = Z1V (Z) + ∞1V (Z)c Z = Z1V (Z)c + ∞1V (Z). From the definition of V (Z) the set {(Z(ω),ω):ω ∈ Ω} iscontainedinthe graph of a countable number of optional times and if T is an optional time then P(Z = T<∞) = 0. Let the covering of A by a countable family of graphs of random variables be written ∞ A ⊆ {(Zn(ω),ω):ω ∈ Ω} n=1 and form a decomposition of each random variable Zn = Zn ∧ Zn as above. ∞ Clearly n=1{(Zn(ω),ω):ω ∈ Ω} is also covered by a countable union of graphs of optional times and by Lemma A.31 we can find a sequence of disjoint optional times Tn such that ∞ ∞ {(Zn(ω),ω):ω ∈ Ω}⊆ {(Tn(ω),ω):ω ∈ Ω}. n=1 n=1 Define ∞ K = A ∩ {(Zn(ω),ω):ω ∈ Ω} = A ∩ {(Tn(ω),ω):ω ∈ Ω} n=1 n ∞ L = A ∩ {(Z (ω),ω):ω ∈ Ω} = A \ {(Tn(ω),ω):ω ∈ Ω}. n=1 n Clearly A = K ∪ L, hence this is a decomposition of A which has the required properties. Lemma A.33. For every Ft-optional process Xt there is an indistinguishable o Ft+-optional process. Proof. Let T be an Ft-stopping time. Consider the process Xt =1[0,T ),which is c`adl`ag and Ft-adapted, and hence Ft-optional. By Lemma A.21 there exists o an Ft+-stopping time T such that T = T a.s. If we define Xt =1[0,T ),then o o since this process is c`adl`ag and Ft+-adapted, it is clearly an Ft+-optional process. P(ω : Xt(ω)=Xt(ω) ∀t)=P(T = T )=1, which implies that the processes X and X are indistinguishable. We extend from processes of the form 1[0,T ) to the whole of O using the monotone class framework (Theorem A.1) to extend to bounded optional pro- cesses, and use truncation to extended to the unbounded case. A.9 The Optional Projection Without the Usual Conditions 321 Lemma A.34. For every Ft-previsible process, there is an indistinguishable o Ft -previsible process. Proof. We first show that if T is Ft-previsible; then there exists T which is o Ft -previsible, such that T = T a.s. As {T =0}∈F0−, we need only consider † the case where T>0. Let Tn be a sequence of Ft-stopping times announcing o T . By Lemma A.33 it is clear that we can find Rn an Ft+-stopping time such that Rn = Tn a.s. Define Ln maxi=1,...,n Rn; clearly this is an increasing sequence of stopping times. Let this sequence have limit L. Define An {Ln =0}∪{Ln Since the sets An are decreasing, the stopping times Mn form an increasing sequence and the sequence Mn announces everywhere its limit T . This limit o is strictly positive. Because T is announced, T is an Ft -previsible time and T = T a.s. Finish the proof by a monotone class argument as in Lemma A.33. The main result of this section is the following extension of the optional projection theorem which does not require the imposition of the usual condi- tions. o Theorem A.35. Given a stochastic process X, we can construct an Ft op- tional process Zt such that for every stopping time T , ZT 1{T<∞} = E ZT 1{T<∞} |FT , (A.21) and this process is unique up to indistinguishability. Proof. By the optional projection theorem 2.7 we can construct an Ft-optional process Z¯t which satisfies (A.21). By Lemma A.33 we can find a process Zt o which is indistinguishable from Z¯t but which is Ft+-optional. In general this o process Z will not be Ft -optional. We must therefore modify it. Similarly using Theorem A.29, we can construct an Ft-previsible process o Yt, and using Lemma A.34, we can find an Ft -previsible process Y¯t which is indistinguishable from the process Yt. Let H = {(t, ω):Yt(ω) = Zt(ω)}; then it follows by Theorem A.30, that this graph of differences is contained within a countable disjoint union of graphs of random variables. Thus by Lemma A.32 we may write H = K ∪ L † A stopping time T is called announceable if there exists an announcing sequence (Tn)n≥1 for T . This means that for any n ≥ 1andω ∈ Ω, Tn(ω) ≤ Tn+1(ω) < T (ω)andTn(ω) T (ω). A stopping time T is announceable if and only if it is previsible. For details see Rogers and Williams [248]. 322 A Measure Theory o such that for T any Ft -stopping time P(ω :(T (ω),ω) ∈ L)=0andthere o exists a sequence of Ft -stopping times Tn such that K ⊂ {(Tn(ω),ω):ω ∈ Ω}. n E |Fo For each n let Zn be a version of [XTn 1{Tn<∞} Tn ]; then we can define Yt(ω)if(t, ω) ∈/ n{(Tn(ω),ω):ω ∈ Ω} Zt(ω) (A.22) Zn(ω)if(t, ω) ∈{(Tn(ω),ω):ω ∈ Ω}. o It is immediate that this Zt is Ft -optional. Let us now show that it satisfies F o ∈Fo ∩{ } (A.21). Let T be an t -optional time and let A T .SetAn = A T = Tn ; ∈Fo \ ∈Fo thus A Tn .LetB = A n An and thus B T . From the definition (A.22), E |Fo ZT 1An 1T<∞ = Zn1An 1Tn<∞ =1An [1Tn<∞XTn Tn ] E |Fo E |Fo = [XTn 1An 1Tn<∞ Tn ]= [XT 1An 1T<∞ T ]. Consequently E E |Fo E E[1An ZT 1T<∞]= [1An [1T<∞XT T ]] = [1An XT 1T<∞]. So on An the conditions are satisfied. Now consider B,onwhicha.s.T = Tn for all n; hence (T (ω),ω) ∈/ L.SinceP((T (ω),ω) ∈ K) = 0, it follows that a.s. (T (ω),ω) ∈/ H. Recalling the definition of H this implies that Yt(ω)=ζt(ω) a.s.; from the Definition A.22 on B, ZT = YT ,thus o E[1BZT ]=E[1BζT ]=E[1BE[XT |FT +)] = E[1BXT ]. Thus on An for each n and on B the process Z is an optional projection of X. The uniqueness argument using the optional section theorem is exactly analogous to that used in the proof of Theorem 2.7. A.10 Convergence of Measure-valued Random Variables n ∞ Let (Ω,F, P) be a probability space and let (μ )n=1 be a sequence of random measures, μn : Ω →M(S)andμ : Ω →M(S) be another measure-valued random variable. In the following we define two types of convergence for se- quences of measure-valued random variables: n 1. limn→∞ E [|μ f − μf|] = 0 for all f ∈ Cb(S). n 2. limn→∞ μ = μ, P-a.s. We call the first type of convergence convergence in expectation.Ifthere exists an integrable random variable w : Ω → R such that μn(1) ≤ w for all n, n n then limn→∞ μ = μ, P-a.s., implies that μ converged to μ in expectation by n ∞ the dominated convergence theorem. The extra condition is satisfied if (μ )n=1 is a sequence of random probability measures, since in this case, μn(1)=1 for all n. We also have the following. A.10 Convergence of Measure-valued Random Variables 323 Remark A.36. If μn converges in expectation to μ, then there exist sequences n(m) n(m) such that limm→∞ μ = μ, P-a.s. Proof. Since M(S) is isomorphic to (0, ∞)×P(S), with the isomorphism being given by ν ∈M(S) → (ν(1),ν/ν(1)) ∈ (0, ∞) ×P(S), it follows from Theorem 2.18 that there exists a countable convergence deter- mining set of functions† M {ϕ0,ϕ1,ϕ2,...}, (A.23) where ϕ0 is the constant function equal to 1 everywhere and ϕi ∈ Cb(S)for any i>0. Since lim E [|μnf − μf|]=0 n→∞ for all f ∈{ϕ0,ϕ1,ϕ2,...} and the set {ϕ0,ϕ1,ϕ2,...} is countable, one can n(m) find a subsequence n(m) such that, with probability one, limm→∞ μ ϕi = μϕi for all i ≥ 0, hence the claim. If a suitable bound on the rate of convergence for E [|μnf − μf|]isknown, then the sequence n(m) can be specified explicitly. For instance we have the following. Remark A.37. Assume that there exists a countable convergence determining set M such that, for any f ∈M, cf E [|μnf − μf|] ≤ √ , n m3 where cf is a positive constant independent of n, then limm→∞ μ = μ, P-a.s. Proof. By Fatou’s lemma ( ) ∞ % % n 0% %1 % 3 % % 3 % E %μm f − μf% ≤ lim E %μm f − μf% n→∞ m=1 m=1 ∞ ≤ 1 ∞ cf 3/2 < . m=1 m Hence ∞ % % % 3 % %μm f − μf% < ∞ P-a.s., m=1 † Recall that M is a convergence determining set if, for any sequence of fi- nite measures νn, n =1, 2,... and ν being another finite measure for which limn→∞ νnf = νf for all f ∈M, it follows limn→∞ νn = ν. 324 A Measure Theory therefore 3 lim μm f = μf for any f ∈M. m→∞ Since M is countable and convergence determining, it also follows that m3 limm→∞ μ = μ, P-a.s. Let d : P(S) ×P(S) → [0, ∞) be the metric defined in Theorem 2.19; that is, for μ, ν ∈P(S), ∞ |μϕi − νϕi| d(μ, ν)= i , i=1 2 where ϕ1,ϕ2, . . . are elements of Cb(S) such that ϕi∞ =1andletϕ0 = 1. We can extend d to a metric on M(S) as follows. ∞ M S ×M S → ∞ 1 | − | dM : ( ) ( ) [0, ),d(μ, ν) i μϕi νϕi . (A.24) i=0 2 The careful reader should check that dM is a metric and that indeed dM induces the weak topology on M(S). Using dM, the almost sure convergence 2. is equivalent to n 2 . limn→∞ dM(μ ,μ)=0, P-a.s. If there exists an integrable random variable w : Ω → R such that μn(1) ≤ w for all n, then similarly, (1) implies n 1 . limn→∞ E [dM(μ ,μ)]=0. However, a stronger condition (such as tightness) must be imposed in order for condition (1) to be equivalent to condition (1). It is usually the case that convergence in expectation is easier to estab- lish than almost sure convergence. However, if we have control on the higher moments of the error variables μnf − μf then we can deduce the almost sure convergence of μn to μ. The following remark shows how this can be achieved and is used repeatedly in Chapters 8, 9 and 10. Remark A.38. i. Assume that there exists a positive constant p>1anda countable convergence determining set M such that, for any f ∈M,we have cf E |μnf − μf|2p ≤ , np where cf is a positive constant independent of n. Then, for any ε ∈ (0, 1/2− 1/(2p)) there exists a positive random variable cf,ε almost surely finite such that cf,ε |μnf − μf|≤ . nε n In particular, limn→∞ μ = μ, P-a.s. A.11 Gronwall’s Lemma 325 ii. Similarly, assume that there exists a positive constant p>1andacountable convergence determining set M such that n 2p c E dM(μ ,μ) ≤ , np where dM is the metric defined in (A.24) and c is a positive constant in- dependent of n. Then, for any ε ∈ (0, 1/2 − 1/(2p)) there exists a positive random variable cε almost surely finite such that cε |μnf − μf|≤ , P-a.s. nε n In particular, limn→∞ μ = μ, P-a.s. Proof. As in the proof of Remark A.37, ( ) ∞ ∞ 1 E 2εp| n − |2p ≤ ∞ n μ f μf cf p−2εp < , n=1 m=1 n since p − 2εp > 1. Let cf,ε be the random variable ∞ 1/2p 2εp n 2p cf,ε = n |μ f − μf| . n=1 2p As (cf,ε) is integrable, cf,ε is almost surely finite and ε n n |μ f − μf|≤cf,ε. n Therefore limn→∞ μ f = μf for any f ∈M. Again, since M is countable n and convergence determining, it also follows that limn→∞ μ = μ, P-a.s. Part (ii) of the remark follows in a similar manner. A.11 Gronwall’s Lemma An important and frequently used result in the theory of stochastic differential equations is Gronwall’s lemma. Lemma A.39 (Gronwall). Let x, y and z be measurable non-negative func- tions on the real numbers. If y is bounded and z is integrable on [0,T] for some T ≥ 0,andforall0 ≤ t ≤ T , t xt ≤ zt + xsys ds, (A.25) 0 then for all 0 ≤ t ≤ T , t t xt ≤ zt + zsysexp yr dr ds. 0 s 326 A Measure Theory ! " − t Proof. Multiplying both sides of the inequality (A.25) by yt exp 0 ys ds yields t t t xtyt exp − ys ds − xsys ds yt exp − ys ds 0 0 0 t ≤ ztyt exp − ys ds . 0 The left-hand side can be written as the derivative of a product, d t t t xsys ds exp − ys ds ≤ ztyt exp − ys ds , dt 0 0 0 which can be integrated to give t t t s xsys ds exp − ysds ≤ zsys exp − yr dr ds, 0 0 0 0 or equivalently t t t xsys ds ≤ zsys exp yr dr ds. 0 0 s Combining this with the original equation (A.25) gives the desired result. Corollary A.40. If x is a real-valued function such that for all t ≥ 0, t xt ≤ A + B xs ds, 0 then for all t ≥ 0, Bt xt ≤ Ae . Proof. We have for t ≥ 0, t B(t−s) xt ≤ A + ABe ds 0 ≤ A + ABeBt(e−tB − 1)/(−B)=AeBt. A.12 Explicit Construction of the Underlying Sample Space for the Stochastic Filtering Problem Let (S,d) be a complete separable metric space (a Polish space) and Ω1 be the space of S-valued continuous functions defined on [0, ∞), endowed with A.12 Explicit Construction of Sample Space 327 the topology of uniform convergence on compact intervals and with the Borel σ-algebra associated denoted with F 1, Ω1 = C([0, ∞), S), F 1 = B(Ω1). (A.26) 1 1 1 Let X be an S-valued process defined on this space; Xt(ω )=ω (t), ω ∈ 1 1 Ω . We observe that Xt is measurable with respect to the σ-algebra F and consider the filtration associated with the process X, 1 Ft = σ(Xs,s∈ [0,t]). (A.27) Let A : Cb(S) → Cb(S) be an unbounded operator with domain D(A)with 1 ∈D(A)andA1 =0andletP1 be a probability measure which is a solution of the martingale problem associated with the infinitesimal generator A and 1 the initial distribution π0 ∈P(S), i.e., under P , the distribution of X0 is π0 and t ϕ 1 Mt = ϕ(Xt) − ϕ(X0) − Aϕ(Xs)ds, Ft , 0 ≤ t<∞, (A.28) 0 is a martingale for any ϕ ∈D(A). Let also Ω2 be defined similarly to Ω1, but with S = Rm. Hence Ω2 = C([0, ∞), Rm), F 2 = B(Ω2). (A.29) 2 2 2 We consider also V to be the canonical process in Ω , (i.e. Vt(ω )=ω (t), ω2 ∈ Ω2)andP2 to be a probability measure such that V is an m-dimensional standard Brownian motion on (Ω2, F 2) with respect to it. We consider now the following. Ω Ω1 × Ω2, F F 1 ⊗F2, P P1 ⊗ P2, N {B ⊂ Ω : B ⊂ A, A ∈F, P(A)=0} F F ∨N. So (Ω,F, P) is a complete probability space and, under P, X and V are two independent processes. They can be viewed as processes on the product space (Ω,F, P) in the usual way: as projections onto their original spaces of definition. If W is the canonical process on Ω, then W (t)=ω(t)=(ω1(t),ω2(t)) X = p1(ω)wherep1 : Ω → Ω1,p1(ω)=ω1 V = p2(ω)wherep2 : Ω → Ω2,p2(ω)=ω2. ϕ Mt is also a martingale with respect to the larger filtration Ft,where 328 A Measure Theory Ft = σ(Xs,Vs,s∈ [0,t]) ∨N. Let h : S → Rm be a Borel-measurable function with the property that T P h(Xs) ds<∞ =1 forallT>0, 0 Finally let Y be the following stochastic process (usually called the observation process) t Yt = h(s, Xs)ds + Vt,t≥ 0. 0 B Stochastic Analysis B.1 Martingale Theory in Continuous Time The subject of martingale theory is too large to cover in an appendix such as this. There are many useful references, for example, Rogers and Williams [248] or Doob [81]. Theorem B.1. If M = {Mt,t≥ 0} is a right continuous martingale bounded p p p in L for p ≥ 1, that is, supt≥0 E[|Mt| ] < ∞, then there exists an L - integrable random variable M∞ such that Mt → M∞ almost surely as t →∞. Furthermore, p p 1. If M is bounded in L for p>1,thenMt → M∞ in L as t →∞. 1 2. If M is bounded in L and {Mt,t≥ 0} is uniformly integrable then 1 Mt → M∞ in L as t →∞. If either condition (1) or (2) holds then the extended process {Mt,t∈ [0, ∞]} is a martingale. For a proof see Theorem 1.5 of Chung and Williams [53]. The following lemma provides a very useful test for identifying martingales. Lemma B.2. Let M = {Mt,t≥ 0} be a c`adl`ag adapted process such that for each bounded stopping time T , E[|MT |] < ∞ and E[MT ]=E[M0] then M is a martingale. Proof. For s Then T is a stopping time and E[M0]=E[MT ]=E[Ms1A]+E[Mt1Ac ], A. Bain, D. Crisan, Fundamentals of Stochastic Filtering, DOI 10.1007/978-0-387-76896-0, © Springer Science+Business Media, LLC 2009 330 B Stochastic Analysis and trivially for the stopping time t, E[M0]=E[Mt]=E[Mt1A]+E[Mt1Ac ], so E[Mt1A]=E[Ms1A] which implies that Ms = E[Mt |Fs]a.s.whichto- gether with the integrability condition implies M is a martingale. By a straightforward change to this proof the following corollary may be established. Corollary B.3. Let {Mt,t≥ 0} be a c`adl`ag adapted process such that for each stopping time (potentially infinite) T , E[|MT |] < ∞ and E[MT ]=E[M0] then M is a uniformly integrable martingale. Definition B.4. Let M be a stochastic process. If M0 is F0-measurable and there exists an increasing sequence Tn of stopping times such that Tn →∞ a.s. and such that T { − ≥ } Mn = Mt∧Tn M0,t 0 is a Ft-adapted martingale for each n ∈ N,thenM is called a local martingale and the sequence Tn is called a reducing sequence for the local martingale M. The initial condition M0 is treated separately to avoid imposing integrability conditions on M0. B.2 ItˆoIntegral The stochastic integrals which arise in this book are the integrals of stochastic processes with respect to continuous local martingales. The following section contains a very brief overview of the construction of the Itˆointegralinthis context and the necessary conditions on the integrands for the integral to be well defined. The results are presented starting from the previsible integrands, since in the general theory of stochastic integration these form the natural class of integrators. The results then extend in the case of continuous martingale integrators to integrands in the class of progressively measurable processes and if the quadratic variation of the continuous martingale is absolutely continuous with respect to Lebesgue measure (as for example in the case of integrals with respect to Brownian motion) then this extends further to all adapted, jointly measurable processes. It is also possible to construct directly the stochastic integral with a continuous martingale integrator on the space of progressively measurable processes (this approach is followed in e.g. Ethier and Kurtz [95]). There are numerous references which describe the material in this section in much greater detail; examples include Chung and Williams [53], Karatzas and Shreve [149], Protter [247] and Dellacherie and Meyer [79]. B.2 Itˆo Integral 331 Definition B.5. The previsible (predictable) σ-algebra denoted P is the σ- algebra of subsets of [0, ∞) × Ω generated by left continuous processes valued in R; that is, it is the smallest σ-algebra with respect to which all left con- tinuous processes are measurable. A process is said to be previsible if it is P-measurable. Lemma B.6. Let A be the ring† of subsets of [0, ∞)×Ω generated by the sets of the form {(s, t] × A} where A ∈Fs and 0 ≤ s