chapter 8 Stratonovich’s Theory

From an abstract mathematical standpoint, Itˆo’stheory of stochastic in- tegration has as a serious flaw: it behaves dreadfully under changes of coor- dinates (cf. Remark 3.3.6). In fact, Itˆo’sformula itself is the most dramatic manifestation of this problem. The origin of the problems Itˆo’stheory has with coordinate changes can be traced back to its connection with independent increment processes. In- deed, the very notion of an independent increment process is inextricably tied to the linear structure of Euclidean space, and anything but a linear change of coordinates will wreak havoc to that structure. Generalizations of Itˆo’stheory like the one of Kunita and Watanabe do not really cure this problem, they only make it slightly less painful. To make Itˆo’stheory more amenable to coordinate changes, we will de- velop an idea which was introduced by R.L. Stratonovich. Stratonovich was motivated by applications to engineering, and his own treatment [34] had some mathematically awkward aspects. In fact, it is ironic, if not surprising, that Itˆo[12] was the one who figured out how to put Stratonovich’s ideas on a firm mathematical foundation. 8.1 & Stratonovich From a technical perspective, the coordinate change problem alluded to above is a consequence of the fact that non-linear functions destroy the martingale property. Thus, our first step will be to replace martingales with a class of processes which is invariant under composition with non-linear functions. Throughout, (Ω, F, P) will be a complete probability space which comes equipped with a non-decreasing family {Ft : t ≥ 0} of P-complete σ-algebras. 8.1.1. Semimartingales. We will say that Z : [0, ∞) × Ω −→ R is a continuous semimartingale1 and will write Z ∈ S(P; R) if Z is can written  in the form Z(t, ω) = M(t, ω) + B(t, ω), where M(t), Ft, P is a continu-

1 It is unfortunate that Doob originally adopted this term for the class of processes which are now called submartingales. However, the confusion caused by this terminological accident recedes along with the generation of probabilists for whom it was a problem.

221 222 8 Stratonovich’s Theory

ous and B is a progressively measurable function with the property that B( · , ω) is a continuous function of locally bounded variation for P-almost every ω. Notice that when we insist that M(0) ≡ 0 the decomposition a semi- martingale Z into to martingale part M and its bounded variation part B is almost surely unique. This is just an application of the first statement in Corollary 7.1.2. In addition, Itˆo’sformula (cf. Theorem 7.2.9) shows that the class of continuous semimartingales is invariant under composition with twice continuously differentiable functions f. In fact, his formula says that if  n Z(t) = M1(t) + B1(t),...,Zn(t) + Bn(t) is an R -valued, continuous semi- martingale and F ∈ C2(Rn; R), then the martingale and bounded variation parts of F ◦ Z are, respectively,

n Z t X  ∂iF Z(τ) dMi(τ) and i=1 0 n n 1 X Z t X Z t F Z(0) + ∂ ∂ F Z(τ) hM ,M i(dτ) + ∂ F Z(τ) B (dτ), 2 i j i j i i i,j=1 0 i=1 0

where, of course, the integrals in the first line are taken in the sense of Itˆoand the ones in the second are (or Riemann) Lebesgue integrals. In various circumstances it is useful to have defined the

Z t Z t Z t θ(τ) dZ(t) = θ(τ) dM(τ) + θ(τ) B(dτ) 0 0 0 2 for Z = M + B and θ ∈ Θloc(hMi, P; R),

where the dM(τ)-integral is taken in the sense of Itˆoand the B(dτ) integral is a taken a la Lebesgue. Also, we take

(8.1.1) hZ1,Z2i(t) ≡ hM1,M2i(t) if Zi = Mi + Bi for i ∈ {1, 2}.

Indeed, with this notation, Itˆo’sformula 7.2.9 becomes

n Z t   X  F Z(t) − F Z(0) = ∂iF Z(t) dZi(t) i=1 0 (8.1.2) n 1 X Z t + ∂ ∂ F Z(τ) hZ ,Z i(dτ). 2 i j i j i,j=1 0

The notational advantage of (8.1.2) should be obvious. On the other hand, the disadvantage is that it mixes the martingale and bounded variation parts 8.1 Semimartingales & Stratonovich Integrals. 223 of the right hand side, and this mixing has to be disentangled before expec- tation values are taken. Finally, it should be noticed that the extension of the bracket given in (8.1.1) is more than a notational device and has intrinsic meaning. Namely, by Exercise 7.2.13 and the easily checked fact that, as N → ∞,

2N t X  −N  −N  −N  −N  Z1 (m + 1)2 − Z1 m2 Z2 (m + 1)2 − Z2 m2 m=0

 −N  −N  −N  −N  − M1 (m + 1)2 − M1 m2 M2 (m + 1)2 − M2 m2 ,

tends to 0 P-almost surely uniformly for t in compacts, we know that

∞ X  −N  −N  Z1 t ∧ (m + 1)2 − Z1 t ∧ m2 (8.1.3) m=0  −N  −N  × Z2 t ∧ (m + 1)2 − Z2 t ∧ m2 −→ hZ1,Z2i(t) in P-probability uniformly for t in compacts. 8.1.2. Stratonovich’s Integral. Keeping in mind that Stratonovich’s purpose was to make stochastic integrals better adapted to changes in vari- able, it may be best to introduce his integral by showing that it is integral at which one arrives if one adopts a somewhat naive approach, one which re- veals its connection to the . Namely, given elements X and 2 R t Y of S(P; R), we want to define the Stratonovich integral 0 Y (τ) ◦ dX(τ) of Y with respect to X as the limit of the Riemann integrals

Z t lim Y N (τ) dXN (τ), N→∞ 0 where, for α : [0, ∞) −→ R, we use αN to denote the polygonal approxi- mation of α obtained by linear interpolation on each interval [m2−N , (m + 1)2−N ]. That is,

N −N N −N N α (t) = α(m2 ) + 2 (t − m2 )∆mα N −N  −N  where ∆mα ≡ α (m + 1)2 − α m2

 −N −N  for all m, N ∈ N and t ∈ m2 , (m + 1)2 .

2 This notation is only one of many which are used to indicate it is Stratonovich’s, as opposed to Itˆo’s , sense in which an integral is being taken. Others include the use of δX(τ) in place of ◦dX(τ). The lack of agreement about the choice of notation reflects the inadequateness of all the variants. 224 8 Stratonovich’s Theory

Of course, before we can adopt this definition, we are obliged to check that the this limit exists. For this purpose, write

Z t Z t Z t N N N  N Y (τ) dX (τ) = Y ([τ]N ) dX(τ) + Y (τ) − Y ([τ]N ) dX (τ), 0 0 0 where the dX-integral on the right can be interpreted either as an Itˆoor a Riemann integral. By using Corollary 7.2.3, one can easily check that

Z t Z t

Y ([τ]N ) dX(τ) − Y (τ) dX(τ) −→ 0 in P-probability. 0 0 [0,T ]

At the same time, by (8.1.3),

Z t N  N Y (τ) − Y ([τ]N ) dX (τ) 0 Z t∧(m+1)2−N N X N  N  −N 1 = 4 ∆mX ∆mY (τ − m2 ) dτ −→ hX,Y i(t) −N 2 m≤2N t m2 in P-probability uniformly for t in compacts. Hence, for computational pur- poses, it is best to present the Stratonovich integral as

Z t Z t 1 (8.1.4) Y (τ) ◦ dX(τ) = Y (τ) dX(τ) + hX,Y i(t), 0 0 2 where integral on the right is taken in the sense of Itˆo. So far as I know, the formula in (8.1.4) was first given by Itˆoin [12]. In par- ticular, Itˆoseems to have been the one who realized that the problems posed by Stratonovich’s theory could be overcome by insisting that the integrands be semimartingales, in which case, as (8.1.4) makes obvious, the Stratonovich integral of Y with respect to X is again a continuous . In fact, if X = M + B is the decomposition of X into its martingale and bounded variation parts, then

Z t Z t 1 Y (τ) dM(τ) and Y (τ) dB(τ) + hY,Mi(t) 0 0 2

R t are the martingale and bounded variation parts of I(t) ≡ 0 Y (τ) ◦ dX(τ), and so hZ,Ii(dt) = Y (t)hZ,Xi(dt) for all Z ∈ S(P; R). To appreciate how clever Itˆo’sobservation is, notice that Stratonovich’s integral is not really an integral at all. Indeed, in order to deserve being called an integral, an operation should result in a quantity which can be estimated in terms of zeroth order properties of the integrand. On the other 8.1 Semimartingales & Stratonovich Integrals. 225 hand, as (8.1.4) shows, no such estimate is possible. To see this, take Y (t) =  2 f Z(t) , where Z ∈ S(P; R) and f ∈ C (R; R). Then, because hX,Y i(dt) = f 0Z(t)hX,Zi(dt),

Z t Z t 1 Z t Y (τ) ◦ dX(τ) = fZ(τ) dX(τ) + f 0Z(τ) hX,Zi(dτ), 0 0 2 0 which demonstrates that there is, in general, no estimate of the Stratonovich integral in terms of the zeroth order properties of the integrand. Another important application of (8.1.4) is to the behavior of Stratonovich integrals under iteration. That is, suppose that X,Y ∈ S(P; R), and set R t I(t) = 0 Y (τ) ◦ dX(τ). Then hZ,Ii(dt) = Y (t)hZ,Xi(dt) for all Z ∈ S(P; R), and so

Z t Z t 1 Z t Z(τ) ◦ dI(τ) = Z(τ) dI(τ) + Y (τ) hZ,Xi(dτ) 0 0 2 0 Z t 1 Z t  = Z(τ)Y (τ) dX(τ) + Z(τ)hY,Xi(dτ) + Y (τ)hZ,Xi(dτ) 0 2 0 Z t 1 Z t = Z(τ)Y (τ) dX(τ) + hZY,Xi(t) = ZY (τ) ◦ dX(τ), 0 2 0 since, by Itˆo’sformula,

Z t Z t 1 ZY (t) − ZY (0) = Z(τ) dY (τ) + Y (τ) dZ(τ) + hZ,Y i(t), 0 0 2 and therefore hZY,Xi(dt) = Z(t)hY,Xi(dt) + Y (t)hZ,Xi(dt). In other words, we have now proved that

Z t Z τ  Z t (8.1.5) Z(τ) ◦ d Y (σ) ◦ X(σ) = Z(τ)Y (τ) ◦ dX(τ). 0 0 0 8.1.3. Ito’s Formula & Stratonovich Integration. Because the origin of Stratonovich’s integral is in Riemann’s theory, it should come as no sur- prise that Itˆo’sformula looks deceptively like the fundamental theorem of cal- culus when Stratonovich’s integral is used. Namely, let Z = (Z1,...,Zn) ∈ n 3 n S(P; R) and f ∈ C (R ; R) be given, and set Yi = ∂if ◦ Z for 1 ≤ i ≤ n. Then Yi ∈ S(P; R) and, by Itˆo’sformula applied to ∂if, the local martingale part of Yi is given by the sum over 1 ≤ j ≤ n of the Itˆostochastic integrals of ∂i∂jf ◦ Z with respect the local martingale part of Zj. Hence,

n X hYi,Zii(dt) = ∂i∂jf ◦ Z(τ)hZi,Zji(dt). j=1 226 8 Stratonovich’s Theory

But, by Itˆo’s formula applied to f, this means that

n X 1  dfZ(t) = Y (τ) dZ (τ) + hY ,Z i(dt) , i i 2 i i i=1 and so we have now shown that, in terms of Stratonovich integrals, Itˆo’sfor- mula does look like the “the fundamental theorem of

n Z t   X  (8.1.6) f Z(t) − f Z(0) = ∂if Z(τ) ◦ dZi(τ). i=1 0

As I warned above, (8.1.6) is deceptive. For one thing, as its derivation makes clear, it, in spite of its attractive form, is really just Itˆo’sformula (8.1.2) in disguise. In fact, if, as we will, one adopts Itˆo’s approach to Stratonovich’s integral, then it is not even that. Indeed, one cannot write (8.1.6) unless one knows that ∂if ◦ Z is a semi-martingale. Thus, in general, (8.1.6) requires us to assume that f is three times continuously differentiable, not just twice, as in the case with (8.1.2). Ironically, we have arrived at a first order fundamental theorem of calculus which applies only to functions with three derivatives. In view of these remarks, it is significant that, at least in the case of Brownian motion, Itˆofound a way (cf. [13] or Exercise 8.1.8 below) to make (8.1.6) closer to a true Fundamental Theorem of Calculus, at least in the sense that it applies to all f ∈ C1(Rn; R). Remark 8.1.7. Putting (8.1.6) together with our earlier footnote about the notation for Stratonovich integrals, one might be inclined to think the right R t ˙ notation should be 0 Y (τ)X(τ) dτ. For one thing, this notation recognizes that the Stratonovich integral is closely related to the notion of generalized derivatives a la Schwartz’s distribution theory. Secondly, (8.1.6) can be summarized in differential form by the expression

n  X  df Z(t) = ∂if Z(t) ◦ dZi(t), i=1 which would take the appealing form

n d X fZ(t) = ∂ fZ(t)Z˙ (t). dt i i=1

Of course, the preceding discussion should also make one cautious about being too credulous about all this. 8.1 Semimartingales & Stratonovich Integrals. 227

8.1.4. Exercises. Exercise 8.1.8. In this exercise we will describe Itˆo’sapproach to extend- ing the validity of (8.1.6). (i) The first step is to give another description of Stratonovich integrals. Namely, given X,Y ∈ S(P; R), show that the Stratonovich integral of Y with respect to X over the interval [0,T ] is almost surely equal to   2N −1 Y (m+1)T + Y mT       X 2N 2N (m + 1)T mT (*) lim X − X . N→∞ 2 2N 2N m=0 Thus, even if Y is not a semimartingale, we will say that Y : [0,T ] × Ω −→ R is Stratonvich integrable on [0,T ] with respect to X if the limit in (*) R T exists in P-measure, in which case we will use 0 Y (t) ◦ dX(t) to denote this limit. It must be emphasized that the definition here is “T by T ” and not simultaneous for all T ’s in an interval. T + (ii) Given an X ∈ S(P; R) and a T > 0, set Xˇ (t) = X (T − t) , and ˇ T ˇT  suppose that X (t), Ft , P is a semimartingale relative to some filtration ˇT {Ft : t ≥ 0}. Given a Y : [0,T ] × Ω −→ R with the properties that  ˇT Y ( · , ω) ∈ C [0,T ]; R for P-almost every ω and that ω Y (t, ω) is Ft ∩Ft for each t ∈ [0,T ], show that Y is Stratonovich integrable on [0,T ] with respect to X. In fact, show that

Z T 1 Z T 1 Z T Y (t) ◦ dX(t) = Y (t) dX(t) − Y (t) dXˇ T (t), 0 2 0 2 0 where each of the integrals on the right is the taken in the sense of Itˆo.  n (iii) Let β(t), Ft, P be an R -valued Brownian motion. Given T ∈ ˇT + ˇT ˇT  (0, ∞), set β (t) = β (T − t) , Ft = σ {β (τ): τ ∈ [0, t]} , and n T T  ˇ n ˇ show that, for each ξ ∈ R , (ξ, β (t))R , Ft , P is a semimartingale with t ˇT ˇT ˇT R βτ hβ , β i(t) = t ∧ T and bounded variation part t − 0 T −τ dτ. In par- n n  ticular, show that, for each ξ ∈ R and g ∈ C(R ; R), t g β(t) is  Stratonovich integrable on [0,T ] with respect to t ξ, β(t) n . Further, if ∞ n R {gn}1 ⊆ C(R ; R) and gn −→ g uniformly on compacts, show that Z T Z T     gn β(t) ◦ d ξ, β(t) n −→ g β(t) ◦ d ξ, β(t) n 0 R 0 R in P-measure. (iv) Continuing with the notation in (iii), show that

n Z T  X  f β(T ) − f(0)) = ∂if β(t) ◦ dβi(t) i=1 0 228 8 Stratonovich’s Theory for every f ∈ C1(Rn; R). In keeping with the comment at the end of (i), it is important to recognize that although this form of Itˆo’s formula holds for all continuously differentiable functions, it is, in many ways less useful than forms which we obtained previously. In particular, when f is no better than once differentiable, the right hand side is defined only up to a P-null set for each T and not for all T ’s simultaneously.  (v) Let β(t), Ft, P be an R-valued Brownian motion, show that, for each  T ∈ (0, ∞), t sgn β(t) is Stratonovich integrable on [0,T ] with respect to β, and arrive at

Z T  β(T ) = sgn β(t) ◦ dβ(t). 0

After comparing this with the result in (6.1.7), conclude that the `(T, 0) of β at 0 satisfies

Z T Z T |β(t)|  T dt − `(T, 0) − β(T ) = sgn β(T − t) dMˇ (t), 0 t 0 where Mˇ T is the martingale part of βˇT . In particular, the expression on the left hand side is a centered Gaussian random variable with variance T .

Exercise 8.1.9. Itˆo’sformula proves that f ◦ Z ∈ S(P; R) whenever Z = n 2 n (Z1,...,Zn) ∈ S(P; R) and f ∈ C (R ; R). On the other hand, as Tanaka’s treatment (cf. §6.1) of local time makes clear, it is not always necessary to know that f has two continuous derivatives. Indeed, both (6.1.3) and (6.1.7) provide examples in which the composition of a martingale with a continuous function leads to a continuous semimartingale even though the derivative of the function is discontinuous. More generally, as a corollary of the Doob-Meyer Decomposition Theorem (alluded to at the beginning of § 7.1) one can show that f ◦ Z will be a continuous semimartingale whenever n Z ∈ Mloc(P; R ) and f is a continuous, convex function. Here is a more pedestrian approach to this result.

(i) Following Tanaka’s procedure, prove that f ◦ Z ∈ S(P; R) whenever n 1 n Z ∈ S(P; R) and f ∈ C (R ; R) is convex. That is, if Mi denotes the martingale part of Zi, show that

n Z t  X  f Z(t) − ∂if Z(τ) dMi(τ) i=1 0 is the bounded variation part of f ◦ Z. 8.1 Semimartingales & Stratonovich Integrals. 229

 Hint: Begin by showing that when A(t) ≡ hZi,Zji(t) 1≤i,j≤n, A(t)−A(s) is P-almost surely non-negative definite for all 0 ≤ s ≤ t, and conclude that if f ∈ C2(Rn; R) is convex then

n Z t X  t ∂i∂jf Z(τ) hZi,Zji(dτ) i,j=1 0 is P-almost surely non-decreasing. (ii) By taking advantage of more refined properties of convex functions, see if you can prove that f ◦ Z ∈ S(P; R) when f is a continuous, convex function.

Exercise 8.1.10. If one goes back to the original way in which we described Stratonovich in terms of Riemann integration, it becomes clear that the only reason why we needed Y to be a semimartingale is that we needed to know that X N  N  ∆mY ∆mX m≤2N t converges in P-probability to a continuous function of locally bounded vari- ation uniformly for t in compacts.  n (i) Let Z = Z1,...,Zn ∈ S(P; R) , and set Y = f ◦ Z, where f ∈ C1(Rn; R). Show that, for any X ∈ S(P; R),

n Z t X N  N  X  ∆mY ∆mX −→ ∂if Z(τ) hZi,Xi(dτ) m≤2N t i=1 0 in P-probability uniformly for t compacts. (ii) Continuing with the notation in (i), show that

Z t Z t Y (τ) ◦ dX(τ) ≡ lim Y N (τ) dXN (τ) 0 N→∞ 0 n Z t 1 X Z t = Y (τ) dX(τ) + ∂ fZ(τ)hZ ,Xi(dτ), 2 i i 0 i=1 0 where the convergence is in P-probability uniformly for t in compacts. (iii) Show that (8.1.6) continues to hold for f ∈ C2(Rn; R) when the inte- gral on the right hand side is interpreted using the extension of Stratonovich integration developed in (ii). 230 8 Stratonovich’s Theory

Exercise 8.1.11. Let X,Y ∈ S(P; R). In connection with Remark 8.1.7, it is interesting to examine whether it is sufficient to mollify only X when defining the Stratonovich integral of Y with respect to X. R 1 N R 1 (i) Show 0 Y ([τ]N ) dX (τ) tends in P-probability to 0 Y (τ) dX(τ). N R t (ii) Define ψN (t) = 1 − 2 (t − [t]N ), set ZN (t) = 0 ψN (τ) dY (τ), and show that

N Z 1 Z 1 2 −1 N N X N  N  Y (τ) dX (τ) − Y ([τ]N ) dX (τ) = ∆mX ∆mZN . 0 0 m=0

(iii) Show that

N 2 −1 Z 1 X N  N  ∆mX ∆mZN − ψN (t)hX,Y i(dt) −→ 0 in P-probability. m=0 0

(iv) Show that for any Lebesgue integrable function α : [0, 1] −→ R, R 1 1 R 1 0 ψN (τ)α(τ) dτ tends to 2 0 α(τ) dτ. (v) Under the condition that hX,Y i(dt) = β(t) dt, where β : [0, ∞) × Ω −→ [0, ∞) is a progressively measurable function, use the preceding to see R 1 N R 1 that 0 Y (τ) dX (τ) tends in P-probability to 0 Y (τ) ◦ dX(τ). Exercise 8.1.12. One place where Stratonovich’s theory really comes into its own is when it comes to computing the determinant of the solution to a linear stochastic differential equation. Namely, suppose that A = n n  ((Aij))1≤i,j≤n ∈ S P; Hom(R ; R ) (i.e., Aij ∈ S(P; R) for each 1 ≤ i, j ≤ n n  n), and assume that X ∈ S P; Hom(R ; R ) satisfies dX(t) = X(t) ◦ dA(t) in sense that

n X dXij(t) = Xik(t) ◦ dAkj(t) for all 1 ≤ i, j ≤ n. k=1

Show that  detX(t) = detX(0)eTrace A(t)−A(0) .

8.2 Stratonovich Stochastic Differential Equations Seeing as every Stratonovich integral can be converted into an Itˆointe- gral, it might seem unnecessary to develop a separate theory of Stratonovich stochastic differential equations. On the other hand, the replacement of a Stratonovich equation by the equivalent Itˆoequation removes the advan- tages, especially the coordinate invariance, of Stratonovich’s theory. With 8.2 Stratonovich Stochastic Differential Equations. 231 this in mind, we devote the present section to a non-Itˆo analysis of Stratono- vich stochastic differential equations. Let P0 denote the standard Wiener measure for r-dimensional paths p = r (p1, . . . , pr) ∈ C([0, ∞); R ). In order to emphasize the coordinate invariance of the theory, we will write our Stratonovich stochastic differential equations in the form r  X  dX(t, x, p) = V0 X(t, x, p) dt + Vk X(t, x, p) ◦ dpk(t) (8.2.1) k=1 with X(0, x, p) = x,

n n where, for each 0 ≤ k ≤ r, Vk : R −→ R is a smooth function. To see what the equivalent Itˆoequation is, notice that

n D  E X  (Vk)i X( · , x, p) , pk (dt) = ∂j(Vk)i X(t, x, p) hXj( · , x, p), pki(dt) j=1 n r X X   = (V`)j∂j(Vk)i X(t, x, p) hp`, pki(dt) = DVk (Vk)i X(t, x, p) dt, j=1 `=1 Pn where DVk = i=1(Vk)j∂j denotes the directional derivative operator de- termined by Vk. Hence, if we think of each Vk, and therefore X(t, x, p), as a column vector, then the Itˆoequivalent to (8.2.1) is

dX(t, x, p) = σX(t, x, p)dp(t) + bX(t, x, p) dt (8.2.2) with X(0, x, p) = x,  where σ(x) = V1(x),...,Vr(x) is the n × r-matrix whose kth column is 1 Pr Vk(x) and b = V0 + 2 1 DVk Vk. In particular, if n n 1 X X L = a ∂ ∂ + b ∂ , where a = σσ>, 2 ij i j i i i,j=1 i=1 then L is the operator associated with (8.2.1) in the sense that for any 1,2 n f ∈ C [0,T ] × R ,

Z t∧T !    0 f t ∧ T,X(t, x, p) − ∂τ f + Lf X(τ, x, p) dτ, Bt, P 0 is a local martingale. However, a better way to write this operator is directly in terms of the directional derivative operators DVk . Namely, r 1 X (8.2.3) L = D + (D )2. V0 2 Vk k=1 232 8 Stratonovich’s Theory

H¨ormander’s famous paper [10] was the first to demonstrate the advantage of writing a second order elliptic operator in the form given in (8.2.3), and, for this reason, (8.2.3) is often said to be the H¨ormanderform expression for L. The most obvious advantage is the same as the advantage that Stratonovich’s theory has over Itˆo’s: it behaves well under change of variables. To wit, if F : Rn −→ Rn is a diffeomorphism, then

r 1 X 2 (Lϕ) ◦ F = D ϕ + D  ϕ, F∗V0 2 F∗Vk k=1 where F V  = D F , 1 ≤ i ≤ n, ∗ k i Vk i is the “pushforward” under F of Vk. That is, DF∗Vk ϕ = DVk (ϕ ◦ F ). 8.2.1. Commuting Vector Fields. Until further notice, we will be deal- ing with vector fields V0,...,Vr which have two uniformly bounded, contin- uous derivatives. In particular, these assumptions are more than enough to assure that the equation (8.2.2), and therefore (8.2.1), admits a P0-almost surely unique solution. In addition, by Corollary 4.2.6 or Corollary 7.3.6, the 0 L P -distribution of the solution is the unique solution Px to the martingale for L starting from x. In this section we will take the first in a sequence of steps leading to an alternative (especially, non-Itˆo) way of thinking about solutions to (8.2.1). r+1 Pr Given ξ = (ξ0, . . . , ξr) ∈ R , set Vξ = k=0 ξkVk, and determine E(ξ, x) for x ∈ Rn so that E(0, x) = x and d E(tξ, x) = V E(tξ, x). dt ξ From elementary facts (cf. §4 of Chapter 2 in [4]) about ordinary differential equations, we know that (ξ, x) ∈ Rr+1 × Rn 7−→ E(ξ, x) ∈ Rn is a twice continuously differentiable function which satisfies estimates of the form

ν|ξ| E(ξ, x) − x ≤ C|ξ|, ∂ξ E(ξ, x) ∨ ∂xi E(ξ, x) ≤ Ce (8.2.4) k ν|ξ| ∂ξk ∂ξ` E(ξ, x) ∨ ∂ξk ∂xi E(ξ, x) ∨ ∂xi ∂xj E(ξ, x) ≤ Ce , for some C < ∞ and ν ∈ [0, ∞). Finally, define

r+1 n (8.2.5) Vk(ξ, x) = ∂ξk E(ξ, x) for 0 ≤ k ≤ r and (ξ, x) ∈ R × R .

For continuously differentiable, Rn-valued functions V and W on Rn, we will use the notation [V,W ] to denote the Rn-valued function which   is determined so that D[V,W ] is equal to the commutator, DV ,DW = DV ◦ DW − DW ◦ DV , of DV and DW . That is, [V,W ] = DV W − DW V . 8.2 Stratonovich Stochastic Differential Equations. 233

 8.2.6 Lemma. Vk(ξ, x) = Vk E(ξ, x) for all 0 ≤ k ≤ r and (ξ, x) ∈ r+1 n R × R if and only if [Vk,V`] ≡ 0 for all 0 ≤ k, ` ≤ r.  Proof: First assume that Vk(ξ, x) = Vk E(ξ, x) for all k and (ξ, x). Then, r+1 if e` is the element of R whose `th coordinate is 1 and whose other coordinates are 0, we see that d E(ξ + te , x) = V (ξ + te , x) = V E(ξ + te , x), dt ` ` ` ` `  and so, by uniqueness, E(ξ + te`, x) = E te`,E(ξ, x) . In particular, this leads first to   E te`,E(sek, x) = E(te` + sek, x) = E sek,E(te`, x) , and thence to ∂2  DVk V`(x) = E te`,E(sek, x) ∂s∂t s=t=0 ∂2  = E sek,E(te`, x) = DV` Vk(x). ∂t∂s s=t=0  To go the other direction, first observe that Vk(ξ, x) = Vk E(ξ, x) is  implied by E(ξ + tek, x) = E tek,E(ξ, x) . Thus, it suffices to show that  E(ξ + η, x) = E η, E(ξ, x) follows from [Vξ,Vη] ≡ 0. Second, observe that E(ξ + η, x) = Eη, E(ξ, x) is implied by Eξ, E(η, x) = Eη, E(ξ, x). In- deed, if the second of these holds and F (t) ≡ Etξ, E(tη, x), then F˙ (t) =   (Vξ + Vη) F (t) , and so, by uniqueness, F (t) = E t(ξ + η), x . In other words, all that remains is to show that Eξ, E(η, x) = Eη, E(ξ, x) fol-  lows from [Vξ,Vη] ≡ 0. To this end, set F (t) = E ξ, E(tη, x) , and note  that F˙ (t) = E(ξ, · )∗Vη E(tη, x) . Hence, by uniqueness, we will know that   E ξ, E(tη, x) = F (t) = E tη, E(ξ, x) once we show that E(ξ, · )∗Vη =  Vη E(ξ, · ) . But d E(sξ, · )−1V E(sξ, · ) = [V ,V ]E(sξ, · ), ds ∗ η ξ η and so we are done. 

8.2.7 Theorem. Assume that the vector fields Vk commute. Then the    one and only solution to (8.2.1) is (t, p) X(t, x, p) ≡ E t, p(t) , x . Proof: Let X( · , x, p) be given as in the statement. By Itˆo’sformula, r    X    dX(t, x, p) = V0 t, p(t) ,X(t, x, p) dt+ Vk t, p(t) ,X(t, x, p) ◦dpk(t), k=1 and so, by Lemma 8.2.6, X( · , x, p) is a solution. As for uniqueness, simply re-write (8.2.1) in its Itˆoequivalent form, and conclude, from Theorem 5.2.2, that there can be at most one solution to (8.2.1).  234 8 Stratonovich’s Theory

Remark 8.2.8. A significant consequence of Theorem 8.2.7 is that the so- lution to (8.2.1) is a smooth function of p when the Vk’s commute. In fact, for such vector fields, p X( · , x, p) is the unique continuous extention to r C [0, ∞); R of the solution to the ordinary differential equation r  X  X˙ (t, x, p) = V0 X(t, x, p) + Vk X(t, x, p) p˙k(t) k=1

1 r for p ∈ C [0, ∞); R . This fact should be compared to the examples given in § 3.3. 8.2.2. General Vector Fields. Obviously, the preceding is simply not going to work when the vector fields do not commute. On the other, it indi- cates how to proceed. Namely, the commuting case plays in Stratonovich’s theory the role that the constant coefficient case plays in Itˆo’s. In other words, we should suspect that the commuting case is correct locally and that the general case should be handled by perturbation. We continue with the assumption that the Vk’s are smooth and have two bounded continuous derivatives. With the preceding in mind, for each N ≥ 0, set XN (0, x, p) = x and

N N N  X (t, x, p) = E ∆ (t, p),X ([t]N , x, p) (8.2.9) N  where ∆ (t, p) ≡ t − [t]N , p(t) − p([t]N ) . Equivalently (cf. (8.2.5)):

N N N  dX (t, x, p) = V0 ∆ (t, p),X ([t]N , x, p) dt r X N N  + Vk ∆ (t, p),X ([t]N , x, p) ◦ dpk(t); k=1 Note that, by Lemma 8.2.6, XN (t, x, p) = E(t, p(t)), x for each N ≥ 0 N when the Vk’s commute. In order to prove that {X ( · , x, p): N ≥ 0} 0 is P -almost surely convergent even when the Vk’s do not commute, we N N proceed as follows. Set D (t, x, p) ≡ X(t, x, p) − X (t, x, p) and Wk(ξ, x) ≡  N Vk E(ξ, x) − Vk(ξ, x), and note that D (0, x, p) = 0 and

N   N  dD (t, x, p) = V0 X(t, x, p) − V0 X (t, x, p) dt r X  N  + Vk X(t, x, p) − Vk X (t, x, p) ◦ dpk(t) k=1 N N  (8.2.10) + W0 ∆ (t, p),X ([t]N , x, p) dt r X N N  + Wk ∆ (t, p),X ([t]N , x, p) ◦ dpk(t). k=1 8.2 Stratonovich Stochastic Differential Equations. 235

8.2.11 Lemma. There exists a C < ∞ and ν > 0 such that  1   2 ν(|ξ|+|ν|) Vk(η, E(ξ, x) − Vk(ξ + η, x) − 2 Vξ,Vk (x) ≤ C |ξ| + |η| e for all ξ, η ∈ Rr+1, x ∈ Rn, and 0 ≤ k ≤ r. In particular, if X   W˜ k(ξ, x) ≡ Wk(ξ, x) − ξ` V`,Vk (x), `6=k ˜ ˜ ν|ξ| r+1 n then Wk(0, x) = 0 and ∂ξWk(ξ, x) ≤ C|ξ|e for all (ξ, x) ∈ R × R and some C < ∞ and ν > 0. Proof: In view of the estimates in (8.2.4), it suffices for us to check the first statement. To this end, observe that   1  E η, E(ξ, x) = E(ξ, x) + Vη E(ξ, x) + 2 DVη Vη E(ξ, x) + R1(ξ, η, x), where R1(ξ, η, x) is the remainder term in the second order Taylor’s expan-  sion of η E · ,E(ξ, x) around 0 and is therefore (cf. (8.2.4)) dominated 3 ν|η| by constant C1 < ∞ times |η| e . Similarly, 1 E(ξ, x) = x + Vξ(x) + 2 DVξ Vξ(x) + R2(ξ, x)  Vη E(ξ, x) = Vη(x) + DξVη(x) + R3(ξ, η, x)  DVη Vη E(ξ, x) = DVη Vη(x) + R4(ξ, η, x) E(ξ + η, x) = x + D V (x) + 1 D2 V (x) + R (ξ + η, x), Vξ+η ξ+η 2 Vξ+η ξ+η 2 where

3 ν|ξ| 2 ν|ξ| |R2(ξ, x)| ≤ C2|ξ| e , |R3(ξ, η, x)| ≤ C3|ξ| |η|e , 2 ν|ξ| |R4(ξ, η, x)| ≤ C4|ξ||η| e . Hence  1   E η, E(ξ, x) − E(ξ + η, x) = 2 Vξ,Vη (x) + R5(ξ, η, x), 3 ν(|ξ|+|η|) where |R5(ξ, η, x)| ≤ C5 |ξ| + |η| e , and so the required estimate follows.  Returning to (8.2.10), we now write N N N  dD (t, x, p) = W0 ∆ (t, p),X ([t]N , x, p) dt 1 X + V ,V XN ([t] , x, p)∆N (t, p) dp (t) 2 ` k N ` k 1≤k6=`≤r r X N N  + W˜ k ∆ (t, p),X ([t]N , x, p) ◦ dpk(t) k=1   N  + V0 X(t, x, p) − V0 X (t, x, p) dt r X  N  + Vk X(t, x, p) − Vk X (t, x, p) ◦ dpk(t), k=1 236 8 Stratonovich’s Theory

N where, for 1 ≤ ` ≤ r, we have used ∆` (t, p) = p`(t) − p`([t]N ) to denote the kth coordinate of ∆N (t, p). Also, it is important to observe that, because hpk, p`i ≡ 0 for ` 6= k, we were able to replace the Stratonovich stochastic N N integral ∆` (t, p) ◦ dpk(t) by the Itˆostochastic integral ∆` (t, p) dpk(t). In fact, it is this replacement which makes what follows possible. To proceed, first use an obvious Gaussian computation to see that there exists for each q ∈ [1, ∞) a Aq < ∞ such that

1 0 h N 2q 2qν|∆N (t,p)|i q  EP ∆ (t, p) e ≤ Aq t − [t]N .

Hence, there exist constants Cq < ∞ such that

1 " Z · 2q # q 0 N N  2 −N EP W0 ∆ (t, p),X ([t]N , x, p) dt ≤ CqT 2 0 [0,T ] 1 " · 2q # q 0 Z P   N  N −N E Vk,V` X ([t]N , x, p) ∆` (t, p) dpk(t) ≤ CqT 2 0 [0,T ] 1 " · 2q # q 0 Z P ˜ N N  2 −N E Wk ∆ (t, p),X (t, x, p) ◦ dpk(t) ≤ Cq T + T 2 . 0 [0,T ]

In the derivation of the last two of these, we have used Burkholder’s Inequal- ity (cf. Exercises 5.1.27 or 7.2.15). Of course, in the case of the last, we had to first convert the Stratonovich integral to its Itˆoequivalent and then had to also apply the final part of Lemma 8.2.11. Similarly, we obtain

1 " Z · 2q # q 0   N  EP Vk X(t, x, p) − Vk X (t, x, p) ◦ dpk(t) 0 [0,T ] T 1 Z 0 h 2q i q P N ≤ Cq(1 + T ) E X( · , x, p) − X ( · , x, p) [0,T ] dt. 0 After combining the preceding with Gronwall’s inequality, we arrive at the goal toward which we have been working. 8.2.12 Theorem. For each T > 0 and q ∈ [1, ∞) there exists a C(T, q) < ∞ such that

1 0 h 2q i q P N −N E X( · , x, p) − X ( · , x, p) [0,t] ≤ C(T, q)t2 if t ∈ [0,T ].

8.2.3. Another Interpretation. In order to exploit the result in Theo- rem 8.2.12, it is best to first derive a simple corollary of it, a corollary which 8.2 Stratonovich Stochastic Differential Equations. 237 contains an important result which was proved, in a somewhat different form and by entirely different methods, originally by Wong and Zakai [42]. n Namely, when p is a locally absolutely continuous element of C [0, ∞); R , then it is easy to check that, for each T > 0, N lim X ( · , x, p) − X( · , x, p) = 0, N→∞ [0,T ] where here X( · , x, p) is the unique locally absolutely continuous function such that Z t r !  X  (8.2.13) X(t, x, p) = x + V0 X(τ, x, p) + Vk X(τ, x, p) p˙k(τ) dτ. 0 k=1 r N 8.2.14 Theorem. For each N ≥ 0 and p ∈ C [0, ∞); R , let p ∈ r C [0, ∞); R be determined so that N −N −N −N −N p (m2 ) = p(m2 ) and p  Im,N ≡ [m2 , (m + 1)2 ] is linear for each m ≥ 0. Then, for all T > 0 and  > 0,

0 N  lim P X( · , x, p) − X( · , x, p ) ≥  = 0 N→∞ [0,T ] where X( · , x, pN ) is the unique, absolutely continuous solution to (8.2.13). Proof: The key to this result is the observation is that XN (m2−N , x, p) = X(m2−N , x, pN ) for all m ≥ 0. Indeed, for each m ≥ 0, both the maps t ∈ (m2−N , (m + 1)2−N ) 7−→ X(t, x, pN ) and t ∈m2−N , (m + 1)2−N   −N −N  −N  −N  n −→ E t 2 , p (m + 1)2 − p(m2 ) ,X(m2 , x, p) ∈ R are solutions to the same ordinary differential equation. Thus, by induction on m ≥ 0, the asserted equality follows from the standard uniqueness theory for the solutions to such equations. In view of the preceding and the result in Theorem 8.2.12, it remains only to prove that  0 N N lim P sup X(t, x, p ) − X([t]N , x, p ) N→∞ t∈[0,T ]  N N ∨ X (t, x, p) − X ([t]N , x, p) ≥  = 0 for all T ∈ (0, ∞) and  > 0. But N N −N −N X(t, x, p ) − X([t]N , x, p ) ≤ C2 ∨ p([t]N + 2 ) − p([t]N ) and

N N νkpk[0,T ] −N X (t, x, p) − X ([t]N , x, p) ≤ Ce 2 ∨ p(t) − p([t]N ) , and so there is nothing more to do.  238 8 Stratonovich’s Theory

An important dividend of preceding is best expressed in terms of the n L support in C [0, ∞); R of the solution Px to the martingale problem for L (cf. (8.2.3)) starting from x. Namely, take (cf. Exercise 8.2.16 below for more information)

 1 r S(x; V0,...,Vr) = X( · , x, p): p ∈ C [0, ∞); R . 8.2.15 Corollary. Let L be the operator described by (8.2.3), and (cf. L Corollary 4.2.6) let Px be the unique solution to the martingale problem for L n L starting at x. Then the support of Px in C [0, ∞); R is contained in the closure there of the set S(x; V0,...,Vr). L 0 Proof: First observe that Px is the P -distribution of p X( · , x, p). Thus, by Theorem 8.2.14, we know that

L  0 N  Px S(x; V0,...,Vr) ≥ lim P {p : X( · , x, p ) ∈ S(x; V0,...,Vr)} . N→∞

r N But, for each n ∈ N and p ∈ C [0, ∞); R , it is easy to construct {p :  > ∞ r N 0} ⊆ C [0, ∞); R so that p (0) = p(0) and Z T N N 2 lim p˙ (t) − p˙ (t) dt = 0 &0 0

N N and to show that lim&0 kX( · , x, p ) − X( · , x, p )k[0,T ] = 0 for all T > 0. N n  Hence, X( · , x, p ) ∈ S(x; V0,...,Vr) for all n ∈ N and p ∈ C [0, ∞); R ) . 

8.2.4. Exercises.  n Exercise 8.2.16. Except when span {V1(x),...,Vr(x)} = R for all x ∈ n n R , in which case S(x; V0,...,Vr) is the set of p ∈ C [0, ∞); R with p(0) = x, it is a little difficult to get a feeling for what paths are and what paths are not contained in S(x; V0,...,Vr). In this exercise we hope to give at least some insight into this question. For this purpose, it will be helpful to 1 n introduce the space V(V0,...,Vr) of bounded V ∈ C [0, ∞); R with the n property that for all α ∈ R and x ∈ R the integral curve of V0 +αV starting at x is an element of S(x; V0,...,Vr). Obviously, Vk ∈ V(V0,...,Vr) for each 1 ≤ k ≤ r. n (i) Given V,W ∈ V(V0,...,Vr) and (T, x) ∈ (0, ∞) × R , determine X ∈ n C [0, ∞); R so that

( x + R t V X(τ) dτ if t ∈ [0,T ] X(t) = 0 R t  X(T ) + T W X(τ) dτ if t > T, and show that X ∈ S(x; V0,...,Vr). 8.2 Stratonovich Stochastic Differential Equations. 239

1 n (ii) Show that if V,W ∈ V(V0,...,Vr) and ϕ, ψ ∈ Cb(R ; R), then ϕV + ψW ∈ V(V0,...,Vr). 3 n n Hint: Define X : R × R −→ R so that X (0, 0, 0), x) = x and d Xt(α, β, γ), x = (αV + βV + γW )Xt(α, β, γ), x). dt 0 n n Given N ∈ N, define YN : [0, ∞) × R −→ R so that YN (t, x) equals  Xt(1, 2ϕ(x), 0), x for 0 ≤ t ≤ 2−N−1   −N−1 −N−1 −N−1 −N X (t − 2 )(1, 0, 2ψ(x)),YN (2 for 2 ≤ t ≤ 2   −N  YN t − [t]N ,YM ([t]N , x) for t ≥ 2 .

n Show that YN ( · , x) −→ Y ( · , x) in C [0, ∞); R , where Y ( · , x) is the inte- gral curve of V0 + ϕV + ψW starting at x.

(iii) If V,W ∈ V(V0,...,Vr) have two bounded continuous derivatives, show that [V,W ] ∈ V(V0,...,Vr). Hint: Use the notation in the preceding. For N ∈ N, define YN : [0, ∞) × n n R −→ R so that YN (t, x) is equal to

N +2  X (t, 2 2 t, 0), x −N−2 N +2 N−2 −N−2  X (t − 2 , 0, 2 2 (t − 2 ),YN (2 , x) −N−1 N +2 −N−1 −N−1  X (t − 2 , −2 2 (t − 2 ), 0),YN (2 , x) −N−2 N +2 N−2 −N−2  X (t − 32 , 0, −2 2 (t − 32 )),YN (32 , x)  YN t − [t]N ,YN ([t]N , x) according to whether 0 ≤ t ≤ 2−N−2, 2−N−2 ≤ t ≤ 2N−1, 2−N−1 ≤ t ≤ −N−2 −N−2 −N −N 32 , 32 ≤ t ≤ 2 , or t ≥ 2 . Show that YN ( · , x) −→ Y ( · , x) n in C [0, ∞); R , where Y ( · , x) is the integral curve of V0 + [V,W ] starting at x. (iv) Suppose that M is a closed submanifold of Rn and that, for each x ∈ M, {V0(x),...,Vr(x)} is a subset of the tangent space TxM to M at x. Show  that, for each x ∈ M, S(x; V0,...,Vr) ⊆ C [0, ∞); M and therefore that L  ∞ n n Px ∀t ∈ [0, ∞) p(t) ∈ M = 1. Moreover, if {V1,...,Vr} ⊆ C (R ; R ) and n Lie(V1,...,Vr) is the smallest Lie algebra of vectors fields on R containing {V1,...,Vr}, show that   S(x; V0,...,Vr) = p ∈ C [0, ∞); M : p(0) = x if M 3 x is a submanifold of Rn with the property that  V0(y) ∈ TyM = V (y): V ∈ Lie(V1,...,Vr) for all y ∈ M. 240 8 Stratonovich’s Theory

8.3 The Support Theorem Corollary 8.2.15 is the easier half the following result which characterizes L the support of the measure Px . In its statement, and elsewhere, we use M 2 r the notation (cf. Theorem 8.2.14) kqkM,I ≡ kq˙ kL (I;R ) for M ∈ N, q ∈ 1 r C [0, ∞); R ), and closed intervals I ⊆ [0, ∞). L n 8.3.1 Theorem. The support of Px in C([0, ∞); R ) is the closure there of (cf. Corollary 8.2.15) S(x; V0,...,Vr). In fact, if for each smooth g ∈ 1 n C [0, ∞); R X( · , x, g) is the solution to (8.2.13) with p = g, then

0 lim lim P X( · , x, p) − X( · ,x, g) [0,T ] <  (8.3.2) M→∞ δ&0 

kp − gkM,[0,T ] ≤ δ = 1 for all T ∈ (0, ∞) and  > 0.3

L Notice that, because we already know supp(Px ) ⊆ S(x; V0,...,Vr), the first assertion will follow as soon as we prove (8.3.2). Indeed, since

[2M T ]+1 2 M+1 X −M −M 2 p(0) = 0 =⇒ p − g M,[0,T ] ≤ 2 p(m2 ) − g(m2 ) m=1 and, for any ` ∈ Z+, the P0-distribution of

n −M −M  n ` p ∈ C [0, ∞); R 7−→ p(2 ), . . . , p(`2 ) ∈ (R ) has a smooth, strictly positive density, we know that

0  (8.3.3) P p − g M,[0,T ] ≤ δ > 0 for all δ > 0.

Hence, (8.3.2) is much more than is needed to conclude that

0  P X( · , x, p) − X( · , x, g) [0,T ] <  > 0 for all  > 0.

3 This sort of statement was proved for the first time in [39]. However, the conditioning there was somewhat different. Namely, the condition was that max1≤i≤n kpi − gik[0,T ] < δ. Ikeda and Watanabe [16] follow the same basic strategy in their treatment, although they introduce an observation which not only greatly simplifies the most unpleasant part of the argument in [39] but also allows them to use the more natural condition kp−gk[0,T ] < δ. The strategy of the proof followed below was worked out in [38]. Finally, if all that one cares about is the support characterization, [40] shows that one need not assume that the operator L can be expressed in H¨ormander’s form. 8.3 The Support Theorem. 241

To begin the proof of (8.3.2), first observe that, by Theorem 8.2.12 and (8.3.3),   0 X( · , x, p) − X( · , x, g) >  kp − gk ≤ δ P [0,T ] M,[0,T ] 0 N  ≤ lim P X ( · , x, p) − X( · , x, g) >  kp − gkM,[0,T ] ≤ δ N→∞ [0,T ]   0 N M  ≤ sup P X ( · , x, p) − X ( · , x, g) [0,T ] > 2 kp − gkM,[0,T ] ≤ δ N>M   + 0 XM ( · , x, p) − X( · , x, g) >  kp − gk ≤ δ . P [0,T ] 2 M,[0,T ] Hence, since M X ( · , x, p) − X( · , x, g) [0,T ] M M M M ≤ X ( · , x, p) − X( · , x, p ) [0,T ] + X( · , x, p ) − X( · , x, g ) [0,T ] M + X( · , x, g ) − X( · , x, g) [0,T ]  −M νkpk ≤ C 2 + e [0,T ] sup p(t) − p(s) 0≤s 0,  

0   (8.3.4) lim sup P  sup p(t) − p(s) ≥  kp − gkM,[0,T ] ≤ δ = 0 M→∞ δ∈(0,1] 0≤sM (8.3.5) δ∈(0,1] 

kp − gkM,[0,T ] ≤ δ = 0. 8.3.1. The Support Theorem, Part I. The key to our proof of both (8.3.4) and (8.3.5) is the following lemma about Wiener measure. As the as- tute reader will recognize, this lemma is a manifestation of the same property of Gaussian measures on which (6.2.5) is based. n M M 8.3.6 Lemma. For M ∈ N and p ∈ C [0, ∞); R , set p˜ = p−p . Then M  0 M −M  σ {p˜ (t): t ≥ 0} is P -independent of B ≡ σ {p(m2 ): m ∈ N} . n Hence, if, for some Borel measurable F : C [0, ∞); R −→ [0, ∞), Z M M M  0 r F˜ (q) ≡ F p˜ + q P (dp), q ∈ C [0, ∞); R , 242 8 Stratonovich’s Theory

0 then F˜M = EP [F |BM ] P0-almost surely. In particular,

0  ˜M  0 P F (p), kp − hkM,[0,T ] ≤ δ) P   E E F (p) kp − hkM,[0,T ] ≤ δ = 0 . P (kp − hkM,[0,T ] ≤ δ) Proof: Clearly, the second part of this lemma is an immediate consequence of the first part. Thus (cf. part (i) in Exercise 4.2.39 in [36]) because all elements of span{p(t): t ≥ 0} are centered Gaussian random variables, it 0  M −M  suffices to prove that EP p˜ (s)p(m2 ) = 0 for all s ∈ [0, ∞) and m ∈ N. Equivalently, what we need to show is that

0  M −M  0  M M −M  EP p(s)p (m2 ) = EP p (s)p (m2 ) for all s ∈ [0, ∞)& m ∈ N.

But, after choosing ` ∈ N so that `2−M ≤ s ≤ (` + 1)2−M and using the fact 0   that EP p(u)p(v) = u ∧ v, this becomes an elementary computation.  Knowing Lemma 8.3.6, one gets (8.3.4) is easily. Namely, given M ∈ N r and (p, h) ∈ C [0, ∞); R , set

M M M p˜h (t) ≡ p˜ (t) + h (t). Then, we will have proved (8.3.4) once we show that

 M M  0 |p˜ (t) − p˜ (s)| P h h sup E sup 1 < ∞. 4 kh−gkM,[0,T ]≤1 0≤s

M M 1 But, |h (t) − h (s)| ≤ |t − s| 2 khkM,[0,T ],

|p˜M (t) − p˜M (s)| |p(t) − p(s)| sup 1 ≤ 4 sup 1 , 0≤s M. Then, in view of Lemma 8.3.6, (8.3.5) will follow once we show that

0 h 2 i P M,N M (8.3.7) lim sup E D ( · , x, p˜h ) [0,T ] = 0 M→∞ N>M kh−gkM,[0,T ]≤1 Because

N N N  X (t, x, p) = X t − [t]M ,X ([t]M , x, p), δ[t]M p M M M  and X (t, x, p) = X t − [t]M ,X ([t]M , x, p), δ[t]M p , 8.3 The Support Theorem. 243

r r where, for each s ≥ 0, δs : C [0, ∞); R −→ C [0, ∞); R is defined so that M,N δsp(t) = p(s + t) − p(s), we have that D (t, x, p) can be decomposed into

M,N N  D t − [t]M ,X ([t]M , x, p), δ[t]M p M N  M M  + E ∆ (t, p),X ([t]M , x, p) − E ∆ (t, p),X ([t]M , x, p) M,N M,N N  = D ([t]M , x, p) + D t − [t]M ,X ([t]M , x, p), δ[t]M p M N  M M  + E¯ ∆ (t, p),X ([t]M , x, p) − E¯ ∆ (t, p),X ([t]M , x, p) , where E¯(ξ, x) ≡ E(ξ, x) − x. Hence, we can write

Z t r Z t M,N M,N X M,N D (t, x, p) = W0 (τ, x, p) dτ + Wk (τ, x, p) ◦ dpk(τ) (8.3.8) 0 k=1 0 + D˜ M,N (t, x, p),

M,N where Wk (t, x, p) is used to denote M N  M M  Vk ∆ (t, p),X ([t]M , x, p) − Vk ∆ (t, p),X ([t]M , x, p) and D˜ M,N (t, x, p) is given by

[2M t] X M,N −M −M N −M  D (t − m2 ) ∧ 2 ,X (m2 , x, p), δm2−M p . m=0

M,N The terms involving the Wk ’s are relatively easy to handle. Namely, because (cf. (5.1.25) and use Brownian scaling) for any λ ≥ 0 and T ∈ (0, ∞)

− 1 0  λT 2 kpk  0  λkpk  (8.3.9) EP e [0,T ] = EP e [0,1] ≡ K(λ) < ∞, and, for any (m, M) ∈ N2, (cf. the notation in Theorem 8.2.14)

M M M M − 2 ∆ ( · , p˜ )kIm,M ≤ 2 ∆ ( · , p) + 2 khkM,Im,M , h Im,M we have that

M 0 h λ2 2 k∆M ( · ,p˜M )k i λkhk P h Im,M M,Im,M (8.3.10) E e Bm2−M ≤ K(2λ)e . Hence, since (cf. (8.2.4))

Z · 2 Z T M,N M,N 2 W0 (τ, x, p) dτ ≤ T W0 (τ, x, p) dτ 0 [0,T ] 0 Z T 2ν|∆M (τ,p)| M,N 2 ≤ CT e D ([τ]M , x, p) dτ, 0 244 8 Stratonovich’s Theory it is clear that there are finite constants K and K0 such that

" · 2 # 0 Z P M,N M E W0 (τ, x, p˜h ) dτ 0 [0,T ] T Z 0 h M M 2i P 2ν|∆ (τ,p˜h )| M,N M ≤ KT E e D ([τ]M , x, p˜h ) dτ 0 T Z 0 h 2i 0 2νkhkM,[0,T ] P M,N M ≤ K T e E D ([τ]M , x, p˜h ) dτ. 0

Next, suppose that 1 ≤ k ≤ r. Then, after converting Stratonovich integrals to their Itˆoequivalents, we have

Z t M,N M M Wk (τ, x, p˜h ) ◦ d(˜ph )k(τ) 0 Z t Z t M,N M 1 M,N M = Wk (τ, x, p˜h ) dpk(τ) + Wk,k (τ, x, p˜h ) dτ 0 2 0 Z t Z t M,N M M M,N M ˙ M − Wk (τ, x, p˜h )p ˙k (τ) dτ + Wk (τ, x, p˜h )hk (τ) dτ, 0 0

M,N where Wk,k (t, x, p) denotes

M N  M M  Vk,k ∆ (t, p),X ([t]M , x, p) − Vk,k ∆ (t, p),X ([t]M , x, p)

  when Vk,k(ξ, x) ≡ ∂kVk( · , x) (ξ). The second and fourth terms on the right M,N are handled in precisely the same way as the one involving W0 and satisfy estimates of the form

" · 2 # 0 Z P M,N M E Wk,k (τ, x, p˜h ) dτ 0 [0,T ] T Z 0 h 2i 2νkhkM,[0,T ] P M,N M ≤ KT e E D ([τ]M , x, p˜h ) dτ 0 and

" · 2 # 0 Z P M,N M ˙ M E Wk (τ, x, p˜h )hk (τ) dτ 0 [0,T ] T Z 0 h 2i 2 2νkhkM,[0,T ] P M,N M ≤ KkhkM,[0,T ]e E D ([τ]M , x, p˜h ) dτ. 0 8.3 The Support Theorem. 245

In addition, because the first term is an Itˆointegral, we can apply Doob’s inequality followed by the above reasoning to obtain the estimate

" · 2 # 0 Z P M,N M E Wk (τ, x, p˜h ) dpk(τ) 0 [0,T ] T Z 0 h 2i 2νkhkM,[0,T ] P M,N M ≤ Ke E D ([τ]M , x, p˜h ) dτ. 0 As for the third term on the right, it is best to first decompose it into Z t M,N M Wk ([τ]M , x, p˜h ) dpk(τ) 0 Z t M,N M,N  M + Wk (τ, x, p) − Wk ([τ]M , x, p) p˙k (τ) dτ. 0 Since the first of these is an Itˆointegral, it presents no new problem. To deal with the second, observe that (cf. (8.2.4)) Z · 2 M,N M,N  M Wk (τ, x, p) − Wk ([τ]M , x, p) p˙k (τ) dτ 0 [0,T ] [2M T ] 2 M X −M  −M 2 ≤ C 4 T pk (m + 1)2 − pk(m2 ) m=0 Z M M 2 2ν|∆ (τ,p˜h )| M M 2 M,N −M M × e |∆ (τ, p˜h )| dτ D (m2 , x, p˜h ) , Im,M and pass from this to the same sort of estimate at which we arrived for the other terms. Hence, by combining these, we can now replace (8.3.8) by

0 h 2 i P M,N M E D ( · , x, p˜h ) [0,T ] T Z 0 h 2 i (8.3.11)  P M,N M ≤ C T, khkM,[0,T ] E D ( · , x, p˜h ) [0,t] dt 0 0 P  ˜ M,N M 2  + 2E kD ( · , x, p˜h )k[0,T ] , where (s, t) ∈ [0, ∞) 7−→ C(s, t) ∈ (0, ∞) is non-decreasing in each variable separately. 8.3.3. The Support Theorem, Part III. After applying Gronwall’s inequality to (8.3.11), we know that, there is a (s, t) ∈ [0, ∞)2 7−→ K(s, t) ∈ (0, ∞), which is non-decreasing in each variable separately, such that

0 h 2 i P M,N M sup E D ( · , x, p˜h ) [0,T ] kh−gkM,[0,T ]≤1 0 h 2 i P ˜ M,N M ≤ K(T, g) sup E D ( · , x, p˜h ) [0,T ] . kh−gkM,[0,T ]≤1 246 8 Stratonovich’s Theory

Hence, what remains is to show that

0 h 2 i P ˜ M,N M (8.3.12) lim sup E D ( · , x, p˜h ) [0,T ] = 0, M→∞ N>M kh−gkM,[0,T ]≤1 and this turns out to be quite delicate. To get started on the proof of (8.3.12), we again break the computation into parts. Namely, because

N N  N N  X t−[τ]M ,X ([τ]M , x, p), δ[τ]M p = X t−[τ]N ,X ([τ]N , x, p), δ[τ]N p ,

˜ M,N M  D t, x, p˜h can be written as

Z t ˜ M,N M  W0 τ, x, p˜h dτ 0 r r X Z t 1 X Z t + W˜ M,N τ, x, p˜M  dp (τ) + W˜ M,N τ, x, p˜M  dτ k h k 2 k,k h k=1 0 k=1 0 r Z t r Z t X ˜ M,N M  M X ˜ M,N M ˙ M − Wk τ, x, p˜h p˙k (τ) dτ + Wk τ, x, p˜h hk (τ), k=1 0 k=1 0

˜ M,N where Wk (t, x, p) denotes

N N  M N  Vk ∆ (t, p),X ([t]N , x, p) − Vk ∆ (t, p),X ([t]M , x, p)

˜ M,N and Wk,k (t, x, p) is equal to

N N  M N  Vk,k ∆ (t, p),X ([t]N , x, p) − Vk,k ∆ (t, p),X ([t]M , x, p) .

M With the exception of those involvingp ˙k (τ), none of these terms is very difficult to estimate. Indeed,

" · 2 # T 0 Z Z 0 h 2i P ˜ M,N M  P ˜ M,N M  E W0 τ, x, p˜h dτ ≤ T E W0 τ, x, p˜h dτ, 0 [0,T ] 0

" · 2 # T 0 Z Z 0 h 2i P ˜ M,N M  P ˜ M,N M  E Wk τ, x, p˜h dpk(τ) ≤ 4 E Wk τ, x, p˜h dτ, 0 [0,T ] 0

" · 2 # T 0 Z Z 0 h 2i P ˜ M,N M  P ˜ M,N M  E Wk,k τ, x, p˜h dτ ≤ T E Wk,k τ, x, p˜h dτ, 0 [0,T ] 0 8.3 The Support Theorem. 247 and

" · 2 # 0 Z P ˜ M,N M ˙ M E Wk τ, x, p˜h hk (τ) 0 [0,T ] T Z 0 h 2i 2 P ˜ M,N M  ≤ khkM,[0,T ] E Wk τ, x, p˜h dτ. 0 ˜ M,N M At the same time, by (8.2.4), Wk (t, x, p˜h ) is dominated by a constant times

 M M p˜h ([τ]N ) − p˜h ([τ]M )  M M N M  N M  νk∆ ( · ,p˜h )kIm,M + X [τ]N , x, p˜h − X [τ]M , x, p˜h e , and so, in view of (8.3.9) and (8.3.10), all the above terms will have been handled once we check that, for each q ∈ [1, ∞), there exists a Cq < ∞ such that

1 0 h q i q P N M N −M M E X ( · , x, p˜h ) − X (m2 , x, p˜h ) (8.3.13) Im,M M  − 2 ≤ Cq 1 + khkM,Im,M 2 for all x ∈ Rn,(m, M) ∈ N2, and N ≥ M. To this end, first note that, N −M N N −M  because X ( · +m2 , x, p) = X · ,X (m2 , x, p), δm2−M p , it suffices N M to treat the case when m = 0. Second, write X (t, x, p˜h ) − x as Z t N M M  V0 ∆ (τ, p˜h ), x, p˜h dτ 0 r X Z t 1 Z t  + V ∆N (τ, p˜M ), x, p˜M  dp (τ) + V ∆N (τ, p˜M ), x, p˜M  dτ k h h k 2 k,k h h k=1 0 0 r Z t X  N M M  M N M M ˙ M  − Vk ∆ (τ, p˜h ), x, p˜h p˙k (τ) − Vk ∆ (τ, p˜h ), x, p˜h hk (τ) dτ, k=1 0 and apply (8.2.4) and standard estimates to conclude from this that

0 q 1 P  XN ( · , x, p˜M ) − x  q E h I0,M is dominated by an expression of the form on the right hand side of (8.3.13). 8.3.4. The Support Theorem, Part IV. The considerations in § 8.3.3 reduce the proof of (8.3.12) to showing that, for each T ∈ [0, ∞),

" · 2 # 0 Z P ˜ M,N M  M (8.3.14) lim sup E W τ, x, p˜h p˙k (τ) dτ = 0, M→∞ k N>M 0 0,T ] kh−gkM,[0,T ]≤1 248 8 Stratonovich’s Theory and for this purpose it is best to begin with yet another small reduction.

R · ˜ M,N M  M Namely, dominate Wk τ, x, p˜h p˙k (τ) dτ by 0 [0,T ] Z ˜ M,N M  M max Wk τ, x, p˜h |p˙k (τ)| dτ 0≤m≤2M T Im,M −M Z m2 + max W˜ M,N τ, x, p˜M p˙M (τ) dτ . M k h k 0≤m≤2 T 0 Again, the first of these is easy, since, by lines of reasoning which should be familiar by now:

 !2 0 Z P ˜ M,N M  M E  max Wk τ, x, p˜h |p˙k (τ)| dτ  0≤m≤[2M T ] Im,M

1 [2M T ]−1  2 4 0 X 4νk∆M p˜M k P h Im,M −M  −M ≤ CE  e pk (m + 1)2 − pk(m2 )  m=0

1− M 0 1 2 2 νkhk2 − M ≤ C T 2 e M,[0,T ] 2 2 for appropriate, finite constants C and C0. Hence, the proof of (8.3.14), and therefore (8.3.12), is reduced to showing that

" m2−M 2# 0 Z P ˜ M,N M  M (8.3.15) lim sup E max Wk τ, x, p˜h p˙k (τ) dτ = 0 M→∞ N>M m≤2M T 0 uniformly with respect to h with kh − gkM,[0,T ] ≤ 1. ˜ M,N To carry out the proof of (8.3.15), we decompose Wk (τ, x, p) into the sum

N X  ˆ M,L N  ˆ M,L N  Vk τ, X ([τ]M , x, p), p + Wk τ, X ([τ]M , x, p), p , L=M+1 where

ˆ M,L L L  Vk (τ, x, p) ≡ Vk ∆ (τ, p),X ([τ]L − [τ]M , x, δ[τ]M p) L L−1  − Vk ∆ (τ, p),X ([τ]L − [τ]M , x, δ[τ]M p) . and

ˆ M,L L L−1  Wk (τ, x, p) ≡ Vk ∆ (τ, p),X ([τ]L − [τ]M , x, δ[τ]M p) L−1 L−1  − Vk ∆ (τ, p),X ([τ]L−1 − [τ]M , x, δ[τ]M p) 8.3 The Support Theorem. 249

After making this decomposition, it is clear that (8.3.15) will be proved once we show that

"[2M T ]−1 0 X −M  −M lim sup EP pk (m + 1)2 − pk(m2 ) M→∞ (8.3.16) N>M m=0 kh−gkM,[0,T ]≤1 2# M,N M  × Fk m, x, p˜h = 0,

M,N where Fk (m, x, p) is given by

N Z X M ˆ M,L N −M  2 Vk τ, X (m2 , x, p), p dτ, L=M+1 Im,M and that   0 2 P M,N  (8.3.17) lim sup E max Jk m, x, (p, h) = 0, M→∞ N>M m<[2M T ] kh−gkM,[0,T ]≤1

M,N  where Jk m, x, (p, h) is equal to

m N Z X X ˆ M,L N 0 −M M M  Wk τ, X (m 2 , x, p˜h ), p˜h p˙k(τ) dτ. m0=0 L=M+1 Im0,M

By Schwarz’s inequality, the expectation value in (8.3.16) is dominated by

[2M T ]−1 0  2  M X P −M  −M M,N M 2 2 T E pk (m + 1)2 − pk(m2 ) Fk (m, x, p˜h ) m=0 [2M T ]−1 0 X P  M,N M 2 ≤ 2T E Fk (m, x, p˜h ) . m=0 In addition, by Minkowski’s and Jensen’s inequalities,

0 1 P  M,N M 2 2 E Fk (m, x, p˜h ) 1 N  !2 2 0 Z X P M ˆ M,L M ≤ sup E  2 V (τ, x, p˜h ) dτ  x∈ n L=M+1 R Im,M

1 N Z ! 2 M X 0 h 2i 2 P ˆ M,L M ≤ 2 sup E V (τ, x, p˜h ) dτ . x∈ n L=M+1 R Im,M 250 8 Stratonovich’s Theory

Thus, (8.3.16) will be proved once we show that Z  0 M,L 2 3M P ˆ M   −L− 2 (8.3.18) E Vk τ, x, p˜h dτ ≤ C khkM,[0,T ] 2 Im,M for some non-decreasing r ∈ [0, ∞) 7−→ C(r) ∈ [0, ∞) and all x ∈ Rn and 0 ≤ m < [2M T ]. To this end, first observe that the left hand side of (8.3.18) is dominated by

Z 0 h 2i  P L−1,L M C khkM,[0,T ] sup E D ([τ]L, x, δm2−M p˜h ) dτ. n x∈R I0,M With the help of the following lemma, we will be able to use Theorem 8.2.12 to control this last expression. −M 8.3.19 Lemma. Given 0 ≤ τ < 2 and a Bτ -measurable, continuous r Ψ: C [0, ∞); R −→ [0, ∞),

n khk2  M  2q M,I 0  M  2 m,M P Ψ δ −M p˜ ≤ e 2q kΨk q 0 E m2 h 1 − 2M τ L (P ) for every q ∈ [1, ∞]. Proof: First observe that (cf. Lemma 8.3.6)

0 0 0 P  M  P  M  P  M M  ˜ M M Ψ δm2−M p˜ = Ψ p˜ = Ψ p˜ + h = Ψ (h ), E h E δm2−M h E m m

M M M M  where hm (t) = (1 ∧ 2 t) h((m + 1)2 ) − h(m2 ) . Indeed, the first of these is just the time shift-invariance of Brownian increments, the second M M −M comes from the fact that δm2−M h coincides with hm on [0, 2 ], and the last is the definition of Ψ˜ M in Lemma 8.3.6. ˜ M M The next step is to show that another expression for Ψ (hm ) is

n  M  2   M M −M 2  2 2M−1|hM (2−M )|2 0 2 |p(τ) − hm (2 )| e m P Ψ(p) exp − , 1 − 2M τ E 2(1 − 2M τ) and, while doing this, we may and will assume that Ψ is not only continuous and non-negative but also bounded. But, by Lemma 8.3.6, we know that, for any bounded continuous f : Rr −→ R,

0  M −M  0  −M  EP Ψ˜ (p)f p(2 ) = EP Ψ(p)f p(2 ) .

r M r M −M Further, if y ∈ R 7−→ `y ∈ C [0, ∞); R is given by `y (t) = (t ∧ 2 )y, then 0 Z P  ˜ M −M  ˜ M M E Ψ (p)f p(2 ) = Ψ (`y )f(y)γ2−M (y) dy r R 8.3 The Support Theorem. 251 while

0 Z 0 h i P  −M  P  E Ψ(p)f p(2 ) = E Ψ(p)γ2−M −τ y − p(τ) f(y) dy, r R

r  |y|2  − 2 where γt(y) = (2πt) exp − 2t . Hence, because Ψ, and therefore also Ψ˜ M , is bounded and continuous, the asserted equality follows by choosing a sequence of f’s which form an approximate identity at h((m + 1)2−M ) − h(m2−M ). Given the preceding, there are two ways in which the proof can be com- pleted. One is to note that, trivially,

0  M M  P ∞ 0 E Ψ p˜ + hm ≤ kΨkL (P ), whereas, by the preceding,

n  M  2 0  M M  2 2M−1|hM (2−M )|2 P Ψ p˜ + h ≤ kΨk 1 0 e m . E m L (P ) 1 − 2M τ

Hence interpolation provides the desired estimate. Alternatively, one can apply H¨older’s inequality to get

  M M −M 2  0 2 |p(τ) − h (2 )| P Ψ(p) exp − m E 2(1 − 2M τ) 1 Z  0 M M −M 2   q0 q 2 |y − hm (2 )| q 0 ≤ kΨkL (P ) exp − M γτ (y) dy , r 2(1 − 2 τ) R perform an elementary Gaussian integration, and arrive at the desired result after making some simple estimates.  If we now apply Lemma 8.3.19 with

L L−1 2 Ψ(p) = X ([τ]L, x, p) − X ([τ]L, x, p) and q = n, we see from Theorem 8.2.12 that

0 h 2i P L M L−1 M E X ([τ]L, x, p˜h ) − X ([τ]L, x, p˜h )

1  2M  2 ≤ Ckhk (2−Lτ) , M,Im,M 1 − 2M τ for an appropriate choice of non-decreasing r C(r). In view of the discus- sion prior to Lemma 8.3.19, (8.3.18), and therefore (8.3.16), is now an easy step. 252 8 Stratonovich’s Theory

8.3.5. The Support Theorem, Part V. What remains is to check ˆ M,L (8.3.17). For this purpose, we have to begin by rewriting Wk (τ, x, p) as

 L L L−1  Vk ∆ (τ, p),E ξ (τ, p),X ([τ]L−1 − [τ]M , x, δ[τ]M p)

 L L L−1  − Vk ∆ (τ, p) + ξ (τ, p),X ([τ]L−1 − [τ]M , x, δ[τ]M p ,

L  where ξ (τ, p) ≡ [τ]L − [τ]L−1, p([τ]L) − p([τ]L−1 . Having done so, one ˆ M,L sees that Lemma 8.2.11 applies and shows Wk (τ, x, p) can be written as

1 X [V ,V ]XL−1([τ] − [τ] , x, δ p)ξL(τ, p) + RM,L(τ, x, p), 2 ` k L−1 M [τ]M ` k,` `6=k where

L−1 M,L 2νk∆ ( · ,p)kI L−1 2 R (τ, x, p) ≤ Ce m,M k∆ ( · , p)k for τ ∈ I . k,` Im,M m,M

Thus, since

"[2M T ]−1 0 X −M −M EP pk((m + 1)2 ) − pk(m2 ) m=0 N 2# L−1 M  X 2νk∆ ( · ,p˜ )kI L−1 M 2 × e h m,M k∆ ( · , p˜ )k h Im,M L=M+1

[2M T ]−1 N M X X 0 h −M −M 2 ≤ 2 T EP pk((m + 1)2 ) − pk(m2 ) m=0 L=M+1 2 1 ! L−1 M 2 4νk∆ ( · ,p˜ )kI L−1 M 4 i × e h m,M k∆ ( · , p˜ )k h Im,M

2 N 1 ! 0 L−1 M 4 M 2 X h 8νk∆ ( · ,p˜ )kI L−1 M 8 i ≤ 2 T C P e h 0,M k∆ ( · , p˜ )k E h I0,M L=M+1 2 −M ≤ C(khkM,[0,T ])T 2 , we are left with showing that for each 1 ≤ k ≤ r and ` 6= k,

 m 2 0 P X M,N 0  (8.3.20) lim sup E  max Hk,` m , x, (p, h)  = 0, M→∞ N>M m<[2M T ] m0=0 kh−gkM,[0,T ]≤1 8.3 The Support Theorem. 253

M,N  where Hk,` m, x, (p, h) equals

M  −M  −M  2 pk (m + 1)2 − pk(m2 ) Z N X M,L,N M L M × αk,` ([τ]L−1, x, p˜h )ξ` (τ, p˜h ) dτ Im,M L=M+1 with

M,L,N  L−1 N  αk,` (τ, x, p) ≡ [V`,Vk] X τ − [τ]M ,X ([τ]M , x, p), δ[τ]M p .

Since

M,N  −M −M  −M Hk,0 m, x, (p, h) ≤ C2 pk (m + 1)2 − pk(m2 ) , the case when ` = 0 is easy. Thus, we will restrict our attention to 1 ≤ k 6= M,N  ` ≤ r, in which case we decompose Hk,` m, x, (p, h) into the sum

 −M  −M  M,N  pk (m + 1)2 − pk(m2 ) Ak,` m, x, (p, h)

 −M  −M  + pk (m + 1)2 − pk(m2 )

 −M  −M  M,N  × (h − p)` (m + 1)2 − (h − p)`(m2 ) Bk,` m, x, (p, h)

 −M  −M  + pk (m + 1)2 − pk(m2 )

 −M  −M  M,N  × (h − p)` (m + 1)2 − (h − p)`(m2 ) Ck,` m, x, (p, h) ,

M,N  where Ak,` m, x, (p, h) equals Z N M X M,L,N M   2 αk,` [τ]L−1, x, p˜h p`([τ]L) − p`([τ]L−1) dτ, Im,M L=M+1

M,N  Bk,` m, x, (p, h) equals Z N M X M,L,N −M M   4 αk,` m2 , x, p˜h [τ]L − [τ]L−1 dτ, Im,M L=M+1

M,N  and Ck,` m, x, (p, h) equals Z N M X  M,L,N M  4 αk,` [τ]L−1, x, p˜h Im,M L=M+1 M,L,N −M M   − αk,` m2 , x, p˜h [τL] − [τ]L−1 dτ 254 8 Stratonovich’s Theory

M,N  Because Ck,` m, x, (p, h) is dominated by a constant times L−1 N −M M M  N −M M X · ,X (m2 , x, p˜ ), δm2−M p˜ − X (m2 , x, p˜ ) , h h h I0,M a simple application of (8.3.13) leads to " m 0 X  0 −M  0 −M  EP max pk (m + 1)2 − pk(m 2 ) m<[2M T ] m0=0 2#  0 −M  0 −M  M,N 0  × (h − p)` (m + 1)2 − (h − p)`(m 2 ) C m , x, (p, h) k,` "[2M T ]−1 M 0 X  −M  −M  ≤ 2 TC sup EP pk (m + 1)2 − pk(m2 ) x m=0  −M  −M  × (h − p)` (m + 1)2 − (h − p)`(m2 ) L−1 N −M M M  × X · ,X (m2 x, p˜h ), δm2−M p˜h 2# N −M M − X (m2 , x, p˜h ) I0,M

0 1 2  2 P  N M 4  2 ≤ 1 + khkM,[0,T ] CT sup E kX ( · , x, p˜ M ) − xkI n hm 0,M x∈R N≥M  2 −M ≤ C khkM,[0,T ] T 2 . The key to handling the other terms is the observation that, because ` 6= k,

0 h  i P −M  −M M,N  E pk (m + 1)2 − pk(m2 ) Ak,` m, x, (p, h) Bm2−M = 0 and 0 h −M  −M  EP pk (m + 1)2 − pk(m2 )   i −M  −M × (h − p)` (m + 1)2 − (h − p)`(m2 ) Bm2−M = 0. M,N  Thus, since Bk,` m, x, (p, h) is Bm2−M , both

m−1 X  0 −M  0 −M  M,N 0  pk (m + 1)2 − pk(m 2 ) Ak,` m , x, (p, h) m0=0 and m−1 X  −M  −M  pk (m + 1)2 − pk(m2 ) m0=0  −M  −M  M,N  × (h − p)` (m + 1)2 − (h − p)`(m2 ) Bk,` m, x, (p, h) 8.3 The Support Theorem. 255

0 are P -martingales relative to {Bm2−M : m ≥ 0}. In particular, by Doob’s Inequality, the P0-expected square of the maximums over m < [2M T ] of these are dominated by the sum from m = 0 to [2M T ] − 1 of, respectively,

0 h 2 2i P −M  −M M,N  E pk (m + 1)2 − pk(m2 ) Ak,` m, x, (p, h)

1 0 h 4i 2 −M P M,N  ≤ C2 E Ak,` m, x, (p, h) and 2 0 h −M  −M  EP pk (m + 1)2 − pk(m2 ) 2  −M  −M  M,N  2i × (h − p)` (m + 1)2 − (h − p)`(m2 ) Bk,` m, x, (p, h) 0 h 2i −M 2  P M,N  ≤ C4 1 + khkM,[0,T ] E Bk,` m, x, (p, h) M,N  Because Bk,` m, x, (p, h) is uniformly bounded, we will be done if we can show that 1 0 h 4i 2 P M,N  −M E Ak,` m, x, (p, h) ≤ C2 . To this end, note that 4 Z N X M,L,N M   αk,` [τ]L−1, x, p˜h p`([τ]L) − p`([τ]L−1) dτ Im,M L=M+1 4 Z N −M X M,L,N M   ≤ 8 αk,` [τ]L−1, x, p˜h p`([τ]L) − p`([τ]L−1) dτ. Im,M L=M+1 Thus 0 h 4i Z 0 h 4i P M,N  M P M,N  E Ak,` m, x, (p, h) ≤ 2 E Gk,` τ, x, (p, h) dτ, Im,M where N M,N  X M,L,N M   Gk,` τ, x, (p, h) ≡ αk,` [τ]L−1, x, p˜h p`([τ]L) − p`([τ]L−1) L=M+1

Z [τ]N M,N M = αˆk,` (σ, τ, x, p˜h ) dp`(σ) [τ]M M,N M,N whenα ˆk,` (σ, τ, x, p) = αk,` ([τ]L−1, x, p) for [τ]L−1 ≤ σ < [τ]L. Finally, by Burkholder’s Inequality,  2 [τ]N ! 0 h 4i 0 Z 2 P M,N  P M,N M E Gk,` τ, x, (p, h) ≤ CE  αˆk,` (σ, τ, x, p˜h ) dσ  , [τ]M which is dominated by a constant times 4−M . Hence, we have at last com- pleted the proof of Theorem 8.3.1. 256 8 Stratonovich’s Theory

8.3.6. Exercises. Exercise 8.3.21. It should be pointed out that the characterization of L supp(Px ) is much easier when L is strictly elliptic in the sense that {V1(x), n n ...,Vr(x)} spans R at each x ∈ R . Of course, in this case, the statement L n is that supp(Px ) is the space of all continuous, R -valued paths which start at x. What follows is an outline giving steps for a simple proof of this statement. For the considerations here, Itˆo’stheory is preferable to Stratonovich’s. Thus, we will consider the solution p X( · , x, p) to the Itˆostochastic integral equation (8.2.2). Our ellipticity assumption becomes the assumption that the matrix a(x) = σ(x)σ>(x) is strictly positive definite for each x ∈ Rn.  n (i) If M(t), Ft, P is an R -valued continuous martingale with M(0) = 0 satisfying n X hMji(dt) ≤ Kdt j=1 for some K < ∞, show that

KT  − 2 P kMk[0,T ] < R ≥ e R for all T ∈ [0, ∞) and R ∈ (0, ∞). Hint: Set Kt  |x|2  R2 uR(t, x) = e 1 − R2 ,  and show that u(t, M(t)), Ft, P is a submartingale. In particular, by Doob’s Stopping Time Theorem, conclude that KT h i e R2 P(ζR > T ) ≥ EP u T ∧ ζR,M(t ∧ ζR) ≥ 1 where ζR = inf{t ≥ 0 : |M(t)| ≥ R}. 1 n (ii) Given h ∈ C [0, ∞); R with h(0) = x, set ρ = kh − xk[0,T ], and choose a bounded, measurable W : [0, ∞) × Rn −→ Rn so that

˙ n σ(y)W (t, y) = b(y) − h(t) for (t, y) ∈ [0, ∞) × BR (x, ρ + 1). Next, take θ(t, p) = −W t, X(t, x, p) and ζ(p) = inf{t ≥ 0 : |X(t, x, p) − x| ≥ ρ}, set Z t Z t   1 2 Eθ(t, p) = exp θ(τ, p), dp(τ) n − |θ(τ, p)| dτ , 0 R 2 0 determine (cf. Theorem 6.2.1) the probability measure Q so that dQ  Bt = 0 Eθ(t)dP  Bt for t ≥ 0 , and apply Corollary 6.2.2 together with Doob’s  Stopping Time Theorem to see that X(t ∧ ζρ, x, p) − h(t ∧ ζρ), Bt, Q is an Rn-valued martingale which satisfies the conditions in (i) with K = >  sup|y−x|≤ρ Trace σσ (y) . 8.3 The Support Theorem. 257

0  (iii) By combining (i) with (ii), show that P kX( · , x, p)−hk[0,T ] <  > 0 for all  > 0. Exercise 8.3.22. Without much more effort, it is possible to refine (8.3.2) 1 a little. Namely, given α ∈ (0, 2 ) and T ∈ (0, ∞), set

(α) |p(t) − p(s)| n kpk[0,T ] = sup α for p ∈ C [0, ∞); R . 0≤s

The purpose of this exercise is to show that (8.3.2) can be replaced by the statement that

0 (α)  lim lim P X( · , x, p) − X( · , x, g) <  kp − gkM,[0,T ] ≤ δ = 1 M→∞ δ&0 [0,T ]

1 r for all g ∈ C [0, ∞); R with g(0) = 0 and T > 0. (i) Begin by showing that for each q ∈ [1, ∞) and T > 0 there is a non- decreasing r Kq(T, r) such that

1 0 h 2qi 2q 1 P N M N M  2 E X (t, x, p˜h ) − X (s, x, p˜h ) ≤ Kq T, khkM,[0,T ] (t − s)

n n for all 0 ≤ s < t ≤ T , N ≥ M, x ∈ R , and h ∈ C [0, ∞); R . (ii) Using the preceding, Lemma 8.3.6, and Exercise 2.4.17, show that for 1 each α ∈ (0, 2 ), q ∈ [1, ∞), and T ∈ [0, ∞),

0 h (α) q i sup EP X( · , x, p) kp − gk ≤ δ < ∞. n [0,T ] x∈R δ∈(0,1]

(iii) Given 0 < α < β, show that

α α (α) 1− β (β)  β kpk[0,T ] ≤ 2kpk[0,T ] kpk[0,T ] ; and use this, part (ii) above, and (8.3.2) to complete the program. Exercise 8.3.23. Perhaps the single most important application of Theo- rem 8.3.1 is to the strong minimum principle for degenerate elliptic opera- tors. Namely, let G is an open subset of R × Rn, and, given (s, x) ∈ G, define  GL(s, x) to be the set of t, X(t − s) where t ≥ s and X ∈ S(x; V0,...,Vr) with τ, X(τ − s) ∈ G for τ ∈ [s, t]. The minimum principle for L is the 1,2 statement that if u ∈ C (G; R) is a (cf. (8.2.3)) (∂t + L)-supersolution in GL(s, x) (i.e., (∂t + L)u ≤ 0 on GL(s, x)), then

u  GL(s, x) ≥ u(s, x) =⇒ u  GL(s, x) = u(s, x). Here are some steps which lead to this conclusion. 258 8 Stratonovich’s Theory

(i) Suppose that (t, y) ∈ GL(s, x) and that u(t, y) > u(s, x). Choose X ∈ S(x; V0,...,Vr) so that X  [0, t − s] ⊆ G and y = X(t − s). Next, choose ρ > 0 so that

(τ, z): τ ∈ [s, t] and |z − X(τ − s)| ≤ ρ} ⊆ G and there exists an  > 0 such that u(t, z) ≥ u(s, x) +  when |z − y| ≤ ρ. Set ζ(p) = infτ ≥ 0 : |p(τ) − X(τ)| ≥ ρ ∧ (t − s),

 L and show that u s + τ ∧ ζ(p), p τ ∧ ζ , Bτ , Px is a supermartingale. In particular, conclude that

L h i u(s, x) ≥ EPx u s + ζ(p), p ζ) .

L  (ii) Show that α ≡ Px (ζ = t−s > 0, and combine this with the conclusion reached in (i) to arrive at the contradiction u(s, x) ≥ u(s, x) + α > u(s, x).