Topics in Stochastic Differential Equations and Rough Path Theory

vorgelegt von Dipl.-Math. Joscha Diehl Von der Fakult¨atII – Mathematik und Naturwissenschaften der Technischen Universit¨atBerlin zur Erlangung des akademischen Grades

Doktor der Naturwissenschaften – Dr. rer. nat. – genehmigte Dissertation

Promotionsausschuss Vorsitzender: Prof. Dr. Dietmar H¨omberg Berichter: Prof. Dr. Peter Friz Prof. Dr. Terry Lyons

Tag der wissenschaftlichen Aussprache: 22. M¨arz2012

Berlin 2012 D 83 Acknowledgment I want to express my deep gratitude towards my advisor Prof. Peter Friz, who set me on an interesting collection of problems. He was always ready to discuss any piece of mathematics and his unyielding enthusiasm was a constant source of joy. I want to thank Prof. Terry Lyons for accepting to be co-examiner of this thesis. Financial support by the DFG through the IRTG “Stochastic Models of Complex Processes” as well as the SPP1324 is gratefully acknowledged. Contents

1 Introduction 2

2 Stochastic differential equations with rough drift 5 2.1 Notation ...... 5 2.2 Main results ...... 5 2.3 The Young case ...... 21 2.4 Applications ...... 32 2.4.1 Robust filtering ...... 33 2.4.2 Pathwise stochastic control ...... 40 2.4.3 Mixed stochastic differential equations ...... 45

3 Backward stochastic differential equations with rough driver 46 3.1 Notation ...... 47 3.2 Main results ...... 47 3.3 The Markovian Setting - Connection to rough PDEs ...... 58 3.4 Connection to BDSDEs ...... 64 3.5 Technical results ...... 66 3.5.1 Comparison for BSDEs ...... 66 3.5.2 Properties of rough flows ...... 70 3.5.3 Comparison for PDEs ...... 71

1 1. Introduction

1 Introduction

Rough path theory, introduced by Lyons [52], is a recent development to treat solutions of differential equations driven by paths that are very irregular in time. More precisely, the theory gives meaning to differential equation of the form

Z t Yt = Y0 + V (Yr)dXr, (1) 0 under appropriate assumptions on the deterministic path X and the vector field V , which go 1 d way beyond the case X ∈ C ([0,T ], R ). Heuristically, the theory can deal with α-H¨oldercontinuous paths of some order α > 0. The regime α > 1/2 is the well understood Young theory ([62, 49]). For α ∈ (1/3, 1/2] it turns d out that the information of the R -valued path alone is not sufficient to build a satisfying theory. An α-H¨olderrough path then takes values in a larger metric space which, in addition to the path itself, contains “second order information” which can be though of as the iterated of the path against itself. 1 This information has to be given exogenously, since it is not canonically determined, and has to satisfy a certain algebraic compatibility condition. On the space of all such paths a rough path “norm” 2 (similar to the usual α-H¨oldernorm d of paths in R ) is introduced, and all elements with finite norm are denoted α-H¨olderrough paths. Distance between paths is measured using this “norm”, giving rise to a rough path metric. 3 There are then at least three different approaches to solutions of rough differential equations (RDEs) of the form (1), which lead essentially to the same results. Lyons’ original idea was to define some sort of rough integration first and consider the equation as a fixed point problem. One of the key results is stability: with respect to rough path metric the solution mapping of RDEs is continuous in the driving signal (Lyons’ limit theorem). In particular, if X is a geometric rough path, which by definition means that there exists a sequence of smooth paths Xn converging to X in rough path metric, then in particular the solution to the RDE driven by X is the limit of classical ODE solutions driven by the approximating smooth paths Xn . The approach by Davie [21] defines a solution locally, taking inspiration from (higher order) Euler approximations. In this thesis we shall adopt the approach put forward by Friz and Victoir [33]. It takes Lyons’ limit theorem as starting point, and defines a solution to an RDE driven by a geometric rough path as the limit (if it exists uniquely) of ODE solutions driven by approximating smooth paths. Let us note that the theory can treat many stochastic processes as driving signals, continuous being the prime example (see e.g. [51, 33]). Owing to stability properties of rough path theory, many known results from stochastic analysis can then be easily recovered (existence and regularity of stochastic flows, support theorems, large deviation principles, see e.g. [46, 33]). A plethora of new results have also been made possible. Let us mention new numerical algorithms for SDEs [35], SDEs driven by non-semimartingales [17, 33, 40], H¨ormandertype theorems for SDEs driven by Gaussian (rough) paths [11, 3] and ergodicity for SDEs driven by fractional Brownian motion [39]. As we pointed out, an important (or even defining) feature of rough path theory is, that it establishes continuity in the driving signal for time inhomogeneous differential equations with

1For α < 1/3, more and more additional information (more “iterated integrals”) has to be given a priori, related to the size of 1/α. 2The space is not a vector space. 3We note that one can actually choose from a family of metrics; this shall be of no concern here.

2 1. Introduction respect to a certain metric. This thesis investigates whether the principle of stability can be extended to other objects that are classically well-defined when taking a smooth path η as input. In particular we consider the solution mapping for SDEs with an additional drift Z t Z t Z t St = S0 + a(Sr)dr + b(Sr)dBr + c(Sr)dηr, (2) 0 0 0 BSDEs with an additional driver Z T Z T Z T Yt = ξ + f(r, Yr,Zr)dr + H(Yr)dηr − ZrdWr, (3) t t t and parabolic PDES of the form

2 ∂tu + F (t, u, Du, D u) + H(u, Du)η ˙t = 0. (4)

Note that if η is smooth, these equations possess a unique solution (under appropriate as- sumptions on the coefficients). Section 2 covers our results on equation (2). We establish stability in rough path metric with respect to η, which makes it possible to define a solution to an SDE with rough drift for arbitrary geometric rough paths η. Deterministic rough path estimates do not easily mix with stochastic estimates and we tackle the problem using three different approaches. The first one consist in the application of a flow transform, which converts the equation into one that depends on the driving signal in a harmless way; in particular there is no explicit dependence on the derivative anymore. A similar technique is used in the section on BSDEs with rough driver, but the flow transform leads to more technical problems there. The second approach interprets equation (2) directly as an RDE. For this to work one needs to understand η and the Brownian motion B together as a joint rough path. There is a canonical way of doing this though using stochastic integration and formal integration by parts, at least for the case α > 1/3. The main problem in this approach is showing exponential integrability of solutions, a property that we need for applications. Lastly, in the Young regime α > 1/2 we are able to solve the integral equation directly. The main technical difficulty here stems from the fact that the deterministic estimates available for the Young integral are not easily combined with the stochastic (not pathwise) estimates available for the stochastic integral. The motivation and main application for equations of type (2) is a problem in the theory of nonlinear filtering. As conjectured by Lyons [52], we show that in order to do robust filtering for multidimensional observation, correlated with the signal, one needs to treat the observation as an element of a rough path space. This is shown in Section 2.4.1 using SDEs with rough drift. We refer to the introduction of that section for a thorough explanation and a historical account concerning the problem of robust filtering. In Section 2.4.2 we consider an application to (pathwise) stochastic control. The resulting Hamilton-Jacobi-Bellman equation is of main interest here, since it gives a stochastic repre- sentation for a class of “rough partial differential equations” (4). We note that rough path theory has been applied to the field of partial differential equations (PDEs) in several papers [36, 27, 61]. We work with the approach put forward by Friz et al [9, 10, 32], which treats rough PDEs as the limit of PDEs in which the rough path is replaced by an approximating sequence of smooth paths. The aforementioned stability of rough path theory is also a key feature in the theory of viscosity solutions to PDEs ([18]). It is then natural to interpret solutions to the approximating PDEs in the viscosity sense, and this is excatly what Friz et al did. This also turns out to be convenient for this particular application, since viscosity solutions are the right solution concept for PDEs originating from optimal control. Let us

3 1. Introduction mention that we can, in a Brownian setting (that is, when formally replacing η by a ), recover results by Buckdahn and Ma [8], under somewhat weaker assumptions and significantly shorter proofs. We note that (formally), when plugging in a fractional Brownian motion with Hurst param- eter H > 1/2 for the path η in (2), one gets the solution to a mixed Brownian-fractional Brownian SDE. Such equations are treated for example in [63] (with only one-dimensional Brownian motion and stronger assumptions on the vector fields). This statement is made rigorous in Theorem 33. Backward stochastic differential equations (BSDEs) were introduced by Bismut in 1973. In [4] he applied linear BSDEs to stochastic optimal control. In 1990 Pardoux and Peng [57] then considered non-linear equations. A solution to a BSDE with driver f and random variable 2 ξ ∈ L (FT ) is an adapted pair of processes (Y,Z) in suitable spaces, satisfying

Z T Z T Yt = ξ + f(r, Yr,Zr)ds − ZrdWr, t ≤ T. t t Under appropriate conditions on f and ξ they showed the existence of a unique solution to such an equation. Again we are interested in the (formal) equation (3), when η is a rough path. In the case of η beeing a smooth path, one is back in the classical setting. In Section 3 we will show that under appropriate conditions, if the smooth path converge to a rough path, then the corresponding solutions also converge. We use a flow transformation to prove the limit theorem. As opposed to the case of afore- mentioned SDEs, this leads to technical difficulties. The transformed BSDE falls outside the classical Lipschitz framework, since it is quadratic in the control variable, and we have to use stability results by Kobylanski [43]. We note that formally, when plugging in (the rough path lift of) a Brownian motion for the rough path in (3), one gets the solution to a backward doubly stochastic differential equation (BDSDE), a notion introduced in [55]. This statement is made rigorous in Section 3.4. When the randomness of the parameters (f, ξ) of a BSDE comes from the state of a standard SDE, the system it is called a forward-backward stochastic differential equation (FBSDE). The solutions of FBSDE’s are connected to (viscosity) solutions of partial differential equa- tions (PDEs) in what is sometimes called a nonlinear Feynman-Kac formula, see e.g. [54]. We extend this result to BSDEs with rough drift, thereby arriving at a nonlinear Feynman- Kac formula for a class of rough RDEs in Section 3.3. At the same time we get existence, uniqueness and stability for a class of equations, that has not been treated before in the literature.

4 2. Stochastic differential equations with rough drift

2 Stochastic differential equations with rough drift

The aim of this part is to give meaning to an equation of the form Z t Z t Z t ¯ St = S0 + a(Sr)dr + b(Sr)dBr + c(Sr)dηr, 0 0 0 in the case where η is a rough path. As mentioned in the introduction, we proceed by continuously extending the solution mapping, which is of course classically well defined if η is (the lift of) a smooth path. After introducing some notation, we present and prove the main results in Section 2.2. In Section 2.3 we improve, in the case of a Young driving path, on the regularity assumptions. A specific feature of this case is, that the SDE with rough drift actually satisfies the integral equation. Finally, in Section 2.4 we present some applications.

2.1 Notation

γ 4 m n Lip is the set of γ-Lipschitz functions a : R → R where m and n are chosen according to the context.

[p] dY ∼ d d G (R ) = R ⊕ so(dY ) is the free nilpotent group of step [p] over R . Here [p] denotes the largest integer not larger then p. 5

0,α 0,α−H¨ol [1/α] dY C := C0 ([0, t],G (R )) is the set of geometric α-H¨olderrough paths η : [0, t] → [1/α] d G (R Y ) starting at 0. We shall use the non-homogeneous metric ρα−H¨ol on this space.

0,q−var 0,q−var [q] dY C := C0 ([0, t],G (R )) is the set of geometric q-variation rough paths η : [0, t] → [q] d G (R Y ) starting at 0. ¯ ¯ ¯ ¯ 0 0 ¯ (Ω, F, (Ft)t≥0, P) will be a filtered probability space. Let S = S (Ω) denote the space of d adapted, continuous processes in R S , with the topology of uniform convergence in probability. For q ≥ 1 we denote by Sq = Sq(Ω)¯ the space of processes X ∈ S0 such that

 1/q ¯ q ||X||Sq := E[sup |Xs| ] < ∞. s≤t

2.2 Main results

¯ ¯ ¯ ¯ Let (Ω, F, (Ft)t≥0, P) be a filtered probability space carrying a dB-dimensional Brownian motion B¯ and a bounded dS-dimensional random vector S0 independent of B¯. n d n In the following α ∈ (0, 1]. Let η : [0, t] → R Y be smooth paths, such that η → η in 0,α n α-H¨older,for some η ∈ C and let S be a dS-dimensional process which is the unique solution to the classical SDE Z t Z t Z t n n n ¯ n n St = S0 + a(Sr )dr + b(Sr )dBr + c(Sr )dηr , 0 0 0 4In the sense of E. Stein, i.e. bounded k-th derivative for k = 0,..., bγc and γ − bγc-H¨oldercontinuous bγc-th derivative, where bγc is the largest integer strictly smaller then γ. 5This is the ”correct” state space for a geometric 1/p-H¨olderrough path; the space of such paths subject to 1/p-H¨olderregularity (in rough path sense) yields a complete metric space under 1/p-H¨olderrough path metric. Technical details of geometric rough path spaces can be found e.g. in Section 9 of [33].

5 2. Stochastic differential equations with rough drift where we assume that

1 dS 1 dS γ+2 dS (a1) α ∈ (0, 1], γ > 1/α, a ∈ Lip (R ), b1, . . . , bdB ∈ Lip (R ) and c1, . . . , cdY ∈ Lip (R )

1 dS 1 dS (a1’) α ∈ (0, 1], γ > 1/α, a ∈ Lip (R ), b1, . . . , bdB ∈ Lip (R ) and c1, . . . , cdY ∈ γ+3 d Lip (R S )).

1 dS γ dS (a2) α ∈ (1/3, 1/2), γ > 1/α, a ∈ Lip (R ), b1, . . . , bdB ∈ Lip (R ) and c1, . . . , cdY ∈ γ d Lip (R S )).

Theorem 1. Under assumption (a1), (a1’) or (a2), there exists a dS-dimensional process S∞ ∈ S0 such that

Sn → S∞, in S0.

In addition, the limit Ξ(η) := S∞ only depends on η and not on the approximating sequence. Moreover, for all q ≥ 1, η ∈ C0,α it holds that Ξ(η) ∈ Sq and the corresponding mapping Ξ: C0,α → Sq is locally uniformly continuous (and locally Lipschitz under assumption (a1’) or (a2)).

Following Theorem 1, we say that Ξ(η) is a solution of the SDE with rough drift

Z t Z t Z t ¯ Ξ(η)t = S0 + a(Ξ(η)r)dr + b(Ξ(η)r)dBr + c(Ξ(η)r)dηr. (5) 0 0 0 The following result establishes some of the salient properties of solutions of SDEs with rough ˆ ¯ drift. Let (Ω, F, P0) carry, a dY -dimensional Brownian motion Y and let Ω = Ω × Ω be the ˆ ¯ product space, with product measure P := P0 ⊗ P. Let S be the unique solution on this probability space to the SDE

Z t Z t Z t St = S0 + a(Sr)dr + b(Sr)dB¯r + c(Sr) ◦ dYr. (6) 0 0 0 Denote by Y the rough path lift of Y (i.e. the enhanced Brownian Motion over Y ).

Theorem 2. Under assumption (a1), (a1’) or (a2) we have that

• For every R > 0, q ≥ 1 ¯ sup E[exp(q|Ξ(η)|∞;[0,t])] < ∞. (7) ||η||α−H¨ol

• For P0 − a.e. ω ¯ P[Ss(ω, ·) = Ξ(Y (ω))s(·), s ≤ t] = 1. (8)

We split the proof in two cases. The first case uses a flow transformation, similar to what is done in Section 3 for BSDEs with rough drift, and works for assumption (a1) or (a1’). In the second case, under assumption (a2), we create a joint rough path lift of the Brownian motion B¯ and the deterministic rough path η, and then solve a classical RDE with this driving signal. The major problem here, is to show some integrability properties.

6 2. Stochastic differential equations with rough drift

Proof of Case 1

Proof of Theorem 1. Let η ∈ C0,α be the lift of a smooth path η. Let Sη be the unique solution of the SDE Z t Z t Z t η η η ¯ η St = S0 + a(Sr )dr + b(Sr )dBr + c(Sr )dηr. 0 0 0

˜η η −1 η η Define S := (φ ) (t, St ), where φ is the ODE flow Z t η η φ (t, x) = x + c(φ (r, x))dηr. (9) 0

By Lemma 3, we have that S˜η satisfies the SDE Z t Z t ˜η η ˜η ˜η ˜η ¯ St = S0 + a˜ (r, Sr )dr, + b (r, Sr )dBr (10) 0 0 witha ˜η, ˜bη defined as in Lemma 3. This equation makes sense, even if η is a generic rough path in C0,α, (in which case (9) is now really an RDE). Indeed, since the first two derivatives of φη and its inverse are bounded (Proposition 11.11 in [33]) we have thata ˜η(t, ·), ˜bη(t, ·) are also in Lip1. Hence by Theorem V.7 in [59], there exists a unique strong solution to (10). We define the mapping introduced in Theorem 1 as η ˜η Ξ(η)t := φ (t, St ).

To show continuity of the mapping we restrict ourselves to the case q = 2. Moreover we shall γ+3 dS assume c1, . . . , cdY ∈ Lip (R ), and we will hence prove the local Lipschitz property of the respective maps. 1 2 0,α 1 2 Let η , η ∈ C with |η |α−H¨ol, |η |α−H¨ol < R. By Lemma 6 we have ¯ ˜1 ˜2 2 1/2 1 2 E[sup |Ss − Ss | ] ≤ CLem6(R)ρα−H¨ol(η , η ). s≤t

Hence ¯ 1 2 2 1/2 ¯ 1 ˜1 2 ˜2 2 1/2 E[sup |Ξ(η )s − Ξ(η )s| ] = E[sup |φ (s, Ss ) − φ (s, Ss )| ] s≤t s≤t ¯ 1 ˜1 1 ˜2 2 1/2 ¯ 1 ˜2 2 ˜2 2 1/2 ≤ E[sup |φ (s, Ss ) − φ (s, Ss )| ] + E[sup |φ (s, Ss ) − φ (s, Ss )| ] s≤t s≤t ¯ 1 ˜1 1 ˜2 2 1/2 1 1 ≤ E[sup |φ (s, Ss ) − φ (s, Ss )| ] + sup |φ (s, x) − φ (s, x)| d s≤t s≤t,x∈R S ¯ ˜1 ˜2 2 1/2 1 2 ≤ K(R)E[sup |Ss − Ss | ] + CLem7(R)ρα−H¨ol(η , η ) s≤t 1 2 ≤ C1ρα−H¨ol(η , η ), as desired, where

C1 = KLem7(R)CLem6(R) + CLem7(R), where KLem7(R) and CLem7(R) are the constants from Lemma 7.

7 2. Stochastic differential equations with rough drift

Proof of Theorem 2. In order to show (7), pick k ∈ 1, . . . , dS. We first note that by simply scaling (the coefficients of) Sη it is sufficient to argue for q = 1. And consider the k-th component of Ξ. Then

(k) η  η ˜(k);η  E[exp(|Ξ (η)|∞;[0,t])] ≤ E[exp(|Dψ |∞ |φ (0,S0)| + |S |∞;[0,t] )] η η η ˜(k);η ≤ exp(|Dψ |∞ sup |φ (0, x))E[exp(|Dψ |∞ sup |St |)] |x|≤|S0|L∞ s≤t η η ≤ exp(|Dψ |∞ sup |φ (0, x)) |x|≤|S0|L∞   η ˜(k);η η ˜(k);η × E[exp(|Dψ |∞ sup Ss )] + E[− exp(|Dψ |∞ sup Ss )] s≤t s≤t η η = exp(|Dψ |∞ sup |φ (0, x)) |x|≤|S0|L∞   η ˜(k);η η ˜(k);η × E[sup exp(|Dψ |∞Ss )] + E[− sup exp(|Dψ |∞Ss )] . s≤t s≤t

Now, only the boundedness of the last two terms remains to be shown, for η bounded. By applying Itˆo’s formula we get that

Z t Z t ˜(k);η ˜(k);η ˜(k);η ˜(k);η ˜(k);η exp(St ) = 1 + exp(Sr )dSr + exp(Sr )dhS ir 0 0 Z t dB Z t ˜(k);η η ˜(k);η X ˜(k);η ˜η ˜(k);η ¯i = 1 + exp(Sr )˜ak (Sr )dr + exp(Sr )bki(Sr )dBr 0 i=1 0 dB Z t X ˜(k);η ˜η ˜(k);η 2 + exp(Sr )|bki(Sr )| dr. i=1 0

Hence the process exp(S˜(k);η) satisfies a SDE with Lipschitz coefficients and by an application of Gronwall’s Lemma and Burkholder-Davis-Gundy inequality (see also Lemma V.2 in [59]) one arrives at

¯ η ˜(k);η sup sup E[exp(|Dψ |∞St )] ≤ C exp(C2), (11) |η|α−H¨ol

η ˜η 2 C2 := sup |a˜ |∞ + |b |∞, η:||η||α−H¨ol

¯ η ˜(k);η which is finite because of Lemma 4. One argues analogously for sups≤t E[− exp(|Dψ |∞St )], which then gives (7). Now, for the correspondence to an SDE solution let Ω be the additional probability space as given in the statement. Let S be the solution to the SDE (6). In Section 3 in [8] it was shown (see also Theorem 2 in [1]), that if we let Θ be the stochastic (Stratonovich) flow

Z t Θ(ω; t, x) = x + c(Θ(ω; r, x)) ◦ dYr(ω), 0

8 2. Stochastic differential equations with rough drift

ˆ −1 ˆ then with St := Θ (t, St) we have P-a.s. Z s Z s ˆ B¯ ˆ ˆ ˆ ¯ ˆ B¯ Ss(ω, ω ) = S0 + aˆ(r, Sr)dr + b(r, Sr)dBr, s ∈ [0, t], P a.e. (ω, ω ). (12) 0 0 Here, componentwise,

X 1 X X aˆ(t, x) := ∂ Θ−1(t, Θ(t, x))a (Θ(t, x)) + ∂ Θ−1(t, Θ(t, x)) b (Θ(t, x))b (Θ(t, x)), i xk i k 2 xj xk i jl kl k j,k l ˆ X −1 b(t, x)ij := ∂xk Θi (t, x)bkj(Θ(t, x)). k

Especially, by a Fubini type theorem (e.g. Theorem 3.4.1 in [5]), there exists Ω0 with P0(Ω0) = ¯ 1 such that for ω ∈ Ω0 equation (12) holds true P a.s.. Let Y ∈ C0,α be the enhanced Brownian motion over Y . We can then construct ω-wise the Y(ω) ˜Y(ω) rough flow φ as given in (9). By the very definition of Ξ we know that St (ω) := Y(ω) −1 (φ ) (ω; t, Ξ(ω)t) satisfies the SDE

Z t Z t ˜Y(ω) ˆY(ω) ˜Y(ω) ˆY(ω) ˜Y(ω) ¯ ¯ B¯ St = S0 + b (r, Sr )dr + b (r, Sr )dBr, Pa.e. ω , (13) 0 0 where

Y(ω) X Y(ω) −1 Y(ω) Y(ω) a˜ (t, x)i := ∂xk (φ )i (t, φ (t, x))ak(t, φ (t, x)) k 1 X X + ∂ (φY(ω))−1(t, φY(ω)(t, x)) b (t, φY(ω)(t, x))b (t, φY(ω)(t, x)), 2 xj xk i jl kl j,k l ˜Y(ω) X Y(ω) −1 Y(ω) b (t, x)ij := ∂xk (φ )i (t, x)bkj(t, φ (t, x)). k

It is a classical rough path result (see e.g. section 17.5 in [33]), that there exists Ω1 with Y P (Ω1) = 1 such that for ω ∈ Ω1, we have

φY(ω)(·, ·) = Θ(ω; ·, ·).

Y(ω) Y(ω) Hence for ω ∈ Ω1 we have thata ˆ =a ˜ , ˆb = ˜b . Hence for for ω ∈ Ω0 ∩ Ω1 the processes ˆ ˜Y(ω) ¯ 6 St(ω, ·), St (·) satisfy the same Lipschitz SDE (with respect to P). By strong uniqueness ¯ we hence have for ω ∈ Ω0 ∩ Ω1 that P a.s. ˆ ˜Y(ω) Ss(ω, ·) = Ss (·), s ≤ t.

Hence for ω ∈ Ω0 ∩ Ω1 ¯ Ss(ω, ·) = Ξ(Y(ω))(·)s, s ≤ t, P a.s.

6 Here one has to argue, that fixing ω in equation (13) gives (P0-a.s.) the solution to the respective SDE on ΩB¯ . This is similar to the arguments given in Case 2 below, so we omit them.

9 2. Stochastic differential equations with rough drift

Lemma 3. Let η be a smooth dY -dim path η and S be the solution of the the following classical SDE Z t Z t Z t St = S0 + a(Sr)dr + b(Sr)dB¯r + c(Sr)dηr 0 0 0

¯ R t PdS R t i where B is a dB-dimensional Brownian motion, 0 c(Sr)dηr := i=1 0 ci(Sr)η ˙rdr, a ∈ 1 dS 1 dS 3 dS ∞ ¯ dS Lip (R ), b1, . . . , bdB ∈ Lip (R ), c1, . . . , cdY ∈ Lip (R ), and S0 ∈ L (Ω; R ) inde- pendent of B¯. Consider the flow

Z t φ(t, x) = x + c(φ(r, x))dηr. (14) 0

−1 Then S˜t := φ (t, St) satisfies the following SDE

Z t Z t S˜t = S0 + a˜(r, S˜r)dr + ˜b(r, S˜r)dB¯r, 0 0 where we define componentwise

X 1 X X a˜(t, x) := ∂ φ−1(t, φ(t, x))a (φ(t, x)) + ∂ φ−1(t, φ(t, x)) b (φ(t, x))b (φ(t, x)), i xk i k 2 xj xk i jl kl k j,k l ˜ X −1 b(t, x)ij := ∂xk φi (t, φ(t, x))bkj(φ(t, x)). k

Proof. Denote ψ(t, x) := φ−1(t, x). Then

Z t ψ(r, x) = x − ∂xψ(r, x)c(x)dηr. 0

By Itˆo’sformula

Z t X Z t ψi(t, St) − ψi(0,S0) = ∂tψi(r, Sr)dr + ∂xj ψi(r, Sr)dSj(r) 0 j 0 X 1 Z t + ∂ ψ (r, S )dhS ,S i 2 xj xk i r k j r j,k 0 Z t Z t X X X ¯ = ∂xj ψi(r, Sr)aj(Sr)dr + ∂xj ψi(r, Sr) bjk(Sr)dBk(r) j 0 j 0 k X 1 Z t X + ∂ ψ (S ) b (S )b (S )dr. 2 xj xk i r kl r jl r j,k 0 l

Lemma 4. Consider for a rough path η ∈ C0,α the analogously to Lemma 3 transformed coefficients, a˜η, ˜bη, i.e. consider the rough flow

Z t η φ(t, x) = φ (t, x) = x + c(φ(r, x))dηr. (15) 0

10 2. Stochastic differential equations with rough drift and define

X 1 X X a˜η(t, x) := ∂ φ−1(t, φ(t, x))a (φ(t, x)) + ∂ φ−1(t, φ(t, x)) b (φ(t, x))b (φ(t, x)), i xk i k 2 xj xk i jl kl k j,k l ˜η X −1 b (t, x)ij := ∂xk φi (t, φ(t, x))bkj(φ(t, x)). k

Then for every R > 0 there exists KLem4 = KLem4(R) < ∞ such that

η sup |a˜ |∞ ≤ KLem4, η:|η|α−H¨ol

1 2 1 2 sup |a˜ (t, x) − a˜ (t, x)| ≤ KLem4(R)ρα−H¨ol(η , η ), t,x ˜1 ˜2 1 2 sup |b (t, x) − b (t, x)| ≤ KLem4(R)ρα−H¨ol(η , η ). t,x

Proof. This is a straightforward calculation using Lemma 7 and the properties of a, b.

The following is a standard result for continuous dependence of SDEs on parameters.

Lemma 5. Let a˜i(t, x), ˜bi(t, x), i = 1, 2 be bounded and uniformly Lipschitz in x. Let S˜i be the corresponding unique solutions to the SDEs

Z t Z t ˜i i ˜i ˜i ˜i ¯ St = S0 + a˜ (r, Sr)dr + b (r, Sr)dBr, i = 1, 2. 0 0

Assume

1 1 sup |Da˜ (s, ·)|∞, sup |D˜b (s, ·)|∞ < K < ∞, s≤t s≤t   sup a˜1(r, x) − a˜2(r, x) , sup ˜b1(r, x) − ˜b2(r, x) < ε < ∞. r,x r,x

Then there exists CLem5 = CLem5(K) such that

˜1 ˜2 2 1/2 E[sup |Ss − Ss | ] ≤ CLem5 ε, s≤t

Proof. This is a straightforward application of Ito’s formula and Burkholder-Davis-Gundy inequality.

We now apply the previous Lemma to our concrete setting.

11 2. Stochastic differential equations with rough drift

Lemma 6. Let η1, η2 ∈ C0,α and let S˜1, S˜2 be the corresponding unique solutions to the SDEs

Z t Z t ˜i i ˜i ˜i ˜i ¯ St = S0 + a˜ (r, Sr)dr + b (r, Sr)dBr, i = 1, 2, 0 0 where a˜i, ˜bi are given as in Lemma 4. 1 2 Assume R > max{|η |α−H¨ol, |η |α−H¨ol}. Then there exists CLem6 = CLem6(R). ˜1 ˜2 2 1/2 1 2 E[sup |Ss − Ss | ] ≤ CLem6ρα−H¨ol(η , η ). s≤t

1 2 Proof. Fix R > 0. Let ||η ||α−H¨ol, ||η ||α−H¨ol < R. From Lemma 4 we know that

˜1 ˜2 1 2 sup |b (t, x) − b (t, x)| ≤ KLem4(R)ρα−H¨ol(η , η ), t,x

Analogously, we get

1 2 1 2 sup |a˜ (t, x) − a˜ (t, x)| ≤ L2ρα−H¨ol(η , η ), t,x for a L2 = L2(R).

1 Lemma 7. Let α ∈ (0, 1). Let γ > α ≥ 1, k ∈ {1, 2,... } and assume that V = (V1,...,Vd) γ+k e e is a collection of Lip -vector fields on R . Write n = (n1, . . . , ne) ∈ N and assume |n| := n1 + ··· + ne ≤ k.

Then, for all R > 0 there exist C = C(R, |V |Lipγ+k ),K = K(R, |V |Lipγ+k )) such that if 1 2 α−H¨ol [p] d i x , x ∈ C ([0, t],G (R )) with maxi ||x ||α−H¨ol;[0,t] ≤ R then

1 2 1 2 sup ∂nπ(V )(0, y0; x ) − ∂nπ(V )(0, y0; x ) ≤ Cρα−H¨ol(x , x ), e α−H¨ol;[0,t] y0∈R 1 −1 2 −1 1 2 sup ∂nπ(V )(0, y0; x ) − ∂nπ(V )(0, y0; x ) ≤ Cρα−H¨ol(x , x ), e α−H¨ol;[0,t] y0∈R 1 sup ∂nπ(V )(0, y0; x ) ≤ K, e α−H¨ol;[0,t] y0∈R 1 −1 sup ∂nπ(V )(0, y0; x ) ≤ K. e α−H¨ol;[0,t] y0∈R

Proof. The fact that V ∈ Lipγ+k (instead of just Lipγ+k−1) entails that the derivatives up to γ order k are unique non-explosive solutions to RDEs with Liploc vector fields (see Section 11 in [33]). Localization (uniform for driving paths bounded in α-H¨oldernorm) then yields the desired results.

12 2. Stochastic differential equations with rough drift

Proof of Case 2

Proof of Theorem 1. We will first define the solution mapping and only in the end we will see that it is the limit of classical SDEs.  B¯  Write η = π1 (η) for the projection of η to the first level. We define a path λ = λ η, ω 2 d taking values in G (R ), d := dY + dB. The first level is given as   ¯  λ(1) = η, B¯ ωB

(2);i,j (2);i,j (2);i,j  B¯  The elements λ of the second level are defined as η for i, j ∈ {1, . . . , d1}, B¯ ω for i, j ∈ {d1 + 1, . . . , d1 + d2} and the remaining second-level entries as Z t (2);i,j i ¯j λt = ηudBu , for i ∈ {1, . . . , d1} , j ∈ {1 + d1, . . . , d1 + d2} , 0 Z t (2);i,j j ¯i j ¯i λ = ηt Bt − ηudBu , otherwise. 0 −1 The Campbell-Baker-Hausdorff formula (Theorem 7.24 in [33]) then shows that λs,t := λs ⊗    λt = exp ηs,t, B¯s,t + As,t where

 1 R t i j R t j i   η dηu − ηs,udη , for i, j ∈ (1, ··· , dY )  2 s s,u s u   t j−d t j−d   1 R ¯i−dY ¯ Y R ¯ Y ¯i−dY i,j  2 s Bs,u dBu − s Bs,u dBu , for i, j ∈ (dY + 1, ··· , dY + dB) so (d) 3 As,t :=   R t i ¯j−dY 1 i ¯j−dY  η dBu − η B , for i ∈ (1, ··· , dY ), j ∈ (dY + 1, ··· , dY + dB)  s s,u 2 s,t s,t   t j j i−d   R ¯i−dY 1 ¯ Y  − 0 ηudBu + 2 ηs,tBs,t , for i ∈ (dY + 1, ··· , dY + dB), j ∈ (1, ··· , dY )

2 d  B¯  0,α0−H¨ol 2 d  That is, λt takes values in G (R ). We demonstrate that λ η, ω ∈ C G (R ) 0 (P-a.s.) for α < α by showing that |λ|α0 even has Gaussian tails. It is clear that the first level of λ is α-H¨older; to deal with the second level note that

P ij |Ast| ∼ ij Ast and that 2α−H¨olderregularity already holds for the first two cases, that is i, j ∈ {1, . . . , d1} and i, j ∈ {d1 + 1, . . . , d1 + d2} (in fact even a Gauss tail via the Fernique estimate for rough path norms). Now for the remaining case we have by definition that Z t 1 ij i j i j As,t ≤ ηs,udBu + ηs,tBs,t s 2 j j and since ηi B ≤ |η| B |t − s|2α it just remains to treat R t ηi dB . But s,t s,t α−H¨ol α−H¨ol s s,u u q R t i j R t i 2 since for each s < t, s ηs,udBu has the same distribution as s ηs,u du.Z for some fixed Z ∼ N (0, 1), we also have  r 2  2  t 2   R t i  j  R i  η dB ηs,u du s s,u u  s  exp η  =Law exp ηZ2      2α     2α   (t − s)   (t − s)  

and by the elementary estimate R t ηi 2 du ≤ |η| (t − s)2α+1 we can conclude by s s,u α−H¨ol taking sups

13 2. Stochastic differential equations with rough drift for some κ > 0. By Theorem A.19 in [33] this already yields the desired 2α0-H¨olderregularity R t i j 0 of η dB for any α < α. Putting everything together, we have shown that kλk 0 < s s,u u α −H¨ol ∞. We can actually deduce for every R > 0 the existence of b = b(α0, α, T, R) > 0 such that

2 sup E exp(b|λ|α0−H¨ol) < ∞, (16) |η|α≤R

α0−H¨ol 2 d  0 So indeed λ ∈ C G (R ) , P − a.s. for all α < α. Hence (see e.g. Corollary 8.24 in 0,α0−H¨ol 2 d  0 [33]) λ ∈ C G (R ) , P − a.s. for all α < α. Using α0 < α large enough, such that γ > 1/α0, we can then use standard existence and uniqueness for the solution of the RDE

Z t Z t λ λ λ St = S0 + ˚a(Sr )dr + V (Sr )dλr, (17) 0 0

k k PdB k 7 where V = (c, b) and ˚a (x) := a (x) + i=1hDbi (x), bi(x)i. We then define the mapping λ as Ξ(η) := St .

We proceed to show continuity of the mapping of a rough path η to the lift λ(η). Fix α0 < α 0 0 such that γ > 1/α . We shall use Theorem A.13 in [33]. Let then q ≥ q0(α, α ), as given 0,α in this Theorem. We show that for η, η¯ ∈ C with |η|α−H¨ol, |η¯|α−H¨ol ≤ R we have for the corresponding lifts λ, λ¯ that

¯ q 1/q E[ρα0 (λ, λ) ] ≤ CLipρα(η, η¯),

0 where CLip = CLip(q, R, α, α ). ¯ Denote ε := ρα0 (λ, λ). By (16) exists a constant C1 = C1(q, R) such that

q ¯ ¯ q αq E[|d(λs, λt)| ], E[|d(λs, λt)| ] ≤ C1|t − s| .

Moreover

¯ q q E[|π1(λs,t − λs,t)| ] = |ηs,t − η¯s,t| ≤ εq|t − s|αq. and (again the constants C may only depend on R and q) 1 [|π (λ − λ¯ )|q/2] = [| π (λ − λ¯ ) ⊗ π (λ − λ¯ ) + A − A¯ |q/2] E 2 s,t s,t E 2 1 s,t s,t 1 s,t s,t s,t s,t q/2 αq ¯ q/2 ≤ Cε |t − s| + CE[|As,t − As,t| ]

Also, since n ≥ N

Z t Z t Z t Z t  i j j i i j j i q/2 q/2 αq | ηs,udηu − ηs,udηu − η¯s,udη¯u − η¯s,udη¯u | ≤ ε |t − s| . s s s s

7This is a plus the Stratonovich corrector for the dB¯ integral. Note that we only have ˚a ∈ Lip1 so we have to use results on RDEs with drift (e.g. Theorem 12.6 and Theorem 12.10 in [33]) to get existence of a unique solution.

14 2. Stochastic differential equations with rough drift

And finally

Z t 1 Z t 1  i ¯j−dY i ¯j−dY i ¯j−dY i ¯j−dY q/2 E[| ηs,udBu − ηs,tBs,t − η¯s,udBu − η¯s,tBs,t | ] s 2 s 2 Z t i i ¯j−dY q/2 i ¯j−dY i ¯j−dY q/2 ≤ CE[| ηs,u − η¯s,udBu | ] + CE[|ηs,tBs,t − η¯s,tBs,t | ] s Z t i i 2 q/4 i i q/2 ¯j−dY q/2 ≤ C| |ηs,u − η¯s,u| du| + C|ηs,t − ηs,t| E[|Bs,t | ] s ≤ Cεq/2(t − s)αq/2+q/4 + Cεq/2(t − s)αq/2(t − s)q/4 ≤ Cεq/2(t − s)αq.

Hence

¯ q/2 q/2 αq E[|π2(λs,t − λs,t)| ] ≤ C2ε |t − s|

1/q 1/(2q) So with M = M(R, q) := max{1,C1 ,C2 } we have for all q ≥ 1 ¯ q 1/q α E[|d(λs, λt)| ] ≤ M|t − s| , ¯ q 1/q α E[|π1(λs,t − λs,t)| ] ≤ εM|t − s| , ¯ q/2 2/q 2 2α E[|π2(λs,t − λs,t)| ] ≤ εM |t − s| .

Hence by Theorem A.13 (i) in [33] there exists q large enough and a K = K(α, α0, T, q) such that  ¯ q |π1(λs,t − λs,t| 1/q E[ sup α0 ] ≤ εKM, s

¯ !q ¯ !q q 1/q π1(λs,t − λs,t) 1/q π2(λs,t − λs,t) 1/q 0 ¯ E[ρα −H¨ol(λ, λ) ] ≤ E[ sup α0 ] + E[ sup 2α0 ] ≤ CLipε, s6=t |t − s| s6=t |t − s| as desired. This consideration was for q large enough, but since Lr is Lipschitz continuously embedded in Lq for r > q, the result follows for all q.

We can now proceed to show local Lipschitzness of the solution mapping Ξ. Let still η, η¯ ∈ C0,α, Let α0 < α with γ > 1/α0 and let p0 := 1/α0. By Theorem 8 in [13] we have the deterministic estimate 8, that for some C∗, any β ∈ (0, 1) there exists C = C(C∗, β) such that

¯ ∗ ρp0−var(Ξ(η), Ξ(η¯)) ≤ Cρp0−var(λ, λ) exp(−N(β, C , υ) log(1 − β)), where υ, N are given as in Lemma 8 below.

8This estimate modifies the estimates in [33] in a way that is essential to get integrability in our context.

15 2. Stochastic differential equations with rough drift

Hence, for every q ≥ 1,

q 1/q q 1/q E[sup |Ξ(η)t − Ξ(η¯)t| ] ≤ E[ρp0−var(Ξ(η), Ξ(η¯)) ] t≤T  ¯ 2q 1/(2q) ∗ 2q 1/(2q) ≤ CE[ ρp0−var(λ, λ) ] E[[exp(−N(β, C , υ) log(1 − β))] ]  ¯ 2q 1/(2q) ∗ 2q 1/(2q) ≤ CE[ ρα0−H¨ol(λ, λ) ] E[[exp(−N(β, C , υ) log(1 − β))] ] ∗ 2q 1/(2q) ≤ Cρα0−H¨ol(η, η¯)E[[exp(−N(β, C , υ) log(1 − β))] ] .

By Lemma 8 we know that for every q ≥ 1, R > 0 there exists β0(q, R) > 0 such that for |η|α−H¨ol, |η¯|α−H¨ol < R, we have for β ≤ β0 that

∗ q 1/q E[[exp(−N(β, C , υ) log(1 − β))] ] is bounded. This yields the desired local Lipschitzness of Ξ.

Proof of Theorem 2. Fix R > 0. We show

¯ (k) sup E[exp(qΞ (η)t)] < ∞. (18) ||η||α−H¨ol

By Theorem 10.14 in [33] we have

 p0  |S|p0−var;[s,t] ≤ C ||λ||p0−var;[s,t] ∨ ||λ||p0−var;[s,t] .

Define the control

p0  p0  υ(s, t) := ||λ||p0−var;[s,t] ∨ ||λ||p0−var;[s,t] .

Then for any partition D of [0, t] X ||S||∞;[0,t] ≤ |S0| + ||S − Sti ||∞;[ti,ti+1] ti∈D X ≤ |S0| + ||S||p−var;[ti,ti+1] ti∈D X 1/p0 ≤ |S0| + C υ(ti, ti+1) .

ti∈D Then, using for a > 0 the partition

t0(a) := 0, p0 ti+1(a) := inf{s ≥ ti(a): υ(σi(a), s) ≥ a } ∨ t, we get

p0 ||S||∞;[0,t] ≤ |S0| + Ca (sup{n : tn < t} + 1) .

The exponential integrability (18) then follows as in the proof of Lemma 8. We now show correspondence to the SDE solution. On Ωˆ let EBM(Y, B¯) be the enhanced Brownian motion lift of (Y, B¯). We want to compare it to the mapping λEBM(Y ), where we

16 2. Stochastic differential equations with rough drift plug the enhanced Brownian motion lift of Y into the above constructed mapping. Note, that this composition is in general not a random variable on Ω,ˆ since joint measurability is not clear.

We know by definition that π1(λ(EBM(Y ))) is measurable and coincides with π1(EBM(Y, B¯)) Pˆ-a.s. We now consider only one component of the second level, the others follow similarly. So without loss of generality let i ∈ {1, . . . , dY }, j ∈ {dY + 1, . . . , dY + dB}. Then Z t ¯ (2);i,j i ¯j−dY ˆ EBM(Y, B)t = Yr dBr P − a.s. 0

Now by results on stochastic integrals we have

n t 2 Z  ¯ h ¯ ¯ i i ¯j−dY B X k j−dY B j−dY B Y dB (ω, ω ) = lim Y n (ω) B n (ω ) − B n (ω ) r r n→∞ (k−1)/2 t k/2 t (k−1)/2 t 0 k=1 on Aˆ ⊂ Ωˆ with Pˆ[Aˆ] = 1. By a Fubini type theorem (e.g. Theorem 3.4.1 in [5]), there exists ˚ ˚ ˆ B¯ B¯ ˆ B¯ ˆ a Ω ⊂ Ω such that for ω ∈ Ω we have that Aω := {ω :(ω, ω ) ∈ A} satisfy P [Aω] = 1. On the other hand for fixed ω ∈ Ω Z t (2);i,j i ¯j−d1 λt (EBM(Y )(ω)) = Yr (ω)dBr . 0 Now

n t 2 m Z  ¯ h ¯ ¯ i i ¯j−dY B X k j−dY B j−dY B Y (ω)dB (ω ) = lim Y nm (ω) B n (ω ) − B n (ω ) r r m→∞ (k−1)/2 t k/2 m t (k−1)/2 m t 0 k=1

B Bˆ Bˆ for ω ∈ Dω ⊂ Ω with P [Dω] = 1. Here, Dω as well as the subsequence (nm)m will depend on ω. B So for ω ∈ ˚Ω, ω ∈ Aˆω ∩ Dω we have that

n t 2 Z  ¯ h ¯ ¯ i i ¯j−dY B X k j−dY B j−dY B Y dB (ω, ω ) = lim Y n (ω) B n (ω ) − B n (ω ) r r n→∞ (k−1)/2 t k/2 t (k−1)/2 t 0 k=1 2nm X k h j−dY B¯ j−dY B¯ i = lim Y nm (ω) B n (ω ) − B n (ω ) m→∞ (k−1)/2 t k/2 m t (k−1)/2 m t k=1 t Z  ¯ i ¯j−dY B = Yr (ω)dBr (ω ). 0 Here, the second equality follows from the fact that the sum already converges along n, so it will converge along any subsequence (nm)m. ˚ B¯ ˆ Y Noting that for ω ∈ Ω we have P [Aω ∩ Dω] = 1 we get that for P -a.e. ω ∈ Ω ¯ B¯ EBM(Y, B)· = λ·(EBM(Y )(ω)), P − a.s. (19)

Now by classical rough paths results (see e.g. Section 17.5 in [33]), the RDE solution to (17) driven by EBM(Y, B¯) corresponds Pˆ-a.s. to the SDE solution of (6). Combining with (19) we arrive at (8).

17 2. Stochastic differential equations with rough drift

The results in Section 17.5 in [33] also yield, that if η is (the lift of) a smooth path η, then Ξ(η) actually solves (5) as a classical SDE. Together with continuity of the mapping Ξ this yields the first claim of Theorem 1.

Lemma 8. Let η, η¯ ∈ C0,α, with corresponding lifts (as in the proof of Case 2 above) λ, λ¯. 0 0 0 0 Let |η|α, |η¯|α < R and Let C > 0. Let α < α with γ > 1/α and let p := 1/α . Define p0 ¯ p0 υ(s, t) := ||λ||p0−var;[s,t] + ||λ||p0−var;[s,t], & ' X N(x, y, υ) := f(x, y) sup υ(ti, ti+1) , −1 υ(ti,ti+1)≤f(x,y) i where yp0 f(x, y) := , ψ−1(x)p ψ(x) := x exp(x).

For every q ≥ 1 there exists β0 = β0(R, C, q) > 0,K = K(R, C, q) > 0 such that for β < β0

|| exp(−N(β, C, υ) log(1 − β))||Lq ≤ K.

Proof. Define

−1 −1 M(β, C, υ) := sup{n : τ1(f(β, C) ) + ··· + τn(f(β, C) ) ≤ T }, where

τ0(a) := 0,

τi+1(a) := inf{t ≥ 0 : υ(σi(a), σi(a) + t) ≥ a}, with the convention inf ∅ = ∞ and ∞ − ∞ = ∞, where

i X σi(a) := τj(a). j=1

By the argument in Proposition 4.6 in [12] one sees that

N(β, C, υ) ≤ 2M(β, C, υ) + 1, so we can focus on M. First

−1 −1 P[M(β, C, υ) > n] = P[τ1(f(β, C) ) + ··· + τn(f(β, C) ) ≤ T ] −1 −1 ≤ P[τ1(f(β, C) ) ∧ φ(β) + ··· + τn(f(β, C) ) ∧ φ(β) ≤ T ], where we take φ to be the function given in Lemma 9. Define

−1 mi := mi(β) := E[τi(f(β, C) ) ∧ φ(β)], n 1 X S¯ := τ (f(β, C)−1) ∧ φ(β) − m  . n n i i i=1

18 2. Stochastic differential equations with rough drift then n T 1 X [M(β, C, υ) > n] ≤ [S¯ ≤ − m ]. P P n n n i i=1

1 Pn Now Lemma 9 shows, that 0 < m(β) ≤ n i=1 mi for all n, where m(β) := κφ(β).

Then, if n ≥ n0(T, β) := 2T/m(β), 1 · · · ≤ [S¯ ≤ − m(β)]. P n 2

1 If ε0 = 2 m(β), then ¯ ... ≤ P[Sn ≤ −ε0] 2ε2n2 ≤ 2 exp(− 0 ). nφ(β)2 by Hoeffding’s inequality (see e.g. Theorem 2 in [41]). Now ∞ X log n [exp(−q log(1 − β)M(β, C, υ))] ≤ [M(β, C, υ) ≥ b c] E P q(− log(1 − β)) n=1

Now for

n ≥ n1 := exp((n0 + 1)(− log(1 − β))q), one has by the above,

log n 2ε2 log n [M(β, C, υ) ≥ b c] ≤ 2 exp(− 0 b c) P q(− log(1 − β)) φ(β)2 q(− log(1 − β)) 2ε2 2ε2 log n ≤ 2 exp( 0 ) exp(− 0 ) φ(β)2 φ(β)2 q(− log(1 − β)) 2 2ε2 2ε − 0 = 2 exp( 0 )n φ(β)2q(− log(1−β)) φ(β)2 1 κ 1 − 2 = 2 exp( κ)n q(− log(1−β)) 2

Finally, pick β0 = β0(q) such that

1 κ 2 ≥ 2, q(− log(1 − β)) for β ≤ β0. Then for β ≤ β0 1 [exp(−q log(1 − β)M(β, C, υ))] ≤ n (β) + 2 exp( κ)ζ(2), E 1 2 and n1(β) is bounded for |η|α−H¨ol, |η¯|α−H¨ol bounded, by Lemma 9. This finishes the proof.

19 2. Stochastic differential equations with rough drift

Lemma 9. In the setting of the previous Theorem, there exists φ = φη,η¯ : R+ → R+, such that φ(x) > 0 for x 6= 0, and κ > 0 such that for all n ≥ 1

1 X [τ (f(β, C)−1) ∧ φ(β)] ≥ κφ(β), n E i i=1 and such that for x > 0, φ(x) is bounded away from 0 for |η|α−H¨ol, |η¯|α−H¨ol bounded.

Proof. First for every i

−1 −1 E[τi(f(β, C) ) ∧ φ(β)] ≥ P[τi(f(β, C) ) ≥ φ(β)] · φ(β)  −1  = 1 − P[τi(f(β, C) ) < φ(β)] · φ(β)

 −1  ≥ 1 − P[τi(f(β, C) ) ≤ φ(β)] · φ(β)

 −1 −1 −1  = 1 − P[υ(σi(f(β, C) ), σi(f(β, C) ) + φ(β)) ≥ f(β, C) ] · φ(β).

Now, by Markov inequality,

−1 −1 −1 P[υ(σi(f(β, C) ), σi(f(β, C) ) + φ(β)) ≥ f(β, C) ] −1 −1 ≤ f(β, C)E[υ(σi(f(β, C) ), σi(f(β, C) ) + φ(β))].

Hence

−1  −1 −1  E[τi(f(β, C) ) ∧ φ(β)] ≥ 1 − f(β, C)E[υ(σi(f(β, C) ), σi(f(β, C) ) + φ(β))] · φ(β).

Summing up we get

1 X [τ (f(β, C)−1) ∧ φ(β)] n E i i=1 n 1 X  ≥ 1 − f(β, C) [υ(σ (f(β, C)−1), σ (f(β, C)−1) + φ(β))] · φ(β) n E i i (20) i=1 n  1 X  = 1 − f(β, C) [υ(σ (f(β, C)−1), σ (f(β, C)−1) + φ(β))] · φ(β). n E i i i=1

Now n  1 X  1 − f(β, C) [υ(σ (f(β, C)−1), σ (f(β, C)−1) + φ(β))] n E i i i=1   ≥ 1 − f(β, C) sup E[υ(τ, τ + φ(β))] , τ where the supremum is over all stopping times τ. We define 1 1 φ(β) := sup{a : sup E[υ(τ, τ + a)] ≤ }. 2 τ 2f(β, C)

We claim that φ > 0 for β > 0.

20 2. Stochastic differential equations with rough drift

Indeed, note that

p0 ¯ p0 sup E[υ(τ, τ + a)] = sup E[||λ||p0−var;[τ,τ+a] + ||λ||p0−var;[τ,τ+a]] τ τ p0 ¯ p0 ≤ sup E[a||λ||α0−H¨ol;[0,T ] + a||λ||α0−H¨ol;[0,T ]] τ p0 ¯ p0 = E[||λ||α0−H¨ol;[0,T ] + ||λ||α0−H¨ol;[0,T ]]a, which shows that φ(β) > 0 for β > 0. By (16) we have in particular that φ(β) is bounded away from 0 for |η|α−H¨ol, |η¯|α−H¨ol bounded. Then by (20)

1 X −1   E[τi(f(β, C) ) ∧ φ(β)] ≥ 1 − f(β, C) sup E[υ(τ, τ + φ(β))] · φ(β) n τ i=1 1 ≥ φ(β), 2 so κ := 1/2 does the job.

2.3 The Young case

In Section 2.2 we showed existence, uniqueness and stability of an SDE with rough drift under the assumption that the vector fields in front of the Brownian motion are Lip1 and the vector fields in front of the rough path are Lipγ+2, if the rough path is α-H¨olderand γ > 1/α. The aim of this section is to show that in the case of α > 1/2 actually Lipγ regularity suffices in front of the rough path, which is the regularity one would need when solving a standard RDE. Moreover the vector fields in front of the Brownian motion do not have to be bounded anymore, as one would hope when thinking of the standard assumption for classical SDEs. The results are most conveniently derived when working in variation spaces. In this section [q] d d d q ∈ (1, 2), so of course G (R Y ) = R Y and we are dealing with classical paths in R Y . Let γq > q. Assume

γq dS (a3) a Lipschitz, b1, . . . , bdB Lipschitz and c1, . . . , cdY ∈ Lip (R )

¯ ¯ ¯ ¯ Let as before (Ω, F, (Ft)t≥0, P) be a filtered probability space carrying a dB-dimensional Brownian motion B¯ and a bounded dS-dimensional random vector S0 independent of B¯. Let n d n 0,q−var η : [0, t] → R Y be smooth paths, such that η → η in q-variation for some η ∈ C and n let S be a dS-dimensional process which is the unique solution to the classical SDE Z t Z t Z t n n n ¯ n n St = S0 + a(Sr )dr + b(Sr )dBr + c(Sr )dηr , 0 0 0

∞ 0 Theorem 10. Under assumption (a3) there exists a dS-dimensional process S ∈ S such that

Sn → S∞, in S0.

In addition, the limit Ξ(η) := S∞ only depends on η and not on the approximating sequence.

21 2. Stochastic differential equations with rough drift

Moreover, for all r ≥ 1, η ∈ C0,q−var it holds that Ξ(η) ∈ Sr and the corresponding mapping Ξ: C0,q−var → Sr is locally uniformly continuous. 9 Moreover, S∞ ∈ C0,p−var, for all p > 2 and satisfies the integral equation

Z t Z t Z t ¯ St = S0 + a(Sr)dr + b(Sr)dBr + c(Sr)dηr, (21) 0 0 0 where the last integral is a (pathwise) Young integral. It is the only process that satisfies this equation.

We prove this in four steps. First we show uniqueness of a solution to the integral equation, if it exists (Theorem 12). Then we show weak existence of a solution under the assumption c ∈ Lip1. (Theorem 13). Using standard methods from strong existence for standard SDEs, we can then deduce from these two results the existence of a unique strong solution (Theorem 14). We finally show stability of the solution with respect to the Young path η, which yields also the identification as a limit, as in the statement of the theorem. For notational convenience we assume a ≡ 0.

Remark 11. The main reason why we were not able to employ a contraction mapping argu- ment directly, is the fact that the deterministic Young inequality only gives the locally Lipschitz estimate Z Z ¯ || H(Y )dη − H(Y )dη||q−var;[0,T ] ¯  ¯ ≤ C ||Y ||p−var;[0,T ] + ||Y ||p−var;[0,T ] ||Y − Y ||p−var;[0,T ]||η||q−var;[0,T ].

If we apply the expectation operator to both sides of this inequality it becomes evident that a naive Gronwall argument will fail in a stochastic setting.

Theorem 12 (Pathwise Uniqueness). Let T > 0. Let q ∈ (1, 2) and η ∈ C0,q−var.

γq Let γq > q and c = (c1, . . . , cdY ) be a collection of Lip vector fields, let b = (b1, . . . , bdB ) be a collection of Lipschitz (not necessarily bounded) vector fields. Let ξ be a random variable independent of B. Then there exists at most one solution S on [0,T ] of (21).

Proof. Let p ∈ (2, 3) such that 1/p + 1/q > 1. Let S, S¯ be two solutions of (21) on [0,T ] (if the solutions are only valid up to a stopping time, the following argument shows that they coincide up to this stopping time). For T > 0, n ≥ 0 define the stopping times

T τ0 := 0, T T T τ := inf{t ≥ τ : ||S|| T ≥ 1 or ||S¯|| T ≥ 1} ∧ T ∧ (τ + T). n+1 n p−var;[τn ,t] p−var;[τn ,t] n

Since S, S¯ solve (21), their p-variation is a.s. finite (this follows e.g. from Theorem 14.9 in [33]). Hence for every T > 0, the stopping times exhaust the interval [0,T ].

9 Actually we show that the mapping is continuous into the space of all processes X with norm ||X||p,r = 1/r E[||X||p−var;[0,T ]r ] < ∞ for all p > 2, r ≥ 1.

22 2. Stochastic differential equations with rough drift

γq Assume that S T = S¯ T . First (here it is essential that c ∈ Lip ) τn τn Z Z || c(S)dη − c(S¯)dη|| T T p−var;[τn ,τn+1] Z Z ≤ || c(S)dη − c(S¯)dη|| T T q−var;[τn ,τn+1]   ≤ C 1 + max{||S|| T T , ||S¯|| T T } ||η|| T T ||S − S¯|| T T p−var;[τn ,τn+1] p−var;[τn ,τn+1] q−var;[τn ,τn+1] p−var;[τn ,τn+1]   ¯ ≤ C2 sup ||η||q−var;[t,t+T] ||S − S||p−var;[τ T,τ T ]. t≤T n n+1 Choosing T small enough we get Z · Z · ¯ r 1/r ¯ r 1/r E[|| c(S)dη − c(S)dη||p−var;[τ T,τ T ]] ≤ 1/3E[||S − S||p−var;[τ T,τ T ]] . 0 0 n n+1 n n+1

  On the other hand, define the martingale B˜ := 1 T (t) B − B . Using the BDG t [τn ,∞) t∧τn+1 τn inequality for p-variation (Theorem 14.12 in [33]), we get Z Z ¯ r 1/r ¯ ˜ r 1/r E[|| b(S) − b(S)dB|| T T ] = E[|| b(S) − b(S)dB||p−var;[0,T ]] p−var;[τn ,τn+1] Z T r/2 ¯ 2 1/r ≤ CE[ |b(Sr) − b(Sr)| dr ] 0 n !r/2 Z τi+1 ¯ 2 1/r = CE[ |b(Sr) − b(Sr)| dr ] n τi 1/2 ¯ r 1/r ≤ CT E[ sup |Sv − Sv| ] n n τi ≤v≤τi+1 1/2 ¯ r 1/r ≤ T CE[||S − S|| T T ] , p−var;[τn ,τn+1]

(where we need T ≤ 1 to bound sup with p-var; also we use the fact that S T = S¯ T ). τn τn Choosing T small enough we get Z Z ¯ r 1/r ¯ r 1/r E[|| b(S)dB − b(S)dB|| T T ] ≤ 1/3E[||S − S|| T T ] . p−var;[τn ,τn+1] p−var;[τn ,τn+1]

Hence

¯ r 1/r E[||S − S|| T T ] p−var;[τn ,τn+1] Z · Z · Z · Z · ¯ ¯ r 1/r = E[|| c(Sr)dηr + b(Sr)dBr − c(Sr)dηr − b(Sr)dBr||p−var;[τ T,τ T ]] 0 0 0 0 n n+1 ¯ r 1/r ≤ 2/3E[||S − S|| T T ] . p−var;[τn ,τn+1] Hence (note that the right hand side is finite, because of the stopping time) S = S¯ on T T [τn , τn+1]. This is true for all n, hence S = S¯ on [0,T ].

Theorem 13 (Weak Existence). Let T > 0. Let q ∈ (1, 2) and η ∈ C0,q−var.

γq Let γq > q and c = (c1, . . . , cdY ) be a collection of Lip vector fields, let b = (b1, . . . , bdB ) be a collection of Lipschitz Lip1 (i.e. Lipschitz and bounded) vector fields.

23 2. Stochastic differential equations with rough drift

n Let µ be a probability measure on R . Then there exists a probability space carrying a Brownian motion B˜ and an adapted process S˜ such that on [0,T ]

Z t Z t ˜ ˜ ˜ ˜ ˜ St = S0 + b(Sr)dBr + c(Sr)dηr, (22) 0 0 and S˜0 ∼ µ.

n 4 n Proof. Let c be a sequence of collections of Lip vector fields such that |c − c|Lipγq → 0. Let a Brownian motion B be given. Let ξ be a random variable, independent of B, with ξ ∼ µ. For every n there exists a solution Sn to

Z t Z t n n n n St = ξ + b(Sr )dBr + c (Sr )dηr. 0 0 Indeed, by Theorem 1 there exists a solution to corresponding SDE with rough drift, which so far only formally satisfies the integral equation. But looking at the proof, the solution is the back-transformation of an SDE with transformed coefficients. Applying Theorem 19 then gives, that Sn actually satisfies the integral equation. n 0,p Let p− < p ∈ (2, 3) such that 1/p + 1/q > 1. We will show that S is tight in C . Let T > 0. First (using BDG, exploiting boundedness of b, when applying Kolmogorov for the stochastic integral)

n r 1/r E[ ||S ||p−−var;[iT,(i+1)T] ]  Z r  Z r n n 1/r n 1/r ≤ E[ || c (S )dη||p−−var;[iT,(i+1)T] ] + E[ || b(Sr )dBr||p−−var;[iT,(i+1)T] ]

n r 1/r ≤ E[ C(1 + ||S ||p−−var;[iT,(i+1)T])||η||q−var;[iT,(i+1)T] ]  Z r 1/p− n 1/r + T E[ || b(Sr )dBr||1/p−−H¨ol;[iT,(i+1)T] ]

n r 1/r 1/p− ≤ C||η||q−var;[iT,(i+1)T] + E[ ||S ||p−−var;[iT,(i+1)T]) ] ||η||q−var;[iT,(i+1)T] + CT .

Choosing T so small such that supt ||η||q−var;[t,t+T] < 1/2, we get

n r 1/r 1/p− E[ ||S ||p−−var;[iT,(i+1)T] ] ≤ 2C||η||q−var;[iT,(i+1)T] + 2CT , (23) independent of n. 10 An application of Lemma 17 then yields

n r 1/r sup E[ ||S ||p−−var;[0,T ] ] < ∞. n

n 0,p−var p −var Hence S is tight as random variable in C . Indeed, let KR := {x ∈ C − : |x0| ≤ p−var R, ||x||p− ≤ R}, which is compact in C . Then

n n P[S 6∈ KR] ≤ P[|ξ| > R] + P[||S ||p−−var > R].  r 10Note that, for this reasoning to work, we need to know - a priori - that [ ||Sn||p− ]1/r < E p−−var;[iT,(i+1)T] ∞. But this follows e.g. by using the former inequality for the stopping times σn := inf{t ≥ iT : ||Sn||p− > n} and taking the limit. p−−var;[iT,t]

24 2. Stochastic differential equations with rough drift

Now the first term goes to zero, trivially uniformly in n, as R → ∞. The second term goes to zero, uniformly in n, as R → ∞, by the Chebychev inequality using (23). This gives tightness. Hence there exists a probability space with Brownian motions B˜n and random processes S˜n such that (Sn,B) ∼ (S˜n, B˜n) and (by relabeling a subsequence if necessary)

(S˜n, B˜n) → (S,˜ B˜) a.s. in C0,p−var, for some limit processes (S,˜ B˜). It is easy to see that

Z t Z t ˜n ˜n ˜n ˜ n ˜n St = S0 + b(Sr )dBr + c (Sr )dηr, 0 0

˜n and S0 ∼ µ. But (e.g. by Theorem 10.47 in [33]) Z Z cn(S˜n)dη → c(S˜)dη a.s. in C0,q−var.

Moreover by Lemma 5.2 in [38] (or Theorem 7.10 in [44], where one does not need that b is bounded) we have Z Z b(S˜n)dB˜n → b(S˜)dB˜ in ucp.

Hence Z t Z t ˜ ˜ ˜ ˜ ˜ St = S0 + b(Sr)dBr + c(Sr)dηr, 0 0 and S˜0 ∼ µ, i.e. we have found a weak solution to (22) as desired. Theorem 14 (Strong Existence). Let T > 0. Let q ∈ (1, 2) and η ∈ C0,q−var.

γq Let γq > q and c = (c1, . . . , cdY ) be a collection of Lip vector fields, let b = (b1, . . . , bdB ) be a collection of Lipschitz vector fields. ∞ Let B a Brownian motion on some probability space. Let S0 ∈ L (F0). Then there exists a unique solution S on [0,T ] of (21).

Proof. Assume b is actually a collection of Lip1 (i.e. Lipschitz and bounded) vector fields. Then we get a strong solution using Theorem 12, Theorem 13, and the standard arguments by Yamada/Watanabe that derives, in a classical SDE setting, strong existence from pathwise uniqueness and weak existence (see e.g. Lemma 18.17 in [42]). 11

Now let b just be Lipschitz and not necessarily bounded. Let φn be Lipschitz, with Lipschitz n n constant 1, such that for |x| ≤ n we have φn(x) = x and |φn|∞ ≤ n. Define b (x) := b(φ (x)). Then by the above, there exists a unique solution to

Z t Z t n n n n St = ξ + b (Sr )dBr + c(Sr )dηr, (24) 0 0 11Let us briefly note another approach to arrive at a strong solution, that might be more instructive. Using Lemma 1.1 in [37] and pathwise uniqueness given by Theorem 12 one can show that the sequence used in Theorem 13 to construct a weak solution actually converges to a strong solution.

25 2. Stochastic differential equations with rough drift

Define the stopping time

n τn := inf{t : |St | ≥ n}.

n Then, on [0, τn], S solves Z t Z t n n n St = ξ + b(Sr )dBr + c(Sr )dηr. 0 0

n n+1 Claim: τn ≤ τn+1 and S = S on [0, τn]. This follows from uniqueness for equation (24). Moreover there exists C > 0 such that for all n

n r 1/r r 1/r E[||S ||p−var;[0,T ]] ≤ C + CE[|ξ| ] . (25) Indeed, using the BDG inequality for p-variation (Theorem 14.12 in [33])

n r 1/r E[ ||S ||p−var;[iT,(i+1)T] ]  Z · r  Z · r n 1/r n n 1/r ≤ E[ || c(S )dη||p−var;[iT,(i+1)T] ] + E[ || b (Sr )dBr||p−var;[iT,(i+1)T] ] 0 0 n p r 1/r ≤ E[ C(1 + ||S ||p−var;[iT,(i+1)T]) ||η||q−var;[iT,(i+1)T] ]  Z · r/2 n n 1/r + CE[ h b (Sr )dBri[iT,(i+1)T] ] 0 n r 1/r ≤ C||η||q−var;[iT,(i+1)T] + E[ ||S ||p−var;[iT,(i+1)T]) ] ||η||q−var;[iT,(i+1)T] r/2 Z (i+1)T ! n n 2 1/r + CE[ |b (Sr )| dr ] iT

Now

r/2 r/2 Z (i+1)T ! Z (i+1)T ! n n 2 1/r n n n n 2 1/r E[ |b (Sr )| dr ] ≤ E[ (|b (S0 )| + |Db |∞|S0,r|) dr ] iT iT 1/2 n n n n r 1/r ≤ T E[(|b (S0 )| + |Db |∞|S |0) ] 1/2 n h n r 1/r n r 1/ri ≤ T |Db |∞ E[|S0 | ] + E[|S |0]

1/2 n h n r 1/r n r 1/ri ≤ T |Db |∞ E[|S0 | ] + E[|S |p−var] .

n Choosing T so small (independent of n and SiT), such that

1/2 n sup ||η||q−var;[t,t+T] + T |Db |∞ ≤ 1/2, t≤T we get

n r 1/r n r 1/r E[ ||S ||p−var;[iT,(i+1)T] ] ≤ C + CE[|SiT| ] .

An application of Lemma 17 then yields

n r 1/r r 1/r E[ ||S ||p−var;[0,T ] ] ≤ C + CE[|ξ| ] , (26) where C does not depend on the initial value.

26 2. Stochastic differential equations with rough drift

But then τn → T almost surely. Indeed assume there exists P[A] = ε > 0 such that on A we n have τn ≤ T − δ for all n. But then P[|S |∞ ≥ n] ≥ ε, for all n, which contradicts (25). n Hence S·∧τn converges to a process S that solves (21) on [0,T ], and, by an application of Fatou’s Lemma, S necessarily satisfies (26). Now a (unique, by Theorem 12) solution in the case of ξ ∈ L0, is constructed by solving with initial condition −n ∧ ξ ∨ n and letting n → ∞.

Theorem 15 (Stability). Let q ∈ (1, 2). Assume η, η1, η2, · · · ∈ C0,q−var. Let the assumptions of Theorem 14 be fulfilled. If

ηn → η, in C0,q−var, then for every r ≥ 1 we have for the corresponding solution to SDE with rough drift,

n r 1/r E[||S − S||p−var;[0,T ]] → 0.

Proof. We want to apply Lemma 17 and Lemma 18. 1. For the first one let T > 0. Using the BDG inequality for p-variation (Theorem 14.12 in [33])

n r 1/r E[ ||S ||p−var;[iT,(i+1)T] ]  Z r  Z r n n 1/r n 1/r ≤ E[ || c(S )dη ||p−var;[iT,(i+1)T] ] + E[ || b(Sr )dBr||p−var;[iT,(i+1)T] ]

n p n r 1/r ≤ E[ C(1 + ||S ||p−var;[iT,(i+1)T]) ||η ||q−var;[iT,(i+1)T] ]  Z r/2 n 1/r + CE[ h b(Sr )dBri[iT,(i+1)T] ]

n n r 1/r n ≤ C||η ||q−var;[iT,(i+1)T] + E[ ||S ||p−var;[iT,(i+1)T]) ] ||η ||q−var;[iT,(i+1)T] r/2 Z (i+1)T ! n 2 1/r + CE[ |b(Sr )| dr ] iT

Now

r/2 r/2 Z (i+1)T ! Z (i+1)T ! n 2 1/r n n 2 1/r E[ |b(Sr )| dr ] ≤ E[ (|b(SiT)| + |Db|∞|S0,r|) dr ] iT iT 1/2 n n n r 1/r ≤ T E[(|b(SiT)| + |Db |∞|S |0) ] 1/2 h n r 1/r n r 1/ri ≤ T |Db|∞ E[|SiT| ] + E[|S |0]

1/2 h n r 1/r n r 1/ri ≤ T |Db|∞ E[|SiT| ] + E[||S ||p−var;[iT,(i+1)T]] .

n Choosing T so small (independent of n and SiT!), such that

n 1/2 n sup sup ||η ||q−var;[t,t+T] + T |Db |∞ ≤ 1/2, n t≤T we get

n r 1/r n r 1/r E[ ||S ||p−var;[iT,(i+1)T] ] ≤ C + CE[|SiT| ] .

27 2. Stochastic differential equations with rough drift

An application of Lemma 17 yields

n r 1/r E[ ||S ||p−var;[0,T ] ] ≤ C.

Hence

r 1/r n r 1/r E[||S||p−var;[0,T ]] + sup E[||S ||p−var;[0,T ]] =: R < ∞. n

2. Define the controls

n p n p υ (s, t) := ||S||p;[s,t] + ||S ||p;[s,t].

For i ≥ 1, n ≥ 0, T > 0 define the stopping times (Note, that because of this we have to work in p-var space.)

n τ0 := 0, n n n n n τi+1 := inf{t ≥ τi : υ (τi , t) ≥ 1} ∧ T ∧ (τi + T).

Then

n r 1/r [||S − S || n n ] E p;[τi ,τi+1] Z · Z · n r 1/r n n r 1/r ≤ E[|| c(Sr) − c(Sr )dηr||p;[τ n,τ n ]] + E[|| c(Sr )d [ηr − ηr ] ||p;[τ n,τ n ]] 0 i i+1 0 i i+1 Z · n r 1/r + E[|| b(Sr) − b(Sr )dBr||p;[τ n,τ n ]] . 0 i i+1

Now, as usual (see e.g. the proof of Theorem 12), the last term is estimated with

1/2 n r 1/r 1/2 n r 1/r 1/2 n r 1/r CT [||S − S || n n ] ≤ CT [|Sτ n − S n | ] + CT [||S − S || n n ] . E ∞;[τi ,τi+1] E i τi E p;[τi ,τi+1]

The second term is estimated by

n r 1/r n n C [||S || n n ] ||η − η ||q;[0,T ] ≤ C||η − η ||q;[0,T ]. E p;[τi ,τi+1]

The first term is estimated by (here we use the definition of the stopping times)

n r 1/r n r 1/r C [|S n − S n | ] sup ||η|| + C [||S − S || n n ] sup ||η|| . E τi τ q;[t,t+T] E p;[τ ,τ ] q;[t,t+T] i t≤T i i+1 t≤T

Choosing T small enough we arrive at

n r 1/r [||S − S || n n ] E p;[τi ,τi+1] n r 1/r n ≤ C [|Sτ n − S n | ] + C||η − η || . E i τi q;[0,T ]

Hence

n r 1/r n r 1/r n [||S − S || n n ] ≤ (1 + C) [|Sτ n − S n | ] + C||η − η ||q;[0,T ] E ∞;[τi ,τi+1] E i τi n r 1/r n = C [|Sτ n − S n | ] + C||η − η || . E i τi q;[0,T ]

An application of Lemma 18 then yields the desired convergence (using, that the first part did hold true for all r ≥ 1).

28 2. Stochastic differential equations with rough drift

We now present the lemmas that were necessary for above reasoning.

Lemma 16. Let 0 = t0 < t1 < ··· < tn = T . Let p ≥ 1. Then for every X ∈ C0,p−var

p−1  ||X||p−var;[0,T ] ≤ n ||X||p−var;[t0,t1] + ||X||p−var;[t1,t2] + ··· + ||X||p−var;[tn−1,tn] .

Proof. Straightforward calculation.

Lemma 17. Let S be a in C0,p−var. Let r ≥ 1. i). Let

r 1/r E[|S0| ] = C0.

If there exists T > 0 such that for all i

r 1/r r 1/r E[||S||p−var;[iT,(i+1)T]] ≤ C1 + C1E[|SiT| ] .

Then there exiss C2 = C2(C1, T, T, r, p) such that

r 1/r E[||S||p−var;[0,T ]] ≤ C2, r 1/r E[||S||∞;[0,T ]] ≤ C2. ii). If there exists T > 0 such that for all i

r 1/r E[||S||p−var;[iT,(i+1)T]] ≤ C1, then there exists C3 = C3(C1, T, T, r, p) such that

r 1/r E[||S||p−var;[0,T ]] ≤ C3.

Proof. Without loss of generality assume T = iT, for some i ≥ 1. By Lemma 16 we have

r 1/r p−1 r 1/r p−1 r 1/r E[||S||p−var;[0,iT]] ≤ 2 E[||S||p−var;[0,(i−1)T]] + 2 E[||S||p−var;[(i−1)T,iT]] p−1 r 1/r p−1 h r 1/ri ≤ 2 E[||S||p−var;[0,(i−1)T]] + 2 C1 + C1E[|S(i−1)T| ] p−1 r 1/r ≤ 2 E[||S||p−var;[0,(i−1)T]] p−1 h  r 1/r r 1/ri + 2 C1 + C1 E[|S0| ] + E[||S||p−var;[0,(i−1)T]] .

Iterating up to time 0 yields the desired result.

Lemma 18. Let S, Sn be a sequence of stochastic processes in C0,p−var. Define the controls

n p n p υ (s, t) := ||S||p;[s,t] + ||S ||p;[s,t].

29 2. Stochastic differential equations with rough drift

For i, n ≥ 0, T > 0 define the stopping times

n τ0 := 0, n n n n n τi+1 := inf{t ≥ τi : υ (τi , t) ≥ 1} ∧ T ∧ (τi + T).

Assume that for some T > 0, r > 1 one has for all i, n

n r 1/r n r 1/r [||S − S || n n ] ≤ C [|Sτ n − S n | ] + h(n), E p−var;[τi ,τi+1] E i τi where h(n) → 0 as n → ∞. Assume

r n r E[||S||p−var;[0,T ]] + sup E[||S ||p−var;[0,T ]] =: R < ∞. (27) n

Assume

n r 1/r E[|S0 − S0 | ] → 0.

Then for every 1 ≤ r− < r we have

n r− 1/r− E[||S − S ||p−var;[0,T ]] →n→∞ 0.

Proof. 1. For every ε > 0 there exists M(ε) such that uniformly for all n

n P[τM = T ] ≥ 1 − ε.

Indeed, 1 [τ n < T ] ≤ [υn(0,T ) ≥ M − dT/Te] ≤ [υn(0,T )], P M P M − dT/TeE which can be chosen small, uniformly in n, because of (27). 2. We estimate

n r 1/r [||S − S || n ] E p−var;[0,τm] p−1 n r 1/r p−1 n r 1/r ≤ 2 [||S − S || n ] + 2 [||S − S || n n ] E p−var;[0,τm−1] E p−var;[τm−1,τm] p−1 n r 1/r p−1 h n r 1/r i ≤ 2 [||S − S || n ] + 2 C [||S − S || n ] + h(n) E p−var;[0,τm−1] E ∞;[0,τm−1] p−1 n r 1/r ≤ 2 [||S − S || n ] E p−var;[0,τm−1] p−1 h n r 1/r n r 1/r i + 2 C [|S − S | ] + C [||S − S || n ] + h(n) . E 0 E p−var;[0,τm−1] Iterating yields

n r 1/r h n r 1/r i [||S − S || n ] ] ≤ C(m, p) [|S − S | ] + h(n) . E p−var;[0,τm] E 0

3. We show convergence. Let 1 ≤ r− < r.

30 2. Stochastic differential equations with rough drift

n For every n, m ≥ 1 let An,m := {τm = T }. We have

n r− n r− n r− [||S − S || ] = [||S − S || 1 ] + [||S − S || 1 C ] E p−var;[0,T ] E p−var;[0,T ] An,m E p−var;[0,T ] An,m n r− n r− = [||S − S || n 1An,m ] + [||S − S || 1 C ] E p−var;[0,τm] E p−var;[0,T ] An,m  h ir− n r− 1/r− ≤ C(m, p) E[|S − S0 | ] + h(n) n r 1/r C 1/θ + E[||S − S ||p−var;[0,T ]] P[An ]  h ir− n r− 1/r− 1/r C 1/θ ≤ C(m, p) E[|S − S0 | ] + h(n) + R P[An ] , where θ = r . r−r− Let ε > 0 be given. Pick m so large, such that the second term is smaller then ε (uniformly in n, by Step 1). Then pick n so large that the first term is smaller then ε. This yields convergence.

0,q−var Lemma 19. Let q ∈ (1, 2), let γq > q. Let η ∈ C . Let φ be the Young flow Z t φ(t, y) = y + c(φ(r, y))dηr. 0

γq+2 3 Assume c = (c1, . . . , cdη ) is a collection of Lip vector fields. Then φ is a C diffeomor- phism. Let Y be a continuous with values in C0,p−var, where 1/q + 1/p > 1. Then

Z t Z t 1 Z t φ(t, Yt) − φ(0,Y0) = c(φ(r, Yr))dηr + ∂yφ(r, Yr)dYr. + ∂yyφ(r, Yr)dhY ir. 0 0 2 0 Remark 20. By Proposition 14.9 in [33] every continuous semimartingale has values in C0,p−var for every p > 2.

Proof. For notational convenience we assume dY = 1. 0,p−var 1. Obviously φ(·,Y·) ∈ C so the Young integral is well defined. n −n 2. Let ti := i2 t. Now

2n−1 X n n n n φ(t, Y ) − φ(t, Y ) = φ(t ,Y n ) − φ(t ,Y n ) + φ(t ,Y n ) − φ(t ,Y n ) t 0 i+1 ti i+1 ti+1 i+1 ti i ti i=0

Now classically

2n−1 X n n φ(t ,Y n ) − φ(t ,Y n ) i+1 ti+1 i+1 ti i=0 2n−1 2 X n h i 1 n h i = ∂yφ(t ,Ytn ) Ytn − Ytn + ∂yyφ(t ,Ytn ) Ytn − Ytn i+1 i i+1 i 2 i+1 i+1 i+1 i i=0 3 1 n h i + ∂yyyφ(t , ξtn ) Ytn − Ytn 6 i+1 i+1 i+1 i Z t 1 Z t →n→∞ ∂yφ(r, Yr)dYr + ∂yyφ(r, Yr)dhY ir, 0 2 0

31 2. Stochastic differential equations with rough drift in probability, say. On the other hand

2n−1 X n n φ(t ,Y n ) − φ(t ,Y n ) i+1 ti i ti i=0 2n−1 X n h i  n n n h i = c(φ(t ,Ytn )) η n − η n + φ(t ,Ytn ) − φ(t ,Ytn ) − c(φ(t ,Ytn )) η n − η n i i ti+1 ti i+1 i i i i i ti+1 ti i=0

Let 2 > q0 > q. Now by Theorem 6.8 in [33] we have

n n n h i φ(t ,Ytn ) − φ(t ,Ytn ) − c(φ(t ,Ytn )) η n − η n i+1 i i i i i ti+1 ti n Z ti+1 n h i = c(φ(r, Ytn ))dηr − φ(ti ,Ytn ) ηtn − ηtn n i i i+1 i ti

≤ C|c(φ(·,Y n ))| 0 n n |η| 0 n n ti q −var;[ti ,ti+1] q −var;[ti ,ti+1] 2 2 ≤ C|φ(·,Ytn )| 0 n n + C|η| 0 n n . i q −var;[ti ,ti+1] q −var;[ti ,ti+1]

Now by Theorem 10.14 in [33], we have

2  2 2q0  sup |φ(·, y)|q0−var;[tn,tn ] ≤ C ||η||q0−var;[tn,tn ] + ||η||q0−var;[tn,tn ] y i i+1 i i+1 i i+1  q0 (q0)2  ≤ C ||η|| 0 n n + ||η|| 0 n n , q −var;[ti ,ti+1] q −var;[ti ,ti+1] where the last line holds for n large enough, since q0 < 2. 2 q0 Since also |η| 0 n n ≤ |η| n n for n large enough, we get by Wiener’s charac- q −var;[ti ,ti+1] q−var;[ti ,ti+1] terization of C0,p−var, (Theorem 8.22 (i.2a) in [33])

2n−1 X  n n n h i φ(t ,Ytn ) − φ(t ,Ytn ) − c(φ(t ,Ytn )) η n − η n →n→∞ 0. i+1 i i i i i ti+1 ti i=0

By Exercise 6.9 in [33] we know that

n 2 −1 Z t X n h i c(φ(t ,Ytn )) η n − η n →n→∞ c(φ(r, Yr))dη , i i ti+1 ti r i=0 0

Hence

n 2 −1 Z t X n n φ(t ,Y n ) − φ(t ,Y n ) → c(φ(r, Y ))dη , i+1 ti i ti n→∞ r r i=0 0 which finishes the proof.

2.4 Applications

We present three applications for stochastic differential equations wit drift. The first one concerns problem of robustness in the theory of nonlinear filtering. With the help of SDEs

32 2. Stochastic differential equations with rough drift with rough drift, we solve the case in which the signal and multidimensional observation are correlated. The second application concerns the stochastic representation of certain rough PDEs, a topic on which we will also elaborate in the section on BSDEs with rough drift. Lastly, the results on the Young case enable us to solve mixed Brownian-fractional Brownian SDEs, under significantly weaker assumptions then those that seem to be available in the literature.

2.4.1 Robust filtering

Let (Ω, F, (Ft)t≥0, P) be a filtered probability space on which we have defined a two com- ponent diffusion process (X,Y ) solving a stochastic differential equation driven by a mul- tidimensional Brownian motion. One assumes that the first component X is unobservable and the second component Y is observed. The filtering problem consists in computing the conditional distribution of the unobserved component, called the signal process, given the observation process Y . Equivalently, one is interested in computing

πt(f) = E[f(Xt,Yt)|Yt], where Y = {Yt, t ≥ 0} is the observation filtration and f is a suitably chosen test function. An elementary measure theoretic result tells us12 that there exists a Borel-measurable map f dY θt : C([0, t], R ) → R, such that

f πt(ϕ) = θt (Y·) P-a.s., (28) where dY is the dimension of the observation state space and Y· is the path-valued random variable dY Y· :Ω → C([0, t], R ),Y·(ω) = (Ys(ω), 0 ≤ s ≤ t). f ¯f Of course, θt is not unique. Any other function θt such that

−1 ¯f f  P ◦ Y· θt 6= θt = 0,

−1 dY f where P ◦ Y· is the distribution of Y· on the path space C([0, t], R ) can replace θt in (28). It would be desirable to solve this ambiguity by choosing a suitable representative from the class of functions that satisfy (28). A continuous version, if it exists, would enjoy the −1 following uniqueness property: if the law of the observation P ◦ Y· positively charges all dY f non-empty open sets in C([0, t], R ) then there exists a unique continuous function θt that f satisfies (28). In this case, we call θt (Y·) the robust version of πt(ϕ) and equation (28) is the robust representation formula for the solution of the stochastic filtering problem. The need for this type of representation arises when the filtering framework is used to model and solve ‘real-life’ problems. As explained in a substantial number of papers (e.g. [15, 16, 22, 23, 24, 25, 26, 45]) the model chosen for the “real-life” observation process Y¯ may not be a perfect one. However, as long as the distribution of Y¯· is close in a weak sense to that f ¯ of Y· (and some integrability assumptions hold), the estimate θt (Y·) computed on the actual f ¯ 2 observation will still be reasonable, as E[(f(Xt,Yt) − θt (Y·)) ] is well approximated by the f 2 idealized error E[(f(Xt,Yt) − θt (Y·)) ]. 12See, for example, Proposition 4.9 page 69 in [6].

33 2. Stochastic differential equations with rough drift

Moreover, even when Y and Y¯ actually coincide, one is never able to obtain and exploit a continuous stream of data as modeled by the continuous path Y·(ω). Instead the observation arrives and is processed at discrete moments in time

0 = t0 < t1 < t2 < ··· < tn = t.

ˆ n However the continuous path Y·(ω) obtained from the discrete observations (Yti (ω))i=1 by d linear interpolation is close to Y·(ω) (with respect to the supremum norm on C([0, t], R Y ); f ˆ hence, by the same argument, θt (Y·) will be a sensible approximation to πt(ϕ). In the following, we will assume that the pair of processes (X,Y ) satisfy the equation

X k X j dXt = l0(Xt,Yt)dt + Zk(Xt,Yt)dWt + Lj(Xt,Yt)dBt , (29) k j

dYt = h(Xt,Yt)dt + dWt, (30) with X0 being a bounded random variable and Y0 = 0. In (29) and (30), the process X is the dX -dimensional signal, Y is the dY -dimensional observation, B and W are independent dB-dimensional, respectively, dY -dimensional Brownian motions independent of X0. Suitable dX +dY dX dX +dY assumptions on the coefficients l0,L1,...,LdB : R → R , Z1,...,ZdY : R → d 1 d d +d d R X and h = (h , . . . , h Y ): R X Y → R Y will be introduced later on. This framework covers a wide variety of applications of stochastic filtering (see, for example, [20] and the references therein) and has the added advantage that, within it, πt(ϕ) admits an alternative representation that is crucial for the construction of its robust version. Let us detail this representation first.

Let u = {ut, t > 0} be the process defined by

d ! XY Z t 1 Z t u = exp − hi(X ,Y )dW i − (hi(X ,Y ))2ds . (31) t s s s 2 s s i=1 0 0

Then, under suitable assumptions13, u is a martingale which is used to construct the prob- S ability measure P0 equivalent to P on 0≤t<∞ Ft whose Radon–Nikodym derivative with respect to is given by u, viz P dP0 = ut. d P Ft Under P0, Y is a Brownian motion under independent of B. Moreover the equation for the signal process X becomes

¯ X k X j dXt = l0(Xt,Yt)dr + Zk(Xt,Yt)dYt + Lj(Xt,Yt)dBt . (32) k j

Observe that equation (32) is now written in terms of the pair of Brownian motions (Y,B) ¯ ¯ P and the coefficient l0 is given by l0 = l0 + k Zkhk. Moreover, for any measurable, bounded d +d function f : R X Y → R, we have the following formula called the Kallianpur-Striebel’s formula, pt(f) πt(f) = , pt(f) := E0[f(Xt,Yt)vt|Yt] (33) pt(1)

13 h  1 R t i 2 i For example, if Novikov’s condition is satisfied, that is, if E exp 2 0 kh (Xs,Ys)k ds < ∞ for all t > 0, then u is a martingale. In particular it will be satisfied in our setting, in which h is bounded.

34 2. Stochastic differential equations with rough drift

where v = {vt, t > 0} is the process defined as vt := exp(It), t ≥ 0 and

d XY Z t 1 Z t  I := hi(X ,Y )dY i − (hi(X ,Y ))2dr , t ≥ 0. (34) t r r r 2 r r i=1 0 0 The representation (33) suggests the following three-step methodology to construct a robust f representation formula for πt :

Step 1 . We construct the triplet of processes (Xy,Y y,Iy) 14 corresponding to the pair (y, B) where y is now a fixed observation path y. = {ys, s ∈ [0, t]} belonging to a suitable class of continuous functions and prove that the random variable f(Xy,Y y)exp(Iy) is P0-integrable. f Step 2 . We prove that the function y. → gt (y.) defined as

f y y y gt (y.) = E0 [f(Xt ,Yt ) exp(It )] (35) is continuous.

f Step 3 . We prove that gt (Y.) is a version of pt(f). Then, following (33), the function, f y. → θt (y.) defined as f f gt θt = 1 (36) gt

provides the robust version of of πt(f).

We emphasize that Step 3 cannot be omitted from the methodology. Indeed one has to f f prove that gt (Y.) is a version of pt(f) as this fact is not immediate from the definition of gt . Step 1 is immediate in the particular case when only the Brownian motion B drives X (i.e. the coefficient Z = 0) and X is itself a diffusion, i.e., it satisfies an equation of the form

X j dXt = l0(Xt)dr + Lj(Xt)dBt , (37) j and h does only depend on X. In this case the process (Xy,Y y) can be taken to be the pair (X, y). Moreover, we can define Iy by the formula

d XY  Z t 1 Z t  Iy := hi(X )yi − yi dhi(X ) − (hi(X ,Y ))2dr , t ≥ 0. (38) t t t r r 2 r r i=1 0 0

i R t i i provided the processes h (X) are semi-martingales. In (38), the integral 0 yrdh (Xr) is the Itˆointegral of the non-random process yi with respect to hi(X). Note that the formula for y It is obtained by applying integration by parts to the stochastic integral in (34) Z t Z t i i i i i i h (Xr)dYr = h (Xt)Yr − Yr dh (Xr), (39) 0 0 and replacing the process Y by the fixed path y in (39). This approach has been successfully used to study the robustness property for the filtering problem for the above case in a number of papers ([15, 16, 45]).

14As we shall see momentarily, in the uncorrelated case the choice of Y y will trivially be y. In the correlated case we make it part of the SDE with rough drift, for (notational) convenience.

35 2. Stochastic differential equations with rough drift

The construction of the process (Xy,Y y,Iy) is no longer immediate in the case when Z 6= 0, i.e. when the signal is driven by both B and W (the correlated noise case). In the case when the observation is one dimensional one can solve this problem by using a method akin with Doss-Sussmann’s “pathwise solution” of a stochastic differential equation (see [29],[60]). This approach has been employed by Davis to extend the robustness result to the correlated noise case with scalar observation (see, [22, 23, 24, 26]). In this case one constructs first a diffeomorphism which is a pathwise solution of the equation15

Z t φ(t, x) = x + Z(φ(s, x)) ◦ dYt. (40) 0 The diffeomorphism is used to express the solution X of equation (32) as a composition between the diffeomorphism φ and the solution of a stochastic differential equation driven by B only and whose coefficients depend continuously on Y . As a result, we can make sense of Xy. Iy is then defined by a suitable (formal) integration by parts that produces a pathwise interpretation of the stochastic integral appearing in (34) and Y y is chosen to be y, as before. The robust representation formula is then introduced as per (36). To our knowledge, the correlated noise and multidimensional observation case has not been studied and it is the subject of the current work. In this case in turns out that we cannot hope to have robustness in the sense advocated by Clark. More precisely, there may not f dY exists a map continuous map θt : C([0, t], R ) → R, such that the representation (28) holds almost surely. The following is a simple example that illustrates this.

Example 21. Consider the filtering problem where the signal and the observation process solve the following pair of equations

Z t 1 Z t Xt = X0 + XrdYr + Xrdr 0 2 0 Z t Yt = h(Xr)dr + Wt, 0

1 where Y is 2-dimensional and P(X0 = 0) = P(X0 = 1) = 2 . Then with f, h such that f(0) = h1(0) = h2(0) = 0 one can explicitly compute

f(exp(Yt)) E[f(Xt)|Yt] = . (41)  X Z t 1 Z t  1 + exp − hk(exp(Y ))dY k + ||h(exp(Y )||2dr r r 2 r k=1,2 0 0

Following the findings of rough path theory (see, eg, [33, 49, 50, 51]) the expression on the right hand side of (41) is not continuous in supremum norm (nor in any other metric on path space) because of the stochastic integral. Explicitly, this follows, for example, from Theorem 1.1.1 in [51] by rewriting the exponential term as the solution to a stochastic differential equation driven by Y .

Nevertheless, we can show that a variation of the robustness representation formula still exists in this case. For this we need to ”enhance” the original process Y by adding a second component to it which consists of its iterated integrals (that, knowing the path, is in a one-to-one correspondence with the Lev`yarea process). Explicitly we consider the process

15Here dY = 1 and Y is scalar.

36 2. Stochastic differential equations with rough drift

Y = {Y t, t ≥ 0} defined as

  R t 1 1 R t 1 dY  0 Yr ◦ dYr ··· 0 Yr ◦ dYr Y t = Yt,  ·········  , t ≥ 0. (42) R t dY 1 R t dY dY 0 Yr ◦ dYr ··· 0 Yr ◦ dYr

2 dY ∼ The stochastic integrals in (42) are Stratonovich integrals. The state space of Y is G (R ) = d 16 R Y ⊕ so(dY ), where so(dY ) is the set of anti-symmetric matrices of dimension dY . Over this state space we consider not the space of continuous function, but a subspace C0,α that 2 d d contains paths η : [0, t] → G (R Y ) that are α-H¨olderin the R Y -component and 2α-H¨older in the so(dY )-component, where α is a suitably chosen constant α < 1/2. Note that there exists a modification of Y such that Y (ω) ∈ C0,α for all ω (Corollary 13.14 in [33]). The space C0,α is endowed with the α-H¨olderrough path metric under which C0,α becomes a complete metric space. The main result of the paper is that there exists a continuous map f 0,α θt : C → R, such that f πt(ϕ) = θt (Y ·) P-a.s. (43) Even though the map is defined on a slightly more abstract space, it nonetheless enjoys d the desirable properties described above for the case of a continuous version on C([0, t], R ). −1 0,α 17 Since P ◦ Y positively charges all non-empty open sets of C , the continuous version we construct will be unique. Also, it provides a certain model robustness, in the sense that f ¯ 2 f 2 ¯ E[(ϕ(Xt) − θt (Y·)) ] is well approximated by the idealized error E[(ϕ(Xt) − θt (Y·)) ], if Y· is close in distribution to Y·. The problem of discrete observation is a little more delicate. It is true, that the rough path lift Yˆ calculated from the linearly interpolated Brownian motion Yˆ will converge to the true rough path Y in probability as the mesh goes to zero (Corollary f ˆ f 13.21 in [33]), which implies that θt (Y ) is close in probability to θt (Y ) (we provide local Lipschitz estimates for θf ). In effect, most sensible approximations will do, as is for example shown in Chapter 13 in [33] (although, contrary to the uncorrelated case, not all interpolations that converge in uniform topology will work, see e.g. Theorem 13.24 ibid). But these are probabilistic statements, that somehow miss the pathwise stability that one wants to provide f with θt . If on the other hand, one is able to observe (at discrete time points), instead of only the process itself, also the second level, i.e. the area, one can construct an interpolating rough path by using geodesic interpolation (see e.g. Chapter 13.3.1 in [33]), which is close to the true (lifted) observation path Y in the relevant metric for all realizations Y ∈ C0,α. In the following we will make use of the Stratonovich version of equation (32), i.e. we will consider that the signal satisfies the equation Z t Z t Z t X k X j Xt = X0 + L0(Xr,Yr)dr + Zk(Xr,Yr) ◦ dYr + Lj(Xr,Yr)dBr , 0 0 0 k j (44) Z t Yt = h(Xr,Yr)dr + Wt. 0

j ¯j 1 P P j i 1 P j where L0(x, y) = l0(x, y) − 2 k i ∂xi Zk(x, y)Zk(x, y) − 2 k ∂yk Zk(x, y). We remind that under P0 the observation Y is a Brownian motion independent of B.

16 [1/α] d More generally, G (R ) is the ”correct” state space for a geometric α-H¨olderrough path; the space of such paths subject to α-H¨olderregularity (in rough path sense) yields a complete metric space under α-H¨older rough path metric. Technical details of geometric rough path spaces (as found e.g. in section 9 of [33]) are not required for understanding the results of the present paper. 17This fact is a consequence of the support theorem of Brownian motion in H¨olderrough path topology [34], see also Chapter 13 in [33].

37 2. Stochastic differential equations with rough drift

1 1 We will assume that f is a bounded Lipschitz function and we fix α ∈ ( 3 , 2 ), γ > 1/α, t > 0 and X0 is a bounded random vector independent of B and Y . We will use one of the following assumptions

γ+2 1 dY γ+2 1 (A1) Z1,...,ZdY ∈ Lip , h , . . . , h ∈ Lip and L0,L1,...,LdB ∈ Lip

γ+3 1 dY γ+3 1 (A1’) Z1,...,ZdY ∈ Lip , h , . . . , h ∈ Lip and L0,L1,...,LdB ∈ Lip

γ 1 dY γ 1 γ (A2) Z1,...,ZdY ∈ Lip , h , . . . , h ∈ Lip and L0 ∈ Lip , L1,...,LdB ∈ Lip Remark 22. Assumption (A1), (A1’) and (A2) lead to the existence of a solution of an SDEs with rough driver (Theorem 1). Under (A1) the solution mapping is locally uniformly continuous, under (A1’), (A2) it is locally Lipschitz.

Assume either (A1), (A1’) or (A2). For η ∈ C0,α there exists by Theorem 1 a solution (Xη,Iη) to the following SDE with rough drift

Z t Z t Z t η η η η η X η η ¯j Xt = X0 + L0(Xr ,Yr )dr + Z(Xr ,Yr )dηr + Lj(Xr ,Yr )dBr , 0 0 j 0 Z t η Yt = dηr, (45) 0 Z t 1 X Z t Iη = h(Xη,Y η)dη − D hk(Xη,Y η)dr. t r r r 2 k r r 0 k 0 Remark 23. Note that formally (!) when replacing the rough path η with the process Y , η η η X ,Y yields the solution to the SDE (44) and exp(It ) yields the (Girsanov) multiplicator in (33). This observation is made precise in the statement of Theorem 2.

f 1 0,α We introduce the functions g , g , θ : C → R defined as gf (η) gf (η) := ¯[f(Xη,Y η) exp(Iη)], g1(η) := ¯[exp(Iη)], θ(η) := , η ∈ C0,α. E t t t E t g1(η)

Theorem 24. Assume that (A1) holds, then θ is locally uniformly continuous. If (A1’) or (A2) hold, then θ is locally Lipschitz.

Proof. From Theorem 1 we know that for η ∈ C0,α the SDE with rough drift (45) has a unique solution (Xη,Y η,Iη) belonging to S2. Let now η, η0 ∈ C0,α. Denote X = Xη,Y = Y η,I = Iη and analogously for η0. Then

f f 0 0 |g (η) − g (η)| ≤ E[|f(Xt,Yt) exp(It) − f(Xt,Y ”t) exp(It)|] 0 0 0 0 ≤ E[|f(Xt,Yt)|| exp(It) − exp(It)|] + E[|f(Xt,Yt) − f(Xt,Yt )| exp(It)] 0 0 0 2 1/2 0 2 1/2 ≤ |f|∞E[| exp(It) − exp(It)|] + E[|f(Xt,Yt) − f(Xt,Yt )| ] E[| exp(It)| ] 0 2 1/2 0 1/2 ≤ |f|∞E[| exp(It) + exp(It)| ] E[|It − It|] 0 0 2 1/2 0 2 1/2 + E[|f(Xt,Yt) − f(Xt,Yt )| ] E[| exp(It)| ]

38 2. Stochastic differential equations with rough drift

Hence, using from Theorems 1+2 the continuity statements as well as the boundedness of exponential moments, we see that gf is locally uniformly continuous under (A1) and it is locally Lipschitz under (A1’) or (A2). The same then holds true for g1 and moreover g1(η) > 0. Hence θ is locally uniformly continuous under (A1) and locally Lipschitz under (A1’) or (A2).

0,α Denote by Y ·, as before, the canonical rough path lift of Y to C . We then have

Theorem 25. Assume either (A1), (A1’) or (A2). Then θ(Y ·) = πt(f), P − a.s.

Proof. To prove the statement it is enough to show that

f g (Y ·) = pt(f) P − a.s. which is equivalent to

f g (Y ·) = pt(f) P0 − a.s.

For that, it suffices to show that

f E0[pt(f)Υ(Y·)] = E0[g (Y ·)Υ(Y·)], (46) d for an arbitrary continuous bounded function Υ : C([0, t], R Y ) → R. ¯ ¯ ¯ Let (Ω, F, P) be the auxiliary probability space from before, carrying an dB-dimensional ¯ ˆ ˆ ˆ ¯ ¯ ¯ Brownian motion B. Let (Ω, F, P) := (Ω × Ω, F ⊗ F, P0 ⊗ P). By Y and X0 we denote also ˆ the ’lift’ of Y to Ω, i.e. Y (ω, ω¯) = Y (ω), X0(ω, ω¯) = X0(ω). Then (Y,B) (on Ω under P0) has the same distribution as (Y, B¯) (on Ωˆ under Pˆ). Denote by (X,ˆ Iˆ) the solution on (Ωˆ, F,ˆ Pˆ) to the SDE Z t Z t Z t ˆ ˆ X ˆ k X ˆ ¯j Xt = X0 + L0(Xr,Yr)dr + Zk(Xr,Yr) ◦ dYr + Lj(Xr,Yr)dBr , 0 k 0 j 0 X Z t 1 X Z t Iˆ = hk(Xˆ ,Y ) ◦ dY k − D hk(Xˆ ,Y )dr. t r r r 2 k r r k 0 k 0

Then Z · Z · ! X k k 1 X k (Y, X,ˆ Iˆ) ∼ Y,X, h (Xr,Yr) ◦ dY − D h (Xr,Yr)dr . Pˆ r k 0 2 0 k k P0 Hence, for the left hand side of (46), ! X Z t 1 X Z t [p (f)Υ(Y )] = [f(X ,Y ) exp hk(X ,Y ) ◦ dY k − D hk(X ,Y )dr Υ(Y )] E0 t · E0 t t r r r 2 k r r · k 0 k 0 ˆ ˆ ˆ  = E[f(Xt,Yt) exp It Υ(Y·)]

On the other hand, from Theorem 2 we know that for P0 − a.e ω

Y ·(ω) ˆ ¯ X (¯ω)t = Xt(ω, ω¯) for P − a.e. ω,¯ Y ·(ω) ˆ ¯ Y (¯ω)t = Yt(ω, ω¯) for P − a.e. ω,¯ Y ·(ω) ˆ ¯ I (¯ω)t = It(ω, ω¯) for P − a.e. ω.¯

39 2. Stochastic differential equations with rough drift

Hence, for the right hand side of (46) we get (using Fubini for the last equality)   f ¯ Y · Y · Y · E0[g (Y ·)Υ(Y·)] = E0[E[f(Xt ,Yt ) exp It ]Υ(Y·)] ¯ ˆ ˆ  = E0[E[f(Xt,Yt) exp It ]Υ(Y·)] ˆ ˆ ˆ  = E[f(Xt,Yt) exp It Υ(Y·)], which yields (46).

2.4.2 Pathwise stochastic control

In this section we consider controlled SDEs with rough drift. Similar to classical stochastic optimal control, we show that in this framework, the value function solves a certain PDE, which is now a rough PDE (in the sense of [10]); this is the content of Theorem 29. Without the theory of rough paths, Buckdahn and Ma ([8]) considered pathwise stochastic control in a purely Brownian setting. Heuristically, they fix the path of a component of the multidi- mensional Brownian motion driving the controlled SDEs and only then they optimize. We investigate the connection to our results in Theorem 31.

As in the previous section, let (Ω, F, Ft, P0) be a filtered probability space carrying a dY dimensional Brownian motion Y and and independent dB dimensional Brownian motion B. ¯ ¯ ¯ ¯ As before, (Ω, F, Ft, P) will be an auxiliary probability space, carrying a dB dimensional Brownian motion B¯, which serves as the solution space to the SDEs with rough drift. We shall need a version of Theorem 1 and Theorem 2 with stochastic and time dependent coefficients. The proofs are identical to the proofs under assumption (a1) above, so we omit them. Theorem 26. Let α ∈ (0, 1], γ > 1/α. 1 Let a(ω, t, x), b1(ω, t, x), ··· , bdB (ω, t, x) be adapted and Lip with respect to x, uniformly in γ+2 dS t and ω. Let c1, . . . , cdY ∈ Lip (R ). ∞ n d n Let S0 ∈ L (F0). Let η : [0, t] → R Y be smooth paths, such that η → η in α-H¨older,for 0,α n some η ∈ C and let S be a dS-dimensional process which is the unique solution to the classical SDE Z t Z t Z t n n n ¯ n n St = S0 + a(r, Sr )dr + b(r, Sr )dBr + c(Sr )dηr , 0 0 0

∞ 0 There exists a dS-dimensional process S ∈ S such that Sn → S∞, in S0. In addition, the limit Ξ(η) := S∞ only depends on η and not on the approximating sequence. Moreover, for all q ≥ 1, η ∈ C0,α it holds that Ξ(η) ∈ Sq and the corresponding mapping Ξ: C0,α → Sq is locally uniformly continuous. ˆ ¯ ˆ ¯ Let Ω = Ω × Ω be the product space, with product measure P := P0 ⊗ P. Let S be the unique solution on this probability space to the SDE Z t Z t Z t St = S0 + a(r, Sr)dr + b(r, Sr)dB¯r + c(Sr) ◦ dYr. 0 0 0 Denote by Y the rough path lift of Y (i.e. the enhanced Brownian Motion over Y ).

40 2. Stochastic differential equations with rough drift

Theorem 27. Under the assumptions of the previous theorem

• For every R > 0, q ≥ 1 ¯ sup E[exp(q|Ξ(η)|∞;[0,t])] < ∞. ||η||α−H¨ol

• For P0 − a.e. ω ¯ P[Ss(ω, ·) = Ξ(Y (ω))s(·), s ≤ t] = 1.

Consider for a smooth path η the controlled SDE X := Xs,x,ν

Z t Z t Z t Xt = x + a(r, Xr, νr)dr + b(r, Xr, νr)dB¯r + c(Xr)dηr, s s s where conditions an a, b, c will be given later on. Here ν ∈ U¯ = U¯(Ω¯, a, b), the progressively measurable processes such that

Z T Z T ¯ 2 n E[ |b(r, x, νr)| dr + |a(r, x, νr)|dr] < ∞, x ∈ R . 0 0

We define the value function ¯ s,x v(s, x) := inf E[g(XT )]. ν∈U¯

We have the following standard result in stochastic optimal control (remember that η is assumed to be smooth).

Lemma 28. Let a and b be continuous, bounded and Lipschitz continuous in x, uniformly in l t ∈ [0,T ], u ∈ R . Let c be bounded an Lipschitz. Let g be bounded and uniformly continuous. Then v is a viscosity solution to

n n n n n n n ∂tv (t, x) + H(t, x, v (t, x), ∂xv (t, x), ∂xxv (t, x)) + ∂xv (t, x) · c(x)η ˙t = 0, (t, x) ∈ [0,T ) × R (47) vn(T, x) = g(x). where 1 H(t, x, r, p, A) := inf{a(t, x, u)p + Tr[b(t, x, u)b(t, x, u)T A]}. (48) u 2

n Moreover v ∈ BUC([0,T ] × R , and it is the only solution to (47) in this space.

Proof. By Theorem 4.3.1 and Remark 4.3.4 in [58] we have that v is a (discontinuous) viscosity solution to

n n n n n n n ∂tv (t, x) + H(t, x, v (t, x), ∂xv (t, x), ∂xxv (t, x)) + ∂xv (t, x) · c(x)η ˙t = 0, (t, x) ∈ [0,T ) × R .

Since g is bounded, v is bounded, so the Comparison Theorem 4.4.4 in [58] applies and we get that v is actually continuous and the unique BUC solution to (47) (here we used also Remark 4.3.5 for the terminal condition).

41 2. Stochastic differential equations with rough drift

Now we get the corresponding statement, in the case where η is a rough path. Theorem 29. Let α ∈ (0, 1], γ > 1/α. Let η ∈ C0,α. Let c ∈ Lipγ+2. Let a and b be Lip1 l with respect to x, uniformly in t ∈ [0,T ], u ∈ R . Let ¯ s,x,ν v(s, x) = inf E[g(XT )] ν∈U¯ where X = Xs,x,ν is the solution to the following controlled SDE with rough drift Z t Z t Z t ¯ Xt = x + a(r, Xr, νr)dr + b(r, Xr, νr)dBr + c(Xr)dηr. s s s

Then v is the unique solution to the rough partial differential equation 18

2 [∂tv(t, x) + H(t, x, v(t, x), Dv(t, x),D v(t, x))]dt + Dv(t, x) · c(x)dηt = 0, (49) v(T, x) = g(x). (50)

Proof. By Theorem 1 and Example 5 in [10] there exists a unique solution v0 to the rough HJB equation (49). For the sake of unified notation, write η as η0. Let now ηk, k = 1, 2,... be a sequence of smooth path, such that ηk → η0 in p-variation. By Theorem 1 in [10] we have that v0(t, x) =v ˆ0(t, ψ0(t, x)) where φ0 is the flow (14) corre- sponding to the rough path η0, ψ0 is its x-inverse andv ˆ0 is the unique BUC solution to the PDE

0 0 0 0 2 0 ∂tvˆ (t, x) + Hˆ (t, x, vˆ (t, x),Dvˆ (t, x),D vˆ (t, x)) = 0, vˆ0(T, x) = g(ψ0(T, x)).

Here we define for k ≥ 0

Hˆ k(t, x, r, p, A) n k k k := inf a(t, φ (t, x), u) · hp, ∂xψ (t, φ (t, x))i u 1 + Tr[b(t, φk(t, x), u)b(t, φk(t, x), u)T 2  k k k k k  o × hA, ∂xψ (t, φ (t, x)) ⊗ ∂xψ(t, φ (t, x))i + hp, ∂xxψ (t, φ (t, x))i ] n 1 o = inf ˜b(t, x, u)p + Tr[˜σ(t, x, u)˜σ(t, x, u)T A] , u 2 where ˜b, σ˜ were defined in Lemma 3. By Lemma 28 we have that 0 ¯ 0 ˆ s,x,ν vˆ (s, x) = inf E[g(ψ (T, XT ))] ν∈U¯ where Xˆ s,x,ν is the controlled diffusion Z t Z t ˆ s,x,ν ˆ s,x,ν ˜ ˆ s,x,ν Xt = x + a˜(r, X , νr)dr + b(r, X , νr)dWr. (51) s s 18This type of rough partial differential equation has been studied in [10].

42 2. Stochastic differential equations with rough drift

˜ s,x,ν s,x,ν Now by Lemma 3 we know that Xt := φ(t, Xt ) satsifies the SDE (51), with starting point φ(s, x). Hence we can conclude

0 0 ¯ 0 ˆ s,φ(s,x),ν v (s, x) =v ˆ (s, φ(s, x)) = inf E[g(ψ (T, XT ))] ν∈U¯ ¯ 0 ˜ s,x,ν ¯ s,x,ν = inf E[g(ψ (T, XT ))] = inf E[g(XT )]. ν∈U¯ ν∈U¯

We now investigate the connection to the results in [8]. Denote now Xrp := Xrp,s,x,ν,η, ν ∈ U¯, as the solution to the controlled SDE with rough drift Z t Z t Z t rp rp rp ¯ rp Xt = x + a(r, Xr , νr)dr + b(r, Xr , νr)dBr + c(Xr )dηr. s s s

Let

L(s, x, η) := inf ¯[g(Xrp,s,x,ν,η)]. ν E T Lemma 30. Let α ∈ (0, 1], γ > 1/α. 1 l γ+2 Let a and b be in Lip with respect to x, uniformly in t ∈ [0,T ], u ∈ R . Let c ∈ Lip . Let 1 n g ∈ Lip (R ). For fixed s, x and R > 0 the mapping

0,α C → R η 7→ L(s, x, η) is uniformly continuous on {||η||C0,α < R}.

Proof. By Theorem 26 the mapping that takes a rough path to the corresponding solution of SDE with rough drift is uniformly continuous on bounded sets. Inspection of the proof, 1 l using the fact that a and b are in Lip with respect to x, uniformly in t ∈ [0,T ], u ∈ R , yields that the solution mapping is locally uniform continuous, uniform over all controls ν. The trivial inequality

¯ rp,s,x,ν,η1 ¯ rp,s,x,ν,η2 ¯ rp,s,x,ν,η1 ¯ rp,s,x,ν,η2 | inf E[g(XT )] − inf E[g(XT )]| ≤ sup |E[g(XT )] − E[g(XT )]| ν ν ν then yields the desired continuity.

ˆ ¯ ˆ ¯ Let Ω = Ω × Ω be the product space, with product measure P := P0 ⊗ P. Consider, following s,x,ν ˆ B [8], the controlled SDE X := X on Ω where ν is Ft adapted Z t Z t Z t Xt = x + a(r, Xr, νr)dr + b(r, Xr, νr)dB¯r + c(Xr) ◦ dYr. s s s where conditions an a, b, c will be given later on. Here ν ∈ Uˆ = Uˆ(Ωˆ, a, b), the progressively measurable processes such that

Z T Z T ¯ 2 n E[ |b(r, x, νr)| dr + |a(r, x, νr)|dr] < ∞, x ∈ R . 0 0

43 2. Stochastic differential equations with rough drift

Denote ˆ s,x,ν Y v(s, x) := ess inf E[g(XT )|F ]. ν∈Uˆ

Let Y be the rough path lift of the Brownian motion Y . 1 l Theorem 31. Let σ and b be in Lip with respect to x, uniformly in t ∈ [0,T ], u ∈ R . Let 1 n g ∈ Lip (R ). Y For P0-a.e. ω we have L(s, x, Y(ωY )) = v(s, x)(ωY ). Remark 32. Let us shortly remark, that in [8], a and b are assumed to be uniformly contin- uous (also in the control state). The assumption on c is essentially the same, which comes as no surprise, as they also use a kind of Doss-Sussmann transform in order to deal with the problem. Our approach leads to the same solution, as Theorem 31 shows, but avoids many of the technical problems in [8].

Proof. “≤”: We have seen already in Theorem 25, that for every ν ¯ rp,s,x,ν,Y(ωY ) ˆ s,x,ν Y Y E[g(XT )] = E[g(XT )|Y = ω ], for P0 − a.e. ω . (52) Now by definition Y ¯ rp,s,x,ν,Y(ωY ) Y L(s, x, Y(ω )) ≤ E[g(XT )], ∀ν, ∀ω . Hence for every ν Y ˆ s,x,ν Y Y Y L(s, x, Y(ω )) ≤ E[g(XT )|Y = ω ], for P − a.e. ω . By using Lemma 30 we see that ωY 7→ L(s, x, Y(ωY )) is measurable. Hence by definition of the essential infimum, we have Y Y Y Y L(s, x, Y(ω )) ≤ v(s, x)(ω ), for P − a.e. ω .

“≥”:

Using (52), it is enough to show that for every δ > 0 there exists Ω1, Ω2, · · · ∈ F and ν1, ν2, ··· ∞ such that ∪i=1Ωi = Ω and ¯ s,x,νi,Y E[g(XT ] ≤ L(s, x, Y) + δ, on Ωi.

0,α For all η ∈ C let νη be such that ¯ s,x,νη,η E[g(XT )] ≤ L(s, x, η) + δ/2. Define 0,α ¯ s,x,νη,ξ Aη := {ξ ∈ C : E[g(XT )] ≤ L(s, x, ξ) + δ}. ¯ s,x,ν,η Since η 7→ E[g(XT )] and η 7→ L(s, x, η) are continuous, the Aη give an open cover of 0,α 0,α C . Since C is separable, there exists η1, η2, ··· such that ∞ 0,α ∪i=1Aη1 = C . −1 Then Ωi := Y (Ai) and νi := νηi yield the desired properties.

44 2. Stochastic differential equations with rough drift

2.4.3 Mixed stochastic differential equations

¯ ¯ ¯ ¯ Let (Ω, F, (Ft)t≥0, P) be a filtered probability space carrying a dB dimensional Brownian motion B. Let (Ω, F, (Ft)t≥0, P) be a filtered probability space carrying a dY dimensional fractional Brownian motion Y with Hurst parameter H > 1/2. Denote the product space Ωˆ := Ω × Ω,¯ with corresponding filtration and the product measure Pˆ := P ⊗ P¯. On this space we consider the mixed Brownian-fractional Brownian motion SDE

Z t Z t Z t Xt = X0 + a(Xr)dr + b(Xr)dB¯r + c(Xr)dYr. (53) 0 0 0

Such equations have for example been considered by Zaehle [63], although only for one- dimensional Brownian motion B¯ and essentially Lip2 vector fields. We also mention the monograph [53], which considers applications to financial mathematics for equations of this type. Let us note, that owing to H > 1/2, a joint rough path lift of B,Y¯ can easily be constructed (even without the results of Theorem 1, assumption (a2)). Hence (53) could be solved directly as RDE, if b is essentially Lip2. The main appeal of the following theorem is then, that we get away with a lot less regularity.

Theorem 33. Let γ > 1/H. Assume a Lipschitz, b1, . . . , bdB Lipschitz and c1, . . . , cdY ∈ γ dS ∞ ˆ Lip (R ). Let X0 ∈ L (F0). Then there exists a unique solution to the mixed SDE (53).

Proof. Uniqueness follows from Theorem 12, we show existence. On Ωˆ denote the classical solution to the SDE Z t Z t Z t n n n ¯ n n Xt = X0 + a(Xr )dr + b(Xr )dBr + c(Xr )dYr , (54) 0 0 0 where Y n denotes the piecewise linear interpolation of Y . n ¯ Now for every n there exists Ωn, P[Ωn = 1] such that for ω ∈ Ωn, X (ω, ·) solves (54) on Ω. 0 Let Ω := ∩nΩn. Let p > 2 such that 1/p + H > 1. Define the measurable set

A := {ωˆ ∈ Ω:ˆ Xn(ˆω) converges in C0,p−var}.

n Now by stability of the integrals we have that X := limn X solves (53) on A. We show Pˆ[A] = 1. For every ω ∈ Ω0, Xn solves the SDE with rough drift Y n(ω) on Ω¯ for every n. Hence there n ¯ exists Mω such that X (ω, ω¯) converges forω ¯ ∈ Mω, and P[Mω] = 1. Then

0 {(ω, ω¯): ω ∈ Ω , ω¯ ∈ Mω} ⊂ A, an application of Fubini’s theorem yields Pˆ[A] = 1, as desired.

45 3. Backward stochastic differential equations with rough driver

3 Backward stochastic differential equations with rough driver

We recall that backward stochastic differential equations (BSDEs) are stochastic equations of the type Z T Z T Yt = ξ + f (r, Yr,Zr) dr − ZrdWr. (55) t t   Here, W is an m-dimensional Brownian motion on some filtered probability space Ω, F, (Ft)0≤t≤T , P . m The terminal data ξ is assumed to be FT -measurable, the driver f :Ω × [0,T ] × R×R → R is a predictable random field; a solution to this equation is a (1 + m)-dimensional adapted solution process of the form (Yt,Zt)0≤t≤T ; subject to some integrability properties depend- ing on the framework imposed by the type of assumptions on f. Equation (55) can also be written in differential form

−dYt = f (t, Yt,Zt) dt − ZtdWt.

The aim of this section, partially motivated from the recent progress on partial differential equations driven by rough path [9, 10, 36, 27, 61], is to consider

−dYt = f (t, Yt,Zt) dt + H(Yt)dηt − ZtdWt, where η is (at first) a smooth d-dimensional driving signal - accordingly H = (H1,...,Hd) - followed by a discussion in which we establish rough path stability of the solution process (Y,Z) as a function of η. Note that we do not establish any sort of rough path stability in W . Indeed when f ≡ 0 in (55), BSDE theory reduces to martingale representation, an intrinsically stochastic result which does not seem amenable to a rough pathwise approach. 19 We are able to carry out our analysis in a framework in which the ω-dependence of the terms driven n by η factorizes through an Itˆoprocess. That is, we consider, for fixed (t0, x0) ∈ [0,T ] × R ,

n dXt = b (ω; t) dt + σ (ω; t) dWt, t0 ≤ t ≤ T ; Xt0 = x0 ∈ R , ∞ −dYt = f (ω; t, Yt,Zt) dt + H (Xt,Yt) dη − ZtdW, t0 ≤ t ≤ T ; YT = ξ ∈ L (FT ) .

Our main-result is, under suitable conditions on f and H = (H1,...,Hd), that any sequence (ηn) which is Cauchy in rough path metric gives rise to a solution (Y,Z) of the BSDE with rough driver −dYt = f (ω; t, Yt,Zt) dt + H (Xt,Yt) dη − ZtdWt, (56) where η denotes the (rough path) limit of (ηn) and where indeed (Y,Z) depends only on η and not on the particular approximating sequence. An interesting feature of this result, which somehow encodes the particular structure of the above equation, is that one does not need to construct or understand the iterated integrals of η and W ; but only those of η which is tantamount to speak of the rough path η. This is in strict contrast to the usual theory of rough differential equations in which both dη and dW figure as driving differentials, e.g. in equations of the form dy = V1(y)dη + V2(y)dW .

If we specialize to a fully Markovian setting, say ξ = g (XT ) , σ (ω; t) = σ (t, Xt(ω)) , b (ω; t) = b (t, Xt(ω)) , f (ω; t, y, z) = f (t, Xt (ω) , y, z) ,H = H (Xt,Yt), we find that the solution to

19See however the recent work of Liang et al. [47] in which martingale representation is replaced by an abstract transformation.

46 3. Backward stochastic differential equations with rough driver

(56), evaluated at t = t0, yields a solution to the (terminal value problem of the) rough partial differential equation

−du = (Lu) dt + f (t, x, u, Du σ(t, x)) dt + H (x, u) dη, uT (x) = g (x) , where L denotes the generator of X. If one is interested in the Cauchy problem,u ˜ (t, x) = u (T − t, x) satisfies,

du˜ = (Lu˜) dt + f (x, u,˜ Du˜ σ(t, x)) dt + H (x, u˜) dη,˜ u˜0 (x) = g (x) , (57) where η˜ = η (T − ·). To the best of our knowledge, (56) is the first attempt to introduce rough path methods [49, 51, 50, 33] in the field of backward stochastic differential equations [56, 30, 43]. Of course, there are many hints in the literature towards the possibility of doing so: we mention in particular the Pardoux-Peng [55] theory of backward doubly stochastic differential equations (BDSDEs) which amounts to replacing dη in (56) by another set of Brownian differentials, say dB, independent of W . This theory was then employed by Buckdahn and Ma [7] to construct (stochastic viscosity) solutions to (57) with dη replaced by a Brownian differential and the assumption that the vector fields H1 (x, ·) ,...,Hd (x, ·) commute. In Section 3.2 we state and prove our main result concerning the existence and uniqueness of BSDEs with rough drivers. Section 3.3 specializes the setting to a purely Markovian one. In this context BSDEs with rough drivers are connected to rough partial differential equations, which we analyze in their own right. In Section 3.4 we establish the connection to BDSDEs.

3.1 Notation

We fix once and for all a filtered probability space (Ω, F, (F)t, P), which carries an m- 2 m dimensional Brownian motion W . Let Ft be the usual filtration of W . Denote by H[0,T ](R ) m 2 R T 2 the space of predictable processes X in R such that ||X|| := E[ 0 |X|rdr] < ∞. Denote by ∞ H[0,T ](R) the space of predictable processes that are almost surely bounded. We will say a ∞ sequence converges in H if it converges uniformly on [0,T ], P-a.s. For a random variable ξ we denote by ||ξ||∞ its essential supremum, for a process Y we denote by ||Y ||∞ the essential supremum of sup0≤t≤T |Yt|. n n BC([0,T ] × R ) (resp. BC(R )) denotes the space of bounded continuous functions on n n [0,T ] × R (resp. R ) with the topology of uniform convergence on compacta. Similarly n n BUC([0,T ] × R ) (resp. BUC(R )) denotes the space of bounded uniformly continuous n n functions on [0,T ] × R (resp. R ) with the topology of uniform convergence on compacta.

3.2 Main results

d ∞ For a smooth path η in R and ξ ∈ L (FT ) we consider the BSDE Z T Z T Z T Yt = ξ + f(r, Yr,Zr)dr + H(Xr,Yr)dη(r) − ZrdWr, t ≤ T, (58) t t t n where the R -valued semimartingale X has the form Z t Z t Xt = x + σrdWr + brdr. 0 0

47 3. Backward stochastic differential equations with rough driver

n R T Here, H = (H1,...,Hd) with Hk : R × R → R, k = 1, . . . , d and t H(Xr,Yr)dη(r) := Pd R T k k=1 t Hk(Xr,Yr)η ˙ (r)dr. W is an m-dimensional Brownian motion (hence Z is a row m×1 m m vector taking values in R identified with R ). f :Ω×[0,T ]×R×R → R is a predictable n n×m random function, x ∈ R , σ is a predictable process taking values in R , b is a predictable n process taking values in R . Definition 34. We call equation (58) BSDE with data (ξ, f, H, η). We call (Y,Z) a solution ∞ 2 m if Y ∈ H[0,T ](R),Z ∈ H[0,T ](R ) and (58) is satisfied.

For a vector x we denote the Euclidean norm as usual by |x|. For a matrix X we denote by |X|, depending on the situation, either the 1-norm (operator norm), the 2-norm (Euclidean norm) or the ∞-norm (operator norm of the transpose). This slight abuse of notation will not lead to confusion, as all inequalities will be valid up to multiplicative constants. We introduce the following assumptions:

(A1) There exists a constant Cσ > 0 such that P − a.s. for t ∈ [0,T ]

|σt(ω)| ≤ Cσ.

(A2) There exists a constant Cb > 0 such that P − a.s. for t ∈ [0,T ]

|bt(ω)| ≤ Cb.

m 20 (F1) There exists a constant C1,f > 0 such that P − a.s. for (t, y, z) ∈ [0,T ] × R × R

2 |f(ω; t, y, z)| ≤ C1,f + C1,f |z| ,

|∂zf(ω; t, y, z)| ≤ C1,f + C1,f |z|.

m (F2) There exists a constant C2,f > 0 such that P − a.s. for (t, y, z) ∈ [0,T ] × R × R

∂yf(ω; t, y, z) ≤ C2,f .

For given real numbers γ > p ≥ 1 we have the following assumption:

(Hp,γ) Let H(x, ·) = (H1(x, ·),...,Hd(x, ·)) be a collection of vector fields on R, parameterized n by x ∈ R . Assume that for some CH > 0, we have joint regularity of the form

sup |H | γ+2 n+1 ≤ C . i Lip (R ) H i=1,...,d

As a consequence of Theorem 2.3 and Theorem 2.6 in [43], we get the following

n Lemma 35. Assume (A1), (A2), (F1), (F2) and let H be Lipschitz on R × R. Let ξ ∈ ∞ L (FT ) and a smooth path η be given. Then there exists a unique solution to the BSDE with data (ξ, f, H, η).

20 When we use partial derivatives, we assume implicitly that the function in question is continuously differentiable in the respective variable. In fact, throughout, it would suffice to assume (local) Lipschitzness and bound the Lipschitz constant analogously.

48 3. Backward stochastic differential equations with rough driver

We want to give meaning to equation (58), where the smooth path η is replaced by a gen- 0,p−var [p] d [p] d eral geometric rough path η ∈ C ([0,T ],G (R )), where G (R ) is the free step-[p] d d d [p] nilpotent group over R , realized as subset of R ⊕ R ⊕ · · · ⊕ (R ) , equipped with Carnot- Caratheodory metric We give our main result, the proof of which we present at the end of the section. n d n Theorem 36. Let p ≥ 1, γ > p and η , n = 1, 2,..., be smooth paths in R . Assume η → η 0,p−var [p] d ∞ in p-variation, for a path η ∈ C ([0,T ],G (R )). Let ξ ∈ L (FT ). Let f be a random function satisfying (F1) and (F2). Moreover, assume (A1), (A2) and (Hp,γ). For n ≥ 1 denote by (Y n,Zn) the solutions to the BSDE with data (ξ, f, H, ηn). ∞ 2 Then there exists a process (Y,Z) ∈ H[0,T ] × H[0,T ] such that n Y → Y uniformly on [0,T ] P − a.s., n 2 Z → Z in H[0,T ].

The process is unique in the sense, that it only depends on the limiting rough path η and not on the approximating sequence. We write (formally 21)

Z T Z T Z T Yt = ξ + f(r, Yr,Zr)dr + H(Xr,Yr)dη(r) − ZrdWr. (59) t t t Moreover, the solution mapping 0,p−var [p] d ∞ ∞ 2 C ([0,T ],G (R )) × L (FT ) → H[0,T ] × H[0,T ], (η, ξ) 7→ (Y,Z) is continuous.

The problem in showing convergence of the processes (Y n,Zn) in the statement of the theorem lies in the fact, that in general the Lipschitz constants for the corresponding BSDEs will tend to infinity as n → ∞. It does not seem possible then, to directly control the solutions via a priori bounds, a standard tool in the theory of BSDEs (see e.g. [30]). We will take another approach and transform the BSDEs corresponding to the smooth paths ηn into BSDEs which are easier to analyze. We start by defining the flow (parametrized by x)

Z T d X k φ(t, x, y) = y + Hk(x, φ(r, x, y))dη (r). (60) t k=1 Let φ−1 be the y-inverse of φ, then

Z T d −1 X −1 k φ (t, x, y) = y − ∂yφ (r, x, y)Hk(x, y)dη (r). t k=1

We have the following

21The ”integral” R H(X,Y )dη is not a rough integral defined in the usual rough path theory (e.g. [51] or [33]); regularity issues aside one misses the iterated integrals of X (and thus W ) against those of η. For what it’s worth, in the present context (59) can be taken as an implicit definition of R H(X,Y )dη. (Somewhat similar in spirit: F¨ollmer’sItˆo’sintegral which appears in his Itˆoformula sans probabilit´e.) More pragmatically, notation (59) is justified a posteriori through our uniqueness result; in addition it is consistent with standard BSDE notation when η happens to be a smooth path.

49 3. Backward stochastic differential equations with rough driver

n Lemma 37. Assume (A1), (A2), (F1), (F2) and let H be Lipschitz on R × R. Let ξ ∈ ∞ L (FT ) and a smooth path η be given and let φ be the corresponding flow defined in (60). Let (Y,Z) be the unique solution to the BSDE with data (ξ, f, H, η). Then, the process (Y,˜ Z˜) defined as

˜ −1 ∂xφ(t, Xt, Yt) 1 Y˜t := φ (t, Xt,Yt), Z˜t := − σt + Zt, ∂yφ(t, Xt, Y˜t) ∂yφ(t, Xt, Y˜t) satisfies the BSDE

Z T Z T Y˜t = ξ + f˜(r, Xr, Y˜r, Z˜r)dr − Z˜rdWr, (61) t t where (throughout, φ and all its derivatives will always be evaluated at (t, x, y˜))

˜ 1 n 1  T  f(t, x, y,˜ z˜) := f (t, φ, ∂yφz˜ + ∂xφσt) + h∂xφ, bti + Tr ∂xxφσtσt ∂yφ 2 1 o + hz,˜ (∂ φσ )T i + ∂ φ|z˜|2 . xy t 2 yy Remark 38. This (”Doss-Sussman”) transformation is well known and has been recently applied to BDSDEs [7] and rough partial differential equations [32]. We include details for the reader’s convenience.

−1 Proof. Denoting ψ := φ and θr := (r, Xr,Yr), we have by Itˆoformula

Z T d Z T Z T X ˙k ψ(t, Xt,Yt) = ξ − ∂yψ(θr)Hk(Xr,Yr)η (r)dr − h∂xψ(θr), bridr − h∂xψ(θr), σrdWri t k=1 t t Z T Z T d Z T X ˙k + ∂yψ(θr)f(r, Yr,Zr)dr + ∂yψ(θr)Hk(Xr,Yr)η (r)dr − ∂yψ(θr)ZrdWr t t k=1 t Z T Z T Z T 1  T  1 2 T − Tr ∂xxψ(θr)σrσr i dr − ∂yyψ(θr)|Zr| dr − h∂xyψ(θr), σrZr idr 2 t 2 t t Z T h 1  T  = ξ + ∂yψ(θr)f(r, Yr,Zr) − h∂xψ(θr), bri − Tr ∂xxψ(θr)σrσr t 2 1 i − ∂ ψ(θ )|Z |2 − h∂ ψ(θ ), σ ZT i dr 2 yy r r xy r r r Z T − h∂xψ(θr)σr + ∂yψ(θr)Zr, dWri. t

Now, by deriving the identity ψ(t, x, φ(t, x, y˜)) =y ˜ we get

0 = ∂xψ + ∂yψ∂xφ,

0 = ∂xxψ + ∂yxψ ⊗ ∂xφ + [∂xyψ + ∂yyψ∂xφ] ⊗ ∂xφ + ∂yψ∂xxφ

= ∂xxψ + 2∂xyψ ⊗ ∂xφ + ∂yyψ∂xφ ⊗ ∂xφ + ∂yψ∂xxφ,

1 = ∂yψ∂yφ,

0 = ∂xyψ∂yφ + ∂yyψ∂xφ∂yφ + ∂yψ∂xyφ, 2 0 = ∂yyψ(∂yφ) + ∂yψ∂yyφ.

50 3. Backward stochastic differential equations with rough driver

And hence

∂yyφ ∂xφ ∂yyφ ∂xyφ ∂yyψ = − 3 , ∂xψ = − , ∂xyψ = 3 ∂xφ − 2 , (∂yφ) ∂yφ (∂yφ) (∂yφ)   ∂yyφ ∂xyφ ∂xxφ 1 ∂xxψ = 2 3 ∂xφ − 2 ⊗ ∂xφ + 3 ∂xφ ⊗ ∂xφ − ∂xxφ. (∂yφ) (∂yφ) (∂yφ) ∂yφ

If we define

−1 Y˜t := ψ(t, Xt,Yt) = φ (t, Xt,Yt),

Z˜t := ∂xψ(t, Xt,Yt)σt + ∂yψ(t, Xt,Yt)Zt

∂xφ(t, Xt, Y˜t) 1 = − σt + Zt, ∂yφ(t, Xt, Y˜t) ∂yφ(t, Xt, Y˜t) and (ψ and its derivatives are always evaluated at (t, x, φ(t, x, y˜)), φ and its derivatives are evaluated at (t, x, y˜))   ˜ ∂xφσt 1  T  f(t, x, y,˜ z˜) := ∂yψf t, φ, ∂yφ(˜z + ) − h∂xψ, bti − Tr ∂xxψσtσt ∂yφ 2  T 1 z˜ − ∂xψσt 2 z˜ − ∂xψσt − ∂yyψ| | − h∂xyψ, σt i 2 ∂yψ ∂yψ 1 n 1  T  = f (t, φ, ∂yφz˜ + ∂xφσt) + h∂xφ, bti + Tr ∂xxφσtσt ∂yφ 2 1 o + hz,˜ (∂ φσ )T i + ∂ φ|z˜|2 , xy t 2 yy we therefore obtain Z T Z T Y˜t = ξ + f˜(r, x, Y˜r, Z˜r)dr − Z˜rdWr. t t

Definition 39. We call equation (61) BSDE with data (ξ, f,˜ 0, 0).

The BSDE (58) only makes sense for a smooth path η. On the other hand, equation (60) yields 0,p−var [p] d a flow of diffeomorphisms for a general geometric rough path η ∈ C ([0,T ],G (R )), p ≥ 1. Hence we can, also in this case, consider the function f˜ from the previous lemma. We now record important properties for this induced function.

0,p−var [p] d Lemma 40. Let p ≥ 1, η ∈ C ([0,T ],G (R )) and γ > p. Assume (A1), (A2), (F1), (F2) and (Hp,γ). Let φ be the flow corresponding to equation (60) (now solved as a rough differential equation). Then the function

˜ 1 n 1  T  f(t, x, y,˜ z˜) := f (t, φ, ∂yφz˜ + ∂xφσt) + h∂xφ, bti + Tr ∂xxφσtσt (62) ∂yφ 2 1 o + hz,˜ (∂ φσ )T i + ∂ φ|z˜|2 xy t 2 yy satisfies the following properties:

51 3. Backward stochastic differential equations with rough driver

˜ • There exists a constant C1,f > 0 depending only on Cσ, Cb, C1,f , CH and ||η||p−var;[0,T ] such that

˜ 2 |f(t, x, y,˜ z˜)| ≤ C˜1,f + C˜1,f |z˜| , ˜ |∂z˜f(t, x, y,˜ z˜)| ≤ C˜1,f + C˜1,f |z˜|.

˜ • There exists a constant Cunif > 0 that only depends on Cσ, Cb, C2,f , CH and ||η||p−var;[0,T ] such that for every ε there exists an hε > 0 that only depends on Cσ, Cb, CH and ||η||p−var;[0,T ] such that on [T − hε,T ] we have

˜ 2 ∂y˜f(t, x, y,˜ z˜) ≤ C˜unif + ε|z˜| .

Proof. (i). Note that

˜ 1  1  T  |f(t, x, y,˜ z˜)| ≤ | | |f (t, φ, ∂yφz˜ + ∂xφσt) | + |h∂xφ, bti| + | Tr ∂xxφσtσt | ∂yφ 2 1  + |hz,˜ (∂ φσ )T i| + | ∂ φ||z˜|2 xy t 2 yy 1  2 1 T ≤ | | C1,f + C1,f |∂yφz˜ + ∂xφσt| + |∂xφ||bt| + |∂xxφ||σtσt | ∂yφ 2 1  + |z˜||∂ φσ | + |∂ φ||z˜|2 xy t 2 yy 1  2 T 1 2 ≤ | | C1,f + C1,f 2(|∂yφ| |z˜| + |∂xφ||σt |) + |∂xφ||bt| + |∂xxφ||σt| ∂yφ 2 1  + |z˜||∂ φ||σT | + |∂ φ||z˜|2 xy t 2 yy 2 ≤ C˜1,f + C˜1,f |z˜| .

Here we have used (A1), (A2) and (F1). For the boundedness of the flow and its derivatives we have used Lemma 3.4. Note that C˜1,f hence only depends on Cσ, Cb, C1,f , CH and ||η||p−var;[0,T ]. (ii). Note that 1   |∂z˜f˜(t, x, y,˜ z˜)| = |∂zf (t, φ, ∂yφz˜ + ∂xφσt) + ∂xyφσt + ∂yyφz˜ | ∂yφ ∂xyφ ∂yyφ ≤ C1,f + C1,f (|∂yφ||z˜| + |∂xφ||σt|) + | ||σt| + | ||z˜| ∂yφ ∂yφ

≤ C˜1,f + C˜1,f |z˜|.

Here we have used (A1), (A2) and (F1). For the boundedness of the flow and its derivatives we have used Lemma 3.4. Note that again, C˜1,f hence only depends on Cσ, Cb, C1,f , CH and ||η||p−var;[0,T ]. Without loss of generality we can choose it to be the same constant as in the estimate for (i).

52 3. Backward stochastic differential equations with rough driver

(iii). Note that

˜ ∂yyφ n 1  T  ∂y˜f(t, x, y,˜ z˜) = − 2 f (t, φ, ∂yφz˜ + ∂xφσt) + h∂xφ, bti + Tr ∂xxφσtσt (∂yφ) 2 1 o + hz,˜ (∂ φσ )T i + ∂ φ|z˜|2 xy t 2 yy 1 n 1  T  + ∂yφ∂yf (t, φ, ∂yφz˜ + ∂xφσt) + h∂yxφ, bti + Tr ∂yxxφσtσt ∂yφ 2 1 o + hz,˜ (∂ φσ )T i + ∂ φ|z˜|2 . yxy t 2 yyy Hence using our assumptions on f we get

˜ ∂yyφ n 2 1 2 ∂y˜f(t, x, y,˜ z˜) ≤ | 2 | C2,f + C2,f |∂yφz˜ + ∂xφσt| + |∂xφ||bt| + |∂xxφ||σt| (∂yφ) 2 1 o + |z˜||∂ φ||σ | + |∂ φ||z˜|2 xy t 2 yy 1 n 1 + ∂yf (t, φ, ∂yφz˜ + ∂xφσt) + |∂yxφ||bt| + |∂yxxφ||σt| ∂yφ 2 1 o + (1 + |z˜|2)|∂ φ||σ | + ∂ φ|z˜|2 yxy t op 2 yyy ∂yyφ n 2 2 1 2 o ≤ | 2 | C2,f + C2,f 2|∂xφ| |σt| + |∂xφ||bt| + |∂xxφ||σt| + |∂xyφ||σt| (∂yφ) 2

+ ∂yf (t, φ, ∂yφz˜ + ∂xφσt) 1 n 1 o + |∂yxφ||bt| + |∂yxxφ||σt| + |∂yxyφ||σt| ∂yφ 2

n ∂yyφ 2 ∂yyφ ∂yyφ 1 + | 2 |C2,f 2|∂yφ| + | 2 ||∂xyφ||σt| + | 2 | |∂yyφ| (∂yφ) (∂yφ) (∂yφ) 2 1 T 1 1 o 2 + |∂yxyφ||σt | + ∂yyyφ |z˜| ∂yφ ∂yφ 2 ˜ n ∂yyφ 2 ∂yyφ ∂yyφ 1 ≤ Cunif + | 2 |C2,f 2|∂yφ| + | 2 ||∂xyφ||σt| + | 2 | |∂yyφ| (∂yφ) (∂yφ) (∂yφ) 2 1 T 1 1 o 2 + |∂yxyφ||σt | + ∂yyyφ |z˜| , ∂yφ ∂yφ 2 ˜ where Cunif only depends on Cσ, Cb, CH and ||η||p−var;[0,T ] (here we have used Lemma 3.4 to bound the flow and its derivatives). By (A1), (A2) σ and b are bounded. Then, by the properties of the flow, the term in front of 2 |z˜| goes uniformly to zero as t approaches T . To be specific: using (Hp,γ) we obtain, again by Lemma 3.4, that for every ε > 0 there exists an hε > 0, depending on Cσ, Cb, CH and ||η||p−var;[0,T ] such that on [T − hε,T ] we have

˜ 2 ∂y˜f(t, x, y,˜ z˜) ≤ C˜unif + ε|z˜| .

We are now ready to prove Theorem 36.

Proof of Theorem 36. For the sake of unified notation, we write (Y 0,Z0) for the (rough BSDE) solution (Y,Z); similarly, we write η0 for the rough path η.

53 3. Backward stochastic differential equations with rough driver

1. Existence Let φn, n ≥ 0 be the (ODE, for n ≥ 1 and RDE, when n = 0) solution flow (parametrized by x) Z T φn(t, x, y) = y + H(x, φn(r, x, y))dηn(r). t

n n 3 By Lemma 3.4, we have for all n ≥ 0, x ∈ R , that φ (t, x, ·) is a flow of C -diffeomorphisms. Let ψn(t, x, ·) be its y-inverse. We have that φn(t, ·, ·) and its derivatives up to order three are bounded (Lemma 3.4). The same holds true for ψn(t, ·, ·) and its derivatives up to order three. n Moreover, by Lemma 3.5 we have that locally uniformly on [0,T ] × R × R

n 1 n n n n n 0 1 0 0 0 0 0 (φ , n , ∂yφ , ∂yyφ , ∂xφ , ∂xxφ , ∂yxφ ) → (φ , 0 , ∂yφ , ∂yyφ , ∂xφ , ∂xxφ , ∂yxφ ). ∂yφ ∂yφ (63)

Denote for n ≥ 0 the function

˜n 1 n n n n n 1  n T  f (t, x, y,˜ z˜) := n f (t, φ , ∂yφ z˜ + ∂xφ σt) + h∂xφ , bti + Tr ∂xxφ σtσt ∂yφ 2 1 o + hz,˜ (∂ φnσ )T i + ∂ φn|z˜|2 . xy t 2 yy Now, we have seen above that for n ≥ 1, the process (Y˜ n, Z˜n) := Ln(Y n,Zn) n n −1 n n −1 n ∂xφ (·,X·, (φ ) (·,X·,Y· )) 1 n := ((φ ) (·,X·,Y· ), − n n −1 n σ· + n n −1 n Z· ). ∂yφ (·,X·, (φ ) (·,X·,Y· )) ∂yφ (·,X·, (φ ) (·,X·,Y· )) solves the BSDE with data (ξ, f˜n, 0, 0). Note that although (ξ, f˜n, 0, 0) is a quadratic BSDE, existence and uniqueness of a solution are guaranteed for n ≥ 1 by the fact that the mapping Ln is one to one and by the existence of a unique solution to the untransformed BSDE (Lemma 35). For n = 0, using the good properties of f˜0 demonstrated in Lemma 40, there exists a solution ˜ 0 ˜0 ∞ 2 ˜0 (Y , Z ) ∈ H[0,T ] × H[0,T ] to the BSDE with data (ξ, f , 0, 0) by Theorem 2.3 in [43]. Note that it is a priori not unique, but we will show that it is at least unique on a small time interval up to T . We now construct the process (Y 0,Z0) of the statement on subintervals of [0,T ]. First of all notice that we can choose the constant C˜1,f appearing in Lemma 40 uniformly for all n ≥ 0. Let M := ||ξ||∞ + T C˜1,f . By Corollary 2.2 in [43] we have n ||Y˜ ||∞ ≤ M, n ≥ 0. (64)

Now by Lemma 40

˜ • there exists C1,f > 0 that only depends on Cσ, Cb, C1,f , CH and ||η||p−var;[0,T ] such that ˜0 2 |f (t, x, y, z)| ≤ C˜1,f + C˜1,f |z| , ˜0 |∂zf (t, x, y, z)| ≤ C˜1,f + C˜1,f |z|.

54 3. Backward stochastic differential equations with rough driver

˜ • There exists a constant Cunif > 0 that only depends on Cσ, Cb, C2,f , CH and ||η||p−var;[0,T ] such that for every ε there exists an hε > 0 that only depends on Cσ, Cb, CH and ||η||p−var;[0,T ] such that on [T − hε,T ] we have

˜0 2 ∂yf (t, x, y, z) ≤ C˜unif + ε|z| .

Hence we can choose h = h ˜ , such that for t ∈ [T − h, T ] we have δ(C1,f ,M) ˜ 2 ∂yf(t, x, y, z) ≤ C˜unif + δ(C˜1,f ,M)|z| .

Here δ is the universal function given in the statement of Theorem 3.2. We can then apply Theorem 3.2 to get uniqueness of our solution (Y˜ 0, Z˜0) on [T − h, T ]. Now, as a consequence of (63) we have

f˜n → f˜0 uniformly on compacta.

Hence, by the argument of Theorem 2.8 in [43] we have that

n 0 Y˜ → Y˜ uniformly on [T − h, T ] P − a.s., ˜n ˜0 2 Z → Z in H[T −h,T ]. (65)

Moreover, if we define 0 0 ˜ 0 Yt := φ (t, Xt, Yt ), t ∈ [T − h, T ], " 0 ˜ 0 # 0 0 0 0 ∂xφ (t, Xt, Yt ) Z := ∂yφ (t, Xt, Y˜ ) Z˜ + σt , t ∈ [T − h, T ], t t t 0 ˜ 0 ∂yφ (t, Xt, Yt ) and remembering that by construction

n n ˜ n Yt = φ (t, Xt, Yt ), " n ˜ n # n n n n ∂xφ (t, Xt, Yt ) Z = ∂yφ (t, Xt, Y˜ ) Z˜ + σt , t t t n ˜ n ∂yφ (t, Xt, Yt ) and using (63) we get

n 0 Y → Y uniformly on [T − h, T ] P − a.s., (66) n 0 2 Z → Z in H[T −h,T ].

Let us proceed to the next subinterval. To make the rough path disappear in the BSDE, we will use a similar transformation via a flow as above. As before we need to control the driver of the transformed BSDE, this time near T − h. For this reason we have to start the flow anew. First, we rewrite the BSDEs for n ≥ 1 as Z T −h Z T −h Z T −h n n n n n n n Yt = YT −h + f(r, Yr ,Zr )dr − H(Xr,Yr )dηr − Zr dWr. t t t Then define the flow φn,T −h started at time T − h, i.e. Z T −h φn,T −h(t, x, y) = y + H(x, φn,T −h(r, x, y))dηn(r), t ≤ T − h. t

55 3. Backward stochastic differential equations with rough driver

On [0,T − h] define ˜ n,T −h n,T −h −1 n Y· , := φ ) (·,X·,Y· ), n,T −h n,T −h −1 n ˜n,T −h ∂xφ (·,X·, (φ ) (·,X·,Y· )) 1 n Z· := − n,T −h n,T −h −1 n σ· + n,T −h n,T −h −1 n Z· . ∂yφ (·,X·, (φ ) (·,X·,Y· )) ∂yφ (·,X·, (φ ) (·,X·,Y· ))

Then Z T −h Z T −h ˜ n,T −h n ˜n,T −h ˜ n,T −h ˜n,T −h ˜n,T −h Yt = YT −h + f (r, Xr, Yr , Zr )dr − Zr dWr, t t where

˜n,T −h 1 n  n,T −h n,T −h n,T −h  n,T −h f (t, x, y,˜ z˜) := n,T −h f t, φ , ∂yφ z˜ + ∂xφ σt + h∂xφ , bti ∂yφ 1 h i  T 1 o + Tr ∂ φn,T −hσ σT + hz,˜ ∂ φn,T −hσ i + ∂ φn,T −h|z˜|2 . 2 xx t t xy t 2 yy This BSDE is also defined for n = 0 and as before we get via Lemma 40 for the same h and the same C˜1,f and C˜unif as before (here the explicit dependence of these constants is crucial), that on [T − 2h, T − h] we have ˜0,T −h 2 ∂yf (t, x, y, z) ≤ C˜unif + δ(C˜1,f ,M)|z| .

Hence we can apply Comparison Theorem 3.2 to get uniqueness of our solution (Y˜ 0,T −h, Z˜0,T −h) on [T − 2h, T − h]. Now, also note that for the terminal value we have from (66) and (64)

n 0 YT −h → YT −h P − a.s., n |YT −h| ≤ M, n ≥ 1.

Hence, again by the argument of Theorem 2.8 in [43] 22

n,T −h 0,T −h Y˜ → Y˜ uniformly on [T − 2h, T − h] P − a.s., ˜n,T −h ˜0,T −h 2 Z → Z in H[T −2h,T −h].

Finally, reversing the transformation, we get as above

n 0 Y → Y uniformly on [T − 2h, T − h] P − a.s., n 0 2 Z → Z in H[T −2h,T −h].

Then, we can iterate this procedure on subintervals of length h up to time 0. Without loss of generality we can assume that T = Nh for an N ∈ N. Then, patching the results together we get

N n 0 X n 0 sup |Yt − Yt | ≤ sup |Yt − Yt | → 0 P − a.s. t≤T k=1 (k−1)h≤t≤kh

22Note that Theorem 2.8 in [43] demands convergence in L∞ of the terminal value. A closer look at the proof though, reveals that P-a.s. convergence combined with a uniform deterministic bound (M in our case) is enough. To be specific: the convergence of the terminal value is only used at two instances for Theorem 2.8 and this is in the proof of Proposition 2.4 (which is the main ingredient for Theorem 2.8). Firstly, it is used on p. 568, right before Step 2 where it reads “By Lebesgue’s dominated . . . ”. Secondly, it is used on p. 570, before the end of the proof where it reads “from which we deduce that . . . ”. In both cases, the above stated requirement is enough.

56 3. Backward stochastic differential equations with rough driver and Z T  N "Z kh # n 0 2 X n 0 2 E |Zr − Zr | dr = E |Zr − Zr | dr → 0. 0 k=1 (k−1)h 2. Uniqueness Let ζ¯n, n ≥ 1 be another sequence of smooth paths that converges to η in p-variation. Let (Y¯ n, Z¯n) be the solutions to BSDEs with data (ξ, f, H, ζ¯n). Then, as above ˜ n 0 Y¯ → Y˜ uniformly on [T − h, T ] P − a.s., ¯˜n ˜0 2 Z → Z in H[T −h,T ]. And hence n 0 Y¯ → Y uniformly on [T − h, T ] P − a.s., ¯n 0 2 Z → Z in H[T −h,T ]. Note that the choice of h in the proof of existence only depended on properties of the limiting function f˜0, so we can use the same value here. One can now iterate this argument up to time 0 to get n 0 Y¯ → Y uniformly on [0,T ] P − a.s., ¯n 0 2 Z → Z in H[0,T ], as desired. 3. Continuity of the solution map We note that for a given B > 0, all terminal values ξ such that |ξ| ≤ B and all geometric p-rough paths with ||η||p−var;[0,T ] ≤ B we can choose an h = h(B) > 0 such that the above constructed unique solution (Y 0,Z0) to the BSDE (59) is given by  0,T ˜ T  φ (t, Xt, Yt ), t ∈ [T − h, T ],   φ0,T −h(t, X , Y˜ T −h), t ∈ [T − 2h, T − h], Y 0 = t t t ...   0,h ˜ h  φ (t, Xt, Yt ), t ∈ [0, h],

  0,T ˜ 0,T  0,T ˜ 0,T ˜0,T ∂xφ (t,Xt,Yt )  ∂yφ (t, Xt, Yt ) Zt + 0 0,T σt , t ∈ [T − h, T ],  ∂yφ (t,Xt,Y˜ )  t   0,T −h ˜ 0,T −h   0,T −h 0,T −h 0,T −h ∂xφ (t,Xt,Yt )  ∂yφ (t, Xt, Y˜ ) Z˜ + σt , t ∈ [T − 2h, T − h], 0 t t ∂ φ0,T −h(t,X ,Y˜ 0,T −h) Zt = y t t  ...    0,h ˜ 0,h   ∂ φ0,h(t, X , Y˜ 0,h) Z˜0,h + ∂xφ (t,Xt,Yt ) σ , t ∈ [0, h],  y t t t 0,h ˜ 0,h t ∂yφ (t,Xt,Yt ) where we used the unique solutions to the following BSDEs Z T Z T ˜ 0,T ˜0,T ˜ 0,T ˜0,T ˜0,T Yt = ξ + f (r, Xr, Yr , Zr )dr − Zr dWr, t t Z T −h Z T −h ˜ 0,T −h 0,T ˜ 0,T ˜0,T −h ˜ 0,T −h ˜0,T −h ˜0,T −h Yt = φ (T − h, XT −h, YT −h) + f (r, Xr, Yr , Zr )dr − Zr dWr, t t ... Z h Z h ˜ 0,h 0,2h ˜ 0,2h ˜0,h ˜ 0,h ˜0,h ˜0,h Yt = φ (h, Xh, Yh ) + f (r, Xr, Yr , Zr )dr − Zr dWr. t t

57 3. Backward stochastic differential equations with rough driver

From this representation and stability results on BSDEs (Theorem 2.8 in [43]) it easily follows that the solution map

0,p−var [p] d ∞ ∞ 2 C ([0,T ],G (R )) × L (FT ) → H[0,T ] × H[0,T ]

is continuous in balls of radius B. Since this is true for every B > 0 we get the desired result.

3.3 The Markovian Setting - Connection to rough PDEs

We now specialize to a Markovian model. We are interested in solving the following forward n backward stochastic differential equation for (t0, x0) ∈ [0,T ] × R Z t Z t t0,x0 t0,x0 t0,x0 Xt = x0 + σ(r, Xr )dWr + b(r, Xr )dr, t ∈ [t0,T ], t0 t0 Z T t0,x0 t0,x0 t0,x0 t0,x0 t0,x0 Yt = g(XT ) + f(r, Xr ,Yr ,Zr )dr (67) t Z T Z T t0,x0 t0,x0 t0,x0 + H(Xr ,Yr )dηr − Zr dWr, t ∈ [t0,T ]. t t

n n×m n n n m n Here σ : [0,T ] × R → R , b : [0,T ] × R → R , f : [0,T ] × R × R × R , g : R → R, n d d H = (H1,...,Hd): R × R → R , η : [0,T ] → R are continuous mappings, on which more assumptions will be presented later. Assume for the moment that η is actually a smooth path. Then (67) is connected to the PDE 1 ∂ u(t, x) + Tr[σ(t, x)σ(t, x)T D2u(t, x)] + hb(t, x), Du(t, x)i t 2 n + f(t, x, u(t, x), Du(t, x)σ(t, x)) + H(x, u(t, x))η ˙t = 0, t ∈ [0,T ), x ∈ R , (68) n u(T, x) = g(x), x ∈ R .

We will make this connection explicit after introducing the following adaption (and strength- ening) of previous assumptions:

n (MA1) There exists a constant Cσ > 0 such that for (t, x) ∈ [0,T ] × R

|σ(t, x)| ≤ Cσ,

|∂xi σ(t, x)| ≤ Cσ, i = 1, . . . , n.

n (MA2) There exists a constant Cb > 0 such that for (t, x) ∈ [0,T ] × R

|b(t, x)| ≤ Cb,

|∂xb(t, x)| ≤ Cb.

n m (MF1) There exists a constant C1,f > 0 such that for (t, x, y, z) ∈ [0,T ] × R × R × R

|f(t, x, y, z)| ≤ C1,f ,

|∂zf(t, x, y, z)| ≤ C1,f .

58 3. Backward stochastic differential equations with rough driver

n m (MF2) There exists a constant C2,f > 0 such that such that for (t, x, y, z) ∈ [0,T ]×R ×R×R

∂yf(t, x, y, z) ≤ C2,f .

n m (MF3) There exists a constant C3,f > 0 such that such that for (t, x, y, z) ∈ [0,T ]×R ×R×R

2 ∂xf(t, x, y, z) ≤ C3,f + C3,f |z| ,

and f is uniformly continuous in x, uniformly in (t, y, z).

(MG1) g is bounded and uniformly continuous.

We again consider for a smooth (or rough) path η the flow (parametrized by x)

Z T d X k φ(t, x, y) = y + Hk(x, φ(r, x, y))dη (r). (69) t k=1 Proposition 41. Assume (MA1), (MA2), (MF1), (MF2), (MF3), (MG1) and let H be n Lipschitz on R × R. Let η be a smooth path. Then there exists a unique viscosity solution 23 n to (68) in BUC([0,T ], R ). It is given by

t,x u(t, x) := Yt ,

n t ,x t ,x where for every (t0, x0) ∈ [0,T ] × R the process (Y 0 0 ,Z 0 0 ) is the solution to (67).

Proof. The fact that u is a bounded, uniformly continuous viscosity solution follows from Proposition 2.5 and Theorem 3.4 in [2]. Uniqueness of a bounded viscosity solution to (68) follows from Theorem 3.6. Note, that since ∂yf is bounded, we can choose hε ≡ T in the statement of the theorem and hence get uniqueness on the entire interval (0,T ]. Every n n BUC function on (0,T ] × R has a unique extension to [0,T ] × R . Hence u is unique in n BUC([0,T ], R ). Remark 42. It is also possible to show existence of a (unique) solution to (68) by purely deterministic methods, see e.g. Theorem 2 in [28].

n d n 0 Let now η , n = 1, 2,..., be smooth paths in R . Let γ > p ≥ 1 and assume η → η 0 0,p−var [p] d in p-variation, for a η ∈ C ([0,T ],G (R )). Assume (MA1), (MA2), (MF1), (MF2), (MF3), (MG1) and (Hp,γ), so that especially Theorem 36 holds true. It follows that the corresponding un (as given in Proposition 41) converge pointwise to some function u0, i.e.

n 0 n u (t, x) → u (t, x) t ∈ [0,T ], x ∈ R .

Again, the limiting function u0 does not depend on the approximating sequence, but only on the limiting rough path η0. We could hence define this u0 to be the solution solution to (68). But it is not straightforward, via this approach, to show uniform convergence on compacta as well as continuity of the solution map. We hence work directly on the PDEs, as in [10] and [32]. First we get the respective versions of Lemma 37 and Lemma 40.

23For an introduction to the theory of viscosity solutions we refer the reader to [18].

59 3. Backward stochastic differential equations with rough driver

Lemma 43. Assume (MA1), (MA2), (MF1), (MF2), (MG1) and let H(x, ·) = (H1(x, ·),...,Hd(x, ·)) be a collection of Lipschitz vector fields on R. Let a smooth path η be given. Let u be the unique viscosity solution to (68). Then v(t, x) := φ−1(t, x, u(t, x)) is a viscosity solution to 1 ∂ v(t, x) + Tr[σ(t, x)σ(t, x)T D2v(t, x)] + hb(t, x), Dv(t, x)i t 2 n + f˜(t, x, v(t, x), Dv(t, x)σ(t, x)) = 0, t ∈ [0,T ), x ∈ R , n v(T, x) = g(x), x ∈ R , where (in what follows the φ will always be evaluated at (t, x, y˜))

1 n 1  T  f˜(t, x, y,˜ z˜) = f (t, x, φ, ∂yφz˜ + ∂xφσ(t, x)) + h∂xφ, b(t, x)i + Tr ∂xxφσ(t, x)σ(t, x) ∂yφ 2 1 o + hz,˜ (∂ φσ(t, x))T i + ∂ φ|z˜|2 . xy 2 yy

Proof. This is an application of Lemma 5 in [32].

0,p−var [p] d Lemma 44. Let p ≥ 1, η ∈ C ([0,T ],G (R )) and γ > p. Assume (MA1), (MA2), (MF1), (MF2), (MF3), (MG1) and (Hp,γ). Let φ be the flow corresponding to equation (69) (solved as a rough differential equation). Then

1 n 1  T  f˜(t, x, y,˜ z˜) = f (t, x, φ, ∂yφz˜ + ∂xφσ(t, x)) + h∂xφ, b(t, x)i + Tr ∂xxφσ(t, x)σ(t, x) ∂yφ 2 1 o + hz,˜ (∂ φσ(t, x))T i + ∂ φ|z˜|2 xy 2 yy satisfies:

˜ • There exists a constant C1,f > 0 depending only on Cσ, Cb, C1,f , CH and ||η||p−var;[0,T ] n m such that for (t, x, y,˜ z˜) ∈ [0,T ] × R × R × R ˜ 2 |f(t, x, y,˜ z˜)| ≤ C˜1,f + C˜1,f |z˜| , ˜ |∂z˜f(t, x, y,˜ z˜)| ≤ C˜1,f + C˜1,f |z˜|.

˜ • There exists a constant Cunif > 0 that only depends on Cσ, Cb, C2,f , CH and ||η||p−var;[0,T ] such that for every ε > 0 there exists an hε > 0 that only depends on Cσ, Cb, CH and n m ||η||p−var;[0,T ] such that for (t, x, y,˜ z˜) ∈ [T − hε,T ] × R × R × R

˜ 2 ∂y˜f(t, x, y,˜ z˜) ≤ C˜unif + ε|z˜| .

˜ • There exists a C3,f > 0 that only depends on Cσ, Cb, C2,f , C3,f , CH and ||η||p−var;[0,T ] n m such that for (t, x, y,˜ z˜) ∈ [0,T ] × R × R × R ˜ 2 ∂xf(t, x, y,˜ z˜) ≤ C˜3,f + C˜3,f |z˜| .

60 3. Backward stochastic differential equations with rough driver

Proof. The first three inequalities follow as in Lemma 40. Now for i ≤ n we have ˜ ∂xi f(t, x, y,˜ z˜) 1 ˜ = −∂xiyφ f(t, x, y,˜ z˜) ∂yφ 1 h + ∂yf(t, x, φ, ∂yφz˜ + ∂xφσ(t, x))∂xi φ ∂yφ T + ∂zf(t, x, φ, ∂yφz˜ + ∂xφσ(t, x)) (∂xiyφz˜ + ∂xixφσ(t, x) + ∂xφ∂xi σ(t, x))

+ h∂xixφ, b(t, x)i + h∂xφ, ∂xi b(t, x)i 1 1 1 + Tr ∂ φσ(t, x)σ(t, x)T  + Tr ∂ φ∂ σ(t, x)σ(t, x)T  + Tr ∂ φσ(t, x)∂ σ(t, x)T  2 xixx 2 xx xi 2 xx xi 1 i + hz,˜ (∂ φσ(t, x))T i + hz,˜ (∂ φ∂ σ(t, x))T i + ∂ φ|z˜|2 . xixy xy xi 2 xiyy So ˜ |∂xi f(t, x, y,˜ z˜)| 1 ˜ ≤ |∂xiyφ|| ||f(t, x, y,˜ z˜)| ∂yφ 1 h + | | |∂yf(t, x, φ, ∂yφz˜ + ∂xφσ(t, x))||∂xi φ| ∂yφ

+ |∂zf(t, x, φ, ∂yφz˜ + ∂xφσ(t, x))| (|∂xiyφ||z˜| + |∂xixφ||σ(t, x)| + |∂xφ||∂xi σ(t, x)|)

+ |h∂xixφ||b(t, x)| + |∂xφ||∂xi b(t, x)| 1 + |∂ φ||σ(t, x)|2 + |∂ φ||∂ σ(t, x)||σ(t, x)| 2 xixx xx xi 1 i + |z˜||∂ φ||σ(t, x)| + |z˜||∂ φ||∂ σ(t, x)| + |∂ φ||z˜|2 | xixy xy xi 2 xiyy 2 ≤ C˜3,f + C˜3,f |z˜| ˜ with a constant C3,f only depending on Cσ, Cb, C1,f , C2,f , C3,f , CH and ||η||p−var;[0,T ]. Here we have used the first inequality of the statement to bound f˜, (MF1), (MF2), (MF3) to bound f and its y and z derivatives and Lemma 3.4 to bound the flow and its derivatives. Summing over i then yields the desired result.

n d Theorem 45. Let γ > p ≥ 1 and let η , n = 1, 2,... be smooth paths in R . Assume ηn → η

0,p−var [p] d in p-variation, for a rough path η ∈ C ([0,T ],G (R )). Assume (MA1), (MA2), n n (MF1), (MF2), (MF3), (MG1) and (Hp,γ). Let u ∈ BUC([0,T ]×R ) be the unique solution n n to (68) with driving path η (Proposition 41). Then there exists u ∈ BC([0,T ] × R ), only dependent on η but not on the approximating sequence ηn, such that un → u locally uniformly. We write (formally) 1  du + Tr[σ(t, x)σ(t, x)T D2u(t, x)] + hb(t, x), Du(t, x)i + f(t, x, u(t, x), Du(t, x)σ(t, x)) dt 2 n + H(x, u(t, x))dη(t) = 0, t ∈ [0,T ), x ∈ R , (70) n u(T, x) = g(x), x ∈ R .

61 3. Backward stochastic differential equations with rough driver

Furthermore, the solution map

0,p−var [p] d n n C ([0,T ],G (R )) × BUC(R ) → BC([0,T ] × R ), (η, g) 7→ u is continuous. At last we have the stochastic representation t,x u(t, x) = Yt , where Y t,x is (the Y -component of) the solution to the BSDE (67). Remark 46. Equations like (70) have been considered in [32]. The setting there is more general in the sense that the vector field H in front of the rough path is allowed to also depend on the gradient. On the other hand, their f is independent of the gradient and H is linear. For the proof we apply the same ideas as in the proof of Theorem 1 in [10]. We however mimic our analysis of the BSDE case (Theorem 36) and proceed on small intervals; a similar approach was carried out in Lions-Souganidis [48]. n Remark 47. We suspect the solution to actually lie in BUC([0,T ] × R ). Showing this would involve adapting the comparison theorem 3.6 to directly yield a modulus of continuity for solutions, as it has been done in [28] under different assumptions on the coefficients.

Proof. For the sake of unified notation, we write u0 for the (rough PDE) solution u; similarly, we write η0 for the rough path η. 1. Existence Let φn, n ≥ 0 be the (ODE, for n ≥ 1 and RDE, when n = 0) solution flow (parametrized by x) Z T φn(t, x, y) = y + H(x, φn(r, x, y))dηn(r). t

Then, by Lemma 43, for n ≥ 1, un is a solution to (68) if and only if vn(t, x) := (φn)−1(t, x, un(t, x)) is a solution to 1 ∂ vn(t, x) + Tr[σ(t, x)σ(t, x)T D2vn(t, x)] + hb(t, x), Dvn(t, x)i t 2 n n n n + f˜ (t, x, v (t, x), Dv (t, x)σ(t, x)) = 0, t ∈ [0,T ), x ∈ R , (71) n n v (T, x) = g(x), x ∈ R , where ˜n 1 n n n n n 1  n T  f (t, x, y,˜ z˜) = n f (t, x, φ , ∂yφ z˜ + ∂xφ σ(t, x)) + h∂xφ , b(t, x)i + Tr ∂xxφ σ(t, x)σ(t, x) ∂yφ 2 1 o + hz,˜ (∂ φnσ(t, x))T i + ∂ φn|z˜|2 . xy 2 yy

In the proof of Theorem 36 we have already seen that f˜n → f˜0, locally uniformly. From the method of semi-relaxed limits (Lemma 6.1, Remark 6.2-6.4 in [18]), the pointwise (relaxed) limits ∗ v¯0 := lim sup vn, v0 := lim inf vn, ∗

62 3. Backward stochastic differential equations with rough driver are viscosity (sub resp. super) solutions to the (transformed) PDE (71) for n = 0. Here we have used the fact, thatv ¯0 and v0 are indeed finite, say bounded in norm by M > 0. This follows from the Feynman-Kac representation (Proposition 41) for each un, in combination with bounds (uniform in (t0, x0) and n) on the corresponding BSDEs (Corollary 2.2 in [43]). (Although not completely obvious, such uniform bounds can also be obtained without BSDE arguments; one would need to exploit comparison for (68), and then (71), clearly valid when n ≥ 1, with rough path estimates for RDE solutions which will serve as sub- and super- solutions without spatial structure.) Note also, that the semi-relaxed limiting procedure preserves the terminal value (see e.g. Proposition 5.1 in [31] or Section 10 in [10]). By Lemma 44 the function f˜0 satisfies the conditions of Theorem 3.6. Hence the PDE (71) for n = 0 satisfies comparison on [T − h, T ] for h sufficiently small (as long as T − h > 0), ˜0 and h only depends on M and the constants C˜unif , C˜1,f and C˜2,f for f given by Lemma 44. So v0(t, x) :=v ¯0(t, x) = v0(t, x), t ∈ [T − h, T ] is the unique (and continuous, sincev, ¯ v are respectively upper resp. lower semi-continuous) solution to (71) with n = 0 on [T − h, T ]. Moreover, using a Dini-type argument (Remark 6.4 in [18]), one sees that this limit must be uniform on compact sets. Undoing the transformation, we see that un → u0 locally uniformly on [T − h, T ], where u0(t, x) := φ0(t, x, v0(t, x)), t ∈ [T − h, T ]. We proceed to the next subinterval. We use the same argument as above, we just work with a different transformation. For n ≥ 0 let φn,T −h be the solution flow started at time T − h, i.e. Z T −h φn,T −h(t, x, y) = y + H(x, φn,T −h(r, x, y))dηn(r). t

n Then, for n ≥ 1, u |[0,T −h] is a solution to 1 ∂ un(t, x) + Tr[σ(t, x)σ(t, x)T D2un(t, x)] + hb(t, x), Dun(t, x)i t 2 n n n n + f(t, x, u (t, x), Du (t, x)σ(t, x)) + H(x, u (t, x))η ˙r = 0, t ∈ [0,T − h], x ∈ R , n n n u(T − h, x) = φ (T − h, x, v (T − h, x)), x ∈ R . if and only if vn,T −h(t, x) := (φn,T −h)−1(t, x, un(t, x)) is a solution to 1 ∂ vn,T −h(t, x) + Tr[σ(t, x)σ(t, x)T D2vn,T −h(t, x)] + hb(t, x), Dvn,T −h(t, x)i t 2 n,T −h n,T −h n,T −h n + f˜ (t, x, v (t, x), σ(t, x)Dv (t, x)) = 0, t ∈ (0,T − h), x ∈ R , n,T −h n n n v (T, x) = φ (T − h, x, v (T − h, x)), x ∈ R , where of course f˜n,T −h is defined as f˜n was, with φn replaced by φn,T −h. Now we have already shown that the terminal values of these PDEs converge, e.g.

φn(T − h, ·, vn(T − h, ·)) → φ(T − h, ·, v(T − h, ·)), locally uniformly.

As before, one also shows that f˜n,T −h → f˜0,T −h, locally uniformly. By Theorem 3.6 we again get comparison, now on [T −2h, T −h], and hence again via the method of semi-relaxed limits we arrive at 24

n,T −h 0,T −h n v → v locally uniformly on [T − 2h, T − h] × R .

24Remark 6.3 in [18] does not take into account converging terminal values. But the result is immediate: the relaxed limit is a sub resp. super solution by Lemma 6.3 and by Proposition 2 in [10] their terminal value is exactly the limit of the given converging terminal values.

63 3. Backward stochastic differential equations with rough driver

Hence un → u0 locally uniformly on [T −2h, T −h], where u0(t, x) = φ0,T −h(t, x, v0,T −h(t, x)). Iterating this argument up to time 0 we get

n 0 n u → u locally uniformly on [0,T ] × R , where u0 is defined on intervals of length h as above. 25 2. Uniqueness, Continuity of solution map Uniqueness of the limit and continuity of the solution map now follow by the same arguments as in the proof of Theorem 36, adapted to the PDE setting. 3. Stochastic representation Let ηn → η0 as above. Denote by un the solution to the corresponding PDE (rough PDE for n = 0). Denote by Y n,t,x the solution to the BSDE (67) (BSDE with rough driver for n = 0) corresponding to the path ηn. Then, using the result from step 1, the stochastic representation in the case of a smooth path from Proposition 41 and the convergence of the BSDEs from Theorem 36, we get

u0(t, x) = lim un(t, x) = lim Y n,t,x = Y 0,t,x. n→∞ n→∞ t t

3.4 Connection to BDSDEs

1 d 2 m 1 2 Let Ω = C([0,T ], R ), Ω = C([0,T ], R ), with the respective Wiener measures P , P on 1 2 1 2 1 2 them. Let Ω = Ω × Ω , with the product measure P := P ⊗ P . For (ω , ω ) ∈ Ω let B(ω1, ω2) = ω1 be the coordinate mapping with respect to the first component. Analo- gously W (ω1, ω2) = ω2 is the coordinate mapping with respect to the second component. In particular, B is a d-dimensional Brownian motion and W is an independent m-dimensional Brownian motion. B W B W Define Ft := Ft,T ∨ F0,t , where Ft,T := σ(Br : r ∈ [t, T ]), F0,t := σ(Wr : r ∈ [0, t]). Note that F is not a filtration, since it is neither increasing nor decreasing. In this setting, Pardoux and Peng [55] considered backward doubly stochastic differential equations (BDSDEs). An F-adapted process (Y,Z) is called a solution to the BDSDE

Z T Z T Z T Yt = ξ + f(r, Yr,Zr)dr + H(Xr,Yr) ◦ dBr − ZrdWr, (72) t t t

2 R T 2 if E[supt≤T |Yt| ] < ∞, E[ 0 |Zr| dr] < ∞ and (Y,Z) satisfies P-a.s. (72) for t ≤ T . Here X is again the semimartingale

Z t Z t Xt = x + σrdWr + brdr. 0 0 25The attentive reader will observe that convergence at t = 0 is not immediate, since Theorem 3.6 was not formulated to give comparison at t = 0. But we can argue by extending the coefficients as well as the (rough) paths ηn for t ∈ [−1, 0] as

n n σ(t, x) := σ(0, x), b(t, x) := b(0, x), f(t, x, y, z) := f(0, x, y, z), ηt := η0 and considering the PDEs on the interval [−1,T ].

64 3. Backward stochastic differential equations with rough driver

Under appropriate (essentially Lipschitz) conditions on f and H they were able to show existence and uniqueness of a solution. 26 The connection to BSDEs with rough driver is given by the following

∞ Theorem 48. Let p ∈ (2, 3), γ > p. Let ξ ∈ L (FT ). Let f be a random function satisfying (F1) and (F2). Moreover, assume (A1), (A2), (F1), (F2) and (Hp,γ). Then by Theorem 1.1 in [55] there exists a unique solution (Y,Z) to the BDSDE Z T Z T Z T Yt = ξ + f(r, Yr,Zr)dr + H(Xr,Yr) ◦ dBr − ZrdWr. t t t

27 Let Bt = exp(Bt + At) be the Enhanced Brownian motion (over B) , especially B ∈ 0,p−var 2 d 1 0,p−var 2 d C0 ([0,T ],G (R )) P a.s.. By setting B = 0 on a null set, we get B ∈ C0 ([0,T ],G (R )). By Theorem 36 we can, for every ω1 ∈ Ω1, construct the solution to the BSDE with rough driver Z T Z T rp 1 rp rp rp 1 1 Y (ω , ·)t = ξ(·) + f(r, Yr ,Zr )dr + H(Xr,Y (ω , ·))dBr(ω ) t t Z T rp 1 − Z (ω , ·)dWr(·), t ∈ [0,T ]. t

1 1 2 We then have for P − a.e. ω that P − a.s. 1 rp 1 Yt(ω , ·) = Yt (ω , ·), t ≤ T and

1 rp 1 2 Zt(ω , ·) = Zt (ω , ·), dt ⊗ P a.s..

Proof. As in the proof of Theorem 36, in the BDSDE setting, one can transform the integral belonging to the Brownian motion B away. In [7] it was shown, that if we let φ be the stochastic (Stratonovich) flow Z T 1 1 1 φ(ω ; t, x, y) = y + H(x, φ(ω ; r, x, y)) ◦ dBr(ω ), t then with Y˜ := φ−1(t, X ,Y ), Z˜ := 1 Z we have -a.s. t t t t ∂yφ(t,Xt,Yt) t P Z T Z T 1 2 2 1 2 1 2 1 2 1 2 2 Y˜t(ω , ω ) = ξ(ω ) + f˜(ω , ω ; r, Xr, Y˜r(ω , ω ), Z˜r(ω , ω ))dr − Z˜r(ω , ω )dWr(ω ), t ≤ T. t t (73) Here

˜ 1 2 1 n 2  1  T  f(ω , ω ; t, x, y,˜ z˜) := f ω ; t, φ, ∂yφz˜ + ∂xφσt + h∂xφ, bti + Tr ∂xxφσtσt ∂yφ 2 1 o + hz,˜ (∂ φσ )T i + ∂ φ|z˜|2 , xy t 2 yy 26Pardoux and Peng considered equations, where the was actually a backward integral. But if H is smooth enough, the formulations are equivalent. See also Section 4 in [7]. 27B is precisely d-dimensional Brownian motion enhanced with its iterated integrals in Stratonovich sense; it is in 1 − 1 correspondence with Brownian motion enhanced with L´evy’sarea; exp denotes the exponential d map from the Lie algebra R ⊕so(d) to the group, realized inside the truncated tensor algebra. See e.g. section 13 in [33] for more details.

65 3. Backward stochastic differential equations with rough driver where φ and its derivatives are always evaluated at (ω1; x, y˜). Especially, by a Fubini type 1 1 1 1 1 theorem (e.g. Theorem 3.4.1 in [5]), there exists Ω0 with P (Ω0) = 1 such that for ω ∈ Ω0 2 equation (73) holds true P a.s.. On the other hand we can construct ω1-wise the rough flow Z T rp 1 rp 1 1 φ (ω ; t, x, y) = y + H(x, φ (ω ; r, x, y))dBr(ω ). t Assume for the moment that we have global comparison, so that we can solve the transformed 1 1 2 BSDE uniquely, i.e. for every ω ∈ Ω , we have P a.s. Z T ˜ rp 1 2 2 ˜rp 1 ˜ rp 1 2 ˜rp 1 2 Yt (ω , ω ) = ξ(ω ) + f (ω ; r, Yr (ω , ω ), Zr (ω , ω ))dr t Z T ˜rp 1 2 2 − Zr (ω , ω )dWr(ω ), t ≤ T, t where ˜rp 1 2 1 n 2 rp rp rp  rp 1  rp T  f (ω , ω ; t, x, y,˜ z˜) := rp f ω ; t, φ , ∂yφ z˜ + ∂xφ σt + h∂xφ , bti + Tr ∂xxφ σtσt ∂yφ 2 1 o + hz,˜ (∂ φrpσ )T i + ∂ φrp|z˜|2 , xy t 2 yy where φ and its derivatives are always evaluated at (ω1; x, y˜). It is a classical rough path 1 1 1 1 1 result, that there exists Ω1 with P (Ω1) = 1 such that for ω ∈ Ω1, we have φrp(ω1; ·, ·, ·) = φ(ω1; ·, ·, ·). 1 1 1 ˜ 1 ˜ 1 ˜ rp 1 ˜rp 1 Combining above results we have for ω ∈ Ω0∩Ω1 that (Yt(ω , ·), Zt(ω , ·)) and (Yt (ω , ·), Zt (ω , ·)) satisfy the same BSDE. Hence, we have by uniqueness ˜ 1 ˜ rp 1 2 Yt(ω , ·) = Yt (ω , ·), t ≤ T, P a.s. and ˜ 1 ˜rp 1 2 Zt(ω , ·) = Zt (ω , ·), dt ⊗ P a.s.. By reversing the transformation we get the desired result for Y and Z. Now, since comparison does not necessarily hold globally, we must argue differently. Define k 1 1 1 k A := {ω ∈ Ω : ||B(ω )||p−var ≤ k}. Then on A we have for an h = h(k) > 0 comparison S k on [T − h, T ], and we argue on subsequent intervals as above. Now, since P( k A ) = 1, we get the desired result.

3.5 Technical results

3.5.1 Comparison for BSDEs

∞ Definition 3.1. Let ξ ∈ L (FT ), W an m-dimensional Brownian motion and f a predictable m function on Ω × R+ × R × R . We call an adapted process (Y,Z,C) a supersolution to the BSDE with data (ξ, f) if Y ∈ ∞ 2 H[0,T ], Z ∈ H[0,T ], C is an adapted right continuous increasing process and Z T Z T Z T Yt = ξ + f(r, Yr,Zr)dr − ZrdWr + dCr, t ≤ T. t t t We call (Y,Z,C) a subsolution to the BSDE with data (ξ, f) if (Y,Z, −C) is a supersolution.

66 3. Backward stochastic differential equations with rough driver

The following statement as well as its proof are based on Theorem 2.6 in [43].

2 Theorem 3.2. There exists a (universal) strictly positive function δ : R+ → (0, ∞) such that the following statement is true. Let (Y (1),Z(1),C(1)) be a supersolution to the BSDE with data (ξ(1), f (1)). Let (Y (2),Z(2),C(2)) (2) (2) (1) be a subsolution to the BSDE with data (ξ , f ). Let M ∈ R+ be a bound for Y and Y (2), i.e.

(1) (2) ||Y ||∞, ||Y ||∞ ≤ M.

Assume that P-a.s.

(1) (1) (1) (2) (1) (1) f (t, Yt ,Zt ) ≤ f (t, Yt ,Zt ), ∀t ∈ [0,T ], ξ(1) ≤ ξ(2).

Assume that there exist constants C > 0, L > 0, K > 0 such that for (t, y, z) ∈ [0,T ] × m [−M,M] × R

(2) 2 |f (t, y, z)| ≤ L + C|z| P − a.s., (2) |∂zf (t, y, z)| ≤ K + C|z| P − a.s.

m Assume that there exists a constant N > 0 such that for (t, y, z) ∈ [0,T ] × [−M,M] × R

(2) 2 ∂yf (t, y, z) ≤ N + δ(C,M)|z| P − a.s. (74)

Then P-a.s. (1) (2) Yt ≤ Yt , 0 ≤ t ≤ T. (75)

Remark 3.3. We note that, as in Theorem 2.6 of [43], the assumptions could be weakened 1 2 by replacing the constants L, K, N with deterministic functions lt ∈ L (0,T ), kt ∈ L (0,T ) 1 and nt ∈ L (0,T ). In our application of Theorem 3.2 in the proof of Theorem 36, condition (74) is not satisfied on [0,T ]. But we are able to choose h > 0 small enough, such that it is satisfied on [T −h, T ]. Comparison (75) then holds on [T − h, T ].

Proof. Let λ > 0, B > 1 be constants, to be specified later on. We begin by constructing several functions, whose properties we will rely on later in the proof. Define

1 eλBy˜ + 1 γ(˜y) := γ (˜y) := log − M, y˜ ∈ . λ,B λ B R

Then 1   1 γ−1(y) = log Beλ(y+M) − 1 , γ0(˜y) = B . λB 1 + e−λBy˜

Denote g(y) := e−λ(y+M), then 0 < g ≤ 1, on [−M,M]. Define

w(y) := γ0(γ−1(y)) = B − g(y).

67 3. Backward stochastic differential equations with rough driver

Then

w0(y) = λg(y), w00(y) = −λ2g(y), w0(y) λg(y) w00(y) −λ2g(y) = , = . w(y) B − g(y) w(y) B − g(y)

In particular w > 0 on [−M,M]. Define α(y) := γ−1(y). Then, since (Y (1),Z(1),C(1)) is a supersolution to the BSDE with data (ξ(1), f (1)), Itˆoformula gives

Z t Z t (1) (1) 0 (1) (1) (1) (1) 0 (1) (1) α(Yt ) = α(Y0 ) − α (Yr )f (r, Yr ,Zr )dr + α (Yr )Zr dWr 0 0 Z t Z t 0 (1) 00 (1) (1) 2 − α (Yr )dCr + α (Yr )|Zr | dr. 0 0

Define Z(1) Z(1) Y˜(1) := α(Y (1)), Z˜(1) := = . γ0(Y˜ (1)) w(Y (1)) and 1  1  F (1)(t, y,˜ z˜) := f (1)(t, γ(˜y), γ0(˜y)˜z) + γ00(˜y)|z˜|2 . γ0(˜y) 2

0 ˜ (1) ˜(1) R · 0 (1) (1) Since α > 0 we have that (Y , Z , 0 α (Yr )dCr ) is a supersolution to the BSDE with (1) (1) ˜ (2) ˜(2) R · 0 (2) (2) data (α(ξ ),F ). Analogously we have that (Y , Z , 0 α (Yr )dCr ) is a subsolution to the BSDE with data (α(ξ(2)),F (2)). Since α is increasing, it is now enough to verify that Y˜ (1) ≤ Y˜ (2). For that we will verify, that F (2) satisfies the conditions of Proposition 2.9 in [43]. Especially we will show, that there exist constants A, G > 0 such that

(2) (2) 2 m ∂yf (t, y, z) + A|∂zf (t, y, z)| ≤ G, ∀(t, y, z) ∈ [0,T ] × R × R . (76)

For simplicity denote F := F (2), f := f (2). Denote y = γ(˜y), z = γ0(˜y)˜z = w(y)˜z. For convenience w and its derivatives will always be evaluated at y. Then w0 ∂ F (t, y,˜ z˜) = ∂ f(t, y, z) + z , z˜ z w 1 1  ∂ F (t, y,˜ z˜) = w00|z|2 + w0 (∂ f(t, y, z)z − f(t, y, z)) + ∂ f(t, y, z). y˜ w 2 z y

Hence 1 1  ∂ F (t, y,˜ z˜) ≤ w00|z|2 + w0|z|[K + C|z|] + L + C|z|2 + ∂ f(t, y, z) y˜ w 2 y and  w0 2 |∂ F (t, y,˜ z˜)|2 ≤ K + C|z| + |z| . z˜ w

68 3. Backward stochastic differential equations with rough driver

So, for A > 0 1 w00 w0 w0  w0  w0  (∂ F + A|∂ F |2)(t, y,˜ z˜) ≤ |z|2 + 2C + A(C + )2 + K|z| + 2A C + y˜ z˜ 2 w w w w w w0 + L + ∂ f(t, y, z) + AK2. w y

Note, that for the second term we have w0  w0    w0  K|z| + 2A C + ≤ K|z| (1 + 2A) C + w w w  w0 2 (1 + 2A)2 ≤ A C + |z|2 + K2. w A

Hence 1 w00 w0 w0  (∂ F + A|∂ F |2)(t, y,˜ z˜) ≤ |z|2 + 2C + 2A(C + )2 y˜ z˜ 2 w w w w0  (1 + 2A)2  + L + ∂ f(t, y, z) + A + K2. w y A

Now 1 w00 w0 w0 1 w00 w0 w0 w0 + 2C + 2A(C + )2 = + 2C + 2AC2 + 4AC + 2A( )2 2 w w w 2 w w w w λ2 g(y) λg(y) λ2g(y)2 = − + 2C(1 + 2A) + 2AC2 + 2A 2 B − g(y) B − g(y) (B − g(y))2 g(y) h λ2 = − (B − g(y)) + 2C(1 + 2A)λ(B − g(y)) (B − g(y))2 2 i + 2Aλ2g(y) + 2AC2 g(y) λ2  = ((1 + 4A)g(y) − B) + 2C(1 + 2A)λ(B − g(y)) + 2AC2. (B − g(y))2 2

For all A < 1 we hence have 1 w00 w0 w0 g(y) λ2  + 2C + 2A(C + )2 ≤ (5g(y) − B) + 2C3λ(B − g(y)) + 2AC2. 2 w w w (B − g(y))2 2 Now, choose B = 6. Hence 5g(y)−B ≤ −1, y ∈ [−M,M]. Then choose λ = λ(C) sufficiently large such that the term in square brackets is strictly negative, say smaller then −1 for all y ∈ [−M,M]. This is possible since it is a polynomial in λ and the leading power has a negative coefficient. Then for y ∈ [−M,M]

g(y) λ2  g(y) (5g(y) − 6) + 2C3λ(6 − g(y)) ≤ − (6 − g(y))2 2 (6 − g(y))2 1 ≤ − e−λ2M =: −2δ, 36 where δ depends only M and λ and hence only on M and C, i.e. 1 δ = δ(C,M) = e−λ(C)2M . 72

69 3. Backward stochastic differential equations with rough driver

Now choose A ∈ (0, 1) small enough such that 2AC2 < δ. If then for some N > 0 we have

2 ∂yf(t, y, z) ≤ N + δ(C,M)|z| , it follows that w0  (1 + 2A)2  (∂ F + A|∂ F |2)(t, y,˜ z˜) ≤ L + N + A + K2 y˜ z˜ w A λ  (1 + 2A)2  ≤ L + N + A + K2 B − 1 A =: G.

So we have shown (76) and comparison then follows from Proposition 2.9 in [43].

3.5.2 Properties of rough flows

Consider the solution flow φ to

Z T φ(t, x, y) = y + H(x, φ(r, x, y))dηr, (77) t where H and η will be specified in a moment. We need to control

∂yφ − 1, ∂xφ, ∂xxφ, ∂xyφ, ∂yyφ, ∂yyyφ, ∂xyyφ, ∂xxyφ over a small interval [T − h, T ]. Note that each of the above expressions is 0 when evaluated at t = T .

0,p−var [p] d Lemma 3.4. Let p ≥ 1, η ∈ C ([0,T ],G (R )) and γ > p. Assume that Hi = Hi (x, y) has joint regularity of the form

sup |Hi (·, ·)|Lipγ+2(Rn+1) ≤ c1 i=1,...,d and

kζkp-var;[0,T ] ≤ c2.

3 n Then, the solution to (77) induces a flow of C diffeomorphisms, parametrized by x ∈ R , and n there exists a positive L = L (c1, c2,T ) so that, uniformly over x ∈ R , y ∈ R and t ∈ [0,T ]  1  max ∂xφ, ∂yφ, , ∂xxφ, ∂xyφ, ∂yyφ, ∂yyyφ, ∂xyyφ, ∂xxyφ < L. ∂yφ

Moreover, for every ε > 0 there exists a positive δ = δ (ε, c1, c2) so that, uniformly over x ∈ Rn, y ∈ R and t ∈ [T − δ, T ]

max {∂xφ, ∂yφ − 1, ∂xxφ, ∂xyφ, ∂yyφ, ∂yyyφ, ∂xyyφ, ∂xxyφ} < ε.

Proof. Consider the extended RDE

dξ = 0 −dφ = H (ξ, φ) dζ

70 3. Backward stochastic differential equations with rough driver

with terminal data (ξT , φT ) = (x, y). The assumption on (Hi) implies that (ξ, φ) evolves according to a rough differential equation with Lipγ+2-vector fields. In this case, the ensemble

φˆ = (ξ, φ, ∂xφ, ∂yφ, ∂xxφ, ∂xyφ, ∂yyφ, ∂yyyφ, ∂xyyφ, ∂xxyφ)

28 γ−1 can be seen to be the (unique , non-explosive) solution to an RDE along Liploc vector fields. Thanks to non-explosivity we can, for fixed terminal data ˆ φT = (x, y, 0, 1, 0, 0, 0, 0, 0, 0) , localize the problem and assume without loss of generality that the above ensemble is driven along Lipγ−1 vector fields. Since we want estimates that are uniform in x, y we make another key observations: there is no loss of generality in taking (x, y) = (0, 0) provided H is replaced by Hx,y = H (x + ·, y + ·). This also shifts the derivatives (evaluated at some (x, y)) to derivatives evaluated at (0, 0). As announced, we can now safely localize, and assume that the vector fields required for φ,ˆ obtain by taking formal (x, y) derivatives in dξ = 0 −dφ = H (ξ, φ) dζ, are globally Lipγ−1. A basic estimate (Theorem 10.14 in [33]) for RDE solutions implies that for some C = C (p, γ)   ˆ ˆ ˆ φt − φT ≤ φ = C × ϕp |Hx,y| γ+2 kζkp-var;[T −h,T ] , p-var;[t,T ] Lip p where ϕp (x) = max(x, x ).At last, we note that |Hx,y|Lipγ+2 = |H|Lipγ+2 thanks to invariance of such Lip norms under translation. The proof is then easily finished.

Lemma 3.5. Assume the setting of the previous lemma. Assume that ηn, n ≥ 1 is a sequence of p rough paths that converge to a rough path η0 in p-variation. n Then locally uniformly on [0,T ] × R × R

n 1 n n n n n 0 1 0 0 0 0 0 (φ , n , ∂yφ , ∂yyφ , ∂xφ , ∂xxφ , ∂yxφ ) → (φ , 0 , ∂yφ , ∂yyφ , ∂xφ , ∂xxφ , ∂yxφ ) ∂yφ ∂yφ

Proof. Using enlargment of the state space as in the proof of Lemma 3.4 we can apply the same reasoning as in Theorem 11.14 and Theorem 11.15 in [33] to get the desired result.

3.5.3 Comparison for PDEs

We consider the equation 1 −∂ u − Tr[σ(t, x)σ(t, x)T D2u] − hb(t, x), Dui − f(t, x, u, Duσ(t, x)) = 0, (t, x) ∈ (0,T ) × n, t 2 R (78)

n m n n n n×m where f : [0,T ] × R × R × R → R, b : [0,T ] × R → R and σ : [0,T ] × R → R , are a continuous functions. The following statement as well as its proof are a modification of Theorem 3.2 in [43]. (The statement is not in its most general form, but adjusted to what we need in the main text.)

28 γ This is actual a subtle point since uniqueness in general requires Liploc-regularity. The point is that the RDEs obtained by differentiating the flow have a special structure so that for the final level of derivatives only γ−1 rough integration is need; as is well known, for this it suffices to have Liploc regularity. Chapter 11 in [33] contains a detailed discussion of this.

71 3. Backward stochastic differential equations with rough driver

Theorem 3.6. Assume that there exists a constant Cb > 0 such that for (t, x), (t, y) ∈ n [0,T ] × R

|b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)| ≤ Cb|x − y|,

|b(t, x)| ≤ Cb.

n Assume that there exists a constant Cσ > 0 such that for (t, x), (t, y) ∈ [0,T ] × R

|σ(t, x) − σ(t, y)| ≤ Cσ|x − y|,

|σ(t, x)| ≤ Cσ.

n m Assume that there exists a constant C1,f > 0 such that for (t, x, y, z) ∈ [0,T ] × R × R × R

2 |f(t, x, y, z)| ≤ C1,f (1 + |z| ),

|∂zf(t, x, y, z)| ≤ C1,f (1 + |z|).

Assume that there exists a constant C2,f such that for every ε > 0 there exists an hε ∈ (0,T ] n m such that for (t, x, y, z) ∈ [T − hε,T ] × R × R × R we have

2 ∂yf(t, x, y, z) ≤ C2,f + ε|z| . (79)

n m Assume that there exists a constant C3,f > 0 such that for (t, x, y, z) ∈ [0,T ] × R × R × R

2 |∂xf(t, x, y, z)| ≤ C3,f (1 + |z| ).

n Let u, v be a bounded semicontinuous sub-(resp. super-)solution to (78) on (0,T ) × R , with ∗ ∗ u(T, ·) ≤ v(T, ·). Then there exists an ε = ε (||u||∞ ∨ ||v||∞,C,C2,f ) > 0 such that for n (t, x) ∈ (T − hε∗ ,T ] × R we have u(t, x) ≤ v(t, x).

Proof. 1. Reduction We will first transform the PDE into a PDE with coefficient that satisfies a certain structure condition (equation (24) in [43]). Set M := max{||u||∞, ||v||∞} + 1. Let λ > 0, A > 1, K > 0 be constants to be chosen later. We begin by constructing several functions, whose properties we will rely on later in the proof. Define

1 eλAy˜ + 1 ln(A) ϕ(˜y) := ln : → (− , ∞). λ A R λ

We will have to choose A ≥ eλ2MeKt , since we shall need later on that {eKt(y − M): y ∈ [−M,M]} is contained in the range of ϕ. Then 1 1   ϕ0(˜y) = A , ϕ−1(y) = ln Aeλy − 1 . 1 + e−λAy˜ λA

Define r(y) := ϕ−1(eKt(y −M)) , its inverse s(˜y) := ϕ(˜y)e−Kt +M and g(y) := e−λeKt(y−M) : [−M,M] → [1, eλ2MeKt ]. Then g0(y) = −λeKtg(y). Define

−Kt 0 −Kt h −λeKt(y−M)i −Kt w(y) := e ϕ (r(y)) = ∂y˜s|y˜=r(y) = e A − e = e [A − g(y)] ,

72 3. Backward stochastic differential equations with rough driver which is non-negative for A ≥ eλ2MeKt . Then

w0(y) = λg(y), w00(y) = −eKtλ2g(y).

Let now u(t, x) be a solution to (78). Letu ˜(t, x) := r(u(t, x)). Then u(t, x) = s(˜u(t, x)), and hence

0 −Kt ∂xi u(t, x) = ϕ (˜u(t, x))e ∂xi u˜(t, x), 00 −Kt 0 −Kt ∂xj xi u(t, x) = ϕ (˜u(t, x))e ∂xj u˜(t, x)∂xi u˜(t, x) + ϕ (˜u(t, x))e ∂xj xi u˜(t, x), i.e.

Du(t, x) = ϕ0(˜u(t, x))e−KtDu˜(t, x), D2u(t, x) = ϕ00(˜u(t, x))e−KtDu˜(t, x) ⊗ Du˜(t, x) + ϕ0(˜u(t, x))e−KtD2u˜(t, x).

Hence 1 ∂ u˜(t, x) = KeKt(u(t, x) − M) + eKt∂ u(t, x) t ϕ0(˜u(t, x)) t 1 = KeKt(u(t, x) − M) ϕ0(˜u(t, x)) 1 1  − eKt Tr[σ(t, x)σ(t, x)T D2u(t, x)] + hb(t, x), Du(t, x)i ϕ0(˜u(t, x)) 2 1 − eKtf(t, x, u(t, x), Du(t, x)σ(t, x)) ϕ0(˜u(t, x)) ϕ(˜u(t, x)) = K ϕ0(˜u(t, x)) 1 ϕ00(˜u(t, x)) 1 − Tr[σ(t, x)σ(t, x)T D2u˜(t, x)] − Tr[σ(t, x)σ(t, x)T Du˜(t, x) ⊗ Du˜(t, x)] 2 ϕ0(˜u(t, x)) 2 1 − hb(t, x),Du˜(t, x)i − eKtf(t, x, s(˜u(t, x)), ϕ0(˜u(t, x))e−KtDu˜(t, x)σ(t, x)). ϕ0(˜u(t, x))

Analogously, by resorting to test functions, one shows, that if u (resp. v) is a viscosity sub- (resp. super-) solution to (78), thenu ˜(t, x) := r(u(t, x)) (resp.v ˜(t, x) := r(v(t, x))) is a viscosity sub-(resp. super-) solution to 1 − ∂ u˜(t, x) − Tr[σ(t, x)σ(t, x)T D2u˜(t, x)] − h(b(t, x),Du˜(t, x)i t 2 n − f˜(t, x, u˜(t, x),Du˜(t, x)σ(t, x)) = 0, t ∈ (0,T ), x ∈ R , (80) where, denoting from now on y = s(˜y), z = w(y)˜z,

ϕ(˜y) ϕ00(˜y) 1 f˜(t, x, y,˜ z˜) = −K + |z˜|2 ϕ0(˜y) ϕ0(˜y) 2 1 + eKtf(t, x, s(˜y), ϕ0(˜y)e−Ktz˜) ϕ0(˜y) y − M 1 1 = −K + w0(y) |z˜|2 + f(t, x, y, w(y)˜z). w(y) 2 w(y)

We also obviously haveu ˜(T, ·) ≤ v˜(T, ·).

73 3. Backward stochastic differential equations with rough driver

We will bound they ˜-derivative of f˜, while at the same time choosing the constants K, λ, A. First w0(y) 1 w00(y) ∂ f˜(t, x, y,˜ z˜) = −K(1 − (y − M) )) + |z|2 y˜ w(y) 2 w(y) w0(y) − f(t, x, y, z) + ∂ f(t, x, y, z) w(y) y w0(y) + ∂ f(t, x, y, z)z w(y) z w0(y) 1 w00(y) ≤ −K(1 − (y − M) )) + |z|2 w(y) 2 w(y) w0(y) + C (1 + |z|2) + ∂ f(t, x, y, z) w(y) 1,f y w0(y) + C (1 + |z|)|z| w(y) 1,f |z|2 1  ≤ w00(y) + C w0(y) + C w0(y) w(y) 2 1,f 1,f w0(y) w0(y) w0(y) − K(1 − (y − M) ) + ∂ f(t, x, y, z) + C + C |z|. w(y) y 1,f w(y) 1,f w(y) Now using w0(y) |z|2 w0(y) C |z| ≤ w0(y) + C2 , 1,f w(y) w(y) w(y) 1,f we get |z|2 1  ∂ f˜(t, x, y,˜ z˜) ≤ w00(y) + (2C + 1)w0(y) (81) y˜ w(y) 2 1,f w0(y) − K + ∂ f(t, x, y, z) + C + K(y − M) + C2  . y w(y) 1,f 1,f

Note that 2 2 C1,f + K(y − M) + C1,f ≤ C1,f − K + C1,f , y ∈ [−(M − 1),M − 1].

Hence we can choose K0 = K0(C1,f ) sufficiently large, such that 2 C1,f + K0(y − M) + C1,f ≤ −1, y ∈ [−(M − 1),M − 1].

Then we have that for all choices of K0 > K, and all choices λ > 0 that the last term in (81),

0 −λeKt(y−M) w (y) 2  e Kt 2  C1,f + K(y − M) + C = λe C1,f + K(y − M) + C , w(y) 1,f A − e−λeKt(y−M) 1,f

λ2MeKt is negative for y ∈ [−(M −1),M −1] as long as A > e . We now fix K = K(C1,f ,C2,f ) = max{K0(C1,f ),C2,f } + 1. Then 1 1 w00(y) + (2C + 1)w0(y) = − eKtλ2g(y) + λ(2C + 1)g(y) 2 1,f 2 1,f 1 ≤ − λ2g(y) + λ(2C + 1)g(y) 2 1,f  1  = g(y)λ (2C + 1) − λ . 1,f 2

74 3. Backward stochastic differential equations with rough driver

So, if we choose λ = λ(C1,f ) = 4C1,f + 4, we have 1 w00(y) + (2C + 1)w0(y) ≤ g(y)(4C + 4)(−1) ≤ −(4C + 4) ≤ −1. 2 1,f 1,f 1,f

λ2MeKT We now fix A = A(λ(C1,f ),M,K(C1,f ,C2,f )) = A(M,C1,f ,C2,f ) = e + 1. Then for the first term in (81)

2   2   |z| 1 00 0 |z| Kt 1 00 0 w (y) + (2C1,f + 1)w (y) = e w (y) + (2C1,f + 1)w (y) w(y) 2 eλ2MeKT + 1 − e−λeKt(y−M) 2 |z|2 ≤ − eKt eλ2MeKT + 1 − e−λeKt(y−M) |z|2 ≤ − eKt eλ2MeKT < −δ|z|2 < 0, with eKt δ = δ(λ(C1,f ),K(C1,f ,C2,f ),M) = δ(M,C1,f ,C2,f ) = > 0. eλ2MeKT + 1

∗ ∗ δ We now set ε = ε (M,C2,f ,C2,f ) := 2 . Then on [T − hε∗ ,T ] we have δ ∂ f(t, x, y, z) ≤ C + |z|2, y 2,f 2 and hence we get that on [T − hε∗ ,T ] (remember that K ≥ C2,f + 1)

|z|2 1  ∂ f˜(t, x, y,˜ z˜) ≤ w00(y) + (2C + 1)w0(y) y˜ w(y) 2 1,f w0(y) − K + ∂ f(t, x, y, z) + C + K(y − M) + C2  y w(y) 1,f 1,f δ ≤ −δ|z|2 − K + C + |z|2 2,f 2 δ ≤ −δ|z|2 + |z|2 − 1 2 δ = − |z|2 − 1 2 δ = − |w(y)|2|z˜|2 − 1 2 ≤ −K˜ (1 + |z˜|2), for y ∈ [−(M − 1),M − 1]. for some K˜ > 0. Moreover by the definition of f˜ and the assumptions on f it is straightforward to bound the other partial derivatives of f˜. So in total we get K,˜ C˜ > 0 such that for t ∈ [T − hε∗ ,T ], y ∈ m [−(M − 1),M − 1], z˜ ∈ R

2 ∂y˜f˜(t, x, y,˜ z˜) ≤ −K˜ (1 + |z˜| ), 2 |∂xf˜(t, x, y,˜ z˜)| ≤ C˜(1 + |z˜| ), (82)

|∂z˜f˜(t, x, y,˜ z˜)| ≤ C˜(1 + |z˜|).

75 3. Backward stochastic differential equations with rough driver

Let M := r(−(M − 1)), M¯ := r(M − 1). Then, since u, v take values in [−(M − 1),M − 1], u,˜ v˜ take values in [M, M¯ ]. We can then define

 ˜ ˜ 2 f(t, x, M, z˜) − K(1 + |z˜| )(˜y − M) , y˜ < M, ˜  f˜(t, x, y,˜ z˜) := f˜(t, x, y,˜ z˜) , y˜ ∈ [M, M¯ ], f˜(t, x, M,¯ z˜) − K˜ (1 + |z˜|2)(˜y − M¯ ) , M¯ < y.˜

˜ 29 ˜ ˜ m This function f then satisfies for some K, C > 0 and for all t ∈ [T − hε∗ ,T ], y˜ ∈ R, z˜ ∈ R

˜ 2 ∂y˜f˜(t, x, y,˜ z˜) ≤ −K˜ (1 + |z˜| ), ˜ 2 |∂xf˜(t, x, y,˜ z˜)| ≤ C˜(1 + |z˜| ), (83) ˜ |∂z˜f˜(t, x, y,˜ z˜)| ≤ C˜(1 + |z˜| + |y˜||z˜|).

˜ andu, ˜ v˜ are also sub-(resp. super-) solution to (80) with f˜ replaced by f˜. 30 We can hence assume the validity of (83) for f˜. 2. Comparison under structure condition Letu ˜,v ˜ be a semicontinuous sub-(resp. super-)solution to 1 − ∂ u˜(t, x) − Tr[σ(t, x)σ(t, x)T D2u˜(t, x)] − h(b(t, x),Du˜(t, x)i t 2 n − f˜(t, x, u˜(t, x),Du˜(t, x)σ(t, x)) = 0, t ∈ (0,T ), x ∈ R

m where f˜ satisfies (83) for t ∈ [0,T ], y˜ ∈ R, z˜ ∈ R . Letu ˜ be bounded above andv ˜ be bounded. n Assumeu ˜(T, ·) ≤ v˜(T, ·). We showu ˜ ≤ v˜ on (0,T ] × R . γ First of all, note thatu ˜γ(t, x) :=u ˜(t, x) − t is also a subsolution. Sinceu ˜ ≤ v˜ follows from u˜γ ≤ v˜ in the limit γ → 0, it suffices to prove comparison under the additional assumption

lim u˜(t, x) = −∞, uniformly on n. t→0 R

Define

L := sup [˜u(t, x) − v˜(t, x)] x∈Rn,t∈(0,T ] and also

L(h) := sup u˜(t, x) − v˜(t, x0) , |x−x0|≤h,t∈(0,T ] L0 := lim L(h). h→0 ˜ 29Note that f˜ is not necessarily continuously differentiable iny ˜ anymore, but, as was noted on page 48, we can directly work with functions that are only (locally) Lipschitz and bound the corresponding Lipschitz constants. 30 The reason one wants bounds globally in y is that is that the proof involvesu ˜γ =u ˜ − γ/t which is unbounded. In fact, it is possible to carry out the comparison proof without penalizing t = 0. It suffices to use the slightly more general version of the parabolic theorem of sums in Theorem 50. Following this approach also leads to comparison at t = 0, if we take into consideration the remarks in [14] on the accessibility of a subsolution.

76 3. Backward stochastic differential equations with rough driver

One has of course L ≤ L0. We will show L0 ≤ 0. Consider

|x − x0|2 ψ (t, x, x0) :=u ˜(t, x) − v˜(t, x0) − − η(|x|2 + |x0|2). ε,η ε2

ˆ0 ˆ0 n Let Lε,η be the maximum of ψε,η and (t,ˆ x,ˆ x ) = (tˆε,η, xˆε,η, x ε,η) ∈ (0,T ] × R a maximizing point, which exists by the assumptions onu ˜ andv ˜. We argue by contradiction. Hence assumeu ˜(s, z) − v˜(s, z) > δ for some (s, z). Then also L0 > δ. We first argue, that for small enough values of ε, η the optimizing time parameter tˆ cannot be T . Indeed, assuming tˆ= T we can estimate

2 δ − 2η|z| = ψε,η(s, z, z) 0 ≤ ψε,η(T, x,ˆ xˆ )  0 2  0 |x − x | 2 0 2 = sup u(T, x) − v(T, x ) − 2 − η(|x| + |x | ) . x,x0 ε

Now by Theorem 3.1 in [18], applied to u(T, x) − η|x|2 and v(T, x) + η|x0|2, we get

0  2 lim ψε,η(T, x,ˆ xˆ ) = sup u(T, x) − v(T, x) − 2η|x| ≤ sup [u(T, x) − v(T, x)] ≤ 0. ε→0 x x

It follows that for ε, η small enough, tˆ 6= T . Also, by assumptionu ˜(s, z) − v˜(s, z) > δ; hence 0 we have for η small enough, thatu ˜(t,ˆ xˆ) − v˜(t,ˆ xˆ ) ≥ Lε,η ≥ δ > 0. We assume to be in this scenario from now on. By applying the parabolic Theorem on Sums (e.g. Theorem 8.3 in [18]), we get

(b, p,ˆ Xˆ) ∈ P¯2,+u˜(t,ˆ xˆ), (b0, pˆ0, Xˆ 0) ∈ P¯2,+v˜(t,ˆ xˆ0),

0 xˆ−xˆ0 0 xˆ−xˆ0 0 such that b − b = 0,p ˆ = 2 ε2 + 2ηxˆ,p ˆ = 2 ε2 − 2ηxˆ and

 Xˆ 0  4  I −I   I 0  ≤ + 4η . (84) 0 −Xˆ 0 ε2 −II 0 I

0 |x−x0|2 2 0 2 Indeed, defining ϕ(t, x, x ) := ε2 + η(|x| + |x | ) we have

2  I −I   I 0  A := D2ϕ(t, x, x0) = + 2η . ε2 −II 0 I

Then  8 η   I −I   I 0  A2 = + 8 + 4η2 . ε4 ε2 −II 0 I

By the Theorem on Sums, for every a > 0 there exist said elements of the jets, such that

 Xˆ 0  ≤ A + aA2. 0 −Xˆ 0

Hence we can choose a so small such that (84) holds.

77 3. Backward stochastic differential equations with rough driver

By the viscosity property,

0 ≤ Tr[σσT (t,ˆ xˆ)Xˆ] − Tr[σσT (t,ˆ xˆ0)Xˆ 0] + hb(t,ˆ xˆ), pˆi − hb(t,ˆ xˆ0), pˆ0i + f˜(t,ˆ x,ˆ u˜(t,ˆ xˆ), pσˆ (t,ˆ xˆ)) − f˜(t,ˆ xˆ0, v˜(t,ˆ xˆ0), pˆ0σ(t,ˆ xˆ0)) = (i) + (ii) + (iii).

Where

(i) := Tr[σσT (t,ˆ xˆ)Xˆ] − Tr[σσT (t,ˆ xˆ0)Xˆ 0], (ii) := hb(t,ˆ xˆ), pˆi − hb(t,ˆ xˆ0), pˆ0i, (iii) := f˜(t,ˆ x,ˆ u˜(t,ˆ xˆ), pσˆ (t,ˆ xˆ)) − f˜(t,ˆ xˆ0, v˜(t,ˆ xˆ0), pˆ0σ(t,ˆ xˆ0)).

T  σ(t,ˆ xˆ)   σ(t,ˆ xˆ)  Multiplying (84) with from the right side, with from the left and σ(t,ˆ xˆ0) σ(t,ˆ xˆ0) then taking the trace, we get

(i) = Tr[σσT (t,ˆ xˆ)Xˆ] − Tr[σσT (t,ˆ xˆ0)Xˆ 0] 4 ≤ ||σ(t,ˆ xˆ) − σ(t,ˆ xˆ0)||2 + 4η ||σ(t,ˆ xˆ)||2 + ||σ(t,ˆ xˆ0)||2 ε2 2 2 2 4 ≤ C |xˆ − xˆ0|2 + 8ηC2. σ ε2 σ

Moreover

(ii) = hb(t,ˆ xˆ0), pˆ0i − hb(t,ˆ xˆ), pˆi xˆ − xˆ0 ≤ |b(t,ˆ xˆ0) − b(t,ˆ xˆ)||2 | + |b(t,ˆ xˆ)||2ηxˆ0 + 2ηxˆ| ε2 xˆ − xˆ0 ≤ C |xˆ − xˆ0||2 | + C 2η(|xˆ0| + |xˆ|) b ε2 b |xˆ − xˆ0|2 = C 2 + C 2η(|xˆ0| + |xˆ|). b ε2 b

We have

(iii) = f˜(t,ˆ x,ˆ u˜(t,ˆ xˆ), pσˆ (t,ˆ xˆ)) − f˜(t,ˆ xˆ0, v˜(t,ˆ xˆ0), pˆ0σ(t,ˆ xˆ0)) Z 1 0 0 0 0 = [∂xf˜((∗))(ˆx − xˆ ) + ∂y˜f˜((∗))(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ )) + ∂z˜f˜((∗))(ˆpσ(t,ˆ xˆ) − pˆ σ(t,ˆ xˆ ))]dλ, 0 where

(∗) := (t,ˆ λxˆ + (1 − λ)ˆx0, λu˜(t,ˆ xˆ) + (1 − λ)˜v(t,ˆ xˆ0), λpσˆ (t,ˆ xˆ) + (1 − λ)ˆp0σ(t,ˆ xˆ0)).

0 We know |v˜(t,ˆ xˆ )| ≤ ||v˜||∞ < ∞ and by the upper boundedness ofu ˜ and by the definition of 0 the maximizier we get ∞ > C ≥ u˜(t,ˆ xˆ) ≥ u˜(T, 0)−||v˜||∞ > ∞. Hence λu˜(t,ˆ xˆ)+(1−λ)˜v(t,ˆ xˆ ) is always bounded and we can assume that actually

|∂z˜f˜(t, x, y,˜ z˜)| ≤ C˜(1 + |z˜|).

0 Remember moreover that we assume η small enough, such thatu ˜(t,ˆ xˆ) − v˜(t,ˆ xˆ ) ≥ Lε,η ≥ q δ > 0. Especially we have |xˆ − xˆ0| ≤ ε u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0). Hence we can estimate (let

78 3. Backward stochastic differential equations with rough driver

(∗∗) := λpσˆ (t,ˆ xˆ) + (1 − λ)ˆp0σ(t,ˆ xˆ0)) Z 1 (iii) ≤ [C˜(1 + |(∗∗)|2)|xˆ − xˆ0| − K˜ (1 + |(∗∗)|2)(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ0)) 0 + C˜(1 + |(∗∗)|)|pσˆ (t,ˆ xˆ) − pˆ0σ(t,ˆ xˆ0)|]dλ Z 1 q ≤ [C˜(1 + |(∗∗)|2)ε u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0) − K˜ (1 + |(∗∗)|2)(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ0)) 0 1 + Cϑ˜ (1 + |(∗∗)|)2(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ0)) + |pσˆ (t,ˆ xˆ) − pˆ0σ(t,ˆ xˆ0)|2]dλ ϑ(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ0)) Z 1 q ≤ [C˜(1 + |(∗∗)|2)ε u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0) − K˜ (1 + |(∗∗)|2)(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ0)) 0 1 + Cϑ˜ 2(1 + |(∗∗)|2)(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ0)) + |pσˆ (t,ˆ xˆ) − pˆ0σ(t,ˆ xˆ0)|2]dλ. ϑ(˜u(t,ˆ xˆ) − v˜(t,ˆ xˆ0))

√ Choose ϑ = C˜ . We then have for ε2 < C˜ δ 6K˜ 3K˜ K˜ 1 (iii) ≤ − u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0) + |pσˆ (t,ˆ xˆ) − pˆ0σ(t,ˆ xˆ0)|2 3 ϑ u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0) ˜ K 0  0 0 2 1 ≤ − u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ ) + (|pˆ|Cσ|xˆ − xˆ | + |pˆ − pˆ |Cσ) 3 ϑ u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0) ˜ 0 2 K 0  |xˆ − xˆ | 0 0 2 1 ≤ − u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ ) + (2 + 2Cση|xˆ||xˆ − xˆ | + Cσ|2ηxˆ + 2ηxˆ |) . 3 ε2 ϑ u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0)

Now, Lemma 3.5 in [43] (see also Lemma 2 in [10]) yields h i lim inf lim inf u˜(t,ˆ xˆ) − v˜(t,ˆ xˆ0) = L0, ε→0 η→0 |xˆ − xˆ0| lim sup lim sup = 0, ε→0 η→0 ε lim sup lim sup η(|xˆ|2 + |xˆ0|2) = 0. ε→0 η→0

Hence we get lim supε→0 lim supη→0(i) ≤ 0, lim supε→0 lim supη→0(ii) ≤ 0, and lim supε→0 lim supη→0(iii) ≤ K˜ 0 − 3 L . Combining, we arrive at K˜ 0 ≤ − L0, 3 which is the desired contradiction.

In the proof above we need a version of the classical theorem of sums, that does not seem to be available in the literature. Let us recall the usual statement first.

n 2n Theorem 49 ([19, Thm 7]). Let u1, u2 ∈ USC ((0,T ) × R ) and w ∈ USC (0,T ) × R be given by w (t, x) = u1 (t, x1) + u2 (t, x2) 2n 2n 2n Suppose that s ∈ (0,T ) , z = (z1, z2) ∈ R , b ∈ R, p = (p1, p2) ∈ R ,A ∈ S with (b, p, A) ∈ P2,+w (s, z) . (85)

79 3. Backward stochastic differential equations with rough driver

Assume moreover that there is an r > 0 such that for every M > 0 there is a C such that for i = 1, 2

2,+ bi ≤ C whenever (bi, qi,Xi) ∈ P w (t, xi) , (86)

|xi − zi| + |s − t| < r and |ui (t, xi)| + |qi| + kXik ≤ M.

n Then for each ε > 0 there exists (bi,Xi) ∈ R×S such that

2,+ (bi, pi,Xi) ∈ P¯ u (s, zi) and     1 X1 0 2 − + kAk I ≤ ≤ A + εA and b1 + b2 = b. (87) ε 0 X2

The proof of the above theorem is reduced (cf. Lemma 8 in [19]) to the case b = 0, z = 0, p = 0 and v1 (s, 0) = v2 (s, 0) = 0, where (in order to avoid confusion) we write vi instead of ui. Condition (85) translates than to 1 v (t, x ) + v (t, x ) − hAx, xi ≤ 0 for all (t, x) ∈ (0,T ) × 2n; (88) 1 1 2 2 2 R this also means that the left-hand-side as a function of (t, x1, x2) has a global maximum at n (s, 0, 0). The assertion of the (reduced) theorem is then the existence of (bi,Xi) ∈ R×S 2,+ such that (bi, 0,Xi) ∈ P¯ vi (s, 0) for i = 1, 2 and (87) holds with b = 0.

n Theorem 50. Assume that ui has a finite extension to (0,T ] × R , i = 1, 2, via its semi- continuous envelopes, that is,

ui (T, x) = lim sup ui (t, y) < ∞. n (t,y)∈(0,T )×R : t↑T,y→x

Then the above theorem remains valid at s = T if

2,+ 2,+ P w (s, z) and P¯ u (s, zi) is replaced by 2,+ ¯2,+ PQ w (T, z) and PQ u (T, zi) and the final equality in (87) is replaced by

b1 + b2 ≥ b. (89)

2,+ Remark 51. If we knew (but we don’t!) that the final conclusion is (bi, pi,Xi) ∈ P u (T, zi), ¯2,+ rather than just being an element in the closure PQ u (T, zi), then we could trivially diminish the bi’s such as to have b1 + b2 = b.

Proof. Step 1: We focus on the reduced setting (and thus write vi instead of ui) and (following the proof of Lemma 8 in [19]) redefine vi (ti, xi) as −∞ when |xi| > 1 or ti ∈/ [T/2,T ]. We can also assume that (88) is strict if t < s = T or x 6= 0. For the rest of the proof, we shall abbreviate (t1, t2) , (x1, x2) etc by (t, x). With this notation in mind we set 1 w (t, x) = v (t , x ) + v (t , x ) − hAx, xi . 1 1 1 2 2 2 2

80 3. Backward stochastic differential equations with rough driver

By the extension via semi-continuous envelopes, there exist a sequence (tn, xn) ∈ (0,T )2 × n 2 (R ) , such that (tn, xn) ≡ t1,n, t2,n, x1,n, x2,n → (T,T, 0, 0) .

We now consider w with a penalty term for t1 6= t2 and a barrier at time T for both t1 and t2. ( 2 ) m X 2 ψ (t, x) = w (t, x) − |t − t |2 + T − ti,n / (T − t ) , m,n 2 1 2 i i=1 2 indexed by (m, n) ∈ N , say. By assumption w has a maximum at (T,T, 0, 0) which we may assume to be strict (otherwise subtract suitable forth order terms ...). Define now

 2 2 t,ˆ xˆ ∈ arg max ψm,n over [T − r, T ] × B¯r (0) where r = T/2 (for instance). When we want to emphasize dependence on m, n we write  tˆm,n, xˆm,n . We shall see below (Step 2) that there exists increasing sequences m = m (k) , n = n (k) so that  t,ˆ xˆ |m=m(k),n=n(k) → (T,T, 0, 0) . (90) Using the (elliptic) theorem of sums in the form of [19, Theorem 1] we find that there are

2,+  (bi, pi,Xi) ∈ P¯ vi tˆi, xˆi

(where tˆi → T, xˆi → 0 as k → ∞) such that the first part of (87) holds and     xˆ1 p1 i,ε2 2 A = , bi = m (ti − t3−i) + T − t / (T − ti) . xˆ2 p2 for i = 1, 2. Note that

b1 + b2 = m (t1 − t2) + m (t2 − t1) + (positive terms) ≥ 0; since each bi is bounded above by the assumptions and the estimates on the Xi it follows that the bi lie in precompact sets. Upon passing to the limit k → ∞ we obtain points

2,+ (bi, pi,Xi) ∈ P¯ vi (T, 0) , i = 1, 2; with b1 + b2 ≥ 0. Step 2: We still have to establish (90). We first remark that for arbitrary (strictly) increasing sequences m (k) , n (k), compactness implies that

  2 ¯ 2 tˆm(k),n(k), xˆm(k),n(k) : k ≥ 1 ∈ [T − r, T ] × Br (0) has limit points. Note also tˆ1, tˆ2 ∈ [T − r, T ) thanks to the barrier at time T . The key technical ingredient for the remained of the argument is and we postpone details of these to Step 3 below:

( 2 )   m 2 X i,n2  1 1 w t,ˆ xˆ − ψm,n t,ˆ xˆ = tˆ1 − tˆ2 + T − t / T − tˆi → 0 as << → 0. 2 n m i=1 (91) In particular, for every k > 0 there exists m (k) such that for all m ≥ m (k) 1 lim sup {...} < . n→∞ k

81 3. Backward stochastic differential equations with rough driver

By making m (k) larger if necessary we may assume that m (k) is (strictly) increasing in k. Furthermore there exists n (m (k) , k) = n (k) such that for all n ≥ n (k): {...} < 2/k. Again, we may make n (k) larger if necessary so that n (k) is strictly increasing. Recall t1,n(k) − t2,n(k) → T − T = 0 as k → ∞. For reasons that will become apparent further below, we actually want the stronger statement that

m (k) 2 t1,n(k) − t2,n(k) → 0 as k → ∞ (92) 2 which we can achieve by modifying n (k) such as to run to ∞ even faster. Note that the so-constructed m = m (k) , n = n (k) has the property    w t,ˆ xˆ − ψm,n t,ˆ xˆ |m=m(k),n=(k) = {...} |m=m(k),n=(k) → 0 as k → ∞. (93)

By switching to a subsequence (kl) if necessary we may also assume (after relabeling) that   2 ¯ 2 tˆm(k),n(k), xˆm(k),n(k) → t,˜ x˜ ∈ [T − r, T ] × Br (0) as k → ∞.

In the sequel we think of t,ˆ xˆ as this sequence indexed by k. We have   w t,˜ x˜ ≥ lim sup w t,ˆ xˆ |m=m(k),n=(k) by upper-semi-continuity (94) k→∞  = lim sup ψm,n t,ˆ xˆ |m=m(k),n=(k) thanks to (93). k→∞ On the other hand, thanks to the particular form of our time-T barrier,  ψm,n t,ˆ xˆ n n ≥ ψm,n (t , x ) ( 2 ) n n m 1,n 2,n 2 X i,n = w (t , x ) − t − t + T − t . 2 i=1 Take now m = m (k) , n = n (k) as constructed above. Then  ψm,n t,ˆ xˆ |m=m(k),n=(k)   ≥ w tn(k), xn(k)

( 2 ) m (k) 2 X   − t1,n(k) − t2,n(k) + T − ti,n(k) 2 i=1 The first term in the curly bracket goes to zero (with k → ∞) thanks to (92), the other term goes to zero since ti,n → T with n → ∞, and hence also along n (k). On the other hand (recall xi,n → 0)   1 w tn(k), xn(k) → v (T, 0) + v (T, 0) − hA0, 0i = 0 as k → ∞. 1 2 2

(In the reduced setting v1 (T, 0) = v2 (T, 0) = 0.) It follows that  lim inf ψm,n t,ˆ xˆ |m=m(k),n=(k) = 0. k→∞

Together with (94) we see that w t,˜ x˜ ≥ 0. But w (T,T, 0, 0) = 0 was a strict maximum in 2 2  [T − r, T ] × B¯r (0) and so we must have t,˜ x˜ = (T,T, 0, 0).

82 3. Backward stochastic differential equations with rough driver

Step 3: Set

0 M (h) = sup w (t1, t2, x1, x2) and M = lim M (h) 2 2 h→0 (t,x)∈[T −r,T ) ×B¯r(0) |t1−t2|

It is enough to show

 0  lim sup w t,ˆ xˆ ≤ M ≤ lim inf ψm,n t,ˆ xˆ . (95) 1 1 1 << 1 →0 n << m →0 n m since the claimed

( 2 )   m 2 X i,n2  1 1 w t,ˆ xˆ − ψm,n t,ˆ xˆ = tˆ1 − tˆ2 + T − t / T − tˆi → 0 as << → 0. 2 n m i=1 follows from   lim sup {...} ≤ lim sup w t,ˆ xˆ − lim inf ψm,n t,ˆ xˆ 1 1 1 1 1 << 1 →0 n << m →0 n << m →0 n m ≤ 0 (and hence = 0).

 2 2 Note that w t,ˆ xˆ is bounded on [T − r, T ] × B¯r (0) so that

2  √  tˆ1 − tˆ2 = O (1/m) =⇒ w t,ˆ xˆ ≤ M const/ m .

0 On the other hand, from the very definition of M as limh→0 M (h), there exists a family (th, xh) so that 0 |t1,h − t2,h| ≤ h and w (th, xh) → M as h → 0 (96)

For every m, n we may take (th, xh) as argument of ψm,n (which itself has a maximum at t,ˆ xˆ); hence 2 m X 2 w(t , x ) − h2 − T − ti,n / (T − t ) ≤ ψ t,ˆ xˆ . (97) h h 2 i,h m,n i=1 i,n2 Take now a sequence n = n (h), fast enough increasing as h & such that T − t / (T − ti,h) → 0 with h → 0. It follows that

0 M = lim w(th, xh) h→0 2 ! 2 m 2 X  i,n(h) = lim inf w(th, xh) − h − T − t / (T − ti,h) h →0 2 i=1   ≤ lim inf ψm,n(h) t,ˆ xˆ = lim inf ψm,n t,ˆ xˆ by monotonicity of sup ψm,n in n. h →0 n →∞

i,n (In the last equality we used that t ↑ T ; this shows that sup ψm,n is indeed monotone in n.) The proof is now finished.

83 References

[1] G.B. Arous and F. Castell. Flow decomposition and large deviations. Journal of func- tional analysis, 140(1):23–67, 1996.

[2] G. Barles, R. Buckdahn, and E. Pardoux. Backward stochastic differential equations and integral-partial differential equations. Stochastics An International Journal of Probability and Stochastic Processes, 60(1):57–83, 1997.

[3] F. Baudoin and M. Hairer. A version of h¨ormanderstheorem for the fractional brownian motion. Probability theory and related fields, 139(3):373–395, 2007.

[4] J.M. Bismut. Conjugate convex functions in optimal stochastic control. J. Math. Anal. Appl, 44(1973):384–404, 1973.

[5] V. I. Bogachev. Measure theory. Vol. I. Springer-Verlag, Berlin, 2007.

[6] Leo Breiman. Probability. Classics in Applied Mathematics. SIAM, Philadelphia, PA, 1992.

[7] R. Buckdahn and J. Ma. Stochastic viscosity solutions for nonlinear stochastic partial differential equations. I. Stochastic Process. Appl., 93(2):181–204, 2001.

[8] R. Buckdahn and J. Ma. Pathwise stochastic control problems and stochastic HJB equations. SIAM J. Control Optim., 45(6):2224–2256 (electronic), 2007.

[9] M. Caruana and P. K. Friz. Partial differential equations driven by rough paths. J. Differential Equations, 247(1):140–173, 2009.

[10] Michael Caruana, Peter K. Friz, and Harald Oberhauser. A (rough) pathwise approach to a class of non-linear stochastic partial differential equations. Annales de l’Institut Henri Poincare (C) Non Linear Analysis, 28(1):27 – 46, 2011.

[11] T. Cass and P. Friz. Densities for rough differential equations under h¨ormander’scon- dition. Annals of mathematics, 171(3):2115–2141, 2010.

[12] T. Cass, C. Litterer, and T. Lyons. Integrability estimates for gaussian rough differential equations. Arxiv preprint arXiv:1104.1813, 2011.

[13] T. Cass and T. Lyons. Evolving communities with individual preferences. preprint, 2010.

[14] Y.G. Chen, Y. Giga, and S. Goto. Remarks on viscosity solutions for evolution equations. Proceedings of the Japan Academy. Series A Mathematical sciences, 67(10):323–328, 1991.

[15] J. M. C. Clark. The design of robust approximations to the stochastic differential equa- tions of nonlinear filtering. In Communication systems and random process theory (Proc. 2nd NATO Advanced Study Inst., Darlington, 1977), pages 721–734. NATO Advanced Study Inst. Ser., Ser. E: Appl. Sci., No. 25. Sijthoff & Noordhoff, Alphen aan den Rijn, 1978.

[16] J. M. C. Clark and D. Crisan. On a robust version of the integral representation formula of nonlinear filtering. Probab. Theory Related Fields, 133(1):43–56, 2005.

84 [17] L. Coutin and Z. Qian. Stochastic analysis, rough path analysis and fractional brownian motions. Probability theory and related fields, 122(1):108–140, 2002.

[18] M. G. Crandall, H Ishii, and P.-L. Lions. User’s guide to viscosity solutions of second order partial differential equations. Bull. Amer. Math. Soc. (N.S.), 27(1):1–67, 1992.

[19] M.G. Crandall and H. Ishii. The maximum principle for semicontinuous functions. Dif- ferential Integral Equations, 3(6):1001–1014, 1990.

[20] D. Crisan and B. Rozovski˘ı(eds). The Oxford handbook of nonlinear filtering. Oxford: Oxford University Press. xiv, 1063 p. , 2011.

[21] AM Davie. Differential equations driven by rough paths: An approach via discrete approximation. Applied Mathematics Research eXpress, 2007(2007), 2007.

[22] M. H. A. Davis. On a multiplicative functional transformation arising in nonlinear filtering theory. Z. Wahrsch. Verw. Gebiete, 54(2):125–139, 1980.

[23] M. H. A. Davis. A pathwise solution of the equations of nonlinear filtering. Teor. Veroyatnost. i Primenen., 27(1):160–167, 1982.

[24] M. H. A. Davis and M. P. Spathopoulos. Pathwise nonlinear filtering for nondegenerate diffusions with noise correlation. SIAM J. Control Optim., 25(2):260–278, 1987.

[25] M.H.A. Davis. Pathwise nonlinear filtering. Stochastic systems: the mathematics of filtering and identification and applications, Proc. NATO Adv. Study Inst., Les Arcs/Savoie/ France 1980, 505-528 (1981)., 1981.

[26] M.H.A. Davis. Pathwise nonlinear filtering with correlated noise. Crisan, Dan (ed.) and Rozovski˘ı,Boris (ed.), The Oxford handbook of nonlinear filtering. Oxford: Oxford University Press. 403-424 (2011)., 2011.

[27] A. Deya, M. Gubinelli, and S. Tindel. Non-linear rough heat equations. Arxiv preprint arXiv:0911.0618, 2009.

[28] J. Diehl, P.K. Friz, and H. Oberhauser. Parabolic Comparison Revisited and Applica- tions. Preprint, 2011.

[29] H. Doss. Liens entre ´equationsdiff´erentielles stochastiques et ordinaires. Ann. Inst. H. Poincar´eSect. B (N.S.), 13(2):99–125, 1977.

[30] N. El Karoui, S. Peng, and M. C. Quenez. Backward stochastic differential equations in finance. Math. Finance, 7(1):1–71, 1997.

[31] W.H. Fleming and H.M. Soner. Controlled Markov processes and viscosity solutions. Springer Verlag, 2006.

[32] P. K. Friz and H. Oberhauser. Rough path stability of SPDEs arising in non-linear filtering. Arxiv preprint arXiv:1005.1781, 2010.

[33] P. K. Friz and N. B. Victoir. Multidimensional stochastic processes as rough paths, volume 120 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2010. Theory and applications.

85 [34] P.K. Friz. Continuity of the Ito-map for Hoelder rough paths with applications to the support theorem in Hoelder norm. Probability and Partial Differential Equations in Modern Applied Mathematics, IMA Volumes in Mathematics and its Applications, Vol 140 (2005) 117-135, 2005.

[35] JG Gaines and TJ Lyons. Variable step size control in the numerical solution of stochastic differential equations. SIAM Journal on Applied Mathematics, pages 1455–1484, 1997.

[36] M. Gubinelli and S. Tindel. Rough evolution equations. Arxiv preprint arXiv:0803.0552v1, 2008.

[37] I. Gy¨ongyand N. Krylov. Existence of strong solutions for itˆo’sstochastic equations via approximations. Probability theory and related fields, 105(2):143–158, 1996.

[38] I. Gyongy and T. Mart´ınez.On stochastic differential equations with locally unbounded drift. Czechoslovak Mathematical Journal, 51(4):763–783, 2001.

[39] M. Hairer and N.S. Pillai. Ergodicity of hypoelliptic sdes driven by fractional brownian motion. In Annales de l’Institut Henri Poincar´e,Probabilit´eset Statistiques, volume 47, pages 601–628. Institut Henri Poincar´e,2011.

[40] BM Hambly and TJ Lyons. Stochastic area for brownian motion on the sierpinski gasket. The Annals of Probability, 26(1):132–148, 1998.

[41] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, pages 13–30, 1963.

[42] O. Kallenberg. Foundations of modern probability. Springer Verlag, 2002.

[43] M. Kobylanski. Backward stochastic differential equations and partial differential equa- tions with quadratic growth. Ann. Probab., 28(2):558–602, 2000.

[44] T.G. Kurtz and P. Protter. Weak limit theorems for stochastic integrals and stochastic differential equations. The Annals of Probability, 19(3):1035–1070, 1991.

[45] H. J. Kushner. A robust discrete state approximation to the optimal nonlinear filter for a diffusion. Stochastics, 3(2):75–83, 1979.

[46] M. Ledoux, Z. Qian, and T. Zhang. Large deviations and support theorem for diffusion processes via rough paths. Stochastic processes and their applications, 102(2):265–283, 2002.

[47] G. Liang, T. J. Lyons, and Z. Qian. Backward stochastic dynamics on a filtered proba- bility space. Arxiv preprint arXiv:0904.0377, 2009.

[48] P.-L. Lions and P. E. Souganidis. Fully nonlinear stochastic pde with semilinear stochas- tic dependence. C. R. Acad. Sci. Paris S´er.I Math., 331(8):617–624, 2000.

[49] T. J. Lyons. Differential equations driven by rough signals. I. An extension of an in- equality of L. C. Young. Math. Res. Lett., 1(4):451–464, 1994.

[50] T. J. Lyons, M. Caruana, and T. L´evy. Differential equations driven by rough paths, volume 1908 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures from the 34th Summer School on Probability Theory held in Saint-Flour, July 6–24, 2004.

86 [51] T. J. Lyons and Z. Qian. System Control and Rough Paths. Oxford University Press, 2002. Oxford Mathematical Monographs.

[52] T.J. Lyons. Differential equations driven by rough signals. Revista Matem´atica Iberoamericana, 14(2):215–310, 1998.

[53] Y. Mishura. for fractional Brownian motion and related processes. Number 1929. Springer, 2008.

[54] E. Pardoux and S. Peng. Backward stochastic differential equations and quasilinear parabolic partial differential equations. Stochastic partial differential equations and their applications, pages 200–217, 1992.

[55] E.´ Pardoux and S. Peng. Backward doubly stochastic differential equations and systems of quasilinear SPDEs. Probability Theory and Related Fields, 98(2):209–227, 1994.

[56] E.´ Pardoux and S. G. Peng. Adapted solution of a backward stochastic differential equation. Systems Control Lett., 14(1):55–61, 1990.

[57] E. Pardoux and SG Peng. Adapted solution of a backward stochastic differential equa- tion. Systems & Control Letters, 14(1):55–61, 1990.

[58] H. Pham. Continuous-time stochastic control and optimization with financial applica- tions, volume 61 of Stochastic Modelling and Applied Probability. Springer-Verlag, Berlin, 2009.

[59] P.E. Protter. Stochastic integration and differential equations. Springer Verlag, 2004.

[60] H. J. Sussmann. On the gap between deterministic and stochastic ordinary differential equations. Ann. Probability, 6(1):19–41, 1978.

[61] J. Teichmann. Another approach to some rough and stochastic partial differential es- quations. Arxiv preprint arXiv:0908.2814v1, 2009.

[62] L.C. Young. An inequality of the h¨oldertype, connected with stieltjes integration. Acta Mathematica, 67(1):251–282, 1936.

[63] M. Zaehle. Forward integrals and stochastic differential equations. Progress in Probabil- ity, pages 293–302, 2002.

87