A brief introduction to rough paths
Giovanni Zanco
September 2, 2016
These notes are based on a short course I taught at the University of Pisa in April, 2016, and originated from a course taught by Jan Maas at IST Austria in autumn 2015, for which I have been the teaching assistant. They provide a brief introduction to the theory of rough paths of Hölder regularity between 1/3 and 1/2, with some hints at the theory for rough paths of arbitrary regularity. The exposition and the results presented are heavily based on the book [Friz and Hairer, 2014], which I suggest as a reference for a first study of the subject. Some of the material presented here is also adapted from [Baudoin, 2013], [Friz and Victoir, 2010] and from personal handwritten notes by Jan Maas. Further references are given throughout the text. The proofs given here follow very closely the cited references. These notes have been developed as a handout, with the goal of organising and extending the material presented during the mentioned courses, thus giving a very concise overview of the basic results and of the lines along which the theory of rough paths developed. They are not supposed to be exhaustive, and many important topics are omitted, together with many details. For all sections but the last one basic knowledge in real analysis and stochastics is required, but nothing more. The last section contains more advanced topics, but it is kept at a very informal level, with few rigorous proofs. I warmly thank Jan Maas, for his course provided me with a general and comprehensive outline upon which these short notes developed. The current shape of this material has also benefited from the comments given and the question raised during the course in Pisa; I thank all the participants for their interest and help, and in particular I acknowledge Franco Flandoli for his useful suggestions about the exposition of some topics discussed herein.
Contents
1 Introduction 2
2 Elements of Young integration 3
3 Rough Paths 6
1 4 Some comments on the general theory 11
5 Rough integration 13
6 Rough differential equations 23
7 Stochastic processes as rough paths 25
8 Stochastic differential equations 30
9 Applications to a stochastic partial differential equation 31
1 Introduction
The central objects of these lectures will be differential equations of the form
dYt = f (Yt) dXt (1) where X : [0,T ] → E is a driving signal, Y : [0,T ] → V is the output (unknown) and f : V → L(E; V ) is a smooth function, E and V are Banach spaces and L(E; V ) denotes the space of continuous linear maps from E to V . A common choice is E = Rd and V = Rn, so that f : Rn → Rn×d. We expect X and Y to be continuous functions. The usual way to interpret equation (1) together with some initial datum y0 is in its integral form Z t Yt = y0 + f (Ys) dXs (2) 0 and a standard scheme to solve (2) is (i) to give meaning to the integral;
(ii) to apply some fixed point result. To deal with item (i) we need an integration theory that is satisfactory in the sense that it allows to work with signals and unknowns of suitable regularity, depending on the problem we are in- terested in. To deal with item (ii) we need to space of solutions to (2) to have some nice metric structure. If f is smooth and X is differentiable, then the classical theory applies: equation (2) is inter- preted as Z t Yt = y0 + f (Ys) X˙ s ds 0 and the solution Y can be found as the fixed point of the map M defined on continuous functions by Z t M(Y )t := y0 + f (Ys) X˙ s ds . 0
We will be interested here in situations in which Y is a α-Hölder function (and hence t 7→ f (Yt) is α-Hölder as well) and X is a β-Hölder function. If α + β > 1 we can interpret the integral
2 appearing in (2) as a Young integral (see section 2) and find the solution as the unique fixed point of the map M above in the space Cα. If α + β ≤ 1 the question is more tricky; rough paths theory provides a convenient answer. Of course there are well known probabilistic results that allow to define the so-called stochastic integrals and to solve stochastic differential equations like (2). However they typically do not provide pathwise solutions (if X is a Brownian motion one cannot fix a Brownian path X(ω) and solve (2) for that particular realisation of X), rather solutions in a probabilistic sense; in- deed all stochastic integrals require some probabilistic property of X and Y to be well defined (semimartingale structure, adaptedness, etc.) rather than some regularity property of the typical paths of X and Y . We will see that many classical results about stochastic differential equations are recovered in the theory of rough paths. What does the study of rough paths then add to the classical theories of stochastic analysis? Among many interesting answers to these question, we will focus mainly on the following one. A celebrated result by T. Lyons (see [Lyons, 1991]) states the following:
Theorem 1.1. There exists no separable Banach space B ⊂ C ([0, 1]; R) with the properties:
(i) sample paths of Brownian motion belong to B almost surely;
R · ˙ (ii) The map (g, h) 7→ 0 g(t)h(t) dt extends from smooth functions to a continuous map on B × B taking values in C ([0, 1]; R).
For example the solution map B 7→ Y of the Stratonovich differential equation
dYt = f (Yt) ◦ dBt is measurable but not continuous, in general, with respect to any reasonable topology (and in- deed not all smooth approximations to B give convergence to the Stratonovich solution for the above equation). Rough paths provide a framework in which continuity of the solution map can, to a certain extent, be restored.
Here we will consider driving signals with Hölder regularity α ∈ (1/3, 1/2]. The original theory prefers to work with functions of finite p-variation rather than with Hölder functions, and allows to consider signals of arbitrarily low regularity, but requires heavy algebraic methods that would need too long to be introduced. Restricting to α ∈ (1/3, 1/2] allows to avoid the study of signatures, to avoid many algebraic difficulties and is anyway interesting enough to see many features of the theory and tackle some interesting problems. Some hints of the general theory will however be given hereinafter; the interested reader can refer to [Friz and Victoir, 2010] and [Lyons et al., 2007].
2 Elements of Young integration
We recall the following definitions that will be frequently used in these notes.
3 Definition 2.1 (Hölder continuous functions). Let α > 0. A function X defined on an interval [0,T ] ⊂ R and taking values in a Banach space E is α-Hölder continuous (often α-Hölder for brevity) if the quantity |Xt − Xs|E kXkα := sup α t6=s |t − s| is finite. The space of all α-Hölder continuous functions from [0,T ] into E is denoted by Cα ([0,T ]; E).
When no confusion can arise on the domain of the functions at hand we will simply write Cα(E), or even Cα if the co-domain is clear as well. Hölder continuous functions are continuous, and any α-Hölder function with α > 1 is constant. The quantity k·kα is a semi-norm (it does not separates constant functions); however the quantity
kXkCα := |X0|E + kXkα is a norm that makes Cα(E) a Banach space (in general not separable). This norm is equivalent β α to k · k∞ + k · kα. If α < β then obviously C ⊂ C . We say that a E-valued function X belongs to Ck,α(E) if it is k times differentiable with k,α its k-th derivative being α-Hölder. The space C (E) endowed with the norm kXkCk,α = (k) kXkCk + kX kα is also a Banach space.
By partition of an interval I, in the sequel, we will mean a finite family Π of (essentially) disjoint sub-intervals [s, t] of I such that ∪[s,t]∈Π[s, t] = I. Therefore choosing a partition is equivalent to choosing a finite number of points t0 = 0 < t1 < ··· < tN = T and dividing I into the sub-intervals [ti, ti+1]. The one-point overlap between adjacent intervals will cause no trouble. The mesh of a partition Π is defined as |Π| := max[s,t]∈Π |t − s|. Definition 2.2 (Finite p-variation functions). Let p > 0. A function X defined on an interval [0,T ] ⊂ R and taking values in a Banach space E has finite p-variation if
1 p X p kXkp−var := sup |Xt − Xs|E < ∞ Π [s,t]∈Π where the supremum is taken over all partitions Π of [0,T ]. The space of all continuous functions from [0,T ] into E with finite p-variation is denoted by Cp−var ([0,T ]; E).
As above we will write Cp−var(E) or even Cp−var when spaces are clear from the context. Any function with finite p-variation for some p < 1 is constant, and functions of finite 1-variation are known as bounded variation (BV) functions. The quantity k · kp−var is a semi-norm on Cp−var, but kXkCp−var := |X0|E + kXkp−var p−var is a norm (equivalent to k·k∞ +k·kp−var) that turns C into a Banach space (not separable, in general). If p < q then Cp−var ⊂ Cq−var.
4 Any α-Hölder function is easily seen to have finite 1/α-variation. Conversely, any continuous function X with finite p-variation can be written as X = Y ◦ τ where Y is 1/p-Hölder and τ : [0,T ] → [0, 1] is continuous and increasing.
Smooth functions are not dense neither in Cα for any α < 1 nor in Cp−var for any p > 1. The closure of the set of smooth functions∗ under the Cα norm is denoted by C0,α and the closure of the set of smooth functions under the Cp−var norm is denoted by C0,p−var. Of course we have that C0,α ⊂ Cα and C0,p−var ⊂ Cp−var, but we remark that the inclusion is strict. However it can be easily proved that the difference is tiny, in the sense that for every α < β ≤ 1 the inclusion Cβ ⊂ C0,α holds, and for every 1 ≤ p < q the inclusion Cp−var ⊂ C0,q−var holds. Moreover we have that the closure of C1−var in Cp−var is again C0,p−var.
We will now briefly recall some ideas about the construction of Young integral, considering the real-valued case for simplicity. Let X ∈ Cβ ([0, 1]; R) and Y ∈ Cα ([0, 1]; R). A first way to define the integral of Y against X consists in studying a Riemann sum approximation along the dyadic partition of [0, 1]. Set
I0 = Y0 (X1 − X0) , I1 = Y0 X 1 − X0 + Y 1 X1 − X 1 , 2 2 2 . . 2n−1 X In = Y j X j+1 − X j . 2n 2n 2n j=0
We thus have that
2n−1 X |In+1 − In| = Y j+1/2 − Y j X j+1 − X j+1/2 n n 2n 2 2 2n j=0 2n−1 X −(j+1)α −(j+1)β ≤ kY kα 2 kXkβ 2 j=0 n −(n+1)(α+β) = kXkβkY kα2 2 and therefore if α + β > 1 the sequence (In) is Cauchy and we can define the Young integral as
Z 1 Ys dXs := lim In . 0 n→∞
An alternative definition using finite p-variation spaces is based on the following Young estimate.
∗by smooth we usually mean functions in C∞; however here using piecewise C1 functions would yield the same result.
5 Proposition 2.3. Let X,Y ∈ C1−var ([0, 1] : R) and choose p, q ≥ 1 such that 1/p + 1/q > 1. Then Z 1 1 Y dX − Y (X − X ) ≤ kXk kY k , (3) s s 1 1 0 1−1/p−1/q p−var q−var 0 1 − 2 where the integral above is well defined as a Riemann-Stieltjes integral.
Now for X ∈ C0,p−var and Y ∈ C0,q−var with 1/p + 1/q > 1, there exist two sequences (n) (n) 1−var (n) (n) X n, Y n in C such that X → X in p-variation norm and Y → Y in q- variation norm. The above estimate the yields convergence of the sequence R Y (n) dX(n) and we can define the Young integral of Y against X as
Z 1 Z 1 (n) (n) Ys dXs := lim Ys dXs . 0 n→∞ 0
The Young integral is easily seen to be independent of the sequences X(n) and Y (n) and to satisfy again inequality (3). The extension to general X ∈ Cp−var and Y ∈ Cq−var is obtained through the inclusions stated above. The first construction shows explicitly the role of the condition α + β > 1; it can actually be shown that there exist sequences X(n), Y (n) of smooth functions such that both X(n) → 0 and (n) 1 R (n) (n) Y → 0 in C 2 but Y dX → ∞.The second construction instead suggests the general principle that an estimate like (3) comparing the integral with the global increment over the domain of integration might be of help in defining an integral.
3 Rough Paths
To simplify formulas in the sequel we introduce a convenient shorthand.
Notation 3.1. Given a function X on [0,T ] we will denote by Xs,t its increment between s and t, i.e. Xs,t := Xt − Xs .
To avoid confusion, the value of a function F of two variables will be then denoted by F(s,t), (s, t) ∈ [0,T ]2.
To define a rough path we start from the following observation: suppose that X belongs to α d n n C [0,T ]; R , α ∈ (1/3, 1/2], and that Y takes values in R and solves (1) with f : R → Rn×d smooth. Then we expect that, at least on small scales, the output be “similar” to the noise, i.e.
Ys,t = f (Ys) Xs,t + R(s,t) (4) where R is some remainder that we expect to control (in some sense to be clarified later on), and thus
ˆ f(Y )s,t = f (Yt) − f(Ys) = Df(Ys)Ys,t + Re(s,t) = Df(Ys)f(Ys)Xs,t + R(s,t) (5)
6 ˆ for suitable remainder terms Re and R. Then, neglecting the remainder and setting Zs = d×d n ∼ n×d×d ∼ d n×d Df(Ys)f(Ys) ∈ L R ; R = R = L R ; R , we should have Z t Z t Z t f(Ys) dXs = f(Y0) + f(Ys) − f(Y0) dXs ≈ f(Y0)X0,t + Z0 Xs,t ⊗ dXs ; 0 0 0 (6) R we reduced the problem to calculating X dX. Since α ≤ 1/2 this is of course again an ill- R t posed task; rough paths theory proposes to postulate the value of 0 Xs,t dXs and to consider as a path not only X but a couple (X, X ) where the second component plays the role of the iterated integral of X against itself. This allows then to define a rough path integral R Y dX for Y in a suitable class of functions.
From the point of view of the solution map introduced above, consider the stochastic differ- ential equation in R2 0 0 1 dY = f (Y ) dB , f(x) = x + t t t 1 0 0 for B a standard Brownian motion in R2, that is the system of equations
( 1 1 dYt = dW 2 1 2 dYt = Wt dWt . The solution map is then Z · 1 1 2 W 7→ W , Ws dWs . 0 This is not continuous, but if we add by definition to B its iterated integrals as a second compo- nent B , continuity is straightforwardly restored.
The new object X that we will introduce has in principle nothing to do with integration; in fact such an integration does not exist and, in a sense, we want to define it using X . Anyway we would like to recover the classical case when X is smooth so that its integral against itself is well defined. Therefore suppose for a moment that X is smooth and define the function of two variables X by Z t X (s,t) := Xs,r ⊗ dXr ; s then it follows that Z t Z u Z t X (s,t) − X (s,u) − X (u,t) = Xs,r ⊗ dXr − Xs,r ⊗ dXr − Xu,r ⊗ dXr s s u Z t = Xu,s dXr u = Xs,u ⊗ Xu,t . This suggests the definition below, for which we introduce another convenient notation.
7 2 α 2 Notation 3.2. Let F : [0,T ] → E. We write F ∈ C2 [0,T ] ; E if |F | (s,t) E kF kα = sup α < ∞ . s6=t |t − s|
Note that we use the same symbol k · kα with different meanings depending on whether it refers to functions of one or two variables. If X ∈ Cα ([0,T ]; E) then it is immediate to check α that (s, t) 7→ Xs,t belongs to C ([0,T ]; E).
Definition 3.3. Let α ∈ (1/3, 1/2]. The space C α ([0,T ]; E) of (E-valued) α-Hölder rough paths X α 2α 2 is the space of pairs (X, ) ∈ C ([0,T ]; E) ⊕ C2 [0,T ] ; E ⊗ E such that the identity
X (s,t) − X (s,u) − X (u,t) = Xs,u ⊗ Xu,t (7) holds true for every s, u, t ∈ [0,T ].
Identity (7) is known as Chen’s relation. It implies that X (t,t) = 0 for every t ∈ [0,T ]. As before we will write C α(E) or C α if no confusion can arise.
α α 2α The space C is a closed set in C ⊕ C2 but it is not a linear space due to the nonlinear constraint given by Chen’s relation. On C α we define the rough path norm p X, X α := kXkα + kX kα (8) 9 9 which is not a norm in the common sense (the space is not linear), but is homogeneous with respect to the natural scaling (X, X ) 7→ (λX, λ2X ). We also introduce the rough path metric
|X − Y | |X − Y | X Y s,t s,t E (s,t) (s,t) E ρα (X, ), (Y, ) := |X0 − Y0|E + sup α + sup 2α . s6=t |t − s| s6=t |t − s|
α Theorem 3.4. (C , ρα) is a complete metric space. The proof of this fact is not difficult and can be done as an exercise. It is also easy to show that C β ⊂ C α for any 1/3 < α < β ≤ 1/2. Some remarks on the function X are required. It is true that, if α ∈ (1/3, 1/2). given X ∈ Cα there exists an associated rough path (X, X ) ∈ C α (this result is known as Lyons-Victoir the- orem, [Lyons and Victoir, 2007]), but the choice of X is not unique. Indeed if X could be determined uniquely by X we would be adding no information to our function, so that we could not expect to obtain better results than those already available from classical theories. If α = 1/2 the Lyons-Victoir theorem applies only when E is finite-dimensional. Non-uniqueness of X is easily shown: if we change X (s,t) to Xf(s,t) := X (s,t) + gt − gs for any continuous function g then relation (7) is again satisfied; therefore Chen’s relation is surely not enough to define X given X. In particular X is determined up to increments of a 2α-Hölder function g, and in general there is no canonical choice for g. However if (X, X ) and (X, Xf) are both rough paths and we set δ(s,t) := X (s,t) − Xf(s,t) and gt := δ(0,t), we immediately see that δ(s,t) − δ(s,u) − δ(u,t) = 0 and δ(u,t) = δ(0,t) − δ(0,u) = gt − gu. Therefore X is uniquely deter- mined by Chen’s relation and by the one-variable function X (0,t). This means that knowledge
8 of the paths t 7→ Xt and t 7→ X (0,t) completely identifies the rough path (X, X ). In this sense rough paths are really paths and not two-dimensional objects. Therefore we will write X s,t for X (s,t) in the sequel.
Example 3.5. Let (X, X ) ∈ C α(Rd), α ∈ (1/3, 1/2], and h : [0,T ] → Rd. If we translate the path X by the function h, is there a canonical way to obtain a new rough path associated to X + h? Set h Th(X, X ) = (X + h, X ) where X h is defined as Z t Z t Z t X h X s,t = s,t + hs,r ⊗ dXr + Xs,r ⊗ dhr + hs,r ⊗ dhr . s s s
1,2 If h ∈ W the three integrals above are well defined (in classical sense) and the operator Th is continuous from C α into itself.
Example 3.6. A typical example of non-standard rough path behaviour is given in R2 by
2 Z t (n) 1 cos 2πn t X (n) (n) (n) Xt = 2 , s,t = Xs,r ⊗ dXr . n sin 2πn t s
(n) α (n) (n) α For α < 1/2 it easily seen that X → 0 in C but X , X → (0, X ) in C (i.e. with respect to ρα), where X is given by
0 1 X = π(t − s) . s,t −1 0
The iterated integral “remembers” the oscillations of the paths, that therefore appear in the limit object.
It is worth noting that X in the example above is anti-symmetric, and in general we do not expect X s,t ∈ E ⊗ E to be symmetric. Nevertheless the study of the symmetric part of X is important in the context of rough paths. Suppose for a while that X is a smooth function taking Rd X R t X values in and set s,t = s Xs,r ⊗ dXr. Then the symmetric part of is given by
Z t Z t 1 X i,j X j,i 1 i j j i 1 i j 1 i j s,t + s,t = Xs,r dXr + Xs,r dXr = d Xs,rXs,r = Xs,tXs,t 2 2 s 2 s 2 hence 1 Sym(X ) = X ⊗ X . (9) s,t 2 s,t s,t This equation is false for a general rough path (X, X ) but motivates the following definition. α X α Definition 3.7. The set of geometric rough paths Cg consists of those (X, ) ∈ C that satisfy equation (9).
9 Relation (9) holds for smooth paths considered as rough paths with their canonical lift X R t s,t = s Xs,r ⊗ dXr. The closure of the set of these canonical lifts of smooth paths in α 0,α α C is a set denoted by Cg that can be shown to be strictly smaller than Cg . Similarly to what β 0,α happens for Hölder spaces, it can be easily shown that Cg ⊂ Cg for 1/3 < α < β ≤ 1/2.
We end this section with a useful approximation criterion.