arXiv:2006.11544v1 [math.PR] 20 Jun 2020 Appendix 5 theorem homogenisation Multi-scale 4 theorem limit functional joint Lifted 3 Preliminaries 2 Introduction 1 Contents ifsv n og ooeiaini rcinlniefiel noise fractional in homogenisation rough and Diffusive * oanghigr8ipra.cu,xue-mei.li@imperi [email protected], keywords: . oeruhpt hoy...... theory . path . . rough . Some ...... 5.1 ...... fOU . . . of . . . integrability . . . conditional . . . . the . . . of . . . . Proof ...... B . . . Theorem topology) . . of . rough 3.5 . . Proof in . . CLT . . . (functional . case . . . topolo . 3.4 . rough Itô . . . in . non-CLT . . . (functional . . case . . 3.3 integral . . Young . . . iterated . . of . . compactness . 3.2 . Relative ...... 3.1 ...... Conventions . . . and Assumptions . . . non-CLT / . . CLT . functional 2.5 . . Joint . . . . . Rank . 2.4 Hermite . processes . Ornstein-Uhlenbeck 2.3 Fractional . . processes 2.2 Hermite 2.1 classification: Subject MSC eta n o-eta ii hoe nteruhpt topo path rough the in theorem limit non-central and central classe universality self-similarity non-Gaussian the simultaneous and driven equations differential stochastic dyna by effective The noise. fractional dependent long-range ihrcnl eeoe ol,w rv ooeiainth homogenisation a prove we tools, developed recently With asv rcr rcinlnie ut-cl,functiona multi-scale, noise, fractional tracer, passive 40,6F5 01,6G8 02,6H5 00,60H10 60H07, 60H05, 60G22, 60G18, 60F17, 60F05, 34F05, oanGhigradXeMiLi Xue-Mei and Gehringer Johann meilCleeLondon College Imperial ue2,2020 23, June al.ac.uk ii hoes og ifrnilequations differential rough theorems, limit l Abstract 1 .Akylmafrti ste‘itd on functional joint ‘lifted’ the is this for lemma key A s. yb tcatcpoessfo ohteGaussian the both from processes stochastic by ly isaentncsaiydfuin,te r given are they diffusions, necessarily not are mics logy. oe o admOEwt hr and short with ODE random a for eorem y 14 ...... gy) 15 ...... * 22 ...... 10 ...... 6 ...... 8 7 ...... 5 ...... 26 ...... 22 ...... 7 . . . . . d 26 24 9 5 2 1 Introduction

Fractional noise is the ‘derivative’of a fractional Brownian motion. Its covarianceat times separated by a span 2H 2 2H 1 1 s is ̺˜(s) 2H(2H 1) s − +2H s − δs where H is the Hurst parameter taking values in (0, 1) 2 ∼ − | | | |1 1 \{ } and δs is the Dirac measure. The ‘H = ’ case is white noise. If H > , ̺ds˜ = which means that the 2 2 R ∞ noise has non-integrable long range dependence (LRD). If H < 1 , the process is negatively correlated. Just 2 R as white noise is used for modelling noise coming from a large number of independent random components, fractional noise is used for modelling Long range dependence (LRD). LRDs are observed in nature and in time series data. We study the two scale passive tracer problem, this is also called the tagged particle problem, with fractional noise. ε We consider a slow/fast system in which the slow variables are given by a random ODE x˙ t = G(xt,yt ). This touches on two problems. The first is the passive tracer problem modelling the motion of a tagged particle in a disturbed flow, not necessarily incompressible, which allows simulation of the turbulent from the Lagrangian description. The other is the dynamical description for Brownian particles in a liquid at rest. The slow variables evolve in their natural time scale, while the fast random environmentevolves in the microscopic scale ε. The aim is to extract a closed effective dynamics which approximates the slow variables when ε is sufficiently small. This effective dynamics will be obtained from the persistent effects coming from the fast- moving variables through adiabatic transmission. If the environment is stationary strong mixing noise with sufficiently fast rate of convergence, the homogenisation problem is synonymous with ‘diffusion creation’, and is therefore also known as diffusive homogenisation. There have been continuous explorations of the diffusive homogenisation problem, see [Gre51, Has66, Kub57, KV86, LOV00, PK74, Tay21, KLO12] and the references therein. Recently long range dependent noises are also studied in several papers in the context of homogeneous incompressible fluids, however, they inevitably fall within the central limit theorem regimes [FK00, KNR12] and the effective dynamics are either Brownian motions or fractional Brownian motions. We will study a family of vector fields without spatial homogeneity, the resulting dynamics can take the form of a process resembles locally a fractional Brownian motion and more generally they compromise of a larger class of stochastic dynamical systems of the form

n N k k dxt = fk(xt) dX + fk(xt)dX , x = x , (1.1) ◦ t t 0 0 k k n X=1 =X+1 k where Xt is a for k n and otherwise a Gaussian or a non-Gaussian Hermite process. To our best knowledge, this presents a new≤ effective limit class. In these equations, the symbol denotes the and the other integrals are in the sense of Young integrals. ◦ The homogenisation problem we consider is:

N ε ε ε x˙ = αk(ε) fk(x ) Gk(y ), t t t (1.2)  k  X=1  ε x0 = x0,

ε  where y = y t and yt are the short and long range dependent stationary fractional Ornstein-Uhlenbeck ε 1 processes (fOU) with Hurst parameter H (0, 1) 2 and one time probability distribution µ, the centred p ∈ \{ } 1 real valued functions Gk L (µ) transforms the noise. If fk are in b and Gk are bounded measurable, the ∈ ε N ε ε C solutions to the equations x˙ t = k=1 fk(xt )Gk(yt ) will be approximated by the averaged dynamics which, in this case, is the trivial ODE x˙ t = 0, c.f. [LH19] and [LS]. A homogenisation theorem will then describe the fluctuation around this average,P for this we must rescale the vector fields to arrive at a non-trivial limit. The different scales αk(ε) are reflections of the non-strong mixing property of the noise, they tend to as ∞

2 ε 0 at a speed tailored to the transformations Gk. These scales determine the local self-similar property of → 2 1 the limit. If G is an L function with Hermite rank m, to be defined below, then m = 2(1 H) is the critical value for the limit to be locally a Brownian motion. If m is smaller, the effective limit is− locally a Hermite process of rank m, otherwise a Wiener process. Our main theorem is the following. We take αk(ε) to be α(ε,H∗(mk)), the latter is defined by (1.3).

1 3 d d pk Theorem A Let H (0, 1) 2 , fk b (R ; R ) and Gk L (R; R,µ) be real valued functions satisfying Assumption∈ 2.10. Then\{ the} solutions∈ C of (1.2) converge weakly∈ in γ, on any finite time interval and for any γ ( 1 , 1 1 ), to the solution of (1.1). C ∈ 3 2 − mink≤n pk The linear contraction in the Langevin equation and the exponential convergence of the solutions would lead to the belief that it mixes as fast as the Ornstein-Uhlenbeckprocess. But, the auto-correlation functions of the increment process, which measures how much the shifted process remembers, exhibits power law decay. 1 For H > 2 , the auto correlation function is not integrable. Conventional tools are not applicable here, we turn to the theory of rough path differential equations and view (1.2) as rough differential equations driven by stochastic processes with a parameter ε. By the continuity theorem for solutions of rough differential equations, it is then sufficient to prove the convergence of these drivers in the rough path topology. For t ε continuous processes this concerns the scaling limits of the path integrals of the form 0 Gk(ys )ds together with their canonical lifts. Using rough path theory for stochastic homongenisation is a recent development, in [KM17, BC17], this was used for diffusive homogenisation. Proving and formulatingR an appropriate functional limit theorem, however, turned out to be one of our main endeavours. For independent identically distributed random variables, the central limit theorems (CLTs) states that 1 n √n k=1 Xk converges to a Gaussian distribution. For correlated random variables, non-Gaussian distribu- tions may appear. One of these was proved by Rosenblatt: Let Zn be a stationary Gaussian sequence with P d 1 2 d 1 correlation ̺(n) n− where d (0, 2 ) and let Yn = (Zn) 1 then n − Yn converges to a non-Gaussian distribution. To emphasise∼ the non-Gaussian∈ nature, those limit− theorems with non-Gaussian limits are re- ferred to ‘non-Central Limit Theorems’ (non-CLTs). A functional limit theorem concerns path integrals of t ε functionals of a yt. For a centred function G, it states that limε 0 √ε 0 G(ys)ds con- verges to a Brownian motion. Non-CLTs and functional non-CLTs were extensively→ studied [MT07, BH02, BM83, Taq75], these were then shown to hold for a larger class of functions [CNN20, NP05]R with Malli- avin calculus. In a nutshell, for a class of Gaussian processes and for a centred L2 function G with the t ε scaling constant depending on its Hermite rank m, the limit of α(ε) 0 G(ys )ds will be a BM if the scale is 1 or 1 ; otherwise it is a self-similar Hermite process of degree m with self-similar exponent √ε √ε ln(ε) R | | H∗(m)= m(H 1)+1. We will use functional limit theorems for both cases. − Let α(ε,H∗(m)) be positive constants as follows, they depend on m,H and ε and tend to as ε 0, ∞ → 1 1 √ε , if H∗(m) < 2 , 1 1 α(ε,H (m)) = , if H∗(m)= , (1.3) ∗  √ε ln(ε) 2  H∗|(m) 1| 1  ε − , if H∗(m) > 2 .  1 Observe that H∗ decreases with m andH∗(1) = H. If H 2 we only see the diffusion scale. We state below our key limit theorem, the lifted joint functional limit theorem≤ in the rough path topology, c.f. (5.2), see §3.4. The proof for the main theorem is finalised in §4. 1 Theorem B (Lifted joint functional CLTs/ Non-CLTs) Let H (0, 1) 2 and fix a finite time horizon 2 ∈ \{ } T . Suppose that the L (µ) functions G1,...,GN satisfy Assumption 2.10. Let mk denote the Hermite rank of Gk. Set t k,ε ε ε 1,ε 2,ε N,ε Xt = α(ε,H∗(mk)) Gk(ys)ds, X = (Xt ,Xt ,...,Xt ). (1.4) Z0

3 1 1 1 Xε ε Xε 1. Then, for every γ ( , ), the canonical rough paths := Xt , s,t converge weakly ∈ 3 2 − mink≤n pk C γ N in the rough topology ([0,T ], R ) and  ε lim X = X := (Xt, Xs,t + (t s)A) ε 0 → − 2. The precise formulation for the stochastic process Xt in the limit is given in Theorem 2.7. It consists of two independent blocks: a Wiener process block and a Hermite process block. For 0 s t T , the limiting second order processes are given by X = (Xi,j ) and A = (Ai,j ) where ≤ ≤ ≤

t an Itô integral, for i, j n, Xi,j = (Xi Xi)dXj, ≤ s,t r − s r a Young integral, otherwise. Zs (

∞ E(Gi(ys)Gj (y0))ds, if i, j n, Ai,j = ≤  Z0  0, otherwise.

∗  H (mk),mk The Hermite processes in Theorem B are Zt , see §2.1. They have Hölder continuous sample paths up to the order H∗(mk). For this theorem, we use a basic functional CLT from [GL20] for prov- ing the joint convergence of the integrals and their iterated integrals in an appropriate path space, in finite dimensional distribution. For the Wiener limit part, we employ both ergodic theorems and martingale approx- imations. In case where the processes are not strong mixing, proving the L2 boundedness of the martingale approximationsis rather involved (this is where we had to exclude functions with Hermit rank falling into the 1 1 range [ 2(1 H) , 1 H ). We will follow an idea in [Hai05a, LH19] for fractional Brownian motions to develop a locally independentdecomposition− − for the fOU process and use this for estimating the conditional moments. The final hurdle is the relatively compactness of the iterated integrals in the rough path topology, for which we use the diagram formula and an upper bound, from [Taq77], on the number of eligible graphs of complete pairings.

Acknowledgement. 1. We would like to thank M. Gubinelli and M. Hairer for very helpful discussions. 1 2. Previously, we proved the homogenisation theorem for H > 2 . This was posted to the Mathematics arxiv 1 and unpublished otherwise, see [GL19]. Here we can also include the H < 2 case. For the presentation, we did not include the basic joint functional limit theorem from [GL19]. Instead, an improved version is presented in [GL20].

Notation

• (Wt,t R) denotes a two-sided Wiener process. ∈ • Bt is the fBM in the Langevin equation, H is its Hurst parameter, t denotes its filtration. F • H∗(m)= m(H 1)+1. − • mk is the Hermit rank of Gk. 1 1 • Convention : H∗(mk) 2 for k n; otherwise H∗(mk) > 2 , r ≤ ≤ • b : bounded continuous functions with bounded continuous derivatives up to order r. • fC . g means that there exists a constant c, not depending on f or g, such that f cg. xt xs ≤ • x α := sups=t | t −s α| is the homogeneous Hölder semi-norm, 0 <α< 1. | | 6 | − | • For a process xt, set xs,t := xt xs. • We fix a probability space (Ω, −, P). Lp(Ω) denotes the Lp space on Ω and its norm is denoted by F p • L . • µk =k N(0, 1) is the standard Gaussian measure, Lp(µ) denotes the corresponding Lp space.

4 2 Preliminaries

A fractional Brownian motion is a continuous with stationary increments. We take a nor- 2 malised fractional Brownian motion Bt so that B0 = 0 and E(B1) = 1. Specifically, if H is its Hurst parameter, then

1 2H 2H 2H 2H E((Bt Bs)(Bu Bv)) = t v + s u t u s v . − − 2 | − | | − | − | − | − | − |  We refer to [PT17, Sam06, CKM03] for details on fractional Brownian motions. Note that

t s 1 2H 2H 2H 2H 2 E(BtBs)= t + s t s = H(2H 1) r r − dr dr , 2 − | − | − | 1 − 2| 1 2 Z0 Z0  ∂2 2H 2 1 and so ∂t∂s E(BtBs)= H(2H 1) t s − , when H (0, 1) 2 . Let Xn = B1+n Bn denote the − | − | ∈ \{ } − 1 increment process of a fBM. Then, the autocorrelation function of Xn is not summable for H > . { } 2 2.1 Hermite processes

1 Let Wt be a one dimensional standard two-sided Brownian motion. Let Hˆ (m)= (H 1)+1, so Hˆ is the m − inverse of H∗. ˆ 1 Definition 2.1 Let m N with H(m) > 2 . We take a standalised Hermite process of rank m to be the following mean zero process:∈

m t 1 1−H H,m K(H,m) ( 2 + m ) Zt = (s ξj )+− ds dW (ξ1) . . . dW (ξm). (2.1) m! m − R 0 j=1 Z Z Y The integral over Rm is understood as a multiple Wiener-Itô integral (no integration along the diagonals) and the constant K(H,m) is chosen so that it variance is 1 at t =1. The number H is its self-similarity exponent, it is also known as its Hurst parameter.

Since Hˆ (1) = H, the rank 1 Hermite processes ZH,1 are fractional BMs. Indeed (2.1) is exactly the Mandel- brot Van-Ness representation for a fBM. We emphasise this representation:

t 3 H H 2 B = (s ξ) − ds dWξ. t − + ZR Z0 The Hermite processes have stationary increments, finite moments of all orders and the following covariance function: 1 E(ZH,mZH,m)= (t2H + s2H t s 2H ). (2.2) t s 2 − | − | H,m Therefore, using Kolmogorv’s theorem, one can show that the Hermite processes Zt have sample paths of Hölder regularity up to H. As mentioned before, they also self similar stochastic processes:

H H,m H,m λ Z · Z. . λ ∼

H,m th The process Zt belongs to the m Wiener chaos generated by W , in particular, two Hermite processes H,m H′,m′ Z and Z , defined by the same Wiener process, are uncorrelated if m = m′. Further details on Hermite processes can also be found in [MT07]. 6

5 Remark 2.2 We note that in some literature, e.g. [MT07], the notation for the Hermite processes are differ- ent: t m H,m K(H,m) H 3 ˜ − 2 Zt = (s ξj )+ ds dW (ξ1) . . . dW (ξm). m! m − R 0 j=1 Z Z Y These two are related by H∗(m),m ˜H,m H,m ˜Hˆ (m),m Zt = Zt ,Zt = Zt . (2.3)

2.2 Fractional Ornstein-Uhlenbeck processes We gather in this section to useful facts about the stationary fractional Ornstein-Uhlenbeck process, by which t (t s) H H we mean yt = σ e− − dBs for Bt a two-sided fractional BM and σ chosen such that yt is distributed as µ = N(0, 1). It−∞ is the stationary solution of the Langevin equation: dy = y dt + σdBH with the initial R t t t 0 s H ε − value y0 = σ e dBs . We take rescale the fOU process to obtain yt , the latter is the the stationary solution of −∞ R 1 σ dyε = yε dt + dBH . (2.4) t − ε t εH t t 1 ε ε σ ε (t s) H Observe that y and y · have the same distributions, furthermore, y = H e dB . Let us denote ε t ε − − s their correlation· functions by ̺ and ̺ε respectively: −∞ R ε ε ε ̺(s,t) := E(ysyt), ̺ (s,t) := E(ysyt ).

ε Let ̺(s)= E(y0ys) for s 0 and extended to R by symmetry, so ̺(s,t)= ̺(t s) and similarly for ̺ . We ≥1 − have, for u> 0 and H > 2 , u 0 2 (u r r ) 2H 2 ̺(u)= σ H(2H 1) e− − 1− 2 r r − dr dr . − | 1 − 2| 1 2 Z−∞ Z−∞ We recall the following correlation decay from [CKM03],

1 2 2H 2 2H 4 Lemma 2.3 Let H (0, 1) 2 . Then, ̺(s)= σ H(2H 1)s − + O(s − ) as s . In particular, for any s 0, ∈ \{ } − → ∞ ≥ 2H 2 ̺(s) . 1 s − . (2.5) | | ∧ | | ∞ m 1 1 By Lemma 2.3, 0 ̺ (s)ds is finite if and only if H∗(m) < 2 . We are not interested in H = 2 , as the Ornstein-Uhlenbeck process admits an exponential decay of correlations and ̺m is integrable for any m 1. The following estimatesR explains how to choose the appropriate scaling constants, see [GL20] for detail.≥

1 Lemma 2.4 Let H (0, 1) 2 and fix a finite time horizon T , then, for t [0,T ] the following holds uniformly for ε (0∈, 1 ]: \{ } ∈ ∈ 2 t 1 1 ∞ ̺m s ds, if H m < , t t 2 ε 0 ( ) ∗( ) 2 ε ε m ̺(u, r) dr du .  q( t ) ln 1 , if H (m)= 1 , (2.6) | | εR ε ∗ 2 0 0 !  ∗| | Z Z  qt H (m) 1 ε ,  if H∗(m) > 2 .     m 1 1 tε ∞ ̺ (s)ds, if H∗(m) < , t t 2 0 2 ̺ε(u, r) m dr du .  q 1 1 (2.7) tε Rln ε , if H∗(m)= 2 , 0 0 | |  | ∗ | Z Z   H (m) 1  qt − 1 t ε  , if H∗(m) > 2 .    6 In particular, t (2H∗(m) 1) ε m t ∨ t ̺ (s) ds . 2 . (2.8) 0 | | α(ε,H (m)) Z ∗

t 1 ∞ m Note, if H = 2 , the bound is always ε 0 ̺ (s)ds. q R 2.3 Hermite Rank

2 2 x dm −x m 2 2 We take the Hermite polynomials of degree m to be Hm(x) = ( 1) e dxm e . Thus, H0(x)=1, 2 − H1(x) = x. The Hermite rank of an L (µ) function with respect to a Gaussian measure is the degree of the lowest non-zero Hermite polynomial term in the Hermite polynomial expansion of Gk. Definition 2.5 Let G : R R be an L2(µ) function with chaos expansion → ∞ 1 G(x)= ckHk(x), ck = G, Hk 2 . (2.9) k!h iL (µ) k m X= 1. The smallest m with cm =0 is called the Hermite rank of G. 6 1 2. Set H∗(mk) = mk(H 1)+1. If H∗(m) 2 we say G has high Hermite rank (relative to H), otherwise it is said to have− low Hermite rank. ≤

2.4 Joint functional CLT / non-CLT Functional limit theorems for Guassian processes have been extensively studied. The theorem we will need is from [GL20], it is tailored for proving the lifted functional limit theorem. We first introduce the notations.

ε Convention 2.6 Let yt = y t ε be the rescaled stationary fractional Ornstein-Uhlenbeck process with standard 1 Gaussian distribution µ and Hurst parameter H (0, 1) 2 . Each Gk : R R is a centred function in 2 ∈ \{ } → L (µ) with Hermite rank mk. Let αk(ε)= α(ε,H∗(mk)). Set

t ε 1,ε N,ε k,ε ε X := X ,...,X , where Xt = αk(ε) Gk(ys)ds. (2.10) Z0  We further define the rough paths Xε = (Xε, Xi,j,ε), where

t t s i,j,ε i,ε i,ε j,ε ε ε X := (X X )dX = αi(ε)αj (ε) Gi(y )Gj (y ) drds. (2.11) u,t s − u s r s Zu Zu Zu The process Xε = (Xε, Xi,j,ε) is called the canonical lift of Xε.

ε Without any further assumptions on Gk, X can be shown to converge jointly in finite dimensional pk distributions. For the convergence in a Hölder topology, we assume that Gk L (µ) for pk sufficiently 1 ∈1 1 large. This means H∗(mk) > 0 if Gk has low Hermite rank and otherwise > 0. This condition pk 2 pk is summarised in part (3) of− Assumption 2.10. −

Theorem 2.7 (Joint Functional CLT/non-CLT) Suppose that Gk are centred and satisfies furthermore As- sumption 2.10 (3). Write G = ∞ c H and set k l=mk k,l l XW,ε = PX1,ε,...,Xn,ε , XZ,ε = Xn+1,ε,...,XN,ε .

Then, the following holds:  

7 1. There exist stochastic processes XW = (X1,...,Xn) and XZ = (Xn+1,...,XN ) such that on every finite interval [0,T ], (XW,ε,XZ,ε) (XW ,XZ ), −→ weakly in γ([0,T ], RN ). We can take γ to be any number smaller than 1 1 if at least one 2 mink≤n pk C − 1 component converges to a Wiener process, otherwise we can take γ < mink>n H∗(mk) . pk 2. In particular the following holds, −

1 k,ε t s , if H∗(m) 2 , sup Xs,t . | −H∗| (m) ≤ 1 ε (0, 1 ) pk t s , if H∗(m) > 2 . ∈ 2  |p− |

Furthermore, for any t> 0 Z,ε Z lim Xt Xt L2(Ω) =0. ε 0 → k → k 3. The limit X = (XW ,XZ ) has the following properties W n Z N n (1) X R and X R − are independent. W ∈ ∈ (2) X = UWˆ t where Wˆ t is a standard Wiener process and U is a square root of the matrix i,j (2A )i,j n. Let ̺(r)= E(yry0), then the entries of the matrix are given as follows: ≤

i,j ∞ ∞ ∞ q A = E(Gi(ys)Gj (y0))ds = ci,q cj,q (k!) ̺(r) dr 0 0 q=mi mj Z X∨ Z i j i,j In other words, E Xt Xs = 2(t s)A for i, j n. ∗ H (mk),mk ∧ ≤ (3) Let Zt be the Hermite processes, represented by (2.1), and

∗ k mk! H (mk),mk Zt = Zt . (2.12) K(H∗(mk),mk) Then, Z n+1 N X = (cn+1,mn+1 Zt ,...,cN,mN Zt ). We emphasize that the Wiener process defining the Hermite processes is the same for every k, which is in addition independent of Wˆ t.

2.5 Assumptions and Conventions

2 Definition 2.8 A function G L (µ), G = l∞=0 clHl, is said to satisfy the fast chaos decay condition with parameter q N, if ∈ ∈ P ∞ l cl √l! (2q 1) 2 < . | | − ∞ l X=0 2 For functions G1,...,GN in L (µ), we write mk for their Hermite ranks.

2 Convention 2.9 Given a collection of functions (Gk L (µ), k N), we will label the high rank ones first 1 ∈ ≤ 1 so H∗(mk) < for k =1,...,n, where n 0 and otherwise H∗(mk) > . 2 ≥ 2

γ pk Assumption 2.10 (CLT rough, C - assumptions) Each Gk belongs to L (µ) for some pk > 2 and has Hermite rank mk 1. Furthermore, ≥ (1) Each Gk satisfies the fast chaos decay condition with parameter q 4. ≥

8 (2) (Integrability condition) pk is sufficiently large so the following holds: 1 1 1 min + min H∗(mk) > 1. (2.13) k n 2 − p n ; otherwise assume > . pk 2 2 p k 3 1 − − (4) Either H∗(mk) < 0 or H∗(mk) > 2 . Remark 2.11

1. If the functions Gk are polynomial functions, all assumptions stated above are automatically satisfied, except for (4). 2. The moment assumptions arise from the necessity to obtain the convergence, not just in the space of C γ 1 continuous functions but also in a rough path space for some γ > 3 , which is naturally established by Kolmogorov type arguments, to be able to use the continuity of the solution maps in the rough path setting. 3. Let η denote the greatest common Hölder continuity exponent for the first n terms in Xε, each of these converge to a Wiener process. Let τ denote the greatest common Hölder continuity exponent for the rest of the components of Xε. Then condition (2) is used for making sure η + τ > 1. With this, any iterated integral, in which one term converges to a Wiener and the other one to a Hermite process, can be interpreted as a Young integral. 1 4. In Condition (4) we have to assume H∗(mk) < 0, leaving a gap [0, 2 ]. This restriction is due to Proposition 3.20, where we only obtain the required integrability estimates for H∗(mk) < 0.

3 Lifted joint functional limit theorem

If X(n) and Y (n) are two sequences of stochastic processes with X(n) X and Y (n) Y (even if the convergence is almost surely everywhere and even if X and Y are differentiable→ curves),→ we may fail to t (n) (n) t (n) 1 (n) 1 conclude that Xs dYs ds XsdYs. Take for example X = cos(nt) and Y = sin(nt). 0 → 0 t √n t √n (n) (n) If a sequenceR of vector valued stochasticR processes (X1 ,X2 ) together with its canonical lift converge in the rough path topology, the limit of the iterated integrals may not be the same as the iterated integrals of the limit. We give an example for this by modifying the earlier example by pumping randomness into the cos and sin sequences using random variables λ(1), λ(2) taking values in 1, 1 . Define a sequence of stochastic { − } processes X(n) as follows: { 1 } 1 cos(nt), λ(1) = 1, X(n)(t)= √n 1 1 sin(nt), λ(1) = 1, ( √n − and similarly X(n). Then, X(n)(s) 0 in α for α< 1 and the same holds true for X(n), however, 2 1 → C 2 2 t t , λ(1) = 1, λ(2) = 1, (n) (n) 2 − X1 (s)dX2 (s)= 0, λ(1) = λ(2),  t Z0 , λ(1) = 1, λ(2) = 1.  − 2 − (n) (n)  In this example, (X1 ,X2 ) together with its canonical lift converge in the rough path topology. The limit of the iterated integrals depend on λ. If we set λ so that (λ(1), λ(2)) is uniformly distributed, the marginals are always the same, but the joint distributions depends on the further correlation relations of the random variables λ(1) and λ(2).

9 In this section, we show that Xε = (Xε, Xi,j,ε), the canonical lift of Xε, converges in the rough path topology. Specifically, we will show in §3.3 that the secondary processes Xi,j,ε, involving only i, j n, converge jointly in finite dimensional distributions (which is more involved due to the lack of the strong≤ ε Xi,j,ε 1 mixing property). In §3.1, we prove that (X , ),ε (0, 2 ] is tight in the rough path topology. The tightness plus the fact that we can identify{ the limiting joint∈ probability} distributions with stochastic integrals t W,ε Xi,j,ε W 0 XidXj shows that (X , ,i,j n) converges in the rough path to X and its lift. Furthermore we identify its remaining canonical lift parts≤ of (XZ ,XW ) as a measurable functions of (XW ,XZ ). The Rrest follows from Theorem 2.7.

3.1 Relative compactness of iterated integrals In this section, we establish moment bounds on the iterated integrals and prove that Xε is tight in the rough 2 path topology. Let Gi and Gj be two functions in L (µ) with Hermite ranks mGi and mGj respectively. Set

αi = α(ε,H∗(mGi )) and αj (ε)= α(ε,H∗(mGj )). Recall that

t s Xi,j,ε ε ε u,t = αi(ε)αj (ε) Gi(yr )Gj (ys)drds, Zu Zu To obtain tightness, we assume that the coefficients cn,i in the Hermite expansion of Gi satisfy the decay condition specified in Assumption 2.10 (1). We want to argue by Theorem 3.1 in [FH14], the rough path Xi,j,ε analogue to Kolmogorov’s theorem. Thus, we need to estimate u,t Lp(Ω), where by stationarity we may from now on assume u =0. k k If Gi and Gj are in a finite chaos of order Q, then

p p t s Xi,j,ε ε ε E 0,t = E αi(ε)αj (ε) Gi(yr)Gj (ys)drds (3.1) 0 0    Z Z  p t s Q p p ε ε = αi(ε) αj (ε) E ci,kcj,k′ Hk(y )Hk′ (y )drds (3.2)  r s  0 0 k,k′ Z Z X=1   p Q Q p p t s1 t sp p p ε ε αi(ε) αj (ε) ci,k cj,k′ E Hk (y )Hk′ (y ) drldsl . l l l rl l sl ≤ ′ ′ | | z 0 0 ···}| 0 0 { ! k1,...,kp=mGi k1,...,kp=mGj l=1 Z Z Z Z l=1 X X Y Y

(3.3)

p ε ε This means we need to estimate the terms E H (y )H ′ (y ) . For convenience, we will re- l=1 kl rl kl sl  E 2p ε  label the indices so to write the product in the formQ l=1 Hkl (ysl ) . For p = 2, we have the identity ε ε ε ε m E(Hm(ys)Hn(yr)) = δn,m(E(ysyr )) . For the multipleQ product, we use the so called diagram-formulae, see e.g.[BH02] and references therein. The diagram-formulae formula states that the expectation we are con- cerned with can be calculated by summing over products of covariances, similar to Isserli’s/Wick’s theorem. ε This can be linked to graphs. Nodes of these graphs correspond to the ysl ’s and each such node has exactly ε ε kl edges, where no edge may connect a node to itself. Each edge between ysl and ysq corresponds to a factor ε ε E(ysq ysl ). The expectation we are concerned with is then given by summing over all possible graphs of such complete parings. For a particular graph Γ, we denote by n(l, q) the number of edges connecting l to q, so it takes values in 0, 1,..., min(kl, kq) , and consider the pairings in an ordered way so that each pairing is counted only { }

10 2p once. We thus have q=1 n(l, q) = kl and, since edges are only allowed to connect with different nodes n(q, q)=0 for every q. For any given graph this is P 2p 2p 2p n(l,q) ε ε ε n(l,q) E(y y ) = ̺ (sl sq) , sq sl − q=1 l=q+1 q=1 l:l>q, l Γq Y Y   Y { Y∈ } where Γq denotes the subgraph of nodes connected to q. Thus,

2p 2p ε ε n(l,q) E Hk (y ) = ̺ (sl sq) , l sl − l=1 ! Γ q=1 l:l>q, l Γq Y X Y { Y∈ } where the sum ranges over all suitable graphs Γ given (k1,...,k2p). Lemma 3.1

1. Let Γ denote a complete pairing of 2p nodes with a suitable amount of edges (k1,...,k2p). Define:

2p t t E ε ε n(q,l) I(ε, 2p, Γ) := ( (ysq ysl )) ds1 ...ds2p. z 0 ···}| 0{ Z Z (sq ,sl) Γ { Y}∈ Then, 2p 2p ∗ 1 t H (kl) t ∨ 2 I(ε, 2p, Γ) . t ̺ε(s) kl ds . . (3.4) | | α(ε,H (kl)) l s t l ∗ Y=1 Z− Y=1 2. If Gi, Gj : R R are functions in finite chaos with Hermite ranks mG and mG respectively. Then, → i j t s Xi,j,ε ε ε 0,t Lp(Ω) = αi(ε)αj (ε) Gi(yr )Gj (ys )drds k k p Z0 Z0 L (Ω) H∗(m ) 1 +H∗(m ) 1 Gi 2 Gj 2 . t ∨ ∨ .

Proof. For a general graph, let us start dealing with the first variable s1. We first count forward and observe

2p 2p 2p ε ε n(q,l) ε ε n(q,l) ε n(l,q) (E(y y )) = (E(y y )) = (̺ (sl sq)) , sq sl sq sl − (sq ,sl) Γ q=1 l=q+1 q=1 l:l>q, l Γq { Y}∈ Y Y Y { Y∈ } where Γq denotes the subgraph of nodes connected to q. Using Hölder’s inequality we obtain

n(1,q) t t k1 ε n(1,q) ε k1 ̺ (s1 sq) ds1 ̺ (s1 sq) ds1 0 | − | ≤ 0 | − | Z q:q>1, q Γ1 q:q>1, q Γ1 Z  { Y∈ } { Y∈ } t ε k1 ̺ (s1) ds1. ≤ t | | Z− We have used n(1, q) = k , the number of edges at node 1. We then peel off the integrals q>1:q Γ1 1 layer by layer, and{ proceed∈ } with the same technique to the next integration variable. For example suppose the P

11 2p remaining integrator containing s2 has the combined exponent τ2 = q=2 n(2, q),(τ1 = k1). By the same procedure as for s1 we score a factor t P ε τ2 ̺ (s2) ds2. t | | Z− By induction and putting the estimates for each integral together,

2p t t 2p 2p t ε n(l,q) ε τq (̺ (sl sq)) ds1 ...ds2p . ̺ (s) ds. z 0 ···}| 0{ − t | | Z Z q=1 l:l>q, l Γq q=1 Z− Y { Y∈ } Y

Following [BH02], we reverse the procedure in the estimation for the integral kernel. Let ξq denote the q number of edges connected to the node q in the backward direction, so ξq = l=1 n(l, q), and the same reasoning leads to the following estimate: P 2p t t 2p 2p t ε n(l,q) ε ξq (̺ (sl sq)) ds1 ...ds2p . ̺ (s) ds. z 0 ···}| 0{ − t | | Z Z q=1 l:l

Since τq + ξq = kq by Hölders inequality,

t t t ̺ε(s) τq ds ̺ε(s) ξq ds 2t ̺ε(s) kq ds. t | | t | | ≤ t | | Z− Z− Z− Therefore,

2p 2 t t 2p 2p t ε n(l,q) ε kq  (̺ (sl sq)) ds1 ...ds2p . t ̺ (s) ds . z 0 ···}| 0{ − t | | Z Z q=1 l:l>q, l Γq  q=1  Z−   Y { Y∈ }  Y     By Lemma 2.4 we obtain, for each q 1,...,N , ∈{ } t ∗ 2 ε kq 2H (kq ) 1 α(ε,H∗(kq)) t ̺ (s) ds . t ∨ , t | | Z− hence, the first part of the lemma follows. ε For Gi and Gj we obtain as in Equation (3.1), using the fact that ̺ > 0 and thus we may enlarge our integration area,

12 Xi,j,ε p k 0,t kLp(Ω) p Q Q p p t s1 t sp p p ε ε αi(ε) αj (ε) ci,k cj,k′ E Hk (y )Hk′ (y ) drldsl l l l rl l sl ≤ ′ ′ | | z 0 0 ···}| 0 0 { ! k1,...,kp=mGi k1,...,kp=mGj l=1 Z Z Z Z l=1 X X Y Y

Q Q p p p p ε ε . α (ε) α (ε) c c ′ E H (y )H ′ (y ) dr ds . i j i,kl j,kl kl rl kl sl l l 2p ′ ′ | | [0,t] k ,...,kp=mG k ,...,k =m l=1 l=1 ! 1 X i 1 Xp Gj Y Z Y Q Q p p p = α (ε) α (ε) c c ′ I(ε, 2p, Γ) i j i,kl j,kl ′ | | k ,...,kp=mG k ,...,k′ =m l=1 Γ 1 X i 1 Xp Gj Y X p ∗ 1 ∗ ′ 1 H (kl) H (k ) . t ∨ 2 t l ∨ 2 . l Y=1

By monotonicity of H∗ and the fact that kl mG and k′ mG , ≥ i l ≥ j 1 p p ∗ 1 ∗ ′ 1 ∗ 1 ∗ 1 H (kl) H (k ) H (mG ) +H (mG ) t ∨ 2 t l ∨ 2 t i ∨ 2 j ∨ 2 , ≤ l ! Y=1 concluding the proof.

For functions not belonging to a finite chaos we must count the number of graphs in the computation and need some assumptions. Let M(k1,...,k2p) denote the cardinality of admissible graphs with 2p nodes with respectively (k1,...,k2p) edges. In [Taq77] it was shown that

2p kl M(k , k ,...,k p) (2p 1) 2 kl. 1 2 2 ≤ − l=1 Y p This leads to Assumption 2.10 (1), which restricts the Gi’s to the class of functions whose coefficients in the Hermite expansion decay sufficiently fast.

Proposition 3.2 Suppose that each Gk satisfies Assumption 2.10. Then, one has for i, j 1,...,N , ∈{ } t s ∗ 1 ∗ 1 ε ε H (mGi ) 2 +H (mGj ) 2 αi(ε)αj (ε) Gi(yr)Gj (ys)drds . t ∨ ∨ . Z0Z0 Lp(Ω)

Consequently, X ε is tight in C γ for γ ( 1 , 1 1 ). ∈ 3 2 − mink≤n pk

13 ∗ 1 ∗ ′ 1 ∗ 1 ∗ 1 ε p H (kl) H (k ) p(H (mG ) +H (mG ) ) Proof. As above using ̺ > 0 and t ∨ 2 t l ∨ 2 t i ∨ 2 j ∨ 2 , l=1 ≤ Xi,j,ε p Q k 0,t kLp(Ω p p p p ∞ ∞ ε ε . α (ε) α (ε) c c ′ E H (y )H ′ (y ) dr ds . i j i,kl j,kl kl rl kl sl l l 2p ′ ′ | | [0,t] k ,...,kp=mG k ,...,k =m l=1 l=1 ! 1 X i 1 Xp Gj Y Z Y p p p ∞ ∞ = α (ε) α (ε) c c ′ I(ε, 2p, Γ) i j i,kl j,kl ′ ′ | | k ,...,kp=mG k ,...,k =m l=1 Γ 1 X i 1 Xp Gj Y X p ∗ 1 ∗ 1 ∞ ∞ p(H (mG ) +H (mG ) ) . t i ∨ 2 j ∨ 2 c c ′ M(k ,...,k , k′ ,...,k′ ) i,kl j,kl 1 p 1 p ′ ′ | | k ,...,kp=mG k ,...,k =m l=1 1 X i 1 Xp Gj Y p ′ ∗ 1 ∗ 1 ∞ ∞ kl+kl p(H (mG ) +H (mG ) ) . t i ∨ 2 j ∨ 2 c c ′ k !k !(2p 1) 2 . i,kl j,kl l l′ ′ ′ | | − k ,...,kp=mG k ,...,k =m l=1 1 X i 1 Xp Gj Y q By the chaos decay assumption, these sums are finite and this completes the proof for the required moment Xε C γ 1 1 bounds. Finally, using Theorem 2.7, we can conclude the tightness of in , where γ ( 3 , 2 1 ), by an application of Lemma 5.6. ∈ − mink≤n pk

3.2 Young integral case (functional non-CLT in rough topology) Lemma 3.3 Assume Assumption 2.10.Then,

ε i,j,ε (X , X ) i,j 1,...,N :i j>n , (3.5) { ∈{ } ∨ } Xi,j Xi,j t i j converges in finite dimensional distributions to (X, ), where = 0 XsdXs and these integrals are well defined as Young integrals. R Proof. By Assumption 2.10 and Theorem 2.7, each component of Xε converges in a Hölder space. Further- more, by Assumption 2.10 (2) there exist numbers η and τ, with η + τ > 1, such that the Hölder regularity of the limits corresponding to a Wiener processes, are bounded below by η, and the ones corresponding to a Hermite process bounded from below by τ. Therefore, taking the integrals

t s t ε ε i,ε j,ε αi(ε)αj (ε) Gj (ys )Gi(yr)drds = Xs dXs Z0 Z0 Z0 is a continuous and well-defined operation from η τ τ or τ η η , thus weak convergence in η follows. Let F denote the continuous map suchC ×C that, for→Ci, j withC i×Cj>n→C, Xi,j,ε = F (Xε)i,j . Now, set C ∨ F = id F × ε ε ε ε i,j F(X ) = (X , F (X )) = (X , X ) i,j 1,...,N :i j>n , { ∈{ } ∨ } which by the above is a continuous function. Thus, by an application of the continuous mapping theorem we can conclude the lemma. Remark 3.4 Note that by the moment bounds obtained in Theorem 2.7 and Proposition 3.2 the joint conver- gence takes place in better Hölder spaces. Now it is left to deal with the parts of the natural rough path lift involving two Wiener scaling terms, this is carried out in the next section.

14 3.3 Itô integral case (functional CLT in rough topology) We proceed to establish the convergence of the iterated integrals where both components belong to the high Hermit rank case.

Remark 3.5 We further assume H∗(mk) < 0 for each k which gives rise to a Wiener scaling. Thus, we do 1 not obtain Logarithmic terms and therefore work with the √ε scaling from here on. Furthermore, in this case t t ε ε α(ε) 0 G(ys )ds equals √ε 0 G(ys)ds in law and for simplicity we will work with the latter in this chapter.

FromR here onwards in this section,R we take k,i,j n. Thus, both Gi and Gj give rise to Wiener processes. Recall that, ≤ t k,ε ε Xt = √ε Gk(ys)ds. Z0 By Theorem 2.7, (Xi,ε,Xj,ε) (W i, W j ), where W i and W j denote Wiener processes with covariances as specified in Theorem 2.7, weakly.→ We now want to show that the convergence of the following integral

t t ε ε s i,ε j,ε Xs dXs = ε Gi(yr)Gj (ys)drds Z0 Z0 Z0 = I1(ε)+ I2(ε).

t i j We will show that I1(ε) 0 Ws dWs weakly, where the integral is understood in the Itô-sense, and I2(ε) tAi,j in probability for some→ constants Ai,j . For this we aim to use [KP91] Theorem 2.2 , hence, we need→ to R approximate Xk,ε by a suitable martingale, see also [BC17]. For any L2(µ) function U, in particular for the Gk’s, one would have liked to work with the stationary process,

∞ ΦU (t)= U(yr)dr Zt and use it to define L2(Ω)-martingale differences, see [KV86]. This unfortunately does not posses good enough integrability properties, thus, as in [BC17], we instead define

∞ Uˆ(k) := E(U(yr) k) dr. (3.6) k 1 |F Z − Since y is stationary, we do have (Uˆ τ)(k) = Uˆ(k + 1), where τ is the shifting operator on sequences. To show that Uˆ posses the desired integrability◦ properties is a bit more involved. We will show that that there exists a local independent decomposition of the fractional Ornstein-Uhlenbeck process as follows: for every t k k k k there exists a decomposition, yt = y +˜y , such that the first term y is k measurable, y˜ is independent of t t t F t k, where k is the filtration generated by the driving fractional Brownian motion up to time k. Both terms areF GaussianF processes. This is given in section 3.5. To proceed further we also need a couple of lemmas. Lemma 3.6 For x,y,a,b R such that a2 + b2 =1, ∈ m m j m j Hm(ax + by)= a b − Hj (x)Hm j (y). (3.7) j − j=0 X   1 k Lemma 3.7 Let H (0, 1) . Set at = y 2 . Then, ∈ \{ 2 } k t kL (Ω) k m yt E[Hm(yt) k] = (at) Hm . |F a  t 

15 k k Proof. Let yt = yt +y ˜t denote the local independent decomposition of the fOU from 3.5 and set bt = k k k y˜ 2 . By the independence of y and y˜ we obtain k t kL (Ω) t t 2 k 2 k 2 2 2 1= yk 2 = y 2 + y˜ 2 = (at) + (bt) . k kL (Ω) k t kL (Ω) k t kL (Ω) Now we decompose Hm(yt) using the above identity and obtain, yk y˜k H (y )= H yk +˜yk = H a t + b t m t m t t m t a t b   t   t  m  k k m j m j yt y˜t = at bt − Hj Hm j . j at − bt j=0 X       yk yk By construction t and ˜t are standard Gaussian random variables, together with the fact that y¯k is measur- at bt t able with respect to k this leads to, F m k k m j m j yt y˜t E[Hm(yt) k]= (at) (bt) − Hj E Hm j k |F j at − bt |F j=0 X         yk = (a )mH t , t m a  t  k y˜t k where we used the fact that E Hj vanishes for any j 1, y˜ is independent of k, and H0 =1. bt ≥ t F    Proposition 3.8 If U L2(µ) has Hermite rank m, then ∈ ˆ ∞ ∞ k k m U(k) L2(µ U L2(µ) E ys yr dr ds. (3.8) k k ≤k k k 1 k 1 Z − Z −  2 1 In particular Uˆ(k) k 1 is bounded in L (Ω) if H (0, 1) and U has Hermite rank m such that { } ≥ ∈ \{ 2 } H∗(m) < 0. 1 Proof. The ‘in particular’ part of the assertion follows from the statement that if H (0, 1) 2 and U has q ∈ \{ } ∞ ∞ k k Hermite rank m such that H∗(m) < 0, then, k 1 k 1 E ys yr dr ds < , see Proposition 3.20. Due to the lack of the strong mixing property, the proof− for− this is lengthy and independent∞ of the error estimates here and therefore postponed to section 3.5. R R  We go ahead proving the identity. Starting with the definition of Uˆ and the Hermite expansion U = 2 q∞=m cqHq, we compute the L (Ω) norm as follows:

P ∞ ∞ ∞ ∞ Uˆ(k) 2 = cqcj E E[Hq(ys) k] E[Hj (yr) k] dr ds k kL (Ω) |F |F k 1 k 1 q=m j=m Z − Z − X X   ∞ k k ∞ ∞ 2 q q ys yr = (cq) E (as) (ar) Hq Hq dr ds as ar k 1 k 1 q=m Z − Z − X      k k q ∞ ∞ ∞ 2 q q ys yr = (cq) q! (as) (ar) E dr ds as ar k 1 k 1 q=m Z − Z − X    ∞ ∞ ∞ 2 k k q ∞ ∞ k k m = (cq) q! E ys yr dr ds U L2(µ) E ys yr dr ds. k 1 k 1 ≤k k k 1 k 1 Z − Z − q=m Z − Z − X   2 2 The desired conclusion follows from the summability of ∞ (cq) q!, which is U 2 . q=m k kL (µ) P 16 With this we may define two families of L2(Ω) martingales.

2 Corollary 3.9 Given U, V L (µ) such that there Hermite ranks mU and mV satisfy H∗(mU ) < 0 and ∈ H∗(mV ) < 0, then, the process (Mk, k 1), where ≥ k

Mk := Uˆ(j) E Uˆ(j) j 1 , − |F − j=1 X    2 is an k-adapted L (Ω) martingale with shift covariant martingale difference. The same holds for F k

Nk := Vˆ (j) E(Vˆ (j) j 1) . − |F − j=1 X   We can now formulate the main result of this sub-section.

Proposition 3.10 Let U,V,M and N be as in Corollary 3.9, then there exists a function Er(ε) converging to zero in probability as ε 0, such that → t t [ ε ] ε s ε U(ys)V (yr)drds = ε (Mk Mk)Nk + tγ + Er(ε), (3.9) +1 − 0 0 k Z Z X=1 where ∞ γ = E(U(ys)V (y0))ds. Z0 t [ ε ] The proof for this is given in the rest of the section. Afterwards we show that ε k=1(Mk+1 Mk)Nk [ t ] [ t ] − converges to the relevant Itô integrals of the limits of √ε ε U(y )dr and √ε ε V (y )dr. 0 r 0 P r Lemma 3.11 The stationary Ornstein-Uhlenbeck processR is ergodic. R

Proof. A stationary Gaussian process is ergodic if its spectral measure has no atom, see [CFS82, Sam06]. The spectral measure F of a stationary Gaussian process is obtained from Fourier transforming its correlation iλx function and ̺(λ)= R e dF (x). According to [CKM03]:

R 1 2H Γ(2H + 1) sin(πH) x − ̺(s)= eisx | | dx, (3.10) 2π 1+ x2 ZR so the spectral measure is absolutely continuous with respect to the Lebesgue measure with spectral density x 1−2H s(x)= c | 1+| x2 .

For k N, we define the k-adapted processes: ∈ F k I(k)= U(ys)ds =ΦU (k 1) ΦU (k) k 1 − − Z − k J(k)= V (ys)ds =ΦV (k 1) ΦV (k). k 1 − − Z −

17 Remark 3.12 We note the following useful identities. For k N ∈

Uˆ(k)= I(k)+ E[Uˆ(k + 1) k], (3.11) |F Mk Mk = I(k)+ Uˆ(k + 1) Uˆ(k), (3.12) +1 − − k k I(j)= U(yr)dr = Mk Uˆ(k)+ Uˆ(1) M . − − 1 j=1 0 X Z and similarly for V and N.

t Henceforth in this section we set L = L(ε) = [ ε ]. Lemma 3.13 There exists a function Er (ε), which converges to zero in probability as ε 0, such that 1 → t s L k 1 1 s ε − ε U(ys)V (yr)drds = ε I(k) J(l)+ t E(U(ys)V (yr)) drds + Er1(ε) (3.13) 0 0 k l 0 0 Z Z X=1 X=1 Z Z Proof. Let us divide the integration region 0 r s t into several parts, ≤ ≤ ≤ ε t L s ε s U(ys)V (yr)drds + U(ys)V (yr)drds. Z0 Z0 ZL Z0 t ε The second term is of order o(ε) since L U(ys)ds L2(Ω) is bounded by stationarity of yr and Theorem k t k ε t 2.7, see also [GL20], furthermore, the term √ε V (yr)dr 2 is bounded by . We compute for the R k 0 kL (Ω) √ε remaining part, R L s L k k 1 s − U(ys)V (yr)drds = U(ys) V (yr)dr + V (yr)dr ds 0 0 k k 1 0 k 1 ! Z Z X=1 Z − Z Z − L k k 1 L − = U(ys)ds V (yr)dr + U(ys)V (yr)drds k k 1 0 k k 1 r s k X=1 Z − Z X=1 Z{ − ≤ ≤ ≤ } L k 1 L − = I(k) J(l)+ U(ys)V (yr)drds. k l k k 1 r s k X=1 X=1 X=1 Z{ − ≤ ≤ ≤ }

The stochastic process Zk = k 1 r s k U(ys)V (yr)drds is shift invariant and the shift operator is ergodic with respect to the probability{ − ≤ distribution≤ ≤ } on the path space generated by the fOU process, hence, by R Birkhoff’s ergodic theorem,

L 1 s 1 (ε 0) Zk → EZ = E(U(ys)V (yr))drds. L −→ 1 k 0 0 X=1 Z Z This completes the proof.

Lemma 3.14 The following converges in probability:

L k 1 L 1 − ∞ lim ε I(k) J(l) ε (Mk+1 Mk)Nk = t E(U(ys)V (yr))drds. ε 0 − − → k l k ! 1 0 X=1 X=1 X=1 Z Z

18 Proof. A. Following [BC17] and using the identities of Remark 3.12, we obtain:

L k 1 − I(k) J(l) (Mk Mk)Nk − +1 − k l ! X=1 X=1 L = I(k) Nk Vˆ (k)+ Vˆ (1) N I(k)+ Uˆ(k + 1) Uˆ(k) Nk − − 1 − − k X=1     L L L = I(k) Vˆ (k)+ I(k)(Vˆ (1) N ) (Uˆ(k + 1) Uˆ(k))Nk. − − 1 − − k k k X=1 X=1 X=1 Firstly, by the shift invariance of the summands below and Birkhoff’s ergodic theorem we obtain

L 1 ∞ ε I(k)Vˆ (k) ( t) E[I(1)Vˆ (1)] = ( t)E U(yr)dr V (ys)ds . (3.14) − −→ − − k 0 0 X=1 Z Z  Next, since Vˆ (1) N = E[Vˆ (1) ], − 1 |F0 2 2 L L E ε I(k)(Vˆ (1) N ) =E ε U(yr) dr E[Vˆ (1) ] − 1 |F0 k 0 X=1 Z L L 2 2 .ε E[Vˆ (1)] E[U(yr)U (ys)] ds dr, Z0 Z0 which by Lemma 2.4 and expanding into Hermite polynomials converges to 0 as ε 0. B. It remains to discuss the convergence of →

L ε (Uˆ(k + 1) Uˆ(k))Nk. − k X=1 We change the order of summation to obtain the following decomposition

L (Uˆ(k + 1) Uˆ(k))Nk − k X=1 L k 1 − = (Uˆ(k + 1) Uˆ(k)) (Nj Nj )+ N −  +1 − 1 k j=1 X=1 X L 1 L  L − = (Nj Nj) (Uˆ(k + 1) Uˆ(k)) + (Uˆ(k + 1) Uˆ(k))N +1 − − − 1 j=1 k j k X =X+1 X=1 L 1 L 1 − − = (Nj Nj )Uˆ(L + 1) (Nj Nj) Uˆ(j +1)+ Uˆ(L + 1) Uˆ(1) N . +1 − − +1 − − 1 j=1 j=1 X X   We may now apply Birkhoff’s ergodic theorem to the first term, taking ε 0, → L 1 − lim ε (Nj+1 Nj )Uˆ(L +1)=0, ε 0 − − → j=1 X

19 in probability. By the same ergodic theorem, the second term converges to t E Uˆ(2)(N N ) in proba- − 2 − 1 bility. By Proposition 3.8, Uˆ(j) is bounded in L2(Ω), hence, for the third term we obtain, 

ε Uˆ(L + 1) Uˆ(1) N1 . ε. − L2(Ω)  

Overall we end up with L

lim( ε) (Uˆk+1 Uˆ(k))Nk = t E Uˆ(2)(N2 N1) , (3.15) ε 0 − − − → k X=1   where the convergence is in probability, hence,

L k L

lim ε I(k)J(l) ε (Mk+1 Mk)Nk = t E Uˆ(2)(N2 N1) I(1)Vˆ (1) . (3.16) ε 0 − − − − → k l k ! X=1 X=0 X=1 h i C. We look for a better expression of the limit in (3.16). Firstly by Corollary 3.9, we have (N N )= Vˆ (2) E(Vˆ (2) )= Vˆ (2) E(Vˆ (2) ) 2 − 1 − |F1 − |F1 1 = Vˆ (2) Vˆ (1) + V (ys) ds. − Z0 1 Using this and I(1) = U(ys)ds = Uˆ(1) E[Uˆ(2) ], we compute 0 − |F1 E Uˆ(2)(N NR ) I(1)Vˆ (1) 2 − 1 − 1  ∞  = E(U(ys)V (yr))drds + E Uˆ(2) Vˆ (2) Vˆ (1) Uˆ(1) E[Uˆ(2) 1] Vˆ (1) . 1 0 − − − |F Z Z       Since Vˆ (1) is measurable, F1 E Uˆ(2) Vˆ (2) Vˆ (1) Uˆ(1) E[Uˆ(2) ] Vˆ (1) , − − − |F1       vanishes by the shift covariance of Uˆ(k)Vˆ (k). This concludes the proof of the lemma. Proof of Proposition 3.10. Combining (3.13), Lemma 3.14 and Lemma 3.15, we have

t ε s ε U(ys)V (yr)drds Z0 Z0 L k 1 1 s − =ε I(k) J(l)+ t E(U(ys)V (yr)) drds + Er1(ε) k l 0 0 X=1 X=1 Z Z L 1 ∞ =ε (Mk Mk)Nk + t E(U(ys)V (yr))drds +1 − k=1 Z1 Z0 X 1 s + t E(U(ys)V (yr))drds + Er1(ε)+ Er2(ε) Z0 Z0 L ∞ =ε (Mk Mk)Nk + t E(U(yv)V (y ))du + Er (ε)+ Er (ε), +1 − 0 1 2 k 0 X=1 Z

20 where we used stationarity, E(U(ys)V (yr)) = E(U(ys r)V (y0)), and the change of variables u = s + r, v = s r, leading to the identity, − − 1 s 1 u 2 ∞ 1 ∞ − ∞ + E(U(ys)V (yr))drds = E(U(yv)V (y ))dudv = E(U(yv)V (y ))dv. −2 0 0 Z0 Z0 Z1 Z0  Z0 Zu Z0 This completes the proof of Proposition 3.10.

Proposition 3.15 Let XW,ε = (X1,ε,...,Xn,ε), then

t t (XW,ε, XW,ε) = (XW,ε, XW,εdXW,ε) (XW , XW dXW + tA), → Z0 Z0 jointly in finite dimensional distributions, where the integration is understood in the Itô sense and A is as in Theorem B.

k k Proof. For each X we first define the martingales M as in Corollary 3.9, for U = Gk. Then, we define the Cádlág martingales as follows k,ε k Mt = √εM t . [ ε ] Using the identity (3.12) we obtain,

t [ ε ] t [ ε ] k,ε k k k t k M = √ε (M M )+ √εM = √ε Gk(yr)ds + √εGˆk √εGˆk(1) + √εM . t q+1 − q 1 ε − 1 q=1 0 X Z   ˆ 2 1,ε n,ε W Since Gk is L (Ω) bounded, the joint convergence (Mt ,...,Mt ) X in finite dimensional distribu- tions follows from Theorem 2.7. Next by Proposition 3.10, →

t t i,j [ ε ] t XW dXW = ε (M j M j )M i + tAi,j + Er(ε)= M i,εdM j,ε + tAi,j + Er(ε), s s q+1 − q q s s 0 q=1 0 Z  X Z where the integration is understood in the Itô sense and Er(ε) 0 in probability. The joint convergence → 2 k,ε t i,ε j,ε k,ε of (M , Ms dMs )i,j,k n in finite dimensional distributions follows, as for each k, E Mt . 0 ≤ t+o(ε), byR an application of Theorem 2.2 in [KP91], which states that given a sequence of jointly convergent martingales bounded in L2(Ω), then these martingales also converge jointly with their Itô integrals. This concludes the proof.

We summarise this section with the following more general statement, which follows from the proofs above:

2 Remark 3.16 Let yt be a stationary and ergodic stochastic process with stationary measure π, Gk L (π), t ∈ k,ε ε ε 1,ε n,ε X = √ε Gk(ys)ds such that X = (X ,...,X ) converges, as ε 0, to a Wiener process t 0 → X. Suppose that ( ∞ E(G (y ) )dr, k 1) is L2(Ω) bounded. Then, Xε = (Xε, Xε), the canonical R k 1 j r k rough path lift of (X−ε), converges|F to (X, X≥+ (t s)A), the Stratonovich lift of X. (By this we mean Xi,j t iR j − that 0,t = 0 XsdXs , where the integral is understood in the Itô sense and A denotes the corresponding Stratonovich correction.) R

21 3.4 Proof of Theorem B Now we are ready to conclude the convergence of Xε weakly in the rough path topology. We assume that pk 1 Gk L (µ) satisfy the integrability conditions specified in Assumption 2.10. Fix H (0, 1) 2 and a final∈ time T . Let, for t [0,T ], ∈ \{ } ∈ t t ε ε ε Xt := α1(ε) G1(ys )ds, ...,αN (ε) GN (ys )ds  Z0 Z0  ∞ E(G (y )G (y ))ds, if i, j n, Ai,j = 0 i s j 0 0, otherwise.≤  R By Theorem 2.7 Xε has a limit which we denote by X. Set further,

t s Xi,j,ε ε ε u,t = αi(ε)αj (ε) Gi(ys )Gj (yr )drds, u u t Z Z Xi,j i j u,t = Xu,sdXs , Z0 where the second integral is to be understood in the Itô-sense if two Wiener processes appear, and in the Young sense otherwise. We will prove below that, as ε 0, → Xε = (Xε, Xε) X = (X, X + A(t s)), → − weakly in C γ ([0,T ], RN ) for γ ( 1 , 1 1 ). ∈ 3 2 − mink≤n pk Proof. Firstly, we recall that by Theorem 2.7 the basic processes converge jointly and the limiting Wiener process, XW , is independent of the limiting Hermite process, XZ . In Proposition 3.15 we showed that W,ε t W,ε W,ε the high Hermite rank components, X , converge jointly with their iterated integrals 0 X dX . In Lemma 3.3 we proved that the first order processes together with the lifts for which i j>n converge jointly and in particular these lifts are continuous functionals of (XW,ε,XZ,ε). By the continuous∨ R dependence of t i,ε j,ε 0 X dX , where i j>n, we may leave out these iterated integrals, it is sufficient to show joint conver- W,ε t W,ε∨ W,ε Z,ε γ 1 gence of (X , X dX ,X ). This vector is tight in C 2 by Theorem 2.7, Proposition 3.2, R 0 and application of Kolmogorov’s Theorem (see also Lemma 5.6 below),×C hence we may chose a converging subsequence. WeR now want to identify the limiting distribution. As we have already established convergence W,ε t W,ε W,ε Z,ε W Z of the marginals (X , 0 X dX ) and X , by independence of X and X their limiting distribu- t tion is just given by the product measure between (XW , XW dXW ) and XZ . The choice of subsequence R 0 was arbitrary, hence each subsequence has the same limit and the whole sequence converges. This concludes the proof. R

3.5 Proof of the conditional integrability of fOU m ∞ ∞ k k The aim of this section is to prove that supk k 1 k 1 E ys yr drds is indeed finite, for which we restrict 1 − − ourselves to the case H (0, 1) 2 and H∗(m) < 0. We first compute the conditional expectations of ∈ 2 \{ } R R  E(G(yt) k) where G L (µ). In [Hai05b]|F it was show∈ that a fractional Brownian motion has a locally independent decomposition: for k k k k any k

22 Furthermore, it was shown that the filtration generated by the fractional Brownian motion is the same as the one generated by the two-sided Wiener process Wt. We now prove such a decomposition for the fractional Ornstein-Uhlenbeck process.

H Lemma 3.17 Let s be the filtration generated by B . For k

k k where the first term y is k measurable and y˜ is independent of k. t F t F Lemma 3.18 Let τ > 1. For any k

Proof. We may assume that s > 4k +4, otherwise the integral is finite as the exponential term can be esti- k s mated by 1 and as τ > 1, the singularity is integrable. Splitting the integral into two regions +1 + , − k k+1 k+1 (s v) τ (s k 1) we have e− − (v k) dv . e− − − and furthermore using integration by parts, k − R R R s (s v) τ e− − (v k) dv − Zk+1 s 2 s τ (s k 1) (s v) τ 1 = (s k) e− − − τ + e− − (v k) − dv − − − k+1 s − Z Z 2 ! s s τ (s k) 2 τ 2 τ s . (s k) e− − e− (v k) k (v k) s − − − − | +1 − − | 2 τ s s τ τ . (s k) + e− 2 + ( k) . (s k) . − 2 − − This gives the required estimate.

k H 1 Lemma 3.19 For t k 1 the following estimate holds, y¯ 2 . 1 t k − . ≥ − k t kL (Ω) ∧ | − |

23 k Proof. Firstly, as yt L2(Ω) =1 we also obtain y¯t L2(Ω) 1. Thus, it is only left to consider the behaviour when t becomes large.k k Using the above Lemmak wek obtain ≤

t t k (t k) (t s) ¯k (t k) (t s) ¯˙ k yt L2(Ω) = e− − yk + e− − dBs L2(Ω) e− − yk L2(Ω) + e− − Bs L2(Ω)ds k k k k k ≤ k k k k k t Z Z (t k) (t s) H 1 H 1 e− − + e− − s k − ds . t k − . ≤ | − | | − | Zk

1 Proposition 3.20 Given H (0, 1) and suppose that H∗(m) < 0. Then, ∈ \{ 2 }

∞ ∞ k k m sup E ys yt dt ds < . k k 1 k 1 ∞ Z − Z −  Proof. As

∞ ∞ k k m ∞ ∞ k k m E ys yt dt ds ys L2(Ω) yt L2(Ω) dt ds k 1 k 1 ≤ k 1 k 1 k k k k Z − Z − Z − Z −  2  ∞ k m = yt L2(Ω)dt , k 1 k k Z −  m 2 2 ∞ k it is sufficient to show finiteness of k 1 E ys dt. By Lemma 3.19 we obtain, − R  m  ∞ k 2 2 ∞ q(H 1) E( ys ) dt . 1 t k − dt. k 1 k 1 ∧ | − | Z − Z −    This expression is finite if m(H 1) < 1. As H∗(m)= m(H 1)+1 < 0 this concludes the proof. − − − 4 Multi-scale homogenisation theorem

For Hermite polynomials we have the hypercontractivity estimate:

k k 2 Hk 2q (2q 1) 2 E(Hk) = (2q 1) 2 √k!. k kL (µ) ≤ − − 2 p Consequently, if an L (µ) function G = l∞=0 clHl satisfies the fast chaos decay condition with parameter q, l 2q then, G 2q ∞ cl Hl q < . We used ∞ cl √l! (2q 1) 2 < . Thus, G L (µ). k kL (µ) ≤ l=0 | |k k ∞P l=0 | | − ∞ ∈ Observe that 1 1 > 1 , a condition needed for the convergence in C γ, is equivalent to the condition q> 3. 2 2q P 3 P Also, if G satisfies− the decay condition with q> 1, then G has a continuous representation. Indeed, we have

x2 e− 2 Hk(x) 1.0865√k!, ≤ ∞ x2 see [AS84, pp787], the polynomials in [AS84] are orthogonal with respect to e− dx and one should take x2 2 care with the convention. Thus the power series e− l∞=0 clHl converges uniformly in x, the limit G is continuous. P Remark 4.1 If G satisfies the fast chaos decay condition with parameter q > 1, then G has a representation in L2q , with which we will work from here on. ∩C

24 We reformulate the main theorem here where it is proved.

1 3 d d Theorem 4.2 Let H (0, 1) 2 , fk b (R , R ), and Gk satisfy Assumption 2.10. Set f = (f1,...,fN ), then, the following statements∈ \{ hold.} ∈C ε γ 1 1 1. The solutions xt of (1.2) converge weakly in on any finite interval and for any γ ( 3 , 2 1 ). C ∈ − mink≤n pk 2. The limit solves the rough differential equation

dxt = f(xt)dXt x0 = x0. (4.1)

N Here X = (X, Xs,t + (t s)A) is a rough path over R , as specified in Theorem B. 3. Equation (4.1) is equivalent− to the stochastic equation below:

n N k l dxt = fk(xt) dX + fl(xt)dX , x = x , ◦ t t 0 0 k l n X=1 =X+1 where (X1,...,Xn) and (Xn+1,...,XN ) are independent, the denotes Stratonovich integral, and the other integrals are Young integral. ◦

Proof. We want to formulate our slow/fast random differential equation as a family of rough differential equations such that the drivers converge in the rough path topology. Using the continuity of the solution map, we obtain weak convergence of the solutions to a rough differential equation. Results in rough path then relate this rough differential equations to usual Stratonovich/Young equations, this is explained in §5.1.2 where we introduce the notations from rough differential equations, see also [FV10, FH14, LCL07]. We define F : Rd L(RN , Rd) as follows: → N

F (x)(u1,...,uN )= ukfk(x). k X=1 If we further set ε G = α1(ε)G1,...,αN (ε)GN , we may then write our slow equation as  

ε ε ε ε x˙ t = F (xt )G (yt ). For the rough path Xε = (Xε, Xε) defined by (2.10,2.11). we may rewrite equation (1.2) as a rough differ- ential equation with respect to Xε: ε ε Xε dxt = F (xt )d (t). with covariance as specified in Theorem 2.7 and Theorem B. By Theorem B, Xε converges to X = (X, X + (t s)A) in C γ where γ ( 1 , 1 1 ) on every 3 2 mink≤n pk 1 − ∈ − finite interval. Since γ > 3 by Assumption 2.10, we may apply the continuity theorem for rough differential equations, Theorem 5.3, to conclude that the solutions converge to the solutions of the rough differential equation x˙ t = F (xt)dXt. 3 Since F belongs to b , this is well posed as a rough differential equation. We completed the proof for the convergence. To showC the independence of Xk for k n from the other processes we observe that by Assumption 2.10 the terms of Xi,j for which at least i>n≤ or j>n do not contribute in the limit, hence, we conclude the proof by Theorem B.

25 5 Appendix

The purpose of the appendix is to explain the notation we used from rough path theory. We include the theorems needed for proving the tightness theorem and the homogenisation theorem. Finally we explain how to interpret the effective rough differential equations (1.1) with Itô integrals and Young integrals, and hope this self-contained material will be useful for those not familiar with the rough path theory.

5.1 Some rough path theory If X and Y are Hölder continuous functions on [0,T ] with exponent α and β respectively, such that α+β > 1, T the Young integration theory enables us to define 0 Y dX via limits of Riemann sums [u,v] Yu(Xv T ∈P − Xu), where denotes a partition of [0,T ]. FurthermoreR (X, Y ) 0 Y dX is a continuousP map. Thus, for 1 + P 7→ X 2 , onecanmakesense of a solution Y to the Young integral equation dYs = f(Ys)dXs, given enough ∈C 2 R regularity on f. If f b , the solution is continuous with respect to both the driver X and the initial data, ∈ C 1 see [You36]. In the case of X having Hölder continuity less or equal to 2 , this fails and one cannot define a pathwise integration for XdX by the above Riemann sum anymore. Rough path theory provides us with a machinery to treat less regular functions by enhancing the process with a second order process, giving a better R local approximation, which then can be used to enhance the Riemann sum and show it converges. If Xs is a Brownian motion and taking a dyadic approximation, then, the usual Riemann sum converges in probability to the Itô integral. The enhanced Riemann sum, however, provides a better approximation and defines a pathwise integral agreeing with the Itô integral provided the integrand belongs to both domains of integration. Their domains of integration are quite different, the first uses an additional adaptedness condition and requires arguably less regularity than the second. We restrict ourselves to the case where Xt is a continuous path over d 1 1 X X [0,T ], which takes values in R . A rough path of regularity α ( 3 , 2 ), is a pair of process = (Xt, s,t) d d ∈ where (Xs,t) R × is a two parameter stochastic processes satisfying the following algebraic conditions: for 0 s

Xs,t Xs,u Xu,t = Xs,u Xu,t, (Chen’s relation) − − ⊗ i,j i j where Xs,t = Xt Xs, and (Xs,u Xu,t) = X X as well as the following analytic conditions, − ⊗ s,u u,t α 2α Xs,t . t s , Xs,t . t s . (5.1) k k | − | k k | − | α d The set of such paths will be denoted by C ([0,T ]; R ). The so called second order process Xs,t can be t viewed as a possible candidate for the iterated integral s Xs,udXu. Remark 5.1 Using Chen’s relation for s =0 one obtainsR

Xu,t = X ,t X ,u X ,u Xu,t, 0 − 0 − 0 ⊗ thus one can reconstruct X by knowing the path t (X ,t, X ,t). → 0 0 Given a path X, which is regular enough to define its iterated integral, for example X 1([0,T ]; Rd), we define its natural rough path lift to be given by ∈ C

t Xs,t := Xs,udXu. Zs It is now an easy exercise to verify that X = (X, X) satisfies the algebraic and analytic conditions (depending on the regularity of X), by which we mean Chen’s relation and (5.1). Note that given any function F ∈

26 2α d d (R × ), setting X˜ s,t = Xs,t + Ft Fs, X˜ would also be a possible choice for the rough path lift. Given Ctwo rough paths X and Y we may define− , for α ( 1 , 1 ), the distance ∈ 3 2 X Y X Y ̺ X, Y s,t s,t s,t s,t . (5.2) α( ) = sup k − α k + sup k − 2α k s=t t s s=t t s 6 | − | 6 | − | This defines a complete metric on C α([0,T ]; Rd), this is called the inhomogenous α-Hölder rough path metric. We are also going to make use of the norm like object

1 X 2 X Xs,t s,t α = sup k kα + sup k kα , (5.3) k k s=t [0,T ] t s s=t [0,T ] t s 6 ∈ | − | 6 ∈ | − | where we denote for any two parameter process X a semi-norm: X X s,t . 2α := sup k 2kα k k s=t [0,T ] t s 6 ∈ | − | Given a path X, as the second order process X takes the role of an iterated integral, another sensible conditions to impose is the chain rule (or integration by parts formulae) leading to the following definition. Definition 5.2 A rough path X satisfying the following condition,

i,j 1 i,j j,i 1 i j Sym(Xs,t) = X + X = X X (5.4) 2 2 s,t ⊗ s,t is called a geometric rough path. The space of all of geometri c rough paths of regularity α is denoted by C α d C α d g ([0,T ]; R ) and forms a closed subspace of ([0,T ]; R ).

Furthermore, one can show that if a sequence of 1([0,T ], Rd) paths Xn converges in the rough path metric X X C t to , then is a geometric rough path. To obtain a geometric rough path from a Wiener process, as 0 Ws 2 t ◦ Wt W dWs = 2 , one has to enhance it with its Stratonovich integral, s,t = s (Wr Ws) dWr, upR to an antisymmetric part. − ◦ X C α d RT X Given a rough path ([0,T ], R ), we may define the integral 0 Y d for suitable paths Y α d m ∈ α d d m ∈ ([0,T ], L(R , R )), which admit a Gubinelli derivative Y ′ ([0,T ], L(R × , R )) with respect to XC ∈ C R , meaning Ys,t = Ys′Xs,t + Rs,t, where the two parameter function R satisfies R 2α < . The pair Y 2kα k ∞ := (Y, Y ′) is said to be a controlled rough path, their collection is denoted by X . The remainder term for the case Y = f(X) with f smooth is the remainder term in the Taylor expansion.D This is done by X showing that the enhanced Riemann sums [s,t] YsXs,t + Ys′ s,t, converge as the partition size is going Y X ∈P 2α Y X 2α to zero, and the limit is defined to be d . Given Y X , then ( d , Y ) X , and the map X Y Y X P X ∈C α D 2α ∈ D ( , ) ( d , Y ) is continuous with respect to and Y X . With7→ this theory of integration one canR study the equation,∈ ∈ DR R dY = f(Y )dX.

However, unlike in the theory of stochastic differential equations one now has continuous dependence on the noise X. We now state the precise theorem for our application, see also [Lyo94]. m 1 3 m L d m X C β d Theorem 5.3 [FH14] Let Y0 R ,β ( 3 , 1), f b (R , (R , R )) and ([0,T ], R ). Then, the differential equation ∈ ∈ ∈ C ∈ t Yt = Y0 + f(Ys)dXs (5.5) Z0

27 β d β d has a unique solution which belongs to . Furthermore, the solution map Φf : R C ([0,T ], R ) 2β m C × → X ([0,T ], R ), where the first component is the initial condition and the second component the driver, is continuous.D

As continuous maps preserve weak convergence to show weak convergence of solutions to rough differential equations dY ε = f(Y ε)dXε, it is enough to establish weak convergence of the rough paths Xε in the topology defined by the rough metric. Obtaining convergence in this topology follows the convergence of the finite dimensional distributions of the rough paths Xε plus tightness in the space of rough paths with respect to that topology.

5.1.1 Tightness of rough paths The following lemma can be obtained via an Arzela-Ascoli argument, for details see [FH14, FV10].

Lemma 5.4 Let 0 denote the rough path obtained from the 0 path enhanced with a 0 second order process, 1 γ′ γ′ then, for γ>γ′ > , the sets X C : ̺γ(X, 0) < R, X(0) = 0 are compact in C . 3 { ∈ } Lemma 5.5 Let θ (0, 1), γ ( 1 ,θ 1 ) and Xε = (Xε, Xε) such that ∈ ∈ 3 − p ε θ ε 2θ X Lp(Ω) . t s , X p . t s , k s,tk | − | k s,tkL 2 (Ω) | − | then, ε p sup E( X γ) < . ε (0,1] k k ∞ ∈ Proof. The proof is based on a Besov-Hölder embedding, for details we refer to [FV10, CFK+19].

Lemma 5.6 Let Xε be a sequence of rough paths, γ ( 1 , 1 1 ), such that X(0) = 0, and ∈ 3 2 − p ε p sup E( X γ) < , ε (0,1] k k ∞ ∈ Xε C γ′ 1 then is tight in for every 3 <γ′ <γ.

2 Proof. Choose α (γ′,γ), as ̺α(X, 0) X α + X we obtain ∈ ≤k k k kα p p Xε 2 X X 2 2 ε E(̺α( , 0)) E α + α C P(̺α(X , 0) > R) p k k p k k . p . ≤ R 2 ≤ R 2  R 2 This proves the claim by Lemma 5.4.

5.1.2 Interpreting the effective dynamics by classical equations We now explain what the limiting equation means in the classical sense. Our set up is the following.

W Z W Assumption 5.7 Let Xt = (Xt ,Xt ), where Xt is a n-dimensional possibly correlated Wiener process and XZ a N n-dimensional Hermite process. The two components XW and XZ are independent, we set t − t t Cov(XW ) 0 2A := . 0 0  

28 We write Ai,j for the components of A. We are concerned with the classical interpretation of the rough X d L N d 3 X X differential equation x˙ t = F (xt)d t, where F : R (R , R ) is a b map, = (X, + (t s)A) X Xi,j Xi,j t i j → C − and = ( ) is given by 0,t = 0 Xs dXs interpreted as Itô integrals if i, j n, otherwise as Young integrals. ≤ R We show that the rough differential equation (4.1) is really the same as the equations given in part 3 of Theorem 4.2. Without loss of generality, we will assume our solution is defined on the interval [0, 1]. According to Theorem 8.4 in [FH14], see also [Lyo94, FV10], there exists a unique solution to our rough 2α d 1 differential equation in the controlled rough path space DX ([0, 1]; R ), where α > 3 . The solution exists 1 global in time and the full controlled process is given by (xs, F (xs)), which means xs,t = F (xs)Xs,t + Rs,t, 1 2α d where R 2α < . By Lemma 7.3 in [FH14] given a controlled rough path (Y, Y ′) DX ([0, 1]; R ) and k k 2∞ 2α ∈ d a function ϕ b , then (ϕ(Y ), ϕ(Y )′) is also a controlled rough path in DX ([0, 1]; R ), where ϕ(Y )′ = ∈ C 3 2α d Dϕ(Y )Y ′. In our case F , thus, (F (xs), DF (xs)F (xs)) D ([0, 1]; R ) and ∈Cb ∈ X t xt x = F (xs)dXs − 0 Z0 = x0 + lim F (xu)Xu,v + DF (xu)F (xu)Xu,v + DF (xu)F (xu)(v u)A. 0 − |P|→ [u,v] X∈P In components these are just, for l =1,...,d,

N l l l k xt =x0 + lim Fk(xu)Xu,v 0 |P|→ [u,v] k=1 X∈P X d N l,l′ ,i l′ i,j l,l′,i l′ i,j + DF (xu) F (xu)X + DF (xu) F (xu)(v u)A j u,v j − l′ i,j=1 X=1 X By assumption 2.10 (2) the terms containing Xi,j , where i j >n, do not contribute to the limit, hence we may neglect them, see also Lemma 4.2 in [FH14]. We will∨ drop these terms and use Ai,j = 0 with only i j>n. Let ∨ n l k I ( )= F (xu)X 1 P k u,v [u,v] k=1 X∈P X d n l,l′,i l′ i,j l,l′,i l′ i,j + DF (xu) F (xu)X + DF (xu) F (xu)(v u)A . j u,v j − l′ i,j=1 X=1 X N l k I ( )= F (xu)X 2 P k u,v [u,v] k=n+1 X∈P X l W 1 n Now I2( ) gives rise the classical Young integrals Fk(xr)dXr. For I1 we write X = (X ,...,X ) as a linearP combination of a standard n dimensional Wiener W . Let U be given such that U T U = 2A so W Xi,j i,j Wi,j Wi,jR Wi,j v i j X = UW . Then u,v = 2Au,v u,v, where denotes the Itô lift of W ,( u,v = u Xu,rdXr ). We R

29 obtain,

n l k,q q I ( )= F (xu) U W 1 P k u,v [u,v] k=1 q=1 X∈P X X d n l,l′,i l′ i,j i,j l,l′ ,i l′ i,j + DF (xu) F (xu)2A W + DF (xu) F (xu)(v u)A . j u,v u,v j − l′ i,j=1 X=1 X

Now, by Proposition 3.5 and Theorem 9.1 in [FH14] lim 0 I1( ) coincides almost surely with the pro- l,l′,i l′ |P|→ P i,j claimed Stratonovich integrals as the term DF (xu) Fj (xu)(v u)A corresponds exactly the Stratonovich correction. We may now conclude our explanation. −

Finally, we conclude the paper with a question.

Open Problem. For Theorem A and B to hold, the only restriction on the Hermite rank of the functions 1 Gk comes from the lack of integral bound (3.8). We can only prove this bound when H∗(m) [0, ]. Our ∈ 2 question is: Can one lift the restriction H∗(m) < 0, and still obtain the bound (3.8)? A proposal for obtaining this is to depart from the Hölder path approach used here and take on the p-variation rough path formulation instead. In [CFK+19], the authors have improve their regularity assumption from their previous work by using the p-variation rough path formulation instead of the Hölder one. They were studying the diffusive 1 homogenisation problem, for this they managed to include p = 6 .

References

[AS84] Milton Abramowitz and Irene A. Stegun, editors. Handbook of mathematical functions with formulas, graphs, and mathematical tables. A Wiley-Interscience Publication. John Wiley & Sons, Inc., New York; John Wiley & Sons, Inc., New York, 1984. Reprint of the 1972 edition, Selected Government Publications. [BC17] I. Bailleul and R. Catellier. Rough flows and homogenization in stochastic turbulence. J. Differ- ential Equations, 263(8):4894–4928, 2017. [BH02] Samir Ben Hariz. Limit theorems for the non-linear functional of stationary Gaussian processes. J. Multivariate Anal., 80(2):191–216, 2002. [BM83] Peter Breuer and Péter Major. Central limit theorems for nonlinear functionals of Gaussian fields. J. Multivariate Anal., 13(3):425–441, 1983. [CFK+19] Ilya Chevyrev, Peter K. Friz, Alexey Korepanov, Ian Melbourne, and Huilin Zhang. Multiscale systems, homogenization, and rough paths. In Probability and Analysis in Interacting Physical Systems, 2019. [CFS82] I. P. Cornfeld, S. V. Fomin, and Ya. G. Sinai. Ergodic theory, volume 245 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer- Verlag, New York, 1982. Translated from the Russian by A. B. Sosinskii. [CKM03] Patrick Cheridito, Hideyuki Kawaguchi, and Makoto Maejima. Fractional Ornstein-Uhlenbeck processes. Electron. J. Probab., 8:no. 3, 14, 2003.

30 [CNN20] Simon Campese, Ivan Nourdin, and David Nualart. Continuous breuer–major theorem: Tightness and nonstationarity. Ann. Probab., 48(1):147–177, 01 2020. [FH14] Peter K. Friz and . A course on rough paths. Universitext. Springer, Cham, 2014. With an introduction to regularity structures. [FK00] Albert Fannjiang and Tomasz Komorowski. Fractional Brownian motions in a limit of turbulent transport. Ann. Appl. Probab., 10(4):1100–1120, 2000. [FV10] Peter K. Friz and Nicolas B. Victoir. Multidimensional stochastic processes as rough paths, volume 120 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cam- bridge, 2010. Theory and applications. [GL19] J. Gehringer and Xue-Mei Li. Homogenizationwith fractional random fields. arXiv:1911.12600. This is now improved and split into ‘Functional limit theorem for fractional OU’ and ‘Diffusive and rough homogenisation in fractional noise field’, 2019. [GL20] J. Gehringer and Xue-Mei Li. Functional limit theorem for fractional OU. This is an improved version of arXiv:1911.12600, 2020. [Gre51] Melville S. Green. Brownian motion in a gas of noninteracting molecules. J. Chem. Phys., 19:1036–1046, 1951. [Hai05a] Martin Hairer. Ergodicity of stochastic differential equations driven by fractional Brownian mo- tion. Ann. Probab., 33(2):703–758, 2005. [Hai05b] Martin Hairer. Ergodicity of stochastic differential equations driven by fractional Brownian mo- tion. Ann. Probab., 33(2):703–758, 2005. [Has66] R. Z. Hasminskii. Certain limit theorems for solutions of differential equations with a random right side. Dokl. Akad. Nauk SSSR, 168:755–758, 1966. [KLO12] Tomasz Komorowski, Claudio Landim, and Stefano Olla. Fluctuations in Markov processes, vol- ume 345 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Math- ematical Sciences]. Springer, Heidelberg, 2012. Time symmetry and martingale approximation. [KM17] David Kelly and Ian Melbourne. Deterministic homogenization for fast-slow systems with chaotic noise. J. Funct. Anal., 272(10):4063–4102, 2017. [KNR12] Tomasz Komorowski, Alexei Novikov, and Lenya Ryzhik. Evolution of particle separation in slowly decorrelating velocity fields. Commun. Math. Sci., 10(3):767–786, 2012. [KP91] Thomas G. Kurtz and Philip Protter. Weak limit theorems for stochastic integrals and stochastic differential equations. Ann. Probab., 19(3):1035–1070, 1991. [Kub57] Ryogo Kubo. Statistical-mechanicaltheoryof irreversible processes. I. General theory and simple applications to magnetic and conduction problems. J. Phys. Soc. Japan, 12:570–586, 1957. [KV86] C. Kipnis and S. R. S. Varadhan. Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions. Comm. Math. Phys., 104(1):1–19, 1986. [LCL07] Terry J. Lyons, Michael Caruana, and Thierry Lévy. Differential equations driven by rough paths, volume 1908 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures from the 34th Summer School on Probability Theory held in Saint-Flour, July 6–24, 2004, With an introduction concerning the Summer School by Jean Picard.

31 [LH19] Xue-Mei Li and Martin Hairer. Averaging dynamics driven by fractional brownian motion. arXiv:1902.11251, To appear in the Annals of Probability., 2019. [LOV00] C. Landim, S. Olla, and S. R. S. Varadhan. Asymptotic behavior of a tagged particle in simple exclusion processes. Bol. Soc. Brasil. Mat. (N.S.), 31(3):241–275, 2000. [LS] Xue-Mei Li and J. Sieber. Slow/fast system with fractional environment and dynamics. In prepa- ration. [Lyo94] Terry Lyons. Differential equations driven by rough signals. I. An extension of an inequality of L. C. Young. Math. Res. Lett., 1(4):451–464, 1994. [MT07] Makoto Maejima and Ciprian A. Tudor. Wiener integrals with respect to the Hermite process and a non-central limit theorem. Stoch. Anal. Appl., 25(5):1043–1056, 2007. [NP05] David Nualart and Giovanni Peccati. Central limit theorems for sequences of multiple stochastic integrals. The Annals of Probability, 33(1):177–193, 2005. [PK74] G. C. Papanicolaou and W. Kohler. Asymptotic theory of mixing stochastic ordinary differential equations. Comm. Pure Appl. Math., 27:641–668, 1974. [PT17] Vladas Pipiras and Murad S. Taqqu. Long-range dependence and self-similarity. Cambridge Series in Statistical and Probabilistic Mathematics, [45]. Cambridge University Press, Cambridge, 2017. [Sam06] Gennady Samorodnitsky. Long range dependence. Found. Trends Stoch. Syst., 1(3):163–257, 2006. [Taq75] Murad S. Taqqu. Weak convergenceto fractional Brownian motion and to the Rosenblatt process. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 31:287–302, 1975. [Taq77] Murad S. Taqqu. Law of the iterated logarithm for sums of non-linear functions of Gaussian variables that exhibit a long range dependence. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 40(3):203–238, 1977. [Tay21] G. I. Taylor. Diffusion by Continuous Movements. Proc. London Math. Soc. (2), 20(3):196–212, 1921. [You36] L. C. Young. An inequality of the Hölder type, connected with Stieltjes integration. Acta Math., 67(1):251–282, 1936.

32