arXiv:2101.05222v1 [math.PR] 13 Jan 2021 5 eo.Ti rvdsnntiiladnwifraino h limiting the on linear information is new inte and time and regeneration non-trivial anomaly’ a provides ‘area on This area called stochastic below. is shou the (5) but latter of motion, terms The in Brownian for identified the 1.1 process. of Section deterministic (see a level iterated second the the with whereas covar coincide the case, by classical defined motion the lif appears Brownian in when the feature naturally, However, surprising is a level limit. space first the The path firs in rough origin to the sequence the to is approximating of processes the here fluctuations of the principle that ones invariance show the to an proces then prove induced and the to subsequence, which strategy on natural times of The subsequence general more infinite a an are contain processes Regenerative topology. Skorohod ose’ nainepicpesae htadffsvl ecldcent rescaled diffusively a R that states principle invariance Donsker’s Introduction 1 environme random in walks random theorem, renewal key process, tas e 7 ui16 02 elnadMhesrse3 101 39 Mohrenstrasse and Berlin 10623 136, Juni 17. des Strasse d Keywords: classification: subject A.M.S. ihjmso nt ainecnegsi itiuint rwinm Brownian a to distribution in converges variance finite of jumps with og nainepicpefrdlydregenerative delayed for principle invariance Rough sapidt bana pia oetcondition. moment optimal an The obtain condition. to moment applied setting second p-variation is the the under available in is formulated fe Theorem is functional A result additive The interval. and environment chains. regeneration pat random a rough in on walks limiting random the area emerg of stochastic the level average is second the result the the in of term feature correction interesting An increment condition. regenerative delayed with processes stochastic edrv nivrac rnil o h itt h og pa rough the to lift the for principle invariance an derive We ehiceUiesta elnadWirtasInstitute Weierstrass and Universit¨at Berlin Technische nainepicpe og paths, rough principle, Invariance Email: [email protected] aur 4 2021 14, January a Orenshtein Tal processes 03,6F7 24,82B43 82C41, 60F17, 60K37, Abstract 1 p vrain raaoay regenerative anomaly, area -variation, ne notmlmoment optimal an under s ,weearuhDonsker rough a where s, neo raaoay a anomaly, area of ence hc sietfidas identified is which h plctosinclude applications w ntelmtn og path. rough limiting the in ls:te r sue to assumed are they class: frcretMarkov recurrent of s e eea theorem renewal key lpoescicd with coincide process al oedtis osnot does details) more nt. ac arxachieved matrix iance rdrno akon walk random ered vl e o example for see rval, htplg of topology th sarno walk. random a is s rv tfrthat for it prove t db orce by corrected be ld igregenerative ting ahwihis which path ntm and time in to nthe in otion 7Berlin 17 not captured in the classical invariance principle. This information is crucial in order to describe the limit of stochastic differential equations (SDE) of the form

t t (n) (n) (n) (n) Yt = Y0 + b(Ys )ds + σ(Ys )dXs , t [0,T ], Z0 Z0 ∈ where the driver X(n) is a linearly interpolated rescaled regenerative process, b and σ are smooth functions and the integral is in the sense of Riemann-Stieltjes. Even though X(n) converges weakly to a Brownian motion B, the limit of Y (n) does not satisfy the SDE with b and σ driven by B, but the is an additional drift which is explicit in terms of the area correction of the second level of the limiting rough path (denoted by Γ in (5) below), see e.g [6] line (1.1) and the discussion below it. In this work we optimize the moment condition on regeneration intervals that was assumed in [21]. The main obstacle of [21] is that it relies on the rough path extension of Donsker’s Theorem [2] which is based on Kolmogorov’s tightness criterion on H¨older rough paths. As mentioned already in [2], this was costly and therefore the assumed moment condition was not optimal. Instead of considering the somewhat heavy alge- braic framework in H¨older rough path formalism, we consider the parametrization-free p-variation settings which fits better to jump processes and discrete-time processes. Recently, a new machinery was introduced to deal with regularity for jump processes in the rough path topology. The main tool is the L´epingle Burkholder-Davis-Gundy (BDG) inequality lifted to the p-variation rough path settings. The latter provides an equivalence between the (2q)-th moment of the p-variation norm of a local martingale and the q-th moment of its quadratic variation, and allows to obtain the rough version of Donsker’s Theorem under the second moment condition, as in the classical case. The proof is by now standard, however for completeness the details are included in Appendix C. In order to optimize the moment condition on a regeneration interval for processes with regenerative increments so that the rough path version of Donsker’s Theorem is used under not more than the second moment, we apply the Key Renewal Theorem. This theorem roughly says that the mass function of the process in a regeneration interval around a fixed time (also called ‘age’ in the renewal theory jargon) is approaching a density which is proportional to the uniform measure of an independent copy of the interval, namely, its size-biased version. This result is sharp and in particular guarantees that in the limit as time goes to infinity the m-th moment of a regeneration interval around a deterministic time, and the (m + 1)-st moment of a fixed regeneration interval are equal up to a constant. The result extends the classical Donsker Theorem for these processes with no extra regularity assumption. There has been some progress related to random walks in the rough topology in the past two decades. The closest works, generalized in this paper are [21, 23, 22]. In the context of and rough paths with jumps [3, 8, 4], CLT on nilpotent covering graphs and crystal lattices [15, 24, 16], additive functional of Markov process and random walks in random environment [6]. For homogenization in the continuous settings [5, 18, 19], and for additive functionals of fractional random fields [11, 12, 13]. In the remaining part of this section we shall present the model and the main result, Theorem 1.5, after introducing the necessary rough path theory elements in Section 1.1. In Chapter 2 we shall mention some examples to which our main result applies, whereas

2 its proof is given in Chapter 3. We also included three short appendices which might be useful in other context.

1.1 Preliminaries on rough paths

For two families (ai)i I , (bi)i I of real numbers indexed by I, we write ai . bi if there is a positive constant c so∈ that a∈6 cb for all i I. We write a b whenever a b 0 i i ∈ n ≈ n n − n → as n . Set N = 1, 2, ... , N0 = N 0 and ∆T := (s,t):0 6 s 0. → ∞ { }d ∪{ } { } For a function X : [0,T ] R we set Xs,t := Xt Xs. We interpret X also as a function on ∆ given by (s,t) X →. For a metric space E−we write C([0,T ], E) resp. D([0,T ], E)) T 7→ s,t for the E-valued continuous resp. c`adl`ag functions on [0,T ]. We write C(∆T , E) resp. D(∆T , E)) for the space of E-valued functions X : ∆T E so that t Xs,t is continuous resp. c`adl`ag on [s,T ], for every s [0,T ]. By convention,→ whenever7→E = Rd, we write for the d-dimensional Euclidean∈ norm. Also, the expectation of a vector (or matrix) |valued · | random variable is understood coordinate- (or entry-) wise. For x, y Rd we d d ∈ 2 write x y R × for the tensor product (x y)i,j = xiyj,i,j = 1,...,d, and x⊗ for ⊗ ∈ ⊗ Rd i x x. Whenever (xn)n is a sequence of elements in we write xn,i =1, ..., d, for their components.⊗ For brevity, we shall focus on the necessary objects needed for introducing our results. We follow closely Chapter 2 of [6]. The reader is referred to Section 5 of [8] for details on Itˆo p-variation rough paths with jumps and to [9] and [10] for an extensive account of the theory of rough paths. For a normed space (E, ), and a function X C([0,T ], E) or X C(∆ , E) k·kE ∈ ∈ T we write X ,[0,T ] := sup(s,t) ∆ Xs,t E to denote the uniform norm of X, and for ∞ T k k ∈ k k 1/p p any p (0, ), we write X p,[0,T ] := sup [s,t] Xs,t E , where the supremum is ∈ ∞ k k  ∈P k k  over all finite partitions of [0,T ], to denoteP its p-variation norm. A continuous rough P d d d path is a pair of functions (X, X) C([0,T ], R ) C(∆T , R × ) satisfying Chen’s relation, that is ∈ × X X X = X X for all 0 6 s

9 (X, X)9 := X + X + X < . (2) p,[0,T ] | 0| k kp,[0,T ] k kp/2,[0,T ] ∞

The (uniform) p-variation distance σp,[0,T ] is defined by taking the norm of the path defined by the diferences increment-wise:

σ ((X, X), (Y, Y)) := 9(X Y, X Y) 9 . p,[0,T ] − − p,[0,T ] We refer to X as the first level of the rough path (X, X) and to X as its second level. We shall now state a simple and useful sufficient condition for convergence in p- variation based on the convergence in the uniform topology together with tightness of the p-variation norms, see Lemma 2.3 in [6] and which is based on Theorem 6.1 of [8].

Lemma 1.2 (Sufficient condition for convergence in p-variation) Assume that (Zn, Zn)n N is a sequence of continuous rough paths and let p0 (2, 3). Assume also ∈ ∈

3 that there exists a continuous rough path (Z, Z) such that (Zn, Zn) (Z, Z) in dis- tribution in the uniform topology and that the family of real valued→ random variables Z Z Z ( (Zn, n) p0,[0,T ])n N is tight. Then (Zn, n) (Z, ) in distribution in the p0-variation k k ∈ d d d → uniform topology C ([0,T ], R R × ) for all p (p , 3). p × ∈ 0 1.2 Main result Rd Let X = (Xn)n N0 be a discrete time on defined on a probability space (S, , P) and∈ let E be the corresponding expectation. Assume that X has a delayed F regenerative increments, that is there exists a sequence 0 =: τ0 < τ1 < τ2 <... of N -measurable 0-valued random variables so that (Tk, Xτk,τk+m, 0 6 m 6 Tk )k N is Fan i.i.d. family independent of (T , X , 0 6 m 6 τ ){ under P, where T = τ } ∈ τ 0 { 0,m 1} k k+1 − k are the regeneration intervals and Xℓ,k := Xk Xℓ are the increments. Assume that E − N P [Xτ1,τ2 ] = 0 and that gcd j : pj > 0 = d for some d , where pj = (T1 = j), that { } ∈ d is, T1 is d-arithmetic. For any sequence (Xk)k N of elements in R we define ∈ 0

(n) 1 nt nt Xt := X nt + −⌊ ⌋(X nt +1 X nt ) and √n ⌊ ⌋ √n ⌊ ⌋ − ⌊ ⌋ X(n) 1 Str n(t s) nt + ns Str Str s,t := I ns , nt (X)+ − −⌊ ⌋ ⌊ ⌋ I ns , nt +1(X) I ns , nt (X) , (3) n ⌊ ⌋ ⌊ ⌋ n ⌊ ⌋ ⌊ ⌋ − ⌊ ⌋ ⌊ ⌋  where for positive integers M

Str 1 IM,N (X) := XM,k 1 Xk 1,k Xk 1,k Xk 1,k .  − ⊗ − − 2 − ⊗ −  M+1X6k6N

Remark 1.3 Note that

X(n) (n) (n) s,t := dXv dXu , Z(s,t] Z(s,u) ⊗

(n) where the integration with respect to dXv is in the sense of Riemann-Stieltjes. We shall now formulate the main regularity assumption. We remind the reader that for Y Rd we write Y i for the i-th component of Y , i 1, ..., d . ∈ k ∈{ } Assumption 1.4 For all i 1, ..., d , m 0, 1 and p 0, 2 ∈{ } ∈{ } ∈{ } p 0 < E Ξi T < ,  m k ∞  i i where Ξ := sup X :0 6 k 6 Tm . m {| τm,τm+k| } Theorem 1.5 Let X be a discrete time stochastic process satisfying Assumption 1.4. E N (n) X(n) Assume that [Xτk,τk+1 ]=0 for every k 0. Then, (X , )n N converges in dis- d ∈ d d ∈ tribution to (B, B + Γ) in Cp([0,T ], R R × ) for every p > 2, where B is a centered Brownian motion with· covariance × E 2 [Xτ⊗1,τ2 ] [B,B]t = t, (4) E[T1]

4 d d B is the Stratonovich iterated integral of B, that is B = B dB and Γ R × s,t (s,t] s,u ⊗◦ u ∈ is given by R E[A (X)] Γ= τ1,τ2 , (5) E[T1] X(1) X(1) where AM,N (X) = Antisym( M,N ) is the antisymmetric part of the matrix M,N . The notation B + Γ above is for Bs,t + (t s)Γ . · − (s,t) ∆T  ∈ Remark 1.6 Note that the positivity condition for the moment in Assumption 1.4 is assumed in order to avoid degeneracies. Indeed, it can be omitted if we accept a degenerate i formulation of the invariance principle. More accurately, if this is violated, then Xn =0 for all times for some coordinate i, for which one can say that the invariance principle holds with a singular covariance matrix. Note also that Assumption 1.4 holds whenever X | ℓ,k| τ1 has a third moment and k ℓ are uniformly bounded from above by a constant a.s., for example, whenever the process− X has values in Zd, with nearest-neighbor jumps.

n Remark 1.7 (Optimality of the result) Let Xn = k=1 ξk be a centered random Rd E i 1 d Rd walk on , that is [ξk]=0,i = 1,...,d, where ξk =P (ξk,...,ξk ) are -valued i.i.d random variables. Then X = (Xn)n N0 has, trivially, delayed regenerative increments: here T = 1 and Ξi = ξi , i = 1∈,...,d, k = 0, 1,... Therefore Assumption 1.4 is k k | k| E d i 2 equivalent in this case to i=1 ξ1 < , which is a necessary and sufficient condition h | | i ∞ for the classical central limitP theorem, cf. Theorem 4 in Chapter 7 of [14].

2 Examples

Positive recurrent countable Markov chains. Assume that (Yk)k N is a positive recurrent irreducible taking values in some measurable space∈ S, that is

P(T + < X = x) = 1forsome x S x ∞| 0 ∈ + N E + 3 where Tx = inf k : Yk = x . Assume moreover that [(Tx ) X0 = x] < . Then the conditions of{ Theorem∈ 1.5 hold} under P( X = x) for the sequence| ∞ ·| 0 n E [D Y0 = x] N Xn := f(Yk) n +| ,n 0, − E[Tx Y = x] ∈ kX=0 | 0

T + 1 where D = x − f(Y ), and f : S Rd is any bounded measurable function. k=0 k → All the examplesP considered in Section 5 of [21] are applicable here. In particular, the result applies to Random walks in periodic environment ([21], Section 5.2), where the periodicity assumption can be easily relaxed. For example, it can include i.i.d. impu- rities, as long as the regenerative structure is kept. It applies also to Random walks on covering graphs and hidden Markov chains, (Chapter 5.2 of [21]). The examples there immediately satisfy Assumption 1.4. Moreover, these can be extended, allowing infinite modulating systems and covering graphs with infinite structure. Another appli- cation is to the so-called Ballistic random walks in random environment (Section 5.1 of [21]). Lastly, for the example of Random walks in Dirichlet environments,

5 the rough invariance principle, Theorem 5.5 of [21] is shown in the ballistic regime (more accurately, whenever a condition which is denoted by (T )γ holds for some γ (0, 1), see Chapter 6.1 of [26] for teh definition and more details) in the sense of H¨older rough∈ paths only whenever the trap parameter κ satisfies κ> 8 in the α-H¨older rough path topology 1 1 for all α < ∗ , where (κ/2)∗ = min κ/2 , 2 κ/4 . The improvement in the 2 − (κ/2) {⌊ ⌋ ⌊ ⌋} present work is that the rough invariance principle - Theorem 1.5 below - applies already whenever the trap parameter satisfies κ > 3, and moreover it holds for all p > 2, which corresponds to all α< 1/2 in the α-H¨older settings.

3 Proof of Theorem 1.5

k 1 Set Zk := Xτ = − Yℓ for k 0, where Yℓ := Xτ ,τ . Then Z = (Zk)k N is a k ℓ=0 ≥ ℓ ℓ+1 ∈ 0 random walk withP square integrable jumps. In particular,

(n) (n) Z Z d (Z , Z ) converges in distribution to (B , B ) in Cp([0,T ], R ) (6) for all p> 2, where BZ is a centered Brownian motion with covariance

[BZ ,BZ ] = E[Y Y ]t t 2 ⊗ 2 and BZ is the Stratonovich iterated integral of BZ . By Lemma 1.2 it is enough to prove first convergence in distribution in the uniform topology and then to show tightness for the sequence of p-variation norms. We shall work on each level separately. Let us first identify the limit by proving the convergence of the finite-dimensional distributions. For ease of notation we shall only show the one-dimensional distributions, proving the convergence for higher dimensions is done similarly. For any u > 0 let κ(u) be the unique random integer k so that τk 6 u < τk+1. k 1 Note that κ(u) is measurable with respect to σ(Tk : k

Next, note that as τκ (nt) 6 nt<τκ(nt)+1 κ(nt) κ(nt)+1 κ(nt) κ(nt) 6 6 . κ(nt)+1 τκ(nt)+1 nt τκ(nt)

The weak law of large numbers for (τk)k N implies that ∈ 0

κ(nt) 1 E[T1]− =: β (7) n nt −−−−→→∞ in probability with respect to P, where we used the fact that 1 6 E[T1] < by assump- tion. In particular, P(κ(n) > 2βn) 0 and therefore for every ε> 0 ∞ n −−−−→→∞ P (n),i (n),i P i ( X Zκ(n )/n ,[0,T ] >ε) 6 Ξκ(n ) ,[0,T ] >ε√n k − · k∞ k · k∞  P i 6 max Ξm >ε√n + o(1). 06m6nT 2β 

6 But since the maximum of order n i.i.d random variables with a finite second moment is sub-diffusive in probability the first term also vanishes in probability. Indeed,

cn P i P i P i ⌊ ⌋ max Ξm >ε√n = 1 1 Ξ0 >ε√n 1 Ξ1 >ε√n 06m6cn  − − −   cn 1 1 P Ξi >ε√n ⌊ ⌋ ≈ − − 1 1 exp cn P Ξi >ε√n , ≈ − − 1 P i 2 2  P i 2 2 which tends to 0 as n . This holds since ( Ξ1 >ε n) 6 ( Ξ1 >ε j) for every j 6 n and so → ∞ | | | | n (n k)P( Ξi 2 >ε2n) 6 P( Ξi 2 >ε2j), k 6 n. − | 1| | 1| j=Xk+1 Therefore,

∞ P i P i 2 2 P 2 i 2 lim sup n Ξ1 >ε√n = lim sup(n k) ( Ξ1 >ε n) 6 (ε− Ξ1 > j) 0, n n − | | | | −−−−→k →∞  →∞ j=Xk+1 →∞ E i 2 since the right hand side is summable as [ Ξ1 ] < . Next, since the maximum of linear interpolations| | of∞ any finite sequence on any bounded interval is obttained on the end points, we have

(n),i (n),i 1 i i Zκ(n )/n Z β ,[0,T ] 6 max Zκ(m/β) Zm k · − · k∞ √n 06m6T βn | − |

1 i 6 max Ξm max κ(m/β) m . √n 06m6T nβ | | 06m6T nβ | − | Thus for R> 0 P (n),i (n),i ( Zκ(n )/n Z β ,[0,T ] >ε) k · − · k∞ P i i 6 max Zκ(m/β) Zm >ε√n, max κ(m/β) m 6 R√n 06m6T βn | − | 06m6T nβ | − |  P i i + max Zκ(m/β) Zm >ε√n, max κ(m/β) m > R√n 06m6T βn | − | 06m6T nβ | − |  P i i P 6 max Zk Zm >ε√n + max κ(m/β) m > R√n 06k,m6Tβn, k m 6R√n | − |  06m6T nβ | − |  | − | P i P 6 3Tβ√n max Zk >ε√n + max κ(m/β) m > R√n . k6R√n | |  06m6T nβ | − |  To deal with the first term we use a standard two pairs estimate (see Example 10.1 of Billingsley [1]), to find some K > 0 and Ψ(λ) 0 so that for any fixed R> 0 −−−−→λ →∞ Ψ ε√n P i K √R √n max Zk >ε√n 6 √n 4 + √n   2 (8) k6R√n | |  εn1/4/√R εn1/4/ 2√R      KR2 4R ε√n = + Ψ 0. 4 2 n ε √n ε  √R  −−−−→→∞

7 By the central limit theorem for renewal processes with finite variance

P max κ(m/β) m > R√n P( max c Wt > R), 06m6T nβ n 06t6T  | − |  −−−−→→∞ | | where W is a Brownian motion and c> 0 is some constant. As R can be chosen arbitrarily P (n),i (n),i large, lim supn ( Zκ(n )/n Z β ,[0,T ] >ε) = 0. Therefore, →∞ k · − · k∞ (n),i (n),i X Z β ,[0,T ] 0 in probability . ∞ n k − · k −−−−→→∞ Applying Slutsky’s Theorem in the Skorohod topology, Theorem B.1, to

(n) (n) (n) (n) X = X Zβ + Zβ (9) − · · (n) Z P we deduce that X Bβ =: B in distribution with respect to , so that B is a −−−−→n · d-dimensional Brownian→∞ motion with a covariance matrix given in (4), as desired. Next, we shall show tightness of the p-variation norms for p> 2. Note that κ(u) 6 u for any u > 0. Indeed, Ti N for all i N by assumption and therefore κ(u) 6 τκ(u) 6 u. Now, since ∈ ∈ 1 X(n),i Z(n),i 6 ( Ξi + Ξi ), | s,t − κ(ns)/n,κ(nt)/n| √n | κ(ns)| | κ(nt)+1| for any partition 0 = t

E (n) (n) 2 1 E 2 E 2 [ X Zκ(n )/n 2,[0,T ]] . ((n 1) [T1 Ξ1 ]+ [T0 Ξ0 ]) . 1. k − · k n − | | | |

1 i 1 i Next, since κ(nT ) 6 nT we have that √n Zκ(ℓ) is a subsequence of √n Zk  ℓ6nT  k6nT and therefore E[ Z(n),i 2 ] 6 E[ Z(n),i 2 ] . 1, see (13). k κ(n )/nkp,[0,T ] k kp,[0,T ] Using the triangle· inequality together with the fact that the p-variation norms are monotonically decreasing in p,

E (n) 2 E (n) (n),i 2 E (n),i 2 [ X p,[0,T ]] . [ X Zκ(n )/n 2,[0,T ]]+ [ Zκ(n )/n p,[0,T ]] . 1, k k k − · k k · k as desired. We shall now treat the second level. Let us show first convergence in the Skorohod topology. Note that for ℓ < k we have the following decomposition which is a consequence of Chen’s relation (1)

Str Str Iτℓ,τk (X)= Iℓ,k (Z)+ Aτu−1,τu (X). (10) ℓ+1X6u6k

Str Str Indeed, Iτu−1,τu (X) = sym(Iτu−1,τu (X)) + Aτu−1,τu (X), and by a direct computation Str 1 sym(Iτ − ,τ (X)) = Zu 1,u Zu 1,u. Therefore, u 1 u 2 − ⊗ −

8 Str Str Str 1 Iτ ,τ (X)= Iℓ,k (Z)+ Iτ − ,τ (X) Zu 1,u Zu 1,u ℓ k u 1 u − 2 − ⊗ − ℓ+1X6u6k ℓ+1X6u6k Str = Iℓ,k (Z)+ Aτu−1,τu (X). ℓ+1X6u6k X(n) X(n) By Remark 2.2 of [6], it is enough to prove the convergence of t := 0,t . By (10), we have for any t > 0

1 X(1) Z 1 X(1) Aτ − ,τ (X) = + X0,τ Xτ ,nt n nt − κ(nt) − u 1 u n| τκ(nt),nt κ(nt) ⊗ κ(nt) | 16uX6κ(nt)

1 2 1 6 Ξ⊗ + Z Ξ . n| κ(nt)| n| κ(nt)| ⊗ | κ(nt)| Therefore

1 X(1) Z sup ns,nt κ(ns),κ(nt) Aτu−1,τu (X) 06s,t6T n − − κ(ns)

1 2 6 2 sup ( Ξκ⊗(nt) + Zκ(nt) Ξκ(nt) ). 06t6T n | | | | ⊗ | |

1 2 But sup06t6T ( Ξ⊗ + Zκ(nt) Ξκ(nt) ) 0 in probability. Indeed, we have n κ(nt) n 0 | | | | ⊗ | | −−−→→ P i already seen that Ξκ(n ) ,[0,T ] >ε√n 0 for any ε> 0 and for the second term k · k∞  →

P( sup Zκ(nt) Ξκ(nt) > εn) 6 P Zn ,[0,T ] > R√n + P max Ξk >ε√n/R . | ⊗ | k ·k∞ 06k6nT | |  06t6T  But for any R > 0 the right term vanishes as n , while the left term converges to → ∞ P(max06t6T cWt > R), where W is a Brownian motion and c > 0 is a constant, which vanishes as R| | . By Slutsky’s Theorem the convergence of X(n) in distribution holds if → ∞ 1 Z 1 B κ(n ) + Aτu−1,τu (X) 0, +Γ n · n −−−−→n · · 16uX6κ(n ) →∞ · in distribution in the uniform topology. To achieve the last convergence, first note that

(Aτu−1,τu (X))u N are independent random matrices with the same law for u > 2. Note ∈ 2 E also that Aτu−1,τu (X) 6 4 Ξu 1 ⊗ Tu 1 which implies that [ Aτu−1,τu (X) ] < C for u =1, 2 by| Assumption| 1.4.| Hence− | the− weak law of large numbers| for the sum| yields

1 E Aτ − ,τ (X) [Aτ ,τ (X)] n k 1 k −−−−→n 1 2 16Xk6n →∞

κ(nt) 1 in probability. Together with the convergence in probability E[T1]− which nt −−−−→n is the consequence of Assumption 1.4. with α = 0, we deduce that →∞ E 1 [Aτ1,τ2 (X)] Aτk−1,τk (X) =: Γ almost surely . n −−−−→n E[T1] 16kX6κ(nt) →∞

9 E [ Aτ1,τ2 (X) ] Moreover, since | E | < we have moreover [T1] ∞

1 Aτk−1,τk (X) Γ ,[0,T ] 0 in probability . |n − · |∞ −−−→n 0 16kX6κ(n ) → ·

Using Slutsky’s Theorem again together with (6) it is left to show that

Z(n) Z(n) β ,[0,T ] 0 in probability. κ(n )/n ∞ n k · − · k −−−−→→∞ As for the case of the first level we use (3) to reduce the last convergence to showing that Z(n) Z(n) β ,[0,T ]1 6 0 in probability. Fix ε> 0. k κ(n )/n − k∞ max06m6Tn κ(m) mβ R√n −−−−→n · · { | − | } →∞ P Zi,j Zi,j max 0,m 0,k > εn 06k,m6Tβn, k m 6R√n | − |  | − | P i j . max Zk,m Z0,k > εn) 06k,m6Tβn, k m 6R√n | | | | | − | P i P j 6 max Zk,m > √εn/R)+ max06k6nT Z0,k > √εnR 06k,m6Tβn, k m 6R√n | | | | | − |   √nP i P j 6 max Z0,k > √εn/R + max06k6nT Z0,k > √εnR . R 06k6R√n | |   | |  Now, for any fixed R the first term converges to zero by (8) while the limsup of the right term is bounded by P (max06t6T cWt > √εR), which tends to zero as R . Therefore we have proved the convergence| is| the uniform topology. → ∞ To end, we shall now prove tightness of the p/2-variation norms, p > 2. As in the estimate for the first level, we observe that for 0 = t0

m Ai,j = Ai,j | ntr−1,ntr | | ntr−1,ntr | Xr=1 06Xu

E 1 which implies that n κ(ns)6u6κ(nt) Aτu−1,τu (X) 1,[0,T ] . 1. Also,  | 06s

X(1) Z 2 ns,nt κ(ns),κ(nt) κ(ns)6u6κ(nt) Aτu−1,τu (X) 6 Ξκ⊗(ns),κ(nt) + Zκ(ns),κ(nt) Ξκ(ns),κ(nt) , − − | | | | ⊗ | | P and so

E 1 X(1) Z   Aτ − ,τ (X)  . 1. n | ns,nt − κ(ns),κ(nt) − u 1 u |1,[0,T ] κ(ns)6Xu6κ(nt)   06s

10 (n) Observe that E[ Z p/2,[0,T ]] . 1, see (14). Using the triangle inequality together with k k (n) the fact that the p-variation norms are monotonically decreasing E[ X p/2,[0,T ]] . 1, as desired. ✷ k k

Acknowledgements. The author is grateful to Tommaso Cornelis Rosati and Willem van Zuijlen for numerous valuable discussions. This work was supported by the German Research Foundation (DFG) via Research Unit FOR2402.

A Key renewal theorem

In this section we show that the moment condition of Assumption 1.4 is asymptotically equivalent to a second moment condition on Ξκ(n).

Theorem A.1 (Key renewal theorem) Assume that pj > 0, j N0,p0 =0, N pj =1 so that gcd j : pj > 0 = d N. If (bn)n N is a ∈ j 0 { } ∈ ∈ 0 summable sequence ofP non-negative∈ real numbers, then the equation

n an = bn mum (11) − mX=0 has a unique solution satisfying

b m N0 m lim an = ∈ , (12) n P jp →∞ j N0 j P ∈ k where um = k N p∗ (m) is the k-fold convolution of p evaluated at m, that is ∈ 0 k P P k p∗ (m)= j=1 Tj = m , where (Tk)k>2 is a sequence of independent random   variables so thatP Tk, k > 2 all have the same probability mass function p. Moreover, This to be understood even when N jpj = , in which case the limit on the right side of j 0 ∞ (12) is 0. P ∈ (To note about delay: T1 has a different distribution.)

Lemma A.2 Let Ξ as defined in Assumption 1.4, then for r, ℓ N k ∈ 0 E r ℓ+1 r ℓ [ Ξ2 ⊗ T2 ] E[ Ξκ(n) ⊗ T ] | | , κ(n) n E | | −−−−→→∞ [T2] whenever the right hand side is finite.

r ℓ Proof Let b = E[ Ξ ⊗ T 1 ], then n | 2| 2 T2>n r ℓ r ℓ r ℓ+1 b = E[ Ξ ⊗ T 1 ]= kE[ Ξ ⊗ k 1 ]= E[ Ξ ⊗ T ]. n | 2| 2 T2=k | 2| T2=k | 2| 2 nXN nXN k>nX kXN ∈ 0 ∈ 0 ∈

By The key renewal theorem there is a unique solution (an)n N0 to the equation (A.1), and it satisfies the limit in (12). By the last computation the∈ right hand side of (12) is

11 exactly the one in the wanted assertion. It is therefore enough to show that r ℓ a = E[ Ξ ⊗ T ]. Indeed, n | κ(n)| κ(n)

r ℓ r ℓ r ℓ E[ Ξ ⊗ T ]= E[ Ξ ⊗ T 1 ]= E[ Ξ ⊗ T 1 6 ] | κ(n)| κ(n) | k| k κ(n)=k | k| k τk n,τk+Tk>n kXN kXN ∈ 0 ∈ 0 n n E r ℓ E r ℓ P = [ Ξk ⊗ Tk 1Tk>n m1τk=m]= [ Ξk ⊗ Tk 1Tk>n m] [τk = m] | | − | | − kXN mX=0 kXN mX=0 ∈ 0 ∈ 0 n n E r ℓ P = [ Ξ2 ⊗ T2 1T2>n m] [τk = m]= bn mum = an, | | − − mX=0 kXN mX=0 ∈ 0 where in the fourth equality we used independence. ✷

Note that one could use the argument for bn of the form E [f( Xτ1, τ1 + k 06k6T1 )g(T1)], where f,g are real functions, as long as (bn) is an absolutely{ summable} sequence (of well-defined finite elements).

B Slutsky’s Theorem in the Skorohod topology

Theorem B.1 Assume that Xn, Y n D([0,T ], R) the Skorohod space of cadlag functions or that Xn, Y n C([0,T ], R∈)) so that Xn X in distribution in D([0,T ], R), ∈ n → where X C([0,T ], R) and Y f ,[0,T ] 0 in probability, where f is a deterministic∈ continuous function.k − Thenk∞ Xn→+ Y n X + f in distribution in D([0,T ], R), or in C ([0,T ] , R) respectively. → Proof We shall cover the case D = D([0,T ], R), the case of the uniform topology is similar. Let Φ Cb(D, R) and fix η> 0. The proof is completed if E ∈ n n ˜ ˜ lim supn [Φ(X + Y ) Φ(X + f)] < η. Define Φ(g) := Φ(g + f). Note that Φ(g) is bounded→∞ and| continuous in−D by the continuity| of f. Now

lim sup E[Φ(Xn + Y n) Φ(X + f)] n | − | →∞ 6 lim sup E[Φ(Xn + Y n) Φ(Xn + f)] + lim sup E[Φ(Xn + f) Φ(X + f)] n | − | n | − | →∞ →∞ = lim sup E[Φ(Xn + Y n) Φ(Xn + f)] + lim sup E[Φ(˜ Xn) Φ(˜ X)] n | − | n | − | →∞ →∞ = lim sup E[Φ(Xn + Y n) Φ(Xn + f)] . n | − | →∞ However, since the Skorohod distance d on D is controlled by the uniform distance which is homogeneous, we have d(g + h,f + h) 6 f g ,[0,T ] for any h, we have k − k∞ E[Φ(Xn + Y n) Φ(Xn + f)] | n −n n | = E[(Φ(X + Y ) Φ(X + f))(1d(Xn+Y n,Xn+f)>ε +1d(Xn+Y n,Xn+f)<ε)] | n − n n n n n | 6 2 Φ P[d(X + Y ,X + f) > ε]+ E[(Φ(X + Y ) Φ(X + f))1d(Xn+Y n,Xn+f)<ε] ∞ k k n | n n − n | =2 Φ P[ Y f ,[0,T ] > ε]+ E[(Φ(X + Y ) Φ(X + f))1d(Xn+Y n,Xn+f)<ε] ∞ ∞ k k nk −n k n | − | E[(Φ(X + Y ) Φ(X + f))1 n n n ] . ≃ | − d(X +Y ,X +f)<ε |

12 n P n η By tightness of X we have a compact Kη so that (X Kη) < 4 Φ ∞ . Note that also 6∈ k k K˜ := K + f := g + f : g K is compact. Since Φ is continuous it is equicontinuous η η { ∈ η} on K˜ . Therefore, there exists some ε > 0 so that if v K˜ and u D are so that η 0 ∈ η ∈ d(u, v) <ε , then Φ(u) Φ(v) < η/2. Note that Xn K if and only if Xn + f K˜ 0 | − | ∈ η ∈ η and therefore, if we choose ε 6 ε0 we have n n n E[(Φ(X + Y ) Φ(X + f))1d(Xn+Y n,Xn+f)<ε] | −n n n | n 6 2 Φ P(X K ) + E[(Φ(X + Y ) Φ(X + f))1 n n n n ] ε d(X +Y ,X +f)<ε,X +f K˜ η k k∞| 6∈ | | − ∈ | < η/2+ η/2P(d(Xn + Y n,Xn + f) < ε,Xn + f K˜ ) < η. ∈ η ✷

C Rough Donsker Theorem

Let Z = k Y , k N be a delayed random walk on Rd starting at Z = 0, that is k ℓ=1 ℓ ∈ 0 (Yℓ)ℓ N areP independent and have the same distribution for ℓ > 2. Assume that ∈ 2 (n) (n) E[Yℓ] = 0 and E[ Yℓ ] < for ℓ =1, 2. Let Z and Z be defined as in (3), then (6) holds. | | ∞ Proof The convergence in distribution Z(n) BZ in the uniform topology on n −−−−→→∞ Rd E (n) 2 E 2 C([0,T ]; ) is Donsker’s Theorem, cf. [7, 1]. Since [ (ZT )⊗ ]= T [ Z1⊗ ] < , the conditions of Theorem 2.2 of Kurtz-Protter [17] are fulfilled| (and| particularly| | their∞ Example 3.1 with m = 2) and therefore we conclude the joint convergence of (n) (n) Z Itˆo d d d (Zˆ , Zˆ ) to B , B in the Skorohod topology on the space D([0,T ]; R R × ) of ×  Rd Rd d ˆ(n) 1 c`adl`ag functions [0,T ] × ), where Zt = Z tn , → × √n ⌊ ⌋ Zˆ(n) 1 nt BItˆo t = ⌊ ⌋ Zr 1 Zr 1,r, and is the Itˆoiterated integral of B. Notice that n r=1 − ⊗ − P nt ⌊ ⌋ Z(n) Zˆ(n) 1 2 nt nt 2 t t = Zr⊗ 1,r −⌊ ⌋Z⊗nt , nt +1 − 2n − − 2n Xr=1 ⌊ ⌋ ⌊ ⌋ E t 1 Z which converges to [Y2 Y2] 2 = 2 [B ]t in probability in the uniform norm on [0,T ] by the ergodic theorem and⊗ the integrability of the limit. Indeed, on the diagonal T E (Zˆ(n) Z(n))i,i = E[ (Zˆ(n) Z(n))i,i ] 6 ([Y 2]+ E[Y 2]) < . k − k1,[0,T ] | T − T | 2 1 2 ∞ The point-wise limit is identified by the ergodic theorem, and altogether this gives

(n) (n) i,i (Zˆ Z ) E[Y2 Y2] · 0 in probability. − − ⊗ 2 ,[0,T ] −−−−→n ∞ →∞ The off-diagonal terms then follow by polarization. It is straight-forward to conclude (n) Z(n) Z BZ Z BItˆo 1 Z Z then the joint convergence of (Zt , t )06t6T to (B , )= B , + 2 [B ,B ]t d d d in the Skorohod topology on the space C([0,T ]; R R × ).  By Lemma 1.2 together with (2) we only need to show× the tightness of the sequence of (n) (n) random variables Z p,[0,T ] and Z p/2,[0,T ], for all p> 2. For the first level note first that k k k k E[ Z(n) 2 ] . [[Z(n),Z(n)] ] . T (E[Y 2]+ E[Y 2]) (13) k kp,[0,T ] T 2 1

13 by L´epingle’s p-variation inequality [20] combined with the BDG inequality. For tightness of the second level, we claim first that for the Itˆosum we have

E[ Zˆ(n) ] . [[Z(n),Z(n)] ] . T (E[Y 2]+ E[Y 2]). k kp/2,[0,T ] T 2 1 Indeed, this follows from Theorem 1.1 of [27], which is an off-diagonal version to the L´epingle p-variation BDG inequality (one can also use [25] or Proposition 3.8 of [6]). Since T E[ Z(n) Zˆ(n) ] 6 ([Y 2]+ E[Y 2]) < , k t − t k1,[0,T ] 2 1 2 ∞ we conclude that

E[ Z(n) ] . E[ Zˆ(n) ]+ E[ Z(n) Zˆ(n) ] k kp/2,[0,T ] k kp/2,[0,T ] k − kp/2,[0,T ] E 2 E 2 . T ( [Y2⊗ ]+ [Y1⊗ ]) , (14) and the proof is complete. ✷

References

[1] Patrick Billingsley. Convergence of probability measures, second edition. John Wiley Sons, New York , 1999. [2] Emmanuel Breuillard, Peter K. Friz, and Martin Huesmann. From random walks to rough paths. Proceedings of the American Mathematical Society, 137(10):3487–3496, 2009. [3] Ilya Chevyrev. Random walks and l´evy processes as rough paths. Probability Theory and Related Fields, 170(3-4):891–932, 2018. [4] Ilya Chevyrev and Peter K. Friz. Canonical rdes and general semimartingales as rough paths. The Annals of Probability, 47(1):420–463, 2019. [5] Laure Coutin, Antoine Lejay. Semi-martingales and rough paths theory. Electronic Journal of Probability, 10:761–785, 2005. [6] Jean-Dominique Deuschel, Tal Orenshtein, and Nicolas Perkowski. Additive functionals as rough paths. To appear in The Annals of Probability. ArXiv preprint arXiv:1912.09819, 2019. [7] Monroe D. Donsker. An invariance principle for certain probability linit theorems. AMS, 1951. [8] Peter K. Friz and Huilin Zhang. Differential equations driven by rough paths with jumps. J. Differential Equations, 264(10):6226–6301, 2018. [9] Peter K. Friz and . A course on rough paths: with an introduction to regularity structures. Springer, 2014. [10] Peter K Friz and Nicolas B Victoir. Multidimensional stochastic processes as rough paths: theory and applications, volume 120. Cambridge University Press, 2010.

14 [11] Johann Gehringer and Xue-Mei Li. Homogenization with fractional random fields. ArXiv preprint arXiv:1911.12600, 2019. [12] Johann Gehringer and Xue-Mei Li. Functional limit theorems for the fractional ornstein–uhlenbeck process. Journal of Theoretical Probability, pages 1–31, 2020. [13] Johann Gehringer and Xue-Mei Li. Rough homogenisation with fractional dynamics. ArXiv preprint arXiv:2011.00075, 2020. [14] Boris Vladiminovich Gnedenko. Limit distributions for sums of independent random variables. 1968. [15] Satoshi Ishiwata, Hiroshi Kawabi, and Ryuya Namba. Central limit theorems for non-symmetric random walks on nilpotent covering graphs: part i. ArXiv preprint arXiv:1806.03804, 2018. [16] Satoshi Ishiwata, Hiroshi Kawabi, and Ryuya Namba. Central limit theorems for non-symmetric random walks on nilpotent covering graphs: part ii. ArXiv preprint arXiv:1808.08856, 2018. [17] Thomas G. Kurtz and Philip Protter. Weak limit theorems for stochastic and stochastic differential equations. Ann. Probab., 19(3):1035–1070, 1991. [18] Antoine Lejay and Terry Lyons. On the importance of the levy area for studying the limits of functions of converging stochastic processes. application to homogenization. In Current Trends in Potential Theory, volume 7. The Theta foundation/American Mathematical Society, 2003. [19] Antoine Lejay. On the convergence of stochastic integrals driven by processes converging on account of a homogenization property. Electronic Journal of Probability, 7, 2002. [20] Dominique L´epingle. La variation d’ordre p des semi-martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 36(4):295–316, 1976. [21] Olga Lopusanschi and Tal Orenshtein. Ballistic random walks in random environment as rough paths: convergence and area anomaly. ArXiv preprint arXiv:1812.01403, 2018. [22] Olga Lopusanschi and Damien Simon. Area anomaly and generalized drift of iterated sums for hidden markov walks. ArXiv preprint arXiv:1709.04288, 2017. [23] Olga Lopusanschi and Damien Simon. L´evy area with a drift as a renormalization limit of markov chains on periodic graphs. Stochastic Processes and their Applications, 2017. [24] Ryuya Namba. A remark on a central limit theorem for non-symmetric random walks on crystal lattices. ArXiv preprint arXiv:1701.08488, 2017. [25] Peter K. Friz, Pavel Zorin-Kranich. Rough semimartingales and p-variation estimates for martingale transforms. ArXiv preprint arXiv: 2008.08897, 2020.

15 [26] Christophe Sabot and Laurent Tournier. Random walks in dirichlet environment: an overview. Annales de la Facult´edes sciences de Toulouse : Math´ematiques, Ser. 6, 26(2):463–509, 2017. [27] Pavel Zorin-Kranich. Weighted l´epingle inequality. Bernoulli, 26(3):2311–2318, 2020.

16