Area Anomaly and Generalized Drift of Iterated Sums for Hidden Markov Walks Olga Lopusanschi, Damien Simon

Area anomaly and generalized drift of iterated sums for hidden Markov walks Olga Lopusanschi, Damien Simon

To cite this version:

Olga Lopusanschi, Damien Simon. Area anomaly and generalized drift of iterated sums for hidden Markov walks. 2017. hal-01586794

HAL Id: hal-01586794 https://hal.archives-ouvertes.fr/hal-01586794 Preprint submitted on 13 Sep 2017

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives| 4.0 International License Area anomaly and generalized drift of iterated sums for hidden Markov walks

Olga Lopusanschi & Damien Simon September 13, 2017

Abstract Following our result from [13], we study the convergence in rough path topology of a certain class of discrete processes, the hidden Markov walks, to a Brownian motion with an area anomaly. This area anomaly, which is a new object keeps track of the time-correlation of the discrete models and brings into light the question of embeddings of discrete processes into continuous time. We also identify an underlying combinatorial structure in the hidden Markov walks, which turns out to be a generalization of the occupation time from the classical ergodic theorem in the spirit of rough paths. Finally, we construct a rough path out of iterated sums of a hidden Markov walk, compare it to the classical canonical lift of the piecewise linear interpolation of the process and analyse its convergence to a non-geometric rough path.

Contents

1 Introduction 2 1.1 On the importance of discrete models in the rough paths theory.2 1.2 Structure of the article and notations...... 5

2 Main results and map of the paper 6 2.1 Rough paths models from microscopic models...... 6 2.1.1 Main theorems: rough paths from hidden Markov walks. .6 2.1.2 An easy example: the diamond and round-point models. . 12 2.2 Combinatorial structure behind the iterated sums of HMW. . . . 14 2.2.1 Iterated sums and non-geometric rough paths...... 14 2.2.2 Iterated occupation times of HMW as underlying combinatorial structures of rough paths...... 16

3 From Hidden Markov Walks to rough paths 20 3.1 Theory of pseudo-excursions for hidden Markov walks...... 20 3.2 Elements of rough paths theory ...... 22 3.3 Embeddings ...... 24 3.4 Getting back to theorem 2.3 ...... 28

1 4 Iterated structures behind discrete time and discrete space Markov chains. 35 4.1 Shuffle and quasi-shuffle products...... 35 4.2 From geometric to non-geometric rough paths through hidden Markov walks ...... 37 4.2.1 Geometric rough paths and shuffle products...... 37 4.2.2 A discrete construction for non-geometric rough paths. . . 38 4.3 Getting back to proposition 2.1...... 38

5 Open questions 43

1 Introduction 1.1 On the importance of discrete models in the rough paths theory. From continuous to discrete setting. Rough paths theory was introduced by T. Lyons in 1997 (see, for example, [15]) in order to provide a deterministic setting to stochastic differential equations (SDEs) of the type

푑푦푠 = 푓(푦푠)[푑푥푠] (1)

푑′ 푑 where (푦푠)푠 is a path in , (푥푠)푠 is a path in of of Hölder regularity 훼 < 1 R R ′ ′ (which is often the case for stochastic processes) and 푓 : R푑 → 퐸푛푑(R푑, R푑 ). Whenever the classical Young integration [16] fails (which is the case for 훼 < 1/2), paths may be lifted (in a non-unique way) to a larger, more abstract, space, the space of rough paths, for which existence and uniqueness of a solution map become easier to prove. The idea behind this theory relies on the Taylor expansion: if we want to suitably approach (1), the trajectory level is not enough to register all the relevant information when passing to the limit. Thus, we have first to make an "expansion" of the path (푥푡)푡 by constructing, on the grounds of certain rules, ⌊1/훼⌋ the associated rough path (x푡)푡, which is a continuous path in 푉 ⊕ ... ⊕ 푉 and then to suitably approach it. A particularity of the rough paths theory is that it has been developed in a continuous setting (we approximate continuous processes by other continuous processes that are smoother). Since the main goal of the rough paths theory is giving a general method for solving SDEs, the exploration of the discrete setting may seem irrelevant other than for SDE approximation (see [3]). And yet, new directions of research can be found in this domain as it appears, for example, in [12], where the authors study discrete rough integrals. We believe that developing a theory of discrete models in the rough path setting can enrich the classical stochastic calculus with new tools and results. The present paper concentrates on two issues related to this field: the ways of constructing a rough path out of a discrete model and the explicit construction of new objects arising

2 at the limit in rough path topology, as well as the role they may play in classical stochastic calculus.

Analysing rough paths issues through discrete models. A finite-variation path 훾 : [0, 푇 ] → 푉 can be canonically lifted to a rough path using the sequence of iterated integrals: ∫︁ ⊗푛 푆훾,푛(푡) = 푑훾(푠1)⊗푑훾(푠2)⊗...⊗푑훾(푠푛) ∈ 푉 , 푡 ∈ [0, 푇 ] (2) 0<푠1<푠2<...<푠푛<푡 For any 푁 ≥ 2, the corresponding canonical lift coincides with the step-N signature of 훾 given by

(푁) 푆푁 (훾)0,푡 = (1, 푆훾,1(푡), . . . , 푆훾,푁 (푡)) ∈ 푇1 (푉 ), 푡 ∈ [0, 푇 ] (3)

푁 ⊗푘 ⊗0؀ (푁) where 푇1 (푉 ) = {(푎0, . . . , 푎푁 ∈ 푘=0 푉 |푎0 = 1} with the convention 푉 = R. For 훼-Hölder paths with 훼 < 1, such integrals may not be well-defined and some of the values of the vectors 푆훾,푛(푡) may be postulated provided that they satisfy certain algebraic relations. The absence of a canonical lift of Hölder paths to a larger space leads to some natural questions at the basis of the present paper. The first one, answered since the very beginning of rough paths theory, deals with how classical stochastic integrals (Itô and Stratonovitch) fit within this theory. The set of rough paths is separated in two categories, one for each type of integration: ∙ geometric rough paths, whose components satisfy the chain rule, for Stratonovich integration; ∙ non-geometric rough paths, whose components do not satisfy it, for Itô integration. Based on discrete sequences we give explicit constructions for both types of integrals as limit processes and discuss the link with geometric and non-geometric rough paths. In particular, we will see how discrete constructions can give rise to non-geometric rough paths as limit processes. The second issue consists in understanding the nature of the non-trivial objects in the space of rough paths; for example, two-dimensional area bubbles [11] obtained as the limit of the paths:

(︂ 1 1 )︂ 훾(푛)(푡) = √ cos(푛푡), √ sin(푛푡) 푛 푛 have a well-defined rough path limit different from the constant path 훾(푡) = (0, 0) (which would be the limit in the uniform topology). In this case, 푆훾,1(푡) = 0 but 1 2 푆훾,2(푡) = 2 푡푒1 ∧ 푒2, with 푒1 and 푒2 vectors of the canonical basis of R .

3 The third (and central) question is to ask whether such non-trivial limits can coexist with the classical structure of stochastic integrals. We can give a positive answer to this question by studying well-chosen discrete models with suitable time correlations. In [13], we have constructed a class of processes on a finite-dimensional vector space 푉 (Markov chains on periodic graphs) that converge, under certain conditions, in rough path topology (see section 3.2 for details) to (푆퐵,1(푡), 푆퐵,2(푡) + Γ푡)푡∈[0,1] (4) where (퐵푡)푡 if a standard Brownian motion on the vector space 푉 and Γ a deterministic antisymmetric matrix called the area anomaly. While 푆퐵,1(푡) and 푆퐵,2(푡) are obtained with classical stochastic calculus, Γ is a new object. The present paper provides a much more general construction of such limits. The main idea relies on the following: if one wants to be able to control Γ in a limit of the type (4), one has to modify the underlying discrete models in the following way: ∙ one may try to create area-bubble-like corrections to the trajectories;

∙ these corrections have to remain small so that 푆퐵,1 does not change; ∙ the corrected trajectories have to satisfy the Markov property. Our solution consists in ∙ introducing time-correlations in the discrete processes, tuned so that they affect only 푆퐵,2 after renormalization, through the framework of the hidden Markov walks, of which the Markov chains on periodic graphs are an example, and ∙ showing how the area anomaly depends on the way we embed the discrete paths in the set of continuous paths. This last point brings into light the difference played by embeddings in the uniform and rough paths topologies. In the first case, it is merely a way to make sense of the convergence of a discrete path and one uses usually linear interpolation by default. In the second case, it is a mean to influence and even change the nature of the limit through the non-trivial object which is the area anomaly. We also show how the convergence of any embedding can be studied through (2) the convergence of a discrete sequence in 푇1 (푉 ) (more precisely, in one of its sub-groups: 퐺2(푉 )). This allows, in particular, to make a classification of the embeddings of a given discrete process based on the Γ they provide at the limit. Another important question concerns the combinatorial structures that can be derived from rough paths built out of hidden Markov walks. While the algebraic structure of rough paths, based on shuffle products, provides nice combinatorial interpretations of iterated integrals, the iterated sums have more intricate multiplication rules, which can be interpreted using the quasi-shuffle products. This subject will be discussed in section 4.

4 Acknowledgements. D. S. is partially funded by the Grant ANR-14CE25- 0014 (ANR GRAAL). D. S. and O. L. thank Lorenzo Zambotti for the stimulating discussions and useful suggestions.

1.2 Structure of the article and notations. General structure of the article. The present article is divided into five main sections. The first two sections give a general presentation of our workas follows: ∙ in the present section 1, we describe the motivations and the main goals of our work; ∙ in section 2, we state our main results and give some additional comments and illustrations to them. The next two sections are more technical: ∙ section 3 is dedicated to the convergence of rough paths constructed out of hidden Markov walks: we give a detailed description of the framework and a proof of our main theorem (3.4); ∙ in section 4, we explore the combinatorial properties of hidden Markov chains and the associated rough paths through iterated occupation times and their link with shuffle and quasi-shuffle products. The last section 5 is dedicated to further research perspectives and open questions.

Notations. Throughout this paper, the following objects will often appear. ∙풞 1−푣푎푟([0, 푇 ], 퐵) is the space of functions valued in a Banach space 퐵 with bounded variations. ∙ 푉 denotes a finite-dimensional vector space. ∙ 퐸 denotes a finite state space (for Markov chains).

(푘) ∙ 푇1 (푉 ) is the graded vector space:

(푘) ⊗푘 푇1 (푉 ) = 푉 ⊕ ... ⊕ 푉

2 (2) and 퐺 (푉 ) a subgroup of 푇1 (푉 ) defined by: 1 ∀(푎, 푏) ∈ 퐺2(푉 ), Sym(푏) = (푎 ⊗ 푎) (5) 2 (the symmetrical part of the level 푉 ⊗ 푉 depends entirely on the level 푉 of the object).

푘 ⊗푙؀ (푘) ∙ Instead of (1, 푎1, . . . , 푎푘) ∈ 푇1 (푉 ), we will write (푎1, . . . , 푎푘) ∈ 푙=1 푉 .

5 ∙ The operator "⊗" can denote: - a tensor product of two elements in 푉 (in the formula 푉 ⊗ 푉 , for example); (2) ′ ′ ′ ′ ′ - an operation on 푇1 (푉 ): (푎, 푏) ⊗ (푎 , 푏 ) = (푎 + 푎 , 푏 + 푏 + 푎 ⊗ 푎 ), where the operation 푎 ⊗ 푎′ is in the sense of a tensor product. ∙ The operator "∧" is an antisymmetric law on 퐺2(푉 ): (푎, 푏) ∧ (푎′, 푏′) = 1 (푎 + 푎′, 푏 + 푏′ + (푎 ⊗ 푎′ − 푎′ ⊗ 푎)); 2 2 2 ∙ 훿휖: the standard dilatation operator on 퐺 (푉 ): 훿휖(푎, 푏) = (휖푎, 휖 푏).

2 Main results and map of the paper 2.1 Rough paths models from microscopic models. 2.1.1 Main theorems: rough paths from hidden Markov walks. Definition of hidden Markov walks. The Donsker theorem, generalized to rough paths theory in [2], shows that random walks do not leave any room for area anomaly: area bubbles are killed by the independence between random variables at each time step. Following [13], we introduce the framework of hidden Markov walks (HMW) whose time-correlations, as it has already been mentioned, allow additional area accumulation on the second level of the corresponding rough paths. We first give a definition of hidden Markov chains (HMC), which are the "bricks" from which hidden Markov walks are constructed: Definition 2.1 (the hidden Markov chain). Let 퐸 be a finite state space and 푉 a finite-dimensional vector space. A hidden Markov chain is a discrete process (푅푛, 퐹푛)푛 on 퐸 × 푉 such that 푅 = (푅푛)푛 is a Markov chain on 퐸 and under the conditional law P (·|휎(푅)) the variables 퐹푛 are independent and

∀푘 ≥ 1, ∀푢 ∈ 퐸, P (퐹푘|푅푘 = 푢) = P (퐹1|푅1 = 푢) Hidden Markov walks are constructed out of hidden Markov chains just as simple random walks can be constructed as sums of i.i.d. variables: Definition 2.2 (the hidden Markov walk). Let 퐸 be a finite state space and 푉 a finite-dimensional vector space and (푅푛, 퐹푛)푛 a hidden Markov chain on ∑︀푛 퐸 × 푉 . For all 푛 ≥ 1, set 푋푛 = 푘=1 퐹푘. The discrete process (푅푛, 푋푛)푛 on 퐸 × 푉 is called a hidden Markov walk.

Remarks.

∙ Throughout the paper, the transition matrix of the process (푅푛) will be written 푄 and the conditional law of 퐹푘 = 푋푘 −푋푘−1 knowing 푅푘 = 푢 ∈ 퐸 will be written 휈(∙|푢) and expectations under this law E휈 [∙|푢].

6 ∙ The process (푋푛)푛 may take values in a more general space than 푉 (and we will encounter such examples throughout the paper). The only thing that changes in this case is the definition of the increments of the process (see property 3.1).

A Donsker-type result. The following theorem is a result on convergence of rough paths constructed by piecewise linear interpolation (like in the classical Donsker theorem) out of HMW. It is a generalization of our result from [13] to a larger class of processes. We define the excursion times of the process (푅푘)푘 as:

푇0 = 0 (6)

∀푛 ≥ 1, 푇푛 = inf{푘 > 푇푛−1 : 푅푘 = 푅0} (7)

Theorem 2.1. Let (푅푛, 푋푛)푛 be a hidden Markov walk on 퐸 × 푉 such that (푅푛)푛 is irreducible and the increments of (푋푛)푛 are a.s. uniformly bounded, i.e. ∃퐾 > 0, ∀푗 ∈ N, |퐹푗|푉 ≤ 퐾 푎.푠. (8) Set 훽 = [푇 ]. Under the conditions [푋 ] = 0 and [︀푋⊗2]︀ = 퐶퐼 E 1 E 푇1 E 푇1 푑 for a certain 퐶 > 0, the rough path of the piecewise linear interpolation of (푋푘)푘≤푁 , renormalized by 훿(훽−1푁퐶)−1/2 , converges in the rough path topology 풞훼([0, 1], 퐺2(푉 )) for 훼 < 1/2 to a rough path whose levels are given by:

푆퐵,1(푡) = 퐵푡 ∫︁

푆퐵,2(푡) = ∘푑퐵푠1 ⊗ ∘푑퐵푠2 + Γ푡 0<푠1<푠2<푡 where (퐵푡)푡 is a standard 푑-dimensional Brownian motion and Γ is a deterministic antisymmetric matrix, the area anomaly, given by ⎡ ⎤ 1 ∑︁ Γ = 퐶−1 퐹 ⊗ 퐹 − 퐹 ⊗ 퐹 (9) 2E ⎣ 푘1 푘2 푘2 푘1 ⎦ 1≤푘1<푘2≤푇1

Remarks.

∙ The condition (8) requires a.s. uniform bound for 퐹푗s, whereas the classical Donsker theorem (and the its rough path variant from [2]) requires a uniform bound for the moments. On one hand, most of the concrete models we can apply theorem 2.1 to satisfy (8); on the other hand, as we will see further on, the variables whose role is equivalent to the ones of i.i.d.r.v. from the classical Donsker theorem are, in our case, the pseudo- excursions.

∙ The definition of 훽 implies that it may depend on the initial law of (푅푛)푛, whereas the limit does not. The reason for this is simple: 훽 registered the slowing down in time of the limit process which, as we will see, depends on the length of an excursion of (푅푛)푛.

7 ∙ We can actually always suppose the covariance matrix of (1) to be equal X푇1 to 퐶퐼푑 for a certain 퐶 > 0 in theorem 2.3 (idem for the covariance matrix

of 푋푇1 in theorems 2.1 and 2.2). The reasons for this are similar to those explained in the remark that follows immediately corollary 2.1 from [13]: we can embed the process (1) in a smaller space than 푉 , and afterwards X푇1 use a linear transformation to get the identity matrix modulo a constant.

∙ We can still get a result similar to theorem 2.1 if the covariance matrix

of 푋푇1 is not diagonal. The difference is that (퐵푡)푡 will have a covariance

matrix depending on 푋푇1 and will not be a standard Brownian motion anymore (see proposition 2.1). ∙ The second-level drift is actually the stochastic area of an excursion modulo −1 a multiplicative constant, i.e. Γ = 퐶 E [퐴푇1 (푋)].

∙ 푆퐵,2(푡) and can be decomposed into a symmetric part, given by 1/2(퐵푡 ⊗ 퐵푡), and an antisymmetric part, given by 풜푡 + Γ푡, where (풜푡)푡 is the Lévy area of (퐵푡)푡, i.e. 1 ∫︁ 풜푡 = 푑퐵푠1 ⊗ 푑퐵푠2 − 푑퐵푠2 ⊗ 푑퐵푠1 2 0<푠1<푠2<푡

This decomposition highlights the fact that (푆퐵,1(푡), 푆퐵,2(푡))푡 is a geometric rough path. Note that the Lévy area does not depend on the choice of integration (Itô or Stratonovich). If we endow the space of rough paths with the antisymmetric tensor product ∧ instead of the ordinary one ⊗, the 푆퐵,2(푡) will only keep track of the antisymmetric part of the limit: this is the approach we have adopted in our previous article [13].

Examples to which theorem 2.1 applies. ∙ In case of a simple random walk (퐸 = {0}) with a piecewise linear embedding, we get the Donsker theorem for rough paths from [2] (in particular, Γ = 0), with slightly different moment conditions. ∙ A less trivial example is the sum of rotating Bernoulli i.i.d.r.v. which has ∑︀푛 푘 been presented in our previous article [13]: 푋푛 = 푘=1 푖 푈푘, where the 푈푘s are i.i.d. Bernoulli variables. In this case, 퐸 = {1, 2, 3, 4} and Γ ̸= 0. ∙ More generally, theorem 2.1 applies to any Markov chain on periodic graphs as the ones described in [13].

The question of embeddings. If we want to build a rough path associated to a sequence (푥푛)푛 ∈ 푉 N, we first need to associate to (푥푛)푛 ∈ 푉 N a continuous path in 푉 for which building a rough path makes sense. In other words, we need to embed the discrete sequence in the space of continuous, sufficiently regular paths in 푉 .

8 Definition 2.3. We define an embedding of (푥푛)푛 ∈ 푉 N as a sequence 휌 = (휌푁 )푁 of continuous paths in 푉 such that

∙∀ 푁 ≥ 1, 휌푁 : [0, 푁] → 푉 ;

∙∀ 푁 ≥ 1, ∀푖 ∈ {1, . . . , 푁}, 휌푁 (푖) = 푥푖;

∙∀ 푘 < 푁, ∀푡 ∈ [0, 푘], 휌푘(푡) = 휌푁 (푡). We can construct an embedding by curve concatenation in the following way: we start by connecting, for all 푛 ∈ N, 푥푛 and 푥푛+1 by a continuous curve 훽푛 such that: 훽푛 : [0, 1] → 푉, 훽푛(0) = 푥푛, 훽푛(1) = 푥푛+1 (10) The embedding is then given by

휌푁 = 훽1 · ... · 훽푁 (11) where · is the operation of path concatenation in 푉 . In order to pass to the limit, we define 횤푁 : [0, 1] → 푉 as 횤푁 (푡) = 푓(푁)휌푁 (푁푡), where 푓 : N → R+ is the renormalization function (for details on embeddings see section 3.3). In theorem 2.1 we have used the piecewise linear embedding, i.e. we have connected two consecutive points by linear interpolation. While this is the most common embedding, it is not unique. In order to define a more general embedding for a hidden Markov walk, we proceed as follows. Lemma 2.1 (construction of an embedding for HMW). Consider a Markov chain (푅푛, 퐹푛)푛 on 퐸 × 푉 . We construct a sequence of processes (휌푁 )푁 as follows:

∙ for any 푢 ∈ 퐸, denote by 푉푢 ⊂ 푉 the set of all possible realizations of 퐹1 under the law P (∙|푅1 = 푢);

∙ for 푢 ∈ 퐸, to every 푦 ∈ 푉푢 we associate in a measurable way, for the usual Borel 휎-algebras, a curve 푓 푦 : [0, 1] → 푉 with bounded variation and we denote the set of these curves by ℬ푢;

∙ for all 푘 ≥ 1, we associate to 퐹푘 a random variable 훽푘 in the sense that, for a 푢 ∈ 퐸 and under the law P (∙|푅푘 = 푢), 훽푘 gives for every realization 푦 푦 of 퐹푘 from 푉푢 the corresponding element 푓 from ℬ푢;

∙ for all 푁 ≥ 1, we set 휌푁 = 훽1 · ... · 훽푁 .

Then (휌푁 )푁 is an embedding of (푋푛)푛 in the sense of definition 3.4.

Remark. Lemma 2.1 relies on the following idea: we can then "enlarge" a hidden Markov chain (푅푛, 퐹푛)푛 in 퐸 ×푉 to the sequence of triplets (푅푛, 퐹푛, 훽푛)푛 1−푣푎푟 in 퐸 × 푉 × 풞 ([0, 1], 푉 ), and construct an embedding (휌푁 )푁 out of (훽푘)푘 as in (11). This leads to the following generalization of theorem 2.1:

9 Theorem 2.2. Let (푅푛, 퐹푛)푛 be a hidden Markov chain on 퐸 × 푉 such that (푅푛)푛 is irreducible and condition (8) is satisfied, and denote by (푅푛, 푋푛)푛 the corresponding Markov walk. Denote by 휌 = (휌푛)푛 an embedding for (푋푛)푛 with bounded variation constructed as in lemma 2.1. Set 훽 = [푇 ]. Under the conditions [푋 ] = 0 and [︀푋⊗2]︀ = 퐶퐼 for a E 1 E 푇1 E 푇1 푑 certain 퐶 > 0, the rough path canonically constructed out of 휌푁 , renormalized through the dilation operator 훿(훽−1푁퐶)−1/2 , converges in the rough path topology 풞훼−Hl([0, 1], 퐺2(푉 )) with 훼 < 1/2 to a rough path given by:

푆퐵,1(푡) = 퐵푡 ∫︁

푆퐵,2(푡) = ∘푑퐵푠1 ⊗ ∘푑퐵푠2 + Γ휌푡 0<푠1<푠2<푡 where (퐵푡)푡 is a standard 푑-dimensional Brownian motion and Γ휌 is an antisymmetric deterministic matrix depending on 휌 as follows:

[︂1 ∫︁ ]︂ Γ휌 = E 푑휌푇1 (푠1) ⊗ 푑휌푇1 (푠2) − 푑휌푇1 (푠2) ⊗ 푑휌푇1 (푠1) (12) 2 0<푠1<푠2<푇1

Remarks.

∙ Γ휌 is the stochastic area of the curve 휌푇1 between the times 0 and 푇1. ∙ For the 휌-embedding, we have chosen curves of finite variation. This is a sufficient but not a necessary condition for the result of theorem 2.1tobe true. We maintain it for commodity reason, a generalisation to a weaker condition on the embedding (for example, curves of finite 푝-variation with a suitable 푝) being a question of some additional computations.

Examples to which theorem 2.2 applies: All the models to which theorem 2.1 applies and for which we connect consecutive points by curves of finite variation: the round-point model (see figure 2), models for traffic with road intersections, random walks on deformed networks and deformed periodic graphs etc.

The general theorem. In both theorems 2.1 and 2.2, we have lifted paths in 푉 to geometric paths in 퐺2(푉 ). As we will see in section 3.3 (proposition 3.3), any sufficiently regular embedding 휌 of a sequence in 푉 can be encoded in a rough path in 퐺2(푉 ), so that we can start directly with a sequence in 퐺2(푉 ) and associate to it a rough path, i.e. a sufficiently regular embedding in 퐺2(푉 ). The embeddings for sequences in 퐺2(푉 ) can be defined by analogy with the ones on 푉 (see the remark after definition 2.3). In particular, since 퐺2(푉 ) is a geodesic space (see the comment after definition 34), we can define for sequences valued in 퐺2(푉 ) the geodesic embedding, i.e. the embedding that connects two consecutive points by a curve whose length equals to the distance between them.

10 In figure 1, examples of geodesic curves corresponding to different elements of 퐺2(R2) are presented. In the case of R푑, the geodesic embedding corresponds to the linear interpolation.

Remark. The geodesic embedding of a sequence in 퐺2(푉 ) can be represented using the increments of the sequence, just like the linear interpolation of a sequence in R푑.

Figure 1: Examples of geodesic curves corresponding to elements of 퐺2(푉 ).

Thus, theorem 2.2 becomes a corollary of the following, more abstract, result:

Theorem 2.3. Let (푅푛)푛 be an irreducible Markov chain on 퐸 and (푅푛, X푛)푛 a hidden Markov walk on 퐸 × 퐺2(푉 ) such that

⃒⃒ −1 ⃒⃒ ∃퐾 > 0, ∀푗 ∈ N, ⃒⃒X푗 ⊗ X푗+1⃒⃒ < 퐾 푎.푠. (13)

(1) We denote by X the first component of X and set 훽 = E [푇1]. Let 휌 = (휌푁 )푁 be an embedding that can be encoded by the sequence (X푛)푛 (as in proposition 3.3). [︁ ]︁ [︁ ]︁ Under the conditions (1) = 0 and ( (1))⊗2 = 퐶퐼 , the geodesic E X푇1 E X푇1 푑 embedding of (X푘)푘≤푁 , renormalized through the dilation 훿(훽−1푁퐶)−1/2 , converges in the rough path topology 풞훼−Hl([0, 1], 퐺2(푉 )) with 훼 < 1/2 to the rough path given by:

푆퐵,1(푡) = 퐵푡 ∫︁

푆퐵,2(푡) = ∘푑퐵푠1 ⊗ ∘푑퐵푠2 + Γ휌푡 0<푠1<푠2<푡 where (퐵푡)푡 is a 푑−dimensional Brownian motion and Γ휌 is a deterministic antisymmetric matrix defined as in (12). Proof. See section 3.4. Since theorems 2.1 and 2.2 are particular cases of theorem 2.3, we shall not prove them separately.

11 Remarks. ∙ Setting (︂ ∫︁∫︁ )︂ F푗 = 퐹푗, 푑훽푗(푠1) ⊗ 푑훽푗(푠2) 0<푠1<푠2<1 ਀푛 where the 훽푖s the curves that define 휌 as in (11), we have X푛 = 푗=1 F푗, and condition (13) becomes similar to (8):

∃퐾 > 0, ∀푗 ∈ N, ||F푗|| < 퐾 푎.푠.

∙ Γ휌 can be decomposed in two parts: the first one is the Γ that emerges (1) from applying theorem 2.1 to the HMW (푅푛, X푛 )푛, the second one comes (2) from applying the law of large numbers to (X푛 )푛 (see section 3.4). ∙ If we endow the space of rough paths with the antisymmetric tensor product ∧, we can state an antisymmetric version of theorem 2.3.

2.1.2 An easy example: the diamond and round-point models. Let us now see an example that shows what kind of processes correspond to each of the three theorems. Consider the irreducible Markov chain (푅푛)푛 on 퐸 = {1, 2,..., 8} as represented in figure 2. The stochastic matrix 푄 corresponding to it is given by

⎛0 푝 0 1 − 푝 0 0 0 0⎞ ⎜푝 0 0 0 0 1 − 푝 0 0⎟ ⎜ ⎟ ⎜0 푝 0 1 − 푝 0 0 0 0⎟ ⎜ ⎟ ⎜0 0 0 0 1 − 푝 0 푝 0⎟ 푄 = ⎜ ⎟ (14) ⎜푝 0 0 0 0 1 − 푝 0 0⎟ ⎜ ⎟ ⎜0 0 1 − 푝 0 0 0 0 푝⎟ ⎜ ⎟ ⎝0 0 1 − 푝 0 0 0 0 푝⎠ 0 0 0 0 1 − 푝 0 푝 0

By associating to each state of 퐸 a vector of R2 as in the same figure, we define a HMC (푅푛, 퐹푛)푛, out of which we can eventually construct a HMW (푅푛, 푋푛)푛. If we apply the piecewise linear embedding to (푋푛)푛, we get the diamond model which is in the left lower corner of figure 2. It is obvious that theorem 2.1 applies to this model. We can obtain a round-point model by applying to (푋푛)푛 an embedding of round-arched openings (right lower corner of figure 2). This model can thus be derived from the diamond model by a change of embedding. We will now rewrite both models in the rough path setting, to which theorem 2.3 applies. Since in both cases we construct piecewise smooth embeddings, the rough path resulting from the canonical lift is a geometric rough path, and thus taking values in 퐺2(푉 ). For commodity reasons, we choose to endow the set

12 2 3 1

88 4 8 5

7 6 The hidden Markov chain 6

The underlying Markov chain

Figure 2: Construction of diamond model and round-point model out of the same HMC (푅푛, 퐹푛)푛.

퐺2(푉 ) with the antisymmetric law ∧, so that we keep only the antisymmetric part (the signed area) of the second-level component. In the case of the diamond model, the use of the piecewise linear (geodesic) embedding together with the

13 2 condition (5) for 퐺 (푉 ) elements imply that (X푛)푛 is composed as 푁 ⋀︁ X푁 = (퐹푘, 0) 푘=1 In the case of the round-point model, for 푘 ≥ 1, each round-arched opening can be encoded in an element 퐹푘 and a certain 푎푘 ∈ R, where the 푎푘s are the signed areas added by the round-arched openings when passing from the piecewise linear embedding of the diamond model to the new embedding. In particular, the 푎푘s are the antisymmetric part of the second-level component. The sequence (X푛)푛 is then defined as 푁 ⋀︁ X푁 = (퐹푘, 푎푘) 푘=1 2 (푅푛, (퐹푛, 푎푛))푛 is a hidden Markov chain on 퐸 × 퐺 (푉 ), with the 푎푛s given by the signed areas of the corresponding arches of circle. In order to take the limit, we then construct a rough path by applying the geodesic embedding to (X푛)푛.

2.2 Combinatorial structure behind the iterated sums of HMW. 2.2.1 Iterated sums and non-geometric rough paths. From iterated integrals to iterated sums. We have already see that iterated integrals of type 푆훾,푘 from (2) (or algebraic objects satisfying their properties) are what rough paths are composed of. Let us now pass from continuous objects to discrete ones by analogy. For a sequence 푥 = (푥푛)푛 in 푉 , we define the iterated sums as ˜ ∑︁ 푆푥,푙(푁) = Δ푥푗1 ⊗ ... ⊗ Δ푥푗푙 (15)

1≤푗1<...<푗푙≤푁 where Δ푥푖 = 푥푖 − 푥푖−1, and these objects can be interpreted as a discrete analogue of the iterated integrals. We will compare some algebraic properties of iterated sums 푆˜푥,푙 with those of iterated integrals 푆휌,푙, where 휌 = (휌푁 )푁 is an embedding of 푥, and see how the discrete setting of iterated sums allows to isolate a particular combinatorial structure, the iterated occupation times. Moreover, using the iterated sums 푆˜푥,푙(푁), we can construct for the sequence 푥 a discrete analogue of the step-푙 signature 푆푙(휌푁 ) of 휌푁 constructed as in (3) using iterated integrals.

Definition 2.4. For a sequence (푥푘)푘 in 푉 , we define the step-푙 discrete signature as ˜ (︀ ˜ ˜ )︀ (푙) 푆푙(푥)1,푁 = 푆푥,1(푁),..., 푆푥,푙(푁) ∈ 푇1 (푉 ) (16)

One may wonder what the relation between (푆푙(휌푁 ))푁 and (푆˜푙(푥)1,푁 )푁 is and what a study on its convergence may reveal. We will partially try to answer these questions hereunder in the setting of HMW.

14 Non-geometric paths as limits of iterated sums of HMW. Consider a HMW (푅푛, 푋푛)푛 on 퐸 × 푉 and set ⎛ ⎞ ˜ ∑︁ (2) Y푛 = 푆2(푋)1,푁 = ⎝푋푛, 퐹푘1 ⊗ 퐹푘2 ⎠ ∈ 푇1 (푉 ) (17) 1≤푘1<푘2≤푛

The following proposition shows that (Y푛)푛 converges in rough path topology to a non-geometric rough path with a second-level deterministic drift.

Proposition 2.1. Let (푅푛, 푋푛)푛 be a HMW on 퐸 × 푉 such that the corre- (푁) sponding 퐹푘s satisfy (8). Choose (푋푡 )푡 to be the piecewise linear embedding of (푋푛)푛, i.e.

⌊푁푡⌋ * (푁) ∑︁ ∀푁 ∈ N , ∀푡 ∈ [0, 1], 푋푡 = 푋푘 + (푁푡 − ⌊푁푡⌋)퐹⌊푁푡⌋+1 푘=1

∑︀푘 ⊗2 and suppose 푋0 = 0 a.s. Consider the geodesic embedding of 푍푘 = 푗=1 퐹푗 given by

⌊푁푡⌋ * (푁) ∑︁ ⊗2 ⊗2 ∀푁 ∈ N , ∀푡 ∈ [0, 1], 푍푡 = 퐹푘 + (푁푡 − ⌊푁푡⌋)퐹⌊푁푡⌋+1 푘=1 and set

* (푁) (푁) −1 (푁) ∀푁 ∈ N , ∀푡 ∈ [0, 1], Y푡 = (0, 푍푡 ) ⊗ 푆2(푋 )0,푡

Suppose in addition that E [푋푇1 ] = 0, where 푇1 is as in (6), and that the covariance matrix [︀푋⊗2]︀ is equal to 퐶퐼 for some 퐶 > 0. We also set E 푇1 푑 훽 = E [푇1]. (푁) Then (Y푡 )푡 is a well-defined embedding for (Y푛)푛 and we have the following 훼−Hl (2) convergence in the 풞 ([0, 1], 푇1 (푉 )) topology for 훼 < 1/2:

(︁ (푁))︁ (︀ It )︀ 훿(푁퐶훽−1)−1/2 Y푡 −→ B푡 ⊗ (0, 푀푡) 푡∈[0,1] 푁→∞ 푡∈[0,1]

It where (B푡 )푡 is the standard Brownian motion on 푉 enhanced with iterated integrals in the Itô sense and 푀 is a deterministic matrix given by ⎡ ⎤ −1 ∑︁ 푀 = 퐶 E ⎣ 퐹푘1 ⊗ 퐹푘2 ⎦ (18) 1≤푘1<푘2≤푇1

It In particular, (B푡 ⊗ (0, 푀푡))푡 is a non-geometric rough path. Proof. The proof can be found in section 4.2.

15 2.2.2 Iterated occupation times of HMW as underlying combinatorial structures of rough paths. Definition and combinatorial nature of iterated occupation times. In the case of a HMW (푅푛, 푋푛)푛 on 퐸 × 푉 ((푅푛)푛 is irreducible as before), the conditional measure 휈(∙|푢) is important for determining the trajectory on 푉 . However, when it comes to algebraic properties, it actually plays only a small role; most of the algebraic properties of rough paths are already present at the level of the discrete Markov chain (푅푛)푛. They are concentrated in combinatorial objects contained in the HMW, the iterated occupation times. The definition of a HMW implies that

푛 ∑︁ ∑︁ E [푋푛|휎(푅)] = 푓(푢) 1푅푘=푢 (19) 푢∈퐸 푘=1 where 푅 = (푅푛)푛 and 푓(푢) = E휈 [푋1 − 푋0|푢]. The occupation time of 푢 by (푅푛)푛 up to time 푛 푛 ∑︁ 퐿푢;푛(푟) = 1푟푘=푢 (20) 푘=1 is a natural r.v. at the heart of ergodic-type theorems. We can generalize the notion of occupation time (20) as follows.

Definition 2.5 (iterated occupation times). Let (푅푛)푛 be an irreducible Markov * 푘 chain on 퐸. For any 푘 ∈ N and any elements (푢1, . . . , 푢푘) ∈ 퐸 , the iterated occupation time of (푢1, . . . , 푢푘) by (푅푛)푛 at time 푛 ∈ N is defined as:

퐿푢1,...,푢푘;푛(푅) = card {(푛1, . . . , 푛푘) ∈ Δ푘(푛); 푅푛1 = 푢1, . . . , 푅푛푘 = 푢푘} (21) where Δ푘(푛) is the set {︀ 푘 }︀ Δ푘(푛) = (푛1, . . . , 푛푘) ∈ N ; 0 ≤ 푛1 < 푛2 < . . . < 푛푘 ≤ 푛 (22) This cardinal can be written as an iterated sum of products of indicator functions ∑︁ 퐿 (푅) = 1 ... 1 (23) 푢1,...,푢푘;푁 푅푙1 =푢1 푅푙푘 =푢푘 1≤푙1<...<푙푘≤푁 Further on, we can easily generalize relation (19) through the following decomposition. Property 2.1. ⎡ ⃒ ⎤ ⃒ [︀ ˜ ⃒ ]︀ ∑︁ ⃒ E 푆푋,푙(푁)⃒휎(푅) = E ⎣ Δ푋푗1 ⊗ ... ⊗ Δ푋푗푙 ⃒휎(푅)⎦ ⃒ 1≤푗1<...<푗푙≤푁 ⃒ ∑︁ = (푓(푢1) ⊗ ... ⊗ 푓(푢푙))퐿푢1,...,푢푙;푁 (푅) (24) 푙 (푢1,...,푢푙)∈퐸 where 푅 = (푅푛)푛, 푋 = (푋푛)푛 and 푓(푢) = E휈 [푋1 − 푋0|푢] as before.

16 Remark. Formula 24 tells us that, considering under the law P (∙|휎(푅)), the expectations of iterated sums of a HMW are linear combinations of occupation times of the underlying Markov chain. Conversely, an iterated sum of the type ∑︁ 푣퐿푢1,...,푢푙;푁 (푅) 푙 (푢1,...,푢푙)∈퐸 with 푣 ∈ 푉 ⊗푙 corresponds (under the same probability law) to the expectation of an iterated sum of order 푙 of a certain HMW depending on (푅푛)푛.

From shuffle to quasi-shuffle products. A particularity of the Stratonovich iterated integrals is that they satisfy the chain rule; for example, the product of two components of the first levels satisfies:

푆훾,1(푡)푖 ⊗ 푆훾,1(푡)푗 = 푆훾,2(푡)푖푗 + 푆훾,2(푡)푗푖 (25) where 푥푖 is the 푖-th coordinate of a vector 푥 ∈ 푉 . Formula 25 is actually an illustration of the fact that the multiplication of Stratonovich iterated integrals is a shuffle product introduced in [4]. This product can be identified with a particular set of permutations giving all the ways of interlacing two ordered sets while preserving the original order of components of each of them. In the case of Itô integrals and iterated sums. We have seen that the combinatorial properties of the latter concentrate in the iterated occupation times, which may themselves be understood, through formula (23), as basis vectors of R퐸. This formal identification establishes a relation between iterated integrals/sums of rough paths with the present iterated occupation time. In particular, one checks easily for any elements 푢 ̸= 푣

퐿푢(푛)퐿푣(푛) = 퐿푢,푣(푛) + 퐿푣,푢(푛) (26) which is very reminiscent of (25). However, for 푢 = 푣, the formula is modified to: 퐿푢(푛)퐿푢(푛) = 2퐿푢,푢(푛) + 퐿푢(푛) (27) The emergence of the last term is due to the discrete nature of the sums (in the context of stochastic iterated integrals, it is due to the extra-drift of the Itô integral). This extra-term implies that we need a more general product than the shuffle product to characterize the multiplication of combinatorial structures behind HMW. Property 2.2. The product of two iterated sums/occupation times of a HMW can be identified with a quasi-shuffle product, where the quasi-shuffle productis as in definition 4.1 from section (27). The quasi-shuffle products, introduced in[8] and extended, for example, in [9], are a generalization of the shuffle product: while the latter supposes that components from two different sets can never coincide when interlacing the sets, the former allows for sets to overlap (following certain rules).

17 Remark: Given a HMW on 퐸 ×푉 and an embedding for it, the structure given by the shuffle product to the space of the corresponding (Stratonovich) iterated integrals, on one hand, and the one given to the space of the corresponding iterated sums/occupation times by the quasi-shuffle product on the other gives rise to an interesting illustration of the formal algebraic definitions given in7 [ ] for the geometric and non-geometric rough paths.

Asymptotics of the combinatorial structure of HMW. Whereas the convergence of iterated sums of a HMW (푅푛, 푋푛)푛 on 퐸 × 푉 can not be directly expressed through the convergence of their combinatorial structure, there is an interesting link between both of them. Following theorem 2.1 and the classical ergodic theorem, we want to study the large time values and dynamics of 퐿푢1,...,푢푘;푛(푅) using the results presented above. We provide two types of estimates: almost sure limits of renormalized iterated occupation times and convergence in law to anomalous Brownian motion in the rough path topology for corrections to the almost sure limits. First, the iterated occupation times satisfy the following scaling limit.

Proposition 2.2 (almost sure convergence). Let (푅푛)푛∈N be an irreducible

Markov chain on a finite set 퐸 and 퐿푢1,...,푢푘;푛(푅) as in definition 2.5. Then, for any 푘 ∈ N, the following convergence holds almost surely: 퐿 (푅) 푢1,...,푢푘;⌊푁푡⌋ −−−−→푁→∞ (1/푘!)푡푘휋(푢 ) . . . 휋(푢 ) (28) 푁 푘 1 푘 where 휋 is the invariant measure for (푅푛)푛. Proof. We give only the main idea of the proof: using the theory of pseudo- excursions, we prove that 퐿푢1,...,푢푘;⌊푁푡⌋(푅) can be approximated by a polynomial of degree 푁 푘, and show that the coefficient corresponding to this degree is given by the following a.s. convergence obtained by applying the classical ergodic theorem: (︂푁)︂ 푁 −푘 퐿 (푅) . . . 퐿 (푅) −−−−→푁→∞ (1/푘!)휋(푢 ) . . . 휋(푢 ) 푘 푢1 푢푘 1 푘

Remarks.

∙ Proposition 2.2 can as a generalized version of the classical ergodic theorem which is also proved using a decomposition into excursions (see, for example, [10]).

∙ The stochastic matrix 푄 of (푅푛)푛 appears here through the invariant measure 휋.

18 We now present the results on the convergence in law. We set 퐿푛(푟) = 퐶푎푟푑(퐸) (퐿푢;푛(푟))푢∈퐸. In this case, we get the integer-valued vector 퐿푛(푟) ∈ N . We then have the following result which brings into light a universality class of convergence for the HMW:

Proposition 2.3. Let (푅푛)푛 be an irreducible Markov chain on the finite state ˜ 퐶푎푟푑(퐸) space 퐸. For 푛 ≥ 1, set 퐿푛(푅) = (퐿푢;푛(푅) − 푛휋(푢))푢∈퐸 ∈ R , where 휋 is the invariant probability measure of (푅푛)푛. Then, modulo the appropriate renormalization and a multiplicative constant, the piecewise linear embedding from proposition 4.3 of (퐿˜ (푅)) and the associated iterated integrals 푆 (푡) 푛 푛 퐿˜푛(푅),푘 훼−Hl (2) with 푘 ≥ 2 converge in rough path topology of 풞 ([0, 1], 푇1 (푉 )) with 훼 < 1/2 towards

푆퐵,1(푡) = 퐵푡, ∫︁

푆퐵,2(푡) = 푑퐵푠1 ⊗ 푑퐵푠2 + 푀푡 0<푠1<푠2<푡 where (퐵푡)푡 is a 퐶푎푟푑(퐸)-dimensional Brownian motion and 푀 is a deterministic matrix defined as in (18). Proof. Follows directly from proposition 2.1.

Remark. Notice that the Brownian motion is not standard here. This comes from the fact that the covariance matrix of an excursion here is not diagonal. We have the following immediate corollary, which can be viewed as a kind of central limit the theorem for iterated sums:

Corollary 2.1. Let (푅푛)푛 be an irreducible Markov chain on the finite state space 퐸. For any 푢, 푣 ∈ 퐸, set ∑︁ 퐿˜ (푅) = (1 − 휋(푢))(1 − 휋(푣)) 푢,푣;푁 푅푙1 =푢 푅푙2 =푣 1≤푙1<푙2≤푁 where for 푤 ∈ 퐸 푐푤 is as in proposition 2.3. Then for any 푢, 푣 ∈ 퐸, we have the following convergence in law in the uniform topology: ∫︁ −1 ˜ (푢) (푣) 푁 퐿푢,푣;푁 (푅) −→ 푑퐵푠 푑퐵푠 + 푀푢,푣 푁→∞ 1 2 0<푠1<푠2<1 where 퐵 is a 퐶푎푟푑(퐸)-dimensional Brownian motion and the matrix 푀푢,푣 is the (푢, 푣) entrance of the 퐶푎푟푑(퐸) × 퐶푎푟푑(퐸) matrix 푀 from proposition 2.3.

Remark. We can generalize the result of corollary 2.1 for iteration times of higher order by setting, for any 푢1, . . . , 푢푘 ∈ 퐸, ∑︁ 퐿˜ (푅) = (1 − 휋(푢 )) ... (1 − 휋(푢 )) 푢1,...,푢푘;푁 푅푙1 =푢1 1 푅푙푘 =푢푘 푘 1≤푙1<...<푙푘≤푁 However, in this case the additional terms at the limit (all the terms aside the iterated integral of Brownian motion) are more difficult to express explicitly, even if they depend on 푀 and (퐵푡)푡.

19 3 From Hidden Markov Walks to rough paths 3.1 Theory of pseudo-excursions for hidden Markov walks. Another definition of HMW. Definitions 2.1 and 2.2 can be summed upin the following definition-property, which includes also the HMW taking values in 퐸 × 퐺2(푉 ), where 퐺2(푉 ) is as defined in notations of section 1.2 or in definition 3.3.

Property 3.1. A process (푅푛, 푋푛)푛 is a hidden Markov walk on 퐸 × 푉 (resp. on 퐸 × 퐺2(푉 )) in the sense of definition 2.2 if and only if there exists a sequence 2 of r.v. (퐹푛)푛 with values in 푉 (resp. 퐺 (푉 )) such that

∙ under P (∙|휎(푅)), with 푅 = (푅푛)푛, (퐹푛)푛 is a sequence of independent r.v.;

* ∙ the distribution of 퐹푛 knowing 푅푛 = 푢 is 휈(∙|푢) for all 푛 ∈ N ; * ∑︀푛 ਀푛 2 ∙∀ 푛 ∈ N , 푋푛 = 푘=1 퐹푘 (resp. 푋푛 = 푘=1 퐹푘 in 퐺 (푉 )). −1 2 In particular, we have 퐹푘 = 푋푘 − 푋푘−1 (resp. 퐹푘 = 푋푘−1 ⊗ 푋푘 in 퐺 (푉 )).

Theory of pseudo-excursions. As it has already been mentioned in the introduction, HMW are a way of generalizing simple random walks: instead of being i.i.d. variables, the increments are variables depending on a Markov chain on a finite state space 퐸: to each state 푢 ∈ 퐸 we associate a supplementary object (a vector, a curve etc.) which gives the increment. However, we can derive a sum of i.i.d.r.v. from a HMW using the theory of pseudo-excursions. For commodity reasons, we state it for HMW on 퐸 × 푉 , but it is equally valid for those on 퐸 × 퐺2(푉 ). If (푅푛)푛 is an irreducible Markov chain on a finite state space 퐸, we can apply to it all the excursion theory we have for Markov chains. Thus, even if the hidden Markov walk (푅푛, 푋푛)푛 may not even be a Markov chain, we can construct a theory of pseudo-excursions for it based on the results we have for (푅푛)푛. We start by defining a sequence of stopping times for the Markov chain (푅푛)푛 as in (6). We then have the following definition, which is a property at the same time.

Definition 3.1 (pseudo-excursions). Let (푅푛, 푋푛)푛 be a HMW on 퐸 × 푉 and the sequence (푇푘)푘 as defined in (6). We call (푋푇푘+1, . . . , 푋푇푘+1 ) the 푘-th pseudo- excursion of (푋푛)푛 and 푋푇푘+1 − 푋푇푘 the 푘-th pseudo-excursion increment. Property 3.2. We have the following basic properties for pseudo-excursions:

∙ The variables 푋푇푘+1 − 푋푇푘 are i.i.d. random variables.

∙ For 푖 ̸= 푗, the trajectories (푋푇푖+1, . . . , 푋푇푖+1 ) and (푋푇푗 +1, . . . , 푋푇푗+1 ) are independent and have the same law.

20 Proof. Let 푘 ̸= 푚. The pseudo-excursions 푋푇푘+1 − 푋푇푘 and 푋푇푚+1 − 푋푇푚 are independent knowing (푅푇푘+1, . . . , 푅푇푘+1 ) and (푅푇푚+1, . . . , 푅푇푚+1 ) respectively

(this follows from property 3.1). Since 푘 ̸= 푚, the excursions (푅푇푘+1, . . . , 푅푇푘+1 ) and (푅푇푚+1, . . . , 푅푇푚+1 ) are i.i.d., and thus 푋푇푘+1 − 푋푇푘 and 푋푇푚+1 − 푋푇푚 are independent and have the same law.

Remark. Let us briefly justify the choice of the term "pseudo-excursions".

The excursion part comes from the fact that 푋푇푘+1 − 푋푇푘 depends on the excursion (푅푇푘+1, . . . , 푅푇푘+1 ). Why "pseudo"? Because an ordinary excursion is the trajectory of a process before returning to its starting point, and the increments 푋푇푘+1 − 푋푇푘 are non-trivial, i.e. we do not return to our starting point. The pseudo-excursions are particularly interesting when it comes to convergence of a hidden Markov walk, as they tell us that it can be studied almost as the convergence of a sum of well-chosen i.i.d.r.v.

Proposition 3.1. Let (푅푛, 푋푛)푛 be a hidden Markov walk as in definition 2.2. Suppose that there exists 푀 > 0 such that |푋푛+1 − 푋푛| ≤ 푀 a.s. There exists a sequence (푉푛)푛 of i.i.d.r.v. such that, setting, for all 푛 ≥ 1,

−1 ⌊푛 [푇1] ⌋ E∑︁ 푆푛 = 푉푘 푘=1 we have ∀휖 > 0, P (|푋푛 − 푆푛| > 휖) → 0

Proof. Let 휅(푛) be an integer such that 푇휅(푛) ≤ 푛 < 푇휅(푛)+1. We decompose

휅(푛) 푛 ∑︁ ∑︁ 푋푛 = (푋푇푖 − 푋푇푖−1 ) + (푋푖 − 푋푖−1)

푖=1 푖=푇휅(푛)+1 −1 The first sum contains 휅(푛) i.i.d. variables, and moreover 휅(푛)/푛 −→ [푇1] 푛→∞ E a.s. by the ergodic theorem. We will now prove that the second sum converges to zero in probability. For 휖 > 0, we have: ⎛ ⎞ ⎛ ⎞ 푛 푇휅(푛)+1 ∑︁ ∑︁ P ⎝| 푋푖 − 푋푖−1| > 휖⎠ ≤ P ⎝ |푋푖 − 푋푖−1| > 휖⎠

푖=푇휅(푛)+1 푖=푇휅(푛)+1 (︀ )︀ ≤ P 푀(푇휅(푛)+1 − 푇휅(푛)) > 휖 −1 ≤ 푀휖 E [푇1] → 0 We then obtain the desired result by applying Slutsky’s theorem (for reference, see [1], theorem 3.1). Proposition 3.1 is very useful in the sense that it allows to adopt the convergence theorems we have for sums of i.i.d. random variables to the class of hidden Markov walks.

21 3.2 Elements of rough paths theory The reader who is well familiar with rough paths may skip this part, and those who would like a more detailed study may refer, for example, to [6] or to [5].

Some definitions. In general, a rough path is defined as a continuous path (푁) on 푇1 (푉 ) for 푁 ≥ 1. When we want to associate a rough path to a 푉 -valued path 훾 of regularity 훼 < 1, we choose 푁 = ⌊1/훼⌋ and fill each level following some rules.

Remark. The way of lifting a path in 푉 to a rough path is not unique. In particular, the regularity of the rough path has to be coherent with the regularity of the initial path (for details on the regularity issues, see, for example, [14] or [6]). Throughout this article, we are interested in rough paths that convergence towards the Brownian motion whose regularity is 1/2−, so the rough paths we operate correspond to the following definition.

Definition 3.2. Let 1/3 < 훼 < 1/2. An 훼-Hölder rough path (x푡)푡 is an element 훼−Hl (2) 훼−Hl 2 of 풞 ([0, 1], 푇1 (푉 )). Moreover, if (x푡)푡 is an element of 풞 ([0, 1], 퐺 (푉 )), it is an 훼-Hölder geometric rough path. A more informal way to state definition 3.2 is the following: ∙ a geometric rough path on 퐺2(푉 ) endowed with the law ⊗ inherited from (2) ′ ′ ′ ′ ′ 푇1 (푉 ): (푎, 푏) ⊗ (푎 , 푏 ) = (푎 + 푎 , 푏 + 푏 + 푎 ⊗ 푎 ); ∙ a geometric rough path on 퐺2(푉 ) endowed with the antisymmetric law ∧, which eliminates the symmetric part of the second level (it is seen as redundant since it depends entirely on the first level): (푎, 푏) ∧ (푎′, 푏′) = 1 (푎 + 푎′, 푏 + 푏′ + 푎 ⊗ 푎′ − 푎′ ⊗ 푎); 2 (2) ∙ a non-geometric rough path on 푇1 (푉 ) endowed with the law ⊗ (we do not consider the antisymmetric law ∧ since the symmetric part of the second level contains elements that do not depend on the first level). An important property of rough paths is that they satisfy Chen’s relation, i.e.

∀ 0 ≤ 푠 < 푢 < 푡 ≤ 1, x푠,푢 ⊗ x푢,푡 = x푠,푡 (29) or, alternatively, in the antisymmetric setting (for 퐺2(푉 )),

∀ 0 ≤ 푠 < 푢 < 푡 ≤ 1, x푠,푢 ∧ x푢,푡 = x푠,푡 (30)

−1 where ∀ 0 ≤ 푠 < 푢, x푠,푢 = x푢 ⊗ x푠. This reflects, in particular, the fact that x푡 can be decomposed into a "sum" of increments of x. The Brownian motion rough path is the most important practical example of rough path in the probabilistic setting. Since stochastic calculus allows us to

22 define integrals with respect to the Brownian motion (Itô or Stratonovich), we can directly construct the rough path as: ∫︁ B푠,푡 = (퐵푡 − 퐵푠, ∘푑퐵푢 ⊗ ∘푑퐵푣) (31) 푠<푢<푣<푡 In the antisymmetric setting of 퐺2(푉 ), we only keep the antisymmetric part of (31) and we obtain the enhanced Brownian motion:

B푠,푡 = (퐵푡 − 퐵푠, 풜푠,푡) (32) where 풜 is the stochastic signed area of 퐵, called the Lévy area. The enhanced Brownian motion is a Brownian motion on 퐺2(푉 ).

The group 퐺2(푉 ): alternative definition and topology. Since our main theorems of convergence (2.1, 2.2, 2.3) are dealing with geometric rough paths, we give more details about the group 퐺2(푉 ). 2 (2) In the introduction, we have stated that 퐺 (푉 ) is a subgroup of 푇1 (푉 ) whose elements satisfy condition (5) (i.e. the symmetrical part of the second level depends entirely on the first level). We will now give an alternative, more analytical definition. Definition 3.3. The free nilpotent group 퐺2(푉 ) is defined as: 2 1−푣푎푟 퐺 (푉 ) = {푆2(훾)0,1 : 훾 ∈ 풞 ([0, 1], 푉 )} where 푆2(훾)0,1 is as in (3). The topology of 퐺2(푉 ) is induced by the Carnot-Caratheodory norm, which gives the length of the shortest path corresponding to a given signature: {︂∫︁ 1 }︂ 2 1−푣푎푟 ∀푔 ∈ 퐺 (푉 ), ||푔|| := inf |푑푥| : 푥 ∈ 풞 ([0, 1], 푉 ) and 푆2(훾)0,1 = 푔 0 (33) where | · |푉 is a restriction to 푉 of the Euclidean norm. The norm thus defined ⃒⃒ −1⃒⃒ is homogeneous (||훿휆푔|| = |휆| ||푔|| for 휆 ∈ R), symmetric (||푔|| = ⃒⃒푔 ⃒⃒) and sub-additive (||푔 ⊗ ℎ|| ≤ ||푔|| + ||ℎ||), it induces a continuous metric d on 퐺2(푉 ) through the application 2 2 d : 퐺 (푉 ) × 퐺 (푉 ) → R+ ⃒⃒ ⃒⃒ (34) (푔, ℎ) ↦→ ⃒⃒푔−1 ⊗ ℎ⃒⃒ In this case, (퐺2(푉 ), d) is a geodesic space (in the sense of definition 5.19 from [6]) and a Polish space. Another useful norm on 퐺2(푉 ), homogeneous but neither symmetric nor sub-additive, is given by:

2 1/2 ∀(푎, 푏) ∈ 퐺 (푉 ), |||(푎, 푏)||| = max{|푎|푉 , |푏|푉 ⊗푉 } Since all homogeneous norms are equivalent on 퐺2(푉 ), this new norm gives us a rather easy way to get an estimate of the Carnot-Caratheodory norm:

23 Property 3.3. There exist two positive constants 푐1 and 푐2 such that

2 ∀(푎, 푏) ∈ 퐺 (푉 ), 푐1|||(푎, 푏)||| ≤ ||(푎, 푏)|| ≤ 푐2|||(푎, 푏)||| (35)

A convergence criterion for rough paths. We usually prove the convergence of a sequence of rough paths using pointwise convergence plus a tightness criterion. However, for some cases, we can also use the following result, deduced from exercise 2.9 in [5]:

(푁) Proposition 3.2. Consider a sequence of rough paths (X )푁 which takes values in 풞훼−Hl([0, 1], 퐺2(푉 )) with 훼 < 1/2 such that we have the uniform bound ⃒⃒ ⃒⃒ ⎡ ⃒⃒ (푁)⃒⃒ ⎤ ⃒⃒X푠,푡 ⃒⃒ sup sup < ∞ E ⎣ 1/2 ⎦ 푁 푠̸=푡 |푡 − 푠| and the pointwise convergence (in probability)

(푁) ∀푡 ∈ [0, 1], X0,푡 −→ X0,푡 푁→∞ for a certain rough path X in 풞훼−Hl([0, 1], 퐺2(푉 )). Then X(푁) converges in probability to X in the 풞훼−Hl([0, 1], 퐺2(푉 )) topology with 훼 < 1/2.

3.3 Embeddings Definition and equivalence classes in the case of a finite-dimensional vector space. In order to study the limit of a discrete process, we need to properly define this convergence in the continuous space, where the limit process lives. This is when embeddings come on the scene. In the introduction, we have given a general definition 2.3 of an embedding and a way of constructing an embedding for a given sequence of points (11). The following property highlights the fact that the method is consistent with the definition:

Definition 3.4. Let (푥푛)푛 be a sequence with values in a finite-dimensional vector space 푉 . Consider a sequence (훽푛)푛 of continuous paths 훽푛 : [0, 1] → 푉 and such that 훽푛(0) = 푥푛 and 훽푛(1) = 푥푛+1. Let · be the operator of path concatenation given by

∀훾 :[푠, 푡] → 푉, ∀훾′ :[푠′, 푡′] → 푉, ∀푢 ∈ [푠, 푡 + 푡′ − 푠′], {︃ 훾(푢) if 푢 ∈ [푠, 푡] (훾 · 훾′)(푢) = (36) 훾′(푢 − 푡 + 푠′) if 푢 ∈ [푡, 푡 + 푡′ − 푠′]

Then the sequence 휌푁 = 훽1 · ... · 훽푁 is an embedding of (푥푛)푛.

24 The most common embedding is the geodesic embedding, i.e. the one that connects two consecutive points (in a geodesic space) by a curve minimizing the distance between them (in R푑, it consists in simply connecting consecutive points of (푥푛)푛 by straight lines). The advantage of this term is that it can be applied to embeddings in 푉 as well as in 퐺2(푉 ). In some cases, we can study the convergence of the embedding thus obtained by shrinking the space scale of the paths, i.e. by rescaling the embedding. We then have the following definition: Definition 3.5. Let 푉 be a finite-dimensional vector space. Consider a 푉 -valued embedding 휌푁 = 훽1 · ... · 훽푁 as defined in 3.4. Let 푓 : N → R+ be an increasing function. The sequence 횤푁 = 푓(푁)휌푁 defines a rescaled embedding. Even if, for a given discrete process, we have a large choice of embeddings, we should be aware of the fact that the limit we get depends of the embedding we adopt (some embeddings may not even admit a limit). For a sequence (푥푛)푛 with values in a vector space 푉 , we can define a way of dividing the embeddings into equivalence classes by building rough paths out of our embeddings (when possible) and considering the convergence in the rough path topology:

Definition 3.6. Let (푥푛)푛 be a sequence with values in a (finite-dimensional) ′ ′ vector space 푉 . We say that two embeddings 휌 = (휌푁 )푁 and 휌 = (휌푁 )푁 of (푥푛)푛 are equivalent (in the rough path sense) if there exists a function 푓 : N → R+ such ′ ′ ′ that the rough paths 횤푁 and 횤푁 corresponding to 횤푁 = 푓(푁)휌푁 and 횤푁 = 푓(푁)휌푁 respectively converge in distribution to the same limit 푙 in the rough path topology. We shall now see how this definition is important for generalizing the convergence of HMW in rough path topology.

From sequences in 퐺2(푉 ) to rough paths. For 푛 ∈ N, we consider a (1) (2) 2 sequence (푔푘)푘 = (푔푘 , 푔푘 )푘 in 퐺 (푉 ). As in the case of a sequence in 푉 , we can pass to the continuous framework by associating a set of embeddings to (푔푘)푘 as follows: 2 Definition 3.7. Consider a sequence (푔푘)푘 ∈ 퐺 (푉 )N. Consider a sequence of 2 rough paths (휌푁 )푁 such that 휌푁 : [0, 푁] → 퐺 (푉 ) which satisfy

−1 ∀1 ≤ 푘 ≤ 푁, 휌푁 (푘 − 1) ⊗ 휌푁 (푘) = 푔푘

We say that (휌푁 )푁 is an embedding for (푔푘)푘. Just like in the case of a vector space 푉 , a particularly important embedding is the geodesic embedding, which, in this case, consists in connecting the elements of a sequence by geodesic curves minimizing the distance between them:

2 Definition 3.8 (the geodesic embedding in 퐺 (푉 )). Consider a sequence (푔푘)푘 ∈ 2 퐺 (푉 )N. We call geodesic embedding of (푔푘)푘 the sequence of rough paths

2 휚푁 : [0, 푁] → 퐺 (푉 ) ਀⌊푡⌋ (37) 푡 ↦→ 푘=1 푔푘 ⊗ 훿푡−⌊푡⌋(푔⌊푡⌋+1)

25 The universality of the geodesic embedding in 퐺2(푉 ) for HMW. We can now explain what we mean by the universality of the geodesic embedding. The geodesic embedding on 퐺2(푉 ) allows us to describe further beyond the equivalence classes the convergence of different embeddings of a hidden Markov walk. More precisely, the limit of all the embeddings that are equivalent in the rough path sense is the same as the limit of the geodesic embedding of a well-chosen sequence in 퐺2(푉 ):

Proposition 3.3. Consider a process (푋푛)푛 on 푉 and suppose that there exists (휌푁 )푁 an embedding of (푋푛)푛 with bounded variation (constructed, for −1/2 example, as in lemma 2.1) and such that (푁 휌푁 (푁푡))푡∈[0,1] converges in law in 풞훼−Hl([0, 1], 퐺2(푉 )) topology for 훼 < 1/2. Then there exists a sequence (1) (2) 2 푔푛 = (푔푛 , 푔푛 )푛 ∈ 퐺 (푉 ) such that its geodesic embedding rescaled by the oper- 훼−Hl 2 ator 훿푁 −1/2 converges to the same limit in 풞 ([0, 1], 퐺 (푉 )) topology as any embedding equivalent to 휌푁 in the sense of definition 3.6.

Proof. Denote by 휌푁 the geometric rough path corresponding to 휌푁 . Set −1 ∀푘 ≥ 1, 푔푘 = 휌푘(푘 − 1) ⊗ 휌푘(푘) By definition of the embedding 3.4, this means that −1 ∀푘 ≥ 1, ∀푁 ≥ 푘, 푔푘 = 휌푁 (푘 − 1) ⊗ 휌푁 (푘)

The geodesic embedding for the sequence (푔푘)푘 is given by

⌊푁푡⌋ ਁ ∀푡 ∈ [0, 1], 휚푁 (푁푡) = 푔푘 ⊗ 훿푁푡−⌊푁푡⌋(푔⌊푁푡⌋+1) 푘=1 −1 = 휌푁 (⌊푁푡⌋) ⊗ 훿푁푡−⌊푁푡⌋(휌푁 (⌊푁푡⌋) ⊗ 휌푁 (⌊푁푡⌋ + 1)) it is now left to prove that (훿푁 −1/2휚푁 (푁푡))푡∈[0,1] and (훿푁 −1/2휌푁 (푁푡))푡∈[0,1] converge towards the same limit in rough path topology. As before, we denote by ||·|| the Carnot-Caratheodory norm defined in (33). Since 휌푁 is 훼-Hölder, the equivalence of norms (35) tells us that there exists 푐 > 0 such that ⃒⃒ −1 ⃒⃒ 훼 ∀푛 ≥ 1, ∀0 ≤ 푢 < 푣 ≤ 푁, ⃒⃒휌푁 (푢) ⊗ 휌푁 (푣)⃒⃒ ≤ 푐|푣 − 푢| Consequently, we have ⃒⃒ −1 ⃒⃒ sup ⃒⃒훿푁 −1/2휌푁 (푁푡) ⊗ 훿푁 −1/2휚푁 (푁푡)⃒⃒ 푡∈[0,1] (︃ −1/2 ⃒⃒ −1 ⃒⃒ ≤ 푁 sup ⃒⃒휌푁 (푁푡) ⊗ 휌푁 (⌊푁푡⌋)⃒⃒ 푡∈[0,1] )︃ ⃒⃒ −1 ⃒⃒ + sup ⃒⃒훿푁푡−⌊푁푡⌋(휌푁 (⌊푁푡⌋) ⊗ 휌푁 (⌊푁푡⌋ + 1))⃒⃒ 푡∈[0,1] (︃ )︃ ≤ 푁 −1/2 sup (푁푡 − ⌊푁푡⌋)훼 + sup (푁푡 − ⌊푁푡⌋) −→ 0 푡∈[0,1] 푡∈[0,1] 푁→∞

26 which achieves the proof. Proposition 3.3 shows the interest of theorem 2.3: rather than operating with concatenation of continuous paths in each particular case, we can manipulate a more general object in 퐺2(푉 ). The advantage is a general result for an equivalence class and a more convenient setting for computations.

Embeddings for differential and difference equations. As it has already been mentioned, given a differential equation 푑푌푡 = 푓(푌푡)푑푋푡, its solution map 휉(푌0, 푋) = 푌 is not always continuous in uniform topology, which means that we (푛) have to choose an appropriate sequence of continuous paths (푋 )푛 approaching (푛) (푛) 푋 in order for 휉(푌0, 푋 ) = 푌 to suitably approach 푌 . The rough paths theory gives us a universal recipe for making this choice: a suitable sequence (푛) (푛) (푋 )푛 is such that the corresponding sequence of rough paths (X )푛 converges to the rough path Y of 푌 in the rough path topology. Even if the rough paths theory deals only with continuous objects, the above statement has surprising and important consequences in the discrete setting. A crucial difference is that, in the continuous case, the limit is given, andwehave to find a suitable sequence that convergence towards it. On the contrary, ifwe have a sequence of difference equations, the limit is not known a priori andthus will be function of the embedding. Otherwise said, in the first case we have a given limit and several possible approximations, whereas in the second one we have a given discretization and several possible limits. Let us now see how this fact applies to the analysis of the discrete analogue of differential equations, the difference equations. While solving these equations directly can have useful applications, it is also of great interest to consider their convergence after renormalization, i.e. the convergence of

푑 ∑︁ (푖) (푖) Δ푌푛 = 휖 푓 (푌푛−1)(Δ푋푛) (38) 푖=1 such that 휖 → 0. This is where that the embedding comes on the scene.

Property 3.4. Consider a difference equation as in (38) and suppose that (푋푛)푛 ′ allows two different embeddings of finite variation 휌푁 and 휌푁 . Then, if 휌푁 and ′ 휌푁 are equivalent in the rough path sense, they define the same limit equation in the rough path topology (if it exists). In particular, is means that, in the case of rough paths in 퐺2(푉 ), the difference between the corresponding limit equations will depend on the second- level component of the rough path, which contains the area anomaly and the square bracket of the process. Let us now see how this translates in the setting of the classical stochastic calculus in the case where (푋푛)푛 is a hidden Markov walk. In our article [13], we have discussed the convergence of a difference equation driven by the sum of the Bernoulli "turning" variables. In that example, we have implicitly supposed a piecewise linear, geodesic embedding. What happens if

27 we change it? The additional term we obtain at the limit by choosing a hidden Markov walk as driving process is a drift consisting of two components. The first one, featuring the constant 퐾, depends on the square bracket of the process, i.e. is a common term in stochastic calculus. The second one, containing the area anomaly Γ, is a new term that can only be brought up by means of rough path analysis. However, both of them depend on the embedding: the first one due to the construction of the square bracket as a limit, the second one due to the fact that different embeddings may generate different stochastic area anomalies at the limit, as it has been mentioned earlier. Thus, the choice of an embedding is a problem that has consequences on the classical stochastic calculus even independently of the rough paths theory.

3.4 Getting back to theorem 2.3 Notations: For 푔 ∈ 퐺2(푉 ), we denote by 푔(1) its first level component and by 푔(2) the second-level one. We denote by 훿 is the standard dilatation operator on 2 (1) (2) (1) 2 (2) 퐺 (푉 ) (i.e. 훿휖(푔 , 푔 ) = (휖푔 , 휖 푔 )). We start with two preliminary lemmas.

2 Lemma 3.1. Let (휉푛)푛 be a sequence of i.i.d. 퐺 (푉 )-valued centred random variables with bounded moments of all orders, i.e.

[︁ (1) 푝]︁ ∀푝 ≥ 1, E |휉푛 | < ∞

Furthermore, let (푘푛)푛 be a sequence of N-valued r.v. such that

∙ 푘0 = 0 a.s.

∙ P (∀푛 ≥ 0, 푘푛+1 ∈ {푘푛, 푘푛 + 1}) = 1

푘푛 * ∙ → 푎 ∈ R+ a.s. 푛 푛→∞

਀푘푛 푛 Set Ξ푛 = 푘=1 휉푘 and for 푡 ∈ [0, 1], Ξ푡 = Ξ⌊푛푡⌋ ⊗ 훿푛푡−⌊푛푡⌋(휉푘⌊푛푡⌋+1 ). We then have the following convergence:

(︀ 푛 )︀ (푑) 훿(푛휎2)−1/2 (Ξ푡 ) −→ ( 푡) 푡∈[0,1] 푛→∞ B 푡∈[0,1]

2 [︁ (1) 2]︁ where (B푡)푡∈[0,1] is the enhanced Brownian motion, and 휎 = E |휉0 | .

Remark. The difference with the Donsker-type theorem from[2] is that the sums of i.i.d. r.v. are a function of 푘푛 and not simply of 푛. Proof. We will proceed by the classical method, proving first the convergence of the finite-dimensional marginals of the process and then its tightness. For the first part, we use the result from2 in[ ], which states that a (renormalized) sum of i.i.d.r.v. in 퐺2(푉 ) converges in law to the Brownian rough path.

28 Since 푘푛/푛 converges a.s. and that, moreover, (푘푛)푛 is increasing a.s., a sum of 푘푛 i.i.d.r.v. yields the same type of convergence, but with a change in time depending on 푎. Otherwise said, for 푡 ∈ [0, 1], we have

푛 푛 (푑) √ 훿(휎2푛)−1/2 Ξ푡 = 훿(푘 /푛)1/2 훿(휎2푘 )−1/2 Ξ푡 −→ 푎 푡 푛 푛 푛→∞ B and we can then generalize this convergence to any finite-dimensional marginals using the independence of the variables and Slutsky’s theorem (theorem 3.1 in [1]) as in the classical Donsker theorem. In order to prove the tightness of the process in 풞훼([0, 1], 퐺2(푉 )) for 훼 < 1/2, we can use the Kolmogorov’s criterion (as already exposed in the proof oh theorem 1.1 from [13]), i.e. we have to prove that, for any 푝 > 1, there exists 푐 > 0 such that

[︁ −2푝 푛 푛 4푝]︁ 2푝 E 푛 d (Ξ푡 , Ξ푠 ) ≤ 푐|푡 − 푠| for any 푠, 푡 ∈ [0, 1]. The definition of d (·, ·) implies that

푛 푛 ⃒⃒ −1 −1 ⃒⃒ d (Ξ , Ξ ) = ⃒⃒훿푛푠−⌊푛푠⌋휉 ⊗ Ξ ⊗ Ξ⌊푛푡⌋ ⊗ 훿푛푡−⌊푛푡⌋휉푘 ⃒⃒ (39) 푡 푠 ⃒⃒ 푘⌊푛푠⌋+1 ⌊푛푠⌋ ⌊푛푡⌋+1 ⃒⃒

If 푠, 푡 ∈ [푘/푛, (푘 + 1)/푛[, then this expression becomes

푛 푛 ⃒⃒ ⃒⃒ d (Ξ푡 , Ξ푠 ) = 푛(푡 − 푠) ⃒⃒휉푘⌊푛푠⌋+1 ⃒⃒ which further on gives us

[︁ −2푝 푛 푛 4푝]︁ 2푝 4푝 2푝 E 푛 d (Ξ푡 , Ξ푠 ) ≤ 푀푝푛 |푡 − 푠| ≤ 푀푝|푡 − 푠|

[︁ 4푝]︁ where 푀푝 is such that E ||휉0|| ≤ 푀푝. If 푠 ∈ [푖/푛, (푖 + 1)/푛[ and 푡 ∈ [(푖 + 1)/푛, (푖 + 2)/푛[, the expression (39) becomes:

푛 푛 ⃒⃒ −1 ⃒⃒ d (Ξ , Ξ ) = ⃒⃒훿푛푠−⌊푛푠⌋휉 ⊗ Ξ⌊푛푠⌋+1 ⊗ 훿푛푡−⌊푛푡⌋휉푘 ⃒⃒ 푡 푠 ⃒⃒ 푘⌊푛푠⌋+1 ⌊푛푡⌋+1 ⃒⃒ ⃒⃒ ⃒⃒ = ⃒⃒훿1−(푛푠−⌊푛푠⌋)휉푘⌊푛푠⌋+1 ⊗ 훿푛푡−⌊푛푡⌋휉푘⌊푛푡⌋+1 ⃒⃒ ⃒⃒ ⃒⃒ ⃒⃒ ⃒⃒ = (⌊푛푡⌋ − 푛푠) ⃒⃒휉푘⌊푛푠⌋+1 ⃒⃒ + (푛푡 − ⌊푛푡⌋) ⃒⃒휉푘⌊푛푡⌋+1 ⃒⃒ ⃒⃒ ⃒⃒ ≤ 2푛(푡 − 푠)(⃒⃒휉푘⌊푛푠⌋+1 ⃒⃒ and we can conclude as in the first case. Finally, for 푠 ∈ [푖/푛, (푖 + 1)/푛[ and 푡 ∈ [(푖 + 푙)/푛, (푖 + 푙 + 1)/푛[ with 푙 ≥ 2, we use the properties of d (·, ·) to get

⎛ ⃒⃒ ⃒⃒4푝 ⎞ ⃒⃒ 푘⌊푛푡⌋ ⃒⃒ 푛 푛 4푝 4푝 ⃒⃒ ⃒⃒4푝 ⃒⃒ ਁ ⃒⃒ ⃒⃒ ⃒⃒4푝 d (Ξ , Ξ ) ≤ 2 ⎜⃒⃒훿1−(푛푠−⌊푛푠⌋)휉푘 ⃒⃒ + ⃒⃒ 휉푗⃒⃒ + ⃒⃒훿푛푡−⌊푛푡⌋휉푘 ⃒⃒ ⎟ 푡 푠 ⎝ ⌊푛푠⌋+1 ⃒⃒ ⃒⃒ ⌊푛푡⌋+1 ⎠ ⃒⃒푗=푘⌊푛푠⌋+2 ⃒⃒

29 and consequently

⎛ ⎡ 4푝⎤⎞ ⃒⃒ 푘 ⃒⃒ [︁ ]︁ [︁ ]︁ ⃒⃒ ⌊푛푡⌋ ⃒⃒ −2푝 푛 푛 4푝 4푝 ⎜ −2푝 4푝 ⎢⃒⃒ −2푝 ਁ ⃒⃒ ⎥⎟ E 푛 d (Ξ푡 , Ξ푠 ) ≤ 2 2E 푛 ||휉0|| + E ⃒⃒푛 휉푗⃒⃒ ⎝ ⎣⃒⃒ ⃒⃒ ⎦⎠ ⃒⃒ 푗=푘⌊푛푠⌋+2 ⃒⃒

The first part of the right-hand side inequality can be bounded using thefact [︁ −2푝 4푝]︁ −2푝 2푝 that there exists 푀푝 > 0 such that E 푛 ||휉0|| ≤ 푀푝푛 ≤ 푀푝|푡 − 푠| , as |푡 − 푠| > 1/푛. We transform the second one using the independence of variables, and then what is left to prove is that there exists 푐′ > 0 such that

⎡ ⃒⃒ ⃒⃒4푝⎤ ⎡ ⃒⃒ ⃒⃒4푝⎤ ⃒⃒ 푘⌊푛푡⌋ ⃒⃒ ⃒⃒푘⌊푛푡⌋−푘⌊푛푠⌋−1 ⃒⃒ −2푝 ⃒⃒ ਁ ⃒⃒ −2푝 ⃒⃒ ਁ ⃒⃒ E ⎢푛 ⃒⃒ 휉푘⃒⃒ ⎥ = E ⎢푛 ⃒⃒ 휉푗⃒⃒ ⎥ ⎣ ⃒⃒ ⃒⃒ ⎦ ⎣ ⃒⃒ ⃒⃒ ⎦ ⃒⃒푗=푘⌊푛푠⌋+2 ⃒⃒ ⃒⃒ 푗=1 ⃒⃒ ≤ 푐′|푡 − 푠|2푝

[︀⃒⃒਀푚 4푝⃒⃒]︀ 2푝 We use the result E ⃒⃒ 푘=1 휉 ⃒⃒ = 푂(푚 ) proven in [2] and the inequality 푘푗+푙 − 푘푗 ≤ 푙 to get:

⎡⃒⃒ ⃒⃒4푝⎤ ⎡⃒⃒ ⃒⃒4푝⎤ ⃒⃒푘⌊푛푡⌋−푘⌊푛푠⌋−1 ⃒⃒ ⃒⃒푘푖+푙−푘푖−1 ⃒⃒ ⃒⃒ ਁ ⃒⃒ ⃒⃒ ਁ ⃒⃒ E ⎢⃒⃒ 휉푗⃒⃒ ⎥ = E ⎢⃒⃒ 휉푗⃒⃒ ⎥ ⎣⃒⃒ ⃒⃒ ⎦ ⎣⃒⃒ ⃒⃒ ⎦ ⃒⃒ 푗=1 ⃒⃒ ⃒⃒ 푗=1 ⃒⃒ 2푝 2푝 2푝 = 푂((푘푖+푙 − 푘푖) ) = 푂((푙 − 1) ) = 푂((푛|푡 − 푠|) )

We thus get a uniform bound, which achieves the proof.

Lemma 3.2. Consider (푘푛)푛 a sequence of r.v. as the one in lemma 3.1. Let ((0, 퐶푛))푛 be a sequence of uniformly bounded r.v. taking values in the centre of the group 퐺2(푉 ) and such that we have the following a.s. convergence for any 푡 ∈ [0, 1]:

푘⌊푛푡⌋ ਁ 훿푛−1/2 (0, 퐶푖) −→ (0, 푎푡푀) (40) 푛→∞ 푖=1

* where 푀 is a deterministic matrix and 푎 ∈ R+ is as in lemma 3.1. Then we have the following convergence in probability

⎛ 푘⌊푛푡⌋ ⎞ ਁ 훿푛−1/2 (0, 퐶푖) −→ (0, 푎푡푀)푡∈[0,1] ⎝ ⎠ 푛→∞ 푖=1 푡∈[0,1] in the rough path topology 풞훼−Hl([0, 1], 퐺2(푉 )) for 훼 < 1/2. Proof. We will use here proposition 3.2. The pointwise convergence is here a hypothesis of the lemma, so what is left to prove is the uniform bound for the

30 Carnot-Caratheodory norm. On one hand, the upper bound for the Carnot- Caratheodory norm deduced from the norm equivalence (35) implies that

2 1/2 ∀(0, 푦) ∈ 퐺 (푉 ), ||(0, 푦)|| ≤ |푦|푉 ⊗푉

Using the fact that the sequence (푘푛)푛 is such that, for 푚 ≤ 푛, 푘푛 − 푘푚 ≤ 푛 − 푚, we obtain ⃒⃒ ⃒⃒2 ⃒⃒ ⃒⃒2 ⃒ ⃒ ⃒⃒ 푘⌊푛푡⌋ ⃒⃒ ⃒⃒ 푘⌊푛푡⌋ ⃒⃒ ⃒ 푘⌊푛푡⌋ ⃒ ⃒⃒ ਁ ⃒⃒ ⃒⃒ ∑︁ ⃒⃒ ⃒ ∑︁ ⃒ ⃒⃒ (0, 퐶푖)⃒⃒ = ⃒⃒(0, 퐶푖)⃒⃒ ≤ ⃒ 퐶푖⃒ ≤ (⌊푛푡⌋−⌊푛푠⌋)휉 ⃒⃒ ⃒⃒ ⃒⃒ ⃒⃒ ⃒ ⃒ 푖=푘 +1 푖=푘 +1 푖=푘 +1 ⃒⃒ ⌊푛푠⌋ ⃒⃒ ⃒⃒ ⌊푛푠⌋ ⃒⃒ ⃒ ⌊푛푠⌋ ⃒푉 ⊗푉 where 휉 is such that for all 푖 ∈ N, |퐶푖|푉 ⊗푉 ≤ 휉. We then have: ⃒⃒ ⃒⃒ ⎡ ਀푘⌊푛푡⌋ ⎤ [︃ ]︃ ⃒⃒훿 −1/2 (0, 퐶푖)⃒⃒ 1/2 1/2 ⃒⃒ 푛 푖=푘⌊푛푠⌋+1 ⃒⃒ (⌊푛푡⌋ − ⌊푛푠⌋) 휉 ∀푛 ≥ 1, sup ≤ sup E ⎣ 1/2 ⎦ E 1/2 푠̸=푡 |푡 − 푠| 푠̸=푡 (푛|푡 − 푠|) which is uniformly bounded as the (0, 퐶푖)s are supposed to be uniformly bounded.

We will now give the proof of theorem 2.3. Before doing so, we will give a slightly different formulation of it and explain why we do so. As before, 퐸 denotes a finite state space and 푉 a finite-dimensional vector space.

Theorem (theorem 2.3 for non-centred variables). Let (푅푛, X푛)푛 be a hidden Markov walk on 퐸 × 퐺2(푉 ) that satisfies condition (13). As in property 3.1, we can decompose 푛 ਁ ∀푛 ≥ 1, X푛 = 퐹푘 푘=1 −1 where 퐹푘 = X푘−1 ⊗ X푘. (︁ [︁ (1)]︁ )︁ −1 Furthermore, set 훽 = [푇 ], 푣 = 훿 −1 , 0 , 퐹˜ = 푣 ⊗ 퐹 and E 1 훽 E X푇1 푘 푘 ˜ ਀푛 ˜ X푛 = 푘=1 퐹푘. ˜ 2 In this case, the geodesic embedding of (X푛)푛 in 퐺 (푉 ) will be

⌊푛푡⌋ ˜ 푛 ਁ ˜ ˜ ∀푡 ∈ [0, 1], X푡 = 퐹푘 ⊗ 훿푛푡−⌊푛푡⌋퐹⌊푛푡⌋+1 푘=1

Let 휌 = 휌 be an embedding encoded by (˜ ) as in proposition 3.3. If (1) 푁 푁 X푛 푛 X푇1 is non-degenerate, we can suppose the covariance matrix of (1) is 퐶퐼 without X푇1 푑 loss of generality and we have the following convergence in 풞훼([0, 1], 퐺2(푉 )) for 훼 < 1/2:

(︀ ˜ 푛)︀ (푑) (︀ 푆푡푟푎푡 )︀ 훿(푛퐶훽−1)−1/2 푡 −→ 푡 ⊗ (0, 푡Γ휌) X 푡∈[0,1] 푛→∞ B 푡∈[0,1]

31 where B푆푡푟푎푡 is the Brownian motion enhanced with second-level Stratonovich integrals, i.e. as the second-level limit from theorem 2.3, and Γ휌 is the deterministic antisymmetric matrix from theorem 2.3. Alternatively, Γ휌 can be represented as

⎡ ⎤ [︃ 푇 ]︃ 1 ∑︁ ∑︁1 Γ = 퐹˜(1) ⊗ 퐹˜(1) − 퐹˜(1) ⊗ 퐹˜(1) + 푎 (41) 휌 E ⎣2 푝 푚 푚 푝 ⎦ E 푝 1≤푝<푚≤푇1 푝=1

Remarks. ˜ ∙ (푅푛, X푛)푛 is a HMW in the sense of definition 2.2, and it satisfies the [︁ ]︁ conditions from theorem 2.3: in particular, ˜ (1) = 0. Thus, the interest E X푇1 of the present version of theorem 2.3 resides in the fact that it allows to treat the case of more general HMW on 퐺2(푉 ) by recentring their excursions. ∙ If we compare the statement of the theorem 2.2 given in the introduction to the present one, a fundamental difference is that the first one was more analytic, whereas the present one is more algebraic. We have avoided the "heavy" rough path formulation in the introduction as the rough path setting appears only further in the paper (section 3.2). ∙ In particular, instead of presenting the first and second level as random processes in the uniform topology (푆퐵,푖(푡)), the limit is presented here as (︀ 푆푡푟푎푡 )︀ the rough path process B푡 ⊗ (0, 푡Γ휌) 푡∈[0,1]. By doing so, we stress the area anomaly obtained at the limit and given by (0, 푡Γ휌).

2 Proof. Since the 퐹푘s are in 퐺 (푉 ), the particular form of the elements from this group stated in (5) gives the decomposition: 1 ∀푘 ≥ 1, 퐹 (2) = 퐹 (1) ⊗ 퐹 (1) + 푎 (42) 푘 2 푘 푘 푘

(2) (2) where 푎푘 is the antisymmetric part of 퐹푘 , 퐴푛푡푖(퐹푘 ). Thus, using (42), we have the following decomposition for the 퐹˜푘s: 1 ∀푘 ≥ 1, 퐹˜ = (퐹˜(1), 퐹˜(1) ⊗ 퐹˜(1)) ⊗ (0, 푎 ) (43) 푘 푘 2 푘 푘 푘 1 where (퐹˜(1), 퐹˜(1) ⊗ 퐹˜(1)) ∈ 퐺2(푉 ) is an element with a symmetric second 푘 2 푘 푘 2 component and (0, 푎푘) ∈ 퐺 (푉 ) is a "pure area" element (i.e. such that the first component equal to zero and the second component antisymmetric) and is in the centre of the group. We denote by 휅(푛) the rank of the excursion to which 푛 belongs, i.e. the unique integer such that 푇휅(푛) ≤ 푛 < 푇휅(푛)+1, where the 푇푖s are as defined in (6). Based on the fact that the elements of type (0, 푎) ∈ 퐺2(푉 ) are in the

32 centre of the group and thus commute with all the others, we have the following ˜ 푛 decomposition of X푡 : ⌊푛푡⌋ (︂ )︂ ਁ (1) 1 (1) (1) ਁ ˜ 푛 = (퐹˜ , 퐹˜ ⊗ 퐹˜ ) ⊗ (0, 푎 ) 훿 퐹˜ X푡 푘 2 푘 푘 푘 푛푡−⌊푛푡⌋ ⌊푛푡⌋+1 푘=1 휅(⌊푛푡)⌋ ⎛ 푇 푇 ⎞ ਁ ∑︁푘 1 ∑︁푘 ∑︁ = 퐹˜(1), 퐹 (1) ⊗ 퐹˜(1) + 퐹˜(1) ⊗ 퐹˜(1) ⎝ 푝 2 푝 푝 푝 푚 ⎠ 푘=1 푝=푇푘−1+1 푝=푇푘−1+1 푇푘−1+1≤푝<푚≤푇푘 ⎛ ⎞ ⎛ ⎞ 휅(⌊푛푡)⌋ 푇푘 ⌊푛푡⌋ ਁ ਁ ∑︁ ਁ ਁ ˜ ਁ ˜ ⎝ (0, 푎푝)⎠ ⎝ 퐹푘⎠ 훿푛푡−⌊푛푡⌋퐹⌊푛푡⌋+1

푘=1 푝=푇푘−1+1 푘=푇휅(⌊푛푡⌋)+1

휅(⌊푛푡)⌋ ⎛ 푇 푇 ⎞ ਁ ∑︁푘 1 ∑︁푘 = 퐹˜(1), ( 퐹˜(1))⊗2 ⎝ 푝 2 푝 ⎠ 푘=1 푝=푇푘−1+1 푝=푇푘−1+1

⎛휅(⌊푛푡)⌋ 푇 ⎞ ਁ ਁ 1 ∑︁ ∑︁푘 (0, 퐹˜(1) ⊗ 퐹˜(1) − 퐹˜(1) ⊗ 퐹˜(1) + 푎 ) ⎝ 2 푝 푚 푚 푝 푝 ⎠ 푘=1 푇푘−1+1≤푝<푚≤푇푘 푝=푇푘−1+1

⎛ ⌊푛푡⌋ ⎞ ਁ ਁ ˜ ਁ ˜ ⎝ 퐹푘⎠ 훿푛푡−⌊푛푡⌋퐹⌊푛푡⌋+1

푘=푇휅(⌊푛푡⌋)+1 푛 푛 푛 = 풫푡 ⊗ 풜푡 ⊗ ℛ푡 where 휅(⌊푛푡)⌋ ⎛ 푇 푇 ⎞ ਁ ∑︁푘 1 ∑︁푘 풫푛 = 퐹˜(1), ( 퐹˜(1))⊗2 푡 ⎝ 푝 2 푝 ⎠ 푘=1 푝=푇푘−1+1 푝=푇푘−1+1 is the term that concatenates the excursions,

⎛휅(⌊푛푡)⌋ 푇 ⎞ ਁ ਁ 1 ∑︁ ∑︁푘 풜푛 = (0, 퐹˜(1) ⊗ 퐹˜(1) − 퐹˜(1) ⊗ 퐹˜(1) + 푎 ) 푡 ⎝ 2 푝 푚 푚 푝 푝 ⎠ 푘=1 푇푘−1+1≤푝<푚≤푇푘 푝=푇푘−1+1 is a "pure area" process that takes into consideration the antisymmetric part of (2) the 퐹푘 s, as well as the stochastic areas of the pseudo-excursions, and

⎛ ⌊푛푡⌋ ⎞ 푛 ਁ ˜ ਁ ˜ ℛ푡 = ⎝ 퐹푘⎠ 훿푛푡−⌊푛푡⌋퐹⌊푛푡⌋+1

푘=푇휅(⌊푛푡⌋)+1 is the rest left from the geodesic embedding. We will now compute separately the limit of each of the three terms. Let us 푛 first consider the term (훿푛−1/2 풫푡 )푡∈[0,1]. By construction, the variables

⎛ 푇 푇 ⎞ ∑︁푘 1 ∑︁푘 퐹˜(1), ( 퐹˜(1))⊗2 ⎝ 푝 2 푝 ⎠ 푝=푇푘−1 푝=푇푘−1+1

33 are i.i.d. and centred. Moreover, 휅(⌊푛푡⌋)/푛 → 푡훽−1 a.s., since 휅(푛) is the 푛→∞ number of full excursions accomplished until time 푛. Thus, since the function 휅 satisfies the conditions of lemma 3.1, we can deduce that:

⎛ ⎞ 휅(⌊푛푡⌋) 푇푘 푇푘 ਁ ∑︁ ˜(1) 1 ∑︁ ˜(1) ⊗2 (푑) (︀ 푆푡푟푎푡)︀ ⎝훿(푛퐶훽−1)−1/2 ( 퐹푝 , ( 퐹푝 ) )⎠ −→ B푡 2 푛→∞ 푡∈[0,1] 푘=1 푝=푇 푝=푇 +1 푘−1 푘−1 푡∈[0,1] in the rough path topology. 푛 We now have to study the convergence of (훿푛−1/2 풜푡 )푡∈[0,1]. We notice that the two tensor products containing 휅(⌊푛푡⌋) terms are "sums" of i.i.d.r.v., since each term depends entirely on a different excursion. The law of large numbers thus applies to both of them, and we get that, a.s., for 푡 ∈ [0, 1] fixed:

휅(⌊푛푡⌋) ⎛ ⎞ ਁ 1 ∑︁ ˜(1) ˜(1) ˜(1) ˜(1) 훿(푛훽−1)−1/2 ⎝0, 퐹푝 ⊗ 퐹푚 − 퐹푚 ⊗ 퐹푝 ⎠ → (0, 푡Γ) 2 푛→∞ 푘=1 푇푘−1+1≤푝<푚≤푇푘 where ⎡ ⎤ 1 ∑︁ Γ = 퐹˜(1) ⊗ 퐹˜(1) − 퐹˜(1) ⊗ 퐹˜(1) E ⎣2 푝 푚 푚 푝 ⎦ 1≤푝<푚≤푇1 is the area anomaly we recover. For the second part of the sum, we have, for 푡 ∈ [0, 1] fixed:

휅(⌊푛푡⌋) 푇 ਁ ∑︁푘 훿(푛훽−1)−1/2 (0, 푎푝) → (0, 푡Γ0) 푛→∞ 푘=1 푝=푇푘−1+1 where [︃ 푇 ]︃ ∑︁1 Γ0 = E 푎푝 푝=1 is generated by the antisymmetric part of the second-level components of the 퐹푘s. Since 휅 is as the function from lemma 3.2, the functions 퐹˜푝 are a.s. uniformly bounded under condition (13) and 푇1 has finite moments of all order, the conditions of lemma 3.2 are satisfied, and we deduce the following convergence in probability in rough path topology:

푛 −1 (훿푛−1/2 풜푡 )푡∈[0,1] −→ (0, 푡훽 Γ휌)푡∈[0,1] (44) 푛→∞ where Γ휌 = Γ + Γ0. 푛 We have the residue (훿푛−1/2 ℛ푡 )푡∈[0,1] left to deal with. Its first part contains ⃒⃒ ˜ ⃒⃒ ⌊푛푡⌋ − 푇휅(⌊푛푡⌋) ≤ (푇휅(⌊푛푡⌋)+1 − 1) − 푇휅(⌊푛푡⌋) terms, and ⃒⃒훿푛푡−⌊푛푡⌋퐹⌊푛푡⌋+1⃒⃒ ≤

34 ⃒⃒ ˜ ⃒⃒ ⃒⃒퐹⌊푛푡⌋+1⃒⃒ adds one more term. Moreover, since by (13) the 퐹푘s are uniformly ⃒⃒ ˜ ⃒⃒ bounded, there exists 퐾 > 0 such that sup ⃒⃒퐹푘⃒⃒ < 퐾. Therefore, we get: 푘

[︃ ⃒⃒ 푛 −1 푛 ⃒⃒]︃ [︃ ]︃ ⃒⃒훿 −1/2 ((ℛ ) ⊗ ℛ )⃒⃒ 푇휅(⌊푛푡⌋) − 푇휅(⌊푛푠⌋) + ⌊푛푡⌋ − ⌊푛푠⌋ sup 푛 푠 푡 ≤ 퐾 sup E 1/2 E 1/2 푠̸=푡 |푡 − 푠| 푠̸=푡 (푛|푡 − 푠|)

This quantity is bounded since ⌊푛푡⌋ = ⌊푛푠⌋ for 푠, 푡 ∈ [푖/푛, (푖 + 1)/푛] for any 푖 = 1, . . . , 푛 − 1. We also have the convergence in probability

푛 ∀푡 ∈ [0, 1], 훿푛−1/2 (ℛ푡 ) −→ 0 푛→∞ We can thus use once again proposition 3.2 to conclude to the following convergence in probability in rough path topology:

푛 (훿푛−1/2 (ℛ푡 )) −→ 0 푡∈[0,1] 푛→∞

푛 푛 Finally, putting altogether the convergences of (훿푛−1/2 풫푡 )푡∈[0,1], of (훿푛−1/2 풜푡 )푡∈[0,1] 푛 and of (훿푛−1/2 ℛ푡 )푡∈[0,1], our result follows from Slutsky’s theorem:

(︀ 푛 푛 푛)︀ (푑) (︀ 푆푡푟푎푡 )︀ 훿(푛퐶훽−1)−1/2 풫푡 ⊗ 풜푡 ⊗ ℛ푡 −→ 푡 ⊗ (0, 푡Γ휌) 푡∈[0,1] 푛→∞ B 푡∈[0,1]

4 Iterated structures behind discrete time and discrete space Markov chains.

In this section we study more thoroughly the algebraic properties of HMW through the corresponding iterated occupation times.

4.1 Shuffle and quasi-shuffle products. The definitions and properties from this section are mainly inspired from [9].

Definitions. Definition 4.1 (the quasi-shuffle product). Let (퐴, ·) be an algebra, 푎, 푏 ∈ 퐴, and 푥, 푦 obtained by concatenation of a finite number of elements from 퐴. Then the quasi-shuffle product * is defined recursively by:

푎푥 * 푏푦 = [푎, 푏](푥 * 푦) + 푎(푥 * 푏푦) + 푏(푎푥 * 푦) where [·, ·] is a commutative operation on 퐴. The shuffle product can be viewed as a particular case of the quasi-shuffle product. More specifically, it is the case when 푎·푏 = 0 for any 푎, 푏 ∈ 푅. Contrary to the shuffle product, the quasi-shuffle product is not always commutative.

35 Definition 4.2 (shuffle product). Let (퐴, ·) be an algebra, 푎, 푏 ∈ 퐴, and 푥, 푦 obtained by concatenation of a finite number of elements from 퐴. Then the shuffle product is defined recursively by:

푎푥 푏푦 = 푎(푥 푏푦) + 푏(푎푥 푦)

A more informal way of putting it is as follows:

Definition 4.3 (and corollary of 4.2). Consider the set of permutation Σ푚,푛 = {휎 ∈ Σ푚+푛 : 휎(1) < . . . < 휎(푚), 휎(푚 + 1) < . . . < 휎(푚 + 푛)}. For 푥 = 푥1 . . . 푥푚 and 푦 = 푦1 . . . 푦푛 with 푥푖, 푦푗 ∈ 퐴, the shuffle product of 푥 and 푦 is given by the sum of images of the concatenated word 푎푏 under all the permutations of 푆푚,푛: 푥 푦 = ∑︀ 휎(푥푦) (for example, 푎푏 푐 = 푎푏푐 + 푎푐푏 + 푐푎푏). 휎∈Σ푚,푛

Quasi-shuffle structure of iterated sums. One of the main example of sets we can endow with a shuffle product is the set of iterated integrals on avector 푖1,...,푖푛 푗1,...,푗푚 space 푉 . Denote by en = 푒푖1 ⊗ ... ⊗ 푒푖푛 and em = 푒푗1 ⊗ ... ⊗ 푒푗푚 where the 푒푘s are vectors from the canonical basis of 푉 . We then define, for an integrable path 푦 in 푉 ,

푖1,...,푖푛 푆푦,푛;푖1,...,푖푛 (푡) =< en , 푆푦,푛(푡) > ∫︁ 푖1 푖푛 = 푑푦푡1 . . . 푑푦푡푛 0<푡1<...<푡푛<푡

푗1,...,푗푚 ⊗푙 and idem for em , with < ·, · > the induced scalar product on 푉 . The shuffle product of two iterated integrals depending both on the same path 푦 is then given by:

푆푦,푛;푖1,...,푖푛 (푡)푆푦,푚;푗1,...,푗푚 (푡) = 푆푦,푛+푚;(푖1,...,푖푛) (푗1,...,푗푚)(푡) (45) showing that the shuffle product of two iterated integrals is a linear combination of iterated integrals. If we now consider a sequence 푥 = (푥푛)푛 in 푉 , the components of the iterated sum 푆˜푥,푛(푁) from (24) given by

˜ 푖1,...,푖푛 ˜ 푆푥,푛;푖1,...,푖푛 (푁) =< en , 푆푥,푛(푁) > ∑︁ = 푥푖1 . . . 푥푖푛 (46) 푘1 푘푛 1≤푘1<...<푘푛≤푁 are discrete versions of (45). However, the product of two elements of this kind is not a shuffle product anymore, as shown below. ˜ ˜ ˜ 푆푥,푛;푖1,...,푖푛 (푁)푆푥,푚;푗1,...,푗푚 (푁) = 푆푥,푛+푚;(푖1,...,푖푛) (푗1,...,푗푚)(푁) (47)

+ 푟푥,푛,푚;푖1,...,푖푛;푗1,...,푗푚 (푁)

where 푟푥,푛,푚;푖1,...,푖푛;푗1,...,푗푚 (푁) is a rest that comes from the fact that, when multiplying two iterated sums, we get sets of indices that are not necessarily

36 strictly ordered. Moreover, the result of the multiplication can not be in the form 푖1,...,푖푛 of a function 푆˜ so we can not apply the new product exclusively to en and 푗1,...,푗푚 em . We thus need a more general product, that would better keep track of this rest. The product of two iterated sums of the type 푆푥,푛;푖1,...,푖푛 (푁) is a quasi-shuffle product, as already stated in property 2.2.

4.2 From geometric to non-geometric rough paths through hidden Markov walks 4.2.1 Geometric rough paths and shuffle products.

We know that a piecewise linear embedding of (푋푛)푛 is canonically represented by a (geometric) rough path as follows: (푁) Property 4.1. Denote by (푋푡 )푡 a smooth embedding of (푋푛)푛. The canonical rough path in 퐺2(푉 ) corresponding to this embedding is given by the 2-step signature ∫︁ ∀푡 ∈ [0, 1], 푆 (푋(푁)) = (푋(푁), 푑푋(푁) ⊗ 푑푋(푁)) 2 0,푡 푡 푠1 푠2 0<푠1<푠2<푡 (푁) 1−푣푎푟 2 The path 푡 ↦→ 푆2(푋 )0,푡 is in 풞 ([0, 1], 퐺 (푉 )). We thus obtain a sequence of geometric rough paths and can study the con- (푁) 훼−Hl 2 vergence in law of (훿푁 −1/2 푆2(푋 )0,푡)푡 in the topology of 풞 ([0, 1], 퐺 (푉 )) for 훼 < 1/2. The limit (if it exists) will be a geometric 훼-Hölder rough path, i.e. an element of 풞훼−Hl([0, 1], 퐺2(푉 )), for 훼 < 1/2. Since 퐺2(푉 ) is the space in which geometric rough paths take values, an element (푔(1), 푔(2)) ∈ 퐺2(푉 ) needs to satisfy the shuffle product relation in the sense of the following lemma. (1) (2) (2) 2 Property 4.2. An element (푔 , 푔 ) ∈ 푇1 (푉 ) is in 퐺 (푉 ) if and only if it satisfies the relation (1) (1) (2) < 푔 , 푒푖 >< 푔 , 푒푗 >=< 푔 , 푒푖 푒푗 > (48) where 푒푖, 푒푗 are in the canonical basis of 푉 and 푒푖 푒푗 = 푒푖 ⊗ 푒푗 + 푒푗 ⊗ 푒푖. Proof. We can decompose the second component into a symmetric and an antisymmetric part : 푔(2) = Sym(푔(2)) + Antisym(푔(2)) On one hand, we have 1 < Sym(푔(2)), 푒 푒 > = < 푔(1) ⊗ 푔(1), 푒 ⊗ 푒 + 푒 ⊗ 푒 > 푖 푗 2 푖 푗 푗 푖 (1) (1) < 푔 , 푒푖 >< 푔 , 푒푗 > On the other hand, for the antisymmetric term we get (2) (2) (2) < Antisym(푔 ), 푒푖 푒푗 >=< Antisym(푔 ), 푒푖 ⊗ 푒푗 > − < Antisym(푔 ), 푒푗 ⊗ 푒푖 >= 0 which achieves the proof.

37 Remark. The condition (48) is a version of the relation which is part of the abstract definition of geometric rough paths given in [7].

4.2.2 A discrete construction for non-geometric rough paths. We now present a way of constructing non-geometric rough paths out of a hidden (푁) Markov walk (푅푛, 푋푛)푛. Instead of first constructing a continuous path (푋푡 )푡 in 푉 and then associate a rough path to it, we will first construct a sequence (2) (Y푛)푛 in 푇1 (푉 ) and then associate an embedding to it. Let us consider the sequence (Y)푛 as in (17), i.e. ⎛ ⎞ ∑︁ (2) Y푛 = ⎝푋푛, 퐹푘1 ⊗ 퐹푘2 ⎠ ∈ 푇1 (푉 ) 1≤푘1<푘2≤푛 and explain how connect the non-geometric nature of the corresponding rough path to the quasi-shuffle product. In the case of the sequence (Y푘)푘, we have: 푛 ∑︁ (푖) (푗) < 푋 , 푒 >< 푋 , 푒 > = 퐹 퐹 푛 푖 푛 푗 푘1 푘2 푘1,푘2=1 푛 ∑︁ (푖) (푗) ∑︁ (푖) (푗) ∑︁ (푖) (푗) = 퐹 퐹 + 퐹 퐹 + 퐹 퐹 푘1 푘2 푘1 푘2 푘 푘 1≤푘1<푘2≤푛 1≤푘2<푘1≤푛 푘=1 (49) 푛 ∑︁ ∑︁ ⊗2 =< 퐹푘1 ⊗ 퐹푘2 , 푒푖 푒푗 > + < 퐹푘 , 푒푖 ⊗ 푒푗 > 1≤푘1<푘2≤푛 푘=1 (50) 푛 ∑︁ 1 ∑︁ =< 퐹 ⊗ 퐹 + 퐹 ⊗2, 푒 푒 > 푘1 푘2 2 푘 푖 푗 1≤푘1<푘2≤푛 푘=1 Once again, we see that (50) does not coincide with (48). Moreover, if we identify 푒푖 as corresponding to the index 푘1 and 푒푗 as corresponding to the index 푘2, the line (49) in the computation above can be identified with the quasi-shuffle product: the sums over {푘1 < 푘2} ∪ {푘2 < 푘1} give the shuffle product part, whereas the last sum over {푘1 = 푘2} gives the quasi-shuffle rest. From all the above, we can draw the following conclusion:

(2) 2 Property 4.3. The sequence (Y푛)푛 takes values in 푇1 (푉 ) ∖ 퐺 (푉 ). In par- (푁) ticular, any embedding (Y푡 )푡 of (Y푛)푛, no matter how smooth, will be a non-geometric rough path.

4.3 Getting back to proposition 2.1. We want to study the convergence of (17) and the convergence of the rough (푁) paths (Y푡 )푡∈[0,1] which are obtained from the sequence (Y푛)푛 by a continuous

38 embedding in the sense of definition 3.4. 2 The first thing we need to notice is that the sequence (Y푛)푛 is not in 퐺 (푉 ). In the present case, we have ⎛ ⎞ ⎛ ⎞ ∑︁ 1 ∑︁ ∑︁ Sym 퐹 ⊗ 퐹 = 퐹 ⊗ 퐹 + 퐹 ⊗ 퐹 ⎝ 푘1 푘2 ⎠ 2 ⎝ 푘1 푘2 푘2 푘1 ⎠ 1≤푘1<푘2≤푛 1≤푘1<푘2≤푛 1≤푘1<푘2≤푛 푛 1 1 ∑︁ = 푋 ⊗ 푋 − 퐹 ⊗2 2 푛 푛 2 푘 푘=1 and thus we have an extra term that does not depend on the first level 푋푛. (푁) In order to study the convergence of the non-geometric rough paths (Y푡 )푡 associated to (Y푛)푛 in property 4.3, we need to answer the following questions:

∙ What embedding do we choose for (Y푛)푛? ∙ What is the topology to consider when studying the convergence?

∙ What will be the nature of the limit rough path (if it exists)?

We start with a few preliminary results. First, we want to show that property 4.1 can be used to construct a suitable embedding for (X푛)푛. This is a consequence of the following lemma:

(푁) Lemma 4.1. Choose (푋푡 )푡 to be the geodesic embedding of (푋푛)푛 and suppose 푋0 = 0 a.s. For all 푁 ≥ 1 and 1 ≤ 푖 ≤ 푛, we have:

(︃ 푖 )︃ (푁) ∑︁ ⊗2 푆2(푋 )0,푖/푁 = Y푖 ⊗ 0, 퐹푘 푘=1

Proof. For any 푁 ≥ 1 and 푡 ∈ [0, 1], we can decompose the second component (푁) of 푆2(푋 )0,푡 into a symmetric and an antisymmetric part as follows: ∫︁ 1 ∫︁ 푑푋(푁) ⊗ 푑푋(푁) = 푑푋(푁) ⊗ 푑푋(푁) + 푑푋(푁) ⊗ 푑푋(푁) 푠1 푠2 푠1 푠2 푠2 푠1 0<푠1<푠2<푡 2 0<푠1<푠2<푡 1 ∫︁ + 푑푋(푁) ⊗ 푑푋(푁) − 푑푋(푁) ⊗ 푑푋(푁) 푠1 푠2 푠2 푠1 2 0<푠1<푠2<푡

(푁) Since (푋푡 )푡 is piecewise linear, the symmetric part becomes 1 ∫︁ 푑푋(푁) ⊗ 푑푋(푁) + 푑푋(푁) ⊗ 푑푋(푁) = (푋(푁))⊗2 푠1 푠2 푠2 푠1 푡 2 0<푠1<푠2<푡

(푁) The antisymmetric part corresponds to the stochastic area of the process (푋푡 )푡, (푁) 퐴푖/푁 (푋 ), which, for a piecewise linear process, can be expressed by second

39 iterated sums, i.e.

(푁) ∀푖 ∈ {1, . . . , 푁}, 퐴푖(푋) :=퐴푖/푁 (푋 ) 1 ∫︁ = 푑푋(푁) ⊗ 푑푋(푁) − 푑푋(푁) ⊗ 푑푋(푁) 푠1 푠2 푠2 푠1 2 0<푠1<푠2<푖/푁 1 ∑︁ = 퐹 ⊗ 퐹 − 퐹 ⊗ 퐹 2 푘1 푘2 푘2 푘1 1≤푘1<푘2≤푖 Since we have

푖 ∑︁ (푁) ⊗2 ∑︁ ⊗2 ∀푖 ∈ {1, . . . , 푁}, 퐹푘1 ⊗ 퐹푘2 + 퐹푘2 ⊗ 퐹푘1 = (푋푖/푁 ) − 퐹푘 1≤푘1<푘2≤푖 푘=1 it is straightforward to show

푖 ∑︁ ∫︁ ∑︁ 퐹 ⊗ 퐹 = 푑푋(푁) ⊗ 푑푋(푁) − 퐹 ⊗2 (51) 푘1 푘2 푠1 푠2 푘 0<푠1<푠2<푖/푁 1≤푘1<푘2≤푖 푘=1 We thus conclude that:

⎛ ⎞ (︃ 푖 )︃ (푁) ∑︁ ∑︁ ⊗2 ∀푁 ≥ 1, ∀1 ≤ 푖 ≤ 푁, 푆2(푋 )0,푖/푁 = ⎝푋푖, 퐹푘1 ⊗ 퐹푘2 ⎠⊗ 0, 퐹푘 1≤푘1<푘2≤푖 푘=1

We can now give an answer to the two questions concerning the choice of the embedding and of the topology:

(푁) Lemma 4.2. Choose (푋푡 )푡 to be the geodesic embedding of (푋푛)푛 and suppose ∑︀푘 ⊗2 푋0 = 0 a.s. Consider the geodesic embedding of 푍푘 = 푗=1 퐹푗 given by

⌊푁푡⌋ * (푁) ∑︁ ⊗2 ⊗2 ∀푁 ∈ N , ∀푡 ∈ [0, 1], 푍푡 = 퐹푘 + (푁푡 − ⌊푁푡⌋)퐹⌊푁푡⌋+1 푘=1 and set

* (푁) (푁) −1 (푁) ∀푁 ∈ N , ∀푡 ∈ [0, 1], Y푡 = (0, 푍푡 ) ⊗ 푆2(푋 )0,푡

(푁) (푁) Then 푡 ↦→ Y푡 is an embedding for (Y푛)푛. Furthermore, 푡 ↦→ Y푡 is in 1−푣푎푟 (2) 풞 ([0, 1], 푇1 (푉 )) and is a non-geometric rough path. (푁) Proof. The path 푡 ↦→ Y푡 is constructed by concatenation of two rough paths, (푁) (푁) (푆2(푋 )0,푡)푡 being a smooth rough path and (0, 푍푡 )푡 a smooth path in (2) (푁) 1−푣푎푟 (2) 푇1 (푉 ), which implies that 푡 ↦→ Y푡 is an element of 풞 ([0, 1], 푇1 (푉 )). (푁) Moreover, property 4.3 implies that 푡 ↦→ Y푡 is a non-geometric rough path.

40 (푁) The fact that 푡 ↦→ Y푡 is an embedding for (X푛)푛 follows from lemma 4.1, which tells us that:

(푁) (푁) −1 (푁) ∀푁 ≥ 1, ∀1 ≤ 푖 ≤ 푁, Y푖/푁 = (0, 푌푖/푁 ) ⊗ 푆2(푋 )0,푖/푁

(푁) Remark. Another way of expressing 푡 ↦→ Y푡 is through the rough path bracket which concentrates the symmetric part of the second-level component (see definition 5.5 from [5]), and implies: 1 ∀푡 ∈ [0, 1], (푁) = (푋(푁), 퐴 (푋(푁))) ⊗ (0, [ (푁)] ) Y푡 푡 푡 2 Y 푡 where (푁) (푁) (푁) (푁) ∀푡 ∈ [0, 1], [Y ]푡 = 푋푡 ⊗ 푋푡 − 푍푡 (푁) (푁) and 푡 ↦→ 퐴푡(푋 ) is, as before, the stochastic area of 푡 ↦→ 푋푡 . We can now proceed to the main proof of proposition 2.1. Proof. Theorem 2.1 implies the following convergence in law in the space 풞훼−Hl([0, 1], 퐺2(푉 )) for 훼 < 1/2:

(︁ (푁) )︁ (︀ 푆푡푟푎푡 )︀ 훿(푁퐶훽−1)−1/2 푆2(푋 )0,푡 −→ B푡 ⊗ (0, Γ푡) (52) 푡∈[0,1] 푁→∞ 푡∈[0,1] where Γ is the area anomaly from 2.1. Next, we apply the decomposition in pseudo-excursions to 푍푛 i.e.

휅(푛)−1 푇푗+1 푛 ∑︁ ∑︁ ⊗2 ∑︁ ⊗2 ∀푛 ≥ 1, 푍푛 = 퐹푖 + 퐹푖

푗=0 푖=푇푗 +1 푖=푇휅(푛)+1 and, by the law of large numbers, we deduce the convergence of the finite- dimensional marginals

(︃ [︃ 푇1 ]︃)︃ (푁) −1 ∑︁ ⊗2 ∀푡 ∈ [0, 1], 훿(푁퐶훽−1)−1/2 (0, 푍푡 ) −→ 0, 푡퐶 E 퐹푖 푁→∞ 푖=1

(푁) We can now prove the tightness of the sequence of processes (0, 푍푡 )푡 in 훼−Hl (2) 풞 ([0, 1], 푇1 (푉 )) for 훼 < 1/2, by using a version of the Kolmogorov criterion. Theorem 3.10 from [5] implies that it is enough to prove

−1 [︁ (푁) ]︁ ∃퐾 > 0, ∀푠, 푡 ∈ [0, 1], ∀푁 ≥ 1, 푁 E |푍푠,푡 |푉 ⊗푉 ≤ 퐾|푡 − 푠|

We use the fact that the 퐹푘s are uniformly bounded: ⎡⃒ ⃒ ⎤ ⃒ ⌊푁푡⌋ ⃒ −1 [︁ (푁) ]︁ −1 ⃒ ∑︁ ⊗2⃒ 푁 E |푍푠,푡 |푉 ⊗푉 = 푁 E ⎣⃒ 퐹푖 ⃒ ⎦ ⃒ ⃒ 푖=⌊푁푠⌋+1 ⃒ ⃒푉 ⊗푉 ≤ 퐾|푡 − 푠|

41 where 퐾 comes from the condition (8). 훼−Hl (2) We thus conclude to the following convergence in probability in the 풞 ([0, 1], 푇1 (푉 )) topology for 훼 < 1/2:

(︃ [︃ 푇1 ]︃)︃ (︁ (푁) )︁ −1 ∑︁ ⊗2 훿(푁퐶훽−1)−1/2 (0, 푍푡 ) −→ 0, 푡퐶 E 퐹푖 (53) 푡∈[0,1] 푁→∞ 푖=1 푡∈[0,1]

Using the Slutsky’s theorem (theorem 3.1 from [1]), and combining (52) and (53), we get

(︃ [︃ 푇1 ]︃ )︃ (︁ (푁))︁ 푆푡푟푎푡 −1 ∑︁ ⊗2 훿(푁퐶훽−1)−1/2 Y푡 −→ B푡 ⊗ (0, 푡(Γ − 퐶 E 퐹푖 ) 푡∈[0,1] 푁→∞ 푖=1 푡∈[0,1] (54)

푆푡푟푎푡 where B is the standard Brownian motion (퐵푡)푡 on 푉 enhanced with its iterated integrals in the Stratonovich sense. At the same time, we can compute

[︃ 푇 ]︃ ⎡ ⎤ [︃ 푇 ]︃ ∑︁1 1 ∑︁ 1 ∑︁1 Γ − 퐶−1 퐹 ⊗2 = 퐶−1 퐹 ⊗ 퐹 − 퐹 ⊗ 퐹 − 퐶−1 퐹 ⊗2 E 푖 E ⎣2 푘1 푘2 푘2 푘1 ⎦ E 2 푖 푖=1 1≤푘1<푘2≤푇1 푖=1 (55) ⎡ ⎤ −1 ∑︁ 1 −1 [︀ ⊗2]︀ = 퐶 E ⎣ 퐹푘1 ⊗ 퐹푘2 ⎦ − 퐶 E 푋 2 푇1 1≤푘1<푘2≤푇1 ⎡ ⎤ ∑︁ 1 1 = 퐶−1 퐹 ⊗ 퐹 − 퐼 = 푀 − 퐼푑 (56) E ⎣ 푘1 푘2 ⎦ 2 푑 2 1≤푘1<푘2≤푇1 [︁ ]︁ where 푀 = 퐶−1 ∑︀ 퐹 ⊗ 퐹 (as stated in (18)) and [︀푋⊗2]︀ is E 1≤푘1<푘2≤푇1 푘1 푘2 E 푇1 non other than the covariance matrix of 푋푇1 since the excursions are centred. The expression (55) shows that (0, 푀푡) is a non-geometric rough path: while the first term of the sum is antisymmetric, the second one is symmetric andcan not be expressed through the first component of the rough path (which is zero). (푁) This shows that the limit of the (Y푡 )푡 is a non-geometric rough path. The expression (56) shows that we can rewrite the limit using Itô integration, as 1 ∀0 ≤ 푠 < 푡 ≤ 1, 푆푡푟푎푡 = It ⊗ (0, (푡 − 푠)퐼 ) B푡,푠 B푡,푠 2 푑 We thus obtain the convergence:

(︁ (푁))︁ (︀ It )︀ 훿(푁퐶훽−1)−1/2 Y푡 −→ B푡 ⊗ (0, 푡푀) 푡∈[0,1] 푁→∞ 푡∈[0,1] which achieves the proof.

42 Remark. We find in the expression of the limit (54) the decomposition of a non geometric rough path in a geometric rough path plus an element in the (2) centre of 푇1 (푉 ), highlighted in particular in [5] and expressed by

훼 (2) 훼 2 2훼 ∀1/3 < 훼 < 1/2, 풞 ([0, 1], 푇1 (푉 )) ≃ 풞 ([0, 1], 퐺 (푉 )) ⊕ 풞 ([0, 1], 푉 ⊗ 푉 )

5 Open questions

We have seen that hidden Markov walks are a natural generalization of the class of Markov chains on periodic graphs when one wants to study the area anomaly using techniques related to the theory of excursions, in particular allowing to analyse the convergence of a HMW as that of a sum of i.i.d. variables. It would be therefore interesting to know whether we can obtain results on area anomaly on other processes which present only weak time-correlations that could be ignored when passing to the limit in uniform convergence topology (for example, the 훼-mixing processes described in [1]).

A further study of the iterated occupation times 퐿푢1,...,푢푘;푁 (푅) 2.5 of a Markov chain (푅푛)푛 could also prove fruitful. Several directions can be consid- ered: a generalization of some results of the ergodic theory (in particular, the connection with the invariant measure), their study as random combinatorial objects (using their representation as cardinals), the construction and analysis of an abstract vector space generated by this objects (which would in particular allow an abstract representation of a HMW depending on (푅푛)푛), etc. Furthermore, using the example of HMW, we would like to see if there is a connection between the iterated occupation times as combinatorial structures and the Hopf algebras which describe the combinatorics of (abstract) rough paths (as in [7]). We have seen that, in the case of HMW, iterated sums allow, for example, to get a decomposition of rough paths that isolates the area anomaly or to propose an interesting construction of non-geometric rough paths. It can prove useful to continue studying these objects (possibly for more general processes) as they may contribute on one hand to developing the discrete setting in the rough paths theory and on the other hand to construct some concrete examples which allow explicit computations, which is often difficult when it comes to rough paths. It could be interesting to use the framework we present in this paper to study in a new way finite difference equations and in particular their asymptotics and continuous limits; or, in the reverse way, if new discrete models could be imagined to discretize efficiently classical (stochastic) differential equations. Overall, we hope that the present paper will provide the reader with good arguments to be interested in the role of the area anomaly in the classical stochastic calculus and that of discrete processes in the classical rough paths setting.

43 References

[1] P. Billingsley. Convergence of Probability Measures. Wiley, 1999.

[2] E. Breuillard, P. Friz, and M. Huesmann. From random walks to rough paths. Proc. Amer. Math. Soc., 137(10):3487–3496, 2009. [3] A. Davie. Differential equations driven by rough paths: an approach via discrete approximation. Appl. Math. Res. Express, pages 009–40, 2007.

[4] S. Eilenberg and S. Mac Lane. On the groups H(Π, n), I. Annals of Mathematics, 58(1):55–106, 1953. [5] P. Friz and M. Hairer. A Course on Rough Paths - With an introduction to regularity structures. Springer, 2014.

[6] P. Friz and N. Victoir. Multidimensional Stochastic Processes as Rough Paths: Theory and Applications. Cambridge University Press, 2010. [7] M. Hairer and D. Kelly. Geometric versus non-geometric rough paths. Annals of Institute Henri Poincaré, 51(1):207–251, 2015. [8] M. Hoffman. Quasi-shuffle products. Journal of Algebraic Combinatorics, 11:49–68, 2000. [9] M. Hoffman and K. Ihara. Quasi-shuffle products revisited. Journal of Algebra, 481:293–326, 2017. [10] J.Norris. Markov Chains. Cambridge University Press, 1997.

[11] A. Lejay and T. Lyons. On the importance of the Lévy area for studying the limits of functions of converging stochastic processes. application to homogenization. Current Trends in Potential Theory, page 63–84, 2003. [12] Y. Liu and S. Tindel. Discrete rough paths and limit theorems. arXiv:1707.01783, 2017.

[13] O. Lopusanschi and D. Simon. Lévy area with a drift as a renormalization limit of Markov chains on periodic graphs. Accepted for publication in Stochastic Processes and Applications, 2017. [14] T. Lyons, M. Caruana, and Th. Lévy. Differential Equations Driven by Rough Paths (Ecole d’Eté de Probabilités de Saint-Flour XXXIV). Springer, 2004. [15] T. Lyons and Z. Qian. Flow equations on spaces of rough paths. Journal of Functional Analysis, 149:135–159, 1997. [16] L. Young. The Theory of Integration. The University Press, 1927.