A selection of stochastic processes emanating from natural

vorgelegt von

Diplom-Mathematikerin Maite Isabel Wilke Berenguer Berlin

Von der Fakult¨atII - Mathematik und Naturwissenschaften der Technischen Universit¨atBerlin zur Erlangung des akademischen Grades

Doktor der Naturwissenschaften Dr. rer. nat.

genehmigte Dissertation.

Promotionsausschuss: Vorsitzender: Prof. Dr. math. J¨orgLiesen Berichter/Gutachter: Prof. Dr. rer. nat. Michael Scheutzow Berichter/Gutachter: Prof. Dr. rer. nat. Frank Aurzada Tag der wissenschaftlichen Aussprache: 07. Oktober 2016

Berlin 2016

Jack of all trades, master of none,...

... though oftentimes better than master of one.

ε 4 Contents

I Percolation 13

1 Lipschitz Percolation 15 1.1 (Classic) Lipschitz Percolation ...... 16 1.1.1 Applications and related fields ...... 19 1.2 Lipschitz Percolation above tilted planes ...... 19 1.3 Asymptotic bounds on the critical probability ...... 23 1.3.1 A dual notion: λ-paths ...... 25 1.3.2 Proofs of lower bounds ...... 27 1.3.3 Proofs of upper bounds ...... 40

II Population Genetics 49

2 A novel -bank model 51 2.1 A famous model by Fisher and Wright and Kingman’s dual . 54 2.2 Modelling a seed-bank ...... 64 2.3 The Wright-Fisher model with geometric seed-bank ...... 68 2.3.1 A forward scaling limit ...... 72 2.3.2 The dual of the seed-bank frequency process ...... 77 2.3.3 Long-term behavior and fixation probabilities . . . . . 78 2.4 The seed-bank coalescent ...... 81 2.4.1 Related coalescent models ...... 87 2.5 Properties of the seed-bank coalescent ...... 88 2.5.1 Coming down from infinity ...... 88 2.5.2 Bounds on the time to the most recent common ancestor 95 2.5.3 Recursions for important values ...... 105 2.6 Technical results ...... 112 2.6.1 Convergence of Generators ...... 112 2.6.2 Proofs of recursions ...... 116

5 Contents

III Random Dynamical Systems 121

3 Volterra Stochastic Operators 123 3.1 Quadratic Stochastic Operators ...... 124 3.1.1 Biological origins, related developments and enhance- ments of the model ...... 125 3.2 Polynomial Stochastic Operators ...... 127 3.3 Randomization of the model ...... 134 3.4 A Martingale Lemma ...... 144

4 A Random Dynamical System 149 4.1 Introduction to Random Dynamical Systems and Attractors . 150 4.2 Evolution of the RDS forward in time ...... 156 4.2.1 Pullback attractors of our RDS ...... 156 4.2.2 Considering forward convergence ...... 158 4.3 Evolution backward in time ...... 167 4.4 Delta attractors - refining an established concept ...... 171 4.5 Auxiliary observations ...... 188 4.5.1 Observations related to measurability ...... 188 4.5.2 A helpful Markov chain ...... 190 4.5.3 A suitable metric ...... 191 4.5.4 Derivatives of Volterra PSOs ...... 196

6 Introduction

The history of probability theory is no doubt closely intertwined with its application in diverse areas, which has been its boon and bane. One could say that mathematical probability started with the interest of Pascal and Fermat in the description of gambling games common at their time. This same motivation in mind, first Bernoulli and De Moivre and later Laplace continued the developement of the mathematical theory of games of chance and began applying it to statistics of human population and insurance ques- tions, without regard for the comparability of these to the original problem. As a result of these examples, the field of applications of probability theory expanded rapidly throughout the 19th century (creating new areas like sta- tistical mechanics or actuarial mathematics), while mathematical probability itself experienced a period of stagnation as the weakness of the conceptual foundation was neglected. As unfounded applications in social and moral questions began to emerge and Bertrand presented a series of Paradoxa the need for an axiomatization became evident. This was fully accomplished by Kolmogorov in the early 1930s. It still took some time for probability theory to shake its image as a non-rigorous (outside of the Soviet Union), but nowadays it is a flourishing legitimate branch of mathematics, in addition to being applied in other fields as diverse as genetics, economics, psychology and engineering.1 This description suggests probability theory has a one-way profitable connection to other sciences. However, credit should also go tothe ‘applications’ since the flow of inspiration often goes the other way round as new probabilistic, mathematically fascinating objects arise from an applied model. Both profit flows are present in this thesis. This work is constituted of three independent parts that can be read in any order. They all draw a link between the theory of stochastic processes indexed by discrete sets with a structure of independence and models in science. In this context, Part I: Percolation differs from the remainder of the text as here the index in question represents not time, but space. Parts II: Popu-

1This information can be found in [12] and [74]

7 Contents lation Genetics and III: Random Dynamical Systems are closer connected, as the independent structure results in Markovian processes (mostly) with dis- crete time-index and their joint origin lies in biological population genetics. However, they differ strongly in their present relation to biology. The results of Part II do indeed parallel current developement in the field of biological population genetics and have actual applications in that area (cf. [6]). Part III on the other hand, has the other type of relation to ‘applications’. That is, it focuses on the mathematical relevance of problems that arose from bio- logical population genetics to the extent of introducing a new mathematical object in the end, which serves as vindication for the different title of the parts. Since the three parts do differ in spite their commonalities, detailed in- troductions are left to the individual chapters. Part of this thesis has already been published in [6], [9], [22], [56]. This is detailed in the following, together with the structure of this work:

Part I: Percolation. The first part of this thesis consists of the chapter on Lipschitz Percolation. It begins with a thorough introduction to (clas- sic) Lipschitz percolation outlining previous results on this model from [19] and [47] and discusses related models. Section 1.2 then introduces our novel contribution of Lipschitz percolation above tilted planes summarizing prelim- inary results. Section 1.3 is then devoted to the main result of this chapter: Theorem 1.14. Here, first, the exact bounds obtained are detailed in aseries of propositions, separated into upper and lower asymptotical bounds. The important notion of λ-paths in a sense dual to Lipschitz percolation is given in Section 1.3.1. For better readability, all proofs are grouped into Sections 1.3.2 and 1.3.3. The complete content of Sections 1.2 and 1.3 is joint work with Prof. M. Scheutzow and Prof. A. Drewitz and has been published in [22].

Part II: Population Genetics. The second part is made up of the chap- ter on a novel seed-bank model. This begins with a basic introduction to the two most important models in population genetics: the classic Wright- Fisher model and the Kingman coalescent. The purpose of Section 2.1 is to familiarize the reader new to this field with the type of questions asked in population genetics together with their corresponding answers that are classic results in this branch of mathematics. This also serves to highlight the parallels between our seed-bank model and the classic theory. Section 2.2 then provides an overview of the types of extended Wright-Fisher mod- els reformed to include the seed-bank phenomenon and known results so far,

8 Contents explaining also their limitations. Our own addition to this family of models is presented in Section 2.3, which also collects first results on it. In Section 2.3.1 we obtain a frequency process as a scaling limit of our model, that is the solution to a system of two- dimensional SDEs. We identify its dual-process in Section 2.3.2 and apply it in Section 2.3.3 to determine the diffusion’s longtime behaviour. In Section 2.4, we then define the most important object of this chapter which is a new seed-bank coalescent corresponding to the previously derived dual block-counting process. We explain how it describes the ancestry of the Wright-Fisher geometric seed-bank model and conclude the section with a comparison to other similar coalescents. Section 2.5 summarizes the most important properties of the seed-bank coalescent. We are able to prove that the seed-bank coalescent does not come down from infinity (cf. Section 2.5.1). In the following, we obtain that the expected time to the most recent common ancestor of a sample of k individuals is of asymptotic order log log k as k gets large. The section is concluded with recursions for selected quantities of interest of our model, needed in the scope of applications in order to execute simulations. Here, we also give a brief discussion of an extension of the seed- bank model that we studied in [6] by adding mutation. The last section of this chapter serves the purpose of an appendix, containing the technical details of calculations in Sections 2.3 and 2.5. We shall remark here that the results obtained in Sections 2.3 - 2.6 have previously been published in [9], with the exception of 2.5.3 and in part also 2.6.2 which are part of [6]. These two publications were joint work with Profes. N. Kurt and J. Blath, Dr. A. Gonz´alezCasanova and in the case of [6] also with Dr. B. Eldon and the results also feature in the dissertation of Dr. A. Gonz´alezCasanova. However, this thesis has added details and more precise references in several instances. Among other, we have corrected (and proved) what is Proposition 2.56 in this work and layed out the proofs in Section 2.6.2 in detail.

Part III: Random Dynamical Systems. This last part of the thesis is composed of two chapters: Chapter 3 on Volterra Stochastic Operators and Chapter 4 on a random dynamical system. Chapter 3 focuses on a generalization of the notion of Quadratic Stochas- tic Operators (QSOs), which we have called Polynomial Stochastic Operators (PSOs). QSOs are operators on the simplex Sm−1 ∶= {x ∈ [0, 1]m ∣ m x = 1} ∑i=1 i introduced by Bernstein in [4] to describe the evolution of a large population in discrete generations with a given law of heredity. Their derivation from

9 Contents the biological motivation is very natural, which is utilized in the beginning of this chapter to set up an intuitive framework for so called Volterra QSOs and showcase known results for later reference. The first section concludes with a discussion of the rich literature on related and extended models. Al- though of less biological relevance, it is mathematically intuitive to generalize the notions Volterra QSOs to Volterra PSOs to which we dedicate Section 3.2. This section also derives properties and estimates for Volterra PSOs necessary for the subsequent work. We finally leave the deterministic frame- work in Section 3.3 considering a randomization of the heredity mechanism. The main result of this chapter is Theorem 3.15 which proves almost sure convergence of the dispersal of species to Λ = {e1, . . . , em}, which is in stark contrast to the deterministic heredity mechanisms that need not converge at all. Again, the last section of the chapter serves the purpose of an appendix and contains a separate Martingale-type convergence theorem needed for the proof of the main result, but interesting in its own right. The randomized heredity mechanism is then embedded in the context of random dynamical systems in Chapter 4 in order to profit from the richer structure of these compared to Markov chains. Naturally, the chapter begins with an introductory section on the central notions of a random dynamical system (RDS) and different types of attractors (of such an RDS) in the context of which the concrete RDS resulting from the set-up in Chapter 3 is defined. Section 4.2 is generally concerned with the evolution of thisRDSin its natural (forward) direction. More specifically, convergence in the pullback sense is considered in Section 4.2.1 and the minimal strong pullback point- attractor of the RDS is identified as Λ. Section 4.2.2 on the other hand focuses on forward convergence. We define sets M i as the (random) sets of points converging to ei ∈ Λ and consider their properties. In particular, we prove that they are open and path connected in the context of Volterra QSOs. The restriction to Volterra QSOs carries over to Section 4.3, where we consider in essence our RDS backwards in time, which corresponds to its inverse. For this system we then prove the existence of a strong global pullback attractor A¯ on the interior of Sm−1. With additional assumptions we can prove synchronization, i.e. that ∣A¯∣ = 1 holds for this attractor. We return to the original RDS in Section 4.4, where we introduce a novel concept, so-called ∆-attractors. As opposed to the standard cases, where attractors are assumed to either attract the set of all points or the set of all compact sets (sometimes, also all bounded sets), we consider attractors that attract all compact sets of Hausdorff dimension up to ∆. The main result of this section is Theorem 4.38 in which we can prove that Λ is the minimal strong forward ∆-attractor for a ∆ > 0. The chapter concludes, as before, with a section serving the function of an appendix, as it contains technical

10 Contents results and references for the previous sections. Sections 3.3 in Chapter 3 and 4.2.1 in Chapter 4 are the analogue results for Volterra polynomial stochastic operators obtained in Sections 3 and 4 in [56] for Volterra quadratic stochastic operators. Section 3.4 in Chapter 3 is Section 5 in [56]. Any other result in these two chapters have not yet been published. The publication [56] was joint work with Prof. M. Scheutzow and Dr. U.U. Jamilov, the remaining work was joint work with Prof. M. Scheutzow only. We now give a short collection of the basic notation used throughout this work. Notation 0.1 In contradiction to DIN-Norm 5473 we adhere to the no- tation traditional in probability theory and use N to denote the natural num- bers without zero, i.e. N ∶= {1, 2,...} and N0 ∶= N∪{0}. Z is the set of integers + − + and some subsets of it are given through Z ∶= N, Z ∶= −N and Z0 ∶= N0. + In the same manner, R denotes the set of reals and R0 ∶= [0, ∞[. e1, . . . , ed d d d denote the standard basis of R (or Z ). If x ∈ R , we write x1, . . . , xd for its components. The symbol 0 can have different meanings throughout this work. It is used to denote the integer 0 ∈ Z, but also as an abbreviation for a vector 0 = (0,..., 0) ∈ Zd. However, its meaning should always be clear from the context. We use ∥ ⋅ ∥ to denote the 1-norm, i.e. ∥x∥ ∶= d ∣x ∣ 1 1 ∑i=1 i d 2 1 for x ∈ d, and merely ∥ ⋅ ∥ for the euclidian norm, i.e. ∥x∥ ∶= ( x ) 2 . R ∑i=1 i For any two values a, b ∈ R, we sometimes abbreviate a ∧ b ∶= min{a, b} and a ∨ b ∶= max{a, b}. For any two sets E and I, EI denotes the set of all maps from I to E . IdE is the identity map on a set E. The concatenation of two maps is symbolized by ○. The cardinality of any set A is given by ∣E∣. If the reference space E is clear from context, for A ⊆ E we use Ac ∶= E ∖ A to denote the complement of A in E. For a topological space (E, T ), B(E) is the Borel-σ-algebra on E gener- ated by the topology T . If (E, d) is a metric space, we always tacitly assume the topology to be the one induced by the metric d. Throughout this thesis, (Ω, F, P) will denote a probability space, which might be further specified in the different chapters. For a measurable space (E, E), a random variable on (Ω,,P) is a map X ∶ Ω → E that is F-E measurable. For a family of ran- dom variables (Xi)i∈I on (Ω, F, P) we define σ(Xi ∣ i ∈ I) to be the smallest σ-algebra such that for every i ∈ I, Xi is σ(Xi ∣ i ∈ I)-E measurable. If X is a random variable on (Ω, F, P), L(X) ∶= P ○ X−1 denotes its distribution on (E, E). If ν is a probability measure on (E, E), we write X ∼ ν if L(X) = ν.

11

Part I

Percolation

13

Chapter 1

Lipschitz Percolation

The birth of mathematical percolation theory is widely acknowledged to be the paper by Broadbent and Hammersley published in 1957 called “Perco- lation Processes: I. Crystals and Mazes”, [11] with the aim of providing a model for a class of phenomena of which the following is a representative example:

Imagine we have a large stone made of rather porous material that we submerge in a (equally huge) bucket of water, such that all of its surface is wet. What level of permeability of the rock ensures that the center of it is also eventually reached by water?

It also explains the origin of the name as in physics and chemistry ‘perco- lation’ (from Latin percolare, to trickle through) refers to the movement of fluids through porous materials, although admittedly in the mathematical model the fluid usually flows from the center of the object to (hopefully) infinity. This area has been a fruitful ground for research in physics givenits usefulness in describing various phenomena in statistical mechanics. On the other hand, “it is the source of fascinating problems of the best kind a math- ematician can hope for: problems which are easy to state with a minimum of preparation, but whose solutions are (apparently) difficult and require new methods”, as Kesten1 says in the preface to [62], so it comes as no surprise it is also for mathematicians a strongly active research area and one of the most studied models for a disordered medium (see [45] for a comprehensive account). In a basic mathematical set-up considering the lattice Zd+1 where the ver- tices are assigned one of two states called ‘open’ and ‘closed’ with probability p ∈ [0, 1] (resp. 1 − p) independently at random, the question is to determine

1Grimmett seconds this in the preface of [45].

15 Chapter 1. Lipschitz Percolation for which values of the parameter p there exists an infinite self-avoiding ran- dom walk on the open sites (P-a.s.). In contrast to this, the model studied in this work does not consider a connected cluster of open sites spreading in branches into any dimension, but require the open cluster to be the graph of a Lipschitz function. Before we go into further details about this construction we introduce some basic notation used in this chapter. Notation 1.1 We have taken the practice in this chapter of marking elements of Zd with a bar as inx ¯ ∈ Zd in order to distinguish them from d+1 d+1 canonical elements x ∈ Z . In the same vein, for x = (x1, . . . , xd+1) ∈ Z , we usex ¯ to refer to (x1, . . . , xd) as well as (x,¯ xd+1) to denote x. In addition, by a slight abuse of notation we use 0 to denote the origin of Z, Zd and Zd+1, but no confusion is to be feared due to the context. We write f(s) ≍ g(s) as s → s¯ for two functions f and g if there exist positive and finite constants c, C such that lim inf f(s)/g(s) ≥ c and lim sup f(s)/g(s) ≤ C. Similarly we s→s¯ s→s¯ use f(s) ≲ g(s) as s → s¯, if lim sup f(s)/g(s) ≤ 1, respectively f(s) ≳ g(s) s→s¯ as s → s¯, if lim infs→s¯ f(s)/g(s) ≥ 1, and asymptotic equivalence is denoted by f(s) ∼ g(s), s → s¯ (i.e., if f(s) ≲ g(s) and f(s) ≳ g(s) as s → s¯). The structure of this chapter is as follows: The following Section 1.1 contains the main definitions and results related to (classic) Lipschitz per- colation from [19] and [47] as well as a subsection on related topics. All necessary definitions for Lipschitz percolation above tilted planes, are then introduced in Section 1.2 together with some preliminary results. The main result of this chapter is Theorem 1.14 in Section 1.3 which is followed by a break down of the exact asymptotic bounds obtained and concludes with the series of proofs of these results. All results from Sections 1.2 and 1.3 have been published in [22].

1.1 (Classic) Lipschitz Percolation

The topic of Lipschitz percolation was first introduced in a paper with pre- cisely this title by Dirr, Dondl, Grimmett, Holroyd and Scheutzow in 2010 ([19]). The set-up is that of so-called site-percolation in Zd+1 with a param- eter p ∈ [0, 1] for an integer d ≥ 1 that we will refer to as ’dimension’. That is, our probability space has a specific structure:

d+1 • Ω = {0, 1}Z is the set of maps from Zd+1 to {0, 1}, referred to as configurations of the space Zd+1,

• F is the corresponding product-σ-algebra and

16 1.1. (Classic) Lipschitz Percolation

• P = Pp is the product measure on (Ω, F) of Bernoulli-measures with parameter p. A site x ∈ Zd+1 is called open (with respect to ω) if ω(x) = 1, and closed if ω(x) = 0. (Note that this term is not related to the topological notions of ‘open’ and ‘closed’, but rather a common choice of names for ‘two options’ in percolation theory - common alternatives are ‘occupied’ and ‘empty’ ‘good’ and ‘bad’ or ‘wet’ and ‘dry’ just to name a few. Our choice is the same as the one used in [19]). As mentioned in the introduction above the classic object of study is an infinite (open) connected cluster spreading through a lattice in all possible directions - where ‘connected’ and ‘possible ’ are described by the (possibly oriented) bounds of the choice of lattice. For Lipschitz percolation though we do not consider such a ‘spread-out’ percolation, but impose a different constraint on the infinite open cluster under consideration: we want ittobe the graph of a Lipschitz function, hence the name. We now give the precise definition. Definition 1.2. A function F ∶ Zd → Z is called Lipschitz function, if for any x,¯ y¯ ∈ Z the implication

∥x¯ − y¯∥1 = 1 Ô⇒ ∣F (x¯) − F (y¯)∣ ≤ 1 is satisfied. Denote by Λ the set of all such Lipschitz functions. Wewill

( ( )) d refer to the graph x,¯ F x¯ x¯∈Z of such a Lipschitz function F as a Lipschitz surface, and in an abuse of notation sometimes also denote the surface by F only. Remark 1.3. Note that the pointwise minimum of a family of Lipschitz functions is always a Lipschitz function, if one includes the degenerate cases of F ≡ ±∞ in the notion of a Lipschitz function. These degenerate cases can appear, for example, if one considers the set of all Lipschitz functions, resp. the empty set (with the convention of inf ∅ = ∞). Only the latter will be of relevance for our work. Such a pointwise minimum will be called the minimal Lipschitz function of this family. Its graph will be referred to as the minimal Lipschitz surface (of this family). Definition 1.4. We will say that F is an open Lipschitz surface (or function) in ω ∈ Ω, if for allx ¯ ∈ Zd we have ω((x,¯ F (x¯)) = 1, i.e. all elements of the graph of the function are ‘open’ in the percolation setting. See Figure 1.1 for a visualization of these concepts. Denote by LIP the event that there exists an open Lipschitz surface con- tained in the upper half space Zd × N. As was proven in [19] this undergoes a phase transition and the critical probability is non-trivial:

17 Chapter 1. Lipschitz Percolation

Figure 1.1: Example of an open Lipschitz surface: The circles give the config- uration ω ∈ Ω, where a white circle stands for an open site, a black circle for a closed site. If the hatched site is open, then the red line visualizes an open Lipschitz surface. Note that this is not the only possible one in this set-up, but it is the minimal open Lipschitz surface in Zd × N. If the hatched site is closed, then there is no open Lipschitz surface in Zd × N (in this segment of the Zd × Z lattice).

Theorem 1.5 ([19]). There exists a pL(d) ∈ ]0, 1[ such that ⎧ ⎪0, if p < p (d) (LIP) = ⎨ L Pp ⎪ ⎩⎪1, if p > pL(d).

Furthermore it was proven, that for p > pL(d) there exists an open Lip- schitz surface such that the random field (F (x¯), x ∈ Zd) is stationary and ergodic. In addition, an upper bound for pL(d) and tail estimates for the height of the minimal open Lipschitz surface contained in Zd × N were es- tablished for p sufficiently close to 1. The topic was explored in depthin [47] using and expanding the strategies applied in [19]. Through different representations of the minimal Lipschitz surface (e.g. with so-called local covers, or mountains) exponential tails for the height of the minimal open Lipschitz surface were established for all p > pL(d) and the upper bound on the critical probability pL(d) was improved. The complementing lower bound was obtained through comparison to oriented percolation.

Theorem 1.6 ([19, 47]). For the critical probability pL(d) of Lipschitz per- colation in Zd × N we obtain

−1 1 − pL(d) ≥ (8d) , for all d ∈ N, −1 1 − pL(d) ≲ (2d) , as d → ∞. and in particular 1 − pL(d) ≍ 1/d, d → ∞.

18 1.2. Lipschitz Percolation above tilted planes

The two dimensional case was even proven to be equivalent to the alterna- tive model of oriented percolation (one direction of which was independently proven in the author’s Diplom thesis, cf.[47]). We will refer to this set-up as classic Lipschitz percolation as opposed to the set-up studied in the following sections.

1.1.1 Applications and related fields Interestingly, it seems that the topic of Lipschitz percolation arose indepen- dently in two quite different backgrounds. On one hand Dirr, Dondl and Scheutzow were working on a model for the movement of an interface through a time independent field of random obsta- cles described by the so called Edwards-Wilkinson e)inquation. Here, what we now call an open Lipschitz function can describe the random obstacles such that its existence allows the construction of a stationary supersolution to this equation as was then done in [20]. On the other hand Grimmett and Holroyd looked at ‘lattice embeddings’ and the general question of existence of a Lipschitz injection of Zd into the open cluster of site percolation in ZD (for some D ≥ d) of which Lipschitz percolation is a special case as described in [48]. The two different proofs of existence of such an open Lipschitz surface (and first additional results on it) were then jointly published in [19]. With the aim of considering so-called comb percolation in [54] the exis- tence of a family of stacked, disjoint Lipschitz surfaces was proven. Extending the question of Lipschitz embeddings of one (random) structure into another this topic is also related to the question of existence of a Lipschitz embedding of one Bernoulli sequence into another, treated in [2]. On the other hand the results of [19, 47] were paralleled in [46] in order to analyse the existence of a sphere containing the origin in a plaquette percolation model and resumed in [49]. It should be mentioned that the related results in [46, 48, 49, 54] were proven using (non-obvious) extensions of a notion of duality between a ran- dom surface and suitably chosen random paths used in [19, 47] which we will also apply in our proofs in Section 1.3.

1.2 Lipschitz Percolation above tilted planes

As reported in the previous section, the investigation of Lipschitz percola- tion has, up to now, been focused on Lipschitz surfaces that stay above the

19 Chapter 1. Lipschitz Percolation hyperplane L ∶= Zd × {0}. We, however, are interested in the effect of ‘tilting’ this plane. Let us specify what we mean by tilted planes.

Definition 1.7. For any d ∈ N, α ∈ [0, 1[ and η ∈ {−1, 0, +1}d we define the the tilted plane

d α,d d+1 Lη ∶= {(x1, . . . , xd+1) ∈ Z ∣ xd+1 = ⌊α ∑ ηixi⌋}. (1.1) i=1

Note that this is indeed a d-dimensional hyperplane of Zd+1, but for sim- plicity we will refer to these objects as ‘planes’. We do not use an angle γ ∈ [−π/4, π/4) to describe the inclination along coordinate axes, but rather introduce the parameter(s) α ∈ [0, 1) (and η ∈ {−1, 0, +1}d), as this results in the clean and simple representation of the planes given in (1.1) and is convenient for computations without any drawbacks given the natural one- to-one correspondence between α and γ. Observe also, that the case of α = 0 as well as the case η = 0 correspond to γ = 0 and thus to classic Lipschitz percolation. The restriction to α ∈ [0, 1), resp. γ < π/4, is natural once one realizes that for α ≥ 1 (and η ≠ 0), resp. γ ≥ π/4, for any p < 1, there cannot α,d exist a Lipschitz surface above the plane Lη due to the Lipschitz property (Pp-a.s. for the case γ = π/4). In the study of Lipschitz percolation above tilted planes, the related con- cept of Lipschitz percolation above inverted pyramids is helpful to obtain basic observations.

Definition 1.8. For any d ∈ N, α ∈ [0, 1[ and η ∈ {−1, 0, +1}d define the α,d inverted pyramid ∇η as

d α,d d+1 ′ ∇η ∶= {(x1, . . . , xd+1) ∈ Z ∣ xd+1 = max {⌊α ∑ ηixi⌋}}. η′ 1,0, 1 d ∈{′− + } i=1 ∥η ∥1=∥η∥1

Before introducing the events under consideraton let us give some helpful notational conventions. α,d α,d Notation 1.9 Denote by Lη,> the upper half space strictly above Lη , i.e.,

d α,d d+1 Lη,> ∶= {(x1, . . . , xd+1) ∈ Z ∣ xd+1 > ⌊α ∑ ηixi⌋}. i=1

Recall that Λ is the set of all Lipschitz functions F ∶ Zd → Z.

20 1.2. Lipschitz Percolation above tilted planes

α,d Definition 1.10. Let LIPη denote the event that there exists an open Lip- α,d schitz surface contained in Lη,> , i.e.,

d α,d d LIPη ∶= {ω ∈ Ω ∣ ∃F ∈ Λ ∶ ∀x¯ ∈ Z ∶ ω((x,¯ F (x¯))) = 1 and F (x¯) > ⌊α ∑ ηix¯i⌋}. i=1

α,d Similarly to the case of planes we use LIP(∇η ) to denote the event of exis- α,d tence of a Lipschitz surface above the inverted pyramid ∇η , i.e.,

α,d LIP(∇η ) ∶= d d ′ {ω ∈ Ω ∣ ∃F ∈ Λ ∶ ∀x¯ ∈ Z ∶ ω((x,¯ F (x¯))) = 1 and F (x¯) > max {⌊α ∑ ηix¯i⌋}}. η′ 1,0, 1 d ∈{′− + } i=1 ∥η ∥1=∥η∥1 We proceed with the first step typical in any percolation set-up by prov- ing the existence of a phase-transition as well as the non-triviality of the corresponding critical probabilities together with further preliminary results on these.

Proposition 1.11. For any d ≥ 1, α ∈ [0, 1[ and η ∈ {−1, 0, +1}d, there exists a critical probability pL(α, d, η) ∈ (0, 1) such that ⎧ ⎪0, p ∈ [0, p (α, d, η)), (LIP(∇α,d)) = (LIPα,d) = ⎨ L (1.2) Pp η Pp η ⎪ ⎩⎪1, p ∈ (pL(α, d, η), 1].

In fact, for any η′ ∈ {−1, 0, 1} with ∥η∥1 = ∥η′∥1,

′ pL(α, d, η) = pL(α, d, η ). (1.3)

Therefore, pL(α, d, η) depends on η only through the number of nonzero en- tries.

This means that there exists a phase transition for both Lipschitz per- colation above tilted planes and above inverted pyramids, and their critical probabilities coincide. Notation 1.12 Due to (1.3) it is convenient to define pL(α, d, k) ∶= pL(α, d, η) for any η ∈ {−1, 0, +1} such that ∥η∥1 = k ∈ {0, . . . , d}. Further- more, we set qL(α, d, k) ∶= 1 − pL(α, d, k). (1.4)

For notational convenience we will formulate most of our results for qL instead of pL since the latter usually tends to 1 and hence the former to 0.

21 Chapter 1. Lipschitz Percolation

Proof of Proposition 1.11. First observe that due to the symmetries of Zd α,d and the i.i.d.-product structure of Pp, the quantity Pp(LIPη ) depends on η only through ∥η∥1. Thus, if the postulated critical probabilities exist, then they must fulfill (1.3). We now start with showing the second equality in (1.2) for some pL(α, d, η) ∈ α,d α,d [0, 1]. Since LIPη is an increasing event, it is immediate that Pp(LIPη ) is nondecreasing in p. Therefore, it is sufficient to show that it takes values in {0, 1} only. Define the shift θ ∶ ω ↦ ω(⋅,..., ⋅+1) in the (d+1)-st coordinate. Then θ is measure preserving for Pp and ergodic with respect to Pp. As a consequence, −1 α,d α,d −1 α,d α,d α,d since θ (LIPη ) ⊂ LIPη and Pp(θ (LIPη )) = Pp(LIPη ), the event LIPη α,d −1 α,d is Pp-a.s. invariant with respect to θ, i.e., Pp(LIPη △ θ (LIPη )) = 0, and by Proposition 6.15 in [10] this already implies

α,d Pp(LIPη ) ∈ {0, 1}.

This establishes the second equality in (1.2) for some pL(α, d, η) ∈ [0, 1]. In order to obtain the first equality of (1.2), due to the second equality in α,d α,d α,d (1.2) and LIP(∇η ) ⊆ LIPη , it remains to show that Pp(LIPη ) = 1 implies α,d α,d Pp(LIP(∇η )) = 1. By symmetries, Pp(LIPη ) = 1 already yields

α,d Pp( ⋂ LIPη′ ) = 1. η′ 1,0, 1 d ∈{′− + } ∥η ∥1=∥η∥1 Note that the pointwise maximum of Lipschitz functions is a Lipschitz func- tion again and thus

α,d α,d ⋂ LIPη′ ⊆ LIP(∇η ). η′ 1,0, 1 d ∈{′− + } ∥η ∥1=∥η∥1 Thus (1.2) holds true. It remains to show the nontriviality of the phase transition, i.e., that pL(α, d, η) ∈ (0, 1). Proposition 1.15 below in particular shows that pL(α, d, d) < 1 for all α ∈ [0, 1) and d ≥ 1; hence, using (1.7) below, we deduce pL(α, d, k) < 1 for all 0 ≤ k ≤ d. On the other hand, pL(α, d, k) > 0 for all 0 ≤ k ≤ d fol- lows from the fact that the critical probability for the existence of an infinite connected component in the 1-norm in (d + 1)-dimensional Bernoulli site- percolation (which is a lower bound for pL(α, d, k)) is strictly positive. Using the above result one can obtain some simple but helpful mono- tonicity results for the critical probabilities.

22 1.3. Asymptotic bounds on the critical probability

Lemma 1.13. For all d ∈ N, and α, α′ ∈ [0, 1) such that α ≤ α′, we have

′ ∀ k = 0, . . . , d ∶ pL(α, d, k) ≤ pL(α , d, k), (1.5)

∀ k = 0, . . . , d ∶ pL(α, d, k) ≤ pL(α, d + 1, k), (1.6) and ∀ k = 0, . . . , d − 1 ∶ pL(α, d, k) ≤ pL(α, d, k + 1). (1.7) Proof. We start by proving the monotonicity in α, which is best seen consid- ering Lipschitz surfaces above inverted pyramids. Note that for α′ ≥ α, one ′ ′ α ,d α,d α′ α ,d α α,d has ∇η ≥ ∇η , in the sense that for any (y,¯ y ) ∈ ∇η and (y,¯ y ) ∈ ∇η ′ d+1 d+1 α′ α α ,d α,d we have y ≥ y . Hence LIP(∇η ) ⊆ LIP(∇η ), which implies (1.5). d+1 d+1 d 1 On the other hand, to prove (1.6) choose η ∈ {−1, 0, +1} + with ∥η∥1 = k, and let 1 ≤ j ≤ d + 1 be such that ηj = 0. Then (1.6) follows directly α,d+1 from the fact that the cross section of a Lipschitz surface in Lη,> with Zj−1 × {0} × Zd−j+1 mapped to Zd by eliminating the j-th coordinate is again α,d j a Lipschitz surface contained in L ( ) , for η( ) ∶= (η1, . . . , ηj 1, ηj 1, . . . , ηd 1), η j ,> − + + j combined with the fact that ∥η( )∥1 = k and (1.3). α,d α,d Lastly, (1.7) follows from the fact that for any 1 ≤ j ≤ d, ∇ηj→0 ≥ ∇η α,d α,d in the above sense and thus LIP(∇ηj→0 ) ⊃ LIP(∇η ), where ηj→0 is obtained from η by replacing the j-th coordinate by 0.

1.3 Asymptotic bounds on the critical prob- ability

As it was done for classic Lipschitz Percolation in [19] and [47], we aim at characterizing the asymptotic behavior of the critical probabilities through suitable bounds, only now we have more paramters to consider: besides the dimension d also the inclination given by α and even the number k of dimensions along which we consider the inclination. For the reader’s convenience we summarize the principal asymptotics for qL(α, d, d) = 1 − pL(α, d, η) in the following theorem. The actual asymptotics obtained are more precise and are detailed in individual propositions below.

Theorem 1.14.

1 − − qL(α, d, d) ≍ d 1 α , as d → ∞ (and α fixed ), (1.8) and d qL(α, d, d) ≍ (1 − α) , as α → 1 (and d fixed ). (1.9)

23 Chapter 1. Lipschitz Percolation

Proof. (1.8) follows from Propositions 1.15 and 1.20, while (1.9) is obtained combining Propositions 1.18 and 1.21. We now list the series of results, postponing their proofs to the next sec- tion in order to facilitate their comparison and maintain a cleaner structure.

———————— Lower Bounds for qL(α, d, k) ———————— We start with a ‘general’ bound in the sense that it holds true for all d ≥ 1 and α ∈ [0, 1). Proposition 1.15 (General bound). For any d ≥ 1 and α ∈ [0, 1) one has

1 1 q (α, d, d) ≥ (4d)− 1−α . L 2 Note that for α = 0 this is exactly the lower bound of [47] (cf. Theorem 1.6 in this work). In a similar manner we find bounds for the critical probability in the case that the number k of axes along which the plane is tilted has a more general dependence on the dimension d:

Proposition 1.16. Consider a function φ ∶ N → N0 with φ(d) ≤ d for all d ∈ N. 1. If for some α ∈ [0, 1) one has that φ(d) ∈ o(d1−α) as d → ∞, then 1 q (α, d, φ(d)) ≳ d−1, as d → ∞. L 8

2. If for some α ∈ [0, 1) and c ∈ [0, 1] one has φ(d) ∼ cd1−α as d → ∞, then there exists a constant C(c, α) > 0 such that

−1 qL(α, d, φ(d)) ≳ C(c, α)d , as d → ∞.

3. If for some c ∈ (0, 1] one has φ(d) ∼ cd as d → ∞, then for α ∈ (0, 1),

1 1 q (α, d, φ(d)) ≳ (1 − α)(cd)− 1−α , as d → ∞. L 4

Remark 1.17. The constant in Proposition 1.16, (b), satisfies C(c, 0) = C(0, α) = 1/8 for any c ∈ [0, 1], α ∈ [0, 1) being thus what one would hope for, given that these cases correspond to standard Lipschitz percolation (cf. [47], resp. Theorem 1.6 in this work). The bound in Proposition 1.16, (c), is an improvement compared to Proposition 1.15 at the expense of being of asymptotic nature only.

24 1.3. Asymptotic bounds on the critical probability

On the other hand, fixing d and k, we can improve the lower bounds in α:

Proposition 1.18. For each d ≥ 1 and each k = 1, . . . , d there exists a con- stant C(k, d) > 0 such that for all α ∈ [0, 1) one has

k qL(α, d, k) ≥ C(k, d)(1 − α) .

And for d = k = 1 we can be more precise with the constant in the asymptotic case:

Proposition 1.19. For d = 1 one has qL(α, 1, 1) ≳ (1 − α) as α → 1, which together with Proposition 1.21 below yields

qL(α, 1, 1) ∼ (1 − α), as α → 1.

———————— Upper Bounds for qL(α, d, k) ———————— For the asymptotical behavior in d, we obtain an upper bound comple- menting the lower bound in Proposition 1.15.

Proposition 1.20. For every α ∈ [0, 1) there exists a constant C(α) such that

1 − − qL(α, d, d) ≲ C(α)d 1 α , as d → ∞.

1 More precisely, C(α) = θ 1−α /(eθ − 1), where θ is the unique solution to θeθ/(eθ − 1) = 1/(1 − α) and C(0) = 1.

And finally we have an upper bound matching Proposition 1.18.

Proposition 1.21 (General bound). For any α ∈ [0, 1) and d ∈ N d!(1 − α)d q (α, d, d) ≤ ≤ d!(1 − α)d. L 1 + d!(1 − α)d

Remark 1.22. Since qL(α, d, k) ≤ qL(α, k, k) by Lemma 1.13, Proposition 1.21 immediately yields upper bounds for qL(α, d, k) for any k = 1, . . . , d also.

1.3.1 A dual notion: λ-paths The proofs of all propositions above rest upon the observation precised in Lemma 1.24 below, that a Lipschitz surface can be constructed as a blocking surface to suitably defined random paths, called λ-paths. In this sense, the

25 Chapter 1. Lipschitz Percolation

Lipschitz surface is ‘dual’ to the set of λ-paths. (Note that this is not related to the notion of duality considered in Chapter 2.3.2). This Peierl’s type argument of considering and cleverly counting suitable sets of paths is the same strategy employed in [19] and [47] for classic Lipschitz Percolation. d+1 Recall that we denote by e1, . . . , ed+1 ∈ Z the standard basis vectors of Zd+1. Definition 1.23. For x, y ∈ Zd+1 a λ-path from x to y is any finite sequence d 1 x = u0, . . . , un = y of distinct sites in Z + such that for all i = 1, . . . , n

ui − ui−1 ∈ {ed+1} ∪ {−ed+1 ± ej ∣ j = 1, . . . , d}. Such a path will be called admissible (with respect to ω), if for all i = 1, . . . , n the following implication holds:

If ui − ui−1 = ed+1, then ui is closed (with respect to ω).

For any x, y ∈ Zd+1 denote by x ↣ y the event that there exists an admis- sible λ-path from x to y. We then define for xall ¯ ∈ Zd, α ∈ [0, 1), d ∈ N and η ∈ {−1, 0, +1} the function

α,d α,d Fη (x¯) ∶= sup{n ∈ Z ∣ ∃y ∈ Lη ∶ y ↣ (x,¯ n)} + 1. (1.10)

α,d α,d Observe that the graph of Fη is contained in Lη,> . As in [19] and [47], the graph of this function corresponds to the minimal open Lipschitz surface α,d (above Lη ): Lemma 1.24. The function defined in (1.10) describes a Lipschitz function whose graph consists of open sites, if and only if it is finite for all x¯ ∈ Zd. This in turn holds true if and only if it is finite at x¯ = 0. Thus, in the analysis of the existence of an open Lipschitz surface we can α,d focus on the behavior of Fη (0). Proof. Of course, for each of the two equivalences stated, only one implication needs a detailed proof and in this the choice definition of a λ-path, i.e. the steps allowed to them, will become clear. α,d d Assume, for example, that Fη (x¯)(ω) = ∞ for some ω ∈ Ω andx ¯ ∈ Z . Then, in particular, for any n ∈ N this implies the existence of an admissible α,d λ-path starting in Lη (in this ω) reaching (x,¯ ∥x¯∥1 + n). Since downward- diagonal steps of the type −ed+1 ± ej for j = 1, . . . , d are always permitted (i.e., do not depend on ω), this admissible λ-path can be extended by ∥x¯∥1 suitable steps to reach (0, n). Since n ∈ N was arbitrary, this means that α,d Fη (0)(ω) = ∞ proving the second equivalence.

26 1.3. Asymptotic bounds on the critical probability

α,d To address the first, note that, by definition (x,¯ Fη (x¯) − 1) is the “high- α,d d est” site reachable by some admissible λ-path from Lη for anyx ¯ ∈ Z . If α,d (x,¯ Fη (x¯)) were closed in ω, this path could be extended by an upward step α,d to an admissible λ-path reaching (x,¯ Fη (x¯)), contradicting the definition of α,d d Fη . In addition, note that for anyx, ¯ y¯ ∈ Z such that ∥x¯ − y¯∥1 = 1, as α,d above, (x,¯ Fη (x¯)) is reachable by an admissible λ-path in ω. Again, using that the downward-diagonal step −ed+1 + (x¯ − y¯) is always permitted, we can α,d extend this to an admissible λ-path reaching (y,¯ Fη (x¯) − 2), which implies α,d α,d α,d α,d Fη (y¯) ≥ Fη (x¯) − 1. By symmetry this yields ∣Fη (x¯) − Fη (y¯)∣ ≤ 1 and α,d thus proves that Fη is Lipschitz in ω. α,d α,d Notation 1.25 It will be useful to define Lη (h) ∶= Lη +hed+1 and denote α,d α,d by Lη (h) the random set of sites in Lη (h) reachable by an admissible λ- path started in the origin. We remind the reader of the practice of writing d d+1 x¯ ∈ Z vs. x ∈ Z and x = (x,¯ xd) = (x1, . . . , xd−1, xd). 0 to denote the origin of Z, Zd and Zd+1. d In addition, due to the symmetries of Z and the product structure of Pp, we will w.l.o.g. from now on assume that for any k = 1, . . . , d, the vector η is of the form η = (1,..., 1, 0,..., 0). ´¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¶ ´¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¶ k times d−k times

1.3.2 Proofs of lower bounds We begin with a criterion ensuring the existence of an open Lipschitz surface α,d by providing suitable conditions for the Pp-a.s. finiteness of Fη (as defined in (1.10)), which will be applied in all but one of the following proofs of lower bounds. Lemma 1.26 (Criterion for existence of an open Lipschitz surface). Let α,d d Fη be defined as in (1.10). Then, for any x¯ ∈ Z and h ∈ N,

d α,d α,d Pp(Fη (x¯) − ⌊α ∑ ηix¯i⌋ ≥ h) ≤ Ep[∣Lη (h − 2)∣]. (1.11) i=1 In particular, if

α,d lim Ep[∣Lη (h)∣] = 0, (1.12) h→∞ then α,d Pp(LIPη ) = 1. (1.13)

27 Chapter 1. Lipschitz Percolation

Proof of Lemma 1.26. In order to prove (1.11) we start by observing that for everyx ¯ ∈ Zd,

d α,d α,d the random variable Fη (0) + 1 stochastically dominates Fη (x¯) − ⌊α ∑ ηix¯i⌋, i=1 (1.14) where the +1 stems from lattice effects. Now we estimate

α,d α,d Pp(Fη (0) ≥ h + 1) = Pp(∃z ∈ Lη ∶ z ↣ (0, h)) ≤ ∑ Pp(z ↣ (0, h)) α,d z∈Lη α,d ≤ ∑ Pp(0 ↣ z) = Ep[∣Lη (h)∣]. α,d z∈Lη (h)

In combination with (1.14), this supplies us with (1.11) which finishes the proof. Note that we used the fact that if a site x = (x,¯ h) with h ≥ ⌊α d η x¯ ⌋ ∑i=1 i i α,d is reachable from Lη by an admissible λ-path, then so is any site x = (x,¯ i) with ⌊α d η x¯ ⌋ ≤ i ≤ h. This stems from the observation that if we remove ∑i=1 i i the last step the admissible λ-path took in the upward direction and then trace it, we obtain again an admissible λ-path reaching the site right below x. The fact that (1.12) implies (1.13) follows immediately from (1.11) in combination with the observation below (1.10). The common core of the proofs of Propositions 1.15 and 1.16 can be summarized in the following, somewhat technical lemma.

Lemma 1.27 (A general lower bound). Let α ∈ [0, 1), d ∈ N and k = 0, . . . , d. Then for any choice of

4 p1, p2, p3, p4 ∈ (0, 1) such that ∑ pi = 1 (1.15) i=1 we obtain

1 1 √ p 1−α p p q (α, d, k) ≥ min { p p p , p ( 3 ) , 1 4 } . (1.16) L k 1 2 3 1 k 2(d − k)

Note that the above holds true for all possible choices of our parameters – in particular for k ∈ {0, d} – if we use the convention of 1/0 = ∞. This somewhat unelegant agreement may be justified in this case as it avoids the need of repeating analogous computations without the respective terms.

28 1.3. Asymptotic bounds on the critical probability

Proof of Lemma 1.27. In order to obtain the existence of an open Lipschitz surface and thus the lower bound through Lemma 1.26, we will show the following estimate under appropriate assumptions on q = 1 − p: For d ≥ 1, α ∈ [0, 1), k = 1, . . . , d and q smaller than the right-hand side of (1.16), there exist constants δ ∈ (0, 1) and C > 0 such that for all h ∈ N,

α,d h−1 Ep[∣Lη (h)∣] ≤ Cδ . (1.17)

We will say that the j-th step of a λ-path (un) is positive downward, if uj − uj−1 ∈ {−ed+1 + el ∣ l = 1, . . . , k} and negative downward if uj − uj−1 ∈ + + − − {−ed+1 − el ∣ l = 1, . . . , k} and use D = D (u), resp D = D (u) to denote the number of these steps. In analogy, D = D(u) will denote the number of downward steps such that uj −uj+1 ∈ {−ed+1 ±el ∣ l = k+1, . . . , d} and U = U(u) will be the number of upward steps, i.e., those for which uj − uj−1 = ed+1. Now for any natural numbers U, D+,D− and D, the number of λ-paths starting in the origin with U upward steps as well as D+ positive, D− negative and D neutral downward steps, respectively, can be estimated from above by

U + D+ + D− + D + − ( )kD +D (2(d − k))D. U, D+,D−,D

Thus the expected number of such paths which are admissible can be upper bounded by U + D+ + D− + D + − ( )kD +D (2(d − k))DqU . (1.18) U, D+,D−,D

In addition, due to the multinomial theorem, for any p1, p2, p3, p4 chosen as in (1.15) we have

+ − U + D + D + D U D+ D− D ( )p1 p2 p3 p4 ≤ 1, U, D+,D−,D and hence

+ − U + D+ + D− + D 1 U 1 D 1 D 1 D ( ) ≤ ( ) ( ) ( ) ( ) . (1.19) U, D+,D−,D p1 p2 p3 p4

In order to simplify notation, note that the ‘best strategy’ for admissible λ-paths is to go for the negative orthant in the first d coordinate axes, in the sense that

d ∑ Pp(0 ↣ y) ≤ 2 ∑ Pp(0 ↣ y). α,d α,d d y∈Lη (h) y∈Lη (h)∩((−N0) ×Z)

29 Chapter 1. Lipschitz Percolation

Since at each downward step of a λ-path the (d+1)-st coordinate of the path is decreased by one, the total number U(u) of upward steps of a λ-path (un) α,d d starting in 0 and ending in Lη (h) ∩ ((−N0) × Z) fulfills

U(u) = D+(u) + D−(u) + D(u) + ⌊α(D+(u) − D−(u))⌋ + h and D+(u) − D−(u) ≤ 0.

Using (1.18) and (1.19) and choosing q < p1 we can thus estimate

∑ Pp(0 ↣ y) α,d d y∈Lη (h)∩((−N0) ×Z) + − q U k D k D 2(d − k) D ≤ ∑ ( ) ( ) ( ) ( ) D+,D−,D 0 D+ D− 0 p1 p2 p3 p4 + −≥ ∶ − + ≤ − U=h+D +D +D+⌊α(D −D )⌋ + − + − + − q D +D +D+⌊α(D −D )⌋+h k D k D 2(d − k) D ≤ ∑ ( ) ( ) ( ) ( ) D+,D−,D 0, p1 p2 p3 p4 − + ≥ D −D ≥0 q n+∆+n+m+⌊−α∆⌋+h k n k ∆+n 2(d − k) m = ∑ ∑ ∑ ( ) ( ) ( ) ( ) p p p p n≥0 ∆≥0 m≥0 1 2 3 4 n q h q2k2 q ∆+⌊−α∆⌋ k ∆ 2(d − k)q m = ( ) ∑ ( ) ∑ ( ) ( ) ∑ ( ) p p2p p p p p p 1 n≥0 1 2 3 ∆≥0 1 3 m≥0 1 4 n q h q2k2 q ∆(1−α)−1 k ∆ 2(d − k)q m ≤ ( ) ∑ ( ) ∑ ( ) ( ) ∑ ( ) p p2p p p p p p 1 n≥0 1 2 3 ∆≥0 1 3 m≥0 1 4 n ∆ m q h−1 q2k2 q1−αk 2(d − k)q = ( ) ∑ ( ) ∑ ( ) ∑ ( ) . (1.20) p p2p p p1−αp p p 1 n≥0 1 2 3 ∆≥0 1 3 m≥0 1 4 Now note that if

1 1 √ p 1−α p p q < min { p p p , p ( 3 ) , 1 4 } (1.21) k 1 2 3 1 k 2(d − k) then all sums in (1.20) converge and q/p1 < 1. Thus

α,d Ep[∣Lη (h)∣] = ∑ Pp(0 ↣ y) α,d y∈Lη (h) h−1 d q 1 1 1 ≤ 2 ( ) − q2k2 q1 αk 2(d−k) p1 1 − 2 1 − 1−α 1 − p1p2p3 p1 p3 p1p4

30 1.3. Asymptotic bounds on the critical probability and with q δ = δ(q, p1) ∶= and p1 2 1−α d p1p2p3 p1 p3 p1p4 C = C(α, d, k, q, p1, p2, p3, p4) ∶= 2 2 2 2 1−α 1−α p1p2p3 − q k p1 p3 − q k p1p4 − 2(d − k)q we obtain the claim in (1.17). Lemma 1.26 then guarantees the existence of an open Lipschitz surface for q as in (1.21) which completes the proof.

Depending on our choice of the parameters p1, p2, p3, p4 we now obtain dif- ferent bounds for the critical probability leading to the results of Propositions 1.15 and 1.16.

Proof of Proposition 1.15. In order to obtain Proposition 1.15 set 1 1 1 p = and p = p = − p . 1 2 2 3 4 2 4 Note that since we consider the case of k = d, the last term on the right-hand side of (1.16) is infinite and hence irrelevant. Comparing the first two terms on the right-hand side of (1.16), one can easily see that the second is the dominating one. Thus, taking p4 ↓ 0, from (1.16) we can deduce the validity of Proposition 1.15.

Proof of Proposition 1.16. (a) Assume φ(d) ∈ o(d1−α) as d → ∞. Then for any fixed choice of p1, . . . , p4, as d → ∞ the last term on the right-hand side of (1.16) is the minimal one and thus determines the lower bound for the critical probability given in (1.16). For every ε > 0, choosing p1 = 1/2, p2 = p3 = ε/2 and p4 = 1/2 − ε, we get 1 1 lim inf q (α, d, φ(d))d ≥ ( − ε) . L 2 d→∞ 2 2 Since this is true for any ε > 0, the claim follows. (b) Now consider the case that for some c ∈ [0, 1] and α > 0 one has φ(d) ∼ cd1−α as d → ∞. Then the second and third term on the right-hand side of (1.16) are of the same order and smaller than the first term. Hence, they dictate the bound. The claim then holds for any feasible choice of p1, . . . , p4 and

1 p 1−α 1 C(α, c) ∶= min {p ( 3 ) , p p } . 1 c 2 1 4

31 Chapter 1. Lipschitz Percolation

For α = 0 we have to take into consideration all three terms of the right-hand side of (1.16), and thus obtain the claim with √ p p p p 1 p p C(0, c) ∶= min { 1 2 3 , p 3 , 1 4 } . c 1 c 2 1 − c

(c) Now assume that for some c ∈ (0, 1] one has φ(d) ∼ cd as d → ∞. In this case, the second term on the right-hand side of (1.16) is the asymptoti- cally decisive contribution. Again, for any ε > 0, choosing 1 − α 1 − α 1 p = − 2ε, p = p = ε, and p = 1 − = 1 2 − α 2 4 3 2 − α 2 − α yields

1 1 1 − α 1 1 1−α lim inf qL(α, d, φ(d))d 1−α ≥ ( − 2ε)( ) . d→∞ 2 − α 2 − α c Since ε was arbitrary,

1 1 1 − α 1 1 1−α lim inf qL(α, d, φ(d))d 1−α ≥ ( ) d→∞ 2 − α 2 − α c 2−α 1 1 1 − α 1−α 1 1−α 1 1 1−α = (1 − α)(1 − ) ( ) ≥ (1 − α) ( ) . 2 − α c 4 c

The next step is to prove Proposition 1.18, again, using Lemma 1.26.

Proof of Proposition 1.18. In order to derive an upper bound for the expec- tation in (1.12) of Lemma 1.26, instead of directly looking at λ-paths, we will consider a coarse-grained version of them and estimate the probability of these paths reaching a certain height. The reason for coarse-graining is the following: if q is approximately equal to qL(α, d, k), then an admissible λ-path starting in 0 (say) will on average pick up at most 1 − α closed sites per horizontal step and if q is slightly above qL(α, d, k), then such a path will certainly exist. When α is very close to one, then the average number of sites which such a path visits between two successive visits of closed sites will be of the order (1 − α)−1 (which is large). If d ≥ 2, then there will auto- matically be lots of admissible λ-paths visiting exactly the same closed sites (in the same order) but taking different routes in between successive visits to closed sites, the factor increasing to infinity as α approaches 1. This means that estimating the probability that there exists an admissible λ-path (with

32 1.3. Asymptotic bounds on the critical probability

Figure 1.2: An il- lustration of the coarse-grained lat- α,d tice. Lη is marked by the black dots and the corresponding coarse-grained boxes α,d,η are hatched. B0 is double hatched. a certain property) by the expected number of such paths (via Markov’s in- equality) becomes very poor when α is close to 1. Therefore, we will define larger boxes in Zd+1 and define equivalence classes of paths by just observing the sequence of larger boxes they visit. The boxes will then be tuned such that the number of closed sites inside a box is of order one. Recall that w.l.o.g. we assume ηi = 1, i = 1, . . . , k and ηi = 0, i = k +1, . . . , d. To facilitate reading, we have structured the proof into three steps. Step 1: Coarse-grained λ-paths. In order to define the abovementioned paths we partition Zd+1 by dividing Rd+1 into boxes as illustrated in Figure 1.2. Define

α,d,η d+1 −1 B0 ∶= {r ∈ R ∣ ∀i = 1, . . . , k ∶ ri ∈ [0, (1 − α) [, ∀i = k + 1, . . . , d ∶ ri ∈ [0, 1[ d d

and rd+1 ∈ (α ∑ ηiri − 1, α ∑ ηiri]} i=1 i=1

α,d,η α,d,η d+1 and likewise for a ∈ Z set Ba ∶= B0 + v(a), where 1 v(a) ∶ = ∑ a e + ∑ a (e + αη e ) + a e i i i 1 − α i i d+1 d+1 d+1 i=k+1,...,d i=1,...,k 1 ⎛ α ⎞ = ∑ a e + ∑ a e + ∑ a η + a e . i i i 1 − α i ⎝ i 1 − α i d+1⎠ d+1 i=k+1,...,d i=1,...,k i=1,...,k

α,d,η Note that these boxes are translations of B0 shifted either in the direc- α,d d+1 tion of ed+1 or parallel to the inclination of Lη and are such that Z = α,d,η d 1 d d+1 ( ∩ + ) ∈ ⋃a∈Z Ba Z , where the union is over disjoint sets. For any y Z

33 Chapter 1. Lipschitz Percolation the coordinates of the box it is contained in are given by a(y) ∈ Zd+1 as

⎪⎧y , i = k + 1, . . . , d ⎪ i ai(y) ∶= ⎨⌊(1 − α)yi⌋, i = 1, . . . , k ⎪ ⎪y − ⌊α d η y ⌋, i = d + 1 ⎩ d+1 ∑j=1 j j We will refer to these as the coarse-grained coordinates. Note that they α,d d describe the position of the boxes relative to Lη . For y ∈ Z the (d + 1)-st coordinate of its coarse-grained coordinates a(y) gives its height (or distance α,d in the (d+1)-st coordinate) relative to Lη . Since α, d and η are fixed for this proof, we will often drop the superscripts for the sake of better readability. With the above partition of Zd+1 at hand, we can now define coarse-grained d 1 λ-paths. If we sample a standard λ-path only on the boxes Ba, a ∈ Z + , it d 1 visits, we obtain a path on these boxes. Such a path on Ba, a ∈ Z + , that is the trace of a standard λ-path will be called a coarse-grained λ-path. These paths can take a step from a box Ba to Ba′ only if

′ a − a ∈ {ed+1} ∪ {−ei ∣ i = 1, . . . , k} ∪ {−ed+1} (1.22)

∪ {−2ed+1} ∪ {±ei − ed+1 ∣ i = 1, . . . , d} ∪ {ei − 2ed+1 ∣ i = 1, . . . , k}. (But note that not all paths consisting of the type of steps described in (1.22) are traces of standard λ-paths and thus coarse-grained λ-paths.) We call a box Ba closed (with respect to ω) if and only if ω(x) = 0 for at least one x ∈ Ba. Similarly to the case of λ-paths, we will call a coarse-grained λ-path admissible if for each of its upward steps, i.e., those steps for which ′ ′ a −a = ed+1, the box Ba is closed. A moment’s thought reveals that the above sampling procedure maps admissible λ-paths to admissible coarse-grained λ- paths, and thus the existence of an admissible λ-path from some x ∈ Zd+1 to y ∈ Zd+1 implies the existence of an admissible coarse-grained λ-path from Ba(x) to Ba(y). We therefore investigate the behavior of these coarse-grained λ-paths more closely. Step 2: An estimate for coarse-grained λ-paths. Recalling (1.22), note that there is only one kind of step in a coarse-grained λ-path that will not change α,d its height relative to Lη , i.e., its coarse-grained coordinate in the (d + 1)-st dimension, namely those of the form −ei with i = 1 . . . , k. Use CG(M) to denote the set of all coarse-grained λ-paths starting with B0 of length M ∈ N α,d whose endpoint, i.e., its last box, is above or intersects Lη . For π ∈ CG(M), use U = U(π) to denote the number of its ‘up’ -steps, i.e., those steps that increase the (d + 1)-st coarse-grained coordinate. Similarly, use D = D(π) to denote the number of steps that decrease the (d + 1)-st coarse-grained i i coordinate (possibly by more than 1) and D0 = D0(π) the number of steps

34 1.3. Asymptotic bounds on the critical probability in each dimension i = 1, . . . , d, that do not alter the (d + 1)-st coarse-grained coordinate. Due to the natural restrictions on the movements of the standard i and thus the coarse-grained λ-paths, D0 = 0 for any i = k + 1, . . . , d. We can now make the following observation: In order for π to end in a box above or α,d intersecting Lη , we necessarily have

U ≥ D.

In addition, observe that due to the length of the boxes in the corresponding i directions being 1/(1−α), between two steps of type D0 (for the same i) there j needs to be at least one step of type D or U (not D0, j ≠ i). This implies that i D0 ≤ D + U + 1. Therefore, for a coarse-grained λ-path π ∈ CG(M), recalling that it ends α,d above or intersecting Lη ,

d i M = U + D + ∑ D0 ≤ 2U + ∥η∥1(2U + 1) = 2U(k + 1) + k i=1 (1.23) M − k ⇐⇒ U ≥ . 2(k + 1) Thus, we will now estimate the probability of the event on the right-hand side in the above display. Write m(π) for the number of distinct boxes visited by a path π ∈ CG(M). Then the exponential Chebychev inequality yields for any β > 0 and γ ∈ (0, 1) that

Pp(there exists π ∈ CG(M) whose boxes contain at least γM closed sites) ≤ ∑ Pp(boxes of π contain at least γM closed sites) π∈CG(M) 1 ≤ ∑ [exp(β(# of closed sites in boxes of π))] exp(βγM)Ep π∈CG(M) 1 = ∑ [exp(β(# of closed sites in m(π) distinct boxes))] exp(βγM)Ep π∈CG(M) 1 = ∑ ( [exp(β(# of closed sites in B ))])m(π) exp(βγM) Ep 0 π∈CG(M) 1 1 km π = ∑ (exp(β)q + (1 − q))⌈ 1−α ⌉ ( ) exp(βγM) π∈CG(M) 1 1 kM ≤ ∑ (exp(β)q + (1 − q))⌈ 1−α ⌉ exp(βγM) π∈CG(M)

35 Chapter 1. Lipschitz Percolation

M 1 1 kM ≤ (2(2d + 1)) (exp(β)q + (1 − q))⌈ 1−α ⌉ exp(βγM) ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≤exp(q(exp(β)−1)) 2 − α k ≤ exp (M( log(4d + 2) − βγ + q(exp(β) − 1)( ) )), (1.24) 1 − α where in the penultimate inequality we estimated the total number of coarse- grained λ-paths of length M by (2(2d + 1))M . Observe that, choosing β = 1+ϵ γ log(4d+2) for some ϵ > 0 the expression inside the exponential is negative if, and only if, 1 + ϵ 2 − α k −ϵ log(4d + 2)+q(exp( log(4d + 2)) − 1)( ) < 0 γ 1 − α ϵ log(4d + 2) 1 − α k ⇔ q < ( ) . (1.25) exp((1 + ϵ)γ−1 log(4d + 2)) − 1 2 − α Step 3: Returning to λ-paths. In order to apply Lemma 1.26 we need to α,d estimate the probability of reaching a site y ∈ Lη (h) with an admissible λ-path. Recall that coarse-grained λ-paths were defined in such a way that the existence of an admissible λ-path from 0 ∈ Zd+1 to y ∈ Zd+1 implies the existence of an admissible coarse-grained λ-path from B0 to Ba(y). This path then has length M at least ∥a(y)∥1 and thus

M ≥ ∥a(y)∥1 d ≥ ∑ ∣ai(y)∣ + h i=1 = ∑ ∣yi∣ + ∑ ∣⌊(1 − α)yi⌋∣ + h i=k+1,...,d i=1,...,k

≥ ∑ ∣yi∣ + ∑ ((1 − α)∣yi∣ − 1) + h i=k+1,...,d i=1,...,k

≥ (1 − α)∥y¯∥1 − k + h.

α,d Therefore, for any h ∈ N and y ∈ Lη (h) using (1.23) in the third step,

Pp(0 ↣ y)

≤ Pp(there exists an admissible coarse-grained λ-path from B0 to Ba(y)) ≤ Pp(there exists π ∈ CG((1 − α)∥y¯∥1 − k + h) admissible) ≤ Pp(there exists π ∈ CG((1 − α)∥y¯∥1 − k + h) (1 − α)∥y¯∥ − k + h − k whose boxes contain at least 1 closed sites) 2(k + 1)

≤ exp (((1 − α)∥y¯∥1 − k + h)

36 1.3. Asymptotic bounds on the critical probability

2 − α k × ( − ϵ log(4d + 2) + q(exp((1 + ϵ)4(k + 1) log(4d + 2)) − 1)( ) )), 1 − α where we choose h ≥ 3k and set γ ∶= 1 to apply (1.24) for the last 4(k+1) inequality. Assuming

ϵ log(4d + 2) 1 q < (1 − α)k exp((1 + ϵ)4(k + 1) log(4d + 2)) − 1 2k ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶C(k,d,ϵ) (1.25) holds and combining the observations above we can estimate (1.12) by

∑ Pp(0 ↣ y) ≤ ∑ exp (((1 − α)∥y¯∥1 − k + h) α,d α,d y∈Lη (h) y∈Lη (h) 1 + ϵ 1 × (−ϵ log(4d + 2) + q(exp( log(4d + 2) − 1)⌈ ⌉k)) γ 1 − α ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶c¯(k,d,ϵ,α,q)=c¯<0

= exp((−k + h)c¯) ∑ exp((1 − α)∥y¯∥1c¯) α,d y∈Lη (h) ∞ ≤ exp((−k + h)c¯) ∑ exp((1 − α)ic¯)(2d + 1)i . i 1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶= <∞ Thus

α,d lim Ep[∣Lη ∣] = lim ∑ Pp(0 ↣ y) = 0. h→∞ h→∞ α,d y∈Lη (h) Therefore, the assumptions of Lemma 1.26 hold which implies the existence of an open Lipschitz surface. Hence,

k qL(α, k, d) ≥ C(k, d, ϵ)(1 − α) .

Note that for our result, any ε > 0 is sufficient. However, the optimal ε is given by ε = 1+h , where h is such that − exp(−1 − 4(k + 1) log(4d + 2)) = 4(k+1) log(4d+2) h exp(h).

Proof of Proposition 1.19. In order to prove the lower bound for qL(α, 1, 1) we show the existence of an open Lipschitz surface for sufficiently small q by analyzing the existence of an admissible λ-path starting in Lα,1 reaching the (1) A site (0, h) for large h ∈ N0. Writing x ↣ y for the event of existence of an

37 Chapter 1. Lipschitz Percolation admissible λ-path from x ∈ Z2 to y ∈ Z2 that only uses sites in the set A ⊆ Z2, and defining Lα,1 ∶= Lα,1 ∪ Lα,1 we observe that (1),≥ (1),> (1)

Lα,1 Lα,1 α,1 α,1 (1),≥ (1),≥ p(L ↣ (0, h)) = p(L ↣ (0, h)) ≤ 2 p( ⋃ {(n, ⌊αn⌋) ↣ (0, h)}) P (1) P (1) P n∈N0 Lα,1 ∞ (1),≥ ≤ 2 ∑ Pp((n, ⌊αn⌋) ↣ (0, h)), (1.26) n=0 for any h ∈ N0. Therefore we need to find suitable upper bounds for the summands. A first helpful bound, albeit without the restriction on the space, canbe obtained similarly to (1.18). Observe that any λ-path from (n, ⌊αn⌋) to (0, h) must have made a total of 4k + ⌈(2 − α)n⌉ + h steps for some k ∈ N0: n + k to the downward left, k to the downward right and n − ⌊αn⌋ + h + 2k upwards. Then, counting the number of admissible λ-paths under consideration

Lα,1 (1),≥ Pp((n, ⌊αn⌋) ↣ (0, h)) ≤ Pp((n, ⌊αn⌋) ↣ (0, h)) 2n + h − ⌊αn⌋ + 4k ≤ ∑ ( )qn−⌊αn⌋+h+2k. n + k, k, n − ⌊αn⌋ + h + 2k k∈N0 (1.27)

This upper bounds the terms for small n in (1.26), but can also be used to obtain an adequate estimate for large n. This is, however, more elaborate: For n ∈ N0 define

An ∶= {−n, −(n − 1),..., −1, 0, 1,...} × Z, An Yn ∶= max{r ∈ Z ∣ (0, 0) ↣ (−n, r)}.

Yn is defined as the height of the highest site above −n reachable by an admissible λ-path started in 0 under the restriction of using only the sites in An An, but in fact also Yn = max{r ∈ Z ∣ {0} × Z− ↣ (−n, r)}. Now denote by ¯ Y0 a copy of Y0, independent of (Yn)n∈N0 . Then, for any n ∈ N0, if Yn < ∞ Pp-a.s., ¯ Y0 stochastically dominates Yn+1 − (Yn − 1) under Pp(⋅ ∣ Yi, i ≤ n) (1.28) since the conditioning can be seen as discarding those paths in the construc- tion using any site visited by the previous paths leading to the (−i, Yi), i ≤ n. ¯ Therefore, a closer study of the distribution of Y0 seems advisable. The ob- servations below will, in particular, prove Y0 to be finite Pp-a.s. which by

38 1.3. Asymptotic bounds on the critical probability

induction with (1.28) assures the necessary finiteness of all Yn, n ∈ N0. Using (1.27), ¯ Pp(Y0 ≥ m) ≤ Pp((0, 0) ↣ (0, m)) ≤ qm + ∑ 3m+4kqm+2k k∈N (1.29) (9q)2 ≤ qm + (3q)m , 1 − (9q)2 for q < 1/9. Hence, we can upper bound the expectation

∞ (9q)2 ∞ [Y¯ ] ≤ q + ∑ qm + ∑ (3q)m Ep 0 1 − (9q)2 m=2 m=1 ≤ q + Cq2 for a suitable C > 0 and small q. As a consequence, assuming q sufficiently small for

q + Cq2 − 1 < −α (1.30) to hold, (1.28) and a large deviation principle (the required exponential mo- ments exist due to (1.29)) yield the existence of c1, c2 > 0 such that

Pp(Yn ≥ −αn) ≤ c1 exp(−nc2). Observe that an admissible λ-path started in some (n, ⌊αn⌋) and reaching α,1 {0} × 0 going only through L has only used sites to the right of {0} × N (1),≥ Z until the first time it hits {0} × N0. Hence,

Lα,1 Lα,1 (1),≥ (1),≥ Pp((n, ⌊αn⌋) ↣ (0, h)) ≤ Pp((n, ⌊αn⌋) ↣ {0} × N0) ≤ Pp(Yn ≥ −nα) ≤ c1 exp(−nc2). This is the last component needed to estimate (1.26) as it allows us to choose N ∈ N such that for any h ∈ N

Lα,1 ∞ (1),≥ 1 ∑ ((n, ⌊αn⌋) ↣ (0, h)) ≤ . Pp 8 n=N On the other hand, using (1.27) again, we may now choose H sufficiently large such that for all h ≥ H,

N 1 Lα,1 − (1),≥ 1 ∑ ((n, ⌊αn⌋) ↣ (0, h)) ≤ . Pp 8 n=0

39 Chapter 1. Lipschitz Percolation

Hence, by (1.26) choosing q as in (1.30) implies

α,1 1 Pp(L ↣ (0, h)) ≤ (1) 2

α,d for all h ≥ H. Recalling Lemma 1.24 this means that Pp(Fη (0) = ∞) ≤ 1/2 α,d α,d and thus 1 − Pp(LIPη ) = Pp(Fη (0) = ∞) = 0 yielding q < qL(α, 1, 1). The corresponding upper bound is given by Proposition 1.21.

1.3.3 Proofs of upper bounds In this section it will be usefull to consider what we call reversed λ-paths.

d 1 Definition 1.28. A sequence of sites x0, x1, . . . , xn ∈ Z + is called an (ad- missible) reversed λ-path, if xn, xn−1, xn−2, . . . , x0 is an (admissible) λ-path in the sense of Definition 1.23.

Furthermore, the proof of Proposition 1.20 will take advantage of a com- parison to so-called ρ-percolation, see e.g. [77] and [63]. Here the setting is that of oriented site-percolation in Zd, i.e., where in addition to our standard setting of Bernoulli site percolation we assume the nearest neighbor edges of Zd to be oriented in the direction of the positive coordinate vectors (which is the sense of orientation for the rest of this section).

d Definition 1.29. We say that ρ-percolation occurs for ω ∈ {0, 1}Z if there d exists an oriented nearest neighbor path 0 = x¯0, x¯1,... in Z starting in the origin, such that

1 n lim inf ∑(1 − ω(x¯i)) ≥ ρ. n→∞ n i=1 Any such path is called a ρ-path.

As observed in [77] the probability of the existence of such a path ex- hibits a phase transition in the parameter q and the corresponding critical probability is denoted by qc(ρ, d).

Theorem 1.30 (Theorem 2 in [63]). For every ρ ∈ (0, 1],

1 1 θ ρ lim d ρ q (ρ, d) = =∶ R(ρ), (1.31) c θ d→∞ e − 1 where θ is the unique solution to θeθ/(eθ − 1) = 1/ρ, and R(1) = 1.

40 1.3. Asymptotic bounds on the critical probability

Note that we have interchanged the role of ‘open’ and ‘closed’ (and thus p and q) with respect to [63] in order to adapt the result to its application in this work. Before turning to the proof of Proposition 1.20, we observe a useful prop- erty of the critical probability of ρ-percolation.

Lemma 1.31 (Continuity of qc). The critical probability of ρ-percolation is continuous in ρ, i.e., for any d ∈ N the map

[0, 1) ∋ ρ ↦ qc(ρ, d) (1.32) is continuous.

Proof of Lemma 1.31. Since d is fixed and we only consider Zd in this proof, the index is dropped for better readability. It is easy to see that the event of ρ-percolation also undergoes a phase-transition in ρ (for fixed q) and thus we define

ρc(q) ∶= sup{ρ ∣ P1−q(ρ-percolation occurs) = 1}.

Note that strict monotonicity of ρc(q) for q ∈ [0, q¯], whereq ¯ ∶= sup{q ∣ ρc(q) < 1}, would imply the desired continuity of qc(ρ) on [0, 1). In order to prove this strict monotonicity, we will, however, first consider a different quantity: d Still in the setting of oriented percolation in Zd, for any ω ∈ {0, 1}Z let

Y0,n(ω) ∶= max { r ∈ N0 ∣ ∃ directed nearest neighbor path n 0 = x0, x1, . . . , xn ∶ ∑(1 − ω(xi)) = r}, i=1 ˆ and denote by Xn the site with the lowest lexicographical order that is the endpoint of such a directed nearest neighbor path on which the value of Y0,n is attained. Then, for m ≥ n define

Yn,m(ω) ∶= max { r ∈ N0 ∣ ∃ directed nearest neighbor path m n ˆ − Xn = x0, x1, . . . , xm−n ∶ ∑ (1 − ω(xi)) = r}. i=1

Then (−Yn,m)m≥n∈N, fullfils the assumptions of the Subadditive Ergodic Theorem (see e.g. [23], Theorem 6.6.1), namely

1. −Y0,n − Yn,m ≥ −Y0,m (− ) ∈ 2. Ynk,(n+1)k n∈N0 is stationary and ergodic for every k N.

41 Chapter 1. Lipschitz Percolation

3. The distribution of (−Yk,k+m)m∈N does not depend on k ∈ N.

+ 4. E1−q[(−Y0,1) ] < ∞ and for each n ∈ N, E1−q[−Y0,n] ≥ −n. 1 Thus the sequence (Y0,n/n)n∈N converges P1−q-a.s. and in L (P1−q) to a (de- terministic) limit that we denote by γ(q). In fact,

γ(q) = ρc(q). (1.33)

To see this, fix q ∈ (0, 1) and choose ρ < ρc(q). Then for P1−q-almost any d ω ∈ {0, 1}Z there exists an oriented nearest neighbor path X1(ω),X2(ω),... such that 1 n ρ ≤ lim inf ∑(1 − ω(Xi(ω))). n→∞ n i=1 n Since by definition ∑i 1(1 − ω(Xi(ω))) ≤ Y0,n(ω) for P1−q-almost all ω ∈ d = {0, 1}Z and n ∈ N, taking the limes inferior on both sites gives ρ ≤ γ(q), which implies ρc(q) ≤ γ(q). To prove the converse inequality, choose, for any d 1 Z ε > 0 an N ∈ N such that N E1−q[Y0,N ] ≥ γ(q) − ε. For any ω ∈ {0, 1} let X1(ω),X2(ω),...,XN (ω) be an (oriented nearest neighbor) path, such that N Y0,N = ∑i 1(1 − ω(Xi(ω))). Using i.i.d. copies of (X1,...,XN ), one can con- = ̃ struct an infinite oriented nearest neighbor path (Xi)i∈N0 with the property that by the law of large numbers

n 1 ̃ 1 lim ∑ Xi = E1−q[Y0,N ] ≥ γ(q) − ε P1−q-a.s.. n→∞ n N i=1

Thus γ(q) − ε ≤ ρc(q) and since ε was arbitrary, γ(q) ≤ ρc(q), which in combination with the above establishes (1.33). The strict monotonicity of γ(⋅) (and thus ρc(⋅)) can now be proven through a suitable coupling argument. Denote by U[0,1] the uniform measure on the d interval [0, 1] and define µ ∶= U ⊗Z as the product measure on the space [0,1] d W ∶= [0, 1]Z . For any w ∈ W, q ∈ (0, 1) and n ∈ N0 define

q Yn (w) ∶= max {r ∈ N0 ∣ ∃ directed nearest neighbor path n 1 0 = x0, x1, . . . , xn ∶ ∑ [0,q](w(xi)) = r}. i=1 q Observe that Lµ((Yn )n∈N0 ) = LP1−q ((Y0,n)n∈N0 ), where Lν denotes the law with respect to the measure ν. Therefore

1 q 1 lim Yn = γ(q) µ-a.s. and in L (µ). n→∞ n

42 1.3. Asymptotic bounds on the critical probability

q,n q,n As before, for any q ∈ (0, 1), w ∈ W and n ∈ N0, let X1 (w),...,Xn (w) be an oriented nearest neighbor path such that Y q = n 1 (w(Xq,n(w))). n ∑i=1 [0,q] i Choose 0 ≤ q < q′ ≤ q¯, then n n ′ q′,n q,n q 1 ′ 1 ′ Yn = ∑ [0,q ](w(Xi (w))) ≥ ∑ [0,q ](w(Xi (w))) i=1 i=1 n q,n q 1 ′ = Yn + ∑ [q,q ](w(Xi (w))). (1.34) i=1 1 d q Set Fq ∶= σ(w ↦ [0,q](w(x)) ∣ x ∈ Z ). Then, obviously, the Yn are Fq- 1 q,n measurable and the [0,q](w(Xi (w))), 1 ≤ i ≤ n, are independent given Fq. In addition,

q,n q′ − q µ(1 q,q′ (w(X (w))) = 1 ∣ Fq) = 1 w Xq,n w q . [ ] i 1 − q { ( i ( ))> } Thus using (1.34) we obtain

n ′ q′ q q,n q q − q [Y − Y ∣ F ] ≥ [ ∑ 1 ′ (w(X (w))) ∣ F ] = (n − Y ) , Eµ n n q Eµ [q,q ] i q n 1 − q i=1 where Eµ denotes the expectation with respect to µ. Using the L1(µ) con- vergence

1 ′ ′ q q γ(q ) − γ(q) = lim Eµ[Eµ[ (Yn − Yn ) ∣ Fq]] n→∞ n ′ 1 q q − q ≥ lim Eµ[ (n − Yn ) ] n→∞ n 1 − q q′ − q = (1 − γ(q)) 1 − q and the right-hand side is positive, since γ(q) = ρc(q) < 1 for q < q¯. This shows the strict monotonicity of the function ρc on [0, q] and hence implies (1.32). Proof of Proposition 1.20. Note that the projection of a reversible λ-path onto its first d coordinates is in fact a (lazy) nearest neighbor path in Zd. We will use this to compare ρ-paths in Zd with reversed admissible λ-paths d+1 in Zd+1. To this end define for any ω ∈ {0, 1}Z andx ¯ ∈ Zd the quantity

Hω(x¯) ∶= min {h ∈ N0 ∣ ∃ an oriented nearest neighbor path d 0 = x¯0,..., x¯m = x¯ ∈ Z , and a sequence 0 = h0, . . . , hm = h ∈ N0 s.t. ⎧ ⎪h , if ω(x¯ , h ) = 0, h = ⎨ i i i }. i+1 ⎪ ⎩⎪hi + 1, otherwise.

43 Chapter 1. Lipschitz Percolation

A second’s thought reveals that this map is defined in such a way that the sequence 0 = (x¯0, h0),..., (x¯m, hm) = (x,¯ Hω(x¯)) from the definition actually is an admissible λ-path from (x,¯ Hω(x¯)) to the origin, which takes advantage of many closed sites in the configuration ω. (It is, however, not optimal, as it does not make use of consecutive ‘piled up’ closed sites in one step.) In addition, this λ-path is oriented in the sense that its projection onto Zd, i.e. the sequence 0 = x¯0,..., x¯m = x¯, is oriented. With this we can then define a d+1 d map T ∶ {0, 1}Z → {0, 1}Z as ⎧ ⎪ω(x,¯ H (x¯)), ifx ¯ ∈ d, (T (ω))(x¯) ∶= ⎨ ω N0 ⎪ ⎩⎪ω(x,¯ 0), otherwise.

d+1 The purpose of T is to map a configuration ω ∈ {0, 1}Z to a configuration d ω¯ ∈ {0, 1}Z , for which there exists an oriented path picking up almost as many closed sites as the oriented reversed admissible λ-path in ω with lowest (d + 1)-st coordinate. In order to be more precise, we add an index to the probability measure used to indicate the space it is defined on. I.e., Pp,d will denote the Bernoulli product-measure on Zd with parameter p. Since the d value of H(x¯) only depends on the state of the sitesy ¯ ∈ N0 with ∥y¯∥1 < ∥x¯∥1, −1 Pp,d+1 ○ T = Pp,d. Thus, if q > qc(ρ, d), we have that

1 = Pp,d(ρ-percolation occurs)

≤ Pp,d+1(there exists an admissible reversed λ-path 0 = (x¯0, h0), (x¯1, h1),... (1.35) 1 s.t. lim sup hn ≤ 1 − ρ). n→∞ n Now choose ρ > 1−α and set δ ∶= 1−ρ+(α−(1−ρ))/2 ∈ (1−ρ, α). Then (1.35) implies the existence of a (deterministic) N ∈ N such that for all n ≥ N,

Pp,d+1(there exists an admissible reversed λ-path 1 0 = (x¯ , h ), (x¯ , h ),..., (x¯ , h ) s.t. h ≤ δn) ≥ . 0 0 1 1 n n n 2 Recalling our choice of η = (1,..., 1), note that if there exists an admissible reversed λ-path from the origin to some (x¯n, hn) with hn ≤ δn, then there α,d actually exists an admissible λ-path from Lη − ⌊(α − δ)n⌋ed+1 to the origin. Thus, by translation invariance of Pp,d+1, we obtain that 1 ∀n ≥ N ∶ (Lα,d ↣ (0, ⌊(α − δ)n⌋)) ≥ Pp,d+1 η 2

44 1.3. Asymptotic bounds on the critical probability which, since α − δ > 0, implies

α,d c α,d 1 Pp,d+1 ((LIPη ) ) = lim Pp,d+1(Lη ↣ (0, (α − δ)n)) ≥ . n→∞ 2

α,d By Proposition 1.11 we deduce that Pp (LIPη ) = 0 and hence q ≥ qL(α, d, d). We have thus shown that for any ρ > 1 − α one has qc(ρ, d) ≥ qL(α, d, d). Since qc(ρ, d) is continuous in ρ by Lemma 1.31, then the claim follows from (1.31).

Again, using the notion of reversed λ-paths we provide a criterion for non-existence of a Lipschitz surface to be used in the proof of Proposition 1.21.

Lemma 1.32 (Criterion for non-existence of an open Lipschitz surface). For any α > 0, and d ∈ N define

d T ∶= inf{m ∈ N0 ∣ ∃x¯ ∈ N0 ∶ ∥x¯∥1 = m and (x,¯ ∥x¯∥1) is closed}.

If for p ∈ (0, 1) one has

1 [T ] < , (1.36) Ep 1 − α then P-a.s. there exists no open Lipschitz surface and q = 1 − p ≥ qL(α, d, d).

Condition (1.36) has an intuitive interpretation: 1/(1−α) is the number of ‘downward-diagonal’ steps a λ-path can take before decreasing its distance to the plane with inclination α by one. Ep[T ] on the other hand is the expected number of such steps an admissible λ-path must take before encountering a closed site and thus being able to take an upwards step. (1.36) therefore means that this path will – on average – encounter a closed site strictly before decreasing its distance to the plane by one, thus increasing the distance in the long run and preventing the existence of an open Lipschitz surface above it.

Proof. As in the proof of Proposition 1.20, the idea is to construct admissible reversed λ-paths starting in 0 such that their endpoints (i.e., the starting α,d points of the respective λ-paths) are arbitrarily far below Lη . With a simple shifting argument we can then see that the Lipschitz surface would, with probability bounded away from 0, have to have arbitrarily large height in 0 and can therefore almost surely not exist.

45 Chapter 1. Lipschitz Percolation

We begin with the construction of the reversed λ-paths. To this end, set d X0 ∶= Y0 ∶= 0. Let (z¯i)i∈N0 be an ordering of N0 compatible with ∥ ⋅ ∥1 in the sense that ∥zi+1∥1 ≥ ∥zi∥1, for all i ∈ N0. Then define for any n ∈ N0,

ιn+1 ∶= inf{i ∈ N0 ∣ (z¯i, ∥z¯i∥1) + Yn is closed},

Xn+1 ∶= (z¯ιn , ∥z¯ιn ∥1),

Yn+1 ∶= Yn + Xn+1 − ed+1.

By construction, there always exists an admissible λ-path from any Yn to 0.

Note also that (ιn)n∈N and (Xn)n∈N are i.i.d. sequences where ι1 is geometric ¯ on N0 with parameter q and ∥X1∥1 = X1 ⋅ ed+1 is distributed as T . We are now interested in the height of the starting points of these λ-paths α,d relative to Lη . This is given by ¯ H(n) ∶= ⌊α∥Yn∥1⌋ − Yn ⋅ ed+1 n n ¯ = ⌊α ∑ ∥Xj∥1⌋ − ∑(Xn − ed+1) ⋅ ed+1 j=1 j=1 n n ¯ ¯ = ⌊α ∑ ∥Xj∥1⌋ − ∑ ∥Xj∥1 + n. j=1 j=1 The law of large numbers then yields 1 lim H(n) = (α − 1)Ep[T ] + 1 Pp-a.s. n→∞ n and the right-hand side is strictly negative by assumption. Thus with ∆ ∶= − ((α − 1)Ep[T ] + 1)/2 > 0 we have in particular the existence of a deterministic N ∈ N such that 1 ∀n ≥ N ∶ (H(n) ≤ −∆n) ≥ . Pp 2 Now note that on the event {H(n) ≤ −∆n} there exists an admissible λ- α,d path starting in Lη − ∆ned+1 and reaching 0, since Yn is below the plane α,d Lη − ∆ned+1. Hence, by translation invariance of Pp we have that 1 ∀n ≥ N ∶ (Lα,d ↣ (0, ∆n)) ≥ Pp η 2 which implies

α,d c α,d 1 Pp ((LIPη ) ) = lim Pp(Lη ↣ (0, ∆n)) ≥ . n→∞ 2 α,d By Proposition 1.11, Pp (LIPη ) = 0 and p ≤ pL(α, d, d), i.e., q ≥ qL(α, d, d).

46 1.3. Asymptotic bounds on the critical probability

d Proof of Proposition 1.21. Recall the ordering (z¯i)i∈N0 of N0 compatible with ∥ ⋅ ∥1 from the proof of Lemma 1.32 and define the random variable

ι1 ∶= inf{i ∈ N0 ∣ (z¯i, ∥z¯i∥1) is closed}, which has a geometric distribution on N0 with parameter q. With B(j) ∶= d {x¯ ∈ N0 ∣ ∥x¯∥1 ≤ j} denoting the ball with radius j ∈ N0, define the function

r(i) ∶= inf{j ∈ N0 ∣ ∣B(j)∣ − 1 ≥ i} that gives the radius of the smallest ball such that its cardinality (without the origin) is larger than or equal to a given i ∈ N0. Note that r(ι1) is distributed as T , for T defined in Lemma 1.32. Using

j + d (j + 1)d ∣B(j)∣ = ( ) ≥ d d! we obtain r(i)d i ≥ ∣B(r(i) − 1)∣ ≥ d! and can thus upper bound the expectation

1 1 1 d [T ] = [r(ι )] ≤ (d! [ι ]) d ≤ (d! ( − 1)) , Ep Ep 1 Ep 1 q where we used Jensen’s inequality in the first inequality. The right-hand side is strictly smaller than 1/(1 − α) if and only if

d!(1 − α)d q > . 1 + d!(1 − α)d

Thus Lemma 1.32 then implies that for such values of q no open Lipschitz surface can exist, i.e., q ≥ qL(α, d, d), and the claim follows.

47

Part II

Population Genetics

49

Chapter 2

A novel seed-bank model

Population genetics is the study of the genetic composition of populations, including the analysis of distributions of genotypes (and phenotypes) and their changes in frequency, and comprising also the evaluation of ancestral relations (through graphs) and their evolution in response to the evolution- ary forces of natural selection, genetic drift, mutation and gene flow. The description of this area of the biological sciences alludes to a strong mathe- matical component. Indeed, Population Genetics is peculiar as it gives rise to mathematical structures sapid as abstract (mathematical) objects that at the same time find direct application in science. This interplay of mathematics and biology can be traced back to (at least as far as) Mendel, 1866 [76], but came into prominence in the western hemisphere1 through the works of Fisher [29] and Wright [110] in the 1930s that laid the foundation of the model that today bears their name, the so- called Wright-Fisher model. Even in its simplest form it already comprises the range of mathematical phenomena that arise in this context, in partic- ular an intriguing moment duality between the Wright-Fisher diffusion and the (block-counting process of the) Kingman coalescent, which reflects the interlaced relation between the evolution of a population forward in time and the structure of its genealogy backward in time. This classic model has been extended and enhanced to include other evolutionary forces mentioned above such as mutation and selection or even spatial structures (thinking of different colonies) always obtaining an analogous picture in spirit of aduality between the forward evolution and a backward coalescent ([24]). There is, however, a force not mentioned in our recollections so far: the notion of a seed-bank or, equivalently a state of dormancy in populations.

1It seems that the works by Bernstein [4] were not readily available due to historical reasons. They represent another branch of Population Genetics which developed into a deterministic approach resulting in Volterra Quadratic Operators, see Chapter 3.

51 Chapter 2. A novel seed-bank model

The most intuitive example for this is indeed the production of as offspring by plants. As opposed to, say mammals, where the offspring is‘of the same nature’ as the parent immediately, plants can store their genetic information in seeds for a potentially longer time. An extreme example of this is a date tree from nick-named Methuselah as it sprouted in 2005 from a roughly 2000 year old seed that had been found during the excavation of a Herodian fortress in , Israel in the 1960s. Genetic tests proved that, although this ‘new’ tree was closest to a present-day Egyptian and an Iraqui cultivar, the differences were prominent (for details see [97]). Thus the sprouting of this tree has potentially brought back genetic material that was presumed lost around the time of the last crusades clearly indicating the relevance of this dormant form of a plant for genetic diversity of a species. If this example seems artificial though, similar effects are easily found at different time-scales: The Atacama desert is the dryest place onearth2, yet from time to time after an unusual year with sufficient precipitation we see the phenomen of the desierto florido – the blooming desert – where seeds that have been laying dormant in the desert will germinate and produce a new generation of (plants and then) seeds to lay and wait for the next unusual year. And on an even shorter scale, we observe that the seed-state allows species of plants such as Tropaeolum3 to survive the winter. If we take a step back we observe a more general form of the notion of a seed in nature: According to Lennon and Jones [69] dormancy is defined as “any rest period or reversible interruption of the phenotypic development of an organism”. In this same publication they conclude that up to 80% of the bacteria found in the soil are in a latent or dormant state. It is indeed used by a variety of organisms as an evolutionary strategy to overcome unfavorable environmental conditions such as drought or fire (see [108] for an overview or [17] for an example in extreme aridity) and leads to significantly increased genetic variability (see, e.g., [105], [70], [89], [106]). Given its broad presence and observable influence on ecology and adap- tive and genetic evolution it is natural to try to incorporate and investigate seed-bank effects through probabilistic models, leading to a new extension of the Wright-Fisher mechanism. We will begin with a basic introductory sec- tion on the classic Wright-Fisher model. This Section 2.1 should serve as an overview of questions asked and corresponding results classic in this mathe- matical branch of population genetics. Section 2.2 will then give an overview of the extended Wright-Fisher models reformed to include the seed-bank phe-

2It is such that NASA uses it as a test-site for their life-detecting instruments sent to Mars, cf. [103], [84]. 3In German: Kapuzinerkresse.

52 nomenon and known results so far, before moving on to our own model and corresponding results and observations in Sections 2.3-2.6. A more detailed description of the content of these last three sections will be given at the end of Section 2.2 once necessary terminology has been introduced. We shall remark here that the results obtained in Sections 2.3 - 2.6 have previously been published in [9], with the exception of 2.5.3 and in part also 2.6.2 which are part of [6]. However, we have added details and more precise references in several instances as compared to the publications. In particular, we have corrected what is Proposition 2.56 in this work and layed out the proofs in Section 2.6.2 in detail.

Biological terms We give here a short superficial glossary of explanations of some biological terms reoccuring in the remainder of the chapter. They are not crucial for the mathematical comprehension, but should help under- stand the picture behind the models and are certainly not for use in a less mathematical context. Population – A population is a summation of all the organisms of the same group or species, which live in a particular geographical area, and have the capability of interbreeding. DNA – Deoxyribonucleic acid (DNA) is a molecule that carries the ge- netic instructions used in the growth, development, functioning and reproduction of all known living organisms and many viruses. Chromosome – A chromosome is a packaged and organized structure containing most of the DNA of a living organism. Haploid – A cell is called haploid if it has a single set of chromosomes. Diploid – A cell is called diploid if it has two sets of chromosomes. Most mammals are diploid organisms, where one chromosome comes from the mother and one comes from the father. The transmitting cells, i.e. the egg and the sperm, are haploid. Gene – A gene is a region of DNA which is made up of nucleotides and is the molecular unit of heredity. Most biological traits are under the influence of many different genes as well as the gene-environment interactions. Some genetic traits are instantly visible, such as eye color, while some, such as blood type, are not. Allele – An allele is one of a number of alternative forms of a gene. Sometimes, different alleles will lead to different phenotypes, suchas different colored petals, for example. See Section 3.1.1 in Chapter 3.

53 Chapter 2. A novel seed-bank model

Genotype – The genotype of an individual is the piece of DNA, which determines a specific characteristic (phenotype) of that individual. The genotype is one of three factors that determine the phenotype, the other two being inherited epigenetic factors, and non-inherited environmental factors. An example of how genotype determines a characteristic is petal color in a pea plant.

Phenotype – The phenotype of an individual is the composite of an organism’s observable characteristics or traits, such as its morphology, development, biochemical or physiological properties, phenology, be- havior, and products of behavior.

Notation 2.1 Before we move on to mathematical population genetics let us make a short remark on notation in this chapter. Due to the nature of the subject, most notation will be introduced locally when it is needed, as it is often not relevant for other parts of the chapter. Also, the difference between notation and definition is not strictly differentiated in mathematics in general but often a matter of personal perspective and is even less prominent in this part of the thesis. Nevertheless, every notation used will be introduced.

Remark 2.2. We should add that in several instances in this chapter contin- uous time Markov chains will have to be introduced. We will always define them by giving all strictly positive transition rates between distinct states, tacitly assuming the ones not specifically mentioned to be 0 and the diago- nal elements of the Q-matrix to be of negative value such that the generator is conservative. Since in the cases concerned the Markov chains can all be coupled to a copy of themselves with finite state-space, there is no explosion and using for example Theorem 2.8.1 in [87] we conclude that they have the strong Markov property.

2.1 A famous model by Fisher and Wright and Kingman’s dual

This section is an overview of well-known results related to a very famous model in population genetics - the Wright-Fisher model. The idea in pre- senting this ‘simplest’ version that has led to many enhancements is to give readers new to this topic an impression of the concepts and questions asked as well as the phenomena typically observed in order to expose the ‘standard backbone’ and the striking differences of our own results. It is by no means an in-depth treatment of the topic, but merely a selection relevant for this

54 2.1. A famous model by Fisher and Wright and Kingman’s dual work and will therefore not contain any proofs. Details can be found in [24] and [3].

The Wright-Fisher model We begin outlining the assumptions on our population.

I Population has a constant finite size N ∈ N. II We consider the population to be haploid and assume that each indi- vidual carries a genetic type from some type space E.

III Reproduction takes place in non-overlapping4 discrete generations.

IV Each individual has one parent in the previous generation and inherits its genetic type.

V There is neither mutation nor any selection mechanism or the like. In order to obtain a functioning model, we need to specify the repro- duction mechanism. We will call the parent generation, generation 0 and the generation of children likewise generation 1. Each generation has N in- dividuals, labeled with numbers 1,...,N (in addition to the label of their generation). Since we want to keep the population size constant, a backwards point of view proves handy: In order to determine the ‘parent-child’ relationsships between the individuals in generation 0 and generation 1, we assume that each of the individuals present at generation 1 chooses their parent uniformly among all possible individuals of generation 0, independently of the choices of the other individuals of his generation 1 as depicted in Figure 2.1. This procedure is then repeated independently for each generation. To formalize this we introduce

1 N Fr ∶= (Fr ,...,Fr ), r ∈ Z

1 where Fr = n means that the individual of generation r with the label 1 has chosen as his parent the individual (in the generation r − 1) with the label n ∈ ⟦N⟧ ∶= {1,...,N}. For these random variables we know

n 1 (Fr )r∈Z,n∈⟦N⟧ are i.i.d and F1 ∼ U⟦N⟧ (2.1) where U⟦N⟧ is the uniform distribution on the set ⟦N⟧. The reader might have detected the use of Z for time. Surely, this is not remarkable from 4Each individual belongs to exactly one generation.

55 Chapter 2. A novel seed-bank model a mathematical point of view, but it does also have a biological meaning: both directions in time are areas of study for biologists. Going forward in time one might wonder about the future evolution given a certain type and ask questions about survival of all species or the emergence of new types. A look back in time is just as interesting inquiring about the genealogy and the existence of a common ancestors of a sample of individuals. Note that the definition of the Wright-Fisher mechanism allows for a very natural consideration of forward and backward questions. We will indeed consider both directions in this section for the 1 … N Wright-Fisher model, as well as making analogous considerations for generation 0 the seed-bank model in the follow- ing sections. We thus make a short remark about the usage of ‘time’ generation 1 and its ‘direction’, as this sometimes leads to confusion in the mathemat- Figure 2.1: Schematic representa- ical models. In the few critical in- tion of the Wright-Fisher reproduc- stances, we will speak of real-life tion mechanism. The N individuals time to express that time is running in the offspring generation (generation in the same direction as our own 1) chose their parents independently, lives, i.e. from the Big-Bang, pass- uniformly among the N individuals if ing through the present to the Hov- the parent generation (generation 0). erboard in the future.5 This is the direction we refer to whenever we speak of backward or forward in time. The mathematical models on the other hand will always let time go to +∞ (or t ↦ t+1) which basically corresponds to turning 180°on the natural time-axis when considering backwards models. Once the reader is aware of this fact, no difficulties in comprehension should arise. Observe that (for any r ∈ Z) the coordinates of Fr are exchangeable ran- dom variables, since for any permutation σ of ⟦N⟧ we have

1 N d σ(1) σ(N) (Fr ,...,Fr ) = (Fr ,...,Fr ) which means that the actual position or label in ⟦N⟧ of the child (and as a matter of fact also of the parent) is not relevant due to the symmetries inherent in the model, as there is no form of selection6 added. We say that the model is neutral. 5These terms are strictly non-scientific and simply for orientation. Thus we believe it is justified to use the ‘Big Bang’ as −∞ and the ‘Hoverboard’ as +∞ although their real distance to 0 (the present) might be finite. 6As a biological term: Reproductive advantage of one individual or type over another.

56 2.1. A famous model by Fisher and Wright and Kingman’s dual

This fact allows for a different formulation of the reproduction mechanism particularly appealing to those who find the notion of a ‘child choosing its n parent’ as unnatural: We introduce the offspring distribution Wr of each individual in the Wright-Fisher model defined as W n ∶= ∣{i ∈ ⟦N⟧ ∣ F i = n}∣ for every n ∈ ⟦N⟧, r ∈ . r r+1 Z 1 N Note that (for any r ∈ Z) Wr ,...,Wr are identically distributed, but not independent as they sum up to N: N W n = N. Setting ∑n=1 r 1 N Wr ∶= (Wr ,...,Wr ), r ∈ Z the (Wr)r∈Z are i.i.d. with the symmetric multinomial distribution N N 1 N N 1 N! 1 P(Wr = m1,...,Wr = mN ) = ( )( ) = ( ) m1, . . . , mN n m1! . . . mN ! n for any choice of m , . . . , m ∈ ⟦N⟧ such that N M = N. In the formula- 1 N ∑n=1 n tion of this forward view we say that N individuals produced N offspring by multinomial sampling 7.

The Wright-Fisher frequency process for two alleles We now want to add the genetic types of the individuals to the consider- ation and explore their evolution under the mechanism introduced above. We restrict the observations to the classic bi-allelic case. Therefore every individual has one of two types of the type space E = {a, A} (which it has inherited from its parent). Definition 2.3 (The Wright-Fisher frequency process). In the set-up de- scribed above we define the genetic-type configuration process (ξr)r∈N0 = 1 N (ξr , . . . , ξr )r∈N0 of the bi-allelic Wright-Fisher model given by some initial 8 distribution L(ξ0) on P(E) and transitions ξi (ω) = ξn(ω) ∈ E if ω ∈ {F i = n}, r+1 r r+1 which models the assumption that the unique genetic type of an individual is inherited from its unique parent. This allows us to introduce the frequency N process (Xr )r∈N0 of the Wright-Fisher model as N N 1 X ∶= ∑ 1 n r N {ξr =a} n=1 7In general, the number of parents and the number of offspring need not coincide, see Definition 2.11. 8P(E) denotes the power-set of E.

57 Chapter 2. A novel seed-bank model

tracing the fraction of alleles of type a ∈ E at time r ∈ N0 in a population of size N.

An important question in biology is to understand the fluctuations of this frequency process in [0, 1]. These fluctuations and changes in the frequency of an allele caused merely by the randomness in the reproductive mechanism were investigated by Wright [110] and termed random genetic drift. It is considered one of the evolutionary forces (next to, for example, selection or mutations), but its actual influence on genetic variability was immediately critized by Fisher [29] and is to this day highly disputed among the biolog- ical community. For us mathematicians though, it generates an interesting object, as we will see in the next paragraph. N Observe that (Xr )r∈N0 is a bounded martingale and a time-homogeneous Markov chain as which it has two absorbing states 0 and 1, thus it will converge almost surely to one of these states. If the frequency process reaches 1, we say that the allele a has fixated (or allele A has become extinct) and vice- versa if it reaches 0. We thus know that a either fixates or becomes extinct almost surely. Therefore the model predicts that even if both types are equivalent for reproduction, one will overpower the other completely, i.e one will become extinct, while the other fixates almost surely. The probability of success of the allele a in fixation is given by its initial fraction. Forthis simple model it is easy to see that fixation not only occurs in finite time almost surely, but that even the expected time until fixation is finite. Now, one is typically interested in very large populations, but if we just let N → ∞ we would obtain a deterministic model. Hence, the question arises at what time-scale the randomness would be retained. A hint in this direction is the following observation for the frequency process: Since the conditional distribution of XN given XN is binomial with parameters N r+1 r N and Xr if we calculate its conditional variance we obtain XN (1 − XN ) (XN ∣ XN ) = r r -a.s. V r+1 r N P and can suspect that non-trivial fluctuations should stay visible on a time- scale of order N. This time-scale is identified as the evolutionary time-scale relevant in numerous instances as we will not only see in the next paragraph, but later in this section as well as in Sections 2.3.1 and 2.4.

The Wright-Fisher diffusion Indeed, if we rescale time (as we are doing with space) in the frequency process by the evolutionary scaling, we obtain an interesting limiting object.

58 2.1. A famous model by Fisher and Wright and Kingman’s dual

N Theorem 2.4. Assuming X0 → x ∈ [0, 1] a.s. for N → ∞, we have that

(XN ) ⇒ (X ) ⌊Nt⌋ t≥0 t t≥0 on D[0,∞)([0, 1]) as N → ∞, where (Xt)t≥0 is a 1-dimensional diffusion solv- ing √ dXt = Xt(1 − Xt)dBt,X0 = x (2.2) for (Bt)t≥0 standard Brownian motion. Definition 2.5 (The Wright-Fisher diffusion). We call the diffusion process

(Xt)t≥0 from Theorem 2.4 the Wright-Fisher diffusion. Note that existence and uniqueness of the solution of (2.2) does not fol- low due to classic results that tend to assume Lipschitz conditions for the coefficients violated in our case by the square root. Instead one needs torefer to [55], Theorem 3.2 and [111], Theorem 1. We will return to this diffusion, but first move on to analyze the Wright- Fisher model backwards in time.

The Kingman coalescent The intention is now to trace the genealogies of a sample of a fixed number k ≪ N of individuals backwards in (real-life) time. The idea is to use parti- tions of {1, . . . , k} to explain which individuals have a common ancestor in each generation. To this end, denote by Pk the set of partitions of {1, . . . , k} and for later use by P the set of partitions of N. Recall the Wright-Fisher mechanism given in (2.1). In order to describe ( (k,N)) the ancestral process A−r r∈N0 of k individuals (without loss of generality labelled {1, . . . , k} for simplicity), we begin with the partition of singletons (k,N) A0 ∶= {{1},..., {k}} and proceed as follows: whenever individuals choose the same parent in the preceeding generation, their sets are merged, see Figure 2.2 for clarification. We observe from (2.1) that the probability oftwo given (fixed) individuals choosing the same parent is given by1/N, likewise the probability that three given individuals choose the same parent is 1/N 2 etc. i.e. 1 1 (F 1 = F 2) = , (F 1 = F 2 = F 3) = ,... P r r N P r r r N 2 and this determines the transition probabilities of the ancestral process from one partition to another. Notice that any event involving more than one

59 Chapter 2. A novel seed-bank model

= 0 1 2 3 4 5

Figure 2.2: Schema of the iteration of the Wright-Fisher reproduction mech- anism. The genealogy of 5 individuals chosen in generation r = 0 is enhanced through hatching. The black arrow on the left indicates the direction of real- life time, whereas the dashed arrow indicates the mathematical time-index r ∈ N0 of the process. The blue double-hatched individual is called the most recent common anccestor of the 5 original individuals. merger in this model has a probability of order (at least) N −2. Indeed, if we parallel Theorem 2.4 and run this process at the evolutionary time-scale (k,N) for N → ∞, i.e. if we consider (A tN ) , all events but those involving −⌈ ⌉ t≥0 exactly one merger of sets become irrelevant and we obtain in the (weak) limit an object describing the genealogy. Before we define it, we need some additional notation: For two partitions π, π′ ∈ Pk, we write π ≻ π′ if π′ can be constructed by merging exactly 2 blocks of π. For example

{{1, 3}, {2}, {4, 5}} ≻ {{1, 3, 4, 5}, {2}}.

Definition 2.6 (The Kingman k-coalescent). For k ≥ 2 we define the King- ˜ (k) man k-coalescent (Πt )t≥0 to be the continuous time Markov chain with

60 2.1. A famous model by Fisher and Wright and Kingman’s dual

values in Pk, characterized by the following transition rate π ↦ π′ at rate 1 if π ≻ π′, and all other rates equal to 0 such that the generator is conservative. Figure 2.3 shows a typical realization of the k-Kingman coalescent.

Figure 2.3: A graphical representation of a typical realization of the k- Kingman coalescent representing the lineages if k = 35 individuals. The dashed arrow indicates the direction of the time-index t of the process. Since the rate of a coalescence event in a population of n individuals is equal to n the number of pairs, i.e. to (2), we see many coalescences in rapid succession in the beginning and then observe a deceleration as time goes on. The black arrow on the left indicates real-life time.

Due to the Markov-property the laws of the k-coalescents defined above are a consistent family: If we restrict the Kingman (k + 1)-coalescent to

61 Chapter 2. A novel seed-bank model

Pk, which essentially amounts to ‘ignoring’ the (k + 1)st particle, we obtain the distribution of the Kingman k-coalescent, since the behavior of the (k + 1)st particle is independent of that of the first k. Thus we can consider its projective limit using Kolmogorov’s extension theorem. The details of this argument can be found in [65]. Definition 2.7 (The Kingman coalescent). We define the Kingman coales- ˜ ˜ (∞) cent, (Πt)t≥0 = (Πt )t≥0 as the unique Markov process distributed according to the projective limit of the laws of the Kingman k-coalescents as k goes to infinity. This is one of the instances where ‘mathematical time’ runs in the oppo- site direction of ‘real-life’ time as the reader might have detected. The Kingman coalescent is one of the most important objects in popu- lation genetics and the lion’s share of its relevance is due to its universality as a scaling limit of different reproductive models. The simplest of this kind might be the Moran model, a continuous-time analogue of the Wright-Fisher model, but the Kingman coalescent also appears in the limit for more general models, e.g. so-called Canning’s models and even when we add a ‘seed-bank’ as will be discussed in the following Section 2.2. But before we get to this, we discuss some relevant properties of the Kingman coalescent that will be paralleled for our seed-bank model in the following sections.

The time to the most recent common ancestor When contemplating the genealogy of a sample of individuals it is natural to ask whether, if one goes far enough in time, it is possible to find one single individual that is the ancestor of all those individuals in the present-day sample and if so, how long we have to ‘wait’ to find it. This is called the time to the most recent common ancestor of a sample of k individuals and is defined using the Kingman k-coalescent as

˜ (k) TMRCA[k] ∶= inf{t > 0 ∣ Πt = {1, . . . , k}}. (2.3) Since the Kingman k-coalescent behaves much like a (pure) death process (see next paragraph), a simple calculation yields the following significant result. Theorem 2.8. 1 [T [k]] = 2 (1 − ) , (2.4) E MRCA k

62 2.1. A famous model by Fisher and Wright and Kingman’s dual and hence in particular

∀k ∈ N ∶ E[TMRCA[k]] ≤ 2.

This is remarkable as it means that for every finite set of individuals ob- served ‘nowadays’, you only have to go back a (random, even in expectation) finite time to conclude that they all derive from the same individual. As visualized in Figure 2.3 most coalescences occur very quickly (since their rate is of the order of the square of the number of individuals), but the last coalescences slow down, which can be seen also in equation (2.4) as the expected time to the most recent common ancestor is largely given by the time of coalescence of two individuals.

Coming down from infinity Extending the question posed (and answered) above, one wonders if similar observations hold true if one does not start with a finite, but an infinite number of individuals. This refines to the concept of coming down from infinity discussed simultaneously by Schweinsberg [100] and Pitman [90] in the more general context of exchangeable coalescent processes9.

To concretize we define the block-counting process (Nt)t≥0 of the Kingman coalescent as precisely what its name says:

˜ Nt ∶= ∣Πt∣ where for π ∈ P ∣π∣ denotes the cardinality of the partition, i.e. the number of blocks. This is a continuous time Markov chain, which is indeed a pure n death process, with the specific transition rate (2) for n ↦ n − 1, n ≥ 2, since n (2) is precisely the number of possible mergers of two blocks given n blocks are present in a partition. Note that in the literature the block-counting process of the Kingman coalescent is often defined as the death process on N0 with these rates before introducing the actual Kingman coalescent and its extension to include ∞. The Kingman coalescent being such an exchangeable coalescent, Schweins- berg and Pitman’s defnition translates to the following: The Kingman coa- lescent (started in a partition of N with infinitely many blocks) comes down from infinity, if its block-counting process is finite almost surely for each t > 0. Otherwise it is said to stay infinite. The astounding fact is contained in the following theorem.

9For an overview on exchangeable coalescent processes see for example [3].

63 Chapter 2. A novel seed-bank model

Theorem 2.9. The Kingman coalescent comes down from infinity, i.e. if we start the Kingman coalescent in a partition of N with infinitely many blocks and denote by (Nt)t≥0 its block-counting process then

∀t > 0 ∶ P(Nt < ∞) = 1.

Duality of the frequency and the block-counting process We observed earlier that going backwards in time for any finite sample we need only to wait a finite time to find their common ancestor. Now, ifwe were to add types to this deliberation all individuals would, of course, have the same type as this one common ancestor. This is strongly redolent of the subject of fixation and extinction when going forward in time. Indeed, there is a strong interplay of the forward and backward in time models in the form of a mathematical duality. Indeed the block-counting process of the Kingman coalescent is a moment dual of the frequency process of the Wright-Fisher model. To substantiate this claim let us denote by Pn the distribution under which the block-counting process (Nt)t≥0 of the Kingman n coalescent is started on n ∈ NP -a.s. and similarly use Px for the distribution such that Px(X0 = x) = 1, for x ∈ [0, 1]. Theorem 2.10. For all n ∈ N and all x ∈ [0, 1]:

n Nt n ∀t ≥ 0 ∶ E [x ] = Ex[Xt ]. This type of duality is often very useful because a certain property of one of the processes migh be easier to calculate for the other process. A general overview of types and advantages of dualities of Markov processes is given in [57].

2.2 Modelling a seed-bank

As explained in the introduction to this chapter, seed-banks have a pivotal role in the genetic evolution of a population counteracting other evolutionary forces such as random genetic drift or selection while buffering environmental changes. Their impact as evolutionary force becomes particularly obvious when considering the classic concept of fixation respectively extinction of alleles, which necessarily becomes more intricate in the presence of a seed- bank because an allele may have seemingly become extinct, disappearing from the active population only to emerge after a potentially very long time. Since probabilistic models, in particular the Kingman coalescent and its relatives, have proven to be very useful tools to understand basic principles of

64 2.2. Modelling a seed-bank population genetics and the interaction of evolutionary forces [109], it is self- evident to try to incorporate seed-bank effects in such a probabilistic mod- eling framework. However, the classic Wright-Fisher model cannot allow for a seed-bank, since it assumes that each individual chooses his (unique) par- ent from the previous generation. If we assume the existence of a seed-bank though, this is not the case anymore – some seeds sprout immediately (say, in the next season), but we might have others that lay for several generations before germinating. Thus the ‘parent’ might have lived several generations ago and ‘sibblings’ do not necessarily belong to the same generation anymore. This phenomenon was incorporated in an extension of the classic Wright- Fisher model by Kaj, Krone and Lascoux [58] thus pioneering the area of seed- bank models. In their model, each generation consists of a fixed amount of N individuals as in the Wright-Fisher population. However, now each individual chooses its parent a random amount of generations in the past, where the number of generations separating the parent and offspring is understood as the time that the offspring has spent as a seed or dormant form. To specify this mechanism let µ denote a measure on N that we will call the seed-bank age distribution. Then every individual chooses first a generation in the past according to µ and then an individual of that generation uniformly at random to be its parent as illustrated in Figure 2.4. Of course, the choice of generation and parent, as well as the choices of different individuals are assumed independent. Note that the case µ = δ1 is just the classical Wright- Fisher model. Kaj, Krone and Lascoux then prove that if the seed-bank age distribution µ is restricted to finitely many generations {1, 2, . . . , m}, where m does not depend on N, then the ancestral process induced by the seed-bank model converges, after the usual (evolutionary) scaling of time by a factor N, to a time changed (delayed) Kingman coalescent, where the coalescent rates are multiplied by β2 ∶= 1/µˆ2. Hereµ ˆ is the expectation of the seed-bank age distribution µ. Thus this kind of a seed-bank decellerates the coalescence process, but still falls in the universality class of the Kingman coalescent. Since the overall coalescent tree structure is retained, this leaves the relative allele frequencies within a sample unchanged [108]. In this scenario we thus speak of a weak seed-bank effect. Observe that the seed-bank should indeed be ‘weak’ by intuition. We already know that the evolutionary time scale is of order N and since we consider N → ∞ it should not be of great significance whether we jump by 1 or some other finite number of generations. This result is generalized in [8], where it is shown that a sufficient condi- tion for convergence to the Kingman coalescent (with the same scaling and delay) in this setup is that the finiteness-condition holds (only) in expecta- tion, i.e.µ ˆ < ∞, if the seed-bank age does not depend on N.

65 Chapter 2. A novel seed-bank model

A different extension of the N plants model by Kaj, Krone and Lascoux is considered in [108], where the au- thors combine the seed-bank model -k of [58] with fluctuations in popu- lation size. They point out that substantial germ banks with small ... germination rates may buffer or en- hance the effect of the demogra- phy. This indicates that seed-bank affects the interplay of evolutionary -2 forces and may have important con- sequences. This strong seed-bank effect is also investigated in [8] proving that -1 suitable µ can lead to a behavior radically different from the King- man coalescent. In particular, if the seed-bank age distribution is ‘heavy- generation 0 tailed’, say, µ({k, k + 1, k + 2,...}) = L(k)k−α, for k ∈ N, where L is a slowly varying function, then if α < Figure 2.4: Illustration of the seed- 1 the expected time for the most bank model introduced by Kaj, Krone recent common ancestor is infinite, and Lascoux in [58]. In each genera- and if α < 1/2 two randomly sampled tion (blue boxes), each individual first individuals do not have a common chooses the generation of its parent ac- ancestor at all with positive proba- cording to the distribution µ (black ar- bility. Hence this will not only delay, row) and then chooses its parent uni- but actually completely alter the ef- formly among the individuals of the fect of random genetic drift. chosen generation (dotted arrow). If such extreme behavior of the seeds seems artificial, instead, one can turn to the case µ = µN and scale the seed-bank age distribution µ with N in order to understand its interplay with other evolutionary forces on similar scales. For example, in [7] the authors study a seed-bank model with µ = µN = (1 − ε)δ1 + εδN β , β > 0, ε ∈ (0, 1). This models a scenario where almost all seeds sprout during the next season, but very few stay dormant for a very long time. For β < 1/3 the ancestral process converges to the Kingman coalescent only after rescaling the time by the non-classical factor N 1+2β, so that the expected time to the most recent common ancestor is highly elevated. This was used in [44] to discuss the effect of seed-banks in

66 2.2. Modelling a seed-bank evolutionary bacteria.

While there are substantial mathematical results in the weak seed-bank regime, it seemed that the ‘stronger’ seed-bank models had not been identi- fied. This is particularlry apparent in the abscence of a new limiting coales- cent structure in contrast to many other population genetic models, where the interplay of suitably scaled evolutionary forces (such as mutation, ge- netic drift, selection and migration) often lead to elegant limiting objects, like the ancestral selection graph [86], or the structured coalescent [52, 88]. The intrinsic problem of the model being the loss of the Markov property in Wright-Fisher models with long genealogical ‘jumps’, that also impedes the formulation of a forward model.

In our work we thus propose a new Markovian Wright-Fisher type seed- bank model that allows for a clear forward and backward process and corre- sponding scaling limit interpretation, which will be presented in the following order:

We begin, of course, with the introduction of our Wright-Fisher model with geometric seed-bank in Section 2.3 to consider the frequency process (in the bi-allelic case) and prove its convergence to the solution of a system of two-dimensional SDEs. We derive their dual (block-counting) process and employ this duality to compute the fixation probabilities as t → ∞ of the system. In Section 2.4, we then define the new seed-bank coalescent corresponding to the previously derived dual block-counting process. We argue how it describes the ancestry of the Wright-Fisher geometric seed-bank model and conclude the section with a comparison to other similar coalescents like the structured coalescent, the coalescent with freeze and most notably the peripatric coalescent, discovered independently by Lambert and Ma in [68], which although arising from a very different pre-scaling model, turns out to be the same as the seed-bank coalescent, circumstanciating our claim of universality for this new coalescence structure. Its properties are then elaborated in Section 2.5 where we first prove that the seed-bank coalescent does not come down from infinity. Then we obtain that the expected time to the most recent common ancestor of a sample of k individuals is of asymptotic order log log k as k gets large, which interestingly agrees with the scale for the Bolthausen-Sznitman coalescent identified by Goldschmidt and Martin [43]. The section is concluded with recursions for some quantities of interest of our model, needed by biologists to execute simulations and a brief discussion of an extension of the seed-bank model we studied in [6] by adding mutation. The technical details of some calculations in Sections 2.3 and 2.5 were outsourced to Section 2.6 for better readability.

67 Chapter 2. A novel seed-bank model

2.3 The Wright-Fisher model with geometric seed-bank

In order to introduce a discrete time forward model similar to the Wright- Fisher model from Section 2.1, we need to begin with the assumptions we make on the population under consideration. I The individuals of the population can be in one of two states: they can be active or dormant. We assume that at any point in time we have N ∈ N active and M ∈ N dormant individuals, such that we have a fixed total population size of N + M. Motivated by the application we will frequently refer to the active in- dividuals as plants and the dormant individuals as seeds. Thus the set of dormant individuals will be called the seed-bank. II We consider the population to be haploid and assume that each indi- vidual carries a genetic type from some type space E. III Reproduction takes place in non-overlapping discrete generations (for the moment indexed by N0, later by Z) according to the mechanism described in Definition 2.11. IV Each individual has one parent and inherits its genetic type. V There is no mutation nor any selection mechanism or the like. In order to define our reproduction mechanism, we need to explain how active, respectively dormant individuals give birth to active, respectively dor- mant individuals. Like in the Wright-Fisher model, this can be done ‘back- wards’ by pretending the children choose their parents in order to preserve the fixed population size. However, the notation in this case would become much more complicated and since we have learned how to translate this into a forward picture in Section 2.1, we will only use the latter here. Definition 2.11. Given N,M ∈ N, let ε ∈ [0, 1] such that εN ≤ M and set δ ∶= εN/M, and assume for convenience that εN = δM is a natural number (otherwise replace it by ⌊εN⌋ everywhere). Recall that ⟦N⟧ ∶= {1,...,N} and ⟦N⟧0 ∶= ⟦N⟧ ∪ {0}. The dynamics of our Wright-Fisher model with strong seed-bank component are as follows: 1. The N active individuals (plants) from generation 0 produce (1 − ε)N active individuals in generation 1 by multinomial sampling10 with equal weights. 10 See Section 2.1.

68 2.3. The Wright-Fisher model with geometric seed-bank

2. Additionally, δM = εN uniformly (without replacement) sampled seeds from the seed-bank of size M in generation 0 ‘germinate’, that is, they turn into exactly one active individual in generation 1 each, and leave the seed-bank. ⇒ The active individuals from generation 0 are thus replaced by these (1 − ε)N + δM = N new active individuals, forming the population of plants in the next generation 1. 3. In addition, the N active individuals from generation 0 produce δM = εN seeds by multinomial sampling with equal weights, filling the vacant slots of the seeds that were activated. 4. The remaining (1 − δ)M seeds from generation 0 remain inactive and stay in the seed-bank (or, equivalently, produce exactly one offspring each, replacing the parent). ⇒ The M dormant individuals of generation 1 are thus made up of the δM seeds produced from the plants in generation 0 and the (1 − δ)M seeds that stayed in the seed-bank. 5. Throughout reproduction, all offspring inherit the genetic type of the parent. Thus, in generation 1, we have again N plants and M seeds. Figure 2.5 illustrates steps 1.-4., Figure 2.6 depicts the inheriting of types from step 5. This probabilistic mechanism is then to be repeated independently to produce all following generations k ≥ 1. Remark 2.12. 1. The time that a given seed stays in the seed-bank before becoming ac- tive is geometric with success parameter δ. This means we can obtain our model in the framework considered for previous seed-bank models in Section 2.2 by taking a seed-bank age distribution µ with a geometric = ( − ) + ( + ) component: µ 1 ε δ1 ε ∑k∈N δk+1 GeoN k 1 . This observation makes the difference between our model and previous ones more clear, as it is the memorylessness of the geometric distribution that corre- sponds to the Markov property forward in time. 2. The probability that a given plant originates from a seed is ε. 3. Note also that the offspring distribution of active individuals (both for the number of plants and for the number of seeds) is exchangeable10 within their respective sub-population.

69 Chapter 2. A novel seed-bank model

4. Observe that as the classical Wright-Fisher mechanism described in Section 2.1 this can also be traced forward and backward in time.

N plants M seeds

0

1

0

1

0

1

0

1

Figure 2.5: Illustration of step 1 - 4 of the Wright-Fisher mechanism with geometric seed-bank. The hatching enhances the individuals involved in the reproduction in the respective step.

0

1

Figure 2.6: Illustration of step 5 of the Wright-Fisher mechanism with geometric seed-bank. The offspring inherit their type from their parent. Here we have three types: white, hatched gray and double hatched blue.

70 2.3. The Wright-Fisher model with geometric seed-bank

Figure 2.7 visualizes the ancestral relations resulting from iterating this mechanism.

N plants M seeds

Figure 2.7: Sketch of a realization of ancestral relationships in a Wright- Fisher model with geometric seed-bank component. It exemplifies how a genetic type (highlighted in grey) is lost for two generations in the plant- population, but then reappears, illustrating the buffering activity of the seed- bank against genetic drift, maintaining genetic variability. As usual, the arrow on the left indicates the direction of real-life time.

Following the example of the classic Wright-Fisher story from Section 2.1, we now want to consider the type-process. In order to avoid disproportionate notational complexity, we abstain from defining the Wright-Fisher process with geometric seed-bank and instead move straight to the mathematical definition of the Wright-Fisher type-configuration with geometric seed-bank:

Definition 2.13. Fix population size N ∈ N, seed-bank size M, genetic type N space E and δ, ε as before. Given initial type configurations ξ0 ∈ E and M η0 ∈ E , denote by ξk ∶= (ξk(i)) , k ∈ , i∈⟦N⟧ N the random genetic type configuration in EN of the plants in generation

71 Chapter 2. A novel seed-bank model

k ∈ N0 obtained from the mechanism given in Definition 2.11, and denote by

ηk ∶= (ηk(j)) , k ∈ , j∈⟦M⟧ N correspondingly the genetic type configuration of the seeds in EM . We call N M the discrete-time Markov chain (ξk, ηk)k∈N0 with values in E × E the type configuration process of the Wright-Fisher model with geometric seed-bank component.

2.3.1 A forward scaling limit When observing evolution forward in time, we are interested in the change in frequency of the different types in the typespace E. Will a type become extinct or, on the contrary, dominate over others? Such a question is also asked (and partially answered) for another model in Chapter 3.3. Here, we will restrict ourselves to the bi-allelic case for a type space E ∶= {a, A}. Definition 2.14. For the type configuration process given in Definition 2.13 and type space E ∶= {a, A}, define 1 1 XN ∶= ∑ 1 and Y M ∶= ∑ 1 , k ∈ . k N {ξk(i)=a} k M {ηk(j)=a} N0 i∈⟦N⟧ j∈⟦M⟧ N M We call (Xk )k∈N0 and (Yk )k∈N0 the frequency process of a alleles in the active population, respectively in the seed-bank (for parameters N and M).

N M Obviously, (Xk ,Yk )k∈N0 is a Markov chain taking values in 1 2 1 2 IN × IM ∶= {0, , ,..., 1} × {0, , ,..., 1} ⊂ [0, 1] × [0, 1]. N N M M N M Notation 2.15 Denote by Px,y the distribution for which (Xk ,Yk )k∈N0 N M starts in (x, y) ∈ I × I Px,y-a.s., i.e. N M N M Px,y( ⋅ ) ∶= P( ⋅ ∣X0 = x, Y0 = y) for any (x, y) ∈ I × I and analogously use Ex,y, Vx,y the expectation and variance w.r.t. Px,y. We can now calculate the (time-homogeneous) transition probabilities of the allele frequency Markov chain.

Proposition 2.16. Let c ∶= εN = δM and assume c ∈ ⟦N⟧0. With the above notation we have for (x, y) resp. (x,¯ y¯) ∈ IN × IM ,

N M Px,y(X1 = x,¯ Y1 = y¯) c = ∑ Px,y(Z = i)Px,y(U = xN¯ − i)Px,y(V = (y¯ − y)M + i) i=0

72 2.3. The Wright-Fisher model with geometric seed-bank

where Z, U, V are independent under Px,y with distributions

Lx,y(Z) = HypM,c,yM , Lx,y(U) = BinN−c,x, Lx,y(V ) = Binc,x .

Here, HypM,c,yM denotes the hypergeometric distribution with parameters M, c, yM and Binc,x is the binomial distribution with parameters c and x.

N plants M seeds

0

1

0

1

0

1

Figure 2.8: Illustration of the random variables Z, U, and V as explained in Remark 2.17. We see three times the same step of the Wright-Fisher mechanism with geometric seed-bank in the bi-allelic case. The individuals of type a are double-hatched blue. Each time, the individuals that are part of the definition of the random variable under consideration are enhanced through a black outline. Thus the values of the random variables in this example are given by the number of double-hatched individuals with a black outline in generation 1: Z = 1, U = 1 and V = 2.

Remark 2.17. The random variables introduced in Proposition 2.16 have a simple interpretation, readily visible in Figure 2.8:

73 Chapter 2. A novel seed-bank model

Z is the number of plants in generation 1, that are offspring of a seed of type a in generation 0. This corresponds to the number of seeds of type a that germinate / become active in the next generation (noting that, in contrast to plants, the ‘offspring’ of a germinating seed is always precisely one plant and the seed vanishes).

U is the number of plants in generation 1, that are offspring of plants of type a in generation 0.

V is the number of seeds in generation 1, that are produced by plants of type a in generation 0.

Thus, the number of seeds that are ‘offspring’ of seeds, i.e. the number of seeds that did not germinate, is given by yM − Z.

Proof of Proposition 2.16. With the interpretation of Z, U and V given in Remark 2.17 their distributions are immediate as described in the Definition N U+Z M V −Z 2.11. By construction we then have X1 = N and Y1 = y + M and thus the claim follows.

In many modeling scenarios in population genetics, parameters describing evolutionary forces such as mutation, selection and recombination are scaled in terms of the population size N in order to reveal a non-trivial limiting structure, an example of which we have seen in Section 2.1. In our case, the interesting regime is reached by letting ε, δ and M(!) scale with N. More precisely, assume that there exist c, K ∈ ] 0 , ∞[ such that

c N ε = ε(N) = and M = M(N) = . (2.5) N K

Again, for notational convenience we consider c ∈ ⟦N⟧0 as N → ∞. Under assumption (2.5), the seed-bank age distribution is geometric with parameter

c cK δ = δ(N) = = , M(N) N and recalling the Definition 2.11 of the mechanism, one sees that c is the num- ber of seeds that become active in each generation, respectively the number of individuals that move to the seed-bank. In Figures 2.5 and 2.7 for ex- ample, we had c = 2. The parameter K determines the relative size of the seed-bank with respect to the active population. With the right scaling we obtain a non-trivial limit of the the frequency process of a alleles.

74 2.3. The Wright-Fisher model with geometric seed-bank

Proposition 2.18. In the set-up of Definition 2.14 assume ε, δ and M as in (2.5). Consider test functions f ∈ C(3)([0, 1]2). For any (x, y) ∈ IN × IM , we define the discrete generator AN = AN of the frequency Markov chain (ε,δ,M) N M (Xk ,Yk )k∈N by

N N M (A f)(x, y) ∶= NEx,y[f(X1 ,Y1 ) − f(x, y)].

Then for all (x, y) ∈ [0, 1]2,

lim (AN f)(x, y) = (Af)(x, y), N→∞ where A is defined by

∂f ∂f 1 ∂f 2 (Af)(x, y) ∶= c(y − x) (x, y) + cK(x − y) (x, y) + x(1 − x) (x, y). ∂x ∂y 2 ∂x2

Proof. This follows straightforward from Proposition 2.59, which we have outsourced to Section 2.6.1 together with the lengthy technical calculations. Using DN,M = N it states

c ∂f cK ∂f (AN f)(x, y) = N[ (y − x) (x, y) + (x − y) (x, y) N ∂x N ∂y 1 1 ∂2f + x(1 − x) (x, y) + R(N)], N 2 ∂x2 where the remainder term R(N) satisfies that there exists a constant C1(c, f) ∈ (0, ∞), independent of N such that

−3/2 2 −2 −2 3 −4 ∣R(N)∣ ≤ C1(N + K N + N K + K N ).

From this we can obtain the main result of this section: The limiting object is indeed a generator of a diffusion process given as the strong solution of an SDE using Theorem 3.2 in [101]. The convergence then follows from the corresponding convergence of the generators using for example Theorem 19.28 in [59].

N Corollary 2.19. Under the conditions of Proposition 2.18, if X0 → x and M(N) Y0 → y a.s. for N → ∞, we have that

N M(N) (X ,Y )t 0 ⇒ (Xt,Yt)t 0 ⌊Nt⌋ ⌊Nt⌋ ≥ ≥

75 Chapter 2. A novel seed-bank model

2 on D[0,∞)([0, 1] ) as N → ∞, where (Xt,Yt)t≥0 is a 2-dimensional diffusion solving ⎧ √ ⎪dX = c(Y − X )dt + X (1 − X )dB , ⎨ t t t t t t (2.6) ⎪ ⎩⎪dYt = cK(Xt − Yt)dt,

X0 = x, Y0 = y where (Bt)t≥0 is a standard Brownian motion.

Definition 2.20. We call the diffusion process (Xt,Yt)t≥0 from Corollary 2.19 the Wright-Fisher diffusion with seed-bank component, but will also refer to it as the seed-bank frequency process.

Remark 2.21. If we abandon the assumption N = KM there are situations in which we can still obtain meaningful scaling limits. If we assume N/M → 0, and we rescale the generator as before by measuring the time in units of size N, we obtain (cf. Proposition 2.59)

∂f 1 ∂f 2 lim (AN )f(x, y) = c(y − x) (x, y) + x(1 − x) (x, y). 2 N→∞ ∂x 2 ∂x This shows that the limiting process is purely one-dimensional, namely the seed-bank frequency Yt is constantly equal to y, for any t ≥ 0, and the process

(Xt)t≥0 is a Wright-Fisher diffusion with migration (with migration rate c and reverting to the mean y). The seed-bank, which in this scaling regime is much larger than the active population, thus acts as a reservoir with constant allele frequency y, with which the plant population interacts. The case M/N → 0 leads to a simpler limit: If we rescale the generator by measuring the time in units of size M we obtain

∂f lim AM f(x, y) = c(y − x) (x, y) M→∞ ∂y and constant frequency Xt = x for all t ≥ 0 in the plant population, which tells us that if the seed-bank is of smaller order than the active population, the genetic configuration of the seed-bank will converge to the genetic con- figuration of the active population, in a deterministic way.

The above results can be extended to more general genetic type spaces E in a standard way using the theory of measure-valued respectively Fleming- Viot processes. This will be treated elsewhere. Before we investigate some properties of the limiting system (indeed, in order to investigate said properties), we first derive its dual process.

76 2.3. The Wright-Fisher model with geometric seed-bank

2.3.2 The dual of the seed-bank frequency process As we saw in Section 2.1, the classic Wright-Fisher diffusion is known to be dual to the block-counting process of the Kingman coalescent, and similar duality relations hold for other models in population genetics, see for example [67] or [25]. Such dual processes are often extremely useful for the analysis of the underlying system, an example of which we will see in Section 2.3.3, as, indeed, our Wright-Fisher diffusion with geometric seed-bank component also has a nice dual we now define.

Definition 2.22. We define the block-counting process of the seed-bank coa- lescent (Nt,Mt)t≥0 to be the continuous time Markov chain taking values in N0 × N0 making the transition

⎪⎧(n − 1, m + 1) at rate cn, ⎪ (n, m) ↦ ⎨(n + 1, m − 1) at rate cKm, (2.7) ⎪ ⎪ n ⎩(n − 1, m) at rate (2).

Notation 2.23 As done for the frequency process for finite population sizes N and M, denote by Px,y the distribution for which (Xt,Yt)t≥0 from 2 Definition 2.20 starts in (x, y) ∈ [0, 1] Px,y-a.s. and use Ex,y, Vx,y for the expectation and variance w.r.t. Px,y. In the same manner now denote by n,m n,m P the distribution for which (N0,M0) = (n, m) holds P -a.s., and the corresponding expected value by En,m.

Remark 2.24. Let us record some simple observations about this process we will need in different instances later. n,m It is easy to see that, eventually, Nt + Mt = 1 (as t → ∞), P -a.s. for all (n, m) ∈ N0 × N0 ∖ {(0, 0)}. We can even say that for such a pair (n, m) the stopping time T ∶= {t > 0 ∣ Nt + Mt = 1} has finite expectation, i.e. En,m[T ] < ∞ holds true. In addition, the distribution of (Nt,Mt)t≥0 converges weakly to the dis- K 1 tribution δ 1,0 + δ 0,1 , which is the invariant distribution of a single 1+K ( ) 1+K ( ) individual jumping between the states plant and seed at rate c respectively cK.

We now show that (Nt,Mt)t≥0 is the moment dual of (Xt,Yt)t≥0.

Theorem 2.25. Let (Xt,Yt)t≥0 as in Definition 2.20 and (Nt,Mt)t≥0 as in 2 Definition 2.22. For every (x, y) ∈ [0, 1] , (n, m) ∈ N0 × N0 and t ≥ 0

n m n,m Nt Mt Ex,y[Xt Yt ] = E [x y ]. (2.8)

77 Chapter 2. A novel seed-bank model

For the reader new to this topic the process defined in Definition 2.22 completing the duality result might seem like a lucky guess, but given the diffusion it is actually rather easy to find: Note that its three possible tran- sitions correspond respectively to the drift of the X-component, the drift of the Y -component, and the diffusion part of the system (2.6). Though void of meaning for now, we have already used the term block- counting process of the seed-bank coalescent foreshadowing the deeper mean- ing of this duality. As we will see in Section 2.4, the process defined in Def- inition 2.22 does indeed count the ‘blocks’ of another process (cf. Definition 2.30) that will correspond to our Wright-Fisher type mechanism from Def- inition 2.11, but tracing it backwards in time, much like the Wright-Fisher type diffusion resulted from following this model forwards in time.

(n,m) n m Proof of Theorem 2.25. Let f(x, y; n, m) ∶= f (x, y) ∶= f(x,y)(n, m) ∶= x y . Applying the generator A of (Xt,Yt)t≥0 for fixed n, m ∈ N0 to f acting as a function of x and y, i.e. to f (n,m) gives

(Af)(x, y) = (Af (n,m))(x, y) ∂ 1 ∂2 ∂ = c(y − x) f(x, y) + x(1 − x) f(x, y) + cK(x − y) f(x, y) ∂x 2 ∂x2 ∂y 1 = c(y − x)nxn−1ym + x(1 − x)n(n − 1)xn−2ym 2 + cK(x − y)xnmym−1 n = cn(xn−1ym+1 − xnym) + ( )(xn−1ym − xnym) 2 + cKm(xn+1ym−1 − xnym)

= cnf(x,y)(n − 1, m + 1) + cKmf(x,y)(n + 1, m − 1) n n + ( )f (n − 1, m) − (cn + cKm + ( )) f (n, m) 2 (x,y) 2 (x,y)

= (Qf(x,y))(n, m) = (Qf)(n, m) where Q is the generator of (Nt,Mt)t≥0 applied to f acting as a function of n and m, for fixed x, y ∈ [0, 1]. Hence the duality follows from standard arguments, see e.g. [57], Proposition 1.2.

2.3.3 Long-term behavior and fixation probabilities The long-term behavior of our system (2.6) is not obvious. While we saw in

Section 2.1 that the classical Wright-Fisher diffusion (Zt)t≥0, given by √ dZt = Zt(1 − Zt)dBt,Z0 = z ∈ [0, 1],

78 2.3. The Wright-Fisher model with geometric seed-bank will get absorbed at the boundaries after finite time a.s. (in fact with fi- nite expectation), hitting 1 with probability z, this is more involved for our frequency process in the presence of a strong seed-bank. We use the dual- ity observed in the previous section to prove the crucial observation for the long-term behavior.

Proposition 2.26. All mixed moments of (Xt,Yt)t≥0 solving (2.6) converge to the same finite limit depending only on x, y and K. More precisely, for each fixed (n, m) ∈ N0 × N0 ∖ {(0, 0)}, we have

n m y + xK lim Ex,y[Xt Yt ] = . (2.9) t→∞ 1 + K

Proof. Let (Nt,Mt)t≥0 be as in Definition 2.22, started in (n, m) ∈ N0 × N0 ∖ {(0, 0)}. Let T be the first time at which there is only one particle leftin the system (Nt,Mt)t≥0, that is,

T ∶= inf {t > 0 ∶ Nt + Mt = 1}.

As discussed in Remark 2.24 for any finite initial configuration (n, m) ∈ N0 × N0 ∖ {(0, 0)}, the stopping time T has finite expectation. Now, applying Theorem 2.25 in the first equality we obtain

n m n,m Nt Mt limEx,y[Xt Yt ] = lim E [x y ] t→∞ t→∞ n,m N M n,m = lim E [x t y t ∣ T ≤ t] P (T ≤ t) t→∞ + lim n,m[xNt yMt ∣T > t] n,m(T > t) t E P →∞ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≤1 n,m n,m = lim (xP ((Nt,Mt) = (1, 0),T ≤ t) + yP ((Nt,Mt) = (0, 1),T ≤ t)) t→∞ n,m n,m = lim (xP ((Nt,Mt) = (1, 0)) + yP ((Nt,Mt) = (0, 1))) t→∞ xK y = + , 1 + K 1 + K where the last equality holds by the convergence observed in Remark 2.24. The limit is independent of the choice of the starting point n, m.

We can now consider the long-term behavior of the diffusion. Obviously, (0, 0) and (1, 1) are absorbing states for the system (2.6). They are also the only absorbing states, since absence of drift requires x = y, and for the fluctuations to disappear, it is necessary tohave x ∈ {0, 1}.

79 Chapter 2. A novel seed-bank model

Corollary 2.27. In the set-up of Definition 2.14 for any initial value (x, y) ∈ 2 [0, 1] the diffusion (Xt,Yt)t≥0 converges Px,y-a.s. as t → ∞ to a two-dimensional random variable (X∞,Y∞), whose distribution is given by y + xK 1 − y + (1 − x)K L (X ,Y ) = δ + δ , (2.10) x,y ∞ ∞ 1 + K (1,1) 1 + K (0,0) where δ(x,˜ y˜) is the Dirac measure on (x,˜ y˜). Note that this is in line with the classical results for the Wright-Fisher diffusion: As K → ∞ (that is, the seed-bank becomes small compared to the plant population), the fixation probability of a alleles approaches x. Further, if K becomes small (so that the seed-bank population dominates the plant population), the fixation probability is governed by the initial fraction y of a-alleles in the seed-bank. Proof. We first prove convergence in distribution. It is easy to seethat the only two-dimensional distribution on [0, 1]2, for which all moments are constant equal to (y + xK)/(1 + K), is given by y + xK 1 + (1 − x)K − y δ + δ . 1 + K (1,1) 1 + K (0,0) Indeed, uniqueness follows from the moment problem, which is uniquely solv- able on [0, 1]2, cf. [53]. Convergence in law follows from convergence of all moments due to the Portmanteau-Theorem (cf. for example Theorem 3.3.1 in [26]) and the Stone-Weierstraß Theorem.

On the other hand, we observe that (KXt +Yt)t≥0 is a bounded Martingale and thus converges a.s. to some limit Z∞, which by (2.10) has the distribution y + xK 1 + (1 − x)K − y δ + δ . 1 + K K+1 1 + K 0 Now we are left to conclude the convergence of the summands, which we can easily do due to the shape of the limiting law. With the elementary observation that for any (x,˜ y˜) ∈ [0, 1]2 and a, b ∈ R ⎧ ⎪Kx˜ ∈ [0 ∨ (a − 1), b ∧ K] Kx˜ + y˜ ∈ [a, b] ⇒ ⎨ ⎪ ⎩⎪y˜ ∈ [0 ∨ (a − K), b ∧ 1] when we consider the events 1 1 A ∶= {∀n ∈ N ∃T ≥ 0 ∀t ≥ T ∶ 0 ∨ (Z − − 1) ≤ KX ≤ (Z + ) ∧ K} X ∞ n t ∞ n 1 1 A ∶= {∀n ∈ N ∃T ≥ 0 ∀t ≥ T ∶ 0 ∨ (Z − − K) ≤ Y ≤ (Z + ) ∧ 1} Y ∞ n t ∞ n

80 2.4. The seed-bank coalescent

we know P(AX ) = P(AY ) ≥ P(KXt + Yt → Z∞) = 1. Now, splitting along the the possible values of Z∞, we observe

AX ∩ {Z∞ = 0} = {Xt → 0} AX ∩ {Z∞ = 1} = {Xt → 1} and likewise for (Yt)t≥0. Thus we have the almost sure convergence of (Xt)t≥0 1 and (Yt)t≥0 to X∞ = Y∞ = {Z∞=K+1}, which proves the claim.

Remark 2.28. Recall that (Xt)t≥0 and (Yt)t≥0 describe the fraction of type a alleles in the plants, respectively the seedbank population. Our result thus shows that fixation of one of the two alleles a or A, i.e. the dissappearance of the respective other type, will occur almost surely in the limit. However, this will not happen in finite time! To understand this from (2.6), we can compare the seed-component (Yt)t≥0 to the solution of the deterministic equation

dyt = −cKytdt, corresponding to a situation where the drift towards 0 is maximal (or to dyt = cK(1 − yt)dt where the drift towards 1 is maximal), since we have removed the ‘helping’ impact from the plants. Given that (yt)t≥0 does not reach 0 in finite time if y0 > 0, neither does (Yt)t≥0. This will also become visible later in Section 2.5.1 where we will show that the block-counting process (Nt,Mt)t≥0, started from an infinite initial state, does not come down from infinity, which means that the whole (infinite) population does not have a most-recent common ancestor. Thus, in finite time, initial genetic variability should never be completely lost. We expect that with some extra work, this intuitive reasoning could be made rigorous in an almost sure sense with the help of a “look-down construction”, which shall be treated in future work.

2.4 The seed-bank coalescent

So far, we have been concerned with the evolution of our model forward in time, so our time index evolved in the natural real-life time, i.e. from now to the ‘hoverboard-future’. Much as it was done for the Wright-Fisher Model in Section 2.1 it is just as natural to ask for the evolution up to now, i.e. for the genealogical tree in our model. As we do this, remember that from this point of view, when we speak of ‘evolving’ and let t → ∞ we are actually going backwards in time, i.e. in the ‘direction of the Big Bang’. We will want to trace the line of ancestors of a set of ‘present-day’ individ- uals and observe, for example, when they move from plant-state to seed-bank

81 Chapter 2. A novel seed-bank model or merge (i.e. when the individuals find a common ancestor). To formalize this, we introduce the space of marked partitions. Notation 2.29 (The space of marked partitions) For k ∈ N, let Pk be the set of partitions of ⟦k⟧ = {1, . . . , k}. For π ∈ Pk let ∣π∣ be the number of blocks of the partition π. We define the space of marked partitions of ⟦k⟧ to be

{p,s} ∣ζ∣ Pk ∶= {(ζ, u⃗) ∣ ζ ∈ Pk, u⃗ ∈ {s, p} } and use P{p,s} for the analogous set of (marked) partitions of N. This enables us to attach a flag to each block of the partition, which can be either ‘plant’ or ‘seed’ (p or s), so that we can trace whether an ancestral line is currently in the active or dormant part of the population. Recall that all elements of a block are interpreted to have a common ancestor. {p,s} For example, for k = 5, an element π of Pk is the marked partition π = {{1, 3}p{2}s{4, 5}p}. This means that the individuals 1 and 3 have a common ancestor who is a plant, the individuals 4 and 5 also have a common ancestor who is a plant and the ancestor of 2 is a seed. p,s ′ { } ′ ′ Consider two marked partitions π, π ∈ Pk , we say π ≻ π if π can be constructed by merging exactly 2 blocks of π carrying the p-flag, and the resulting block in π′ obtained from this merger again carries a p-flag. For example {{1, 3}p{2}s{4, 5}p} ≻ {{1, 3, 4, 5}p{2}s} . We use the notation π & π′ if π′ can be constructed by changing the flag of precisely one block of π, for example {{1, 3}p{2}s{4, 5}p} & {{1, 3}s{2}s{4, 5}p}.

In view of the form of the block-counting process in Definition 2.22, it is now easy to guess the stochastic process describing the limiting gene ge- nealogy of a sample taken from the Wright-Fisher model with seed-bank component. Definition 2.30 (The seed-bank k-coalescent). For k ≥ 2 and c, K ∈ (0, ∞) (k) we define the seed-bank k-coalescent (Πt )t≥0 with seed-bank intensity c and relative seed-bank size 1/K to be the continuous time Markov chain with {p,s} values in Pk , characterized by the following transitions:

⎪⎧1 if π ≻ π′, ⎪ π ↦ π′ at rate ⎨c if π & π′ and one p is replaced by one s, (2.11) ⎪ ⎩⎪cK if π & π′ and one s is replaced by one p.

82 2.4. The seed-bank coalescent

If c = K = 1, we speak of the standard seed-bank k-coalescent.

Comparing (2.11) to (2.7) it becomes evident that (Nt,Mt)t≥0, introduced in Definition 2.22, is indeed the block-counting process of the seed-bank k- coalescent, given a suitable starting point; see Figure 2.9 for a visual repre- sentation of the seed-bank k-coalescent. (Nt,Mt) then gives the number of lineages that are active, respectively dormant at time t ≥ 0.

Definition 2.31 (The seed-bank coalescent). We may define the seed-bank (∞) coalescent, (Πt)t≥0 = (Πt )t≥0 with seed-bank intensity c and relative seed- bank size 1/K as the unique Markov process distributed according to the projective limit of the laws of the seed-bank k-coalescents (with seed-bank intensity c and relative seed-bank size 1/K) as k goes to infinity. In analogy to Definition 2.30 we call the case of c = K = 1 the standard seed-bank coalescent.

Remark 2.32.

1. Note that the seed-bank coalescent is a well-defined object. Indeed, for the projective limiting procedure to make sense, we need to show consistency and then apply the Kolmogorov extension theorem. This Ð→ (k) can be roughly sketched as follows. Define the process (Π t )t≥0 as (k+1) the projection of (Πt )t≥0, the k + 1 seed-bank coalescent, to the space P{p,s}. Mergers and flag-flips involving the singleton {k + 1} are k Ð→ (k+1) (k) only visible in (Πt )t≥0, but do not affect (Π t )t≥0. Indeed, by the Markov-property, a change involving the singleton {k + 1} does not Ð→ (k) (k) affect any of the other transitions. Hence, if Π 0 = Π0 , then Ð→ (k) (k) (Π t )t≥0 = (Πt )t≥0 holds in distribution. By the Kolmogorov extension theorem the pro- jective limit exists and is unique.

2. We have observed that the Markov chain (Nt,Mt)t≥0 defined in Def- inition 2.22 does indeed count the clocks of a seed-bank k-coalescent if started suitably. In the same manner, we will want to consider the block-counting process of the seed-bank coalescent and use the same

notation (Nt,Mt)t≥0 for it, which is a slight abuse but an excusable one given the set-up and the gain in readability. Note that this block- counting process is not restricted to N0 × N0 anymore, but may take values in N0 ∪ {ℵ0} × N0 ∪ {ℵ0}.

83 Chapter 2. A novel seed-bank model

Figure 2.9: A realization of the k-seed-bank coalescent (for k = 10). The black lines indicate active lineages, the dashed lines dormant lineages. At the time of the blue horizontal line the configuration of the process is {{1, 2}s, {3, 4, 5, 6, 7}p, {8, 9, 10}s}. The dashed arrow indicates the direction of time t of the process. The black arrow on the left indicates the direction of real-life time.

Further, it is not hard to see that the seed-bank coalescent does indeed appear as the limiting genealogy of a sample taken from the Wright-Fisher model with geometric seed-bank component in the same way as the Kingman coalescent describes the limiting genealogy of a sample taken from the clas- sical Wright-Fisher model. Here we merely sketch a proof, which is entirely standard, but notationally exuberant. Indeed, consider the genealogy of a sample of k ≪ N individuals, sampled from present generation 0. We proceed backward in real-life time, keeping track in each generation of the ancestors of the original sample among the active individuals (plants) and among the seeds. To this end, denote by (N,k) {p,s} Πi ∈ Pk the configuration of the genealogy at generation −i, where

84 2.4. The seed-bank coalescent

(N,k) two individuals belong to the same block of the partition Πi if and only if their ancestral lines have met until generation −i, which means that all individuals of a block have exactly one common ancestor in this generation, and the flag s or p indicates whether it is a plant or a seed in generation −i. According to our forward-in-time population model described in Definition 2.11, there are the following possible transitions from one generation to the previous one of this process:

• One (or several) plants become seeds in the previous generation.

• One (or several) seeds become plants in the previous generations.

• Two (or more) individuals have the same ancestor in the previous gen- eration (which by construction is necessarily a plant), meaning that their ancestral lines merge.

• Any possible combination of these three events.

A possible realization of the seed-bank genealogy can be found in Figure 2.10. It turns out that only three of the possible transitions play a role in the limit as N → ∞, whereas the others have a probability that is of smaller order.

Proposition 2.33. In the setting of Proposition 2.18, fix k ≪ N and assume N,k p,s ( ) p p ′ { } that Π0 = {{1} , ..., {k} }, P-a.s. Then for π, π ∈ Pk ,

(Π(N,k) =π′ ∣ Π(N,k) = π) P i+1 i ⎧ 1 −2 ′ ⎪ N + O(N ) if π ≻ π , ⎪ ⎪ c + O(N −2) if π & π′ and a p is replaced by an s, = ⎨ N ⎪ cK + ( −2) & ′ ⎪ N O N if π π and an s is replaced by a p, ⎪ ⎩⎪O(N −2) otherwise. for all i ∈ N0. Proof. According to the definition of the forward-in-time population model, exactly c out of the N plants become seeds, and exactly c out of the M = (N,k) N/K seeds become plants. Thus whenever the current state Πi of the genealogical process contains at least one p-block, then the probability that a c given p-block changes flag to s at the next time step is equal to N . If there is at least one s-block, then the probability that any given s-block changes flag cK to p is given by N , and the probability that a given p-block chooses a fixed c 1 plant ancestor is equal to (1 − N ) N (where 1 − c/N is the probability that

85 Chapter 2. A novel seed-bank model

N c=2 M

1 2 3

Figure 2.10: An illustration of the Wright-Fisher mechanism with geometric seed-bank iterated for several generations. The genealogy of k = 3 individuals chosen in generation 0 is enhanced through hatching. Its configuration is given on the right. The most recent common ancestor of the sample is double hatched blue. the ancestor of the block in question is a plant, and 1/N is the probability to choose one particular plant among the N). From this we conclude that the probability of a coalescence of two given p-blocks in the next step is

c 2 1 (two given p-blocks merge) = (1 − ) . P N N Since we start with k blocks, and the blocks move independently, the prob- ability that two or more blocks change flag at the same time is of order at most N −2. Similarly, the probability of any combination of merger or block- flip events other than single blocks flipping or binary mergers isoforder N −2 or smaller, since the number of possible events (coalescence or change of flag) involving at most k blocks is bounded by a constant depending on k but not on N.

Corollary 2.34. For any k ∈ N, under the assumptions of Proposition 2.18,

86 2.4. The seed-bank coalescent

(N,k) (k) (Π )t 0 converges weakly as N → ∞ to the seed-bank coalescent (Π )t 0 ⌊Nt⌋ ≥ t ≥ started with k plants.

(N,k) Proof. From Proposition 2.33 we see that the generator of (Π )t 0 con- ⌊Nt⌋ ≥ (k) verges to the generator of (Πt )t≥0, which is defined via the rates given in (2.11). Using for example Theorem 19.28 in [59] we obtain the weak conver- gence of the processes.

2.4.1 Related coalescent models In the plethora of coalescent models, see for example [3] for an overview, there are a few with striking similarities to our seed-bank coalescent, which we will briefly discuss here. With the exception of the last model mentioned, in spite their similarities in the set-up, they are clearly different from our seed-bank coalescent.

The structured coalescent The seed-bank coalescent is reminiscent of the structured coalescent arising from a two-island population model (see e.g. [110, 104, 88, 51, 52]). Consider two Wright-Fisher type (sub-) popula- tions of fixed relative size evolving on separate ‘islands’, where individuals (when going forward in real-life time, resp. ancestral lineages for backward real-life time) may migrate between the two locations with a rate of order re- ciprocal of the total population size (the so-called ‘weak migration regime’). Reproduction only takes place among the individuals in the same island (and the offspring is placed on the same island), thus mergers between two ances- tral lineages are only allowed if both are currently in the same island. This set-up also gives rise to a coalescent process defined on ‘marked partitions’, with the marks indicating the location of the ancestral lines among the two islands. Coalescences are only allowed for lines carrying the same mark at the same time, and marks are switched according to the scaled migration rates, see [109] for an overview. In our Wright-Fisher model with geometric seed-bank component, we consider a similar ‘migration’ regime between the two sub-populations, called ‘plants’ and ‘seeds’. However, in the resulting seed-bank coalescent, coales- cences can only happen while in the plant-population. This asymmetry leads to a behavior that is qualitatively different to the usual two-island scenario. For example the expected time to the most recent common ancestor is uni- formly bounded in the number of simultaneously considered individuals and thus in particular finite, even when started with infinitely many individuals, whereas for our seed-bank coalescent we will show in Theorem 2.43, that this is not true.

87 Chapter 2. A novel seed-bank model

The coalescent with freeze Another related model is the coalescent with freeze, see [21], which behaves much like our model, but blocks become com- pletely inactive at a fixed rate and can then never be activated again. Hence, deactivated blocks will never coalesce at all, while in our case they just had to wait to become active again to have a chance at coalescence. Clearly, this leads to a very different long-time behavior, since in particular one cannot expect to see a most recent common ancestor in such a coalescent.

The peripatric coalescent This is the name used by Lambert and Ma for the coalescent structure that arose in [68], 2015, which turns out to be our seed-bank coalescent. However, they obtained it starting from a different ‘base-model’. They considered a setting in continuous time of a very large ‘main’ population with a well-studied reproduction mechanism known as the Moran-model, where, in addition, the individuals will independently send out offspring to found new ‘colonies’. These colonies are of smaller sizethan the original population and will, independently of any other event, merge back into the main population at a fixed rate. Going backwards in time, the ancestral lines can either lie in the main population (where they can merge) or in the colonies where they are prevented from merging and the parallels to the seed-bank coalescent become obvious.

2.5 Properties of the seed-bank coalescent

As we just saw in the previous Section 2.4.1, the seed-bank coalescent also arises in other contexts. Indeed, we believe the seed-bank coalescent to be universal in the sense that it is likely to appear as a scaling limit of various other models. Therefore, the study of its properties becomes even more important. Notation 2.35 As opposed to the point of view used in Section 2.3 it will now also become convenient to make the starting points parts of the defini- tion of the respective Markov processes and not of the measures considered. (n,m) (n,m) We will use (Nt ,Mt )t≥0 for the block-counting process from Remark 2.32,2 started in the initial conditions (n, m) ∈ N0 ∪{ℵ0}×N0 ∪{ℵ0} P-almost (n,m) (n,m) (n,m) (n,m) surely. We will often write Nt ,Mt = ∞ instead of Nt ,Mt = ℵ0 for simplicity.

2.5.1 Coming down from infinity The notion of coming down from infinity was discussed by Pitman [90] and Schweinsberg [100]. They say that an exchangeable coalescent process comes

88 2.5. Properties of the seed-bank coalescent down from infinity if the corresponding block-counting process of an infinite sample (!) has finitely many blocks immediately after time 0, i.e. the number of blocks is finite almost surely for each t > 0. On the contrary, the coalescent is said to stay infinite if the number of blocks is infinite a.s. for all t ≥ 0. In [100] Schweinsberg gives necessary and sufficient conditions for so- called “Λ-coalescents” to come down from infinity. In particular, the King- man coalescent does come down from infinity. However, the seed-bank coa- lescent does not belong to the class of Λ-coalescents, so that Schweinsberg’s result does not immediately apply. Indeed, we will see now that the seed- bank coalescent does indeed not come down from infinity.

Theorem 2.36. The seed-bank coalescent does not come down from infinity. In fact, if started in an infinite configuration its block-counting process stays infinite for all t > 0, P-a.s. To be precise, for each starting configuration (n, m) where n+m is (count- ably) infinite, (n,m) P(∀t > 0 ∶ Mt = ∞) = 1. The proof of this theorem is based on a coupling with a dominated simpli- fied colored seed-bank coalescent process introduced below. In essence, the colored seed-bank coalescent behaves like the normal seed-bank coalescent, except we mark the individuals with a color to indicate whether they have (entered and) left the seed-bank at least once. This will be useful in order to obtain a process where the number of plant-blocks is non-increasing. We will then prove that even if we consider only those individuals that have never made a transition from seed to plant (but possibly from plant to seed), the corresponding block-counting process will stay infinite. This will be achieved by proving that infinitely many particles enter the seed-bank before any pos- itive time. Since they subsequently leave the seed-bank at a linear rate, it will take an infinite amount of time. We summarize the whole set-up for the colored seed-bank coalescent in one Definition.

Definition 2.37 (A colored seed-bank coalescent). In analogy to the con- struction of the seed-bank coalescent, we first define the set of colored, marked partitions as

{p,s}×{w,b} {p,s} k Pk ∶= {(π, u,⃗ v⃗) ∣ (π, u⃗) ∈ Pk , v⃗ ∈ {w, b} }, k ∈ N, P{p,s}×{w,b} ∶= {(π, u,⃗ v⃗) ∣ (π, u⃗) ∈ P{p,s}, v⃗ ∈ {w, b}N}.

It corresponds to the marked partitions introduced earlier, where now each element of ⟦k⟧, resp. N, has an additional flag indicating its color: w for white

89 Chapter 2. A novel seed-bank model and b for blue. It is important to note that the p- or s-flags are assigned to blocks, the color-flags to individuals. We write π ≻c π′, if π′ can be constructed from π by merging two blocks with a p-flag in π that result into a block with a p-flag in π′, while each individual retains its color. We use π ⋉c π′, to denote that π′ results from π by changing the flag of a block from p to s and leaving the colors of all individuals unchanged and π ⋊c π′, if π′ is obtained from π, by changing the flag of a block from s to p and coloring all the individuals in this block blue, i.e. setting their individual flags to b. In other words, after leaving the seed-bank, individuals are always colored blue. For k ∈ N and c, K ∈ (0, ∞) we now define the colored seed-bank k- coalescent with seed-bank intensity c and seed-bank size 1/K, denoted by {p,s}×{w,b} (Πt)t≥0, as the continuous time Markov chain with values in Pk and transition rates given by

⎪⎧1, if π ≻ π′, ⎪ c ′ π ↦ π at rate ⎨c, if π ⋉c π′, (2.12) ⎪ ⎩⎪cK, if π ⋊c π′.

Figure 2.11 gives a possible realization. The colored seed-bank coalescent with seed-bank intensity c and seed-bank size 1/K is then the unique Markov process on P{p,s}×{w,b} given by the projective limit of the distributions of the k-colored seed-bank coalescents, as k goes to infinity.

Remark 2.38.

1. Note that the colored seed-bank coalescent is well-defined. Since the color of an individual only depends on its own path and does not depend on the color of other individuals (not even those that belong to the same block), the consistency of the laws of the k-colored seed-bank coalescents boils down to the consistency of the seed-bank k-coalescents discussed in Remark 2.32.1. In much the same way we then obtain the existence and uniqueness of the colored seed-bank coalescent from Kolmogorov’s Extension Theorem.

2. The normal seed-bank (k-)coalescent can be obtained from the colored seed-bank (k-)coalescent by omitting the flags indicating the coloring of the individuals. However, if we only consider those blocks containing at least one white individual, we obtain a coalescent similar to the seed-bank coalescent, where lineages are discarded once they leave the seed-bank.

90 2.5. Properties of the seed-bank coalescent

Figure 2.11: A realization of the colored k-seed-bank coalescent (for k = 10). The continuous lines indicate active lineages, the dashed lines dormant lineages. In the beginning, als individuals are colored white. When a lineage leaves the dormant state, all individuals in it are colored blue. When lineages merge, each individual retains its color. The dashed arrow indicates the direction of time t of the process, the black arrow that of real-life time.

Notation 2.39 For t ≥ 0 define N t to be the number of white active lineages and M t the number of white dormant lineages in Πt. We will always start in a configuration were all individual labels are set to w, i.e. with only white particles. Note that our construction is such that (N t)t≥0 is non- increasing.

(n,m) (n,m) Proposition 2.40. For any n, m ∈ N∪{ℵ0}, the processes (Nt ,Mt )t≥0 (n,m) (n,m) and (N t ,M t )t≥0 can be coupled such that (n,m) (n,m) (n,m) (n,m) P (∀t ≥ 0 ∶ Nt ≥ N t and Mt ≥ M t ) = 1. Proof. This result is immediate if we consider the coupling through the col- ored seed-bank coalescent and the remarks in 2.38.

91 Chapter 2. A novel seed-bank model

Proof of Theorem 2.36. Proposition 2.40 implies that it suffices to prove the (n,m) (n,m) statement for (M t )t≥0 instead of (Mt )t≥0. In addition, we will only have to consider the case of m = 0, since starting with more (possibly infinitely many) seeds will only contribute towards our desired result. For n ∈ N ∪ {ℵ0} let

n (n,0) τj ∶= inf{t ≥ 0 ∣ N t = j}, 1 ≤ j ≤ n − 1, j < ∞ be the first time that the number of active blocks ofan n-sample reaches j. (n,0) Note that (N t )t≥0 behaves like the block-counting process of a Kingman coalescent, where in addition to the coalescence events, particles may ‘dis- appear’ at a rate proportional to the number of particles alive. Since the corresponding values for a Kingman coalescent are finite P-a.s., it is easy to see that the τ n are, too. Clearly, for any n, τ n − τ n has an exponential j j j−1 distribution with parameter j λ ∶= ( ) + cj. j 2

n At each time of a transition τj , we distinguish between two events: coales- cence and deactivation of an active block, where by deactivation we mean a n n transition of (N t ,M t )t≥0 of type (j + 1, l) ↦ (j, l + 1) (for suitable l ∈ ⟦n⟧), i.e. the transition of a plant to a seed. Then cj 2c (deactivation at τ n ) = = , (2.13) P j−1 j j + 2c − 1 (2) + cj independently of the number of inactive blocks. Thus

n ∶= 1 = < ∞ Xj n , j 2, ..., n, j , {deactivation at τj−1} are independent Bernoulli random variables with respective parameters 2c/(j+ n 2c − 1), j = 2, ..., n. Note that Xj depends on j, but the random variable is independent of the random variable τj−1 due to the memorylessness of the n exponential distribution. Now define At as the (random) number of deacti- vations up to time t ≥ 0 that is, for n ∈ N ∪ {ℵ0},

n n n A ∶= ∑ X 1 n . (2.14) t j {τj−1

j For n ∈ N, since λj ≥ (2), it follows from a comparison with the block-counting ˜ n process of the Kingman coalescent, denoted by (∣Πt ∣)t≥0 if started in n blocks,

92 2.5. Properties of the seed-bank coalescent that for all t ≥ 0, n ˜ n lim P(τ log n 1 ≤ t) ≥ lim P(∣Πt ∣ ≤ ⌊log n − 1⌋) n→∞ ⌊ − ⌋ n→∞ ˜ ≥ lim P(∣Πt∣ ≤ log n − 1) = 1, n→∞ where the last equality follows from the fact that the Kingman coalescent ˜ (Πt)t≥0 comes down from infinity, cf. Theorem 2.9. For t ≥ 0, n n n n n n n (A ≥ ∑ X ) ≥ (1 n ∑ X ≥ ∑ X ) (2.15) P t j P {τlog n−1

n n n lim P(At ≥ ∑ Xj ) = 1. (2.17) n→∞ j=log n Note that due to (2.13), n n 2c [ ∑ Xn] = ∑ = 2c(log n − log log n) + R(c, n), (2.18) E j j + 2c − 1 j=log n j=log n where R(c, n) converges to a finite value depending on the seed-bank intensity n c as n → ∞. Since the Xj are independent Bernoulli random variables, we obtain for the variance n n n 2c 2c [ ∑ Xn] = ∑ [Xn] = ∑ (1 − ) V j V j j + 2c − 1 j + 2c − 1 j=log n j=log n j=log n ≤ 2c log n as n → ∞. (2.19) For any ϵ > 0 we can choose n large enough such that, [ n X ] ≥ E ∑j=log n k (2c − ϵ) log n holds, which yields

n n n n n n P( ∑ Xj < c log n) ≤ P( ∑ Xj − E[ ∑ Xj ] < −(c − ϵ) log n) j=log n j=log n j=log n n n n n ≤ P(∣ ∑ Xj − E[ ∑ Xj ]∣ > (c − ϵ) log n) j=log n j=log n 2c ≤ , (2.20) (c − ϵ)2 log n by Chebyshev’s inequality. In particular, for any κ ∈ N, n n lim P( ∑ Xj < κ) = 0, n→∞ j=log n

93 Chapter 2. A novel seed-bank model and together with (2.17) we obtain for any t > 0

n lim P(At < κ) = 0. (2.21) n→∞

n Since the (At )t≥0 are coupled by construction for any n ∈ N ∪ {ℵ0}, we know ∞ n in particular that P(At < κ) ≤ P(At < κ), for any n ∈ N, t ≥ 0, κ ≥ 0 and ∞ therefore P(At < κ) = 0, which yields

∞ ∀t ≥ 0 ∶ P(At = ∞) = 1. (2.22)

∞ Since in addition, (At )t≥0 is non-decreasing in t, we can even conclude

∞ P(∀t ≥ 0 ∶ At = ∞) = 1. (2.23) Thus we have proven that, for any time t ≥ 0, there have been an infinite amount of movements to the seed-bank P-a.s. Now we are left to show that this also implies the presence of an infinite amount of lineages in the seed- bank, i.e. that a sufficiently large proportion is saved from moving backto the plants where it would be “instantaneously” reduced to a finite number by the coalescence mechanism. Define Bt to be the blocks of a partition that visited the seed-bank at some point before a fixed time t ≥ 0 and were visible in the “white” seed- bank coalescent, i.e.

{s} (ℵ0,0) Bt ∶= {B ⊆ N ∣ ∃ 0 ≤ r ≤ t ∶ B ∈ Πr and contains at least one white lineage }.

Since we started our colored coalescent in (ℵ0, 0), the cardinality of Bt is ∞ at least equal to At and therefore we know P(∣Bt∣ = ∞) = 1. Since Bt is B = { n} countable, we can enumerate its elements as t ⋃n∈N Bt and use this to n 1 n define the sets Bt ∶= {Bt ,...,Bt }, for all n ∈ N. Since Bt is infinite P-a.s., n these Bt exist for any n, P-a.s. Now observe that the following inequalities hold even pathwise by construction:

(ℵ0,0) M ≥ 1 (ℵ ,0) ≥ 1 (ℵ ,0) t ∑ B{s} Π 0 ∑ B{s} Π 0 { ∈ t } n { ∈ t } B∈Bt B∈Bt and therefore the following holds for any κ ∈ N:

(ℵ0,0) (M ≤ κ) ≤ ( 1 (ℵ ,0) ≤ κ) P t P ∑ Bs Π 0 n { ∈ t } B∈Bt κ ∗ n n ≤ ∑ ( )(e−cKt)i(1 − e−cKt)n−i ÐÐ→→∞ 0, i i=1

94 2.5. Properties of the seed-bank coalescent

(ℵ0,0) which in turn implies P(M t = ∞) = 1. In ∗ we used that for each of the ,0 n (ℵ0 ) −cKt n blocks in Bt we know P(B ∈ Πt ) ≥ e and they leave the seed-bank independently of each other, which implies that the sum is dominated by a Binomial random variable with parameters n and e−cKt. Since the probability on the left does not depend on n, and the above (ℵ0,0) holds for any κ ∈ N, we obtain P(M t = ∞) = 1 for all t > 0. Note that this (ℵ0,0) (ℵ0,0) also implies P(M t +N t = ∞) = 1 for all t > 0, from which, through the monotonicity of the sum, we can immediately deduce the stronger statement

(ℵ0,0) (ℵ0,0) P(∀t > 0 ∶ M t + N t = ∞) = 1.

(ℵ0,0) On the other hand, we have seen that P(N t < ∞) = 1, for all t > 0, which (ℵ0,0) again using its monotonicity, yields P(∀t > 0 ∶ N t < ∞) = 1. Putting (ℵ0,0) these two results together we obtain P(∀t > 0 ∶ M t = ∞) = 1

2.5.2 Bounds on the time to the most recent common ancestor In view of the previous subsection it is now quite obvious that the seed- bank causes a relevant delay in the time to the most recent common ancestor of finite samples. Throughout this section, we will again use the notation (n,m) (n,m) (Nt ,Mt )t≥0 to indicate the initial condition of the block-counting 11 process is (n, m) ∈ N0 × N0 Definition 2.41. We define the time to the most recent common ancestor of a sample of n plants and m seeds, to be

(n,m) (n,m) TMRCA[(n, m)] = inf{t > 0 ∣ (Nt ,Mt ) = (1, 0)}. This notion obviously only makes sense for (n, m) ≠ (0, 0) and thus we agree to assume this for our initial conditions even when not explicitely mentioned. Since coalescence only happens in the plants, TMRCA[(n, m)] = (n,m) (n,m) inf{t > 0 ∣ Nt + Mt = 1}, unless we start in (0, 1). Notation 2.42 We will mostly be interested in the case where the sample is drawn from plants only, and write TMRCA[n] ∶= TMRCA[(n, 0)]. The main results of this section are asymptotic logarithmic bounds on the expectation of TMRCA[n]. Theorem 2.43. For all c, K ∈ (0, ∞), the seed-bank coalescent satisfies

w E[TMRCA[n]] ≍ log log n. (2.24) 11In this section we will consider only finite numbers n and m.

95 Chapter 2. A novel seed-bank model

w Here, the symbol ≍ denotes weak asymptotic equivalence of sequences, meaning that we have

E[TMRCA[n]] lim inf > 0, (2.25) n→∞ log log n and E[TMRCA[n]] lim sup < ∞. (2.26) n→∞ log log n The proof of Theorem 2.43 will be given in Proposition 2.46 and Propo- sition 2.51. The intuition behind this result is the following. The time until a seed gets involved in a coalescence event is much longer than the time it takes for a plant to be involved in a coalescence, since a seed has to become a plant first. Thus the time to the most recent common ancestor of asample n plants is governed by the number of individuals that become seeds before coalescence, and by the time of coalescence of a sample of seeds. Due to the quadratic coalescence rates, it is clear that the time until the ancestral lines of all sampled plants have either coalesced into one, or have entered the seed-bank at least once, is finite almost surely. The number of lines that enter the seed-bank until that time is a random variable that is asymptotically of order log n, due to similar considerations as in (2.18). Thus we need to control the time to the most recent common ancestor of a sample of O(log n) seeds. The linear rate of migration then leads to the second log. Turning this reasoning into bounds requires some more work, in particular for an upper bound. Notation 2.44 As in the proof of Theorem 2.36, let Xk, k = 1, ..., n de- note independent Bernoulli random variables with parameters 2c/(k +2c−1). Similar to (2.14) define n n A ∶= ∑ Xk. (2.27) k=2

Lemma 2.45. Under our assumptions, for any ϵ > 0,

n lim P(A ≥ (2c + ϵ) log n) = 0 n→∞ and n lim P(A ≤ (2c − ϵ) log n) = 0. n→∞ Proof. As in the proof of Theorem 2.36 before we have n 2c [An] = ∑ = 2c log n + R′(c, n), E k + 2c − 1 k=2 96 2.5. Properties of the seed-bank coalescent where R′(c, n) converges to a finite value depending on c as n → ∞, and

n V(A ) ∼ 2c log n as n → ∞. Thus again by Chebyshev’s inequality, for sufficiently large n (and recalling that c is our model parameter)

n n n P(A ≥ (2c + ϵ) log n) ≤ P(A − E[A ] ≥ ϵ log n) n n ≤ P(∣A − E[A ]∣ ≥ ϵ log n) 2c ≤ . ϵ2 log n This proves the first claim. The second statement follows similarly, cf. (2.20).

Recall the process (N t,M t)t≥0 from Notation 2.39. The coupling of Proposition 2.40 leads to the lower bound in Theorem 2.43.

Proposition 2.46. For all c, K ∈ (0, ∞), the seed-bank coalescent satisfies

E[TMRCA[n]] lim inf > 0. (2.28) n→∞ log log n

Proof. The coupling with (N t,M t)t≥0 yields

TMRCA[n] ≥ T MRCA[n], where T MRCA[n] denotes the time until (N t,M t)t≥0, started at (n, 0), has reached a state with only one block left. By definition, An of the previous lemma gives the number of individuals that at some point become seeds in the process (N t,M t)t≥0. Thus T MRCA[n] is bounded from below by the time it takes until these An seeds migrate to plants (and then disappear). Since the seeds disappear independently of each other, we can bound T MRCA[n] stochastically from below by the extinction time of a pure death process with death rate cK started with An individuals. For such a process started at An = l ∈ N individuals, the expected extinction time as l → ∞ is of order log l. Thus we have for ϵ > 0 that there exists C > 0 such that 1 E[TMRCA[n]] ≥E[TMRCA[n] {An≥(2c−ϵ) log n}] n ≥C log log nP(A ≥ (2c − ϵ) log n), and the claim follows from the fact that by Lemma 2.45, An ≥ (c − ϵ) log n almost surely as n → ∞.

97 Chapter 2. A novel seed-bank model

To prove the corresponding upper bound, we couple (Nt,Mt)t≥0 to a func- tional of another type of colored process.

Definition 2.47. Let (N t, M t)t≥0 be the continuous time Markov process taking values in N0 × N0, characterized by the transition rates:

⎪⎧(n − 1, m + 1) at rate cn, ⎪ (n, m) ↦ ⎨(n + 1, m − 1) at rate cKm, ⎪ ⎪ n ⎪(n − 1, m) at rate ( ) ⋅ 1 √ (n, m). ⎩ 2 {(n,m)∈N0×N0∣n≥ n+m}

This means that (N t, M t)t≥0 has the same transitions as (Nt,Mt)t≥0, but coalescence is suppressed if there are too few plants relative to the number√ of seeds. The effect of this choice of rates is that for (N t, M t)t≥0, if n ≳ m, then coalescence happens at a rate which is of the same order as the rate of migration from seed to plant.

Lemma 2.48. The processes (N t, M t)t≥0 and (Nt,Mt)t≥0 can be coupled such that

(n,m) (n,m) (n,m) (n,m) P (∀t ≥ 0 ∶ Nt ≤ N t and Mt ≤ M t ) = 1.

Proof. We construct both processes from the same system of blocks, again using an additional flag of color. However, this time the flag will be attached to the blocks, not the individuals. The following strategy is reminiscent of an alternative description of a coalescent using Poisson processes to mark the random events of coalescence, moving in and out of the seed-bank. Start with n + m blocks labeled from {1, ..., n + m}, and with n of them carrying an s-flag, the others a p-flag. Let Si,P i, i = 1, ..., n+m and V i,j, i, j = 1, ..., n + m, i < j be independent Poisson processes, Si with parameter cK, P i with parameter c, and V i,j with parameter 1. Moreover, let each block carry a color flag, blue or white. At the beginning, all blocks are assumed to be blue. The blocks evolve as follows: At an arrival of Si, if block i carries an s-flag, this flag is changed to p irrespective of its color. Similarly, at an arrival of P i, if block i carries a p-flag, this is changed to an s-flag irrespective of its color. At an arrival of V i,j, and if blocks i and j both carry a p-flag, one observes the whole system, and proceeds as follows:

(i) If the total number of p-flags in the system is greater or equal tothe square root of the total number of blocks, then blocks i and j coalesce, which we encode by saying that the block with the higher label (i or j) is discarded. If the coalescing blocks have the same color, this color is kept. Note that here the blocks carry the color, unlike in the colored

98 2.5. Properties of the seed-bank coalescent

process of the previous sections, where the individuals were colored. If the coalescing blocks have different colors, then the color after the coalescence is blue.

(ii) If the condition on the number of flags in (i) is not satisfied, then there is no coalescence, but if both blocks were colored blue, then the block (i or j) with the higher label is colored white (this can be seen as a “hidden coalescence” in the process where colors are disregarded).

A representation of this process is given in Figure 2.12. It is then clear by observing the rates that (Nt,Mt)t≥0 is given by the process which counts at any time t the number of blue blocks with p-flags resp. s-flags only, and

(N t, M t)t≥0 is obtained by counting the number of blocks with p-flags resp. block with s-flags of any color. By construction we obviously have N t ≥ Nt and M t ≥ Mt for all t P-a.s. as in the claim.

Notation 2.49 Define now

(0,m) (0,m) T MRCA[m] ∶= inf {t ≥ 0 ∣ (N t , M t ) = (1, 0)}.

Lemma 2.50. There exists a finite constant C independent of m such that

E[T MRCA[m]] ≤ C log m.

Proof. Define for every k ∈ 1, 2, ..., m − 1 the hitting times

Hk ∶= inf{t > 0 ∶ N t + M t = k}. (2.29)

0,m C 0,m C We aim at proving that [Hm 1] ≤ √ and [Hj 1 − Hj] ≤ for j ≤ E − m E − j−1 m−1, for some 0 < C < ∞. Here and throughout the proof, C denotes a generic positive constant (independent of m) which may√ change√ from instance to instance. To simplify notation, we will identify j with ⌈ j⌉, or equivalently assume that all occuring square roots are natural numbers. Moreover, we will only provide the calculations in the case of the standard seedbank-coalescent, that is, c = K = 1. The reader is invited to convince herself that the argument can also be carried out in the general case.

99 Chapter 2. A novel seed-bank model

Figure 2.12: A realization of the colored k-seed-bank coalescent (for k = 10). The continuous lines indicate active lineages, the dashed lines dormant lineages. Arrival times of the Poisson processes of type P are marked by a black circle, those of type S by a white circle and those of V by stars. White stars indicate instances that could not result in a coalescence due to the additional restrictions on the relative number of plants. All lineages are colored blue in the beginning. After such an event, though, the lineage with the higher label will be colored white. Thus this event looks like a coalescence to the process tracing only the blue lines. When lineages merge, they obtain the darker color. The dashed arrow indicates the direction of time t of the process, the black arrow that of real-life time.

We write λt for the total jump rate of the process (N t, M t)t≥0 at time t, that is, N t λt = ( )1 √ + N t + Mt, 2 {N t≥ Nt+Mt} and set N ( t)1 √ 2 {N t≥ N t+M t} N t M t αt ∶= , βt ∶= , γt ∶= λt λt λt

100 2.5. Properties of the seed-bank coalescent for the probabilities that the first jump after time t is a coalescence, a migra- tion from plant to seed or a migration from seed to plant, respectively. Even though all these rates are now random, they are well-defined conditional on the state of the process. The proof will√ be carried out in three steps. Step 1: Bound on the time to reach m plants. Let

(0,m) √ Dm ∶= inf{t > 0 ∣ N t ≥ m} (2.30) √ denote the first time the number of plants is at least m. Due to the re- 0,m 0,m striction in the coalescence rate, the process (N ( ), M ( )) has to first √ t t t≥0 reach a state with at least m plants before being able to coalesce, hence D < H a.s. Therefore, for any t ≥ 0, conditional on t ≤ D we have m m−1 √ √ m λt = m and N t < m. Thus Mt > m− m a.s. and we note that at each jump time of (N t, M t)t≥0 for t ≤ Dm √ m − m 1 γ ≥ = 1 − √ a.s. ∀s ≤ t s m m and 1 β ≤ √ a.s. ∀s ≤ t. s m

The expected number of jumps of the process (N t, M t)t≥0 until Dm is there- fore bounded from above by the expected time it takes a discrete√ time asym- metric simple random√ walk started at 0 with probability 1 − √1/ m for an upward jump and 1/ m for a downward jump to reach level m − 1. It is a well-known fact√ (see for example [28], Ch. XIV.3) that this expectation is bounded by C m for some C ∈ (0, ∞). Since the time between each of the jumps of (N t, M t), for t < Dm, is exponential with rate λt = m, we get

√ 1 C 0,m[D ] ≤ C m ⋅ = √ . (2.31) E m m m √ Step 2: Bound on the time to the first coalescence after reaching m √m √ √ plants. At time t = Dm, we have λ = ( 2 ) + m + m − m, and thus √ √ m 2 m C β = = √ ≤ √ a.s., t √m 3m − m m ( 2 ) + m and √ m − m 1 1 α = √ ≥ (1 − √ ) a.s. t 3m − m 3 m

101 Chapter 2. A novel seed-bank model

Denote by Jm the time of the first jump after time Dm. At Jm there is either a coalescence taking place (thus reaching a state with m − 1 individuals and hence in that case Hm−1 = Jm), or a migration. In order to obtain an upper bound on Hm−1, as a “worst-case scenario”, we can assume that if there is no coalescence at Jm, the process is restarted from√ state (0, m), and then run again until the next time that there are at least m plants (hence after Jm, the time until this happens is again equal in distribution to Dm). If we proceed like this, we have that the number of times that the process is restarted is stochastically dominated by a geometric random variable with 1 ( − 1 ) parameter 3 1 √m , and since 1 C 0,m[J − D ] = λ−1 = ≤ , E m m Dm √m m ( 2 ) + m we can conclude (using (2.31)) that √ 3 m 0,m[H ] ≤ 0,m[J ]√ E m−1 E m m − 1 √ 3 m = ( 0,m[D ] + 0,m[J − D ])√ E m E m m m − 1 C ≤ √ . (2.32) m Step 3: Bound on the time between two coalescences. Now we want to estimate 0,m[H − H ] for j ≤ m − 1. Obviously at time H −, for j ≤ m − 1, E j−1√ j j there are at least j + 1 plants, since Nt + Mt can decrease only through a coalescence. Therefore (keeping in mind our√ convention√ that Gauß-brackets are applied if necessary, and hence N ≥ j + 1−1 ≥ j −1 holds) we obtain √ Hj √ N Hj ≥ j − 1. Hence either we have N Hj ≥ j and coalescence is possible in √ √ j− j 1 the first jump after H , or N = j − 1, in which case γ ≥ = 1 − √ , j Hj Hj j j meaning that if coalescence is not allowed at Hj, with probability at least − 1 1 √j it will be possible after the first jump after reaching Hj. Thus the probability that coalescence is allowed either at the first or the second jump − 1 after time Hj is bounded from below by 1 √j . Assuming that coalescence is possible at the first or second √jump after Hj, denote by Lj the time to either the first jump after Hj if N Hj ≥ j, or the time of the second jump after Hj otherwise. Then in the same way as ≥ − C before, we see that αLj 1 √j . Thus the probability that Hj−1 is reached ( − C )2 no later than two jumps after Hj is at least 1 √j . Otherwise, in the case where there was no coalescence at either the first or the second jump after

102 2.5. Properties of the seed-bank coalescent

Hj, we can obtain an upper bound on Hj−1 by restarting the process from state (0, j). The probability that the process is restarted is thus bounded C ( ) from above by √j . We know from equation (2.32) that if started in 0, j , 0,j[ ] ≤ C ≥ there is E Hj−1 √j . Noting that λHj j, and we need to make at most 0,m two jumps, we have that E [Lj] ≤ 2/j. Thus we conclude

0,m 0,m C 2 C 0,j E [Hj 1 − Hj] ≤ E [Lj](1 − √ ) + √ E [Hj 1] − j j − 2 C C 2 ≤ (1 − √ ) + (√ ) j − 1 j j C ≤ . (2.33) j − 1

These three bounds allow us to finish the proof, since when starting (N t, M t)t≥0 in state (0, m) our calculations show that

m−1 0,m E[T MRCA[m]] = E [H1] = E[Hm−1] + ∑ E[Hj−1 − Hj] j=2 C m−1 1 ≤ √ + C ∑ ∼ C log m (2.34) j − 1 m j=2 as m → ∞. This allows us to prove the upper bound corresponding (qualitatively) to the lower bound in (2.28).

Proposition 2.51. For c, K ∈ (0, ∞), the seed-bank coalescent satisfies

E[TMRCA[n]] lim sup < ∞. (2.35) n→∞ log log n Proof. Assume that the initial n individuals in the sample of the process (n) (Πt )t≥0 are labelled 1, ..., n. Let

Sr ∶= {k ∈ ⟦n⟧ ∶ ∃0 ≤ t ≤ r ∶ k belongs to an s-block at time t} denote those lines that visit the seed-bank at some time up to t. Let

n c ϱ ∶= inf{r ≥ 0 ∶ ∣Sr ∣ = 1} be the first time that all those individuals which so far had not entered the seed-bank have coalesced. Note that ϱn is a stopping time for the process

103 Chapter 2. A novel seed-bank model

(n) (n,0) (n,0) (Πt )t≥0, and Nϱn and Mϱn are well-defined as the number of plant (n) n blocks, resp. seed blocks of Πϱn . By a comparison of ϱ to the time to the most recent common ancestor of Kingman’s coalescent cf. [109], E[ϱn] ≤ 2 for any n ∈ N, and thus

(n,0) (n,0) E[TMRCA[(n, 0)]] ≤ 2 + E[TMRCA[(Nϱn ,Mϱn )]] (n,0) (n,0) ≤ 2 + E[TMRCA[(0,Nϱn + Mϱn )]], (2.36) where the last inequality follows from the fact that every seed has to become a plant before coalescing. Recall An from (2.27) and observe that

(n,0) (n,0) n Nϱn + Mϱn ≤ A + 1 stochastically. (2.37)

This follows from the fact that for every individual, the rate at which it is involved in a coalescence is increased by the presence of other individuals, while the rate of migration is not affected. Thus by coupling (Nt,Mt)t≥0 to a system where individuals, once having jumped to the seed-bank, remain n there forever, we see that Nϱn + Mϱn is at most A + 1. By the monotonicity of the coupling with (N t, M t), we thus see from (2.36), for ϵ > 0,

n E[TMRCA[n]] ≤ 2 + E[T MRCA[A + 1]] n 1 = 2 + E[T MRCA[A + 1] {An≤(2c+ϵ) log n}] n 1 + E[T MRCA[A + 1] {An>(2c+ϵ) log n}]. (2.38)

From Lemma 2.50 we obtain

n 1 E[T MRCA[A + 1] {An≤(2c+ϵ) log n}] ≤ C log(2c − ϵ) log n ≤ C log log n, and since An ≤ n in any case, we get

n 1 n E[T MRCA[A + 1] {An>(2c+ϵ) log n}] ≤ C log n ⋅ P(A > (2c + ϵ) log n) ≤ C.

This completes the proof.

Remark 2.52. In the same manner as in the proof of Theorem 2.43, one can show that for any a, b ≥ 0,

w E[TMRCA[an, bn]] ≍ log ( log(an) + bn).

104 2.5. Properties of the seed-bank coalescent

2.5.3 Recursions for important values

Although the seed-bank coalescent is surely an interesting structure from a mathematical point of view, its motivation comes from real-life applications, as explained in the introduction. However, the aim is not to only take inspi- ration from biological questions for new mathematical concepts, but to then in return contribute to answers providing a model that suitably describes the biological phenomena. A quality feature for a model from the point of view of applications is the suitability of its parameters for simulations, for example, if they can be described by recursive formulae. Several such recursions for quantities of interest were given in the context of an even more general model that includes mutation and death in the seed-bank in [6] together with an extended analysis of important statistical features. However, these are less interesting from an analytical point of view and would require a lengthy introduction. Therefore we will mention here only the results directly ap- plicable to the seed-bank coalescent as given in Definition 2.31 with a short explication of the notions when required. The basic proofs were moved to 2.6.2 for better readability.

Time to the most recent common ancestor

One obvious candidate for simulations is the time to the most recent com- mon ancestor as defined in Definition 2.41. We saw in Theorem 2.43 that the expected time to the most recent common ancestor for the seedbank coalescent, if started in a sample of active individuals of size n, is of order O(log log n), in stark contrast to the corresponding quantity for the classical Kingman coalescent, which is bounded by 2, uniformly in n, cf. Theorem 2.8. This is relevant, as it already indicates that one should expect elevated levels of (old) genetic variability under the seed-bank coalescent, matching the intuition that a seed-bank helps preserve genetic variability. Here, it should also be noted that the time to the most recent common ancestor of the Bolhausen-Sznitman coalescent is also O(log log n) [43]. The Bolthausen- Sznitman coalescent is often used as a model for selection12, cf. e.g. [85]. While the above result shows the asymptotic behavior of the time to the most recent common ancestor for large numbers n of individuals, it does not give precise information for the exact absolute value, in particular for ‘small to medium’ n. Thus it is important that we provide recursions for its expected value and variance that can be computed efficiently. Recall

12Consult footnote 6 on page 56 for an explanation of the term.

105 Chapter 2. A novel seed-bank model

Definition 2.41: For an initial value (n, m) ∈ N0 × N0 ∖ {(0, 0)}

(n,m) (n,m) TMRCA[(n, m)] = inf{t > 0 ∣ (Nt ,Mt ) = (1, 0)}.

Notation 2.53 To make the structure in the formulae more visible, we abbreviate

tn,m ∶= E[TMRCA[(n, m)]], vn,m ∶= V(TMRCA[(n, m)]) as well as n λ ∶= ( ) + cn + cKm, (2.39) n,m 2 and

n (2) cn cKm αn,m ∶= , βn,m ∶= , γn,m ∶= (2.40) λn,m λn,m λn,m and set λn,m = αn,m = βn,m = γn,m = 0 for any of the cases with −1 ∈ {n, m} or (n, m) = (0, 0). The reader will have recognized in 2.40 the transition probabilities of the (n,m) (n,m) jump-chain corresponding to the block-counting process (Nt ,Mt )t≥0.

Proposition 2.54. Let n, m ∈ N0 such that n + m ≥ 2. Then we have the following recursive representations for the expectation and the variance of the time to the most recent common ancestor of a sample of size (n, m)

−1 tn,m = λn,m + αn,mtn−1,m + βn,mtn−1,m+1 + γn,mtn+1,m−1,

−2 vn,m = λn,m + αn,mvn−1,m + βn,mvn−1,m+1 + γn,mvn+1,m−1 + α t2 + β t2 + γ t2 n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 2 − (αn,mtn−1,m + βn,mtn−1,m+1 + γn,mtn+1,m−1) ,

1 2 with initial conditions t1,0 = v1,0 = 0 and t0,1 = (cK)− , v0,1 = (cK)− . Before we move to the next recursion, let us discuss an immediate obser- vation. Solving the linear system, we obtain 2 1 t = 1 + + . (2.41) 2,0 K K2 From this equation we observe that the time to the most recent common ancestor t2,0 of a sample of two plants (and no seeds) is constant in the rate of

106 2.5. Properties of the seed-bank coalescent exchange c between the plant and the seed population. In particular, for c → 0 it does not converge to 1 - the corresponding value for the Kingman coalescent (Theorem 2.8), which describes the case of no seed-bank. This effect can also be observed in the structured coalescent13 with two islands if the migration rate tends to 0, cf. [83]. On the other hand, we do indeed recover the value of the Kingman case when the relative seed-bank size tends to 0, i.e. for K → ∞. As illustrated in Table 2.1, this latter effect also appears for larger sample sizes. Here we also find confirmation of the following intuition: Lineages that are in the seed-bank cannot coalesce, thus a larger seed-bank (i.e. a smaller K) leads no an increased tn,0. At the same time, for the same reason, a larger c meaning a faster exchange rate between the states, has an attenuating effect on this increment.

We can also give a heuristic explanation for the fact that t2,0 = 4 in case K = 1 and c very large: Then transitions between the active and the dormant state occur very fast. Therefore, at any given time, the probability of a line being active is around 1/2, whence we obtain the probability of the two lineages being active and thus able to coalesce to be 1/4. We therefore conjecture that for K = 1 and c → ∞ the genealogy of a sample is given by a time-change by a factor 4 of the Kingman coalescent.

Table 2.1: Values of the expected time to most recent common ancestor tn,0 of the seed-bank coalescent, obtained from Proposition 2.54 for relative seed- bank size K, sample size n and dormancy initiation rates c. For comparison, the respective values for the Kingman coalescent are added. The values in the first table with K = 0.01 need to be multiplied by 104.

K = 0.01, ×104 K = 1 K = 100 sample size n sample size n sample size n c 2 10 100 2 10 100 2 10 100 0.01 1.02 2.868 5.185 4 10.21 17.18 1.02 1.846 2.052 0.1 1.02 2.731 4.487 4 9.671 14.97 1.02 1.838 2.026 1 1.02 2.187 2.666 4 8.071 10.02 1.02 1.836 2.02 10 1.02 1.878 2.085 4 7.317 8.221 1.02 1.836 2.02 100 1.02 1.84 2.026 4 7.212 7.954 1.02 1.836 2.02

The values in the Kingman case (K = ∞, no c): 1 1.80 1.98

13Explained in Section 2.4.1

107 Chapter 2. A novel seed-bank model

Length of the genealogical tree

In a model with mutation the number of segregating sites – positions showing differences between related genes – and the number of singletons – mutations that can only be found in one individual – are relevant indicators of genetic variability. In order to calculate them, we need the total tree length and the length of external branches of the genealogical tree, which is the graph of the seed-bank coalescent up to the time of the most recent common ancestor, as we considered, for example, in Figure 2.9 on page 84. Using common graph- terminology, the outer branches of the tree are the bonds between a leaf and an inner node (where the root counts as inner node). Inner branches are all others. See Figure 2.13 for an explanation.

Figure 2.13: This is the realization of a k-seed-bank coalescent from Figure 2.9. The outer branches are highlighted in red while the inner branches are left black.

Notation 2.55 Let L(p) and L(s) be the total length of all branches (n,m) (n,m) while they are in the active, resp. the dormant state given the coalescent was

108 2.5. Properties of the seed-bank coalescent

started with n ∈ N0 active and m ∈ N0 dormant individuals and abbreviate

lp ∶= [L(p) ], ls ∶= [L(s) ], n,m E (n,m) n,m E (n,m)

wp ∶= [L(p) ], ws ∶= [L(s) ], wp,s ∶= [L(p) ,L(s) ]. n,m V (n,m) n,m V (n,m) n,m E (n,m) (n,m)

The following Proposition was stated incorrectly in [6].

Proposition 2.56. Let n, m ∈ N0 such that n + m ≥ 2. We then have the following recursive representations for the expectation, the variance and the second mixed moments of the total tree length of a sample of size (n, m)

lp = nλ−1 + α lp + β lp + γ lp (2.42) n,m n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1

ls = mλ−1 + α ls + β ls + γ ls , (2.43) n,m n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 and

wp = n2λ−2 + α wp + β wp + γ wp n,m n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 + α (lp )2 + β (lp )2 + γ (lp )2 n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 2 − (α lp + β lp + γ lp ) , n,m n−1,m n,m n−1,m+1 n,m n+1,m−1

ws = m2λ−2 + α ws + β ws + γ ws n,m n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 + α (ls )2 + β (ls )2 + γ (ls )2 n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 2 − (α ls + β ls + γ ls ) , n,m n−1,m n,m n−1,m+1 n,m n+1,m−1

wp,s = 2nmλ−2 + mλ−1 (α lp + β lp + γ lp ) n,m n,m n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 + nλ−1 (α ls + β ls + γ ls ) n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 + α wp,s + β wp,s + γ wp,s , n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 p p p p s s −1 s with initial conditions l1,0 = l1,0 = l0,1 = 0, l0,1 = (cK) , w1,0 = w1,0 = w0,1 = 0, s p,s p,s w0,1 = (cK)−2 and w1,0 = w0,1 = 0. Observe that equations (2.42) and (2.43) differ in the factor (n resp. m) −1 before λn,m. From quantities calculated above, we recover the expected total tree length given the coalescent is started with n plants and m seeds as p s n,m p s p,s p s ln,m + ln,m. Moreover, Cov (L ,L ) = wn,m − wn,mwn,m.

109 Chapter 2. A novel seed-bank model

We calculate 2 2 2 lp = 1 + , ls = + . 2,0 K 2,0 K K2

p s and observe that l2,0 and l2,0 are independent of c as also seen for t2,0 in (2.41). Our next goal is to derive recursions for the expected length of external branches in the active, respectively in the dormant state. This is a little more involved, since obviously a coalescence now may happen between either two external active branches, two internal active branches, or an external and an internal active branch. Notation 2.57 We use indices (n, n′, m, m′) ∈ N0 ×N0 ×N0 ×N0 to denote the number of external active branches, internal active branches, external dormant branches, and internal dormant branches, respectively. Abbreviate

n n′ λ ′ ′ ∶= ( ) + ( ) + nn′ + cn + cn′ + cKm + cKm′, n+n ,m+m 2 2

n n′ ( ) ( ) ′ (1) 2 (2) 2 (3) nn α ′ ′ ∶= , α ′ ′ ∶= , α ′ ′ ∶= , n,n ,m,m ′ ′ n,n ,m,m ′ ′ n,n ,m,m ′ ′ λn+n ,m+m λn+n ,m+m λn+n ,m+m

′ (1) cn (2) cn β ′ ′ ∶= , β ′ ′ ∶= , n,n m,m ′ ′ n,n m,m ′ ′ λn+n ,m+m λn+n ,m+m

′ (1) cKm (2) cKm γ ′ ′ ∶= , γ ′ ′ ∶= n,n m,m ′ ′ n,n ,m,m ′ ′ λn+n ,m+m λn+n ,m+m and set their values to 0, if −1 ∈ {n, n′, m, m′}. Let E(p) denote the total lenght of external branches in the plant state, (n,m) and E(s) the total lenght of external branches in the seed state and set (n,m)

ep ∶= [E(p) ] es ∶= [E(s) ]. n,0,m,0 E (n,m) n,0,m,0 E (n,m) Since the situation here is more involved we have summarized the possible events together with their probabilities in Table 2.2.

Proposition 2.58. Let n, n′, m, m′ ∈ N0 such that n+n′ +m+m′ ≥ 2. We then have the following recursive representation for the expectation of the length

110 2.5. Properties of the seed-bank coalescent

Table 2.2: Overview of the notation for Porposition 2.58. We track the seed- bank coalescent differentiating between internal and external lineages. The possible events and respective transitions of the corresponding jump-chain are explained on the right, while their probability is given on the left.

n (1) (2) α ′ ′ = 2 external (active) lineages coalesce n,n ,m,m λn+n′,m+m′ (and thus become an internal lineage) (n, n′, m, m′) ↦ (n − 2, n′ + 1, m, m′) ′ n (2) ( 2 ) α ′ ′ = 2 internal (active) lineages coalesce n,n ,m,m λn+n′,m+m′ (n, n′, m, m′) ↦ (n, n′ − 1, m, m′) ′ (3) nn α ′ ′ = 1 external and 1 internal (active) lineage coalesce n,n ,m,m λn+n′,m+m′ (n, n′, m, m′) ↦ (n − 1, n′, m, m′)

(1) cn β ′ ′ = 1 external active lineages becomes inactive n,n ,m,m λn+n′,m+m′ (n, n′, m, m′) ↦ (n − 1, n′, m + 1, m′) ′ (2) cn β ′ ′ = 1 internal active lineages becomes inactive n,n ,m,m λn+n′,m+m′ (n, n′, m, m′) ↦ (n, n′ − 1, m, m′ + 1)

(1) cKm γ ′ ′ = 1 external inactive lineages becomes active n,n ,m,m λn+n′,m+m′ (n, n′, m, m′) ↦ (n + 1, n′, m − 1, m′) ′ (2) cKm γ ′ ′ = 1 internal inactive lineages becomes active n,n ,m,m λn+n′,m+m′ (n, n′, m, m′) ↦ (n, n′ + 1, m, m′ − 1)

of the external branches of a sample of size (n, n′, m, m′)

p 1 e ′ ′ = nλ− ′ ′ n,n ,m,m n+n ,m+m (1) p (2) p (3) p + α ′ ′ e ′ ′ + α ′ ′ e ′ ′ + α ′ ′ e ′ ′ n,n ,m,m n−2,n +1,m,m n,n ,m,m n,n −1,m,m n,n ,m,m n−1,n ,m,m (1) p (2) p + β ′ ′ e ′ ′ + β ′ ′ e ′ ′ n,n ,m,m n−1,n ,m+1,m n,n ,m,m n,n −1,m,m +1 (1) p (2) p + γ ′ ′ e ′ ′ + γ ′ ′ e ′ ′ n,n ,m,m n+1,n ,m−1,m n,n ,m,m n,n +1,m,m −1 and

s 1 e ′ ′ = mλ− ′ ′ n,n ,m,m n+n ,m+m (1) s (2) s (3) s + α ′ ′ e ′ ′ + α ′ ′ e ′ ′ + α ′ ′ e ′ ′ n,n ,m,m n−2,n +1,m,m n,n ,m,m n,n −1,m,m n,n ,m,m n−1,n ,m,m (1) s (2) s + β ′ ′ e ′ ′ + β ′ ′ e ′ ′ n,n ,m,m n−1,n ,m+1,m n,n ,m,m n,n −1,m,m +1

111 Chapter 2. A novel seed-bank model

(1) s (2) s + γ ′ ′ e ′ ′ + γ ′ ′ e ′ ′ n,n ,m,m n+1,n ,m−1,m n,n ,m,m n,n +1,m,m −1

p p p s s −1 with initial values e1,0,0,0 = e1,0,0,0 = e0,0,1,0 = 0, e0,0,1,0 = (cK) and e0,1,0,0 = s p s e0,1,0,0 = e0,0,0,1 = e0,0,0,1 = 0.

p s ′ ′ Observe that, in particular, e0,n′,0,m′ = e0,n′,0,m′ = 0 for all n , m ∈ N0. As mentioned above, these recursions are specifically relevant in the bio- logical context, as they allow for easy simulations. The recursions presented here were used in [6] to derive further recursions for values of importance to observe the genetic variability. In [6] also deeper statistical investigations were undertaken, but we will not go into further detail about the statistical aspects as they were not done by the author of this thesis.

2.6 Technical results

2.6.1 Convergence of Generators

Proposition 2.59. Assume c = εN = δM and M → ∞, N → ∞. Let

(DN,M )N,M∈N be an array of positive real numbers. Then the discrete genera- N M tor of the allele frequency process (X ,Y )t + on time-scale DN,M ⌊DN,M t⌋ ⌊DN,M t⌋ ∈R is given by

c ∂f c ∂f (AN f)(x, y) = D [ (y − x) (x, y) + (x − y) (x, y) N,M N ∂x M ∂y 1 1 ∂2f + x(1 − x) (x, y) + R(N,M)], N 2 ∂x2 where the remainder term R(N,M) satisfies that there exists a constant C1(c, f) ∈ (0, ∞), independent of N and M, such that

−3/2 −2 −1 −1 −3 ∣R(N,M)∣ ≤ C1(N + M + N M + NM ).

In particular, in the situation where M = O(N) as N → ∞ and DN,M = N we immediately obtain Proposition 2.18.

N M Proof. We calculate the generator of (Xk ,Yk )k≥0 depending on the scaling 3 2 (DN,M )N,M∈N. For f ∈ C ([0, 1] ) we use Taylor expansion in two dimensions

112 2.6. Technical results to obtain

N 1 ∂f N ∂f M (A f)(x, y) = [ (x, y)Ex,y [X1 − x] + (x, y)Ex,y [Y1 − y] DN,M ∂x ∂y 1 ∂2f 2 + (x, y) [(XN − x) ] 2 ∂x2 Ex,y 1 1 ∂2f 2 + (x, y) [(Y M − y) ] 2 ∂y2 Ex,y 1 ∂2f + (x, y) [(XN − x)(Y M − y)] ∂x∂y Ex,y 1 α,β N N M α M β + Ex,y[ ∑ R (X1 ,Y1 )(X1 − x) (Y1 − y) ]] ∈ α,β N0 α+β=3 where the remainder is given by α + β 1 ∂3f Rα,β(x,¯ y¯) ∶= ∫ (1 − t)α+β−1 (x − t(x¯ − x), y − t(y¯ − y))dt α!β! 0 ∂xα∂yβ for anyx, ¯ y¯ ∈ [0, 1]. In order to prove the convergence, we thus need to calculate or bound all the moments involved in this representation. Given Px,y the following holds: By Proposition 2.16 1 XN = (U + Z), 1 N 1 Y M = (yM − Z + V ), 1 M in distribution where U, V and Z are independent random variables such that U ∼ Bin(N − c, x), V ∼ Bin(c, x), Z ∼ Hyp(M, c, yM). Thus we have

Ex,y[U] = Nx − cx, Ex,y[V ] = cx, Ex,y[Z] = cy, and moreover Vx,y(U) = (N − c)x(1 − x). One more observation is that as 0 ≤ V ≤ c and 0 ≤ Z ≤ c, it follows that ∣Z − cX∣ ≤ c and ∣V − Z∣ ≤ c, which implies that for every α ∈ N α α ∣Ex,y[(Z − cX) ]∣ ≤ c , α α ∣Ex,y[(Z − V ) ]∣ ≤ c ,

113 Chapter 2. A novel seed-bank model and for every α, β ∈ N

α β α+β ∣Ex,y[(Z − cX) (V − Z) ]∣ ≤ c (2.44)

We are now prepared to calculate all the mixed moments needed.

1 [XN − x] = [U + Z − Nx] Ex,y 1 N Ex,y 1 1 = [U − Nx + cx] + [Z − cx] N Ex,y N Ex,y c = (y − x) N

Here we used (2.6.1), in particular Ex,y[U − Nx + cx] = Ex,y[U − Ex,y[U]] = 0. Similarly,

1 [Y M − y] = [My + V − Z − My] Ex,y 1 M Ex,y 1 = [V − Z] M Ex,y c = (x − y). M

N 1 1 Noting X1 − x = N (U − Nx + cx) + N (Z − cx) leads to 1 [(XN − x)2] = [(U − Nx + cx)2] Ex,y 1 N 2 Ex,y 2 + [U − Nx + cx] [Z − cx] N 2 Ex,y Ex,y 1 + [(Z − cx)2] N 2 Ex,y 1 1 = [U] + [(Z − cx)2] N 2 Vx,y N 2 Ex,y 1 c 1 = x(1 − x) − x(1 − x) + [(Z − cx)2], N N 2 N 2 Ex,y where c 1 c2 ∣ − x(1 − x) + [(Z − cx)2]∣ ≤ . N 2 N 2 Ex,y N 2 Moreover we have 1 c2 ∣ [(Y M − y)2]∣ = ∣ [(V − Z)2∣ ≤ . Ex,y 1 M 2 Ex,y M 2

114 2.6. Technical results

Using Equation (2.44) we get

1 ∣ [(XN − x)(Y M − y)]∣ ≤∣ [U − xN + cx] [V − Z]∣ Ex,y 1 1 NM Ex,y Ex,y 1 + ∣ [(Z − cx)(V − Z)]∣ NM Ex,y c2 ≤ . NM

We are thus left with the task of bounding the remainder term in the Taylor expansion. Since f ∈ C3([0, 1]2), we can define

∂3f C˜f ∶= max{ (x,¯ y¯) ∣ α, β ∈ , α + β = 3, x,¯ y¯ ∈ [0, 1]} ∂xα∂yβ N0 which yields a uniform estimate for the remainder in the form of

1 ∣Rα,β(x,¯ y¯)∣ ≤ , α!β!C¯f which in turn allows us to estimate

α,β N N N α M β ∣Ex,y[ ∑ R (X1 ,Y1 )(X1 − x) (Y1 − y) ]∣ ∈ α,β N0 α+β=3 1 ≤ ∑ [∣(XN − x)α(Y M − y)β∣]. ¯f Ex,y 1 1 α!β!C ∈ α,β N0 α+β=3

Thus the claim follows if we show that the third moments are all of small enough order in N and M. Observe that for α ∈ {0, 1, 2} we have

α Ex,y[∣(U − Nx + cx)∣ ] ≤ N. (2.45)

For α = 0 this is trivially true, for α = 1 it is due to the fact that the binomial random variable U is supported on 0, ..., N −c and Nx−cx is its expectation, and for α = 2 it follows from the fact that (U − Nx + cx)2 = ∣(U − Nx + cx)2∣ and the formula for the variance of a binomial random variable. For α = 3 it follows e.g. from Lemma 3.1 in [27] that

3 3/2 Ex,y[∣(U − Nx + cx)∣ ] = O(N ). (2.46)

115 Chapter 2. A novel seed-bank model

Thus we get for any 0 ≤ α, β ≤ 3 such that α + β = 3 that

α M β Ex,y[∣(X1 − x) (Y1 − y) ∣] 1 α α = ∑ ( ) [∣(U − Nx + cx)i(Z − cx)α−i(V − Z)β∣] N αM β i Ex,y i=0 1 α α ≤ ∑ ( ) [∣(U − Nx + cx)i∣] [∣(Z − cx)α−i(V − Z)β∣] N αM β i Ex,y Ex,y i=0 1 α α 3(2c)3 α−i+β ≤ ∑ ( )N(2c) 1 1,2,3 (α) + 1 3 (α) N αM β i { } N 3/2 { } i=0 1 1 1 N ≤ C( + + ), NM M 2 N 3/2 M 3 from (2.44), (2.45) and (2.46), where the constant C depends only on c. This completes the proof.

2.6.2 Proofs of recursions For the proofs of the recursions it will be more convenient to return to the habit of thinking of the initial values as given by the measure and not at- tached to the process itself, similar to Notation 2.23. Let us recall the parameters introduced in Notation 2.53:

n λ ∶= ( ) + cn + cKm, n,m 2 and

n (2) cn cKm αn,m ∶= , βn,m ∶= , γn,m ∶= . λn,m λn,m λn,m

Recall the block-counting process of the seed-bank coalescent. It is the continuous-time Markov chain (Nt,Mt)t≥0 introduced in Definition 2.22 on page 77 and the transition probabilities of its jump-chain are given precisely by the parameters above. For the proofes it is sufficient and more convenient to work with this block-counting process instead of the seed-bank coalescent itself.

Proof of Proposition 2.54. Let τ1 be the time of the first jump of this process. Given the block-counting process is started in (n, m) ∈ N0 × N0 ∖ {(0, 0)}, τ1 is an exponential random variable with parameter λn,m. The strong Markov property then allows us to calculate

116 2.6. Technical results

n,m n,m n,m tn,m = E [TMRCA] = E [E [τ1 + TMRCA − τ1 ∣ Fτ1 ]] n,m n,m Nτ ,Mτ = E [τ1] + E [E 1 1 [TMRCA]] −1 n−1,m = λn,m + αn,mE [TMRCA] n−1,m+1 n+1,m−1 + βn,mE [TMRCA] + γn,mE [TMRCA] −1 = λn,m + αn,mtn−1,m + βn,mtn−1,m+1 + γn,mtn+1,m−1, which is the first recursion. Similarly, since we also know that τ1 is indepen- dent of (Nτ1 ,Mτ1 ) together with the law of total variance yields

n,m vn,m = V (TMRCA) n,m n,m Nτ ,Mτ n,m Nτ ,Mτ = V (τ1) + E [V 1 1 (TMRCA)] + V (E 1 1 [TMRCA])

−2 n,m Nτ1 ,Mτ1 n,m Nτ1 ,Mτ1 = λn,m + E [V (TMRCA)] + V (E [TMRCA]).

At the same time

n,m Nτ1 ,Mτ1 E [V (TMRCA)] = αn,mvn−1,m + βn,mvn−1,m+1 + γn,mvn+1,m−1 and

2 n,m Nτ ,Mτ n,m Nτ ,Mτ 2 n,m Nτ ,Mτ V (E 1 1 [TMRCA]) = E [E 1 1 [TMRCA] ] − E [E 1 1 [TMRCA]] = α t2 + β t2 + γ t2 n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 2 − (αn,mtn−1,m + βn,mtn−1,m+1 + γn,mtn+1,m−1) .

Combining the observations we obtain the second recurrence and conmplete the proof.

Proof of Proposition 2.56. Observe that each stretch of time with length t during which the number of n active and m dormant blocks stay constant, contributes with nt to the total length of the active branches, and with mt to the total length of dormant branches. As in the previous proof, let τ1 be the time of the first jump of (Nt,Mt)t≥0. Using the observation and the strong Markov property, we see

p n,m (p) ln,m = E [L ] n,m n,m Nτ ,Mτ (p) = nE [τ1] + E [E 1 1 [L ]] = nλ−1 + α lp + β lp + γ lp , n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1

117 Chapter 2. A novel seed-bank model and in exact analogy we also obtain the second recurrence. As in the previous proof

p n,m (p) wn,m = V (L ) n,m n,m Nτ ,Mτ (p) n,m Nτ ,Mτ (p) = V (nτ1) + E [V 1 1 (L )] + V (E 1 1 [L ])

2 −2 n,m Nτ1 ,Mτ1 (p) n,m Nτ1 ,Mτ1 (p) = n λn,m + E [V (L )] + V (E [L ]).

Since also

n,m N ,M p p p [ τ1 τ1 (L( ))] = α w + β w + γ w E V n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 and

n,m N ,M p n,m N ,M p 2 n,m N ,M p 2 V (E τ1 τ1 [L( )]) = E [E τ1 τ1 [L( )] ] − E [E τ1 τ1 [L( )]] = α (lp )2 + β (lp )2 + γ (lp )2 n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 2 − (α lp + β lp + γ lp ) n,m n−1,m n,m n−1,m+1 n,m n+1,m−1

s and the calculations can be copied exactly for wn,m we have proven the second p,s pair of recursions, too. Now we are only left to consider wn,m. Once more, we use the strong Markov property and the independence of τ1 and (Nτ1 ,Mτ1 ) and add more details for clarity.

p,s n,m (p) (s) wn,m = E [L L ] n,m n,m (p) (s) = E [E [(nτ1 + L − nτ1)(mτ1 + L − mτ1) ∣ Fτ1 ]] n,m n,m n,m (s) = E [nτ1mτ1] + E [nτ1E [(L − mτ1) ∣ Fτ1 ]] n,m n,m (p) + E [mτ1E [(L − nτ1) ∣ Fτ1 ]] n,m n,m (p) (s) + E [E [(L − nτ1)(L − mτ1) ∣ Fτ1 ]]

n,m 2 n,m n,m Nτ1 ,Mτ1 (p) = nmE [τ1 ] + mE [τ1]E [E [L ]] n,m n,m Nτ ,Mτ (s) n,m Nτ ,Mτ (p) (s) + nE [τ1]E [E 1 1 [L ]] + E [E 1 1 [L L ]] = 2nmλ−2 + mλ−1 (α lp + β lp + γ lp ) n,m n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 + nλ−1 (α ls + β ls + γ ls ) n,m n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 + α wp,s + β wp,s + γ wp,s , n,m n−1,m n,m n−1,m+1 n,m n+1,m−1 thus the last recursion is proven.

Before the last proof of this section, let us briefly recall the notation introduced in Notation 2.57 on page 110, also explained in Table 2.2.

118 2.6. Technical results

The indices (n, n′, m, m′) denote the number of external active branches, internal active branches, external dormant branches, and internal dormant branches, respectively.

n n′ λ ′ ′ ∶= ( ) + ( ) + nn′ + cn + cn′ + cKm + cKm′, n+n ,m+m 2 2

n n′ ( ) ( ) ′ (1) 2 (2) 2 (3) nn α ′ ′ ∶= , α ′ ′ ∶= , α ′ ′ ∶= , n,n ,m,m ′ ′ n,n ,m,m ′ ′ n,n ,m,m ′ ′ λn+n ,m+m λn+n ,m+m λn+n ,m+m

′ (1) cn (2) cn β ′ ′ ∶= , β ′ ′ ∶= , n,n m,m ′ ′ n,n m,m ′ ′ λn+n ,m+m λn+n ,m+m

′ (1) cKm (2) cKm γ ′ ′ ∶= , γ ′ ′ ∶= n,n m,m ′ ′ n,n ,m,m ′ ′ λn+n ,m+m λn+n ,m+m The proof of the last proposition follows the lines of the above. How- ever, the knowledge of only the process (Nt,Mt)t≥0 is not sufficient, since it cannot differentiate between internal and external branches. This is, of course, possible in seed-bank coalescent (Πt)t≥0 itself (see Definition 2.30, p.82) used to define the variables Ed and Es. Thus, we define a new process ˜ ′ ˜ ′ (Nt,Nt , Mt,Mt )t≥0 in the following way: For any t ≥ 0 ˜ Nt is the number of active singletons in Πt, ′ Nt is the number of active blocks of size greater or equal to two in Πt, ˜ Mt is the number of dormant singletons in Πt, ′ Mt is the number of dormant blocks of size greater or equal to two in Πt. An active block is one with the flag p while a dormant block is one with the s ˜ ′ ˜ ′ flag . Obviously, Nt = Nt + Nt and Mt = Mt + Mt for all t ≥ 0. Again, it is ˜ ′ ˜ ′ easy to see that (Nt,Nt , Mt,Mt )t≥0 is a continuous-time Markov chain with transitions described by the parameters introduced above, whence we can proceed as before.

Proof of Proposition 2.58. Again, the proof is based on considering τ1 as the ˜ ′ ˜ ′ time of the first jump of (Nt,Nt , Mt,Mt )t≥0 (which coincides with that of (Nt,Mt)t≥0) and the strong Markov property. Similar to the proof above, we make use of the observation that each stretch of time with length t during which the number of n active external, n′ active internal, m dormant external and m′ dormant internal blocks stays constant, contributes with nt to the

119 Chapter 2. A novel seed-bank model total length of the active external branches (and likewise with n′t to the length of active internal, with mt to the length of dormant external and with m′t to the length of dormant internal branches). Hence, using the strong Markov property

′ ′ ′ ′ ′ ′ p n,n ,m,m (p) n,n ,m,m n,n ,m,m (p) en,n′,m,m′ = E [E ] = E [E [nτ1 + E − nτ1 ∣ Fτ1 ]] ′ ′ ′ ′ ˜ ′ ˜ ′ n,n ,m,m n,n ,m,m Nτ1 ,Nτ ,Mτ1 ,Mτ (p) = nE [τ1] + E [E 1 1 [E ]] 1 = nλ− ′ ′ n+n ,m+m (1) p (2) p (3) p + α ′ ′ e ′ ′ + α ′ ′ e ′ ′ + α ′ ′ e ′ ′ n,n ,m,m n−2,n +1,m,m n,n ,m,m n,n −1,m,m n,n ,m,m n−1,n ,m,m (1) p (2) p + β ′ ′ e ′ ′ + β ′ ′ e ′ ′ n,n ,m,m n−1,n ,m+1,m n,n ,m,m n,n −1,m,m +1 (1) p (2) p + γ ′ ′ e ′ ′ + γ ′ ′ e ′ ′ . n,n ,m,m n+1,n ,m−1,m n,n ,m,m n,n +1,m,m −1 The second recursion can be proven in the exact same way; details are om- mitted.

120 Part III

Random Dynamical Systems

121

Chapter 3

Volterra Stochastic Operators

Quadratic Stochastic Operators (QSOs) were introduced by Bernstein in [4] (English translation: [5]) to describe the evolution of a large population in discrete generations with a given law of heredity. Since quadratic operators arise rather naturally in the biological context, there is a vast number of similar models. Bernstein’s work though sparked the interest in QSOs as mathematical objects leading to a research area still very active today (see Section 3.1.1 for details). As the definition of QSOs and their relation tobi- ological considerations is rather intuitive, we begin with the introduction of so-called Volterra QSOs directly and introduce one known result about this class of operators that we will resort to in the next chapter. The first section concludes with a discussion of related models and further developments. Sec- tion 3.2 is dedicated to generalizing the notion of Volterra quadratic stochas- tic operators to Volterra polynomial stochastic operators of arbitrary degree and a discussion of their properties. Up to this point the use of the term ‘stochastic’ comes from the use of frequencies, but no randomness is involved so far. This changes in Section 3.3 where we consider a random heredity mechanism and in the main result of this chapter prove almost sure con- vergence of the distribution of species, contrasting analogous deterministic observations. Though this chapter roughly corresponds to Sections 1-3 of [56], note that the results obtained here are slightly more general, since they are for polynomial stochastic operators of arbitrary degree. In particular, most observations from Section 3.2 were not part of [56]. This chapter is finalized in Section 3.4 with a separate martingale-type convergence theorem needed for the proof of the main result, but interesting in its own right, which corresponds to Section 5 in [56].

123 Chapter 3. Volterra Stochastic Operators

3.1 Quadratic Stochastic Operators

In order to formulate his model of heredity, Bernstein made a few assumptions on the population under consideration: I Reproduction takes place in non-overlapping (i.e. discrete) generations: Each individual belongs to exactly one generation and breeding does not take place between generations. II Each individual has two parents from the previous generation.

III There are m ∈ N possible types1 of individuals in the population and each individual belongs to exactly one of these types. In particular, there is no mutation, i.e. no new types can appear. IV The size of the population is taken to be sufficiently large for random fluctuations to have no impact on the frequencies of the types under consideration. V The population is in a state of panmixia, i.e. the types are assumed sta- tistically independent for breeding. In particular, there is no selection and no sexual differentiation between the ‘parents’. Note that this model differs from the one considered in Chapter 2.For example here an individual is assumed to have two, not one parent(s). We will call ⟦m⟧ ∶= {1, . . . , m} the type-space. According to IV we will always observe the distribution of types in our population and thus work on

m m−1 m S ∶= {x = (x1, . . . , xm) ∈ [0, 1] ∣ ∑ xi = 1} i=1 the simplex of probability distributions on ⟦m⟧.A quadratic stochastic op- erator is a map V ∶ Sm−1 → Sm−1 given by m m i,j (V x)k ∶= ∑ ∑pk xixj, (3.1) i=1 j=1 ∀x ∈ Sm−1, k ∈ ⟦m⟧, where

i,j j,i ∀i, j, k ∈ ⟦m⟧ ∶ pk = pk ≥ 0 (3.2) m i,j and ∀i, j ∈ ⟦m⟧ ∶ ∑ pk = 1. (3.3) k=1 1Note that ‘type’ is not a biological term. In this context it may refer to a specific allele of a gene or a combination thereof, a genotype or a phenotype and can model haploid or polyploid populations. The reader might also think of ‘species’.

124 3.1. Quadratic Stochastic Operators

i,j For i, j, k ∈ ⟦m⟧ the quantities pk , called heredity coefficients, give the con- ditional probabilities, that an individual of type i and an individual of type j interbreed to produce an individual of type k, given that the individuals of type i and j meet. By postulate III they must sum up to one as in (3.3). The non-differentiation between ‘mother’ and ‘father’ results in (3.2), which is not a restriction for the mathematical model and can be assumed without i,j i,j j,i loss of generality: otherwise set qk = (pk + pk )/2. However, this does not mean, that postulate V can be dropped, as in general a distinction between the sexes will lead to different quadratic operators, see for example [60] and the discussion in Section 3.1.1 below. In addition, due to the presumed sta- tistical independence of types, the term xixj indeed gives the probability of an individual of type i and an individual of type j meeting (and breed- ing). Thus (V x)k clearly describes the fraction of individuals of type k in the next generation, given we start with x ∈ Sm−1, for any k ∈ ⟦m⟧. Due to its interpretation such a map is sometimes also referred to as evolutionary operator. A certain subclass of such QSOs, so called Volterra operators is of special mathematical and biological relevance as we explain in Section 3.1.1. They are characterized by the property:

i,j ∀ i, j, k ∈ ⟦m⟧ ∶ k ∈/ {i, j} ⇒ pk = 0. (3.4)

In the biological context this corresponds to the seemingly reasonable as- sumption that the offspring be of the type of one of its parents. We will discuss part of the ample literature available on this topic in the following section, but state here one result, which is of particular relevance as it will be needed for the work in Chapter 4.

Theorem 3.1 ([41]). A Volterra quadratic stochastic operator V ∶ Sm−1 → Sm−1 as defined by (3.1)-(3.4) is a homeomorphism.

3.1.1 Biological origins, related developments and en- hancements of the model Though Bernstein was the first to introduce QSOs as formal objects forthe mathematical study of the evolution of a population, it appears already in Mendel’s “Versuche ¨uber Pflanzenhybriden”, [76], his famous study of inher- itance laws of peas, considered the birth of what today we call “Genetics”. We give a very brief summary of the key relevant point as it also helps to exemplify a caveat on the biological interpretations we have done above. He studied, for example, the color of the flowers of the peas for which he

125 Chapter 3. Volterra Stochastic Operators observed two options: white and pink. From his experiments he drew the revolutionary conclusion that this trait in a plant was characterized by two parameters, with two options for each: w for white and P for pink, such that in total there were three possible types of plants

ww wP = P w P P 1 2 3.

He assumed that in each parent one of the two parameters was chosen with equal probability and inherited to the offspring, independently of the other parent. For example, crossing a type wP with a type ww will give a child of 1,2 type wP with probability p2 = 1/2. Thus, if we number the types as above and begin with a given distribution of types x ∈ S2 the fraction of each type in the next generation is given by x′ with 1 1 1 x′ = 1x x + x x + ( + ) x x 1 1 1 4 2 2 2 2 1 2 1 1 1 1 1 x′ = x x + (1 + 1)x x + ( + ) x x + ( + ) x x 2 2 2 2 1 3 2 2 1 2 2 2 3 2 1 1 1 x′ = 1x x + x x + ( + ) x x 3 3 3 4 2 2 2 2 3 2 which is indeed as in (3.1)-(3.3). Here we see the influence of the specific def- inition of type for the application. This representation was possible because we were sorting the individuals by their genotype (as opposed to the ‘visible’ sorting in white and pink, i.e. their phenotype). Note that this Mendelian operator is not a Volterra operator. Indeed, the offspring of to parents of 2,2 type wP may well be ww, i.e. p1 = 1/4. Thus the seemingly ‘natural’ as- sumption is not always sensible in the biological context and non-Volterra operators are of relevance for biologists, too. As we just saw, quadratic operators arise very naturally in the biological context and thus the plethora of similar quadratical models comes as no sur- prise. In [30] a model with distinction between the sexes is considered, but with equal numbers of each, thus leading to an object as given by (3.1)-(3.3). However, a large number of such quadratical models differs from the defini- tion of a QSO used in this work in that the coefficients are not necessarily probabilities or do not sum up to 1. This happens, among other reasons, when the number of individuals can vary for example by adding survival probabilities for the offspring as done in [64] or [110] such that (3.1) needs to be normalized by a constant depending on the whole previous generation. Similarly, in [60] and [61] a pair of entangled quadratic operators is analyzed originating from a model with explicit sex differentiation, but these articles

126 3.2. Polynomial Stochastic Operators are also recommended for further references and a discussion of quadratic operators (here called quadratic transformations) in biology. The works named above make no reference to the original work by Bern- stein, possibly due to the language barrier and historical circumstances, but his work was followed by Lyubich in [72] and [73]. Although still heavily moti- vated by the biological origin, it becomes apparent, that QSOs are interesting not only for their potential application, but as mathematical objects in their own right. Indeed, Stein, Ulam and Menzel pioneered in applying computers to mathematical problems in order to analyze (and plotting!) the trajecto- ries of each member of a special family of QSOs called binary relations (cf. [91], [102]). The analytical study of QSOs focuses on their trajectories and related questions such as fixed points, ω-limit sets, periodicity ([113]) and ergodicity properties ([33], [35], [36], [112]). Often, Lyapunov functions are applied in order to obtain these results ([39], [41]). Also generalizations of the notion of a QSO we introduced above are investigated, for example con- sidering a larger type-space ([32], [34], [80], [94]) or indeed sex-differentiation ([40], [95]). The reader is refered to [38] for an overview of current results and open problems and directed towards the recent monograph [81] for more detailed information, especially, since some of the results mentioned above are only available in Russian. We only point out two publications, as they exemplify the mathematical relevance of Volterra operators: In [93] it was shown that certain (general- ized) QSOs can be ‘decomposed’ into Volterra QSOs and in [42] it was proven that any automorphism on the simplex Sm−1 is in a sense a permutation of a Volterra QSO. Needless to say, there is also a vast field in the analogous observations in continuous time – indeed, Volterra’s original work [107], that became the namegiver inspite being preceeded by Lotka [71] – was in continuous time, but we omit any deeper discussion of this for lack of relevance for this thesis.

3.2 Polynomial Stochastic Operators

As the reader might have noticed, the focus of the publications mentioned in Section 3.1.1 was on quadratic stochastic operators of different kinds moti- vated by the biological idea of the set of ‘parents’ of an individual consisting of two individuals. But as mathematicians like to generalize, cubic Volterra operators were considered numerically in [91] using ‘experimental mathemat- ics’ taking polaroid pictures of an oscilloscope to analyze the trajectories of said operators. More recent results in the area of cubic stochastic operators were obtained in [18], [75], [92]. Paralleling this, progress in genetics has lead

127 Chapter 3. Volterra Stochastic Operators to the development of a technique called Mitochondrial Manipulation Tech- nology, a special form of in vitro fertilization, where the future embryo has three sources of DNA – the two parents and a donor – used in cases where the mother carries a mitochondrial disease. However, the child does not always retain trace of this third DNA and the procedure is highly controversial from an ethical point of view. Fortunately this thesis is in the area of math, not biology, and as a math- ematician we have the pleasure to consider objects for the pure joy of work- ing with them. Thus, without attempting to predict the future of genetic engineering and leaving moral predicaments aside, from the mathematical point of view, it is natural to generalize the notion of a quadratic (or cubic) Stochastic Operator, to a polynomial Stochastic Operator. Recall that we denote by Sm−1 the simplex of dimension m − 1.

Definition 3.2. For d ∈ N let V (d) ∶ Sm−1 → Sm−1 be such that for any x ∈ Sm−1 and k ∈ ⟦m⟧

m m (d) i1,...,id (V x)k ∶= ∑ ⋯ ∑ pk xi1 ⋯xid (3.5) id=1 i1=1

( i1,...,id ) where the coefficients pk i1,...,id,k∈⟦m⟧ are such that

i1,...,id iπ(1),...,iπ(d) (a) for any permutation π of {1, . . . , d}: pk = pk ≥ 0 and

(b) m pi1,...,id = 1 ∑k=1 k

d for all i1, . . . , id, k ∈ ⟦m⟧. Then V ( ) is called a Polynomial Stochastic (evo- lutionary) Operator (PSO) of degree d. For d = 2 and d = 3 we say quadratic, respectively cubic, stochastic operator.

Although maybe of less biological relevance, the definitions above still retain their biological interpretation. Condition (a) is again the lack of dif- ferentiation between the parents (‘mother’, ‘father’, ‘other’...) and (b) comes from the assumption of no mutation. Likewise, the product in (3.5) corre- sponds to the assumption of statistical independence of types for breeding.

Remark 3.3.

1. The equality in condition (a) of the definition of a PSO can again be assumed without loss of generality as you can otherwise simply replace i ,...,i the coefficients by qi1,...,id = 1 ∑m p π(1) π(d) , where Π is the set of k d! π∈Πd k d permutations of {1, . . . , d}.

128 3.2. Polynomial Stochastic Operators

2. Every PSO of degree d is a PSO of degree D ≥ d (d, D ∈ N). This is easy to see for D = d + 1 (and then follows for all D ≥ d by induction): Let i ,...,i (d) ( 1 d ) V be a PSO of degree d given by the coefficients pk i1,...,id,k∈⟦m⟧. Define 1 p¯i1,...,id+1 ∶= (pi1,...,id + pi1,...,id−1,id+1 + ... + pi1,i3,...,id+1 + pi2,...,id+1 ) k d + 1 k k k k ¯ for all i1, . . . , id+1, k ∈ ⟦m⟧ and let V be the map given by the ( i1,...,id+1 ) p¯k i1,...,id+1,k∈⟦m⟧. Since (a) and (b) of Definition 3.2 clearly hold, V¯ is a PSO of degree d + 1. At the same time we have

m m ¯ i1,...,id+1 (V x)k = ∑ ⋯ ∑ p¯k xi1 ⋯xid+1 id+1=1 i1=1 1 m m = ∑ ⋯ ∑ pi1,...,id x ⋯x d + 1 k i1 id+1 id+1=1 i1=1 1 m m + ∑ ⋯ ∑ pi1,...,id−1,id+1 x ⋯x d + 1 k i1 id+1 id+1=1 i1=1 1 m m + ... + ∑ ⋯ ∑ pi2,...,id+1 x ⋯x d + 1 k i1 id+1 id+1=1 i1=1 1 m m m = ∑ ⋯ ∑ pi1,...,id x ⋯x ∑ x d + 1 k i1 id id+1 id=1 i1=1 id+1=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =1 1 m m m m + ∑ ∑ ⋯ ∑ pi1,...,id−1,id+1 x ⋯x x ∑ x d + 1 k i1 id−1 id+1 id id+1=1 id−1=1 i1=1 id=1 ´¹¹¹¹¹¸¹¹¹¹¹¹¶ =1 1 m m m + ... + ∑ ⋯ ∑ pi2,...,id+1 x ⋯x ∑ x d + 1 k i2 id+1 1 id+1=1 i2=1 i1=1 ² =1 d + 1 m m = ∑ ⋯ ∑ pj1,...,jd x ⋯x = (V x) d + 1 k j1 jd k jd=1 j1=1

for all x ∈ Sm−1 and k ∈ ⟦m⟧, therefore V (d) = V¯ . 3. Note that the PSOs are continuous and that the simplex over a finite set is compact and convex, such that by the Brouwer Fixed-Point Theorem there always exists at least one fixed point (where we call a point x ∈ Sm−1 a fixed point of V (d) if V (d)(x) = x). Further, if a trajectory generated by the PSO V (d) converges to some x then, by continuity, x is a fixed point.

129 Chapter 3. Volterra Stochastic Operators

As in the case of QSOs we are interested in a special subclass describing the assumption that the offspring be of the type of one of its parents.

Definition 3.4. Let V (d) be a PSO of degree d, such that

i1,...,id k ∉ {i1, . . . , id} ⇒ pk = 0 (3.6) for all i1, . . . , id, k ∈ ⟦m⟧. Then said operator is called a Volterra Polynomial Stochastic Operator (VPSO). Remark 3.5. A Volterra PSO of degree d has the general form m (d) k,...,k d−1 d−2 i1,k,...,k (V x)k = xk [ pk xk + xk d ∑ pk xi1 ² i1=1 =1 i1≠k d m m + xd−3( ) ∑ ∑ pi1,i2,k,⋯,kx x k 2 k i1 i2 i2=1 i1=1 i2≠k i1≠k d m m + ⋯ + x ( ) ∑ ⋯ ∑ pi1,...,id−2,k,kx ⋯x k d − 2 k i1 id−2 id−2=1 i1=1 id−2≠k i1≠k m m i1,...,id−1,k + d ∑ ⋯ ∑ pk xi1 ⋯xid−1 ] id−1=1 i1=1 id−1≠k i1≠k for any x ∈ Sm−1 and k ∈ ⟦m⟧. Remark 3.6 (A word on different notions of Volterra). 1. In [79] it is observed, that a QSO is a Volterra QSO if and only if for all x ∈ Sm−1: (V x) ≪ x, where for two measures µ and ν, µ ≪ ν means that µ is absolutely continuous with respect to ν. This characterization is then used to define the notion of Volterra for quadratic stochastic operators on a more general space of probability measures. The analo- gous equivalence also holds for PSOs, i.e. a PSO V is of Volterra type according to Definition 3.4 if and only if for all x ∈ Sm−1: (V x) ≪ x. From the representation in Remark 3.5 the ‘only if’ is immediately clear. For the ‘if’, let V be a PSO of some degree d such that the as- 1 sumption on absolute continuity holds. Set x( ) ∶= (e2 +...+em)/(m−1). (1) Since x1 = 0, by assumption m m 0 = (V x(1)) = ∑ ⋯ ∑ pi1,...,id x(1)⋯x(1) 1 1 i1 id id=1 i1=1 m m 1 d = ∑ ⋯ ∑ pi1,...,id ( ) . 1 m − 1 id=2 i1=2

130 3.2. Polynomial Stochastic Operators

i1,...,id Thus p1 = 0 for all i1, . . . , id ∈ ⟦m⟧∖{1} and repeating the procedure with similar x(2), . . . , x(m) we obtain (3.6).

2. A similar notion is that of Lotka-Volterra operators, characterized in m 1 the following way: For any I ⊆ ⟦m⟧ we call ΓI ∶= {x ∈ S − ∣ k ∈/ I ⇒ m 1 xk = 0} a face of the simplex S − and riΓI ∶= {x ∈ ΓI ∣ k ∈ I ⇒ xk > 0} its relative interior. A map V ∶ Sm−1 → Sm−1 is called Lotka-Volterra operator, if it is continuous and maps the relative interior of each face into itself, i.e. V (riΓI ) ⊆ riΓI , see for example [82] or [96]. Note that this is equivalent to saying that for all x ∈ Sm−1 V x ≪ x and V x ≫ x. This is a priori a stronger assumption than in 1. However, again, from the representation in Remark 3.5 it is easy to see, that a PSO which is Volterra according to Definition 3.4, is indeed a Lotka-Volterra opera- tor. And since this clearly implies the condition of absolute continuity, we have equivalence also in this case. Unfortunately, not much seems to be known for this class of operators in general. For the results we obtained in [56], we needed to consider special sub- classes of VQSOs that essentially corresponded to heredity mechanisms in which one type was a pure blood, i.e. the offspring would only be of this type, if all, so in this case both, its parents were of this same type. For VPSOs this is too strict, therefore we need a more refined definition.

Definition 3.7. Given a PSO V (d) of degree d with heredity coefficients ( i1,...,id ) ∈ ⟦ ⟧ ≤ pk i1,...,id,k∈⟦m⟧ we will call a type k m purebred of purity level l d in the heredity mechanism given by V (d), if this type is not possible to breed if more than l parents are of a different type. In formulae, this condition is described by

i1,...,id ∣ {j ∈ {1, . . . , d} ∣ ij ≠ k} ∣ > l ⇒ pk = 0 (3.7) for all i1, . . . , id, k ∈ ⟦m⟧. If l = 0, then k is a pure blood. In V (d) all types k ∈ ⟦m⟧ are purebred of level l = d − 1, if and only if V (d) is Volterra (see (3.6)). Remark 3.8. For every l ≤ d−1 and any k ∈ ⟦m⟧ there exists a PSO for which k is purebred of level l. Indeed, there always exists a Volterra PSO with this = ( i1,...,id ) property: Without loss of generality consider k 1. Let p¯ȷ i1,...,id,¯ȷ∈⟦m⟧ 1,...,1 be the heredity coefficients of a Volterra operator (in particular p1 = 1) and

i1,...,id 1 p¯ȷ ∶= , ∣{j ∈ {1, . . . , d} ∣ ij ≠ 1}∣

131 Chapter 3. Volterra Stochastic Operators

for all ¯ȷ ∈ ⟦m⟧∖{1}, i1, . . . , id ∈ ⟦m⟧. Since by assumption (b) in the Definition 3.2 of a PSO the quantities have to sum up to 1, this means (3.7) holds. Since every type that is purebred of some level l is also purebred for any l′ ≥ l, this proves the claim.

It is quite obvious from a biological point of view that being purebred is a disadvantage when it comes to survival of the type. This is also clearly seen in the mathematical expressions. Take for example the extreme case and let V (d) be a VPSO of degree d such that type 1 is a pure blood. Then for any m 1 d initial distribution x ∈ S − after one evolutionary step we obtain (V ( )x)1 = d x1 (see Remark 3.5), hence the fraction of individuals of type 1 diminished drastically. The problem for this type (that we will take advantage of in Section 3.3) is that no VPSO is capable of neutralizing such a strong blow to the numbers, as we can deduce from the following, seemingly crude estimate.

Proposition 3.9. Let V (d) ∶ Sm−1 → Sm−1 be a PSO of degree d ≥ 2 such that type k ∈ ⟦m⟧ is purebred of purity level l < d. Then for all x ∈ Sm−1

d (V (d)x) ≤ xd−l( ). (3.8) k k d − l

For l ≤ d − 2 we can even write

d d (V (d)x) ≤ min {xd−l( ), x2( )} . (3.9) k k d − l k 2

Which one of the upper bounds given in (3.9) is better depends on the proximity of xk to 0, resp. 1. Together with the last remark in Definition 3.7 we immediately obtain the following estimate for Volterra PSOs.

Corollary 3.10. For any Volterra PSO V (d) of degree d ≥ 2, every type k ∈ ⟦m⟧ and any initial distribution x ∈ Sm−1:

(d) (V x)k ≤ dxk.

This was observed for VQSOs in [56].

Proof of Proposition 3.9. We begin with the proof of (3.8). Let V (d) ∶ Sm−1 → Sm−1 be a PSO of degree d such that type k ∈ ⟦m⟧ is purebred of purity level l < d and let x ∈ Sm−1. Using the condition (3.7) in the second equality we

132 3.2. Polynomial Stochastic Operators obtain

m m (d) i1,...,id (V x)k = ∑ ⋯ ∑ pk xi1 ⋯xid id=1 i1=1 m d m m = xd + xd−1d ∑ pi1,k,...,kx + xd−2( ) ∑ ∑ pi1,i2,k,⋯,kx x k k k i1 k 2 k i1 i2 i1=1 i2=1 i1=1 i1≠k i2≠k i1≠k − ¹¹¹d l¹¹¹ d m m ³ · µ + ⋯ + xd−l( ) ∑ ⋯ ∑ pi1,...,il,k,...,kx ⋯x k l k i1 il il=1 i1=1 il≠k i1≠k d d ≤ xd + dxd−1(1 − x ) + ( )xd−2(1 − x )2 + ⋯ + ( )xd−l(1 − x )l k k k 2 k k l k k where we estimate all heredity coefficients by 1 and use the observation that for any n ≤ l

m m m m n ∑ ⋯ ∑ xi1 ⋯xin = ∑ xil ⋯ ∑ xi1 = (1 − xk) . in=1 i1=1 in=1 i1=1 in≠k i1≠k in≠k i1≠k To prove our claim it will thus be sufficient to prove that the function f ∶ [0, 1] → R given by

d l d f(s) ∶= ( )sd−l − ∑ ( )sd−n(1 − s)n d − l n n=0 is nonnegative for all s ∈ [0, 1]. Now, obviously f(0) = 0 and in addition

df d (s) = (d − l)( )sd−l−1 ds d − l d−(n+1) ( + )− l ¹¹¹¹¹ ¹¹¹¹¹ n 1 1 d ³ · µ © d + ∑ { − ( )(d − n) s d−n−1 (1 − s)n + ( )nsd−n(1 − s)n−1} − dsd−1 n n n=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ d (n+1)(n+1) d d = (d − l)( )sd−l−1 − ( )(d − l)sd−l−1(1 − s)l d − l l d = (d − l)( )sd−l−1 {1 − (1 − s)l} ≥ 0 d − l for all s ∈ [0, 1], thus the claim is proven. (3.9) now follows from (3.8) and observation, that any type that is pure- bred of level l is also purebred of level l′ for any l′ ≥ l.

133 Chapter 3. Volterra Stochastic Operators

Remark 3.11. The classes of VQSOs we considered in [56] were clearly disjoint, since in every population only (at most) one species can be of pure blood (recall that every type will breed with every type since there is no selection). As we have seen, this is not true in general for the notion of purebred of any purity level l, since, for example, the extreme case of being purebred of degree d−1 for all types is possible and, indeed, holds true for all PSOs we will consider, since it is equivalent to being Volterra. The precise description for any purity level is as follows: A Volterra PSO V (d) of degree d ≥ 2 can be purebred of purity level l for at most ⌈d/(d − l − 1)⌉ − 1 species at the same time. To see this, make the following observation: If a type k is purebred of level l, it means, that it can be bred if at most l parents are of a different type, which is equivalent to saying, that at least d − l parents need to be of this type k. Thus for κ types to be purebred of this level l we need κ(d−l−1) < d. Otherwise assume κ(d − l − 1) ≥ d (thus necessarily l < d − 1) and the types in question are without loss of generality the types 1, . . . , κ. Choose the parents i1, . . . , id such that ij = ⌈j/(d − l − 1)⌉ for j = 1, . . . , d, i.e. the first (d − l − 1) parents are of type 1, the next (d − l − 1) parents are of type 2 etc. until we have filled all d parents (which is possible by our assumption). Then d−l−1 d−l−1 ­ ­ i1,...,id 1,...,1,2,...,2,... ∀k ∈ ⟦m⟧ ∶ pk = pk = 0

either by the condition of being Volterra, if k ∈/ {i1, . . . , id} or by the assump- tion that the type is purebred of level l, since there are at most l − 1 of each type represented in the parents. But this is a contradiction to condition (b) in Definition 3.2 that these coefficients sum up toone.

3.3 Randomization of the model

So far, in spite of the frequent occurrence of the word stochastic, no random- ness was involved in the model. Indeed, the vast majority of references given up to now is concerned with determinisic evolutionary models. However, it seems natural to randomize the hereditary mechanism. As a first step into this direction, we want to investigate the trajectories of a sequence of in- dependent and identically distributed VPSOs. Therefore we introduce the notion of random VPSOs along the lines of the definition of random QSOs introduced in [37]. For this we need a suitable measurable space. Notation 3.12 Let V(d) be the set of all Volterra PSOs of degree d ≥ 2. Since d ≥ 2 is fixed throughout the rest of the chapter and notation might become more complicated, we will often drop the indication of the dimension

134 3.3. Randomization of the model and simply write V. Likewise, we will drop the superscript (d) and only write V for a VPSO (of degree d). Note that the correspondence between the × × ( i1,...,id ) PSOs V and ‘matrices’ of dimensions m ... m (d times) pk i1,...,id,k∈⟦m⟧ d+1 d+1 gives a natural embedding ι ∶ V ↪ R(m ). Denote by B(R(m )) the Borel- d+1 σ-algebra on R(m ) and use it to define

1 md+1 Υ ∶= ι− ({B ∩ ι(V) ∣ B ∈ B(R( ))})

(which is nothing but the Borel-σ-algebra on V if one uses ι to translate the d+1 topology on R(m ) into V). As always, let (Ω, F, P) be a probability space.

Definition 3.13. For a given probability space (Ω, F, P) any measurable map T ∶ Ω → V, i.e. any map such that T −1Υ ⊆ F is called random Volterra Polynomial Stochastic Operator (RVPSO). We will need to consider special subclasses of VPSOs.

l Definition 3.14. For any type k ∈ ⟦m⟧ and any level l, let Vk be the set of all Volterra PSOs (of degree d) V such that the type k is purebred of level l d−2 in V . As it will be the most commonly used, we abbreviate Vk ∶= Vk . These sets are all non-empty, but might, however, not be disjoint (see Remarks 3.8 and 3.11) and are clearly measurable with respect to Υ. The relevance of Vk stems from the following observation: Any V ∈ Vk describes an evolutionary situation in which the species k is in strong disadvantage. From a biological point of view it might not seem to be a heavy drawback if you need at least two of your potentially many parents to be of the same type k, but the picture is more clear in the math. As we see in Proposition 3.9, such a V ∈ Vk implies the fraction of individuals of type k to be at least squared in the next step. On the other hand, Corollary 3.10 means that any other V¯ ∈ V will only be able to increase this fraction by at most a factor d, thus making it hard for the type k to recover. This explains the importance of the sets Vk when analyzing extinction in the long-term behaviour. We now come to the main result of this Chapter, which was published in [56] for RVQSOs.

Theorem 3.15. Let ν be a probability distribution on (V, Υ) such that ν(Vk) > 0 for all k ∈ ⟦m⟧. Furthermore, consider a sequence T1,T2,... of indepen- dent and identically distributed RVPSOs in V of degree d ≥ 2 such that the m 1 distribution of T1 is given by ν. Then, for any x ∈ S −

P( lim (Tn ○ ... ○ T1)(x) ∈ {e1, ..., em}) = 1. (3.10) n→∞ 135 Chapter 3. Volterra Stochastic Operators

Remark 3.16. 1. Theorem 3.15 states that for any initial point from the simplex of prob- ability distributions x ∈ Sm−1 the random trajectory converges almost surely to one of the vertices of the simplex. This is far from being obvious since the set of Volterra PSOs considered may well contain op- erators that do not have this property. In general for Volterra QSOs it is only known, that for any initial point x ∈ Sm−1 the ω-limit set, i.e. the set of all limit points of the trajectory, consists of either one or infinitely many points and that, if the initial point x is not a fixed point, the ω-limit set must be a subset of the boundary of the sim- plex (cf. [41]). The set of Volterra PSOs considered in Theorem 3.15 may contain operators that do not converge at all, indeed, are not even ergodic (cf. [35], [112]). 2. Note that for the biological interpretation our results show that such a mechanism does not allow for coexistence, but yields almost sure extinction of all but one type. As mentioned above, the corresponding results in the deterministic setting on the other hand cannot generally rule out coexistence in the long run. Indeed, some of the PSOs included in the set we consider for the random setting, e.g. those studied in [112], model a very distinct deterministic behaviour. They describe a population where a species will come to the verge of extinction only to recover to the point where all other species are almost annihilated, after which the cycle repeats indefinitely, not yielding a stable situation. We should point out that the conditions in Theorem 3.15 do not ask for the existence of operators resulting in purebred species. This property is obviously an extreme disatvantage for a species and thus extinction (some time) after such a mechanism is applied might not come as a big surprise. But here we only ask for operators modeling the existence of species of purity level d-2, the lowest possible level before being merely a Volterra operator. One would think that the restriction that at least 2 of your d many parents be of your type should not be a strong drawback for a species (for large d especially), but at least in mathematical terms, it is. In order to proceed we will need some additional notation. Notation 3.17 For ε > 0 we denote by

ε m−1 Ui = {x ∈ S ∣ ∀j ∈ ⟦m⟧ ∖ {i} ∶ xj < ε} ∈ ⟦ ⟧ = ε the ε-neighbourhood of the vertex ei, i m and set Uε ⋃i∈⟦m⟧ Ui . Fur- m 1 thermore, we define Λ ∶= {e1, ..., em} to be the set of vertices of S − .

136 3.3. Randomization of the model

Before we move on to the proof of Theorem 3.15 we will need some preparational propositions. Let us thus briefly discuss the strategy of the proof: The following Proposition 3.18 states that for any ε > 0 there is a deterministic number N ∈ N of steps after which the probabil- ity of being close to one of the ver- tices, i.e. in Uε, is bounded away from 0 uniformly with respect to the starting point x ∈ Sm−1. Proposi- tion 3.21 then in particular implies a Figure 3.1: Visualization of the strat- positive probability of the trajectory egy of proof of Thm. 3.15. Prop. 3.18 converging to the corresponding ver- implies positive probability of being in tex on this event. Since we can thus U ε after N steps, uniform in the start- bound the probability of reaching U ε ing point x. Prop. 3.21 yields positive and subsequently converging to the probability of subsequently converging closest vertex in Λ away from 0 uni- to the respective corner. formly, the main result given in The- orem 3.15 then follows with a Borel- Cantelli-type argument.

Proposition 3.18 ([56]: for RVQSOs). Under the assumptions of Theorem 3.15, for each ε > 0 there are N ∈ N and q > 0 such that for every point x ∈ Sm−1

P(TN ○ TN−1 ○ ... ○ T1(x) ∈ Uε) ≥ q.

Remark 3.19. The values of N and q can indeed be quantified: Recall that we assumed ν to be such that νk ∶= ν(Vk) > 0 for all k ∈ ⟦m⟧ and define ν ∶= min{ν1, . . . , νm}. For any r sufficiently large for

rm log(d) − 2r log(2) < log(ε) to hold, we can set

N ∶= r(m − 1) and q ∶= νN .

Proof. Let ε > 0 and choose r ∈ N sufficiently large for rm log(d) − 2r log(2) < log(ε) to hold. For a fixed starting point x ∈ Sm−1 we will first construct a deterministic sequence V¯ (1),..., V¯ (r),..., V¯ (1) ,..., V¯ (r) ∈ V such that V¯ (r) ○ 1 1 m−1 m−1 m−1

137 Chapter 3. Volterra Stochastic Operators

... ○ V¯ (1) ○ ... ○ V¯ (r) ○ ... ○ V¯ (1)(x) ∈ U and then prove the probability of such m−1 1 1 ε an event to be bounded away from 0 uniformly in x. Define j1 = j1(x) ∈ ⟦m⟧ as the smallest index of a vertex corresponding to the maximal distance of x to Λ, i.e.

∥x − ej1 ∥ = max ∥x − ej∥ j∈⟦m⟧ and j1 is the smallest value for which this holds. We now recursively de- r k 1 fine a family of maps jk ∶ V ( − ) → ⟦m⟧ for k = 2, . . . , m − 1: Set Jk =

Jk(V1,...,Vr(k−1)) ∶= {j1, j2(V1,...,Vr), . . . , jk(V1,...,Vr(k−1))}. Then define jk+1(V1,...,Vrk) ∈ ⟦m⟧ ∖ Jk to be the smallest index such that ∥ ○ ○ ( ) − ∥ = ∥ ○ ○ ( ) − ∥ Vrk ... V1 x ejk+1(V1,...,Vrk) max Vrk ... V1 x ej . j∈⟦m⟧∖Jk(V1,...,Vr(k−1)) ( ○ ○ ( )) ≤ / Note that by construction we then have Vr(k−1) ... V1 x jk(V1,...,Vr(k−1)) 1 2 for all k = 2, . . . , m−1 and any choice of V1,...,Vr(k−1) ∈ V. Of course, like j1, these functions depend on x, too, but since x is fixed throughout the proof, we have ommitted any indication since the notation is tedious already. Proceed to choose V¯ (1),..., V¯ (r),..., V¯ (1) ,..., V¯ (r) ∈ V such that 1 1 m−1 m−1

¯ (1) ¯ (r) Vk ,..., Vk ∈ V ¯ (1) ¯ (r) , k ∈ ⟦m − 1⟧. (3.11) jk(V1 ,...,Vk−1) With this choice we obtain the following estimates for every k ∈ ⟦m − 1⟧:

¯ (r) ¯ (1) ¯ (r) ¯ (1) (Vm 1 ○ ... ○ Vm 1 ○ ... ○ V1 ○ ... ○ V1 (x)) ¯ (1) ¯ (r) − − jk(V1 ,...,Vk−1) ¯ (r) ¯ (1) ¯ (r) ¯ (1) ¯ (r) ¯ (1) = (Vm 1 ○ ... ○ Vk 1(Vk ○ ... ○ Vk (Vk 1 ○ ... ○ V1 (x)))) ¯ (1) ¯ (r) − + − jk(V1 ,...,Vk−1) r(m−1−k) ¯ (r) ¯ (1) ¯ (r) ¯ (1) ≤ d (Vk ○ ... ○ Vk (Vk 1 ○ ... ○ V1 (x))) ¯ (1) ¯ (r) − jk(V1 ,...,Vk−1) r d r ≤ dr(m−1−k)( ) (V¯ (r) ○ ... ○ V¯ (1)(x))(2 ) k−1 1 ¯ (1) ¯ (r) 2 jk(V1 ,...,Vk−1) r 2r d 1 ( ) r r ≤ dr(m−1−k)( ) ( ) ≤ dr(m−1−k+2)2−2 ≤ drm2−2 < ε 2 2 where we used Proposition 3.10 in the first inequality and (3.9) from Propo- sition 3.9 in the second. This implies

V¯ (r) ○ ... ○ V¯ (1)(x) ∈ U j∗ ⊂ U , m−1 1 ε ε if we denote by j the unique element of ⟦m⟧∖J (V¯ (1),..., V¯ (r) ). Observe ∗ m−1 1 m−1 that the assumption of independence allows us to estimate the probability

138 3.3. Randomization of the model of such a suitable choice of operators satisfying (3.11), which in turn is a lower bound for the value we want to bound away from 0: Without loss of generality assume j1 = j1(x) = 1. With ν ∶= min{ν1, . . . , νm} > 0 we see

P(Tr(m−1) ○ ... ○ T1(x) ∈ Uε) m m ≥ ( ∈ V ( ) = ∑ ... ∑ P T(m−1)r,...,T(m−2)r+1 im−1 , jm−1 T1,...,T(m−2)r im−1, i2=1 im−1=1

...,T2r,...,Tr+1 ∈ Vi2 , j2(T1,...,Tr) = i2,Tr,...,T1 ∈ V1) m m = ( ∈ V ∣ ( ) = ∑ ... ∑ P T(m−1)r,...,T(m−2)r+1 im−1 jm−1 T1,...,T(m−2)r im−1, i2=1 im−1=1

...,T2r,...,Tr+1 ∈ Vi2 , j2(T1,...,Tr) = i2,Tr,...,T1 ∈ V1) × ( ( ) = ∈ V P jm−1 T1,...,T(m−2)r im−1,...,T2r,...,Tr+1 i2 ,

j2(T1,...,Tr) = i2,Tr,...,T1 ∈ V1) m m = ( ∈ V ) ∑ ... ∑ P T(m−1)r,...,T(m−2)r+1 im−1 i2=1 im−1=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ νr νr = im−1 ≥ × ( ( ) = ∈ V P jm−1 T1,...,T(m−2)r im−1,...,T2r,...,Tr+1 i2 ,

j2(T1,...,Tr) = i2,Tr,...,T1 ∈ V1) m m ≥ r ( ∈ V ∈ V ν ∑ ... ∑ P T(m−2)r,...,T(m−3)r+1 im−2 ,...,T2r,...,Tr+1 i2 , i2=1 im−2=1

j2(T1,...,Tr) = i2,Tr,...,T1 ∈ V1).

Iterating this argument yields

r(m−1) P(Tr(m−1) ○ ... ○ T1(x) ∈ Uε) ≥ ν > 0.

Since this lower bound does not depend on our initial choice of x anymore we have proven the claim with q = ν(m−1)r and N = r(m − 1).

Notation 3.20 We let intSm−1 denote the interior and ∂Sm−1 the boundary of Sm−1, i.e.

m−1 m−1 m−1 m−1 m−1 intS ∶= {x ∈ S ∶ x1x2⋯xm > 0}, and ∂S ∶= S ∖ intS .

139 Chapter 3. Volterra Stochastic Operators

In order to analyze the convergence now consider a sequence (Tn)n∈N of random VPSOs as in Theorem 3.15 and let X denote a random variable taking values in int Sm−1 that is independent of the sequence and such that

E[∣ log(X)∣] < ∞. Define a filtration (Fn)n∈N0 by Fn ∶= σ(X,T1,...,Tn) for ˆ n ∈ N0. We introduce the abbreviation Tn ∶= Tn ○...○T1 and use this to define

i ˆ Zn ∶= log((TnX)i). (3.12)

ˆ m 1 Note that, by Remark 3.6, 2 (or Remark 3.5 directly) TnX ∈ int S − for all n ∈ N and thus (3.12) is well-defined. We would like the increments of this process to be (at least) integrable, i but since this is not necessarily the case we define a new process (Yn)n∈N0 in the following way: Choose

⎧ ⎧ d ⎫⎫ ⎪ ⎪log(d) + log ( )⎪⎪ c > max ⎨log(m), max ⎨ 2 ⎬⎬ (3.13) ⎪ i m ⎪ ν ⎪⎪ ⎩⎪ ∈⟦ ⟧ ⎩⎪ i ⎭⎪⎭⎪ and set

i i Y0 ∶= Z0 = log(Xi), (3.14) ⎧ ⎪Zi − Zi , if Zi − Zi ≥ −c, Y i − Y i ∶= ⎨ n+1 n n+1 n (3.15) n+1 n ⎪ ⎩⎪−c, otherwise.

i i Then we know that for all ω ∈ Ω: Zn(ω) ≤ Yn(ω).

d Proposition 3.21 ([56]: for VQSOs). For C ∶= mini∈⟦m⟧{νic−log(d)−log (2)} by our choice of c in (3.13) we have C > 0 and for every j ∈ ⟦m⟧

1 i i P(∀i ∈ ⟦m⟧ ∖ {j} ∶ lim inf − Yn ≥ C ∣ ∀i ∈ ⟦m⟧ ∖ {j} ∶ ∀n ∈ N ∶ Yn ≤ −c) = 1. n→∞ n (3.16)

Moreover for every θ > 0 and every b ∈ R there exists an s > 0, such that

i ¯ P(∃j ∈ ⟦m⟧∀i ∈ ⟦m⟧ ∖ {j} ∀n ∈ N ∶ Yn < b ∣ F0) ≥ 1 − θ on {X ∈ Us} P-a.s. (3.17)

¯ m 1 where Us ∶= {x ∈ S − ∣ ∃j ∈ ⟦m⟧∀i ≠ j ∶ xi ≤ s}.

An illustration of the somewhat technical-looking statements of Proposi- tion 3.21, is given in Figure 3.2 below.

140 3.3. Randomization of the model

Figure 3.2: Illustration of Proposition 3.21. (3.17) es- sentially states that given the

process (Xn)n∈N is started with X in the blue area, it will never leave the hatched area with large probability. (3.21) implies that, given the process stays in the hatched area for- ever, it converges towards the closest corner P-a.s. lalal lala lala lala la

i Proof. Note that the increments of (Yn)n∈N0 are integrable. Thus we can calculate

T (Tˆ (X)) [Y i − Y i ∣ F ] = [log ( n+1 n i ) ∨ (−c) ∣ F ] E n+1 n n E ˆ n Tn(X)i V (Tˆ (X)) = log ( n i ) ∨ (−c)dν(V ) ∫ ˆ V Tn(X)i V (Tˆ (X)) = log ( n i ) ∨(−c)dν(V ) ∫ ˆ Vi Tn(X)i ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ d ˆ ≤(2)Tn(X)i V (Tˆ (X)) + log ( n i ) ∨(−c)dν(V ) ∫ ˆ V∖Vi Tn(X)i ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≤d by Cor. 3.10 d ≤ ν ({log ( ) + log (Tˆ (X) )} ∨ (−c)) + log(d) i 2 n i d = ν ({log ( ) + Zi } ∨ (−c)) + log(d) i 2 n d ≤ ν (Zi ∨ (−c)) + ν (log ( ) ∨ (−c)) + log(d) i n i 2 d ≤ −ν c + log ( ) + log(d) ≤ −C on {Zi ≤ −c} i 2 n

i and therefore also on {Yn ≤ −c} and

[(Y i − [Y i ∣ F ])2 ∣ F ] E n+1 E n+1 n n

141 Chapter 3. Volterra Stochastic Operators

= [(Y i − Y i − [Y i − Y i ∣ F ])2 ∣ F ] E n+1 n E n+1 n n n = [(Y i − Y i)2 ∣ F ] − [( [Y i − Y i ∣ F ])2 ∣ F ] E n+1 n n E E n+1 n n n i i + 2 i i − 2 ≤ E[((Yn 1 − Yn) ) ∣ Fn] + E[((Yn 1 − Yn) ) ∣ Fn] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶+ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶+ d ≤c ≤log (2) d ≤ (log ( ))2 + c2 -a.s. 2 P This allows us to apply Lemma 3.22 in Section 3.4 yielding

1 i i P(lim inf − Yn ≥ C ∣ ∀n ∈ N ∶ Yn ≤ −c) = 1 (3.18) n→∞ n and that for every θ > 0 and every b ∈ R there exists an ri ∈ R, such that 1 (∀n ∈ ∶ Y i < b ∣ F ) ≥ 1 − θ on {Y i ≤ r } = {log(Xi) ≤ r }. (3.19) P N n 0 m − 1 0 i i From (3.18) we obtain for every j ∈ ⟦m⟧

1 i i P(∀i ∈ ⟦m⟧ ∖ {j} ∶ lim inf − Yn ≥ C ∣ ∀i ∈ ⟦m⟧ ∖ {j} ∀n ∈ N ∶ Yn ≤ −c) = 1. n→∞ n

With s ∶= mini=1,...,n{exp(ri)} for any j ∈ ⟦m⟧ (3.19) implies i ¯ j i P(∀i ∈ ⟦m⟧ ∖ {j} ∀n ∈ N ∶ Yn < b ∣ F0) ≥ 1 − θ on {X ∈ Us } = ⋂ {X ≤ s} i∈⟦m⟧∖{j} and thus i ¯ P(∃j ∈ ⟦m⟧ ∀i ∈ ⟦m⟧ ∖ {j} ∶ ∀n ∈ N ∶ Yn < b ∣ F0) ≥ 1 − θ on {X ∈ Us}.

Now we finally come to the proof of the main theorem of this chapter. Proof of Theorem 3.15. Recall the definitions of C and c from above. Note that by Remark 3.6,2 for any k ∈ ⟦m⟧ and every Volterra operator V xk ≠ 0 if, and only if (V x)k ≠ 0. Thus, by disregarding the zero-entries, starting on ∂Sm−1 can be interpreted as starting and considering the same problem on the interior of a lower-dimensional simplex. Therefore, without loss of generality, we can assume x ∈ int Sm−1. Let θ ∈ (0, 1) be arbitrary and setting 1 b ∶= −c choose s as in Proposition 3.21. For ε ∶= min{s, m } let N and q be as in Proposition 3.18. We begin by defining the objects we will need for the proof. Define the stopping time i ˆ τ1 ∶= inf{nN ∣ ∃j ∈ ⟦m⟧ ∀i ∈ ⟦m⟧ ∖ {j} ∶ ZnN < log(ε)} = inf{nN ∣ TnN (x) ∈ Uε}.

142 3.3. Randomization of the model

Proposition 3.18 shows that τ1 is almost surely finite. Set J1 ∶= min{j ∈ ⟦m⟧ ∣ ˆ j τ1 Tτ1 (x) ∈ Uε }. Now for every index i ≠ J1 we start the cut-off version (Yn )n∈N0 of our process given by

τ1,i ∶= ( ) ≥ i Y0 log ε Zτ1 , ⎧ ⎪Zi − Zi , if Zi − Zi ≥ −c Y τ1,i − Y τ1,i ∶= ⎨ τ1+n+1 τ1+n τ1+n+1 τ1+n n+1 n ⎪ ⎩⎪−c, otherwise for all n ∈ N0 and use this to define the stopping time

τ1,i σ1 ∶= inf{n > τ1 ∣ ∃i ≠ J1 ∶ Yn ≥ −c}.

J1 and σ1 are well-defined since τ1 < ∞ P-a.s. Recursively then define

i τk+1 ∶= inf{nN > σk ∣ ∃j ∈ ⟦m⟧ ∶ ∀i ∈ ⟦m⟧ ∖ {j} ∶ ZnN < log(ε)}, ˆ = inf{nN > σk ∣ TnN (x) ∈ Uε}, ∶= { ∈ ⟦ ⟧ ∣ ˆ ( ) ∈ j} Jk+1 min j m Tτk+1 x τk+1 Uε , Y τk+1,i ∶= log(s) ≥ Zi , 0 τk+1 ⎪⎧ i i i i τ + ,i ⎪Z − Z , if Z − Z ≥ −c Y k 1 − Y τk+1,i ∶= ⎨ τk+1+n+1 τk+1+n τk+1+n+1 τk+1+n n+1 n ⎪ ⎩⎪−c, otherwise, τk+1,i σk+1 ∶= inf{n > τk+1 ∣ ∃i ≠ Jk+1 ∶ Yn ≥ −c, } for i ≠ Jk+1, n ∈ N0. Note that on {σk = ∞} we have the existence of a j ∈ ⟦m⟧ such that τk,i for all other i ∈ ⟦m⟧ ∖ {j} ∶ Yn < −c holds, which by Proposition 3.21 and i its definition implies thatn lim →∞ Zn = −∞. This, however, is equivalent to ˆ limn→∞ T (x) ∈ Λ, our desired result. Of course, since some of the above are only well-defined when the corre- sponding stopping times are finite, we begin by considering the probabilities ( < ∞ ∣ F ) = of these events. Again, by Proposition 3.18 we know that P τk+1 σk ˆ 1 on {σk < ∞}. Furthermore, since {τk < ∞} ⊂ {Tτk (x) ∈ Uε} we know that

τk,i P(σk = ∞ ∣ Fτk ) = P(∃j ∈ ⟦m⟧ ∶ ∀i ∈ ⟦m⟧ ∖ {j} ∶ ∀n ∈ N0 ∶ Yn < −c ∣ Fτk ) ≥ 1 − θ

on {τk < ∞} P-a.s. by Proposition 3.21 and therefore P(σk < ∞ ∣ Fτk ) ≤ θ on {τk < ∞}. Combining the results above we see that for every k ∈ N we have

143 Chapter 3. Volterra Stochastic Operators

( < ∞ ∣ F ) ≤ { < ∞} P σk σk−1 θ on σk−1 which we can use to conclude

P(σk < ∞) = P(σk < ∞, . . . , σ1 < ∞) = ( ( < ∞ ∣ F ) 1 ⋯1 ) E P σk σk−1 {σk−1<∞} {σ1<∞} ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≤θ k ≤ θP(σk−1 < ∞, . . . , σ1 < ∞) ≤ θ ( < ∞) < ∞ iterating the argument used in the last step. Therefore ∑k∈N P σk which by the first Borel-Cantelli lemma implies that P(∃k ∈ N ∶ σk = ∞) = 1. i Since we chose the {∀i ∈ ⟦m⟧ ∖ {j} ∶ ∀n ∈ N ∶ Yn < −c}j∈⟦m⟧ to be pairwise disjoint by Proposition 3.21 we know that

τk,i P(∃j ∈ ⟦m⟧ ∀i ∈ ⟦m⟧ ∖ {j} ∶ lim Yn = −∞ ∣ σk = ∞) = 1 n→∞ and can conclude that

τk,i 1 = P(∃k ∈ N ∃j ∈ ⟦m⟧ ∀i ∈ ⟦m⟧ ∖ {j} ∶ lim Yn = −∞) n→∞ i ≤ P(∃j ∈ ⟦m⟧ ∀i ∈ ⟦m⟧ ∖ {j} ∶ lim Zn = −∞) n→∞ ˆ ≤ P( lim Tn(x) ∈ Λ), n→∞ which completes the proof of Theorem 3.15.

3.4 A Martingale Lemma

1 Lemma 3.22 ([56]). Consider a real-valued process (Yn)n∈N0 that is in L (P) and adapted to a filtration (Fn)n∈N0 such that for some a ∈ R and A, B > 0 we have that for all n ∈ N0 ∶

1. E[Yn+1 ∣ Fn] ≥ Yn + A and

2 2. E[(Yn+1 − E[Yn+1 ∣ Fn]) ∣ Fn] ≤ B on {Yn ≥ a} P-a.s. Then 1 P(lim inf Yn ≥ A ∣ ∀n ∈ N ∶ Yn ≥ a) = 1. (3.20) n→∞ n Moreover for every θ > 0 and every b ∈ R there exists an S ∈ R, such that

P(∀n ∈ N ∶ Yn > b ∣ F0) ≥ 1 − θ (3.21) on {Y0 ≥ S} P-a.s.

144 3.4. A Martingale Lemma

Proof. The proof of (3.21) follows an idea of Rajchman used to prove a strong law of large numbers, see [66, Theorem 2.14]. A similar result with stronger assumptions is given in [99, Lemma 2.6]. We begin with the proof of the first statement and define τ ∶= inf{n ∈ N ∣ Yn < a} as the first time our process jumps below the level a.

We will want to apply Theorem 2.19 from [50] to the sequence ((Yn+1 − [ ∣ F ])1 ) E Yn+1 n {τ>n} n∈N0 . Therefore let Ξ be a random variable such that P(Ξ ≤ 1) = 0 and P(Ξ > 1 + x) = x2 for x > 1. Then E[Ξ log Ξ] < ∞ and since 1 (∣Y − [Y ∣ F ]∣1 > x) ≤ ( [(Y − [Y ∣ F ])21 ] ) ∧ 1 P n+1 E n+1 n {τ>n} E n+1 E n+1 n {τ>n} x2 1 ≤ (B ) ∧ 1 ≤ (B ∨ 1) (Ξ > x) x2 P for all x > 0 and n ∈ N0 the assumptions of the theorem hold and we have 1 n ∑ (Y − [Y ∣ F ]) 1 = (3.22) n i+1 E i+1 i {τ>i} i=1 n 1 n ∑((Y − [Y ∣ F ])1 − [(Y − [Y ∣ F ])1 ∣ F ]) ÐÐ→→∞ 0 n i+1 E i+1 i {τ>i} E i+1 E i+1 i {τ>i} i i=1

P-almost surely. Now observe that

n 1 1 1 lim inf Yn∧τ = lim inf ∑(Yi − Yi−1) {τ>i−1} n→∞ n n→∞ n i=1 n 1 1 1 = lim inf ∑((Yi − Yi−1) {τ>i−1} − E[(Yi − Yi−1) {τ>i−1} ∣ Fi−1] n→∞ n i=1 1 + E[(Yi − Yi−1) {τ>i−1} ∣ Fi−1]) n 1 1 = lim inf ∑((Yi − E[Yi ∣ Fi−1]) {τ>i−1} n→∞ n i=1 1 + E[(Yi − Yi−1) {τ>i−1} ∣ Fi−1]) n (3.22) 1 1 = lim inf ∑ E[(Yi − Yi−1) {τ>i−1} ∣ Fi−1] n→∞ n i=1 n 1 1 = lim inf ∑ (E[Yi ∣ Fi−1] − Yi−1) {τ>i−1} n→∞ n i=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≥A n ∧ τ ≥ lim inf A n→∞ n

145 Chapter 3. Volterra Stochastic Operators and therefore 1 lim inf Yn ≥ A on {τ = ∞} n→∞ n which proves the first statement. ¯ To prove the second statement we start by considering a process (Yn)n∈N with the same properties as (Yn)n∈N0 , but without the restriction on the size of the predecessor, i.e. such that for all n ∈ N0 ¯ ¯ (1’) E[Yn+1 ∣ Fn] ≥ Yn + A and

¯ ¯ 2 (2’) E[(Yn+1 − E[Yn+1 ∣ Fn]) ∣ Fn] ≤ B

P-almost surely. With this define

i ¯ ¯ hi+1 ∶= Yi+1 − E[Yi+1 ∣ Fi],Si ∶= ∑ hj j=1 for all i ∈ N0. Note that due to (2’) we know

2 E[hi ∣ F0] ≤ B and E[hihj ∣ F0] = 0 holds for every i, j ∈ N0, i ≠ j P-a.s. For arbitrary constants c1 ≥ c2 ≥ 0 and α1 > α2 > 0 we can then estimate

P(∃ m ∈ N ∶ Sm ≤ −c1 − α1m ∣ F0) 2 ≤ P(∃ n ∈ N ∶ Sn2 ≤ −c2 − α2n ∣ F0) 2 2 2 + P(∃ n ∈ N ∃ m ∈ [n , (n + 1) − 1] ∶ Sm − Sn2 ≤ −(c1 − c2) − (α1 − α2)n ∣ F0) 2 ≤ ∑ P(Sn2 ≤ −c2 − α2n ∣ F0) n∈N 2 (n+1) −1 2 + ∑ ∑ P(Sm − Sn2 ≤ −(c1 − c2) − (α1 − α2)n ∣ F0) n∈N m=n2 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ m [ 2∣F ] ∑ E h 0 i=n2+1 i ≤ 2 2 ((c1−c2)+(α1−α2)n ) Bn2 Bn(2n + 1) ≤ ∑ + ∑ (c + α n2)2 ((c − c ) + (α − α )n2)2 n N 2 2 n N 1 2 1 2 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶∈ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶∈ =∶f(c2,α2) =∶g(c1−c2,α1−α2)

P-a.s.,where limc2→∞ f(c2, α2) = 0 and limc→∞ g(c, α1 − α2) = 0. This means that for every θ > 0 (and every choice of α1 > α2 > 0) choosing c2 large enough

146 3.4. A Martingale Lemma

θ θ for f(c2, α2) ≤ 2 and then c1 large enough such that g(c1 − c2, α1 − α2) < 2 we have

P(∃ m ∈ N ∶ Sm ≤ −c1 − α1m ∣ F0) ≤ f(c2, α2) + g(c1 − c2, α1 − α2) ≤ θ ¯ P-a.s. Using α1 ∶= A we obtain the following for our process (Yn)n∈N0 : For every θ > 0 and every point b choosing S ∶= c1 + b for c1 as above we see that ¯ on {Y0 ≥ S}

m ¯ ¯ ¯ ¯ P(∃ m ∈ N ∶ Ym ≤ b ∣ F0) = P(∃ m ∈ N ∶ Y0 + ∑(Yi − Yi−1) ≤ b ∣ F0) i=1 m ¯ ¯ ≤ P(∃ m ∈ N ∶ S + ∑(Yi − Yi−1) ≤ b ∣ F0) i=1 m ¯ ¯ = P(∃ m ∈ N ∶ ∑(Yi − Yi−1) ≤ b − S ∣ F0) i=1 m m ¯ ¯ ≤ P(∃ m ∈ N ∶ ∑ hi ≤ b − S − ∑ E[Yi − Yi−1 ∣ Fi−1] ∣ F0) i=1 i=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≥A by (1’) m ≤ (∃ m ∈ ∶ ∑ h ≤ b − S −m A ∣ F ) ≤ θ, P N i ± ® 0 i=1 =−c1 =α1 which means that for every θ > 0 and b ∈ R we can find an S such that ¯ ¯ P(∀ m ∈ N ∶ Ym > b ∣ F0) ≥ 1 − θ on {Y0 ≥ S}. ¯ Coming back to (Yn)n∈N0 use it to define such a process (Yn)n∈N0 through ¯ Y0 ∶= Y0 and ⎧ ⎪Y − Y , if τ > n Y¯ − Y¯ ∶= ⎨ n+1 n n+1 n ⎪ ⎩⎪A, otherwise. This process has the stronger properties (1’) and (2’) and since we also have ¯ {∀n ∈ N0 ∶ Yn ≥ a} = {∀n ∈ N0 ∶ Yn ≥ a} the above observation yields the second statement.

147

Chapter 4

A Random Dynamical System

As the name suggests the field of Random Dynamical Systems lies at the intersection of dynamical systems and probability theory and is concerned with a specific kind of non-autonomous dynamical systems. A great source of interest comes from the observation that stochastic differential equations (SDEs) do not only provide a family of stochastic processes each solving the SDE, but naturally generate these random dynamical systems. Their richer structure in turn makes it possible to improve classical results previously based on the Markov transition probabilities. In particular, it can trace (virtually) any set of initial conditions simultaneously instead of being limited to one-point motions. The simplest such random dynamical system is a product of random map- pings: At each discrete time-step n a random selection mechanism chooses a map φn (out of a given set of maps) and the point xn is mapped to xn+1 ∶= φn(xn) and the procedure is iterated, always with the same (ran- dom) selection mechanism. This mechanism is clearly reminiscent of the set-up in Chapter 3, Section 3.3 and, indeed, as we will see in Definition 4.4 below, this is precisely the random dynamical system at the focus of this chapter. It begins with a brief introduction of the main notions of random dynam- ical system and attractor as well as the definition of the random dynamical system to be worked on in this chapter. Section 4.2 focuses on the evolution of this RDS forward in time. Here, we conclude the existence of a strong minimal pullback point-attractor in Section 4.2.1. Section 4.2.2 is concerned with forward convergence and the properties of the (random) sets M i of points converging to a given ei ∈ Λ i ∈ ⟦m⟧. We prove that these sets are φ- invariant and measurable. Restricting the system to d = 2, we can even prove that they are open and path connected. Section 4.3 then considers what is essentially the inverse of RDS and to this end, is restricted to d = 2. We

149 Chapter 4. A Random Dynamical System can show the existence of a strong global pullback attractor in this system. With additional restrictions on the operators allowed, we can then conclude that this attractor is indeed a singleton, and hence, that synchronization oc- curs. The chapter concludes with the introduction of so-called ∆-attractors, that attract compact sets of Hausdorff-dimension ∆. The main result in this Section 4.4 then states that Λ is indeed a ∆-attractor for some ∆ > 0. We summarize here some notation needed in the remainder of the chapter. Notation 4.1 Let (E, d) be a metric space. For any x ∈ E and any subset ∅ ≠ B ⊆ E define

d(x, B) ∶= inf d(x, y) y∈B and for any other set ∅ ≠ A ⊆ E

d(A, B) ∶= sup d(x, B) = sup inf d(x, y) y B x∈A x∈A ∈ the Hausdorff semi-distance induced by d. Using the same letter for the metric and for the Hausdorff semi-distance induced by it should not cause confusion and will be done so consistently for any arising metric. Note that the Hausdorf semi-distance is not symmetric. The Hausdorff metric is then defined by dh(A, B) ∶= max{d(A, B), d(B,A)}, but we will not make further use of it. For a set A ⊆ E and a δ > 0, set Aδ ∶= {x ∈ E ∣ d(x, A) < δ}. From the previous chapter, recall ⟦m⟧ ∶= {1, . . . , m} and the definition of the simplex

m m−1 m S ∶= {x ∈ [0, 1] ∣ ∑ xi = 1} i=1

m 1 m 1 as well as the notation for its interior int S − ∶= {x ∈ S − ∣ ∀i ∈ ⟦m⟧ ∶ xi > 0} and its boundary ∂Sm−1 ∶= Sm−1 ∖ int Sm−1. In this chapter, we consider the metric space (Sm−1, d) of Sm−1 equipped with the euclidian metric d in Rm, i.e. d(x, y) ∶= ∥x − y∥ for any x, y ∈ Sm−1 (or indeed x, y ∈ Rm).

4.1 Introduction to Random Dynamical Sys- tems and Attractors

The purpose of this section is to provide the reader with the necessary ba- sic notions and assertions related to random dynamical systems as needed for this thesis. For a comprehensive account (and reference for the general

150 4.1. Introduction to Random Dynamical Systems and Attractors content of this section), we refer the reader to [1]. In addition, the specific random dynamical system this chapter is concerned with is introduced in Definition 4.4.

+ Definition 4.2. Let (Ω, F, P) be a probability space, T1 ∈ {R0 , R, N0, Z} endowed with its Borel-σ-algebra B(T1). Denote by (ϑt)t∈T1 a family of maps such that

i (ω, t) ↦ ϑt(ω) is F ⊗ B(T1)-F-measurable,

−1 ii ∀ t ∈ T1: P ○ ϑt = P

iii ϑ0 = IdΩ and ∀ t, s ∈ T1 ∶ ϑt+s = ϑt ○ ϑs.

Then (Ω, F, P, (ϑt)t∈T1 ) is called a metric dynamical system. + Now let (E, d) be a metric space and T2 ∈ {R0 , R, N0, Z} such that T2 ⊆ T1, endowed with their Borel-σ-algebras B(E) and B(T2) respectively. Consider a map

φ ∶ T2 × Ω × E → E with the properties

iv φ is B(T2) ⊗ F ⊗ B(E)-B(E)-measurable and using the notation φ(t, ω) ∶= φ(t, ω, ⋅) ∶ E → E for t ∈ T2 and ω ∈ Ω

v ∀ ω ∈ Ω ∶ φ(0, ω) = IdE and

vi ∀ t, s ∈ T2 ∶ φ(t + s, ω) = φ(t, ϑsω) ○ φ(s, ω).

Such a φ is called a cocycle and the ensemble (Ω, F, P, (ϑt)t∈T1 , φ) is a Randon Dynamical System (RDS). It is called continuous RDS, if in addition

vii ∀ω ∈ Ω ∶ φ( ⋅ , ω, ⋅ ) ∶ T2 × X → X is continuous. Remark 4.3.

1. We will often refer to the property vi as the cocycle property of φ. If the metric dynamical system is clear from the context we will often refer to only φ as the random dynamical system.

t 2. If T1 ∈ {N0, Z}, then ϑt = ϑ1 for any t ∈ T1. Hence, condition i reduces to the F-F measurability of ϑ ∶= ϑ1. In the same way, ii reduces to the P-invariance of ϑ.

151 Chapter 4. A Random Dynamical System

3. If T1 ∈ {R, Z}, it automatically implies that ϑt is measurably invertible −1 −t with ϑt = ϑ , for any t ∈ T1.

4. If T2 ∈ {N0, Z}, condition iv is equivalent to (ω, x) ↦ φ(1, ω, x) being F ⊗ B(E)-B(E) measurable. Likewise, vii is replaced by the condition of continuity of x ↦ φ(1, ω, x) (for every ω ∈ Ω).

5. If T2 ∈ {R, Z}, for all t ∈ T2 and ω ∈ Ω φ(t, ω) is a bimeasurable bijection of X and

−1 φ(t, ω) = φ(−t, ϑtω).

1 Moreover, the mapping (t, ω, x) ↦ φ(t, ω)− x is B(T2) ⊗ F ⊗ B(E)- B(E)-measurable. (See Theorem 1.1.6 in [1].) 6. The condition of continuity of the RDS needs to be handled with care when consulting the literature, as sometimes, only continuity of φ(t, ω, ⋅ ) ∶ X → X (for every ω ∈ Ω, t ∈ T2) is required. Now that we have prepared the general definition, we introduce the RDS that is at the core of this Chapter and arises in the set-up considered in the previous chapter, more precisely in Section 3.3. Recall the set-up of Theorem 3.15: V(d) is the set of all Volterra Poly- nomial Stochastic Operators of degree d on the simplex Sm−1 (although we often simply write V) and Υ the corresponding Borel-σ-algebra. Vk(d) is the set of all VPSOs, that are pure-bred of level at least d − 2. ν is a probability measure on (V(d), Υ) such that for all k ∈ ⟦m⟧ ν(Vk(d)) > 0. Definition 4.4. Let d ≥ 2. Set

Ω ∶= V(d)Z, F ∶= Υ⊗Z, P ∶= ν⊗Z. (4.1) Define ϑ ∶ Ω → Ω by

∀ ω ∈ Ω, ∀ i ∈ Z ∶ (ϑω)i ∶= ωi+1

z and with it the family ϑz ∶= ϑ , z ∈ Z. For d ≥ 3, define

m−1 m−1 φd ∶ N0 × Ω × S → S as ⎧ ⎪ω ○ ⋯ ○ ω x, for n ∈ , φ (n, ω, x) ∶= ⎨ n 1 N (4.2) d ⎪ ⎩⎪x, for n = 0.

152 4.1. Introduction to Random Dynamical Systems and Attractors

For d = 2, define ⎧ ⎪ωz ○ ⋯ ○ ω1x, for z ∈ N, ⎪ φ2(z, ω, x) ∶= ⎨x, for z = 0, (4.3) ⎪ ⎪ω−1 ○ ⋯ ○ ω−1x, for z ∈ − . ⎩ z+1 0 N Remark 4.5.

1. (Ω, F, P, (ϑz)z∈Z) is clearly a metric dynamical system (with T1 = Z). In addition, since P is an ‘i.i.d.’ product measure and ϑ is the shift in the sequences, it is ergodic, i.e. ϑ1 is ergodic with respect to P.

2. Sm−1 is a metric space with the euclidian metric d. As such it is even Polish, i.e. separable and complete.

3. The construction in (4.2) and (4.3) is the standard construction for RDS with discrete one-sided, respectively two-sided time T2, cf. Sec- tion 2.1 in [1]. However, for this we need to assure that (ω, x) ↦ m 1 m 1 φd(1, ω, x) is F ⊗B(S − )-B(S − ) measurable. A priori we know that m 1 ω ↦ φd(1, ω, x) is F-F measurable, for every x ∈ S − . Since VPSOs are continuous, we also have the continuity of x ↦ φd(1, ω, x) for any ω ∈ Ω. Lemma 1.1 in [14] (repeated as Proposition 4.46 in Section 4.5.1) then yields the desired joint measurability. Note also, that the continuity of VPSOs implies that our RDS φd is, indeed, a continuous RDS for any d ≥ 2.

4. Due to the structure of P, the RDS defined above is a product of random mappings, i.e. it has independent increments, cf. [1] Section 2.1.3.

Hence, for any n ∈ N, the σ-algebras Fn ∶= σ(φ(k, ⋅ ) ∣ k ≤ n) and ϑ−nF are independent.

5. Recall Theorem 3.1 stating that Volterra quadratic stochastic opera- tors, i.e. for d = 2, are homeomorphisms on Sm−1. The differentiation between (4.2) and (4.3) is necessary, because we do not have such a result for general d ≥ 3. Whenever possible, we have formulated (and proven) the results in this chapter for general d ≥ 2. The exceptions to this are Propositions 4.26 and 4.28 and the entire Section 4.3.

6. We will often write φd(z, ω)x instead of φd(z, ω, x) (for every z ∈ T2, ω ∈ Ω and x ∈ Sm−1) emphazising the underlying idea of random mappings.

7. If we define Tn(ω) ∶= φd(n, ω), for every n ∈ T2 and ω ∈ Ω, we obtain a

sequence (Tn)n∈N0 of i.i.d. random VSPOs (of degree d) with T1 ∼ ν as

153 Chapter 4. A Random Dynamical System

considered in Theorem 3.15. For d = 2 we even obtain a double-sided

sequence (Tz)z∈Z. As mentioned in the introduction, one of the interesting aspects of em- bedding a Markovian structure into a random dynamical system is that one steps out of the boundaries of considering only one-point motions to tracing the behaviour of (almost) arbitrary sets. In Theorem 3.15 we saw that in the set-up of our RDS the one-point motions converge P-almost surely to Λ = {e1, . . . , em}. In order to embed this result in the context of RDS (and extend it), we need to introduce the notion of an attractor. This is one of the basic concepts in the theory of random dynamical systems. However, there are several different approaches to this term. They typically have in common that the attractor should be a ‘compact random set’ and ‘invariant’ under the actions of the flow (the terminology will be explained below). They then differ mainly in the types of sets they attract (points, bounded or compact sets, for example) and in how these sets are attracted. Before that, we need some preliminary definitions. Definition 4.6. 1. Let (E, d) be a Polish space. Recall that P(E) is used to denote the power-set of E. A set-valued map C ∶ Ω → P(E) is said to be measurable, if for each x ∈ E, the map

ω ↦ d(x, C(ω))

is F-B(E) measurable. Then C is called a random set. Such a map is called a compact random set, if, in addition, for all ω ∈ Ω, C(ω) is a compact subset of E.

2. Let (Ω, F, P, (ϑt)t∈T1 , φ) be a random dynamical system. A random set C is said to be (forward) invariant for φ, if for every t ∈ T2

φ(t, ω)B(ω) ⊂ B(ϑtω) for P-almost all ω ∈ Ω. It is called strictly φ-invariant, if

φ(t, ω)B(ω) = B(ϑtω) for P-almost all ω ∈ Ω.

We are now ready to define the notion of an attractor of anRDS.

Definition 4.7. Let (Ω, F, P, (ϑt)t∈T, φ) be a random dynamical system over the metric space (E, d). Let C ⊆ P(E) be an arbitrary subset of the power-set of E and A a compact random set that is strictly φ-invariant.

154 4.1. Introduction to Random Dynamical Systems and Attractors

1. A is called a strong forward C-attractor, if for every C ∈ C

lim d(φ(t, ω)C,A(ϑtω)) = 0 P-a.s. t→∞ 2. A is called a strong pullback C-attractor, if for every C ∈ C

lim d(φ(t, ϑ−tω)C,A(ω)) = 0 P-a.s. t→∞ 3. A is called a weak C-attractor, if for every C ∈ C

lim d(φ(t, ω)C,A(ϑtω)) = 0 in probability. t→∞ Remark 4.8. The notions of forward and pullback attractor coincide when the convergence is considered to be in probability only. (See also Remark 4.14.) Hence the absence of such a denomination in the definition of a weak attractor. For a comparison of different concepts of attractors we recom- mend [98] as well as [13]. Oftentimes strong pullback attractors are simply referred to as strong attractors, since forward attractors are less prominent in the literature. However, we will encounter examples of both notions in this work, thus we prefer the usage of strong pullback attractor to avoid misunderstandings. Definition 4.9. In the context of Definition 4.7: If C = K ∶= {K ∈ B(E) ∣ K ≠ ∅ is compact }, then A is called global attractor. If C = {{x} ∣ x ∈ Sm−1}, then A is called point attractor. Remark 4.10. It was proven in [31], Lemma 1.3, that the weak global attractor of a continuous RDS is P-almost surely unique. Hence, if they exist, the strong global forward attractor and the strong global pullback attractor must coincide and they are unique P-almost surely. (However, they need not exist.) Since uniquenes does not necessarily hold for other than global attractors, we introduce the notion of a minimal attractor.

Definition 4.11. Let (Ω, F, P, (ϑt)t∈T, φ) be a random dynamical system over the metric space (E, d) and C ⊆ P(E) be an arbitrary subset of the power-set of E. A (strong/weak forward/pullback) C-attractor for φ is called minimal, if it is minimal among all such (strong/weak forward/pullback) attractors for φ with respect to inclusion. It was recently proven in [16], that such a minimal attractor exists, if any such attractor exists and all sets in C are compact. We are now ready to explore the attractors in the random dynamical system defined in 4.4.

155 Chapter 4. A Random Dynamical System

4.2 Evolution of the RDS forward in time

As we saw in Definition 4.4, the set-up from Chapter 3.3 can be naturally embedded in the context of random dynamical systems.

4.2.1 Pullback attractors of our RDS When considering Volterra PSOs of degree d = 2 we also immediately obtain a very simple example of an (strong pullback) attractor, which we will treat with slightly more details than we ususally would, for it serves as our first example of this concept. Recall that by Theorem 3.1 (Theorem 5 in [41]), quadratic Volterra operators are homeomorphisms.

m 1 m 1 Remark 4.12. For φ2 on S − , S − itself is the strong global pullback attractor. The conditions are easy to verify:

1. Sm−1 is a compact set and since it is non-random the measurability is clear.

m 1 m 1 2. For all ω ∈ Ω and z ∈ Z we have φ2(z, ω)S − = S − , since all elements of V are homeomorphisms, so in particular surjective.

3. For all B ∈ K and all ω ∈ Ω

m−1 lim d(φ2(z, ϑ−zω)B,S ) = 0. z→∞ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =0

As it turns out though, Sm−1 is not the strong minimal pullback point attractor for any d ∈ N. The following result was published in [56] as Theorem 4.3 for VQSOs.

m 1 Theorem 4.13. Let d ≥ 2. For the RDS φd on S − the strong minimal pullback point attractor is the set Λ ∶= {e1, ..., em}.

Proof. Again, the measurability and compactness properties of a point at- tractor clearly hold for Λ. Observing for example Corollary 3.10, we see that e1, . . . , em are fixed points for any Volterra operator, thus the invariance prop- erty holds, too. This also implies that each point attractor has to contain Λ. Therefore, it only remains to show that for each x ∈ Sm−1, we have

lim d(φ(n, ϑ−nω)x, Λ) = 0, P-a.s. n→∞

156 4.2. Evolution of the RDS forward in time

We obtain this convergence in probability from Theorem 3.15: Using the notation in the theorem it straighforward implies

∀ε > 0 lim P(d(Tn ○ ... ○ T1(x), Λ) > ε) = 0. n→∞

Recalling the Definition 4.4 of our RDS we observe that forany n ∈ N0

P ({ω ∈ Ω ∣ d(Tn(ω) ○ ... ○ T1(ω)(x), Λ) > ε}) = P ({ω ∈ Ω ∣ d(φ(n, ω)x, Λ) > ε})

= P ({ω ∈ Ω ∣ d(φ(n, ϑ−nω)x, Λ) > ε}) where we used that P is ϑ−1-invariant in the last equality. In order to infer almost sure convergence from this, it suffices to show that convergence in probability happens sufficiently quickly. In fact, thanks to the first Borel- Cantelli Lemma, it suffices to prove that for each ε > 0, we have

∞ ∞ c ∑ P(d(φ(n, ϑ−nω)x, Λ) > ε) = ∑ P(Tn ○ ... ○ T1(x) ∈ Uε ) < ∞. n=1 n=1 Observe that Propositions 3.18 and 3.21 together show that the summands on the right-hand side converge to zero exponentially quickly and therefore the assertion follows.

Remark 4.14. Let us briefly comment on the difference between Theorem 3.15 and Theorem 4.13, which amounts to a difference between strong forward and pullback attractors. Written in the terminology of RDS the assertion in Theorem 3.15 reads

lim d(φ(n, ω)x, Λ) = 0 P-a.s. (4.4) n→∞ whereas the (essential) content of Theorem 4.13 is

lim d(φ(n, ϑ−nω)x, Λ) = 0 P-a.s. (4.5) n→∞ As we saw in the proof above we can relate the two directions using the ϑ−1-invariance of P and for example obtain

P ({ω ∈ Ω ∣ d(φ(n, ω)x, Λ) > ε}) = P (ϑn {ω ∈ Ω ∣ d(φ(n, ω)x, Λ) > ε}) (4.6)

= P ({ω ∈ Ω ∣ d(φ(n, ϑ−nω)x, Λ) > ε})

157 Chapter 4. A Random Dynamical System for any ε > 0 and x ∈ Sm−1 and therefore the expressions in (4.4) and in (4.5) are equivalent if we replace the type of convergence by convergence in probability. Indeed, the same argument actually yields

d φ(n, ⋅)x = φ(n, ϑ−n⋅)x d or Tn ○ ... ○ T1(x) = T0 ○ ... ○ T1−n(x) thus yielding the equivalence of the notions of (any) forward and pullback attractor in probability. But note that we can only use this correspondence for arbitrary but fixed n ∈ N0 as we use the shift corresponding to the time or number of steps taken, cf. (4.6). Thus we cannot use this to relate the sequences as a total indicating that the notions differ in the case of almost sure convergence. See also [98].

4.2.2 Considering forward convergence

As we observed for the pullback-case in Remark 4.14, Sm−1 is indeed also the m 1 strong global forward attractor for φ2 on S − . On the other hand rewriting the claim of Theorem 3.15 in the terminology of random dynamical systems, we obtain that (for any d ≥ 2)

m−1 ∀ x ∈ S ∶ P ( lim φd(n, ⋅ )x ∈ Λ) = 1. n→∞ Since we already observed that the measurability and invariance properties hold for the set Λ, the convergence implies that Λ is indeed also the minimal strong forward point attractor. In this context one can also examine the (random) sets of sites converging to a specific vertex in Λ:

Definition 4.15. For any i ∈ ⟦m⟧ and ω ∈ Ω define

i m−1 M (ω) ∶= {x ∈ S ∣ lim φd(n, ω)x = ei} . n→∞

Since every vertex in Λ is a fixed point for any operator in V, these sets are non-empty. Proposition 3.21, however, in addition allows to conclude i the a.s. existence of points x ∈ M ∖ {ei} for every i ∈ ⟦m⟧. We will see in Proposition 4.24 that the sets M i even contain balls of positive radii.

Remark 4.16. For any d ≥ 2 the sets M i, i ∈ ⟦m⟧ are forward invariant. To see this, fix ω ∈ Ω and let x ∈ M i for an i ∈ ⟦m⟧ of your choice. Then, using

158 4.2. Evolution of the RDS forward in time the cocycle-property of the RDS

ei = lim φd(n, ω)x n→∞ = lim φd(n − 1, ϑ1ω) φd(1, ω)x n→∞ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶y

= lim φd(n − 1, ϑ1ω)y n→∞

i i i and thus φd(1, ω)x = y ∈ M (ϑ1ω) implying φd(1, ω)M (ω) ⊆ M (ϑ1ω), i.e. the forward invariance. For d = 2, since we have two-sided time, the sets M i, i i ∈ ⟦m⟧ are even strictly invariant. Again, fixing ω ∈ Ω, choosing y ∈ M (ϑ1ω), the cocycle-property implies

ei = lim φ2(n, ϑ1ω)y n→∞

= lim φ2(n − 1, ϑ−1ϑ1ω) φ2(−1, ϑ1ω) y n→∞ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ −1 =(φ2(1,ω)) −1 = lim φ2(n, ω)(φ2(1, ω)) y. n→∞

1 i i Hence, (φ2(1, ω))− M (ϑ1ω) ⊆ M (ω) and together with the more general i i observation above we obtain φ2(1, ω)M (ω) = M (ϑ1ω) for all ω ∈ Ω and i ∈ ⟦m⟧. Note that we did not make use of any properties specific to our RDS.

Notation 4.17 For our probability space (Ω, F, P), we denote by F 0 the completion of F with respect to P, i.e. the σ-algebra given by

0 0 0 F ∶= {A ∪ N ∣ A ∈ F and ∃ N ∈ F ∶ P(N) = 0 and N ⊆ N} .

Proposition 4.18. For any d ≥ 2 the sets M i, i ∈ ⟦m⟧, are measurable random sets in the sense that

m 1 i 0 ∀ x ∈ S − ∶ ω ↦ d(x, M (ω)) is F -B(R) measurable. Proof. Without loss of generality, consider i = 1. We first observe, that the graph of the random set M 1 is measurable with respect to F ⊗ B(Sm−1):

graphM 1 ∶ = {(ω, x) ∈ Ω × Sm−1 ∣ x ∈ M 1(ω)} m−1 m−1 = {(ω, x) ∈ Ω × S ∣ lim φd(n, ω)x = e1} ∈ F ⊗ B(S ) n→∞

m 1 m 1 m 1 m 1 since φd(n, ⋅ )⋅ ∶ Ω × S − → S − is F ⊗ B(S − )-B(S − ) measurable for every n ∈ N. Additionally, note that the continuity of d ∶ (Sm−1)2 → R

159 Chapter 4. A Random Dynamical System implies that the map (ω, y) ↦ d(x, y) is F ⊗ B(Sm−1)-measurable for any fixed x ∈ Sm−1. Then Corollary 2.13 in [14] (see Proposition 4.47 in Section 4.5.1) implies the measurability of

ω ↦ − sup −d(x, y) = d(x, M 1(ω)) y∈M 1(ω) for any x ∈ Sm−1 with respect to F 0. In order to further explore these sets, we investigate the forward behavior of our RDS in more depth. The following observation will prove useful not only in this section, but also in the following Section 4.3. Recall that Vk, k ∈ ⟦m⟧, was defined in Definition 3.14 as the set ofall PSOs (of degree d ≥ 2) for which the type k was purebred of purity level d−2, which amounts to say that at least two of its parents have to be of type k for the offspring to be of this type. As we saw in Proposition 3.9, thisis a strong disadvantage for this type k, since such an operator will essentially square its fraction of the population. We will use this to show that sets that are sufficiently close to the boundary of Sm−1 will converge to it uniformly with arbitrarily high probability. Notation 4.19 For any i ∈ ⟦m⟧ and h ∈ [0, 1] we define the sets

i m−1 Dh ∶= {x ∈ S ∣ xi ≤ h} , i Dh ∶= ⋃ Dh, i∈⟦m⟧

m 1 and for any set C ⊆ S − define hi(C) ∶= sup{xi ∣ x ∈ C}.

Recall that the measure ν was such that ν ∶= min{ν1, . . . , νm} > 0. Lemma 4.20. For every d ≥ 2 and h ∈ [0, 1],

i −α1 α1 ∀ i ∈ ⟦m⟧ ∶ P ( lim hi(φd(n, ⋅ )Dh) = 0) ≥ 1 − κ h (4.7) n→∞ for any

⎧ 1 ⎫ ⎪ α1 ⎪ − log(1 − ν) ⎪⎛1 − (1 − ν)dα1 ⎞ ⎪ −1 0 < α1 < and κ ∶= min ⎨ α , d ⎬ . log(d) ⎪⎝ ν(d) 1 ⎠ ⎪ ⎩⎪ 2 ⎭⎪ Since ⟦m⟧ is finite, observing that (4.7) translates to

i m−1 −α1 α1 ∀ i ∈ ⟦m⟧ ∶ P ( lim d(φd(n, ⋅ )Dh, ∂S ) = 0) ≥ 1 − κ h , n→∞ immediately implies the following corollary.

160 4.2. Evolution of the RDS forward in time

Figure 4.1: Illustration of the sets defined for Lemma 4.20 and Corollary 4.21 for the case of m = 3.

Corollary 4.21. For every d ≥ 2 and h ∈ [0, 1]

m−1 −α1 α1 P ( lim d(φd(n, ⋅)Dh, ∂S ) = 0) ≥ 1 − mκ h . n→∞

Proof of Lemma 4.20. Recall that by Proposition 3.9 and Corollary 3.10 in Chapter 3, Section 3.2, for any set C ⊆ Sm−1 and every k ∈ ⟦m⟧

V ∈ V ⇒ hk(VC) ≤ dhk(C) d and V ∈ V ⇒ h (VC) ≤ ( )h (C)2. k k 2 k

Without loss of generality assume k = 1. Fix h ∈ [0, 1] and define for ω ∈ Ω and n ∈ N0

1 H0(ω) ∶= h1(Dh) ⎧ ⎪(d)H2(ω) ∧ 1, if ω ∈ V H (ω) ∶= ⎨ 2 n n+1 1 n+1 ⎪ ⎩⎪dHn(ω) ∧ 1, otherwise. This is a time-homogeneous Markov chain with transition probabilities

d (H = ( )h¯2 ∧ 1 ∣ H = h¯) = 1 − (H = dh¯ ∧ 1 ∣ H = h¯) = ν(V ) > 0 P 1 2 0 P 1 0 1

¯ d for any h ∈ [0, 1]. By Proposition 4.51 in Section 4.5.2, with a = (2), b = d and β = 2 we can conclude that

−α1 α1 lim P ( lim Hn = 0 ∣ H0 = h) ≥ 1 − κ h . h↓0 n→∞

161 Chapter 4. A Random Dynamical System

1 − d 2 d 2 −1 Note that, since (2) ≤ d , also (2) ≥ d and therefore

1 1 ⎧ 1 ⎫ ⎧ ⎫ ⎪ α1 ⎪ ⎪ α1 ⎪ ⎪⎛1 − (1 − ν)dα1 ⎞ d − 2 ⎪ ⎪⎛1 − (1 − ν)dα1 ⎞ ⎪ −1 −1 κ = min ⎨ α , d , ( ) ⎬ = min ⎨ α , d ⎬ ⎪⎝ ν(d) 1 ⎠ 2 ⎪ ⎪⎝ ν(d) 1 ⎠ ⎪ ⎩⎪ 2 ⎭⎪ ⎩⎪ 2 ⎭⎪

At the same time (Hn)n∈N0 is coupled to our RDS such that

1 ∀ω ∈ Ω ∀n ∈ N0 ∶ h1(φd(n, ω)Dh) ≤ Hn(ω). Thus

1 −α1 α1 P( lim h1(φd(n, ⋅)Dh = 0) ≥ P ( lim Hn = 0 ∣ H0 = h) ≥ 1 − κ h . n→∞ n→∞

Lemma 4.20 now allows us to see that around each vertex of Λ there must be a (random) neighborhood consisting only of points converging to this vertex.

Definition 4.22. We call a set C ⊆ Sm−1 monochromatic in ω ∈ Ω, if there exists an i ∈ ⟦m⟧ such that all the elements of C converge to ei under the action of our RDS, i.e. if

∀ x ∈ C ∶ lim φd(n, ω)x = ei. n→∞

In this case we might also say that C is monochrome of the color of ei. By definition, the sets M i, i ∈ ⟦m⟧ are monochromatic. m 1 Notation 4.23 For any r > 0 and x ∈ S − we denote by Br(x) the open ball of radius r around x, i.e.

m−1 Br(x) ∶= {y ∈ S ∣ d(x, y) < r}.

Proposition 4.24. For any d ≥ 2 our RDS φd is such that:

P(∃ r > 0 ∀ i ∈ ⟦m⟧ ∶ Br(ei) is monochrome) = 1.

Of course, Br(ei) is then necessarily monochrome of the color of ei. Proof. According to Lemma 4.20 since ⟦m⟧ is finite, similar to the observation in Corollary 4.21, we know

i lim P (∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ⋅ )Dh) = 0) = 1. h→0 n→∞

162 4.2. Evolution of the RDS forward in time

But since these sets are increasing in h > 0, i.e. for any h′ ≥ h

i {ω ∈ Ω ∣ ∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ω)Dh′ ) = 0} n→∞ i ⊆ {ω ∈ Ω ∣ ∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ω)Dh) = 0} n→∞ we can write

i 1 = lim P (∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ⋅ )D0) = 0) h→0 n→∞ i = P( ⋃ {∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ⋅ )Dh) = 0}) n→∞ h∈[0,1]∩Q i = P ({∃ h > 0 ∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ⋅ )Dh) = 0}) . n→∞ m 1 Now observe that for any x ∈ S − , if d(x, ei) < h, then xj < h for all other j ∈ ⟦ ⟧∖{ } ∈ j ( ( ) ) = m i and thus x ⋂j∈⟦m⟧∖{i} Dh. At the same time, if limn→∞ φd n, ω x j 0 for all j ∈ ⟦m⟧ ∖ {i}, then limn→∞ φd(n, ω)x = ei. Hence we obtain

i 1 = P (∃ h > 0 ∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ⋅ )Dh) = 0) n→∞ ≤ P (∃ h > 0 ∀ i ∈ ⟦m⟧ ∀ x ∈ Bh(ei) ∶ lim φd(n, ⋅ )x = ei) n→∞ i = P (∃ h > 0 ∀ i ∈ ⟦m⟧ ∶ Bh(ei) ⊆ M ( ⋅ )) which proves the claim. If we restrict to considering our RDS for d = 2, we can specify the state- ments about the largest radius, such that a ball around a vertex of Λ with this radius is still monochrome. Notation 4.25 For any i ∈ ⟦m⟧ and any set C ⊆ Sm−1 define

i r (C) ∶= sup{r > 0 ∣ Br(ei) ⊆ C}, with the convention sup ∅ = −∞.

Proposition 4.26. In the set-up of our RDS from Definition 4.4 let d = 2. Then there exists R = (R1,...,Rm) ∶ Ω → Rm satisfying 1. R is F-B(Rm) measurable,

i 2. P (mini∈⟦m⟧ R > 0) = 1, and such that in addition, for (Rz)z∈Z defined by i i ∀z ∈ Z ∀ω ∈ Ω ∶ Rz(ω) ∶= R (ϑzω) for every i ∈ ⟦m⟧, the following holds true:

163 Chapter 4. A Random Dynamical System

3. (Rz)z∈Z is a stationary and ergodic sequence of random variables on (Ω, F, P),

i i i 4. P(∀ i ∈ ⟦m⟧ ∀ z ∈ Z ∶ Rz = r (φ(z, ⋅ )M ( ⋅ ))) = 1 . Proof. For any ω ∈ Ω, i ∈ ⟦m⟧ we set

R¯i(ω) ∶= ri(M i(ω)) and R¯(ω) ∶= (R¯1(ω),..., R¯m(ω)).

Recall that by Proposition 4.18 we know that all ω ↦ d(x, M i(ω)) are F 0- B(R) measurable and note that then also all

ω ↦ d(C,M i(ω)) = sup d(x, M i(ω)) x∈C are F 0-B(R) measurable for any set C ⊆ Sm−1. In particular the maps

↦ ( ) ∶= 1 i ( ) ω fr ω r {ω¯∈Ω∣d(Br(ei),M (ω¯))=0} ω are F 0-B(R) measurable for any r > 0. This allows us to conclude that R¯ is F 0-B(Rm) measurable since for any i ∈ ⟦m⟧ we can write

R¯i(ω) = ri(M i(ω)) i = sup { r > 0 ∣ Br(ei) ⊆ M (ω)} i = sup { r > 0 ∣ d(Br(ei),M (ω)) = 0}

= sup fr(ω) r>0 and B(Rm) = B(R)⊗m. Lemma 1.2 in [14] (Proposition 4.49) then yields the existence of a map R ∶ Ω → R that is F-B(R) measurable and a P-nullset ¯ ∈ F ∈ ¯ c ¯( ) = ( ) ∶= ¯ N such that for all ω N we have R ω R ω . Set N ⋃z∈Z ϑzN. Then we still have N ∈ F and P(N) = 0 by the bimeasurability and P- invariance of ϑ respectively. In addition we preserved that for all ω ∈ N c ⊆ N¯ c ¯ c c R(ω) = R(ω) and obtained that for all ω ∈ N , also ϑzω ∈ N for any z ∈ Z. Hence

c ¯ ∀ ω ∈ N ∀ z ∈ Z ∶ R(ϑzω) = R(ϑzω).

If we define

c i i ∀ i ∈ ⟦m⟧ ∀ω ∈ N ∀z ∈ Z ∶ Rz(ω) ∶= R (ϑzω) (4.8)

164 4.2. Evolution of the RDS forward in time

then (Rz)z∈Z is a sequence of random variables on (Ω, F, P) that is stationary and ergodic by construction, since the properties follow from the measurabil- ity, P-invariance and ergodicity of ϑ respectively. Hence, 3 holds. In addition, observe that for all i ∈ ⟦m⟧, ω ∈ Ω, z ∈ Z

i i i i r (φ2(z, ω)M (ω)) = r (M (ϑzω)) ¯i = R (ϑzω), (4.9)

i where we used the strict φ2-invariance of the sets M , cf. Remark 4.16 in the first equality. But (4.8) and (4.9) together imply that forall i ∈ ⟦m⟧, z ∈ Z and ω ∈ N c:

i i i i Rz(ω) = R (ϑzω) = r (φ2(z, ω)M (ω)) (4.10) and thus 4. Lastly, we are left to prove 2. But this follows immediatly from Proposition 4.24, since (4.10) in particular implies Ri = ri(M i) = sup{r > 0 ∣ i Br(ei) ⊆ M }, P-a.s. ¯ ¯ ¯i Remark 4.27. It is easy to see that indeed R and (Rz)z∈Z given by Rz ∶= i i r (φ2(z, ⋅ )M ( ⋅ )) have all the properties from Proposition 4.26 already, ex- cept for the measurability with respect to F: Property 4 holds by definition (so even ω-wise) and 2 is proven in the same way as for R. The station- arity and ergodicity follow from (4.9) together with the observation that if (Ω, F, P, ϑ) is ergodic, then so is (Ω, F 0, P, ϑ) (in particular the measurability of ϑ w.r.t. F 0 holds, too).

The knowledge about (Rz)z∈Z now allows us to make some interesting observations about properties of the random sets M i, i ∈ ⟦m⟧, that we sum up in the following Proposition. Proposition 4.28. In the set-up of our RDS let d = 2. Then the random sets M i, i ∈ ⟦m⟧, are open and path-connected P-almost surely. Proof. Without loss of generality, assume i = 1. The proof of both properties 1 1 makes use of a simple observation for (Rz)z∈Z. Since (Rz)z∈Z is (stationary and) ergodic

c 1 1 ∃ N1 ∈ F ∶ P(N1) = 0 and ∀ ω ∈ N1 ∶ lim sup Rz(ω) ≥ E[R ] =∶ r.¯ (4.11) z→∞ (Otherwise we would obtain a contradiction to Birkhoff’s Ergodic Theorem stating that lim 1 z−1 R1 = [R1] -almost surely.) Also, Proposition z→∞ z ∑n=0 n E 0 P 4.26, 4 implies

c 1 i i ∃ N2 ∈ F ∶ P(N2) = 0 and ∀ ω ∈ N2 ∀z ∈ Z ∶ Rz(ω) = r (φ2(z, ω)M (ω)).

165 Chapter 4. A Random Dynamical System

i c c We first prove that the M are open P-almost surely. Let ω ∈ N1 ∩ N2 and choose any x ∈ M 1(ω). (Note that x does also, in a sense, depend on ω.) By 1 definition of M we have limn→∞ φ2(n, ω)x = e1. Hence, in particular r¯ ∃ n¯ = n¯(ω, x, r¯) ∈ ∀ n ≥ n¯ ∶ d(φ (n, ω)x, e ) < . N0 2 1 2 At the same time (4.11) implies

1 ∀ n ∈ N0 ∃ n¯ ≥ n ∶ Rn¯(ω) ≥ r¯ and combining the two we obtain r¯ ∃ n¯ = n¯(ω, x, r¯) ∈ ∶ d(φ (n, ω)x, e ) < < r¯ ≤ R1 (ω). N0 2 1 2 n¯ This in turn means that for thisn ¯

1 Br¯/4(φ(n,¯ ω)x) ⊆ B3¯r/4(e1) ⊆ φ2(n,¯ ω)M (ω) (4.12)

1 1 since 3¯r/4 < Rn¯ = sup{r > 0 ∣ Br(e1) ⊆ φ2(n, ω)M (ω)}. Given that φ2 m−1 −1 is a homeomorphism on S , (φ2(n,¯ ω)) Br¯/4(φ(n,¯ ω)x) is an open set (containing x) and by (4.12)

−1 −1 1 1 (φ2(n,¯ ω)) Br¯/4(φ(n,¯ ω)x) ⊂ (φ2(n,¯ ω)) φ2(n, ω)M (ω) = M (ω). Hence, M 1(ω) is an open set and we have proven the claim. The argument to prove the path-connectedness is very similar. Again, let 1 ω ∈ N1 ∩ N2 and choose x, y ∈ M (ω). As before we observe that r¯ ∃ n¯ = n¯(ω, x, y, r¯) ∶ d(φ (n,¯ ω)x, e ) < , 2 1 2 r¯ d(φ (n,¯ ω)y, e ) < and 2 1 2 1 Rn¯(ω) ≥ r.¯

Then φ2(n,¯ ω)x, φ2(n,¯ ω)y ∈ Br¯/2(e1) and since this open ball is path-connected, there exists a (continuous) curve γ ∶ [0, 1] → Br¯/2(e1) such that γ(0) = 1 φ2(n, ω)x and γ(1) = φ2(n, ω)y. In addition, sincer ¯/2 < Rn¯(ω) = sup{r > 0 ∣ 1 Br(e1) ⊆ φ2(n, ω)M (ω)}

γ([0, 1]) = {γ(s) ∣ s ∈ [0, 1]}

⊂ Br¯/2(e1) 1 ⊂ φ2(n,¯ ω)M (ω).

166 4.3. Evolution backward in time

m 1 Thus, since φ2 is a homeomorphism on S − , we have a continuous map 1 m 1 (φ2(n,¯ ω))− ○ γ ∶ [0, 1] → S − such that

−1 −1 ((φ2(n,¯ ω)) ○ γ)(0) = (φ2(n,¯ ω)) φ2(n,¯ ω)x = x −1 −1 ((φ2(n,¯ ω)) ○ γ)(1) = (φ2(n,¯ ω)) φ2(n,¯ ω)y = y −1 −1 1 1 and ((φ2(n,¯ ω)) ○ γ)([0, 1]) ⊂ (φ2(n,¯ ω)) φ2(n,¯ ω)M (ω) ⊂ M (ω) which means we have found a path in M 1(ω) connecting x and y and since they were arbitrary, we have proven P-almost sure path-connectedness of M 1.

4.3 Evolution backward in time

We have up to now traced the evolution of our RDS in its natural time direction. However, as we saw in Definition 4.4 for the case d = 2, i.e. when our RDS acts applying quadratic stochastic operators, the time parameter in the RDS φ can actually be taken to be two-sided, so T = Z, thus opening the question of evolution backwards in time. To abbreviate notation in this section and for more clarity we introduce the backwards version of our RDS. In all of this section we take d = 2, for we need time to be two-sided.

m−1 Definition 4.29. For the RDS (Ω, F, P, (ϑz)z∈Z, φ2) on S introduced in Definition 4.4 we set

¯ −1 ϑz ∶= ϑ−z = ϑz −1 φ¯2(z, ω) ∶= φ2(−z, ω) = (φ2(z, ϑ−zω)) (4.13)

¯ m−1 and call (Ω, F, P, (ϑz)z∈Z, φ¯2) on S the backward version of our RDS. Remark 4.30.

¯ m−1 1. By Remark 4.5, 3, (Ω, F, P, (ϑz)z∈Z, φ¯2) is indeed an RDS on S itself. It corresponds to an i.i.d. sequence of operators in the same way our RDS φ2 does, except instead of applying VQSOs V ∈ V(2), we apply their inverses V −1.

2. With the same argument used in Remark 4.14 for our RDS φ2, we can also immediately observe that Sm−1 is the strong global pullback m 1 attractor for φ2 on S − . (Again, it is also the strong global forward attractor, but we will not further pursue the notion of forward attractors in this section.)

167 Chapter 4. A Random Dynamical System

When contemplating the behavior ofφ ¯2, recall that for φ2, the sets Vk, k ∈ ⟦m⟧, of ‘special’ VQSOs played an important role. Any such V ∈ Vk when applied will give a set a strong uniform ‘push’ towards the boundary m 1 of S − in the direction away from the corresponding ek, k ∈ ⟦m⟧, while any VQSO V ′ ∈ V can only weakly counteract this effect (see Proposition 3.9 and Corollary 3.10). We made positive use of this fact for example in Theorem 3.15, but it was very visible in Lemma 4.20. Consequently, their inverses show the opposite behavior: For any x ∈ Sm−1 and k ∈ ⟦m⟧

V ∈ V(2) ∶ ⇒ (V −1x) ≥ 2−1x k √ k −1 and V ∈ Vk(2) ∶ ⇒ (V x)k ≥ xk.

While the boundary of Sm−1 remains invariant1, this observation implies that m 1 the inverse of any V ∈ Vk(2) will give a set in the interior of S − a strong push towards the ‘inside’ suggesting the investigation of the RDS defined in Definition 4.29 restricted to int Sm−1. Indeed in this case, we obtain the existence of a strong global pullback attractor on the interior of Sm−1:

m 1 Theorem 4.31. The RDS φ¯2 on int S − has a strong global pullback at- tractor A¯.

For the proof we will apply a criterion for the existence of such a strong global pullback attractors from [15] which we paraphrase here for our set-up.

Theorem 4.32 (Theorem 3.2 in [15]). The following are equivalent:

m 1 • φ¯2 has a strong global pullback attractor (in int S − ).

m 1 • For every ε > 0 there exists a compact subset Cε ⊂ int S − such that for each δ > 0 and each compact subset K of int Sm−1 it holds that

¯ δ P ( ⋃ ⋂ {φ¯2(z, ϑ−zω)K ⊆ Cε }) ≥ 1 − ε. s∈N0 z≥s

Here Cδ ∶= {x ∈ int Sm−1 ∣ d(x, C) < δ} for any C ⊆ Sm−1.

Proof of Theorem 4.31. As we saw in Corollary 4.21 for our RDS φ2 for any ε > 0 there exists an h = h(ε) such that in particular

○ m−1 P ( lim d(φ2(n, ⋅)Dh, ∂S ) = 0) ≥ 1 − ε n→∞ 1Indeed, as observed for VPSOs in Remark 3.62, also these inverses map the relative interior of the faces of Sm−1 into themselves.

168 4.3. Evolution backward in time

○ m−1 m−1 where Dh ∶= {x ∈ S ∣ d(x, ∂S ) < h}. Rewriting the definition of the limit, this implies that for all ε > 0 there exists an h = h(ε) such that for all δ > 0

○ m−1 P( ⋃ ⋂ {d(φ2(z, ⋅)Dh, ∂S ) < δ} ) ≥ 1 − ε. s∈N0 z≥s ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ○ m−1 δ ={ φ2(z,⋅)Dh ⊆ (∂S ) }

m 1 Now define Cε ∶= S − ∖ D○ for any ε > 0. These sets are compact and h(ε)

K

Figure 4.2: An illustration of the idea of the proof of Theorem 4.31 for m = 3. As D○ fades into the boundary under the actions of φ2 (i.e. forward h(ε) in time), its complement Cε expands and will eventually cover any compact set K. since our operators are invertible

φ (z, ω)D○ ⊆ (∂Sm−1)δ ⇔ Sm−1 ∖ (∂Sm−1)δ ⊆ φ (z, ω)C 2 h(ε) 2 ε (for every z ∈ Z and ω ∈ Ω). Thus for all ε > 0 and all δ > 0

m−1 m−1 δ P( ⋃ ⋂ {S ∖ (∂S ) ⊆ φ2(z, ω)Cε}) ≥ 1 − ε. (4.14) s∈N0 z≥s

Observe also that for any compact K ⊂ int Sm−1 there must exist a δ = δ(K) such that

K ⊆ int Sm−1 ∖ (∂Sm−1)δ(K).

Adding this to (4.14) we obtain that for every ε > 0 and every compact K ⊂ int Sm−1

P( ⋃ ⋂ {K ⊆ φ2(z, ω)Cε}) ≥ 1 − ε. s∈N0 z≥s

169 Chapter 4. A Random Dynamical System

¯ −1 Since by definition φ2(z, ω) = (φ¯2(n, ϑ−zω)) (see (4.13)) this yields: For m 1 every ε > 0 there exists a compact Cε ⊂ int S − (as defined above) such that for each compact K ⊂ int Sm−1

¯ P( ⋃ ⋂ {φ¯2(z, ϑ−zω)K ⊆ Cε}) ≥ 1 − ε s∈N0 z≥s and the claim follows with Theorem 4.32.

We have proven the existence of a (strong pullback) attractor A¯ relying heavily on the forces exerting a strong inward drift, ‘pushing everything to- gether’ towards the center. The attractor however, is by definition (strictly) invariant under the actions of the RDSφ ¯2 and thus also under these strong forces. By intuition, it should thus be a set that cannot be ‘further com- pressed’, i.e. a point. The phenomenon of the existence of a (weak) attractor that is a singleton P-almost surely is called synchronization. We refer the reader to [31], in particular Section 1.1 and references therein. Unfortunately, we are only able to prove synchronization for a very special case. Let m = 3. Notation 4.33 For i ∈ ⟦m⟧ define ⎧ ⎪x2, if k = i, (V x) ∶= ⎨ i i k ⎪ ⎩⎪xk(1 + xi), else.

Then Vi ∈ Vi, since Vi strongly diminishes type i but it has no preferences among the other two types.

m 1 Theorem 4.34. Consider the RDS φ¯2 on int S − with the additional re- striction on the measure ν (see Definition 4.4), that ν({V1,V2,V3}) = 1. Then synchronization occurs, i.e. the strong global pullback attractor A¯ sat- isfies

∣A¯∣ = 1 P-a.s.

Proof. For x, y ∈ int Sm−1 define

3 dl(x, y) ∶= ∑ ∣ log(xi) − log(yi)∣, i=1 and for any C ⊆ int Sm−1 set diam (C) ∶= sup d (x, y). By Proposition l x,y∈C l m 1 4.52 in Section 4.5.3 dl is a metric on int S − such that for any i, j ∈ ⟦3⟧ with i ≠ j:

170 4.4. Delta attractors - refining an established concept

m−1 −1 −1 1. For all x, y ∈ S : dl(Vi x, Vi y) ≤ dl(x, y).

2. for any compact set K ⊂ int Sm−1

−1 −1 diaml(Vi Vj K) < diaml(K) or diaml(K) = 0.

¯ Now consider the map Ω ∋ ω ↦ diaml(A(ω)). Then ¯ ¯ ¯ diaml(A(ϑ1ω)) = diaml(φ¯(1, ω)A(ω)) 1. ¯ ≤ diaml(A(ω)) for P-almost all ω ∈ Ω, where we used that the attractor A¯ isφ ¯-invariant in the first inequality. Since ϑ1 is ergodic with respect to P, this implies that the ¯ ¯ map diaml(A( ⋅ )) is constant P-almost surely, say diaml(A( ⋅ )) = c ∈ [0, ∞[. −1 −1 Assume c ≠ 0. Since there is a positive probability thatφ ¯2(2, ⋅ ) = V2 V1 (or ¯ any other combination of distinct choices of V1,V2,V3), and A is compact, 2. above implies ¯ ¯ diaml(φ¯(2, ⋅ )A( ⋅ )) < diaml(A( ⋅ )) with positive probability. (4.15)

Resorting to the strictφ ¯-invariance of the attractor A¯ on the other hand, we see ¯ ¯ ¯ ¯ diaml(φ¯2(2, ω)A(ω)) = diaml(A(ϑ2ω)) = r = diaml(A(ω)) ¯ for P-almost all ω ∈ Ω contradicting (4.15). Hence, diaml(A( ⋅ )) = c = 0 and ∣A¯∣ = 1 P-almost surely.

The metric dl is tailored to the needs of the RDS considered in Theorem 4.34. See Remark 4.53 in Section 4.5.3 for a more detailed discussion of this challenge.

4.4 Delta attractors - refining an established concept

As we saw in the previous sections the attractors generally considered are either point or global attractors, i.e. they either attract only points or all compact subsets of the space. However, one might want to refine this concept. Consider the following example for m = 3. Theorem 3.15 shows that every point in Sm−1 is attracted by Λ and we defined the (monochromatic) random i sets M as the sets of all points converging to ei, i ∈ ⟦m⟧. On the other

171 Chapter 4. A Random Dynamical System

m 1 hand, at least for d = 2, the global forward attractor of φ2 is S − itself, as discussed in the beginning of Section 4.2. Assume there is a curve γ in Sm−1 that intersects, say M 1 and M 2 (with positive probability). The endpoints of 1 2 this curve will converge to e1 and e2 by definition of M and M . Since φ2 is continuous, the curve cannot be attracted by Λ, as it would have to become ‘disconnected’. At the same time, as we saw for example in Lemma 4.20, φd has the tendency to push everything to ∂Sm−1 and a very intuitive picture m 1 suggestst that the curve should be pushed to the face of S − connecting e1 and e2, see Figure 4.3. Of course, one needs to treat this with utter most care: a space-filling curve, for example, will not be attracted by ∂Sm−1. But the observation does spark an interest considering attractors for other ‘sizes’ of sets and wonders if one could maybe find smaller attractors, for ‘smaller’ sets. In order to measure the ‘size’ of a set, we will use the Hausdorff dimension defined below. A very natural question in the context of our RDS defined in Definition 4.4 isthe relation between the attractor for sets of a certain Hausdorff dimension ∆ and the faces of Sm−1 of dimension ⌊∆⌋. The bold hope is to find a set-up – maybe with a more restrictive measure ν – such that the faces of Sm−1 of dimension κ are indeed the minimal attractors for all sets of Hausdorff dimension in [κ, κ + 1[. But as can be con- cluded from the lack of precision in the for- mulation, this is still far from viable. In Figure 4.3: Illustration of the this section, however, we take a first step set M i, i ∈ ⟦m⟧ for m = 2. The in this direction and, after formally intro- curve γ intersects the (random) ducing the concept of ∆-attractors, prove set M 1 and M 2. The drift to that Λ is not only the (strong forward) the boundary of the (forward) point-attractor, as proven in Theorem 4.13, RDS φ (as represented by the 2 but also a strong forward ∆-attractor for a blue arrows) suggests that the ∆ > 0. curve could be absorbed in the In order to proceed, we need the defi- boundary between e and e . 1 2 nition of the Hausdorff dimension, see for example Chapter 4 in [78]. Definition 4.35. Let (E, d) be a metric space and H ⊂ E. A sequence E1,E2,... of subsets of E is called a cover of H, if

H ⊆ ⋃ Ei. i∈N

172 4.4. Delta attractors - refining an established concept

For every δ ≥ 0 and ε > 0 define

δ δ Hε ∶= inf {∑ diam(Ei) ∣ E1,E2,... cover of H, ∀ i ∈ N ∶ diam(Ei) < ε} i∈N and with it

δ δ H (H) ∶= sup Hε(E). ε>0 This is called the δ-Hausdorff measure of the set H and

δ dimH (H) ∶= inf{δ ∣ H (H) = 0} is the Hausdorff dimension of H.

Remark 4.36.

m−1 δ • Note that for any set E ⊂ S diam(E) = infδ>0 diam(E ). Hence, we may, without loss of generality assume that the covers used in the Definition of the Hausdorff dimension consist of open sets.

• For later reference observe: For a set H with dimH (H) = ∆ > 0, the definition of the Hausdorff dimension implies that for every δ > 0, ε1 > 0 and ε2 > 0, there exists a cover (of open sets) E1,E2,... of H such that

∆+δ ∑ diam(Ei) < ε1 i∈N and ∀ i ∈ N ∶ diam(Ei) < ε2.

We can now introduce the concept of ∆-attractors.

Definition 4.37. For any ∆ ≥ 0, define

C∆ ∶= {H ∈ K ∣ dimH (H) ≤ ∆}.

Let (Ω, F, P, (ϑt)t∈T, φ) be a random dynamical system over a metric space (E, d). A random compact set A ⊂ E (that is strictly φ-invariant), is called a strong, resp. weak forward, resp. pullback ∆-attractor, if it is a strong, resp. weak forward resp. pullback C∆-attractor in the sense of Definition 4.7. With this, we can present the main result of this section.

Theorem 4.38. There exists a ∆ > 0 such that Λ = {e1, . . . , em} is the m 1 minimal strong forward ∆-attractor for the RDS φd on S − .

173 Chapter 4. A Random Dynamical System

Remark 4.39. Theorem 4.38 holds for every ∆ < β, for β given by (4.37) in Notation 4.42. Let us briefly discuss the strategy of the proof before indulging inthe preliminary results. Note that a set H must be attracted by Λ, if we can find a cover of H that is attracted by Λ. Since these covers are made up of (very small) sets, the bulk of work is in proving that such small sets converge to Λ with high probability. This is done in two steps: Proposition 4.41 shows that a small neighborhood of Λ will converge uniformly to Λ. We then have to invest additional work to guarantee that the small cover sets reach this neighborhood and stay sufficiently small to be completely contained in it,cf. Corollary 4.45. Notation 4.40 For any i ∈ ⟦m⟧ and h ∈ [0, 1] we define

¯ i j ¯ ¯ i Uh ∶= ⋂ Dh Uh ∶= ⋃ Uh, j∈⟦m⟧∖{i} i∈⟦m⟧

i where the sets Dh are as defined in Notation 4.19. Note that these are indeed i the closures of the sets Uh, resp. Uh defined in Notation 3.17. Using previous results we can then estimate the probability of these sets converging to Λ uniformly. Recall that the measure ν was such that ν ∶= min{ν1, . . . , νm} > 0. Proposition 4.41. For any h ∈ [0, 1]

¯ −α1 α1 P( lim d(φd(n, ⋅ )Uh, Λ) = 0) ≥ 1 − mκ h n→∞ if we choose

⎧ 1 ⎫ ⎪ α1 ⎪ − log(1 − ν) ⎪⎛1 − (1 − ν)dα1 ⎞ ⎪ −1 0 < α1 < and κ ∶= min ⎨ α , d ⎬ . log(d) ⎪⎝ ν(d) 1 ⎠ ⎪ ⎩⎪ 2 ⎭⎪ Proof. This proposition is a straight consequence of Lemma 4.20, with a simple observation. Recall that for any set C ⊆ Sm−1 and i ∈ ⟦m⟧ we defined hi(C) ∶= sup{xi ∣ x ∈ C}. By Lemma 4.20

i −α1 α1 P (∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ⋅ )Dh) = 0) ≥ 1 − mκ h . n→∞

1 Let h < κ/m α1 and choose

i ω ∈ ⋂ { lim hi(φd(n, ⋅ )Dh) = 0} ≠ ∅. n→∞ i∈⟦m⟧

174 4.4. Delta attractors - refining an established concept

Figure 4.4: Illustration of the

Partition (Ql)l∈N0 as defined in (4.16).

¯ j j Since Uh ⊂ Dh for all j ∈ ⟦m⟧, this implies ¯ j ∀ j ∈ ⟦m⟧ ∀ i ∈ ⟦m⟧ ∖ {j} ∶ lim hi(φd(n, ω )Uh) = 0 n→∞ and thus ¯ j ∀ j ∈ ⟦m⟧ ∀ i ∈ ⟦m⟧ ∖ {j} ∶ lim φd(n, ω )Uh = ej. n→∞ Therefore ¯ P( lim d(φd(n, ⋅ )Uh, Λ) = 0) n→∞ i −α1 α1 ≥ P (∀ i ∈ ⟦m⟧ ∶ lim hi(φd(n, ⋅ )Dh) = 0) ≥ 1 − mκ h n→∞ which completes the proof. Now that we have a result on the uniform convergence of sets sufficiently ‘close to’ Λ, we are left to assure that the RDS reaches these sets ‘fast enough’. To this end, consider the following partition of Sm−1 illustrated in Figure 4.4:

m−1 ¯ Q0 ∶= S ∖ Ud−l0 d−1 , ¯ ¯ Ql ∶= Ud−l0 d−l ∖ Ud−l0 d−(l+1) , l ∈ N, (4.16) where the value of l0 is assigned in Notation 4.42 below. In a slight abuse of m 1 notation, define a map l ∶ S − → N0 by

l(x) = l ⇔ x ∈ Ql. Note that then ¯ x ∈ Ud−l0 d−l ⇔ l(x) ≥ l.

The key idea is to characterize the convergence of (φd(n, ⋅ )x)n∈N0 through the behavior of (l(φd(n, ⋅ )x))n∈N0 , since

lim φd(n, ⋅ )x ∈ Λ ⇔ lim l(φd(n, ⋅ )x) = ∞. n→∞ n→∞

175 Chapter 4. A Random Dynamical System

For this purpose, we construct a Markov chain (Ln)n∈N0 that dominates m−1 (l(φd(n, ⋅ )x))n∈N0 in the following sense: Fix γ > 0 For x ∈ S , let N ∈ N set ¯ σγ(x, N) ∶ = inf {n ∈ N0 ∣ φd(n, ⋅ )x ∈ Ud−l0 d−γN } = inf {n ∈ N0 ∣ l(φd(n, ⋅ )x) ≥ γN} , and similarly

τγ(N) ∶ = inf {n ∈ N0 ∣ Ln ≥ γN} .

Then (Ln)n∈N0 is intended to be such that

P(σγ(x, N) ≤ N) ≥ P(τγ(N) ≤ N ∣ L0 = 0) (4.17)

m−1 for every x ∈ S and N ∈ N. To this end, we need to construct (Ln)n∈N0 with a weaker drift towards ∞ than (l(φd(n, ⋅ )x))n∈N0 , but sufficiently strong to still reach high levels in a ‘short’ amount of steps. In the following we give an exhaustive list of the parameters (and con- ditions on them) appearing in the forthcoming deliberations. From the list itself, the purpose of the paramters will not be apparent. The reader is there- fore invited to skip it and only return to it for reference, when the parameters are used further on. Notation 4.42 Recall the measure ν on V from Theorem 3.15 and

ν ∶= min{ν1, . . . , νm} > 0. (4.18) With this then choose − log(1 − ν) α ∈ ] 0, [ (4.19) 1 log(d) and set ⎧ 1 ⎫ ⎪ α1 ⎪ ⎪⎛1 − (1 − ν)dα1 ⎞ ⎪ −1 κ ∶= min ⎨ α , d ⎬ . (4.20) ⎪⎝ ν(d) 1 ⎠ ⎪ ⎩⎪ 2 ⎭⎪ Then define

p ∶= νm−1(> 0). (4.21)

If m = 2, let µ2 ∶= 0, otherwise choose any

µ2 ∈]0, 1[ (4.22)

176 4.4. Delta attractors - refining an established concept and subsequently − log(1 − p) λ ∈ ]0 , [ . (4.23) m − 1 + µ2 Note that this is such that

eλ(m−1+µ2)(1 − p) < 1. (4.24)

Choose

1 1 − eλ(m−1+µ2)(1 − p) l > − log ( )(> 0) (4.25) 1 λ p and note that this implies

eλ(m−1+µ2)(1 − p) + e−λl1 p < 1. (4.26)

Define

l0 ∶= ⌈l1 − 1 + µ2 + 2(m − 1)⌉. (4.27) Define 2 2 log(d) M = (m − 1)⌈ log ( max {m, l + 1})⌉ − 1 (4.28) log(2) log(2)2 0 as well as

q ∶= νM+1 (4.29) and then choose 1 µ ∈ ]0 , − log (1 − q + e−λq)[( ⊂ ]0, 1[) . (4.30) λ Note that this is such that

eλµ(1 − q) + e−λ(1−µ)q < 1. (4.31)

Set

λµ A ∶= e− M (< 1), B ∶= eλµ(1 − q) + e−λ(1−µ)q < 1, by (4.31), λ µ2 C ∶= e− m−2 (< 1) D ∶= eλ(m−1+µ2)(1 − p) + e−λl1 p < 1, by (4.26), E ∶= max{A, B, C, D}(< 1).

177 Chapter 4. A Random Dynamical System

For m = 2 we assume C = 0 in the above definition. Choose − log(E) γ ∈ ]0, [ (4.32) λ and observe that then λγ + log(E) < 0. (4.33) Set

α2 ∶= −(λγ + log(E)) (> 0) (4.34) and

α3 ∶= min{α1, α2}(> 0). (4.35) Finally, define

−α3l0 log(d) log(m) (1+γ+ ) c ∶= e log(d) (4.36) and α β ∶= 3 (> 0). (4.37) (1 + γ + log(m) ) log(d)

Now we are ready to define the Markov chain that is to, in a sense, dominate our RDS.

Definition 4.43. For the parameter choice given in Notation 4.42, let (Ln)n∈N0 be the time-homogeneous Markov chain (on the probability space (Ω, F, P)) with state space µ µ { i ∣ i = 0,...,M } ∪ ⋃ { l + i 2 ∣i = 0, . . . , m − 2} − M l∈N m 2 (using the convention of 0 ⋅ ∞ = 0) defined by the following transition proba- bilities: For m ≥ 3: ⎧ µ ⎪1, if b − a = (and a < 1), ⎪ M ⎪q, if a = µ and b = 1, ⎪ ⎪1 − q, if a = µ and b = 0, P(L1 = b ∣ L0 = a) = ⎨ ⎪1, if b − a = µ2 (and a ≥ 1, b ∉ ), ⎪ m−2 N ⎪ ⎪p, if a − µ2 ∈ N and b = 2(a − µ2) − 2(m − 1) + l0 ⎪ ⎩⎪1 − p, if a − µ2 ∈ N and b = max{a − µ2 − (m − 1), 0}.

178 4.4. Delta attractors - refining an established concept

For m = 2:

⎧ µ ⎪1, if b − a = M (and a < 1), ⎪ ⎪q, if a = µ and b = 1, ⎪ (L = b ∣ L = a) = ⎨1 − q, if a = µ and b = 0, P 1 0 ⎪ ⎪ ⎪p, if a ∈ N and b = 2a − 2(m − 1) + l0 ⎪ ⎩⎪1 − p, if a ∈ N and b = a − (m − 1).

Note that, since we chose µ2 = 0 in the case of m = 2 the two cases differ only in the existence of a transition of a step of positive size less than 1 for m ≥ 3. Figures 4.5 and 4.6 illustrate the corresponding transition graphs.

Figure 4.5: The transition graph of the Markov chain (Ln)n∈N0 defined in Definition 4.43 for m ≥ 3.

Figure 4.6: The transition graph of the Markov chain (Ln)n∈N0 defined in Definition 4.43 for m = 2.

Proposition 4.44. Let (Ln)n∈N0 be the Markov chain defined above. For every N ∈ N we then have

−α2N P(LN ≥ γN ∣ L0 = 0) ≥ 1 − e .

The connection between this Markov chain (Ln)n∈N0 and (l(φd(n, ⋅ )x))n∈N0 is best explained in two steps, each corresponding to one area of the state space of (Ln)n∈N0 .

179 Chapter 4. A Random Dynamical System

Step 1: Behavior on ⋃ { l + i µ2 ∣i = 0, . . . , m − 2}. l∈N m−2 ¯ Let x ∈ Ud−l0 d−l , i.e. l(x) ≥ l for some l ∈ N arbitrary but fixed. Without m loss of generality assume x ∈ U¯ − , i.e. d l0 d−l

−l0 −l ∀ i = 1, . . . , m − 1 ∶ xi ≤ d d . Recall from Chapter 3, Section 3.2 (Proposition 3.9 and Corollary 3.10), that for any x ∈ Sm−1 and every k ∈ ⟦m⟧

V ∈ V ⇒ (V x)k ≤ dxk d and V ∈ V ⇒ (V x) ≤ ( )x2. k k 2 k

Now choose Vi ∈ Vi for each i = 1, . . . , m − 1. If we apply these (in any order) to x, each of the first m − 1 coordinates will be multiplied by d m − 2 times and squared once. The worst case occurs, when the squaring step is the last one, hence we can estimate

d 2 (V ○ ⋯ ○ V x) ≤ ( ) (d−l0 d−ldm−2) m−1 1 i 2 ≤ d2d−2(l0+l−(m−2)) = d−l0 d−(l0−2(m−1)+2l). Therefore, we know

¯ − −( − ( − )+ ) Vm−1 ○ ⋯ ○ V1x ∈ Ud l0 d l0 2 m 1 2l , which is equivalent to l(Vm−1 ○ ⋯ ○ V1x) ≥ l0 − 2(m − 1) + 2l. The probability of such a sequence of operators appearing consecutively is at least νm−1. Of course, (φd(n, ⋅ )x))n∈N0 might also move closer to em with other combina- tions of operators (in particular with any permutation of the above), and thus the probability for (l(φd(n, ⋅ )x))n∈N0 to jump from l to a value above l0 − 2(m − 1) + 2l in (m − 1) steps is greater that the probability of (Ln)n∈N0 to jump from l to (exactly) l0 − 2(m − 1) + 2l. On the other hand, we also have the following estimate for any sequence of V1,...,Vm−1 ∈ V:

−l0 −l m+1 −l0 −(l−(m−1)) (Vm−1 ○ ⋯ ○ V1x)i ≤ d d d = d d for every i = 1, . . . , m−1. This implies l(Vm−1 ○⋯○V1x) ≥ l−(m−1) which the reader will recognize as the complementary jump of (Ln)n∈N. Hence, (Ln)n∈N0 will also jump farther to the left than (l(φd(n, ⋅ )x))n∈N0 in the case of the complementary event. Together this implies that (l(φd(n, ⋅ )x))n∈N0 has a stronger drift to the right (towards higher values) than (Ln)n∈N0 on this part of the state space. Note that this reasoning holds true for all m ≥ 2.

180 4.4. Delta attractors - refining an established concept

µ Step 2: Behavior on { i M ∣ i = 0,...,M }. The important observation for this part is that, as explained below, M and q were chosen to fit Proposition 3.18 and thus are such that

m−1 ¯ ∀ x ∈ S ∶ P(φd(M + 1, ⋅ )x ∈ Ud−l0 d−1 ) ≥ q.

m−1 Hence, for any x ∈ S , the probability for (l(φd(n, ⋅ )x))n∈N0 to reach the value 1 in (M + 1) steps is greater or equal to the probability of (Ln)n∈N0 reaching 1 in (M +1) steps. Since (φd(n, ⋅ )x)n∈N0 is Markovian, we see again that also in this area it has a stronger drift to the right, i.e. the larger values, than (Ln)n∈N0 . (The same argument is detailed in the proof of Theorem 3.15.) To see that M and q were indeed suitably chosen, note that by Re- mark 3.19, we need r ∶= M/(m − 1) to be such that rm log(d) − 2r log(2) < l 1 log(d− 0 d− ) = −(l0 + 1) log(d). The definition of M in equation (4.28) is a sufficient condition for this to hold. This can be concluded from the following claim.

Claim 1. Fix m, d ≥ 2, l ∈ N. If

2 2 log(d) r > log ( max {m, l}) , log(2) log(2) log(2) then

rm log(d) − 2r log(2) < −l log(d).

Proof. First observe that

rm log(d) − 2r log(2) < −l log(d) log(d) log(d) ⇔ er log(2) = 2r > m r + l . log(2) log(2)

Abbreviate c ∶= log(d) max {m, l}. It is sufficient for the claim to find r such log(2) that 2r > cr + c ≥ m log(d) r + l log(d) . log(2) log(2) Let Y ∼ Poi(r). Then we can write, for any λ > 0

r r λY λ rc + c = ce P(Y ≤ 1) = ce P(e− ≥ e− ) r λ λY log c λ reλ ≤ ce e E[e ] = e ( )+ + .

181 Chapter 4. A Random Dynamical System

Hence, it suffices to find r such that log(c) + λ + re−λ < r log(2). For λ > − log(log(2)) this is equivalent to log(c) + λ log(2) r > then choose λ ∶= − log ( ) log(2) − e−λ 2 log(2) log(c) − log ( 2 ) = 1 log(2) − 2 log(2) 2 2 2 2 log(d) = log ( c) = log ( max {m, l}) . log(2) log(2) log(2) log(2) log(2) Hence the claim is proven. Combining the observations in Step 1 and Step 2, we conclude that

(φd(n, ⋅ )x)n∈N0 has overall a stronger drift to the right than (Ln)n∈N0 and (4.17) holds. Thus Proposition 4.44 yields the following corollary:

Corollary 4.45. Let γ and α2 be as defined in Notation 4.42. Then for every N ∈ N and every x ∈ Sm−1:

¯ −α2N P( ∃ k ≤ N ∶ φd(k, ⋅ )x ∈ Ud−l0 d−γN ) = P(σγ(x, N) ≤ N) ≥ 1 − e . Note that with more care in the choice of parameters one could even cou- ple the processes to obtain a stronger relationship between the two processes that the one given by (4.17). However, this is an intricate endeavour and ommitted here, since it is not needed for our pursposes.

Proof of Proposition 4.44. Abbreviate P0 ∶= P( ⋅ ∣ L0 = 0). We want to apply a large-deviation-type argument. Let all parameters be as in Notation 4.42. Let Gn ∶= σ(Lk ∣ k ≤ n) for any n ∈ N0. For any N ∈ N, we then have

−λLN −λγN P0(LN < γN) = P0(e > e ) γλN −λLN ≤ e E0[e ] N γλN −λ n=1(Ln−Ln−1) = e E0[e ∑ ] N γλN −λ(Ln−Ln−1) = e E0[E0[∏ e ∣ GN−1]] n=1 N−1 γλN −λ(Ln−Ln−1) −λ(LN −LN−1) = e E0[ ∏ e E0[e ∣ GN−1]] n=1 We can estimate the conditional expectation making use of the Markov prop- erty of (Ln)n∈N0 : Let n ∈ N. On {Ln−1 < µ}: µ − − − −λ(Ln−Ln 1) −λ(Ln 1+ M −Ln 1 E0[e ∣ Gn−1] = e ) λ µ = e− M =∶ A < 1.

182 4.4. Delta attractors - refining an established concept

On {Ln−1 = µ}:

−λ(Ln−Ln−1) λµ −λ(1−µ) E0[e ∣ Gn−1] = e (1 − q) + e q =∶ B < 1 by (4.31).

On {Ln−1 ∈ [l, l + µ2[} for some l ∈ N:

µ − − 2 − −λ(Ln−Ln 1) −λ(Ln 1+ m−2 −Ln 1) E0[e ∣ Gn−1] = e λ µ2 = e− m−2 =∶ C < 1.

On {Ln−1 = l + µ2} for some l ∈ N

−λ(Ln−Ln−1) −λ(max{0,l−(m−1)}−(l+µ2)) −λ(l0−2(m−1)+2l) E0[e ∣ Gn−1] = e (1 − p) + e p ≤ e−λ(l−(m−1)−(l+µ2))(1 − p) + e−λ(l0−2(m−1)+2l)p ≤ eλ(m−1+µ2 (1 − p) + e−λ(l0−2(m−1)+2)p ≤ eλ(m−1+µ2 (1 − p) + e−λl01p =∶ D < 1 by (4.26).

Hence, if we define E ∶= max{A, B, C, D} we have

−λ(Ln−Ln−1) E0[e ∣ Gn−1] ≤ E P-almost surely. Now we can go back and estimate

N−1 γλN −λ(Ln−Ln−1) −λ(LN −LN−1) P0(LN < γN) ≤ e E0[ ∏ e E0[e ∣ GN−1]] n=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≤E N−1 γλN −λ(Ln−Ln−1) ≤ e EE0[ ∏ e ] n=1 and iterating this argument we obtain

γλN N γλN+log(E)N −α2N P0(LN < γN) ≤ e E = e = e .

Note that this line of arguments holds for all m ≥ 2. For m ≥ 3 this is clear. In the case of m = 2, µ2 = 0, hence [l, l + µ2[= ∅, which reflects the fact, that the Markov-chain does not take these additional steps for m = 2. In this case, C = 0 by definition and hence correctly does not influence the valueof E.

We have now completed all the preliminary work and move on to the proof of Theorem 4.38.

Proof of Theorem 4.38. First note that, if Λ is a strong forward ∆-attractor for some ∆ > 0, then it is already the minimal (strong forward) ∆-attractor,

183 Chapter 4. A Random Dynamical System

since Λ is (strictly) invariant under φd. Since Λ is also compact, in order to prove the Theorem, we need to show that for all H ∈ C∆:

lim d(φd(n, ⋅ )H, Λ) = 0 P-a.s. n→∞

Let H ∈ C∆. As explained after introducing the theorem, the aim is to prove the convergence of suitable coverings with arbitrarily large probability. The reason is that this implies the convergence of H itself, which is easy to see. Let E1,E2,... be such a covering of H with open sets. Since H is compact, only a finite number of sets is needed for the covering and without loss of generality, we assume these to be E1,...,Eg, i.e. g H ⊆ ⋃ Ei. i=1 Now, if ω ∈ Ω is such that

∀ i = 1, . . . , g ∶ lim d(φd(n, ω)Ei, Λ) = 0, n→∞ then of course also g lim d(φd(n, ω)H, Λ) ≤ lim d(φd(n, ω) ⋃ Ei, Λ) n n →∞ →∞ i=1 ≤ lim max d(φd(n, ω)Ei, Λ) = 0. n→∞ i=1,...,g Corollary 4.45 yields the uniform convergence of a set ‘sufficiently close’ to Λ, hence, any ball in that set will also converge, but we still have to assure that the ball will reach this set. To this end, we have to translate the results obtained in Proposition 4.41 for points x into a result for (small) balls around m 1 x. Recall that we denote by Bh(x) the open ball around x ∈ S − with radius h > 0 (cf. Notation 4.23). An important observation in this context is, that such a ball has bounded growth under the action of the RDS φd. As detailed in Section 4.5.4, Proposition 4.56, for every ω ∈ Ω, x ∈ Sm−1 and h > 0,

φd(1, ω)Bh(x) ⊆ Bdmh(φd(1, ω)x). In addition, note that for any i ∈ ⟦m⟧, ¯ i ¯ i x ∈ Uh ⇒ Bh(x) ⊆ U2h, ¯ ¯ hence x ∈ Uh ⇒ Bh(x) ⊆ U2h.

Combining these two considerations, we note that for every ω ∈ Ω, x ∈ Sm−1, h > 0, and N ∈ N,

( ) ∈ ¯ ⇒ ( ) −N ( ) ⊆ ( ( ) ) ⊆ ¯ φd N, ω x Uh φd N, ω Bh(md) x Bh φd N, ω x U2h.

184 4.4. Delta attractors - refining an established concept

This allows us to conclude

( ∃ k ≤ N ∶ φ (k, ⋅ )B −l −γN −N (x) ∈ U¯ −l −γN ) P d d 0 d (dm) 2d 0 d ¯ ≥ P( ∃ k ≤ N ∶ φd(k, ⋅ )x ∈ Ud−l0 d−γN ) ≥ 1 − e−α2N (4.38) by Corollary 4.45. Recall α1 > 0 and κ > 0, cf. (4.19) and (4.20). We can adapt the result of Proposition 4.41 to this set-up and obtain ¯ P( lim d(φd(n, ⋅ )U2d−l0 d−γN , Λ) = 0) n→∞ ≥ 1 − mκ−α1 (2d−l0 d−γN )α1 = 1 − mκ−α1 (2d−l0 )α1 d−γNα1 = 1 − mκ−α1 (2d−l0 )α1 e−γα1 log(d)N . (4.39)

These two results can be combined with the following observation:

{ω ∈ Ω ∣ lim d(φ (n, ω)B −l −γN −N , Λ) = 0} d d 0 d (dm) N→∞

⊇ {ω ∈ Ω ∣σγ(x, N)(ω) ≤ N}

∩ { ∈ ∣ ( ( − ( )( ) ) ¯ −l −γN ) = } ω Ω lim d φd n σγ x, N ω , ϑσγ (x,N)(ω)ω U2d 0 d , Λ 0 n→∞ ¯ Note that we have to shift ω in the second set, since we want U2d−l0 d−γN to converge only once the ball has reached it. Recall that the increments of φd are independent under P and we have that for any k ∈ N0 Fk and ϑ−kF are independent, cf. Remark 4.5,4. In a Markovian manner we argue that our ¯ RDS ‘restarts’ independently of the past once it reaches the set U2d−l0 d−γN . However, we are not just tracing a one-point motion (or a finite set of points), and therefore proceed with more care:

{ω ∈ Ω ∣σγ(x, N)(ω) ≤ N}

∩ { ∈ ∣ ( ( − ( )( ) ) ¯ −l −γN ) = } ω Ω lim d φd n σγ x, N ω , ϑσγ (x,N)(ω)ω U2d 0 d , Λ 0 n→∞ N = ⋃ ({ω ∈ Ω ∣σγ(x, N)(ω) = k} k=1 ¯ ∩ {ω ∈ Ω ∣ lim d(φd(n − k, ϑkω)U2d−l0 d−γN , Λ) = 0}) n→∞ N = ⋃ ({ω ∈ Ω ∣σγ(x, N)(ω) = k} k=1

¯ − − ∩ {ϑ−kω ∈ Ω ∣ lim d(φd(n, ω)U2d l0 d γN , Λ) = 0}) n→∞ 185 Chapter 4. A Random Dynamical System

N = ⋃ ({ω ∈ Ω ∣σγ(x, N)(ω) = k} k=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ∈Fk ¯ ∩ ϑ k {ω ∈ Ω ∣ lim d(φd(n, ω)U −l0 −γN , Λ) = 0} ). − n 2d d ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶→∞ ∈ϑ−kF

Since the union is disjoint,

P({ω ∈ Ω ∣σγ(x, N)(ω) ≤ N}

∩ { ∈ ∣ ( ( − ( )( ) ) ¯ −l −γN ) = }) ω Ω lim d φd n σγ x, N ω , ϑσγ (x,N)(ω)ω U2d 0 d , Λ 0 n→∞ N = ∑ P({ω ∈ Ω ∣σγ(x, N)(ω) = k} k=1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ∈Fk ¯ ∩ ϑ k {ω ∈ Ω ∣ lim d(φd(n, ω)U −l0 −γN , Λ) = 0} ) − n 2d d ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶→∞ ∈ϑ−kF N = ∑ (P({ω ∈ Ω ∣σγ(x, N)(ω) = k}) k=1

¯ − − × P(ϑ−k {ω ∈ Ω ∣ lim d(φd(n, ω)U2d l0 d γN , Λ) = 0} )) n→∞ N = ∑ (P({ω ∈ Ω ∣σγ(x, N)(ω) = k}) k=1 ¯ × P({ω ∈ Ω ∣ lim d(φd(n, ω)U2d−l0 d−γN , Λ) = 0}) n→∞ ¯ = P({ω ∈ Ω ∣ lim d(φd(n, ω)U2d−l0 d−γN , Λ) = 0}) n→∞ N × ∑ P({ω ∈ Ω ∣σγ(x, N)(ω) = k}) k=1 ¯ = P({ω ∈ Ω ∣ lim d(φd(n, ω)U2d−l0 d−γN , Λ) = 0}) n→∞

× P({ω ∈ Ω ∣σγ(x, N)(ω) ≤ N}),

where we used the independence of Fk and ϑ−kF in the second equality. We

186 4.4. Delta attractors - refining an established concept can now estimate the desired propability as

( lim d(φ (n, ω)B −l −γN −N , Λ) = 0) P d d 0 d (dm) N→∞ ≥ P(σγ(x, N) ≤ N and

( ( − ( )( ) ) ¯ −l −γN ) = ) lim d φd n σγ x, N ω , ϑσγ (x,N)(ω)ω U2d 0 d , Λ 0 n→∞ = P(σγ(x, N) ≤ N)

× ( ( ( − ( )( ) ) ¯ −l −γN ) = ) P lim d φd n σγ x, N ω , ϑσγ (x,N)(ω)ω U2d 0 d , Λ 0 n→∞ ≥ (1 − e−α2N )(1 − mκ−α1 (2d−l0 )α1 e−γα1 log(d)N ) by (4.38) and (4.39). Hence, we can write

α3N ( lim d(φ (n, ω)B −l −γN −N , Λ) = 0) ≥ 1 − e− P d d 0 d (dm) N→∞ for sufficiently large N ∈ N, if we set

α3 ∶= min{α1, α2}.

In order to apply this to a covering, it is more useful to express the probability in terms of the radius of the ball itself. To this end, we rewrite

( ) γ 1 log m N r = d−l0 d−γN (dm)−N = d−l0 d(− − − log(d) ) log(rd−l0 ) ⇔ N = − . (1 + γ + log(m) ) log(d) Then, for sufficiently small r,

− log(rd l0 ) α3 ( ) ( + + log m ) 1 γ ( ) P( lim d(φd(n, ω)Br, Λ) = 0) ≥ 1 − e log d N→∞ −α3l0 log(d) α3 log(m) log(r) log(m) (1+γ+ ) (1+γ+ ) = 1 − e log(d) e log(d)

−α3l0 log(d) α3 log(m) log(m) (1+γ+ ) (1+γ+ ) = 1 − e log(d) r log(d) ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶c = 1 − crβ, (4.40) with

−α3l0 log(d) log(m) (1+γ+ ) α3 c ∶= e log(d) and β ∶= > 0. (1 + γ + log(m) ) log(d)

187 Chapter 4. A Random Dynamical System

Now we are ready to combine and prove the assertion. The remainder of the proof is, as life, a matter of right choices: Recall that we chose H ∈ C∆. Let β ε > 0 and choose δ ∶= β−∆. In the notation of Remark 4.36, set ε1 ∶= ε/(c2− ). β Then choose r in (4.40) sufficiently small such that cr < ε1 (and (4.40) holds). Then set ε2 ∶= 2r. As described in Remark 4.36, there exists a covering (with open sets) E1,E2,... such that

∀i ∈ N ∶ diam(Ei) < ε2 ∆+δ and ∑ diam(Ei) < ε1. i∈N

Then for every i ∈ N

diam(E ) β i −β β P( lim d(φd(Ei, Λ) = 0) ≥ 1 − c ( ) = 1 − c2 diam(Ei) n→∞ 2 since we chose their diameters sufficiently small for diam(Ei) < 2r. But this then implies

P(∀ i ∈ N ∶ lim d(φd(Ei, Λ) = 0) n→∞ ≥ 1 − ∑ (1 − P( lim d(φd(Ei, Λ) = 0)) n→∞ i∈N −β β ≥ 1 − ∑ c2 diam(Ei) i∈N −β ≥ 1 − c2 ε1 = 1 − ε, and with the introductory obersvations

P( lim d(φd(H, Λ) = 0) ≥ P(∀ i ∈ N ∶ lim d(φd(Ei, Λ) = 0) ≥ 1 − ε. n→∞ n→∞ Since this can be conducted for any ε > 0, we have proven the claim.

4.5 Auxiliary observations

4.5.1 Observations related to measurability The following results are all from [14] and listed here as a reference, outside of the main text in order to avoid interrupting the reader’s flow. They are here listed in the order in which they are used in Sections 4.1 – a permutation of their order in [14], which should not be hindering, as their content is independent of each other.

188 4.5. Auxiliary observations

Proposition 4.46 (Lemma 1.1 in [14]). Suppose that Y is a separable metric space, that (Ω, F) is a measurable space, and let Z be another metric space. Let B(Y ) and B(Z) be the respective Borel-σ-algebras of Y and Z. Suppose that f ∶ Y × Ω → Z satisfies

1. ω ↦ f(y, ω) is measurable for each y ∈ Y ,

2. y ↦ f(y, ω) is continuous for each ω ∈ Ω.

Then f is B(Y ) ⊗ F-B(Z) measurable.

Proposition 4.47 (Corollary 2.13 in [14]). Suppose that f ∶ Ω × X → R is measurable, where (Ω, F, P) is a probability space and X is a Polish space considered with its Borel-σ-algebra B(X). Let ω ↦ C(ω) be any set-valued mapping such that

graph(C) ∶= {(ω, x) ∈ Ω × X ∣ x ∈ C(ω)} ∈ F ⊗ B(X).

Then

ω ↦ sup f(x, ω) x∈C(ω) is measurable with respect to the completion of F with respect to P.

Remark 4.48. The original Corollary 2.13 is indeed a bit stronger, stating measurability with respect to the universally completed σ-algebra of F, which u is given by F ∶= ⋂µ Fµ where Fµ denotes the completion of F with respect to a (positive) measure µ on (Ω, F) and the intersection is taken over all positive finite measures µ on (Ω, F).

Proposition 4.49 (Lemma 1.2 in [14]). Suppose that (Ω, F, P) is a proba- bility space, and let F 0 be the completion of F with respect to P. Let Y be a separable metric space. Then for any F 0-measurable map f 0 ∶ Ω → Y there exists an F-measurable map f ∶ Ω → Y with f = f 0 almost surely with respect to (the completion of) P.

Remark 4.50. We can, without loss of generality assume that the P-nullset in Proposition 4.49 is F-measurable. Assume the statement holds with some N 0 ∈ F 0. By definition of F 0 there exists some N ∈ F such that P(N) = 0 and N 0 ⊆ N. For all ω ∈ N c ⊆ (N 0)c then still f 0(ω) = f(ω), hence the assertion holds for the P-nullset N ∈ F.

189 Chapter 4. A Random Dynamical System

4.5.2 A helpful Markov chain This section is built around an auxiliary convergence result about a Markov chain constructed in order to describe the behavior of specific sets under the action of the RDS defined in 4.4, cf. Lemma 4.20.

Proposition 4.51. Let a, b ≥ 1, β > 1 and (Hn)n∈N0 be a time-homogeneous Markov chain (on some probability space (Ω, F, P)) taking values in the in- terval [0, 1] whose transition probabilities are given by

β P(H1 = ah ∧ 1 ∣ H0 = h) ∶= 1 − P(H1 = bh ∧ 1 ∣ H0 = h) ∶= p ∈]0, 1] for any h ∈ [0, 1] such that ahβ ∧ 1 ≠ bh ∧ 1 and

β P(H1 = ah ∧ 1 ∣ H0 = h) ∶= 1. otherwise. Then

−α1 α1 P ( lim Hn = 0 ∣ H0 = h) ≥ 1 − κ h . n→∞ for any

1 1 ⎧ α ( − ) ⎫ − log(1 − p) ⎪ 1 − (1 − p)b 1 β 1 α1 1 1 β ⎪ 0 < α1 < and κ ∶= min ⎨( ) , , ( ) ⎬ log(b) ⎪ paα1 b a ⎪ ⎩⎪ ⎭⎪

Proof. Let α1 and κ be as above. Observe that they were chosen such that, for any h ≤ κ, we have ahβ ∧ 1 = ahβ, bh ∧ 1 = bh and paα1 hα1 + (1 − p)bα1 ≤ 1. 1 α ≤ ( 1−(1−p)b 1 ) (β−1)α1 Indeed, for any h paα1 :

(β−1)α1 α ( − ) 1 − (1 − p)b 1 β 1 α1 paα1 h(β−1)α1 + (1 − p)bα1 ≤ paα1 ( ) + (1 − p)bα1 paα1 = 1 − (1 − p)bα1 + (1 − p)bα1 = 1. (4.41) ˜ Define τ ∶= inf{n ∈ N0 ∣ Hn > κ} and with it Hn ∶= min{Hn∧τ , κ} for all n ∈ N0. ˜ (Hn)n∈N0 is then a time-homogeneous Markov chain. Furthermore, define α1 ˜ α1 v ∶ [0, 1] → [0, 1] as v(h) ∶= h . Then E[v(Hn+1) ∣ Fn] = κ on {τ ≥ n} and ˜ ˜ β ˜ E[v(Hn+1) ∣ Fn] = pv(aHn ) + (1 − p)v(bHn) α1 ˜ βα1 α1 ˜ α1 = pa Hn + (1 − p)b Hn

˜ α1 α1 ˜ (β−1)α1 α1 = Hn (pa Hn + (1 − p)b ) ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ≤1 by (4.41) ˜ α1 ˜ ≤ Hn = v(Hn)

190 4.5. Auxiliary observations on {τ > n} for any n ∈ N. (Note that these calculations include the case of ˜ ¯ ¯ ¯β ¯ {Hn = h} with h such that ah = bh.) ˜ Hence, (v(Hn))n∈N0 is a bounded supermartingale and converges P-almost ˜ surely to some (random variable) v∞. This implies that (Hn)n∈N0 then con- 1/α1 verges to H∞ = v∞ ∈ {0, κ}. Therefore

α1 P(v∞ = 0) = 1 − P(v∞ = κ ).

˜ Since (v(Hn))n∈N0 is a supermartingale we can estimate (for any h ≤ κ)

α1 ˜ h = E[v(H0) ∣ H0 = h]

≥ E[v∞ ∣ H0 = h] α1 α1 = 0 P(v∞ = 0 ∣ H0 = h) + κ P(v∞ = κ ∣ H0 = h) α1 = κ (1 − P(v∞ = 0 ∣ H0 = h)) which is equivalent to

−α1 α1 P(v∞ = 0 ∣ H0 = h) ≥ 1 − κ h and the claim is proven.

4.5.3 A suitable metric For x, y ∈ int S2 define

3 dl(x, y) ∶= ∑ ∣ log(xi) − log(yi)∣, (4.42) i=1 and for any C ⊆ int S2 set

diaml(C) ∶= sup dl(x, y). x,y∈C

2 2 Proposition 4.52. The map dl ∶ int S × int S → [0, ∞[ is a metric. Recall 2 V1,V2,V3 from Notation 4.33. Then for any i ∈ ⟦3⟧ and all x, y ∈ S

−1 −1 dl(Vi x, Vi y) ≤ dl(x, y).

In addition, for any i, j ∈ ⟦3⟧ with i ≠ j and any compact set K ⊂ int S2

−1 −1 diaml(Vi Vj K) < diaml(K) or diaml(K) = 0.

191 Chapter 4. A Random Dynamical System

Proof. dl is well-defined, since we have restricted ourselves to the interior of 2 S . Clearly, dl is positive definite and symmetric. In addition it is easyto see that for any x, y, z ∈ int S2

3 dl(x, z) = ∑ ∣ log(xi) − log(zi)∣ i=1 3 = ∑ ∣ log(xi) − log(yi) + log(yi) − log(zi)∣ i=1 3 3 ≤ ∑ ∣ log(xi) − log(yi)∣ + ∑ ∣ log(yi) − log(zi)∣ = dl(x, y) + dl(y, z). i=1 i=1

For V1,V2,V3 as in Notation 4.33, the inverses are easily calculated to be ⎧√ ⎪ xi, if k = i, (V −1x) = ⎨ i k xk ⎪ √ , otherwise ⎩ 1+ xi for every x ∈ int S2 and i ∈ ⟦3⟧. Without loss of generality consider i = 1. Then, for any x, y ∈ int Sm−1

d (V −1x, V −1y) l 1 1 √ √ = ∣ log( x1) − log( y1)∣ x y + ∣ log ( √2 ) − log ( √2 ) ∣ 1 + x1 1 + y1 x y + ∣ log ( √3 ) − log ( √3 ) ∣ 1 + x1 1 + y1 1 = ∣ log(x1) − log(y1)∣ 2 √ √ + ∣ log(x ) − log(y ) + log(1 + x ) − log(1 + y )∣ 2 2 √ 1 √ 1 + ∣ log(x3) − log(y3) + log(1 + x1) − log(1 + y1)∣

≤ dl(x, y) 1 √ √ − ∣ log(x ) − log(y )∣ + 2∣ log(1 + x ) − log(1 + y )∣. (4.43) 2 1 1 1 1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶f(x,y) (The last inequality is actually an equality for any x, y ∈ int S2 such that x1 ≥ y1, x2 ≤ y2 and x3 ≤ y3, hence the estimate is sharp.) By Claim 2 below, we know that f(x, y) ≤ 0 for all x, y ∈ int S2 and f(x, y) = 0 if and only if x1 = y1. Hence

−1 −1 dl(V1 x, V1 y) ≤ dl(x, y)

192 4.5. Auxiliary observations for all x, y ∈ int S2 and

−1 −1 dl(V1 x, V1 y) = dl(x, y) ⇔ x1 = y1. Now we want to prove the statement about compact sets. Without loss of generality, consider i = 1 and j = 2 and let x, y ∈ int S2. The calculations as above yield

−1 −1 −1 −1 dl(V2 V1 x, V2 V1 y) 1 √ √ ≤ d (x, y) − ∣ log(x ) − log(y )∣ + 2∣ log(1 + x ) − log(1 + y )∣ l 2 1 1 1 1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶fI (x,y) 1 x y − ∣ log ( √2 ) − log ( √2 ) ∣ (4.44) 2 1 + x1 1 + y1 √ √ x y +2∣ log (1 + √2 ) − log (1 + √2 ) ∣. 1 + x1 1 + y1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶fII (x,y)

Again, Claim 2 yields fI (x, y) ≤ 0 and fII (x, y) ≤ 0 as well as

fI (x, y) < 0 ⇔ x1 ≠ y1 x2 y2 fII (x, y) < 0 ⇔ √ ≠ √ 1 + x1 1 + y1

2 and thus for f(x, y) ∶= fI (x, y)+fII (x, y), f(x, y) ≤ 0 for any x, y ∈ int S and

f(x, y) < 0 ⇔ x ≠ y. (4.45)

2 Consider now a compact set K ⊂ int S and assume diaml(K) ≠ 0. Then applying (4.44)

−1 −1 −1 −1 −1 −1 diaml(V2 V1 K) = sup dl(V2 V1 x, V2 V1 y) x,y∈K (4.44) ≤ sup {dl(x, y) + f(x, y)} x,y∈K

K K Since dl and f are continuous, there exist x , y ∈ K such that

K K K K sup {dl(x, y) + f(x, y)} = dl(x , y ) + f(x , y ). x,y∈K

K K K K K K Now, if x = y , then dl(x , y ) + f(x , y ) = 0 by (4.45) and indeed −1 −1 diaml(V2 V1 K) = 0 < diaml(K).

193 Chapter 4. A Random Dynamical System

On the other hand, if xK ≠ yK we can estimate

K K K K K K dl(x , y ) + f(x , y ) ≤ sup dl(x, y) + f(x , y ) < diaml(K). x,y∈K ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ < 0 by 4.45 Hence we have proven the last assertion of the Proposition. Claim 2. For any s, t ∈ ]0, 1[ 1 √ √ − ∣ log(s) − log(t)∣ + 2∣ log(1 + s) − log(1 + t)∣ ≤ 0 2 with equality if and only if s = t. Proof. The proof consists of straighforward not very elegant calculations. Without loss of generality assume s ≥ t. 1 √ √ 0 ≥ − ∣ log(s) − log(t)∣ + 2∣ log(1 + s) − log(1 + t)∣ 2 1 1 √ √ = − log(s) + log(t) + 2 log(1 + s) − 2 log(1 + t) 2 √ 2 √ √ √ = − log( s) + log( t) + 2 log(1 + s) − 2 log(1 + t) ⇔ 1 √ 1 √ √ √ 0 ≥ − log( s) + log( t) + log(1 + s) − log(1 + t) 2 2 1 1 √ √ = − log(s 4 ) + log(t 4 ) + log(1 + s) − log(1 + t) ⇔ 0 ≥ − log(s) + log(t) + log(1 + s2) − log(1 + t2) 1 + s2 t = log ( ) s 1 + t2 where we simply substituted the roots in the last equivalence. The last assertion is equivalent to 1 + s2 t 1 ≥ s 1 + t2 ⇔ 0 ≤ s(1 + t2) − t(1 + s2) = s − t − st(s − t) = (s − t)(1 − st) ´¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¶ > 0 which is true if and only if s ≥ t proving the first part of the claim. Observe that all equivalences still hold if we replace ≤ with < and thus we have proven the complete claim.

194 4.5. Auxiliary observations

Remark 4.53. The reason to introduce the metric dl instead of working with d comes from the structure of the operators V1,V2,V3 (indeed any VQSOs) or rather their inverses. We cannot expect an estimate like (4.42) to hold 2 for d for x and y ∈ int S close to the boundary. This stems from the simple√ √observation, that for small s, t ∈ ]0, 1[ (indeed any x, y ≤ 1/4) we have ∣ s − t ∣ ≥ ∣s−t∣. The metric dl annihilates this effect. Unfortunately, it is tailored to the case treated. On one hand, we are not able to apply this directly to larger dimensions of the simplex m. Of course, the estimate in (4.43) can be parallelized and would amount to

−1 −1 dl(V1 x, V1 y) 1 √ √ ≤ d (x, y) − ∣ log(x ) − log(y )∣ + (m − 1)∣ log(1 + x ) − log(1 + y )∣. l 2 1 1 1 1 ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ =∶fm(x,y) But for any m > 3 we can find xm, ym ∈ int Sm−1 such that f(xm, ym) > 0. (The same holds true if we replace the coefficient of the first summand byan α < 1/2 in the definition of fm.) At the same time, the estimate in (4.43) is also sharp for higher dimensions m, as equality holds for any x, y ∈ int Sm−1 such that x1 ≤ y1 and xi ≥ yi for all other i = 2, . . . , m. (The example in this discussion refers without loss of generality to the application of V1.) On the other hand, although simulations suggest, that the estimate ( −1 −1 ) ≤ ( ) ∈ 2 ∈ V d V x, V y d x, y holds true for all x, y int S and any V ⋃k∈⟦m⟧ k, it does certainly not hold for all V ∈ V. Consider the following example:

(V x)1 = x1(1 − 0.7x2 − 0.9x3),

(V x)2 = x2(1 + 0.7x1 − 0.1x3),

(V x)3 = x3(1 + 0.1x2 + 0.9x1), together with x = ( 0.04 ; 0.76 ; 0.2 ) and y = ( 0.03 ; 0.95 ; 0.02 ). Then

dl(x, y) = 2.81341 ≥ 2.71937 = dl(V x, V y) (4.46) and thus we have a counterexample given by V (or rather V −1) andx ¯ ∶= V x andy ¯ ∶= V y, since (4.46) is equivalent to

−1 −1 dl(V x,¯ V y¯) ≥ dl(x,¯ y¯).

Therefore at least some alteration of dl and/or some specifications of the measure ν should be necessary in order to extend these observations to higher dimensional simplices or more general Volterra operators.

195 Chapter 4. A Random Dynamical System

4.5.4 Derivatives of Volterra PSOs The purpose of this section is to provide an estimate for the maximal ex- pansion of a ball under the action of a Volterra PSO (of degree d). In order to provide this in Corollary 4.56, we contemplate the derivative of such a VPSO. Recall from Remark 3.5 that a general Volterra PSO of degree d as defined in Definition 3.4 in Section 3.2 is of the form

m (d) d−1 d−2 i1,k,...,k (V x)k = xk [ xk + xk d ∑ pk xi1 i1=1 i1≠k d m m + xd−3( ) ∑ ∑ pi1,i2,k,⋯,kx x k 2 k i1 i2 i2=1 i1=1 i2≠k i1≠k d m m + ⋯ + x ( ) ∑ ⋯ ∑ pi1,...,id−2,k,kx ⋯x k d − 2 k i1 id−2 id−2=1 i1=1 id−2≠k i1≠k m m i1,...,id−1,k + d ∑ ⋯ ∑ pk xi1 ⋯xid−1 ] id−1=1 i1=1 id−1≠k i1≠k for any x ∈ Sm−1 and k ∈ ⟦m⟧. For the purpose of this Section, we will consider such a VPSO as a map

d m m V ( ) ∶ R → R .

Since VPSOs are essentially polynomials, this extension is natural and with- out difficulty. In addition, it is immediate that for anyVPSO V (d) all partial derivatives exist and are continuous and V (d) is differentiable and we denote its (total) derivative by DV (d).

Proposition 4.54. For any Volterra PSO V (d) of degree d ≥ 2 its partial derivatives are bounded by d on Sm−1, i.e.

∂ m−1 ∀ x ∈ S ∀ k, j ∈ ⟦m⟧ ∶ 0 ≤ (V x)k ≤ d. ∂xj

Proof. Observe that for any k ≤ d

d d! (d − 1)! d − 1 k( ) = k = d = d( ). (4.47) k (d − k)!k! (d − k)!(k − 1)! k − 1

196 4.5. Auxiliary observations

Then we can calculate (for any x ∈ Rm) ∂ m (V (d)x) = dxd−1 + d(d − 1)xd−2 ∑ pi1,k,...,kx ∂x k j k k i1 k i1=1 i1≠k d m m + ( )(d − 2)xd−3 ∑ ∑ pi1,i2,k,⋯,kx x 2 k k i1 i2 i2=1 i1=1 i2≠k i1≠k d m m + ⋯ + ( )2x ∑ ⋯ ∑ pi1,...,id−2,k,kx ⋯x d − 2 k k i1 id−2 id−2=1 i1=1 id−2≠k i1≠k m m i1,...,id−1,k + d ∑ ⋯ ∑ pk xi1 ⋯xid−1 . id−1=1 i1=1 id−1≠k i1≠k

If we restrict ourselves to x ∈ Sm−1 and recall, that all inheritance coefficients are bounded by 1, we obtain

1 x ³¹¹¹¹¹·¹¹¹¹¹¹µ= − k ∂ m (V (d)x) ≤ dxd−1 + d(d − 1)xd−2 ∑ x ∂x k k k i1 k i1=1 i1≠k d m m + ( )(d − 2)xd−3 ∑ ∑ x x 2 k i1 i2 i2=1 i1=1 i2≠k i1≠k d m m + ⋯ + ( )2x ∑ ⋯ ∑ x ⋯x d − 2 k i1 id−2 id−2=1 i1=1 id−2≠k i1≠k m m

+ d ∑ ⋯ ∑ xi1 ⋯xid−1 id−1=1 i1=1 id−1≠k i1≠k d − 1 = dxd−1 + d( )xd−2(1 − x ) k 1 k k d − 1 + d( )xd−3(1 − x )2 2 k k d − 1 + ⋯ + d( )x (1 − x )d−2 d − 2 k k d−1 + d(1 − xk) d−1 = d(xk + 1 − xk) = d

= ∈ m−1 where we used (4.47) and the fact that ∑i∈⟦m⟧ xi 1 for x S in the second step.

197 Chapter 4. A Random Dynamical System

Now, note that for j ≠ k ∂ m m ( ∑ ⋯ ∑ pi1,...,il,k,...,kx ⋯x ) ∂x k i1 il j il=1 i1=1 il≠k i1≠k m m i1,...,il−1,j,k,...,k = l ∑ ⋯ ∑ pk xi1 ⋯xil−1 il−1=1 i1=1 il−1≠k i1≠k il−1≠j i1≠j l m m + ( ) ∑ ⋯ ∑ pi1,...,il−2,j,j,k,...,kx ⋯x 2x 2 k i1 il−2 j il−2=1 i1=1 il−2≠k i1≠k il−2≠j i1≠j m i1,j,...,j,k,...,k l−2 + ⋯ + l ∑ pk xi1 (l − 1)xj i1=1 i1≠k i1≠j j,...,j,k,...,k l−1 + pk lxj (for any x ∈ Rm) and using that the heredity coefficients are bounded by1, x ∈ Sm−1 and (4.47) ∂ m m ( ∑⋯ ∑ pi1,...,il,k,...,kx ⋯x ) ∂x k i1 il j il=1 i1=1 il≠k i1≠k l − 1 ≤ l(1 − x − x )l−1 + l( )(1 − x − x )l−2x k j 2 k j j l − 1 + ⋯ + l( )(1 − x − x )xl−2 + lxl−1 1 k j j j l−1 l−1 = l(1 − xk − xj + xj) = l(1 − xk) . (4.48) Now we can calculate ∂ (d) d−1 j,k,...,k (V x)k = 0 + dxk pk ∂xj ⎛ ⎞ d ∂ m m + xd−2( ) ⎜ ∑ ∑ pi1,i2,k,⋯,kx x ⎟ k 2 ∂x ⎜ k i1 i2 ⎟ j ⎝i2=1 i1=1 ⎠ i2≠k i1≠k ⎛ ⎞ d ∂ m m 2 ⎜ i1,...,id−2,k,k ⎟ + ⋯ + xk( ) ⎜ ∑ ⋯ ∑ pk xi1 ⋯xid−2 ⎟ d − 2 ∂xk ⎝id−2=1 i1=1 ⎠ id−2≠k i1≠k ⎛ ⎞ ∂ m m ⎜ i1,...,id−1,k ⎟ + dxk ⎜ ∑ ⋯ ∑ pk xi1 ⋯xid−1 ⎟ ∂xk ⎝id−1=1 i1=1 ⎠ id−1≠k i1≠k

198 4.5. Auxiliary observations

(4.48) d ≤ dxd−1 + xd−2( )2(1 − x ) k k 2 k d + ⋯ + x2( )(d − 2)(1 − x )d−3 k d − 2 k d−2 + dxk(d − 1)(1 − xk) d−1 d−1 = d ((1 − xk + xk) − (1 − xk) ) ≤ d. Hence, the claim is proven. We can now use this to bound the expansion under any such VPSO.

Proposition 4.55. For any Volterra PSO V (d) of degree d ≥ 2:

∀ x, y ∈ Sm−1 ∶ ∥V (d)(x) − V (d)(y)∥ ≤ dm∥x − y∥.

From this we can immediately conclude the following bound on the ex- d pansion of a Ball under the action of V ( ). Recall that we defined Bh(x) ∶= {y ∈ Sm−1 ∣ ∥x − y∥ ≤ h} for any x ∈ Sm−1 and h ≥ 0.

Corollary 4.56. Let V (d) be a Volterra PSO of degree d ≥ 2. For any x ∈ Sm−1 and h ≥ 0

(d) (d) V Bh(x) ⊆ Bdmh(V x). Proof of Proposition 4.55. The mean-value theorem for vector-valued func- tions yields, that for any x, y ∈ Sm−1

1 V (x) − V (y) = ∫ DV (d)(tx + (1 − t)y)dt(x − y) 0 where the integral of the matrix is to be understood componentwise. Note that for these components, since tx + (1 − t)y ∈ Sm−1 for all t ∈ [0, 1] 1 1 ∂ (d) (d) ∣ (∫ DV (tx + (1 − t)y)dt) ∣ = ∣ ∫ (V x)idt∣ ≤ d 0 i,j 0 ∂xj for any i, j ∈ ⟦m⟧ by Proposition 4.54 and therefore,

1 ∥V (d)(x) − V (d)(y)∥ = ∥ ∫ DV (d)(tx + (1 − t)y)dt(x − y)∥ 0 1 ≤ max ∥ ∫ DV (d)(tx + (1 − t)y)dtz∥∥x − y∥ z∈Rm,∥z∥=1 0 ≤ dm∥x − y∥.

199

Bibliography

[1] L. Arnold. Random dynamical systems. Springer Monographs in Math- ematics. Springer-Verlag, Berlin, 1998.

[2] R. Basu and A. Sly. Lipschitz embeddings of random sequences. Prob- ability Theory and Related Fields, 159(3):721–775, 2014.

[3] N. Berestycki. Recent progress in coalescent theory, volume 16 of En- saios Matem´aticos [Mathematical Surveys]. Sociedade Brasileira de Matem´atica,Rio de Janeiro, 2009.

[4] S. Bernstein. Solution of a mathematical problem connected with the theory of heredity. Annales scientifiques des Institutions math´ematiquessavantes de l’Ukraine, 1:83–114, 1924.

[5] S. Bernstein. Solution of a mathematical problem connected with the theory of heredity. Ann. Math. Statist., 13:53–61, 1942.

[6] J. Blath, A. Casanova, B. Eldon, N. Kurt, and M. Wilke- Berenguer. Genetic variability under the seedbank coalescent. Ge- netics, 200(3):921–934, 2015.

[7] J. Blath, B. Eldon, A. Gonz´alezCasanova, and N. Kurt. Genealogy of a Wright-Fisher Model with Strong SeedBank Component, pages 81–100. Springer International Publishing, Cham, 2015.

[8] J. Blath, A. Gonz´alezCasanova, N. Kurt, and D. Span`. The ancestral process of long-range seed bank models. Journal of Applied Probability, 50(3):741–759, 2013.

[9] J. Blath, A. Gonz´alezCasanova, N. Kurt, and M. Wilke-Berenguer. A new coalescent for seed-bank models. Ann. Appl. Probab., 26(2):857– 891, 04 2016.

201 Bibliography

[10] L. Breiman. Probability, volume 7 of Classics in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992. Corrected reprint of the 1968 original.

[11] S. R. Broadbent and J. M. Hammersley. Percolation processes. Math- ematical Proceedings of the Cambridge Philosophical Society, 53:629– 641, 7 1957.

[12] H. Cram´er. The elements of probability theory and some of its applica- tions. John Wiley & Sons, New York; Almqvist & Wiksell, Stockholm, 1955.

[13] H. Crauel. Random point attractors versus random set attractors. Journal of the Mathematical Society, 63(2):413–427, 05 2001.

[14] H. Crauel. Random probability measures on Polish spaces, volume 11 of Stochastics Monographs. Taylor & Francis, London, 2002.

[15] H. Crauel, G. Dimitroff, and M. Scheutzow. Criteria for strong and weak random attractors. J. Dynam. Differential Equations, 21(2):233– 247, 2009.

[16] H. Crauel and M. Scheutzow. Minimal random attractors. preprint.

[17] A. Crits-Christoph, C.K. Robinson, T. Barnum, W. F. Fricke, A.F. Davila, B. Jedynak, C.P. McKay, and J. DiRuggiero. Colonization patterns of soil microbial communities in the Atacama desert. Micro- biome, published online, 1:1–28, 2013.

[18] R.R. Davronov, U.U. Jamilov, and M. Ladra. Conditional cubic stochastic operator. Journal of Difference Equations and Applications, 21(12):1163–1170, 2015.

[19] N. Dirr, P. W. Dondl, G. R. Grimmett, A. E. Holroyd, and M. Scheut- zow. Lipschitz percolation. Electron. Commun. Probab., 15:14–21, 2010.

[20] N. Dirr, P. W. Dondl, and M. Scheutzow. Pinning of interfaces in random media. Interfaces Free Bound., 13(3):411–421, 2011.

[21] R. Dong, A. Gnedin, and J. Pitman. Exchangeable partitions derived from Markovian coalescents. Ann. Appl. Probab., 17(4):1172–1201, 2007.

202 Bibliography

[22] A. Drewitz, M. Scheutzow, and M. Wilke-Berenguer. Asymptotics for Lipschitz percolation above tilted planes. Electron. J. Probab., 20:23 pp., 2015.

[23] R. Durrett. Probability: theory and examples. Duxbury Press, Belmont, CA, second edition, 1996.

[24] A. Etheridge. Some mathematical models from population genetics, volume 2012 of Lecture Notes in Mathematics. Springer, Heidelberg, 2011. Lectures from the 39th Probability Summer School held in Saint- Flour, 2009.

[25] A.M. Etheridge and R.C. Griffiths. A coalescent dual process ina moran model with genic selection. Theoretical Population Biology, 75(4):320 – 330, 2009. Sam Karlin: Special Issue.

[26] S.N. Ethier and T.G. Kurtz. Markov processes: Characterization and convergence. Series in Probability and Statistics. Wiley, 2005. 2nd edition.

[27] P. Fearnhead, P. Jenkins, and Y. Song. Tractable diffusion and coa- lescent processes for weakly correlated loci. Electron. J. Probab., 20:25 pp., 2015.

[28] W. Feller. An introduction to probability theory and its applications. Vol. I. Third edition. John Wiley & Sons, Inc., New York-London- Sydney, 1968.

[29] R.A. Fisher. The Genetical Theory of Natural Selection. Clarendon Press, Oxford, 1930.

[30] R.A. Fisher. Average excess and average effect of a gene substitution. Annals of Eugenics, 11(1):53–63, 1941.

[31] F. Flandoli, B. Gess, and M. Scheutzow. Synchronization by noise. Probability Theory and Related Fields, pages 1–46, 2016.

[32] N.N. Ganikhodjaev. An application of the theory of Gibbs distributions to mathematical genetics. Doklady in Math., 61(3):321–323, 2000.

[33] N.N. Ganikhodjaev, U.U. Jamilov, and R.T. Mukhitdinov. On non- ergodic transformations on S3. Journal of Physics: Conference Series, 435(1):012005, 2013.

203 Bibliography

[34] N.N. Ganikhodjaev and U.A. Rozikov. On quadratic stochastic opera- tors generated by Gibbs distributions. Regul. Chaotic Dyn., 11(4):467– 473, 2006.

[35] N.N. Ganikhodjaev and D.V. Zanin. On a necessary condition for the ergodicity of quadratic stochastic operators defined on the two- dimensional simplex. Russ. Math. Surv., 59(3):571–572, 2004.

[36] N.N. Ganikhodjaev and D.V. Zanin. Ergodic Volterra quadratic maps (Russian). arXiv:1205.3841, 2012.

[37] N.N. Ganikhodzaev. The random models of heredity in the random envorenments. Dokl. Akad. Nauk Ruz, 12:6–8, 2000.

[38] R. Ganikhodzaev, F. Mukhamedov, and U. Rozikov. Quadratic Stochastic Operators and Processes: Results and open problems. In- finite Dimensional Analysis, Quantum Probability and Related Topics, 14(2):279–335, 2011.

[39] R.N. Ganikhodzaev. Map of fixed points and Lyapunov functions for one class of discrete dynamical systems. Mathematical Notes, 56(5):1125–1131, 1994.

[40] N.N. Ganikhodzhaev, U.U. Zhamilov, and R.T. Mukhitdinov. Noner- godic Quadratic Operators for a Two-Sex population. Ukrainian Math- ematical Journal, 65:1282–1291, 2014.

[41] R.N. Ganikhodzhaev. Quadratic Stochastic Operators, Lyapunov func- tions, and tournaments. Russian Academy of Sciences. Sbornik Math- ematics, 76(2):489–506, 1993.

[42] R.N. Ganikhodzhaev and D.B. Eshmamatova.` Quadratic automor- phisms of a simplex and the asymptotic behavior of their trajectories (Russian). Vladikavkaz. Mat. Zh., 8(2):12–28, 2006.

[43] C. Goldschmidt and J.B. Martin. Random recursive trees and the Bolthausen-Sznitman coalescent. Electron. J. Probab., 10:no. 21, 718– 745 (electronic), 2005.

[44] A. Gonz´alez-Casanova, E. Aguirre von Wobeser, G. Esp´ın,L. Serv´ın- Gonz´alez,N. Kurt, D. Span`o,J. Blath, and G. Sober´on-Ch´avez. Strong seed-bank effects in bacterial evolution. Journal of Theoretical Biology, 356:62 – 70, 2014.

204 Bibliography

[45] G. Grimmett. Percolation, volume 321 of Grundlehren der Mathema- tischen Wissenschaften [Fundamental Principles of Mathematical Sci- ences]. Springer-Verlag, Berlin, second edition, 1999.

[46] G. R. Grimmett and A. E. Holroyd. Plaquettes, spheres, and entan- glement. Electron. J. Probab., 15:1415–1428, 2010.

[47] G. R. Grimmett and A. E. Holroyd. Geometry of Lipschitz percolation. Ann. Inst. Henri Poincar´eProbab. Stat., 48(2):309–326, 2012.

[48] G. R. Grimmett and A. E. Holroyd. Lattice embeddings in percolation. Ann. Probab., 40(1):146–161, 2012.

[49] G. R. Grimmett, A. E. Holroyd, and G. Kozma. Percolation of finite clusters and infinite surfaces. Mathematical Proceedings of the Cam- bridge Philosophical Society, 156:263–279, 3 2014.

[50] P. Hall and C.C. Heyde. Martingale limit theory and its application. Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York-London, 1980. Probability and Mathematical Statistics.

[51] H.M. Herbots. Stochastic models in population genetics: genealogical and genetic differentiation in structured populations. PhD DIssertation, University of London, 1994.

[52] H.M. Herbots. The structured coalescent. In Progress in population genetics and human evolution (Minneapolis, MN, 1994), volume 87 of IMA Vol. Math. Appl., pages 231–255. Springer, New York, 1997.

[53] T. H. Hildebrandt and I. J. Schoenberg. On linear functional opera- tions and the moment problem for a finite interval in one or several dimensions. Annals of Mathematics, 34(2):317–328, 1933.

[54] A. Holroyd and J. Martin. Stochastic domination and comb percola- tion. Electron. J. Probab., 19:no. 5, 1–16, 2014.

[55] N. Ikeda and S. Watanabe. Stochastic differential equations and dif- fusion processes, volume 24 of North-Holland Mathematical Library. North-Holland Publishing Co., Amsterdam; Kodansha, Ltd., Tokyo, second edition, 1989.

[56] U.U. Jamilov, M. Scheutzow, and M. Wilke-Berenguer. On the ran- dom dynamics of Volterra quadratic operators. Ergodic Theory and Dynamical Systems, FirstView:1–16, 7 2016.

205 Bibliography

[57] S. Jansen and N. Kurt. On the notion(s) of duality for Markov pro- cesses. Probab. Surv., 11:59–120, 2014.

[58] I. Kaj, S.M. Krone, and M. Lascoux. Coalescent theory for seed bank models. J. Appl. Probab., 38(2):285–300, 2001.

[59] O. Kallenberg. Foundations of modern probability. Probability and its Applications (New York). Springer-Verlag, New York, second edition, 2002.

[60] H. Kesten. Quadratic transformations: A model for population growth. I. Advances in Appl. Probability, 2:1–82, 1970.

[61] H. Kesten. Quadratic transformations: A model for population growth. II. Advances in Appl. Probability, 2:179–228, 1970.

[62] H. Kesten. Percolation theory for mathematicians, volume 2 of Progress in Probability and Statistics. Birkh¨auser,Boston, Mass., 1982.

[63] H. Kesten and Z.-G. Su. Asymptotic behavior of the critical probability for ρ-percolation in high dimensions. Probab. Theory Related Fields, 117(3):419–447, 2000.

[64] J.F.C. Kingman. Mathematics of genetic diversity, volume 34 of CBMS-NSF Regional Conference Series in Applied Mathematics. So- ciety for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa., 1980.

[65] J.F.C. Kingman. The coalescent. Stochastic Process. Appl., 13(3):235– 248, 1982.

[66] U. Krengel. Einf¨uhrungin die Wahrscheinlichkeitstheorie und Statis- tik, volume 59 of Vieweg Studium: Aufbaukurs Mathematik [Vieweg Studies: Mathematics Course]. Friedr. Vieweg & Sohn, Braunschweig, 1988.

[67] S.M. Krone and C. Neuhauser. Ancestral processes with selection. Theoretical Population Biology, 51(3):210 – 237, 1997.

[68] A. Lambert and C. Ma. The coalescent in peripatric metapopulations. J. Appl. Probab., 52(2):538–557, 06 2015.

[69] J.T. Lennon and S.E. Jones. Microbial seed banks: the ecological and evolutionary implications of dormancy. Nature Reviews Microbiology, 0:119–130, 2011.

206 Bibliography

[70] D.A. Levin. The seed bank as a source of genetic novelty in plants. The American Naturalist, 135(4):563–572, 1990.

[71] A.J. Lotka. Analytical Note on Certain Rhythmic Relations in Organic Systems. Proceedings of the National Academy of Sciences of the United States of America, 6(7):410–415, 1920.

[72] Yu. I. Lyubich. Basic concepts and theorems of the evolutionary ge- netics of free populations. Russian Mathematical Surveys, 26(5):51, 1971.

[73] Yu.I. Lyubich. Mathematical structures in population genetics, vol- ume 22 of Biomathematics. Springer-Verlag, Berlin, 1992. Translated from the 1983 Russian original by D. Vulis and A. Karpov.

[74] L. E. Ma˘ıstrov. Probability theory: a historical sketch. Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York- London, 1974. Translated and edited by Samuel Kotz, Probability and Mathematical Statistics, Vol. 23.

[75] B.J. Mamurov and U.A. Rozikov. On cubic stochastic operators and processes. Journal of Physics: Conference Series, 697(1):012017, 2016.

[76] G.J. Mendel. Versuche ¨uber Pflanzen-hybriden. Verhandlungen des naturforschenden Vereins Br¨unn, 4:3–47, 1866.

[77] M. V. Menshikov and S. A. Zuev. Models of ρ-percolation. In Proba- bilistic methods in discrete mathematics (Petrozavodsk, 1992), volume 1 of Progr. Pure Appl. Discrete Math., pages 337–347. VSP, Utrecht, 1993.

[78] P. M¨ortersand Y. Peres. Brownian motion. Cambridge Series in Sta- tistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2010. With an appendix by Oded Schramm and Wendelin Werner.

[79] F. Mukhamedoc and M.H. Bin Mohd Taha. On Volterra and orthogo- nality preserving quadratic stochastic operators. Miskolc Mathematical Notes, 17(1):457–470, 2016.

[80] F. Mukhamedov, H. Akin, and S. Temir. On infinite dimensional quadratic Volterra operators. Journal of Mathematical Analysis and Applications, 310(2):533–556, 2005.

207 Bibliography

[81] F. Mukhamedov and N. Ganikhodjaev. Quantum quadratic operators and processes, volume 2133 of Lecture Notes in Mathematics. Springer, Cham, 2015.

[82] F. Mukhamedov and M. Saburov. On Dynamics of Lotka-Volterra Type Operators. Bulletin of the Malaysian Mathematical Sciences Society, 37(1), 2014.

[83] H.B. Nath and R.C. Griffiths. The coalescent in two colonies with symmetric migration. Journal of Mathematical Biology, 31:841–852, 1993.

[84] R. Navarro-Gonz´alez, F.A. Rainey, P. Molina, D.R. Bagaley, B.J. Hollen, J. de la Rosa, A.M. Small, R.C. Quinn, F.J. Grunthaner, L. C´aceres, B. Gomez-Silva, and C.P. McKay. Mars-like soils in the Atacama desert, chile, and the dry limit of microbial life. Science, 302(5647):1018–1021, 2003.

[85] R.A. Neher and O. Hallatschek. Genealogies of rapidly adapting popu- lations. Proceedings of the National Academy of Sciences, 110(2):437– 442, 2013.

[86] C. Neuhauser and S.M. Krone. The genealogy of samples in models with selection. Genetics, 145(2):519–534, 1997.

[87] J. R. Norris. Markov chains, volume 2 of Cambridge Series in Sta- tistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 1998. Reprint of 1997 original.

[88] M. Notohara. The coalescent and the genealogical process in geo- graphically structured population. Journal of Mathematical Biology, 29(1):59–75, 1990.

[89] L. Nunney. The effective size of annual plant populations: The inter- action of a seed bank with fluctuating population size in maintaining genetic variation. The American Naturalist, 160(2):195–204, 2002.

[90] J. Pitman. Coalescents with multiple collisions. Ann. Probab., 27(4):1870–1902, 1999.

[91] S.M. Ulam P.R. Stein. Non-linear transformation studies on electronic computers. Instytut Matematyczny Polskiej Akademi Nauk, 1964.

208 Bibliography

[92] U.A. Rozikov and A.Yu. Khamraev. On Cubic Operators defined on finite-dimensional simplices. Ukrainian Mathematical Journal, 56(10):1699–1711, 2004.

[93] U.A. Rozikov and N.B. Shamsiddinov. On Non-Volterra Quadratic Stochastic Operators Generated by a Product Measure. Stochastic Analysis and Applications, 27(2):353–362, 2009.

[94] U.A. Rozikov and U.U Zhamilov. F-quadratic stochastic operators. Mathematical Notes, 83(3):554–559, 2008.

[95] U.A. Rozikov and U.U. Zhamilov. Volterra quadratic stochastic op- erators of a Two-Sex population. Ukrainian Mathematical Journal, 63:1136–1153, 2011.

[96] M. Saburov. A class of nonergodic Lotka-Volterra operators. Mathe- matical Notes, 97(5):759–763, 2015.

[97] Sarah Sallon, Elaine Solowey, Yuval Cohen, Raia Korchinsky, Markus Egli, Ivan Woodhatch, Orit Simchoni, and Mordechai Kislev. Ger- mination, genetics, and growth of an ancient date seed. Science, 320(5882):1464–1464, 2008.

[98] M. Scheutzow. Comparison of various concepts of a random attractor: A case study. Archiv der Mathematik, 78(3):233–240, 2002.

[99] M. Scheutzow and D. Steinsaltz. Chasing balls through martingale fields. Ann. Probab., 30(4):2046–2080, 2002.

[100] J. Schweinsberg. A necessary and sufficient condition for the Λ- coalescent to come down from infinity. Electron. Comm. Probab., 5:1–11 (electronic), 2000.

[101] T. Shiga and A. Shimizu. Infinite dimensional stochastic differential equations and their applications. J. Math. Kyoto Univ., 20(3):395–416, 1980.

[102] P.R. Stein, S.M. Ulam, and M.T. Menzel. Quadratic transformations. Part I. Los Alamos, N.M.: Los Alamos Scientific Laboratory of the University of California, 1959.

[103] B. Sutter, D. C. Golden, R. Amundson, G. Chong-Diaz, and D. W. Ming. Calcium Sulfate in Atacama Desert Basalt: A Possible Analog for Bright Material in Adirondack Basalt, Gusev Crater. Lunar and

209 Bibliography

Planetary Science XXXVIII [electronic resource] : papers presented at the Thirty-eighth Lunar and Planetary Science Conference : March 12-16, 2007, 2007.

[104] N. Takahata. The coalescent in two partially isolated diffusion popu- lations. Genet. Res., 53:213–222, 1988.

[105] A.R. Templeton and D.A. Levin. Evolutionary consequences of seed pools. The American Naturalist, 114(2):232–249, 1979.

[106] R. Vitalis, S. Gl´emin,and I. Olivieri. When genes go to sleep: The pop- ulation genetic consequences of seed dormancy and monocarpic peren- niality. The American Naturalist, 163(2):259–311, 2004.

[107] V. Volterra. Fluctuations in the abundance of a species considered mathematically. Nature, 118:558–560, 1926.

[108] D. Zivkoviˇcandˇ A. Tellier. Germ banks affect the inference of past demographic events. Mol. Ecol., 21:5434–5446, 2012.

[109] J. Wakeley. Coalescent Theory: An Introduction. Coalescent theory: an introduction. Roberts & Company Publishers, Greenwood Village, 2009.

[110] S. Wright. Evolution in Mendelian populations. Genetics, 16(2):97– 159, 1931.

[111] T. Yamada and S. Watanabe. On the uniqueness of solutions of stochas- tic differential equations. J. Math. Kyoto Univ., 11:155–167, 1971.

[112] M.I. Zakharevich. The behavior of trajectories and the ergodic hy- pothesis for quadratic mappings of a simplex (Russian). Russian Math. Surv., 33:207–208, 1978.

[113] U.U. Zhamilov and U.A. Rozikov. The dynamics of strictly non- Volterra quadratic stochastic operators on the 2-simplex. Sbornik: Mathematics, 200(9):1339, 2009.

210 Acknowledgment

As it is customary there are a few people I should and want to thank for their role and involvement in these past six years and beyond.

Of course, the first person on this list is Professor Michael Scheutzow, my supervisor. His outstanding lecture on measure and integration theory fortunately lured me to probability theory, a subject I have come to love. I am thankful for the interesting choice of topics he introduced me to - Lipschitz Percolation and Random Dynamical Systems - and it was a pleasure and joy to work as his teaching assistant. I am deeply indebted to Professor Frank Aurzada not only for agreeing to review this thesis. I doubt that it would have been completed without his reiterated active interest in my progress. Furthermore I would like to thank Professor Noemi Kurt and Professor Jochen Blath for including me in their projects and welcoming me into their groups, as well as Professor Wolfgang K¨onigfor his honest interest in and support of my advance. I am deeply grateful to Professor Gerhard Franz and Dr. Melene Bahner for long and insightful discussions. I was also fortunate to be accompanied by three very different great men during this endeavour: Diplom-Mathematiker Ren´eKehl, Dr. Matti Leimbach and Dr. Sebastian Riedel (note that the order is simply chronological). Words can not express how much these three have supported me - in questions of life and math - each in their unique way. But I can assure that without them I would not be who and certainly not where I am today. I would like to include in this list the service-personnel of the math build- ing whose smiles and kind greetings at any hour of the day (or night) always brought a smile to my face and heart. And just like these small ones, it is often the unexpected gestures or words of encouragement that have great impact. Hence I conclude this list expressing my gratitude and adressing these words to VP Professor Christine Ahrend and in particular to Dr.-Ing. Martin Steiof:

hoffentlich.

Berlin, October 2016 Maite Isabel Wilke Berenguer

211