CONTINUUM SPACE LIMIT OF THE GENEALOGIES OF INTERACTING FLEMING-VIOT PROCESSES ON Z

ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

Abstract. We study the evolution of genealogies of a population of individuals, whose type frequencies result in an interacting Fleming-Viot process on Z. We construct and analyze the genealogical structure of the population in this genealogy-valued Fleming- Viot process as a marked metric measure space, with each individual carrying its spatial location as a mark. We then show that its time evolution converges to that of the ge- nealogy of a continuum-sites stepping stone model on R, if space and time are scaled diffusively. We construct the genealogies of the continuum-sites stepping stone model as functionals of the Brownian web, and furthermore, we show that its evolution solves a martingale problem. The generator for the continuum-sites stepping stone model has a singular feature: at each time, the resampling of genealogies only affect a set of in- dividuals of measure 0. Along the way, we prove some negative correlation inequalities for coalescing Brownian motions, as well as extend the theory of marked metric measure spaces (developed recently by Depperschmidt, Greven and Pfaffelhuber [DGP12]) from the case of probability measures to measures that are finite on bounded sets. AMS 2010 subject classification: 60K35, 60J65, 60J70, 92D25. Keywords: Brownian web, continuum-sites stepping stone model, evolving genealogies, interacting Fleming-Viot process, marked metric measure space, martingale problems, negative correlation inequalities, spatial continuum limit.

Contents

1. Introduction and main results 2 1.1. Background and Overview 2 1.2. Marked Metric Measure Spaces 5 1.3. Interacting Fleming-Viot (IFV) Genealogy Processes 8 1.4. Genealogies of Continuum-sites Stepping Stone Model (CSSM) on R 13 1.5. Convergence of Rescaled IFV Genealogies 16 1.6. Martingale Problem for CSSM Genealogy Processes 17 1.7. Outline 20 2. Proofs of the properties of CSSM Genealogy Processes 20 3. Proof of convergence of Rescaled IFV Genealogies 23 3.1. Convergence of Finite-dimensional Distributions 24 3.2. Tightness 27 3.3. Proof of convergence of rescaled IFV processes (Theorem 1.30) 34 4. Martingale Problem for CSSM Genealogy Processes 34 4.1. Generator Action on Regular Test Functions 35 4.2. Generator Action on General Test Functions 44 4.3. Proof of Theorem 1.38 48 Appendix A. Proofs on marked metric measure spaces 48 Appendix B. The Brownian web 50 Appendix C. Correlation Inequalities for Coalescing Brownian motions 52 Appendix D. Proof of Theorems 1.11 and 1.16 56 References 61

Date: May 12, 2015 script/greven/gsw/gswN34.tex. 1Universit¨atErlangen-N¨urnberg, Germany. Email: [email protected] 2National University of Singapore, Singapore. Email: [email protected] 3Universit¨atDuisburg-Essen, Germany. Email: [email protected] 1 2 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

1. Introduction and main results In the study of spatial population models on discrete geographic spaces (for example d Z ), such as branching processes, voter models, or interacting Fisher-Wright diffusions (Fleming-Viot models), the technique of passing to the spatial continuum limit has proven to be useful in gaining insight into the qualitative behaviour of these processes. A key d example is branching random walks on Z , leading to the Dawson-Watanabe process d [Daw77] on R and Fisher-Wright diffusions; catalytic branching and mutually catalytic + + branching on Z, leading to SPDE on R [KS88, EF96, DP98, DEF 02b, DEF 02a]. The goal of this paper is to carry out this program at the level of genealogies, rather than just type or mass configurations. We focus here on interacting Fleming-Viot models on Z.

1.1. Background and Overview. We summarize below the main results of this paper, recall some historical background, as well as state some open problems. Summary of results. The purpose of this paper is twofold. On the one hand, we want to understand the formation of large local one-family clusters in Fleming-Viot populations 1 on the geographic space Z , by taking a space-time continuum limit of the genealogical configurations equipped with types. On the other hand we use this example to develop the theory of tree-valued dynamics via martingale problems in some new directions. In particular, this is the first study of a tree-valued dynamics on an unbounded geograph- ical space with infinite sampling measure, which requires us to extend both the notion of marked metric measure spaces in [GPW09, DGP11] and the martingale problem for- mulations in [GPW13, DGP12] to marked metric measure spaces with infinite sampling measures that are boundedly finite (i.e., finite on bounded sets). Here is a summary of our main results: (1) We extend the theory of marked metric measure spaces [GPW09, DGP12] from probability sampling measures to infinite sampling measures that are boundedly finite, which serve as the state space of marked genealogies of spatial population models. See Section 1.2. (2) We characterize the evolution of the genealogies of interacting Fleming-Viot (IFV) models by well-posed martingale problems on spaces of marked ultrametric measure spaces. See Section 1.3. (3) We give a graphical construction of the spatial continuum limit of the IFV genealogy process, which is the genealogy process of the so-called Continuum Sites Stepping- stone Model (CSSM), taking values in the space of ultrametric measure spaces with spatial marks and an infinite total population. The graphical construction is based on the (dual) Brownian web [FINR04]. The CSSM genealogy process has the peculiar feature that, as soon as t > 0, the process enters a regular subset of the state space that is not closed under the topology. This leads to complications for the associated martingale problem and for the study of continuity of the process at time 0. See Section 1.4. (4) We prove that under suitable scaling, the IFV genealogy processes converge to the CSSM genealogy process. The proof is based on duality with spatial coales- cents, together with a novel approach of controlling the genealogy structure using a weaker convergence result on the corresponding measure-valued processes, with measures on the geographic and type space (with no genealogies). See Section 1.5. (5) We show that the CSSM genealogy process solves a martingale problem with a singular generator. More precisely, its diffusion term carries coefficients of a singu- lar nature, which only acts on a part of the geographical space with zero measure. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 3

In particular, the generator is only defined on a regular subset of the state space. See Section 1.6. (6) We prove some negative correlation inequalities for coalescing Brownian motions, which are of independent interest. See Appendix C. Besides the description of the genealogies of the current population, we also prepare the ground for the treatment of all individuals ever alive, i.e. fossils, moving from the state space of marked ultrametric measure spaces to the state space of marked measure R-trees, which will be carried out elsewhere.

History of the problem: Why are we particularly interested in one-dimensional ge- ographic spaces for our scaling results? Many interacting spatial systems that model G evolving populations, i.e., Markov processes with state spaces I (I = R, N, [0, 1], etc., d and G = Z or the hierarchical group ΩN ) that evolve by a migration mechanism be- tween sites and a stochastic mechanism acting locally at each site, exhibit a dichotomy d in their longtime behavior. For example, when G = Z and the migration is induced by the simple symmetric : in dimension d ≤ 2, one observes convergence to laws concentrated on the traps of the dynamic; while in d ≥ 3, nontrivial equilibrium states are approached and the extremal invariant measures are spatially homogeneous ergodic measures characterized by the intensity of the configuration. Typical examples include the , branching random walks, spatial Moran models, or systems of interact- ing diffusions (e.g., Feller, Fisher-Wright or Anderson diffusions). One obtains universal dimension-dependent scaling limits for these models if an additional continuum spatial limit is taken, resulting in, for example, super (see Liggett [Lig85] or Dawson [Daw93]). In the low-dimension regime, the cases d = 1 and d = 2 are very different. In d = 2, one observes for example in the voter model the formation of mono-type clusters on spatial scales tα/2 with a random α ∈ [0, 1], a phenomenon called diffusive clustering (see Cox and Griffeath [CG86]). In the one-dimensional voter model, the clusters have an extension of a fixed order of√ magnitude but exhibit random factors in that scale. More precisely, in space- time scales ( t, t) for t → ∞, we get annihilating Brownian motions. Similar results have been obtained for low-dimensional branching systems (Klenke [Kle00], Winter [Win02]), systems of interacting Fisher-Wright diffusions (Fleischmann and Greven [FG94], [FG96] and subsequently [Zho03], [DEF+00] ) and for the Moran model in d = 2 (Greven, Limic, Winter [GLW05]). In all these models, one can go further and study the complete space-time genealogy structure of the cluster formation and describe this phenomenon asymptotically by the spatial continuum limit. In particular, the description for the one-dimensional voter model can be extended to the complete space-time genealogy structure, obtaining as scaling limit the Brownian web [Arr, TW98, FINR04]. More precisely, the Brownian web is defined by considering instantaneously coalescing one-dimensional Brownian motions starting from every space-time point in R × R. It arises as the diffusive scaling limit of continuous time coalescing simple symmetric random walks starting from every space-time point in Z×R, which represent the space-time genealogies of the voter model. This is analogous to the study of historical process for branching processes, which approximates the ancestral paths of branching random walks by that of super Brownian motion (see e.g. [DP91, FG96, GLW05]). The basic idea behind all this is that, we can identify the genealogical relationship between the individuals of the population living at different times and different locations. This raises the question of whether one can obtain a description of the asymptotic behavior of the complete genealogical structure of the process on large space-time scales, which will 4 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3 in turn allow for asymptotic descriptions of interesting genealogical that are not expressible in a natural way in terms of the configuration process. These observations on the genealogical structure goes back to the graphical construction of the voter model due to Harris, and continues up to the historical process of Dawson and Perkins for branching models [DP91], or representation by contour processes [GJ98, Ald93]. To better describe genealogies, the notion of R-trees, marked R-trees or marked measure R- trees were developed as a framework [Eva97, EPW06]. These objects contain the relevant information abstracted from the labeled genealogy tree, where every individual is coded with its lifespan and its locations at each time. Such a coding means in particular that all members of the population are distinguished, which information is mostly not needed. In the large population limit, it suffices to consider the statistics of the population via sampling. For this purpose, one equips the population with a metric (genealogical distance), a probability measure (the so-called sampling measure, which allows to draw typical finite samples from the population), and a mark function (specifying types and locations). This description in terms of random marked metric measure spaces (in fact, the metric defines a tree) is the most natural framework to discuss the asymptotic analysis of population models, since it comprises exactly the information one wants to keep for the analysis in the limits of populations with even locally infinitely many individuals. The evolution of a process with such a state space is described by martingale problems [GPW13, DGP12].

Open problems and conjectures. We show in this paper that the spatial continuum limit of the one-dimensional interacting Fleming-Viot genealogy process solves a martin- gale problem. However, due to the singular nature of the generator, the uniqueness of this martingale problem is non-standard, and we leave it for a future paper. Instead of Z, resp. R, as geographic spaces, one could consider the hierarchical group ΩN = ⊕NZN , with ZN being the cyclical group of order N, reps. the continuum hierarchical ∞ L group ΩN = ZN . Brownian motion on R can be replaced by suitable L´evy processes ∞ Z on ΩN and the program of this paper can then be carried out. The Brownian web would have to be replaced by an object based on L´evyprocesses as studied in [EF96], [DEF+00]. We conjecture that the analogues of our theorems would hold in this context. A further ∞ challenge would be to give a unified treatment of these results on R, ΩN . Another direction is to consider the genealogy processes of interacting Feller diffusions, catalytic or mutually catalytic diffusions, interacting logistic Feller diffusions, and derive their genealogical continuum limits. A more difficult extension would be to include ances- tral paths as marks, which raises new challenges related to topological properties of the state space.

Outline of Section 1. The remainder of the introduction is organized as follows. In Subsection 1.2, we recall and extend the notion of marked metric measure spaces needed to describe the genealogies. In Subsection 1.3, we define the interacting Fleming-Viot (IFV) genealogy process via a martingale problem and give a dual representation in terms of a spatial coalescent. In Subsection 1.4, we give, based on the Brownian web, a graphical construction for the continuum-sites stepping stone model (CSSM) on R and its marked genealogy process, which is the continuum analogue and scaling limit of the interact- ing Fleming-Viot genealogy process on Z, under diffusive scaling of space and time and normalizing of measure, a fact which we state in Subsection 1.5. In Subsection 1.6, we formulate a martingale problem for the CSSM genealogy process. In Subsection 1.7 we outline the rest of the paper. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 5

1.2. Marked Metric Measure Spaces. To formulate our main results, we first need to introduce the space in which the genealogies of interacting Fleming-Viot processes take their value. We will regard the genealogies of all individuals in the population as a marked metric measure space. Here is the precise definition of marked metric measure spaces, extending the one introduced in [DGP11], which considered probability measures.

Definition 1.1 (V -mmm-spaces). Let (V, rV ) be a complete separable metric space with metric rV , and let o be a distinguished point in V . 1. We call (X, r, µ) a V -marked metric measure space (V -mmm space for short) if: (i)( X, r) is a complete separable metric space, (ii) µ is a σ-finite measure on the Borel σ-algebra of X ×V , with µ(X ×Bo(R)) < ∞ for each ball Bo(R) ⊂ V of finite radius R centered at o. 2. We say two V -mmm spaces (X, rX , µX ) and (Y, rY , µY ) are equivalent if there exists a measureable map ϕ : X → Y , such that

(1.1) rX (x1, x2) = rY (ϕ(x1), ϕ(x2)) for all x1, x2 ∈ supp(µX (· × V )),

and if ϕe : X × V → Y × V is defined by ϕe(x, v) := (ϕ(x), v), then −1 (1.2) µY = µX ◦ ϕe . In other words, ϕ is an isometry between supp(µX (· × V )) and supp(µY (· × V )), and the induced map ϕe is mark and measure preserving. We denote the equivalence class of (X, r, µ) by (1.3) (X, r, µ). 3. The space of (equivalence classes of) V -mmm spaces is denoted by V (1.4) M := {(X, r, µ):(X, r, µ) is a V -mmm space}. 4. The subspace of (equivalence classes of) V -mmm spaces which admit a mark func- tion is denoted by V V (1.5) Mfct := {(X, r, µ) ∈ M : ∃κ : X → V s.t. µ(d(x, ·)) = µ(dx × V ) ⊗ δκ(x)}. V Note that M depends both on the set V and the metric rV since the latter defines the sets on which the measure must be finite. Marked metric measure spaces were introduced in [DGP11], which extends the notion of metric measure spaces studied earlier in [GPW09]. Definition 1.1 is exactly the analogue of [DGP11, Def. 2.1], where µ is a probability measure. The basic interpretation in our context is that: X is the space of individuals; r(x1, x2) measures the genealogical distance between two individuals x1 and x2 in X; the measure µ is a measure on the individuals and the marks they carry (which can be spatial location as well as type, or even ancestral paths up to now), allowing us to draw samples from individuals with marks in a bounded set. V To define a topology on M that makes it a Polish space, we will make use of the marked Gromov-weak topology introduced in [DGP11, Def. 2.4] for V -mmm spaces equipped with probability measures. The basic idea is that, given our assumption on µ in Defini- tion 1.1.1.(ii), we can localize µ to finite balls in V to reduce µ to a finite measure. We can then apply the marked Gromov-weak topology (which also applies to finite measures instead of probability measures) to require convergence for each such localized version of the V -mmm spaces. We will call such a topology V -marked Gromov-weak# topology, replacing weak by weak#, following the terminology in [DVJ03, Section A2.6] for the con- vergence of measures that are bounded on bounded subsets of a complete separable metric 6 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3 space. Note that vague convergence is for measures that are finite on compact rather than bounded subsets. Both notions agree, however, on Heine-Borel spaces (compare, [ALW]). Definition 1.2 (V -marked Gromov-weak# Topology). Fix a sequence of continuous func- tions ψk : V → [0, 1], k ∈ N, such that ψk = 1 on Bo(k), the ball of radius k centered at o ∈ c V , and ψk = 0 on Bo(k +1). Let χ := (X, r, µ) and χn := (Xn, rn, µn), n ∈ N, be elements V of M . Let ψk · µ be the measure on X × V defined by (ψk · µ)(d(x, v)) := ψk(v)µ(d(x, v)), # and let ψk ·µn be defined similarly. We say that χn −→ χ in the V -marked Gromov-weak n→∞ topology if and only if:

(Xn, rn, ψk · µn) =⇒ (X, r, ψk · µ) in the Gromov-weak topology for each k ∈ . n→∞ N d When V = R , we may choose ψk to be infinitely differentiable. # Remark 1.3 (Dependence on o and (ψk)k∈N). Note that the V -marked Gromov-weak topology does not depend on the choice o ∈ V and the sequence (ψk)k∈N, as long as ψk has bounded support and Ak := {v : ψk(v) = 1} increases to V as k → ∞.

V V N V Remark 1.4 (M as a subspace of (Mf ) ). Let Mf denote the space of (equivalent classes of) V -mmm spaces with finite measures, equipped with the V -marked Gromov- V weak topology as introduced in [DGP11, Def. 2.4]. Then each element (X, r, µ) ∈ M can V N be identified with a sequence ((X, r, ψ1 · µ), (X, r, ψ2 · µ),...) in the product space (Mf ) , equipped with the product topology. This identification allows us to easily deduce many V V properties of M from properties of Mf that were established in [DGP11]. In particular, # V we can metrize the V -marked Gromov-weak topology on M by introducing a metric (which can be called V -marked Gromov-Prohorov# metric) (1.6) ∞ X dMGP ((X1, r1, ψk · µ1), (X2, r2, ψk · µ2)) ∧ 1 d # ((X , r , µ ), (X , r , µ )) := , MGP 1 1 1 2 2 2 2k k=1 V where dMGP is the marked Gromov-Prohorov metric on Mf , which was introduced in [DGP11, Def. 3.1] and metrizes the marked Gromov-weak topology. The proof of the following result combines the corresponding results in [DGP12] and [ALW], and will be deferred to Appendix A.

V # Theorem 1.5 (Polish space). The space M , equipped with the V -marked Gromov-weak topology, is a Polish space.

V V Points in M , as well as weak convergence of M -valued random variables, can be V determined by the so-called polynomials on M , which are defined via sampling a finite subset on the V -mmm space.

n Definition 1.6 (Polynomials). Let n ∈ N. Let g ∈ Cbb(V , R), the space of real-valued n bounded continuous function on V with bounded support. For k ∈ N ∪ {0, ∞}, let φ ∈ n n k (2) (2) Cb (R+ , R), the space of k-times continuously differentiable real-valued functions on R+ n,φ,g V with bounded derivatives up to order k. We call the function Φ : M → R defined by Z Z (1.7) Φn,φ,g((X, r, µ)) := ··· φ(r)g(v)µ⊗n(d(x, v)), a monomial of order n, where v := (v1, . . . , vn), x := (x1, . . . , xn), r = r(x) := (r(xi, xj))1≤i

n k n,φ,g k (2) n (a) Let Πn := {Φ : φ ∈ Cb (R+ , R), g ∈ Cbb(V , R)}, which we call the space k of monomials of order n (with differentiability of order k). Let Π0 be the set of constant functions. We then denote by k k Π := ∪n∈N0 Πn the set of all monomials (with differentiability of order k). d (b) For V = R and k, l ∈ N ∪ {0, ∞}, we define n k,l n,φ,g k (2) l n Πn := {Φ : φ ∈ Cb (R+ , R), g ∈ Cbb(V , R)}, k,l k,l and Π := ∪n∈N0 Πn . (c) We call the linear spaces (1.8) Πe k,` generated by Πk,`, the polynomials (with differentiability of φ, resp. g, of order k, resp. `). Remark 1.7. Note that the polynomials form an algebra of bounded continuous func- tions. Theorem 1.8 (Convergence determining class). We have the following properties for Πk, for each k ∈ N ∪ {0, ∞}: k V V (i)Π is convergence determining in M . Namely, (Xn, rn, µn) → (X, r, µ) in M if k and only if Φ((Xn, rn, µn)) → Φ((X, r, µ)) as n → ∞ for all Φ ∈ Π . k V (ii)Π is also convergence determining in M1(M ), the space of probability measures V V on M . Namely, a sequence of M -valued random variables (Xn)n∈N converges V weakly to a M -valued X if and only if E[Φ(Xn)] → E[Φ(X )] as n → ∞ for all Φ ∈ Πk. d k,l (iii) For V = R and each k, l ∈ N ∪ {0, ∞}, Π is also convergence determining in V V M and M1(M ). We defer the proof of Theorems 1.5 and 1.8, as well as some additional properties of V -mmm spaces, to Appendix A. V V Recall Mfct from Definition 1.1, and notice that Mfct is not complete, and that we there- V V fore choose the bigger space M as the state space. The space M allows an individual x ∈ X to carry a set of marks, equipped with the conditional measure of µ on V given x ∈ X. If each individual carries only a single mark which we can identify via a mark function κ : X → V , the corresponding marked metric measure space is an element of V Mfct. This will be the case for the interacting Fleming-Viot process that we will study. V It can be shown that every element in Mfct is an element of the closed space (1.9) V [ \  V 0 0 0 Mh := (X, r, µ) ∈ M : ∃µA ∈ Mf (X × A): µ ≤ µ(· × A), kµ − µ(· × A)kTV ≤ hA(δ), Abdd δ>0 0 ⊗2 (µA) {r(x1, x2) < δ, d(v1, v2) > hA(δ)} = 0 , for some hA ∈ H, where

(1.10) H := {h : R+ → R+ ∪ {∞} : h is continuous and increasing, and h(0) = 0} ([KL, Lemma 2.8]). Remark 1.9. For the models we consider, the genealogies lie in Polish spaces which V arise as closed subspaces of M . Note that the current population alive corresponds to the leaves of a genealogical tree, and hence the associated V -mmm space is ultrametric. We V will denote the space of V -marked ultrametric measure spaces by U . They form a closed 8 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

V V subspace of M and hence U is Polish. The same holds when we consider the subset of V M , whose associated measures all follow a prescribed measure when projected onto the mark space V .

1.3. Interacting Fleming-Viot (IFV) Genealogy Processes. We now study the ge- nealogies of the measure-valued interacting Fleming-Viot (IFV) processes on a countable geographic space and with allelic types, typically taken from the type space [0, 1] (see [DGV95] for details on IFV), which is motivated by the following individual-based model, the so-called Moran model. Consider a population of individuals with locations indexed by a countable additive group V (for us this later will be Z). The individuals migrate independently according to rate one continuous time random walks with transition probability kernel a(·, ·),

(1.11) a(v1, v2) = a(0, v2 − v1) for all v1, v2 ∈ V. We denote the transition kernel of the time reversed random walks by

(1.12)a ¯(v1, v2) := a(v2, v1) = a(0, v1 − v2). Individuals furthermore reproduce by resampling, where every pair of individuals at the same site die together at exponential rate γ > 0, and with equal probability, one of the two individuals is chosen to give birth to two new individuals at the same site with the same type as the parent. This naturally induces a genealogical structure. The genealogical distance of two individuals at time t is 2 min{t, T } plus the distance of the ancestors at time 0, where T is the time it takes to go back to the most recent common ancestor. Imposing the counting measure on the population with each individual carrying its location as a mark, we obtain a V -mmm space and its equivalence class is the state of the genealogy process. Letting now the number of individuals per site tend to infinity and normalizing the measure such that each site carries population mass of order one, we obtain a diffusion model, the interacting Fleming-Viot (IFV) genealogy process with

V (1.13) state space U1 , the space of V -marked ultrametric measure spaces with the property that the projection of µ onto the countable geographic space V is counting measure, the 1 indicating that the measure restricted to each colony is a probability measure. This (see Remark 1.9) is again a Polish space. (For the diffusion limit in the case of a finite geographic space V , see [GPW13], [DGP12]).

Remark 1.10. If we introduce as marks (besides locations from a countable geographic space G) also allelic types from some set K (typically taken as a closed subset of [0, 1]), then the type is inherited at reproduction and V = K × G is the product of type space and geographic space. In this case the localization procedure in Definition 1.2 applies to the geographic space, since K is compact. 1.3.1. The genealogical IFV martingale problem. We now define the interacting Fleming- Viot (IFV) genealogy process via a martingale problem for a linear operator LFV on V Cb(U1 , R), acting on polynomials. For simplicity we first leave out allelic types, which we introduce later in Remark 1.12. FV V 1,0 The linear operator L on Cb(U1 , R), with domain Π as introduced in Defini- tion 1.6 (b), consists of three terms, corresponding respectively to aging, migration, and reproduction by resampling. With X = (X, r, µ) and Φ = Φn,φ,g ∈ Π1,0, CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 9

Z X ∂ LFVΦ(X ) = 2 µ⊗n(d(x, v)) g(v) φ(r) ∂rk,` (X×V )n 1≤k<`≤n Z n ⊗n X X 0 0 (1.14) + µ (d(x, v)) φ(r) a¯(vj, v )(Mvj ,v g − g)(v) j=1 0 (X×V )n v ∈V Z ⊗n X + 2 γ µ (d(x, v)) g(v) 1{vk=v`}(θk,`φ − φ)(r), (X×V )n 1≤k<`≤n where x = (x1, ··· , xn), v = (v1, ··· , vn), r := (rk,`)1≤k<`≤n := (r(xk, x`)1≤k<`≤n), and 0 0 (1.15) (Mvj ,v g)(v1, ··· , vn) := g(v1, ··· , vj−1, v , vj+1, ··· , vn) encodes the replacement by migration of the j-th sampled individual corresponding to a 0 jump from location vj to v , while θk,` encodes the replacement of the `-th individual by the k-th individual (both at the same site), more precisely,  r(xi, xj) if i, j 6= `,  (1.16) (θk,`φ)(r) := φ(θk,`r), with (θk,`r)i,j := r(xi, xk) if j = `,   r(xk, xj) if i = `. The following results cover a particular case of evolving genealogies of interacting Λ- Fleming-Viot diffusions which will be studied in [GKW]. For the sake of completeness we V present all proofs in Appendix D. The first result states that there is a unique U1 -valued diffusion process associated with this operator. It is a special case of the martingale problem characterization of interacting Λ-Fleming- Viot genealogy processes which are studied in detail in forthcoming work [GKW]. We therefore defer the proof to Appendix D. Theorem 1.11 (Martingale problem characterization of IVF Genealogy processes). For V any X0 = (X0, r0, µ0) ∈ U1 , we have: FV 1,0 V (i) The (L , Π , δX0 )-martingale problem is well-posed, i.e. there exists a U1 -valued FV FV process X := (Xt )t≥0, unique in its distribution, which has initial condition 1,0 X0 and c`adl`agpath, such that for all Φ ∈ Π and w.r.t. the natural filtration FV generated by (Xt )t≥0, t Z  FV FV FV FV  (1.17) Φ(Xt ) − Φ(X0 ) − (L Φ)(Xs )ds is a martingale. t≥0 0 (ii) The solutions (for varying initial conditions) define a strong Markov, and with continuous path. (iii) If the initial state admits a mark function, then so does the path for all t > 0 almost surely. Remark 1.12. If we add the type of an individual as an additional mark, i.e., V := G×K with geographic and type space respectively, then the same result holds if we modify as follows. We require the states to satisfy the constraint that the projection of µ on the geographic space V is still the counting measure. The test functions Φ should be modified n so that we multiply g : G → R, acting on the locations of the n sampled individuals, by n FV another factor gtyp : K → R, acting on the types of the individuals. The generator L 10 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3 should be modified accordingly, so that gtyp changes at resampling from gtyp to gtyp ◦ θek,`, with θek,` replacing the `-th sampled individual by the k-th one, see [DGP12]).

FV Remark 1.13. The process Xt = (Xt, rt, µt) has the property that the measure- valued process Xbt given by the collection {µt(Xt × {i} × ·), i ∈ G}, is the IFV process on (M1(K))G. Remark 1.14. From Section 1.4 onward, we will choose V = Z. However in the subsequent analysis, it is important to observe that we can embed Z into R and view UZ as a closed subspace of UR, and view the IFV genealogy process as UR-valued process. 1.3.2. Duality. The IFV genealogy process X FV can be characterized by a dual process, the spatial coalescent. The formulation given here can also incorporate mutation and selection (see [DGP12]). The dual process is driven by a system of n coalescing random walks for some n ∈ N, labeled from 1 to n starting at ξ0,1, ··· , ξ0,n ∈ V at time 0. Each walk i is identified with the partition element {i} for a partition of the set {1, 2, ··· , n}. We order partition elements n (2) by their smallest element they contain. We also fix two functions, φ ∈ Cb(R+ , R), which takes as its arguments the genealogical distances between the n coalescing walks, and n g ∈ Cbb(V , R), which takes as its arguments the initial positions ξ0 := (ξ0,1, . . . , ξ0,n) of the walks. The dynamics of the dual process is as follows. Partition elements migrate independently on V till they merge, when they follow the same walk, according to rate 1 continuous time random walks with transition kernela ¯. Independently, every pair of partition elements at the same location in V merge at rate γ. At time t, we define the genealogical distance rt(i, j) of two individuals i and j in {1, 2, ··· , n} as 2 min{t, Ti,j}, where Ti,j is the first time the two walks labeled by i and j coalesce, i.e., i and j are in the same partition element. The positions of the n walks are denoted by ξt := (ξt,1, . . . , ξt,n). 0 0 This way we obtain a process (Kt)t≥0 with states (π, ξ , r ) ∈ Sn, where π is a partition of the set {1, ··· , n} with cardinality |π|, ξ0 records the position of the walk at the present 0 0 time, and r := (r (i, j))1≤i

V 0 0 where X = (X, r, µ) ∈ U1 , K = (π, ξ , r ) ∈ S, from X an independent sample (x, v) ∈ X × V is taken according to µ for for each partition element in π, where all individuals π in the partition element are assigned the same value (x, v), and r = (rπ(k),π(`))1≤k<`≤n π , v = (vπ(i))i=1,··· ,n, with π(i) being the partition element of π containing i. In other words, we concatenate, compare Remark 1.20, X and K to obtain a new V -mmm space and then apply φ and g. Remark 1.15. n ∞ (2) n Note that {H(·, K): K ∈ Sn, φ ∈ Cb (R+ , R), g ∈ Cbb(V , R), n ∈ N} is law-determining V and convergence-determining on U1 . FV The IFV genealogy process (Xt )t≥0 is dual to the coalescent (Kt)t≥0, and its law and behavior as t → ∞ can be determined as follows. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 11

Theorem 1.16 (Duality and longtime behaviour). The following properties hold for the FV IFV genealogy process (Xt )t≥0: FV V (a) For every X0 ∈ U1 and K0 ∈ S, we have FV FV (1.19) E[H(Xt , K0)] = E[H(X0 , Kt)], t ≥ 0. 1 (b) If ba(·, ·) = 2 (a(·, ·) +a ¯(·, ·)) is recurrent, then FV V (1.20) L[Xt ]=⇒ Γ ∈ M1(U1 ), t→∞ FV V where Γ is the unique invariant measure of the process X on U1 . FV Remark 1.17. If ba is transient, then we can decompose Xt = (Xt, rt, µt) in such i i a way that Xt is the countable union of disjoint sets Xt , µt is the sum of measures µt, i i i 0 j and {(X , rt| i i , µ )}i∈ are V -mmm spaces such that for x ∈ X , x ∈ X with i 6= j, t Xt ×Xt t N t t 0 i i rt(x, x ) diverges in probability as t → ∞, and each (X , rt| i i , µ ) converges in law to t Xt ×Xt t a limiting V -mmm space. Alternatively we can transform distances r : r → 1 − e−r and obtain a unique equilibrium in that case. See also [GM]. A further consequence of relation (1.19) is that we can specify the finite dimensional FV distributions of (Xt )t≥0 completely in terms of the spatial coalescent as follows. Fix a time horizon t > 0. The finite dimensional distributions are determined by the expectations of 1,0 Φ,e where for some n ∈ N, 0 ≤ t1 < t2 < ··· < t` = t,Φk ∈ Π of order nk and defined by φk and gk for each k = 1, ..., `, we define (1.21) Φ((X FV) ) := Πn Φ (X FV). e t t≥0 k=1 k tk

The dual is the spatial coalescent with frozen particles (Kes)s∈[0,t] , for which the time index s ∈ [0, t] runs in the opposite direction from X FV. Namely, looking backward from time t, for each 1 ≤ k ≤ `, we start n particles at time t−t at locations ξ1 := (ξ1 , . . . , ξ1 ) ∈ k k tk tk,1 tk,nk k V , each forming its own partition element in the partition π of {1, 2, . . . , n1 + ··· + n`}. The particles perform the usual dynamics of the spatial coalescent, with the particles starting at time t − tk frozen before then. At time s, the genealogical distance rs(i, j) between two individuals i and j, started respectively at times t − ti, t − tj ≤ s, is defined to be 2 min{s, Ti,j} − (t − ti) − (t − tj), where Ti,j is the first time the two individuals coalesce. If s < t − ti or t − tj, we simply set rs(i, j) = 1. Denote this new spatial coalescent process with frozen particles by (Ket)t≥0. The state space Se is the set of all 0 0k  (1.22) π, r , (ξ , tk, nk)1≤k≤n 0k for all n ∈ N,(nk)1≤k≤n ∈ N, and 0 ≤ t1 < ··· < t`, where ξ records the present positions of the particles starting at time tk. Let us label the individuals from time t1 0 0 0 to time tk successively by 1, 2, . . . , m := n1 + ··· + n`, and let ξ := (ξ1, . . . , ξm) denote present positions. We can then rewrite the state in (1.22) as 0 0 (1.23) π, ξ , r , (tk, nk)1≤k≤`). V Let (φk, gk)1≤k≤` be as in the definition of Φe in (1.21). Let X = (X, r, µ) ∈ U1 , and let V Ke ∈ Se be as in (1.22) and (1.23). We then define the duality function He(X , Ke): U1 ×Se → R which determines the finite dimensional distributions of X for varying Ke by

Z m  O  Y π 0 (1.24) H(X , K) = µ (d(x, v)) 1 0 φ(r + r ), e e {vj =ξj } π j=1 (X×R)|π| 12 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3 with ` Y (1.25) φ(r) := φk((ri,j)n1+···+nk−1

FV V Corollary 1.18 (Space-time duality). Let X0 ∈ U1 , and Ke0 ∈ Se be as in (1.23) with tk ≤ t for all 1 ≤ k ≤ `. Let Φe and He be defined as in (1.21) and (1.24). Then we have FV FV (1.26) E[Φ((e Xs )s∈[0,t])] = E[He(X0 , Ket)]. FV In words, the genealogical distances of n1+···+n` individuals sampled from (Xs )s∈[0,t], with nk individuals sampled at time tk at specified locations, can be recovered by letting these individuals evolve backward in time as a spatial coalescent until time 0, at which FV point we sample from X0 according to the location of each partition element in the spatial coalescent. Remark 1.19. (Strong duality) We can obtain a strong dual representation, which FV represents the state Xt in terms of the entrance law of the tree-valued spatial coalescent starting with countably many individuals at each site of V realised both on the same probability space and its equivalence class w.r.t. isometry. The marked partition-valued case is constructed in [GLW05]. With this object we can associate just as above a marked tree ultrametric measure space Kt , which pasted to X0, has the same law as Xt. We need now the concept of subordination of two marked ultrametric measure spaces, allowing to concatinate a marked metric measure tree, prescribed as initial state with the one of the time t coalescent starting with countably many individuals. Remark 1.20. (Pasting) Let V be a countable mark space. We have two random marked ultrametric measure spaces with mark function, U = (U, r, κ, µ) and U 0 realized 0 0 0 independently, where U is generated by completing (N×V, r , κ ), with continuous continu- ation of the mark function and with sampling measure µ0 derived from the equidistribution and finally assume that it has diameter 2t. Furthermore there exists a map from the set of all open 2t-balls of U 0 into the geographic space V , say χ and a random i.i.d. assigned label on the elements of N × V in [0, 1]. Then we paste U 0 to U (written U ` U 0) by drawing from U an infinite sample at each location according to µ and then consider as distance matrix rU 0 + rsample(ψ(·), ψ(·))(U) evaluated for each pair of indices from N × V and denoted re. Here ψ : N × V −→ sample (U) with κ(ψ(i)) = χ(i), ψ(i) is constant on the open 2t-ball around i. This space 0 (N × V, r,e κ ) with local equidistribution as sampling measure we complete and take the equivalence class w.r.t. isomorphy to get the pasting of U 0 to U. Note that every marked metric measure tree we can obtain via the sampling procedure with subsequent completion. See Appendix A for a more formal definition. We choose χ now as the location of the partition element at time t of i and ψ as the local rank of i ∈ N × V at χ(i). For that purpose we use the i.i.d. [0, t]-distributed label. Then we can conclude as in [GPW13] that expectations of polynomials of Xt and dual Kt subordinated to X0 agree and by inspection that the law of X equals the law of the tree-valued coalescent entrance law for the coalescent started at time 0 run for time t and started with countably many individuals at each site of the finite geographic space V pasted to the state X0 if the migration kernel is doubly stochastic: CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 13

(1.27) L[Xt] = L[X0 ` Kt]. For countably infinite geographic space we can construct the coalescence tree as well using the techniques in [GLW05], which shows that the frequencies of individuals located in a finite set A of the geographic space which are in finitely many partition elements at a positive time t can be made arbitrarily close to one. Hence the entrance law defines still a marked ultrametric measure space, see Section D. As a consequence (1.27) still holds.

1.4. Genealogies of Continuum-sites Stepping Stone Model (CSSM) on R. If we rescale space and time diffusively, the measure-valued interacting Fleming-Viot process on Z converges to a continuum space limit, the so-called continuum-sites stepping stone model (CSSM) . Formally, CSSM is a measure-valued process ν := (νt)t≥0 on R × [0, 1], where R is the geographical space and [0, 1] is the type space. We might think of individuals in the population which undergo independent Brownian motions, and whenever two individuals meet, one of the two individuals is chosen with equal probability and switches its type to that of the second individual. Provided that ν0(· × [0, 1]) is the Lebesgue measure on R, CSSM was rigorously constructed in [EF96, Eva97, DEF+00, Zho03] via a moment duality with coalescing Brownian motions. In particular, νt(· × [0, 1]) is the Lebesgue measure on R, for all t > 0. In this subsection we will construct the evolving genealogies of the CSSM based on dual- ity to the (dual) Brownian web, and establish properties (Proposition 1.24, Theorem 1.26). In [FINR04], the Brownian web W is constructed as a random variable where each realization of

(1.28) W is a closed subset of Π := ∪s∈RC([s, ∞), R), the space of continuous paths in R with any starting time s ∈ R. We equip Π with the topology of local uniform convergence. For each z := (x, t) ∈ R, we will let W(z) := W(x, t) denote the subset of paths in W with starting position x and starting time t. We 2 can construct W by first fixing a countable dense subset D ⊂ R , and then construct a collection of coalescing Brownian motions {W(z): z ∈ D}, with one Brownian motion starting from each z ∈ D. The Brownian web W is then obtained by taking a suitable closure of {W(z): z ∈ D} in Π, which gives rise to a set of paths starting from every point 2 in the space-time plane R . It can be shown that the law of W does not depend on the choice of D (see Theorem B.1 in Appendix B). The Brownian web W has a graphical dual called the dual Brownian web, which we denote by Wc. Formally,

(1.29) Wc is a random closed subset of Πb := ∪s∈RC((−∞, s], R), the space of continuous paths in R starting at any time s ∈ R and running backward in time as coalescing Brownian motions. The joint distribution of Wc and W is uniquely determined by the requirement that the paths of Wc never cross paths of W (see,y Theorem B.3 in Appendix B). Thus, jointly, the Brownian web and its dual is a random variable taking values in a Polish space, with (1.30) (W, Wc) ∈ Π × Πb. Interpreting coalescing Brownian motions in the (dual) Brownian web as ancestral lines specifying the genealogies, we can then give an almost sure graphical construction of the CSSM, instead of relying on moment duality relations similar to (1.24) as in [EF96, Eva97, DEF+00, Zho03], which nevertheless we get as corollary of the graphical construction. The CS classical measure-valued CSSM process can be recovered from (Xt )t≥0 by projecting the 14 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

CS sampling measure (µt )t≥0 to the mark space V , if V is chosen to be the product of geographical space R and type space [0, 1]. In what follows, we will take V to be only the geographical space R, since types have no influence on the evolution of genealogies. We next explicitly construct a version of the CSSM genealogy process

CS CS CS CS CS CS (1.31) X := (Xt )t≥0, Xt = (Xt , rt , µt ), t ≥ 0,

R R as a functional of the (dual) Brownian web (W, Wc) and taking values in U1 ⊆ U , the space of R-marked ultrametric measure spaces introduced in Remark ??, where the pro- CS jection of µt on R is the Lebesgue measure. To avoid a disruption of the flow of presentation, background details on the (dual) Brownian web we will need are collected in Appendix B. We proceed in three steps: CS Step 1 (Initial states). Assume that X0 belongs to the following closed subspace of UR: R R (1.32) U1 := {(X, r, µ) ∈ U : µ(X × ·) is the Lebesgue measure on R}. R In other words, U1 is the set of R-marked ultrametric measure spaces where the projection of the measure on the mark space (geographic space) R is the Lebesgue measure. This is necessary for the duality between CSSM and coalescing Brownian motions. We will see CS R that almost surely Xt ∈ U1 for all t ≥ 0. CS CS Step 2 (The time-t genealogy as a metric space). To define (Xt , rt ) for every t > 0, let us fix a realization of (W, Wc), (compare, Appendix B). For each t > 0, let ˆ (1.33) At := {v ∈ R : Wc(v, t) contains a single path f(v,t)},Et := R\At.

2 By Lemma B.4 on the classification of points in R w.r.t. W and Wc, almost surely, Et ˆ is a countable set for each t > 0. For each v ∈ At, we interpret f(v,t) as the genealogy line of the individual at the space-time coordinate (v, t). Genealogy lines of individuals at different space-time coordinates evolve backward in time and coalesce with each other. At time 0, each genealogy line traces back to exactly one spatial location in the set ˆ ˆ (1.34) Eb := {f(v,s)(0) : v ∈ R, s > 0, f ∈ W}c , where we note that Eb is almost surely a countable set, because by Theorem B.1 and Lemma B.2, paths in Wc can be approximated in a strong sense by a countable subset of paths in Wc. For each v ∈ Eb, we then identify a common ancestor ξ(v) for all the individuals whose genealogy lines trace back to spatial location v at time 0, by sampling an individual CS CS CS ξ(v) ∈ X0 according to the conditional distribution of µ0 on X0 , conditioned on the CS spatial mark in the product space X0 × R being equal to v. We next characterize individuals by points in space. Note that there is a natural ge- ˆ ˆ nealogical distance between points in At. For individuals x, y ∈ At, if f(x,t) and f(y,t) ˆ ˆ coalesce at timeτ ˆ < t, then denoting u := f(x,t)(0) and v := f(y,t)(0), we define the distance between x and y by ( 2(t − τˆ) ifτ ˆ ≥ 0, (1.35) rt(x, y) := CS 2t + r0 (ξ(u), ξ(v)) ifτ ˆ < 0.

CS CS First define (Xt , rt ) as the closure of At w.r.t. the metric rt defined in (1.35). Note CS CS that (Xt , rt ) is ultrametric, and by construction Polish. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 15

Remark 1.21. We may even extend the distance rt to a distance r between (x, t) with x ∈ At and t > 0, and (y, s) with y ∈ As and s > 0. More precisely, let ( s + t − 2ˆτ ifτ ˆ ≥ 0, (1.36) r((x, t), (y, s)) := CS s + t + r0 (ξ(u), ξ(v)) ifτ ˆ < 0.

CS CS CS CS Step 3 (Adding the sampling measure). We now define Xt := (Xt , rt , µt ). CS For that purpose, we will represent Xt as an enriched copy of R (see (1.37) below). ˆ CS By identifying each x ∈ At with the path f(x,t) ∈ Wc, we can also identify Xt with the ˆ closure of {f(x,t) ∈ Wc : x ∈ At} in Π,b because a sequence xn ∈ At is a Cauchy sequence ˆ w.r.t. the metric rt if and only if the sequence of paths f(xn,t) is a Cauchy sequence when the distance between two paths is measured by the time to coalescence, which by Lemma B.2, ˆ is also equivalent to (f(xn,t))n∈N being a Cauchy sequence in Π.b ˆ When we take the closure of {f(x,t) ∈ Wc : x ∈ At} in Π,b only a countable number of paths in Wc are added, which are precisely the leftmost and rightmost paths in Wc(x, t), when Wc(x, t) contains more than one path. + − Namely for each x ∈ Et (recall from (1.33)), let x , x denote the two copies of x ± obtained by taking limits of xn ∈ At with either xn ↓ x or xn ↑ x in R, and let Et := ± {x : x ∈ Et}. We can then take CS + − (1.37) Xt := At ∪ Et ∪ Et , CS CS equipped with a metric rt , which is the extension of rt from At to its closure Xt , giving CS CS a Polish space (Xt , rt ). CS CS Next to get a sampling measure, note that each finite ball in (Xt , rt ) with radius + − less than t can be identified with an interval in R (modulo a subset of Et ∪ Et ∪ Et ), and hence can be assigned the Lebesgue measure of this interval. We then obtain a Borel CS CS CS measure µt on (Xt , rt ). This completes our construction of the CSSM genealogy CS process (Xt )t≥0.

A key feature is again duality. We can replace in the Definition of the process (Kt)t≥0 Br from Subsubsection 1.3.2 the random walks by Brownian motions to obtain (Kt )t≥0, CS which is dual to the process (Xt )t≥0 by construction: Corollary 1.22. (H-duality) The tree-valued CS-process and marked tree-valued coalesc- ing Brownian motions are in H-duality, i.e., for each H of the form (1.18), CS Br CS Br (1.38) E[H(Xt , K0 )] = E[H(X0 , Kt )]. Furthermore by the above construction the strong duality (1.27) holds.  Remark 1.23. Note in the continuum case the function Φn,g,ϕ(·, κBr) is not a poly- nomial since we fix the locations we consider. In order to get a polynomial we need to 2 consider a function g on mark space with g ∈ Cbb(R, R) over which we integrate w.r.t. the sampling measure.

We collect below some basic properties for the CSSM genealogy processes that we just constructed.

CS CS Proposition 1.24 (Regularity of states). Let X := (Xt )t≥0 be the CSSM genealogy CS R process constructed from the dual Brownian web Wc, with X0 ∈ U1 . Then almost surely, for every t > 0: 16 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

CS CS (a) There exists a continuous (mark) function κt : Xt → R, i.e., µt (dxdv) = CS µt (dx × R)δκt(x)(dv); CS l (b) For each l ∈ (0, t), Xt is the disjoint union of balls (Bi)i∈Z of radius l. Fur- l thermore, there exists a locally finite set Et := {vi}i∈Z ⊂ R with vi−1 < vi for all l CS l i ∈ Z, such that {κ(x): x ∈ Bi} = [vi−1, vi] and µt (Bi) = vi − vi−1. We can l identify Et from Wc by l ˆ+ ˆ− (1.39) Et := {x ∈ Et : f(x,t) and f(x,t) coalesce at some time s ≤ t − l}, + − where f(x,t) and f(x,t) are respectively the rightmost and leftmost path in Wc(x, t); CS CS CS CS R CS CS (c) Xt = (Xt , rt , µt ) ∈ U1 , i.e., µt (Xt × ·) is the Lebesgue measure on R. Remark 1.25. Using the duality between the Brownian web W and Wc, as characterized in Appendix B, it is easily seen that we can also write l (1.40) Et = {f(t): f ∈ W(x, t − s) for some x ∈ R, s ≥ l}. Theorem 1.26 (Markov property, path continuity, Feller property). Let X CS be as in Proposition 1.24. Then CS R (a)( Xt )t≥0 is a U1 -valued Markov process; CS (b) Almost surely, Xt is continuous in t ≥ 0; CS,(m) CS,(m) R (c) For each m ∈ N, let X be a CSSM genealogy process with X0 ∈ U1 . If CS,(m) CS R CS,(m) X0 → X0 in U1 , then for any sequence tm → t ≥ 0, we have Xtm ⇒ CS Xt . R CS (d) For each initial state in U1 , L[Xt ] converges to the unique equilibrium distribution R on U1 as t → ∞. CS R Remark 1.27. Proposition 1.24 shows that even though X0 can be any state in U1 , CS R for t > 0, Xt can only take on a small subset of the state space U1 . This introduces complications in establishing the continuity of the process at t = 0, and it will also be an important point when we discuss the generator of the associated martingale problem. Remark 1.28. We note that if we allow types as well, we enlarge the mark space from R to R × [0, 1], where each individual carries a type in [0, 1] that is inherited upon resampling. Theorem 1.26 still holds in this case. We will consider such an extended mark space in Theorem 1.30 below. 1.5. Convergence of Rescaled IFV Genealogies. In this subsection, we establish the convergence of the interacting Fleming-Viot genealogy processes on Z to that of the CSSM, where we view the states as elements of UR (see Remark 1.14). We assume that the transition probability kernel a(·, ·) in the definition of the IFV process satisfies X X X (1.41) a(0, x)x = 0 and σ2 := a(0, x)x2 = a¯(0, x)x2 ∈ (0, ∞). x∈Z x∈Z x∈Z σ R R For each  > 0, we then define a scaling map S = S : U → U (depending on σ) as follows. Let X = (X, r, µ) ∈ UR. Then

(1.42) SX := (X,Sr, Sµ), 2 where (Sr)(x, y) :=  r(x, y) for all x, y ∈ X, and Sµ is the measure on X × R induced −1 by µ and the map (x, v) ∈ X × R → (x, σ v), and then the mass rescaled by a factor of σ−1. More precisely, −1 −1 (1.43) (Sµ)(F ) = σ µ{(x,  σv):(x, v) ∈ F } for all measurable F ⊂ X × R. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 17

We have the following convergence result for rescaled IFV genealogy processes.

FV, FV, Theorem 1.29 (Convergence of Rescaled IFV Genealogies). Let X := (Xt )t≥0 FV, Z be an IFV genealogy process on Z with initial condition X0 ∈ U1 , indexed by  > 0. FV, CS CS R Assume that a(·, ·) satisfies (1.41), and SX0 → X0 for some X0 ∈ U1 as  → 0. R FV, Then as C([0, ∞), U )-valued random variables, (SX−2t )t≥0 converges weakly to the CS CS CSSM genealogy process X := (Xt )t≥0 as  → 0. To prove Theorem 1.29, we will need an auxiliary result of interest in its own on the convergence of rescaled measure-valued IFV processes to the measure-valued CSSM. The FV FV IFV process with mark space R × [0, 1] is a measure-valued process (Xbt )t≥0, where Xbt is a measure on R × [0, 1], and its projection on R is the counting measure on Z. Similarly, CS CS the CSSM with mark space R × [0, 1] is a measure-valued process (Xbt )t≥0, where Xbt is a measure on R × [0, 1], and its projection on R is the Lebesgue measure on R. Define XbFV and XbCS respectively as the projection of the measure component of X FV and X CS, projected onto the mark space R × [0, 1]. We then have FV, FV, Theorem 1.30 (Convergence of Rescaled IFV Processes). Let Xb := (Xbt )t≥0 be a measure-valued IFV process on Z, indexed by  > 0. Assume that a(·, ·) satisfies (1.41), FV, CS CS and SXb0 converges to Xb0 w.r.t. the vague topology for some Xb0 as  → 0. FV, Then as C([0, ∞), M(R×[0, 1]))-valued random variables, (SXb−2t )t≥0 converges weakly CS CS to the CSSM process Xb := (Xbt )t≥0 as  → 0. A similar convergence result has previously been established for the voter model in [AS11]. Remark 1.31. As can be seen from the above convergence results and the regularity properties of the limit process in Proposition 1.24, on a macroscopic scale, there are only locally finitely many individuals with descendants surviving for a macroscopic time of δ or more. This phenomenon leads in the continuum limit to a dynamic driven by a thin subset of hotspots only. For similar effects in other population models, see for example [DEF+02b, DEF+02a]. 1.6. Martingale Problem for CSSM Genealogy Processes. In this section, we show that the CSSM genealogy processes solves a martingale problem with a singular generator. CS CS To identify the generator L for the CSSM genealogy process (Xt )t≥0, we note that for CS all t > 0, Xt satisfies the regularity properties established in Proposition 1.24. We will CS 1,2 see that L is only well-defined on Φ ∈ Π evaluated at points X ∈ UR with suitable regularity properties for Φ and X . R We now formalize the subset of regular points in U1 as follows, which satisfy exactly the properties in Proposition 1.24.

R R R Definition 1.32 (Regular class of states Ur ). Let Ur denote the set of X = (X, r, µ) ∈ U satisfying the following regularity properties: R (a) X ∈ U1 , i.e., µ(X × ·) is the Lebesgue measure on R; (b) there exists a mark function κ : X → R, with µ(dxdv) = µ(dx × R)δκ(x)(dv); (c) there exists δ > 0 such that for each l ∈ (0, δ), X is the disjoint union of balls l l (Bi)i∈Z of radius l. Furthermore, there exists a locally finite set E := {vi}i∈Z ⊂ R l l with vi−1 < vi for all i ∈ Z, such that {κ(x): x ∈ Bi} = [vi−1, vi] and µ(Bi) = vi − vi−1.

R By Proposition 1.24, Ur is a dynamically closed, separable, metric subset of the Polish R CS space U1 under the dynamics of X . However, it is not complete. 18 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

R Remark 1.33. Similar to the discussion leading to (1.37), for X ∈ Ur , we can give a representation on an enriched copy of R as follows. Property (c) in Definition 1.32 implies that any two disjoint balls in X are mapped by κ to two intervals, which overlap at at most a single point in El for some l > 0. Therefore κ−1(x) must contain a single point for all x ∈ R with x not in l (1.44) E := ∪l>0E , −1 and κ (x) containing two or more points implies that x is in κ(B1) ∩ κ(B2) for two disjoint balls in X. By the same reasoning, for each x ∈ E, κ−1(x) must contain exactly two points, which we denote by x±, where x+ is a limit point of {κ−1(w): w > x} and x− is a limit point of {κ−1(w): w < x}. Similar to (1.37), we can then identify X with + − ± ± (R\E) ∪ E ∪ E , where E := {x : x ∈ E}. With this identification, we can simplify + − our notation (with a slight abuse) and let µ be the measure on (R\E) ∪ E ∪ E , which ± assigns no mass to E and is equal to the Lebesgue measure on R\E. We now introduce a regular subset of Π1,2 needed to define the generator LCS. 1,2 1,2 Definition 1.34 (Regular class of functions Πr ). Let Πr denote the set of regular test functions Φn,φ,g ∈ Π1,2, defined as in (1.7), with the property that: ∂φ (1.45) ∃ δ > 0 s.t. ∀ i 6= j ∈ {1, 2, ··· , n}, ((rk,l)1≤k

CS CS CS CS (1.46) L Φ(X ) := Ld Φ(X ) + La Φ(X ) + Lr Φ(X ), with the component for the massflow (migration) of the population on R given by Z CS 1 (1.47) Ld Φ(X ) := φ(r)∆g(x) dx, 2 Xn and the component for aging of individuals given by Z CS X ∂φ (1.48) La Φ(X ) := 2 g(x) (r) dx. n ∂ri,j X 1≤i

R These operators are linear operators on the space of bounded continuous functions, Cb(U1 , R), with domain Π1,2, and maps polynomials to polynomials of the same order. The component for resampling is, with θk,lφ defined as in (1.16), given by Z CS X ∗ ∗ Y (1.49) Lr Φ(X ) := 1{κ(xk)=κ(xl)} g(x)(θk,lφ − φ)(r) µ (dxk)µ (dxl) dxi, n 1≤k6=l≤n X 1≤i≤n i6=k,l with effective resampling measure and mark functions

∗ X X (1.50) µ := δx+ + δx− , x∈E x∈E

(1.51) κ(x±) = x for x ∈ E, where E, and x± for x ∈ E, are defined as in Remark 1.33. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 19

CS Remark 1.35. Note that Lr is singular. First because the effective resampling measure µ∗ is supported on a countable subset of X and is singular w.r.t. the sampling measure µ on X. Secondly, µ∗ is locally infinite because E ∩ (a, b) contains infinitely many points for any a < b. Therefore the r.h.s. in (1.49) is now in principle a sum of countably many monomials of order n − 2. + − Indeed, as we partition X = (R\E) ∪ E ∪ E into balls of radius l, with l ↓ 0, the balls must correspond to smaller and smaller intervals on R so as not contradict the fact that CS each point in X is assigned one value in R. Nevertheless, Lr Φ(X ) in (1.49) is well-defined R 1,2 at least on Ur because by our assumption that Φ ∈ Πr and condition (1.45) on φ, we + − have θk,lφ = φ if the resampling is carried out between two individuals x and x for some x ∈ E, with r(x+, x−) ≤ δ. Thus only resampling involving x ∈ E with r(x+, x−) > δ remains in the integration w.r.t. µ∗, and such x are contained in the locally finite set Eδ introduced in Definition 1.32 (c). Together with the assumption that g has bounded support, this implies that the integral in (1.49) is finite.

CS 1,2 Remark 1.36. The operator Lr , defined on functions in Πr evaluated at regular R R states in Ur ⊆ U1 , is still a linear operator, mapping polynomials to generalized poly- 1,2 nomials of degree reduced by two and with domain Πr . Here, generalized polynomial means that they are no longer bounded and continuous, and are only measurable functions R R CS defined on the subset of points Ur ⊆ U1 . Hence Lr differs significantly from generators associated with Feller semigroups on Polish spaces.

CS 1,2 1,2 For Lr Φ(X ) to be well-defined for any Φ ∈ Π , instead of Φ ∈ Πr , we need to place further regularity assumption on the point X at which we evaluate Φ. These assumptions are satisfied by typical realizations of the CSSM at a fixed time, as we shall see below.

R R R Definition 1.37 (Regular subclass of states Urr). Let Urr be the set of X = (X, r, µ) ∈ Ur with the further property that

X + − (1.52) r(x , x ) < ∞ for all n ∈ N, x∈E∩[−n,n]

+ − where we have identified X with (R\E) ∪ E ∪ E as in Remark 1.33. Theorem 1.38 (Martingale problem for CSSM genealogy processes). R CS 1,2 (i) If X0 ∈ U1 , then the (L , Π , δX0 )-martingale problem has a solution, i.e., there exists a process X := (Xt)t≥0 with initial condition X0, almost sure continuous R 1,2 sample path with Xt ∈ Ur for every t > 0, such that for all Φ ∈ Π , and w.r.t. the natural filtration, Z t  CS  (1.53) Φ(Xt) − Φ(X0) − (L Φ)(Xs)ds is a martingale. 0 t≥0 (ii) The CSSM genealogy process X CS constructed in Sec. 1.4 is a solution to the above martingale problem. Apart from the properties established in Prop. 1.24, for each CS R t > 0, almost surely Xt ∈ Urr.

CS R Remark 1.39. Whether almost surely, Xt ∈ Urr for all t > 0, remains open. Remark 1.40. Note that different from the usual martingale or prob- lems, as for example in [EK86], the test functions here are only defined on a (topologically not closed, only dynamically closed) subset of the state space. 20 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

We conjecture that the martingale problems above are in fact well-posed. A proof could be attempted by using the duality between the CSSM genealogy process and the Brownian web. There are however subtle technical complications due to the fact that the generator of the martingale problem is highly singular. We leave this for a future paper.

1.7. Outline. We provide here an outline of the rest of the paper. In Section 2 we construct the CSSM genealogy process, and establish in Section 3 the convergence of the IFV genealogies to those of the CSSM, and in Section 4 results on the martingale problem for the CSSM genealogy processes. In Appendix A, we collect further facts and proofs concerning marked metric measure spaces. In Appendix B, we recall the construction of the Brownian web and its dual, and collect some basic properties of the Brownian web and coalescing Brownian motions. In Appendix C we prove some results on coalescing Brownian motions needed in our estimates to derive the martingale problem for the CSSM. In Appendix D we prove of the results on the IFV genealogy processes.

2. Proofs of the properties of CSSM Genealogy Processes In this section, we prove Proposition 1.24 and Theorem 1.26 by using properties of the double Brownian web (W, Wc), which was used to construct the CSSM genealogy process X CS in Section 1.4.

CS Proof of Prop. 1.24. (a): The existence of a mark function κ : Xt → R follows by construction. The continuity of κ follows from the property of the dual Brownian web CS CS Wc. More precisely, if xn → x in (Xt , rt ), then identifying xn and x with points on R, ˆ ˆ ˆ ˆ it follows that f(xn,t) → f(x,t) in Πb for some path f(xn,t) ∈ Wc(xn, t) and f(x,t) ∈ Wc(x, t), which implies that xn → x in R. CS (b): By (1.37), we identify Xt with R, where a countable subset Et is duplicated. The CS distance between x, y ∈ Xt is defined to be twice of the time to coalescence between the ˆ ˆ dual Brownian web paths f(x,t) ∈ Wc(x, t) and f(y,t) ∈ Wc(y, t), if the two paths coalesce l above time 0. Therefore for l ∈ (0, t), each ball Bi of radius l correspond to a maximal ˆ interval [vi−1, vi] ⊂ R, where all paths in {f(x,t) ∈ Wc(x, t): x ∈ (vi−1, vi)} coalesce into a single path by time t − l. The collection of such maximal intervals (vi−1, vi) form a l l partition of R\Et, where Et = {vi : i ∈ Z} is exactly the set defined in (1.39). l l (c): Fix an l ∈ (0, t). By construction, for each ball Bi of radius l, κ(Bi) = [vi−1, vi] is CS CS assigned the Lebesgue measure on R. Together with (b), it implies that µt (Xt × ·) is the Lebesgure measure on R.

Proof of Theorem 1.26. Let us fix a realization of the Brownian web W and its dual Wc, and let X CS be constructed from Wc as just before Proposition 1.24. By (1.37), for each CS + − t > 0, we can identify Xt with At ∪ Et ∪ Et , where At and Et are defined as in (1.33). To simplify notation, we will drop the superscript CS in the remainder of the proof.

(a): The Markov property of (Xt)t≥0 follows from the Markov property of W and Wc. t More precisely, if we denote by f|s the restriction of a path f ∈ Π to the time interval t t s ∞ [s, t], and K|s := {f|s : f ∈ K} for a set of paths K ⊂ Π, then W|0 is independent of W|s s s for each s ≥ 0. The same is true for Wc since W|0 and W|c 0 a.s. uniquely determine each t other by Theorem B.3. The Markov process (Xt)t≥0 is time homogeneous because W|0 is s+t equally distributed with W|s , apart from a time shift. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 21

R (b): Let X0 ∈ U1 . We first prove that (Xt)t≥0 is a.s. continuous in t > 0. To accomplish 1,2 this, since Π is convergence determining in UR as shown in Theorem 1.8, it suffices to show that for any Φ := Φn,φ,g ∈ Π1,2, the evaluated polynomial Z (2.1) Φ(Xt) = φ(rt)g(x)dx Rn is continuous in t, where x := (x1, . . . , xn), rt := (rt(xi, xj))1≤i 0, Wc(x, t) contains a single path for all but a countable ˆ number of x ∈ R. For such x, by Lemma B.2, the time of coalescence between f(x,s) and ˆ f(x,t) tends to t as s → t, and hence lims→t r((x, s), (x, t)) = 0, where r((x, s), (x, t)) is defined in (1.36) and extends the definition of rt(x, y) to individuals at different times. Since

(2.3) |rs(xi, xj) − rt(xi, xj)| ≤ r((xi, s), (xi, t)) + r((xj, s), (xj, t)), it follows that when t > 0, for Lebesgue a.e. xi, xj ∈ R, 1 ≤ i < j ≤ n, we have

(2.4) lim rs(xi, xj) = rt(xi, xj). s→t We can then apply the dominated convergence theorem in (2.1) to deduce that, for each t > 0, a.s.

(2.5) lim Φ(Xs) = Φ(Xt). s→t

This verifies that (Xt)t≥0 is a.s. continuous in t > 0. Proving the a.s. continuity of (Xt)t≥0 at t = 0 poses new difficulties because X0 can be R any state in U1 , while for any t > 0, Xt is a regular state as shown in Proposition 1.24. We get around this by showing that X admits a c`adl`agversion. More precisely, we invoke a part of the proof of the convergence Theorem 1.29 that is independent of the current R FV, Z proof. Note that for any X0 ∈ U1 , we can find a sequence X0 ∈ U1 , indexed by  > 0, FV, such that SX0 → X0. Indeed, we only need to approximate the mark space R by Z FV, in order to construct SX0 from X0. In the proof of Theorem 1.29, it is shown that FV, the corresponding sequence of interacting Fleming-Viot genealogy process (SX−2t )t≥0 is a tight family of D([0, ∞), UR)-valued random variables, where D([0, ∞), UR) denotes the space of c`adl`agpaths on UR equipped with the Skorohod topology. Furthermore, FV, (SX−2t )t≥0 converges in finite-dimensional distribution to the CSSM genealogy process (Xt)t≥0. Therefore, (Xt)t≥0 must admit a version which is a.s. c`adl`ag,with Xt → X0 as t ↓ 0. Since we have just shown that the version of (Xt)t≥0 constructed in Sec. 1.4 is a.s. continuous in t > 0, it follows that the same version must also be a.s. c`adl`ag,and hence continuous at t = 0, which concludes the proof of part (b).

(m) R (c): To prove the Feller property, let X0 → X0 in U1 , and let tm → t ≥ 0. To show (m) Xtm ⇒ Xt, by Theorem 1.8, it suffices to show (m) n,φ,g 1,2 (2.6) lim [Φ(X )] = [Φ(Xt)] ∀ Φ = Φ ∈ Π . m→∞ E tm E 22 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

We claim that the convergence in

(2.7) lim [Φ(Xs)] = [Φ(Xt)] is uniform w.r.t. the initial condition X0. s→t E E In particular, as m → ∞, (m) (m) E[Φ(Xtm )] − E[Φ(Xt )] → 0. To prove (2.6), it then suffices to show that (m) (2.8) lim [Φ(X )] = [Φ(Xt)]. m→∞ E t E We prove (2.7) and (2.8) next. Proof of (2.7). By the Markov property of X , it suffices to show that as t ↓ 0,

(2.9) E[Φ(Xt)] − E[Φ(X0)] → 0 uniformly in X0. Note that we can write Z (2.10) E[Φ(X0)] = E[ g(x) φ({r0(ξ(xi), ξ(xj))}1≤i

(2.12) rt(xi, xj) = 2t + r0(ξ(xi(0)), ξ(xj(0))).

Let Fe(x, t) denote the event that (xi(s))s≤t, 1 ≤ i ≤ n, do not intersect during the time interval [0, t]. Then using (2.12), we can rewrite (2.11) as (2.13) Z Z E[Φ(Xt)] = g(x)E[ φ(rt)1F c(x,t)] dx + g(x)E[ φ({2t + r0(ξ(xi(0)), ξ(xj(0)))}1≤i

ˆ ˆ Proof of (2.8). For each xi ∈ R, 1 ≤ i ≤ n, let us denote f(xi,t) ∈ Wc(xi, t) by fi. Then Z (m) (m) (2.15) E[Φ(Xt )] = g(x)E[φ(rt )]dx, Rn (m) (m) ˆ where rt = {rt (xi, xj)}1≤i 0, the genealogical distances among individuals with spatial locations in [−N,N] are determined by coalescing Brownian motions in the dual Brownian web Wc. CS The initial condition X0 affects the genealogical distances only on the event that the backward coalescing Brownian motions starting from [−N,N] at time t do not coalesce into a single path by time 0. As t → ∞, the probability of this event tends to 0, and CS therefore (Xt+s)s≥0 converges in distribution to the CSSM genealogy process constructed from Wc as in Section 1.4, with initial condition being at time −∞.

3. Proof of convergence of Rescaled IFV Genealogies In this section, we prove Theorem 1.29, that under diffusive scaling of space and time as well as rescaling of measure, the genealogies of the interacting Fleming-Viot process converges to those of a CSSM. In Section 3.1 we prove f.d.d.-convergence, and in Section 3.2 tightness in path space. In Section 3.3, we prove Theorem 1.30 on the measure-valued process, which is needed to prove tightness in Section 3.2. 24 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

FV, FV, FV, FV, As in Theorem 1.29, let (Xt )t≥0 = (Xt , rt , µt )t≥0 be the family of IFV FV, CS R genealogy processes on Z indexed by  > 0, such that SX0 → X0 ∈ U1 as  → 0, and CS CS CS CS let (Xt )t≥0 = (Xt , rt , µt )t≥0 be the CSSM genealogy process with initial condition CS X0 . 3.1. Convergence of Finite-dimensional Distributions. In this subsection we prove FV, CS the convergence (SX−2t )t≥0 ⇒ (Xt )t≥0 in finite-dimensional distribution, i.e., FV, FV, CS CS (3.1) (SX −2 ,...,SX −2 ) =⇒ (Xt ,..., Xt ) ∀ 0 ≤ t1 < t2 < ··· < tk,  t1  tk →0 1 k k where ⇒ denotes weak convergence of (UR) -valued random variables. By [EK86, Prop. 3.4.6] on convergence determining class for product spaces, it suffices to show that for any n ,φ ,g 1,2 Φi := Φ i i i ∈ Π , 1 ≤ i ≤ k, we have (recall (1.7)) k k h Y FV, i h Y CS i (3.2) Φi(SX −2 ) =⇒ Φi(Xt ) . E  ti →0 E i i=1 i=1 For notational convenience we assume first that the initial tree is the trivial one (all distances are zero) and we shall see at the end of the argument that this easily generalizes. We will first rewrite both sides of this convergence relation in terms of the dual coalescents and then apply the invariance principle for coalescing random walks. Step 1 (Claim rephrased in terms of coalescents). We can by the definition of the polynomial in (2.1) rewrite the left hand side of (3.2) as k h Y FV, i Φi(SX −2 ) E  ti i=1 (3.3) h k i Y −1 ni X −1 i  2 FV,  i  i  = (σ ) gi σ (x )1≤a≤n φi  (r −2 (ξ (x ), ξ (x )))1≤a

( 2(s − τˆ) ifτ ˆ ≥ 0, FV, (3.5) rs (x, y) := FV,   2s + r0 (ξ0(u), ξ0(v)) ifτ ˆ < 0, whereτ ˆ denotes the time of coalescence between the two coalescing walks starting at x, y ∈ Z at time s, while u, v are the positions of the two walks at time 0. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 25

+ − We observe next that the continuum population is represented by At ∪ Et ∪ Et which is a version of R marked on Et by +, −, the geographic marks are the reals and since the sampling measure is Lebesgue measure, we can write polynomials based on integration over R instead of X × V as: Z Z Z (3.6) µ⊗n(d(x, u))g(u)ϕ(r(x)) = λ⊗n(du)g(u)ϕ(r(u)).

X V R We can rewrite using the duality of Corollary 1.22, see also (1.36), the R.H.S. of (3.2) in the same form as in (3.3): k k Z h Y CS i h Y i  CS i i  ii (3.7) E Φi(Xti ) = E gi (ya)1≤a≤ni φi (rti (ya, yb))1≤a

k Y −1 ni X −1 i  (3.8) (σ ) gi σ (xa)1≤a≤ni δxi ··· δxi 1 ni i=1 xi ,...,xi ∈ 1 ni Z as a finite signed sampling measure (recall that g has bounded support), which is easily seen to converge weakly to the finite signed sampling measure appearing in (3.7), namely

k Y i  i i (3.9) gi (ya)1≤a≤ni dy1 ··· dyni . i=1 To prove (3.2), it then suffices to show that (having (3.8)-(3.9) in mind): If for each i, −1 i, i 1 ≤ a ≤ ni and 1 ≤ i ≤ k, xa ∈ Z and σ xa → ya as  → 0, then (3.10) k k h Y 2 FV,  i,  i, i h Y CS i i i φi ( r −2 (ξt (xa ), ξt (x )))1≤a

Step 2 (Invariance principle for coalescents). We next prove (3.10) by means of an invariance principle for coalescing random walks. This invariance principle reads as follows. Given a collection of backward coalescing i, −2 random walks starting at -dependent positions xa ∈ Z at time  ti, 1 ≤ a ≤ ni, −1 i, i 1 ≤ i ≤ k, such that σ xa → ya as  → 0, the collection of coalescing random i, −1 i, −2 −2 walks (xa (s))s≤ ti , rescaled diffusively as (σ xa ( s))s≤ti , converges in distribution i to the collection of coalescing Brownian motions (ya(s))s≤ti evolving backward in time. Furthermore, the times of coalescence between the coalescing random walks, scaled by 2, converge in distribution to the times of coalescence between the corresponding Brownian motions. The proof of such an invariance principle can be easily adapted from [NRS05, Section 5], which considered discrete time random walks with instantaneous coalescence. We will omit the details. Let δ > 0 be small. Note that the collection of rescaled coalescing random walks −1 i, −2 (σ xa ( s)) restricted to the time interval s ∈ [δ, tk], together with their times of 26 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3 coalescence, converge in joint distribution to the collection of coalescing Brownian mo- i tions (ya(s)) restricted to the time interval s ∈ [δ, tk], together with their times of co- alescence. Using Skorohod’s representation theorem (see e.g. [Bil89]), we can couple −1 i, −2 i (σ xa ( s))s∈[δ,tk] and (ya(s))s∈[δ,tk] such that the paths and their times of coales- cence converge almost surely. Let us assume such a coupling from now on. By the same argument as in the proof of Theorem 1.26 (c), we can rewrite the expecta- i, tions in (3.10) in terms of the backward coalescing random walks xa ∈ Z and coalescing i Brownian motions ya. Furthermore, we can condition on the coalescing random walks i, −2 −2 xa (s) on the time interval [δ ,  tk] and condition on the coalescing Brownian motions i ya(s) on the time interval [δ, tk], coupled as above.   Given the locations u1, . . . , ul ∈ Z of the remaining coalescing random walks at time δ−2, we now make an approximation and replace them by independent random walks on the remaining time interval [0, δ−2], and make a similar replacement for the coalescing Brownian motions. Note that the error we introduce to the two sides of (3.10) is bounded by a constant (determined only by |φi|∞, 1 ≤ i ≤ k) times the probability that there is a coalescence among the random walks (resp. Brownian motions) in the time interval [0, δ−2] (resp. [0, δ]), which tends to 0 as δ ↓ 0 uniformly in  by the properties of Brownian motion and the invariance principle. Therefore to prove (3.10), it suffices to prove its analogue where we make such an approximation for a fixed δ > 0, replacing coalescing random walks (resp. Brownian motions) on the time interval [0, δ−2] (resp. [0, δ]) by independent ones. Let us fix such a δ > 0 from now on. By conditioning on the coalescing random walks and the coalescing Brownian motions on the macroscopic time interval [δ, tk] and using the a.s. coupling between them, we note that the analogue of (3.10) discussed above follows readily if we show:

  −1  Lemma 3.1. If u1, . . . , ul ∈ Z satisfy σ ui → ui as  → 0, then for any bounded (l) continuous function ψ : R 2 → R, we have X   2 FV,    gδ(x1, . . . , xl)E ψ ( r0 (ξ0(xi), ξ0(xj)))1≤i

3.2. Tightness. In this subsection we prove the tightness of the family of rescaled IFV FV, genealogy processes, (SX )>0, regarded as C([0, ∞), UR)-valued random variables. FV, First we note that it is sufficient to prove the tightness of (SX )>0 as random variables taking values in the Skorohod space D([0, ∞), UR). Indeed, the tightness of FV, FV, CS (SX )>0 in the Skorohod space, together with the convergence of SX to X FV, CS in finite-dimensional distributions, imply that SX ⇒ X as D([0, ∞), UR)-valued CS random variables. In particular, (Xt )t≥0 admits a version which is a.s. c`adl`ag.Together CS with the fact that Xt is a.s. continuous in t > 0, which was established in the proof of CS Theorem 1.26 (b), it follows that Xt must be a.s. continuous in t ≥ 0. Note that this concludes the proof of Theorem 1.26 (b). FV, Using Skorohod’s representation theorem (see e.g. [Bil89]) to couple (SX )>0 and CS X such that the convergence in D([0, ∞), UR) is almost sure, and using the a.s. continuity CS FV, CS R of (Xt )t≥0, we can then easily conclude that SX → X a.s. in C([0, ∞), U ), which FV, implies the tightness of (SX )>0 as C([0, ∞), UR)-valued random variables. FV, By Jakubowski’s criterion (see e.g. [Daw93, Theorem 3.6.4]), to show that (SX )>0 is a tight family of random variables in the Skorohod space D([0, ∞), UR), it suffices to show that the following two conditions are satisfied: (J1) (Compact Containment) For each T > 0 and δ > 0, there exists a compact set KT,δ ⊂ UR such that for all  > 0, FV,  (3.13) P SX−2t ∈ KT,δ ∀ 0 ≤ t ≤ T ≥ 1 − δ; 1,2 FV, (J2) (Tightness of Evaluations) For each f ∈ Πe ,(f(SX−2t ))t≥0, indexed by  > 0, is a tight family of D([0, ∞), R)-valued random variables. 1,2 Note that Πe (recall from (1.8)) separates points in UR by Theorem 1.8, and is closed under addition. We first prove (J2), following an approach used in [AS11], where a family of rescaled measure-valued processes induced by the voter model on Z is shown to be tight and con- verge weakly to the measure-valued CSSM as  → 0. We will verify (J2) via Aldous’ tightness criterion (see e.g. [Daw93, Theorem 3.6.5]), reducing (J2) to the following con- ditions: FV, (A1) For each rational t ≥ 0, {f(SX−2t )}>0 is a tight family of R-valued random variables. (A2) Given T > 0 and any τ ≤ T , if δ ↓ 0 as  ↓ 0, then for each η > 0, FV, FV,  (3.14) lim P f(SX −2 ) − f(SX −2 ) > η = 0. ↓0  (τ+δ)  τ

1,2 Pk Proof of (J2) via (A1)–(A2). Note that each f ∈ Πe can be written as f = i=1 ciΦi 1,2 for finitely many Φi ∈ Π and ci ∈ R. It then follows by the triangle inequality that to verify (A1)–(A2), it suffices to consider f = Φ ∈ Π1,2. We abbreviate for Φ ∈ Π1,2,  FV, Φt := Φ(SX−2t ). n,φ,g 1,2  (A1). Note that for Φ = Φ ∈ Π , (Φt)>0 is uniformly bounded, because φ is FV, bounded, g is bounded with bounded support, and the projection of Sµ−2t on the mark space R converges to the Lebesgue measure as  ↓ 0, giving (A1). 28 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

(A2). We will use the duality between interacting Fleming-Viot processes and coalesc-  FV, ing random walks (see Theorem 1.16 and Corollary 1.18). Recall that Φt = Φ(SX−2t ). First, we bound

  1   2 1    2 FV,  P(|Φτ +δ − Φτ | > η) ≤ E[(Φτ +δ − Φτ ) ] = E E[(Φτ +δ − Φτ ) | X −2 ]    η2    η2     τ 2   FV,  2   FV,  2 (3.15) ≤ E Var Φτ +δ | X −2 + E E[Φτ +δ | X −2 ] − Φτ , η2    τ η2    τ   FV,   where in the second inequality, we added and subtracted [Φ | X −2 ] from Φ −Φ E τ+δ  τ τ+δ τ and used (a + b)2 ≤ 2a2 + 2b2. We treat the two terms on the r.h.s. separately.  FV,  First term in (3.15). This term we bound by bounding Var Φ | X −2 uniformly τ+δ  τ FV, FV, in X −2 . First note that X is a strong Markov process by Theorem 1.11. Therefore  τ FV, FV, −2 X −2 can be seen as the IFV genealogy process X at time  δ with initial  (τ+δ) FV,  condition X −2 . In particular, it suffices to bound Var(Φ ) uniformly in the initial  τ δ FV, condition X0 , which we can assume to be deterministic. n,φ,g n n Let Φ = Φ , and denote x := (x1, . . . , xn) ∈ Z , y := (y1, . . . , yn) ∈ Z . Then by the definition of the scaling map S in (1.42), we have  n,φ,g FV,   2  2 Var(Φ ) = Var Φ (SX −2 ) = [(Φ ) ] − [Φ ] δ  δ E δ E δ −1 2n X −1 −1 2 FV, 2 FV,  (3.16) = (σ ) g(σ x)g(σ y) Cov φ( r −2 (x)), φ( r −2 (y)) ,  δ  δ x,y∈Zn FV, FV, where r −2 (x) denotes the distance matrix r −2 (ξ(xi), ξ(xj))1≤i

xi yi −2 −2 G δ (x, y) := {none of (X )1≤i≤n coalesces with any (X )1≤i≤n before time  δ},

yi xi FV, y such that (X )1≤i≤n is independent of (X )1≤i≤n. Let r (X −2 ) be the associated et t 0 e δ FV, x distance matrix, which is independent of r (X −2 ). By the duality relation (see The- 0  δ orem 1.16), we have (3.17) 2 FV, 2 FV,  2 FV, x 2 FV, y  Cov φ( r −2 (x)), φ( r −2 (y)) = Cov φ( r (X −2 )), φ( r (X −2 ))  δ  δ 0  δ 0  δ  2 FV, x 2 FV, y   2 FV, x   2 FV, y  = φ( r (X −2 ))φ( r (X −2 )) − φ( r (X −2 )) φ( r (X −2 )) E 0  δ 0  δ E 0  δ E 0  δ  2 FV, x  2 FV, y 2 FV, y  = φ( r (X −2 )) φ( r (X −2 )) − φ( r (X −2 )) E 0  δ 0  δ 0 e δ 2 c 2 X −2 −2 ≤ 2|φ|∞P(G δ (x, y) ) ≤ 2|φ|∞ P(τxi,yj ≤  δ), 1≤i,j≤n CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 29

y xi j where τxi,yi denotes the time it takes for the two walks X· and X· to meet. Note that FV, this bound is uniform w.r.t. X0 . Substituting it into (3.16) then gives  2 X −1 2n X −1 −1 −2 Var(Φδ ) ≤ |φ|∞ (σ ) g(σ x)g(σ y)P(τxi,yj ≤  δ) 1≤i,j≤n x,y∈Zn 2 X X −1 2n −2 −1 −1 (3.18) = |φ|∞ (σ ) g(˜x)g(˜y)P(τ σx˜i, σyj ≤  δ), 1≤i,j≤n x˜,y˜∈σ−1Zn wherex ˜ := σ−1x andy ˜ := σ−1y. We claim that the r.h.s. of (3.18) tends to 0 as  ↓ 0. Indeed, the measure

−1 2n X X (σ ) δx˜i δy˜j g(˜x)g(˜y) 1≤i,j≤n x˜,y˜∈σ−1Zn 2n converges weakly to the finite measure g(˜x)g(˜y)d˜xd˜y on R as  ↓ 0. By Donsker’s invariance principle and the fact that δ → 0 as  ↓ 0, we note that for any λ > 0, −2 (τ−1σx˜ ,−1σy ≤  δ) −→ 0 P i j →0 uniformly inx ˜i andy ˜j with |x˜i − y˜j| > λ. It follows that when restricted tox ˜i andy ˜j with |x˜i − y˜j| > λ, the inner sum in (3.18) tends to 0 as  ↓ 0. On the other hand, when restricted tox ˜i andy ˜j with |x˜i − y˜j| ≤ λ, the inner sum in (3.18) can be bounded from above by replacing P(·) with 1, which then converges to the integral of the finite measure 2n g(˜x)g(˜y)d˜xd˜y over the subset of R with |x˜i − y˜j| ≤ λ, and can be made arbitrarily small by choosing λ > 0 small. This proves that Var(Φ ) tends to 0 uniformly in X FV, as  ↓ 0, and hence the first δ 0 term in (3.15) tends to 0 as  ↓ 0. Second term in (3.15). By the strong Markov property of X FV,, it suffices to bound [Φ ] − Φ uniformly in the (deterministic) initial condition X FV,. E δ 0 0 xi FV, x FV, xi Let (X )1≤i≤n, r (X −2 ), and r (x) be defined as before (3.17). Let (X )1≤i≤n t 0  δ t et xi xi be a collection of independent random walks, such that (Xet )1≤i≤n coincides with (Xt )1≤i≤n −2 up to time  δ on the event

xi −2 −2 G δ (x) := {no coalescence has taken place among (X )1≤i≤n before time  δ}. Then we have   E[Φδ ] − Φ0 −1 n X −1  2 FV,  X −1  2 FV,  = (σ ) g(σ x) φ( r −2 (x)) − g(σ y) φ( r (y)) E  δ E 0 x∈Zn y∈Zn −1 n X −1  2 FV, x  X −1  2 FV,  = (σ ) g(σ x) φ( r (X −2 )) − g(σ y) φ( r (y)) E 0  δ E 0 x∈Zn y∈Zn −1 n X −1  2 FV, x  X −1  2 FV,  ≤ (σ ) g(σ x) φ( r (X −2 )) − g(σ y) φ( r (y)) E 0 e δ E 0 x∈Zn y∈Zn hn o i −1 n X −1 2 FV, x 2 FV, x + (σ ) g(σ x)E φ( r0 (X −2 )) − φ( r0 (Xe −2 )) 1{Gc (x)} ,  δ  δ −2δ x∈Zn Note that the second term in the bound above tends to 0 as  ↓ 0 by the same argument as the one showing that the bound for Var(Φ ) in (3.18) tends to 0 as  ↓ 0. To bound δ the first term in the bound above, we decompose according to the positions of the random 30 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3 walks and rewrite it as follows, where pt(x) denotes the transition probability kernel of 0 Xet :

−1 n X −1  2 FV, x  X −1  2 FV,  (σ ) g(σ x) φ( r (X −2 )) − g(σ y) φ( r (y)) E 0 e δ E 0 x∈Zn y∈Zn n −1 n X X Y −1  2 FV,  X −1  2 FV,  = (σ ) p −2 (y − x )g(σ x) φ( r (y) − g(σ y) φ( r (y))  δ i i E 0 E 0 y∈Zn x∈Zn i=1 y∈Zn n X −1 n X Y −1 −1  ≤ |φ| (σ ) p −2 (y − x ) g(σ x) − g(σ y) ∞  δ i i y∈Zn x∈Zn i=1 n X −1 n X Y n −1 −1 = |φ| (σ ) p −2 (y − x ) σ hx − y, ∇g(σ y)i ∞  δ i i y∈Zn x∈Zn i=1 (σ−1)2 o + hx − y, ∇2g(σ−1u(x, y))(x − y)i 2 n n 2 −1 n+2 X Y  X 2 −2 −1 −1 ≤ C|φ|∞|∇ g|∞(σ ) p δ (yi − xi) 1{σ x∈supp(g)} + 1{σ y∈supp(g)} (yi − xi) x,y∈Zn i=1 i=1 2 −1 n X ≤ 2nC|φ|∞|∇ g|∞(σ ) 1{σ−1y∈supp(g)} δ, y∈Zn where C is a constant depending only on n. In the derivation above, we Taylor expanded g(σ−1x) around σ−1y when either σ−1x or σ−1y is not in the support of g, ∇g and ∇2g denote the first and second derivatives of g, and u(x, y) is some point on the line segment connecting x and y. Lastly, we used the fact that P zp (z) = 0 and P z2p (z) = tσ2. z∈Z t z∈Z t 0 Since g has bounded support, the bound we obtained above is bounded by C δ for some C0 depending only on n, φ and g, and hence tends to 0 as  ↓ 0. This verifies that the second term in (3.15) also tends to 0 as  ↓ 0, which concludes the proof of (A2).

FV, We have verified (J2) above and hence to conclude the proof of tightness of (SX )>0 as a family of random variables in the Skorohod space D([0, ∞), UR), it only remains to verify the compact containment condition (J1). Some technical difficulties arise. Because the geographical space is unbounded, truncation in space is needed. We also need to control how the sizes of different families fluctuate over time, as well as how the population flux across the truncation boundaries affect the family sizes. Our strategy is to enlarge the mark space by assigning types to different families. Using a weaker convergence result, Theorem 1.30, for measure-valued IFV processes with types (but no genealogies), we can control the evolution of family sizes as well as their dispersion in space, which can then be strengthened to control the genealogical structure of the population. As we will point out later, Theorem 1.30 can be proved by adapting what we have done so far, because condition (J1) is trivial in that context. Therefore invoking Theorem 1.30 to prove (J1) for the genealogy processes is justified.

R R N Proof of (J1). As noted in Remark 1.4, we can regard U as a subset of (Uf ) , endowed with the product R-marked Gromov-weak topology. Therefore to prove (J1), it suffices FV, to show that for each k ∈ N, the restriction of (SX−2t )t≥0,>0 to the subset of marks CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 31

(−k, k) ⊂ R, i.e., (k) FV, FV, FV, FV, S X−2t := (X−2t ,Sr−2t , 1{|v| 0, satisfies the compact containment condition (3.13). More precisely, it suffices to show that for each T > 0 (which we will assume to be 1 for simplicity), and for each δ > 0, there R exists a compact Kδ ⊂ Uf , such that for all  > 0 sufficiently small, (k) FV,  (3.19) P S X−2t ∈ Kδ ∀ 0 ≤ t ≤ 1 ≥ 1 − δ, where k ∈ N will be fixed for the rest of the proof. 1 2 3 R We will construct Kδ ,Kδ ,Kδ ⊂ Uf , which satisfy respectively conditions (i)–(iii) in R Theorem A.1 for the relative compactness of subsets of Uf . We can then take Kδ := 1 2 3 R Kδ ∩ Kδ ∩ Kδ , which is a compact subset of Uf . To prove (3.19), it then suffices to prove i the same inequality but with Kδ replaced by Kδ for each 1 ≤ i ≤ 3, which we do in (1)–(3) below. 2 3 Later when we construct Kδ and Kδ , we will keep track of the mass of different indi- viduals having some specified properties. The way to do this is to introduce additional marks. We will enlarge the mark space from R to R × [0, 1], where [0, 1] is the space of the FV, CS additional types, and Xt and Xt then become random variables taking values in the ×[0,1] ×[0,1] space UR . For each (X, r, µ) ∈ UR , µ(dxdvdτ) is then a measure on X ×R×[0, 1]. FV, CS The types of individuals in X0 and X0 will be assigned later as we see fit. 1 R 1 (1) First let Kδ be the subset of Uf , such that for each (X, r, µ) ∈ Kδ , µ(X × ·) is supported on [−k, k], with total mass bounded by 4k. Since the family of measures on [−k, k], with total mass bounded by 4k, is relatively compact w.r.t. the weak topology, 1 (k) FV, 1 Kδ satisfies condition (i) in Theorem A.1. We further note that a.s., S X−2t ∈ Kδ for 1 all t ≥ 0, and hence (3.19) holds with Kδ replaced by Kδ . 2,n R (2) For each n ∈ N, we will find below L(n) such that if Kδ denotes the subset of Uf with ZZ 1 2,n (3.20) 1{r(x,y)>L(n)}µ(dxdu)µ(dydv) < for each (X, r, µ) ∈ Kδ , (X×R)2 n then uniformly in  > 0 sufficiently small, we have δ (3.21) S(k)X FV, ∈/ K2,n for some 0 ≤ t ≤ 1 ≤ =: δ . P  −2t δ 2n n 2 2,n We can then take Kδ := ∩n∈NKδ , which clearly satisfies condition (ii) in Theorem A.1, 2 while (3.21) implies that (3.19) holds with Kδ replaced by Kδ . In order to find L(n), we proceed in two steps. First we find an analogue of L(n) for the limiting CSSM genealogies, and then in a second step, we use the convergence of the measure-valued IFV to obtain L(n). Fix n ∈ N. To find L(n) such that (3.21) holds, we first prove an analogue of (3.21) for the continuum limit X CS by utilizing the types of the individuals. Given η ≥ 0 and γ ∈ [0, 1], let  R×[0,1] (3.22) Gη,γ := (X, r, µ) ∈ Uf : µ(X × [−k, k] × [0, γ]) ≤ η . CS We claim that we can find A sufficiently large, such that if all individuals in X0 with spatial mark outside [−A, A] are assigned type 0, and all other individuals are assigned type 1, then δ (3.23) X CS ∈ G ∀ 0 ≤ t ≤ 1 ≥ 1 − n . P t 0,0 4 32 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

In other words, with probability at least 1 − δn/4, the following event occurs: for all CS 0 ≤ t ≤ 1, no individual in Xt with spatial mark in [−k, k] can trace its genealogy back to some individual at time 0 with spatial mark outside [−A, A]. This will allow us to restrict our attention to descendants of the population in [−A, A] at time 0. Indeed, by the construction of the CSSM (Section 1.4) using the Brownian web, the CS CS measure-valued process Xbt , which is the measure µt projected on the geographic and type space, is given by (3.24) X CS(dvdτ) = δ (dτ)1 + 1 dv + δ (dτ)1 dv, bt 0 {v>f(A,0)(t)} {v 0 be chosen such that (3.23) holds. Since X0 = (X0 , r0 , µ0 ) ∈ U , CS the population in X0 with spatial marks in [−A, A] can be partitioned into a countable 1 CS collection of disjoint balls of radius n . Let us label these balls by Bn,1,Bn,2,... ⊂ X0 in 1 decreasing order of their total measure, and all individuals in Bi is given type i ∈ [0, 1], while all individuals with spatial mark outside [−A, A] are given type 0. Note that by CS 1 choosing M large, the total measure of individuals in X0 with types in (0, M ] can be made arbitrarily small. It then follows by the Feller continuity of the measure-valued process XbCS stated in Remark 1.28, that as M → ∞, the total measure of individuals in CS 1 Xt , t ∈ [0, 1], with spatial marks in [−k, k] and types in (0, M ], converge in probability to the which is identically 0 on the time interval [0, 1]. Combined with (3.23), this implies that we can choose M large, such that

CS  δn (3.25) P Xt ∈/ G 1 , 1 for some 0 ≤ t ≤ 1 ≤ . 4kn M 2

Let Le(n) denote the maximal distance between the balls Bn,1,...,Bn,M . Note that on CS the event that Xt ∈ G 1 1 for all 0 ≤ t ≤ 1, for each t ∈ [0, 1], the set of individuals 4kn , M CS 1 1 in Xt with spatial marks in [−k, k] and types in [0, M ] have total measure at most 4kn ; while the rest of the population with spatial marks in [−k, k] trace their genealogies back at time 0 to Bn,1,...,Bn,M , and their mutual distance is bounded by Le(n) + 2. Letting CS CS CS 2,n L(n) = Le(n) + 3 in (3.20) then readily implies that (Xt , rt , 1{|v|≤k}µt (dxdv)) ∈ Kδ CS on the event Xt ∈ G 1 1 , and (3.25) implies that 4kn , M δ (3.26) (XCS, rCS, 1 µCS(dxdv)) ∈/ K2,n for some 0 ≤ t ≤ 1 ≤ n , P t t {|v|≤k} t δ 2 which is the analogue of (3.21) for X CS. Next we turn to X FV,ε and we will deduce (3.21) from (3.26) by exploiting Theorem 1.30 FV, CS on the convergence of the measure-valued processes SXb to Xb . Let A be the same CS 1 as above. Let individuals in X0 be assigned types in {0} ∪ { i : i ∈ N} as before (3.25), CS where Le(n) denotes the maximal distance between any pair of individuals in X0 with 1  FV, CS R types in [ M , 1]. The assumption that S X0 converges to X0 in U as  ↓ 0 allows FV, CS FV, a coupling between SX0 and X0 , such that for most of the individuals in SX0 with spatial marks in [−A, A], their genealogical distances and spatial marks are close CS FV, to their counterparts in X0 . We can then assign types to individuals in SX0 in such a way that: individuals with spatial marks outside [−A, A] are assigned type 0 while 1 those with spatial marks in [−A, A] are assigned types in { i : i ∈ N}. The distance CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 33

1 between any pair of individuals with types in [ M , 1] is bounded by Le(n) + 1, and the FV, CS measure on geographic and type space, SXb0 , converges vaguely to Xb0 as  ↓ 0. By FV, CS Theorem 1.30, (SXb−2t )0≤t≤1 converges weakly to (Xbt )0≤t≤1 as random variables in C([0, 1], M(R × [0, 1])). Applying this weak convergence result to (3.25) then implies that for  > 0 sufficiently small, we have the following analogue of (3.25): FV,  (3.27) P SX −2 ∈/ G 1 1 for some 0 ≤ t ≤ 1 ≤ δn.  t 2kn , M FV, Note that on the event SX −2 ∈ G 1 1 for all 0 ≤ t ≤ 1, for each 0 ≤ t ≤ 1, the  t 2kn , M FV, 1 set of individuals in SX−2t with spatial marks in [−k, k] and types in [0, M ] have total 1 measure at most 2kn ; while the rest of the population with spatial marks in [−k, k] trace 1 their genealogies back to an individual at time 0 with type in [ M , 1], and hence their (k) FV, 2,n pairwise distance is bounded by L(n) = Le(n) + 3. It follows that S X−2t ∈ Kδ on the FV, event SX −2 ∈ G 1 1 . Together with (3.27), this implies (3.21).  t 2kn , M 3 2 (3) Our procedure for constructing Kδ is similar to that of Kδ . For each n ∈ N, we will 3,n R find M = M(n) such that if Kδ denotes the subset of Uf with the property that for each 3,n 1 M (X, r, µ) ∈ Kδ , we can find M balls of radius n in X, say B1,...,BM with B := ∪i=1Bi, 1 such that µ(X\B × R) < n , then uniformly in  > 0 sufficiently small, we have δ (3.28) S(k)X FV, ∈/ K3,n for some 0 ≤ t ≤ 1 ≤ =: δ . P  −2t δ 2n n 3 2,n We can then take Kδ := ∩n∈NKδ , which clearly satisfies condition (iii) in Theorem A.1, 3 while (3.28) implies that (3.19) holds with Kδ replaced by Kδ . i−1 i To find M(n) such that (3.28) holds, we partition the time interval [0, 1] into [ 2n , 2n ] for 1 ≤ i ≤ 2n. It then suffices to show that for each 1 ≤ i ≤ 2n, we can find Mi(n) such 3,n that if Kδ is defined using Mi, then uniformly in  > 0 small, i − 1 i δ δ (3.29) S(k)X FV, ∈/ K3,n for some ≤ t ≤  ≤ n = . P  −2t δ 2n 2n 2n 2n2n Again we first determine M(n) for CSSM and then use the convergence of measure- valued IFV to measure-valued CSSM. We now prove an analogue of (3.29) for X CS. Since CS R×[0,1] X i−1 ∈ U almost surely, we can condition on its realization and partition the pop- 2n CS 1 ulation in X i−1 with spatial marks in [−A, A] into disjoint balls of radius 3n , B1,B2,..., 2n as we did in the argument leading to (3.25). Repeating the same argument there and 1 assigning type j to individuals in Bj, we readily obtain the following analogue of (3.25): we can choose Mi large enough such that

CS i − 1 i  δn (3.30) P Xt ∈/ G 1 , 1 for some ≤ t ≤ ≤ . 2n Mi 2n 2n 4n CS i−1 i i−1 i Note that on the event Xt ∈ G 1 , 1 for all 2n ≤ t ≤ 2n , for each t ∈ [ 2n , 2n ], the set 2n Mi of individuals in X CS with spatial marks in [−k, k] and types in [0, 1 ] have total measure t Mi 1 CS at most 2n ; while the rest of the individuals in Xt with spatial marks in [−k, k] trace i−1 their genealogies back to an individual at time 2n in B1 ∪ B2 · · · ∪ BMi , and hence they 1 5 CS CS CS are contained in Mi balls of radius 3n +t ≤ 6n . Therefore (Xt , rt , 1{|v|≤k}µt (dxdv)) ∈ 3,n CS Kδ , and the analogue of (3.29) holds for X . 34 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

To establish (3.29) uniformly in small  > 0, we again apply the convergence result in FV, Theorem 1.30. Note that by the f.d.d. convergence established in Section 3.1, SX −2 i−1  2n CS converges in distribution to X i−1 as  ↓ 0. Following the same argument as those leading 2n FV, to (3.27), we can then assign types to individuals in SX −2 i−1 such that the associated  2n FV, CS measure-valued process, (SXb −2 ) i−1 i converges weakly to (Xbt ) i−1 i , and in-  t 2n ≤t≤ 2n 2n ≤t≤ 2n FV, 1 dividuals in S X i−1 with spatial marks in [−A, A] and types in [ , 1] are contained in  −2 Mi  2n 1 Mi balls with radius at most 2n . Applying this convergence result to (3.30) then implies that for  > 0 sufficiently small, we have the following analogue of (3.30):

FV, i − 1 i  δn (3.31) P SXt ∈/ G 1 , 1 for some ≤ t ≤ ≤ . n Mi 2n 2n 2n This is then easily seen to imply (3.29). Combining parts (1)-(3) concludes the proof of (3.19) and hence establishes the compact containment condition (J1).

3.3. Proof of convergence of rescaled IFV processes (Theorem 1.30). In [AS11, Theorem 1.1], a convergence result similar to Theorem 1.30 was proved for the voter model on Z, where the type space consists of only {0, 1}, and a special initial condition was considered, where the population to the left of the origin all have type 1, and the rest of the population have type 0. The proof consists of two parts: proof of tightness, and convergence of finite-dimensional distribution. In [AS11], the proof of tightness for the voter model does not depend on the initial condition, and is based on the verification of Jakubowski’s criterion and Aldous’ criterion as we have done in Sections 3.1 and 3.2 for the genealogical process. Because the IFV process ignores the genealogical distances, the verification of the compact containment condition (J1) in Jakubowski’s criterion is trivial, as in the case for the voter model. Using the duality between the IFV process and coalescing random walks with delayed coalescence, Aldous’ criterion on the tightness of evaluations can be verified by exactly the same calculations as that for the voter model in [AS11], which uses the duality between the voter model and coalescing random walks with instantaneous coalescence. Recall here that in the rescaling the difference between instantaneous and delayed coalescence disappears because of recurrence of the difference walk. Lastly, the convergence of the finite-dimensional distribution for rescaled IFV process follows the same calculations as in Section 3.1, where we can simply enlarge the mark space to R × [0, 1] and suppress the genealogical distances by choosing φi ≡ 1.

4. Martingale Problem for CSSM Genealogy Processes In this section, we show that the CSSM genealogy process constructed in Section 1.4 solves the martingale problem formulated in Theorem 1.38. We will first identify the generator action on regular test functions evaluated at regular states, and then extend it to more general test functions and verify the martingale property. Complications arise mainly from the singular nature of the resampling component of the generator, which are only well-defined a priori on regular test functions evaluated at regular states. Fortunately by Proposition 1.24, the CSSM genealogy process enters these regular states as soon as t > 0, even though the initial state may not be regular. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 35

4.1. Generator Action on Regular Test Functions. In this section, we identify the CS 1,2 generator of the CSSM genealogy process X , acting on Φ ∈ Πr and evaluated at CS R 1,2 R Xt ∈ Ur , where Πr and Ur are introduced in Section 1.6. The advantage of working CS with such regular Φ and Xt is that, the relevant resamplings only occur at the boundary points of balls of radius at least δ, which is a locally finite set. In Section 4.2, we will extend it to the case Φ ∈ Π1,2, where we need to consider the boundaries of all balls in CS Xt , which is a locally infinite set. Proposition 4.1. [Generator action on regular test functions] CS CS CS R n,φ,g Let X := (Xt )t≥0 be the CSSM genealogy process with X0 ∈ Ur . Let Φ = Φ ∈ 1,2 CS CS CS CS Πr , defined as in Definition 1.34. Let L = Ld + La + Lr be defined as in (1.46)– (1.49). Then we have CS CS E[Φ(Xt )] − Φ(X0 ) CS CS (4.1) lim = L Φ(X0 ). t↓0 t Furthermore, Z t CS CS CS CS (4.2) E[Φ(Xt )] = Φ(X0 ) + E[L Φ(Xs )]ds, 0 CS CS where E[L Φ(Xt )] is continuous in t ≥ 0. The proof is fairly long and technical and will be broken into parts, with (4.1) and (4.2) proved respectively in Sections 4.1.1 and 4.1.2.

CS CS CS CS 4.1.1. Proof of (4.1) in Proposition 4.1. For each t ≥ 0, denote Xt = (Xt , rt , µt ). Let L > 0 be chosen such that the support of g(x) is contained in (−L, L)n. Let δ > 0 be determined by φ as in (1.45), so that φ((ri,j)1≤i0Et, l where Et is the set of points in R that lie at the boundary of two disjoint balls of radius CS in l in Xt , which is consistent with the definition in (1.39). Denote δ 4 (4.5) {y1 < y2 < ··· < ym} := E0 ∩ [−2L, 2L], δ 4 and let y0, resp. ym+1, be the point in E0 adjacent to y1, resp. ym. Note that the intervals + − + − δ CS [yi , yi+1] := (yi, yi+1) ∪ {yi , yi+1} form disjoint open balls of radius 4 in X0 . Therefore + − + − ij CS for all x1 ∈ [yi , yi+1] and x2 ∈ [yj , yj+1] with i 6= j, d0 := r0 (x1, x2) is constant, and ij ii d0 := (d0 )0≤i≤j≤m forms a distance matrix, where we set d0 := 0 for 0 ≤ i ≤ m. We can then write CS n,φ,g CS X k×k (4.6) Φ(X0 ) = Φ (X0 ) = φ(d0 ) G(y, k), k∈{0,··· ,m}n 36 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

k×k k×k kikj where y = (y0, ··· , ym+1), d0 is the distance matrix with d0,ij = d0 , 1 ≤ i < j ≤ n, and Z Z (4.7) G(y, k) = ··· g(x) dx.

y

CS Step 2. We next write Φ(Xt ), for t > 0, in terms of coalescing Brownian motions running forward in time (in contrast to the spatial genealogies which run backward in CS time), which determine the evolution of boundaries between disjoint balls in Xt . Let {yi+Bi}0≤i≤m+1 be independent Brownian motions starting from {yi}0≤i≤m+1, from which we construct a family of coalescing Brownian motions {yi + Bei}0≤i≤m+1. Namely, let y0 + Be0 := y0 + B0 for all time, and let y1 + Be1(s) := y1 + B1(s) until the first time y1 +B1(s) hits y0 +Be0. From that time onward, define y1 +Be1 to coincide with y0 +Be0. In the same way, we successively define {yi + Bei}2≤i≤m+1 from {yi + Bi}2≤i≤m+1 by adding one path at a time. Without loss of generality, we may assume that yi + Bei is the a.s. unique path in the Brownian web W starting from (yi, 0). CS To write Φ(Xt ) in terms of the forward coalescing Brownian motions Be, we observe that m+1 δ [ nδ o (4.8) E ∩ [y0 + Be0(s), ym+1 + Bem+1(s)] ⊂ {yi + Bei(s)} ∀ 0 ≤ s < min , t , s 4 i=0 CS since by our construction of Xs in Section 1.4, (yi + Bei(s))0≤i≤m+1 are boundaries of CS + − disjoint balls in Xs , and [(yi + Bei(s)) , (yi+1 + Bei+1(s)) ], 0 ≤ i ≤ m, consists of either δ CS empty sets if yi + Bei(s) = yi+1 + Bei+1(s), or disjoint open balls of radius 4 + s in Xs . Let {meet}t denote the event that either y0 + B0(s) reaches level −L before time t, or ym+1 +Bm+1(s) reaches level L before time t, or one of the pair (yi +Bi(s), yi+1 +Bi+1(s)) c meet before time t. On the complementary event {meet}t , Bei = Bi for all i. Therefore, δ for all 0 ≤ t < 4 , we can write CS CS CS (4.9) Φ(X ) = 1 Φ(X ) + 1 c Φ(X ) t {meet}t t {meet}t e t with CS X k×k (4.10) Φ(e Xt ) = φ(dt ) G(y + B(t), k), k∈{0,··· ,m}n

k×k k×k kikj where dt is the distance matrix with dt,ij = d0 + 2t(1 − δki,kj ) for 1 ≤ i ≤ j ≤ n, and B(t) = (B0(t), ··· ,Bm+1(t)). Since CS CS CS CS (4.11) |Φ(Xt ) − Φ(e Xt )| = 1{meet}t |Φ(Xt ) − Φ(e Xt )| ≤ CΦ1{meet}t for some CΦ depending on Φ, and the probability of the event {meet}t decays exponentially CS CS −1 E[Φ(Xt )]−E[Φ(e Xt )] in t by elementary estimates for Brownian motions, we have limt→0 t = CS CS 0. Thus, we may replace Φ(Xt ) by Φ(e Xt ) up to error o(t) as t → 0. By (4.6) and (4.10), we can write CS CS Φ(e Xt ) − Φ(X0 )(4.12) X k×k k×k  X k×k  = φ(dt ) − φ(d0 ) G(y + B(t), k) + φ(d0 ) G(y + B(t), k) − G(y, k) . k∈{0,··· ,m}n k∈{0,··· ,m}n CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 37

CS Step 3. We first identify the aging term La in the generator, defined in (1.48). For each k×k term in the first sum in (4.12), by Taylor expanding φ(dt ) in t, it is easy to see that

−1 k×k k×k  X ∂φ k×k (4.13) lim t E[ φ(dt ) − φ(d0 ) G(y + B(t), k)] = 2 (d0 ) G(y, k), t→0 ∂rij 1≤i

1 X h k×k k×k  i lim E φ(dt ) − φ(d ) G(y + B(t), k) t→0 t 0 k∈{0,··· ,m}n (4.14) Z X ∂φ = 2 g(x) (r) dx, n ∂rij R 1≤i

CS CS where r := (r0 (xi, xj))1≤i,j≤n. This gives the aging term La .

CS Step 4. We next identify the resampling term Lr , defined in (1.49). For the second sum in (4.12), we need to compute

1 X k×k lim φ(d )E[G(y + B(t), k) − G(y, k)] t→0 t 0 k∈{0,··· ,m}n (4.15) X k×k 1 = φ(d ) lim E[G(y + B(t), k) − G(y, k)]. 0 t→0 t k∈{0,··· ,m}n

For each k ∈ {0, ··· , m}n, (4.16) n n Z  Y Y  G(y + B(t), k) − G(y, k) = g(x) 1[yk +Bk (t),yk +1+Bk +1(t)](xi) − 1[yk ,yk +1](xi) dx. n i i i i i i R i=1 i=1

We can rewrite the difference of the product of indicators as (4.17) n n Y  Y 1 (xi) − 1 (xi) + 1 (xi) − 1 (xi). [yki ,yki+1] [yki ,yki +Bki (t)] [yki+1,yki+1+Bki+1(t)] [yki ,yki+1] i=1 i=1

We can expand the first product above and sort the result into three groups of terms, (G1), (G2) and (G3), depending on whether each term contains one, two, or more factors of the form 1[y ,y +B (t)] for some 0 ≤ i ≤ m + 1. If h(x) denotes a term in (G3), then R i i i necessarily n g(x)h(x) ≤ Cg|Bi1 (t)Bi2 (t)Bi3 (t)| for some Cg depending only on g and R 3 2 some i1, i2, i3 ∈ {0, ··· , m + 1}. Since E|Bi1 (t)Bi2 (t)Bi3 (t)| ≤ ct , terms in (G3) do not contribute to the limit in (4.15). 38 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

Each term in (G2) is of the following form, where 1 ≤ i 6= j ≤ n, Y 1 (xi)1 (xj) 1 (xτ ), [yki ,yki +Bki (t)] [ykj ,ykj +Bkj (t)] [ykτ ,ykτ +1] 1≤τ≤n τ6=i,j Y 1 (xi)1 (xj) 1 (xτ ), [yki+1,yki+1+Bki+1(t)] [ykj +1,ykj +1+Bkj +1(t)] [ykτ ,ykτ +1] 1≤τ≤n τ6=i,j (4.18) Y −1 (xi)1 (xj) 1 (xτ ), [yki+1,yki+1+Bki+1(t)] [ykj ,ykj +Bkj (t)] [ykτ ,ykτ +1] 1≤τ≤n τ6=i,j Y −1 (xi)1 (xj) 1 (xτ ). [yki ,yki +Bki (t)] [ykj +1,ykj +1+Bkj +1(t)] [ykτ ,ykτ +1] 1≤τ≤n τ6=i,j

(1) (2) (3) (4) Denote the four terms in (4.18) respectively by hij (x), hij (x), hij (x) and hij (x). Then (4.19)

yk +Bk (t) yk +Bk (t) Z Z Z i Z i j Z j (1) hij (x)g(x)dx = ··· g(x1, ··· , yki , ··· , ykj , ··· , xn) + o(1))dx, Rn y

(2) (3) (4) We obtain similar results for hij , hij and hij . (·) For a fixed pair i 6= j, when we sum over k and all contributions from hij in (4.15), we n obtain an integral for φ(r)g(x), where xτ , τ 6= i, j, are still integrated over R , however the integration for xi and xj are replaced by summation over {yσ}1≤σ≤m. Contributions only come from xi = xj, and is positive when ki = kj, and negative when ki = kj + 1 or CS kj = ki + 1. Writing everything in terms of X0 , we easily verify that the contribution of terms in (G2) to the limit in (4.15) is exactly X Z X Y (4.21) 1{xi=xj }g(x)(θijφ − φ)(r) dxτ , n−2 1≤i6=j≤n R x ,x ∈{y+,y−:1≤σ≤m} 1≤τ≤n i j σ σ τ6=i,j

CS where θijφ is defined as in (1.16), and r := (r0 (xi, xj))1≤i,j≤n. This gives the resampling CS term Lr defined in (1.49). CS Step 5. Lastly we identify the diffusion (migration) term Ld , defined in (1.47). We note that each term in group (G1) is of the following form, where 1 ≤ j ≤ n, Y −1 (xj) 1 (xi), [ykj ,ykj +Bkj (t)] [yki ,yki+1] 1≤i≤n i6=j (4.22) Y 1 (xj) 1 (xi), [ykj +1,ykj +1+Bkj +1(t)] [yki ,yki+1] 1≤i≤n i6=j CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 39

(1) (2) Denote the two terms respectively by hj (x) and hj (x). Then Z (1) hj (x)g(x)dx Rn yk +Bk (t) Z Z j Z j  ∂g (4.23)= − ··· g(x x =y ) + (xj − ykj ) (x x =y ) + o(|xj − ykj |) dx. j kj ∂xj j kj y

Therefore, the contribution from terms in (G1) to the limit in (4.15) gives precisely the CS migration term Ld defined in (1.47). This establishes the generator formula (4.1).

4.1.2. Proof of (4.2) in Proposition 4.1. The complications in proving (4.2) arise from trying to prove uniform integrability for various quantities. We proceed in three steps. First we show that, for each t > 0, CS CS E[Φ(Xt+h)] − E[Φ(Xt )] CS (4.26) lim = E[LΦ(Xt )]. h↓0 h CS By the Markov property of (Xt )t≥0, CS CS " CS CS CS # E[Φ(Xt+h)] − E[Φ(Xt )] E[Φ(Xt+h)|Xt ] − Φ(Xt ) (4.27) lim = lim E . h↓0 h h↓0 h

[Φ(X CS )|X CS]−Φ(X CS) CS R E t+h t t CS Since Xt ∈ Ur a.s. by Proposition 1.24, limh↓0 h = LΦ(Xt ) almost surely. Therefore, the first step is to interchange limit and expectation in (4.27) and to deduce (4.26). We need to show that (4.28) CS CS CS E[Φ(X )|Xt ] − Φ(Xt ) δ ∧ t t+h is uniformly integrable for h > 0 small, say 0 < h < . h h>0 2 CS Once the uniform integrability has been verified, in Step 2 prove that E[LΦ(Xt )] is continuous in t, and then in Step 3 put things together and prove (4.2). Step 1 (Uniform integrability). This step constitutes the bulk of the proof of (4.2). CS Let us fix Xt and examine the error terms in our earlier calculation of the generator CS CS CS formula (4.1) with X0 replaced by Xt . Instead of partitioning Xt into disjoint open balls of radius δ/4 as done in (4.5), we set δ∧t 4 (4.29) {y1 < ··· < ymt } := Et ∩ [−2L, 2L], which is determined by the dual Brownian web Wc as seen from the definition in (1.39). CS This choice of partitioning of Xt removes the dependence of y = (y1, . . . , ymt ) on the CS CS initial condition X0 . Note that mt depends on Xt . 40 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

CS Let Et[·] denote the conditional expectation E[·|Xt ]. Following the arguments leading to (4.11), −1 CS CS h Et[Φ(Xt+h)] − Φ(Xt )

−1 CS CS −1 CS CS = h Et[Φ(Xt+h) − Φ(e Xt+h)] + h Et[Φ(e Xt+h) − Φ(Xt )] −1  CS CS  −1 CS CS (4.30) ≤ h Et 1{meet}h |Φ(Xt+h) − Φ(e Xt+h)| + h Et[Φ(e Xt+h) − Φ(Xt )] , where {meet}h is the event that either y0 + B0(s) hits level −L before time h, or ymt+1 +

Bmt+1(s) hits level L before time h, or one of the pair (yi +Bi(s), yi+1 +Bi+1(s)) coalesces before time h. We now estimate the two terms in (4.30). (i) We start with the second term in (4.30). Based on the decomposition (4.12), CS CS Φ(e Xt+h) − Φ(Xt ) = H(h, B(h)) for some function H(h, z0, ··· , zmt+1) which is con- tinuously differentiable in h and three times continuously differentiable in z0, . . . , zmt+1 with uniformly bounded derivatives. Since the generator formula (4.1) is derived by Taylor δ∧t expanding H(h, B(h)), it is not hard to see that uniformly in h ∈ (0, 4 ), (4.31)

CS CS CS CS CS CS Et[Φ(e Xt+h)−Φ(Xt )]−hL Φ(Xt ) = Et[H(h, B(h))]−hL Φ(Xt ) ≤ Cg,φ(mt +3)h for some Cg,φ depending on g and φ, and mt+3 is the number of variables in H(h, z0, . . . , zmt+1). Therefore −1 CS CS CS CS (4.32) h |Et[Φ(e Xt+h) − Φ(Xt )]| ≤ Cg,φ(mt + 3) + |L Φ(Xt )| δ∧t CS uniformly for h ∈ (0, 4 ). From the definition of L Φ, we note that CS CS (4.33) |L Φ(Xt )| ≤ Cg,φ(1 + mt) < ∞. By the definition of m in (4.29) and by Lemma B.5, we have [m ] ≤ 4L < ∞. This t E t q δ∧t π 4 δ∧t implies the uniform integrability of the second term in (4.30) for h ∈ (0, 4 ).

(ii) We now consider the first term in (4.30). For 0 ≤ i ≤ mt, let τi,i+1 be the first hitting time between yi + Bi(s) and yi+1 + Bi+1(s). Let τ0 be the first hitting time of level

−L by y0 + B0(s), and τmt+1 the first hitting time of level L by ymt+1 + Bmt+1(s). Then (4.34) −1  CS CS  h Et 1{meet}h |Φ(Xt+h)−Φ(e Xt+h)| mt −1   −1 hX CS CS i ≤ 2|Φ|∞h Et 1{τ0≤h} +1{τm+1≤h} + h Et 1{τi,i+1≤h} Φ(Xt+h)−Φ(e Xt+h) . i=0

Since y0 < −2L and ymt+1 > 2L, the probability of the events {τ0 ≤ h} and {τmt+1 ≤ h} −1 CS decay exponentially fast in h and the events are independent of Xt . Therefore the first term in (4.34) is uniformly bounded in h > 0. Bounding the second term in (4.34) is more delicate, because it remains of order 1 as h ↓ 0. We will need to use negative correlation inequalities for coalescing Brownian motions established in Appendix C. CS Let us recall the definition of Φ(e Xt+h) from (4.10), where we replace the pair {0, t} by CS {t, t + h}. By integrating over the population at time t + h, we note that both Φ(Xt+h) CS and Φ(e Xt+h) can be written as integrals of g(x)φ(r(x)) integrated over x = (x1, ··· , xn) ∈ (−L, L)n, except that: for a given x, the distance matrix r(x) may be different for Φ and Φ,e and for Φ,e the same point x may be integrated over several times with different distance CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 41

matrix r(x) due to the fact that {yi + Bi(h)}0≤i≤mt+1 may not have the same order as {yi}0≤i≤m+1. However, for x not in

n n [ o (4.35) D := x ∈ R : xi ∈ {yj + Bj(s)} for some 1 ≤ i ≤ n and 0 ≤ j ≤ mt + 1 , s∈[0,h] x is integrated over exactly once in Φ,e and the associated distance matrix r(x) is the same CS CS for both Φ and Φ.e Therefore, contributions from x ∈/ D cancel out in |Φ(Xt+h)−Φ(e Xt+h)|. CS CS Since g has compact support, the contribution from x ∈ D to |Φ(Xt+h)−Φ(e Xt+h)|, includ- Pmt+1 ing multiple integrations over the same x by Φ,e is at most Cφ,gn|g|∞|φ|∞ i=0 Ri(h), where Ri(h) := (sup0≤s≤h Bi(s) − inf0≤s≤h Bi(s)). Therefore,

mt mt mt+1 −1 hX CS CS i −1 h X X i h Et 1{τi,i+1≤h} Φ(Xt+h)−Φ(e Xt+h) ≤ Cn,φ,gh EB 1{τi,i+1≤h} Rj(h) i=0 i=0 j=0

mt mt+1 −1 X X 1 2 1 ≤ Cn,φ,gh PB(τi,i+1 ≤ h) 2 EB[Rj(h) ] 2 i=0 j=0

mt 0 − 1 X 1 (4.36) ≤ Cn,φ,gh 2 (mt + 2) PB(τi,i+1 ≤ h) 2 , i=0 where denotes expectation w.r.t. the Brownian motions B = (B ,...,B ), and we EB √ 0 mt+1 2 1 applied H¨olderinequality and the fact that E[Rj(h) ] 2 = c h for some c > 0. It only CS remains to prove the uniform integrability of the r.h.s. of (4.36) w.r.t. the law of Xt for δ∧t 0 < h < 4 . Note that the r.h.s. of (4.36) depends on y0 and ymt+1, which lie outside [−2L, 2L]. To control the dependence on y0 and ymt+1, we enlarge the interval and let {z1, ··· , zM+1} := δ∧t 4 Et ∩ (−2L − 1, 2L + 1) as in (4.29), which contains {y1, ··· , ymt } as a subset. Denote

Z ∞ 2 1 yi+1 − yi  1  − x  2 ψ √ := (τ ≤ h) 2 = 2 e 2 dx , P i,i+1 y −y h i+1√ i 2h we can then replace the r.h.s. of (4.36) by

M CS M X zi+1 − zi  (4.37) Fh(Xt ) := √ ψ √ , h i=1 h 0 CS because Cn,φ,gFh(Xt ) dominates the r.h.s. of (4.36) except for possible missing terms y −y √1 ψ y1√−y0 (m + 2), resp. √1 ψ mt+1√ mt (m + 2), on the event y ≤ −2L − 1, resp. h h t h h t 0 ymt+1 ≥ 2L + 1. Since y1 ≥ −2L and ymt ≤ 2L by definition, both missing terms are bounded by C1mt + C2 uniformly for h > 0, and is thus uniformly integrable. CS δ∧t It only remains to prove the uniform integrability of Fh(Xt ) uniformly in 0 < h < 4 . We achieve this by bounding its second moment. Note that by Cauchy-Schwarz,

" M # 1  X zi +1 − zi 2 [F (X CS)2] = M 2 ψ 3 √ 3  E h t h E i=1 h " M #  M  1 3 X zi3+1 − zi3 2 1 X zi4+1 − zi4 2 (4.38) ≤ E M ψ √ = E  ψ √  . h h h h i=1 i1,i2,i3,i4=1 42 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

We can rewrite the r.h.s. of (4.38) in terms of the correlation functions of the translation δ∧t 4 invariant simple ξ := Et ⊂ R. More precisely, it can be written as Z Z 1 x5 − x4 2 c (4.39) ··· ψ √ K (x1, x2, x3, x4, x5)dx1dx2dx3dx4dx5, h h 4,5 x4

−1 2 K(0) = lim  P [x, x + ] ∩ ξ 6= ∅) = √ for all x ∈ R, ↓0 πδ ∧ t and c −2  K (x, y) = lim  P [x, x + ] ∩ ξ 6= ∅, (x + , y) ∩ ξ = ∅, [y, y + ] ∩ ξ 6= ∅ for x < y. ↓0 By Lemma C.6, for x < y, c K (x, y) ≤ Cδ,t(y − x) ∧ 1 for some Cδ,t > 0 depending only on δ and t. Substituting the above bounds into (4.39), using the definition of ψ√, and separating√ the integration into two regions depending on whether 0 < x5 − x4 < h or x5 − x4 ≥ h, it is easily seen that the integral in (4.39) δ∧t CS is uniformly bounded for 0 < h < 4 . This implies the uniform integrability of Fh(Xt ) CS CS δ∧t Et[Φ(Xt+h)]−Φ(Xt ) for 0 < h < 4 , and hence that of h .

CS CS CS CS Step 2 (Continuity of E[L Φ(Xt )]). Recall from (1.46)–(1.49) that L = Ld + CS CS La + Lr . We will prove the continuity for each component of the generator. CS CS By our assumptions on g and φ, |Ld Φ|∞, |La Φ|∞ ≤ CΦ < ∞. It was shown in (2.4) n CS that for any t > 0, for Lebesgue a.e. x = (x1, ··· , xn) ∈ R , the distance matrix rs (xi, xj) CS converges a.s. to rt (xi, xj) as s → t, and this conclusion is also easily seen to hold for CS R CS t = 0 by the assumption X0 ∈ Ur . Therefore, almost surely w.r.t. (Xs )s≥0, CS CS CS CS CS CS CS CS (4.41) lim L Φ(Xs ) = L Φ(Xt ) and lim La Φ(Xs ) = La Φ(Xt ). s→t d d s→t CS CS CS CS Therefore E[Ld Φ(Xt )] and E[La Φ(Xt )] are continuous in t ≥ 0 by the bounded convergence theorem. CS CS We now turn to the continuity of E[Lr Φ(Xt )]. We first prove the continuity at t = 0 δ 4 CS and later point out how it extends to t > 0. Let E0 ⊂ R be defined from X0 as in (1.39), δ CS which are the boundaries of balls of radius 4 in X0 . Let us follow paths in the Brownian δ 4 web W starting from points in E0 ⊂ R at time 0, which determine the evolution of these CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 43

δ 4 boundaries, and denote the point set on R generated at time s > 0 by E0 (s). Then δ δ 4 +s 4 Es ⊂ E0 (s). By our regularity assumption on φ, only resampling at the boundaries of CS balls of radius δ or more has an effect on Φ(Xs ). Therefore for s ∈ (0, δ/4), we have (4.42) Z CS CS X X Y Lr Φ(Xs ) = 1{xi=xj }g(x)(θijφ − φ)(r) dxτ , n−2 1≤i6=j≤n R + − δ/4 1≤τ≤n xi,xj ∈{y ,y :y∈E0 (s)} τ6=i,j CS where θijφ is defined as in (1.16), and r := (rs (xi, xj))1≤i,j≤n. By our assumptions on g CS R CS and φ, the fact X0 ∈ Ur , and our construction of Xs in terms of the (dual) Brownian web, it is then easily seen by dominated convergence that CS CS (4.43) lim LrΦ(Xs ) = LrΦ(X0 ) almost surely. s↓0 CS CS To prove the continuity of E[L Φ(Xt )] at t = 0, it only remains to verify the uniform CS CS integrability of Lr Φ(Xs ) for s close to 0, say s ∈ [0, δ/4]. We will achieve this by CS CS showing that Lr Φ(Xs ) has uniformly bounded second moments. Note that because g is assumed to be supported on [−L, L]n, we have CS CS δ/4 (4.44) |Lr Φ(Xs )| ≤ Cn,φ,g E0 (s) ∩ (−L, L) . δ/4 By Lemma C.2, E0 (s) is negatively correlated, and hence by Lemma C.5, we have h δ/4 2i h δ/4 i h δ/4 i2 (4.45) E E0 (s) ∩ (−L, L) ≤ 2E E0 (s) ∩ (−L, L) + E E0 (s) ∩ (−L, L) .  δ/4  Thus it suffices to bound E E0 (s) ∩ (−L, L) uniformly in s ∈ [0, δ/4]. δ/4 δ/4 Since E0 (s) is obtained by evolving coalescing Brownian motions starting from E0 , we can bound  δ/4  δ/4 X  [i,i+1]×{0}  (4.46) E E0 (s) ∩ (−L, L) ≤ E0 ∩ [−2L, 2L] + 2 E ξs ∩ (−L, L) , i≥2L [i,i+1]×{0} where ξs ⊂ R is the point set generated at time s by coalescing Brownian motions in the Brownian web W starting from everywhere in [i, i + 1] at time 0. Note that the first term in (4.46) is finite and independent of s ≥ 0. We now treat the [i,i+1]×{0} i second term. For each i ≥ 2L, let us denote ξs ∩ (−L, L) by ξs,L, which is also a point process on (−L, L) satisfying negative correlation. In particular, ∞ ∞ ∞ Z ∞ 2 n i X i X i n X  1 − x  (4.47) E[|ξs,L|] = P(|ξs,L| ≥ n) ≤ P(|ξs,L ≥ 1) ≤ √ e 2s dx , n=1 n=1 n=1 2πs i−L i where we used Lemma C.4 and the observation that, ξs,L 6= ∅ implies that the Brownian motion starting at (i, 0) in the Brownian web must be to the left of L at time s. Since 2 1 R ∞ − x L is fixed and large, √ e 2s dx ≤ α for some α ∈ (0, 1) uniformly in i ≥ 2L and 2πs i−L s ∈ [0, δ/4], and hence

x2 x2 R ∞ − R ∞ − i−L e 2 dx e 2s dx √ i i−L s (4.48) E[|ξs,L|] ≤ √ = √ . (1 − α) 2πs (1 − α) 2π P i It is then clear that i≥2L E[|ξs,L|] tends to 0 as s ↓ 0 and is uniformly bounded for CS s ∈ [0, δ/4], which concludes the proof of the uniform integrability of LrΦ(Xs ) for s ∈ CS CS [0, δ/4]. Therefore E[Lr Φ(Xt )] is continuous at t = 0. 44 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

CS CS δ∧t δ∧t The continuity of E[Lr Φ(Xt )] at t > 0 can be proved similarly. For s ∈ [t− 4 , t+ 4 ], δ∧t we can use the same representation as in (4.42) by treating t − 4 as the starting time 0 CS CS in (4.42). The a.s. convergence lims→t LrΦ(Xs ) = LrΦ(Xt ) then follows as before. The CS uniform integrability of LrΦ(Xs ) follows easily from the density estimate, Lemma B.5, δ∧t δ∧t if we restrict to s ∈ [t − 8 , t + 8 ], since only resampling at locations occupied at time δ∧t CS s by Brownian web paths starting from R at time t − 4 has an effect on Φ(Xs ).

CS Step 3 (Proof of (4.2)). We have thus far shown that E[Φ(Xt )] has a continuous right CS CS CS derivative E[L Φ(Xt )] on [0, ∞). Note that E[Φ(Xt )] is also continuous on [0, ∞) CS since (Xs )s≥0 has a.s. continuous sample paths by Theorem 1.26. Equation (4.2) then follows from a variant of the fundamental theorem of calculus, formulated as Lemma 4.2 below. This finally completes the proof of (4.2) and that of Prop. 4.1.

Lemma 4.2. [Fundamental Theorem of Calculus] + 1 Let f : R → R be continuous, with a continuous right derivative D f(x) := limh↓0 h (f(x+ h) − f(x)). Then Z x + f(x) = f(0) + D f(y)dy for all x ∈ R. 0 R x + Proof. Let g(x) := f(0) + 0 D f(y)dy, which is clearly differentiable with continuous derivative D+f. Then h := f −g is continuous with h(0) = 0 and right derivative D+h ≡ 0. It suffices to show that h ≡ 0. + For  > 0, let h(x) = h(x) + x. Then D h ≡  > 0. This implies that h is non-decreasing on R, since otherwise if h(x) > h(y) for some x < y, then at the point + x0 ∈ [x, y] where h achieves its maximum on [x, y], we have D h(x0) ≤ 0, contradicting + D h ≡ . Letting  ↓ 0, the monotonicity of h then implies that h is also non-decreasing. Applying the same argument to −h shows that h is also non-increasing. Thus h ≡ 0.

4.2. Generator Action on General Test Functions. In this section, we extend the 1,2 CS R validity of the integral equation (4.2) in Prop. 4.1 to general Φ ∈ Π , assuming X0 ∈ Ur . The complication is that the generator identified in (4.1), which acts on regular test 1,2 CS R functions Φ ∈ Πr and evaluated at regular states Xt ∈ Ur , can only be extended to general test functions Φ ∈ Π1,2 provided that we restrict to the regular subclass of CS R states Xt ∈ Urr introduced in Definition 1.37. Fortunately for each t > 0, almost surely CS R 1,2 1,2 Xt ∈ Urr, which makes it possible to extend from Φ ∈ Πr to Φ ∈ Π . Proposition 4.3. [Generator action on general test functions] CS CS R n,φ,g 1,2 Let X be the CSSM genealogy process with X0 ∈ Ur , and let Φ = Φ ∈ Π . Then CS CS (a) The integral equation (4.2) still holds, and E[L Φ(Xt )] is continuous in t > 0. CS R (b) If X0 ∈ Urr, defined as in Definition 1.37, then the generator equation (4.1) still CS CS holds, and E[L Φ(Xt )] is continuous in t ≥ 0. n,φ,g 1,2 1,2 Proof. We will approximate Φ = Φ ∈ Π by Φl ∈ Πr as follows. Let ρ : [0, ∞) → R be continuously differentiable with ρ(x) = 0 for x ∈ [0, 2], ρ0(x) > 0 for x ∈ (2, 3], and ρ(x) = x for x ∈ [3, ∞). For each l > 0, denote ρl(x) = lρ(x/l). Then ρl is a smooth 0 truncation with supx≥0 |ρl(x) − x| → 0 and ρl(x) → 1 for each x > 0 as l ↓ 0. Given φ = φ((rij)1≤i

(a) We will take the limit l ↓ 0 in the integral equation (4.2). In Step 1, we will show CS CS that (4.2) also holds for Φ. In Step 2, we will prove the continuity of E[L Φ(Xt )] 1,2 CS R CS CS in t > 0. Note that without assuming Φ ∈ Πr or X0 ∈ Urr, L Φ(X0 ) may not be well-defined.

Step 1 First note that |Φl − Φ|∞ → 0 as l ↓ 0 by our assumption on g and φ and the fact that supx≥0 |ρl(x) − x| → 0 as l ↓ 0. Therefore,

CS CS CS CS (4.49) lim E[Φl(Xt )] = E[Φ(Xt )] and lim Φl(X0 ) = Φ(X0 ). l↓0 l↓0

CS CS CS CS CS Recall the decomposition of L = Ld + La + Lr in Prop. 4.1. Note that |Ld Φ|∞, CS CS CS |Ld Φl|∞, |La Φ|∞, |La Φl|∞ are all uniformly bounded by some CΦ < ∞ independent of l. Also, by our assumptions on g, φ and ρl, for each s > 0 and a.s. every realization of CS Xs , we have

CS CS CS CS CS CS CS CS (4.50) Ld Φl(Xs ) → Ld Φ(Xs ) and La Φl(Xs ) → La Φ(Xs ) as l ↓ 0.

Therefore, by the bounded convergence theorem,

Z t Z t CS CS CS CS CS CS (4.51) lim E[(Ld Φl + La Φl)(Xs )]ds = E[(Ld Φ + La Φ)(Xs )]ds. l↓0 0 0

CS For the resampling generator Lr , note that for any s > 0 and a.s. every realization of CS Xs , (4.52) Z CS CS X X CS Y |Lr Φ(Xs )| ≤ 1{xi=xj }|g(x)| (θijφ − φ)(r ) dxτ n−2 s R + − 1≤τ≤n 1≤i6=j≤n xi,xj ∈{x ,x :x∈Es} τ6=i,j X ≤ Cn,g,φ dx ∧ 1,

x∈(−M,M)∩Es where Es is defined as in (4.4), which is the subset of R that are points of multiplicity in the CS + − CS dual Brwonian web Wc, dx = rs (x , x ) is the distance between the two points in Xs n with the same spatial location as x ∈ Es, M > 0 is chosen such that supp(g) ⊂ (−M,M) , and we used the assumption that φ has bounded derivative. Such a bound holds for φl uniformly in l > 0. If P d ∧ 1 < ∞, then by dominated convergence, we x∈(−M,M)∩Es x have

CS CS CS CS (4.53) lim Lr Φl(Xs ) = Lr Φ(Xs ). l↓0

CS To prove the analogue of (4.51) for Lr Φ, by dominated convergence, it only remains to show

Z t h X i (4.54) E dx ∧ 1 ds < ∞. 0 x∈(−M,M)∩Es 46 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

Note that for each s ∈ (0, t), h X i E dx ∧ 1 x∈(−M,M)∩Es h X Z 1 i Z 1 h i = 1 du = {x ∈ (−M,M) ∩ E : d > u} du E {dx∧1>u} E s x 0 0 x∈(−M,M)∩Es 2s∧1 1 Z h i Z h i

≤ E {x ∈ (−M,M) ∩ Es : dx > u} du + E {x ∈ (−M,M) ∩ Es : dx > 2s} du 0 2s∧1 Z 2s∧1 h i Z 1 h i R×{s−u/2} R×{0} = E ξs ∩ (−M,M) du + E ξs ∩ (−M,M) du 0 2s∧1 √ 1 − 2s ∧ 1  1  (4.55)= CM 2s ∧ 1 + CM √ ≤ CM 1 + √ , s s A where ξs denotes the point set on R generated at time s by the collection of paths in the 2 R×{s−u} Brownian web W starting from the space time set A ⊂ R , and we used E[ |ξs ∩ √b−a [a, b]| ] = πu by Lemma B.5. Inequality (4.54) then follows. To summarize, we have thus shown that (4.2) is also valid for a general polynomial Φ ∈ Π1,2. CS CS CS Step 2 To prove the continuity of E[L Φ(Xt )] in t > 0, we again decompose L into CS CS CS CS its three summands. First note that the continuity of E[Ld Φ(Xt )] and E[La Φ(Xt )] CS CS CS CS on [0, ∞) follow by the same arguments as that for E[Ld Φ(Xt )] and E[La Φ(Xt )] in the proof of Prop. 4.1. CS CS To prove the continuity of E[Lr Φ(Xt )] in t > 0, we fix t > 0 and a truncation param- CS CS <+s−t CS eter  ∈ (0, t). For each s ∈ (t − , t + ), we decompose Lr Φ(Xs ) into Lr Φ(Xs ) ≥+s−t CS <+s−t CS ≥+s−t CS and Lr Φ(Xs ), where both Lr Φ(Xs ) and Lr Φ(Xs ) are defined as in (4.42), except that resampling therein is carried out by summing over X ≥+s−t CS 1{xi=xj } for Lr Φ(Xs ), + − +s−t xi,xj ∈{y ,y :y∈Es } (4.56) X <+s−t CS 1{xi=xj } for Lr Φ(Xs ), + − +s−t xi,xj ∈{y ,y :y∈Es\Es } δ where Es and Es are defined as in (4.4). CS CS The same argument as in the proof of the continuity of E[Lr Φ(Xt )] in Prop. 4.1 shows that ≥+s−t CS ≥+s−t CS (4.57) lim [|Lr Φ(Xs ) − Lr Φ(Xt )|] = 0. s→t E   On the other hand, for s ∈ [t − 2 , t + 2 ], by the same calculations as those leading to (4.55), we have

<+s−t CS h X i E[|Lr Φ(Xs )|] ≤ Cn,g,ϕE dx ∧ 1 +s−t x∈(−M,M)∩Es\Es h X i √ 0 √ (4.58) = Cn,g,ϕE dx ∧ 1 ≤ CM  + s − t ≤ C M . x∈(−M,M)∩Es dx≤2(+s−t) Since  > 0 can be chosen arbitrarily small, (4.57) and (4.58) together imply that CS CS CS CS lim [Lr Φ(Xs )] = [Lr Φ(Xt )] when t > 0. s→t E E CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 47

CS CS CS CS (b) We now verify the continuity of E[Lr Φ(Xt )] (and hence of E[L Φ(Xt )]) at CS R t = 0 under the additional assumption that X0 ∈ Urr. Together with (4.2), this also implies that the generator equation (4.1) holds for a general polynomial Φ ∈ Π1,2. CS CS ≥+s CS <−s CS As before, we separate Lr Φ(Xs ) into Lr Φ(Xs ) and Lr Φ(Xs ), where the truncation parameter  > 0 is fixed and small. Equation (4.57) also holds with t = 0 CS CS by the same argument as that for the continuity of E[Lr Φ(Xt )] at t = 0, when Φ ∈ 1,2 Πr . Indeed, in both cases, only resampling between individuals with sufficiently large genealogical distance contribute to the generator action on Φ. It remains to show that <+s CS (4.59) sup E[|Lr Φ(Xs )|] −→ 0. 0≤s≤/2 ↓0 For 0 < s < /2, we can separate the resampling terms according to whether the genealogies of the two resampled individuals merge above or below time 0, and write

<+s CS h X i h X X i E[|Lr Φ(Xs )|] ≤ Cn,g,ϕE dx = Cn,g,ϕE dx + dx x∈(−M,M)∩Es x∈(−M,M)∩Es x∈(−M,M)∩Es dx≤2(+s) dx∈(0,2s] dx∈(2s,2(+s)) √ h X i (4.60) ≤ Cn,g,ϕM s + Cn,g,ϕ E dx . x∈(−M,M)∩Es dx∈(2s,2(+s)) It remains to bound the expectation on the r.h.s. above, which originate from resampling between individuals whose genealogical distance depend on the distance of their ancestors CS ˆ[−M,M]×{s} in X0 at time 0. Let ξ0 be the point set on R generated by the collection of paths in the dual Brownian web Wc starting from the space-time set [−M,M] × {s}. Let ˆ[−M,M]×{s} us order the points in ξ0 and denote them by z1 < z2 ··· < zMs+1, with a total CS of Ms + 1 points. Then by our construction of Xs , we have

Ms h X i h X CS i dx = 1 CS (2s + r (zi, zi+1)) E E {r0 (zi,zi+1)<2} 0 x∈(−M,M)∩Es i=1 dx∈(2s,2(+s))

Ms √ h X CS i (4.61) ≤ CM s + 1 CS r (zi, zi+1) , E {r0 (zi,zi+1)<2} 0 i=1

√2M where we used E[Ms] = πs because by duality between the Brownian web and its dual, Ms is exactly the number of points in the interval [−M,M] occupied at time s by Brownian web paths starting from R at time 0, and hence Lemma B.5 can be applied. For each pair CS CS (zi, zi+1) with r0 (zi, zi+1) < 2, by the properties of X0 , there must exist a yi ∈ E0 CS with zi < yi < zi+1 such that r0 (zi, zi+1) = dyi . Note that the yi, 1 ≤ i ≤ Ms, are all distinct with dyi < 2. We can now separate the contribution to the second term in (4.61) into two groups. The first group consists of contributions from pairs (zi, zi+1) with −2M ≤ zi < zi+1 ≤ P 2M. The total contribution from these terms is uniformly dominated by x∈[−2M,2M]∩E0 dx, dx≤2 CS R which tends to 0 as  ↓ 0 by the assumption X0 ∈ Urr. The second group consists of contributions from pairs (zi, zi+1) with either zi ≤ −2M or zi+1 ≥ 2M. We bound these terms by 2 times the expected cardinality of such pairs. By the duality between W and Wc, the cardinality of such pairs of (zi, zi+1) is bounded by {x∈R:|x|≥2M}×{0} the cardinality of ξs ∩ (−M,M), the point set on (−M,M) generated at time s by paths in W starting from the space-time set{x ∈ R : |x| ≥ 2M}×{0}. As shown 48 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

{x∈R:|x|≥2M}×{0} in (4.47), the expected cardinality of ξs ∩ (−M,M) is uniformly bounded in s ∈ [0, l] for any l > 0. Therefore, uniformly for s ∈ [0, /2], the second term in (4.61) is also bounded by a function of , which tends to 0 as  ↓ 0. This concludes the proof of (4.59) and Proposition 4.3.

4.3. Proof of Theorem 1.38. We now prove Theorem 1.38. Note that the various path properties listed in Theorem 1.38 has been verified for the CSSM genealogy process X CS CS R in Prop. 1.24 and Theorem 1.26, except for the claim that for each t > 0, Xt ∈ Urr almost surely. To verify this last claim, we need to show that a.s., X CS + − (4.62) rt (x , x ) < ∞ for each M > 0. x∈Et∩(−M,M) where Et is defined as in (4.4). Since for each  > 0, the set {x ∈ Et ∩ (−M,M): CS + − CS R rt (x , x ) ≥ } is a finite set because Xt ∈ Ur , (4.62) is reduced to showing that a.s., X CS + − 1 CS + − r (x , x ) < ∞ for each M > 0. {rt (x ,x )<} t x∈Et∩(−M,M) The l.h.s. above has been shown in (4.55) to have finite expectation, and hence is finite CS R a.s. Therefore Xt ∈ Urr a.s. To conclude the proof of Theorem 1.38, it only remains to verify the martingale property, 1,2 CS R namely that given Φ ∈ Π and X0 ∈ U1 , for each 0 ≤ s < t, we have a.s. h Z t i CS CS CS CS CS (4.63) E Φ(Xt ) − Φ(Xs ) − (L Φ)(Xu )du (Xu )0≤u≤s = 0. s CS R Consider first the case s > 0. Then a.s. Xs ∈ Ur . We can use the Markov property of CS 1,2 CS X and apply the integral equation (4.2) for Φ ∈ Π and initial state Xs , as established in Prop. 4.3. Equation (4.63) then follows provided that we can apply Fubini’s Theorem to interchange the integral with expectation in Z t Z t h CS CS i CS CS (4.64) E (L Φ)(Xu )du = E[(L Φ)(Xu )]du. s s CS CS CS CS As in the proof of Prop. 4.3, we can decompose L as La + Ld + Lr . The aging and CS diffusion part of the generator action on Φ can be uniformly bounded in time and in Xu , while the resampling part can be bounded as in (4.52), which is shown to be integrable R t w.r.t. E[ 0 ·] in (4.54). Therefore Fubini can be applied, and (4.63) holds for s > 0 a.s. To treat the case s = 0, we take expectation in (4.63) for s > 0, i.e., Z t h CS CS CS CS i E Φ(Xt ) − Φ(Xs ) − (L Φ)(Xu )du = 0, s CS CS CS and then let s ↓ 0. By the a.s. continuity of Xs in s ≥ 0, we have E[Φ(Xs )] → Φ(X0 ).  R t CS CS  The convergence of E s (L Φ)(Xu )du as s ↓ 0 follows by dominated convergence, using the same argument as that for (4.64). Therefore (4.63) also holds for s = 0, which proves the desired martingale property.

Appendix A. Proofs on marked metric measure spaces In this section, we prove Theorems 1.5 and 1.8, derive a relative compactness condition V V for subsets of M , and a tightness criterion for laws on M . Furthermore we formally treat V pasting of trees. Our starting point for the first of these points is Remark 1.4, that M can V N be identified as a subspace of (Mf ) , endowed with the product V -marked Gromov-weak topology. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 49

V Proof of Theorem 1.5. As noted in Remark 1.4, we can identify M as a subspace V N of (Mf ) , endowed with the product V -marked Gromov-weak topology. Furthermore, V V N under this identification, M is a closed subspace of (Mf ) . It was shown in [DGP11, V Theorem 2] that M1 , the space of V -mmm spaces with probability measures, equipped with the V -marked Gromov-weak topology, is a Polish space. The same conclusion is V V N easily seen to hold for Mf . Therefore (Mf ) is also Polish, which implies that any closed V subspace, including M , is also Polish.

Proof of Theorem 1.8. For each k ∈ N ∪ {0, ∞}, let n k k  n,φ k (2) n Πe := ∪n∈N0 Πe n := ∪n∈N0 Φ : φ ∈ C (R+ × V , R) , n n k (2) n (2) where C (R+ ×V , R) is the space of bounded continuous real-valued functions on R+ × n n V that are k times continuously differentiable in the first 2 coordinates, and Z Z n,φ ⊗n V Φ ((X, r, µ)) := ··· φ(r, v)µ (d(x, v)) for each (X, r, µ) ∈ M1 , with r := (r(xi, xj))1≤i

V The following relative compactness criterion for subsets of M follow easily from the V V N identification of M as a subspace of (Mf ) , and the relative compactness criterion for V subsets of Mf formulated in [DGP11, Thm. 3] and [GPW13, Prop. 6.1]. V V Theorem A.1 (Relative compactness of subsets of M ). Let Γ ⊂ M , and let o be any point in V . For each k ∈ N, let Bk(o) denote the open ball of radius k centered at o. Then # Γ is relatively compact w.r.t. the V -marked Gromov-weak topology if for each k ∈ N, V {(X, r, 1{v∈Bk(o)}µ(dxdv)) : (X, r, µ) ∈ Γ} is a relatively compact subset of Mf , i.e., (i) The family of finite measures on V ,

Λk := {µ(X × dv)1{v∈Bk(o)} : (X, r, µ) ∈ Γ}, is relatively compact w.r.t. the weak topology; (ii) For each  > 0, there exists L > 0 such that uniformly in (X, r, µ) ∈ Γ, ZZ

1{r(x,y)>L}1{u,v∈Bk(o)}µ(dxdu)µ(dydv) ≤ ; (X×V )2

(iii) For each  > 0, there exists M ∈ N such that uniformly in (X, r, µ) ∈ Γ, we can M find M balls of radius  in X, say Bε,1,...,Bε,M ⊂ X with B = ∪i=1Bε,i, such that µ((X\B) × Bk(o)) ≤ . V By the tightness criteria formulated in [DGP11, Thm. 4] for Mf -valued random vari- V ables, we obtain the following tightness criteria for M -valued random variables. V Theorem A.2 (Tightness of M -valued random variables). Let o be any point in V , and V let Bk(o) denote the open ball of radius k centered at o. A family of M -valued random variables {Xi = (Xi, ri, µi)}i∈I is tight if for each k ∈ N, {Xi, ri, 1{v∈Bk(o)}µi(dxdv)}i∈I is V a tight family of Mf -valued random variables, i.e., 50 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

(i) {1{v∈Bk(o)}µi(X × dv)}i∈I is a tight family of random variables taking values in the space of finite measures on V (equipped with the weak topology); (ii) {Xi, ri, µi(dx × Bk(o))}i∈I is a tight family of random variables taking values in the space of metric measure spaces (equipped with the Gromov-weak topology). V Using the characterization of relatively compact sets in M given in Theorem A.1, one V can also formulate more concrete conditions for the tightness of a family of M -valued random variables, using concrete conditions for the tightness of a family of random metric measure spaces formulated in [GPW09, Thm. 3].

Pasting Let (Kt)t≥0 be the marked metric measure space generated by the entrance law of the spatial coalescent on V starting with countably many individuals per site, each forming their own partition element and run for time t. We claim the dual representation theorem (strong duality):

(A.1) L[Xt] = L[X0 ` Kt], where ` denotes the subordination of the second marked ultrametric measure space to the first w.r.t. the location of partition elements at time t which on the r.h.s. are realized independently. Here pasting of Y to X means with respect to a function χ : Y → V without reference to graphical devices that we have the following relation. n n (2) n n Let g ∈ Cbb(V , R), ϕ ∈ Cb(R+ , R) and n ∈ N. Define ge : V × V → R, ϕe : n n (2) (2) R+ × R+ −→ R given by n 0 Y (A.2) g(v, v ) = g(v) δ 0 , e vi,vi i=1

0 0 (A.3) ϕe(r, r ) = ϕ(r + r ). Then we should have the following relation: 0 0 Let X = (x, r, µ) and Y = (Y, r , µ ) both in UV. Then we require for the new element of UV for all ϕ, g that: Z Z n,g,ϕ  0 0 0 0 0  (A.4) Φ (X ` Y) = Z µ⊗n(d(x, v))µ ⊗n(d(x , v ))ge(v, χ(x ))ϕe(r, r ) . (X×V )n (Y ×V )n If an ultrametric measure space exists, it is automatically uniquely determined. We can construct a solution by taking a sample sequence and pasting by coagulating roots and leaves for the same sites, if they have the same index as we explained earlier in Remark 1.20.

Appendix B. The Brownian web In this section, we recall the construction and basic properties of the Brownian web. Re- call from Subsection 1.4 the random variable (W, Wc) as constructed in [FINR04, FINR06], and in particular from (1.28)-(1.30), the state spaces of Π of W and Πb of Wc. It has been shown [FINR04, Theorem 2.1] that the Brownian web W can be characterized as follows: Theorem B.1 (Characterization of the Brownian web). The Brownian web W is a ran- dom closed subset of Π, whose law is uniquely determined by the following properties: 2 (i) For every z ∈ R , almost surely W(z) contains a unique path. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 51

2 (ii) For every finite n and deterministic points z1, ··· , zn ∈ R , {W(zi): i = 1, ··· , n} are distributed as n coalescing Brownian motions starting from z1, ··· , zn. 2 (iii) For every deterministic countable dense subset D ⊂ R , W is almost surely the closure of {W(z); z ∈ D} in Π. The following result shows that every path in W can be approximated by the countable set of paths {W(z); z ∈ D} in a very strong sense (see e.g. [SS08, Lemma 3.4]).

Lemma B.2 (Convergence of Paths in W). Almost surely, if (fn)n∈N and f are paths in W, starting respectively at times (sn)n∈N and s, and fn → f in Π, then sup{t : fn(t) 6= f(t)} → s as n → ∞.

The dual Brownian web Wc can be characterized as follows [FINR06, Theorem 3.7]: Theorem B.3 (Characterization of the dual Brownian web). Let W be a Brownian web. Then there exists an almost surely uniquely determined Πb-valued random variable Wc de- fined on the same probability space as W, called the dual Brownian web, such that: (i) Almost surely, paths in W and Wc do not cross, i.e., there exist no f ∈ W and fˆ ∈ Wc and s 6= t such that (f(s) − fˆ(s))(f(t) − fˆ(t)) < 0; (ii) RWc has the same law as W, where R denotes the reflection map that maps each 2 fˆ ∈ Wc to an f ∈ Π such that the graph of f in R is the reflection of the graph of fˆ with respect to the origin. Below we collect some basic properties of the Brownain web which we will use. For further details, see [FINR04, FINR06, SS08]. The first property concerns the configuration 2 2 of paths in W and Wc entering and leaving a point z ∈ R . For each z = (x, t) ∈ R , let mout(z) denote the cardinality of W(z). We will let min(z) denote the number of equivalence classes of paths in W entering z, where a path f ∈ W is said to enter z if it starts before time t and f(t) = x, while two paths f, g ∈ W entering z are called equivalent if they coalesce before time t. Note that min(z) = 2 if and only if z is a point of coalescence between two paths in W. Similarly, we can definem ˆ in(z) andm ˆ out(z), based on the configuration of paths in Wc. The pair (min, mout) is called the type of z in W. We cite the following result from [TW98, Prop. 2.4] and [FINR06, Thm. 3.11-3.14].

Lemma B.4 (Special points for the Brownian web). Let W and Wc be a Brownian web and its dual. Almost surely: 2 (1) The set of z ∈ R with (min(z), mout(z)) = (m ˆ in(z), mˆ out(z)) = (0, 1) has full 2 Lebesgue measure on R . (2) For each t ∈ R, the set of z ∈ R × {t} with mˆ out(z) ≥ 2 is a countable set, with min(z) ≥ 1 for each such z, i.e., z lies on the graph of some path in W starting before time t. Next we cite a result on the decay of the density of coalescing paths started at time 0.

R×{0} Lemma B.5 (Density for the Brownian web). For t > 0, let ξt := {f(t): f ∈ ∪x∈RW(x, 0)} denote the point set on R generated at time t by the collection of coalescing paths in the Brownian web W started at time 0. Then for any a < b,

R×{0} b − a (B.1) E[|ξt ∩ [a, b]|] = √ . πt This result can be easily derived by using the duality between W and Wc, namely that ξt ∈ (x, x + ) 6= ∅ if and only if the two paths in Wc starting from (x, t) and (x + , t) do not collide on the time interval [0, t]. See e.g. [SS08, Prop. 1.12], where such a density 52 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3 calculation is carried out for a generalization of the Brownian web known as the Brownian net, which in addition allows branching of paths.

Appendix C. Correlation Inequalities for Coalescing Brownian motions In this section, we prove some negative correlation inequalities for a collection of coalesc- ing Brownian motions, which are used in Section 4. One such inequality has been observed in [NRS05, Remark 7.5]. Here we deduce more general negative correlation inequalities from Reimer’s inequality applied to coalescing random walks. In van den Berg and Kesten [vdBK02], Reimer’s inequality was applied to continuous time coalescing random walks with a generalized coalescing rule. Since we are interested in coalescing Brownian motions, discrete space-time coalescing random walks with instan- taneous coalescing already provide an adequate approximation, and Reimer’s inequality can be applied without any complication to the latter. First we recall Reimer’s inequality [Rei00]. For each i ∈ I := {1, ··· , n}, let Si be a finite set with a probability measure µi on Si. Let Ω = S1 × S2 · · · × Sn and µ = µ1 × · · · × µn. 0 0 For K ⊂ I and ω = (ωi)i∈I , define the cylinder set C(K, ω) := {ω ∈ Ω: ωi = ωi ∀ i ∈ K}. Given two events A, B ⊂ Ω, we say A and B occur disjointly for a configuration ω ∈ Ω if there exists K ⊂ I such that C(K, ω) ⊂ A and C(I\K, ω) ⊂ B. The set of ω ∈ Ω for which A and B occur disjointly, which we call the disjoint intersection of A and B, is denoted by

AB := {ω ∈ Ω: ∃ K ⊂ I s.t. C(K, ω) ⊂ A and C(I\K, ω) ⊂ B}. Then Reimer’s inequality asserts that, for any two events A, B ⊂ Ω,

(C.1) µ(AB) ≤ µ(A)µ(B). Now we apply this inequality to coalescing random walks. We recall first the con- 2 2 struction of discrete space-time coalescing random walks. Let Zeven = {(x, t) ∈ Z : x + t is even}. Let {ω } 2 be i.i.d. random variables taking values in {±1}. A di- z z∈Zeven 2 rected edge is drawn from each z = (x, t) ∈ Zeven, which ends at (x + 1, t + 1) if ωz = 1, and ends at (x − 1, t+ 1) if ωz = −1. This provides a graphical construction of a collection 2 of coalescing random walks, where the random walk path starting from each z ∈ Zeven is 2 constructed by following the directed edges in Zeven drawn according to ω. (xi,ti) To see how Reimer’s inequality is applied, let (Xj )j≥ti , 1 ≤ i ≤ n, be a collection 2 of coalescing random walks constructed as above with starting points (xi, ti) ∈ Zeven, and assume for simplicity ti ≤ 0 for all 1 ≤ i ≤ n. For t ∈ N, let ξt = {x ∈ Z : x = (xi,ti) Xt for some 1 ≤ i ≤ n}. Let O1, ··· ,Ok be disjoint subsets of Z, and let Ai = {ω : ξt ∩ Oi 6= ∅}. The crucial observation is that, if the events (Ai)1≤i≤n occur simultaneously, then they must occur disjointly w.r.t. (ω ) 2 because of the coalescence. Namely, z z∈Zeven

k \ (C.2) Ai = A1A2 ··· Ak. i=1 Reimer’s inequality (C.1) then gives the negative correlation inequality

k k \  Y (C.3) P Ai ≤ P(Ai). i=1 i=1 CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 53

The same reasoning allows us to choose each Ai to be an increasing event of the occupation 0 configuration ξt∩Oi, i.e., given ω and ω with respective occupation configurations ξt∩Oi ⊂ 0 0 ξt ∩ Oi, if ω ∈ Ai, then also ω ∈ Ai. Using Reimer’s inequality as illustrated above, together with the invariance principle for coalescing random walks, we will deduce a host of negative correlation inequalities for coalescing Brownian motions, which we formulate next. Definition C.1 (Negatively correlated point processes). We say a simple point process ξ on R is negatively correlated, if for any n ∈ N and any disjoint open intervals O1, ··· ,On, n Qn we have P(∩i=1{ξ ∩ Oi 6= ∅}) ≤ i=1 P(ξ ∩ Oi 6= ∅). Lemma C.2 (Negative correlation for colaescing Brownian motions). Let A ⊂ R × A (−∞, 0], and let ξt denote the point set on R generated at time t > 0 by the collec- A tion of coalescing Brownian motions in the Brownian web W starting from A. Then ξt is negatively correlated. Proof. By monotone convergence, it suffices to consider the case when A consists of a A finite number of points {z1, ··· , zk}. The fact that ξt is negatively correlated then follows directly from the negative correlation inequality (C.3) for coalescing random walks, and the distributional convergence of coalescing random walks to coalescing Brownian motions in the local uniform topology (see e.g. [NRS05, Section 5]).

A Lemma C.3 (Decoupling of correlation functions). Let ξt be as in Lemma C.2. Let a1 < b1 ≤ a2 < b2 · · · ≤ an < bn. Then for any 1 ≤ j ≤ n − 1, (C.4) n  \ A A  P {ξt ∩ (ai, bi) 6= ∅} ∩ {ξt ∩ (bj, aj+1) = ∅} i=1  A A A  Y A ≤ P ξt ∩ (aj, bj) 6= ∅, ξt ∩ (bj, aj+1) = ∅, ξt ∩ (aj+1, bj+1) 6= ∅ P(ξt ∩ (ai, bi) 6= ∅). 1≤i≤n i6=j,j+1 Proof. As before, this follows from approximation by discrete space-time coalescing random walks and Reimer’s inequality. Note that for the discrete analogue of the events in the second line of (C.4), if they all occur, then they must occur disjointly.

A Lemma C.4 (Negative correlation for occupation number). Let ξt be as in Lemma C.2, and let B ⊂ R have finite Lebesgue measure. Then for any k ∈ N, A A k (C.5) P(|ξt ∩ B| ≥ k) ≤ P(|ξt ∩ B| ≥ 1) . Proof. By monotone convergence and approximation by open sets, it suffices to consider the case when A ⊂ R × (−∞, 0] consists of a finite number of points, and B is the finite union of disjoint open intervals. In fact, the argument is the same if B is a bounded open interval, say (0, 1). We proceed by discrete approximation. Let (Zt)t∈N be the subset of Z occupied at 2 time t ∈ N by a collection of coalescing random walks on Zeven starting from z1 = 2 (x1, t1), ··· , zn = (xn, tn) ∈ Zeven with ti ≤ 0 for all 1 ≤ i ≤ n. Given O ⊂ Z and for k ∈ , let A = {ω ∈ Ω: |Z ∩ O| ≥ k}, where ω = (ω ) 2 are the i.i.d. {±1}- N k t z z∈Zeven valued random variables underlying the graphical construction of the coalescing random walks. We note that k z }| { (C.6) Ak ⊂ A1 ··· A1 . 54 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

Indeed, if Ak occurs, then we can find k disjoint random walk paths, each of which occupies k a distinct site in O at time t. Reimer’s inequality (C.1) then implies P(Ak) ≤ P(A1) . Inequality (C.5) then follows by the distributional convergence of coalescing random walks to coalescing Brownian motions in the local uniform topology.

Lemma C.5 (Moment bounds for occupation number). Let ξ be a simple point process on R with a locally finite intensity measure µ, which is absolutely continuous w.r.t. Lebesgue measure on R. If ξ is negatively correlated, then for any Lebesgue measurable B ⊂ R with µ(B) < ∞, and for any k ∈ N, we have k X  k  (C.7) |ξ ∩ B|k ≤ mk−mµ(B)m. E m m=1 −n n Proof. Let B be an open interval, say (0, 1). For n ∈ N, let Dn = {i2 : 0 ≤ i ≤ 2 }, and let D = S D . By our assumption that the intensity measure µ is absolutely n∈N n continuous w.r.t. Lebesgue measure, ξ ∩ D = ∅ almost surely. For 1 ≤ i ≤ 2n, let (n) Ii (ξ) = 1{ξ∩((i−1)2−n,i2−n)6=∅}. By the assumption that ξ is a simple point process, and by monotone convergence, (C.8) k h X (n) ki h X (n) ki X h Y (n)i |ξ∩(0, 1)|k = lim I  = lim I  = lim I . E E n→∞ i n→∞ E i n→∞ E ij n n n 1≤i≤2 1≤i≤2 1≤i1,··· ,ik≤2 j=1 P2n (n) ~ Note that µ(0, 1) = E[|ξ ∩ (0, 1)|] = limn→∞ j=1 E[Ij ]. Given i := (i1, ··· , ik) ∈ n k {1, ··· , 2 } , let m(~i) = |{i1, ··· , ik}|, and denote the m(~i) distinct indices in {i1, ··· , ik} by σ1(~i) < ··· < σm(~i). Then by the negative correlation assumption, (C.9) h i P E Qk I(n) j=1 ij n 1≤i1,··· ,ik≤2 P Qm(~i) (n) Pk P Qm (n) ≤ 1≤i ,··· ,i ≤2n j=1 E[I ] = m=1 1≤σ <···<σ ≤2n f(k, m)m! j=1 E[Ij ] 1 k σj (~i) 1 m  n m ≤ Pk f(k, m) P2 [I(n)] −→ Pk f(k, m)µ(0, 1)m, m=1 j=1 E j n→∞ m=1 n k where f(k, m)m! = |{(i1, ··· , ik) ∈ {1, ··· , 2 } : {i1, ··· , ik} = {σ1, ··· , σm}}|, which is n easily seen to be independent of the choice of 1 ≤ σ1 < ··· < σm ≤ 2 . By first picking m indices out of {i1, ··· , ik} and assign them values σ1 < ··· < σm respectively, we easily k  k−m verify that f(k, m) ≤ m m , which proves (C.7) for B = (0, 1). By the same argument, (C.7) holds for finite unions of disjoint open intervals, and by monotone convergence, for open sets as well. Since any Lebesgue measurable set can be approximated from outside by open sets, again by monotone convergence and the fact ξ ∩E = ∅ a.s. for a given E ⊂ R with zero Lebesgue measure, (C.7) also holds for any Lebesgue measurable B.

We also need the following estimate on the constrained two-point correlation function for the Brownian web. R×{0} Lemma C.6 (Constrained two point function for the Brownian web). Let ξt be as R×{0} in Lemma B.5. Let t > 0. For a < b, let I[a,b] denote the event that ξ ∩ [a, b] 6= ∅. Then for any x1 < x2 with ∆ := x2 − x1, we have 2 − ∆ 1 ∆e 4t (C.10) Kc(x , x ) := Kc(∆) = lim I ∩Ic ∩I  = . t 1 2 t 2 P [x1,x1+δ] [x1+δ,x2] [x2,x2+δ] √ 3  δ↓0 δ 2 πt 2 CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 55

×{0} Proof. By translation invariance of ξR , we may assume x1 = 0, and let x2 − x1 = ∆ > 0. Letw ˆ(0,t),w ˆ(δ,t),w ˆ(∆,t) andw ˆ(∆+δ,t) be the dual coalescing Brownian motions in Wc starting at respectively (0, t), (δ, t), (∆, t) and (∆ + δ, t). Letτ ˆa,b be the time of coalescence betweenw ˆ(a,t) andw ˆ(b,t). Then by the duality between W and Wc, almost surely, the event I[0,δ] ∩ I[δ,∆] occurs if and only ifτ ˆ0,δ < 0 andτ ˆδ,∆ < 0, and the event I[0,δ] ∩ I[δ,∆+δ] occurs if and only ifτ ˆ0,δ < 0 andτ ˆδ,∆+δ < 0. Since I[δ,∆] ⊂ I[δ,∆+δ] and c I[δ,∆+δ]\I[δ,∆] = I[δ,∆] ∩ I(∆,∆+δ], we have

 c  P I[x1,x1+δ] ∩ I[x +δ,x ] ∩ I[x2,x2+δ] (C.11) 1 2 B1,B2,B3 B1,B2,B3 = P0,δ,∆ (τ0,δ > t, τδ,∆+δ > t) − P0,δ,∆+δ (τ0,δ > t, τδ,∆ > t),

where we have reversed time and replacedw ˆ(a,t) for a = 0, δ, ∆, ∆ + δ by independent standard Brownian motions B1,B2 and B3 starting from 0, δ, and ∆ (or ∆ + δ), and τa,b is the first hitting time between the two Brownian motions starting from a and b. By the Karlin-McGregor formula [KM59], the transition density for three one-dimensional Brownian motions B1,B2 and B3 starting from x1 < x2 < x3 to end at locations y1 < y2 < y3 at time t without ever intersecting along the way is given by the determinant (y−x)2 − Det(p (x , y ) ), where p (x, y) = e √ 2t . Therefore we define t i j 1≤i,j≤3 t 2πt

B1,B2,B3 B1,B2,B3 (C.12) D := P0,δ,∆ (τ0,δ > t, τδ,∆+δ > t) − P0,δ,∆+δ (τ0,δ > t, τδ,∆ > t).

The r.h.s. can be written as:

ZZZ pt(0, y1) pt(0, y2) pt(0, y3) pt(0, y1) pt(0, y2) pt(0, y3)

= pt(δ, y1) pt(δ, y2) pt(δ, y3) − pt(δ, y1) pt(δ, y2) pt(δ, y3) d~y. p (∆ + δ, y ) p (∆ + δ, y ) p (∆ + δ, y ) p (∆, y ) p (∆, y ) p (∆+, y ) y1

After some elementary manipulation, we obtain that D is given by the expression: (C.13)

y2+y2+y2+δ2+∆2 1 1 1 − 1 2 3 ZZZ e 2t δy1 δy2 δy3 D = e t − 1 e t − 1 e t − 1 d~y, 3 (2πt) 2 ∆y1 ∆y2 ∆y3 y1

2 2(yi−∆)δ−δ where f(t, yi, δ, ∆) = e 2t − 1. We then Taylor expand in δ, and note that the 2 − kyk −2 factor e 2t allows us to take Dδ and pass the limit δ ↓ 0 inside the integral to obtain (C.14) y2+y2+y2+∆2 ZZZ − 1 2 3 1 1 1 c D e 2t y y y K (∆) = lim = 3 7 1 2 3 d~y. t 2 ∆y ∆y ∆y δ↓0 δ 2 2 1 2 3 (2π) t t t t y1

ZZZ (x −∆)¯ 2+x2+x2 − 1 2 3 I1 = e 2 (x1 − ∆)(¯ x3 − x2)d~x,

x1

x1

x1

x2

Appendix D. Proof of Theorems 1.11 and 1.16 In this section we present the proof of the results on the evolving genealogies for the interacting Fleming-Viot diffusions. This model is a special case of evolving genealogies for the interacting Λ-Fleming-Viot diffusions which are studied in [GKW]. Proof of Theorem 1.11. We will proceed in five steps: (1) We show the result on the martingale problem and the duality for finite geographic spaces V . (2) To prepare for the general case where V is countable, we define an approximation procedure with specific finite geographic space dynamics. (3) We then show the convergence in path space, as the finite spaces approach V . (4) We verify the claimed properties of the solution for general V by a direct argument based on the duality and an explicit look-down construction. (5) Finally, we show that the process admits a mark function. We will use several known facts on measure-valued Fleming-Viot diffusions. For that we refer to [Daw93, Chapter 5] in the non-spatial case and to [DGV95] for the spatial case.

Step 1 (V finite) The case where V is finite is very similar to the non-spatial case. We therefore just have to modify the arguments of the proof of Theorem 1 in [GPW13] or Theorem 1 in [DGP12]. As usual we will conclude uniqueness of the solution of the martingale problem from a duality relation. Note that in contrast to the (non-spatial) interacting Fleming-Viot model CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 57 with mutation considered in [DGP12], in our spatial interacting Fleming-Viot model, resampling takes place only locally, that is, for individuals at the same site. The tree- valued dual will therefore be based on the spatial coalescent considered in [GLW05]. As for existence of a solution of the martingale problem we consider the martingale problems for the evolving genealogies of the approximating spatial Moran models. By consistency of the spatial coalescent, we get the uniform convergence of generators for free. Thus we only have to show the compact containment condition. Here we can rely on the general criterion for population dynamics given in Proposition 2.22 in [GPW13]. As V is finite, all arguments given in [GPW13] to verify this criterion simply go through here as well. Step 2 (A coupled family of approximating finite systems) Let now V be count- able, and consider a sequence (Vn)n∈N of finite sets with Vn ⊆ V , and Vn ↑ V . Put for each n ∈ N, and for all v1, v2 ∈ V ,

( a(v1,v2)1V ×V (v1,v2) P n n , if v ∈ V ,  a(v1,v3) 1 n (D.1) an v1, v2 := v3∈Vn δ(v1, v2), if v1 6∈ Vn. Denote then by X FV,Vn a solution of the martingale problem associated with the operator (restricted to Vn) Z X ∂ LFV,Vn ΦX  = 2 µ⊗n(d(x, v)) g(v) φ(r) n ∂rk,` (X×V ) 1≤k<`≤n Z n ⊗n X X 0 0 (D.2) + µ (d(x, v)) φ(r) a¯n(vj, v )(Mvj ,v g − g)(v) n (X×V ) j=1 v0∈V Z ⊗n X  + 2 γ µ (d(x, v)) g(v) 1{vk=v`} θk,`φ − φ (r), n (X×V ) 1≤k<`≤n

Vn and K the spatial coalescent on Vn with migration ratea ¯n(·, ·) rather thana ¯(·, ·). Notice that an is not necessarily double stochastic anymore, which turns the duality with the spatial coalescent into a Feynman-Kac duality where the Feynman-Kac term converges to 1, as n → ∞, on every finite time horizon. The Feynman-Kac duality reads V as follows (compare, for example, [Sei14, Proposition 3.11]): for all X0 ∈ U1 , Z t  FV,Vn Vn   Vn Vn  (D.3) E H(Xt , K0 ) = E H(X0, Kt ) exp A(Ks )ds , 0 where 0  X X 0 0  (D.4) A (π, ξ , r) := an(v , ξi) − 1 . i∈π v0∈V which is bounded along the path by |π|· Const, for all t ≥ 0. Establishing the Feynman-Kac duality requires to check that (compare, Section 4.4 in [EK86, Theorem 4.4.11]): (D.5) LFVH·, K)(X ) = LdualHX , ·(K) + A(K) · HX , K. This can be immediately verified by explicit calculation (compare, [GPW13, Section 4] for the generator calculation for the resampling part, and [Sei14, Proposition 3.11] for the generator calculation for Markov chains - here migration - whose transition matrix is not double stochastic). As for given n ∈ N, our dynamics consists of independent components outside Vn, we can apply Step 1, here with the Feynman-Kac duality, to conclude the well-posedness of the martingale problem with respect to LFV,Vn . 58 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

V Step 3 (V countable) Fix X0 ∈ U1 . In this step we want to show that the family FV,Vn FV 1,2 {X ; n ∈ N} is tight, and that every limit satisfies the (L , Π , δX0 )-martingale problem. Observe first that the family of laws of the projection of the measures on mark space are tight since the localized state w.r.t. a fixed finite subset A of V has only finitely many marks and weight |A| (uniformly in n). We therefore will here ignore the marks and show tightness in Gromov-weak#-topology. For that we want to apply [EK86, Corollary 4.5.2]. FV,Vn Since an(v1, v2) → a(v1, v2) for all v1, v2 ∈ V , we clearly have that L Φ converges to FV FV,Vn FV L Φ, as Vn ↑ V , i.e., supX ∈ V |L Φ(X ) − L Φ(X )| −→ 0, for Φ depending only on U n→∞ finitely many sites, it remains to verify the compact containment condition, i.e., to show V that for every T > 0, and ε > 0 we can find a compact set KT,ε ⊆ U such that for all n ∈ N, FV,Vn  (D.6) P Xt ∈ KT,ε; for all t ∈ [0,T ] ≥ 1 − ε. For that purpose we will once more rely on the criterion for the compact containment condition which was developed in [GPW13, Proposition 2.22] for population dynamics. To see first that the criterion applies, notice that the evolving genealogies of interacting Fleming-Viot diffusions can be read off as a functional of the look-down construction given in [GLW05]. Thus the countable representation of the look-down defines a population dynamics. In particular, for each t ≥ 0, we can read off a representative (Xt, rt, µt) of FV Xt in the look-down graph such that ancestor-descendant relationship is well-defined. Denote for all t ≥ 0, s ∈ [0, t] and x ∈ Xt by At(x, s) ∈ Xt−s the ancestor of x ∈ Xt back at time s, and for J ⊆ Xs by Dt(J , s) ⊆ Xt the set of descendants of a point in J at time t. In the following we refer for each finite A ⊆ V to (D.7) X FV,·,A as the restriction of X FV,· to marks in A, i.e., obtained by considering the sampling measure A µ (dxdv) := 1Aµ(dxdv). Fix T > 0. We then have two show that the following are true for all A ⊆ V , • Tightness of number of ancestors. For all t ∈ [0,T ] and ε ∈ (0, t), the family Vn Vn {S2ε (Xt, rt, µt); n ∈ N} is tight, where S2ε (Xt, rt, µt) denotes the minimal number of balls of radius 2ε needed to cover Xt up to a set of µt-measure ε. • Bad sets can be controlled. For all ε ∈ (0,T ), there exists a δ = δ(ε > 0) FV,Vn such that for all s ∈ [0,T ), n ∈ N and σ(Xu ; u ∈ [0, s])-measurable drandom Vn Vn subsets J ⊆ Xs × A with µs(J ) ≤ δ,

Vn Vn   (D.8) lim sup P sup µt Dt (J , s) × A ≥  ≤ . n∈N t∈[s,T ]

(i) Fix t ∈ [0,T ]. W.l.o.g. we assume that Vn is a subgroup of V with addition +n ˜ Vn for each n ∈ N. We consider for each n ∈ N another spatial coalescent K on Vn with migration kernel a˜¯n(·, ·) rather thana ¯n(·, ·) where 0 X (D.9)a ˜n(v, v ) := a(v, y), 0 y∈V ;y∼nv

FV,Vn where ∼n denotes equivalence modulo +n. Further, denote by X˜ the evolving ge- nealogies of the interacting Fleming-Viot diffusions whose migration kernel isa ˜n(·, ·) rather Vn than an(·, ·). We prefer to work with K˜ . For this spatial coalescent it was verified in the proof of Proposition 3.4 in [GLW05] that for any time t > 0 the total number of partition ˜ Vn elements of Kt which are located in A is stochastically bounded uniformly in n ∈ N. As CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 59

FV,Vn Vn the kernela ˜n(·, ·) is double stochastic, X˜ and K˜ are dual (without a Feynman-Kac ˜FV,Vn FV,Vn potential), and as dGP# (X , X ) → 0, as n → ∞, the claim follows. FV,Vn (ii) Fix T > 0, A ⊆ V finite, ε ∈ (0,T ), s ∈ [0,T ), n ∈ N and a σ(Xu ; u ∈ [0, s])- Vn measurable sets random subset J ⊆ Xs × A. From the generator characterization of FV,Vn Vn X , we can conclude that the process {µt(Dt(J , s)); t ≥ s} is a V -indexed system of interacting (measure-valued) Fisher-Wright diffusions. We have to find δ = δ(ε) such Vn that (D.8) holds if µs(J ) ≤ δ. Vn Vn Notice that {µt(Dt (J , s)×A); t ≥ s} is a semi-martingale and given by a martingale with continuous paths due to resampling plus a deterministic flow in and out of the set A due to migration. Therefore we have to control the fluctuation of the martingale part and the maximal flow out of the set A over a time interval of length t − s. The martingale part is estimated from below with Doob’s maximum inequality (the is bounded uniformly in the state and in n by a constant ·|A|, details are left to the reader). The deterministic out flow occurs at most at a finite rate c × |A| since the total mass of every site is one. This estimate is uniform in the parameter n (recall the random walk kernel is perturbed by restricting it to Vn). Similarly the flow into A can be bounded by c · |A| independently of n but the flow out of the set A occurs with a maximal rate c independently of n with n ≥ n0(A), and (D.10) c = 2 · max  max a(v0, v), max a(v0, v) . v0∈Ac,v∈A v0∈Ac,v∈A

Step 4 (Feller property). We next proof that X FV has the Feller property. From here it is standard to conclude that X FV satisfies the strong Markov property (see, for exam- (n) V (n) ple, [EK86, Theorem 4.2.7]). Consider a sequence (X0 )n∈N in U1 such that X −→ X0, n→∞ (n) # V FV,X FV,X0 Gromov-weak ly, for some X0 ∈ U1 . Denote by X 0 and X the evolving ge- (n) nealogies of the interacting Fleming-Viot diffusions started in X0 and X0, respectively, and let K be our tree-valued dual spatial coalescent, and H as in (1.18). Then for each given t ≥ 0,  (n)      (D.11) E H X0 , Kt Kt −→ E H X0, Kt Kt , a.s. n→∞ Thus, by our duality relation, (n)  FV,X0   (n)  E H(X , K0) = E H(X , Kt (D.12) t 0    FV,X0  −→ E H X0, Kt = E H Xt , K0) . n→∞ n n,φ (2) Recall from Theorem 1.8(i) that the family {H (·, K); n ∈ N, φ ∈ Cb(R+ ); K ∈ Sn} is (n) FV,X0 FV,X0 convergence determining. Thus it follows that Xt =⇒ Xt , for all t ≥ 0. n→∞ V Step 5 (Mark function) Fix T ≥ 0, and X0 ∈ Ufct. For the proof we will rely once more on the approximation of the solution of the FV 1,0 V (L , Π , δX0 )-martingale problem by Ufct-valued evolving genealogies of Moran mod- els, X M,ρ, where ρ > 0 is the local intensity of individuals. By the look-down construction given in [GLW05], we can define the family {X M,ρ; ρ > 0} on one and the same probability FV 1,0 space. Moreover, as the solution of the (L , Π , δX0 )-martingale problem has continu- M,ρ FV ous paths, due to Skorohod representation theorem we may assume that Xt → Xt , as ρ → ∞, uniformly for all t ∈ [0,T ], almost surely. FV 1,0 We will rely on Theorem 3.9 in [KL]. That is, as the solutions of the (L , Π , δX0 )– martingale problem have continuous paths almost surely, we have to construct for each 60 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

finite A ⊆ V , for each t ∈ [0,T ], and , δ, ρ > 0 a function ht,,A ∈ H (recall from (1.10)) ρ ρ and a random measurable set Yt,,δ,A ⊆ Xt such that ρ ρ (D.13) lim sup lim sup P {µt(X \ Yt,,δ,A) ≤ ht,,A(δ), ∀t ∈ [0,T ]} ≥ 1 − . δ↓0 ρ→∞

Assume first that the geographic space V is finite. For the construction of such a   ρ function h,V = h ∈ H and a random measurable set Yρ,δ,V = Yρ,δ ⊆ Xt , we can proceed exactly as in the proof of Theorem 4.3 of [KL] where the statement is shown with mutation rather than migration in the non-spatial rather than the finite geographic space.

Let now V be countable, and consider a sequence (Vn)n∈N of finite sets with Vn ⊆ V and FV,Vn FV,Vn 1,0 Vn ↑ V . Consider for each n ∈ N a solution, X , of the (L , Π , δX0 )-martingale FV,Vn FV,Vn Vn problem with L as defined in (D.2). As we have seen above, Xt ∈ Ufct for all t ≥ 0, almost surely. Moreover, we have shown in Step 3 that each solution X FV of FV 1,0 FV,Vn the (L , Π , δX0 )-martingale problem on V can be obtained as the limit of X as FV V n → ∞. To conclude from here that also Xt ∈ Ufct for all t ≥ 0, almost surely, fix a FV,Vn,A Vn Vn finite set A ⊆ V . As done before we denote by X = (Xt, rt , µt (· ∩ (Xt ∩ A)))t≥0 FV,A FV,Vn FV and X = (Xt, rt, µt(· ∩ (X × A)))t≥0 the restrictions of X and X , respectively, to marks in A (compare (D.7)). For each m > n we couple X FV,Vn and X FV,Vm through the graphical lookdown con- struction by using the same Poisson point processes and marking every path which leaves Vn in the Vm dynamics by a 1. Moreover, we impose the rule that the 1 is inherited upon lookdown in the sense that both new particles carry type 1. The sampling measure of types then follows an interacting Fleming-Viot (in fact two-type Fisher-Wright) diffusion with selection. The corresponding Moran models are coupled and converge in the many particle per site limit to a limit evolution, which is the coupling on the finite geographic spaces and the additional types act upon resampling as under selection. 0 FV,Vn FV,Vm By construction, if x, x supp(µt(·×A)), their distance is the same in X and X if both carry type 0. Thus for suitably large n (depending on  > 0) such that A ⊆ Vn at any location in A the relative frequencies of types 1 at time t can be made less than any b given ε > 0 with probability ≥ 1 − ε by simple random walk estimates. Namely, if (Zt )t≥0 0 is a a(·, ·)-random walk starting in b ∈ Vn ⊆ Vm and b ∈ CVn, b b P (Zt ∈/ Vn for some t ∈ [0,T ],ZT ∈ A)(D.14) b0 b0 +P (Zt ∈ Vn for some t ∈ [0,T ],ZT ∈ A) ≤ δn → 0 as n → ∞ ∀ m ≥ n.

Then the expected frequency of type 1 in locations in A is bounded by F (δm) with F (δ) → 0 as δ → 0, which follows from the properties of the Fisher-Wright diffusion with selection easily via duality. As a consequence the supremum along the path of the difference in variational norm of the distance-mark distributions for the Vn and the Vm-evolution for types in the set A can be bounded by a sequence converging to 0 as n, m → ∞. Therefore also the limit dynamics on countable V has a mark function.

Proof of Theorem 1.16. Let X FV be the evolving genealogies of interacting Fleming-Viot diffusions where we have assumed that the symmetrized migration is recurrent. In order to prove we proceed in two steps: (1) We start with constructing the limiting FV object which is tree-valued spatial coalescent. (2) We then prove convergence of Xt to this V tree-valued spatial coalescent, at t → ∞, for any initial state X0 ∈ U1 . This immediately implies uniqueness of the invariant distribution. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 61

Step 1 (Tree-valued spatial coalescent) Recall from [GLW05] the spatial coalescent started with infinitely many partition elements per site and with migration mechanism a¯(v, v0). If the symmetrized migration is recurrent, we can assign to each realization a marked ultrametric space, K = (K, r), which admits a mark function. In order to equip it with a locally finite measure on the leaves, we consider a coupled family of sub-coalescents {K%, ρ > 0} such that the number of points of a given mark is Poisson with intensity ρ. If we now assign each point in Kρ mass ρ−1, then it follows from [GLW05, Theorem 3] that there exist a measure µ on K × V such that for each v ∈ V , µ(K × {v}) = 1. This reflects the spatial dust-free property. Thus, we can use the same arguments used in [GPW09, ρ −1 P Theorem 4] to show that the family {(K , ρ (x,v)∈Kρ×V δ(x,v)} is tight, and in fact has exactly one limit point, (D.15) K↓ := (K, r↓, µ↓)

V Step 2 (Convergence into the tree-valued spatial coalescent) For all X0 ∈ U1 and n,φ K0 ∈ S, by our duality relation, using the functions H = H from (1.18),

 FV    E H Xt , K0 = E H X0, Kt

K0  ↓  (D.16) −→ E φ (r (i, j))1≤i

n n,φ (2) Once more, as the family {H (·, K); n ∈ N, φ ∈ Cbb(R+ ), K ∈ S} is convergence FV determining by Theorem 1.8(i) , we can conclude that for all initial conditions, Xt converges Gromov-#-weakly to the tree-valued spatial coalescent.

Acknowledgements. R. Sun is supported by AcRF Tier 1 grant R-146-000-185-112. A. Greven was supported by the Deutsche Forschungsgemeinschaft DFG, Grant GR 876/15-1 und 15-2. A. Winter is supported by DFG SPP 1590.

References [Ald93] D. Aldous. The continuum random tree III. Ann. Probab., 21:248–289, 1993. [ALW] S. Athreya, W. L¨ohr,and A. Winter. The gap between Gromov-vague and Gromov-Hausdorff- vague topology. arXiv:1407.6309. [Arr] R. Arratia. Coalescing Brownian motions and the voter model on Z. Unpublished partial manuscript (1981). [AS11] S. Athreya and R. Sun. One-dimensional Voter model interface revisited. Electron. Commun. Probab., 16:792–800, 2011. [Bil89] P. Billingsley. Convergence of Probability Measures. John Wiley & Sons, 2nd edition, 1989. [CG86] J.T. Cox and D. Griffeath. Diffusive clustering in the two dimensional voter model. Ann. Probab., 14:347–370, 1986. [Daw77] D. A. Dawson. The critical measure diffusion process. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 40:125–145, 1977. [Daw93] D.A. Dawson. Measure-valued Markov processes, volume 1541. Springer-Verlag, 1993. [DEF+00] P. Donnelly, S.N. Evans, K. Fleischmann, T.G. Kurtz, and X. Zhou. Continuum-sites stepping- stone models, coalescing exchangeable partitions and random trees. Ann. Probab., 28:1063– 1110, 2000. [DEF+02a] D. A. Dawson, A.M. Etheridge, K. Fleischmann, E.A. Perkins L. Mytnik, and J. Xiong. Mutually catalytic branching in the plane: Infinite measure states. Electron. Commun. Probab., 7(15):1–61, 2002. 62 ANDREAS GREVEN 1, RONGFENG SUN 2, ANITA WINTER 3

[DEF+02b] D. A. Dawson, A.M. Etheridge, K. Fleischmann, L. Mytnik, E.A. Perkins, and J. Xiong. Mutually catalytic branching in the plane: Finite measure states. Ann. of Probab., 30:1681– 1762, 2002. [DGP11] A. Depperschmidt, A. Greven, and P. Pfaffelhuber. Marked metric measure spaces. Electron. Commun. Probab., 16:174–188, 2011. [DGP12] A. Depperschmidt, A. Greven, and P. Pfaffelhuber. Tree-valued Fleming-Viot dynamics with mutation and selection. Ann. Appl. Probab., 22:2560–2615, 2012. [DGV95] D.A. Dawson, A. Greven, and J. Vaillancourt. Equilibria and quasi-equilibria for infinite sys- tems of Fleming-Viot processes. Trans. Amer. Math. Soc., 347:2277–2360, 1995. [DP91] D.A. Dawson and E. Perkins. Historical processes. Mem. Amer. Math. Soc., 93(454), 1991. [DP98] D. A. Dawson and E. Perkins. Longtime behavior and coexistence in a mutually catalytic branching model. Ann. Probab., 26(3):1088–1138, 1998. [DVJ03] D.J. Daley and D. Vere-Jones. An introduction to the theory of point processes. Vol. I. Ele- mentary theory and methods. Springer, New York, 2nd edition, 2003. [EF96] S. Evans and K. Fleischmann. Cluster formation in a stepping stone model with continuous, hierarchically structured sites. Ann. Probab., 24:1926–1952, 1996. [EK86] S.N. Ethier and T. Kurtz. Markov processes: characterization and convergence. John Wiley, New York, 1986. [EPW06] S.N. Evans, J. Pitman, and A. Winter. Rayleigh processes, real trees, and root growth with re-grafting. Probab. Theory Relat. Fields, 134:81–126, 2006. [Eva97] S. Evans. Coalescing markov labelled partitions and a continuous sites genetics model with infinitely many types. Ann. Inst. H. Poincar´eProbab. Statist., 33:339–358, 1997. [FG94] K. Fleischmann and A. Greven. Diffusive clustering in an infinite system of hierarchically interacting Fisher-Wright diffusions. Probab. Theory Related Fields, 98:517–566, 1994. [FG96] K. Fleischmann and A. Greven. Time–space analysis of the cluster formation in interacting diffusions. Electron. J. Probab., 1:1–46, 1996. [FINR04] L.R.G. Fontes, M. Isopi, C.M. Newman, and K. Ravishankar. The Brownian web: characteri- zation and convergence. Ann. Probab., 32:2857–2883, 2004. [FINR06] L.R.G. Fontes, M. Isopi, C.M. Newman, and K. Ravishankar. Coarsening, nucleation, and the marked Brownian web. Ann. Inst. H. Poincar´eProbab. Stat., 42:37–60, 2006. [GJ98] J.F. Le Gall and Y. Le Jan. Branching processes in L´evyprocesses: Laplace functionals of snakes and . Ann. Probab., 26:1407–1432, 1998. [GKW] A. Greven, A. Klimovsky, and A. Winter. Evolving genealogies of spatial Λ-Fleming-Viot processes with mutation. in preparation. [GLW05] A. Greven, V. Limic, and A. Winter. Representation theorems for interacting Moran mod- els, interacting Fisher–Wright diffusions and applications. Electron. J. Probab., 10:1286–1358, 2005. [GM] A. Greven and C. Mukherjee. Genealogies of spatial logistic branching models. in preparation. [GPW09] A. Greven, P. Pfaffelhuber, and A. Winter. Convergence in distribution of random metric measure spaces (the Lambda-coalescent measure tree). Probab. Theory Related Fields, 145:285– 322, 2009. [GPW13] A. Greven, P. Pfaffelhuber, and A. Winter. Tree-valued resampling dynamics: martingale problems and applications. Probab. Theory Related Fields, 155(3-4):789–838, 2013. [KL] S. Kliem and W. L¨ohr. Existence of mark functions in marked metric measure spaces. arXiv:1412.2039. 2 [Kle00] A. Klenke. Diffusive clustering of interacting Brownian motions on Z . Stochastic Process. Appl., 89:261–268, 2000. [KM59] S. Karlin and J. McGregor. Coincidence probabilities. Pacific J. Math., 9:1141–1164, 1959. [KS88] N. Konno and T. Shiga. Stochastic differential equations for some measure-valued diffusions. Probab. Theory Related Fields, 79:201–225, 1988. [Lig85] T.M. Liggett. Interacting Particle Systems. Springer, New York, 1985. Reprint of the 1985 original. Classics in Mathematics. Springer-Verlag, Berlin, 2005. [NRS05] C. M. Newman, K. Ravishankar, and R. Sun. Convergence of coalescing nonsimple random walks to the Brownian web. Electron. J. Probab., 10:21–60, 2005. [Rei00] D. Reimer. Proof of the van den Berg-Kesten conjecture. Combin. Probab. Comput., 9:27–32, 2000. [Sei14] P. Seidel. The historical process of the spatial Moran model with selection and mutation. Uni- versit¨atErlangen-N¨urnberg, Erlangen, 2014. Dissertation. [Spi76] F. Spitzer. Principles of Random Walk. Springer-Verlag, 2nd edition, 1976. CONTINUUM SPACE LIMIT OF IFV GENEALOGIES 63

[SS08] R. Sun and J.M. Swart. The Brownian net. Ann. Probab., 36:1153–1208, 2008. [TW98] B. Toth and W. Werner. The true self-repelling motion. Probab. Theory Related Fields, 111:375–452, 1998. [vdBK02] J. van den Berg and H. Kesten. Randomly coalescing random walk in dimension ≥ 3. In and out of equilibrium. Progr. Probab., 51:1–45, 2002. [Win02] A. Winter. Multiple scale analysis of spatial branching processes under the palm distribution. Electron. J. Probab., 7:1–74, 2002. [Zho03] X. Zhou. Clustering behavior of a continuous-sites stepping-stone model with Brownian mi- gration. Electron. J. Probab., 8:1–15, 2003.