<<

UP-DOWN ORDERED CHINESE RESTAURANT PROCESSES WITH TWO-SIDED IMMIGRATION, EMIGRATION AND DIFFUSION LIMITS

BY QUAN SHI1 AND MATTHIAS WINKEL2

1Mathematical Institute, University of Mannheim, Mannheim D-68131, Germany. [email protected]

2Department of , University of Oxford, 24–29 St Giles’, Oxford OX1 3LB, UK. [email protected]

We introduce a three-parameter family of up-down ordered Chinese (α) restaurant processes PCRP (θ1, θ2), α ∈ (0, 1), θ1, θ2 ≥ 0, generalising the two-parameter family of Rogers and Winkel. Our main result establishes (α) self-similar diffusion limits, SSIP (θ1, θ2)-evolutions generalising exist- ing families of interval partition evolutions. We use the scaling limit approach to extend stationarity results to the full three-parameter family, identifying an extended family of Poisson–Dirichlet interval partitions. Their ranked se- quence of interval lengths has Poisson–Dirichlet distribution with parameters α ∈ (0, 1) and θ := θ1 + θ2 − α ≥ −α, including for the first time the usual range of θ > −α rather than being restricted to θ ≥ 0. This has applications to Fleming–Viot processes, nested interval partition evolutions and tree-valued Markov processes, notably relying on the extended parameter range.

1. Introduction. The primary purpose of this work is to study the weak convergence of a family of properly rescaled continuous-time Markov chains on integer compositions [22] and the limiting diffusions. Our results should be compared with the scaling limits of natural up-down Markov chains on branching graphs, which have received substantial focus in the literature [8, 35, 36]. In this language, our models take place on the branching graph of integer compositions and on its boundary, which was represented in [22] as a space of interval partitions. This paper establishes a proper scaling limit connection between discrete models [44] and their continuum analogues [14, 15, 18] in the generality of [47]. We consider a class of ordered Chinese restaurant processes with departures, parametrised by α ∈ (0, 1) and θ1, θ2 ≥ 0. This model is a continuous-time (C(t), t ≥ 0) on vectors of positive integers, describing customer numbers of occupied tables arranged in a row. At time zero, say that there are k ∈ N = {1, 2,...} occupied tables and for each i ≤ k the i-th occupied table enumerated from left to right has ni ∈ N customers, then the initial state is C(0) = (n1, . . . , nk). New customers arrive as time proceeds, either taking a seat at an existing table or starting a new table, according to the following rule, illustrated in Figure1: • for each occupied table, say there are m ∈ N customers, a new customer comes to join this table at rate m − α; arXiv:2012.15758v1 [math.PR] 31 Dec 2020 • at rate θ1, a new customer enters to start a new table to the left of the leftmost table; • at rate θ2, a new customer begins a new table to the right of the rightmost table; • between each pair of two neighbouring occupied tables, a new customer enters and begins a new table there at rate α. We refer to the arrival of a customer as an up-step. Furthermore, each customer leaves at rate 1 (a down-step). By convention, the chain jumps from the null vector ∅ to state (1) at rate θ := θ1 + θ2 − α if θ > 0, and ∅ is an absorbing state if θ ≤ 0. At every time t ≥ 0, let C(t)

MSC2020 subject classifications: Primary 60J80; secondary 60C05, 60F17. Keywords and phrases: Poisson–Dirichlet distribution, interval partition, Chinese restaurant process, integer composition, self-similar process, branching with immigration and emigration. 1 2 Q. SHI AND M. WINKEL

(α) FIG 1. The rates at which new customers arrive in a PCRP (θ1, θ2). be the vector of customer numbers at occupied tables, listed from left to right. In this way we have defined a continuous-time Markov chain (C(t), t ≥ 0). This process is referred to as a Poissonised up-down ordered Chinese restaurant process (PCRP) with parameters α, θ1 and (α) θ2, denoted by PCRPC(0)(θ1, θ2). This family of Markov chains is closely related to the well-known Chinese restaurant processes due to Dubins and Pitman (see e.g. [38]) and their ordered variants studied in (α) [26, 39]. When θ2 = α, a PCRP (θ1, α) is studied in [44]. Notably, our generalisation includes cases θ = θ1 + θ2 − α ∈ (−α, 0), which did not arise in [26, 39, 44]. Though we focus on the range α ∈ (0, 1) in this paper, our model is clearly well-defined for α = 0 and it is straightforward to deal with this case; we include a discussion in Section 4.7. To state our first main result, we represent PCRPs in a space of interval partitions. For M ≥ 0, an interval partition β = {Ui, i ∈ I} of [0,M] is a (finite or countably infinite) collection of disjoint open intervals Ui = (ai, bi) ⊆ (0,M), such that the (compact) set of partition points S G(β) := [0,M] \ Ui has zero Lebesgue measure. We refer to the intervals U ∈ β as i∈I P blocks, to their lengths Leb(U) as their masses. We similarly refer to kβk := U∈β Leb(U) as the total mass of β. We denote by IH the set of all interval partitions of [0,M] for all M ≥ 0. This space is equipped with the metric dH obtained by applying the Hausdorff metric 0 to the sets of partition points: for every γ, γ ∈ IH ,   0 [ 0 [ dH (γ, γ ) := inf r ≥ 0: G(γ) ⊆ (x − r, x + r),G(γ ) ⊆ (x − r, x + r) . x∈G(γ0) x∈G(γ)

Although (IH , dH ) is not complete, the induced topological space is Polish [16, Theo- rem 2.3]. For c > 0 and β ∈ IH , we define a scaling map by cβ := {(ca, cb):(a, b) ∈ β}.

We shall regard a PCRP (C(t), t ≥ 0) as a càdlàg process in (IH , dH ), by identifying any vector of positive integers (n1, . . . , nk) with an interval partition of [0, n1 + ··· + nk]:

(n1, . . . , nk) ←→ {(si−1, si), 1 ≤ i ≤ k} where si = n1 + ··· + ni. We are now ready to state our main result, which is a limit theorem in distribution in the space of IH -valued càdlàg functions D(R+, IH ) with R+ := [0, ∞), endowed with the J1- Skorokhod topology (see e.g. [6] for background).

(n) THEOREM 1.1. Let α ∈ (0, 1) and θ1, θ2 ≥ 0. For n ∈ N, let (C (t), t ≥ 0) be a (α) (n) (n) PCRP (θ1, θ2) starting from C (0) = γ . Suppose that the initial interval partitions 1 (n) n γ converge in distribution to γ ∈ IH as n → ∞, under dH . Then there exists an IH - valued path-continuous (β(t), t ≥ 0) starting from β(0) = γ, such that

 1 (n)  (1) C (2nt), t ≥ 0 −→ (β(t), t ≥ 0), in distribution in D(R+, IH ). n n→∞ Moreover, set ζ(n) = inf{t ≥ 0: C(n)(t) = ∅} and ζ = inf{t ≥ 0: β(t) = ∅} to be the respec- tive first hitting times of ∅. If γ 6= ∅, then (1) holds jointly with ζ(n)/2n → ζ, in distribution. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 3

We call the limiting diffusion (β(t), t ≥ 0) on IH an (α, θ1, θ2)-self-similar interval par- (α) tition evolution, denoted by SSIP (θ1, θ2). These processes are indeed self-similar with index 1, in the language of self-similar Markov processes [31], see also [29, Chapter 13]: if (α) −1 (α) (β(t), t ≥ 0) is an SSIP (θ1, θ2)-evolution, then (cβ(c t), t ≥ 0) is an SSIP (θ1, θ2)- evolution starting from cβ(0), for any c > 0. While most SSIP-evolutions have been constructed before [14, 15, 18, 47], in increasing generality, Theorem 1.1 is the first scaling limit result with an SSIP-evolution as its limit. In special cases, this was conjectured in [44]. In the following, we state some consequences and further developments. We relate to the literature in more detail in Section 1.1. We refer to Section 1.3 for applications particularly of the generality of the three-parameter family. Our interval partition evolutions have multiple connections to squared Bessel processes. More precisely, a squared Z = (Z(t), t ≥ 0) starting from Z(0) = m ≥ 0 and with “dimension” parameter δ ∈ R is the unique strong solution of the following equation: Z t Z(t) = m + δt + 2 p|Z(s)|dB(s), 0 where (B(t), t ≥ 0) is a standard . We refer to [23] for general properties of squared Bessel processes. Let ζ(Z) := inf{t ≥ 0: Z(t) = 0} be the first hitting time of zero. To allow Z to re-enter (0, ∞) where possible after hitting 0, we define the lifetime of Z by ( ∞, if δ > 0, (2) ζ(Z) := ζ(Z), if δ ≤ 0.

We write BESQm(δ) for the law of a squared Bessel process Z with dimension δ starting from m, in the case δ ≤ 0 absorbed in ∅ at the end of its (finite) lifetime ζ(Z). When δ ≤ 0, by our convention BESQ0(δ) is the law of the constant zero process. (α) In an SSIP (θ1, θ2)-evolution, informally speaking, each block evolves as BESQ(−2α), independently of other blocks [14, 15]. Meanwhile, there is always immigration of rate 2α between “adjacent blocks”, rate 2θ1 on the left [18] and rate 2θ2 on the right [47]. More- (α) over, the total mass process (kβ(t)k, t ≥ 0) of any SSIP (θ1, θ2)-evolution (β(t), t ≥ 0) is BESQkβ(0)k(2θ) with θ := θ1 + θ2 − α. We discuss this more precisely in Section 4.3. We refer to |2θ| as the total immigration rate if θ > 0, and as the total emigration rate if θ < 0. (α) There are pseudo-stationary SSIP (θ1, θ2)-evolutions, that have fluctuating total mass but stationary interval length proportions, in the sense [15] of the following proposition, for a (α) family PDIP (θ1, θ2), α ∈ (0, 1), θ1, θ2 ≥ 0, of Poisson–Dirichlet interval partitions on the space IH,1 ⊂ IH of partitions of the unit interval. This family notably extends the subfamilies of [21, 39, 47], whose ranked sequence of interval lengths in the Kingman simplex  X  ∇∞ := (x1, x2,...): x1 ≥ x2 ≥ · · · ≥ 0, xi = 1 i≥1 are members of the two-parameter family PD(α)(θ), α ∈ (0, 1), θ ≥ 0 of Poisson–Dirichlet distributions. Here, we include new cases of interval partitions, for which θ ∈ (−α, 0), com- pleting the usual range of the two-parameter family of PD(α)(θ) with α ∈ (0, 1) of [38, Defi- (α) nition 3.3]. The further case θ1 = θ2 = 0, is degenerate, with PDIP (0, 0) = δ{(0,1)}.

PROPOSITION 1.2 (Pseudo-stationarity). For α ∈ (0, 1) and θ1, θ2 ≥ 0, consider inde- (α) pendently γ¯ ∼ PDIP (θ1, θ2) and a BESQ(2θ)-process (Z(t), t ≥ 0) with any initial distri- (α) bution and parameter θ = θ1 + θ2 − α. Let (β(t), t ≥ 0) be an SSIP (θ1, θ2)-evolution starting from β(0) = Z(0)¯γ. Fix any t ≥ 0, then β(t) has the same distribution as Z(t)¯γ. 4 Q. SHI AND M. WINKEL

(α) We refer to Definition 2.6 for a description of the probability distribution PDIP (θ1, θ2) (α) on unit interval partitions. We prove in Proposition 2.9 that PDIP (θ1, θ2) gives the limiting block sizes in their left-to-right order of a new three-parameter family of composition struc- tures in the sense of [22]. For the special case θ2 = α, which was introduced in [21, 39], we (α) (α) also write PDIP (θ1) := PDIP (θ1, α) and recall a construction in Definition 2.4. As in the case θ2 = α studied in [15, 18], we define an associated family of IH,1-valued evolutions via time-change and renormalisation (“de-Poissonisation”).

(α) DEFINITION 1.3 (De-Poissonisation and IP (θ1, θ2)-evolutions). Consider γ ∈ IH,1, (α) let β := (β(t), t ≥ 0) be an SSIP (θ1, θ2)-evolution starting from γ and define a time- change function τβ by  Z t  −1 (3) τβ(u) := inf t ≥ 0: kβ(s)k ds > u , u ≥ 0. 0

Then the process on IH,1 obtained from β via the following de-Poissonisation −1 β(u) := β(τβ(u)) β(τβ(u)), u ≥ 0, is called a Poisson–Dirichlet (α, θ1, θ2)-interval partition evolution starting from γ, abbrevi- (α) ated as IP (θ1, θ2)-evolution.

(α) THEOREM 1.4. Let α∈(0, 1), θ1, θ2 ≥0. An IP (θ1, θ2)-evolution is a path-continuous Hunt process on (IH,1, dH ), is continuous in the initial state and has a stationary distribution (α) PDIP (θ1, θ2).

Define H to be the commutative unital algebra of functions on ∇∞ generated by qk(x) = P k+1 i≥1 xi , k ≥ 1, and q0(x) = 1. For every α ∈ (0, 1) and θ > −α, define an operator Bα,θ : H → H by X ∂2 X ∂2 X ∂ Bα,θ := xi 2 − − (θxi + α) . ∂x ∂xi∂xj ∂xi i≥1 i i,j≥1 i≥1

It has been proved in [35] that there is a Markov process on ∇∞ whose (pre-)generator on H is Bα,θ, which shall be referred to as the Ethier–Kurtz–Petrov diffusion with parameter (α, θ), for short EKP(α, θ)-diffusion; moreover, PD(α)(θ) is the unique invariant probability measure for EKP(α, θ). In [17], the following connection will be established. (α) • Let α ∈ (0, 1), θ1, θ2 ≥ 0 with θ1 + θ2 > 0. For an IP (θ1, θ2)-evolution (β(u), u ≥ 0), list the lengths of intervals of β(u) in decreasing order in a sequence W (u) ∈ ∇∞. Then the process (W (u/2), u ≥ 0) is an EKP(α, θ)-diffusion with θ := θ1 + θ2 − α > −α.

1.1. Connections with interval partition evolutions in the literature. The family of (α) SSIP (θ1, θ2)-evolutions generalises the two-parameter model of [18]: (α) • for α ∈ (0, 1) and θ1 > 0, an SSIP (θ1, α)-evolution is an (α, θ1)-self-similar interval (α) partition evolution in the sense of [18], which we will also refer to as an SSIP (θ1)- evolution. The properties of these limiting processes have been proved in [18]. (α) For this smaller class with θ2 = α,[44] provides a study of the family PCRP (θ1, α) and conjectures the existence of diffusion limits, which is thus confirmed by our Theorem 1.1. We 1 (n) conjecture that the convergence in Theorem 1.1 can be extended to the case where G( n γ ) converges in distribution, with respect to the Hausdorff metric, to a compact set of positive TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 5

Lebesgue measure; then the limiting process is a generalized interval partition evolution in the sense of [18, Section 4]. For the two-parameter case (θ2 = α), [43] obtains the scaling limits of a closely related family of discrete-time up-down ordered Chinese restaurant processes, in which at each time exactly one customer arrives according to probabilities proportional to the up-rates of (α) PCRP (θ1, α), see Definition 2.1, and then one customer leaves uniformly, such that the number of customers remains constant. The method in [43] is by analysing the generator of the Markov processes, which is quite different from this work, and neither limit theorem (α) implies the other. It is conjectured that the limits of [43] are IP (θ1, α)-evolutions, and we further conjecture that this extends to the three-parameter setting of Definition 1.3. (α) An SSIP (θ1, θ2)-evolution β = (β(t), t ≥ 0) killed at its first hitting time ζ(β) of (α) ∅ has been constructed in [47]. We denote this killed process by SSIP† (θ1, θ2). For the (α) SSIP (θ1, θ2)-evolution itself, there are three different phases, according to the parameter

θ := θ1 + θ2 − α ≥ −α. (α) • θ ≥ 1: ζ(β) = ∞ a.s.. In this case, SSIP (θ1, θ2)-evolutions have been constructed in [47], including a proof of Proposition 1.2 that is the key to the proof of Theorem 1.4. d • θ ∈ (0, 1): ζ(β) is a.s. finite with ζ(β) = kβ(0)k/2G, where G ∼ Gamma(1−θ, 1), the Gamma distribution with shape parameter 1 − θ and rate parameter 1. The construction in (α) [47] does not cover this case in full. We will construct in Section 4.5 an SSIP (θ1, θ2)- (α) evolution as a recurrent extension of SSIP† (θ1, θ2)-evolutions and study its properties. (α) • θ ∈ [−α, 0]: ∅ is an absorbing state, and hence an SSIP† (θ1, θ2)-evolution coincides with (α) an SSIP (θ1, θ2)-evolution. In [47], we were unable to establish the pseudo-stationarity stated in Proposition 1.2 for this case. Our proof of Proposition 1.2 relies crucially on the convergence in Theorem 1.1. Note that the law of ζ(β) and the phase transitions can be observed directly from the boundary behaviour at zero of the total mass process BESQ(2θ), see e.g. [23, Equation (13)].

(α) (α) 1.2. SSIP (θ1, θ2)-excursions. When θ = θ1 +θ2 −α ∈ (0, 1), a PCRP (θ1, θ2) is (α) reflected at ∅. When θ = θ1 +θ2 −α ≤ 0, a PCRP (θ1, θ2) is absorbed at ∅, and if the 1 (n) initial interval partitions n γ converge in distribution to ∅ ∈ IH as n → ∞ under dH , then the limiting process in Theorem 1.1 is the constant process that stays in ∅. In both cases we refine the discussion and establish the convergence of rescaled PCRP excursions to a non- trivial limit in the following sense.

THEOREM 1.5. Let α ∈ (0, 1), θ1, θ2 ≥ 0 and suppose that θ = θ1 +θ2 −α ∈ (−α, 1). (α) (n) Let (C(t), t≥0) be a PCRP (θ1, θ2) starting from state (1) and denote by P the law (n) 1  of the process C (t) := n C(2nt ∧ ζ(C)), t ≥ 0 , where ζ(C) := inf{t ≥ 0: C(t) = ∅}. Then the following convergence holds vaguely under the Skorokhod topology: Γ(1 + θ) n1−θP(n) −→ Θ, 1 − θ n→∞ where the limit Θ is a σ-finite measure on the space of continuous excursions on IH .

A description of the limit Θ is given in Section 4.4. We refer to Θ as the excursion measure (α) of an SSIP (θ1, θ2)-evolution, which plays a crucial role in the construction of recurrent extensions mentioned above (when θ ∈ (0, 1)), as well as in the study of nested interval partition evolutions (when θ ∈ (−α, 0)) in Section 5.3. 6 Q. SHI AND M. WINKEL

1.3. Further applications. A remarkable feature of the three-parameter family is that it includes the emigration case with θ < 0; this cannot happen in the two-parameter case with θ2 = α where θ = θ1 ≥ 0, but it is naturally included by Petrov [35] in the unordered setting. The discrete approximation method developed in this work in particular permits us to understand pseudo-stationarity and SSIP-excursions in the emigration case, which has further interesting applications. We discuss a few in this paper, listed as follows.

1.3.1. Measure-valued processes with θ ∈ [−α, 0). In [19], we introduced a family of Fleming–Viot parametrised by α ∈ (0, 1), θ ≥ 0, taking values on the space a (M1, dM) of all purely atomic probability measures on [0, 1] endowed with the Prokhorov distance. Our construction in [19] is closely related to our construction of interval partition evolutions. We can now extend this model to the case θ ∈ [−α, 0) and identify the desired sta- tionary distribution, the two-parameter Pitman–Yor process, here exploiting the connection with an SSIP(α)(θ + α, 0)-evolution. This is discussed in more detail in Section 5.1.

1.3.2. Nested interval partition evolutions. Let us recall a well-known identity [38, (5.24)] involving the two-parameter family PD(α)(θ) and associated fragmentation operators. For 0 ≤ α¯ ≤ α < 1 and θ¯ > 0, let (Ai, i ≥ 1) be a random variable on ∇∞ with distribution (¯α) ¯ 0 (α) PD (θ), and let (Ai,j, j ≥ 1), i ≥ 1, be an i.i.d. sequence of PD (−α¯), further indepen- 0 dent of (Ai, i ≥ 1). Then the decreasing rearrangement of the collection AiAi,j , i, j ≥ 1, counted with multiplicities, has distribution PD(α)(θ¯). In other words, a PD(¯α)(θ¯) fragmented by PD(α)(−α¯) has distribution PD(α)(θ¯). In Section 5.2, we extend this to the setting of our (α) three-parameter family PDIP (θ1, θ2) of interval partitions. In Sections 5.3, we study nested interval partition evolutions (βc, βf ), such that at every time t ≥ 0, the endpoints of the intervals in βf (t) form a subset of those in βc(t). A particular case of our results establishes a dynamical and ordered version of this identity. Informally speaking, for 0 < α¯ ≤ α < 1 and θ,¯ θ1, θ2 ≥ 0 with θ1 + θ2 − α = −α¯ < 0, we can find (¯α) ¯ (α) ¯ a coupling of stationary IP (θ)- and IP (θ)-evolutions βc and βf , such that at each (¯α) ¯ time u ≥ 0, βf (u) ∼ PDIP (θ) can be regarded as fragmenting each interval of βc(u) ∼ (¯α) (α) PDIP (θ¯) according to PDIP (θ1, θ2). Finally, Section 5.4 extends Theorem 1.1 to the setting of nested PCRPs and nested interval partitions.

1.3.3. Connections with random trees. An initial motivation of this work was from stud- ies of diffusions on a space of continuum trees. Aldous [1] introduced a Markov chain on the space of binary trees with n leaves, by removing a leaf uniformly and reattaching it to a ran- dom edge. This Markov chain has the uniform distribution as its stationary distribution. As n→∞, Aldous conjectured that the limit of this Markov chain is a diffusion on continuum trees with stationary distribution given by the Brownian continuum random tree (CRT), i.e. the universal scaling limit of random discrete trees with finite vertex degree variance. Among different approaches [32, 13] investigating this problem, [13] describes the evolution via a consistent system of spines endowed with lengths and subtree masses, which relies crucially on interval partition evolutions. This motivates us to construct stable Aldous diffusions, with stationary distributions given by stable Lévy trees with parameter ρ ∈ (1, 2) [12, 11], which are the infinite variance ana- logues of the Brownian CRT. The related Markov chains on discrete trees have been studied in [49]. A major obstacle for the study in the continuum is that the approaches in [32, 13] cannot be obviously extended from binary to multifurcating trees with unbounded degrees. The current work provides tools towards overcoming this difficulty; with these techniques TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 7 one could further consider more general classes of continuum fragmentation trees, including the alpha-gamma models [9] or a two-parameter Poisson–Dirichlet family [24]. To demonstrate a connection of our work to continuum trees, recall the spinal decompo- sitions developed in [24]. In a stable Lévy tree with parameter ρ ∈ (1, 2), there is a natural diffuse “uniform” probability measure on the leaves of the tree. Sample a leaf uniformly at random and consider its path to the root, called the spine. To decompose along this spine, we say that two vertices x, y of the tree are equivalent, if the paths from x and y to the root first meet the spine at the same point. Then equivalence classes are bushes of spinal subtrees rooted at the same spinal branch point; by deleting the branch point on the spine, the subtrees in an equivalence class form smaller connected components. The collection of equivalence classes is called the coarse spinal decomposition, and the collection of all sub- trees is called the fine spinal decomposition. With α¯ = 1 − 1/ρ and α = 1/ρ, it is known [24, Corollary 10] that the deceasing rearrangement of masses of the coarse spinal decomposition has distribution PD(¯α)(¯α); moreover, the mass sequence of the fine spinal decomposition is a PD(α)(α − 1)-fragmentation of the coarse one and has PD(α)(¯α) distribution. Some variants of our aforementioned nested interval partition evolutions can be used to represent the mass evolution of certain spinal decompositions in a conjectured stable Aldous diffusion. The order structure provided by the interval partition evolutions also plays a crucial role: at the coarse level, the equivalence classes of spinal bushes are naturally ordered by the distance of the spinal branch points to the root; at the fine level, a total order of the subtrees in the same equivalence class aligns with the semi-planar structure introduced in [49], which is used to explore the evolutions of sizes of subtrees in a bush at a branch point. In Section 5.5, we give a broader example of a Markov chain related to random trees that converges to our nested SSIP-evolutions. The study of stable Aldous diffusions is, however, beyond the scope of the current paper and will be further investigated in future work.

1.4. Organisation of the paper. In Section2 we generalise the two-parameter ordered Chinese Restaurant Processes to a three-parameter model and establish their connections to interval partitions and composition structures. In Section3, we prove Theorem 1.1 in the two- parameter setting, building on [14, 15, 18, 44]. In Section4, we study the three-parameter setting, both for processes absorbed in ∅ and for recurrent extensions, which we obtain by (α) constructing excursion measures of SSIP (θ1, θ2)-evolutions. This section concludes with proofs of all results stated in this introduction. We finally turn to applications in Section5.

2. Poisson–Dirichlet interval partitions. Throughout the rest of the paper, we fix a parameter α ∈ (0, 1). In this section we will introduce the three-parameter family of random (α) interval partitions PDIP (θ1, θ2), with θ1, θ2 ≥ 0, as the limit of a family of discrete-time ordered Chinese restaurant processes (without customers leaving).

2.1. Ordered Chinese restaurant processes in discrete time. For n ∈ N, let [n] := k {1, 2, . . . , n}. Let C := {(n1, n2, . . . , nk) ∈ N , k ≥ 1} ∪ {∅}. We view C as a space of in- teger compositions: for any n ≥ 0, the subset Cn := {(n1, . . . , nk) ∈ C : n1 + ··· + nk = n} is the set of compositions of n. Recall that we view C as a subset of the space IH of interval partitions introduced in the introduction. We still consider the metric dH , and all operations and functions defined on IH shall be inherited by C. We also introduce the concatenation of a family of interval partitions (βa)a∈A, indexed by a totally ordered set (A, ): X ? βa := {(x + Sβ(a−), y + Sβ(a−): a ∈ A, (x, y) ∈ βa)} , where Sβ(a−) := kβbk. a∈A b≺a

When A = {1, 2}, we denote this by β1 ? β2. Then each composition (n1, n2, . . . , nk) ∈ C is k identified with the interval partition ?i=1{(0, ni)} ∈ IH . 8 Q. SHI AND M. WINKEL

(α) DEFINITION 2.1 (oCRP (θ1, θ2) in discrete-time). Let θ1, θ2 ≥ 0. We start with a cus- tomer 1 sitting at a table and new customers arrive one-by-one. Suppose that there are already n customers, then the (n + 1)-st customer is located by the following (α, θ1, θ2)-seating rule: • If a table has m customers, then the new customer comes to join this table with probability (m − α)/(n + θ), where θ := θ1 + θ2 − α. • The new customer may also start a new table: with probability θ1/(n + θ), the new cus- tomer begins a new table to the left of the leftmost table; with probability θ2/(n + θ), she starts a new table to the right of the rightmost table. Between each pair of two neighbouring occupied tables, a new table is started there with probability α/(n + θ). At each step n ∈ N, the numbers of customers at the tables (ordered from left to right) form a composition C(n) ∈ Cn. We will refer to (C(n), n ∈ N) as an ordered Chinese restaurant (α) process with parameters α, θ1 and θ2, or oCRP (θ1, θ2). We also denote the distribution (α) of C(n) by oCRPn (θ1, θ2).

(α) In the degenerate case θ1 = θ2 = 0, an oCRP (0, 0) is simply a deterministic process C(n) = (n), n ∈ N. If we do not distinguish the location of the new tables, but build the partition of N that has i and j in the same block if the i-th and j-th customer sit at the same table, then this gives rise to the well-known (unordered) (α, θ)-Chinese restaurant process with θ := θ1 + θ2 − α; see e.g. [38, Chapter 3]. When θ1 = α, this model encompasses the family of composition structures studied in (α) [21, Section 8] and [39]. In particular, they show that an oCRP (α, θ2) is a regenerative composition structure in the following sense.

DEFINITION 2.2 (Composition structures [21, 22]). A Markovian sequence of random compositions (C(n), n ∈ N) with C(n) ∈ Cn is a composition structure, if the following property is satisfied: • Weak sampling consistency: for each n ∈ N, if we first distribute n + 1 identical balls into an ordered series of boxes according to C(n + 1) and then remove one ball uniformly at random (deleting an empty box if one is created), then the resulting composition Ce(n) has the same distribution as C(n). Moreover, a composition structure is called regenerative, if it further satisfies • Regeneration: for every n ≥ m, conditionally on the first block of C(n) having size m, the remainder is a composition in Cn−m with the same distribution as C(n − m). For n ≥ m, let r(n, m) be the probability that the first block of C(n) has size m. Then (r(n, m), 1 ≤ m ≤ n) is called the decrement matrix of (C(n), n ∈ N).

(α) LEMMA 2.3 ([39, Proposition 6]). For θ2 ≥ 0, an oCRP (α, θ2)(C(n), n ∈ N) start- ing from (1) ∈ C is a regenerative composition structure with decrement matrix   n (n − m)α + mθ2 Γ(m − α)Γ(n − m + θ2) rθ2 (n, m) := , 1 ≤ m ≤ n. m n Γ(1 − α)Γ(n + θ2)

For every (n1, n2, . . . nk) ∈ Cn, we have k k  Y X P C(n) = (n1, n2, . . . nk) = rθ2 (Ni:k, ni), where Ni:k := nj. i=1 j=i 1 Moreover, n C(n) converges a.s. to a random interval partition γ¯ ∈ IH , under the metric dH , as n → ∞. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 9

The limit γ¯ is called a regenerative (α, θ2) interval partition in [21] and [39]. For β ∈ IH , the left-right reversal of β is

(4) rev(β) := {(kβk − b, kβk − a):(a, b) ∈ β} ∈ IH .

(α) DEFINITION 2.4 (PDIP (θ1)). For θ1 ≥ 0, let γ¯ be a regenerative (α, θ1) interval parti- tion. Then the left-right reversal rev(¯γ) is called a Poisson–Dirichlet(α, θ1) interval partition, (α) whose law on IH is denoted by PDIP (θ1).

(α) By the left-right reversal, it follows clearly from Lemma 2.3 that PDIP (θ1) also de- (α) scribes the limiting proportions of customers at tables in an oCRP (θ1, α). We record from [15, Proposition 2.2(iv)] a decomposition for future usage: with independent B ∼ Beta(α, 1−α) and γ¯ ∼ PDIP(α)(α), we have (5) {(0, 1 − B)} ?Bγ¯ ∼ PDIP(α)(0). (α) To understand the distribution of (C(n), n ∈ N) ∼ oCRP (θ1, θ2), let us present a de- composition as follows. Recall that there is an initial table with one customer at time 1. Let us distinguish this initial table from other tables. At time n ∈ N, we record the size of the (n) initial table by N0 , the composition of the table sizes to the left of the initial table by (n) (n) C1 , and the composition to the right of the initial table by C2 . Then there is the identity (n) (n) (n) C(n) = C1 ? {(0,N0 )} ?C2 . (n) (n) (n) (n) (n) (n) (n) Let (N1 ,N2 ) := (kC1 k, kC2 k). Then (N1 ,N0 ,N2 ) can be described as a Pólya urn model with three colours. More precisely, when the current numbers of balls of the three colours are (n1, n0, n2), we next add a ball whose colour is chosen accord- ing to probabilities proportional to n1 + θ1, n0 − α and n2 + θ2. Starting from the initial (n) (n) (n) state (0, 1, 0), we get (N1 ,N0 ,N2 ) after adding n−1 balls. In other words, the vec- (n) (n) (n) tor (N1 ,N0 −1,N2 ) has Dirichlet-multinomial distribution with parameters n−1 and (θ1, 1−α, θ2); i.e. for n1, n0, n2 ∈ N0 with n0 6= 0 and n1 + n0 + n2 = n,  (n) (n) (n)  pn(n1, n0, n2) := P (N1 ,N0 ,N2 ) = (n1, n0, n2) Γ(1 − α + θ + θ ) (n − 1)!Γ(n − α)Γ(n + θ )Γ(n + θ ) (6) = 1 2 0 1 1 2 2 . Γ(1 − α)Γ(θ1)Γ(θ2) Γ(n − α + θ1 + θ2)(n0 − 1)!n1!n2! (n) (n) (n) (n) (n) Furthermore, conditionally given (N1 ,N0 ,N2 ), the compositions C1 and C2 are (α) (α) independent with distribution oCRP (n) (θ1, α) and oCRP (n) (α, θ2) respectively, for which N1 N2 there is an explicit description in Lemma 2.3, up to an elementary left-right reversal.

(α) PROPOSITION 2.5. Let θ1, θ2 ≥ 0 and (C(n), n ∈ N) an oCRP (θ1, θ2). Then for ev- ery (n1, n2, . . . , nk) ∈ C with n = n1 + n2 + ··· + nk, we have k  i−1 k   X  Y  Y  P C(n)=(n1, . . . , nk) = pn N1:i−1, ni,Ni+1:k rθ1 N1:j, nj rθ2 Nj:k, nj , i=1 j=1 j=i+1

where Ni:j = ni + ··· + nj , pn is given by (6) and rθ1 , rθ2 are as in Lemma 2.3. Furthermore, (C(n), n ∈ N) is a composition structure in the sense that it is weakly sampling consistent.

PROOF. The distribution of C(n) is an immediate consequence of the decomposition ex- plained above. To prove the weak sampling consistency, let us consider the decomposition 10 Q. SHI AND M. WINKEL

(n+1) (n+1) (n+1) C(n + 1) = C1 ? {(0,N0 )} ?C2 . By removing one customer uniformly at ran- (n) (n) (n) dom (a down-step), we obtain in the obvious way a triple (Ce1 , Ne0 , Ce2 ), with the excep- (n+1) tion for the case when N0 = 1 and this customer is removed by the down-step: in the (n) latter situation, to make sure that Ne0 is strictly positive, we choose the new marked table (n+1) to be the nearest to the left with probability proportional to kC1 k, and the nearest to the (n+1) right with the remaining probability, proportional to kC2 k, and we further decompose (n) (n) (n) according to this new middle table to define Ce1 , Ne0 , Ce2 . Therefore, for n1, n0, n2 ∈ N0 with n0 6= 0 and n1 + n0 + n2 = n, the probability of the  (n) (n) (n)  event kCe1 k, Ne0 , kCe2 k = (n1, n0, n2) is n +1 n +1 n +1 p (n +1, n , n ) 1 + p (n , n , n +1) 2 + p (n , n +1, n ) 0 n+1 1 0 2 n+1 n+1 1 0 2 n+1 n+1 1 0 2 n+1 n +n n +n +p (n +n , 1, n ) 1 0 r (n +n , n )+p (n , 1, n +n ) 0 2 r (n +n , n ), n+1 1 0 2 n(n+1) θ1 1 0 0 n+1 1 0 2 n(n+1) θ2 0 2 0 where the meaning of each term should be clear. Straightforward calculation shows the sum is equal to pn(n1, n0, n2). (n+1) (n+1) The triple description above shows that, conditionally on kC1 k = n1, C1 has dis- (α) (n) (n) (n) tribution oCRPn1 (θ1, α). Conditionally on (kCe1 k, Ne0 , kCe2 k) = (n1, n0, n2), we still (n) (α) have that Ce1 ∼ oCRPn1 (θ1, α). This could be checked by looking at each situation: in the down-step, if the customer is removed from the marked table (with size ≥ 2) or the right part, (n) (n+1) (α) then Ce1 = C1 , which has distribution oCRPn1 (θ1, α); if the customer is removed from (n+1) the left part, then this is a consequence of the weak sampling consistency of C1 given by Lemma 2.3; if the marked table has one customer and she is removed, then the claim (n+1) holds because Lemma 2.3 yields that C1 is regenerative. By similar arguments we have (n) (α) (n) (n) (n) Ce2 ∼ oCRPn2 (θ2). Summarising, we find that (Ce1 , Ne0 , Ce2 ) has the same distribution (n) (n) (n) as (C1 ,N0 ,C2 ). This completes the proof.

(α) 2.2. The three-parameter family PDIP (θ1, θ2). Our next aim is to study the asymp- (α) totics of an oCRPn (θ1, θ2), as n → ∞. Recall that Lemma 2.3 gives the result for the case θ1 = α with the limit distributions forming a two-parameter family, whose left-right rever- (α) sals give the corresponding result for θ2 = α. The latter limits were denoted by PDIP (θ1), θ1 ≥ 0. Let us now introduce a new three-parameter family of random interval partitions, (α) generalising PDIP (θ1). To this end, we extend the parameters of Dirichlet distributions

to every α1, . . . , αm ≥ 0 with m ∈ N: say αi1 , . . . , αik > 0 and αj = 0 for any other

j ≤ m, let (Bi1 ,...,Bik ) ∼ Dir(αi1 , . . . , αik ) and Bj := 0 for all other j. Then we define Dir(α1, . . . , αm) to be the law of (B1,...,Bm). By convention Dir(α) = δ1 for any α ≥ 0.

(α) DEFINITION 2.6 (PDIP (θ1, θ2)). For θ1, θ2 ≥ 0, let (B1,B0,B2) ∼ Dir(θ1, 1−α, θ2), (α) (α) γ¯1 ∼ PDIP (θ1), and γ¯2 ∼ PDIP (θ2), independent of each other. Let γ¯ = B1γ¯1 ? {(0,B0)} ? rev(B2γ¯2). Then we call γ¯ an (α, θ1, θ2)-Poisson–Dirichlet interval partition (α) with distribution PDIP (θ1, θ2).

(α) The case θ1 = θ2 = 0 is degenerate with PDIP (0, 0) = δ{(0,1)}. When θ2 = α, we see (α) (α) e.g. from [39, Corollary 8] that PDIP (θ1, α) coincides with PDIP (θ1) in Definition 2.4.

(α) 1 LEMMA 2.7. Let (C(n), n ∈ N) ∼ oCRP (θ1, θ2). Then n C(n) converges a.s. to a (α) random interval partition γ¯ ∼ PDIP (θ1, θ2), under the metric dH , as n → ∞. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 11

PROOF. We use the triple-description of C(n) in the proof of Proposition 2.5. Consider (α) (α) independent (R1(n), n ∈ N) ∼ oCRP (θ1, α), (R2(n), n ∈ N) ∼ oCRP (α, θ2), and a (n) (n) (n) Pólya urn model with three colours (N1 ,N0 ,N2 ), n ∈ N. Then we can write C(n) = (n) (n) (n) R1(N1 ) ? {(0,N0 )} ?R2(N2 ). 1 (n) (n) (n) The asymptotics of a Pólya urn yield that n (N1 ,N0 ,N2 ) converges a.s. to some (α) (B1,B0,B2) ∼ Dir(θ1, 1 − α, θ2). By Lemma 2.3, there exist γ¯1 ∼ PDIP (θ1) and γ¯2 ∼ (α) 1 1 PDIP (θ2), independent of each other, such that n R1(n) → γ¯1 and n R2(n) → rev(¯γ2); 1 both convergences hold a.s. as n → ∞. Therefore, we conclude that n C(n) converges a.s. to (α) (B1γ¯1) ? {(0,B0)} ? (B2rev(¯γ2)), which has distribution PDIP (θ1, θ2) by definition.

(α) We now give some decompositions of PDIP (θ1, θ2), extending [39, Corollary 8] to the (α) three-parameter case. With independent B ∼ Beta(1−α+θ1, θ2), γ¯ ∼ PDIP (θ1, 0), and (α) β¯ ∼ PDIP (α, θ2), it follows readily from Definition 2.6 that (α) Bγ¯ ? (1 − B)β¯ ∼ PDIP (θ1, θ2).

When θ1 ≥ α, we also have a different decomposition as follows.

0 COROLLARY 2.8. Suppose that θ1 ≥ α. With independent B ∼ Beta(θ1 −α, θ2), γ¯ ∼ (α) (α) PDIP (θ1, 0), and β¯ ∼ PDIP (α, θ2), we have 0 0 (α) (7) B γ¯ ? (1 − B )β¯ ∼ PDIP (θ1, θ2),

d 0 0 0 (α) and γ¯ = V γ¯1 ? {(0, 1 − V )} for independent V ∼ Beta(θ1, 1 − α) and γ¯1 ∼ PDIP (θ1).

(α) PROOF. Consider an oCRP (θ1, θ2). For the initial table, we colour it in red with prob- ability (θ1 − α)/(θ1 − α + θ2). If it is not coloured in red, then each time a new table arrives to the left of the initial table, we flip an unfair coin with success probability 1 − α/θ1 and colour the new table in red at the first success. In this way, we separate the composition at ev- ery step into two parts: the tables to the left of the red table (with the red table included), and everything to the right of the red table. It is easy to see that the sizes of the two parts follow a Pólya urn such that the asymptotic proportions follow Dir(θ1 − α, θ2). Moreover, condition- (α) (α) ally on the sizes of the two parts, they are independent oCRP (θ1, 0) and oCRP (α, θ2) respectively. Now the claim follows from Lemma 2.7 and Definition 2.6.

An ordered version [37] of Kingman’s paintbox processes is described as follows. Let γ ∈ IH,1 and (Zi, i ∈ N) be i.i.d. uniform random variables on [0, 1]. Then customers i and j sit at the same table, if and only if Zi and Zj fall in the same block of γ. Moreover, the tables are ordered by their corresponding intervals. For any n ∈ N, the first n variables (Zi, i ∈ [n]) give rise to a composition of the set [n], i.e. an ordered family of disjoint subsets of [n]: ∗ (8) Cγ (n) = {BU (n): BU (n)6=∅,U ∈γ}, where BU (n) := {j≤n: Zj ∈(inf U, sup U)}. ∗ Let Pn,γ be the distribution of the random composition of n induced by Cγ (n). The following statement shows that the composition structure induced by an ordered CRP is a mixture of ordered Kingman paintbox processes.

(α) PROPOSITION 2.9. The probability measure PDIP (θ1, θ2) is the unique probability measure on IH , such that there is the identity Z (α) (α) oCRPn (θ1, θ2)(A) = Pn,γ(A) PDIP (θ1, θ2)(dγ), ∀n ∈ N, ∀A ⊆ Cn. IH 12 Q. SHI AND M. WINKEL

(α) PROOF. Since an oCRP (θ1, θ2) is a composition structure by Proposition 2.5 and since (α) (α) renormalised oCRPn (θ1, θ2) converges weakly to PDIP (θ1, θ2) as n → ∞ by Lemma 2.7, the statement follows from [22, Corollary 12].

(α) REMARK. If we label the customers by N in an oCRP (θ1, θ2) defined in Defini- tion 2.1, then we also naturally obtain a composition of the set [n] when n customers have ∗ arrived. However, it does not have the same law as the Cγ¯(n) obtained from the paintbox in (α) (8) with γ¯ ∼ PDIP (θ1, θ2), though we know from Proposition 2.9 that their induced integer ∗ compositions of n have the same law. Indeed, Cγ¯(n) obtained by the paintbox is exchange- (α) able [22], but it is easy to check that an oCRP (θ1, θ2) with general parameters is not, the only exceptions being for θ1 = θ2 = α.

Recall that PD(α)(θ) denotes the Poisson–Dirichlet distribution on the Kingman sim- plex ∇∞.

PROPOSITION 2.10. Let θ1, θ2 ≥ 0 with θ1 + θ2 > 0. The ranked interval lengths of a (α) (α) PDIP (θ1, θ2) have PD (θ) distribution on ∇∞ with θ := θ1 + θ2 − α.

(n) (α) PROOF. Let C ∼ oCRPn (θ1, θ2), then it follows immediately from its construction that C(n) ranked in decreasing order is an unordered (α, θ)-Chinese restaurant process with θ := θ1 + θ2 − α > −α. As a consequence of Lemma 2.7, the ranked interval lengths of a (α) PDIP (θ1, θ2) have the same distribution as the limit of (α, θ)-Chinese restaurant processes, which is known to be PD(α)(θ).

3. Proof of Theorem 1.1 when θ2 = α. We first recall in Section 3.1 the construc- (α) tion and some basic properties of the two-parameter family of SSIP (θ1)-evolutions from [14, 15, 18], and then prove that they are the diffusion limits of the corresponding (α) PCRP (θ1, α), in Sections 3.4 and 3.5 for θ1 = 0 and for θ1 ≥ 0 in general, respec- tively, thus proving Theorem 1.1 for the case θ2 = α. The proofs rely on a representation (α) of PCRP (θ1, α) by Rogers and Winkel [44] that we recall in Section 3.2 and an in-depth investigation of a positive-integer-valued Markov chain in Section 3.3.

(α) 3.1. Preliminaries: SSIP (θ1)-evolutions. In this section, we recall the scaffolding- and-spindles construction and some basic properties of an (α, θ1) self-similar interval parti- (α) tion evolution, SSIP (θ1). The material is collected from [14, 15, 18]. Let E be the space of non-negative càdlàg excursions away from zero. Then for any f ∈ E, we have ζ(f) := inf{t > 0: f(t) = 0} = sup{t ≥ 0: f(t) > 0}. We will present the con- struction of SSIP-evolutions via the following skewer map introduced in [14]. P DEFINITION 3.1 (Skewer). Let N = i∈I δ(ti, fi) be a point measure on R+ × E and X a càdlàg process such that X X δ(t, ∆X(t)) = δ(ti, ζ(fi)). ∆X(t)>0 i∈I The skewer of the pair (N,X) at level y is (when well-defined) the interval partition y y y y (9) SKEWER(y, N, X) := {(M (t−),M (t)): M (t−) < M (t), t ≥ 0}, y R  where M (t) = [0,t]×E f y − X(s−) N(ds, df). Denote the process by

SKEWER(N,X) := (SKEWER(y, N, X), y ≥ 0). TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 13

(N,X) (ft(z), z ≥ 0) or (V,X) X(t) y z t

X(t−)

ft(y − X(t−)) 0 xt 1 SKEWER(y, N, X) SSKEWER(y, V, X)

FIG 2. A scaffolding with marks (atom size evolutions as spindle-shapes and allelic types from a colour spectrum coded by [0, 1]) and the skewer and superskewer (see Definition 5.1) at level y, not to scale.

Let θ ∈ (−1, 1). We know from [23] that BESQ(2θ) has an exit boundary at zero. Pit- (2θ) man and Yor [40, Section 3] construct a σ-finite excursion measure ΛBESQ associated with BESQ(2θ) on the space E, such that 2θ−1 (10) Λ(2θ) (ζ > y) := Λ(2θ) {f ∈ E : ζ(f) > y} = y−1+θ, y > 0, BESQ BESQ Γ(2−θ) (2θ) and under ΛBESQ, conditional on {ζ = y} for 0 < y < ∞, the excursion is a squared Bessel bridge from 0 to 0 of length y, see [42, Section 11.3]. [40, Section 3] offers several other (2θ) equivalent descriptions of ΛBESQ; see also [14, Section 2.3]. For α ∈ (0, 1), let N be a Poisson random measure on R+ × E with intensity cαLeb ⊗ (−2α) (−2α) ΛBESQ , denoted by PRM(cαLeb ⊗ ΛBESQ ), where (11) cα := 2α(1+α)/Γ(1−α). Each atom of N, which is an excursion function in E, shall be referred to as a spindle, in view of illustration of N as in Figure2 . We pair N with a scaffolding function ξN := (ξN(t), t ≥ 0) defined by  Z (1 + α)t  (12) ξN(t) := lim ζ(f)N(ds, df) − α . z↓0 [0,t]×{g∈E : ζ(g)>z} (2z) Γ(1 − α)Γ(1 + α) This is a spectrally positive stable Lévy process of index (1 + α), with Lévy measure (−2α) −α 1+α cαΛBESQ (ζ ∈ dy) and Laplace exponent (2 q /Γ(1 + α), q ≥ 0). For x > 0, let f ∼ BESQx(−2α), independent of N. Write Cladex(α) for the law of a clade of initial mass x, which is a random point measure on R+ × E defined by

CLADE(f, N) := δ(0, f) + N , where T−y(ξN) := inf{t ≥ 0: ξN(t) = −y}. (0,T−ζ(f)(ξN)]×E

(α) DEFINITION 3.2 (SSIP (0)-evolution). For γ ∈ IH , let (NU ,U ∈ γ) be a family of in- (α) dependent clades, with each NU ∼ CladeLeb(U)(α). An SSIP (0)-evolution starting from γ ∈ IH is a process distributed as β = (β(y), y ≥ 0) defined by β(y) := ? SKEWER(y, NU , ξ(NU )), y ≥ 0. U∈γ

(−2α) We now turn to the case θ1 > 0. Let N be a PRM(cαLeb ⊗ ΛBESQ ) and Xα = ξN its scaffolding. Define the modified scaffolding process

(13) Xθ (t) := Xα(t) + (1 − α/θ1) `(t) where `(t) := − inf Xα(u) for t ≥ 0. 1 u≤t 14 Q. SHI AND M. WINKEL

For any y ≥ 0, let −y Tα := T−y(Xα) = inf{t ≥ 0: Xα(t) = −y} = inf{t ≥ 0: `(t) ≥ y}.

Notice that infu≤t Xθ1 (u) = −(α/θ1)`(t), then we have the identity −y (14) T := T (X ) = inf{t ≥ 0: X (t) = −y} = T −(θ1/α)y. θ1 −y θ1 θ1 α For each j ∈ N, define an interval-partition-valued process ( β (y) := SKEWER(y, N −j , j + X −j ), y ∈ [0, j]. j [0,T ) θ1 [0,T ) θ1 θ1 −z For any z > 0, the shifted process (z + Xα(Tα + t), t ≥ 0) has the same distribution as Xα, −z by the strong of Xα. As a consequence, (−z + `(t + Tα ), t ≥ 0) has the same law as `. Combing this and (14), we deduce that, for any k ≥ j, the following two pairs have the same law:   d   N ◦ L j−k , k + X ◦ L j−k = N −j , j + X −j , T −k j−k θ1 T −k j−k [0,T ) θ1 [0,T ) θ1 [0,T −T ) θ1 [0,T −T ) θ1 θ1 θ1 θ1 θ1 θ1 where L stands for the shift operator and we have also used the Poisson property of N. This ( d ( leads to (βj(y), y ∈ [0, j]) = (βk(y), y ∈ [0, j]). Thus, by Kolmogorov’s extension theorem, ( there exists a process (β(y), y ≥ 0) such that  (  d  (  (15) β(y), y ∈ [0, j] = βj(y), y ∈ [0, j] for every j ∈ N.

(α) ( DEFINITION 3.3 (SSIP (θ1)-evolution). For θ1 > 0, let (β(y), y ≥ 0) be as in (15) * (α) and (β(y), y ≥ 0) an independent SSIP (0)-evolution starting from γ ∈ IH . Then (β(y) = ( * (α) β(y) ? β(y), y ≥ 0) is called an SSIP (θ1)-evolution starting from γ.

(α) In [18], an SSIP (θ1)-evolution is defined in a slightly different way that more explicitly handles the Poisson random measure of excursions of Xα above the minimum. Indeed, the passage from α to θ1 in [18] is by changing the intensity by a factor of θ1/α. The current correspondence can be easily seen to have the same effect.

(α) PROPOSITION 3.4 ([18, Theorem 1.4 and Proposition 3.4]). For θ1 ≥0, an SSIP (θ1)- evolution is a path-continuous Hunt process and its total mass process is a BESQ(2θ1).

(α) We refer to [18] for the transition kernel of an SSIP (θ1)-evolution.

3.2. Poissonised ordered up-down Chinese restaurant processes. For θ > −1, let Z := (Z(t), t ≥ 0) be a continuous-time Markov chain on N0, whose non-zero transition rates are  i + θ, i ≥ 1, j = i + 1;  Qi,j(θ) = i, i ≥ 1, j = i − 1, θ ∨ 0, i = 0, j = 1.

In particular, 0 is an absorbing state when θ ≤ 0. For k ∈ N0, we define

(16) πk(θ): the law of the process Z starting from Z(0) = k. Let ζ(Z) := inf{t > 0: Z(t) = 0} be its first hitting time of zero. Let α ∈ (0, 1) and θ1, θ2 ≥ 0. Recall from the introduction that a Poissonised ordered up- down Chinese restaurant process (PCRP) with parameters α, θ1 and θ2, starting from C ∈ C, (α) is denoted it by PCRPC (θ1, θ2). TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 15

(α) When θ2 = α, a PCRP (θ1, α) is well-studied by Rogers and Winkel [44]. They develop a representation of a PCRP by using scaffolding and spindles, in a similar way to the con- (α) struction of an SSIP (θ1)-evolution. Their approach draws on connections with splitting trees and results of the latter object developed in [20, 30]. Let D ∼ PRM(α · Leb ⊗ π1(−α)) and define its scaffolding function by Z (17) JD(t) := −t + ζ(f)D(ds, df), t ≥ 0. [0,t]×E

Let Z ∼ πm(−α) with m ∈ N, independent of D. Then a discrete clade with initial mass m is a random point measure on R+ × E defined by D (18) CLADE (Z, D) := δ(0,Z) + D |  , 0,T−ζ(Z)(JD) ×E

D D where T−y(JD) = inf{t ≥ 0: JD(t) = −y}. Write Cladem(α) for the law of CLADE (Z, D).

LEMMA 3.5 ([44, Theorem 1.2]). For γ ∈ C, let (DU ,U ∈γ) be an independent family with each D ∼ D (α). Then the process SKEWER(y, D ,J ), y ≥ 0 is U CladeLeb(U) ?U∈γ U DU (α) a PCRPγ (0, α).

(α) To construct a PCRP (θ1, α) with θ1 > 0, define for t ≥ 0  α  (19) Jθ1,D(t) := JD(t) + 1 − `(t), where `(t) := − inf JD(u). θ1 u≤t Then inf{t≥0: J (t)=−z}=inf{t≥0: J (t)=−(α/θ )z}=:T −(α/θ1)z for z ≥ 0. Set D θ1,D 1 θ1 (   C (y) := SKEWER y, D −j , j + J −j , y ∈ [0, j], j ∈ . j [0,T ) θ1,D [0,T ) N θ1 θ1 Then, for any k > j, we have   d   −j D −(j−k) −k , k + Jθ1,D −(j−k) −k = D | , j + Jθ1,D −j . [T ,T ) [T ,T ) [0,Tθ ) [0,T ) θ1 θ1 θ1 θ1 1 θ1 ( d ( As a result, (Ck(y), y ∈ [0, j]) = (Cj(y), y ∈ [0, j]). Then by Kolmogorov’s extension theo- ( rem there exists a càdlàg process (C(y), y ≥ 0) such that

( d ( (20) (C(y), y ∈ [0, j]) = (Cj(y), y ∈ [0, j]) for all j ∈ N.

( THEOREM 3.6 ([44, Theorem 2.5]). For θ1 > 0, let (C(y), y ≥ 0) be the process defined * in (20). For γ ∈ C, let (C(y), y ≥ 0) be a PCRP(α)(0, α) starting from γ. Then the C-valued ( * (α) process (C(y) := C(y) ? C(y), y ≥ 0) is a PCRP (θ1, α) starting from γ.

3.3. Study of the up-down chain on positive integers. For θ > −1 and n, k ∈ N, define a probability measure:

(n) −1  (21) πk (θ) is the law of the process (n Z(2ny), y ≥ 0 , where Z ∼ πk(θ) as in (16). In preparation of proving Theorem 1.1, we present the following convergence concerning scaffoldings and spindles. 16 Q. SHI AND M. WINKEL

(n) PROPOSITION 3.7. For n ∈ N, let N be a Poisson random measure on R+ × E with 1+α (n) (n) (n) intensity Leb ⊗ (2αn · π1 (−α)), and define its scaffolding ξ := (ξ (t))t≥0, where Z (n) (n) α (n) (22) ξ (t) := JN(n) (t) := −n t + ζ(f)N (ds, df), t ≥ 0. [0,t]×E

(n) (n) (n)  (−2α) Write ` := ` (t) := − infs∈[0,t] ξ (s), t ≥ 0 . Let N ∼ PRM(cα · Leb ⊗ ΛBESQ ), where (−2α) ΛBESQ is the excursion measure associated with BESQ(−2α) introduced in Section 3.1 and cα = 2α(1+α)/Γ(1−α) as in (11). Define its scaffolding ξN as in (12), and `N = (`N(t) = (n) (n) (n) − infs∈[0,t] ξN(s), t ≥ 0). Then the joint distribution of the triple (N , ξ , ` ) converges to (N, ξN, `N) in distribution, under the product of vague and Skorokhod topologies.

(n) Note that, for n ∈ N, we can obtain N from D ∼ PRM(Leb ⊗ π1(−α)), such that for each atom δ(s, f) of D, N(n) has an atom δ(n−(1+α)s, n−1f(2n · )). This suggests a close relation between N(n) and a rescaled PCRP, which will be specified later on. The up-down chain defined in (21) plays a central role in the proof of Proposition 3.7. Let us first record a convergence result obtained in [44, Theorem 1.3–1.4]. Similar convergence in a general context of discrete-time Markov chains converging to positive self-similar Markov processes has been established in [5].

(n) LEMMA 3.8 ([44, Theorem 1.3–1.4]). Fix a > 0 and θ > −1. For every n ∈ N, let Z ∼ (n) πbnac(θ). Then the following convergence holds in the space D(R+, R+) of càdlàg functions endowed with the Skorokhod topology: (n) Z −→ Z ∼ BESQa(2θ) in distribution. n→∞ Moreover, if θ ∈ (−1, 0], then the convergence holds jointly with the convergence of first hitting times of 0.

For our purposes, we study this up-down chain in more depth and obtain the following two convergence results.

LEMMA 3.9. In Lemma 3.8, the joint convergence of first hitting time of 0 also holds when θ ∈ (0, 1).

Recall that for θ ≥ 1, the first hitting time of 0 by BESQ(2θ) is infinite.

PROOF. We adapt the proof of [5, Theorem 3(i)], which establishes such convergence of hitting times in a general context of discrete-time Markov chains converging to positive self-similar Markov processes. This relies on Lamperti’s representation for Z ∼ BESQa(2θ)  Z s  Z(t) = exp(ξ(σ(t))), where σ(t) = inf s ≥ 0: eξ(r)dr > t , 0 for a Brownian motion with drift ξ(t) = log(a) + 2B(t) − 2(1 − θ)t, and corresponding representations  Z s  ξn(r) Zn(t) = exp(ξn(σn(t))), where σn(t) = inf s ≥ 0: e dr > t , 0 for continuous-time Markov chains ξn, n ≥ 1, with increment kernels

n x x x  L (x, dy) = 2ne (ne + θ)δlog(1+1/nex)(dy) + ne δlog(1−1/nex)(dy) , x ≥ − log n. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 17

We easily check that for all x ∈ R Z Z yLn(x, dy) → −2(1 − θ) and y2Ln(x, dy) → 4, y∈R y∈R as well as Z 2 n sup y 1{|y|>ε}L (x, dy) = 0 for n sufficiently large. {x: |x|≤r} y∈R To apply [25, Theorem IX.4.21], we further note that all convergences are locally uniform, and we extend the increment kernel Le(x, dy) := L(x, dy), x ≥ log(2) − log(n), by setting Len(x, dy) := Ln(log(2) − log(n), dy), x < log(2) − log(n), to be definite. With this extension of the increment kernel, we obtain ξen → ξ in distribu- tion on D([0, ∞), R). This implies ξn → ξ in distribution also for the process ξn that jumps from − log(n) to −∞, but only if we stop the processes the first time they exceed any fixed negative level. Turning to extinction times τn of Zn, we use Skorokhod’s representation ξn → ξ almost surely. Then we want to show that also Z ∞ Z ∞ ξn(s) ξ(s) τn = e ds → τ = e ds in probability. 0 0

We first establish some uniform bounds on the extinction times when Zn, n ≥ 1, are started n from sufficiently small initial states. To achieve this, we consider the generator L of Zn and note that for g(x) = xβ, we have Lng(x) = 2n((nx + θ)g(x + 1/n) + nxg(x − 1/n) − (2nx + θ)g(x)) ≤ −C/x for all n ≥ 1 and x ≥ K/n if and only if g(1 + h) − 2g(1) + g(1 − h) g(1 + h) − g(1) + θ ≤ −C/2 for all h ≤ 1/K. h2 h But since g00(1) + θg0(1) = β(β − 1 + θ) < 0 for β ∈ (0, 1 − θ), the function g is a Foster- Lyapunov function, and [34, Corollary 2.7], applied with q = p/2 = β/(1 + β) and f(x) = x(1+β)/2, yields 0 (K) q 0 β ∃C > 0 ∀n ≥ 1, ∀x ≥ K/n Ex((τn ) ) < C x , (K) where τn = inf{t ≥ 0: Mn(t) ≤ K/n}. An application of Markov’s inequality yields (K) 0 β −β/(1+β) Px(τn > t) ≤ C x t . In particular, (K) ∀ε > 0 ∀t > 0 ∃η > 0 ∀n ≥ 1 ∀K + 1 ≤ i ≤ ηn Pi/n(τn > t/6) ≤ ε/8.

Furthermore, there is n0 such that for n ≥ n0, the probability that Mn starting from K/n takes more than time t/6 to get from K/n to 0 is smaller than ε/8. Now choose R large enough so that Z ∞  ξ(s) P(exp(ξ(R)) < η/2) ≥ 1 − ε/8 and P e ds > t/3 ≤ ε/4. R

We can also take n1 ≥ n0 large enough so that

P(| exp(ξn(R)) − exp(ξ(R))| < η/2) ≥ 1 − ε/8 for all n ≥ n1. 18 Q. SHI AND M. WINKEL

Then, considering exp(ξn(R)) and applying the Markov property at time R, Z ∞  −ξn(s) P(exp(ξn(R)) > η) < ε/4 and P e ds > t/3 ≤ ε/2, for all n ≥ n1. R

But since ξn → ξ almost surely, uniformly on compact sets, we already have Z R Z R eξn(s)ds → eξ(s)ds almost surely. 0 0 Hence, we can find n2 ≥ n1 so that for all n ≥ n2  Z R Z R  ξ (s) ξ(s) P e n ds − e ds > t/3 < ε/4. 0 0

We conclude that, for any given t > 0 and any given ε, we found n2 ≥ 1 such that for all n ≥ n2  Z ∞ Z ∞  ξ (s) ξ(s) P e n ds − e ds > t < ε, 0 0 as required.

(n) PROPOSITION 3.10. Let θ ∈ (−1, 1) and Z ∼ π1(θ). Denote by π1 (θ) the distribution 1  e of n Z(2nt ∧ ζ(Z)), t ≥ 0 . Then the following convergence holds vaguely Γ(1+θ) n1−θ · π(n)(θ) −→ Λ(2θ) 1−θ e1 n→∞ BESQ on the space of càdlàg excursions equipped with the Skorokhod topology.

PROOF. Denote by A(f) = sup |f| the supremum of a càdlàg excursion f. In using the term “vague convergence” on spaces that are not locally compact, but are bounded away from a point (here bounded on {A > a} for all a > 0), we follow Kallenberg [28, Section 4.1]. Specifically, it follows from his Lemma 4.1 that it suffices to show for all a > 0 (2θ) 1. ΛBESQ(A = a) = 0, 2. (Γ(1 + θ)/(1 − θ))n1−θ · π(n)(θ)(A > a) −→ Λ(2θ) (A > a), e1 n→∞ BESQ 3. π(n)(θ)( · | A > a) −→ Λ(2θ) ( · | A > a) weakly. e1 n→∞ BESQ See also [10, Proposition A2.6.II]. (2θ) (2θ) 1−θ 1. is well-known. Indeed, we have chosen to normalise ΛBESQ so that ΛBESQ(A > a) = a . See e.g. [40, Section 3]. Cf. [14, Lemma 2.8], where a different normalisation was chosen. 2. can be proved using scale functions. Let us compute a scale function s for the birth- death chain with up-rates i + θ and down-rates i from state i ≥ 1. Set s(0) = 0 and s(1) = 1. For s to be a scale function, we need (k + θ)(s(k + 1) − s(k)) + k(s(k − 1) − s(k)) = 0 for all k ≥ 1. Let d(k) = s(k) − s(k − 1), k ≥ 1. Then k Γ(k + 1)Γ(1 + θ) d(k + 1) = d(k) = ∼ Γ(1 + θ)k−θ as k → ∞, k + θ Γ(k + 1 + θ) and therefore k k X X Γ(i + 1)Γ(1 + θ) Γ(1 + θ) s(k) = d(i) = ∼ k1−θ. Γ(i + 1 + θ) 1 − θ i=1 i=1 TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 19

Then the scale function applied to the birth-death chain is a martingale. Now let p(k) be the probability of hitting k before absorption in 0, when starting from 1. Applying the optional stopping theorem at the first hitting time of {0, k}, we find p(k)s(k) = 1, and hence Γ(1+θ) Γ(1+θ)n1−θ n1−θp(dnae) = −→ a1−θ, 1−θ (1 − θ)s(dnae) n→∞ as required. (2θ) 3. can be proved by using the First Description of ΛBESQ given in [40, (3.1)], which states, (2θ) in particular, that the excursion under ΛBESQ( · |A > a) is a concatenation of two indepen- dent processes, an ↑-diffusion starting from 0 and stopped when reaching a followed by a 0-diffusion starting from a and run until absorption in 0. In our case, the 0-diffusion is BESQ(2θ), while [40, (3.5)] identifies the ↑-diffusion as BESQ(4 − 2θ). Specifically, straight- forward Skorokhod topology arguments adjusting the time-changes around the concatenation times, see [25, VI.1.15], imply that it suffices to show: (a) The birth-death chain starting from 1 and conditioned to reach dnae before 0, rescaled, converges to a BESQ(4−2θ) stopped at a, jointly with the hitting times of dnae. (b) The birth-death chain starting from dnae run until first hitting 0, rescaled, converges to a BESQ(2θ) starting from a stopped when first hitting 0, jointly with the hitting times. (b) was shown in [44, Theorem 1.3]. See also Lemmas 3.8–3.9 here, completing the con- vergence of the hitting time. For (a), we adapt that proof. But first we need to identify the conditioned birth-death process. Note that the holding rates are not affected by the condi- tioning. An elementary argument based purely on the jump chain shows that the conditioned jump chain is Markovian, and its transition probabilities are adjusted by factors s(i ± 1)/s(i) so that the conditioned birth-death process has up-rates (i + θ)s(i + 1)/s(i) and down-rates is(i−1)/s(i) from state i ≥ 1. Rescaling, our processes are instances of R-valued pure-jump Markov process with jump intensity kernels Ke n(x, dy) = 0 for x ≤ 0 and, for x > 0,   n s(dnx+1e) s(dnx−1e) Ke (x, dy) = 2n (dnxe + θ) δ1/n(dy) + dnxe δ−1/n(dy) . s(dnxe) s(dnxe) We now check the drift, diffusion and jump criteria of [25, Theorem IX.4.21]: for x > 0 Z s(dnx+1e) − s(dnx−1e) s(dnx+1e) yKe n(x, dy) = 2dnxe + 2θ s(dnxe) s(dnxe) R → 4 − 4θ + 2θ = 4 − 2θ, Z 2dnxe s(dnx+1e) + s(dnx−1e) 2θ s(dnx+1e) y2Ke n(x, dy) = + n s(dnxe) n s(dnxe) R → 4x + 0 = 4x, Z 2 n y 1{|y|≥ε}Ke (x, dy) = 0 for n sufficiently large, R all uniformly in x ∈ (0, ∞), as required for the limiting (0, ∞)-valued BESQ(4−2θ) diffusion with infinitesimal drift 4 − 2θ and diffusion coefficient 4x. The convergence of hitting times of bnac, which is the first passage time above level bnac, follows from the regularity of the limiting diffusion after the first passage time above level a. See e.g. [44, Lemma 3.3].

PROOFOF PROPOSITION 3.7. Proposition 3.10 shows that the intensity measure of the Poisson random measure N(n) converges vaguely as n → ∞. Then the weak convergence of 20 Q. SHI AND M. WINKEL

N(n), under the vague topology, follows from [28, Theorem 4.11]. The weak convergence (n) ξ → ξN has already been proved by Rogers and Winkel [44, Theorem 1.5]. (n) (n) Therefore, both sequences (N , n ∈ N) and (ξ , n ∈ N) are tight (see e.g. [25, VI 3.9]), (n) and the latter implies the tightness of (` , n ∈ N). We hence deduce immediately the tight- (n) (n) (n) ness of the triple-valued sequence ((N , ξ , ` ), n ∈ N). As a result, we only need to prove that, for any subsequence (N(ni), ξ(ni), `(ni)) that converges in law, the limiting dis- tribution is the same as (N, ξN, `N). By Skorokhod representation, we may assume that (n ) (n ) (n ) (N i , ξ i , ` i ) converges a.s. to (N, ξ,e `e), and it remains to prove that ξe= ξN and `e= `N a.s.. For any ε > 0, since a.s. N has no spindle of length equal to ε, the vague convergence of N(ni) implies that, a.s. for any t ≥ 0, we have the following weak convergence of finite point measures:   X (ni) (ni) X 1{|∆ξ (s)| > ε}δ s, ∆ξ (s) =⇒ 1{|∆ξN(s)| > ε}δ (s, ∆ξN(s)) . s≤t s≤t P  The subsequence above also converges a.s. to s≤t 1{|∆ξ(s)| > ε}δ s, ∆ξe(s) , since (n ) ξ i → ξe in D(R+, R). By the Lévy–Itô decomposition, this is enough to conclude that ξe= ξN a.s.. (n ) (n ) For any t ≥ 0, since ξ i → ξN a.s. and ξN is a.s. continuous at t, we have (ξ i (s), s ∈ (n) [0, t]) → (ξN(s), s ∈ [0, t]) in D([0, t], R) a.s.. Then infs∈[0,t] ξ (s) → infs∈[0,t] ξN(s) a.s., because it is a continuous functional (w.r.t. the Skorokhod topology). In other words, `e(t) = `N(t) a.s.. By the continuity of the process `N we have `e= `N a.s., completing the proof.

LEMMA 3.11 (First passage over a negative level). Suppose that (N(n), ξ(n)) as in Proposition 3.7 converges a.s. to (N, ξN) as n → ∞. Define

(n) (n) (n) (23) T−y := T−y(ξ ) := inf{t ≥ 0: ξ (t) = −y}, y > 0, (n) and similarly T−y := T−y(ξN). Let (h )n∈N be a sequence of positive numbers with (n) (n) limn→∞ h = h > 0. Then T−h(n) converges to T−h a.s..

PROOF. Since the process ξ(n) is a spectrally positive Lévy process with some Laplace (n) (n) exponent Φ , we know from [3, Theorem VII.1] that (T(−y)+, y ≥ 0) is a subordina- tor with Laplace exponent (Φ(n))−1. On the one hand, the convergence of Φ(n) leads to (n) (T(−y)+, y ≥ 0) → (T(−y)+, y ≥ 0) in distribution under the Skorokhod topology. Since ξN (n) is a.s. continuous at T−h, we have T−h(n) → T−h in distribution. (n) On the other hand, we deduce from the convergence of the process ξ in D(R+, R) that, for any ε > 0, a.s. there exists N1 ∈ N such that for all n > N1, (n) |ξ (T−h) − (−h)| < ε. (n) We may assume that |h − h| < ε for all n ∈ N. As a result, a.s. for any n > N1 and 0 (n) (n) y < h − 2ε, we have T−y0 < T−h. Hence, by the arbitrariness of ε and the left-continuity (n) (n) (n) of T−y with respect to y, we have lim supn→∞ T−h(n) ≤ T−h a.s.. Recall that T−h(n) → T−h (n) in distribution, it follows that T−h(n) → T−h a.s.. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 21

3.4. The scaling limit of a PCRP(α)(0, α).

(α) (n) THEOREM 3.12 (Convergence of PCRP (0, α)). For n ∈ N, let (C (y), y ≥ 0) be a PCRP(α)(0, α) starting from C(n)(0) ∈ C and (β(y), y ≥ 0) be an SSIP(α)(0)-evolution 1 (n) starting from β(0) ∈ IH . Suppose that the interval partition n C (0) converges in distribu- 1 (n) tion to β(0) as n → ∞, under dH . Then the rescaled process ( n C (2ny), y ≥ 0) converges in distribution to (β(y), y ≥ 0) as n → ∞ in the Skorokhod sense and hence uniformly.

We start with the simplest case.

LEMMA 3.13. The statement of Theorem 3.12 holds, if β(0) = {(0, b)} and C(n)(0) = (n) −1 (n) {(0, b )}, where limn→∞ n b = b > 0.

To prove this lemma, we first give a representation of the rescaled PCRP. Let (N(n), ξ(n)) be as in Proposition 3.7. For each n ∈ N, we define a random point measure on R+ × E by

N(n) := δ(0, f (n)) + N(n) , cld (n) (0,T−ζ(f(n))(ξ )]×E

(n) (n) (n) (n) (n) where f ∼ πb(n) (−α), independent of N , and T−y(ξ ) := inf{t ≥ 0: ξ (t) = −y}. (n) (n) D Then we may assume that Ncld is obtained from D ∼ Cladeb(n) (α) defined in (18) such (n) (n) −(1+α) −1 (n) that for each atom δ(s, f) of D , Ncld has an atom δ(n s, n f(2n·)). Let ξcld be (n) the scaffolding associated with Ncld as in (17). As a consequence, we have the identity

(n)  (n) (n) 1  (n)  β (y) := SKEWER y, N , ξ = SKEWER 2ny, D , ξ (n) , y ≥ 0, cld cld n D (n) 1 (n) where ξD(n) is defined as in (17). By Lemma 3.5, we may assume that β (y) = n C (2ny) with C(n) a PCRP(α)(0, α) starting from C(n)(0) = {(0, b(n))}.

PROOFOF LEMMA 3.13. With notation as above, we shall prove that the rescaled process β(n) := (β(n)(y), y ≥ 0) converges to an SSIP(α)(0)-evolution β := (β(y), y ≥ 0) start- ing from {(0, b)}. By Definition 3.2 we can write β = SKEWER(Ncld, ξcld), with Ncld = CLADE(f, N) and ξcld its associated scaffolding, where f ∼ BESQb(−2α) and N is a Pois- (−2α) son random measure on [0, ∞) × E with intensity cαLeb ⊗ ΛBESQ , independent of f. Us- ing Proposition 3.7 and Lemma 3.8, we have (N(n), ξ(n)) → (N, ξ) and (f (n), ζ(f (n))) → (f, ζ(f)) in distribution, independently. Then it follows from Lemma 3.11 that this con- (n) vergence also holds jointly with T−ζ(f (n))(ξ ) → T−ζ(f)(ξ). As a consequence, we have (n) (n) (Ncld , ξcld ) → (Ncld, ξcld) in distribution. Fix any subsequence (mj)j∈N. With notation as above, consider the subsequence of triple- (mj ) (mj ) (mj ) valued processes (N , ξ , kβ k)j∈N. For each element in the triple, we know its tightness from Proposition 3.7 and Lemma 3.8, then the triple-valued subsequence is tight.   (ni) (ni) (ni) Therefore, we can extract a further subsequence Ncld , ξcld , kβ k that converges i∈N in distribution to a limit process (Ncld, ξcld, Mf). Using Skorokhod representation, we may assume that this convergences holds a.s.. We shall prove that β(ni) converges to β a.s., from which the lemma follows. We stress that the limit Mf has the same law as the total mass process kβk, but at this stage it is not clear if they are indeed equal. We will prove that Mf = kβk a.s.. 22 Q. SHI AND M. WINKEL

To this end, let us consider the contribution of the spindles with lifetime longer than ρ > 0. On the space R+ × {f ∈ E : ζ(f) > ρ}, Ncld has a.s. a finite number of atoms, say enumer- ated in chronological order by (tj, fj)j≤K with K ∈ N. Since Ncld has no spindle of length (ni) (ni) exactly equal to ρ, by the a.s. convergence Ncld → Ncld, we may assume that each Ncld (ni) (ni) also has K atoms (tj , fj )j≤K on R+ × {f ∈ E : ζ(f) > ρ}, and, for every j ≤ K, that

(ni) (ni) (ni) (24) lim tj = tj, lim sup fj (t) − fj(t) = 0, and lim ζ(fj ) = ζ(fj) a.s.. i→∞ i→∞ t≥0 i→∞

(ni) (ni) (ni) (ni) Note that ζ(fj ) = ∆ξcld (tj ). Since ξcld → ξcld in D(R+, R), we deduce that

(ni) (ni) (25) lim ξ (t −) = ξcld(tj−) a.s.. i→∞ cld j By deleting all spindles whose lifetimes are smaller than ρ, we obtain from β(ni) an inter- val partition evolution

(ni) n (ni) (ni)  o β>ρ (y) := Mk−1 (y, ρ),Mk (y, ρ) , 1 ≤ k ≤ K , y ≥ 0,   (ni) P (ni) (ni) (ni) where Mk (y, ρ) = j∈[k] fj y − ξcld (tj −) . Define similarly Mk(y, ρ) and β>ρ(y) from β. By (25) and (24), for all k ≤ K,

(ni) lim sup Mk (y, ρ) − Mk(y, ρ) = 0 a.s.. n→∞ y≥0 It follows that

 (ni)  (26) lim sup dH β>ρ (y), β>ρ(y) = 0 a.s.. i→∞ y≥0 In particular, for all y, ρ > 0,

(ni) (ni) Mf(y) = lim kβ (y)k ≥ lim kβ>ρ (y)k = kβ>ρ(y)k, a.s.. i→∞ i→∞ Then monotone convergence leads to, for all y > 0,

Mf(y) ≥ lim kβ>ρ(y)k = kβ(y)k, a.s.. ρ↓0

Moreover, since Mf and kβk also have the same law, we conclude that Mf and kβk are indistinguishable. Next, we shall show that β>ρ approximates arbitrarily closely to β as ρ → 0. Write M≤ρ := kβk−kβ>ρk. For any ε > 0, we can find a certain ρ > 0 such that supy≥0 M≤ρ(y) < ε. Indeed, suppose by contradiction that this is not true, then there exist two sequences

ρi ↓ 0 and yi ≥ 0, such that M≤ρi (yi) ≥ ε for each i ∈ N. Since the extinction time of a clade is a.s. finite, we may assume that (by extracting a subsequence) limi→∞ yi = y ≥ 0.

By the continuity of β, this yields that lim infi→∞ M≤ρi (y) ≥ ε. This is absurd, since we know by monotone convergence that limi→∞ M≤ρi (y) = 0. With this specified ρ, since dH (β(y), β>ρ(y)) ≤ M≤ρ(y) for each y ≥ 0, we have

(27) sup dH (β(y), β>ρ(y)) < ε. y≥0

Using (26) and the uniform convergence kβ(ni)k → Mf = kβk, we deduce that the process (ni) (ni) (ni) M≤ρ := kβ k − kβ>ρ k converges to M≤ρ uniformly. Then, for all n large enough, we also have   (ni) (ni) (ni) sup dH β (y), β>ρ (y) ≤ sup M≤ρ (y) < 2ε y≥0 y≥0 TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 23

Combining this inequality with (26) and (27), we deduce that   (ni) lim sup sup dH β (y), β(y) ≤ 3ε. i→∞ y≥0

As ε is arbitrary, we conclude that β(ni) converges to β a.s. under the uniform topology, completing the proof.

To extend to a general initial state, let us record the following result that characterises the convergence under dH .

LEMMA 3.14 ([18, Lemma 4.3]). Let β, βn ∈ IH , n ≥ 1. Then dH (βn, β) → 0 as n → ∞ if and only if

(28) ∀(a,b)∈β ∃n0≥1 ∀n≥n0 ∃(an,bn)∈βn an → a and bn → b and (29) ∀ ∀ (c, d) ∈ β. (nk)k≥1 : nk→∞ (ck,dk)∈βnk , k≥1: dk→d∈(0,∞], ck→c6=d

PROOFOF THEOREM 3.12. For the case β(0) = ∅, by convention β(y) = ∅ for every y ≥ 0. Then the claim is a simple consequence of the convergence of the total mass processes. So we may assume that β(0) 6= ∅. By Definition 3.2, we can write β = ?U∈β(0) βU , D where each process βU := (βU (y), y ≥ 0) ∼ CladeLeb(U)(α), independent of the others. For any ε > 0, a.s. we can find at most a finite number of intervals, say U1,U2,...,Uk ∈ β(0), listed from left to right, such that k X (30) sup Rk(y) < ε, where Rk(y) := kβ(y)k − kβUi (y)k. y≥0 i=1

Indeed, since the process kβk ∼ BESQkβ(0)k(0), kβUi k ∼ BESQLeb(Ui)(0) for each i ∈ [k], and Rk is independent of the family {βUi , i ∈ [k]}, we deduce from Proposition 3.4 that Pk Rk ∼ BESQrk (0) with rk := kβ(0)k − i=1 Leb(Ui). Hence, by letting rk be small enough, we have (30). (n) (n) For each n ∈ N, we similarly assume that C (y) = ?U∈C(n)(0) CU (y), y ≥ 0, where (n) (α) each process (CU (y), y ≥ 0) ∼ PCRPLeb(U)(0, α), independent of the others. 1 (n) Due to the convergence of the initial state n C (0) → β(0) and by Lemma 3.14 we can (n) (n) (n) (n) (n) find for each i ≤ k a sequence Ui = (ai , bi ) ∈ C (0), n ∈ N, such that ai /n → (n) (n) inf Ui and bi /n → sup Ui. In particular, we have Leb(Ui )/n → Leb(Ui) for every i ≤ k. Then we may assume by Lemma 3.13 that, for all i ≤ k,   1 (n) (31) lim sup dH C (n) (2ny), βUi (y) = 0 a.s.. n→∞ y≥0 n Ui Moreover, it is easy to see that the total mass of a PCRP(α)(0, α) is a Markov chain described by π(0) in (16). By independence, the rescaled process 1 1 k (n) (n) X (n) Rk (y) := C (2ny) − C (n) (2ny) , y ≥ 0, n n Ui i=1 (n) (n) (n) Pk (n) has the law of π (n) (0) as in (21), where rk := kC (0)k − i=1 Leb(Ui ). By rk (n) Lemma 3.8 and Skorokhod representation, we may also assume supy≥0 |Rk (y)−Rk(y)| → 0 a.s.. 24 Q. SHI AND M. WINKEL

An easy estimate shows that   k   1 (n) (n) X 1 (n) dH C (2ny), β(y) ≤ 2Rk (y) + 2Rk(y) + dH C (n) (2ny), βUi (y) . n n Ui i=1 As a result, combining (30) and (31), we have   1 (n) lim sup sup dH C (2ny), β(y) ≤ 4ε a.s.. n→∞ y≥0 n By the arbitrariness of ε we deduce the claim.

(α) 3.5. The scaling limit of a PCRP (θ1, α).

(α) PROPOSITION 3.15 (Convergence of a PCRP (θ1, α)). Let θ1 ≥ 0. For n ∈ N, let (n) (α) (n) (C (y), y ≥ 0) be a PCRP (θ1, α) starting from C (0) ∈ C. Suppose that the interval 1 (n) partition n C (0) converges in distribution to β(0) ∈ IH as n → ∞, under dH . Then the 1 (n) (α) process ( n C (2ny), y ≥ 0) converges in distribution to an SSIP (θ1)-evolution starting from β(0), as n → ∞, in D(R+, IH ) under the Skorokhod topology.

(n) PROOF. We only need to prove the case when θ1 > 0 and C (0) = ∅ for every n ∈ N; then combining this special case and Theorem 3.12 leads to the general result. The arguments are very similar to those in the proof of Lemma 3.13; we only sketch the strategy here and omit the details. (n) (n) (n) Fix j ∈ N. Let (N , ξ , ` )n∈N be the sequence given in Proposition 3.7. For each n ∈ N, by using Theorem 3.6, we may write

(n) 1 (n)  (n) (n)  β (y) := C (2ny) = SKEWER y, N (n) , j + ξ (n) , y ∈ [0, j], [0,T ) θ1 [0,T ) n −jθ1/α −jθ1/α (n) (n) (n) (n) (n) (n) where ξ := ξ + (1 − α/θ1)` and T := T (ξ ) = T−j(ξ ). By Propo- θ1 −jθ1/α −jθ1/α θ1 sition 3.7 and Skorokhod representation, we may assume that (N(n), ξ(n), ξ(n)) converges θ1 (n) a.s. to (N, Xα, Xθ ). Then it follows from Lemma 3.11 that T → T (Xα) = 1 −jθ1/α −jθ1/α

T−j(Xθ1 ), cf. (14). Next, in the same way as in the proof of Lemma 3.13, we consider for (n) (n) any ρ > 0 the interval partition evolution β>ρ associated with the spindles of β with life- (n) time longer than ρ. By proving that for any ρ > 0, (β>ρ (y), y ∈ [0, j]) → (β>ρ(y), y ∈ [0, j]) (n) (n) as n → ∞, and that kβ k−kβ>ρ k → 0 as ρ ↓ 0 uniformly for all n ∈ N, we deduce the convergence of (β(n)(y), y ∈ [0, j]). This leads to the desired statement.

4. Convergence of the three-parameter family. In this section we consider the gen- (α) eral three-parameter family PCRP (θ1, θ2) with θ1, θ2 ≥ 0. In Section 4.1 we establish a related convergence result, Theorem 4.3, for the processes killed upon hitting ∅, with the limiting diffusion being an SSIP†-evolution introduced in [47]. Using Theorem 4.3, we ob- tain a pseudo-stationary distribution for an SSIP†-evolution in Proposition 4.4, which enables us to introduce an excursion measure and thereby construct an SSIP-evolution from excur- sions, for suitable parameters, in Sections 4.4 and 4.5 respectively. In Section 4.6, we finally complete the proofs of Theorem 1.1 and the other results stated in the introduction. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 25

(α) 4.1. Convergence when ∅ is absorbing. If we choose any table in a PCRP (θ1, θ2), then its size evolves as a π(−α)-process until the first hitting time of zero; before the (α) deletion of this table, the tables to its left form a PCRP (θ1, α) and the tables to its (α) right a PCRP (α, θ2). This observation suggests us to make such decompositions and to use the convergence results obtained in the previous section. A similar idea has been used in [47] for the construction of an SSIP-evolution with absorption in ∅, abbreviated as SSIP -evolution, which we shall now recall. Specifically, define a function φ: IH → †  IH × (0, ∞) × IH ∪ {(∅, 0, ∅)} by setting φ(∅) := (∅, 0, ∅) and, for β 6= ∅, (32) φ(β) := β ∩ (0, inf U), Leb(U), β ∩ (sup U, kβk) − sup U, where U is the longest interval in β; we take U to be the leftmost one if this is not unique.

DEFINITION 4.1 (SSIP-evolution with absorption in ∅, Definition 1.3 of [47]). Consider θ1 ≥ 0, θ2 ≥ 0 and γ ∈ IH . Set T0 := 0 and β(0) := γ. For k ≥ 0, suppose by induction that we have obtained (β(t), t ≤ Tk).

• If β(Tk) = ∅, then we stop and set Ti := Tk for every i ≥ k and β(t) := ∅ for t ≥ Tk. (k) (k) (k) • If β(Tk) 6= ∅, denote (β1 , m , β2 ) := φ(β(Tk)). Conditionally on the history, let (k) (k) (k) (α) f ∼ BESQm(k) (−2α) and γi = (γi (s), s ≥ 0) an SSIP (θi)-evolution starting from (k) (k) (k) (k) (k) βi , i = 1, 2, with f , γ1 , γ2 independent. Set Tk+1 := Tk + ζ(f ). We define (k) n (k) o (k)  β(t) := γ1 (t−Tk) ? 0, f (t−Tk) ? rev γ2 (t−Tk) , t ∈ (Tk,Tk+1].

We refer to (Tk)k≥1 as the renaissance levels and T∞ := supk≥1 Tk ∈ [0, ∞] as the degener- ation level. If T∞ < ∞, then by convention we set β(t) := ∅ for all t ≥ T∞. Then the process (α) β := (β(t), t ≥ 0) is called an SSIP† (θ1, θ2)-evolution starting from γ.

(α) Note that ∅ is an absorbing state of an SSIP† (θ1, θ2)-evolution by construction. Let us summarise a few results obtained in [47, Theorem 1.4, Corollary 3.7].

(α) THEOREM 4.2 ([47]). For θ1, θ2 ≥ 0, let (β(t), t ≥ 0) be an SSIP† (θ1, θ2)-evolution, with renaissance levels (Tk, k ≥ 0) and degeneration level T∞. Set θ = θ1 + θ2 − α.

(i) (Hunt property) (β(t), t ≥ 0) is a Hunt process with continuous paths in (IH , dH ). (ii) (Total-mass) (kβ(t)k, t ≥ 0) is a BESQkβ0k(2θ) killed at its first hitting time of zero. (iii) (Degeneration level) If θ ≥ 1 and β(0) 6= ∅, then a.s. T∞ = ∞ and β(t) 6= ∅ for every t ≥ 0; if θ < 1, then a.s. T∞ < ∞ and limt↑T∞ dH (β(t), ∅) = 0. −1 (α) (iv) (Self-similarity) For any c > 0, the process (cβ(c t), t ≥ 0) is an SSIP† (θ1, θ2)- evolution starting from cβ(0). (α) (α) (v) When θ2 = α, the SSIP† (θ1, α)-evolution (β(t), t ≥ 0) is an SSIP (θ1)-evolution killed at its first hitting time at ∅.

(n) THEOREM 4.3. Let θ1, θ2 ≥ 0 and θ = θ1 +θ2 −α. For n ∈ N, let (C (t), t ≥ 0) be a (α) (n) (n) (n) (n) PCRP (θ1, θ2) starting from C (0) = γ and killed at ζ = inf{t ≥ 0: C (t) = ∅}. (α) Let (β(t), t≥0) be an SSIP† (θ1, θ2)-evolution starting from γ and ζ =inf{t≥0: β(t)=∅}. 1 (n) Suppose that n γ converges in distribution to γ as n → ∞, under dH , then the following convergence holds in D(R+, IH ):  1  C(n)(2nt) ∧ ζ(n), t ≥ 0 −→ (β(t), t ≥ 0), in distribution. n n→∞ Moreover, ζ(n)/2n converges to ζ in distribution jointly. 26 Q. SHI AND M. WINKEL

(α) PROOFOF THEOREM 4.3. We shall construct a sequence of PCRP (θ1, θ2) on a suf- (α) (α) ficiently large probability space by using PCRP (θ1, α), PCRP (α, θ2) and up-down chains of law πk(−α) defined in (16); the idea is similar to Definition 4.1. Then the conver- gences obtained in Proposition 3.15 and Lemmas 3.8–3.9 permit us to conclude. 1 (n) By assumption, n C (0) converges in distribution to γ ∈ IH under the metric dH . Ex- cept for some degenerate cases when γ = ∅, we may use Skorokhod representation and (n) (n) (n)  (n) Lemma 3.14 to find C1 (0), m (0),C2 (0) for all n sufficiently large, with m (0) ≥ 1 (n) (n) (n) (n) (n) (n) and C1 (0),C2 (0) ∈ C, such that C1 (0) ? {(0, m (0))} ?C2 (0) = C (0), and that, as n → ∞,  1 1 1  (33) C(n)(0), m(n)(0), C(n)(0) → (γ , m, γ ) := φ(γ), a.s., n 1 n n 2 1 2 where φ is the function defined by (32). (n,0) (n,0) (α) For every n ∈ N, let f ∼ πm(n)(0)(−α) be as in (16), γ1 a PCRP (θ1, α) start- (n) (n,0) (α) (n) ing from C1 (0) and γ2 a PCRP (α, θ2) starting from C2 (0); the three processes (n,0) (n,0) (n,0) γ1 , f and γ2 are independent. By Proposition 3.15, Lemma 3.8 and Skorokhod representation, we may assume that a.s.  1 1 ζ(f (n,0)) 1    (34) γ(n,0)(2n ·), f (n,0)(2n ·), , γ(n,0)(2n ·) → γ(0), f (0), ζ(f (0)), γ(0) . n 1 n 2n n 2 1 2 (0) (0) (0) The limiting triple process (γ1 , f , γ2 ) starting from (γ1, m, γ2) can serve as that in the (0) (n,0) construction of β in Definition 4.1. Write T1 = ζ(f ) and Tn,1 = ζ(f ), then 1 1 γ(n,0)(T ) ? γ(n,0)(T ) → γ(0)(T ) ? γ(0)(T ) =: β(T ), a.s... n 1 n,1 n 2 n,1 1 1 2 1 1 With φ the function defined in (32), set (n,1) (n,1) (n,1)  (0) (0)  (C1 , m ,C2 ) := φ γ1 (Tn,1) ? γ2 (Tn,1) . (0) (0) Since T1 is independent of (γ1 , γ2 ), β(T1) a.s. has a unique largest block. By this obser- 1 (n,1) (n,1) (n,1) vation and (34) we have n (C1 , m ,C2 ) → φ(β(T1)), since φ is continuous at any interval partition whose longest block is unique. (n,1) (n,1) (n,1) For each n ≥ 1, if (C1 , m ,C2 ) = (∅, 0, ∅), then for every i ≥ 1, we set Tn,i :=  (n,i) (n,i) (n,i) (n,1) (n,1) (n,1) Tn,1 and γ1 , f , γ2 :≡ (∅, 0, ∅). If (C1 , m ,C2 ) 6= (∅, 0, ∅), then con- (n,1) (n,1) (α) ditionally on the history, let f ∼ πm(n,1) (−α), and consider γ1 , a PCRP (θ1, α) (n,1) (n,1) (α) (n,1) starting from C1 , and γ2 , a PCRP (α, θ2) starting from C2 , independent of (n,1) each other. Set Tn,2 = Tn,1 + ζ(f ). Again, by Proposition 3.15, Lemma 3.8 and Sko- rokhod representation, we may assume that a similar a.s. convergence as in (34) holds for (n,1) (n,1) (n,1) (n,1) (γ1 , f , ζ(f ), γ2 ). By iterating arguments above, we finally obtain for every n ≥ 1 a sequence of processes (n,i) (n,i) (n,i) (γ1 , f , γ2 )i≥0 with renaissance levels (Tn,i)i≥0, such that, inductively, for every k ≥ 0, the following a.s. convergence holds:  1 1 ζ(f (n,k)) 1    (35) γ(n,k)(2n ·), f (n,k)(2n ·), , γ(n,k)(2n ·) → γ(k), f (k), ζ(f (k)), γ(k) . n 1 n 2n n 2 1 2  (k) (k) (k) Using the limiting processes γ1 , f , γ2 , we build according to Definition 4.1 an k≥0 (α) SSIP† (θ1, θ2)-evolution β = (β(t), t ≥ 0), starting from γ, with renaissance levels Tk = Pk−1 (i) i=0 ζ(f ) and T∞ = limk→∞ Tk. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 27

Then for every t ≥ 0 and k ∈ N, on the event {Tk > t} we have by (35) the a.s. con- vergence of the process ( 1 C(n)(2ns), s ≤ t) → (β(s), s ≤ t). When θ ≥ 1, since the event T S n {T∞ = ∞} = {T > t} has probability one by Theorem 4.2, the convergence in t∈N k∈N k Theorem 4.3 holds a.s.. We now turn to the case θ < 1, where we have by Theorem 4.2 that T∞ < ∞ a.s. and that, for any ε > 0, there exists K ∈ N such that    (36) P sup kβ(t)k > ε < ε and P T∞ > TK + ε < ε. t≥TK For each n ∈ N, consider the concatenation ( γ(n,i)(t−T )?0, f (n,i)(t−T ) ?γ(n,i)(t−T ), t∈[T ,T ), i≤K−1, C(n)(t)= 1 n,i n,i 2 n,i n,i n,i+1 (n) Ce (t−Tn,K ), t ≥ Tn,K ,

(n) (α) (n) where Ce is a PCRP (θ1, θ2) starting from C (Tn,K −) and killed at ∅, independent of (n) (α) (n) the history. Then C is a PCRP (θ1, θ2) starting from C (0). We shall next prove that its rescaled process converges to (β(t), t ≥ 0) in probability, which completes the proof. By Lemmas 3.8–3.9, under the locally uniform topology

 1 (n) 1 (n)    kCe (2n·)k, ζ(Ce ) → kβ(· + TK )k, ζ β(· + TK ) in distribution. n 2n n→∞ By the convergence (35), there exists N ∈ N such that for every n > N, we have   1    1  (n) (37) P sup dH C (2ns), β(s) > ε < ε and P Tn,K − TK > ε < ε. s∈[0,TK ] n 2n

1 (n) Furthermore, by the convergence of n kCe k, there exists Ne ∈ N such that for every n > Ne,     1 (n) 1 (n)  (38) P sup kCe (s)k > ε < ε and P ζ(Ce ) − ζ β(· + TK ) > ε < ε. s≥0 n 2n

Summarising (36) and (38), for every n > Ne, we have    1 (n)  P sup dH Ce (2ns), β(s + TK ) > 3ε ≤ 3ε. s∈[0,∞) n Together with (37), this leads to the desired convergence in probability.

4.2. Pseudo-stationarity of SSIP†-evolutions.

(α) PROPOSITION 4.4 (Pseudo-stationary distribution of an SSIP† (θ1, θ2)-evolution). For θ1, θ2 ≥ 0 and θ := θ1 + θ2 − α, let (Z(t), t ≥ 0) be a BESQ(2θ) killed at zero with Z(0) > 0, (α) (α) independent of γ¯ ∼ PDIP (θ1, θ2). Let (β(t), t ≥ 0) be an SSIP† (θ1, θ2)-evolution start- ing from Z(0)¯γ. Fix any t ≥ 0, then β(t) has the same distribution as Z(t)¯γ.

(α) Analogous results for SSIP (θ1)-evolutions have been obtained in [14, 18], however, the strategy used in their proofs does not easily apply to our three-parameter model. We shall use a completely different method, which crucially relies on the discrete approxi- (α) mation by PCRP (θ1, θ2) in Theorem 4.3. It is easy to see that the total mass of a (α) PCRP (θ1, θ2) evolves according to a Markov chain defined by π(θ) as in (16), with θ = θ1 + θ2 − α. Conversely, given any C(0) ∈ C and Z ∼ πkC(0)k(θ), we can embed a 28 Q. SHI AND M. WINKEL

(α) process (C(t), t ≥ 0) ∼ PCRP (θ1, θ2), starting from C(0), such that its total-mass evo- lution is Z. More precisely, in the language of the Chinese restaurant process, at each jump time when Z increases by one, add a customer according to the seating rule in Definition 2.1; and whenever Z decreases by one, perform a down-step, i.e. one uniformly chosen customer (α) leaves. It is easy to check that this process indeed satisfies the definition of PCRP (θ1, θ2) (α) in the introduction. Recall the probability law oCRPm (θ1, θ2) from Definition 2.1.

(α) (α) LEMMA 4.5 (Marginal distribution of a PCRP (θ1, θ2)). Consider a PCRP (θ1, θ2) (α) (C(t), t ≥ 0) starting from C(0) ∼ oCRPm (θ1, θ2) with m ∈ N0. Then, at any time t ≥ 0, (α) C(t) has a mixture distribution oCRPkC(t)k(θ1, θ2), where the total number of customers has distribution (kC(t)k, t ≥ 0) ∼ πm(θ) with θ := θ1 + θ2 − α.

(α) PROOF. Let Z ∼ πkC(0)k(θ) and we consider (C(t), t ≥ 0) ∼ PCRP (θ1, θ2), starting from C(0), as a process embedded in Z ∼ πkC(0)k(θ), in the way we just explained as above. (α) Before the first jump time J1 of Z, C(t) = C(0) ∼ oCRPm (θ1, θ2) by assumption. At the first jump time J1 of Z, it follows from Proposition 2.5 that, given Z(J1), C(J1) has conditional (α) distribution oCRP (θ1, θ2). The proof is completed by induction. Z(J1)

(n) (α) PROOFOF PROPOSITION 4.4. For n ∈ N, consider a process C ∼ PCRP (θ1, θ2), (n) (α) starting from C (0) ∼ oCRPbnZ(0)c(θ1, θ2) and killed at ∅. It follows from Lemma 4.5 (n) (α) that, for every t ≥ 0, C (t) has the mixture distribution oCRPN (n)(t∧ζ(N (n)))(θ1, θ2) with (n) 1 (n) (N (t), t ≥ 0) ∼ πbnZ(0)c(θ). By Lemma 2.7, n C (0) converges in distribution to Z(0)¯γ 1 (n) under dH . For any fixed t ≥ 0, it follows from Theorem 4.3 that n C (2nt) converges in distribution to β(t). Using Lemmas 3.8–3.9 and 2.7 leads to the desired statement.

4.3. SSIP-evolutions. Let α ∈ (0, 1) and θ1, θ2 ≥ 0. Recall that the state ∅ has been de- (α) fined to be a trap of an SSIP† (θ1, θ2)-evolution. In this section, we will show that, for certain cases, depending on the value of θ := θ1 + θ2 − α, we can include ∅ as an initial state such that it leaves ∅ continuously. (α) More precisely, consider independent (Z(t), t ≥ 0) ∼ BESQ0(2θ) and γ¯ ∼ PDIP (θ1, θ2). Define for every t ≥ 0 a probability kernel Kt on IH : for β0 ∈ IH and measurable A ⊆ IH , Z t  (39) Kt(β0,A) = P β(t) ∈ A, t < ζ(β) + P(Z(t−r)¯γ ∈ A)P(ζ(β) ∈ dr), 0 (α) where β = (β(t), t ≥ 0) is an SSIP† (θ1, θ2)-evolution starting from β0, and ζ(β) is the first hitting time of ∅ by β. Note that [42, Corollary XI.(1.4)] yields for fixed s ≥ 0, that

(40) (Z(t), t ≥ 0) ∼ BESQ0(2θ), θ > 0, ⇒ Z(s) ∼ Gamma(θ, 1/2s).

When β0 = ∅, we have by convention ζ(β) = 0 and the first term in (39) vanishes.

THEOREM 4.6. Let θ1, θ2 ≥ 0. The family (Kt, t ≥ 0) defined in (39) is the transition semigroup of a path-continuous Hunt process on the Polish space IH .

(α) DEFINITION 4.7 (SSIP (θ1, θ2)-evolutions). For θ1, θ2 ≥ 0, a path-continuous Markov (α) process with transition semigroup (Kt, t ≥ 0) is called an SSIP (θ1, θ2)-evolution.

PROPOSITION 4.8. For θ1, θ2 ≥ 0, let (β(t), t ≥ 0) be a Markov process with transition semigroup (Kt, t≥0). Then the total mass (kβ(t)k, t≥0) is a BESQ(2θ) with θ = θ1+θ2−α. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 29

(α) PROOF. We know from Theorem 4.2 that the total mass of an SSIP† (θ1, θ2)-evolution evolves according to a BESQ(2θ) killed at zero. Therefore, the description in (39) implies that (kβ(t)k, t ≥ 0) has the semigroup of BESQ(2θ).

The proof of Theorem 4.6 is postponed to Section 4.5. We distinguish three phases:

• θ ∈ [−α, 0]: by convention, Z ∼ BESQ0(2θ) is the constant zero process and thus the sec- (α) ond term in (39) vanishes; then (Kt, t ≥ 0) is just the semigroup of an SSIP† (θ1, θ2)- evolution. In this case Theorem 4.6 is encompassed by Theorem 4.2. • θ ∈ (0, 1): by Theorem 4.2(ii) and [23, Equation (13)] we deduce that ζ(β) is a.s. finite in d (39), with ζ(β) = kβ(0)k/2G, where G ∼ Gamma(1 − θ, 1). In this case, we will construct (α) (α) an SSIP (θ1, θ2)-evolution as a recurrent extension of SSIP† (θ1, θ2)-evolutions, by using an excursion measure that will be introduced in Section 4.4. • θ ≥ 1: since ζ(β) = ∞ a.s., the second term in (39) vanishes unless β(0) = ∅ and ∅ is an entrance boundary with an entrance law Kt(∅, · ) = P(Z(t)¯γ ∈ · ), by Proposition 4.4. See also [47, Proposition 5.11], where this was shown using a different construction and a different formulation of the entrance law, which is seen to be equivalent to (39) by writing 0 0 0  0 (α) γ¯ = B V γ¯1 ? {(0, 1−V )} ? (1−B )β¯ ∼ PDIP (θ1, θ2) as in Corollary 2.8.

4.4. The excursion measure of an SSIP-evolution when θ ∈ (−α, 1). In this section, we (α) fix θ1, θ2 ≥ 0 and suppose that −α < θ=θ1+θ2−α<1. We shall construct an SSIP (θ1, θ2) (α) excursion measure Θ := Θ (θ1, θ2), which is a σ-finite measure on the space C([0, ∞), IH ) of continuous functions in (IH , dH ), endowed with the uniform metric and the Borel σ- algebra. Our construction is in line with Pitman and Yor [40, (3.2)], by the following steps.

• For each t > 0, define a measure Nt on IH by

h θ−1 i (41) Nt(A) := E (Z(t)) 1A(Z(t)¯γ) , measurable A ⊆ IH \ {∅},

Nt(∅) := ∞, (α) where Z = (Z(t), t ≥ 0) ∼ BESQ0(4 − 2θ) and γ¯ ∼ PDIP (θ1, θ2) are independent. As θ−1 1−θ 4 − 2θ > 2, the process Z never hits zero. We have Nt(IH \ {∅}) = t /2 Γ(2 − θ). (α) Then (Nt, t ≥ 0) is an entrance law for an SSIP† (θ1, θ2)-evolution (β(t), t ≥ 0). In- deed, with notation above, we have by Proposition 4.4 that, for every s, t ≥ 0 and f non- negative measurable, Z h θ−1 i E [f(β(s)) | β(0) = γ] Nt(dγ) = E (Z(t)) EZ(t)¯γ [f(β(s))]

h 0 θ−1  0 i = E (Z (0)) EZ0(0) f(Z (s)¯γ) , where (Z0(s), s ≥ 0) is a BESQ(2θ) killed at zero with Z0(0) = Z(t). Since we know from the duality property of BESQ(2θ), see e.g. [40, (3.b) and (3.5)], that 0 θ−1  0   θ−1 (Z (0)) EZ0(0) g(Z (s)) = EZ0(0) g(Ze(s))(Ze(s)) , ∀s > 0, where Ze ∼ BESQ(4 − 2θ) starting from Z0(0), it follows from the Markov property that

h 0 θ−1 0 i h h θ−1 ii E (Z (0)) EZ0(0)[f(Z (s)¯γ)] = E EZ0(0) (Ze(s)) f(Ze(s)¯γ) Z h θ−1 i = E (Z(t + s)) f(Z(t + s)¯γ) = f(γ)Nt+s(dγ). 30 Q. SHI AND M. WINKEL

We conclude that Z Z   E f(β(s)) | β(0) = γ Nt(dγ) = f(γ)Nt+s(dγ), ∀s, t ≥ 0.

• As a consequence, there exists a unique σ-finite measure Θ on C((0, ∞), IH ) such that for all t > 0 and F bounded measurable functional, we have the identity Z (42) Θ[F ◦ Lt] = E [F (β(s), s ≥ 0) | β(0) = γ] Nt(dγ),

(α) where (β(s), s ≥ 0) is an SSIP† (θ1, θ2)-evolution and Lt stands for the shift operator. See [45, VI.48] for details. In particular, for each t > 0 and measurable A ⊆ IH \ {∅}, we have the identity Θ{(β(s), s > 0) ∈ C((0, ∞), IH ): β(t) ∈ A} = Nt(A). In particular,  θ−1 1−θ (43) Θ(ζ > t) = Θ (β(s), s > 0) ∈ C((0, ∞), IH ): β(t) 6= ∅ = t /2 Γ(2 − θ). • The image of Θ by the mapping (β(t), t > 0) 7→ (kβ(t)k, t > 0) is equal to the push- (2θ) forward of ΛBESQ from C([0, ∞), IH ) to C((0, ∞), IH ) under the restriction map, where (2θ) ΛBESQ is the excursion measure of BESQ(2θ) as in (10). In particular, we have for Θ-almost every (β(t), t > 0) ∈ C((0, ∞), IH )

(44) lim sup kβ(t)k = 0 =⇒ lim dH (β(t), ∅) = 0. t↓0 t↓0

Therefore, we can “extend” Θ to C([0, ∞), IH ), by defining β(0) = ∅ for Θ-almost every (β(t), t > 0) ∈ C((0, ∞), IH ), and we also set  (45) Θ β ∈ C([0, ∞), IH ): β ≡ ∅ = 0. Summarising, we have the following statement.

PROPOSITION 4.9. Let θ1, θ2 ≥ 0 and suppose that −α < θ = θ1 + θ2 − α < 1. Then (α) there is a unique σ-finite measure Θ = Θ (θ1, θ2) on C([0, ∞), IH ) that satisfies (42) and (2θ) (45). Moreover, the image of Θ by the mapping (β(t), t ≥ 0) 7→ (kβ(t)k, t ≥ 0) is ΛBESQ, the excursion measure of BESQ(2θ).

(α) For the case θ1 = θ2 = 0, the law PDIP (θ1, θ2) coincides with the Dirac mass δ{(0,1)}. (α) (2θ) As a consequence, the SSIP (0, 0) excursion measure is just the pushforward of ΛBESQ, by the map x 7→ {(0, x)} from [0, ∞) to IH . When θ1 = 0 and θ2 = α, it is easy to check using [18, Proposition 2.12(i), Lemma 3.5(ii), Corollary 3.9] that 2αΘ(α)(0, α) is the push-forward (α) via the mapping SKEWER in Definition 3.1 of the measure ν⊥cld studied in [18, Section 2.3].

(α) 4.5. Recurrent extension when θ ∈ (0, 1). Consider the SSIP (θ1, θ2) excursion mea- (α) sure Θ := Θ (θ1, θ2) and suppose that θ = θ1 + θ2 − α ∈ (0, 1). It is well-known [46] in the theory of Markov processes that excursion measures such as Θ can be used to construct a recurrent extension of a Markov process. To this end, let G ∼ PRM(Leb ⊗ bθΘ), where 1−θ bθ = 2 Γ(1 − θ)/Γ(θ). R For every s ≥ 0, set σs = ζ(γ)G(dr, dγ). As the total mass process under Θ is [0,s]×IH the BESQ(2θ) excursion measure with θ ∈ (0, 1), the process (σs, s ≥ 0) coincides with the inverse of a BESQ(2θ), which is well-known to be a subordinator. We define

(46) β(t) = ? γs(t − σs−), t ≥ 0. points (s,γs) of G, σs−

This “concatentation” consists of at most one interval partition since (σs, s ≥ 0) is increasing. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 31

PROPOSITION 4.10. The process (β(t), t ≥ 0) of (46) is a path-continuous Hunt process with transition semigroup (Kt, t ≥ 0). Its total mass process (kβ(t)k, t ≥ 0) is a BESQ(2θ).

PROOF. We can use [46, Theorem 4.1], since we have the following properties: (α) • an SSIP† (θ1, θ2)-evolution is a Hunt process; • Θ is concentrated on {γ ∈ C([0, ∞), IH ): 0 < ζ(γ) < ∞, γ(t) = ∅ for all t ≥ ζ(γ)}; • for any a > 0, we have Θ{γ ∈ C([0, ∞), IH ): supt≥0 kγ(t)k ≥ a} < ∞; R −ζ(γ) • (1 − e )bθΘ(dγ) = 1; •(42) holds; • Θ is infinite and Θ{γ ∈ C([0, ∞), IH ): γ(0) 6= ∅} = 0. It follows that (β(t), t ≥ 0) is a Borel right Markov process with transition semigroup (Kt, t ≥ 0). Moreover, the total mass process (kβ(t)k, t ≥ 0) evolves according to a BESQ(2θ) by Proposition 4.9. In fact, (β(t), t ≥ 0) a.s. has continuous paths. Fix any path on the almost sure event that the total mass process (kβ(t)k, t ≥ 0) and all excursions γs are continuous. For any t ≥ 0, if σs− < t < σs for some s ≥ 0, then the continuity at t follows from that of γs. For any other t, we have β(t) = ∅ and the continuity at t follows from the continuity of the BESQ(2θ) total mass process. This completes the proof.

We are now ready to give the proof of Theorem 4.6, which claims that (Kt, t ≥ 0) defined in (39) is the transition semigroup of a path-continuous Hunt process.

PROOFOF THEOREM 4.6. When θ ∈ (0, 1), this is proved by Proposition 4.10. When (α) θ ≤ 0, the state ∅ is absorbing, and an SSIP (θ1, θ2)-evolution coincides with an (α) SSIP† (θ1, θ2)-evolution. For θ ≥ 1, the state ∅ is inaccessible, but an entrance boundary (α) of the SSIP (θ1, θ2)-evolution, see also [47, Proposition 5.11]. For these cases, the proof is completed by Theorem 4.2, the only modification is when starting from ∅. Specifically, the modified semigroup is still measurable. Right-continuity starting from ∅ follows from the continuity of the total mass process, and this entails the strong Markov property by the usual approximation argument.

4.6. Proofs of results in the introduction. We will first prove Theorem 1.1 and identify the limiting diffusion in Theorem 1.1 with an SSIP-evolution as defined in Definition 4.7. Then we complete proofs of the other results in the introduction.

(n) LEMMA 4.11. Let α ∈ (0, 1), θ1, θ2 ≥ 0 and θ = θ1 +θ2 −α. For n ∈ N, let (C (t), t ≥ (α) (n) (n) 1 (n) 0) be a PCRP (θ1, θ2) starting from C (0) = γ . If n γ converges to ∅ under dH , then for any t ≥ 0, 1 C(n)(2nt) → Z(t)¯γ in distribution, n (α) where (Z(t), t ≥ 0) ∼ BESQ0(2θ) and γ¯ ∼ PDIP (θ1, θ2) are independent. In particular, this limit is constant ∅ when θ ≤ 0.

PROOF. We start with the case when θ < 1. Let ζ(n) be the hitting time of ∅ for C(n). For any ε > 0, for all n large enough, ζ(n) is stochastically dominated by the hitting time of zero of an up-down chain Z(n) ∼ π(2θ) starting from bnεc, which by Lemmas 3.8 and 3.9 converges in distribution to ε/2G with G a Gamma variable. Letting ε → 0, we conclude that ζ(n)/2n → 0 in probability as n → ∞. 32 Q. SHI AND M. WINKEL

For any t > 0 and any bounded continuous function f on IH , we have       1 (n) (n) 1 (n) (n) (47) f C (2nt) = 1{ζ ≤ 2nt}f Ce 2nt − ζ E n E n   1  + 1{ζ(n) > 2nt}f C(n)(2nt) , E n where Ce(n)(s) = C(n)(s + ζ(n)), s ≥ 0. As n → ∞, since ζ(n)/2n → 0 in probability, the second term tends to zero. By the strong Markov property and Lemma 4.5, Ce(n)(s) has the (α) (n) mixture distribution oCRP (θ1, θ2). Since kC (2nt)k/n → Z(t) in distribution by kCe(n)(s)k Lemma 3.8, we deduce by Lemma 2.7 that the first term tends to E [f (Z(t)¯γ)], as desired. For θ ≥ 1, at least one of θ1 ≥ α or θ2 ≥ α. Say, θ1 ≥ α. We may assume that (n) (n) (n) (n) (n) (α) C (t) = C1 (t) ?C0 (t) ?C2 (t) for independent (C1 (t), t ≥ 0) ∼ PCRP (θ1, 0) (n) (α) (n) (n) starting from ∅, (C0 (t), t≥0)∼PCRP (α, 0) starting from C (0), and (C2 (t), t≥0) (α) (n) ∼ PCRP (α, θ2) starting from ∅. For the middle term C0 , the θ ≤ 0 case yields that 1 (n) n C0 (2nt) → ∅ in distribution. For the other two, applying (40) and Lemmas 4.5, 3.8 and 1 (n) 2.7 yields n C1 (2nt) → Z1(t)¯γ1 in distribution, with Z1(t) ∼ Gamma(θ1−α, 1/2t) and γ¯1 ∼ (α) 1 (n) PDIP (θ1, 0), and n C2 (2nt) → Z2(t)¯γ2 in distribution, with Z2(t) ∼ Gamma(θ2, 1/2t) (α) and γ¯2 ∼ PDIP (α, θ2). We complete the proof by applying the decomposition (7).

(α) PROOFOF THEOREM 1.1. When θ ≤ 0, the state ∅ is absorbing, and an SSIP (θ1, θ2)- (α) evolution coincides with an SSIP† (θ1, θ2)-evolution. For this case the proof is completed by Theorem 4.3. So we shall only consider θ > 0 and prove that the limiting diffusion is given by an (α) SSIP (θ1, θ2)-evolution β = (β(t), t ≥ 0) with ζ(β) = inf{t ≥ 0: β(t) = ∅} as defined in Definition 4.7. It suffices to prove the convergence in D([0,T ], IH ) for a fixed T > 0. The convergence of finite-dimensional distributions follows readily from Theorem 4.3, Lemma 4.11 and the description in (39). Specifically, for θ ∈ (0, 1), we proceed as in the proof of Lemma 4.11 and see the first term in (47) converge to E[1{ζ(β) ≤ t}f(Z0(t − ζ(β))¯γ)] (α) where Z0 ∼ BESQ0(2θ) and γ¯ ∼ PDIP (θ1, θ2) are independent and jointly independent of β, while the second term converges to E[1{ζ(β) > t}f(β(t))]. For θ ≥ 1 and β(0) = ∅, con- vergence of marginals holds by Lemma 4.11 and (39). Theorem 4.3 then establishes finite- dimensional convergence, also when β(0) 6= ∅. (n) (n) 1 (n) It remains to check tightness. Let β = (β (t), t ≥ 0) := n C (2n · ). Since we al- ready know from Lemma 3.8 that the sequence of total mass processes kβ(n)k, n ≥ 1, con- verges in distribution, it is tight. For h > 0, define the modulus of continuity by

 (n)  n (n) (n) o ω kβ k, h = sup kβ (s)k − kβ (t)k : |s − t| ≤ h .

For any ε > 0, the tightness implies that there exists ∆0 such that for any h ≤ 2∆0,

h  (n)  i lim sup E ω kβ k, h ∧ 1 < ε; n→∞ this is an elementary consequence of [25, Proposition VI.3.26]. See also [27, Theorem 16.5]. 0 0 (n) (n) For 1 ≤ i ≤ bT/∆ c, set ti = i∆ and let βi be the process obtained by shifting β to start from ti, killed at ∅. The convergence of the finite-dimensional distributions yields that (n) each β (ti) converges weakly to β(ti). Since β(ti) 6= ∅ a.s., by Theorem 4.3 each sequence TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 33

(n) (n) βi converges in distribution as n → ∞. So the sequence (βi , n ∈ N) is tight, as the space (IH , dH ) is Polish. By tightness there exists ∆i such that for any h < ∆i,

h  (n)  i −i lim sup E ω βi , h ∧ 1 < 2 ε. n→∞ 0 Now let ∆ = min(∆ , ∆0, ∆1,..., ∆bT/∆0c). For any s ≤ t ≤ T with t−s ≤ ∆, consider i 0 0 (n) (n) such that ti ≤ s < ti+1, then t−ti < ∆ +t−s ≤ 2∆ . If ζ(βi ) ≤ t−ti, then β touches (n) (n)  (n) 0 ∅ during the time interval [ti, t] and thus max(kβ (s)k, kβ (t)k) ≤ ω kβ k, 2∆ . Therefore, we have

 (n) (n)   (n) (n)   (n) 0 dH β (s), β (t) ≤ dH βi (s), βi (t) + 2ω kβ k, 2∆ . It follows that for h < ∆, bT/∆0c h  (n)  i h  (n) 0 i X h  (n)  i E ω β , h ∧ 1 ≤ 2E ω kβ k, 2∆ ∧ 1 + E ω βi , ∆i ∧ 1 . i=0

h  (n)  i So we have lim supn→∞ E ω β , h ∧ 1 ≤ 4ε. This leads to the tightness, e.g. via [27, Theorem 16.10].

PROOFOF PROPOSITION 1.2. This follows from Proposition 4.4 and the semigroup de- scription in (39).

THEOREM 4.12. Let α ∈ (0, 1), θ1, θ2 ≥ 0 and γn ∈ IH with γn → γ ∈ IH . Let βn, (α) n ≥ 1, and β be SSIP (θ1, θ2)-evolutions starting from γn, n ≥ 1, and γ, respectively. Then βn → β in distribution in C(R+, IH ) equipped with the locally uniform topology.

(0) PROOF. It follows easily from Lemma 3.14 that we may assume that γn = βn,1 ? (0) (0) (0) (0) (0) (0) (0) (0) (0) {(0, mn )} ? βn,2 with mn → m , βn,i → βi , i = 1, 2, and φ(γ) = (β1 , m , β2 ). We will now couple the constructions in Definition 4.1 and use the notation from there. (k) (k) (k) (k) (k) (k) Given (βn,1, mn , βn,2) → (β1 , m , β2 ) a.s., for some k ≥ 0, the Feller property of [18, Theorem 1.8] allows us to apply [27, Theorem 19.25] and, by Skorokhod representation, (k) (k) (k) we obtain γn,i →γi a.s. in C(R+, IH ), i=1, 2, as n→∞. For f ∼BESQm(k) (−2α) and (k) (k) (k) (k) (k) (k) (k) fn (s):=(mn /m )f ((m /mn )s), s≥0, we find fn ∼BESQ (k) (−2α). As n→∞, mn (k) (k) (k) (k) (k) (k) (k) (k) (fn , ζ(fn )) → (f , ζ(f )) a.s.. And as γ1 (ζ(f )) ? γ2 (ζ(f )) has a.s. a unique (k) (k) (k) (k)  (k) (k) (k) (k)  longest interval, φ γn,1(ζ(fn )) ? γn,2(ζ(fn )) → φ γ1 (ζ(f )) ? γ2 (ζ(f )) , a.s.. Inductively, the convergences stated in this proof so far hold a.s. for all k ≥ 0. When θ := θ1 +θ2 −α ≥ 1, this gives rise to coupled βn and β. When θ < 1, arguments as at the end of the proof of Theorem 4.3 allow us to prove the convergence until the first hitting time of ∅ jointly with the convergence of the hitting times. When θ ≤ 0, we extend the constructions (α) by absorption in ∅. When θ ∈ (0, 1), we extend by the same SSIP (θ1, θ2) starting from ∅. In each case, we deduce that βn → β a.s., locally uniformly.

PROOFOF THEOREM 1.4. For an SSIP-evolution, we have established the pseudo- stationarity (Proposition 1.2), self-similarity, path-continuity, Hunt property (Theorem 1.1) and the continuity in the initial state (Theorem 4.12). With these properties in hand, we can easily prove this theorem by following the same arguments as in [15, proof of Theorem 1.6]. Details are left to the reader. 34 Q. SHI AND M. WINKEL

We now prove Theorem 1.5, showing that when θ = θ1 + θ2 − α ∈ (−α, 1), the excursion (α) measure Θ := Θ (θ1, θ2) of Section 4.4 is the limit of rescaled PCRP excursion measures. (α) Recall that the total mass process of PCRP (θ1, θ2) has distribution π1(θ). We have already obtained the convergence of the total mass process from Proposition 3.10.

PROOFOF THEOREM 1.5. Recall that ζ(γ) = inf{t > 0: γ(t) = ∅} denotes the lifetime of an excursion γ ∈ D([0, ∞), IH ). To prove vague convergence, we proceed as in the proof of Proposition 3.10. In the present setting, we work on the space of measures on (n) D([0, ∞), IH ) that are bounded on {ζ > t} for all t > 0. We denote by P the distribu- (n) (α) tion of C , a killed PCRP (θ1, θ2) starting from (1). It suffices to prove for fixed t > 0, 1. Θ(ζ = t) = 0, 2. (Γ(1 + θ)/(1 − θ))n1−θ · P(n)(ζ > t) −→ Θ(ζ > t), n→∞ 3. P(n)( · | ζ > t) −→ Θ( · | ζ > t) weakly. n→∞ 1. This follows from (43). (n) (n) 2. Since the total-mass process kC k is an up-down chain of law πe1 (θ), Proposi- tion 3.10 implies the following weak convergence of finite measures on (0, ∞): Γ(1 + θ)   (48) n1−θ kC(n)(t)k ∈ · ; ζ(C(n)) > t 1 − θ P (2θ)   −→ Λ f ∈ ([0, ∞), [0, ∞)): f(t) ∈ · ; ζ(f) > t = Nt γ ∈ IH : kγk ∈ · , n→∞ BESQ C where Nt is the entrance law of Θ given in (41). This implies the desired convergence. 3. For any t > 0, given (kC(n)(r)k, r ≤ 2nt), we know from Lemma 4.5 that the condi- (n) 1 1 (α) tional distribution of C (t) = n C(2nt) is the law of n Cn, where Cn is oCRPm (θ1, θ2) with m = kC(2nt)k. By Lemma 2.7, we can strengthen (48) to the following weak convergence on IH \ {∅}:  (n) (n)  C (t) ∈ · ζ(C ) > t −→ Nt( · | IH \ {∅}). P n→∞ Next, by the Markov property of a PCRP and the convergence result Theorem 4.3, we deduce that, conditionally on {ζ(C(n)) > t}, the process (C(n)(t + s), s ≥ 0) converges weakly to (α) an SSIP (θ1, θ2)-evolution (β(s), s ≥ 0) starting from β(0) ∼ Nt( · | IH \ {∅}). By the description of Θ in (42), this implies the convergence of finite-dimensional distributions for (n) times t ≤ t1 < ··· < tk. For t > t1, this holds under P ( · | ζ > t1) and Θ( · | ζ > t1) and can be further conditioned on {ζ > t}, by 1. and 2. It remains to prove tightness. For every n ≥ 1, let τn be a with respect to the (n) natural filtration of C and hn a positive constant. Suppose that the sequence τn is bounded and hn → 0. By Aldous’s criterion [27, Theorem 16.11], it suffices to show that for any δ > 0,   (n) (n)  (n)  (49) lim dH C (τn +hn),C (τn) > δ ζ(C ) > t = 0. n→∞ P By the total mass convergence in Proposition 3.10, for any ε > 0, there exists a constant s > 0, such that   (n) (n) (50) lim sup P sup kC (r)k > δ/3 ζ(C ) > t ≤ ε. n→∞ r≤2s Moreover, since (C(n)(s + z), z ≥ 0) conditionally on {ζ(C(n)) > s} converges weakly to a continuous process, by [25, Proposition VI.3.26] we have for any u > s, !   (n) (n) (n) (51) lim P sup dH C (r+hn),C (r) > δ/3 ζ(C ) > t = 0. n→∞ r∈[s,u] Then (49) follows from (50) and (51). This completes the proof. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 35

4.7. The case α = 0. In a PCRP model with α = 0, the size of each table evolves ac- cording to an up-down chain π(0) as in (16), and new tables are only started to the left (0) or to the right, but not between existing tables. We can hence build a PCRP (θ1, θ2) starting from (n1, . . . , nk) ∈ C by a Poissonian construction. Specifically, consider indepen- dent fi ∼ πni (0), i ∈ [k], as size evolutions of the initial tables, F1 ∼ PRM(θ1Leb ⊗ π1(0)) whose atoms describe the birth times and size evolutions of new tables added to the left, and F2 ∼ PRM(θ2Leb ⊗ π1(0)) for new tables added to the right. For t ≥ 0, set C1(t) = {(0, f(t − s))}, where ↓ means that the concatenation is from larger s ?atoms (s,f) of F1↓:s≤t to smaller, C (t) = {(0, f (t))}, and C (t) = {(0, f(t − s))}. Then 0 ?i∈[k] i 2 ?atoms (s,f) of F2:s≤t (0) (C(t) = C1(t) ?C0(t) ?C2(t), t ≥ 0) is a PCRP (θ1, θ2) starting from (n1, . . . , nk).

PROPOSITION 4.13. The statement of Theorem 1.1 still holds when α = 0.

PROOF. We only prove this for the case when the initial state is C(n)(0) = {(0, b(n))} with (n) limn→∞ b /n = b > 0. Then we can extend to a general initial state in the same way as we passed from Lemma 3.13 to Theorem 3.12. (n) For each n ≥ 1, we may assume C is associated with F1 ∼ PRM(θ1Leb ⊗ π1(0)), F2 ∼ (n) PRM(θ1Leb ⊗ π1(0)) and f0 ∼ πb(n) (0). Replacing each atom δ(s, f) of F1 and F2 by (n) (n) (n) δ(s/2n, f(2n·)/n), we obtain F1 ∼ PRM(2nθ1Leb ⊗ π1 (0)) and F2 ∼ PRM(2nθ2Leb ⊗ (n) 1 (n) (n) (n) (n) π1 (0)). Note that ( n C (2nt), t ≥ 0) is associated with (F1 , F2 , f (2n·)/n). (n) (0) Since Proposition 3.10 shows that nπ1 (0) → ΛBESQ, by [28, Theorem 4.11], we deduce (n) (n) (∞) (0) that F1 and F2 converge in distribution respectively to F1 ∼ PRM(2θ1Leb ⊗ ΛBESQ) and (∞) (0) (n) (∞) F2 ∼PRM(2θ2Leb⊗ΛBESQ). By Lemma 3.8, f (2n·)/n→f ∼BESQb(0) in distribution. 1 (n) As a result, we can deduce that ( n C (2nt), t ≥ 0) converges to an IH -valued process (β(t), t ≥ 0) defined by

  (52) β(t) = ? {(0, f(t−s))} (∞) atoms (s,f) of F1 ↓:s≤t   ? {(0, f (∞)(t))} ? ? {(0, f(t−s))} . (∞) atoms (s,f) of F2 :s≤t A rigorous argument can be made as in the proof of Lemma 3.13.

(0) The limiting process in (52) can be viewed as an SSIP (θ1, θ2)-evolution, which is closely related to the construction of measure-valued processes in [48]. See also [19, Sec- tion 7.1].

5. Applications.

5.1. Measure-valued processes. In [19], we introduced a two-parameter family of su- a perprocesses taking values in the space (M , dM) of all purely atomic finite measures on a space of allelic types, say [0, 1]. Here dM is the Prokhorov distance. Our construction is closely related to that of SSIP-evolutions, here extracting from scaffolding and spindles via the following superskewer mapping. See Figure2 on page 13 for an illustration. 36 Q. SHI AND M. WINKEL P DEFINITION 5.1 (Superskewer). Let V = i∈I δ(ti, fi, xi) be a point measure on R × E × [0, 1] and X a càdlàg process such that X X δ(t, ∆X(t)) = δ(ti, ζ(fi)). ∆X(t)>0 i∈I The superskewer of the pair (V,X) at level y is the atomic measure X  (53) SSKEWER(y, V, X) := fi y − X(ti−) δ(xi).

i∈I : X(ti−)≤y

For α ∈ (0, 1), θ ≥ 0, recall the scaffolding-and-spindles construction of an SSIP(α)(θ)- evolution starting from γ ∈ IH ; in particular, for each U ∈ γ, there is an initial spindle fU ∼ BESQLeb(U)(−2α). For any collection xU ∈ [0, 1], U ∈ γ, we can construct a self-similar (α) P SSSP (θ) starting from π = U∈γ Leb(U)δ(xU ) as follows. We mark each initial spindle fU by the allelic type xU and all other spindles in the construction by i.i.d. uniform random variables on [0, 1]. Then we obtain the desired superprocess by repeating the construction of an SSIP(α)(θ)-evolution in Definitions 3.2–3.3, with skewer replaced by superskewer, and concatenation replaced by addition. We refer to [19] for more details. a P We often write π ∈ M in canonical representation π = biδ(xi) with b1 ≥ b2 ≥ · · · P i≥1 and xi < xi+1 if bi = bi+1. We write kπk := π([0, 1]) = i≥1 bi for the total mass of π.

DEFINITION 5.2. Let α ∈ (0, 1) and θ ∈ [−α, 0). We define a process (π(t), t ≥ 0) start- ing from π(0) ∈ Ma by the following construction. P (0) • Set T0 = 0. For π(0) = i≥1 biδ(xi) in canonical representation, consider x := x1 and (0) (0) (α) P independent f ∼ BESQb1 (−2α) and λ ∼ SSSP (θ+α) starting from i≥2 biδ(xi). (i) (i) (i) • For k ≥ 1, suppose by induction we have obtained (λ , f , x ,Ti)0≤i≤k−1. Then we (k−1) set Tk = Tk−1 + ζ(f ) and (k−1) (k−1) (k−1) π(t) = λ (t−Tk−1) + f (t−Tk−1)δ(x ), t ∈ [Tk−1,Tk]. P (k) (k) (k) (k) Write π(Tk) = i≥1 bi δ(xi ), with b1 ≥ b2 ≥ · · · , for its canonical representation. Conditionally on the history, construct independent λ(k) ∼ SSSP(α)(θ+α) starting from P (k) (k) (k) (k) (k) b δ(x ) and f ∼ BESQ (k) (−2α). Let x = x . i≥2 i i b1 1 • Let T∞ = limk→∞ Tk and π(t) = 0 for t ≥ T∞. The process π := (π(t), t ≥ 0) is called an (α, θ) self-similar superprocess, SSSP(α)(θ).

P a For any π(0) = i≥1 biδ(xi) ∈ M consider β(0) = {(s(i − 1), s(i)), i ≥ 1} ∈ IH , (α) where s(i) = b1 + ··· + bi, i ≥ 0. Consider an SSIP (θ+α, 0)-evolution starting from β(0), built in Definition 4.1 and use notation therein. As illustrated in Figure2, we may assume that each interval partition evolution is obtained from the skewer of marked spindles. Therefore, (α) (k) (k) (α) we can couple each SSIP (θ+α)-evolution γ1 with an λ1 ∼ SSSP (θ+α), such that the atom sizes of the latter correspond to the interval lengths of the former. Similarly, each (α) (k) (k) (α) (k) (k) (k) SSIP (0)-evolution γ2 corresponds to a λ2 ∼ SSSP (0). Then λ = λ1 + λ2 is an SSSP(α)(θ + α) by definition. Let f (k) be the middle (marked) spindle in Defini- tion 4.1, which is a BESQ(−2α), and x(k) be its type. In this way, we obtain a sequence (k) (k) (k) (α) (λ , f , x )k≥0 and thus π = (π(t), t ≥ 0) ∼ SSSP (θ) as in Definition 5.2. It is cou- pled with β = (β(t), t ≥ 0) ∼ SSIP(α)(θ+α, 0) as in Definition 4.1, such that atom sizes and interval lengths are matched, and the renaissance level (Tk)k≥0 are exactly the same. The next theorem extends [19, Theorem 1.2] to θ ∈ [−α, 0). TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 37

THEOREM 5.3. Let α ∈ (0, 1), θ ∈ [−α, 0). An SSSP(α)(θ) is a Hunt process with BESQ(2θ) total mass, paths that are total-variation continuous, and its finite-dimensional marginals are continuous along sequences of initial states that converge in total variation.

P a (α) PROOF. For any π(0) = i≥1 biδ(xi) ∈ M consider (π(t), t ≥ 0) ∼ SSSP (θ) and (β(t), t ≥ 0) ∼ SSIP(α)(θ+α, 0) coupled, as above. By [47, Theorem 1.4], their (identical) total mass processes are BESQ(2θ). Moreover, by this coupling and [47, Corollary 3.7],

(54) T∞ < ∞, and lim kπ(t)k = 0 a.s., t→T∞ (α) which implies the path-continuity at T∞. Since an SSSP (θ + α) has continuous paths [19, Theorem 1.2], we conclude the path-continuity of SSSP(α)(θ) by the construction in Definition 5.2, both in the Prokhorov sense and in the stronger total variation sense. To prove the Hunt property, we adapt the proof of [47, Theorem 1.4] and apply Dynkin’s criterion to a richer Markov process that records more information from the construction. Specifically, in the setting of Definition 5.2, let

 (k−1) (k−1) (k−1) (λ(t), f(t), x(t)) := λ (t−Tk−1), f (t−Tk−1), x , t ∈ [Tk−1,Tk), k ≥ 1. and (λ(t), f(t), x(t)) := (0, 0, 0) for t ≥ T∞. We shall refer to this process as a triple-valued SSSP(α)(θ) with values in Je := (Ma × (0, ∞) × [0, 1]) ∪ {(0, 0, 0)}. This process induces the Ma-valued SSSP(α)(θ) as π(t) = λ(t) + f(t)δ(x(t)). Since each (λ(k), f (k), x(k)) is Hunt and is built conditionally on the previous ones according to a probability kernel, then (λ(t), f(t), x(t)), t ≥ 0, is a Borel right Markov process by [2, Théorème II 3.18]. To use Dynkin’s criterion to deduce that (π(t), t ≥ 0) is Borel right Markovian, and hence Hunt by path-continuity, we consider any (λ1(0), f1(0), x1(0)), (λ2(0), f2(0), x2(0)) ∈ Je with λ1(0) + f1(0)δ(x1(0)) = λ2(0) + f2(0)δ(x2(0)). It suffices to couple triple-valued SSSP(α)(θ) from these two initial states whose induced Ma-valued SSSP(α)(θ) coincide. First note that (unless they are equal) the initial states are such that for t = 0 and i = 1, 2,

(55) λ1(t) = µ(t) + f2(t)δ(x2(t)) and λ2(t) = µ(t) + f1(t)δ(x1(t)) for some µ(t) ∈ Ma. We follow similar arguments as in the proof of [47, Lemma 3.3], via a quintuple-valued process (µ(t), f1(t), x1(t), f2(t), x2(t)), 0 ≤ t < SN , that captures two marked types. Let S0 := 0. For j ≥ 0, suppose we have constructed the process on [0,Sj]. • Conditionally on the history, consider an SSSP(α)(θ+2α)-evolution µ(j) starting from (j) µ(Sj), and fi ∼ BESQfi(Sj )(−2α), i = 1, 2, independent of each other. Let ∆j := (j) (j) min{ζ(f1 ), ζ(f2 )} and Sj+1 := Sj + ∆j . For t ∈ [Sj,Sj+1), define

 (j) (j) (j)  (µ(t), f1(t), x1(t), f2(t), x2(t)) := µ (t−Sj), f1 (t−Sj), x1(Sj), f2 (t−Sj), x2(Sj) .

(j) (j) (j) • Say ∆j = ζ(f1 ). If f2 (∆j) exceeds the size of the largest atom in µ (∆j), let N = j + 1. The construction is complete. Otherwise, let (f2(Sj+1), x2(Sj+1)) := (j) (j) (f2 (∆j), x2(Sj)) and decompose µ (∆j) = µ(Sj+1) + f1(Sj+1)δ(x1(Sj+1)) by iden- tifying its largest atom, giving rise to the five components. Similar operations apply when (j) ∆j = ζ(f2 ).

For t ∈ [0,SN ), define λi(t), i = 1, 2, by (55). In general, we may have N ∈ N ∪ {∞}. On the event {N < ∞}, we further continue with the same triple-valued SSSP(α)(θ) starting 38 Q. SHI AND M. WINKEL

(N) (N) from the terminal value (µ (∆N−1), fi (∆N−1), xi(∆N )), with i ∈ {1, 2} being the in- (N) dex such that fi (∆N−1) > 0. By [19, Corollary 5.11 and remark below] and the strong Markov property of these pro- (α) cesses applied at the stopping times Sj , we obtain two coupled triple-valued SSSP (θ), which induce the same Ma-valued SSSP(α)(θ), as required. Indeed, the construction of these two processes is clearly complete on {N < ∞}. On {N = ∞}, by (54) one has {S∞ < ∞} and the total mass tends to zero as t ↑ S∞, and hence the construction is also finished. (0) (0) For the continuity in the initial state, suppose that πn(0) = fn (0)δ(x1) + λn (0) → (0) (0) π(0) = f (0)δ(x1) + λ (0) in total variation. First note that a slight variation of the proof (0) (0) (0) (0) of [19, Proposition 3.6] allows to couple λn and λ so that λn (tn) → λ (t) in to- (0) (0) tal variation a.s., for any fixed sequence tn → t. Also coupling fn and f , we can apply (0) (0) this on {ζ(f ) > t} to obtain πn(t) → π(t) for any fixed t, and on {ζ(f ) < t} to ob- (0) (0) (0) (0) tain ζ(fn ) → ζ(f ) and πn(ζ(fn )) → π(ζ(f )) in total variation a.s.. By induction, this establishes the convergence of one-dimensional marginals on {T∞ >t}, and trivially on {T∞

(α) For α ∈ (0, 1), θ ∈ [−α, 0), let (B1,B2,...) ∼ PD (θ) be a Poisson–Dirichlet sequence in the Kingman simplex and (Xi, i ≥ 1) i.i.d. uniform random variables on [0, 1], further in- (α) dependent of (B1,B2,...). Define PDRM (θ) to be the distribution of the random probability P a a measure π := i≥1 Biδ(Xi) on M1 := {µ ∈ M : kµk = 1}. If θ = −α, then π = δ(X1).

PROPOSITION 5.4. Let (Z(t), t ≥ 0) be a BESQ(2θ) killed at zero with Z(0) > 0, inde- pendent of π ∼ PDRM(α)(θ). Let (π(t), t ≥ 0) be an SSSP(α)(θ) starting from Z(0)π. Fix any t ≥ 0, then π(t) has the same distribution as Z(t)π.

PROOF. By the coupling between an SSIP(α)(θ+α, 0)-evolution and an SSSP(α)(θ) men- tioned above, the claim follows from Proposition 4.4 and the definition of PDRM(α)(θ).

(α) a For θ ≥ −α, let π := (π(t), t ≥ 0) be an SSSP (θ) starting from µ ∈ M1. Define an a associated M1-valued process via the de-Poissonisation as in Definition 1.3: −1 π(u) := π(τπ(u)) π(τπ(u)), u ≥ 0,  R t −1 a where τπ(u) := inf t ≥ 0: 0 kπ(s)k ds > u . The process (π(u), u ≥ 0) on M1 is called a Fleming–Viot (α, θ)-process starting from µ, denoted by FV(α)(θ). Using Proposition 5.4, we easily deduce the following statement by the same arguments as in the proof of [19, Theorem 1.7], extending [19, Theorem 1.7] to the range θ ∈ [−α, 0).

THEOREM 5.5. Let α ∈ (0, 1) and θ ≥ −α.A FV(α)(θ)-evolution is a total-variation a (α) path-continuous Hunt process on (M1, dM) and has a stationary distribution PDRM (θ). 5.2. Fragmenting interval partitions. We define a fragmentation operator for interval (α) partitions, which is associated with the random interval partition PDIP (θ1, θ2) defined in Section2. Fragmentation theory has been extensively studied in the literature; see e.g. [4].

DEFINITION 5.6 (A fragmentation operator). Let α ∈ (0, 1) and θ1, θ2 ≥ 0. We define (α) a Markov transition kernel Frag := Frag (θ1, θ2) on IH as follows. Let (γi)i∈N be i.i.d. (α) with distribution PDIP (θ1, θ2). For β = {(ai, bi), i ∈ N}∈IH , with (ai, bi)i∈N enumerated in decreasing order of length, we define Frag(β, ·) to be the law of the interval partition obtained from β by splitting each (ai, bi) according to the interval partition γi, i.e. {(ai + (bi − ai)l, ai + (bi − ai)r): i ∈ N, (ai, bi) ∈ β, (l, r) ∈ γi}. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 39

FIG 3. Clusters are divided by dashed lines. A new customer starts a new table in an existing cluster (solid arrow) or a new cluster (dashed arrow), with probability proportional to the indicated weights. In the continuous-time model studied in Section 5.4, the weights correspond to the rates ar which customers arrive.

LEMMA 5.7. For α, α¯ ∈ (0, 1) and θ1, θ2, θ¯1, θ¯2 ≥ 0, suppose that

θ1 + θ2 +α ¯ = α. (¯α) ¯ ¯ Let βc ∼ PDIP (θ1, θ2) and βf a random interval partition whose regular conditional dis- (α) (α) ¯ ¯ tribution given βc is Frag (θ1, θ2). Then βf ∼ PDIP (θ1 +θ1, θ2 +θ2).

A similar result for the particular case with parameter α¯ = θ¯2 = 0, θ1 = 0 and θ2 = α is included in [21, Theorem 8.3]. Lemma 5.7 is also an analogous result of [38, Theorem 5.23] for PD(α, θ) on the Kingman simplex. To prove Lemma 5.7, we now consider a pair of nested ordered Chinese restaurant pro- (¯α) ¯ ¯ cesses (Cc(n),Cf (n))n∈N. The coarse one Cc ∼ oCRP (θ1, θ2) describes the arrangement of customers in a sequence of ordered clusters. We next obtain a composition of each clus- ter of customers by further seating them at ordered tables, according to the (α, θ1, θ2)- seating rule. These compositions are concatenated according to the cluster order, form- ing the fine process Cf . Then, as illustrated in Figure3, we can easily to check that (α) ¯ ¯ Cf ∼ oCRP (θ1 + θ1, θ2 + θ2), due to the identity θ1 + θ2 +α ¯ = α. Nested (unordered) Chinese restaurant processes have been widely applied in nonparametric Bayesian analysis of the problem of learning topic hierarchies from data, see e.g. [7]. Lemma 5.7 follows immediately from the following convergence result.

LEMMA 5.8 (Convergence of nested oCRP). Let (Cc(n),Cf (n))n∈N be a pair of nested −1 −1 ordered Chinese restaurant processes constructed as above. Then (n Cc(n), n Cf (n)) converges a.s. to a limit (βc, βf ) for the metric dH as n → ∞; furthermore, we have βc ∼ (¯α) ¯ ¯ (α) ¯ ¯ PDIP (θ1, θ2), βf ∼ PDIP (θ1 +θ1, θ2 +θ2), and the regular conditional distribution of (α) βf given βc is Frag (θ1, θ2).

−1 −1 PROOF. By Lemma 2.7, we immediately deduce that (n Cc(n), n Cf (n)) converges a.s. to a limit (βc, βf ). It remains to determine the joint distribution of the limit. (α) 0 Consider a sequence of i.i.d. γi ∼ PDIP (θ1, θ2), i ≥ 1 and an independent βc := (¯α) ¯ ¯ 0 0 {(ai, bi), i ∈ N} ∼ PDIP (θ1, θ2). Let βf be obtained from βc by splitting each (ai, bi) ac- 0 0 (α) cording to γi, then the regular conditional distribution of βf given βc is Frag (θ1, θ2). We d 0 0 will show that (βc, βf ) = (βc, βf ). To this end, apply the paintbox construction described before Proposition 2.9 to the nested 0 0 βc and βf , by using the same sequence of i.i.d. uniform random variables (Zj, j ∈ N) on ∗ ∗ [0, 1]. For each n ∈ N, let Cc (n) and Cf (n) be the compositions of the set [n] obtained as in 0 0 0 0 (8), associated with βc and βf respectively. Write (Cc(n),Cf (n)) for the integer composi- ∗ ∗ −1 0 −1 0 0 0 tions associated with (Cc (n),Cf (n)), then (n Cc(n), n Cf (n)) converges a.s. to (βc, βf ) by [22, Theorem 11]. 40 Q. SHI AND M. WINKEL

0 ∗ Note that each (ai, bi) ∈ βc corresponds to a cluster of customers in Cc (n), which are ∗ further divided into ordered tables in Cf (n). This procedure can be understood as a paint- (α) box construction, independent of other clusters, by using γi ∼ PDIP (θ1, θ2) and i.i.d. uni- form random variables {(Zj − ai)/(bi − ai): Zj ∈ (ai, bi), j ∈ N} on [0, 1]. By Proposi- (α) tion 2.9, it has the same effect as an oCRP (θ1, θ2). For each n ∈ N, as we readily have 0 d 0 0 d Cc(n) = Cc(n) by Proposition 2.9, it follows that (Cc(n),Cf (n)) = (Cc(n),Cf (n)). As a d 0 0 result, we deduce that the limits also have the same law, i.e. (βc, βf ) = (βc, βf ). 5.3. Coarse-fine interval partition evolutions. We consider the space 2  Inest := (γc, γf ) ∈ IH × IH :: G(γc) ⊆ G(γf ), kγck = kγf k .

In other words, for each element (γc, γf ) in this space, the interval partition γf is a re- finement of γ such that each interval U ∈ γ is further split into intervals in γ , form- c c f ing an interval partition γf U of [inf U, sup U]. We also define the shifted interval partition ← ← γf U := {(a, b):(a + inf U, b + inf U) ∈ γf U } of [0, Leb(U)] and note that γf U ∈ IH . We 2 equip Inest with the product metric 2 0 0 0 0 dH ((γc, γf ), (γc, γf )) = dH (γc, γc) + dH (γf , γf ).

2 LEMMA 5.9. For each n ≥ 1, let (βn, γn) ∈ Inest. Suppose that (βn, γn) converges to 2 2 (β∞, γ∞) under the product metric dH . Then (β∞, γ∞) ∈ Inest.

PROOF. This requires us to prove G(β∞) ⊆ G(γ∞). As G(γ∞) is closed, it is equivalent to show that, for any x ∈ G(β∞), the distance d(x, G(γ∞)) from x to the set G(γ∞) is zero. For any yn ∈ G(βn) ⊆ G(γn), we have d(x, G(γ∞)) ≤ d(x, yn)+dH (γn, γ∞). It follows that d(x, G(γ∞)) ≤ infyn∈G(βn) d(x, yn) + dH (γn, γ∞) ≤ dH (β∞, βn) + dH (γn, γ∞). As n → ∞, the right-hand side converges to zero. So we conclude that d(x, G(γ∞)) = 0 for every x ∈ G(β∞), completing the proof.

2 We shall now construct a coarse-fine interval partition evolution in the space Inest. To this end, let us first extend the scaffolding-and-spindles construction in Section 3.1 to the setting where each spindle is an interval-partition-valued excursion. Denote by EI the space of continuous IH -valued excursions. Given a point measure W on R+ × EI and a scafolding function X : R+ → R, we define the following variables, if they are well-defined. The coarse skewer of (W, X) at level y ∈ R is the interval partition n y y  y y o cSKEWER(y, W, X) := MW,X (t−),MW,X (t) : t ≥ 0,MW,X (t−) < MW,X (t) , y  where M (t) := R γ y − X(s−) W (ds, dγ), t ≥ 0. Let cSKEWER(W, X) := W,X [0,t]×EI (cSKEWER(y, W, X), y ≥ 0). The fine skewer of (W, X) at level y ∈ R is the interval partition  f SKEWER(y, W, X) := ? γt y − X(t−) . y y points (t,γt) of W : MW,X (t−)

Let f SKEWER(W, X) := (f SKEWER(y, W, X), y ≥ 0). (α) Let θ1, θ2 ≥ 0. Suppose that θ = θ1 + θ2 − α ∈ [−α, 0), then we have an SSIP (θ1, θ2)- (α) excursion measure Θ := Θ (θ1, θ2) defined as in Section 4.4. Write α¯ := −θ ∈ (0, α] and let W be a Poisson random measure on R+ × EI with intensity cα¯Leb ⊗ Θ, where cα¯ := (¯α) 2¯α(1 +α ¯)/Γ(1 − α¯). We pair W with a (coarse) scaffolding (ξW (t), t ≥ 0) defined by ! Z (1 +α ¯)t (56) ξ(¯α)(t) := lim ζ(γ)W(ds, dγ) − . W z↓0 α¯ [0,t]×{γ∈EI : ζ(γ)>z} (2z) Γ(1 − α¯)Γ(1 +α ¯) TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 41

(α) This is a spectrally positive stable Lévy process of index 1 +α ¯. Let β be an SSIP† (θ1, θ2)- (α) evolution starting from γ0 ∈ IH with first hitting time ζ(β) of ∅. We define by Qγ0 (θ1, θ2) the law of the following a random point measure on [0, ∞) × EI :  (¯α) (57) δ(0, β) + W , where T−y := inf t ≥ 0: ξ (t) = −y . (0,T−ζ(β)]×EI W

DEFINITION 5.10 (Coarse-fine SSIP-evolutions). Let α∈(0, 1), θ1, θ2 ≥0 with θ1+θ2 <α. 2 (α) Let α¯ := α − θ1 − θ2 ∈ (0, α]. For (γc, γf ) ∈ Inest, let WU ∼ Q ← (θ1, θ2), U ∈ γc, be an γf |U independent family with scaffolding ξ(¯α) as in (56). Then the pair-valued process WU  (¯α) (¯α)  cSKEWERy, W , ξ , f SKEWERy, W , ξ  , y ≥ 0, ? U WU ? U WU U∈γ U∈γ is called a coarse-fine (α, θ1, θ2, 0)-self-similar interval partition evolution, starting from (α,θ1,θ2) (γc, γf ), abbreviated as cfSSIP (0)-evolution.

Roughly speaking, it is a random refinement of an SSIP(¯α)(0)-evolution according to (α) SSIP (θ1, θ2)-excursions. To add immigration to this model, let W ∼ PRM(cα¯Leb ⊗ Θ) (¯α) ¯ and consider its coarse scaffolding ξW as in (56). For θ ≥ 0, as in (13), define the process (¯α) ¯ (¯α) (58) Xθ¯(t) := ξ (t) + 1 − α/¯ θ `(t), t ≥ 0, where `(t) := − inf ξ (u). W u≤t W j ∈ T −j := inf{t ≥ 0: X (t) = −j} For each N, set θ¯ θ¯ and define nested processes (  βc,j(y) := cSKEWER y, W −j , j + Xθ¯ −j , y ∈ [0, j], [0,Tθ¯ ) [0,Tθ¯ ) (  βf,j(y) := f SKEWER y, W −j , j + Xθ¯ −j , y ∈ [0, j]. [0,Tθ¯ ) [0,Tθ¯ ) ( (   d ( (   As in Section 3.1, we find that βc,j(y), βf,j(y) , y ∈ [0, j] = βc,k(y), βf,k(y) , y ∈ [0, j] ( (  for all k ≥ j. Thus, by Kolmogorov’s extension theorem, there exists a process βc, βf such ( (   d ( (   that βc(y), βf (y) , y ∈ [0, j] = βc,j(y), βf,j(y) , y ∈ [0, j] for every j ∈ N.

DEFINITION 5.11 (Coarse-fine SSIP-evolutions with immigration). Let θ,¯ θ1, θ2 ≥0, α∈ 2 ( ( (0, 1), α¯ = α−θ1 −θ2 ∈ (0, α] and (γc, γf ) ∈ Inest. Let (βc, βf ) be defined as above and * * (α,θ1,θ2) (βc, βf ) an independent cfSSIP (0)-evolution starting from (γc, γf ). Then we call ( * ( * (βc(y) ? βc(y), βf (y) ? βf (y)), y ≥ 0, a coarse-fine (α, θ1, θ2)-self-similar interval partition ¯ (α,θ1,θ2) ¯ evolution with immigration rate θ, starting from (γc, γf ), or a cfSSIP (θ)-evolution.

By construction, the coarse process of a cfSSIP(α,θ1,θ2)(θ¯)-evolution is an SSIP(¯α)(θ¯)- evolution. For the special case θ1 = θ2 = 0, the fine process coincides with the coarse one.

REMARK. By combining Definition 5.11 and Definition 4.1, one can further construct a (¯α) coarse-fine SSIP-evolution with the coarse process being an SSIP (θ¯1, θ¯2)-evolution.

5.4. Convergence of nested PCRPs. For α¯ ∈ (0, 1) and θ¯ ≥ 0, let (Cc(t), t ≥ 0) be a Poissonised Chinese restaurant process PCRP(¯α)(θ,¯ α¯) as in Section 3.2. Recall that for each cluster of Cc, the mass evolves according to a Markov chain of law π(−α¯) as in (16). Let α ∈ (¯α, 1) and θ1, θ2 ≥ 0. Suppose that there is the identity

θ = θ1 + θ2 − α = −α¯ < 0, 42 Q. SHI AND M. WINKEL

(α) then the total mass evolution of a PCRP (θ1, θ2) also has distribution π(−α¯). Therefore, (α) we can fragment each cluster of Cc into PCRP (θ1, θ2) as follows. In each cluster, cus- tomers are further attributed into an ordered sequence of tables: whenever a customer joins this cluster, they choose an existing table or add a new table according to the (α, θ1, θ2)- seating rule; whenever the cluster size reduces by one, a customer is chosen uniformly to (α) leave. As a result, we embed a PCRP (θ1, θ2) into each cluster of Cc, independently of the others. The rates at which customers arrive are illustrated in Figure3. For each time t ≥ 0, by concatenating the composition of ordered table size configuration of each cluster, from left to right according to the order of clusters, we obtain a composition Cf (t), representing the numbers of customers at all tables. Then Cf (t) is a refinement of Cc(t). One can easily check (α) ¯ that (Cf (t), t ≥ 0) is a PCRP (θ1 + θ, θ2 +α ¯). We refer to the pair ((Cc(t),Cf (t)), t ≥ 0) as a pair of nested PRCPs.

(n) (n) THEOREM 5.12 (Convergence of nested PCRPs). For each n ∈ N, let (Cc ,Cf ) be (n) (n) 2 a pair of nested PCRPs as defined above, starting from (γc , γf ) ∈ Inest. Suppose that 1 (n) (n) (n) (n) 2 2 n (γc , γf ) converges to (γc , γf ) ∈ Inest under the product metric dH . Then the follow- ing convergence holds in distribution in the space of càdlàg functions on IH × IH endowed with the Skorokhod topology,   1  (n) (n)     C (2nt),C (2nt) , t ≥ 0 −→ βc(t), βf (t) , t ≥ 0 , n c f n→∞

  (α,θ1,θ2) ¯ where the limit (βc, βf ) = βc(t), βf (t) , t ≥ 0 is a cfSSIP (θ).

PROOF. The arguments are very similar to those in the proof of Theorem 3.12 and Propo- sition 3.15, with an application of Theorem 1.5, which replaces the role of Proposition 3.10. Let us sketch the main steps: (n) (α) • Let W be a Poisson random measure of rescaled excursions of PCRP (θ1, θ2) with intensity 2¯αn1+¯αP(n), where P(n) is as in Theorem 1.5. Write ξ(n) for the associated scaffolding of W(n) defined as in (17) and M (n) the total mass of the coarse skewer. Since (α) by Theorem 1.5 the intensity measure converges vaguely to the SSIP (θ1, θ2)-excursion (n) (n) (n) measure cα¯Θ, in analogy with Proposition 3.7, the sequence (W , ξ ,M ) can be constructed such that it converges a.s. to (W, ξ, M), where ξ and M are the scaffolding defined as in (56) and the coarse skewer total mass of W ∼ PRM(cα¯Leb ⊗ Θ), respectively. • Using similar methods as in Theorem 3.12 proves the convergence when θ¯ = 0. More precisely, using the sequence W(n) obtained in the previous step, we give a scaffolding- 1 (n) 1 (n) and-spindles construction for each rescaled nested pair ( n Cc (2n · ), n Cf (2n · )), as in the description below Lemma 3.13 and in Section 3.2. We first study the case when the initial state of the coarse process is a single interval as in Lemma 3.13, and then extend to any initial state by coupling the large clades and controlling the total mass of the remainder. • When θ¯ > 0, we proceed as in the proof of Proposition 3.15: we prove that the modified scaffolding converges and then the skewer process also converges. Summarising, we deduce the convergence of nested PCRPs to the coarse-fine skewer pro- cesses, as desired.

Having Theorem 5.12, we can now identify the fine process by Theorem 1.1.

PROPOSITION 5.13 (Nested SSIP-evolutions). Let α ∈ (0, 1), θ1, θ2, θ¯ ≥ 0 and suppose (α,θ1,θ2) that θ = θ1 + θ2 − α < 0. In a cfSSIP (θ¯)-evolution, the coarse and fine processes are (¯α) (α) SSIP (θ¯)- and SSIP (θ1 +θ,¯ θ2 +α ¯)-evolutions respectively, where α¯ = −θ. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 43

PROOF. We may assume this cfSSIP is the limit of a sequence of nested PCRPs. Since the coarse processes form a sequence of PCRP(¯α)(θ¯) that converges in its own right, Theo- rem 1.1 shows that the limit is an SSIP(¯α)(θ¯)-evolution. Similarly, since the fine process is the (α) (α) limit of a sequence of PCRP (θ1+θ,¯ θ2+¯α), it is an SSIP (θ1+θ,¯ θ2+¯α)-evolution.

PROPOSITION 5.14 (Pseudo-stationarity). Let α∈(0, 1), θ1, θ2 ≥0 with α¯ := α−θ1 −θ2 ¯ (¯α) ¯ ∈ (0, α], and θ ≥ 0. Let Z ∼ BESQ(2¯α) and γ¯c ∼ PDIP (θ, α¯) be independent and γ¯f ∼ (α) (α,θ1,θ2) ¯ Frag (θ1, θ2)(γc, · ). Let ((βc(t), βf (t)), t ≥ 0) be a cfSSIP (θ)-evolution starting d from (Z(0)¯γc,Z(0)¯γf ). Then (βc(t), βf (t)) = (Z(t)¯γc,Z(t)¯γf ) for each t ≥ 0.

PROOF. We may assume this cfSSIP-evolution is the limit of a sequence of nested PCRPs (n) (n) (n) (n) (Cc ,Cf ), with (Cc ,Cf ) starting from nested compositions of [n] with distribution as in Lemma 5.8. By similar arguments as in Lemma 4.5, we deduce that, given the total (n) (n) number of customers m := kCc (t)k = kCf (t)k at time t ≥ 0, the conditional distribution (n) (n) (¯α) ¯ (α) ¯ of (Cc (t),Cf (t)) is given by nested oCRPm (θ, α¯) and oCRPm (θ1 + θ, θ2 +α ¯) described above Lemma 5.8. The claim then follows from Lemma 5.8 and Theorem 5.12.

PROPOSITION 5.15 (Markov property). A cfSSIP(α,θ1,θ2)(θ¯)-evolution is a Markov pro- 2 2 cess on (Inest, dH ) with continuous paths.

(α) To prove Proposition 5.15, we first give a property of the excursion measure Θ (θ1, θ2). a For any IH -valued process γ =(γ(y), y≥0) and a>0, let H (γ) := inf{y≥0: kγ(y)k>a}.

(α) a LEMMA 5.16. For a > 0, let β = (β(y), y ≥ 0) ∼ Θ (θ1, θ2)( · | H < ∞). Condi- a a (α) tionally on (β(r), r ≤ H (β)), the process (β(H (β) + z), z ≥ 0) is an SSIP (θ1, θ2)- evolution starting from β(Ha(β)).

a −k k a k a PROOF. For k ∈ N, let Hk := 2 d2 H e ∧ 2 . Then Hk is a stopping time that a.s. only takes a finite number of possible values and eventually decreases to Ha. By (42), the desired a a property is satisfied by each Hk . Then we deduce the result for H by approximation, using (α) the path-continuity and Hunt property of SSIP (θ1, θ2)-evolutions of Theorem 4.6.

2 For (γc, γf ) ∈ Inest, let (WU ,U ∈ γc) be a family of independent clades, with each (α) (¯α) WU ∼ Q ← (θ1, θ2). Let ξ be the scaffolding associated with WU as in (56) and write γf |U U (¯α) len(WU ) := inf{s ≥ 0: ξU (s) = 0} for its length, which is a.s. finite. Then we define the concatenation of (WU ,U ∈ γc) by X Z X ? WU := δ(g(U)+t, β)WU (dt, dβ), where g(U) = len(WV ). U∈γc U∈γc V ∈γc,sup V ≤inf U (α) Write Q (θ1, θ2) for the law of WU . We next present a Markov-like property (γc,γf ) ?U∈γc for such point measures of interval partition excursions, analogous to [14, Proposition 6.6].

2 (α) (¯α) LEMMA 5.17. For (γc, γf )∈I , let W ∼ Q (θ1, θ2) and X=ξ . For y≥0, set nest (γc,γf ) W ≥y X y y y cutoffW = 1{X(t−) ≥ y}δ(σ (t), γt) + 1{y ∈ (X(t−), X(t))}δ(σ (t), γbt ),

points (t,γt) of W y y where σ (t) = Leb{u≤t: X(u)>y} and γbt = (γt(y−X(t−)+z), z ≥0). Similarly define ≤y ≥y cutoffW . Given (βc(y), βf (y)) = (cSKEWER(y, W, X), f SKEWER(y, W, X)), cutoffW is ≤y (α) conditionally independent of cutoff and has conditional distribution Q (θ1, θ2). W (βc(y),βf (y)) 44 Q. SHI AND M. WINKEL

PROOF. Recall that the construction of the nested processes is a modification of the scaffolding-and-spindles construction of the coarse component, with the same scaffolding (−2¯α) and the ΛBESQ -excursions being replaced by the interval-partition excursions under Θ. In view of this, we can follow the same arguments as in the proof of [14, Proposition 6.6], with an application of Lemma 5.16.

PROOFOF PROPOSITION 5.15. The path-continuity follows directly from that of an SSIP-evolution. As in [14, Corollary 6.7], Lemma 5.17 can be translated to the skewer

(α) (α,θ1,θ2) process under Q (θ1, θ2), thus giving the Markov property for cfSSIP (0)- (γc,γf ) evolutions. When the immigration rate is θ¯ > 0, we introduce an excursion measure Θnest on the space 2 (¯α) of continuous Inest-excursions, such that the coarse excursion is a Θ (0, α¯), and each of (α) its BESQ(−2¯α)-excursions is split into a Θ (θ1, θ2)-excursion. More precisely, for y > 0, it has the following properties: (¯α) −1 1. Θnest(ζ > y) = Θ (0, α¯)(ζ > y) = y . d 2. If (βc, βf ) ∼ Θnest( · | ζ > y), then (βc(y), βf (y)) = Gamma(1 − α,¯ 1/2y)(¯γc, γ¯f ), where (¯α) (α) γ¯c ∼ PDIP (0) and the conditional distribution of γ¯f given γ¯c is Frag (θ1, θ2). More- over, conditionally on (βc(y), βf (y)), the process ((βc(y + z), βf (y + z)), z ≥ 0) is a cfSSIP(α,θ1,θ2)(0)-evolution. Having obtained the pseudo-stationarity (Proposition 5.14) and the Markov property of (α,θ1,θ2) cfSSIP (0)-evolutions, the construction of Θnest can be made by a similar approach as in Section 4.4. Using F ∼ PRM(θ¯Leb ⊗ Θnest), by the construction in [18, Section 3], the following pro- cess has the same law as a cfSSIP(α,θ1,θ2)(θ¯)-evolution starting from (∅, ∅), for y ≥ 0,

βc(y) = ? γc(y − s), βf (y) = ? γf (y − s). points (s,γc,γf ) of F: s∈[0,y]↓ points (s,γc,γf ) of F: s∈[0,y]↓

The Markov property of cfSSIP(α,θ1,θ2)(θ¯)-evolutions is now a consequence of this Poisso- nian construction and the form of Θnest; see the proof of [18, Lemma 3.10] for details.

THEOREM 5.18. For any θ ≥ 0 and pairwise nested γα ∈ IH , α ∈ (0, 1), there exists a (α) nested family (βα, α ∈ (0, 1)) of SSIP (θ)-evolutions, in the following sense: (α) 1. each βα is an SSIP (θ)-evolution starting from γα; 2 2. for any 0 < α¯ < α < 1, (βα¯, βα) takes values in Inest.

(α,0,α−α¯) PROOF. For 0 < α¯ < α < 1, let (βc, βf )=((βc(y), βf (y)), y≥0) be a cfSSIP (θ)- 2 evolution starting from (γα¯, γα) ∈ Inest. Then by Proposition 5.13, the coarse process βc is (¯α) (α) an SSIP (θ)-evolution and the fine process βf is an SSIP (θ)-evolution. This induces a kernel κα,α¯ from the coarse process to the fine process. Arguing by approximation as in

Theorem 5.12, we can prove that κα1,α2 ◦κα2,α3 = κα1,α3 for all 0 < α1 < α2 < α3 < 1. More generally, for any finitely many 0 < α1 < α2 < ··· < αn < 1, we can find nested (βαi , 1 ≤ i ≤ n) that are consistently related by these kernels. We can thus construct the full family by using Kolmogorov’s extension theorem.

2 2 Let Inest,1 := {(γc, γf )∈Inest : kγck=kγf k=1} be the space of nested partitions of [0, 1]. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 45

THEOREM 5.19. For any θ ≥ 0 and pairwise nested γ¯α ∈ IH,1, α ∈ (0, 1), there exists a family of processes (βα, α ∈ (0, 1)) on IH,1, such that (α) 1. each βα is an IP (θ)-evolution starting from γ¯α; 2 2. for any 0 < α¯ < α < 1, (βα¯, βα) takes values in Inest,1.

PROOF. Build a family of SSIP-evolutions (βα, α ∈ (0, 1)) as in Theorem 5.18 on the same probability space. In particular, they have the same total mass process and thus the same de-Poissonisation. So the de-Poissonised family (βα, α ∈ (0, 1)) is still nested.

5.5. An application to alpha-gamma trees. For n ≥ 1, let Tn be the space of all (non- planar) trees without degree-2 vertices, a root vertex of degree 1, and exactly n further degree- 1 vertices, leaves labelled by [n] = {1, . . . , n}. For α ∈ (0, 1) and γ ∈ [0, α], we construct random trees Tn by using the following (α, γ)-growth rule [9]: T1 and T2 are the unique elements in T1 and T2. Given Tk with k ≥ 2, assign weight 1−α to each of the k edges adjacent to a leaf, weight γ to each of the other edges, and weight (d−2)α − γ to each branch point with degree d ≥ 3. To create Tk+1 from Tk, choose an edge or a branch point proportional to the weight, and insert the leaf k+1 to the chosen edge or branch point. This generalises Rémy’s algorithm [41] of the uniform tree (when α = γ = 1/2) and Marchal’s recursive construction [33] of ρ-stable trees with ρ ∈ (1, 2] (when α = 1−1/ρ and γ = 1−α). For each Tn, consider its spinal decomposition as discussed in the introduction, the spine being the path connecting the leaf 1 and the root. Let Cc(n) be the sizes of bushes at the spinal branch points, ordered from left to right in decreasing order of their distances to the (γ) root. Then the (α, γ)-growth rule implies that the (Cc(n), n ∈ N) is an oCRP (1−α, γ). Similar as the semi-planar (α, γ)-growth trees in [49], we further equip each spinal branch point with a left-to-right ordering of its subtrees, such that the sizes of the sub-trees in each bush follow the (α, 0, α−γ)-seating rule. By concatenating the sub-tree-configurations of all bushes according to the order of bushes, we obtain the composition Cf (n) of sizes of sub- (α) trees. Then (Cf (n), n ∈ N) is an oCRP (1−α, α) nested to (Cc(n), n ∈ N), as in Figure3. S Let us introduce a continuous-time Markov chain (T(s), s ≥ 0) on T = n≥1 Tn, the space of labelled rooted trees without degree-2 vertices. Given T(s), assign weights to its branch points and edges as in the (α, γ)-growth model, such that for each branch point or edge, a new leaf arrives and is attaches to this position at the rate given by its weight. More- over, fix the root and the leaf 1, and delete any other leaf at rate one, together with the edge attached to it; in this operation, if a branching point degree is reduced to two, we also delete it and merge the two edges attached to it. For each n ≥ 1, consider such a continuous-time up-down Markov chain (T(n)(s), s≥0) starting from a random tree Tn built by the (α, γ)-growth rule. At each time s ≥ 0, with the spine being the path connecting the leaf 1 and the root, we similarly obtain a nested (n) (n) pair (Cc (s),Cf (s)), representing the sizes of spinal bushes and subtrees respectively. (n)  (γ) (n)  Then it is clear that Cc (s), s ≥ 0 is a PCRP (1−α) and that Cf (s), s ≥ 0 is a (α) (n) PCRP (1−α, α) nested within Cc , such that the size evolution of the subtrees in each bush gives a PCRP(α)(0, α−γ).

(n) (n)  PROPOSITION 5.20. For each n ≥ 1, let (Cc (t),Cf (t)), t ≥ 0 be a pair of nested PCRPs defined as above, associated with a tree-valued process (T(n)(s), s ≥ 0) start- 1 (n) (n)   ing from Tn. As n → ∞, n Cc (2nt),Cf (2nt) , t ≥ 0 converges in distribution to a (α,0,α−γ) (γ) cfSSIP (1−α)-evolution starting from (γc, γf ), where γc ∼ PDIP (1−α, γ) and (α) γf ∼ Frag (0, α−γ)(γc, · ). 46 Q. SHI AND M. WINKEL

PROOF. We characterised the limiting initial distribution in Lemma 5.8 and deduce the convergence of the rescaled process by Theorem 5.12.

Acknowledgements. QS was partially supported by SNSF grant P2ZHP2_171955.

REFERENCES

[1]A LDOUS, D. J. (2000). time for a Markov chain on cladograms. Combin. Probab. Comput. 9 191– 204. MR1774749 (2001f:05129) [2]B ECT, J. (2007). Processus de Markov diffusifs par morceaux: outils analytiques et numériques. Thèse de doctorat, Université Paris-Sud XI, 171 p., https://tel.archives-ouvertes.fr/tel-00169791. [3]B ERTOIN, J. (1996). Lévy processes. Cambridge Tracts in Mathematics 121. Cambridge University Press, Cambridge. MR1406564 (98e:60117) [4]B ERTOIN, J. (2006). Random fragmentation and coagulation processes. Cambridge Studies in Advanced Mathematics 102. Cambridge University Press, Cambridge. MR2253162 [5]B ERTOIN, J. and KORTCHEMSKI, I. (2016). Self-similar scaling limits of Markov chains on the positive integers. Ann. Appl. Probab. 26 2556–2595. [6]B ILLINGSLEY, P. (1999). Convergence of probability measures, second ed. Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons, Inc., New York A Wiley-Interscience Publi- cation. MR1700749 (2000e:60008) [7]B LEI,D.M.,GRIFFITHS, T. L. and JORDAN, M. I. (2010). The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies. J. ACM 57. [8]B ORODIN, A. and OLSHANSKI, G. (2009). Infinite-dimensional diffusions as limits of random walks on partitions. Probab. Theory Related Fields 144 281–318. MR2480792 [9]C HEN,B.,FORD, D. and WINKEL, M. (2009). A new family of Markov branching trees: the alpha-gamma model. Electron. J. Probab. 14 no. 15, 400–430 (electronic). [10]D ALEY, D. J. and VERE-JONES, D. (2003). An introduction to the theory of point processes. Vol. I, second ed. Probability and its Applications (New York). Springer-Verlag, New York Elementary theory and methods. MR1950431 [11]D UQUESNE, T. and LE GALL, J.-F. (2002). Random trees, Lévy processes and spatial branching processes. Astérisque 281 vi+147. MR1954248 [12]D UQUESNE, T. and LE GALL, J.-F. (2005). Probabilistic and fractal aspects of Lévy trees. and Related Fields 131 553–603. [13]F ORMAN,N.,PAL,S.,RIZZOLO, D. and WINKEL, M. (2018). Aldous diffusion I: a projective system of continuum k-tree evolutions. arXiv:1809.07756 [math.PR]. [14]F ORMAN,N.,PAL,S.,RIZZOLO, D. and WINKEL, M. (2020). Diffusions on a space of interval partitions: construction from marked Lévy processes. Electron. J. Probab. 25 46 pp. [15]F ORMAN,N.,PAL,S.,RIZZOLO, D. and WINKEL, M. (2020+). Diffusions on a space of interval par- titions: Poisson–Dirichlet stationary distributions. to appear in Ann. Probab. preprint available as arXiv:1910.07626 [math.PR]. [16]F ORMAN,N.,PAL,S.,RIZZOLO, D. and WINKEL, M. (2020). Metrics on sets of interval partitions with diversity. Electron. Commun. Probab. 25 16 pp. [17]F ORMAN,N.,PAL,S.,RIZZOLO, D. and WINKEL, M. (2020). Interval partition diffusions: connection with Petrov’s Poisson–Dirichlet diffusions. Work in progress. [18]F ORMAN,N.,RIZZOLO,D.,SHI, Q. and WINKEL, M. (2020). Diffusions on a space of interval partitions: The two-parameter model. arXiv:2008.02823 [math.PR]. [19]F ORMAN,N.,RIZZOLO,D.,SHI, Q. and WINKEL, M. (2020). A two-parameter family of measure-valued diffusions with Poisson–Dirichlet stationary distributions. arXiv:2007.05250 [math.PR]. [20]G EIGER, J. and KERSTING, G. (1997). Depth-first search of random trees, and Poisson point processes. In Classical and modern branching processes (Minneapolis, MN, 1994). IMA Vol. Math. Appl. 84 111–126. Springer, New York. MR1601713 [21]G NEDIN, A. and PITMAN, J. (2005). Regenerative composition structures. Ann. Probab. 33 445–479. MR2122798 [22]G NEDIN, A. V. (1997). The representation of composition structures. Ann. Probab. 25 1437–1450. MR1457625 [23]G ÖING-JAESCHKE, A. and YOR, M. (2003). A survey and some generalizations of Bessel processes. Bernoulli 9 313–349. MR1997032 (2004g:60098) [24]H AAS,B.,PITMAN, J. and WINKEL, M. (2009). Spinal partitions and invariance under re-rooting of con- tinuum random trees. Ann. Probab. 37 1381–1411. TWO-SIDED UP-DOWN CRPS AND THEIR DIFFUSION LIMITS 47

[25]J ACOD, J. and SHIRYAEV, A. N. (2003). Limit theorems for stochastic processes, second ed. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 288. Springer-Verlag, Berlin. MR1943877 [26]J AMES, L. F. (2006). Poisson calculus for spatial neutral to the right processes. Ann. Statist. 34 416–440. [27]K ALLENBERG, O. (2002). Foundations of modern probability, second ed. Probability and its Applications (New York). Springer-Verlag, New York. MR1876169 (2002m:60002) [28]K ALLENBERG, O. (2017). Random measures, theory and applications. Probability Theory and Stochastic Modelling 77. Springer, Cham. MR3642325 [29]K YPRIANOU, A. E. (2014). Fluctuations of Lévy processes with applications, second ed. Universitext. Springer, Heidelberg Introductory lectures. MR3155252 [30]L AMBERT, A. (2010). The contour of splitting trees is a Lévy process. Ann. Probab. 38 348–395. MR2599603 (2011b:60344) [31]L AMPERTI, J. (1972). Semi-stable Markov processes. I. Z. Wahrsch. Verw. Gebiete 22 205–225. [32]L ÖHR, W., MYTNIK, L. and WINTER, A. (2020). The Aldous chain on cladograms in the diffusion limit. Ann. Probab. 48 2565–2590. [33]M ARCHAL, P. (2008). A note on the fragmentation of a stable tree. In Fifth Coll. Math. Comp. Sci. 489–499. Assoc. Discrete Math. Theor. Comput. Sci., Nancy. MR2508809 [34]M ENSHIKOV, M. and PETRITIS, D. (2014). Explosion, implosion, and moments of passage times for continuous-time Markov chains: a approach. Stoch. Process. Appl. 124 2388–2414. [35]P ETROV, L. A. (2009). A two-parameter family of infinite-dimensional diffusions on the Kingman simplex. Funktsional. Anal. i Prilozhen. 43 45–66. MR2596654 [36]P ETROV, L. A. (2013). sl(2) operators and Markov processes on branching graphs. Journal of Algebraic Combinatorics 38 663–720. [37]P ITMAN, J. (1997). Partition structures derived from Brownian motion and stable subordinators. Bernoulli 3 79–96. MR1466546 [38]P ITMAN, J. (2006). Combinatorial stochastic processes. Lecture Notes in Mathematics 1875. Springer- Verlag, Berlin. Lectures from the 32nd Summer School on Probability Theory held in Saint-Flour, July 7–24, 2002. MR2245368 (2008c:60001) [39]P ITMAN, J. and WINKEL, M. (2009). Regenerative tree growth: binary self-similar continuum random trees and Poisson–Dirichlet compositions. Ann. Probab. 37 1999–2041. MR2561439 [40]P ITMAN, J. and YOR, M. (1982). A decomposition of Bessel bridges. Z. Wahrsch. Verw. Gebiete 59 425– 457. MR656509 [41]R ÉMY, J.-L. (1985). Un procédé itératif de dénombrement d’arbres binaires et son application à leur généra- tion aléatoire. RAIRO Inform. Théor. 19 179–195. MR803997 [42]R EVUZ, D. and YOR, M. (1999). Continuous martingales and Brownian motion, third ed. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 293. Springer- Verlag, Berlin. MR1725357 (2000h:60050) [43]R IVERA-LOPEZ, K. and RIZZOLO, D. (2020). Diffusive limits of two-parameter ordered Chinese Restau- rant Process up-down chains. arXiv:2011.06577 [math.PR]. [44]R OGERS, D. and WINKEL, M. (2020). A Ray–Knight representation of up-down Chinese restaurants. arXiv:2006.06334 [math.PR]. [45]R OGERS, L. C. G. and WILLIAMS, D. (1994). Diffusions, Markov processes, and martingales. Vol. 1, second ed. Wiley Series in Probability and : Probability and Mathematical Statistics. John Wiley & Sons Ltd., Chichester. Foundations. MR1331599 (96h:60116) [46]S ALISBURY, T. S. (1986). Construction of right processes from excursions. Probab. Theory Related Fields 73 351–367. MR859838 [47]S HI, Q. and WINKEL, M. (2020). Two-sided immigration, emigration and symmetry properties of self- similar interval partition evolutions. arXiv:2011.13378. [48]S HIGA, T. (1990). A stochastic equation based on a Poisson system for a class of measure-valued diffusion processes. Journal of Mathematics of Kyoto University 30 245–279. MR1068791 [49]S ØRENSEN, F. (2020). A down-up chain with persistent labels on multifurcating trees. arXiv:2008.02761.