Interval partition evolutions related to Poisson–Dirichlet distributions

Quan Shi

Universit¨atMannheim

[email protected]

THU-PKU-BNU Webinar, 02.12.2020 Joint work with

Noah Forman (McMaster) Douglas Rizzolo(Delaware)

Matthias Winkel (Oxford) I Chinese restaurant processes I self-similar Markov processes I Bessel processes I stable processes, Local times I branching processes I random trees I population genetics models, Wright–Fisher diffusions

A family of interval partition evolutions related to: I Poisson–Dirichlet distributions A family of interval partition evolutions related to: I Poisson–Dirichlet distributions I Chinese restaurant processes I self-similar Markov processes I Bessel processes I stable processes, Local times I branching processes I random trees I population genetics models, Wright–Fisher diffusions Overview

1. Poisson–Dirichlet distributions on interval partitions

2. Interval partition evolutions

3. Self-similar interval partition evolutions as limits of Chinese restaurants 1. Poisson–Dirichlet distributions on interval partitions I zero points Z of a on (0, 1): interval components of the open set (0, 1) \Z form an interval partition β of [0, 1]

Brownian Motion and its zeros value −0.5 0.0 0.5 1.0

Brownian Motion

0.0 0.2 0.4 0.6 0.8 1.0

time

I The space I of all interval partitions is endowed with the Hausdorff metric dH (between the endpoint sets [0, L] \ β).

Interval partitions

L ≥ 0. We say β is an interval partition of the interval [0, L], if

I β = {(ai , bi ) ⊂ (0, L): i ≥ 1} a collection of disjoint open intervals (blocks) P I the total mass (sum of lengths) of β is kβk := i≥1(bi − ai ) = L. Interval partitions

L ≥ 0. We say β is an interval partition of the interval [0, L], if

I β = {(ai , bi ) ⊂ (0, L): i ≥ 1} a collection of disjoint open intervals (blocks) P I the total mass (sum of lengths) of β is kβk := i≥1(bi − ai ) = L. I zero points Z of a Brownian motion on (0, 1): interval components of the open set (0, 1) \Z form an interval partition β of [0, 1]

Brownian Motion and its zeros value −0.5 0.0 0.5 1.0

Brownian Motion

0.0 0.2 0.4 0.6 0.8 1.0

time

I The space I of all interval partitions is endowed with the Hausdorff metric dH (between the endpoint sets [0, L] \ β). Poisson–Dirichlet distribution PD(α, θ)

Poisson–Dirichlet distribution with parameters (α, θ)(PD(α, θ)): I on the Kingman simplex P ∇∞ = {(x1, x2,...): x1 ≥ x2 ≥ · · · ≥ 0, k≥1 xk = 1} I α ∈ [0, 1], θ > −α I independent Wi ∼ Beta(1 − α, θ + iα), i ≥ 1; W i = 1 − Wi I stick breaking: Pn = W 1 ··· W n−1Wn, n ≥ 1 I PD(α, θ): the law of the decreasing rearrangement of (Pn, n ≥ 1).

P1 = W1 W 1

P1 P2 = W 1W2 W 1W 2

P1 P2 P3 = W 1W 2W3 W 1W 2W 3

P1 P2 P3 P4 Chinese restaurant processes (CRP) Fix α ∈ [0, 1], θ > −α.A (unordered) Chinese restaurant process (CRP) with parameter (α, θ): I customers labeled by N = {1, 2,...} I Customer 1 sits at table 1 I Given that n customers have been seated at k different tables, with each table i having ni customers, the customer n + 1: I sits at table i of ni customer with probability (ni − α)/(n + θ) for i ∈ [k] I sits at a new table k + 1 with probability (kα + θ)/(n + θ).

I The state after n customers have arrived: a partition of n. 1 ↓ I n (n1,..., nk ) ⇒ PD(α, θ) in distribution as n → ∞. I I β ∼ PDIP(α, θ) a.s. has the α-diversity property: the following limit exists for every t ∈ [0, ∞]:

α Dα(t) := Γ(1 − α) lim h #{(a, b) ∈ β : |b − a| > h, b ≤ t}. h↓0

Poisson–Dirichlet Interval Partition PDIP(α, θ) Poisson–Dirichlet interval partition with parameters (α, θ) [GnedinPitman05] (PDIP(α, θ)): I α ∈ [0, 1], θ ≥ 0 I independent Wi ∼ Beta(1 − α, θ + iα), i ≥ 1; W i = 1 − Wi I stick breaking sequence: Pn = W 1 ··· W n−1Wn, n ≥ 1

I PDIP(α, θ): the law of a random interval partition of [0, 1] with block lengths Pn, n ≥ 1, built by proceeding inductively: I I β ∼ PDIP(α, θ) a.s. has the α-diversity property: the following limit exists for every t ∈ [0, ∞]: α Dα(t) := Γ(1 − α) lim h #{(a, b) ∈ β : |b − a| > h, b ≤ t}. h↓0

Poisson–Dirichlet Interval Partition PDIP(α, θ) Poisson–Dirichlet interval partition with parameters (α, θ) [GnedinPitman05] (PDIP(α, θ)): I α ∈ [0, 1], θ ≥ 0 I independent Wi ∼ Beta(1 − α, θ + iα), i ≥ 1; W i = 1 − Wi I stick breaking sequence: Pn = W 1 ··· W n−1Wn, n ≥ 1

I PDIP(α, θ): the law of a random interval partition of [0, 1] with block lengths Pn, n ≥ 1, built by proceeding inductively:

I I I β ∼ PDIP(α, θ) a.s. has the α-diversity property: the following limit exists for every t ∈ [0, ∞]: α Dα(t) := Γ(1 − α) lim h #{(a, b) ∈ β : |b − a| > h, b ≤ t}. h↓0

Poisson–Dirichlet Interval Partition PDIP(α, θ) Poisson–Dirichlet interval partition with parameters (α, θ) [GnedinPitman05] (PDIP(α, θ)): I α ∈ [0, 1], θ ≥ 0 I independent Wi ∼ Beta(1 − α, θ + iα), i ≥ 1; W i = 1 − Wi I stick breaking sequence: Pn = W 1 ··· W n−1Wn, n ≥ 1

I PDIP(α, θ): the law of a random interval partition of [0, 1] with block lengths Pn, n ≥ 1, built by proceeding inductively:

I I I β ∼ PDIP(α, θ) a.s. has the α-diversity property: the following limit exists for every t ∈ [0, ∞]: α Dα(t) := Γ(1 − α) lim h #{(a, b) ∈ β : |b − a| > h, b ≤ t}. h↓0

Poisson–Dirichlet Interval Partition PDIP(α, θ) Poisson–Dirichlet interval partition with parameters (α, θ) [GnedinPitman05] (PDIP(α, θ)): I α ∈ [0, 1], θ ≥ 0 I independent Wi ∼ Beta(1 − α, θ + iα), i ≥ 1; W i = 1 − Wi I stick breaking sequence: Pn = W 1 ··· W n−1Wn, n ≥ 1

I PDIP(α, θ): the law of a random interval partition of [0, 1] with block lengths Pn, n ≥ 1, built by proceeding inductively:

I I I β ∼ PDIP(α, θ) a.s. has the α-diversity property: the following limit exists for every t ∈ [0, ∞]: α Dα(t) := Γ(1 − α) lim h #{(a, b) ∈ β : |b − a| > h, b ≤ t}. h↓0

Poisson–Dirichlet Interval Partition PDIP(α, θ) Poisson–Dirichlet interval partition with parameters (α, θ) [GnedinPitman05] (PDIP(α, θ)): I α ∈ [0, 1], θ ≥ 0 I independent Wi ∼ Beta(1 − α, θ + iα), i ≥ 1; W i = 1 − Wi I stick breaking sequence: Pn = W 1 ··· W n−1Wn, n ≥ 1

I PDIP(α, θ): the law of a random interval partition of [0, 1] with block lengths Pn, n ≥ 1, built by proceeding inductively:

I Poisson–Dirichlet Interval Partition PDIP(α, θ) Poisson–Dirichlet interval partition with parameters (α, θ) [GnedinPitman05] (PDIP(α, θ)): I α ∈ [0, 1], θ ≥ 0 I independent Wi ∼ Beta(1 − α, θ + iα), i ≥ 1; W i = 1 − Wi I stick breaking sequence: Pn = W 1 ··· W n−1Wn, n ≥ 1

I PDIP(α, θ): the law of a random interval partition of [0, 1] with block lengths Pn, n ≥ 1, built by proceeding inductively:

I I β ∼ PDIP(α, θ) a.s. has the α-diversity property: the following limit exists for every t ∈ [0, ∞]:

α Dα(t) := Γ(1 − α) lim h #{(a, b) ∈ β : |b − a| > h, b ≤ t}. h↓0 An (ordered) CRP with parameter (α, θ)

I The same seating rule as a (unordered) CRP. I The tables are ordered in a line. When the customer starts a new table: I The new table is placed to the left of the leftmost table, at probability θ/(kα + θ); I Between each pair of two neighbouring occupied tables, or to the right of the rightmost table, we place the new table with at probability α/(kα + θ). An (ordered) CRP with parameter (α, θ) (cont.)

I After n customers arrived, list the numbers of customers of occupied tables, from left to right, by a vector of positive integers (composition): C(n) = (n1, n2, n3,... nk ). the process (C(n), n ≥ 0): an ordered CRP with parameter (α, θ). I We regard each C(n) as an interval partition of total mass n ∈ N: I (n1, n2,..., nk ) (a total-ordered family of positive real numbers of sum n) Pk−1 I {(0, n1), (n1, n1 + n2),..., ( i=1 ni , n)} (a family of disjoint open intervals with length sum up to n) Asymptotics of an ordered CRP

Scaling the interval lengths of an IP by a > 0:

a {n1, n2,... nk } := {an1, an2,... ank }.

Theorem ([PitmanWinkel 2009]) In the space of interval partitions, we have the convergence: 1 C(n)−→PDIP(α, θ) in distribution. n Examples of PDIPs

PDIP(1/2, 1/2): zero points of a PDIP(1/2, 0): zero points of a Brownian bridge on [0, 1] from Brownian motion on [0, 1]; the zero to zero. left-right reversal.

left-right revesed Brownian Motion and its zeros Brownian bridge and its zeros 0.5 value value 0.0 0.5 1.0

Brownian bridge Brownian Motion − 1.0 0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0 −1.0−0.8−0.6−0.4−0.2 0.0 time I I time PDIPs in continuum random trees I ρ ∈ (1, 2], ρ-stable continuum random tree [Duquesne LeGall 2002]: α = 1 − 1/ρ

masses of spinal bushes (Mi )i≥1 β = {Ui , i ≥ 1} ∼ PDIP(α, α), ⇐⇒law distances to the root (`i )i≥1 α-diversity (Dα(inf Ui ), i ≥ 1). 0 1 Mi

Mi

root

`i a random leaf

Figure: (coarse) spinal decomposition of a 1.5-stable tree @ Kortchemski

I PDIP(α, 1 − α): Ford’s tree growth model [Ford 2010] I PDIP(α, θ): binary self-similar trees [Pitman and Winkel 2009] 2. Interval partition evolutions Question: construct an interval-partition-valued diffusion with stationary distribution PDIP(α, θ).

Motivation: to study continuum-tree-valued diffusions. Does there exist a continuum-tree-valued diffusion stationary with the law of α-stable tree? (in the Brownian case: Aldous’ conjectured diffusion; see [FPRW 2018] and [L¨ohr,Mytnik and Winter 2020+]) Construction in two steps: 1. Construct an I-valued self-similar interval partition evolution with parameters (α, θ)(SSIP(α, θ))(β(y), y ≥ 0). 2. Renormalize by its total mass and use Lamperti type time-change  R y u 7→ τ(u) = inf y ≥ 0: 0 kβ(z)kdz > u :

β¯(u) := kβ(τ(u))k−1 β(τ(u)), u ≥ 0.

I I: the space of interval partitions I I1: the space of interval partitions of [0, 1] Theorem (FRSW 2020+) Let α ∈ (0, 1) and θ ≥ 0. There exists a path-continuous strong Markov process (β¯(u), u ≥ 0) on the space I1 with stationary distribution PDIP(α, θ). This generalizes the cases θ = 0 and θ = α studied in [FPRW2019]. I I: the space of interval partitions I I1: the space of interval partitions of [0, 1] Theorem (FRSW 2020+) Let α ∈ (0, 1) and θ ≥ 0. There exists a path-continuous strong Markov process (β¯(u), u ≥ 0) on the space I1 with stationary distribution PDIP(α, θ). This generalizes the cases θ = 0 and θ = α studied in [FPRW2019]. Construction in two steps: 1. Construct an I-valued self-similar interval partition evolution with parameters (α, θ)(SSIP(α, θ))(β(y), y ≥ 0). 2. Renormalize by its total mass and use Lamperti type time-change  R y u 7→ τ(u) = inf y ≥ 0: 0 kβ(z)kdz > u :

β¯(u) := kβ(τ(u))k−1 β(τ(u)), u ≥ 0. Construction of an SSIP evolution

A skewer of a spindle-marked spectrally positive L´evyprocess Animation by Forman et. al. (scaffolding) [FPRW 2019] An (α, 0)-SSIP evolution starting from {(0, b)} I A squared with dimension −2α (BESQ(−2α)) starting from b > 0, i.e. the unique strong solution Z of the SDE driven by a Brownian motion B p dZt = δdt + 2 |Zt |dBt . It is killed at first hitting time of zero, denoted by ζ. I Scaffolding: a spectrally positive stable (1 + α) L´evyprocess X starting from X0 = ζ, killed at zero, with Laplace exponent q1+α [e−qXt ] = exp(ψ(q)t), ψ(q) = , q ≥ 0. E 2αΓ(1 + α) I Each jump of size x := ∆Xt > 0 is decorated with a BESQ(−2α) excursion of length x [PitmanYor 1982]; i.e. a BESQ(4 + 2α) bridge over [0, x] from zero to zero.

ζ

BESQb (−2α)

 b - t Figure: Simulation by [PitmanWinkel2018] An (α, 0)-SSIP evolution starting from a general γ ∈ I

I an family of independent( α, 0)-SSIP evolutions (βU (y), y ≥ 0)U∈γ : for each U = (a, b) ∈ γ,(βU (y), y ≥ 0) starts from {(0, b − a)}. I the concatenation

β(y) := ? βU (y) := {(SU (y)+a, SU (y)+b): U ∈ γ, (a, b) ∈ βU (y)}, U∈γ P where SU (y) := U0=(u0,v 0)∈γ : u0 0

An (α, θ)-SSIP evolution starting from γ ∈ I. I An (α, 0)-SSIP evolution (β(y), y ≥ 0) starting from γ ∈ I. I An independent immigration. Excursions of reflected marked L´evyprocesses

I Stable (1 + α) process X reflected at infimum: Yt := Xt − infs≤t Xs I A of c`adl`agexcursions   X E := δ y, Yt , `−y ,r −y  y≥0 : `−y

Y X

−y `−y r −y An (α, θ)-SSIP evolution An (α, θ)-SSIP evolution( β(y), y ≥ 0) starting from γ ∈ I. I An (α, 0)-SSIP evolution (β0(y), y ≥ 0) starting from γ ∈ I. ∗ P ∗ θ I Eα,θ = i≥1 δ(yi , ei ): Poisson point process with characteristic α ν¯.

∗ the immigrants ei arrives at time (level) yi , from the left.

∗ ∗ I Each atom (yi , ei ) of Eα,θ gives rise to an interval partition valued processes (βi (y), y ≥ 0) via the skewer operation. I Concatenation from left to right in decreasing order of (yi , i ≥ 0) with y0 = 0: β(y) := ? βi (y − yi ), y ≥ 0. yi ↓

e∗ ⇒ β i i y yi

yi Summary

Theorem (FRSW 2020+) For α ∈ (0, 1) and θ ≥ 0, let (β(y), y ≥ 0) be (α, θ)-SSIP (self-similar interval partition) evolution starting from γ ∈ I.

I it is a path-continuous strong Markov process on (I, dH ); I For any y > 0, β(y) has the α-diversity property: for every interval t ≥ 0, the following limit exists

α Dα(t) := Γ(1 − α) lim h #{(a, b) ∈ β : |b − a| > h, b ≤ t}. h↓0

I the total mass (k(β(y)k, y ≥ 0) evolves according to a BESQ(2θ); I (self-similar with index 1) for c > 0, the space-time rescaled process (c β(y/c), y ≥ 0) is also an (α, θ)-SSIP evolution;

I (pseudo-stationary distribution) Let (Zt , t ≥ 0) ∼ BESQ(2θ) and γ ∼ PDIP(α, θ) be independent. If β(0) ∼ Z0 γ, then, for any y ≥ 0, β(y) ∼ Zy γ. De-Poissonized process

I Introduce a Lamperti-type time-change  Z y  τ(u) := inf y ≥ 0: kβ(y)kdz > u , u ≥ 0. 0 The de-Poissonized process

β¯(u) := kβ(τ(u))k−1 β(τ(u)), u ≥ 0

is a continuous Markov process on the space of unit interval partitions, with stationary distribution PDIP(α, θ). I Write W (u) for the ranked length of β¯(u). Then [FPRW in preparation] proves that the process (W (u/2), u ≥ 0) is a diffusion on the Kingman simplex introduced by [Ethier and Kurtz 1981] (infinitely-many-neutral-alleles model α=0) and [Petrov 2009]. 3. Self-similar interval partition evolutions as limits of Chinese restaurants I Leaving (down-step): Each customer leaves at rate1.

Ordered Chinese restaurant processes with leaving

I The tables are ordered in a line. I Fix α ∈ (0, 1) and θ ≥ 0. We construct a continuous-time . I Arriving (Up-step): I for each occupied table, say there are m ∈ N customers, a new customer comes to join this table at rate m − α; I at rate θ, a new customer enters to start a new table to the left of the leftmost table; I between each pair of two neighbouring occupied tables, or to the right of the rightmost table, a new customer enters and begins a new table there at rate α. Ordered Chinese restaurant processes with leaving

I The tables are ordered in a line. I Fix α ∈ (0, 1) and θ ≥ 0. We construct a continuous-time Markov chain. I Arriving (Up-step): I for each occupied table, say there are m ∈ N customers, a new customer comes to join this table at rate m − α; I at rate θ, a new customer enters to start a new table to the left of the leftmost table; I between each pair of two neighbouring occupied tables, or to the right of the rightmost table, a new customer enters and begins a new table there at rate α.

I Leaving (down-step): Each customer leaves at rate1. At each time t ≥ 0, list the numbers of customers of occupied tables, from left to right, by a vector of positive integers (composition):

C(t) = (n1, n2, n3,..., nk ).

The process (C(t), t ≥ 0) is a Poissonized Chinese restaurant process (PCRP) with parameters (α, θ). Theorem (S. and Winkel, in preparation) Let α ∈ (0, 1) and θ ≥ 0. Let C (n) be a sequence of PCRP(α, θ) started from C (n)(0). Suppose that

1 (n) C (0) −→ γ in (I, dH ). n n→∞ Then under the uniform topology 1 ( C (n)(2ny), y ≥ 0) −→ (β(y), y ≥ 0) in distribution. n n→∞ The limit process is an (α, θ)-SSIP evolution. The key step of the proof I In a PCRP(α, θ), a new table has one customer initially, and this table is removed when all customers of this table have left. I the number of customers at a single table evolves according to a birth-death process X , started from 1 and absorbed at zero, with birth rate Pi→i+1 = i − α and death rate Pi→i−1 = i. Lemma 1 (n) Denote the law of the process ( n X (2ny), y ≥ 0) by π . The following convergence holds vaguely:

2αn1+α · π(n) −→ µ. n→∞

The limit µ is a sigma-finite measure of the space of positive continuous excursions. (BESQ(−2α) excursion measure) I Define the lifetime of a positive excursion f by ζ(f ) := sup{s ≥ 0: f (s) > 0}. Then µ(ζ(f ) ∈ ·) coincides with the L´evymeasure of scaffolding stable (1 + α) precess. I Under µ(· | ζ(f ) = x), the excursion is a BESQ(−2α) excursion of length x. Theorem (S. and Winkel in preparation) For α ∈ (0, 1) and θ ∈ (−α, 0), let C (n) be a sequence of such modified (n) 1 (n) PCRP started from C (0). If n C (0) converges, then under the uniform topology 1 ( C (n)(2ny), y ≥ 0) −→ (β(y), y ≥ 0) in distribution, n n→∞ The limiting process is a path-continuous Markov process with pseudo-stationary distribution “BESQ(2θ) PDIP(α, θ)”.

Interval partition evolution with emigration I PD(α, θ), unordered Chinese restaurant process θ > −α I PDIP(α, θ), ordered Chinese restaurant process θ ≥ 0 Question: the case when θ ∈ (−α, 0)? Theorem (S. and Winkel in preparation) For α ∈ (0, 1) and θ ∈ (−α, 0), let C (n) be a sequence of such modified (n) 1 (n) PCRP started from C (0). If n C (0) converges, then under the uniform topology 1 ( C (n)(2ny), y ≥ 0) −→ (β(y), y ≥ 0) in distribution, n n→∞ The limiting process is a path-continuous Markov process with pseudo-stationary distribution “BESQ(2θ) PDIP(α, θ)”.

Interval partition evolution with emigration I PD(α, θ), unordered Chinese restaurant process θ > −α I PDIP(α, θ), ordered Chinese restaurant process θ ≥ 0 Question: the case when θ ∈ (−α, 0)? Interval partition evolution with emigration I PD(α, θ), unordered Chinese restaurant process θ > −α I PDIP(α, θ), ordered Chinese restaurant process θ ≥ 0 Question: the case when θ ∈ (−α, 0)?

Theorem (S. and Winkel in preparation) For α ∈ (0, 1) and θ ∈ (−α, 0), let C (n) be a sequence of such modified (n) 1 (n) PCRP started from C (0). If n C (0) converges, then under the uniform topology 1 ( C (n)(2ny), y ≥ 0) −→ (β(y), y ≥ 0) in distribution, n n→∞ The limiting process is a path-continuous Markov process with pseudo-stationary distribution “BESQ(2θ) PDIP(α, θ)”. N. Forman, D. Rizzolo, Q. Shi and M. Winkel. Diffusions on a space of interval partitions: the two-parameter model. arXiv:2008.02823. Q. Shi and M. Winkel. Two-sided immigration, emigration and symmetry properties of self-similar interval partition evolutions. arXiv:2011.13378. Q. Shi and M. Winkel. Up-down ordered Chinese restaurant processes with two-sided immigration, diffusion limits and emigration. in progress.