Home Search Collections Journals About Contact us My IOPscience

Scaling limits of a model for selection at two scales

This content has been downloaded from IOPscience. Please scroll down to see the full text. 2017 Nonlinearity 30 1682 (http://iopscience.iop.org/0951-7715/30/4/1682)

View the table of contents for this issue, or go to the journal homepage for more

Download details:

IP Address: 152.3.102.254 This content was downloaded on 01/06/2017 at 10:54

Please note that terms and conditions apply.

You may also be interested in:

Stochastic switching in : from genotype to phenotype Paul C Bressloff

Ergodicity in randomly forced Rayleigh–Bénard convection J Földes, N E Glatt-Holtz, G Richards et al.

Regularity of invariant densities for 1D systems with random switching Yuri Bakhtin, Tobias Hurth and Jonathan C Mattingly

Cusping, transport and of solutions to generalized Fokker–Planck equations Sean Carnaffan and Reiichiro Kawai

Anomalous diffusion for a class of systems with two conserved quantities Cédric Bernardin and Gabriel Stoltz

Feynman–Kac equation for anomalous processes with space- and time-dependent forces Andrea Cairoli and Adrian Baule

Mixing rates of particle systems with energy exchange A Grigo, K Khanin and D Szász

Ergodicity of a collective on a circle Michael Blank

Stochastic partial differential equations with quadratic nonlinearities D Blömker, M Hairer and G A Pavliotis IOP

Nonlinearity

Nonlinearity London Mathematical Society Nonlinearity

Nonlinearity 30 (2017) 1682–1707 https://doi.org/10.1088/1361-6544/aa5499 30

2017

© 2017 IOP Publishing Ltd & London Mathematical Society Scaling limits of a model for selection at two scales NONLE5

Shishi Luo1,3 and Jonathan C Mattingly2 1682 1 Computer Science Division and Department of , University of California - Berkeley, Berkeley, CA 94720, United States of America S Luo and J C Mattingly 2 Department of Mathematics and Department of Statistical Science, Duke University, Durham, NC 27708, United States of America

Scaling limits of a model for selection at two scales E-mail: [email protected]

Received 3 July 2015, revised 24 August 2016 Printed in the UK Accepted for publication 19 December 2016 Published 15 March 2017 NON Recommended by Professor Leonid Bunimovich

10.1088/1361-6544/aa5499 Abstract The dynamics of a population undergoing selection is a central topic in evolutionary biology. This question is particularly intriguing in the case Paper where selective forces act in opposing directions at two population scales. For example, a fast-replicating virus strain outcompetes slower-replicating strains at the within-host scale. However, if the fast-replicating strain causes host 1361-6544 morbidity and is less frequently transmitted, it can be outcompeted by slower- replicating strains at the between-host scale. Here we consider a stochastic 1682 ball-and-urn process which models this type of phenomenon. We prove the weak convergence of this process under two natural scalings. The frst scaling leads to a deterministic nonlinear integro-partial differential equation on 1707 the interval [0,1] with dependence on a single parameter, λ. We show that the fxed points of this differential equation are Beta distributions and that their stability depends on λ and the behavior of the initial data around 1. The 4 second scaling leads to a measure-valued Fleming–Viot process, an infnite dimensional that is frequently associated with a .

Keywords: Markov chains, limiting behavior, evolutionary dynamics, Fleming–Viot process, scaling limits Mathematics Subject Classifcation numbers: 35F55, 35Q92, 37L40, 60J28, 60J68, 60J70 (Some fgures may appear in colour only in the online journal)

3 Author to whom any correspondence should be addressed.

1361-6544/17/041682+26$33.00 © 2017 IOP Publishing Ltd & London Mathematical Society Printed in the UK 1682 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

1. Introduction

We study the model, introduced in [15], of a trait that is advantageous at a local or individual level but disadvantageous at a larger scale or group level. For example, an infectious virus strain that replicates rapidly within its host will outcompete other virus strains in the host. However, if infection with a heavy viral load is incapacitating and prevents the host from transmitting the virus, the rapidly replicating strain may not be as prevalent in the overall host population as a slow replicating strain. A simple mathematical formulation of this phenomenon is as follows. Consider a popula- tion of m ∈ N groups. Each group contains n ∈ N individuals. There are two types of indi- viduals: type I individuals are selectively advantageous at the individual (I) level and type G individuals are selectively advantageous at the group (G) level. Replication and selection occur concurrently at the individual and group level according to the Moran process [8] and are illustrated in fgure 1. Type I individuals replicate at rate 1 + s, s ⩾ 0 and type G individu- als at rate 1. When an individual gives birth, another individual in the same group is selected uniformly at random to die. To refect the antagonism at the higher level of selection, groups replicate at a rate which increases with the number of type G indivduals they contain. As a simple case, we take this rate to be wr1 + k , where k is the fraction of indivduals in the ()n n group that are type G, r ⩾ 0 is the selection coeffcient at the group level, and w > 0 is the ratio of the rate of group-level events to the rate of individual-level events. As with the individual level, the population of groups is maintained at m by selecting a group uniformly at random to die whenever a group replicates. The offspring of groups are assumed to be identical to their parent. As illustrated in fgure 1, this two-level process is equivalent to a ball-and-urn or particle process, where each particle represents a group and its position corresponds to the number of type G individuals that are in it. We note that similar, though more general, particle models of evolutionary and ecological dynamics at multiple scales have been studied and we mention several particularly relevant works here. Dawson and Hochberg [6] also consider a population at multiple levels, albeit with one type per level, not two. In [4], Dawson and Greven consider a more general model, allowing for infnitely many hierarchical levels and for migration, selection, and . Méléard and Roelly [16], in a study also inspired by host-pathogen interactions, investigate a model that allows for non-constant host and pathogen populations as well as mutation. In these more general contexts, determining long-term behavior is less straightforward than in our specifc setting. i We now defne the stochastic process that is the focus of this work. Let Xt be the number of type G individuals in group i at time t. Then m mn, 1 µ := δ i t ∑ Xnt / m i=1 is the empirical measure at time t for a given number of groups m and individuals per group i mn, n. δx()y = 1 if x = y and zero otherwise. The Xt are divided by n so that µt is a probability measure on E :0=…,,1 ,1 . n {}n mn, For fxed T > 0, µt ∈ DT([0, ](, P En)), the set of càdlàg processes on [0, T ] taking values in P()En , where P()S is the set of probability measures on a set S. With the particle process mn, described above, µt has generator Lvmn, =+RwRv, vv− v ()ψψ() ∑()12()ij[( ij)(ψ )] (1) ij,

1683 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

Figure 1. Schematic of the particle process. (a) Left: a population of m = 3 groups, each with n = 3 individuals of either type G (flled small circles) or type I (open small circles). Middle: a type I individual replicates in group 3 and a type G individual is chosen uniformly at random from group 3 to die. Right: group 1 replicates and produces group 2′. Group 2 is chosen uniformly at random to die. (b) The states in (a) mapped to a particle process. Left: group 2 has no type G individuals, represented by ball 2 in urn 0. Similarly, group 3 is represented by ball 3 in urn 2 and group 1 by ball 1 in urn 3. Middle: the number of type G individuals in group 3 decreases from two to one, therefore ball 3 moves to urn 1. Right: a group with zero type G individuals dies, while a group with three type G individuals is born. Therefore ball 2 leaves urn 0 and appears in urn 3 as ball 2′.

1 where vvij :=+()δδj − i , ψ ∈ Cb((P [0, 1 ])) are bounded continuous functions, and m n n vE∈ PP()n ⊂ ([0, 1 ]). The transition rates ()Rw12+ R are given by ⎧ i ⎛ i ⎞ ⎪mv()i⎜⎟11−+()sjif =−ii1, < n ⎪ n ⎝ n ⎠ R1()vv, ij = ⎨ i ⎛ i ⎞ ⎪mv()i⎜⎟1i−=f1 ji+>,0i ⎪ n ⎝ n ⎠ ⎩0otherwise and i j j R2()vv,1ij =+mv()()v ()r . n n n

R1 represents individual-level events while R2 represents group-level events.

2. Main results

We prove the weak convergence of this measure-valued process as mn, → ∞ under two natu- ral scalings. The frst scaling leads to a deterministic partial differential equation. We derive a closed-form expression for the solution of this equation and study its steady-state behavior. The second scaling leads to an infnite dimensional stochastic process, namely a Fleming–Viot process. Let us briefy introduce some notation. By mn, → ∞ we mean a sequence {()mnkk, }k such 1 that for any N, there is an n0 such that if k n0, mnkk, N. We defne fv,d= fxvx ⩾ ⩾ ⟨ ⟩(∫0 )( )

1684 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

where f is a test function and v a measure. Lastly, δx will denote the delta measure for both continuous and discrete state spaces. To provide intuition for the two scalings and the corresponding limits, take ψ to be of the form ψ()vF= (⟨gv, ⟩), where g is some suitable function on [0,1], and apply the generator in (1) to it: ⎧ mn, ⎪ ⎡ 1 ⎛ i ⎞ ⎛ i ⎞⎤ i ⎛ i ⎞ ⎛ i ⎞ LFψµ =⋅′′⎨∑ g″⎜⎟−−sg ⎜⎟ ⎜⎟1 µ⎜⎟ () ⎪ ⎢ ⎥ ⎩ i ⎣ n ⎝ n ⎠ ⎝ n ⎠⎦ n ⎝ n ⎠ ⎝ n ⎠ ⎡ ⎤⎫ i ⎛ i ⎞ ⎛ i ⎞ ⎛ i ⎞ ⎛ i ⎞ j ⎛ j ⎞ ⎪ +−wr⎢∑∑g⎜⎟⎜⎟µµg ⎜⎟⎜⎟∑ µ⎜⎟⎥⎬ ⎪ ⎣⎢ iin ⎝ n ⎠ ⎝ n ⎠ ⎝ n ⎠ ⎝ n ⎠ j n ⎝ n ⎠⎦⎥⎭ 2 ⎧ 2 ⎛ ⎞ 1 ⎪ ⎛ i ⎞ ⎛ i ⎞ ⎛ i ⎞ ⎛ i ⎞ +⋅wF″ ⎨∑∑g⎜⎟µµ ⎜⎟− ⎜ g ⎜⎟⎜⎟⎟ ⎪ ⎜ ⎟ m ⎩ ii⎝ n ⎠ ⎝ n ⎠ ⎝ ⎝ n ⎠ ⎝ n ⎠⎠ ⎛ ⎞2 ⎫ 1 ⎛ i ⎞ ⎛ i ⎞ j ⎛ i ⎞ ⎛ i ⎞⎪ ⎛ 11⎞ ⎛ ⎞ +−rg∑⎜ ⎜⎟g ⎜⎟⎟ µµ ⎜⎟⎜⎟⎬ ++o⎜⎟ o⎜⎟ (2) ⎜ ⎟ ⎪ 2 ij, ⎝ ⎝ n ⎠ ⎝ n ⎠⎠ n ⎝ n ⎠ ⎝ n ⎠⎭ ⎝ m ⎠ ⎝ n ⎠ This suggests two natural scalings. The frst is to take mn, → ∞ without rescaling any parameters. The g″ and F″ terms vanish and we have a deterministic process. The second is to let s = σ, r = ρ , and n θ. The terms F and g no longer vanish and the process converges n m m → ″ ″ to a limit that is stochastic. The precise statement of the weak convergence of the fnite state space system to the deterministic limit is in terms of a weak measure-valued solution to a partial differential equation:

mn, Theorem 1. Suppose the particles in the system described by µt are initially indepen- mn, mn, dently and identically distributed according to the measure µ0 , where µ0 →( µ0 ∈ P []0, 1 ) mn, as mn, → ∞. Then, as mn, → ∞, µt →( µt ∈ DT[]0, ,0P([,1 ])) weakly, where µt solves the differential equation d 〈〉fx,1µµtt=−〈( −+xf)〉′,,λµ〈〉xf tt− 〈〉fx,,µµ〈〉t (3) dt []

1 for any positive-valued test function fC∈ ([0, 1 ]) and with initial condition ⟨⟩f , µ0 . Here, λ := wr and time has been sped up by a factor of s. s

Throughout we will denote the measure-valued solutions to (3) by µt()dx . We note that strong, density-valued solutions, denoted by ηt()x , solve: ∂ ∂ ⎛ 1 ⎞ ηηtt= [(xx1d−+)] ληtt⋅−⎜xyη ()yy⎟ (4) ∂tx∂ ⎝ ∫0 ⎠

with initial density η0()x . In this more transparent form one can see that the frst term on the right is a fux term that transports density towards x = 0 whereas the second term is a forcing term that increases the density at values of x above the mean of the density. The fux cor- responds to the individual-level moves: nearest neighbor moves in the particle system. The forcing term corresponds to group-level moves: moves to occupied sites in the particle system. We will see that if we start with an initial measure µ0 which is the sum of delta measures, then the solution µt retains the same form. More explicitly, if

1685 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

µδ0()d0xa= ∑ ix() i()0 ()dx i

where xi()00∈ [],1, ai(0) > 0, and ∑ ai()01= , then we will see (from lemma 5) that the solu- tion µt to (3) has the form

µδt()ddxa= ∑ ix()txi()t (). i

Moreover, the parameters (atii(), xt ()) satisfy the following set of coupled equations

⎧ dxi =−xxii()1 − ⎪ dt ⎨ (5) ⎪ dai ⎪ =−λµaxii(⟨ya,.t⟩) =−λ iixa∑ jjx ⎩ dt ()j Notice that the positions of the delta masses change according to a negative logistic function, independently of the other masses and the density. The weight ai increases at time t if the posi- tion of the particle xi is above the mean, ∑ axjj, and decreases if it is below the mean. To build intuition, it is instructive to consider some simple examples of this form.

Example 1. According to (5), if µ0 = δ1, then µt = µ0. This can also be seen directly from (3). In the case of an initial condition containing some delta mass at 1, all of the rest of the mass will migrate towards zero. Eventually all of the mass will be below the mean as the mass at one will not move and will ever be increasing its mass as it is always above the mean. Once this happens it is clear that all of the mass will drain from all of the points not at one and hence µt → δ1 as t → ∞. This reasoning holds in a more general setting and is included in theorem 3.

Example 2. According to (5), if µ0 = δ0, then µt = µ0. This too can be seen directly from (3). In the case of an initial condition containing no mass at one and only fnite number of masses total, the mass will eventually all move towards zero and hence hence µt → δ0 as t → ∞. If an infnite number of masses are allowed the situation is not as simple. Theorem 3 hints at the possible complications by giving an example of a density which is invariant. These simple examples correspond to similarly straightforward biological scenarios: once the population is entirely composed of type I individuals or of type G individuals, it stays in that state. Though δ0 is a fxed point of the system attracting many initial confgurations, it is not Lyapunov stable. This means that even small perturbations of δ0 can lead to an arbitrary large excursion away from δ0 even though the system eventually returns to δ0. Rather than making a precise statement which would require quantifying the size of a perturbation, con- sider the example of µ0 =−()1 εδ01+ εδ −α. As ε → 0, the distance between µ0 and δ0 goes to zero in any reasonable metric. If we write µt =−()1 aattδδ0 + xt then as α → 0 one can ensure that the system spends arbitrarily long time with x > 1 and hence a will grow to as close to t 2 t one as one wants in this time. Thus the system could be described as making an an arbitrarily big excursion away from δ0 even though µt → δ0 as t → ∞. It natural to ask if there are other fxed points beyond δ0 and δ1.

Lemma 2 (Fixed points). The measures delta δ0, δ1, and densities in the Beta()λα− , α family of distributions:

1 11 xxλα−−()1 − α− B()λα− , α

1686 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

with αλ∈ ()0, , are fxed points of (3). B()λα− , α is the normalizing constant that makes the density integrate to 1 over the interval [0, 1]. For measure-valued initial data, we show that the basins of attraction for the fxed points are determined by whether they charge the point x = 1 and their Hölder exponent around x = 1.

Theorem 3 (Steady state behavior). Consider measure valued solution µt()dx to (3) with initial probability measure µ0()dx . If µ0({10}) > then

µδt →→1 as t ∞

and if µ0([1,−= ε 10 ]) for some ε > 0 then

µδt →→0 ast ∞ .

Alternatively, suppose that for some α > 0 and C > 0 −α xxµ0([1,− 1a ]) →→Cxs0.

If αλ< , then

µλt()dBxt→(eta,−∞αα)→as .

Otherwise, if αλ⩾ ,

µδt()ddxx→(0 )→as t ∞ The αλ< case in the above theorem is particularly relevant to theoretical evolutionary biology because it implies that coexistence of the two types is possible in this infnite popula- tion limit. The results of theorem 3 should be contrasted with the original before tak- ing the limit mn, → ∞. In the Markov chain, all individuals eventually become either entirely type G or type I. These two homogeneous states are absorbing states for the individual level dynamics. The population level state made of individuals that are all either homogeneous of type G or I is absorbing for the group level dynamics. Hence, the state of the system eventually becomes composed entirely of homogeneous groups of solely G or I and stays in that state for all future times. These two absorbing states of the Markov chain, with fnite m and n, cor- respond to the states δ0 and δ1 in the scaling limit. Hence the natural discretization for the Beta distribution to the lattice k :0<

1687 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

We will not pursue a rigorous proof of this near or quasi invariance here. Nonetheless, we now briefy sketch the argument as we understand it, giving the central points. If the distri- bution of the Markov chain is close to a product of discretized Beta distributions, then the empirical mean will be highly concentrated around the mean of continuous Beta when m and n are large. Hence the generator projected on to any Xi is nearly decoupled from the other particles and close to being Markovian. More precisely, the dynamics of any fxed Xi is well approximated in this setting by the one-dimensional Markov chain obtained by replacing the mean of the empirical measure in the full generator with the mean of the Beta distribution. It is straightforward to see that for m and n large the discretized Beta distribution is an approxi- mate left-eigenfunction of this one-dimensional generator with an eigenvalue which goes to zero as mn, → ∞. All of these observations can be combined to show that if the systems starts in the product discretized Beta distribution then it will say close to the product discretized Beta distribution for a long time if m and n are large. We now turn to the second scaling. Let s = σ, r = ρ , and n θ, and let νmn, denote the n m m → t empirical measure under this scaling. The terms F″ and g″ in the generator (1) no longer vanish and the process converges to a limit that is stochastic. Our weak convergence result is proved and stated in terms of a martingale problem.

Theorem 4. Suppose n θ, w = O(1), s = σ, r = ρ , and we speed up time by a factor of m → n m mn, n. Suppose the particles in the rescaled νt process are initially independently and identically mn, mn, distributed according to the measure ν0 where ν0 → ν0 as mn, → ∞. Then the rescaled process converges weakly to νt as mn, → ∞, where νt satisfes the following martingale problem:

t Nftt()=−〈〉ff,,νν〈〉0 − 〈〉Af ,dνz z ∫0

t ⎧ 1 1 ⎫ − wfθρ ⎨ ()xV()zy,,ννzzQx();d ,dyz⎬ d (6) ∫∫00⎩ ∫0 ⎭

is a martingale with conditional t ⎡ 1 1 ⎤ ⟨(Nf)⟩t = 2;wfθν⎢ ()xf()yQ()τ d,xydd⎥ τ (7) ∫∫00⎣ ∫0 ⎦

where

⎡ d2 d ⎤ Af ()xx=−()1 x ⎢ fx()− σ fx()⎥ ⎣ dx2 dx ⎦ Vt(),,ν xx= Qx()νν;d ,dyx=−()dd((δνx yy)(d ))

2 and fC∈ ([0, 1 ]). The drift part of the martingale (6) comprises a second order partial differential operator A and the centering term from the global jump dynamics (the expression in curly brackets). Note that A is in fact the generator of the Wright–Fisher diffusion with selection [8]. The entire process is a Fleming–Viot process [11]. Fleming–Viot processes frequently arise in models of population genetics (for example [3, 12]; see [10] for a review). In these contexts,

1688 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

the variable x can represent the geographical location of an individual, or as in the original paper of Fleming and Viot [11], the genotype of an individual (where genotype is a continuous instead of a discrete variable). As an aside, it may be helpful to mention an alternative characterization of this Fleming– Viot process as an infnite system of ordinary stochastic differential equations. Instead of a martingale problem, where both m, n have been taken to ∞, we consider the n → ∞ limit frst. In this case, we have a fnite collection of delta masses (each of mass 1 ) moving on the m interval [0, 1]. The positions of these delta masses can be represented by a coupled system of stochastic differential equations (SDEs). From the generator equation (2), one can see that each SDE comprises a diffusion part (corresponding to the individual-level dynamics) and a (corresponding to the group-level dynamics). Specifcally, a delta mass jumps to the position of another delta mass according to a Poisson process with rates dependent on the positions of the delta masses. Donnelly and Kurtz [7] characterize a population process in terms of such a system of SDES and show that the infnite population limit corresponds to a martingale problem for the Fleming–Viot process. We briefy discuss other scalings one might obtain from the particle system and their bio- logical signifcance. The two scalings studied here correspond, respectively, to what is called ‘strong’ selection and ‘weak’ selection occurring at both levels. (In the feld of theoretical evolutionary biology, strong selection is defned as the selection parameter being constant in the population size whereas has the selection parameter scaling with the inverse of the population size). One can also use the same techniques to characterize the limit- ing system when selection is strong at one level and weak at the other. The dynamics of these limiting systems are more straightforward. For example, if selection is weak at the individual level (sO= 1 ) and strong at the group level (no rescaling of r), one can see from the genera- ()n tor equation (2) that the highest order term corresponds to selection at the group level. The limit is therefore deterministic and the steady-state is a population homogeneous in the group type with the largest proportion of type G individuals present in the initial state. Note that it is possible, by further rescaling w, to obtain a limit from a mixture of weak and strong selection. A biological interpretation of these observations is that for selection to manifest itself at two biological levels, the selective forces must be comparable in some sense: either both levels undergo the same type of selection (weak or strong), or if weak selection acts on one level but not the other, this weak selection must be compensated by a faster timescale. The dynamical properties of the deterministic partial differential equation (3) are the focus of the next section. The proofs of weak convergence (theorems 1 and 4) are deferred to section 4.

3. Properties of the deterministic limit

We begin with a closed-form expression for solutions to the deterministic partial differential equation (3). Lemma 5. The solution to the deterministic partial differential equation (3) with initial measure µ0 is given by −1 µµt()ddxG==()t 00()xx()µφt ()d ⋅ wxt() (8)

where

1689 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

x −1 x φt ()= tt e1−−+−x()e

⎡ t ⎤λ −−ttth−∫ ()zzd wxt()=+⎢((e1x − ee)) 0 ⎥ ⎣ ⎦

and h(t) satisfes ht()= ⟨⟩x, µt −−1 1 Remark 1. (µφ0 tt )(d:xx)(= µφ0 ()d ) captures the changes in the initial data that are sole- ly due to the fux term. This expression is also known as the push-forward measure of µ0 under the dynamics of φ. As we will see in the proof, φt()x is precisely the characteristic curve for the spatial variable x and includes a normalizing constant. The multiplication by wt(x) captures the changes in the initial data that are due to the forcing term in (3) and includes a normalizing factor. Remark 2. Density-valued solutions are given by

−−11 ηηt()xx=∂0((φφt )) x t ()xw⋅ t ()x t ⎛ x ⎞ −−ttλ−2 ()λλ−−1dth()zz (9) e1x ee ∫0 = η0⎜ tt⎟[(+−)] ⎝ e1−−+−x()e ⎠

To see this, suppose µ00()ddxx= η ()x. Then for any test function f, 1 1 1 −−1 11− fx()()µφ0 t ()ddxf==() φµt ()xx0() fy()ηφ0((t yy))∂yφt ()dy ∫∫0 0 ∫0

The frst equality follows from the change-of-variable property of push-forward measures and the second from a standard change of variables. The limits of integration do not change −1 because 0 and 1 are fxed points of both φt and φt . Proof of lemma 5. We apply the method of characteristics (see for example [17]) to obtain a formula for a density-valued solution. We then prove that the weak, measure-valued analog of this solution satisfes (3). Consider the following modifcation of (4): ∂ ∂ ξξ()tx,1= [(xx−+)(tx,,)] λξ() tx[(xh− t)] (10) ∂t ∂x

1 where h(t) is a general function in time and ξ ∈ C1 0, 1 . Note that when ht = ytξ ,dyy, 0 ([ ]) () ∫0 () this differential equation is equivalent to (4). To be clear about which equation we are solving, we use ξ()tx, to denote solutions when h(t) is unspecifed. Rewriting (10): ⎛ ∂ ∂ ⎞ ⎜⎟ξξ,,−⋅11((,1−−xx)[,1()−+20xxλξ((−=ht))] ) ⎝∂tx∂ ⎠

The second vector is therefore tangent to the solution surface and gives the rates of change for the t, x, and ξ coordinates. Let the initial condition be parameterized as (0, xx,0ξξ00())(= ,,pp()). The t, x, and ξ coordinates change according to the characteristic equations

1690 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

dt ==10tp(),0 dq dx =−xx()10−=xp(), p dq dξ =−((12xq,,px)) +−λξ((qp)(ht(qp,0 ))) ξξ(), pp= 0() dq [ ]

where q is the parameter as we move through the solutions in time. The frst two ordinary differ­ential equations have solutions

tq(), pq= p xq(), p = q =: φq()p (11) pp−−()1e

From this, the third differential equation can be solved exactly:

dξ ⎡ p ⎤ =+⎢1 q ()λλ−−2 hq()⎥ ξ dq ⎣ pp−−()1e ⎦ ⎧ qqp ⎫ ξξ()qp,e=−0()pqxp ⎨ λλhz()d2z +−() z dz⎬ ⎩ ∫∫00pp−−()1e ⎭ q qh−λ ∫ ()zzd −−q ()λ+2 =−ξ0()ppee0 [ ()11+ ]

−1 Next, make the substitutions q = t and px= φt () from (11) to obtain ξ in terms of t and x:

t −1 −−ttλ−2 ()λλ−−1dth()zz ξξtx,e=+ φ xx1e− ()e ∫0 ()()0 t ()[()] −−11 =∂()ξφ0 t ()xxxφt ()⋅ wxt () (12)

1 If h(t) satisfes ht = ytξ ,dyy, then by defnition, ξ tx, solves the partial differential () ∫0 () () equation (4). Conversely, if ξ()tx, solves the partial differential equation (4), it also solves the 1 differential equation (10) with ht = ytξ ,dyy. Therefore this above expression, along () ∫0 () 1 with the condition ht = ytξ ,dyy, are equivalent to solutions of (4). () ∫0 () To extend this result to measures, suppose we have a strong solution ηt()x with initial condi- tion η0: −−11 ηt()xx=∂()ηφ0 t ()xφt ()xw⋅ t ()x

Using a similar calculation as that in remark 2, the measure µt()dx corresponding to ηt()x is given by

−1 µt()ddxx=⋅()µφ0 t ()wxt()

It remains to check that this satisfes the weak deterministic partial differential equation (3), with ht()= ⟨⟩x, µt . The left hand side of the equation is

1691 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

1 1 d d −1 d ⟨⟩f , µµt ==fx()wxt ()()0φφt ()dx fx((t ))wxt((φµt )) 0()dx dt dt ∫∫0 dt 0

Differentiating under the integral sign, expanding out the expressions for ∂tφt and ∂tt((wxφt( ))), and applying change of variables for push-forward measures again, we obtain

1 1 d −−1 1 ⟨⟩fx,1µµt =− ()−+xf′() xwt ()xx()0φλt ()dd[(xh− tf)] ()xwt ()xx()µφ0 t () dt ∫∫0 0

This matches right hand side of the weak deterministic partial differential equation (3). □

In practice, the condition ht()= ⟨⟩x, µt is diffcult to use. The following provides an equiv- alent and simpler condition. Lemma 6 (Conservation of measure condition). Suppose ξ is a weak measure- valued solution to the deterministic partial differential equation (10) with initial condition 1 ξ d1x = . Then ∫0 0() 1 1 ht()==ytξξ(),dytif and onlyi f,()d1yt ∀>0 ∫∫0 0

1 Proof. (⇒ direction) Suppose ht = ytξ ,dy . Then ξ is a weak measure-valued solution () ∫0 () to (3). Taking the test function f ≡ 1, we obtain d ⟨⟩1, ξλ=+0,[⟨xxξξ⟩⟨−=1, ⟩⟨,0ξ ⟩] dt

Thus, if the initial data has total measure 1, ⟨⟩1, ξ remains constant at 1 for all t ⩾ 0. 1 (⇐ direction) Suppose ξ tx,d = 1 for all t > 0. Again take the test function f ≡ 1 but ∫0 () this time with unspecifed h(t): d 0 ==⟨⟩1, ξλ0,+−[⟨xhξξ⟩⟨1, ⟩(tx)]=−λξ [⟨ ,.⟩(ht)] dt

1 For this to hold, we must have ht = xtξ ,dx . () ∫0 () □

The above lemmas imply that solutions µt()dx to (3) can be obtained by using formula (8) from lemma 5 and imposing the conservation of measure condition ⟨⟩1, µt ≡ 1 from lemma 6. We illustrate this with some examples of exactly solvable solutions for special choices of initial data. We will see that the long time behavior of the examples is consistent with results stated in theorem 3.

Example 3. Initial measure concentrated at x0 ∈ []0, 1 , i.e. µ0 = δx0 Using formula (8),

fxµδddxf==xwxxφφ−1 fxwxφ ∫∫()t() ()tx () 0((t )) ((t 00)) t((t )) = fxwxδ dx ∫ ()tx ()φt()0 ()

Thus µt()ddxw= tx()xxδφt()0 (). Imposing the conservation of measure condition gives

µt()ddxx= δφt()x0 (). In other words, an initial delta measure at x0 moves as a delta measure

1692 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

along the x axis with position given by φt()x0 , the solution to the negative logistic equa- tion with initial position x0.

Example 4. Initial uniform density: η0()x = 1, i.e µ0()ddxx= Using formula (9),

t ()λλ−−1dth∫ ()zz −−tt()λ−2 ηt()xx=+ee0 [(1e− )]

Imposing conservation of measure:

t ⎡ 1 ⎤−1 ()λλ−−1dth∫ ()zz −−tt()λ−2 ee0 =+⎢ [(xx1e− )] d ⎥ ⎣ ∫0 ⎦ t ⎧ ()λ −−11()e− ⎪ if λ ≠ 1 ⎪ 1e− −−()λ 1 t = ⎨ ⎪1e− −t ⎪ if λ = 1 ⎩ t

Thus,

−t ⎧ ()λ −−11()e −−tt−2 ⎪ [(e1+−x ei)]()λ f1 λ ≠ ⎪ 1e− −−()λ 1 t ηt x = ⎨ () −t ⎪1e− −−ttλ−2 ⎪ [(e1+−x ei)]() f1 λ = ⎩ t

Note that η0 ≡ 1 corresponds to an initial condition satisfying the hypothesis of theorem 2 3 with α = 1. As predicted when λ > 1, we obtain η()tx,1→(λλ−=)(xλ− Beta − 1, 1) as t → ∞. The following is an example with α > 1. 2 Example 5. If η0()xx=−21(), i.e. µ0([1,−=xx1 ]) , then the corresponding α from theorem 3 is α = 2. Using formula (9)

t ()λλ−−2dth∫ ()zz −−tt()λ−3 ηt()xx=−2e 0 ()1e[(+−x 1e)]

Imposing the condition in lemma 6 to solve for the h(z) term

t ⎡ 1 ⎤−1 ()λλ−−2dth∫ ()zz −−tt()λ−3 e20 =−⎢ ()1exx[ +−()1e] dx⎥ ⎣ ∫0 ⎦ −1 ⎧ −t −−()λ 1 t ()λ −−21()e ⎡ 1 e 2 t⎤ ⎪ ⎢ − −≠ei−−()λ ⎥ f2 λ ⎪ 2 λλ−−11 e−t −−11 e−t = ⎨ ⎣ ()()()() ⎦ t 2 ⎪ ()1e− − ⎪ if λ = 2 ⎩ 2et −t

As predicted by theorem 3 for λ>=2 α,

1693 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

1 λ−3 ηλt()xx→(−−21)(λλ )(1B−=)(x eta2− ,2) 2

as t → ∞. Example 6. η xx=⋅1 1 with c < 1. 0()c []0,c () Using formula (9)

1 −1 ηt()x =∂1{⩽xcφφt()}(wxtx)(x) c t

−t Since φ c = ce 0 as t ∞, η x 0 for any x > 0. Since η must have total mass t() 1e−+cc−t → → t()→ 1, it follows that regardless of the value of λ, ηt()xxdd→( δ0 xc) fora ny < 1. This can also be 1 seen by applying theorem 3 and noting that µ 1,−=cx1dη x = 0. 0([ ]) ∫1−c 0() We end these examples with solutions for µ0 that are mixtures of delta measures and densities. First, note that it is straightforward to extend example 3 to the case where

µ0()ddxa= ∑ ixδ i ()x is a linear combination of delta measures, ai > 0 for all i. Applying (8), we obtain tx,d aw xxddat x µ()==∑∑it()δδφt()xii() ix() ()t () i i where xti = φ xi and atii=|awt x . Our earlier system of equation (5) is obtained () t() () ()xx= i()t from this and the defnitions of φt()x and wt(x). Second, we consider a combination of a delta measure and a density

µ0()ddxa=+ δx0 ()xa()1d− vx0()x Notice that the formula for the solution (8) at frst seems linear in the initial condition:

∫∫fx()µµt()ddxf= ()xG()t 0 ()x =+fxwx axδφd1−∂av −−11xxφ dx ∫ ()tx ()[(φt()0 )( )(0 t ())(x t )] =+fx aGδ d1xa− Gv dx ∫ ()[( tx0)( )( )(t 0 )( )]

This gives (Gxttµδ0 )(dd)(=+aG xt0)( xa)(1d− )(Gv0 )(x ). However, this notation is mislead- ing because implicit in the Gt operator is the function h(t), the mean of the overall process over time. Here, h(t) involves both the delta measure and the density. The solution operator Gt is therefore not linear for this reason. Nevertheless, we can still use this formula to obtain expressions for solutions. We illustrate this with a concrete example.

Example 7. Take x0 = 0 and v0(x) the density function for Beta()λα− , α with αλ∈ ()0, . Using the solution formula and direct calculation, we obtain

−−11 µδt()d0xa=+wxtt()00()d1()−∂aw()xv() φφt ()xxx t ()dx t −λ ∫ hz()dz) ()λα− t =+ed0 {(axδ00)(1e− av)(xx)}d

1694 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

Note in particular that µt remains a linear combination of δ0 and the Beta distribution. The Beta distribution ultimately dominates because λ> α.

We now use lemma 5 to show that Beta distributions, δ0, and δ1 are fxed points for the deterministic partial differential equation and thus provide a proof of lemma 2 announced earlier in this note.

Proof of lemma 2. Note that we could prove this lemma by substituting δ0, δ1, and the Beta distribution into the deterministic partial differential equation (3) and showing the right-hand side equals zero. Instead, we will show that these distribution are fxed points of the solution operator. Let v be the density of the Beta distribution,

1 11 vx()= xxλα−−()1.− α− B()λα− , α

The mean of v is λα− . Using (9) λ ⎛ x ⎞ Gv xv e1−−ttxveeλλ−−21()tt−−()λα x ()t ()= ⎜ tt⎟[(+−)] = () ⎝ e1−−+−x()e ⎠

v is therefore a fxed point of the solution operator and hence is a fxed point of the determin- istic partial differential equation.

For δ0 and δ1, we use example 3 above to obtain (Gxtxδδ0 )(dd)(= φt()x0 x). Since x0 = 0 and x0 = 1 are fxed points of φt, it follows that δ0 and δ1 are fxed points of Gt. □ We now prove when the fxed points are stable. We begin with a lemma which gives more general conditions than those given in theorem 3 for the delta measure at zero to attract a given initial condition.

Lemma 7. If for some αλ⩾ > 0, −α lim xxµ0([1,−<1 ]) ∞ x→0

then µt → δ0 as t → ∞. In particular, this condition holds if µ0([1,−= ε 10 ]) for some ε > 0. To prove this and subsequent results, we will need the following technical lemma.

Lemma 8. Setting ht()= ⟨⟩x, µt , the following two implications hold: ∞ ht()d0th<∞ ⟹(tt)→ as → ∞. ∫0 ∞ [(1d−

Proof of lemma 8. Since ht()⩾ 0 and 10− ht()⩾ , the only obstruction to the implication is that h(t) (or 1 − h(t)) could have ever shorter and shorter intervals were they return to an order one value before returning to a value close to zero. This would require h(t) to have un- bounded derivatives. However this is not possible since

dh 222 ()th=−(⟨−+xx,,µλtt⟩) (⟨ µ ⟩)− h dt

1695 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

from which one easily see that −1 dh t since 0 hx− 2,1µ and 0 xh22,1µ − . ⩽(dt )⩽λ ⩽⟨t⟩⩽ ⩽⟨ t⟩⩽ □

Proof of lemma 7. As usual let ht()= ⟨⟩x, µt . We begin by observing that if ∞ ht()dt <∞ ∫0

then ht()→ 0 as t → ∞ by lemma 8 and µt → δ0 as we wish to prove. Thus, we henceforth ∞ assume that htdt =∞. Under this assumption, we will show that for any continuous ∫0 () function f

1 fx()µt()d0xf→()→as t ∞. ∫0

Since f is continuous, given any ε > 0, there exists a δ > 0 so that |−fx() f ()0 |<ε whenever x ⩽ δ. Hence 1 11 fx()µµtt()d0xf−|()⩽(fx)(−|fx0d)()⩽εµ+|fx()−|fx()0dt() (13) ∫∫0 0 ∫δ

Now setting

11 |−fx fx0d|=µφt |−fx t fw0d| t φµt xx0 ∫∫() () () −1 ()() ()()() () δφt ()δ 1 2dfw∞ t φµt yy0 . ⩽∥∥(∫ −1 )( )() φδt ()

−1 Since for all y ∈ [(φδt )],1 and t > 0, we have

t λλth− ∫ ()ssd ()wyt φt ()⩽ e 0

we see that

1 t λλth− ∫ ()ssd −1 |−fx() fx()0d|µµt()⩽∥2ef∥(∞ 0 0 [(φδt )],1 ). ∫δ

−1 −t Now using the assumptions on µ0 and that φt () δ ⩾ 1e− D for some D > 0 and all t > 0, one has that

tt λλth− ∫∫()ssd −1 −−()αλth−λ ()ssd e,00µφ0([ t ()δ 1e]) ⩽ Dˆ

∞ for some constant Dˆ and all t > 0. Since αλ and hsds =∞, this bound converges to ⩾ ∫0 () zero as t → ∞ and the proof is complete as the ε in (13) was arbitrary. □

Proof of theorem 3. We start with the setting when µ0({10}) > and begin by writing µt()ddxa=+ttδν1 ()xa()1d− t()x for some time dependent process at ∈ []0, 1 with a0 > 0 and some probability measure valued process νt()dx . As usual we defne ht()= ⟨⟩x, µt and using the representation given in (8), one sees that at solves

1696 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

t dat ⎛ ⎞ =−λλahtt((1eta)) ⟹[=−ah0 xp⎜⎟1d()ss] . dt ⎝ ∫0 ⎠

t Since 10− ht , we know that 1d− hs s converges as t ∞. If it converges to ∞ ()⩾ ∫0 [()] → then at also converges to ∞ since a0 > 0. However this is impossible since at ∈ []0, 1 for all t t 0. Thus, we conclude that 1d−10 ]) → C as x → 0. The case when λ⩽ α is already handled by lemma 7 leaving only the case when λ>> α 0 to be proven. For x ∈ []0, 1 , defne U()xx= µ0([0, ]). Since µ0 is a probability measure we know that U has fnite variation and is regular in the sense that both the right limit U(x+ ) and the left limit U(x−) exist, where U()xU± = lim ()y as yx→± . At the extreme points, only the limit obtained by staying in [0,1] is defned. Now for any smooth function f of [0, 1], we have from (8) that

1 1 1 −1 fx()µµt()ddxZ==t fx()gxt ()()0φφt ()xZt [( fgtt)] ()xxµ0()d ∫∫0 0 ∫0

−−ttλ where wt(x) has been written as the product of gt()xx=+((e1− e )) and Zt some posi- tive, time dependent normalizing constant. It is enough to show that for some time positive, dependent constant Kt, 1 1 λα−−11α− Kt [( fgtt)] φµ()xx0()d1→(fx)(xx−∞)→daxts. (14) ∫∫0 0

Since xf ()xgt ()x is continuous on [0,1], even if U(x) has discontinuities the integration by parts formula for Lebesgue–Stieltjes integrals produces 1 1 −+ [( fgtt)] φµ()xx0()d1=−()fgttUf() ()gU ()0d−∂x[( fgtt)]φ ()xU ()xx ∫∫0 0 1 −+ =−fg Uf11+−gU10+∂x fg φ [(tt)]( )[( )]( )[∫0 ()tt]

×−()xU[]1d()xx.

Here we have used that φt is continuous with φt()11= and φt()00= . − First observe that 1 − U(1 ) = 0 since µ0([1,− x 10 ]) → as x → 0 by assumption and that +−λt gt()0e= . Hence −++−λt [ fgtt (11−+Uf )]( )[gU(−=10 )]( )[Uf()01− ](0e) . (15)

Now turning to the integral term, applying the chain rule and changing variables to yx= φt() produces

1 1 ∂−x[( fgtt)] φφ()xU[]1d()xx=∂[(x fgtt)]()xU[]1d−∂()xx()xφt () x ∫∫0 0 1 −1 =∂[(x fgt)]( yU)[()1d− φt ](yy ) ∫0

1697 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

For any fxed x ∈ ()0, 1 by direct calculation and use of the assumption on µ0, one sees that ∂∂fg xxλfx⎫ x()t () →(x )( ) ⎪ α as t ∞. ααt −−1 t 1 ⎛1 − x ⎞ ⎬ → e1((−=Uxφµt ( )))e,0 ([φt ()xC1]) → ⎜⎟⎪ ⎝ x ⎠ ⎭

t Combining these facts with (15) and the fact that e0−−()λα→ as t → ∞ since λ> α produces

1 1 α αλt ⎛1 − x ⎞ ed[( fgtt)] φµ()xx0()→(Cx∂x fx)( )→⎜ ⎟ daxts.∞ ∫∫0 0 ⎝ x ⎠

for some new positive constant C. Now since integration by parts implies that

1 α 1 11λ ⎛ − x ⎞ λα−−11α− ∂x()xf()x ⎜ ⎟ d1xf=−()xx ()xxd α ∫∫0 ⎝ x ⎠ 0

the last part of the proof is complete. □

4. Proofs of weak convergence

The proofs of theorems 1 and 4 follow a standard procedure [3, 12, 13]. Both proofs require: (i) tightness of the sequence of stochastic processes—which implies a subsequential limit, mn, and (ii) uniqueness of this limit. For the tightness of {}µt mn, on DT([0, ](,0P [,1 ])), it is suf- mn, fcient, by theorem 14.26 in Kallenberg [14] to show that {⟨⟩f , µt } is tight on DT([0, ]), R for any test function f from a countably dense subset of continuous, positive functions on [0, 1]. For the uniqueness of solutions to the partial differential equation in theorem 1, we apply Gronwall’s inequality. For uniqueness of solutions to the martingale problem in theo- rem 4, we apply a by Dawson [5].

4.1. property of multilevel selection process

mn, + It will be useful for what follows to treat ⟨⟩f , µt as a semimartingale. Below, Dfx is the frst − order difference quotient of f taken from the right, Dfx is the frst order difference quotient of f taken from the left, and Dxx f is the second order difference quotient. 2 mn, m,n Lemma 9. For fC∈ ([0, 1 ]) and µt with generator L defned in (1), mn, mn,,mn mn, ⟨⟩⟨⟩ff,,µµt −=0 Aft ()+ Mft () (16)

t where Amn, f is a process of fnite variation, Amn, fa:d= mn, fz, with t () t ()∫0 z ()

mn,,mn⎛ i ⎞ i ⎛ i ⎞ ⎡ 1 ⎛ i ⎞ − ⎛ i ⎞⎤ ⎜⎟⎜ ⎟⎜⎟⎜⎟ aft ()=−∑ µt 1 ⎢ Dfxx − sDx f ⎥ i ⎝ n ⎠ n ⎝ nn⎠ ⎣ ⎝ n ⎠ ⎝ n ⎠⎦ ⎧ ⎫ ⎪ mn,,⎛ j ⎞ j ⎛ j ⎞ mn⎛ i ⎞ ⎛ i ⎞ mn, ⎛ j ⎞ j ⎪ +−wr⎨ µµ⎜⎟f ⎜⎟ ⎜⎟⎜⎟f µ ⎜⎟⎬ ⎪∑∑t t ∑ t ⎪ (17) ⎩ j ⎝ n ⎠ n ⎝ n ⎠ i ⎝ n ⎠ ⎝ n ⎠ j ⎝ n ⎠ n ⎭

1698 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

mn, and Mt () f is a càdlàg martingale with (conditional) quadratic variation

t ⎧ ⎡ 22⎤ mn, 11⎪ mn, ⎛ i ⎞ i ⎛ i ⎞ ⎛ +−⎛ i ⎞⎞ ⎛ ⎛ i ⎞⎞ 〈(Mf)〉t =−⎨ µ ⎜⎟⎜11⎟⎜⎢⎜Dfxx⎟⎜⎟ ++()sD⎜ f ⎟⎟ ⎥ ∫ ⎪ ∑ z mn0 ⎩ i ⎝ n ⎠ n ⎝ n ⎠ ⎣⎢⎝ ⎝ n ⎠⎠ ⎝ ⎝ n ⎠⎠ ⎦⎥ 2⎫ mn,,⎛ i ⎞ mn⎛ j ⎞⎛ j ⎞⎛ ⎛ i ⎞ ⎛ i ⎞⎞ ⎪ ++w µµ⎜⎟ ⎜⎟⎜⎟1dr ⎜ f ⎜⎟− f ⎜⎟⎟ ⎬ z (18) ∑ z z ⎪ ij, ⎝ n ⎠ ⎝ n ⎠⎝ n ⎠⎝ ⎝ n ⎠ ⎝ n ⎠⎠ ⎭

Proof. By Dynkin’s formula (see, for example, lemma 17.21 in [14]), t mn, mn, mn, mn, ψ()µψt −− ()µψ0 ()Ls()µs d ∫0

mn, where ψ ∈ dom()L , is a càdlàg martingale. In particular, this is true for mn,,mn ψ()µµt = Ff(⟨ , t ⟩)

2 where fC∈ ([0, 1 ]) and F : RR→ . Setting F(x) = x and plugging this f into (1):

mn, ⎛ i ⎞ i ⎛ i ⎞ ⎡ 1 ⎛ i ⎞ − ⎛ i ⎞⎤ (〈Lf,1⋅=〉)()vv∑ ⎜⎟⎜−−⎟⎜⎢ Dfxx ⎟⎜sDx f ⎟⎥ i ⎝ n ⎠ n ⎝ nn⎠ ⎣ ⎝ n ⎠ ⎝ n ⎠⎦ ⎧ ⎫ ⎪ ⎛ j ⎞ j ⎛ j ⎞ ⎛ i ⎞ ⎛ i ⎞ ⎛ j ⎞ j ⎪ +−wr ⎨ v⎜⎟f ⎜⎟ v⎜⎟⎜⎟f v⎜⎟⎬ ⎪∑∑∑ ⎪ ⎩ ji⎝ n ⎠ n ⎝ n ⎠ ⎝ n ⎠ ⎝ n ⎠ j ⎝ n ⎠ n ⎭

Thus,

t mn, mn, mn,,mn ⟨⟩ff,,µµt −−⟨⟩0 afz ()dzM= t ()f (19) ∫0

mn, mn, mn, mn, where Mt () f is some martingale and aft ()=⋅(⟨Lf, ⟩)()µt . At( f ) is a process of fnite mn, variation because for a given f, aft () is uniformly bounded in t. Next, setting F(x) = x2 and plugging this ψ into (1): ⎡ 22⎤ mn,2 mn, 1 ⎛ i ⎞ i ⎛ i ⎞ ⎛ +−⎛ i ⎞⎞ ⎛ ⎛ i ⎞⎞ (〈Lf,2⋅=〉)()vf〈〉, vat ()f +−∑ v⎜⎟⎜11⎟⎜⎢⎜Dfxx⎟⎜⎟ ++()sD⎜ f ⎟⎟ ⎥ mn i ⎝ n ⎠ n ⎝ n ⎠ ⎣⎢⎝ ⎝ n ⎠⎠ ⎝ ⎝ n ⎠⎠ ⎦⎥ 2 w ⎛ i ⎞ ⎛ j ⎞⎛ j ⎞⎛ ⎛ i ⎞ ⎛ j ⎞⎞ ++∑ v⎜⎟v⎜⎟⎜⎟1 r ⎜ f ⎜⎟− f ⎜⎟⎟ m ij, ⎝ n ⎠ ⎝ n ⎠⎝ n ⎠⎝ ⎝ n ⎠ ⎝ n ⎠⎠

Thus,

t mn, 2 mn, 2 mn, ⟨⟩ff,,µµt −−⟨⟩0 cfz ()dmz = artingale (20) ∫0

mn, mn,2mn, where ct ()fL=⋅(⟨f , ⟩)()µt . mn, 2 Alternatively, take Yt = ⟨⟩ f , µt and apply Ito’s formula (for example, p78 in [18]) to Y t to obtain

1699 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

t mn, 2 mn, 2 mn, mn, (21) ⟨⟩ff,,µµt −=⟨⟩0 2,⟨⟩faµz z ()fzdm++[(Mf)]t artingale ∫0

mn, mn, mn, where [Mf()]t is the quadratic variation process of Mt . Since ⟨Mf()⟩t is the compensa- mn, tor of [Mf()]t, mn,,mn [Mf()]⟨t − Mf()⟩t

is a martingale. Thus,

t mn, 2 mn, 2 mn,,mn mn, ⟨⟩ff,,µµt −−⟨⟩0 2, ⟨⟩faµz z ()fzdm−=⟨(Mf)⟩t artingale (22) ∫0

mn, The compensator ⟨Mf()⟩t is a predictable process of fnite variation (see p118 in [18]). By the Doob–Meyer inequality (p103 in [18]), the martingale in (22) is the same as the mar- tingale in (20) . Equating these martingale parts we obtain

t t mn,,mn mn, mn, 2,⟨⟩faµz z ()fzdd+=⟨(Mf)⟩t cfz ()z. (23) ∫∫0 0

mn, mn, Substituting in the expressions for az and cz then gives the explicit expression for the con- ditional quadratic variation (18) in the statement of the lemma. □

4.2. Proof of deterministic limit To prove theorem 1, we need the two following lemmas. The frst uses criteria in Billingsley mn, [2] to show tightness of the sequence of processes ⟨⟩f , µt . The second uses Gronwall’s inequality to show uniqueness of solutions to the limiting system. mn, Lemma 10. The processes ⟨⟩f , µt , as a sequence in {(m, n)}, is tight for all positive- 1 valued test functions fC∈ ([0, 1 ]). + Proof. By theorem 13.2 in [2], a sequence of probability measures {Pn} on DT([0, ]), R is tight if and only if (i) for all η > 0, there exists a such that ⎛ ⎞ Pxn⎜ :sup ||xt() ⩾⩽an⎟ η for1⩾ ⎝ tT∈[]0, ⎠ and (ii) for all ε > 0 and η > 0, there exists δ ∈ ()0, 1 and n0 such that

Pxn((:fwn′x δε)⩾ )⩽η or all > n0 where w′ is the modulus of continuity for càdlàg processes and is defned

wx′x()δ :i=|nf maxsup ()sx−|()t t 1 iv {}i ⩽⩽ st,,∈[)ttii−1

+ where {ti} is a partition of [0,T] such that maxi{}ttii− −1 ⩽ δ and xD∈ ([0, T]), R is distrib- uted according to Pn. mn, First, note that since µt is a probability measure, we have mn, ||⟨⟩ff, µt ⩽∥ ∥∞

1700 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

for all t, m, and n. Thus, (i) holds. For (ii), we have by Markov’s inequality: 1 Pwmn,,((′′δε)⩾ )⩽ Emn((w δ)) (24) ε

mn, where ww′ δδ:= ′ mn, . We will use the fact that f , µ is a pure jump process to bound ()⟨⟩f ,µt () ⟨⟩t mn, the right-hand side. The process ⟨⟩f , µt has two types of jumps: nearest-neighbor, and occu- pied-site jumps. Nearest-neighbor jumps occur at rate

mn, ⎛ i ⎞ ⎛ i ⎞ mn ⎜⎟⎜⎟ ∑ mµt i 12−+()s ⩽(2 + s) i ⎝ n ⎠ ⎝ n ⎠ 4

and have magnitude

mn,,mn mn, 1 mn, 1 − i |−〈〉〈〉ff,,µµt t−−|= f , µδt +−i±1 δµi −|〈〉f , t− ⩽(max Dfx )| m ()n n mn i n

Occupied-site jumps occur at rate

mn,,⎛ i ⎞ mn⎛ j ⎞⎛ j ⎞ ⎜⎟ ⎜⎟⎜⎟ ∑ mµµt t 11++r ⩽(mr) ij, ⎝ n ⎠ ⎝ n ⎠⎝ n ⎠

and have magnitude

mn,,mn mn,,1 ⎛ ⎞ mn 2 ⎜⎟ |−〈〉〈〉ff,,µµt t−−|= f , µδt +−j δµi − 〈〉f , t− ⩽∥f∥∞ m ⎝ n n⎠ m

Putting this together,

1 − ⎛ i ⎞ EEmn,,((w′ δδ)) ⩽[mn numberofn earestn−⋅eighbor jumpsin time ] max||Dfx ⎜⎟ mn i ⎝ n ⎠ 1 +−Emn, [ numberofo ccupied site jumpsi n time2 ]δ ⋅ ∥∥f ∞ m mn 1 − ⎛ i ⎞ 2 ⩽(2 +|s)(δδmax1Dfx ⎜⎟|+mr+ )∥f∥∞ 4 mn i ⎝ n ⎠ m

⎧ 2 + s − ⎛ i ⎞ ⎫ = ⎨ max2||Dfx ⎜⎟++()1 rf∥∥∞ ⎬δ ⎩ 4 i ⎝ n ⎠ ⎭

1 Because fC∈ ([0, 1 ]), the expression in curly brackets is uniformly bounded by Cf, a constant that depends on f but not on m nor n. Substituting the above into (24) we get that for δ < εη, Cf Pwmn, ((′ δε)⩾ )⩽η

mn, for all m and n. Thus, both conditions for tightness are satisfed and ⟨⟩f , µt is tight. □ Lemma 11. The integro-partial differential equation (3) in theorem 1 has a unique solution.

1701 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

Proof. Suppose µt satisfes (3). Fix t ⩾ 0 and let ψt()x be a function of time t and space x. By the chain rule and the differential equation (3), d d d 〈〉ψµt, t =+〈〉ψµz, t 〈〉ψµt, z dtzd zt==dz zt

∂ ∂ψt = 〈〉ψµt,1t −−〈(sx x)〉,,µψt +−wr 〈〉xxt µψt 〈〉t,,µµtt〈〉 ∂t ∂x [] t ∂ 〈〉ψµt,,t =+〈〉ψµ0 0 〈(ψψzzxG)(+ xz)〉,dµz ∫0 ∂z t +−wr 〈〉xxψµz,,z 〈〉 ψµz zz〈〉,dµ z (25) ∫0

where Gfs=−xx1 − ∂ f . Let P be the semigroup operator associated with G. In fact, using ()∂x t the method of characteristics (or lemma 5 with λ = 0), ⎛ xe−st ⎞ Pft = f ⎜ ⎟ (26) ⎝1e−+xx−st ⎠

1 Now, set ψzt()xP= −zfx () for 0 ⩽⩽zt, where fC∈ ([0, 1 ]) is some test function. Substitut- ing this into (25), we have

t ∂ 〈〉Pf0 ,,µµt =+〈〉Pft 0 Pftz−−()xG+ Pftz ()xz,dµz ∫0 ∂z t +−wr ⎣⎡〈〉xPtz−−fP,,µµz 〈〉tzfxzz〈〉,dµ ⎦⎤ z ∫0 t (27) 〈〉fP,,µµt =+〈〉tfw0 rx⎣⎡〈〉Pftz−−,,µµz − 〈〉 Pftz zz〈〉xz,dµ ⎦⎤ ∫0

since ∂ Pf=−GP f . Thus, any µ that satisfes (3) also satisfes (27). We show that (27) ∂z tz−− tz t has a unique solution, which in turn implies that (3) has a unique solution. Suppose µt and νt both satisfy (27), with µ0 = ν0. Let t ⩾ 0.

∥∥µνt −=tTV sup,⟨⟩ffµνt − ⟨⟩, t ∥∥f ∞ ⩽1 ⎧ t ⎫ =−sup,⎨ wr⟨⟩xPtz−−fwµνz z +−rx[⟨ ,,µµz⟩⟨Pftz z⟩⟨xP,,ννzt⟩⟨−zzfz ⟩] d ⎬ ∫0 ∥∥f ∞ ⩽1⎩ ⎭ (28) We can bound the frst term in the integrand by

wr|−〈〉xPtz− fw, µνz z |−⩽∥r µνz zT∥ V

because ∥xPtz−∞fP∥⩽∥∥ tz −∞ff⩽∥ ∥⩽∞ 1, where the frst inequality follows from x ∈ []0, 1 and the second from (26). For the second term in the integrand of (28), add and subtract

〈xP,,νµzt〉〈 −z f z〉:

1702 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

wr 〈〉xP,,µµz 〈〉tz−−fxz −=〈〉,,ννzt〈〉Pfzz wr 〈〉xP,,µνz −+zt〈〉−−z fxµνz 〈〉,,zt〈〉Pfz µνz − z ⩽∥wr()Pftz−∞∥∥µνz −+zT∥∥V µνz − zT∥ V ⩽(wr ∥∥f ∞+−1)∥µνz zT∥ V

again, the inequalities follow from x ∈ []0, 1 , ∥Pft ∥⩽∞∞∥∥f and also that µz and νz are prob- ability measures. Substituting this back into (28),

t ∥∥µνt −−tTV ⩽∥3dwr µνz zT∥ V z ∫0

By Gronwall’s inequality, ∥µνt −=tT ∥ V 0, so we have uniqueness. □ Proof of theorem 1. The uniqueness of the limit is given by lemma 11 and the tightness mn, of the process by lemma 10. It remains to show that {⟨⟩f , µt }mn, converges to the solution of (3). Recall from lemma 9 that mn, mn,,mn mn, ⟨⟩⟨⟩ff,,µµt −=0 Aft ()+ Mft ()

Since tightness implies relative compactness (Prohorov’s theorem), there exists a subse- mn, mn, quence of µt that converges to a limit, call it µt. Thus, ⟨ ff,,µµt ⟩→⟨⟩t . We also have mn, ⟨ ff,,µµ0 ⟩→⟨⟩0 by assumption. In addition,

t mn, ⎧ mn, ⎛ i ⎞ i ⎛ i ⎞ ⎡ 1 ⎛ i ⎞ − ⎛ i ⎞⎤ Af=−µ ⎜⎟⎜1 ⎟⎜Df ⎟⎜− sD f ⎟ t () ∫ ⎨∑ z ⎢ xx x ⎥ 0 ⎩ i ⎝ n ⎠ n ⎝ nn⎠ ⎣ ⎝ n ⎠ ⎝ n ⎠⎦ ⎡ ⎤ ⎫ mn,,⎛ j ⎞ j ⎛ j ⎞ mn⎛ i ⎞ ⎛ i ⎞ mn, ⎛ j ⎞ j ⎪ ⎜⎟ ⎜⎟ ⎜⎟⎜⎟ ⎜⎟ +−wr ⎢∑∑µµz f z f ∑ µz ⎥ ⎬ dz ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎪ ⎣⎢ j n n n i n n j n n ⎦⎥ ⎭ t df xx1 s ,,wr ⎡ xf xfxx,,⎤ dz →〈∫ −−()µµzz〉〈+−⎣ () 〉〈()µµzz〉〈 〉⎦ 0 {}dx =: Aft ()

The factor of 1 in the quadratic variation (18) implies that Mmn, 0 as mn, ∞. Therefore, m t → →

⟨ ff,,µµt⟩⟨−=0⟩(Aft )

or, d df ⟨⟩fx,1µµtt=−⟨(−+xs)⟩,,wr [⟨xf ()xfµµtt⟩⟨− ()xx,,⟩⟨ µt⟩] dt dx □

4.3. Proof of Fleming–Viot limit The elementary proof for tightness in theorem 1 does not easily carry over for the case of theorem 4. We thus use a criterion by Aldous [1] to prove tightness for the martingale part of the stochastic process.

1703 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

mn, First, consider the semimartingale formulation of ⟨⟩f , µt (16) with the rescaled param­ t eters s = σ and ρ = r . Let Emn, ff:e= mn, dz and Nmn, f denote the drift and martin- n m t ()∫0 z () t () mn, gale parts of ⟨⟩f , νt , the rescaled process. Then

t mn, mn, ⎛ i ⎞ i ⎛ i ⎞ ⎡ ⎛ i ⎞ − ⎛ i ⎞⎤ Eft ()=−∫ ∑ νσz ⎜⎟⎜1 ⎟⎜⎢Dfxx ⎟⎜− Dfx ⎟⎥ 0 i ⎝ n ⎠ n ⎝ n ⎠ ⎣ ⎝ n ⎠ ⎝ n ⎠⎦ ⎧ ⎫ n ⎪ n⎛ j ⎞ j ⎛ j ⎞ n⎛ i ⎞ ⎛ i ⎞ n⎛ j ⎞ j ⎪ +−wρν⎨ ⎜⎟f ⎜⎟ νν⎜⎟⎜⎟f ⎜⎟⎬ dz (29) ⎪∑∑z z ∑ z ⎪ m ⎩ j ⎝ n ⎠ n ⎝ n ⎠ i ⎝ n ⎠ ⎝ n ⎠ j ⎝ n ⎠ n ⎭ and

t⎧ ⎡ 22⎤ mn, ⎪ n mn, ⎛ i ⎞ i ⎛ i ⎞ ⎛ +−⎛ i ⎞⎞ ⎛ σ ⎞⎛ ⎛ i ⎞⎞ 〈(Nf)〉t =−⎨ νz ⎜⎟⎜11⎟⎜⎢⎜Dfxx⎟⎜⎟ ++⎜⎟⎜Df ⎟⎟ ⎥ ∫ ⎪ 2 ∑ 0 ⎩ m i ⎝ n ⎠ n ⎝ n ⎠ ⎣⎢⎝ ⎝ nn⎠⎠ ⎝ ⎠⎝ ⎝ n ⎠⎠ ⎦⎥ 2⎫ n mn,,⎛ i ⎞ mn⎛ j ⎞⎛ ρ j ⎞⎛ ⎛ i ⎞ ⎛ j ⎞⎞ ⎪ ++w νν⎜⎟ ⎜⎟⎜⎟1d⎜ f ⎜⎟− f ⎜⎟⎟ ⎬ z (30) ∑ z t ⎪ m ij, ⎝ n ⎠ ⎝ nm⎠⎝ n ⎠⎝ ⎝ n ⎠ ⎝ n ⎠⎠ ⎭

mn, 2 Lemma 12. The processes ⟨⟩f , νt , as a sequence in {(m, n)}, is tight for all fC∈ ([0, 1 ]). mn,,mn mn, Proof. Since ⟨ fE, νt ⟩(=+t fN)(t f ), it suffces, by the triangle inequality applied m,n to Billingsley’s tightness criterion (theorem 13.2 in [2]), to show tightness of E ( f ) and Nm,n( f ) separately. mn, For the tightness of the fnite variation term Et () f :

mn,,1 mn⎛ i ⎞ ⎡ ⎛ i ⎞ − ⎛ i ⎞ ⎤ ||et ()f ⩽ ∑ νσz ⎜⎟⎢ Dfxx ⎜⎟+ Dfx ⎜⎟⎥ 4 i ⎝ n ⎠ ⎣ ⎝ n ⎠ ⎝ n ⎠ ⎦ ⎧ ⎫ n ⎪ n⎛ j ⎞ j ⎛ j ⎞ n⎛ i ⎞ ⎛ i ⎞ n⎛ j ⎞ j ⎪ +|wρν⎨ ⎜⎟f ⎜⎟|+ νν⎜⎟||f ⎜⎟ ⎜⎟⎬ ⎪∑∑z z ∑ z ⎪ m ⎩ j ⎝ n ⎠ n ⎝ n ⎠ i ⎝ n ⎠ ⎝ n ⎠ j ⎝ n ⎠ n ⎭

For a given γ > 0, we can choose n and m suffciently large such that n ∈−θγθγ, + , m () ||Dfi f + , and ||Df− i f ′ + . We thus obtain xx ()n ⩽∥ ″∥∞ γ x ()n ⩽∥ ∥∞ γ

mn, 1 ||et ()ff⩽[∥∥″ ∞∞++γσ(∥ fw′∥)++γρ](2 θγ+ )∥f ∥∞ 4

There are only a fnite number of m and n for which this condition is not satisfed. Taking the mn, maximum of the right-hand side of the above equation with the value of ||et ()f for such m and n, we obtain that for all m and n, mn, ||et ()fG⩽ f

and therefore mn, sup ||Eft ()⩽ GTf t∈[]0,T

1704 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

where Gf is a constant that depends on f. Using the same conditions for tightness as in the mn, proof of theorem 1, condition (i) is satisfed because Et () f is bounded uniformly in t, m, mn,,mn and n. Condition (ii) is satisfed because |−EEt+δ t | ⩽ δGf for all t, m, and n and therefore mn,,mn we can always choose δ to be suffciently small so that |−EEt+δ t | ⩽ ε for some prescribed ε. mn, We will show tightness for the martingale part ⟨Nft ()⟩t using Aldous’ tightness condition (we use the result as stated in [9]). First, note that by equation (30), mn, ⟨(Nft )⟩t ⩽ Jtf

2 for fC∈ ([0, 1 ]), where Jf is a constant that depends on f. Thus for fxed t,

mn, 1 mn, PNmn, ((||t fa))>|⩽(Emn, Nft )| a 11 Jtf mn, 2 12/ mn, 12/ ⩽(EEmn, ((Nft )) )(= mn, ⟨(Nft )⟩t)⩽ a a a

Jt Given ε > 0, choose a > f and we have that Nmn, f is tight for each t. Next, let τ be a stop- ε t () ping time, bounded by T, and let ε > 0. For κ > 0,

mn,, mn 1 mn,, mn PNmn, ((|−τκ++fN)(ττfN)⩾||ε)⩽ Emn, κτ()fN−|()f ε

Now (suppressing subscripts on for clarity),

mn,, mn mn ,, mn 2 12/ EE|−Nfτκ++()Nfττ ()|−⩽[ ((Nfκτ)(Nf)) ] mn, 2 mn, 2 mn,, mn mn, 12/ =−[(E Nfτκ++()Nfττ ()+−2 NfNf ()((ττ)(Nfκ ))] mn,,mn 12/ =−[(E ⟨(Nf)⟩τκ+ ⟨( Nf)⟩τ )] ⩽ Jf κ

Hence,

mn,, mn 1 PNmn, ((|−τκ+ fN)(τ fJ)⩾| ε)⩽ f κ ε

ε4 By taking κ < , we satisfy the conditions of Aldous’ stopping criterion. □ Jf Lemma 13. The martingale problem (6) and (7) has a unique solution.

Proof. The martingale problem with V()tx,,ν = 0 corresponds to a neutral Fleming–Viot with linear mutation operator. Its uniqueness has previously been established (see for example [5]). To show uniqueness for nontrivial V, we use a Girsanov-type transform by Dawson [5]. It suffces to check that sup,Vt ,axVconstant ||()µ ⩽(0 ) (31) tx,,µ

In our case, V()tx,,µ = x and since x ∈ []0, 1 , the condition is satisfed and the martingale problem has a unique solution. □

1705 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

Proof of theorem 4. The uniqueness of the limit is given by lemma 13 and the tightness of the process by lemma 12. To see that the limit is the martingale problem stated in theorem 4, note that for a fxed t,

t 1 2 mn, ∂ ∂ Eft ()⟶(xx1d− )[ fx()− σνfx ()](z x) ∫∫00 ∂x2 ∂s ⎧ 1 1 1 ⎫ ×−wxρθ⎨ fx()ννzz()ddxf()xx()xxνz ()dd⎬ z ⎩ ∫∫0 0 ∫0 ⎭

as n, m → ∞ and t 1 1 mn, 2 ⟨(Nf)⟩t ⟶(wfθν()xf− ()yx)(zzdd)(ν yz)d ∫∫00∫0

Finally, notice that

1 1 2 ((fx)(− fy)) ννzz()ddxy() ∫∫0 0 1 1 1 1 2 =−2dfx()ννzz()xy()d2 fx()fy()ννzz()ddxy() ∫∫0 0 ∫∫0 0 1 1 =−2dfx()fy()νδzx()xy[(dd)(νz y)] ∫∫0 0

and

1 1 1 1 1 xf ()xxννzz()dd−=fx() ()xxννzz ()ddxf()xy ()xy[(δνxzdd)(− x))] ∫∫0 0 ∫∫0 0 ∫0

satisfying the form of the martingale problem in the theorem. □

Acknowledgments

The authors would like to thank Mike Reed and Katia Koelle for their roles in the collabora- tion out of which this paper’s central model grew. We would also like to thank Rick Durrett of a number of useful discussions. SL would further like to thank Don Dawson, who sent her an early version of his notes on this topic [5], and Sylvie Méléard, who took time out of a conference to explain to her the elements of the proof for weak convergence. JCM would like to thank the NSF for its support though DMS-08-54879. SL would like to thank support from the NSF (grants NSF-EF-08-27416 and DMS-0942760), NIH (grant R01-GM094402), and the Simons Institute for the Theory of Computing.

References

[1] Aldous D 1978 Stopping times and tightness Ann. Probab. 6 335–40 [2] Billingsley P 1999 Convergence of Probability Measures 2nd edn (New York: Wiley-Interscience) [3] Champagnat N, Ferrière R and Méléard S 2006 Unifying evolutionary dynamics: from individual stochastic processes to macroscopic models Theor. Population Biol. 69 297–321

1706 Nonlinearity 30 (2017) 1682 S Luo and J C Mattingly

[4] Dawson D A and Greven A 1999 Hierarchically interacting Fleming–Viot processes with selection and mutation: multiple space time scale analysis and quasi-equilibria Electron. J. Probab. 4 1–81 [5] Dawson D A 2010 Introductory lecture on stochastic population systems Technical Report Series of the Laboratory for Research in Statistics and Probability Technical Report 451, Carleton University—University of Ottawa [6] Dawson D A and Hochberg K J 1991 A multilevel branching model Adv. Appl. Probab. 23 701–15 [7] Donnelly P and Kurtz T G 1999 Genealogical processes for Fleming–Viot models with selection and recombination Ann. Appl. Probab. 9 1091–148 [8] Durrett R 2008 Probability Models for DNA Sequence Evolution 2nd edn (New York: Springer) [9] Etheridge A 2000 An Introduction to (Providence, RI: American Mathematical Society) [10] Ethier S N and Kurtz T G 1993 Fleming–Viot processes in population genetics SIAM J. Control Optim. 31 345–86 [11] Fleming W H and Viot M 1979 Some measure-valued markov processes in population genetics theory Indiana Univ. Math. J. 28 817–43 [12] Fournier N and Méléard S 2004 A microscopic probabilistic description of a locally regulated population and macroscopic approximations Ann. Appl. Probabab. 14 1880–919 [13] Joffe A and Metivier M 1986 Weak convergence of sequences of with applications to multitype branching processes Adv. Appl. Probab. 18 20–65 [14] Kallenberg O 1997 Foundations of Modern Probability 1st edn (New York: Springer) [15] Luo S 2014 A unifying framework reveals key properties of multilevel selection J. Theor. Biol. 341 41–52 [16] Méléard S and Roelly S 2011 A host-parasite multilevel interacting process and continuous approximations (arXiv:1101.4015) [17] Pinchover Y and Rubinstein J 2005 An Introduction to Partial Differential Equations (Cambridge: Cambridge University Press) [18] Protter P E 2004 Stochastic Integration and Differential Equations 2nd edn (New York: Springer)

1707