Journal of Theoretical 394 (2016) 182–196

Contents lists available at ScienceDirect

Journal of Theoretical Biology

journal homepage: www.elsevier.com/locate/yjtbi

A monoecious and diploid Moran model of random mating

Ola Hössjer a,n, Peder A. Tyvand b a Department of Mathematics, Stockholm University, SE-106 91 Stockholm, Sweden b Department of Mathematical Sciences and Technology, Norwegian University of Life Sciences, N-1432 Ås, Norway

HIGHLIGHTS

A diploid monoecious Moran model is proposed for random mating with possible selfing. The Moran model is compared with a diploid and monoecious Wright–Fisher model. Diffusion approximations are derived on two time scales. Genotype frequencies oscillate as an Ornstein–Uhlenbeck process on the scale.

Fixation index fIS oscillates as an Ornstein–Uhlenbeck process around a fixed point. article info abstract

Article history: An exact is developed for a Moran model of random mating for monoecious diploid Received 17 May 2015 individuals with a given probability of self-fertilization. The model captures the dynamics of genetic Received in revised form variation at a biallelic locus. We compare the model with the corresponding diploid Wright–Fisher (WF) 23 November 2015 model. We also develop a novel diffusion approximation of both models, where the genotype frequency Accepted 20 December 2015 distribution dynamics is described by two partial differential equations, on different time scales. The first Available online 22 January 2016 equation captures the more slowly varying allele frequencies, and it is the same for the Moran and WF Keywords: models. The other equation captures departures of the fraction of heterozygous genotypes from a large Diploid population equilibrium curve that equals Hardy–Weinberg proportions in the absence of selfing. It is the Diffusion approximation distribution of a continuous time Ornstein–Uhlenbeck process for the Moran model and a discrete time Markov chain autoregressive process for the WF model. One application of our results is to capture dynamics of the Moran model fi Random mating degree of non-random mating of both models, in terms of the xation index fIS. Although fIS has a stable fixed point that only depends on the degree of selfing, the normally distributed oscillations around this fixed point are stochastically larger for the Moran than for the WF model. & 2016 Elsevier Ltd. All rights reserved.

1. Introduction very small populations. In this paper we develop an exact Markov process for a simpler monoecious (one-sex) diploid Moran model The classical haploid Wright–Fisher model of population with possible selfing, at a biallelic locus. It has a simpler state genetics describes random mating in discrete generations without space, where only two genotype frequencies are needed to analyze overlap (Fisher, 1922; Wright, 1931). As a contrast the haploid the dynamics of the population, in discrete or continuous time. – Moran model (Moran, 1958a) of has a higher There is also a monoecious and diploid Wright Fisher (WF) degree of continuity between consecutive generations, as it model, see for instance Moran (1958c), Crow and Denniston (1988), Tyvand (1993) and references therein. We will compare the exchanges one individual at discrete or continuous time points. A exact Markov chains for the monoecious and diploid Moran and diploid and dioecious (two-sex) version of the discrete time Moran WF models, and in particular check the validity of the standard model was introduced by Moran (1958b). It has a more compli- equivalence-time scaling that connects the two models. In order to cated state space, with genotype frequencies for males and facilitate this comparison, we develop diffusion approximations females, so that exact computation becomes unfeasible for all but for both models that are increasingly accurate for large popula- tions. It is well-known (Watterson, 1964, Ethier and Nagylaki,

n 1980, 1988) that such approximations work on two different time Corresponding author. E-mail addresses: [email protected] (O. Hössjer), scales in terms of a system of two partial differential equations [email protected] (P.A. Tyvand). instead of one. The first equation is the same for the Moran and http://dx.doi.org/10.1016/j.jtbi.2015.12.028 0022-5193/& 2016 Elsevier Ltd. All rights reserved. O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 183

WF models. It describes the slower allele frequencies dynamics in each instant can be formulated in discrete or continuous time. We the same way as for the haploid Moran and WF models (Kimura, will mainly focus on the discrete version, and briefly mention the 1955). The second equation operates on a more local time scale, continuous time extension in Section 5. Time t¼0 represents the and it is different for the Moran and WF models. It captures the founder generation, and the composition of the founder popula- more rapidly varying departure of the fraction of heterozygous tion is given. The constant number of diploid and monoecious genotypes from a large population equilibrium curve that corre- individuals is n. We take into account the probability s of self- – sponds to Hardy Weinberg (HW) proportions when there is no fertilization. fi sel ng. The Markov chain for the discrete time diploid Moran model Previous work for diploid models have shown that the allele starts with a given founder population. At each of the following frequency process dominates for large populations, and that the time steps ðt ¼ 1; 2; ‥Þ we work with a probability distribution over oscillations around the equilibrium curve are asymptotically neg- all possible populations. In an input (parental) population (time ligible. Here we rescale the latter and obtain a nondegenerate limit step t) there are nðtÞ ¼ n individuals of genotype AA, nðtÞ ¼ n that has the more slowly varying allele frequency process as a 1 1 2 2 genotypes of type Aa, and nðtÞ ¼ n individuals of genotype aa. The fixed parameter. This makes the frequency oscillations of the 3 3 heterozygous genotypes locally time invariant and normally dis- total number of input individuals is n ¼ n1 þn2 þn3. In each output ~ ¼ ðt þ 1Þ ~ ¼ tributed around the equilibrium curve. It is a stationary Gaussian population there are n1 n1 individuals of genotype AA, n2 ðt þ 1Þ ~ ðt þ 1Þ (Ornstein–Uhlenbeck, OU) process in continuous time for the n2 individuals of genotype Aa, and n3 ¼ n3 individuals of Moran model, and an autoregressive process in discrete time for genotype aa. We will not call the output population an offspring the WF model. population, because there is just one new member appearing at A similar autoregressive limit process has previously been each time step t. It is assumed that the total number of diploid obtained by Korolyuk and Korolyuk (1995), and Coad (2000) in a individuals n remains constant for all t, although extensions to a different context, to model allele frequency fluctuations of a variable population size are discussed in Section 8. This implies in – ~ ~ ~ diploid Wright Fisher model with balancing selection. However, particular that n1 þn2 þn2 ¼ n. these oscillations are one-dimensional and do not surround an At each time step t we first pick one mating individual at equilibrium curve, but rather a stable fixed point. Norman (1975) random, for reproduction. After that, we pick one more mating fl also obtains normally distributed uctuations of allele frequencies, individual, with probability s for self-fertilization. These two but in the context of a monoecious diploid WF model or a dioe- mating individuals produce an offspring individual, which is put cious diploid Moran model, for which the deterministic forces of aside for a moment. We can say that the newly formed offspring selection and/or are stronger than the stochastic genetic individual is in exile, temporarily. The next step is that we select drift. In more detail, the allele frequency dynamics in Norman's one individual at random for removal from the genotype pool. The paper is dominated by a deterministic function that solves an last thing we do, is to bring the offspring individual in from its ordinary differential equation according to Haldane's theory (see exile and put it into the genotype pool to replace the removed for instance Chapter 4 of Cavalli-Sforza and Bodmer, 1971). The individual. random fluctuations around this deterministic allele frequency Each time step t thus comprises four substeps: (i) the random curve are smaller, but in contrast to our results, they occur on the same time scale as the variations of the deterministic curve. These picking of two diploid parental individuals, one after the other, stochastic fluctuations are described by a Gaussian diffusion in the with a given probability for self-mating. These parental individuals limit of large populations. Norman's results are formulated are put back into the genotype pool after their mating. (ii) The mathematically in a more general setting though, and we will formation of one offspring individual, by combining two haploid apply them to the genotype frequency process of the monoecious gametes at random from the parental individuals. This single and diploid Moran model, in order to prove its weak convergence newly formed diploid offspring individual is put temporarily into on both time scales simultaneously. exile. (iii) The random removal of one diploid individual from the

An implication of our results is that the fixation index fIS of genotype pool. (iv) The waiting offspring individual is put into the Wright (1943) oscillates around a stable fixed point that is only a genotype pool, where it replaces the individual that has been function of the selfing probability. These oscillations constitute an eliminated. OU process for the Moran model and an autoregressive process for The net effect during one time step is that one diploid indivi- the WF model. Both these processes have a marginal normal dis- dual dies and is replaced by one newly composed individual. This tribution, whose and bias are larger for the Moran than new genotype is composed through random mating in the full for the WF model. genotype pool, including the one that will soon be picked to die fi fi Our paper is organized as follows. We rst de ne the diploid and be replaced by the newly composed one. This is an important Moran model in Section 2 in terms of a Markov chain. In Section 3 distinction. The formation of the offspring takes place before an we define some important , such as the expected hetero- individual is removed from the genotype pool. A different type of zygosity, effective population size and fixation index. The diffusion Moran model could be constructed if one assumed the reverse approximation of the Moran model is introduced in Section 4,and order of removal and replication. A stronger would its continuous time version in Section 5.InSection 6 we introduce result if the dying individual had been excluded from having more briefly the diploid WF model, and derive its diffusion offspring. approximation. Simulation results are presented in Section 7,and The present model will not take mutation and selection into extensions are discussed in Section 8. The mathematical derivations fi have been collected in the appendix. account, but it could be modi ed to do so. A probability s of self- fertilization is taken into account though. The case s ¼ 1=n is of particular interest because it lets the mating take place in a 2. Formulation of monoecious diploid Moran model gamete pool instead of in a genuine genotype pool. It is thus a reference case which can be called standard self-fertilization, We consider one locus and assume two versions or alleles A where the diploid model effectively degenerates into a haploid and a of the gamete. Moran (1958a) noted that a model with model where the heterozygosity is no longer an essential property, overlapping generations, where only one individual is replaced at since no heterozygosity can be identified in a haploid gamete pool. 184 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196

2.1. Offspring and elimination probabilities that we already know. Once we know n2 and n3,wefind n1 ¼ nn2 n3, since the total number of diploid individuals n has In the next two subsections, we will look at the transition a given constant value. The inverse transformation from the ~ matrix of the Moran model. The probabilities for the different counting parameter j of Eq. (5) back to the genotype numbers ðn1 ~ ~ offspring genotypes AA, Aa and aa in the output population are ; n2; n3Þ for the output population is analogous to (8). denoted by P , P and P , respectively. Following Tyvand (1993), M dðnÞ 1 2 3 We will now construct the transition matrix ¼ðMijÞi;j ¼ 1 these probabilities are between all the possible input populations i and the correspond- 2 ing output populations j. We must give seven different expressions ð1sÞðn1 þn2=2Þ þðsn1Þðn1 þn2=4Þ P ¼ ; for seven different cases. 1 nðn1Þ Define frequencies Q ðtÞ ¼ Q , Q ðtÞ ¼ Q and Q ðtÞ ¼ Q at time t, ð1sÞðn n þn n þ2n n þn2=2Þþðsn1Þðn =2Þ 1 1 2 2 3 3 P ¼ 1 2 3 2 1 3 2 2 ; for the three genotypes AA; Aa; aa, respectively. They are given by 2 nðn1Þ 2 ðn1; n2; n3Þ ð1sÞðn3 þn2=2Þ þðsn1Þðn3 þn2=4Þ ðQ ; Q ; Q Þ¼ ; ð9Þ P ¼ : ð1Þ 1 2 3 n 3 nðn1Þ and equal the elimination probabilities in Step (iii). We introduce We will see that there is one notable difference between our them here in the interest of compacting formulas for the transition diploid Moran model and the diploid Wright–Fisher model in matrix. Tyvand (1993): the case of standard self-fertilization ðs ¼ 1=nÞ There is elimination with full replacement, since the eliminated makes the diploid Wright–Fisher model degenerate into its con- genotype is replaced by the newly formed offspring genotype. ventional haploid version. In contrast, the case of standard self- Thus the sum of offspring probabilities adds up to one fertilization makes our diploid Moran model degenerate into a ðP1 þP2 þP3 ¼ 1Þ. The sum of elimination probabilities must also haploid but modified Moran model. This modified haploid Moran add up to one ðQ þQ þQ ¼ 1Þ. Now we will combine these model will have two eliminated gametes replaced by two newly 1 2 3 probabilities of offspring and elimination for each genotype to formed gametes at each time step. In the conventional haploid construct the elements in the transition matrix, where there are Moran model, only one gamete is eliminated and replaced by a seven distinct cases. newly formed gamete. (i) Genotype Aa replaces genotype aa: This is the case where the Tyvand (1993) allowed the population size to be different from relationships between the numbers of output and input genotypes one time step to the next, involving rectangular transition matri- are ces. The transition matrix is always a square matrix for the Moran ~ ; ~ ; ~ : model, since we assume that the number of individuals n in the n1 ¼ n1 n2 ¼ n2 þ1 n3 ¼ n3 1 ð10Þ parental population is the same as the number of individuals in The associated elements M in the transition matrix are given by ; ij the offspring population. Assuming that all possible states ðn2 n3Þ the product of the probability of reproducing one genotype Aa and are ordered as the removal of one genotype aa fð0; 0Þ¼1; ð0; 1Þ¼2; ð1; 0Þ¼3; ð0; 2Þ¼4; …; ðn; 0Þ¼dðnÞg; ð2Þ Mij ¼ P2Q 3: ð11Þ the running indices in the Markov chain algebra are the counting (ii) Genotype aa replaces genotype Aa: This is the case where the parameter i for all different possible input populations. It is relationships between the numbers of output and input genotypes defined as are ðn2 þn3Þðn2 þn3 þ1Þ ~ ~ ~ i ¼ 1þn þ ; ð3Þ n1 ¼ n1; n2 ¼ n2 1; n3 ¼ n3 þ1: ð12Þ 3 2 where i runs from 1 to a value d(n) that represents the number of The associated elements in the transition matrix Mij are given by different population compositions (where permutations are the product of the probability of reproducing one genotype aa and excluded). This number of different populations is given by the removal of one genotype Aa : ðnþ1Þðnþ2Þ Mij ¼ P3Q 2 ð13Þ dðnÞ¼ : ð4Þ 2 (iii) Genotype aa replaces genotype AA: This is the case where The similar counting parameter j for all possible output popula- the relationships between the numbers of output and input gen- tions is otypes are ðn~ þn~ Þðn~ þn~ þ1Þ n~ ¼ n 1; n~ ¼ n ; n~ ¼ n þ1: ð14Þ j ¼ 1þn~ þ 2 3 2 3 ; ð5Þ 1 1 2 2 3 3 3 2 The associated elements in the transition matrix Mij are given by ~ where j runs from 1 to d(n). Even though n1 and n1 are not the product of the probability of reproducing one genotype aa and included in (3) and (5), these equations are essentially transfor- the removal of one genotype AA mations from triplets to single natural numbers M ¼ P Q : ð15Þ ~ ~ ~ ij 3 1 ðn1; n2; n3Þ-i; ðn1; n2; n3Þ-j: ð6Þ (iv) Genotype AA replaces genotype aa: This is the case where It is desirable also to establish inverse transformations from nat- the relationships between the numbers of output and input gen- ural numbers back to triplets otypes are ~ ~ ~ i-ðn1; n2; n3Þ; j-ðn1; n2; n3Þ; ð7Þ ~ ~ ~ n1 ¼ n1 þ1; n2 ¼ n2; n3 ¼ n3 1: ð16Þ and for the input population these are given by the formulas The associated elements in the transition matrix M are given by "#pffiffiffiffiffiffiffiffiffiffiffiffi ij 8i71 ðn þn Þðn þn þ1Þ the product of the probability of reproducing one genotype AA and n þn ¼ ; n ¼ i1 2 3 2 3 ; ð8Þ 2 3 2 3 2 the removal of one genotype aa M ¼ P Q : ð17Þ where ½x denotes the integer part of x, which is the largest integer ij 1 3 that is smaller or equal to x. From the relationships (8) we first (v) Genotype AA replaces genotype Aa: This is the case where the determine the sum n2 þn3, thereafter n3 and then n2 from the sum relationships between the numbers of output and input genotypes O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 185 are form it is given by ~ ~ ~ n1 ¼ n1 þ1; n2 ¼ n2 1; n3 ¼ n3: ð18Þ XdðnÞ V ðt þ 1Þ ¼ V ðtÞM : ð28Þ The associated elements in the transition matrix M are given by j i ij ij i ¼ 1 the product of the probability of reproducing one genotype AA and the removal of one genotype Aa The above-mentioned equation (28) is to be applied recursively for t ¼ 0; 1; 2; …, starting with the founder population t¼0. Since we M ¼ P Q : ð19Þ ij 1 2 will choose a specific founder population, the initial vector V ð0Þ (vi) Genotype Aa replaces genotype AA: This is the case where will by definition only have one nonzero component, according to 8 the relationships between the numbers of output and input gen- < ; ð0Þ; ð0Þ ; 1 i ¼ðn2 n3 Þ otypes are V ð0Þ ¼ ð29Þ i : ; a ð0Þ; ð0Þ : ~ ~ ~ 0 i ðn2 n3 Þ n1 ¼ n1 1; n2 ¼ n2 þ1; n3 ¼ n3: ð20Þ The associated elements in the transition matrix are given by the product of the probability of reproducing one genotype Aa and the removal of one genotype AA 3. Statistics obtained from the Moran model

Mij ¼ P2Q 1: ð21Þ The exact is represented in our Markov chain (vii) All genotype numbers AA, Aa, aa remain unchanged: This is computations. We start with choosing a constant population size the case where the relationships between the numbers of output of n diploid individuals, and we choose the probability s of self- and input genotypes are fertilization. Next we choose the founder population t¼0tobe ~ ; ~ ; ~ ; n1 ¼ n1 n2 ¼ n2 n3 ¼ n3 ð22Þ composed of the numbers of ðn1; n3; n3Þ individuals of the geno- types ðAA; Aa; aaÞ, respectively. Thereby an initial population dis- which means that the input and output population numbers are tribution vector V ð0Þ with only one nonzero component (29) is equal. The associated elements along the main diagonal in the fi transition matrix are given as speci ed. At each time step t we calculate recursively the inherited : Mij ¼ P1Q 1 þP2Q 2 þP3Q 3 ð23Þ population distribution vector V ðt þ 1Þ from its parental distribution vector V ðtÞ. The value of t does not represent a generation, as in the The sum of all these seven expressions for Mij is one, when i is fixed and j varies. This is a constraint that can be used as a check. Wright–Fisher model, but a much smaller time unit, since only one All combinations of i and j that do not belong to one of these seven individual out of n individuals is exchanged at each step. Actually categories will give Mij ¼ 0. there are no distinct generations in the Moran model, since there is only a slow gradual replacement of parental individuals with 2.2. The transition matrix their offspring. Through the common diffusion approximation we can link a Moran model to its corresponding Wright–Fisher model, It is helpful to combine the seven formulas above into one see below. ðtÞ common expression that is valid for all components in the tran- The population distribution vector V contains information sition matrix of the Markov chain. In order to do this, we introduce which may be summarized in a number of ways. We are interested f ðtÞ ðtÞ; ðtÞ; ðtÞ the following notation in the average genotype frequencies ¼ðf 1 f 2 f 3 Þ at time t for the three genotypes ðAA; Aa; aaÞ, respectively. These average gen- δðα; βÞ¼δαβ ð24Þ otype frequencies at a given time step t are given by for the Kronecker delta δαβ, so that δðα; βÞ¼1 when α ¼ β and XdðnÞ ; ; δ α; β αaβ ðtÞ ðn1 n2 n3Þ ðtÞ ð Þ¼0 when . The transition matrix can then be f ¼ V ; ð30Þ n i expressed by one formula i ¼ 1 δ ~ ; δ ~ ; δ ~ ; Mij ¼ P2Q 3 ðn1 n1Þ ðn2 n2 þ1Þ ðn3 n3 1Þ where the values of n1, n2 and n3 have to vary with i in this sum, in ~ ~ ~ þP3Q 2δðn1; n1Þδðn2; n2 1Þδðn3; n3 þ1Þ a way specified by the relationship (3) or its inverse relationship δ ~ ; δ ~ ; δ ~ ; ðtÞ þP1Q 3 ðn1 n1 þ1Þ ðn2 n2Þ ðn3 n3 1Þ (8). The frequency f 2 of the heterozygous genotype Aa deserves δ ~ ; δ ~ ; δ ~ ; þP3Q 1 ðn1 n1 1Þ ðn2 n2Þ ðn3 n3 þ1Þ special attention, since it is a distinguishing feature of the diploid ~ ~ ~ model. Only when s ¼ 1=n, the diploid model degenerates to þP1Q 2δðn1; n1 þ1Þδðn2; n2 1Þδðn3; n3Þ ~ ~ ~ having a haploid gamete pool, where heterozygosity is no longer a þP2Q 1δðn1; n1 1Þδðn2; n2 þ1Þδðn3; n3Þ ~ ~ ~ property of the model. It is shown in the appendix that þðP1Q 1 þP2Q 2 þP3Q 3Þδðn1; n1Þδðn2; n2Þδðn3; n3Þ: ð25Þ ðtÞ 1s = A probability distribution vector over all possible population ¼ ð0Þð ð0ÞÞ t ðnneÞ f 2 2x 1 x s e compositions of time steps t ¼ 0; 1; ‥ is given by the vector 1  2 2 3 V ðtÞ ¼ V ðtÞ; …; V ðtÞ ; ð26Þ 1 dðnÞ 6 1s 7 þ4Q ð0Þ 2xð0Þð1xð0ÞÞ 5e ð1 s=2Þt=n 2 s where 1 2 hi  ðtÞ ¼ ð ðtÞ; ðtÞÞ¼ð ; Þ ; ¼ ; ; …; ð Þ; ð Þ s V i P n2 n3 n2 n3 i 1 2 d n 27 ð1sÞ 3 hi 1 s ð0Þ ð0Þ 2 t=ðnneÞ ð1 Þt=n þ x ð1x Þe e 2 d(n) is the population number of formula (4), and ðn ; n Þ is related n s 2 s 1 2 3 1 1 to i as in (3). We may refer to (26) as the population distribution 2 2 ne vector at time t. It represents the probabilities of each of the dif- 1 ; ferent populations after t steps of random mating. The Markov þoðn Þ ð31Þ chain algebra produces the output population distribution vector where V ðt þ 1Þ from the input population distribution vector V ðtÞ by the ðtÞ ðtÞ ðt þ 1Þ ðtÞ ðtÞ 1 Kolmogorov–Chapman equation V ¼ V M. In component x ¼ 2 Q 2 þQ 3 186 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 is the frequency of allele a at time t, that  s 2 sð1sÞ ne ¼ n 1 ð32Þ Q ðxÞ¼ð1sÞð1xÞ þsð1xÞxð1xÞ ; 2 1 2s fi 1s is the effective population size for sel ng, see Pollak (1997), ð Þ¼ ð Þ ; Q 2 x 2x 1 x = Nordborg and Donnelly (1987) and Hössjer et al. (2015), and 1s 2 oðn 1Þ a remainder term of smaller order than n1. The first term sð1sÞ Q ðxÞ¼ð1sÞx2 þsxxð1xÞ : ð37Þ on the right-hand side of (31) will dominate when t is large, so 3 2s = that exp½1 ðnneÞ is the multiplicative rate by which the expec- It follows from (37) that one single parameter x suffices to ted heterozygosity decreases per time step. We will interpret n as describe the system when the population is large, and an average generation length, since only one randomly chosen fðQ 1ðxÞ; Q 2ðxÞ; Q 3ðxÞÞ; 0oxo1g constitutes an equilibrium curve individual is replaced at each time point, and therefore the of genotype frequencies that equal Hardy Weinberg propor- expected life length of each individual is n time steps. The more tions in the absence of selfing (s¼0). Inserting (37) into (35) we fi sel ng there is, the smaller is ne, and the faster is the decay rate get a large population fixed point expð1=neÞ of heterozygosity per generation, 2xð1xÞQ ðxÞ s The standard deviation for the frequency of genotype α ¼ 1; 2; 3 f fix ¼ 2 ¼ ð38Þ IS ð Þ is given by 2x 1 x 2 s vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u of the fixation index if 0oxo1. It only depends on the selfing uXdðnÞ 2 ðtÞ t nα ðtÞ ðtÞ 2 rate, not on the allele frequency x. σα ¼ V ðf α Þ ð33Þ n i i ¼ 1 4.2. Diffusion approximations ϕðtÞ ϕðtÞ at time point t. We also introduce the probabilities A and a for fi xation at time t in the gametes A and a respectively. These are In this section we derive a large population approximation of given by the genotype probabilities V ðtÞ that relies on a diffusion fluctuation P i ðtÞ ðtÞ 3 ðtÞ fi ϕ ; around (37). Since α ¼ 1 Q α ¼ 1, it suf ces to consider the last A ¼ V 1 two genotypes α ¼ 2; 3. We first rewrite (27) in terms of the fre- ϕðtÞ ¼ V ðtÞ : ð34Þ a dðnÞ quencies of these two genotypes; Under Hardy–Weinberg (HW) equilibrium the frequency of Aa ðtÞ ðtÞ ; ðtÞ ð0Þ π ; ð0Þ π ; V i ¼ PðQ 2 ¼ q2 Q 3 ¼ q3 j Q 2 ¼ 2 Q 3 ¼ 3Þ ð39Þ would be 2xðtÞð1xðtÞÞ. In order to quantify departure from random ; ; = mating in terms of HW-equilibrium, we use the fixation index where i and ðq2 q3Þ¼ðn2 n3Þ n are related through (3). Conse- 8 quently, the genotype frequencies are defined on a lattice > ðtÞ ðtÞ ðtÞ  < 2x ð1x ÞQ 2 ðtÞ ðtÞ ; 0ox o1; 1 n1 f ¼ 2xðtÞð1xðtÞÞ ð35Þ q ; q A 0; ; …; ; 1 ; IS :> 2 3 n n NaN; xðtÞ Af0; 1g; r þ r ðtÞ and satisfy 0 q2 q3 1. We will treat two cases separately, where NaN (Not a Number) means that f is undefined. ; IS depending on whether ðq2 q3Þ corresponds to an absorbing state ðtÞ or not. That is, we give separate approximations of V i depending on whether any of the two alleles A or a has been fixed in the 4. Approximations for large populations whole population at time t or not.

4.1. Genotype frequency equilibrium curve 4.2.1. Non-absorbing states If ðq ; q Þ2fð= 0; 0Þ; ð0; 1Þg is a nonabsorbing state we approximate ðtÞ ðt þ 1Þ ~ 2 3 Recall that nα ¼ nα and nα ¼ nα for α ¼ 1; 2; 3 denote the (39) by three genotype counts at time points t and t þ1. Since n~ α can at ðtÞ 2 ðtÞ ð0Þ ð ; j π ; π Þ: most differ from nα by one unit, the genotype counts will have a V i n f Q j Q q2 q3 2 3 systematic drift Q ðtÞ ¼ð ðtÞ; ðtÞÞ  This is accurate for large n, viewing Q 2 Q 3 as a two- dimensional absolutely continuous random vector with condi- Enα j n1; n2; n3 nα ¼ Pnα ¼ nα þ1 Pnα ¼ nα 1 tional density function f ðtÞ ð0Þ for t 40, given the starting geno- : Q j Q ¼ Pα Qα ð0Þ type configuration Q ¼ðπ2; π3Þ. In order to find this density, we A genotype frequency fixed point is obtained by putting Pα ¼ Q α need to look a two time scales simultaneously. The first one α ; ; for ¼ 1 2 3. We will derive the large population limit of this t fi T ¼ ð40Þ xed point, assuming that the two parents, in the case of no n selfing, are drawn with replacement in step (i) of the reproduction is a generation counter, and the second one cycle between time points t and t þ1. This corresponds to a n ¼1 t limit in (1), with simpler formulas τ ¼ ð41Þ nn n ¼1 = 2 = ðtÞ 2 ðtÞ 1 ; e P1 ¼ ð1sÞðQ 1 þQ 2 2Þ þsðQ 1 þQ 2 4Þ¼ð1sÞð1x Þ þsð1x Þ4 sQ 2 is the time scale (31) for genetic drift. It turns out that allele fre- n ¼1 ðtÞ ðtÞ 1 ðtÞ; quencies change according to the slower time scale τ, and it is P2 ¼ 1Q 1 Q 3 ¼ 2ð1sÞx ð1x Þþ2 sQ 2 therefore convenient to introduce the time transformed allele frequency process n ¼1 2 ðtÞ 2 ðtÞ = = 1 ðnneτÞ P3 ¼ ð1sÞðQ 3 þQ 2 2Þ þsðQ 3 þQ 2 4Þ¼ð1sÞðx Þ þsx 4 sQ 2 XðτÞ¼x : ð42Þ ð36Þ A process that changes more quickly, on time scale T, is the excess ðtÞ ðtÞ fi for all three offspring probabilities. Let QαðxÞ¼Qα ¼ Pα be the fraction Q 2 Q 2ðx Þ of heterozygots compared to an in nitely large population fixed points of the frequencies of genotypes α ¼ large population with a given selfing rate s. This excess fraction pffiffiffi 1; 2; 3 when the allele frequency is xðtÞ ¼ x. By solving (36),wefind gets smaller as n increases, at rate 1= n. We introduce the O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 187 standardized heterozygosity excess process function pffiffiffi 0 1 YðTÞ¼ n½Q ðnTÞ Q ðxðnTÞÞ; ð43Þ cðxÞ 2 2 2 B y pffiffiffi C 1 B n C which has a non-vanishing limit for large populations. f τ ðyj xÞ¼pffiffiffiffiffiffi expB C ð51Þ YðTÞjXð Þ πσ @ 2σ2ðxÞ A With these definitions, we can rewrite the genotype fre- 2 ðxÞ quencies as ffiffiffi ffiffiffi p ðtÞ p is normal with mean cðxÞ= n and variance Q ¼ Q 1ðÞXðτÞ YðTÞ=ð2 nÞ; 1 pffiffiffi ðtÞ τ = ; 2 bðxÞxð1xÞ Q ¼ Q 2ðÞþXð Þ YðTÞ n σ ðxÞ¼ : ð52Þ 2 pffiffiffi 2s Q ðtÞ ¼ Q ðÞXðτÞ YðTÞ=ð2 nÞ; ð44Þ 3 3 In the appendix we use weak convergence results of Norman with T and τ as in (40) and (41). Denote the last two components (1975), Ethier and Nagylaki (1980, 1988) and Ethier and Kurz of (44) as Q ðtÞ ¼ GðÞXðτÞ; YðTÞ , where G : R2-R2 is nonlinear in X (1986) in order to motivate that the joint distribution of the two but linear in Y. Using transformation rules for probability densities, processes (42) and (43) converge to (47) and (51) on the local time we verify scale as n-1. pffiffiffi It follows from (38), (44) and (51) that the fixation index (35) is f ðtÞ ð0Þ ðq ; q j π ; π Þ nf τ ðxj pÞf τ ðyj xÞð45Þ Q j Q 2 3 2 3 Xð Þj Xð0Þ YðTÞj Xð Þ approximately normally distributed ! in the appendix, where p ¼ π =2þπ is the initial frequency of 2 3 s cðxÞ 1 σðxÞ2 1 allele a, and f ðtÞ j XðτÞ¼x N ; IS 2s 2xð1xÞ n ð2xð1xÞÞ2 n ÀÁpffiffiffi 0 1 ð ; Þ¼G 1ð ; Þ¼ = þ ; ½ ð Þ  x y q2 q3 q2 2 q3 n q2 Q 2 x s B ð1sÞ 3 C B s 2 1 bðxÞ 1C the inverse transformation from genotype frequencies to allele ¼ N@  ; A; 2s s 3 n 4ð2sÞxð1xÞ n frequency and heterozygosity excess. It is also shown in the 21 appendix that 2 Âà ð53Þ EXðτþδÞjXðτÞ¼x ¼ x; Âà conditionally on an allele frequencies before fixation ð0oxo1Þ.Of Var XðτþδÞjXðτÞ¼x ¼ xð1xÞδþoðδÞ; ð46Þ particular interest is the case of no selfing s¼0. Then as the time increment δ-0, so that XðτÞ satisfies the same sto- bðxÞ¼8xð1xÞ, so that the fixation index, to a first approximation, chastic differential equation as the haploid Wright–Fisher and has a distribution  Moran models. The Kolmogorov forward equation gives a partial 3 1 f ðtÞ j XðτÞ¼x N ; ð54Þ differential equation associated with (46), for the allele frequency IS 2n n density. Kimura (1955) derived the solution of this equation in terms of an infinite series independently of the allele frequency x. X 1 λ τ=2 f τ ðxj pÞ¼pð1pÞ mðmþ1Þð2mþ1Þe m Xð ÞjXð0Þ m ¼ 1 4.2.2. Absorbing states ; ; ; ; ; ; ; ; A ; ; 2F1ð1m mþ2 2 pÞ2F1ð1m mþ2 2 xÞ ð47Þ The probabilities (39) for the two absorbing states ðq2 q3Þ fð0 0Þ ð0; 1Þg correspond fixation of A and a respectively, as defined in (34). see also Section 8.4 of Crow and Kimura (1970). Here λ ¼ mðmþ m Reaching any of these two states is equivalent to the allele frequency 1Þ is the mth eigenvalue of the partial differential operator that process xðtÞ hitting any of its two absorbing barriers 0 or 1. In Section transforms f τ ðxj pÞ to 2∂f τ ðxj pÞ=∂τ, and F is the Xð Þj Xð0Þ Xð Þj Xð0Þ 2 1 4.2.1 we found that the allele frequency process has the same diff- hypergeometric function, defined in detail for instance in Abra- usion limit as the haploid Wright Fisher and Moran models, on a time mowitz and Stegun (1964). scale (41).Thefixation probability for this process at the upper The diffusion representation of the Y-process is conditional on boundary 1 is the allele frequency process X. It is proved in the appendix that  Φðτj pÞ¼PðXðτÞ¼1j Xð0Þ¼pÞ ÂÃs cðxÞ X δ τ ; ffiffiffi δ δ ; 1 λ τ= EYðT þ ÞYðTÞjXð Þ¼x YðTÞ¼y ¼ 1 y p þoð Þ ¼ pþpð1pÞ ð2mþ1Þð1Þme m 2 ÂÃ2 n m ¼ 1 ; ; ; ; Var YðT þδÞjXðτÞ¼x; YðTÞ¼y ¼ bðxÞxð1xÞδþoðδÞð48Þ 2F1ð1m mþ2 2 pÞ ð55Þ as δ-0, with see for instance Sections 8.4 and 8.8 of Crow and Kimura (1970),or 2 3 McKane and Waxman (2007). Since the two alleles are treated sym- 3 ð1sÞ s1 metrically, this yields approximations 2ð1sÞ 6 2 7 bðxÞ¼61þð2x1Þ2 7; ð49Þ 2 4 5 ðtÞ ϕðtÞ Φ τ ; 1 1 V 1 ¼ A ð j 1pÞ 1 s 1 s 2 2 ðtÞ ϕðtÞ Φ τ V dðnÞ ¼ a ð j pÞð56Þ and fi  of the exact xation probabilities at both boundaries at any time s ð1sÞ 3 point t. cðxÞ¼xð1xÞ 2 : ð50Þ s 3 1 2 5. Continuous time Moran model We notice from (48) that the Y-process is mean-reverting, with a pffiffiffi systematic drift towards cðxÞ= n. Since this number converges to It is possible to define a continuous time version of the diploid 0asn grows, a large population drifts Y back to 0 and hence tries Moran model with time parameter T Z0. Individuals are born as in to maintain heterozygosity balance (37). For large populations, the Section 2. But this does not happen at equidistant time points, but X-process changes much more slowly than the Y-process, so that x rather according to a Poisson process with rate n, at time points becomes a time invariant constant in (48). This corresponds to a 0oTð1Þ oTð2Þ o⋯ . This implies in particular that TðtÞ Tðt 1Þ are stationary Ornstein–Uhlenbeck process, whose marginal density independent and exponentially distributed random variables with 188 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 mean 1=n. Since the expected number of generations t before an frequencies, we define the Moran model generation number as individual dies is n, the expected life time is n 1=n ¼ 1. The con- r T ¼ ; tinuous time Moran model can be represented as a Markov pro- 2 cess and let hi ðTÞ; ðTÞ ðTÞ; ðTÞ pffiffiffi ðn2 n3 Þ¼nðQ 2 Q 3 Þ ð2TÞ ð2TÞ YðTÞ¼ n Q 2 Q 2ðx Þ ð58Þ for T Z0, where nðTÞ ¼ nðtÞ is the genotype frequency of Aa for 2 2 be the time transformed and normalized excess of heterozygots TðtÞ rT oTðt þ 1Þ, with Tð0Þ ¼ 0. Similarly, nðTÞ ¼ nðtÞ is the genotype 3 3 process at time T. An important distinction from the Moran model frequency of aa during the same time interval. This implies in is that T Af0; 1=2; 1; 3=2; …g is defined on the same lattice for each particular that the discrete time Moran model ðnðtÞ; nðtÞÞ, 2 3 n, and it is not a continuous parameter in the limit of large t ¼ 0; 1; 2; …, is the embedded Markov chain of the continuous populations. time Moran model, obtained by recording values of the latter Notice that the Aa and aa genotype frequencies ðQ ðrÞ; Q ðrÞÞ of process after each jump. 2 3 generation r still satisfy (44), provided we change the Moran time Since the number of births from 0 to T is Poisson distributed step index t to r on the left-hand side of this equation. Likewise, with mean nT, it follows from the theory of Markov processes (see the joint distribution of these two genotypes is approximated by for instance Grimmett and Stirzaker, 2001) that the transition (45). In order to find this approximate distribution, we prove in matrix over a time interval of length T has entries  the appendix that XðτÞ satisfies the same diffusion equation (46) as ðTÞ; ðTÞ ð0Þ; ð0Þ for the Moran model, with an asymptotic distribution given by PijðTÞ¼P ðn2 n3 Þ¼jjðn2 n3 Þ¼i (47). The distribution of (58) is on the other hand different from X1 ðnTÞk ¼ e nT MðkÞ ¼ expðtAÞ ; (51). We prove in the appendix that asymptotically for large k! ij ij k ¼ 0 population, (58) is an autoregressive process   r ; r ðkÞ Mk 1 c~ðxÞ s c~ðxÞ for 1 i j dðnÞ. Here Mij denotes element (i,j)of , and YTþ pffiffiffi ¼ YðTÞ pffiffiffi þεðTÞ; ð59Þ A dðnÞ 2 n 2 n ¼ðAijÞi;j ¼ 1 is the intensity matrix, with elements 8 of order 1, with a bias term that involves < nM ; jai; Pij  A ¼ ; ; s ij : Aik ¼n j ¼ i ð1sÞ 2 k;k a i c~ðxÞ¼xð1xÞ 2 ; ð60Þ s 3 1 that equal the rates between all pairs of different states, and minus 2 ÀÁ the rate to leave states along the diagonal. and an innovation term εðTÞN 0; bðxÞxð1xÞ=2 that is normally By the for the Poisson distribution, the distributed conditionally on XðτÞ¼x, with b(x)asin(49). Com- number of births up to time T is nTð1þoð1ÞÞ when n is large. The bining (59) with time theory analysis (see for instance Brockwell ðTÞ; ðTÞ; ðTÞ triplet ðQ 1 Q 2 Q 3 Þ of genotype frequencies for the continuous and Davis, 1991), we find that the normal marginal distribution time Moran model will therefore be well approximated by the 0 1 τ c~ðxÞ 2 right-hand side of (44), with Xð Þ and Y(T)asin(47) and (51), and B y pffiffiffi C ðTÞ B C τ ¼ = ð ; ffiffiffiffiffiffi1 n T ne. Moreover, the joint distribution of the frequencies Q 2 f YðTÞj XðτÞ¼xðyj xÞ¼p expB C ðTÞ 2πσ~ ðxÞ @ 2σ~ 2ðxÞ A Q 3 Þ of genotypes Aa and aa, is well approximated by the right- hand side of (45). of Y(T) conditionally on the allele frequency process, with variance ε τ ~ 2 Var½ðTÞjXð Þ¼x bðxÞxð1xÞ 6. Monoecious diploid Wright–Fisher model σ ðxÞ¼  ¼ : ð61Þ s 2 s2 1 2 2 2 For the monoecious diploid Wright–Fisher model we let Comparing (60) and (61) with (50) and (52), we notice that c~ðxÞ ð Þ ð Þ ð Þ ð Þ ð Þ ð Þ r ; r ; r r ; r ; r ~ 2 2 ðn0 n1 n2 Þ¼nðQ 0 Q 1 Q 2 Þ ocðxÞ and σ ðxÞrσ ðxÞ, with equality if and only if s¼0. This implies that the frequency of heterozygots Aa on average departs be the number of AA, Aa and aa genotypes in generation less from the stable fixed point (37) for the WF model than for the r ¼ 0; 1; 2; …. This is a discrete time Markov chain with the same Moran model. state space (2) as for the Moran model. If coding (3)–(5) is used for For the fixation index we obtain an approximate normal dis- these states, we get multinomial transition probabilities tribution n! ~ ~ ~ 0 1 M ¼ Pn1 Pn2 Pn3 ; ð57Þ  ij n~ !n~ !n~ ! 1 2 3 s 1 2 3 B ð1sÞ 2 C ð Þ B s 1 bðxÞ 1C ðrÞ ~ ðr þ 1Þ f t j XðτÞ¼x NB 2 ;  C; with Pα as in (1), nα ¼ nα and nα ¼ nα , see Tyvand (1993).In IS @2s s 3 n s2 nA 21 42 xð1xÞ the haploid case, it is well-known that one WF generation corre- 2 2 sponds to half a Moran generation. We will prove that the same is true in the diploid case. To this end, we introduce a time scale similarly as in (53). This expression simplifies a lot when there is no selfing (s¼0). Then the fixation index process has independent r τ ¼ and identically distributed components 2ne  1 1 of genetic drift, let xðrÞ ¼ Q ðrÞ=2þQ ðrÞ be the frequency of allele a in f ðrÞ j XðτÞ¼x N ; ð62Þ 2 3 IS n n WF generation r, and τ for r ¼ 1; 2; …, independently of the allele frequency x. This Levene XðτÞ¼xð2ne Þ effect can more easily be obtained by a direct argument, see for the corresponding time transformed process. In order to model instance Levene (1949) and Section 2.10 of Crow and Kimura oscillations around the equilibrium curve (37) of genotype (1970). It confirms a well-known fact that the fraction of O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 189

fi ð0Þ; ð0Þ : ; : Fig. 1. Simulation results for a diploid Moran population of size n¼40 with no sel ng (s¼0) that starts with genotype frequencies ðQ 2 Q 3 Þ¼ð0 5 0 25Þ. Upper left is a ðtÞ; ðtÞ ; …; ; ; ; r r scatterplot from one ðQ 2 Q 3 Þ, t ¼ 1 N ¼ 10 000, together with the limit (solid) curve fðQ 2ðxÞ Q 3ðxÞÞ 0 x 1g for large populations. The remaining three ðtÞ ðtÞ= ðtÞ subplots give aggregated results from ten time series of length N, conditionally on the allele frequency process x ¼ Q 2 2þQ 3 ¼ x. The upper right subplot depicts the average of ðQ ðtÞ; Q ðtÞÞ for each x. The two lower subplots show for each x estimates of the mean and standard deviation of the standardized heterozygosity excess process 2 3 pffiffiffi ðtÞ Yst ¼ YðTÞ=σðx Þ, with Y(T)asin(43). The corresponding predictions from the diffusion approximation in (51) are cðxÞ=ð nσðxÞÞ (thick line, left) and 1 (thin line, right).

fi ð0Þ; ð0Þ : ; : Fig. 2. Simulation results for a diploid Moran population of size n¼40 with sel ng probability s¼0.8 that starts with genotype frequencies ðQ 2 Q 3 Þ¼ð0 15 0 4Þ. See Fig. 1 for a description of the four subplots. heterozygots of a randomly mating WF population has a slight frequencies over 10 realizations follow the curve closely. The bias and upward bias compared to HW equilibrium. standard deviation of this distance are estimated in the lower sub- plots, and they agree well with the mean and standard deviation of the normal distribution in (51). 7. Numerical results Fig. 4 illustrates the accuracy of the diffusion approximation for the allele frequency distribution at interior points (47) and at the two We will show some numerical results from computing the boundaries (56) when the initial allele frequency p¼0.5. It can be exact Markov chain of the diploid discrete time Moran model, as seen that the approximation is good already for n¼40, and even well as the diffusion approximation. better for n¼80. Notice also that the conditional distribution of the Scatterplots of ðQ ðtÞ; Q ðtÞÞ are displayed in the upper left graphs of 2 3 fi Figs. 1–3, for two population sizes n¼40 and n¼100, and two selfing allele frequency distribution conditional on non- xation is close to rates s¼0ands¼0.8. The distance between the realized genotype uniform when τ¼2, but not for τ¼0.5. The reason is that the first - ; ; ; fi frequencies and the equilibrium curve (37) is sometimes quite sub- eigenfunction x 2F1ð0 3 2 xÞ in (47) is uniform, and its coef cient τ=2 ; ; ; : fi τ stantial for the smaller population, although the average genotype 6e 2F1ð0 3 2 0 5Þ will dominate the in nite series when ¼2. 190 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196

fi ð0Þ; ð0Þ : ; : Fig. 3. Simulation results for a diploid Moran population of size n¼100 with no sel ng (s¼0) that starts with genotype frequencies ðQ 2 Q 3 Þ¼ð0 15 0 4Þ. See Fig. 1 for a description of the four subplots.

Fig. 4. Exact (crosses) and approximate (dots) distribution of allele frequency x for a diploid Moran model of size n with no selfing (s¼0) at standardized time τ,defined in = ; …; ð2nτÞ (41). It is assumed that x ¼ v ð2nÞ, with v ¼ 0 2n. The density f XðτÞðxÞ in (47) that solves the diffusion equation, and the standardized probability 2nPðx ¼ xÞ, are plotted for interior allele frequencies, whereas the approximate and exact fixation probabilities (56) are displayed at the two boundary points xAf0; 1g.

In order to check the accuracy of the diffusion approximation of Without this correction, the ratio deviates much more from unity the distribution of YðTÞjXðτÞ, we plot the ratio of (results not shown). pffiffiffi ^ n f YðTÞjXðτÞðyj xÞ¼ 2 y y 8. Conclusions P ðnðtÞ; nðtÞÞ¼nQðxÞþpffiffiffi ; Q ðxÞpffiffiffiffiffiffi 2 3 2 n 3 2n ; ð63Þ PnðtÞ þ2nðtÞ ¼ 2nx In this paper we introduced an exact monoecious diploid 2 3 Moran model and compared it with the monoecious diploid and (51) in Fig. 5, for different values of x and s, when n¼40. The Wright–Fisher model. We also derived novel diffusion approx- first term on the right-hand side of (63) takes into account that imations of both models on two time scales, where the more pffiffiffi Yj XðτÞ¼x varies on a lattice of width 2= n when x is fixed. It is slowly varying allele frequency process is asymptotically the same seen from the figure that the ratio of the two densities is quite for both models, whereas the more rapid oscillations of genotype close to 1, indicating that the diffusion approximation is accurate. frequencies around an equilibrium curve, are described by differ- pffiffiffi It is crucial in this context to include in (51) the bias term cðxÞ= n. ent stochastic processes for the two models. O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 191

τ =σ τ Fig. 5. Ratio of exact and asymptotic approximations, (63) and f YðTÞj XðτÞðyj xÞ, of the conditional distribution of Y(T) given Xð Þ. It is plotted as a function of Yst ¼ YðTÞ ðXð ÞÞ τ fi τ fi ð0Þ; ð0Þ when Xð Þ¼x is xed. The population size is n¼40 in all four subplots, the standardized time in (41) is ¼ 2, and the starting con guration ðQ 2 Q 3 Þ of the genotype frequencies of Aa and aa,isð0:5; 0:25Þ in the upper subplots (s¼0) and ð0:325; 0:325Þ in the lower subplots (s¼0.5).

The results of this paper can be extended in several ways. First, frequencies of males and females. It is well-known that this pro- it is possible to define a more general hybrid model that includes cess has small oscillations around a one-dimensional curve of the Moran model and the Wright–Fisher model as special cases. Hardy–Weinberg proportions when there is no selfing (Ethier and Such a hybrid model can have a given number of n diploid indi- Nagylaki, 1980, 1988). The position along this curve depends on viduals, and in each generation it lets a number m of offsprings be the allele frequency, and the allele frequency process is asympto- generated. At the next time step, m diploid individuals are elimi- tically the same as for a haploid model (46)–(47). We conjecture nated and replaced by the m newly formed offspring. The present that the genotype frequency oscillations around this curve follow a Moran model represents m¼1, and if all offspring choose their multivariate OU-process for the Moran model and multivariate parents independently, the diploid Wright–Fisher model corre- sponds to m¼n. But it is also possible with more skewed offspring autoregressive process for the WF model, on a local time scale. distributions, where, for instance, one single couple is the parent Fourth, it is of interest to work out the dynamics of a biallelic of all m offspring. gene in a haploid subdivided population with non-overlapping When s¼1/n, we get a haploid hybrid model where 2m generations and strong migration between the S subpopulations. gametes are replaced in each generation. If the new gametes are Nagylaki (1980) has shown that a reproductively weighted (Fisher, chosen randomly with replacement from the previous generation 1958) average of the subunits allele frequencies gives a global and m¼n, we get an ordinary haploid Wright Fisher model of size allele frequency that determines genetic drift according to (46)– ðtÞ ðtÞ 2n. A proper rescaling of the number of a-alleles n2 þ2n3 at time (47), on a slowly varying time scale determined by the variance ; ; ; … t ¼ 0 1 2 for this haploid model converges to the same effective population size. We conjecture that the S-dimensional asymptotic diffusion limits (47) and (55) as for the diploid allele vector of local allele frequencies for all subpopulations will oscil- frequency process, for any value of m. On the other hand, if all 2m late around the line fðx; …; xÞ; 0oxo1g on a local time scale, as a gametes have the same parents, we obtain the model of Eldon and multivariate autoregressive process of order 1. An implication of Wakeley (2006). These haploid reproduction scenarios are all fi instances of the exchangeable Cannings model of a homogeneous this result would be that the xation index fST of Wright (1943), fi population (Cannings, 1974). which quanti es the amount of subpopulation differentiation, will Second, it is not immediately clear how the Moran model exhibit oscillations around a quasi-equilibrium fixed point should be generalized to take a varying population size into obtained explicitly in Hössjer et al. (2013) for the island model, account. The most natural extension is to allow the number of and by Hössjer and Ryman (2014) for more general models. individuals mðtÞ and qðtÞ that are born or die, to vary with time. If qðtÞ differs from mðtÞ, the population size will change from nðtÞ to nðtÞ þmðtÞ qðtÞ between time points t and t þ1. Third, we believe the diffusion approximation approach with Acknowledgements two time scales is even more important for higher dimensional processes. Whereas our genotype process is two-dimensional, fi exact Markov chain computation becomes infeasible for all but Ola Hössjer was nancially supported by the Swedish Research very small populations in higher dimensions. This includes the Council, Contract no. 621-2013-4633. The authors want to thank diploid two-sex Moran and WF models of a biallelic gene. It one anonymous reviewer for helpful comments that considerably requires four-dimensional processes, with separate genotype improved our paper, in particular the reference to Norman (1975). 192 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196

Appendix A. Mathematical proofs and motivations point (37), so that

Pα ¼ QαðxÞþoð1Þ; A.1. Motivation of (45) Qα ¼ Q αðxÞþoð1Þ; ðA:8Þ

It follows from transformation rules of densities that for α ¼ 1; 2; 3asn-1. This gives ÀÁ ; π ; π J ; 1 ; Var vðt þ 1Þ j vðtÞ ¼ 2nx; wðtÞ ¼ w f Q ðtÞ j Q ð0Þ ðq2 q3 j 2 3Þ¼j ðx yÞj f XðτÞj Xð0Þ;Yð0Þðxj p y0Þ hi ; ; ðt þ 1Þ ðtÞ 2 ðtÞ ; ðtÞ f YðTÞj XðτÞ;Xð0Þ;Yð0Þðyj x p y0Þ ¼ E ðv v Þ j v ¼ 2nx w ¼ w pffiffiffi ¼ nf τ ; ðxj p; y Þ 2 2 Xð Þj Xð0Þ Yð0Þ 0 ¼ P2Q 3 þP3Q 2 þ2 P3Q 1 þ2 P1Q 3 þP1Q 2 þP2Q 1 ; ; f YðTÞj XðτÞ;Xð0Þ;Yð0Þðyj x p y0Þ ¼ 2Q ðxÞð1Q ðxÞÞþ8Q ðxÞQ ðxÞþoð1Þ pffiffiffi 2 2 1 3 ; : 4xð1xÞ nf XðτÞj Xð0Þðxj pÞf YðTÞj XðτÞðyj xÞ ðA 1Þ ¼ þoð1Þ; ðA:9Þ ffiffiffi 1 s p π 2 where y0 ¼ n½2 Q 2ðpÞ is the value of the heterozygous excess process at time 0. of (A.1) we used that the determinant of the as was to be proved. The last step of (A.9) follows after some Jacobian equals calculations from the definitions of Q αðxÞ in (37).Wefinally deduce the second part of (A.4) by averaging over wðtÞ, similarly as 2ð1sÞ 1 sð1sÞ ð Þ ð Þ þ ð Þ 1 2x = 2 1 s x s 1 2x = in (A.7).□ dGðx; yÞ 1s 2 2 1s 2 1 j Jðx; yÞj ¼ ¼ ¼ pffiffiffi; dðx; yÞ 1 1 pffiffiffi pffiffiffi n n 2 n A.3. Proof of (48) and in the last step that asymptotically for large time points and n, We will prove that it is only the allele frequency p at time 0 that influences the joint  ÀÁs w C distribution of XðτÞ and Y(T). This will be motivated further in the Ewðt þ 1Þ wðtÞ j xðtÞ ¼ x; wðtÞ ¼ w ¼ 1 þ þoðn 1Þ; weak convergence proof below. ÀÁ2 n n Var wðt þ 1Þ j xðtÞ ¼ x; wðtÞ ¼ w ¼ bðxÞxð1xÞþoð1ÞðA:10Þ

A.2. Proof of (46) as n-1, with  1s ″ xð1xÞ 1s 2xð1xÞð1sÞ s C ¼ 2xð1xÞ Q ðxÞ ¼ xð1xÞ þ ≕C þC ¼ cðxÞ 1 Let 2s 2 s s s 2 1 2 2 21 1 1  2 2 2 t vðtÞ ¼ nðtÞ þ2nðtÞ ¼ 2nxðtÞ ¼ 2nX ðA:2Þ ðA:11Þ 2 3 nn e ″ a constant, Q ðxÞ refers to the second derivative of the middle and 2  equation in (37) and c(x) is the function defined in (50). In order to pffiffiffi t verify the first part of (A.10), we Taylor expand Q ðÞ around x and wðtÞ ¼ nðtÞ nQ ðxðtÞÞ¼ nY ðA:3Þ 2 2 2 n use (A.2)–(A.4) to deduce that ÀÁ refer to the number of a-alleles and excess number of Aa geno- ðt þ 1Þ ðtÞ ðtÞ ; ðtÞ ðt þ 1Þ ðtÞ ðtÞ ; ðtÞ Ew w j x ¼ x w ¼ w ¼ En2 n2 j x ¼ x w ¼ w fi types (compared to xed point (37)) at time point t. 1 ÀÁ n Q 0 ðxÞEvðt þ 1Þ vðtÞ j xðtÞ ¼ x; wðtÞ ¼ w It follows from (32) and (42) that Eq. (46) is equivalent to 2n 2 hi ÀÁ 1 ″ ðt þ 1Þ ðτÞ ðt þ 1Þ ðtÞ 2 ðtÞ ; ðtÞ Ev j v ¼ v ¼ v; n Q 2ðxÞE ðv v Þ j x ¼ x w ¼ w 2ð2nÞ2  C ÀÁ2 ÀÁ ¼ Enðt þ 1Þ nðtÞ j xðtÞ ¼ x; wðtÞ ¼ w þ 2 þoðn 1Þ ðt þ 1Þ ðtÞ ð2nÞ 1 4xð1xÞ 2 2 n Var v j v ¼ v ¼ 2nx ¼ xð1xÞþonne ¼ þoð1Þ nne 1s=2 C2 1 ¼ P2Q 3 P3Q 2 þP2Q 1 P1Q 2 þ þoðn Þ : n ðA 4Þ = C2 1 n 1 n1 s C2 1 ðtÞ ðtÞ ¼ P2 Q 2 þ þoðn Þ¼ ð1sÞ2xð1xÞ þ Q 2s Q 2 þ þoðn Þ as n-1. In order to prove (A.4),define Qα ¼ Qα and Pα ¼ Pα for n n1 2 n1 n α ; ;  ¼ 1 2 3, and recall that the genotype frequency transition n 1s w 1 s1 C ¼ð1sÞ2xð1xÞ þ 2xð1xÞ þ s1þ þ 2 þoðn 1Þ between time points t and t þ1 was divided into seven cases (i)– n1 1s=2 n 2 2ðn1Þ n  (vii) in Section 2.1. Only the first six of these, (i)–(vi), give a non- s w C C ¼ 1 þ 1 þ 2 þoðn 1Þ zero value of vðt þ 1Þ vðtÞ that equals either of 2; 1; 1; 2. Sum- 2 n n n  ðt þ 1Þ ðtÞ s w C ming over all possible v v , weighted by their probabilities ¼ 1 þ þoðn 1Þ to occur, we find that 2 n n ÀÁ ðt þ 1Þ ðtÞ ðtÞ ðtÞ Ev v j v ¼ v; w ¼ w ¼P2Q 3 þP3Q 2 þ2P3Q 1 2P1Q 3 P1Q 2 þP2Q 1 where in the fourth step we used (A.6), in the fifth step the defi- ¼ P3 P1 ðQ 3 Q 1Þ nition of P2 in (1) and in the sixth step we combined (37) and (A.3) ð1sÞðn þn þn Þðn n Þþðsn1Þðn n Þ n n ¼ 1 2 3 3 1 3 1 3 1 ¼ 0; ðA:5Þ to deduce that nðn1Þ n 1 s w: where in the second step we made some rearrangements of terms Q 2 ¼ 2xð1xÞ s þ 1 n and used 2 ; : P1 þP2 þP3 ¼ Q 1 þQ 2 þQ 3 ¼ 1 ðA 6Þ In order to verify the second part of (A.10), we rewrite the con- ðt þ 1Þ ðtÞ fi ditional variance of w w as and in the third step we inserted the de nitions of P1 and P3 in (1). hi ðtÞ ÀÁ Averaging over w we then find that Var wðt þ 1Þ wðtÞ j xðtÞ ¼ x; wðtÞ ¼ w ¼ E ðwðt þ 1Þ wðtÞÞ2 j xðtÞ ¼ x; wðtÞ ¼ w þoð1Þ ÀÁÂÃÀÁ ðt þ 1Þ ðtÞ ðtÞ ðt þ 1Þ ðtÞ ðtÞ ðtÞ () Ev v j v ¼ v ¼ EEv v j v ¼ v; w ¼ w ¼ 0: 2 1 ¼ Enðt þ 1Þ nðtÞ Q 0 ðxÞðvðt þ 1Þ vðtÞÞ j xðtÞ ¼ x; wðtÞ ¼ w þoð1Þ ðA:7Þ 2 2 2 2   2 2 In order to verify the second part of (A.4), we notice that the 1 0 1 0 ¼ðP1Q 2 þP2Q 1Þ 1 Q ðxÞ þðP1Q 3 þP3Q 1Þ 0þ2 Q ðxÞ genotype frequencies for large populations are close to the fixed 2 2 2 2 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 193

 2 1 0 a marginal density (A.14), and a transition density þðP2Q 3 þP3Q 2Þ 1þ Q 2ðxÞ þoð1Þ ÀÁÂÃ 2 ″ ″ ″ OUðU ; σ2ÞjOUðU0; σ2Þ¼y NygðU U0Þ; σ2 1g2ðU U0Þ ¼ P Q þP Q þP Q þP Q þðP Q þP Q P Q P Q ÞQ 0 ðxÞ 1 2 2 1 2 3 3 2 2 3 3 2 1 2 2 1 2 : 0 2 ðA 16Þ Q 2ðxÞ ÀÁ þðP1Q 2 þP2Q 1 þP2Q 3 þP3Q 2 þ4P1Q 3 þ4P3Q 1Þ þoð1Þ 0 o ″ s 4 for any pair U U of time points, with gðuÞ¼exp ð1 2Þu .It 0 ¼ Q 1ð1P2ÞþP2ð1Q 2Þþ½Q 2ðP3 P1ÞþP2ðQ 3 Q 1Þ Q 2ðxÞþ½2Q 2ðxÞð1Q 2ðxÞÞ corresponds to a diffusion solution (48) with n ¼1. In order to prove (A.15), we introduce a second time point τ0 0 ð Þ2 Q 2 x ¼ τþ = o o þ8Q 1ðxÞQ 3ðxÞ þoð1Þ U0 ne on the slowly varying time scale (41), with U0 U1 0 4 fixed. Then we introduce the process 4xð1xÞ Q 0 ðxÞ2  ¼ 2Q ðxÞð1Q ðxÞÞþ2Q ðxÞðÞQ ðxÞQ ðxÞ Q 0 ðxÞþ 2 þoð1Þ: 0 2 2 2 3 1 2 s pffiffiffi U 1 4 Z ð 0Þ¼ τ þ ðτ Þ ; ð þ 0Þ ð : Þ 2 n U n Xn 0 Xn 0 Yn T0 U A 17 ne ¼ bðxÞxð1xÞþoð1Þ; of allele frequency oscillations and heterozygous excess around τ0, fi 0 where in the rst step we used the upper part of (A.10), and in the on a time scale U ¼ U U0 Z0, with T0 ¼ τ0ne ¼ T þU0. We will second step the definition of wðtÞ in (A.3) and a first order Taylor prove functional weak convergence fi expansion of Q 2ðÞ. In the fth step we used (A.6), in the sixth step ÈÉ L ÈÉ fi Z ðU0Þ; U0 Z0 j X ðτ Þ¼x ; Y ðT Þ¼y ⟶ ZðU0Þ; U0 Z0 (A.8) and (A.9), and the last step, nally, follows after some com- n82 n 0 0 n 0 30 9 vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi putations from the definition of b(x)in(49), by inserting formulas <> u => 6ux ð1x Þ 7 (37) for QαðxÞ and its derivative ¼ 4t 0 0 BðU0Þ; OUðU0; σ2ðx ÞÞ5; U0 Z0 j OUð0; σ2ðx ÞÞ ¼ y :> s 0 ;> 0 0 2ð12xÞð1sÞ 1 0 ð Þ¼ 2 Q 2 x s 1 ðA:18Þ 2 of this process as n-1, towards a limit process Z whose first for α ¼ 2. component involves a standard Brownian motion B and whose To finalize the proof of (48), we notice that its first part follows second component is an Ornstein–Uhlenbeck process (A.16), by combining (42), (A.3), the upper part of equation of (A.10), and defined on the non-negative real line. (A.11); ÂÃ Before proving (A.18), we will first show how to use it in order δ τ ; EYðT þ ÞYðTÞjXð Þ¼x YðTÞ¼y to verify (A.15). To this end, conditionally on Xnð0Þ¼p and Ynð0Þ 1 ÀÁpffiffiffi ¼ yð0Þ we rewrite the left-hand side of (A.15) as ¼ δn pffiffiffiEwðt þ 1Þ wðtÞ j xðtÞ ¼ x; wðtÞ ¼ ny þoð1Þ   n U U pffiffiffi X τþ ; Y ðT þUÞ ¼ X ðτÞþ X ðτþ ÞX ðτÞ ; Y ðT þUÞ 1 s ny cðxÞð1 sÞ n n n n n n n n ¼ δn pffiffiffi ð1 Þ þ 2 þoðn 1Þ þoð1Þ e e n 2 n n      U U0 U0 s cðxÞ ¼ XnðτÞþ Xn τ0 þ Xnðτ0ÞðXn τ0 ¼ 1 y pffiffiffi δþoð1Þ: ðA:12Þ ne ne 2 n X ðτ ÞÞ; Y ðT þU U ÞÞ n 0 nsffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi0 0 The second part of (48) follows similarly from the lower equation L τ τ 1ffiffiffi Xnð 0Þð1Xnð 0ÞÞ of (A.10).□ XnðτÞþp ½BðU U0ÞBðU0Þ ; n 1 s ÂÃ 2 ÂÃ OU U U ; σ2ðX ðτ ÞÞ OU 0; σ2ðX ðτ ÞÞ ¼ Y ðT Þ A.4. Proof of weak convergence of discrete time Moran process on 0 n 0 n 0 n 0 L ÀÃÂÁ ÂÃ local time scale 2 2 XnðτÞ; OU U U0; σ ðXnðτ0ÞÞ OU 0; σ ðXnðτ0ÞÞ ¼ YnðT0Þ

We start by indexing the a allele frequency process XðτÞ¼XnðτÞ L ÀÃÂÁ Âà 2 2 X1ðτÞ; OU U U0; σ ðX1ðτÞÞ OU 0; σ ðX1ðτÞÞ ¼ YnðT0Þ in (42), and the heterozygous excess process YðTÞ¼YnðTÞ in (43), by population size n. We will prove weak convergence L ÀÃÂÁ 2 X1ðτÞ; OU U U0; σ ðX1ðτÞÞ τ ; ; ⟶L τ ; : ÀÃÂÁ ðXnð Þ YnðTÞÞj Xnð0Þ¼p Ynð0Þ¼yð0Þ ðX1ð Þ Y1ÞðA 13Þ L 2 ¼ X1ðτÞ; OU U; σ ðX1ðτÞÞ ; ðA:19Þ when τ is fixed and n-1, and hence T ¼ τne-1. The first r r component X1ðτÞ on the right-hand side of (A.13) has a distribu- for all U1 U U2. In the third step of (A.19) we introduced the L tion (47) that depends on p but not on yð0Þ. The second component notation An Bn, which refers to two sequences An and Bn of has a normal distribution L random elements that have the same weak limit A, i.e. An⟶A and ÀÁ L 2 ⟶ fi Y1 j X1ðτÞ¼x N 0; σ ðxÞ ðA:14Þ Bn A. In the fth step of (A.19) we used that the allele frequency process converges weakly conditionally on the first, with a variance (52) that depends on x, L but on neither p nor yð0Þ. This distribution corresponds to the limit XnðτÞjXnð0Þ¼p; Ynð0Þ¼yð0Þ⟶X1ðτÞðA:20Þ of (51) as n-1. We will actually prove a result more general than at rescaled time point τ,asn-1. This follows from the weak (A.13); functional weak convergence  convergence theory of Ethier and Nagylaki (1980, 1988) and Ethier U and Kurz (1986). In the sixth step of (A.19) we used (A.16) and Xn τþ ; YnðT þUÞ ; U1 rU rU2 j Xnð0Þ¼p; Ynð0Þ¼yð0Þ ne chose U0 40 so large that conditioning on the value of OU at U0 L ÀÁÂà ; ⟶f X ðτÞ; OU U; σ2ðX ðτÞÞ ; U rU rU gðA:15Þ has no impact on its behaviour over the intervalÂýU1 U2. Finally, in 1 1 1 2 2 the last step of (A.19) we used that U-OU U; σ ðX1ðτÞÞ is a of a stochastic process defined over an interval with fixed end mixture of stationary Ornstein–Uhlenbeck processes, and hence points U1 o0oU2, using a standardized local time scale U such stationary itself. Formula (A.15) follows since its left- and right- that U¼0 corresponds to (40). The second component OU on the hand sides coincides with the left- and right-hand sides of (A.19). right-hand side of (A.15) is a stationary Ornstein–Uhlenbeck pro- Hence it remains to prove (A.18). We will do this by applying a cess conditionally on X1ðτÞ. This Ornstein–Uhlenbeck process has weak convergence result of Norman (1975), henceforth denoted as 194 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196

N75. Theorem 3 of N75 provides functional weak convergence for Eqs. (A.27)–(A.28) follow from formulas (A.22), (A.4) and (A.10) for a one-dimensional process towards a Gaussian diffusion limit, but the mean and variance of the increments of vðtÞ and wðtÞ in (A.2) we will need the multivariate extension of this result, as outlined and (A.3), and the corresponding formula  in Section 9 of N75. To this end, we will first show that (A.17) is Âà equivalent to E ðvðt þ 1Þ vðtÞÞðwðt þ 1Þ wðtÞÞjvðtÞ ¼ 2nx; wðtÞ ¼ w ¼ E ðvðt þ 1Þ vðtÞÞ ffiffiffihi p 0 0 ðt þ nU0Þ Z 0 Rðt0 þ nU Þ γðt0 þ nU Þ ; γ 0 ; :   nðU Þ¼ n ð n n Þþð0 n2 Þ ðA 21Þ ðt þ 1Þ ðtÞ 1 0 ðt þ 1Þ ðtÞ ðtÞ ðtÞ n n Q ðxÞðv v Þ v ¼ 2nx; w ¼ w þoð1Þ 2 2 2 2 where t0 ¼ τ0=ðnneÞ is the discrete time step corresponding to τ0, and ¼ P2Q 3 ð1Þ1þP3Q 2 1 ð1ÞþP3Q 1 2 0  ðtÞ ðtÞ þP1Q 3 ð2Þ0þP1Q 2 ð1Þð1ÞþP2Q 1 1 1 RðtÞ ðtÞ ; ðtÞ ðtÞ; ðtÞ ðtÞ v ; w ; : hi n ¼ðRn1 Rn2Þ¼ðx Q 2 Q 2ðx ÞÞ ¼ ðA 22Þ 1 2n n Q 0 ðxÞE ðvðt þ 1Þ vðtÞÞ2 j vðtÞ ¼ 2nx; wðtÞ ¼ w þoð1Þ 2 2 is a vector that for each time step t Zt has one a allele frequency 0 1 0 4xð1xÞ ðtÞ ¼ 2½Q ðxÞQ ðxÞQ ðxÞQ ðxÞ Q ðxÞ þoð1Þ component x , and one heterozygous excess component 1 2 3 2 2 2 s ðtÞ ðtÞ ðtÞ ðtÞ fi 1 Q 2 Q 2ðx Þ, with v and w as de ned in (A.2) and (A.3). The 2 function γðtÞ ¼ðγðtÞ ; γðtÞ Þ corresponds to the mean drift of RðtÞ.Itis ~ n n1 n2 pffiffiffi n ¼ 2bðxÞxð1xÞþoð1ÞðA:29Þ fi Z γðt0Þ ; = de ned for t t0, with initial value n ¼ðx0 y0 nÞ obtained from the conditioning in (A.18). The recursion for the subsequent for the covariance of the increments of these two processes. In the second step of (A.29), the first six terms correspond to cases (i)– time steps t ¼ t0 þ1; t0 þ2; …,is (vi) of Section 2.1, with probabilities determined by the Markov γðtÞ RðtÞ Rðt 1Þ γðt 1Þ : ðt þ 1Þ ðtÞ n ¼ Eð n j n ¼ n Þ transition matrix, with v v attaining one of the values 2, 1, 1, 2, and nðt þ 1Þ nðtÞ one of the values 1, 0, 1. In the third step It follows from (A.4) and (A.10) that 2 2 of (A.29) we used (A.8) and (A.9), and in the last step the definition 0 γðt0 þ nU Þ ; n1 ¼ x0 of Q αðxÞ in (37). Z 0  The process nðU Þ in (A.21) corresponds to the random poly- 0  0 nU ðt0 þ nU Þ y0 s 1 gonal line process on p. 229 of N75, except that we added a term γ ¼ pffiffiffi ∏ 1 1 ð1þoð1ÞÞ pffiffiffi 0 n2 n 2 n nγðt0 þ nU Þ-y gðU0Þ to the second component. In order to see this, u ¼ 1 n2 0 y s ¼ p0ffiffiffi exp 1 U0 ð1þoð1ÞÞ we notice that the normalizing factor n 2 = pffiffiffi 1=n 1 2 y0ffiffiffi 0 n ¼ ¼ p gðU Þð1þoð1ÞÞ ðA:23Þ = 2 n 1 n as n-1. In the second equation of (A.23) we used (A.22) and the of (A.21) is obtained from the two multiplicative factors 1=n and ÀÁ= 2 upper part of (A.10) to conclude that γðt þ 1Þ ¼ γðtÞ 1ð1s=2Þ=n þ 1 n in (A.25) and (A.26), and the factor 1/n of the time scale t ¼ n2 n2 0= OðCn 2Þþoðn 2Þ. From (42), (43) and the upper part of (A.23) we t0 þU n is also derived from (A.25), just as in N75. The limiting Z deduce that (A.17) and (A.21) are equivalent. process in (A.18) is a Gaussian diffusion, whose distribution is fi In order to prove (A.18), we need to establish certain regularity fully speci ed by its mean and covariance functions. These are W S conditions for the increment obtained by inserting the two functions and in (A.27) and (A.28) into formulas (2.15)–(2.17) of N75, and then adding the ΔRðtÞ ¼ Rðt þ 1Þ RðtÞ ðA:24Þ 0 n n n asymptotic limit y0gðU Þ of the second term of (A.21). After some of (A.22) over one time step. These regularity conditions can be elementary calculations, it follows that the resulting Gaussian Z phrased in terms of its first two moments diffusion corresponds to the process on the right-hand side of  (A.18), with one Brownian motion and one Ornstein–Uhlenbeck ðtÞ ðtÞ 1 ðtÞ ðtÞ ~ ΔR R W R e ; : component. Since bðr1Þa0, these two processes will not be E n j n ¼ ð n Þþ 1n ðA 25Þ n independent. and In order to complete the proof of (A.18), some regularity con-  1 ditions of Theorem 3 in N75 need to be checked. It can easily be Var ΔRðtÞ j RðtÞ ¼ SðRðtÞÞþeðtÞ ; ðA:26Þ n n 2 n 2n shown that the differentiability conditions (2.3)–(2.7) on Wðr1; r2Þ n 0 and Sðr1; r2Þ hold. The remaining regularity conditions ð2:9 Þ, (2.10) and the martingale difference : 0 eðtÞ  and ð2 12 Þ of N75 involve the three remainder terms in , for eðtÞ ΔRðtÞ ΔRðtÞ RðtÞ : i ¼ 1; 2; 3. They can be written as 3n ¼ n E n j n = 1 3 = eðtÞ eðtÞ eðtÞ EðjeðtÞ j 3Þ ¼ oðn 3 2Þ; ðA:30Þ We regard 1n, 2n and 3n as remainder terms, whereas the 1n ΔRðtÞ leading terms of the and variance of n include  ðtÞ s Eðje jÞ¼oðn 2ÞðA:31Þ Wðr ; r Þ¼ 0; 1 r ðA:27Þ 2n 1 2 2 2 and and  1=3 0 1 eðtÞ 3 1 : : 1 ~ Eðj 3n j Þ ¼ Oðn Þ ðA 32Þ 1 s bðr1Þ Sðr ; r Þ¼r ð1r Þ@ 2 A; ðA:28Þ 1 2 1 1 ~ Formulas (A.30)–(A.31) follow by looking more closely at the bðr1Þ bðr1Þ remainder terms of (A.4), (A.10) and (A.29), with third moment where bðr1Þ is the function defined in (49), and estimates for the remainder terms of expected values of incre-  s ments, and first moment estimates for remainder terms of var- ð1sÞ 1þ ~ 2 iance and covariance of increments. Finally, (A.32) follows from bðr1Þ¼ ð12r1Þ: s 2 that j ΔxðtÞ j and j ΔQ ðtÞ j are less or equal to 1/n, and the fact that 2 1 0 □ 2 Q 2 is bounded. O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196 195

A.5. Proof of (31) for all 0oxo1 and w, with  r Y Formulas (37) and (44) imply ðrÞ ðrÞ ðrÞ 2ffiffiffi : w ¼ n2 Q 2ðx Þ¼ p ðA 37Þ 2ð1sÞ Âà n ðtÞ ¼ ðtÞð ðtÞÞ þμð Þ; ð : Þ f 2 s Ex 1 x T A 33 1 the excess number of heterozygots. In order to prove the upper 2 pffiffiffi part of (A.36), we use that the expected value of the multinomial with μðTÞ¼ nEðYðTÞÞ. Starting with the first term, we condition distribution in (57) is on what happens at time step t, and find from (32) and (A.2)–(A.4) ~ ~ ~ Eðn1; n2; n3 j n1; n2; n3Þ¼nðP1; P2; P3Þ: that ðr þ 1Þ ~ ~ ÂÃhiSince x ¼ðn2 þ2n3Þ=ð2nÞ, it follows that Exðt þ 1Þð1xðt þ 1ÞÞjxðtÞ ¼ x ¼ xð1xÞE ðxðt þ 1Þ xðtÞÞ2 j xðtÞ ¼ x ÀÁ ðr þ 1Þ ðrÞ ðrÞ 1 1 Eðx j x ¼ x; w ¼ wÞx ¼ P2 þP3 Q 2 þQ 3 1 4xð1xÞ 2 2 ¼ xð1xÞ þoðn 2Þ 1 ; : 2 s ¼ 2 ½ðP3 P1ÞðQ 3 Q 1Þ ¼ 0 ðA 38Þ ð2nÞ 1 2 where the last equality is a consequence of (A.5). For the lower 1 1 ¼ xð1xÞ 1 þo : part of (A.36) we use that the covariance matrix of the multi- nne nne nomial distribution in (57) has entries Taking expectations on both sides of the last displayed equation, ~ ~ Covðnα; nβ j n1; n2; n2Þ¼nðPαδαβ PαPβÞ; we obtain  rα; βr ÂÃÂà for 1 3. This yields ðt þ 1Þ ðt þ 1Þ ðtÞ ðtÞ 1 1 :  Ex ð1x Þ ¼ Ex ð1x Þ 1 þo 1 1 nne nne Varðxðr þ 1Þ j xðrÞ ¼ x; wðrÞ ¼ wÞ¼ P ð1P ÞþP ð1P ÞP P n 4 2 2 3 3 2 3  Iterating this last relation t times, we get 1 1  ¼ P2ð1P2ÞþP1P3 Âà t n 4 ðtÞ ðtÞ ð0Þ ð0Þ 1 1  Ex ð1x Þ ¼ x ð1x Þ 1 þo 1 1 1 nn nn ¼ Q 2ðxÞð1Q 2ðxÞÞþQ 1ðxÞQ 3ðxÞ þoðn Þ e e n 4 nnet=ðnneÞ 1 1 1 xð1xÞ ¼ ð0Þð ð0ÞÞ þ  1 ; : x 1 x 1 o ¼ s þoðn Þ ðA 39Þ nne nne n 21 2 ð0Þ ð0Þ t : : ¼ x ð1x Þ exp þoð1Þ ðA 34Þ □ nne where in the last two steps we used (A.8) and (A.9). The first term of (31) is obtained by inserting (A.34) into (A.33). A.7. Proof of (58) As for the last two terms of (31),wefirst take expectation with respect to XðτÞ in the upper equation of (48) and then let δ-0. fi We will prove that This gives a linear and inhomogeneous rst order ordinary dif-   1 s c~ðxÞ ferential equation EY Tþ yj XðτÞ¼x; YðTÞ¼y ¼ 1 y pffiffiffi þoðn 1=2Þ;   2 2 n s 1 s μ0ðTÞ¼ 1 μðTÞþ 1 cE½ XðτÞð1XðτÞÞ  2 n 2 1 1   Var YTþ j XðτÞ¼x; YðTÞ¼y ¼ bðxÞxð1xÞ: ðA:40Þ s 1 s = ¼ 1 μðTÞþ 1 cxð0Þð1xð0ÞÞe T ne ðA:35Þ 2 2 2 n 2 It can be seen that this is equivalent to (58),ifεðTÞ has a normal for μðTÞ, with  distribution. This follows by applying a to s ð1sÞ 3 the multinomial distribution in (57) when n is large. 2 ðrÞ c ¼ ; Let w be the excess numberffiffiffi of heterozygots of generation r, s 3 p 1 defined in (A.37), and w ¼ ny. We have that 2  pffiffiffi 1 and in the last step of (A.35) we made use of (A.34). The solution of nEY Tþ yj XðτÞ¼x; YðTÞ¼y 2 (A.35) is ¼ Eðwðr þ 1Þ wðrÞ j xðrÞ ¼ x; wðrÞ ¼ wÞ ð1 sÞT μðTÞ¼μð0Þe 2 Z ¼ Eðnðr þ 1Þ nðrÞ j xðrÞ ¼ x; wðrÞ ¼ wÞ T  2 2 0 0 hi 1 s ð0Þ ð0Þ T =n ð1 sÞðT T Þ 0 þ 1 cx ð1x Þe e e 2 dT : n ″ ðr þ 1Þ ðrÞ 2 ðrÞ ; ðrÞ n 2 Q 2ðxÞE ðx2 x Þ j x ¼ x w ¼ w 0 2 0 1 After evaluating the integral and making some other rearrange- nB 4ð1sÞC xð1xÞ ments, it can be seen that this expression equals the sum of the ¼ nðP Q Þ @ A ÀÁþoð1Þ 2 2 2 s s 1 2n 1 2 last two terms of (31), since 2  s C2 ð0Þ 1s ¼ 1 wþC þ þoð1Þ; ðA:41Þ μð Þ¼ ð0Þð ð0ÞÞ :□ 1 0 Q 2 2x 1 x s 2 2 1 2 where the third step follows from (A.38) and a Taylor expansion of

Q 2ðÞ, in the fourth step we used (A.39) and the expected value of A.6. Proof of (46) for the Wright–Fisher model the multinomial distribution in (57). In the last step we defined C1 and C2 as in (A.11) and used an expression for P2 Q 2 from the Since the argument is analogous to the proof of (46) for the proof of (48). But (A.41) is equivalent to the upper part of (A.40), Moran model, we will be more brief. It suffices to prove ~ since C1 þC2=2 ¼ð1s=2ÞcðxÞ. Eðxðr þ 1Þ j xðrÞ ¼ x; wðrÞ ¼ wÞ¼x; For the second part of (A.40) we use that  xð1xÞ 1 ðr þ 1Þ ðrÞ ; ðrÞ ÀÁ 1 ; : 1 τ ; Varðx j x ¼ x w ¼ wÞ¼ s þoðn Þ ðA 36Þ Var YTþ j Xð Þ¼x YðTÞ¼y 21 2 n 2 196 O. Hössjer, P.A. Tyvand / Journal of Theoretical Biology 394 (2016) 182–196

hi 1 Ethier, S.N., Kurz, T.G., 1986. Markov Processes, Characterization and Convergence. ¼ E ðnðr þ 1Þ nðrÞÞ2 j xðrÞ ¼ x; wðrÞ ¼ w n 2 hi2 John Wiley & Sons, Hoboken, New Jersey. Fisher, R.A., 1922. On the dominance ratio. Proc. R. Soc. Edinb. 42, 321–431. 0 ðr þ 1Þ ðrÞ ðr þ 1Þ ðrÞ ðrÞ ; ðrÞ 2Q 2ðxÞE ðn2 n2 Þðx x Þjx ¼ x w ¼ w Fisher, R.A., 1958. The General Theory of , 2nd rev. ed. Dover, New hiYork. 0 2 ðr þ 1Þ ðrÞ 2 ðrÞ ; ðrÞ þnQ 2ðxÞ E ðx x Þ j x ¼ x w ¼ w þoð1Þ Grimmett, G., Stirzaker, D., 2001. Probability and Random Processes, 3rd ed. Oxford 0 University Press, Oxford. ¼ P2ð1P2ÞQ 2ðxÞ½P2ð1P2Þ2P2P3 Hössjer, O., Jorde, P.E., Ryman, N., 2013. Quasi equilibrium approximations of the fixation index under neutrality: The finite and infinite island model. Theor. 1 0 2 þ Q 2ðxÞ ½P2ð1P2Þ4P2P3 þ4P3ð1P3Þ þoð1Þ Popul. Biol. 84, 9–24. 4 Hössjer, O., Ryman, N., 2014. Quasi equilibrium, variance effective population size ¼ Q 2ðxÞð1Q 2ðxÞÞ and fixation index for models with spatial structure. J. Math. Biol. 69 (5), 0 1057–1128. http://dx.doi.org/10.1007/s00285-013-0728-9. þQ 2ðxÞQ 2ðxÞ½Q 3ðxÞQ 1ðxÞ 1 Hössjer, O., Olsson, F., Laikre, L., Ryman, N., 2015. Metapopulation inbreeding 0 2 dynamics, effective size and subpopulation differentiation—a general analytical þ Q 2ðxÞ ½Q 2ðxÞð1Q 2ðxÞÞþ4Q 1ðxÞQ 3ðxÞ þoð1Þ 4 approach for diploid organisms. Theor. Popul. Biol. 102, 40–59. http://dx.doi. 1 org/10.1016/j.tpb.2015.03.006. ¼ bðxÞxð1xÞþoð1Þ; 2 Kimura, M., 1955. Solution of a process of random genetic drift with a continuous model. Proc. Natl. Acad. Sci. U.S.A. 41, 141–150. where the last step follows as in the proof of (48).□ Korolyuk, V.S., Korolyuk, D., 1995. Diffusion approximation of stochastic Markov models with persistent regression. Ukr. Math. J. 47 (7), 1065–1073. McKane, A.J., Waxman, D., 2007. Singular solutions of the diffusion equation of population genetics. J. Theor. Biol. 247, 849. References Levene, H., 1949. On a matching problem arising in genetics. Ann. Math. Stat. 20, 91–94. Moran, P.A.P., 1958a. Random processes in genetics. Proc. Camb. Philos. Soc. 54, 60. Abramowitz, M., Stegun, I.A. 1964. Handbook of Mathematical Functions. Nat. Moran, P.A.P., 1958b. A general theory of the distribution of gene frequencies. I. Bureau Stand. Overlapping generations. Proc. Camb. Philos. Soc. B149, 102–112. Brockwell, P.J., Davis, R.A., 1991. Time Series: Theory and Methods, 2nd ed. Moran, P.A.P., 1958c. A general theory of the distribution of gene frequencies. II. Springer-Verlag, New York. Non-overlapping generations. Proc. Camb. Philos. Soc. B149, 113–116. Cannings, C., 1974. The latent roots of certain Markov chains arising in genetics: a Nagylaki, T., 1980. The strong migration limit in geographically structured popu- – new approach, I. Haploid models. Adv. Appl. Probab. 6, 260 290. lations. J. Math. Biol. 9, 101–114. Cavalli-Sforza, L.L., Bodmer, W.F., 1971. The Genetics of Human Populations. Free- Nordborg, M., Donnelly, P., 1987. The coalescent process with selfing. Genetics 146, man, San Francisco. 1185–1195. – Coad, R.W., 2000. Diffusion approximation of the Wright Fisher model of popu- Norman, F., 1975. Approximation of stochastic processes by Gaussian diffusion and – lation genetics: single-locus two alleles. Ukr. Math. J. 3 (52), 388 399. application to Wright–Fisher genetic models. SIAM J. Appl. Math. 29 (2), Crow, J.F., Denniston, C., 1988. Inbreeding and variance effective population sizes. 225–242. – Evolution 42 (3), 482 495. Pollak, E., 1997. On the theory of partially inbreeding finite populations. I. Partial Crow, J.F., Kimura, M., 1970. An Introduction to Population Genetics Theory. The selfing. Genetics 117, 353–360. Blackburn Press, Caldwell, New Jersey. Tyvand, P.A., 1993. An exact algebraic theory of genetic drift in finite diploid Eldon, B., Wakeley, J., 2006. Coalescent process when the distribution of number of populations with random mating. J. Theor. Biol. 163, 315–331. – offspring is highly skewed. Genetics 172, 2621 2633. Watterson, G.A., 1964. The application of diffusion theory to two population genetic Ethier, S.N., Nagylaki, T., 1980. Diffusion approximations of Markov chains with two models of Moran. J. Appl. Probab. 1, 233–246. time scales and applications to population genetics. Adv. Appl. Probab. 12, Wright, S., 1931. Evolution in Mendelian populations. Genetics 16, 97–159. – 14 49. Wright, S., 1943. Isolation by distance. Genetics 16, 114–138. Ethier, S.N., Nagylaki, T., 1988. Diffusion approximations of Markov chains with two time scales and applications to population genetics, II. Adv. Appl. Probab. 20, 525–545.