<<

Copyight 0 1994 by the Genetics Society of America

Substitution Processes in Molecular . 111. Deleterious Alleles

John H. Gillespie Section of Evolution and Ecology, University of California, Davis, California 9561 6 Manuscript received February 10, 1994 Accepted for publication July 22, 1994

ABSTRACT The substitution processes for various models of deleterious alleles are examined using computer simulations and mathematical analyses. Mostof the work focuses on the house-ofcards model,which is a popular modelof deleterious allele evolution. The ofrate substitution is shown to bea concave function of the strengthof selection as measuredby a = 2Nu, where N is the populationsize and u is the standard deviation of . Fora < 1, the house-ofcards modelis essentially a neutral model; fora > 4,the model ceases to evolve. The stagnation for largea may be understood by appealing to the theoryof records. The house-of-cards model evolvesto a state where the vast majority of all are deleterious, but precisely one-half of those mutations that fix are deleterious(the other half are advantageous). Thus, the model is not a model of exclusively deleterious evolution asis frequently claimed. Itis argued that there are no biologically reasonable models of where the vast majority of all substitutions are deleterious. Other models examined include the exponential and gamma shift models, the Hartl- Dykhuizen-Dean (HDD) model, and the optimum model. Of all those examined, only the optimum and HDD models appear to be reasonable candidates for silent evolution. Noneof the models are viewedas good candidates for protein evolution, as none are both biologically reasonable and exhibit the variability in substitutions commonly observed in protein sequence data.

N the early 1970s, OHTA(1972, 1977) put forth the majority of “borderline mutations” arealso very slightly del- I bold hypothesis that most substitutions eterious. involve deleterious mutations. In onestroke she was able Borderline mutations are those with deleterious ef- to account for the(then new) observations that protein fects on the order of the reciprocal of the population evolution is generally slower than silent evolution and size, whichis many orders of magnitude smaller than the that proteins do notexhibit the generation-time effect smallest effect that can be measured experimentally. of silent evolution. Today, OHTA’Smechanism-the sub- One could reasonable question whether mutations of stitution of mildly deleterious -is often in- large effect tell us anything at all about borderline mu- voked to account for lowered substitution rates in spe- tations. In fact, it has been argued from biological prin- cific regions of the .For example, KIMURA (1983) ciples that mutations ofvery small effect should be has argued that both aminoacid substitutions and silent equally distributed between advantageous and disadvan- substitutions in highly biased coding regions are mildly tageous effects (FISHER1958; GILLESPIE1987). deleterious. The problem may be approached from another di- A mechanism of evolution involving the fixation of rection entirely. Natural populations evolve to the point less fit alleles must necessarily require very special cir- where most mutations are deleterious through thesub- cumstances in order to work. At the very least, the vast stitution of advantageous mutations. At such time when majority of mutations must be deleterious and popula- all mutationally accessible advantageous alleles are ex- tion sizes cannot be toolarge. The early deleterious mu- hausted, all newly arising mutations will be deleterious. tation models (KIMURA 1979; OHTA1976) assumed that This scenario is surely behind the view that the “great all mutations aredeleterious. Such models have come to majority of new mutations are deleterious.” But is this be called shift models (OHTA1992) as the distribution of how evolution works?Will populations evolve to the selection coefficients must continually shift to assure point where the great majority of borderline mutations that all new mutations areless fit than the most recently are deleterious?The surprising answer to be developed fixed mutation. here is that while the (not vast) majority of borderline The assumption that the vast majority ofmutations are mutations are deleterious, of those that fix, precisely deleterious has received little critical scrutiny. In her de- half are deleterious and half are advantageous. At velopment of the model OHTA(1977) simply asserted that: the same time, the vast majority of all mutations (in- Since it is generally accepted that a great majority of new cluding those of both large and small effect) are, in mutations are deleterious,it is natural to assume that a great fact, deleterious. Thus, there is a strong theoretical

Genetics 138 943-952 (November, 1994) 944 J. H. Gillespie argumentthat calls intoquestion the common centered on the value of their parent allele. HDD is practice of using experimental observations basedon meant tomodel the “evolution of selectiveneutrality” of mutations of large effect to infer the properties of enzymes. The value represents enzyme activity, which is mutations-and moreimportantly, substitutions-of supposed to evolve to such redundancy that all subse- very small effect. quent substitutions are neutral.As will be shown, this is These conclusions will come out of a more general precisely how the model behaves. study of nonexchangeable substitution processes. This The main concern of this paper is with a description study is the third ina series examining the classic models of thesubstitution processes for these five models. Recall of in the context of WATTEFSON’S that there are two substitution processes. The origina- (1975) infinite-sites, no-recombination model of the tion process is the pointprocess of the times ofentry into . The previous articles (GIUESPIE1993, 1994) dealt the population of those mutations that ulti- with the symmetrical overdominance, underdomi- mately fix. The fixation process is the point process of nance, SASCFF, and TIM models. The present paper the times when nucleotide mutations actually fix. The deals with nonexchangeable models. That is, withmod- two processes are tightly coupled butdiffer dramatically els where the alleles in the population are not dynami- in their complexity. The origination process tends to be cally equivalent. a renewal-like process that is sometimes more uniform Five models will be examined. The first is the house- than a Poisson process (for overdominant, SASCFF and of-cards model (KINGMAN 1978). For this model, absolute TIM models in rapidly changing environments) and fitnesses are assigned to additive alleles at the time of sometimes more clustered (for underdominance and their creation. The fitnesses are chosen from a normal TIM models in slowly changing environments). The distribution and are independent of the fitness of the fixation process tends to be episodic, with bursts of parent allele. The house-of-cards model has been substitutionsoccurring at intervals thatare more adopted as a model of molecular evolution and exam- evenly spaced than a Poisson process. The statistical ined in some detail by OHTAand TACHIDA(1990) and properties inferred from sequence data corresponds TACHIDA(1991). Manyof the results presented here to the origination process. The fixation process can- build on these papers. OHTA(1992) calls her version of not be observed. the house-of-cards model the fixed model to emphasize Most of the results will come from computer simula- its distinctness from shift models. However, there seems tions. Where possible, some approximate mathematical little in favor ofusing “fured”over the venerable “house- investigations will be employed. of-cards,” so the latter will be used here. The second model is the optimum model. Here each THE MODELS AND THEIR SIMULATION allele is assigned a value at the time of its creation. The fitness of a diploid individual is determined by using a The simulation program has been described in de- tail in the two previous papers in this series. Briefly, quadratic deviation fitness function centeredat zero on the simulation is made up of components, one for the the sum of the values of the two alleles. As with the two allele frequency dynamicsand one for the allelic geneal- house-ofcards model, the value of an allele is chosen ogy. The latter is independent of the model of selection so from a normal distribution and is independent of that has remained unchanged throughout this seriesof papers. of the parent allele. Unlike the house-ofcards model, The former is changed with each new model. fitnesses are bounded (by one). The optimum model The state of the population is represented by the fre- has frequently been used as a model of molecular evo- quencies of segregating alleles in the population. The lution (LATTER 1970; KIMURA 1981; FOLEY1987). Kimura, number of segregating alleles at generation t is the ran- in particular, has emphasized its suitability asa model of dom quantity K( t). The frequency of the ith of K( t) codon usage evolution. alleles will be written as either x,( t) or x,. Allele frequencies The next two models are shift models. These are the will change due to selection, , and mutation. classical deleterious allele models that instantiate the as- The simulation ofgenetic drift and mutation is the same sumption that all mutations are deleterious. Two shift as in the previous papers in this series.The population size models are widely used in discussions of molecular evolu- will be called Nand the mutation rate to new alleles, u. The tion. The first, due to OHTA (1977), assumes that the se- parameter 8 = 4Nu will be used throughout. lection coefficientsof new mutations are exponential ran- Under the house-of-cards model, each new allele is dom variables. The second, due to KIMURA (1979), assumes assigned a chosen from a normal that they are gamma random variables. distribution with mean zero and variance m‘. The selec- The final model is the HARTL-DWUIZEN-DEAN(HDD) tion coefficient for the ith allele is called Y,. The addi- model (HARTL et al. 1985). Under this model, thefitness tivity assumption dictates that the fitness of the AiAj function is a monotonically increasing bounded func- genotype is tion of the value. The values are assigned from a normal distribution but, unlike the other two models, they are 1 + (y, + y,)/2. Substitution Processes Substitution 945

The change in the frequency of the ith allele is given by and variance u2.For the gamma shift model, the distri- bution is the gamma distribution, 1 Xi(# y, - Y) Axi(t) = - 2 1+Y ' vwrr+e-2ax, where which has mean CT = a/(2N) and variance 24. This particular form of the gamma distribution is the basis of M4 KIMURA'S (1979,1983) theory of molecular evolution by Y = xt(t)y,. x mildly deleterious mutations. The mean of the gamma i= 1 is the same as for the exponential, but the variance is Rather thanusing u2to parameterize the house-of-cards twice as large. model, it is more convenient touse a = 2Nu. With this The selection coefficient of the ith allele is st = vi - notation, the variance in fitness is (a/2N)'. v,, where vi is the value of the ith allele and v, is the For the optimum model, each new mutation is as value of the root allele. As vi > v,, sj < 0. The fitness of signed a value from a standardized normal distribution a genotype is with mean zero and variance one. A fitness function, wq = 1 + (st + sp2 +(r) = 1 - uy' (1) and the allele frequency changes are calculated using is used to map the value of a genotype into absolute Equation 2. fitness. The fitness of the AiAj genotype is given by For the HDD model, the value ofan allele is obtained by adding a standardized normal deviate to thevalue of w. = +(y, + y,). 9 the parentallele. The value ofa genotype is mapped into The change in the frequency of the ithallele is given by fitness by the function

Ax,(t) = xi(t)(Uri - G)/W, (2) +(J) = ?4 + arctan(y)rr. where This function has the property thatit is positive, mono- M 4 tonically increasing, and bounded above by one. The w;= 2 X,Wq expressions for the changein allele frequencies are the j= 1 same as for the optimummodel. Note that theHDD has is the marginal fitness of the ith allele and no parameter reflecting the strength of selection. The model is intended to, in effect, evolve its own parameters M 8 by moving to high values where fitness differences be- w = xiw, x come small. i= 1 One simulation is run for each model and set of pa- is the mean fitness of the population.As with the house- rameters. The simulation is initialized by running it until ofcards model, the optimum model will be parameter- 100 substitutions have occurred. Initial tests indicate ized by a = 2Nu. that stationarity is attained in less time than is required Under thetwo shift models, fitnesses must be adjusted for ten substitutions. Thus, all of the properties to be so that all mutations are less fit than the most recently described are for stationary point processes. fixed mutation. There is a great deal of flexibilityin set- ting up models with this property. The approach taken RATES OF SUBSTITUTION here is to assign each allele a random value that is used to calculate its selection coefficient. The value is set The rate of substitution is defined as equal to thevalue of the parental allele minus either an k = lim-Wt) = -EWt) exponential or a gamma random deviate. As all alleles t' are descendantsof the rootallele, the values ofmost of t+m t the mutations in the population will be exponential or where X( t) is the numberof originations or fixations in gamma random variables. However,some mutations will an interval of t generations. The rates will usually be be descendants of these alleles. Their values will be the expressed relative to the neutralsubstitution rate, which value of their grandparent allele minus the sum of two is equal to the mutation rate u. exponential or gamma random deviates (one coming Substitution rates as a function of a for all but the from their parent allele). It is possible to have mutations HDD model are illustrated in Figure 1. For all four mod- that are even more distantly related with values that are els, the rateof substitution is a decreasing function of the sums of three or more random variables, but this will be strength of selection as measured by a. Thus, all behave rare. as models of deleterious alleles should. For the exponential shift model, the exponentialdis- The most striking aspect of the figure is the contrast tribution used to assign the value has mean u = a/ (2N) between the house-of-cards model and the others. For 946 J. H. Gillespie

8III/#I,, 0.9 -House of Cards D"-o Optimum M Exponential Shift 0.8 -Gamma Shift P 3 0.7 sE: 2 0.6

F$! 0.5 2 .$ 0.4

0.3 5In 0.2

0. I

0.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 a a FIGURE1.-The rate of substitutionas a function of the FIGURE2.-Mean values of Z andembedded Z for the strength of selection, a, for five models with 8 = 1.0 and house-of-cards model andthe Markov approximation of N = 1000. Each point is based on 5,000 events. the house-of-cards model with 8 = 0.2 and N = 1000. Each point is based on 5,000 events. the house-of-cards model, therate is a concave function able is gamma-distributed with parameter P = Yz. of a; for the othersit is a convex function. The decline Thus, the distributionof selection coefficients for the in the rate for the house-of-cards model is so extreme optimum model is very similar to that of the gamma that simulations with a > 4 effectively run forever. shift model. Clearly, the initial problem is to understand why the rate There is a certain irony in this observation. I have on drops off so rapidly for the house-ofcardsmodel and not two occasions argued that thegamma shift model is en- for the others. tirely ad hoc, with no conceivable biological justification There is an extensive literature on the rate of substitu- (GILLESPIE1987, 1991). Now it appears that the gamma tion for the shift and optimum models. KIMURA (1983) shift model could be viewed as an approximation to an summarized the results for the shiftmodels with the optimum model. However, while the approximation of formula the rateof substitution for large a is quite good, the next section will show that other aspects of the substitution processes are not at all like the optimum model. where (() is Reimann's Zeta function. (This formula is Much less is known about the rate of substitution for actually minor variant of KIMURA'S (GILLESPIE1987).) The the house-of-cards model. The essential complication is parameter P is one for the exponential shift model and that there is no easy way to discover the absolute fitnessof one-half for the gamma shiftmodel. In the former case, the the common genotypes inthe population. By contrast, the rate is proportional to 1/a; in the latter, to 1/m.~0th fitness of the reference genotype under shift models is are convex functions of a in agreement with the curves in always one. The success ofmutants obviously depends criti- Figure 1. The numerical agreement between equation 3 cally on the average absolute fitnessesof the resident gene and the simulation resultsis not impressive. For example, types. Thus, our investigation will begin with an investiga- for a = 4, the predicted rate for the exponential shift tion of the fitnesses of resident genotypes. model is 0.2, while the simulated rate is 0.34. For the As a increases, so does the standard deviation in fit- gamma shift model the numbers are 0.32 and 0.45, re- ness of mutants, which increases the effectiveness of spectively. The poor performanceis to be expected as a twe at bothfixing advantageous mutations allele modelwas used to obtain equation3 whereas theSiu- and preventing the fixation of deleterious mutations. lated populationstypically contain seven or eight alleles. This should be reflected in the mean fitness of the alleles FOLEY(1987) showed that the rateunder the optimum in the population. There aretwo ways to investigate this. model is, like the gamma shift model, asymptotically One way is to examine equal tol/fi. In fact, Figure 1 shows that the rate of z= Y/a substitution for thetwo models converge to one another as a - 00. There is a simple explanation for this. If the at regular intervals of time.(Recall that 1 + Y is the most common allele under the optimum model has a mean fitness of the population.) The division by u is a con- value near zero (as it will for large a),then Equation venient scaling as the values of Yare very small.The mean 1 shows that theselection coefficientof mutant alleles value of 2, as a function of a, is illustrated in Figure 2. will be approximately - uy2,where y is a standardized As is apparent, the values of Z do increase with in- normal deviate. The square of a normal random vari- creasing a. As an example, when a = 2.5 the meanvalue Substitution Processes 947 of Z is also about 2.5. In this case, the mean selection If the house-ofcards model is to be a model of mo- coefficient of segregating alleles is about 2.5 standard lecular evolution wherethe vast majorityof substitutions deviations abovethe mean selection coefficientsof newly are deleterious, then 2, must decrease in value much arising mutations. Reference a to table of normal deviates more than it increases. However, as Zi is a stationary shows that about99.3% of the new mutations will be less process, this would seem unlikely. In fact, the when simu- fit than the average fitness ofthe segregating alleles. Thus, lations are examined, the fraction of steps where Zi de- the population has evolvedas expected and the vast ma- creases is,in all cases,not significantly dif€erent from one- jority of all mutations are, in fact, deleterious. half, even though the ktmajority” of all mutations are What aboutborderline mutations? Ifwe define deleterious. Thus, the house-ofcards model is, in fact, a borderline mutations to be those with selection CO- model that yields the same numberof advantageous sub efficients within plus or minus 1/2Nof the mean se- stitutions as deleterious substitutions, a result that was lection coefficient, then the fraction of all mutations noted in passing by TACHIDA(1991). that are deleterious is The simulation results on the embedded 2 process uncovered two simple properties-EZ, a/2 and the 1 - e-y2/20p dy = - fraction of decreases is one-half-which suggest that a eJSU su-l/2N fi Lae-r2/2 dz mathematical analysis of the process might be possible. = 0.01165 This is the case, but only as an approximation when8 is small and a is either small or large. where s = 2.5 is the mean value of Z. Similarly, the The method of approximation is the standard two- fraction of all borderline mutations that are advan- allele approach under weak mutation: Set the rate of tageous is 0.00434. Thus, 72% of all borderline mu- substitution equal to the mean number of mutations tations are deleterious. Not a “vast” majority to be entering the population each generation, 2Nu, times sure. If all advantageous mutations are included-not the probability of fixation of any one of them. The twist just theborderline ones-then the fraction deleterious for the house-of-cards model, asnoted and exploited by decreases to 65%. Natural selection has moved the TACHIDA(1991), is the recognition that the fixation population to a point where thegreat majority probability depends on the selection coefficient of the (99.3%)of all newmutations are deleterious but only currently ‘fixed’ allele. 65% of those with fitnesses greater than the mean A second twist, alsonoted by TACHIDA,that is thefitness minus 1/2N are deleterious. of the currently fixed allele formsa discrete-time Markov The otherway to investigate the fitness of newly aris- process when each unit of time corresponds to the sub- ing mutation is to focuson sites that ultimately fw in the stitution of a new mutation. Thus, theMarkov processap population rather than segregating alleles. Whena new proximates the embeddedZ process of the simulations. mutant enters the population, it does so as a new allele Consider a two-allele population with the selection with a mutation at a randomly chosen site. This mutationcoefficient of the currently fixed allele beingYfand that is called the defining mutation of the new allele. Simi- of a newly arising mutation X. The fixation probability larly, the allele will be called the defining allele of the of the new mutation is mutation. The fixation of a site does not imply fixation 1 - ,-(X-Y,) of its defining allele. Rather, it implies that, at the moment of fixation ofthe site,the population is composed of only 1 - e-2N(X-Yt) the defining alleleand its descendent alleles. Usually, when (KIMuRA 1962). If the distribution of fitnesses of new a is moderate and/or 8 is small, the defining alleleis the mutants is Gaussian with mean zero and variance 4, most common allele in the population at the momentof then the probability of fixation of a new mutation in any fixation of its defining mutation. Thus, it should be infor- particular generation is mative to followthe fitnesses of the alleles whosedefining mutations ultimately fix in the population. Now let 2, = YJu, where Yjis now the selection co- efficient of the allelewhose defining mutation ulti- mately fixes in the population. Z, may be viewed as a The scalings y = x/a and z, = Y,/otransform Equation discrete-time stochastic process with time measured in 4 into units of originations. Thus, 2, would be the scaled se- lection coefficientof the defining allele for the first site to fix,Z, for the defining allele for the second site to fix, J -m and so forth. This processwill be called the embedded Z where process. The mean of the embedded Zprocess as a func- tion of a is illustrated in Figure2. Remarkably, the mean 1 1 - ,-Ocv-d value is very close to 42. 948 J. H. Gillespie

Thus, with probability G(z,) an origination will occur in the current generation. The next step is to find the distribution of zf+l.An argument paralleling that for theprobability of change shows that the probability density of y = z,+ I, given that a substitution occurred, is

The densityf( y, z,) is the transition probabilityfor the embedded Markov chain. Although the chain is analyti- cally intractable, it does yield to an asymptotic approach for small and large a. The small a approximation leads to a number of un- -1.0 ' 0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0 expected results. When a is small, so is cr = a/ (2N), Substitution which motivates a search for the leading term of an as- FIGURE3.-A typical realizationof the embeddedZ process ymptotic expansion off(y, z,) as (+ + 0. Use for the house-of-cards model.

e--LT(Y-LI) - 1 - (+(y - house-of-cards model is not a model forwhich the vast majority of all substitutions are deleterious. Rather, it to approximate g with is a model with an even mixture of positive and nega- tive substitutions. Figure 3 gives a typical realization of z,, illustrating that the process is extraordinarily erratic as would be ex- and to approximate f( y, z,) with pected from the large even moments. The primary goal ofthe asymptotic analysisfor large a is to discover why the process effectively stagnates.The ex- planation follows from a simple applicationof the theory We now require the asymptotic expansions of the ex- of records. The starting point is to note that pectations of Az and (Az)', where Az = ( y - z). Use 11 1 J-%, --+-(y-z,)+-(y-z)'a 1 - a 2 12 (which I obtained from Maple, a symbolic mathematics which implies that the transition probability becomes computer program) to obtain a EAz - - z O(d) - 2 + E(Az)' - 1 + z2 + O(a). (9) Examination of fm( y, z,) shows that, for large a, the only mutations that fix are ones whose fimesses exceed z, In The expression for Ehzshows that, at stationarity,the mean other words, when selection isstrong, the fixation of del- value of z, is approximately a/2. Figure 2 shows that Ez, - eterious alleles essentially stops. But,the simulations show 42in the simulations of the full processas well as for the that the substitution of more fit alleles also stops.All evo- approximating Markov process.Thus, our result that Ez, = lution stops! a/2 appears to be valid even when a is not small. The reason is apparent from the form offm(y, z,). In As selection gets weaker, the embeddedprocess does order for a substitution to occur, a mutation must ap- not converge to a diffusion, as might be expected. The pear whose fitness exceeds z,. If the fitness of the first higher-order odd moments, when z = 0, are all oforder allele is assigned at random, the theory of records tells a. The even moments, on the other hand,grow rapidly us that the average time until a mutation appears with with their order. For example, E(&)", when z = 0, y > zo is infinite (GLICK1978). Of course, the appearance equals 1, 3, 15, and 105 when n equals 2, 4, 6, and 8, of a more fit allele does not guarantee its fixation. The respectively. A consequence of the domination of the rate of substitution will be somewhat less than the rate dynamics by the even-ordered moments is that the dis- of appearance of records. tribution of Az is very nearly symmetrical. Positive and The theory of records is concerned with the number negative values ofAz should occur about equally likely. of record values in a sequence of independent draws This symmetry was seen in both the Markov process from a probability distribution. A record occurs on a and in the simulations of the full process. Clearly, the specific draw if the value of the random variable ob- Substitution Processes 949 Processes Substitution tained on that draw exceeds the values of the random 3.0 r variables obtained on all previous draws. The mean number of records in n draws is

n 2.0 l/i 0.5772 ln(n) 2.5i - + i= 1 (GLICK1978). For example, the expected number of records in a million draws is only14.39. If the sample is extended to another million draws, the expected num- ber of records is only 0.69 greater than for thefirst mil- lion. When 6 = 2, one mutation will enter the popula- tion, on average, eachgeneration. In two million generations, there can be no more than about 15 sub- 4.5 1 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 stitutions, on average. A typical simulation in this paper Lag runs until 5,000 substitutions have occurred. On aver- FIGURE4.-The covariance divided by thesquare of the age, this would require more than about e5,Ooogenera- mean time between originations for three models. The results tions for the large a cases. Such simulations will appear are based on 10,000 events with 0 = 1.0, N = 1000, and with to run forever. a = 10 for the gamma shift and optimum models and a = 2 for the house-of-cards model. The results for the HDD model aresimple: k EJ u. The population quickly evolves to a position where the se- used to describe the variability in rates of substitution lective differences between alleles is smaller than 1/N. (KIMURA, 1983). It is this connectionthat makes %k All subsequent evolution is indistinguishable from neu- more convenient than ck. tral evolution. There is little point in documenting this Here we will investigate only the origination pro- behavior as not a single statistic indicated any deviation cess. As was shown in the previous paper, the fixation from neutrality with k = u. process is considerably more complicated. Thesecom- plications are irrelevant to the analysis of sequence TIMESBETWEEN SUBSTITUTIONS data as the countingprocesses for the origination and fixation processes converge rapidly with the number The times between originations or fixations form a of substitutions. stationary time series in discrete time, . . . , T-,,To, Figure 4 illustrates for the origination process for T,, . . . , whose properties may be investigated using Xk the gamma shift, optimum, and house-ofcards models. standard time series methods. In the previous two pa- All three models have the property that (ek EJ 0 when pers, we discovered that themost valuable statistic is the k > 1. In other words, the origination processes behave autocovariance function divided by the square of the very nearly like renewal processes with the intervals be- mean time between events. tween originations being nearly uncorrelated. The ex-

‘k ponential shift and HDD models also share this property Tk= - (ET,)‘ ’ (data not given). The previous papers in this series showed that this behavior is also characteristic of the where TIM, SASCFF,and overdominance models. As the origi- ck = Cov(T,, Tifk) nation process for the neutral models is a Poisson pro- cess, it has uncorrelated intervals between successive is the covariance of two intervals separated by a lag of k originations. In fact, the underdominance model has originations or fixations. been the only model of the ten examined in this series The function (ek is closely connected to the index of of papersthat exhibits any significant correlation dispersion of the counting process, between successive origination intervals. Var X( t) The values of%o do vary significantly betweenthe vari- I(t) = - EX(t) . ous models. For the HDD and gamma and exponential shift models, %o EJ 1. For these models, the origination For large t, process is indistinguishable from a Poisson process (and from the neutralorigination process). By contrast, R = lim I(t) (10) hrn %o > 1 for the house-ofcardsand optimum models. Re- calling that R = %o for renewal processes (equation 11 m with %,= 0, i > 0), we conclude that forthese two models = %o + 2 x (ei. (11) molecular evolution should appear moreepisodic than i= 1 it does for the neutral model. Clearly, this property is R will be recognized as the quantity that has been widely worth closer scrutiny as the neutral model’s failure to 950 J. H. Gillespie

shares the property that thetime until the next mutation 14.0 - House of Cards to enter the population depends on the fitness of the -D"--13 Optimum currently fixed allele. This is the origin of the clustering. 12.0 - However, the optimum model differs from the house- of-cards model in that themaximum fitness of anygeno- type is one. Thus, there cannot be the extremely long waits for more fit alleles to appear in the population. The two shift models do not exhibit any clustering because the fitnesses offixed alleles are always scaled to one. Infact, the origination process for theshift models are indistinguishable from a Poisson process. Similarly,

2.0 the origination process for the HDD model is indistin- guishable from a Poisson process. 0.0 0.0' 2.0 4.0 6.0 8.0 10.0 a DISCUSSION FIGURE5.4, as a function of a for the houseafcards and The results presented here have a direct bearing on optimummodels. Results are basedon 5,000 eventswith the hypothesis that most molecular evolution involves 0 = 1 and N = 1000. the fixation of deleterious alleles. Three classes of delete- account for theobserved episodic evolution of proteins riousallele models have been proposed:shift models has been its greatest shortcoming. (OHTA1972, 1976; KIMURA1983), house-ofcards models Figure 5 illustrates soas a function of a for the house- (TACHTDA1991; OHTA 199'4, and optimum models (UTTER of-cards and optimum models. The house-ofcard model 1970; KIMURA1981; FOLEY1987). Each hasat least one ma- is extraordinary in that %,, increases dramatically with jor shortcoming as a model of deleterious evolution. increasing a. For a = 3.5, so= 84. Such extreme clus- The shift models are the least appealing of the three. tering of substitutions has not been observed in any The assumptions that all mutations are deleterious and other model examined to date. The reasons for theclus- that thedistribution of fitnesses relativeto the common tering may be understood from the previous discussion allele is immutable are difficult to justify. The usual jus- on the rates of substitution. tification is to infer from the observation that as most The house-ofcards model tends to evolve to a state mutations of large effect are deleterious that most mu- where the currently fixed allele is sufficiently fit that the tations of small effect mustbe deleterious as well. The fal- waiting time until a more fit allele appears is very long. lacy ofthis argument is shownby the house-ofcards model The substitution process would effectively stagnate, as which evolves to the point where most mutationsof large we saw from the theory of records argument, were it not effect are deleterious yet onlyabout 65%of those of small possible for the fixation of less fit alleles. This too re- effect are deleterious. Shift modelsare clearly inappropri- quires a long waiting time (for a > 2.5),but shoulda less ate as models of molecular evolution.OHTA (1992) herself fit allele fix, there is an immediate burst of both advan- has argued against them based on the biological implau- tageous and deleterious substitutions that ends when a sibility of the shifting distributionof relative fitnesses. particularly fit allele fixes. Of the two remaining models, neither can be said to These dynamics may be better understood by refer- be a model of evolution by exclusively deleterious sub- ring again to Figure 3. Consider the final ten substitu- stitutions. For both, precisely one-half of the substitu- tions in the figure. The tenth from the end had anab- tions are advantageous and one-half are deleterious. I solute fitness of 0.48 and remained in the population for would go so far as to claim that there are no biologically 5,772 generations before being replaced by the nextmu- realistic models in which mostof the substitutions of mu- tation whose fitness was 0.49. That allele remained in the tations of very small effectare deleterious. Itis difficult to population for only 973 generations before being re- imagine a model that allows the evolution to less fit states placed. Jumping ahead, the sixth allele from the end without the return evolution tomore fit states when fitness had a fitness of 3.91 and remained in the populationfor differences are on the order of the reciprocal of the popu- 1,752,804 generations. The next four alleles had fit- lation size. This conclusion seems inescapable despitethe nesses of 3.1, 1.9, 2.4 and 1.6 and remained in the popu- huge investment our field has in the notion that most of lation for 801961,159874,658 and 10755 generations, re- the sites fixed in evolution are deleterious. spectively. This extraordinary variation in the residence But even if the house-ofcards and optimum models times of alleles is responsible for the high values of %,,. are not exclusively models of deleterious evolution, they The optimum model also shows a tendency for clus- may still be viable models of molecular evolution. Of the tering but the effect is much less pronounced than for two, the house-of-cards models appears to be the least the house-ofcardsmodel. The maximum value of%,, was worthy. The house-ofcards model suffers from extreme 1.6 which occurred when a = 50. The optimum model parameter dependence as illustrated in Figure 1. The Substitution Processes 95 1 strength of selection must be such that 1 < a < 4. If (Y < lowered ratesof substitution inspecific regions of the 1 the modelis essentiallythe same as the strictly neutral genome. model. If a > 4 the model stagnates. Given that a is the A modified version of the house-of-cards might be a product of the population size times the strength of se- viable candidate for molecular evolution. TWOsimple lection, there is no conceivable reason why it should be modifications should be considered. The first is the ad- restricted to such a small range of values. dition of fluctuating population sizes. OHTA’S“slightly This objection is not new. OHTA(1992) has tried to deleterious mutation theory” is, in its current incarna- deflate it by introducing population subdivision. The tion, a house-ofcards modelwith fluctuating population essential idea is that if fitnesses are determined inde- sizes. The dynamics of this model have not been exam- pendently in m patches, then the effective standard devia- ined in detail, but two crucial conjectures can be made. tion in fitness is u/m. While this does make the depen- The first concerns the time scale of the fluctuations. dence on u itself less extreme, it doesn’t help to explain If the time scale ofpopulation size fluctuations are much why the new a = 2Nu/m should fall in the narrow range less than that of molecular evolution, then the model (1,4).In fact, the new a is now the product of three pa- will behave like a constant populationsize model with a rameters and might be expected to be even more variable. suitable average population size playingthe roleof N. As The rateof substitution under theoptimum modelis the time scale of molecular evolution is on the orderof much less dependent ona. In this regard, the optimum millions of years, we must assume that populations stay model is a much more acceptable model of molecular very small for millions of years and then very large for evolution than is the house-ofcards model. It captures millions of years in order for the fluctuations to affect the spirit of deleterious model in that the rate of sub- the dynamics. Some may find this assumption demo- stitution is a decreasing function of the strength of se- graphically implausible ( GILLESPIE1988). lection. The essential difference between it and the The second concerns the nature of the substitutions house-of-cards model is that fitnesses are bounded un- should the population size fluctuate on a time scale of der the optimum model and unbounded under the millions of years.During periods of small size( a < 1) ,the house-of-cards model. The compression offitnesses substitutions that occur will follow neutral dynamics. near one allows for continuing evolution. When the population expands, those substitutionswill be If we were to promote oneof these models as a model almost entirely advantageous.The fixation of a slightly del- of molecular evolution, itwould have to be the optimum eterious alleles (1 < a < 4) will almost never occur. model. As the dynamics of protein and silent substitu- The second modification of the house-ofcards model tions are dramatically different (GILLESPIE1986, 1991), keeps its biological motivation intact yet allows for con- the optimum model can only apply to one type of sub- tinuing evolution. All that need be done is to allow the stitution. The essential difference between silent and re- fitnesses to change very slowly through time. For long placement substitution dynamics is that the former ex- stretches of time, the relative fitnesses of alleles in the hibit a strong generation-time effect and a relatively population will remain fixed. Thus,the segregating small value ofR while the latter has a small generation- variation will have properties essentially the same as un- time effect and large values of R. The generation-time der the house-of-cards model. Occasionally, the fitness effect is to be expectedfor any substitution process that of the most fit allele will change, leading to a burst of is mutation-limited, as isthe optimummodel. Moreover, substitutions similar to that described in this paper. The the optimum model exhibits a small value of R. Thus, occasional change in fitnesses is the force driving mo- it is an acceptable model of silent evolution. The case for lecular evolution. This is precisely the model that was the optimum model would be strengthened consider- explored in the first paper in this series. There it was ably if the rate of silent substitution in coding regions shown that as the persistence time of fitnesses (the av- were lower than that in truly “non-functional,” hence erage time between changes) increases, the rate of sub- neutral, DNA. As of this writing, it is not clear which of stitution first increases then decreases to match the rate the two rates are higherso we must wait for this support. of change of the environment. At the same time, the IMURA(1981) originally proposedthe optimum clustering of substitution increases. The addition of fluc- model as a model for silent evolution in coding regions. tuating fitnesses is quite naturalas the fitness of a geno- Although he now appears to be backing away from this typeis determined by its success in its environment, view in favor ofthe strictly neutral model (KJMURA1991), which is in a constant state of change. In fact, the basic it may well transpire that the optimum model may be assumption of the house-of-cards model that fitnesses needed to explain lowered rates of silent evolution in remain fixed in perpetuity is patently absurd. coding regions. The HDD model is also a viable candi- While theproposed modification of the house-of- date for silentevolution as it evolves to a state where cards model is simple and biologically compelling, it it is, in essence, a neutral model. It thereforeexhibits must be emphasized that the resulting model changes a strong generation-time effect and R = 1. Like the qualitatively in one important regard. Under thehouse- neutral model, it cannot provide an explanation for of-cards model, genetic drift is the force responsible 952 J. H. Gillespie for the fixation of alleles. Under the modified model, award) and DEB9119463. I would like to thank MICHAELTURELLI for which is actually the TIM model (TAKAHATAet al. 19'75; reminding me of FOLEY'S work on the optimum model. TAKAHATAand KIMURA 1979), naturalselection is primar- ily responsible forthe fixation of alleles. Thus,the LITERATURE CITED house-ofcards dynamics are structurally unstable to the FISHER,R. A., 1958 The Genetical Theory ofNatural Selection.Dover, assumption that fitnesses do not change. If this modi- New York. FOLEY, P., 1987 Molecular clockrates at loci under stabilizing selec- fication is judged to yield a more plausible model, it is tion. Proc. Natl. Acad. Sci. USA 84 7996-8000. most assuredly not a model of neutral evolution. KIMURA GILLESPIE,J. H., 1986 Natural selection and the . Mol. (1983) has defined neutralmodels to be those for which Biol. Evol. 3: 138-155. GILLESPIE,J. H., 1987Molecular evolution and the neutral allele drift is the force responsible for the fixation of sites. theory. Oxf. SUIT. Evol. Biol. 4 10-37. This study is the third in a series aimed at examining GILLESPIE,J. H., 1988 More on the overdispersed molecular clock. substitution processes in the context ofWATTEMON'S Genetics 118: 385-386. GILLESPIE,J. H., 1991 TheCauses of MolecularEvolution. Oxford infinite-sitesno-recombination model of the gene.With University Press, New York. this study, all but one of the classical models of popu- GILLESPIE,J. H., 1993 Substitution processes in molecular evolution. lation genetics have been examined. (The remaining I. Uniform and clustered substitutions in a haploid model. Genetics 134: 971-981. model is a directional evolution model that will be taken GILLESPIE,J. H., 1994 Substitution processes in molecular evolution. up in a future report.) Theinvestigation was motivated 11. Exchangeable models from population genetics. Evolution (in by the need toexplain the high values of R observed in press). GLICK,N., 1978 Breaking records and breaking boards. Am. Math. protein evolution. One explanation for the clustering of Month. 85: 2-26. substitutions involves strong selection (CY>> 1) and weak HARTL, D. L., D. D. DYKHUIZEN,and k M. DEAN,1985 Limits of adap mutation (8 << 1) (GILLESPIE1991). In the current in- tation. The evolution of selective neutrality. Genetics111: 655-674. KIMURA,M., 1962 On the probability of fixation of mutant in vestigation, I wanted to see if models with moderate se- a population. Genetics 47: 713-719. lection (a- 1) and mutation (8 - 1) could exhibit high KIMURA,M., 1979 Model ofeffectively neutral mutations in which valuesof R. Surprisingly, we discovered that several selectiveconstraint is incorporated. Proc. Natl. Acad. Sci.USA 76: 3440-3444. models-TIM in a rapidly changing environment, KIMURA, M., 1981 Possibility of extensive neutral evolution under SASCFF, and overdominance-actually have substitu- stabilizing selection with special reference to nonrandom us- tions that are moreregular than random. Thesemodels age of synonymous codons. Proc. Natl. Acad.Sci. USA 78 5773-5777. may only be used as models of protein evolution if some KIMURA,M., 1983 The Neutral Allele Theory of Molecular Evolution. element is included that changeson a time scale that is Cambridge University Press, Cambridge. commensurate with that of molecular evolution. These KIMURA, M., 1991 The neutral allele theory of molecular evolution. A review of some recent evidence. Jpn. J. Genet. 66: 367-386. elements have been called allelic constrictions and are KINGMAN, J. F. C., 1978 A simple model for the balance between se- identified with extreme environmental changes, bottle- lection and mutation. J. Appl. Prob. 15: 1-12. necks, or linked hitchhiking events (GILLESPIE1991). In LATTER,B. D. H., 1970 Selection in finite populations with multiple alleles. 11. Centripetal selection, mutation, and isoallelicvariation. a subsequent report in this series, we will see whether Genetics 66: 165-186. allelic constrictions can elevate R sufficientlyin OHTA,T., 1972 Evolutionary rate of cistrons and DNA divergence. moderately-selected models to account for the patterns J. Mol. Evol. 1: 150-157. OHTA,T., 1976 Role of very slightly deleterious mutations in molecular in protein evolution. evolution and . Theor. Popul. Biol. 10: 254-275. Three moderately selected models were seen to el- OHTA,T., 1977 Extension of the drift hypothesis, evate R: the underdominance, house-ofcards, and op- pp. 148-167 in MolecularEvolution and Polymorphism,edited by M. KIMURA. National Institute of Genetics, Mishima. timum models. In the underdominance model,which is OHTA,T., 1992 The nearly neutral theory of molecular evolution. exchangeable, the elevation of R is due to the coopera- Annu. Rev. Ecol. Syst. 23 263-286. tive dynamics of alleles. The probability of a new allele OHTA,T., and H. TACHIDA,1990 Theoretical study of near neutrality. I. Heterozygosity and rate of mutant substitution. Genetics 126 entering the population is increased when another al- 219-229. lele is moving through the population and,as a conse- TACHIDA,H., 1991 A study on a nearly neutral mutation model in quence, lowering the homozygosity. The reason for the finite populations. Genetics 128 183-192. TAKAHATA, N.,1987 On the overdispersed molecular clock. Genetics elevation of R under the house-of-cards and optimum 116 169-179. models is quite different. For these, the rate of substi- TAKAHATA,N., and M. KIMURA,1979 Genetic variability maintained in tution depends on the fitness of the common allele, a finite population under mutation and autocorrelated random fluctuation of selection intensity. Proc. Natl. Acad. Sci. USA 76: which changes randomly through time. It is as if the 5813-5817. substitution process is a randomized Poisson process.In TAKAKATA, N.,K. ISHIIand H. MATSUDA,1975 Effect of temporal fluc- this sense, these models arenot unlike TAKAHATA'S tuation of selection coefficient on gene frequency in a popula- tion. Proc. Natl. Acad. Sci. USA 72: 4541-4545. (1987) fluctuating neutral space model. WA~WN,G. A, 1975 On the number of segregating sites in genetic models without recombination. Theor. Popul. Biol. 7: 256-276. The research reported here was funded inpart by National Science Foundation grants BIR-9212381 (Academic Research Infrastructure Communicating editor: G. B. GOLDING