<<

Copyright  2000 by the Genetics Society of America

The Role of Population Size, Pleiotropy and Effects of in the of Overlapping Functions

Andreas Wagner Department of Biology, University of New Mexico, Albuquerque, New Mexico 87131-1091 and The Santa Fe Institute, Santa Fe, New Mexico 87501 Manuscript received February 24, 1999 Accepted for publication November 23, 1999

ABSTRACT Sheltered from deleterious mutations, with overlapping or partially redundant functions may be important sources of novel gene functions. While most partially redundant genes originated in gene duplications, it is much less clear why genes with overlapping functions have been retained, in some cases for hundreds of millions of years. A case in point is the many partially redundant genes in vertebrates, the result of ancient gene duplications in primitive chordates. Their persistence and ubiquity become surprising when it is considered that duplicate and original genes often diversify very rapidly, especially if the action of is involved. Are overlapping gene functions perhaps maintained because of their protective role against otherwise deleterious mutations? There are two principal objections against this hypothesis, which are the main subject of this article. First, because overlapping gene functions are maintained in populations by a slow process of “second order” selection, population sizes need to be very high for this process to be effective. It is shown that even in small populations, pleiotropic mutations that affect more than one of a gene’s functions simultaneously can slow the mutational decay of functional overlap after a by orders of magnitude. Furthermore, brief and transient increases in population size may be sufficient to maintain functional overlap. The second objection regards the fact that most naturally occurring mutations may have much weaker fitness effects than the rather drastic “knock-out” mutations that lead to detection of partially redundant functions. Given weak fitness effects of most mutations, is selection for the buffering effect of functional overlap strong enough to compensate for the diversifying force exerted by mutations? It is shown that the extent of functional overlap maintained in a population is not only independent of the rate, but also independent of the average fitness effects of mutation. These results are discussed with respect to experimental evidence on redundant genes in organismal development.

VERLAPPING or partially redundant gene func- why genes with overlapping functions have been re- O tions are ubiquitous in . Their presence tained, in some cases for hundreds of millions of years. spans the entire taxonomic range from unicellular or- This holds especially for many partially redundant genes ganisms to mammals. They are observed in of in vertebrates, the result of ancient gene duplications any function, be it enzymatic, structural, or regulatory in primitive chordates (Sharman and Holland 1996; (Basson et al. 1986; Hoffman 1991; Lundgren et al. Bailey et al. 1997). This is even more puzzling given 1991; Higashijima et al. 1992; Goldstein 1993; Cadi- that duplicate and original genes can diversify very rap- gan et al. 1994; Gonza´les-Gaita´n et al. 1994; Li and idly after duplication (Cirera and Aguade 1998; Tsaur Noll 1994a,b; Condie and Capecchi 1995; Goodson et al. 1998; Zhang et al. 1998), especially if the action and Spudich 1995; Hanks et al. 1995). Weak or no of natural selection is involved. Thus, are overlapping phenotypic effects of a knock-out mutation in a gene gene functions merely the transient remnants of past thought to play an important biological role are often gene duplication events, a snapshot of a diversification the first indicators of such redundancy. Further analysis process that will eventually be complete? Or are they then typically reveals one or more related genes with perhaps maintained because of their protective role functions similar to that of the mutated gene. With against otherwise deleterious mutations? These ques- some potential exceptions, partially overlapping gene tions are attracting increasing interest (Tautz 1992; functions can be traced to (often old) gene duplications. Thomas 1993; Cooke et al. 1997), and for good reasons. While the evolutionary origins of most partially redun- Sheltered from some deleterious mutations, partially dant gene functions are obvious, it is much less obvious redundant genes may be important sources of evolution- ary novelties on the biochemical level. This article deals with the two perhaps most serious Address for correspondence: Department of Biology, University of New Mexico, 167A Castetter Hall, Albuquerque, NM 87131-1091. objections to the persistence of overlapping gene func- E-mail: [email protected] tions as a buffer against mutation. The first objection

Genetics 154: 1389–1401 (March 2000) 1390 A. Wagner concerns the population size required to stably maintain overlapping gene functions in a population. Even neu- tral mutations in genes with overlapping functions are likely to lead to diversification of the gene’s functions. Counteracting this evolutionary force is a much slower process of “second order” selection for maintaining overlap (Wagner 1999), a process that acts on differ- ences in functional overlap among genes. These differ- ences are selectively neutral, but genes with greatly over- lapping functions are less likely to undergo deleterious mutations than genes with distinct functions. Over time, genes with greater functional overlap will be preferen- tially retained in a population. Because of the indirect nature of this process, populations have to be large to sustain the amount of variation in overlap on which selection can act. A past contribution has shown that the required population sizes are not impossibly large (Wagner 1999). Here, it is shown that even in small populations, pleiotropic mutations that affect more than one of a gene’s functions simultaneously can delay functional divergence by orders of magnitude. More- over, in contrast to the maintenance of genetic variation in fluctuating populations, the evolution of functional overlap may be predominantly influenced not by occa- sional population bottlenecks but by occasional popula- tion size bursts. The second potential objection regards the fact that most naturally occurring mutations are likely to have much weaker fitness effects than the rather drastic “knock-out” mutations that lead to detection of partially Figure 1.—(a) A group of genes regulated by a transcrip- redundant functions in the laboratory. Given weak fit- tion factor (TF). (b) After duplication of the ness effects of many mutations, is the selection process factor gene, the same genes are now regulated by both original outlined above strong enough to compensate for the and duplicate. (c) Over time, each factor may lose the ability diversifying forces exerted by mutations? It is shown that to regulate some of the genes, such that only a subset of genes (hatched) are regulated by both factors. the evolution of mean fitness and functional overlap are effectively decoupled. This implies that the average effect of deleterious mutations on fitness does not greatly influence the evolution of functional overlap cific enzymatic reactions, may be restricted in their abil- and vice versa. This also has implications on the genetic ity to evolve functional overlap. The specificity of their load associated with overlapping gene functions, as dis- function may be part of the reason why housekeeping cussed below. genes are rarer among the documented cases of func- To address the above questions, this contribution uti- tional overlap.) Consider the case of a transcription lizes a conceptually simple mathematical model that factor (Figure 1a), regulating the expression of multiple relies on three main assumptions (Wagner 1999). First, genes involved in a developmental process. Transcrip- the greater the functional overlap among genes, the tion factors occupy a prominent role in the many exam- greater the fraction of mutations in either gene that ples of genes with overlapping functions (Joyner et al. are phenotypically neutral. Second, both neutral and 1991; Weintraub 1993; Maconochie et al. 1996). After nonlethal deleterious mutations cause gene functions duplication (Figure 1b) and diversification (Figure 1c), to diverge. A third assumption, already implicit in the the functional overlap among duplicate and original above discussion, is that overlap among gene functions can be viewed as the subset of genes regulated by both is a quantifiable variable, at least in principle. factors. Microarray technology (Schena 1996) has made While the first two assumptions are unproblematic, the quantification of such overlaps a realistic goal: per- the third requires further comment. Two examples illus- turb the activity of each factor in separate experiments, trate its validity. Overlapping gene functions are espe- measure the resulting change in gene expression pat- cially prominent among regulatory developmental genes, terns, and assess which subset of genes is affected simi- because these genes often have multiple functions. larly. A second example concerns genes with identical (Most housekeeping genes, many of which catalyze spe- biochemical functions but spatiotemporal expression Population Size, Pleiotropy and Fitness Effects 1391

patterns that are not completely congruent (Li and modeled by others (Ohno 1970; Nei and Roychoud- Noll 1994a,b; Hanks et al. 1995). Here, the extent of hury 1973; Kimura and King 1979; Maruyama and spatiotemporal overlap in two expression domains can Takahata 1981; Watterson 1983; Ohta 1987; Mar- be taken as a measure of functional overlap. Recently shall et al. 1994; Walsh 1995). Also, empirical data developed automated image processing technology to suggest that the complete elimination of one gene’s quantify the concentration of multiple gene products function may be much less frequent than commonly in a developing embryo at single cell resolution enables assumed (Allendorf et al. 1975; Ferris and Whitt such quantification (Kosman et al. 1998). 1976, 1979; Nadeau and Sankoff 1997). Summary of previous results: Because results from a previous contribution (Wagner 1999) are used below, MODEL AND RESULTS they are briefly reviewed here. In this earlier article, it The model is concerned with a population of haploid, was assumed that all nonneutral mutations, occurring randomly mating organisms with nonoverlapping gen- at a rate of 2␮(1 Ϫ r) per two genes with overlap r, are erations. The assumption of haploidy is chosen merely effectively lethal. The evolution of the distribution of

because it exposes most clearly the key evolutionary functional overlap, pt(r), in a population was studied principles at work. The genic mutation rate is denoted under this assumption. The recurrence equation as ␮ (␮Ӷ1). Neutral mutations that do not affect 1 any aspect of the function of a gene product are not (1 Ϫ 2␮)pt(r) ϩ 2␮Ύ zpt(z)mr(r z)dz p (r) ϭ 0 | (2) considered. tϩ1 1 (1 Ϫ 2␮) ϩ 2␮ Ύ zpt(z)dz The central concept of the model is the notion of 0

partially redundant or overlapping gene functions, and describes the distribution pt(r)ofr in generation t of a that the overlap in two gene’s functions can be quanti- large population (N␮Ͼ50; Wagner 1999). It can be fied. Specifically, let the variable r, r Յ 0 Յ 1, denote a shown that the mean functional overlap among two 1 measure of the functional overlap of two genes. If r ϭ genes r ϭ ͐0 rp(r)dr reaches a nonzero mutation-selec- 1, then two genes have completely identical functions; tion equilibrium independent of the initial condition if r ϭ 0, there is no overlap in the gene’s functions. and approximated by If two genes have overlapping functions, then some 2 ␴mr ␭r mutations that eliminate one gene’s function will be ∞ ϭ r 3 2 . (3) neutral because the function is covered by the other Ί2␭r Ϫ 3␭r ϩ 1 gene. This is conceptualized in the following way: if r ϭ 1 (identical functions), all mutations are neutral; if r ϭ 2 Here, ␴ mr is the variance of the effect of mutations on 0, no mutation is neutral; and if 0 Ͻ r Ͻ 1, the rate of 2 1 2 r, defined as ␴mr ϭ ͐0(r* Ϫ␭rr) mr(r* r)dr*. This approx- neutral mutations is some function f(r). The simplest | imation holds for values of r∞ in the interior of (0,1), case is that of a linear relation, in which the rate of requires no further assumptions about the distribution ␮ neutral mutations per two genes is given by 2 r (neglect- of mutational effects, and is independent of the muta- ␮2 ing terms of order ). While this linear relation is used tion rate. It is in good agreement with numerical results throughout most of this section, the consequences of (Wagner 1999). When this model was extended to n relaxing this assumption are explored. genes with overlapping functions, it was found that (i) Not only can mutations affect fitness, but also they the extent of overlap maintained in mutation-selection may change the functional overlap among two genes, balance is statistically indistinguishable from that for which will in general lead to a divergence in gene func- the two-gene case, and (ii) linkage relations are not tions. This is modeled by a conditional probability den- likely to favor or disfavor the evolution of overlap. These sity m (r* r), which denotes the probability that the func- r | results motivate the restriction to the two-gene case with tional overlap r* after mutation of two genes with tight linkage analyzed below. overlap r before mutation lies in the interval (r*, r* ϩ For small population N␮Ӷ1, it was found that func- dr*). To leave the formalism as general as possible, only tional overlap diminishes (functions diverge) at a rate the minimal assumption that mutation reduces r on approximated by average by a factor ␭r is made, i.e., 1 2 d͗rt͘ 2␮͗rt͘ r*m(r* r)dr* ϭ␭rr, (1) ϭ ln ␭r . (4) Ύ0 | dt (1 Ϫ 2␮) ϩ 2␮͗rt͘

where 0 Ͻ␭r Ͻ 1. Because a value of ␭r ϭ 0.5 means Here, ͗rt͘ denotes the mean functional overlap in an 1 that each mutation reduces r by ⁄2 on average (a very ensemble of small populations. In the absence of selec- rapid divergence rate), ␭r Ͼ 0.5 is used here. Complete tion, mutations (1) would lead to an exponential decline loss-of-function mutations could be modeled as an im- in ͗rt͘ from an initial value of one immediately after portant special case, but are not considered here, be- gene duplication. The most important feature of (4) is cause their evolutionary dynamics have been extensively that the decay in r is much slower than that: the lower 1392 A. Wagner the functional overlap between two genes, the lower the increased rate of nondeleterious mutations can be main- rate at which functions continue to diverge. The reason tained among genes in large populations, and (ii) func- is that most mutations that affect r are neutral for large tions diverge, albeit very slowly, in small populations. r but deleterious for small r. This leads to a reduction Evolution of functional overlap and of mean fitness in r following a polynomial rather than an exponential are decoupled: In this article the assumption of lethality rate. of nonneutral mutations is relaxed. Specifically, fitness In sum, if all nonneutral mutations have severely dele- of an organism w is interpreted as a probability of sur- terious effects, then (i) functional overlap and thus an vival, and nonneutral mutations in an organism with fitness w reduce fitness by a factor ␭w: 1 w*m(w* w)dw* ϭ␭ww. (5) Ύ0 | This formula, completely analogous to (1), allows for an increase in fitness provided that the variance of muta- tional effects on fitness ␴2 ϭ ͐1(w* Ϫ␭w)2m (w* w)dw* mw 0 w w | is large, but also ensures that mutations reduce mean

fitness on average by a factor ␭w if ␭w Ͻ 1. Of interest is the evolution of the joint distribution of functional

overlap and fitness pt(r,w), as well as the moments i j 1 1 i j r w t ϭ ͐0 ͐0 r w pt(r,w)drdw. In a large population, pt(r,w) will evolve according to

ptϩ1(r,w) ϭ (1/Dt)[(1 Ϫ 2␮)wpt(r,w) 1 ϩ 2␮w z r pt(z r,w)mr(r z r)dz r Ύ0 | 1 1 ϩ 2␮w (1 Ϫ z r)pt(z r ,z w) Ύ0 Ύ0 ϫ m (w z )m (r z )dz dz ]. (6) w | w r | r r w The denominator

Dt ϭ (1 Ϫ 2␮)wt ϩ 2␮rwt ϩ 2␮␭w(wt Ϫ rwt) (7) is a normalization factor representing the fraction of individuals surviving from one generation to the next.

In the numerator, (1 Ϫ 2␮)wpt(r,w) is the fraction of individuals that do not undergo mutation in genera- tion t and that survive into the next generation.

Figure 2.—Mean fitness evolution and mean functional overlap in mutation-selection balance. (a) Mean functional overlap at mutation-selection equilibrium for ␭w ϭ 0.9, i.e., nonneutral mutations reduce fitness on average by 10%. Dots and bars represent mean fitness and one standard deviation as obtained from Monte Carlo simulations. Mean equilibrium overlap scales linearly with ␴mr/(1 Ϫ␭r), where ␭r is the mean reduction in functional overlap caused by mutation, and ␴mr is the standard deviation of this reduction. The solid line represents the analytical prediction (3) for ␭w ϭ 0, i.e., where all nonneutral mutations are lethal. Note that equilibrium overlap maintained is statistically indistinguishable from that in the lethal case (solid line). (b) One minus mean fitness (dots and bars) and correlation between fitness and functional overlap (solid line) in mutation-selection equilibrium. The dashed line represents √␮. Note that despite the high mutation rate ␮ϭ10Ϫ2 used here for reasons of computational feasibil- ity, both correlation and deviation of mean fitness from one are still of order √␮, as estimated analytically. (c and d) Identi- cal to a and b, respectively, but for more severe effects of 4 4 mutations on fitness (␭w ϭ 0.3). N ϭ 5 ϫ 10 , ␴mw ϭ 0.05, 10 generations. Population Size, Pleiotropy and Fitness Effects 1393

Figure 4.—Population size fluctuations and evolution of overlap. Shown are mean and standard error of overlap in an ensemble of 50 populations after 5000 generations of evolu- tion. The simulation was started at a mean overlap of 0.28, which is the mean overlap at mutation-selection balance in a very large population, as predicted by (3) for the parameters used (␭r ϭ 0.09, ␭w ϭ 0, ␴mr ϭ 0.05, ␮ϭ0.25). Solid bars correspond to a population of constant size (Ne ϭ N) such that Ne␮ has the value indicated on the abscissa. Open bars correspond to populations of fluctuating size with the same value of Ne␮ as for the adjacent solid bar, where each popula- tion spent tL ϭ 250 generations in a regime of large population size (NL ϭ 10,000) and tS ϭ 250 generations in a regime of Figure 3.—(a)Dependency of mean fitness at mutation- small population size, NS. NS was calculated from (9) as NS ϭ selection equilibrium, plotted as 1 Ϫ w,on␭w, the average fNeNL/(NL Ϫ Ne(1 Ϫ f )), with f ϭ tS/(tL ϩ tS). Mean overlap 4 factor by which mutations reduce fitness. N ϭ 5 ϫ 10 , ␴mw ϭ for Ne␮ϭ5 and constant population size was equal to 1.5 ϫ 0.05, 104 generations simulated. Dots represent population 10Ϫ5, a value too small to appear on the plot. means, bars one standard deviation. (b) Dependency of mean functional overlap at mutation-selection equilibrium on the mutation rate. Note the logarithmic scale for the mutation rate. Computational feasibility alone motivated the choice of range for ␮. Note that even for high mutation rates, equilib- a rough estimate for the correlation ␳(r,w) between rium overlap is statistically indistinguishable from the analyti- fitness and overlap in mutation-selection balance. The cal prediction (3), represented by the solid horizontal line. estimate, derived in the appendix, suggests that this ␭w ϭ␭r ϭ 0.9, ␴mr ϭ 0.05, ␴mw ϭ 0.01, N ϭ 500/␮ generations correlation is small; i.e., simulated. Cov(r, w) ␳(r,w) ϭ is of order √␮. (8) √Var(w)Var(r)

1 2␮w ͐0 zrpt(zr,w)mr(r zr)dzr is the fraction of individuals This crude approximation (see appendix) is fully sup- | with fitness w that undergo a neutral mutation changing ported by numerical results, exemplified in Figures 2 r from some value zr to r and that survive into the next and 3. Figure 2 shows that both in cases where mutations 1 1 generation. Finally, 2␮w ͐0 ͐0(1 Ϫ zr)pt(z r,z w)mw(w z w)mr have mildly (Figure 2, a and b; ␭ ϭ 0.9) and severely | w (r zr)dz rdz w is the fraction of individuals that undergo a deleterious effects (Figure 2, c and d; ␭ ϭ 0.3), the | w mutation affecting both fitness (zw w) and functional correlation between r and ␮ is of order √␮. This holds → Ϫ2 overlap (zr r) and that survive into the next genera- even in spite of the high mutation rates of 10 that → tion. In both these terms, the factor w in front of the were used for reasons of computational feasibility. Thus, integral represents the effect of selection, i.e., the proba- in mutation-selection balance, there is only a weak corre- bility that an individual of fitness w survives into the next lation between fitness and functional overlap among generation. Implicit in these equations are the assump- genes. Moreover, the mean overlap observed both for tions that functional overlap influences the probability strongly (Figure 2c) and for weakly (Figure 2a) deleteri- of a deleterious mutation, but not its severity, and that ous mutations is statistically indistinguishable from the mutations occur before selection. theoretical prediction (3) for the case where all muta- This rather formidable equation, while perhaps too tions are lethal. Also, mean functional overlap at equilib- complicated to solve analytically, can be used to obtain rium is independent of the mutation rate when this 1394 A. Wagner rate is varied over two orders of magnitude (figure 3b). pear that the concept of effective population size Further, the difference between w and one is of the (Hartl and Clark 1997) from population genetic the- order ␮ regardless of the values of ␭r (Figure 2, c and ory holds the answer to this question. For populations d) or ␭w (Figure 3a). In sum, functional overlap at equi- that spend a fraction f of their time in a regime of small librium is independent of the mutation rate and of the N␮ by having a small population size NS, and a fraction severity of mutational effects on fitness (␭w). Conversely, 1 Ϫ f in a regime of large N␮ by having a large NL, one mean equilibrium fitness is not sensitive to changes in calculates the (inbreeding) effective population size as the rate at which mutations lead to a decay of redun- N N dancy (␭ ). In this sense, evolution of redundancy and N ϭ S L . (9) r e ϩ Ϫ fitness are decoupled. The reasons for this are discussed fNL NS(1 f) below. A population of fluctuating size with a given Ne should Transient population size bursts may be sufficient to then behave in exactly the same manner as a population maintain redundancy: While the maintenance of func- of constant size N ϭ Ne. Numerical results from Figure tional overlap by means of natural selection requires 4 indicate that this is not the case for the evolutionary large populations (N␮ӷ1), overlap may decay relatively process studied here, because fluctuating and constant slowly in small populations (Wagner 1999). This populations with the same Ne show different mean prompts a question as to what fraction of time a popula- overlap. tion of fluctuating size has to spend in a large N␮ regime Figure 5 shows more extensive numerical results for for overlap to be maintained by selection. It might ap- the evolution of overlap under fluctuating N␮, where

a population spends alternatingly tS generations in a regime of small N␮, in which redundancy will decay

under the influence of drift, and tL generations in a regime of large N␮, where selection will maintain func- tional overlap. The figure shows mean functional over- lap in a population ensemble after 105 generations, as

a function of the ratio tS/tL of time spent in the small N␮ regime. For comparison, a dashed line shows Ne␮

Figure 5.—Influence of fluctuations in N␮ on functional overlap. Dots and bars correspond to mean and standard deviation of functional overlap ͗r͘ among two genes in an ensemble of 50 populations after 105 generations of simulated evolution with fluctuating population size. The simulation was started at a mean overlap corresponding to the upper horizontal line, which is the mean overlap at mutation-selec- tion balance for a very large population from (3). The lower horizontal line is the value to which functional overlap would decline from this initial value after 105 generations if the population ensemble was under the sole influence of drift for the parameters used here. In the simulation, the population ensemble spent alternatingly tS generations in a regime of low N␮ (dominated by drift) and tL generations in a regime of high N␮ (dominated by selection). This cycling was repeated until 105 generations had elapsed and was continued after that until the end of the next period of small N␮ was reached. At that time, ensemble overlap statistics were evaluated. The dashed curve represents the value of Ne␮ as a function of tS and tL, where Ne is the inbreeding effective population size calculated from tS and tL. For the entire range of this plot, the values of Ne␮ suggest that the evolutionary dynamics should be dominated by drift, such that the mean ensemble overlap should be close to the lower horizontal line (Wagner 1999, Figure 7). However, as long as Ͼ50 generations are spent in each cycle of large population size (tL Ն 50), the influence of selection continues to be appreciable, even if only a small fraction of time is spent in the large population regime. Only for smaller tL does drift dominate if tS/tL Ͼ 50. For reasons of computational feasibility, it was the mutation rate and not the population size that was varied. N ϭ 1000, ␮S ϭ 0.0005, ␮L ϭ 0.5, ␭w ϭ 0, ␭r ϭ 0.9, ␴mr ϭ 0.05. Population Size, Pleiotropy and Fitness Effects 1395 as calculated via (9), where f ϭ tS/(tL ϩ tS). Note that ulated by each of two genes encoding a transcription Ne␮ is smaller than one for all the values of tS and tL, factor (see Figure 1). Assume that, over time, mutations such that drift should dominate the dynamics of overlap randomly eliminate the capability to carry out some of (Figure 7 in Wagner 1999). However, for many of the these functions in each of the genes, such that only a values of tS and tL in Figure 5, populations maintain fraction r ϭ kr/k of the original k functions is still per- levels of overlap indistinguishable from or close to those formed by both genes. In this context, mutations with in mutation-selection balance for infinite populations, pleiotropic effect can be viewed as those that eliminate despite the fact that they spend only a small fraction more than one, say l, function. If k is sufficiently large, of time in the large N␮ regime. Whereas the absolute then the probability that such a pleiotropic mutation is l amount of time tL spent per cycle in the large N␮ regime neutral is approximated by r . As a consequence, for is clearly important in determining how much overlap a given overlap r, the probability f(r) that a mutation is maintained, it is equally clear from Figure 5 that eliminating l functions is neutral is given by f(r) ϭ r l. substantial overlap is maintained even for transient In other words, the more extensive pleiotropic effects bursts in N␮. For instance, Figure 5a shows that func- are, the smaller the protective effect against mutations tional overlap close to that observed in mutation-selec- afforded by a given overlap r. Only in the absence of tion equilibrium (infinite population size) is maintained pleiotropic effects does the simple relation r ϭ f(r) hold. if the population spends only one-hundredth of its time, In practice, mutations span a range of pleiotropic ef- or 100 out of 10,000 generations in the large N␮ regime. fects, suggesting that f(r) Ͻ r, under the constraints that A caveat to these results is that computational limitations f(1) ϭ 1 and f(0) ϭ 0. Below, the effect on the evolution in simulating large populations necessitated modulating of redundancy of a moderate degree of pleiotropy (l ϭ ␮ rather than N to simulate changes in N␮. Note also 2, 3) via the relation f(r) ϭ r l is explored (Figure 6a). that the specific values of tS and tL necessary to maintain The evolution of ͗rt͘, the mean functional overlap in overlap will depend on multiple factors, such as the a population ensemble evolving under the influence location of the mutation-selection equilibrium, the vari- of genetic drift, can be approximated by an ordinary ance in overlap introduced by mutation, and probably differential equation even the selection regime (soft selection vs. hard selec- d͗r ͘ 2␮͗r ͘1ϩl tion; Hartl and Clark 1997). Unfortunately, an analyt- t ϭ ␭ t ln r l , (10) ical approach to this problem, which could both circum- dt (1 Ϫ 2␮) ϩ 2␮͗rt͘ vent computational limitations and account for these which is a simple extension of (4) to l Ն 1. Here, one factors, is currently not at hand. unit of time corresponds to one (discrete) generation, Why is Ne not a reliable indicator for the evolution and l Ն 1 is the exponent in the function f(r) ϭ r l. A of overlap (Figure 4) in fluctuating populations? Even solution to this equation can be obtained as an implicit in the absence of analytical predictions for the change function by a separation of variables, yielding of r in fluctuating populations, it is safe to say that 1 1 Ϫ 2␮ 1 1 this must have to do with the different evolutionary Ϫ ϩ Ϫ ϭ 1 l ln͗rt͘ t 0 dynamics of genetic variation and functional overlap. ΂ln ␭r΃΄΂ 2␮ ΃΂l ΃΂ ͗rt͘ ΃ ΅ Genetic variation gets lost much faster in small popula- for the initial condition ͗r ͘ ϭ 1. From this form of tions than in large populations. I surmise that this is 0 the solution, one can directly determine the time ␶ not true in the case of functional overlap. Whereas the x necessary for ͗r͘ to decrease from one shortly after a amount of genetic variation maintained in a fluctuating gene duplication to a value of x. Let us consider the population may be most heavily influenced by the popu- special case of the times t ϭ␶ k necessary to reduce lation bottlenecks, the amount of functional overlap k (1/2) ͗r ͘ to (1⁄ )k. One can show that may be most heavily influenced by the largest popula- t 2 lk lk l tion sizes. 1 1 Ϫ 2␮ 2 l 2 (2 Ϫ 1) tkϩ1 Ϫ tk ϭϪ (2 Ϫ 1) ϩ ln 2 ϰ . Pleiotropic effects slow down divergence of gene ΂ln ␭r΃΄΂ 2␮ ΃΂ l ΃ ΅ l functions: Among the many possible relations of func- (11) tional overlap r and the probability f(r) that a mutation leading to a loss of one or more functions in one gene The times tkϩ1 Ϫ tk necessary for successive halving of k is phenotypically neutral, only f(r) ϭ r has been ex- ͗rt͘ not only scale as 2 , i.e., each consecutive halving of plored so far. The following scenario illustrates the role functional overlap takes twice as long, they also scale as that pleiotropic mutations may have in this relation and 2l. Thus, how fast functional overlap will decay depends motivates a more general mathematical form for f. Con- critically on the extent of pleiotropy of mutations. Fig- sider duplicate and original of a gene shortly after a ure 6b shows the decrease of functional overlap from gene duplication, where both copies carry out each of ͗r0͘ ϭ 1 obtained from a numerical solution of (10) k functions. These could be k spatiotemporal expression (solid lines) and from Monte Carlo simulations (dots domains, in which each of the two genes are involved and bars) for realistic mutation rates (␮ϭ2.5 ϫ 10Ϫ6) in a developmental process, or k downstream genes reg- and small populations (N ϭ 100). The decrease in mean 1396 A. Wagner ensemble overlap clearly slows down as the extent of more and more likely to eliminate nonredundant func- pleiotropy increases. However, over longer evolutionary tions as the functional overlap between two genes de- timescales than those depicted here, differences among creases. Because such mutations are deleterious and are the different degrees of pleiotropy are more dramatic eliminated from the population, the decay of overlap than Figure 6b suggests. Figure 6c shows, on a linear- via neutral mutations and drift is slowed down. If most log scale, the times ␶1/2, ␶1/4, and ␶1/8 obtained from (11) mutations have effects on many of a gene’s functions, as a function of l. It shows that (i) for an increase of l one may observe significant functional overlap (congru- from 1 to only 3, each of these times increases by a ent expression patterns, etc.) maintained over long evo- factor 10 to 100 and (ii) for l ϭ 3, the time to decrease lutionary timescales. This is because most mutations will 1 ͗r͘ by each additional factor of ⁄2 increases by approxi- also affect the few unique functions of a gene and thus mately a factor 10. be deleterious. In this case, extensive functional overlap In sum, even moderate degrees of pleiotropy lead to a may not indicate effective buffering against mutations. severe constraint on how much genetic drift can reduce functional overlap among genes. The intuitive explana- tion is simple: mutations with pleiotropic effects are DISCUSSION The model explored here rests on three simple as- sumptions: (i) genes have functions that overlap in a quantifiable way; (ii) overlapping functions affect the probability of mutations being neutral; and (iii) muta- tions, on average, reduce functional overlap. As in any mathematical model, simplifications and abstractions were made. A haploid system has been modeled to illus- trate the issues discussed below most clearly. Purifying and : The model does not incorporate advantageous mutations leading to novel functions. Instead, its restriction to purifying selec- tion and neutral mutations reflects an implicit assump- tion that genes diversify mostly by (i) loss of some of their functions and, thus, functional specialization, or (ii) acquisition of new functions by neutral mutations. Whether this is the case may well depend on the type of genes considered. For instance, genes involved in self- recognition or reproductive functions are more likely to be subject to positive selection for diversification (Cir- era and Aguade 1998; Ting et al. 1998; Tsaur et al. 1998). On the other hand, many of the most striking examples of partial redundancies come from genes em- bedded in highly conserved developmental pathways, such as that of muscle determination (Weintraub 1993). In these cases, there may be a limited potential for the evolution of novel functions, such that diver-

Figure 6.—Pleiotropy slows the decay of functional overlap. (a) The function f(r) ϭ r l illustrates how the probability of neutral mutations for two genes of a given functional overlap r is influenced by the number of functions l affected per mutation. (b) Evolution of functional overlap ͗r͘ in a popula- tion ensemble (20 populations) under the influence of drift as a function of the degree of pleiotropy l. Dots and bars represent means and standard deviations as obtained from a Monte Carlo simulation. The solid lines are numerical solu- Ϫ6 tions of (10); ␭r ϭ 0.5, ␴mr ϭ 0.05, ␭w ϭ 0, ␮ϭ2.5 ϫ 10 . As l is increased from 1 to 3, the decay of r slows. (c) Times ␶x until ͗r͘ decreases from 1 to x as a function of the degree of pleiotropy l as calculated from (10). A linear increase in l leads to an exponential increase in ␶x; ␭r ϭ 0.9, ␴mr ϭ 0.05, Ϫ6 ␭w ϭ 0, ␮ϭ2.5 ϫ 10 , N ϭ 100. Population Size, Pleiotropy and Fitness Effects 1397 gence by specialization may be prevalent (Force et al. evolved (Crow and Kimura 1965; Kondrashov and 1999). Corroborating evidence comes from experi- Crow 1991; Maynard and Smith 1978; Perrot et al. ments where a gene from an invertebrate (before dupli- 1991; Otto and Goldstein 1992). On the other hand, cation) can substitute for the function of one of its diploids carry a higher mutational load (mean fitness duplication products in very distantly related verte- reduction due to deleterious mutations) than haploids, brates, or vice versa. If the vertebrate gene had evolved because they carry twice as many (e.g., Crow and radically new functions, such substitution might not be Kimura 1965). The higher mutational load counteracts possible. Examples include nautilus and decapentaplegic the effects of masking, raising the question of which of from Drosophila melanogaster and their respective coun- these two forces wins out in the evolution of ploidy terparts MyoD and BMP in mammals (Venuti et al. levels. The question has been studied by various authors, 1991; Padgett et al. 1993). However, the restriction to mostly in the framework of modifier theory (Perrot et divergence by neutral mutations should not be read as al. 1991; Bengtsson 1992; Otto and Goldstein 1992). an exclusive endorsement of one evolutionary scenario. The answer depends on details of parameters such as It is rather an acknowledgment that this scenario is recombination rates and degree of . relevant for many developmental genes. There are key differences in the questions studied Mutational load and deleterious mutations: For the here from those at issue in existing models of the evolu- model studied here, mean fitness and mean functional tion of ploidy levels. First, the increase in frequency of overlap among genes are only weakly correlated at muta- original gene and duplicate (corresponding to the rise tion-selection balance (Figure 2). The reason is the fol- of diploidy from haploid organisms) is not at issue here. lowing. If organisms in a population have different fit- Fixation of gene duplications must be a ubiquitous phe- nesses, selection acts immediately on these differences, nomenon, because a vast number of known proteins e.g., via differential survival probabilities. In contrast, if fall into a small number of gene families. Theoretical some organisms in a population have genes with lower work suggests that such fixation is easily accomplished functional overlap than other organisms, these differ- by neutral evolution (even without any selective advan- ences are selectively neutral until mutations occur at tage through “masking”) provided that duplications oc- these genes, which takes of the order of 1/␮ generations cur at a finite rate (Clark 1994). At issue is rather the per individual. In this case, the genes with the higher question if and how functional overlap is maintained, functional overlap r are less likely to undergo deleteri- once established. A second distinction stems from the ous mutations (which are then rapidly eliminated by notion of deleterious mutations. It is best illustrated selection), and they therefore accumulate slowly in the with the example of a set of genes regulated by one population. Thus, one might think of the evolution of whose gene has undergone gene mean fitness and of mean overlap as occurring on differ- duplication and subsequent loss of some functions be- ent timescales, and this is what causes them to be effec- tween original and duplicate (Figure 1). As long as all tively decoupled. genes are properly regulated by at least one of the two As a consequence, mean functional overlap is inde- factors (Figure 1c), no fitness disadvantage will result. pendent of the fitness effects of deleterious mutations However, from the perspective of a classical population (Figure 2). Whether deleterious mutations have only genetic model, all transcription factor genes that have moderate fitness effects, or whether they are invariably lost the ability to regulate at least one gene in the set lethal, they are eliminated at a rate much faster than would harbor “deleterious” mutations, because if the that at which the evolution of overlap occurs. Con- complementing their missing functions (at a dif- versely, mean fitness at equilibrium differs from 1 only ferent in the case of duplicate genes, at the same by an amount that is of the order of the mutation rate, locus on a sister in the case of diploidy) similar to what would be expected from a haploid two- did not occur in the same organism, the organism would locus model with recurrent mutations. This is because have reduced fitness. This creates the somewhat para- overlap influences only the rate of deleterious mutations doxical situation where an entire gene pool may consist (1 Ϫ r)␮. Whether r is large or small, the reduction in of deleterious alleles that nevertheless complement mean fitness from 1 is thus still of the order of ␮. An each other functionally, so that no organism with re- important corollary is that overlapping gene functions duced fitness exists in the population. Thus, where are not maintained because they convey a higher mean multifunctional proteins may undergo specialization fitness on a population with greater functional overlap. after gene duplications via loss of some of their func- In this sense maintenance of functional overlap is not tions, the notion of intrinsically deleterious mutations an adaptive phenomenon (Futuyma 1998). may not be a natural one. It may be more expedient to Evolution of diploidy and redundant gene functions: call a mutation deleterious only if it actually affects fit- A phenomenon superficially related to the evolution of ness. However, even if the notion of “intrinsically” dele- redundancy is the evolution of diploidy. Diploids are terious mutations is abandoned, the question remains able to “mask” the fitness effects of deleterious alleles, whether overlapping gene functions influence mean which may have been an important reason why diploidy fitness (mutational load) in mutation-selection balance. 1398 A. Wagner

The answer is yes, but to no significant extent. In the may be lost slowly in small populations. It would clearly simple case where all deleterious mutations are lethal, be desirable to have a measure of effective population mean equilibrium fitness is 1 Ϫ 2␮ if two genes have size suitable to predict the decay of redundancy in fluc- completely diverged in function; it is 1 Ϫ 2␮(1 Ϫ r)if tuating populations. mean equilibrium overlap is equal to r. Thus, differences Population sizes with N␮ӷ1 are realistic for most in mutational load are small. This result will not be microorganisms and some invertebrates. However, even greatly affected if deleterious mutations have less severe in higher vertebrates, census population sizes are some- effects, because mean equilibrium fitness is still of order times well within this realm, although population sizes 1 Ϫ␮. may undergo frequent bottlenecks. For example, a study Decoupling of the evolution of overlap and redun- by Avise et al. (1988) reports a discrepancy between dancy shows that ␭r and its associated standard deviation current population sizes and historical population sizes of mutational effects ␴mr are the only parameters de- in three vertebrate species including catfish (Arius felis) termining the evolution of functional overlap. Although and redwing blackbirds (Agelaius phoeniceus). Census functional overlap can be measured, as discussed above, population sizes suggest a number of 107 and 2 ϫ 107 such measurements are not available, and thus it is diffi- breeding females, respectively. However, data on mito- cult to estimate these parameters from experimental chondrial DNA variation suggest that past population data. As far as ␴mr is concerned, it is perhaps most im- sizes must have been substantially smaller (Avise et al. portant that some mutations must increase functional 1988). Nevertheless, even if large population sizes are overlap among genes for selection to sustain overlap sustained over only short periods of time, functional (Wagner 1999). Ample indirect evidence that this is overlap among genes might be maintained in such pop- the case comes from the many examples of functionally ulations. More data on population size history would convergent proteins in various organisms (Doolittle be required to determine whether population bursts

1994). Measurement of ␭r is hampered by two additional are frequent enough even in organisms normally char- factors, the significant stochastic component in the evo- acterized by small N. lution of overlap (see below) and its dependency on An observation unrelated to the significance of popu- the nature of the genes considered. lation fluctuations can be made from the results shown Population size fluctuations and redundancy: Natural here. “Error” bars shown in Figure 5 represent the stan- populations differ from the constructs of theoretical dard deviation of functional overlap among populations population genetics in many ways, e.g., mating does not within a population ensemble. Because these popula- occur at random, individuals show a clumped spatial tions spend a significant amount of time in the small distribution, and populations fluctuate in size. Such de- N␮ regime, they are monomorphic most of the time. viations can be dealt with effectively by calculating an This means that the standard deviation shown is a good effective population size Ne according to standard for- measure of differences in functional overlap among mulas (Hartl and Clark 1997). A recent survey on populations evolving in parallel. These diffferences are the relation of Ne and N in a large number of vertebrate obviously large (Figure 5), which is due to variation in and invertebrate populations suggests that fluctuations the number of mutations reducing overlap that go to in population size are by far the most important factor fixation. In some populations, several such mutations influencing Ne (Frankham 1995). may have gone to fixation, in others only few. This holds Population size is a key factor in determining whether not only for populations evolving in parallel, but would selection can maintain functional overlap among genes. also apply to multiple gene pairs evolving in parallel in Only if the influence of drift is weak, i.e., in populations one lineage. The importance of stochastic effects may with large size N␮, can overlap be maintained indefi- help explain why some developmental genes duplicated nitely, or even be built up from disjoint functions early during chordate evolution have preserved greater

(Wagner 1999). Perhaps surprisingly, Ne does not seem functional overlap than others. Examples might include to be sufficient to predict the dynamics of overlap in members of the MyoD gene family (Weintraub 1993), fluctuating populations (Figure 4). With some caveats where significant overlap in functions has been main- due to computational limitations and the absence of tained, vs. members of the hedgehog gene family, which analytical predictions, the results shown in Figure 5 sug- have almost completely diverged in expression pattern gest that short spikes (e.g., 50–100 generations every (Bitgood and McMahon 1995). 10,000 generations) in N␮ may sometimes be sufficient Pleiotropy slows the decay of overlap in small popula- to sustain high levels of functional overlap. The reason tions: A large fraction of an organism’s genomic gene for the inadequacy of Ne to predict the dynamics of content may be expressed during the development of overlap has to do with the different evolutionary dynam- each organ system (Thaker and Kankel 1992; Datta ics of genetic variation (for which Ne was designed) and et al. 1993). This suggests that each developmental gene functional overlap. Whereas genetic variation gets lost participates in many developmental processes, in line much faster in small populations than in large popula- with evidence from many well-studied genes, which are tions, this may not be true for functional overlap, which expressed during more than one developmental stage. Population Size, Pleiotropy and Fitness Effects 1399

Such developmental multifunctionality is comple- derestimated given the vast amount of time that has mented by biochemical multifunctionality of proteins. elapsed since these duplications occurred. Transcription factors exemplify this well (Figure 1). If each gene regulated by a transcription factor is viewed as one function of that factor, then most transcription LITERATURE CITED factors are multifunctional. The abundance of multi- Allendorf, F. W., F. M. Utter and B. P. May, 1975 Gene duplica- functionality in regulatory genes, both developmentally tion within the family Salmonidae. II. Detection and determina- and biochemically, suggests that pleiotropic effects of tion of the genetic control of duplicate loci through inheritance mutations should be pervasive, as Wright (1968) al- studies and the examination of populations, pp. 415–432 in Iso- zymes IV: Genetics and Evolution, edited by C. L. Markert. Aca- ready foresaw. In other words, most mutations may af- demic Press, New York. fect more than one function of a developmental gene. Avise, J. C., R. M. Ball and J. Arnold, 1988 Current versus historical Now what does this have to do with the evolution of population sizes in vertebrate species with high gene flow: a comparison based on mitochondrial DNA lineages and inbreed- functional overlap? ing theory for neutral mutations. Mol. Biol. Evol. 5: 331–344. Envision a scenario such as that shown in Figure 1, Bailey, W. J., J. Kim, G. P. Wagner and F. H. Ruddle, 1997 Phyloge- where a transcription factor regulating the expression netic reconstruction of vertebrate Hox cluster duplications. Mol. Biol. Evol. 14: 843–853. of multiple genes undergoes gene duplication, and sub- Basson, M. E., M. Thorsness and J. Rine, 1986 Saccharomyces cerevis- sequently the original and the copy lose some of their iae contains two functional genes encoding 3-hydroxy-3-methyl- functions. In small populations, functional overlap will glutaryl-coenzyme A reductase. Proc. Natl. Acad. Sci. USA 83: 5563–5567. decay according to (4). The key feature of this equation Bengtsson, B. O., 1992 Deleterious mutations and the origin of is that the reduction of overlap proceeds more and the meiotic ploidy cycle. Genetics 131: 741–744. more slowly as more and more functions get lost. This Bitgood, M. J., and A. P. McMahon, 1995 Hedgehog and Bmp genes are coexpressed at many diverse sites of cell-cell interaction in is because as the number of functions not covered by the mouse embryo. Dev. Biol. 172: 126–138. the other genes increases, the likelihood that a loss- Cadigan, K. M., U. Grossniklaus and W. J. Gehring, 1994 Func- of-function mutation affects a unique function and is tional redundancy: the respective roles of the two sloppy paired genes in Drosophila . Proc. Natl. Acad. Sci. USA deleterious increases as well. This slowing down in evolu- 91: 6324–6328. tion is quite drastic. Pleiotropic mutational effects con- Cirera, S., and M. Aguade, 1998 Molecular evolution of a duplica- tribute further to this phenomenon. A simple argument tion: the sex-peptide (Acp70A) gene region of Drosophila sub- obscura and Drosophila madeirensis. Mol. Biol. Evol. 15: 988–996. presented above suggests that if a mutation affects l Clark, A. G., 1994 Invasion and maintenance of a gene duplication. functions at a time, then the probability of a mutation Proc. Natl. Acad. Sci. USA 91: 2950–2954. in one of two genes with overlap r being neutral is of Condie, B. G., and M. R. Capecchi, 1995 Mice with targeted disrup- l tions in the paralogous Hox genes Hoxa-3 and Hoxd-3 reveal ge- order r . Thus, if pleiotropic mutations affect a suffi- netic interactions. J. Cell. Biochem. Suppl. O: A7–A109. ciently large number l of functions, then mutations will Cooke, J., M. A. Nowak, M. Boerlijst and J. Maynard-Smith, 1997 often have deleterious effects even if r is large. This Evolutionary origins and maintenance of redundant gene expres- sion during metazoan development. Trends Genet. 13: 360–364. should lead to a further slowing down of the decay of Crow, J. F., and M. Kimura, 1965 Evolution in sexual and asexual redundancy, an intuition confirmed by analytical and populations. Am. Nat. 99: 439–450. simulation results (Figure 6). As the number of func- Datta, S., K. Stark and D. R. Kankel, 1993 detector analysis of the extent of genomic involvement in tions affected by a mutation increases linearly, the time development in Drosophila melanogaster. J. Neurobiol. 24: 824–841. until overlap decays from one to a given value increases Doolittle, R. F., 1994 Convergent evolution: the need to be ex- exponentially (Figure 6c). plicit. Trends Biochem. Sci. 19: 15–18. Ferris, S. D., and G. S. Whitt, 1976 Loss of duplicate gene expres- Knock-out mutations—a severe kind of genetic per- sion after polyploidization. Nature 265: 258–260. turbation—of many developmental genes in vertebrates Ferris, S. D., and G. S. Whitt, 1979 Evolution of the differential have weak phenotypic effects. This suggests that signifi- regulation of duplicate genes after polyploidization. J. Mol. Evol. 12: 267–317. cant functional overlap has been maintained among the Force, A., M. Lynch, F. B. Pickett, A. Amores, Y. L. Yan et al., 1999 genes in question, many of which are the product of Preservation of duplicate genes by complementary, degenerative ancient gene duplications having occurred sometime mutations. Genetics 151: 1531–1545. before the origin of tetrapods 400 million years ago Frankham, R., 1995 Effective population size/adult population size .ratios in wildlife: a review. Genet. Res. 66: 95–107 ف (Wray et al. 1996). Such preservation of functional over- Futuyma, D., 1998 Evolutionary Biology, Ed. 3. Sinauer, Sunderland, lap is not the “default” fate of duplicated genes, as sev- MA. eral recent studies on the rapid functional diversifica- Goldstein, L. S. B., 1993 Functional redundancy in mitotic force generation. J. Cell Biol. 120: 1–3. tion of duplicated genes show (Cirera and Aguade Gonza´les-Gaita´n, E. M., M. Rothe, E. A. Wimmer, H. Taubert and 1998; Tsaur et al. 1998; Zhang et al. 1998). Thus, it is H. Ja¨ckle, 1994 Redundant functions of the genes knirps and likely that some evolutionary force is responsible for knirps-related for the establishment of anterior Drosophila head structures. Proc. Natl. Acad. Sci. USA 91: 8567–8571. its preservation. Natural selection is likely to play an Goodson, H. V., and J. A. Spudich, 1995 Identification and molecu- important role in this process, either actively main- lar characterization of a yeast I. Cell Motil. Cytoskel. 30: taining overlap in large populations or passively de- 73–84. Hanks, M., W. Wurst, L. Anson-Cartwright, A. Auerbach and laying its decay in small ones. Even if selection only A. L. Joyner, 1995 Rescue of the En-1 mutant by fulfills this second role, its importance must not be un- replacement of En-1 with En-2. Science 269: 679–682. 1400 A. Wagner

Hartl, D. L., and A. G. Clark, 1997 Principles of Population Genetics, of Drosophila: II. Divergence versus . Mol. Biol. Ed. 3. Sinauer, Sunderland, MA. Evol. 15: 1040–1046. Higashijima, S., T. Michiue, Y. Emori and K. Saigo, 1992 Subtype Venuti, J. M., L. Goldberg, T. Chakraborty, E. N. Olson and determination of Drosophila embryonic external sensory organs W. H. Klein, 1991 A myogenic factor from sea urchin embryos by redundant homeo box genes BarH1 and BarH2. Genes Dev. capable of programming muscle differentiation in mammalian 6: 1005–1018. cells. Proc. Natl. Acad. Sci. USA 88: 6219–6223. Hoffmann, F. M., 1991 Drosophila abl and genetic redundancy in Wagner, A., 1999 Redundant gene functions and natural selection. signal transduction. Trends Genet. 7: 351–355. J. Evol. Biol. 12: 1–16. Joyner, A. L., K. Herrup, B. A. Auerbach, C. A. Davis and J. Rossant, Walsh, J. B., 1995 How often do duplicated genes evolve new func- 1991 Subtle cerebellar phenotype in mice homozygous for a tions? Genetics 139: 421–428. targeted deletion of the en-2 . Science 251: 1239–1243. Watterson, G. A., 1983 On the time for gene silencing at duplicate Kimura, M., and J. L. King, 1979 Fixation of a deleterious allele at loci. Genetics 105: 745–766. one of two “duplicate” loci by mutation pressure and random Weintraub, H., 1993 The MyoD family and myogenesis: redun- drift. Proc. Natl. Acad. Sci. USA 76: 2858–2861. dancy, networks, and thresholds. Cell 75: 1241–1244. Kondrashov, A. S., and J. F. Crow, 1991 Haploidy or diploidy: Wray, G. A., J. S. Levinton and L. H. Shapiro, 1996 Molecular Which is better? Nature 351: 314–315. evidence for deep precambrian divergences among metazoan Kosman, D., J. Reinitz and D. H. Sharp, 1998 Automated assay of phyla. Science 274: 568–573. gene expression at cellular resolution, in Pacific Symposium on Wright, S., 1968 Evolution and the Genetics of Populations. University Biocomputing 1998, edited by R. B. Altman, K. Dunker, L. of Chicago Press, Chicago. Hunter and T. E. Klein. World Scientific, Singapore. Zhang, J., H. F. Rosenberg and M. Nei, 1998 Positive Darwinian Li, X., and M. Noll, 1994a Compatibility between enhancers and selection after gene duplication in primate ribonuclease genes. Proc. Natl. Acad. Sci. USA 95: 3708–3713. promoters determines the transcriptional specificity of goose- berry and gooseberry neuro in the Drosophila embryo. EMBO Communicating editor: A. G. Clark J. 13: 400–406. Li, X., and M. Noll, 1994b Evolution of distinct developmental functions of three Drosophila genes by acquisition of different cis-regulatory regions. Nature 367: 83–87. Lundgren, K., N. Walworth, R. Booher, M. Dembski, M. Kirschner APPENDIX: AN ORDER-OF-MAGNITUDE et al.,1991 mik1 and wee1 cooperate in the inhibitory ESTIMATION FOR THE CORRELATION OF phosphorylation of cdc2. Cell 64: 1111–1112. FITNESS AND FUNCTIONAL OVERLAP Maconochie, M., S. Nonchev, A. Morrison and R. Krumlauf, 1996 Paralogous Hox genes: function and regulation. Annu. Rev. One can derive from (6) a recurrence equation for Genet. 30: 529–556. r and solve for r at equilibrium by setting r ϭ r . This Marshall, C. R., E. C. Raff and R. A. Raff, 1994 Dollo’s law and t t tϩ1 the death and resurrection of genes. Proc. Natl. Acad. Sci. USA yields 91: 12283–12287. Maruyama, T., and N. Takahata, 1981 Numerical studies of the (1 Ϫ 2␮)rw ϩ 2␮(␭ r 2w) ϩ 2 ␮␭ ␭ (rw Ϫ r 2w) ϭ r r w frequency trajectories in the process of fixation of null genes at r , (A1) (1 Ϫ 2␮)w ϩ 2␮rw ϩ 2␮␭w(w Ϫ rw) duplicated loci. Heredity 46: 49–57. Maynard Smith, J., 1978 The Evolution of Sex. Cambridge University Press, New York. which can be rewritten as Nadeau, J. H., and D. Sankoff, 1997 Comparable rates of gene loss and functional divergence after genome duplications early in vertebrate evolution. Genetics 147: 1259–1266. Cov(r,w) ϭ rw Ϫ rw Nei, M., and A. K. Roychoudhury, 1973 Probability of fixation of nonfunctional genes at duplicate loci. Am. Nat. 107: 362–372. 2␮ 2 ϭ ͓rwr ϩ␭w(w Ϫ rw) Ϫ␭rr w Ohno, S., 1970 Evolution by Gene Duplication. Springer, New York. 1 Ϫ 2␮ Ohta, T., 1987 Simulating evolution by gene duplication. Genetics 2 115: 207–213. Ϫ␭r ␭w(rw Ϫ r w͔. (A2) Otto, S. P., and D. B. Goldstein, 1992 Recombination and the evolution of diploidy. Genetics 131: 745–751. Padgett, R. W., J. M. Wozney and W. M. Gelbart, 1993 Human Each of the terms in the square brackets on the right- BMP sequences can confer normal dorsal-ventral patterning in hand side is of order unity or less. (Note that 1 Ͼ the Drosophila embryo. Proc. Natl. Acad. Sci. USA 90: 2905–2909. ͐1͐1r iw jp(r,w) 0asi, j ∞.) It follows that the co- 0 0 → → Perrot, V., S. Richerd and M. Valero, 1991 Transition from hap- variance of r and w in mutation-selection balance is of loidy to diploidy. Nature 351: 315–317. Schena, M., 1996 Genome analysis with gene expression microar- order of the mutation rate, i.e., very small. By calcu- rays. Bioessays 18: 427–431. lating the equilibrium value of r k in the same way, it Sharman, A. C., and P. W. H. Holland, 1996 Conservation, duplica- can be shown that at mutation selection equilibrium tion, and divergence of developmental genes during chordate k k k evolution. Neth. J. Zool. 46: 47–67. Cov(r , w) ϭ r w Ϫ r w is of order ␮ for all k Ͼ 1. In Tautz, D., 1992 Redundancies, development and the flow of infor- other words, r kw Ϸ r kw. Substituting this approximation mation. BioEssays 14: 263–266. into (A1), w and ␮ cancel out, and one easily obtains Thaker, H. M., and D. R. Kankel, 1992 Mosaic analysis gives an estimate of the extent of genomic involvement in the develop- 2 2 ␭r r ϩ␭r ␭w(r Ϫ r ) ment of the visual system in Drosophila melanogaster. Genetics 131: r Ϸ . (A3) 883–894. r ϩ␭w(1 Ϫ r) Thomas, J. H., 1993 Thinking about genetic redundancy. Trends 2 2 Genet. 9: 395–398. This can be rewritten to yield Var(r) Ͼ␭rr Ϫ r ϭ Ting, C. T., S. C. Tsaur, M. L. Wu and C. I. Wu, 1998 A rapidly ␭w(1 Ϫ␭r)r/(1 Ϫ␭w). evolving homeobox at the site of a sterility gene. Science Using (6) to establish an equation for the mean fit- 282: 1501–1504. Tsaur, S. C., C. T. Ting, and C. I. Wu, 1998 Positive selection ness w at equilibrium, one can show that the variance driving the evolution of a gene of male , Acp26Aa, of w is given by Population Size, Pleiotropy and Fitness Effects 1401

2 2 2␮ 2 2 whereas the above form for Var(r) suggests that Var(r)is Var(w) ϭ w Ϫ w ϭ [(1 Ϫ␭w)rw Ϫ␭wrw ϩ␭ww 1 Ϫ␮ of order unity. Taken together, this indicates that

2 2 2 Ϫ␭ww ϩ␴mw(r Ϫ 1)]. Cov(r,w) ␳ ϭ √␮ (r,w) √ is of order . (A5) (A4) Var(w)Var(r) Summarizing, both Cov(r,w) and Var(w) are of order ␮,