<<

A general genetic theory for the of developmental interactions

Sean H. Rice*

Department of and Evolutionary , Osborn Memorial Laboratories, Yale University, New Haven, CT 06520

Communicated by Joseph Felsenstein, University of Washington, Seattle, WA, October 11, 2002 (received for review October 7, 2001) The development of most phenotypic traits involves complex The geometry of this surface is determined by how the interactions between many underlying factors, both genetic and underlying factors interact to influence , in other environmental. To study the evolution of such processes, a set of words, by development. There is a straightforward relation mathematical relationships is derived that describe how selection between the terminology of interaction and the geometry acts to change the distribution of given arbitrarily of the phenotype landscape. If u1 and u2 are genetic (heritable) complex developmental interactions and any distribution of ge- factors, then the degree to which the value of u1 influences the netic and environmental variation. The result is illustrated by using phenotypic consequences of changing u2 (i.e., the epistatic Ѩ2␾͞Ѩ Ѩ it to derive models for the evolution of and for the interaction between them) is u1 u2, which measures the evolutionary consequences of asymmetry in the distribution of Ѩ2␾͞Ѩ 2 curvature of the landscape in one direction. Similarly, u1 genetic variation. measures the nonlinear effects of changing u1 and thus (when u1 is genetic) provides a measure of dominance (10). If u1 is a Ѩ2␾͞Ѩ Ѩ uring development of a , gene products genetic factor and u2 an , then u1 u2 Dinteract in highly nonadditive ways with one another and measures the by environment (G ϫ E) interaction. If with environmental factors (1). Despite this, much evolutionary we hold all of the environmental underlying factors constant and theory assumes additive contributions of gene products to consider only the genetic factors, then we have the ‘‘genotype– phenotype (2). Specific models involving have been phenotype map.’’ Conversely, if we take a ‘‘slice’’ of the pheno- studied since the inception of population , and they have type landscape along one of the environmental axes, then we attracted increased interest in recent years (3–8). The complex- have a ‘‘norm of reaction,’’ a plot of phenotype as a of ity introduced by epistasis, however, has meant that only special that particular environmental variable. The slope of the land- cases have been studied. scape at a particular point determines how sensitive phenotype Below, I construct a theory for the evolution of developmental is to underlying variation at that point; it thus provides a measure interactions that requires no simplifying assumptions about the of ‘‘phenotypic ’’ or ‘‘canalization.’’ number of underlying genetic and environmental factors in- The real value of the phenotype landscape is that we can solve volved, how they interact, or the distribution of variation in a analytically for how selection will act on a population placed on population. The theory is framed in terms of movement of a this surface. To do this, we write the selection differential (the population over a phenotype landscape, a plot of a phenotypic change due only to selection, prior to other processes such as trait as a function of the underlying genetic and environmental recombination) in terms of a set of vectors, each of which gives factors that contribute to its development. I first discuss the idea us insight into the evolutionary consequences of a different kind of a generalized phenotype landscape, then introduce some of developmental interaction or aspect of the distribution of notation that facilitates mathematical analysis of such land- genetic variation. scapes, and finally use this notation to derive an equation for Fig. 1a shows an uncurved phenotype landscape, correspond- evolution of a population on an arbitrarily complex phenotype ing to a case in which all underlying factors contribute additively landscape. to the trait (i.e., no epistasis or dominance) and a symmetrical distribution of underlying variation. In this case, the selection Generalized Phenotype Landscapes differential is described by a single vector (called Q1; the Consider a phenotypic trait, ␾, influenced by a set of underlying notation will be explained below), pointing in the direction of the factors, u1, u2,...,uk. An underlying factor may represent the optimal phenotype contour and influenced by the variance and degree of expression of some gene product, a quantitative covariance values of the distribution of underlying variation. The character in development (e.g., concentration of an inducer), or formula for the vector Q1 contains only first derivatives of an environmental factor such as salinity or temperature. There phenotype with respect to the underlying factors (Ѩ␾͞Ѩu), which is no distinction here between environmental and genetic un- are the only derivatives that exist on an uncurved landscape. derlying factors; each underlying factor is assigned a In Fig. 1b, the landscape is curved (here a quadratic function), and even those with zero heritability can influence the evolution indicating epistatic interaction between the underlying factors. of other, heritable, factors (7). Although I will treat the under- The vector Q1 still exists and has the same interpretation as lying factors as continuous variables, their distribution need not before, but now there is another vector, Q1,2, that points in the be continuous. For example, uj may represent the enzymatic direction in which the slope of the landscape changes most activity of a gene product (a continuous value), whereas the quickly. This vector contains terms involving the product of the distribution of values of uj in a population may be discrete, first and second derivatives of phenotype with respect to the corresponding to distinct . underlying factors (Ѩ␾͞Ѩu⅐Ѩ2␾͞Ѩu2) and thus does not exist on an Plotting phenotype as a function of the k underlying factors uncurved landscape. yields a k-dimensional surface, the phenotype landscape (Fig. Previously (6), I showed that evolution on a quadratic phe- 1). Contours on this landscape are sets of values for the notype landscape (all derivatives higher than second are zero, underlying factors that produce the same phenotypic value (6). meaning that we are considering only first-order epistatic inter- The standard quantities used in , such as actions) is given by the sum of Q1 and Q1,2, one vector moving additive and epistatic variance, can be calculated from the shape of the phenotype landscape and the distribution of underlying variation (9). *E-mail: [email protected].

15518–15523 ͉ PNAS ͉ November 26, 2002 ͉ vol. 99 ͉ no. 24 www.pnas.org͞cgi͞doi͞10.1073͞pnas.202620999 Downloaded by guest on September 26, 2021 Fig. 1. Three phenotype landscapes and the evolution vectors that are relevant on them. The contours are lines of equal value of the trait, with the solid contour 2 2 in each case being the optimal value of the trait. (a) An additive landscape (␾ ϭ u1 ϩ u2). (b) A quadratic landscape. It is assumed here that Ѩ w͞Ѩ␾ Ͻ 0. Otherwise, Q1,2 would point in the opposite direction. (c) A cubic landscape.

the population toward the optimal phenotype contour and the In words, Dn contains all of the nth derivatives of phenotypic trait other moving it toward a point of minimum slope along that ␾ with respect to the underlying factors, evaluated at the contour. Below, I show how to extend this approach to arbitrarily population mean value of the underlying factors. For example, 2 ϭ Ѩ2␾͞Ѩ 2 2 ϭ Ѩ2␾͞Ѩ Ѩ 1 complex phenotype landscapes. Not surprisingly, more vectors D (1, 1) u1 and D (1, 2) u1 u2. D is just the appear, corresponding to ways of changing the population gradient vector, ٌ␾, that points in the direction of maximum 2

distribution that are not possible on an uncurved or quadratic slope on the landscape. D is a matrix of second partial deriva- EVOLUTION surface. tives of phenotype with respect to the underlying factors (this is One such new vector is shown in Fig 1c. On a landscape with the matrix E in refs. 6 and 7). nonzero third derivatives, moving in the direction of the vector (This derivation assumes that the landscape is differentiable. ␾ Q3 causes the distribution of to become positively skewed. This The Appendix shows a parallel derivation based on regressions, new vector contains third derivatives of the phenotype land- which exist even for a kinked or discontinuous surface.) scape, which correspond to cases in which, for example, one gene Let Pn be an nth-rank tensor with elements being the nth influences the epistatic interaction between two others. This central moments of the distribution of underlying factors. Each vector thus does not exist on an uncurved or a quadratic surface. element is then defined as As developmental interactions become more complex, new n͑ ͒ ϭ ͓ ͔ evolution vectors appear. The theory presented below provides P i, j,...,n E xixj ···xn , [2] a way to solve for any particular vector, given the set of ϭ Ϫ ៮ 1 phenotypic derivatives that underlie it. First, I present a couple where E[] denotes expected value. Because xi (ui ui), P is of mathematical concepts that greatly simplify analysis of evo- zero. P2 is the covariance matrix, with elements P2(i, j) ϭ ϭ 3 lution on phenotype landscapes. E(xixj) Cov(ui, uj). P is a third-order array of third moments. Higher moments of the distribution will come up repeatedly Some Useful Notation in the discussion that follows. The important thing to keep in ϭ Ϫ ៮ Define xi ui ui, so that xi measures each value of underlying mind is that the even moments measure symmetrical spread factor i by its deviation from the population mean. Let w be the about the mean, with higher-order even moments being increas- fitness function associated with the phenotypic trait, ␾. For now, ingly sensitive to outliers; so the fourth moment is more sensitive we assume that the underlying factors influence fitness only to outliers than is the variance. Odd moments measure asym- through their influence on ␾ (later, I will discuss the general case metry, with the higher-order odd moments being increasingly in which fitness is influenced by multiple traits, possibly including sensitive to asymmetrical outliers. ␾ direct effects of the us). Like , w is treated here as a smooth We will be using two operations from tensor analysis, the inner function, though nonlinear partial regressions can be substituted product, designated ͗,͘, and the outer product, designated R. The for derivatives to yield an equivalent result (see Appendix). outer product of two tensors of rank n and m is a new tensor of The derivations that follow are framed in terms of tensors, rank (n ϩ m ) with elements defined as the products of each of which for our purposes are simply arrays of elements. The ‘‘rank’’ the elements of the initial tensors. For example, D1 R D2 is a of a tensor is just the number of subscripts necessary to identify third-rank tensor with elements defined as all of its elements. Any first-rank tensor can be written as a 2 vector, with elements Vi identified by a single subscript, and any Ѩ␾ Ѩ ␾ ͑D1  D2͒ ϭ D1⅐D2 ϭ . [3] second-rank tensor can be written as a matrix, with elements Mij i, j,k i j,k Ѩu Ѩu Ѩu requiring two subscripts. (Strictly, to be a tensor, the array must i j k transform under rotation of the coordinate axes in such a way The outer product as defined here is sometimes referred to as the that its actions are unchanged; all of the tensors discussed in this direct or tensor product. paper have this property but it will not be relevant to our The inner product of two tensors of rank n and m (chosen so analysis.) that n Ն m) is a new tensor of rank (n Ϫ m). If n ϭ m, then the n Let D be a tensor of rank n with elements defined as result is a number (a tensor of rank zero). The elements of the Ѩn␾ new tensor are formed by summing over the products of each n͑ ͒ ϭ ͯ element in the lower-rank tensor with the corresponding element D i, j,...,n Ѩ Ѩ Ѩ . [1] ui uj ··· un u៮ in the higher-rank tensor, like this:

Rice PNAS ͉ November 26, 2002 ͉ vol. 99 ͉ no. 24 ͉ 15519 Downloaded by guest on September 26, 2021 Ѩ␾͞Ѩ ⅐Ѩ␾͞Ѩ ͗ 4 3͘ ϭ ͸͸͸ 4 ⅐ 3 considering. Thus, the degree to which ui uj (two first P , D i Pi, j,k,m Dj,k,m. [4] j k m derivatives) contributes to the selection differential is deter- mined by the second derivative of fitness with respect to phe- Note that taking the inner product of a rank-two tensor (a notype, Ѩ2w͞Ѩ␾2. matrix) and a rank-one tensor (a vector) is just the same as The terms Ѩiw͞Ѩ␾i can be replaced by the ith-order nonlinear premultiplying the vector by the matrix. The generalized inner regression of w on ␾ if fitness is not a continuous function of product defined here is sometimes referred to as a contraction. phenotype; the theory then applies even when the regression is On a curved landscape, the mean value of the phenotype may due to stochastic factors (i.e., drift) rather than selection. differ from the phenotype value associated with the mean of the The vectors defined by Eq. 7 give the components of the underlying factors. Using the tensor notation presented above, selection differential. The response, in the next generation, of we can write the mean value of a phenotypic trait ␾ as the population mean to selection is a function of the selection differential and the specific patterns of inheritance of the ϱ underlying factors. If the value of each factor, ui, in offspring is ៮ 1 i i ␾ ϭ ␾͑u៮ ͒ ϩ ͸ ͗P , D ͘ [5] related to the values of all of the underlying factors in their i! iϭ2 parents by a with random noise, then the response to selection is the sum of the Q vectors multiplied by (see Appendix for derivation). The summation starts with i ϭ 2 a heritability matrix (more complicated patterns of inheritance because P1 contains all zeros. will be treated in another paper). The total change in the vector of mean values of the underlying factors over one generation is The Theory then The details of the derivation are presented in the Appendix.We wish to derive the evolutionary consequences of some combi- ⌬u៮ ϭ H͸ Q ϩ ␦͑u៮͒, [8] nation of interactions between underlying factors, described ͚ mathematically by a set of one or more derivatives of phenotype where Q is the sum of all Q vectors and H is a heritability matrix with respect to the underlying factors. with elements hi,j representing the partial regression of the For the general case, consider a set of derivatives like this: expected value of ui in offspring on the parental value of uj. The new term, ␦(u៮), represents change in the mean resulting from Ѩa1␾ Ѩa2␾ Ѩan␾ processes other than differential reproduction, such as recom- ··· . [6] Ѩua1 Ѩua2 Ѩuan bination. For underlying factors that are not heritable, such as some (but not all) ‘‘environmental’’ factors, we simply enter Here, ai denotes the degree of differentiation, which is all that zeros in the appropriate places in the H matrix. Ѩ2␾͞Ѩ 2 Ѩ2␾͞Ѩ 2 matters for our analysis; thus u includes both uj and So far, we have considered how selection on a single trait Ѩ2␾͞Ѩ Ѩ Ѩ␾͞Ѩ ⅐Ѩ2␾͞Ѩ 2 uj uk. For example, the set of derivatives u u changes the means of the underlying factors. Below, I briefly ϭ ϭ corresponds to a1 1, a2 2 (the evolutionary consequences discuss the extensions to multiple traits and change in higher of which are shown in Fig. 1b as Q1,2). moments of the distribution; details of these issues will be Every such set of derivatives that is nonzero has a correspond- elaborated on elsewhere. ing vector in the space of underlying factors; this vector shows the Because change in the mean of the distribution of underlying contribution of that combination of developmental interactions factors is a function of the other moments of the distribution (the to the total selection differential (the change attributable only to P tensors), we would also like to know how these other moments selection). To solve for these vectors, we need to specify the change as well. When the notation introduced here is used, this number of phenotypic derivatives of each type. Define bj as the requires only a slight modification of Eq. 7. The change attrib- number of derivatives of order j. For example, if we are utable to selection alone (the selection differential) in the kth Ѩ␾͞Ѩ ⅐Ѩ␾͞ concerned with a set of derivatives of the form ui moment of the distribution of underlying factors is simply Ѩ ⅐Ѩ2␾͞Ѩ 2 ϭ ϭ uj uk, then b1 2 and b2 1, because there are two first ͚ ␥ Ѩ͚bw derivatives and one second derivative; b is the total number of k kϩ͚a a1 an Q ϭ ͚ ͗P , D  ···  D ͘. [9] derivatives that we are considering. Given this definition, the a1 ···an w៮ Ѩ␾ b general equation for the vector of selection differentials corre- sponding to the set of derivatives in Eq. 6 is The only difference between Eq. 9 and Eq. 7 is the rank of the P tensor, which is now greater than the summed ranks of the D ␥ Ѩ͚bw ϭ 1ϩ͚ai a1 an tensors by an amount k. Setting k 1 in Eq. 9 yields Eq. 7,as Q ϭ ͚ ͗P , D  ···  D ͘, [7] a1···an w៮ Ѩ␾ b it must because the mean is the first moment. Note that (from the inner product rule given above) Qk is a tensor of rank k. This ␥ ϭ ͞ ⌸ ⌸ where 1 ( ai! bj!). (See Appendix for details.) makes sense; the change in the kth moments attributable to In Eq. 7 the rank of the P tensor is always one greater than the selection is written in the same form as we are using to describe sum of ranks of the D tensors. The inner product is thus a the set of kth moments before selection. Thus, Q2 is the change rank-one tensor, which we can write as a vector, Q. (In general, in the variance͞covariance matrix, P2. Di R Dj  Dj R Di, but the symmetry of the P tensors makes it The total response to selection for the kth moment can now so that the order does not matter in Eq. 7.) be written as We can read some important results directly from Eq. 7. First, note that the relevant moments of the distribution of underlying ⌬Pk ϭ ͗Rk H, ͸ Qk͘ ϩ ␦͑Pk͒, [10] variation are of order ͚ a ϩ 1, which is the summed order of the derivatives plus one. For example, the vector corresponding to where Rk H stands for H R H R ⅐⅐⅐k times. The term ␦(Pk), the Ѩ␾͞Ѩ ⅐Ѩ␾͞Ѩ ui uj involves the third-order moments (measuring change in the kth moments introduced by the process of repro- skewness) of the distribution of underlying factors; it will thus not duction, is likely to take on a more significant role in the be relevant to evolution unless the distribution of underlying evolution of higher moments than it does for the change in the variation is skewed. The second thing that we can read directly mean (k ϭ 1) because processes such as segregation, recombi- from Eq. 7 is that the degree of the fitness derivative (Ѩjw͞Ѩ␾j) nation, , etc. generally have greater power to introduce is equal to the total number of phenotypic derivatives that we are variation than to change the mean.

15520 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.202620999 Rice Downloaded by guest on September 26, 2021 From Eq. 9 we can quickly see that purely directional selection Case 2: Quadratic Landscape, Canalizing Selection. The sensitivity of (i.e., only considering the first derivative of fitness with respect ␾ to underlying variation is determined by the slope of the to phenotype) on an uncurved landscape will not change the landscape, measured by the length of the vector ٌ␾ (written ʈٌ␾ʈ 2 variance if the initial distribution is symmetrical, because Q1 ). The gradient of this slope is made up of terms of the form 3 Ѩ␾͞Ѩ ⅐Ѩ2␾͞Ѩ 2 ϭ ϭ contains the distribution tensor P , which is zero for a symmet- u u (6). Here, a1 1 and a2 2, so to study the rical distribution. However, the same selection will introduce evolution of phenotypic sensitivity (or canalization) we use these 3 4 skewness in the next generation, because Q1 goes with P , which values in Eq. 7 to obtain is nonzero for a symmetrical distribution. Note that, as expected, 1 Ѩ2w stabilizing or disruptive selection will change the variance on an ϭ ͗ 4 1  2͘ 2 4 Q1,2 2 P , D D . [14] uncurved landscape, because Q1,1 contains P . 2w៮ Ѩ␾ The discussion above has focused on selection acting on a single trait (␾). The extension to selection acting jointly on This is a generalization of a vector previously described in ref. 6 multiple traits is straightforward; all of the Q vectors defined under more restrictive assumptions; it describes how develop- above exist unchanged, but there are now more possible com- ment evolves so as to increase or decrease the sensitivity of ␾ binations. For example, if selection acts jointly on two traits, 1 phenotype to variation in the underlying factors (i.e., the evo- ␾ ٌ␾ and 2, and some underlying factors influence both of these lution of canalization) (6). Q1,2 contains both the terms E and .traits, then there will be new vectors corresponding to terms that ٌ␾ٌ2␾ from ref. 6 Ѩ␾ ͞Ѩ ⅐Ѩ2␾ ͞Ѩ 2 involve derivatives of both traits, such as 1 u 2 u . On a quadratic phenotype landscape with a multivariate The Q vector corresponding to a set of derivatives like that in and a quadratic fitness function, we need ␾ Eq. 6 but containing n derivatives of trait 1 and m derivatives only consider Q1 and Q1,2. Q1 moves the population to a contour ␾ of trait 2 is corresponding to the local optimum phenotype; Q1,2 then moves the population along that contour to a point of locally minimum ͚͑ 1ϩ͚ 2͒ ␥ ␥ Ѩ b b w slope, at which point the phenotype is maximally buffered k ϭ 1 2 ͗ kϩ͚a  ͘ Q 1 2 ͚ 1 ͚ 2 P , D , [11] a ;a w៮ Ѩ␾ b Ѩ␾ b against underlying genetic or environmental variation. 1 2 Ѩ2 ͞Ѩ␾2 To understand what Q1,2 does, note that w , when where RD stands for the outer product over all derivatives that evaluated on an optimal phenotype contour, measures how fast fitness drops off as we move away from that contour, thus we are considering. For example, the vector corresponding to EVOLUTION Ѩ␾ ͞Ѩ ⅐Ѩ2␾ ͞Ѩ 2 1 u 2 u is capturing the strength of stabilizing selection (11). In addition to reducing the amount of genetic variation, stabilizing selection 1 Ѩ2w tends to change development, by pushing the population toward ϭ ͗ 4 1  2͘ Q1;2 ៮ Ѩ␾ Ѩ␾ P , D1 D2 . [12] 2w 1 2 less steep parts of the landscape, such that remaining genetic or environmental variation translates into minimal phenotypic The equations presented above give a general description of variation (6, 7). evolution by selection, requiring no simplifying assumptions about the distribution of genetic and environmental factors, Case 3: Cubic Landscape, Evolution of Dominance. As noted above, about how these factors influence phenotype, or about how the second derivative with respect to a particular underlying Ѩ2␾͞Ѩ 2 phenotype maps to fitness (although we did make an assumption factor, ui , measures the degree to which that factor exhibits about inheritance). Given a developmental system, we could use dominance (or recessiveness) in its effects on phenotype (11). ͚ Ѩ2␾͞Ѩ 2 this result to model evolution of that system under any fitness The sum of these dominance effects, i( ui ), is known as ,scheme and with any distribution of underlying variation. the Laplacian function, written as ٌ2␾. If dominance is to evolve Ѩ2␾͞Ѩ 2 Alternatively, we can investigate the evolutionary conse- then these second derivatives, ui , must themselves change quences of a particular kind of developmental interaction or a as some underlying factor changes. We are thus concerned with particular distribution of underlying variation by looking at the the evolutionary effects of third-order interactions, measured by Ѩ3␾͞Ѩ 3 Ѩ3␾͞Ѩ Ѩ Ѩ corresponding vectors. I consider four such special cases below. u (recall that this includes ui uj uk as well as Ѩ3␾͞Ѩ 3 The first two cases show how previous results can be quickly ui ). To derive the appropriate evolution vector, we thus set derived from Eq. 7; the final two cases, concerning dominance a ϭ 3 and b ϭ 1 in Eq. 7 to get and the effects of asymmetric genetic variation, are new treat- Ѩ ments of these subjects. 1 w Q ϭ ͗P4, D3͘. [15] 3 6w៮ Ѩ␾ Case 1: Additive Landscape, Directional Selection. The simplest Q vector, corresponding to Ѩ␾͞Ѩu,is For two underlying factors, normally distributed with equal variances (␴2) and no covariance, this becomes 1 Ѩw 1 Ѩw Q ϭ ͗P2, D1͘ ϭ P2ٌ␾. [13] Ѩ3␾ Ѩ3␾ 1 w៮ Ѩ␾ w៮ Ѩ␾ ϩ 4 Ѩ 3 Ѩ Ѩ 2 4 ␴ Ѩw u1 u1 u2 ␴ Ѩw [On an uncurved (additive) landscape with a quadratic fitness Q ϭ ϭ ٌٌ͑2␾͒. [16 3 2w៮ Ѩ␾ ΄ Ѩ3␾ Ѩ3␾΅ 2w៮ Ѩ␾ function, symmetrical distribution of underlying factors, and ϩ ␦(u៮) ϭ 0, the response to selection is described completely by Ѩu2Ѩu Ѩu3 ⌬៮ ϭ 1 2 2 u HQ1 (the ‘‘breeder’s equation’’). Substituting Eq. 13 into Eq. 8, measuring fitness relative to w៮ , and noting that Ѩw͞ The vector in Eq. 16 is the gradient of the Laplacian (the sum 2 Ѩ␾⅐ٌ␾ ϭٌw and that H⅐P is equal to the genetic covariance of the dominance effects). Thus, the vector Q3 points in the matrix, G (11), yields ⌬u៮ ϭ Gٌw. direction in which the sum of dominance effects increases most Q1 captures the effects of directional selection (Fig. 1), rapidly. describing the direction of evolution toward an optimal contour Note that Q3 contains the first derivative of fitness with respect on the phenotype landscape. Even on a more complex landscape, to phenotype. It thus shows that directional selection can alter Q1 is still often the principal determinant of evolution for a development to change the net degree of dominance. To un- population far from selective equilibrium (6). Once the surface derstand this effect, note that the net degree of dominance also curves, however, other vectors come into play as well. relates to the degree to which a symmetric distribution of

Rice PNAS ͉ November 26, 2002 ͉ vol. 99 ͉ no. 24 ͉ 15521 Downloaded by guest on September 26, 2021 underlying factors translates into an asymmetric phenotype -distribution. Increasing ٌ2␾ causes the distribution of pheno typic values to become positively skewed. This can be confirmed with Eq. 5 by setting all variances equal to 1 [P2(i, i) ϭ 1] and all covariances equal to 0 [P2(i, j) ϭ 0 for i  j], then .P2, D2͘ϭٌ2␾͗ Thus, in addition to shifting the mean of the phenotype distribution, directional selection alters development to skew the distribution in the direction of increasing fitness. Note that this is different from the direct effect of selection on skewness discussed after Eq. 10. There, directional selection was reshaping the distribution of underlying variation. Eq. 16 shows that, on a cubic or higher-order landscape, directional selection also shifts the population to regions of the landscape that induce skewness in the distribution of ␾.

Case 4: Asymmetric Underlying Variation. The assumption that the distribution of underlying variation is multivariate normal (en- suring a symmetrical distribution) is central to much of quanti- tative genetics, and is sometimes even taken as a defining characteristic of that field (12). This assumption is not always realistic (13), so it is worth looking at what we are missing by Fig. 2. The vectors Q1 and Q1,1 for a skewed distribution of two underlying insisting on normality. factors and an additive phenotype landscape. The solid diagonal is the opti- To investigate the effects of asymmetry in the distribution of mum phenotype, with fitness dropping off symmetrically away from the underlying variation, we look for Q vectors that contain the third optimum. moment of the distribution of us. Recall that the relevant moment for a particular vector is ͚ a ϩ 1, or 1 greater than the summed order of the derivatives, so the only two vectors that pressure pushing the population mean away from the optimum. include the third moment are Q1,1 and Q2, corresponding to the This is countered by directional selection pushing the mean Ѩ␾͞Ѩ ⅐Ѩ␾͞Ѩ Ѩ2␾͞Ѩ Ѩ terms ui uj and ui uj, respectively. To see one toward the optimum (Q1). If these vectors do not point in exactly consequence of asymmetry, consider the case of complete opposite directions, then there is a component of evolution along Ѩ2␾͞Ѩ Ѩ ϭ additivity (no epistasis). Here, ui uj 0, so we need to the optimum phenotype contour. A similar phenomenon occurs when the fitness function is asymmetrical and Var(u )  Var(u ), consider only the vector Q1,1. From Eq. 7 we find this to be 1 2 as can be confirmed by examining Q1,1,1. 1 Ѩ2w Q ϭ ͗P3, D1  D1͘. [17] Applications 1,1 2w៮ Ѩ␾2 In addition to providing general rules for how selection acts to To see the effects of a skewed distribution, consider a case of change development, the theory described above could be used two underlying factors with equal variances and no covariance, to determine how any given selection regime would influence a 3 ϭ 3 particular trait if we have either (i) a theoretical model for but with u1 being positively skewed. Here, P (1, 1, 1) E[x1]is nonzero but all of the other elements of P3 are zero, so we can development of the trait or (ii) data on what genetic factors write contribute to variation in the trait and estimates of their direct and interaction effects. Ѩ␾ 2 Examples of developmental models that allow us to define the Ѩ2w ͑ 3͒ͩ ͪ 1 E x1 entire phenotype landscape include reaction-diffusion models of Q ϭ ͫ Ѩu ͬ . [18] 1,1 2w៮ Ѩ␾2 1 (10) and mechanical growth models (14, 15). 0 In these cases, the underlying factors are the model parameters, The only other vector that we need worry about on an uncurved such as diffusion rate or growth rate of a tissue. Quantitative trait (QTL) analysis (16) could be used to landscape with quadratic fitness is Q1. Under the assumptions given above, Q is estimate the local geometry of the phenotype landscape. Each 1 QTL would be treated as a different underlying factor, and the Ѩ␾ direct, dominance, and pairwise epistatic effects of these would E͑x2͒ provide estimates of the first and second derivatives (or regres- 1 Ѩw 1 Ѩu Q ϭ 1 . [19] sions) of phenotype with respect to the different factors. Just 1 w៮ Ѩ␾ ΄ Ѩ␾΅ ͑ 2͒ given first- and second-order terms, one could construct the local E x2 Ѩ u2 quadratic approximation to the phenotype landscape. Such an approximation could be used to, among other things, determine Q1,1 does not point in the same direction as Q1 (Fig. 2). whether the population is at or near a point of local maximum Furthermore, it is nonzero even if there is no epistasis or canalization, or phenotypic robustness to underlying variation. If Ѩ␾͞Ѩ dominance (because it depends only on ui) and when the a point of maximum canalization [at which the angle between the population is at phenotypic selective equilibrium (because it is vectors ͗P4, D2 R D1͘ and ͗P2, D1͘ is either 0° or 180° (6)] occurs multiplied by Ѩ2w͞Ѩ␾2). Thus, with a skewed distribution of within the range of variation found in a population, it would underlying variation, the underlying factors may continue to indicate that the developmental process had been structured, in evolve even in a completely additive system that is at phenotypic large part, by stabilizing selection. equilibrium. Fig. 2 illustrates this result. Because more individuals are far Appendix from the mean on one side than on the other, and fitness drops Derivation of Eq. 5. Define p(x1,...,xn) as the probability density ␾ off nonlinearly around the optimum, there is a net selective at the point (x1,...,xn). The mean of is

15522 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.202620999 Rice Downloaded by guest on September 26, 2021 Fig. 3. The expanded equations for S(u1) and S(u2) aligned to show how all of the elements with a particular combination of derivatives form a vector, Q.

ϱ moments are 1 greater than the degree of the derivatives because ϱ ϱ Ѩk␾ ៮ 1 k ␾ ϭ ͵ ...͵ ͸ ͸ x ···x of the extra xi (underlined) in Eq. A3. This gives the selection k! i Ѩu ···Ѩu i1 ik differential for underlying factor u . Aligning all of these func- Ϫϱ Ϫϱ ϭ i1 ik i k 0 tions shows that the vector of selection differentials can be ⅐ p͑x ,...,x ͒dx ···dx , [A1] decomposed into a set of vectors corresponding to different i1 ik i1 ik combinations of phenotypic derivatives (Fig. 3). Note that all we where ͚j ϭ͚ ͚ ...͚ . Integrating the [x ,..., are doing is expanding the numerator of Eq. A3 and collecting k k(1) k(2) k(j) 1 w៮ x p(x,..., x)] terms converts them into moments, so we can terms. All of the resulting terms will be divided by , which is a k scaler and need not be expanded in the same manner. write The combinatorial term ␥ in Eqs. 7 and 8 combines the ␾ ϭ ͞⌸ ϱ coefficients of the Taylor expansion of [ 1 a!] and of Ѩk␾ ៮ 1 k w [ϭ1͞(͚b)!] and the number of terms that involve a particular ␾ ϭ ␾͑u៮ ͒ ϩ ͸ ͸ E͓x ···x ͔. [A2] i Ѩ Ѩ i1 ik ϭ ͚ ͞⌸ k! ui ··· ui combination of derivatives [ ( b)! b!]. kϭ1 1 k

Change in Higher Moments and Multiple Traits. Change in Cov(u , EVOLUTION The second summation on the right side is just ͗Pk, Dk͘, yielding i u ) is just change in the mean value of x x , so to calculate the 5 j i j Eq. . selection differential for the second moments, we simply sub- stitute x x for the underlined x in Eq. A3. The same approach Derivation of Eq. 7. The selection differential, S, for a particular i j i ͞ ៮ works for any moment. underlying factor, ui, is given by 1 w Cov(w, xi) (17), which we Eq. 11 is derived by replacing Eq. A4 with an expansion of can write as an integral as ␾ ␾ w( 1, 2) in terms of both traits. The rest of the derivation is the same as for Eq. 7. 1 S͑u ͒ ϭ ͵ ...͵x w͑␾͒p͑x ,...,x ͒dx ...dx . [A3] i w៮ i 1 k 1 k Using Regressions Instead of Derivatives. If we replace Eq. A4 and Eq. A5 above with regressions we get w ␾ ␾ u ␦␾ ϭ Expanding both ( ) and ( ) as Taylor series (6) [letting ϱ ␾ Ϫ ␾(u៮)] yields ͑␾͒ ϭ ͑␾៮ ͒ ϩ ͸␤ ␦␾i w w w,␾i ϱ ϭ 1 Ѩkw i 1 ͑␾͒ ϭ ͸ ͑␦␾͒k [A6] w Ѩ␾k [A4] ϱ ϭ k! k 0 ␦␾ ϭ ͸͸j ␤ xk ···xk ␾,x ···x , k 1 j k1 kj and jϭ1

j ␤ 1 j Ѩ ␾ where y,x represents the partial regression coefficient of y on x. ␦␾ ϭ ͸ ͸ x x ···x . [A5] k k1 k2 kj Ѩ Ѩ Ѩ The rest of the derivation proceeds as above, except that we j! uk uk ··· uk j 1 2 j replace the combinatorial term ␥ with ␥Јϭ(͚b)!͞⌸b!.

Substituting Eq. A5 into Eq. A4 into Eq. A3 and integrating I thank M. G. Rice, P. M. Magwene, and two anonymous reviewers for converts the products of xs into moments, as above, but now the greatly improving the quality of this paper.

1. Wolf, J. B., Brodie, E. D., III, & Wade, M. J., eds. (2000) Epistasis and the 10. Gilchrist, M. A. & Nijhout, H. F. (2001) Genetics 159, 423–432. Evolutionary Process (Oxford Univ. Press, New York). 11. Lande, R. & Arnold, S. J. (1983) Evolution 37, 1210–1226. 2. Falconer, D. S. & Mackay, T. D. (1996) Quantitative Genetics (Longman, Essex, U.K.). 12. Roff, D. A. (1997) Evolutionary Quantitative Genetics (Chapman & Hall, New 3. Cowley, D. E. & Atchley, W. R. (1992) Evolution 46, 495–518. York). 4. Cheverud, J. M. & Routman, E. J. (1996) Evolution 50, 1042–1051. 13. Turelli, M. & Barton, N. H. (1994) Genetics 138, 913–941. 5. Wagner, G. P., Booth, G. & Bagheri-Chaichian, H. (1997) Evolution 51, 329–347. 14. Nijhout, H. F. & Emlen, D. J. (1998) Proc. Natl. Acad. Sci. USA 95, 6. Rice, S. H. (1998) Evolution 52, 647–656. 3685–3689. 7. Rice, S. H. (2000) in Epistasis and the Evolutionary Process, eds. Wolf, J. B., 15. Rice, S. H. (1998) Paleobiology 24, 133–149. Brodie, E. D., III, & Wade, M. J. (Oxford Univ. Press, New York), pp. 82–98. 16. Shimomura, K., Low-Zeddies, S. S., King, D. P., Steeves, T. D., Whiteley, A., 8. Hansen, T. F. & Wagner, G. P. (2001) Theor. Popul. Biol. 59, 61–86. Kushla, J., Zemenides, P. D., Lin, A., Vitaterna, M. H., Churchill, G. A. & 9. Wolf, J. B., Frankino, W. A., Agrawal, A. F. & Brodie, E. D., III (2001) Takahashi, J. S. (2001) Genome Res. 11, 959–980. Evolution 55, 232–245. 17. Price, G. R. (1970) Nature 227, 520–521.

Rice PNAS ͉ November 26, 2002 ͉ vol. 99 ͉ no. 24 ͉ 15523 Downloaded by guest on September 26, 2021