Analytical Results for Directional and Quadratic Selection Gradients for Log
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/040618; this version posted February 22, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 Analytical results for directional and quadratic selection 2 gradients for log-linear models of fitness functions 3 Michael B. Morrissey1 and I. B. J. Goudie2 4 February 22, 2016 5 1School of Biology, University of St Andrews 6 contact email: [email protected] phone: +44 (0) 1334 463738 fax: +44 (0) 1334 463366 post: Dyers Brae House School of Biology, University of St Andrews St Andrews, Fife, UK, KY16 9TH 7 2School of Mathematics and Statistics, University of St Andrews 8 contact email: [email protected] phone: +44 (0) 1334 463705 fax: +44 (0) 1334 463748 post: School of Mathematics and Statistics, University of St Andrews St Andrews, Fife, UK, KY16 9SS 9 Keywords: natural selection, selection gradients, fitness, generalised linear model, capture- 10 mark-recapture, survival analysis 1 bioRxiv preprint doi: https://doi.org/10.1101/040618; this version posted February 22, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Morrissey and Goudie, log-scale fitness models and selection gradients 2 11 Abstract 12 1. Established methods for inference about selection gradients involve least-squares regression 13 of fitness on phenotype. While these methods are simple and may generally be quite robust, 14 they do not account well for distributions of fitness. 15 2. Some progress has previously been made in relating inferences about trait-fitness rela- 16 tionships from generalised linear models to selection gradients in the formal quantitative 17 genetic sense. These approaches involve numerical calculation of average derivatives of 18 relative fitness with respect to phenotype. 19 3. We present analytical results expressing selection gradients as functions of the coefficients 20 of generalised linear models for fitness in terms of traits. The analytical results allow 21 calculation of univariate and multivariate directional, quadratic, and correlational selection 22 gradients from log-linear and log-quadratic models. 23 4. The results should be quite generally applicable in selection analysis. They apply to any 24 generalised linear model with a log link function. Furthermore, we show how they apply to 25 some situations including inference of selection from (molecular) paternity data, capture- 26 mark-recapture analysis, and survival analysis. Finally, the results may bridge some gaps 27 between typical approaches in empirical and theoretical studies of natural selection. bioRxiv preprint doi: https://doi.org/10.1101/040618; this version posted February 22, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Morrissey and Goudie, log-scale fitness models and selection gradients 3 28 1 Introduction 29 The characterisation of natural selection, especially in the wild, has long been a major research 30 theme in evolutionary ecology and evolutionary quantitative genetics (Endler, 1986; Kingsolver 31 et al., 2001; Lande & Arnold, 1983; Manly, 1985; Weldon, 1901). In recent decades, regression- 32 based approaches have been used to obtain direct selection gradients (especially following Lande 33 & Arnold 1983), which represent the direct effects of traits on fitness. These, and related, 34 measures of selection have an explicit justification in quantitative genetic theory (Lande, 1979; 35 Lande & Arnold, 1983), which provides the basis for comparison among traits, taxa, etc., 36 and ultimately allows meta-analysis (e.g., Kingsolver et al. 2001). Selection gradients can 37 characterise both directional selection and aspects of non-linear selection, and so are a very 38 powerful concept in evolutionary quantitative genetics. 39 Formally, the selection gradient is the vector of partial derivatives of relative fitness with 40 respect to phenotype, averaged over the distribution of phenotype observed in a population. 41 Given an arbitrary function W(z) for expected fitness of a (multivariate) phenotype z, a general 42 expression for the directional selection gradient β is Z @W(z) β = W¯ −1 p(z)dz (1) @z 43 where p(z) is the probability density function of phenotype, and W¯ is mean fitness. Mean fitness R 44 can itself be obtained by W(z)p(z)dz. A quadratic selection gradient can also be defined as the 45 average curvature (similarly standardised), rather than the average slope, of the relative fitness 46 function, Z @2W(z) γ = W¯ −1 p(z)dz: (2) @z2 47 The directional selection gradient has a direct relationship to evolutionary change, assuming 48 that breeding values (the additive genetic component of individual phenotype, Falconer 1960) 49 are multivariate normally-distributed, following the Lande (1979) equation ∆z¯ = Gβ (3) bioRxiv preprint doi: https://doi.org/10.1101/040618; this version posted February 22, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Morrissey and Goudie, log-scale fitness models and selection gradients 4 50 where ∆z¯ is per-generation evolutionary change, and G is the additive genetic covariance matrix, 51 i.e., the (co)variances among individuals of breeding values. The quadratic selection gradient 52 matrix has direct relationships to the change in the distribution of breeding values due to 53 selection, but not with such simple relationships between generations as for the directional 54 selection gradient and the change in the mean (Lande & Arnold, 1983). 55 The primary method for obtaining selection gradient estimates has been a simple and robust 56 approach justified in Lande & Arnold (1983). The method involves least-squares multiple 57 regression of relative fitness, i.e., absolute fitness divided by the mean observed in any comparable 58 group of individuals over a specific period of the life cycle, potentially the entire life cycle, on 59 measures of phenotype. Fitness, or any component of fitness, will typically have highly non- 60 normal residuals in such a regression. Nonetheless, the simple least-squares methods are unbiased 61 (see Geyer & Shaw 2010). However, methods that account for distributions of residuals that 62 arise in regressions involving fitness as a response variable may provide better precision and 63 more reliable statements about uncertainty (i.e., standard errors, p-values, etc.). 64 Some progress has been made at developing generalised regression model methods for 65 inference of selection gradients. Janzen & Stern (1998) proposed a method for binomial responses 66 (e.g., per-interval survival, mated vs. not mated). The Janzen & Stern (1998) method provides 67 estimates of β, and requires fitting a logistic model with linear terms only, calculating the 68 average derivatives at each phenotypic value observed in a sample, and then standardising 69 to the relative fitness scale. Morrissey & Sakrejda (2013) expanded Janzen & Stern's (1998) 70 basic approach to arbitrary fitness functions (i.e., not necessarily linear) and arbitrary response 71 variable distributions, retaining the basic idea of numerically averaging the slope (and curvature) 72 of the fitness function over the distribution of observed phenotype. Shaw & Geyer (2010) 73 developed a framework for characterising the distributions of fitness (and fitness residuals) that 74 arise in complex life cycles, and also showed how the method could be applied to estimate 75 selection gradients by averaging the slope or curvature of the fitness function over the observed 76 values of phenotype in a sample. 77 Of the many forms regression analyses of trait-fitness relationships might take, log-linear 78 or log-quadratic models of the relationship between traits and expected absolute fitness may bioRxiv preprint doi: https://doi.org/10.1101/040618; this version posted February 22, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Morrissey and Goudie, log-scale fitness models and selection gradients 5 79 be particularly useful. In generalised linear models, the log link function is often useful and 80 pragmatic. Fitness is a necessarily non-negative quantity, and expected fitness will typically best 81 be modelled as a strictly positive quantity. This will indeed be the case if expected fitness is an 82 exponential function of the sum of the predictors of the regression model, or, equivalently, a log 83 link is used. Also, a log link function is compatible with generalised linear models with various 84 distributions that could be useful for modelling fitness or fitness components. For example, 85 it can be used with the Poisson distribution (counts, e.g., number of mates or offspring), 86 the negative binomial distribution (for counts that are overdispersed relative to the Poisson 87 distribution, potentially including lifetime production of offspring), and the exponential and 88 geometric distributions (e.g., for continuous and discrete measures of longevity). The purpose 89 of this short paper is to investigate the relationships between log-linear and log-quadratic models 90 of fitness functions, and selection gradients. 91 2 Log-linear and log-quadratic fitness functions, and se- 92 lection gradients 93 Selection gradients turn out to have very simple relationships to the coefficients of log- 94 linear regression models predicting expected fitness from (potentially multivariate) phenotype.