Methods in Ecology and Evolution 2016 doi: 10.1111/2041-210X.12707

REVIEW Measuring and interpreting metrics: evaluation and guidelines

Nils Anthes*,1,InesK.Haderer€ 1,NicoK.Michiels1 and Tim Janicke2

1Animal Evolutionary Ecology Group, Institute for Evolution and Ecology, University of Tubingen,€ Auf der Morgenstelle 28, 2 72076 Tubingen,€ Germany; and Centre d’Ecologie Fonctionnelle et E´volutive, UMR 5175, CNRS, University of Montpellier, 1919 Route de Mende, 34293 Montpellier Cedex 05, France

Summary

1. Routine assessments of overall sexual selection, including comparisons of its direction and intensity between sexesorspecies,relyonsummarymetricsthatcapturethe essence of sexual selection. Nearly all currently employed metrics require population-wide estimates of individual mating success and reproductive success. The resulting sexual selection metrics, however, can heavily and systematically vary with the chosen approaches in terms of sampling, measurement, and analysis. 2. Our review illustrates this variation, using the Bateman gradient, a particularly prominent sexual selection metric. It represents the selection gradient on mating success and – given the latter’s pivotal role in defining sexual selection – reflects a trait-independent integrative proxy for the maximum strength of sexual selection. Drawing from a recent meta-analysis, we evaluate potential biases arising from study design, data collection and parame- ter estimation, and provide suggestions to mitigate such biases in future studies. 3. With respect to study design, we argue that currently almost inexistent manipulative studies must complement the dominating correlative studies to inform us about causality in sexual selection. With respect to data collec- tion, we outline how different measures of mating and reproductive success affect the components of sexual (and natural) selection that are reflected in standard summary metrics. With respect to parameter estimation, we show the potential impact of decisions about data inclusion and the chosen quantitative approach on inferences of sex- ual selection and its sex difference. 4. We expect this meta-analytical review to aid future studies in providing less biased and more informative esti- mates of sexual selection. Key-words: Bateman gradient, causality, selection gradient, sex difference, sexual selection

components from other forms of selection, accepting a Introduction focus on precopulatory (or pre-spawning) selection episodes Quantifying sexual selection and its effects on trait evolu- (but see section ‘Mating success: definition and meaning’). tion is central to contemporary research. Previous empirical work has been dominated by summary The vastly dominating approach rests on Angus J. Bate- metrics proposed to reflect the opportunity for, and strength man’s (1948) idea to interpret variances in, and the linear of, sexual selection (Table 1). Recent overviews (Jones 2009; relationship between, mating success and reproductive suc- Mobley 2014; Henshaw, Kahn & Fritzsche 2016) accessibly cess as ‘signs’ and ‘causes’ of sexual selection, respectively. summarize their conceptual basis and calculation, and recapit- Later work has formalized his ideas within selection theory ulate a lively debate on the validity of variance-based as well as and clarified the degree to which Bateman’s proxies reflect trait-based metrics to capture sexual selection (refs. in opportunities for, rather than actual, selection (Wade 1979; Table 1), none of which we intend to reiterate or evaluate. Yet, Wade & Arnold 1980; Arnold & Duvall 1994; Jones 2009; while compiling a meta-analysis on sex differences in sexual Table 1). In essence, Bateman metrics echo Darwin’s selection (Janicke et al. 2016), we realized that the existing (1871) conception that sexual selection comprises those empirical work varies remarkably in key aspects of study components of total selection that are mediated through design and analysis. Some approaches can generate inappro- mating success (Arnold 1994). Modern definitions of sexual priate interpretations of sexual selection metrics, or estimates selection also include post-copulatory competition for that are not comparable between sexes, populations, or species. access to gametes. For this review, we stick to ‘mating’ as While earlier work addresses several individual challenges in the anchor that distinguishes sexually selected fitness isolation, we lack a comprehensive and quantitative assess- ment of the methodological pitfalls associated with the quan- *Correspondence author. E-mail: [email protected] tification of routine sexual selection metrics.

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society 2 N. Anthes et al.

Table 1. Quantitative metrics commonly used to characterize sexual selection and mating systems independent of specific traits, and references dis- cussing their calculation and interpretation (modified and extended from Klug et al. 2010 and Mobley 2014)

Metric Description Key references

Opportunity for selection (I) Intra-sexual variance in relative reproductive Crow (1958), Wade (1979), Arnold & success. Its square root captures the upper limit of Wade (1984b) and Jones (2009) total linear selection on standardized traits

Opportunity for sexual selection (Is) Intra-sexual variance in relative mating success. Its Wade (1979), Wade & Arnold (1980), square root captures the upper limit for mating Jones (2009); recent reviews in differentials m’, i.e. the covariance between Klug et al. (2010), Krakauer et al. (2011), standardized traits and mating success Jennions, Kokko & Klug (2012) and Evans & Garcia-Gonzalez (2016)

Bateman gradient (bss) Linear slope of a least squares regression of relative Arnold & Duvall (1994), Andersson & reproductive success on relative mating success, Iwasa (1996) and Jones (2009) describing the average fitness gain associated with each additional mating. Technically represents a selection gradient on mating success, often seen to reflect the average ‘strength’ of directional sexual selection pffiffiffiffi b Jones index (s’max)Calculatedasss Is. Defines the upper limit for Jones (2009) and Henshaw, sexual selection differentials in units of phenotypic Kahn & Fritzsche (2016) standard deviations (=maximum standardized sexual selection differential) Morisita index (Id) Observed variance in mating (or reproductive) Morisita (1962) success relative to the expected variance under uniform mate acquisition probabilities Index of resource monopolization (Q) Ratio of observed to maximum variance in mating Ruzzante et al. (1996) (or reproductive) success Upper limit to the opportunity for selection Estimated maximum gain in reproductive success in Lorch (2005) idealized mating interactions, independent of mate fecundity

We scrutinize the quantification of (sex differences in) sexual Study design selection metrics with respect to three components: the under- lying study design (section ‘Study design’), data collection (sec- This section treats pitfalls that researchers encounter when tion ‘Data collection’), and parameter estimation (section designing a study to quantify sexual selection. We discuss the ‘Parameter estimation’). For each component, we discuss pit- degree to which Bateman gradients reflect causality, the need falls during data acquisition, analysis, and interpretation, to estimate sexual selection across a meaningful range of mat- where possible quantify their empirical prevalence and statisti- ing frequencies, and the significance of field vs. laboratory cal consequences using our meta-analysis database (Janicke studies. et al. 2016), and propose guidelines to help prevent these prob- lems in future studies. CAUSALITY VS. CORRELATION IN BATEMAN GRADIENTS Throughout, we exemplify those issues with a focus on the

Bateman gradient (bss), representing a widely used integrative Arnold & Duvall’s (1994) path diagrammatic view of sexual proxy of sexual selection (Klug et al. 2010). Contrary to purely selection focuses on two multiplicative fitness components. variance-based proxies such as I and Is (Table 1), Bateman The mating differential, m’ (as defined by Jones 2009), quanti- gradients are considered to reliably capture the overall direc- fies the covariance between candidate traits and mating suc- tion and intensity of, and sex difference in, sexual selection cess, and thus identifies traits that aid individuals in increasing (Jones et al. 2005; Janicke et al. 2016). One proxy directly access to mates. The sexual selection gradient or Bateman gra- derived from the Bateman gradient, the Jones index s’max, dient, bss, quantifies the expected fitness gain associated with shows particularly good performance (Henshaw, Kahn & each additional mating, averaged across the investigated mat-

Fritzsche 2016; Table 1) and captures upper limits to sexual ing success range. Their product, m’ bss, describes the standard- selection differentials on any given trait (Jones 2009). We wish ized sexual selection differential on a given trait (Jones 2009). to stress that our methodological arguments extend to all other While both fitness components are routinely quantified in cor- recently advocated metrics of sexual selection (Table 1) that relative studies, we argue that they capture sexual (vs. other share a necessity to quantify individual reproductive success forms of) selection only if they signify causality. We focus our and individual mating success in a population. This includes argumentation on the Bateman gradient, but note that the integrative metrics that capture ‘potential’ for sexual selection same logic extends to the mating differential. just as metrics that quantify selection on specific candidate The interpretation of a positive Bateman gradient that – on traits (reviewed by Kingsolver et al. 2012). average – individuals benefit from mating more often rests on

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution Quantifying sexual selection metrics 3 the assumption that we describe a causal relationship between down (Fitzpatrick 2015). Similar scenarios can be developed mating success and reproductive success, not just a correlation for males. For example, phenotypic condition can directly (Klug et al. 2010; Fig. 1a). Empirical assessment of causality affect both mating success and sperm competitive ability, and requires that the predictor trait – mating success – is under thus generate a spurious positive Bateman gradient. Similarly, experimental control, so that confounding traits cannot larger size in sex-role reversed brooding species (e.g. pipefishes) impose spurious relationships between mating and reproduc- likely allows males to store more egg batches and therefore tive success (Mitchell-Olds & Shaw 1987; Walker 2014). Even seek more mating bouts, without the resulting positive Bate- though this problem has been spelled out repeatedly (Ketter- man slope originating from sexual selection. son et al. 1997; Parker & Tang-Martinez 2005; Gerlach et al. 2012), quantifications of Bateman gradients routinely leave Analytical treatment is problematic mating frequencies free to vary, meaning that estimated Bate- man gradients are purely correlational. Many studies add potential confounding factors (or their resid- Correlative Bateman slopes can be confounded by any trait uals) as covariates to multiple regression analyses to statisti- that directly affects both mating success and reproductive suc- cally ‘control’ for their effect. For example, an initially positive cess (Collet et al. 2014), generating Bateman gradients that do female Bateman gradient in northern water snakes became not represent a causal link (Fig. 1c). One such trait can be body non-significant after adding female body size as a covariate size (Ketterson et al. 1997; Gerlach et al. 2012). Fecundity (Prosser et al. 2002). We warn against this approach to infer often scales with size, so that larger females produce more eggs. causality (rather than to just predict; see also Walker 2014). Large females may therefore be particularly attractive to males First, the ‘corrected’ Bateman gradient cannot reveal hidden as mating partners. Females that mate frequently will then also causality. In the water snake example, the non-significant Bate- produce more offspring, resulting in a positive Bateman gradi- man gradient after ‘controlling’ for body size does not show entthatindicates– apparent – sexual selection. In reality, how- the lack of a causal relationship between mating and reproduc- ever, the positive slope arises from a budget effect (van tive success – it just illustrates that correlational studies cannot Noordwijk & de Jong 1986) that simultaneously generates distinguish causality from confounding. fecundity selection on females and a male preference to mate Second, it will be almost impossible to be aware of, and mea- with larger females. In this scenario, female reproductive suc- sure, all possible confounding factors, leaving us with an cess does not increase because of elevated mating success, but incomplete control at best. Body size represents just one of female mating success and reproductive success both increase many candidate traits, most of which are difficult to identify, as a result of the underlying confounding variation in body measure, and deal with, such as condition, health, foraging effi- size. Fecund females here attract more males as a result,notas ciency, or resource acquisition (Ashman & Morgan 2004; Par- a cause, of their higher reproductive potential (Ketterson et al. ker & Tang-Martinez 2005; Gerlach et al. 2012). Hence, in 1997), turning the meaning of the Bateman gradient upside correlative analyses, it will be hard – if not impossible – to

(a) Trait is (b) Trait is correctly identified Causal link sexually selected as not sexually selected Correlation falsely implying causality RS RS RS Uncorrelated βss βss βss

MS MS s’ MS s’

m’ m’ m’ trait trait trait Fig. 1. Sources of confounding when estimat- ing Bateman gradients. (a) Sexual selection (c) Trait is under exclusive natural selection assumes that a trait has a direct and causal Yet, classic Bateman gradient analyses would misleadingly detect “sexual selection” on this trait. effect on mating success (MS), which in turn has a direct and causal effect on reproductive success (RS). (b) In the absence of sexual selec- RS RS RS β β β tion, we correctly identify a trait as not sexu- ss ss ss ally selected when either the mating b differential, m’, or the Bateman gradient, ss, s’ s’ s’ reveal no correlation. (c) False conclusions MS MS MS arise when statistically significant correlations b for m’or ss falsely imply sexual selection, even m’ m’ m’ though actual selection is purely through non- sexual components (s’). trait trait trait

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution 4 N. Anthes et al. disentangle the extent to which Bateman gradients arise from mating frequencies (I.K. Haderer€ & N. Anthes, unpublished sexual vs. natural (or fecundity) selection (Jennions & Kokko data, see also Anthes et al. 2010). It found a more pronounced 2010). sex difference in Bateman gradients caused by a more positive male gradient when mating frequencies were manipulated, highlighting the relevance of establishing causality when Avoid confounding effects by design assessing benefits of mating. Given the necessarily incomplete analytical treatments of con- founding outlined above, we call for complementary studies INVESTIGATED RANGE OF MATING FREQUENCIES that experimentally manipulate mating success as the causal traitofinterest.Adirectmanipulationofmatingsuccessis Bateman gradients are context dependent (Ashman & Mor- challenging, of course, but recent work in freshwater snails gan 2004) so that the slope of the regression of reproductive (Anthes et al. 2010) largely successfully imposed preassigned success on mating success can vary with the distribution of mating frequencies with randomly assigned partners drawn investigated mating frequencies (Arnold 1994; Kokko & from replicate populations that were themselves exposed to Wong 2007; Kokko, Klug & Jennions 2012). Empirical stud- defined mating schedules. This scenario allows measuring the ies are only rarely able to measure reproductive success across expected average returns of mating more often against a bio- full lifetimes and thus often do not capture the ultimate cur- logically meaningful standardized back- rency for selection (Arnold & Wade 1984a; Arnold & Duvall ground. One may also just control mating frequencies while 1994). 80% of the studies included in our meta-analysis (57 allowing among a set of ‘background partners’ out of 72, Janicke et al. 2016) follow cohorts across a single during each allocated mating opportunity. In both scenarios, it breeding season or parts thereof. Generalizations about sex- is crucial that the experimenter (rather than the study subjects) ual selection must then assume that the observed qualitative determines individual mating success. Mating and reproduc- patterns (e.g. the Bateman slope or its sex difference) hold tive success of the study subject is then scored within, and across non-measured components of reproductive lifetime expressedrelativetoitsallocatedbackgroundpopulationthat (Arnold & Duvall 1994). This is a strong assumption given experienced unrestricted mating. that the distribution of, and the link between, mating success Experimental control of mating success clearly faces limita- and reproductive success may vary with systematic shifts at tions. First, the approach only works for behaviour-based defi- the individual level (cumulative effects of mating, J.N.A. Hof- nitions of mating success (e.g. the number of copulations, or fer & J.M. Koene, pers. comm.; sex-specific senescence, Esco- partners) but not when estimating mating success bar et al. 2008), the population demographic level (temporal via parentage analysis (see section ‘Mating success: definition changes in sex ratios, Forsgren et al. 2004; Baena & Macıas- and meaning’). Second, the experimenter may have full control Ordonez~ 2012; Wacker et al. 2014), or the environmental over mating opportunities, but not over actual matings. Hence, level (e.g. food availability Janicke, David & Chapuis 2015; the realized mating frequencies of the study subjects will often Morimoto, Pizzari & Wigby 2016; see also Siepielski, DiBat- be lower than intended. Laborious observation sessions with tista & Carlson 2009; Evans & Garcia-Gonzalez 2016). repeated exposure of study subjects may therefore be necessary Therefore, sexual selection metrics must always be interpreted to impose the aprioridefined mating frequencies and even then in the context of the timeframe and conditions under which fail in some individuals. Third – when the results should reflect they had been established. a natural scenario – the chosen mating frequencies must reflect a range that is relevant to natural populations, an issue to Choose an appropriate range of mating success whichwereturninsection‘Investigatedrangeofmatingfre- quencies’. Finally, in specific systems, artificial mate limitation Empirical work is usually conducted over short periods and imposed through ‘staged matings’ may not realistically capture may thus miss the full range of mating success. Where feasible, sexual selection. For example, if mate availability is generally studies should then assess sexual selection at different episodes high and alternative strategies allow achieving similar repro- within or across seasons to further our understanding of ductive success independent of mating frequencies, staged mat- within-population variation in sexual selection. In manipula- ings may reveal a degree of sexual selection that natural tive studies (section ‘Causality vs. correlation in Bateman gra- populations never experience. dients’), each mating success category should ideally be Only two recent Bateman gradient studies applied con- representedinproportiontoitsnaturalfrequency(Kokko, trolled mating frequencies. In red-backed , Latrodectus Klug & Jennions 2012). An experiment that represents each hasselti, focal individuals were subjected to no, one, or two chosen mating success category equally automatically inflates copulations in equal proportions, afterwards correcting for the the statistical contribution of the rare mating success cate- representation of those specific mating rates in the wild gories. When the goal of the study is to make specific predic- (Andrade & Kasumovic 2005). These spiders usually copulate tions about the ‘benefit of mating more often’ in a defined with virgins, making the establishment of a common sperm population type, researchers can weigh the contribution of competition background unnecessary. A study in the pond each mating success category based on their proportional rep- snail, Biomphalaria glabrata, provides a first comparison of resentation in free-ranging populations, as implemented in Bateman gradients under manipulated and unmanipulated Andrade & Kasumovic (2005).

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution Quantifying sexual selection metrics 5

FIELD VS. LABORATORY STUDIES Avise 2001) and laboratory (Rose, Paczolt & Jones 2013). In rough-skinned newts, Taricha granulosa, field- (Jones, Most empirical work on Bateman gradients to date rests on Arguello & Arnold 2002) and laboratory-based (Jones, field studies (55 out of 72 studies in Janicke et al. 2016). These Arguello & Arnold 2004) assessments of the three Bateman provide measures under the often complex social and environ- parameters were strikingly similar. Interestingly, the best mental conditions that ultimately shaped contemporary selec- match occurred for laboratory-estimates obtained under (more tion regimes. Yet, any conclusions remain necessarily realistic) male-biased comparedtoequalsexratios,emphasiz- restricted to the specific and uncontrolled conditions that pre- ing that laboratory-based estimates should closely mimic field vailed during the study period. Field studies make it almost conditions. impossible to obtain complete information on offspring and their parents – sires in particular (Schlicht & Kempenaers 2013; Jones 2015; but see Rodriguez-Munoz et al. 2010). Takethebestofbothworlds Moreover, field-derived Bateman gradients often remain cor- The path diagrammatic view of sexual selection helps to solve relative because mating success is almost inaccessible for exper- the apparent dilemma between field and laboratory studies: imental manipulation (section ‘Causality vs. correlation in Ideally, the Bateman gradient establishes the direct causal Bateman gradients’). Finally, intraspecific variation in Bate- effect of mating success on fitness, requiring experimental man gradients among years and populations can be substantial manipulation (section ‘Causality vs. correlation in Bateman (e.g. Mobley & Jones 2007, 2009; Apakupakul & Rubenstein gradients’). Whether individuals can achieve agivenmating 2015), highlighting the sensitivity of sexual selection to envi- success under the prevailing environmental conditions – preda- ronmental variation. In an attempt to understand this varia- tion or infection risk, food restriction, etc. – is addressed at the tion, several field studies added manipulative components with stage of the mating differential m’, i.e. the relationship between respect to initial adult sex ratio (Mills et al. 2007; Fitze & Le trait expression and mating success, and this should then opti- Galliard 2008; Williams & DeWoody 2009), population den- mally be established under field conditions (again, best by sity (Levitan 2008), or nest site availability (Fleming & Gross manipulating the underlying traits; Travis & Reznick 1998). 1994). These approaches elegantly combine benefits of field Notwithstanding, the choice between a laboratory or field and laboratory work, even though they may disrupt estab- study will often be set by the specific study system. Where feasi- lished social relationships that represent a prime benefit of ble, we consider a combination of both as particularly fruitful, working in the field. to obtain a comprehensive picture of causality in sexual selec- Bateman’s (1948) own work rested on a laboratory experi- tion (laboratory component) and the relevance that this rela- ment, but only few studies to date followed his approach. As a tionship exhibits in nature (field studies). major critique, laboratory studies measure sexual selection in ignorance of relevant environmental factors such as predation or infection risks that may contribute to, and modify, sexual Data collection selection (e.g. Taylor, Price & Wedell 2014). Even though some Measures of mating success and reproductive success are cen- of these can be integrated (e.g. Turnell & Shaw 2015), research- tral to the calculation of almost all current proxies of sexual ers will usually be na€ıve to relevant environmental factors. selection and mating systems (Table 1). Recent work, how- Hence, Bateman gradients measured under laboratory condi- ever, differs strikingly in how each of these traits has been tions do not necessarily translate to natural populations. How- defined and measured, affecting the calculation and biological ever, laboratory studies allow controlling environmental interpretation of resulting estimates. This section draws atten- variation within and among study populations and thus pro- tion to problems associated with divergent definitions, and vide better comparable sexual selection estimates. They also calls for explicit and precise descriptions of the chosen mea- offer easier access to complete information on mating beha- surement in published research. viour (i.e. mating success) as well as offspring and their parents (i.e. reproductive success). Moreover, opposing predictions MATING SUCCESS: DEFINITION AND MEANING regarding sexual selection can be directly contrasted through experimental manipulation. Recent examples testing how envi- ‘Mating success’ has been an ambiguous term since its early ronmental factors affect the strength of sexual selection include use, with its ascribed meaning spanning from directly observed manipulations of sex ratio (Jones et al. 2000, 2005; Jones, copulations via successful copulations in the sense of gamete Arguello & Arnold 2004; Aronsen et al. 2013), population transfer to successful mating in the sense of fertilization, with density (Aronsen et al. 2013; Wacker et al. 2013), mating fre- the latter applicable also to broadcast spawning organisms quency (Andrade & Kasumovic 2005), nest availability (Singer (Levitan 2008). et al. 2006), or food availability (Janicke, David & Chapuis Early empirical work on Bateman gradients almost univer- 2015; Morimoto, Pizzari & Wigby 2016). sally applied the number of genetic partners (i.e. the number of Within-species comparisons of field and laboratory quantifi- partners with whom a focal individual shares offspring) as the cations of Bateman metrics are rare to date. The pipefish, proxy of mating success. Bateman (1948, pp. 353–354) also Syngnathus scovelli, exhibited similarly female-biased opportu- applied this measure, but remained unsatisfied because nities for sexual selection, IS,inbothfield(Jones,Walker& approximating the number of mating events with the number

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution 6 N. Anthes et al. of genetic mates cannot detect multiple inseminations or insem- overestimating (Turnell & Shaw 2015) as well as underestimat- inations that failed to result in fertilization. To date, only a ing (Danielsson 2001) precopulatory sexual selection. handful of studies applied measurements that relate more Using our meta-analysis data set (Janicke et al. 2016), we directly to the act of gamete exchange (e.g. Bjork & Pitnick tested for systematic changes in the opportunity for sexual

2006; Anthes et al. 2010; Rodriguez-Munoz et al. 2010). selection (Is) and the Bateman gradient (bss) arising from the What are the conceptual underpinnings for alternative use of genetic mating success (gMS) instead of copulatory choices? Ultimately, the definition of mating success determi- mating success (cMS). For this we computed effect sizes for nes the selection episode that is captured by the Bateman gradi- DIs as the natural logarithm of the ratio between the coeffi- ent (Anthes et al. 2010). For example, the number of copulation cientsofvariationinmatingsuccess(lnCVR)whenusing partners captures the effect all traits that influence an individ- gMS and cMS following Nakagawa et al. (2015), with ual’s ability to locate and persuade a conspecific to engage in at higher values indicating that larger estimates were obtained least one copulation, or to resist unwanted mating attempts. when using gMS. Similarly, the effect size for Dbss was When focusing on copulation numbers, we additionally capture defined as Hedges d of the slope difference between the traits that enable multiple copulations with the same partner, Bateman gradient obtained from using gMS and cMS, with or avoid repetitive copulations with familiar mates. This larger values indicating a steeper gradient when using gMS. accounts also for cases where repeated copulations with the In total, we extracted effect sizes from eight studies, three of same partner provide fertilization benefits similar to those which provided estimates for both sexes. Given the small acquired through copulations with a novel partner. These two sample size (half of which originates from snails) we definitions probably come closest to what Darwin (1871) envi- refrained from adding a phylogenetic correction but note its sioned under (precopulatory) sexual selection, but risk to necessarily exploratory nature. include behavioural interactions with primarily social rather As predicted (Fig. 2), gMS was associated with significantly than sexual purposes, for example, in primates. When counting larger estimates of Is (DIs: lnCVR SE: 0432 0094; z-test: the number of genetic partners, we additionally include those z = 4613, K = 12, P < 0001) and bss (Dbss:Hedgesd SE: components of postcopulatory sexual selection that determine 0263 0071; z-test: z = 3721, K = 12, P < 0001). Within whether a given male achieves any fertilization (or hatching) the limited sample size for between-sex comparisons, these success. These three measures just exemplify the different effects were not sex-specific (DIs: QM = 0001, d.f. = 1, options, with recent work moving towards even more fine- P = 0994; Dbss: QM = 0526, d.f. = 1, P = 0468). grained differentiation of pre and postcopulatory selection epi- sodes (e.g. Marie-Orleach et al. 2016). Match mating success and your targeted selection episode Our definition of mating success can also have quantitative consequences. As we move towards proxies of mating success One may be inclined to favour gMS on rather technical that are based on shared offspring, we increasingly confound grounds: This proxy is dominant in existing estimates of response and predictor variables (Anthes et al. 2010), poten- Bateman parameters and therefore facilitates comparison tially generating spurious Bateman gradients. With the Bate- with other studies. Moreover, retrospective sampling of off- man gradient traditionally seen as a measure of the intensity of spring and their assignment to potential fathers and moth- precopulatory rather than postcopulatory sexual selection, the ers likely remains the most feasible option for field studies, usage of reproductive success to infer mating bears the risk of in particular.

Anthes et al. 2010 Collet et al. 2014 Collet et al. 2014 Devost & Turgeon 2015 Devost & Turgeon 2015 Häderer et al. in prep. Janicke et al. 2015 Janicke et al. 2015 Marie-Orleach et al. 2016 Pélissié et al. 2012 Pélissié et al. 2012 Fig. 2. Changes in estimates for (left) the Turnell & Shaw 2015 opportunity for sexual selection (DIs)and (right) the Bateman gradient (Db ) when using Global effect size ss genetic mating success (gMS) relative to a baseline using copulatory mating success –1·0 0·0 1·0 2·0 –1·00·01·02·0 (cMS). Forest plots show effect sizes (lnCVR ∆I ∆ß and Hedges d) with their 95% confidence Female S SS intervals (CIs). Positive values indicate higher Male (lnCVR ± 95% CI) (Hedges’ g ± 95% CI) estimates when using gMS.

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution Quantifying sexual selection metrics 7

50

40

30

20

10

Fig. 3. Frequency distribution of stages at Number of studies which reproductive success has been scored in previous studies reporting Bateman metrics 0 (derived from the database for Janicke et al. Ferlized eggs Embryos Hatchlings Immatures F1 adults 2016). Stage

On biological grounds, however, we favour copulation (or parents with direct selection on the offspring’s genotype or behaviour)-based over genetic estimates of mating success for phenotype ( & Wade 2001; Arnqvist 2013; Bergeron et al. the reasons outlined above. Copulation-based measures have 2013). To maintain connection to equations for evolutionary recently also proven instrumental in identifying mutualistic or responses to selection, some researchers therefore advocated antagonistic episodes of sexual selection (Collet et al. 2012; ‘early’ measures of reproductive success (e.g. Arnold & Wade Pischedda & Rice 2012; Pelissie et al. 2014; Turnell & Shaw 1984a). Reconciliation comes from a framework that extends 2015; Marie-Orleach et al. 2016). This refers to the relative the path diagrammatic view of sexual selection to incorporat- contributions of pre and postcopulatory selection to total vari- ing mate and offspring quality up to offspring maturation ation in reproductive success (Evans & Garcia-Gonzalez while staying within a single breeding cycle (Fitzpatrick 2015). 2016), but also to the relevance of mating order, copulation This allows integrating components of sexual selection that duration, or nuptial gifts in precopulatory episodes of selec- only materialize at post-zygotic stages, such as genetic or non- tion. These studies also exemplify the importance of detailing genetic benefits (e.g. in terms of parental care) to offspring the proxy of mating success that has been applied, the compo- survival and growth. nents of sexual selection this measure likely captures, and pos- sible confounders of the relationship between mating success Match reproductive success and the targeted selection and reproductive success. episode Again, we cannot derive a fixed rule about the best possible REPRODUCTIVE SUCCESS: ‘SEXUAL’ OR ‘NATURAL’ stage to score reproductive success. Importantly, authors must SELECTION? detail and evaluate which selection episodes (including those What we outline above regarding the stage at which mating commonly attributed to natural selection) have potentially success is measured applies in similar terms also to the stage at actedupuntiltheirchosenreproductive success estimate. which reproductive success is scored. Previous studies span a Where feasible, measuring reproductive success (and offspring continuum from the number of fertilized eggs to the number of performance) at multiple stages offers scope to better under- mature or recruiting adults, with embryos and hatchlings stand consecutive selection episodes (Fitzpatrick 2015). clearly dominating (Fig. 3). Whether this choice indeed intro- duces systematic biases to the estimated Bateman parameters REPRODUCTIVE SUCCESS: INCOMPLETE OFFSPRING remains difficult to judge. Only three studies represented in our SAMPLING meta-analysis provided reproductive success estimates for dif- ferent stages within the same data set. Two of those found that Particularly in open field populations, and most strongly in the Bateman gradients became steeper when offspring were non-caring sex(es) in species with uniparental or no parental counted at a later stage (hatchlings vs. immatures: Gerlach care, we are typically able to sample only a subset of offspring et al. 2012; embryos vs. hatchlings: Walker et al. 2014), and produced. About one-third of our surveyed reproductive suc- one found the opposite (eggs vs. hatchlings vs. recruiting off- cess estimates (27 out of 81 in Janicke et al. 2016) derived from spring: Fitze & Le Galliard 2011). incomplete offspring sampling, whether due to restrictions in Scoring reproductive success at a late stage (e.g. the number field sampling, phenotypic marker expression, or molecular of mature recruiting offspring) has the advantage of capturing genotyping. a reasonable end-point on which selection acts. This is because Incomplete sampling can massively affect the derived sexual the entire set of direct (non-genetic) and indirect (genetic) selection metrics. Using simulated data, Mobley & Jones effects of pre and postcopulatory mate choice has had a chance (2013) found that the direction of the bias varied with the true to materialize, including genetic effects that may affect the off- values of the mating system metrics. Selection opportunities (I spring’s reproductive performance rather than their early and Is) calculated after incomplete offspring sampling were development or survival. Such ‘late’ measures, however, have overestimated when small in reality but underestimated when been criticized for conflating sexual selection acting on the large. The Bateman gradient was overestimated when the

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution 8 N. Anthes et al. actual slope was shallow, but reliable when the real slope was inferring mating success from parentage assignment (genetic steep. These effects can also bias our interpretation of sex dif- matings success) in completely sampled populations inflates ferences in sexual selection metrics, most strongly if incomplete the zero-mating success category, because all individuals that offspring sampling is sex-specific (examples in Tatarenkov copulated without achieving detectable fertilization success are et al. 2008; Pelissie, Jarne & David 2012; Mobley & Jones allocated to this group (cf. section ‘Mating success: definition 2013). and meaning’). The reverse occurs when inferring the set of potential parents exclusively from offspring genotypes (often the only feasible strategy when studying open populations). Correcting bias from incomplete sampling Here, the sampling technique is blind to unsuccessful parents Depending on the origin of the sampling bias and the targeted andthematingsuccessofallindividualsisatleast1(e.g.Mun- sexual selection proxies, three alternative approaches have roe & Koprowski 2011; Poesel, Gibbs & Nelson 2011). More- been developed to take incomplete offspring sampling into over, when studies are initiated with virgin individuals and the account. study period is short, far more individuals than realistic under First, Mobley & Jones (2013) offer a maximum likelihood field conditions may end up in the ‘unsuccessful’ group with procedure to estimate mating system variables in open and zero reproductive success, irrespective of the chosen proxy of incompletely sampled populations. The procedure has recently mating success. been implemented in a user-friendly interface called Batem- From a mechanistic point-of-view, the inclusion of the zero anater, and provides estimates and bootstrap-based confidence mating success category merges two selection episodes, namely intervals for I, Is and bss (Jones 2015). Second, focusing on participation in reproduction (by mating at least once) and the between-sex comparisons of the variance in reproductive suc- benefits of remating, and it will depend on the study system to cess (I), Pelissie, Jarne & David (2012) developed a procedure what degree each of these should be attributed to natural or to estimate the expected value of the (binomial) sampling error sexual selection (e.g. Klug, Lindstrom€ & Kokko 2010). Under that results from incomplete offspring sampling, and to adjust the definition that sexual selection represents selection medi- the estimated variances accordingly. Finally, Schlicht & Kem- ated through mating success, the zero mating success category penaers (2013) developed an approach for open populations to was explicitly included into the original derivations of Bateman check the sensitivity of sexual selection metrics to undetected gradients by Wade & Arnold (1980, p. 452), following the sires (via sibship analysis) and to compare observed values argument that excluding non-mating individuals would under- against expected sexual selection metrics under random mating estimate sexual selection. (similar approach in Baena & Macıas-Ordonez~ 2012). Finally, from a statistical point-of-view, we expect inclu- sion of the zeroes to increase estimated sexual selection metrics, most so in the sex that shows no or diminishing Parameter estimation fitness returns from remating (e.g. Klug, Lindstrom€ & This final section addresses potential issues arising during data Kokko 2010; Fitze & Le Galliard 2011). To test this pre- analyses with respect to the influence of non-mating individu- dicted bias, we extracted male and female estimates of Is als, estimation errors in species with small clutch sizes, and the and bss from 28 studies in our meta-analysis data set (cf. calculation of fitness. section ‘Mating success: definition and meaning’) that allowed us to include and exclude individuals with zero mating success. We then computed DI and Db ,with NON-MATING INDIVIDUALS s ss higher values indicating that the inclusion of individuals Most Bateman gradient data sets contain individuals that did with zero mating success yields higher estimates. not achieve any mating success, forming the so-called zero mat- As predicted (Fig. 4), inclusion of individuals with zero mat- ing success category. Whether to include these individuals into ing success led to higher estimates of the opportunity for sexual calculations of sexual selection metrics remains a matter of selection, Is (DIs: lnCVR SE: 0485 0049; z-test: debate that already bothered Bateman (1948, p. 361). z = 9928, K = 58, P < 0001). This systematic difference was

From a conceptual point-of-view, the origin of the ‘zeroes’ unaffected by sex (QM = 0912, d.f. = 1, P = 0340), but affects the type of selection we consider (Anthes et al. 2010; increased with the proportion of individuals showing zero mat- € Klug, Lindstrom & Kokko 2010; Fitze & Le Galliard 2011). ing success in the population (QM = 249567, d.f. = 1,

For example, if zeroes arise from individuals of a given cohort P < 0001). For the Bateman gradient, bss, we found neither that did not reach sexual maturity, we include components of an overall difference (Dbss:Hedges’d SE: 0050 0034; z- natural selection (unless the reasons for mortality are them- test: z = 1485, K = 34, P = 0138) non a clear sex-specific selves under sexual selection). In contrast, mature individuals effect (QM = 2605, d.f. = 1, P = 0. 107). Nevertheless, post that failed during precopulatory competition and obtained no hoc tests indicate that – when excluding the zero mating success access to the mating pool represent an essential component of category – Bateman gradients tend to become shallower in precopulatory sexual selection. females (Dbss:Hedges’d SE: 0104 0047; z-test:

From a methodological point of view, the frequency of indi- z = 2199, K = 16, P = 0028) but not in males (Dbss:Hedges’ vidualswithzeromatingsuccesscandependonthestudy d SE: 0001 0046; z-test: z = 0. 029, K = 16, design and the chosen definition of ‘mating’. For example, P = 0977). Again, estimates of Dbss were positively correlated

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution Quantifying sexual selection metrics 9

Andrade & Kasumovic 2005 Anthes et al. 2010 Becher & Magurran 2004 Bergeron et al. 2010 Borgerhoff Mulder 2009 Courtiol et al. 2013 Croshaw 2010 Croshaw 2010 Dickinson 1988 Emlen & Wrege 2004 Fincke 1982 Gagnon et al. 2012 Gopurenko et al. 2007 Hafernik & Garrison 1986 Janicke et al. 2015 Jones et al. 2000 Jones et al. 2004 Mangold et al. 2015 McCauley 1983 McLain & Boromisa 1987 Mobley & Jones 2012 Pélissié et al. 2012 Pongratz & Michiels 2003 Schulte-Hostedde et al. 2004 Serbezov et al. 2013 Serbezov et al. 2013 Serbezov et al. 2013 Ursprung et al. 2011 Walker et al. 2014 Williams & DeWoody 2009 Fig. 4. Changes in (left) the opportunity for Woolfenden et al. 2002 sexual selection (DIs) and (right) the Bateman gradient (Db ) when excluding the zero mat- ss Global effect size ing success category. Forest plots show effect sizes (lnCVR and Hedges d) with their 95% –1·00·01·02·0 –0·5 0·0 0·5 1·0 confidence intervals (CIs). Positive values indi- Female ∆I ∆ß cate higher estimates when including individu- Male S SS als with zero mating success. (lnCVR ± 95% CI) (Hedges’ d ± 95% CI) with the proportion of individuals showing zero mating success above. Second, researchers need to evaluate – within their

(QM = 6. 387, d.f. = 1, P = 0012). specific experimental paradigm – why certain individuals Taken together, the decision whether to include the zero obtained zero matings, i.e. whether they are of biologically rel- mating success category potentially affects the magnitude of evant origin (and then attributable to natural or sexual selec- the estimated Bateman gradients and the degree to which they tion, cf. section ‘Investigated range of mating frequencies’) or show a sex difference. More precisely, inclusion of zero MS mere by-products of the study conditions. To facilitate typically generates higher estimated female but not male Bate- between-study comparisons, we recommend to routinely man gradients and thus weaker differences in Bateman slopes reporting Bateman gradients with and without the zero-mating between the sexes. category, and discuss biological implications of any qualitative differences between both approaches.

Consider the biology behind zero mating success BIAS IN SMALL BROODS Given the considerations above, there is no a priori reason that zero mating success must be generally included or excluded in As we have seen, many studies infer mating success indirectly estimates of sexual selection. The decision depends, first, on via genetic parentage rather than through direct observation the selection episode one is interested in, i.e., whether one (section ‘Mating success: definition and meaning’). In these wishes to delineate between natural and sexual selection, or cases, fecundity variation can substantially confound the num- between fertility insurance and remating benefits as outlined ber of inferred mates – for females in particular, because it is

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution 10 N. Anthes et al. more likely to detect multiple genetic mates in broods of rather than performing necessarily unsatisfying a posteriori females that produce more offspring (Ketterson et al. 1997; corrections of the predictor variable. Parker & Tang-Martinez 2005; Anthes et al. 2010; Gerlach Finally, Gagnon, Duchesne & Turgeon (2012) offered a per- et al. 2012; Arnqvist 2013). mutation solution. Here, the correlation coefficient between While we expect this sampling artefact to be negligible in reproductive success and mating success is calculated on the species with large offspring numbers relative to the number of original data and evaluated against the probability of obtain- potential contributors, it can introduce substantial bias in spe- ing this or more extreme coefficients in permuted combinations cies that routinely lay few offspring such as , many rep- of the observed RS and MS values under the restriction that tiles, , and others. Within our meta-analysis data set RS ≥ MS. This procedure indeed appropriately tests whether (Janicke et al. 2016), species with average individual offspring observed Bateman slopes, and their sex differences, are larger numbers <3 make up one quarter of all studies (12 out of 48 than expected by chance under the restriction of genetic mate studies for which data on the original distribution of RS was sampling. available), indicating that this problem is potentially wide- spread. The consequences can be dramatic because our Experimental exclusion: recommended response variable (reproductive success) is bound to be equal to or larger than the predictor (mating success), in extreme In order to avoid aposteriorisolutions that face the restrictions cases generating strong but spurious positive Bateman gradi- outlined above, biases introduced by sampling artefacts are ents for males and females even in cases where (re-)mating pro- best avoided at the study design stage. First, by scoring mating vides no or only limited fertility benefits. success at the level of observed copulations (section ‘Mating success: definition and meaning’) we remove the origin of the bias, because it is a problem inherent to parentage-based esti- Analytical treatment: apply with care mates of mating success. Second, given that the latter approach Several a posteriori approaches have been proposed to correct still faces problems with confounding, we propose that con- for these sampling artefacts. The first adds covariates underly- trolled mating frequencies (section ‘Causality vs. correlation in ing presumed fecundity-effects (e.g. body size, Prosser et al. Bateman gradients’) offer the most suitable approach to 2002) to the regression analysis in order to statistically ‘control’ remove bias, because this entirely disentangles estimated mat- for the sampling artefact. As outlined in section ‘Causality vs. ing success from variation in egg laying rates, fecundity, condi- correlation in Bateman gradients’, this approach is insufficient tion, or size. and may even introduce novel, unwanted biases if the added covariates are not the causal agent underlying the variation in RELATIVE VS. ABSOLUTE FITNESS fecundity. Second, Bergeron et al. (2012) subtracted the observed Sexual selection metrics are typically calculated from measure- number of matings from individual offspring numbers and ments of absolute mating success and reproductive success. We recalculated Bateman gradients with this adjusted estimate of argue, however, that regression analyses, and in particular reproductive success, RSadj. This procedure, however, does not comparisons among sexes (or populations, consecutive breed- correct the sampling artefact. RSadj counts the number of off- ing seasons, species), must be performed on relative values (cf. spring born or sired beyond the number of partners with whom Wilson 2004; Jones 2009), i.e., individual observations on mat- a male or female shared offspring. Individuals thus automati- ing and offspring numbers divided by their respective popula- cally obtain high RSadj when siring offspring with as few part- tion mean. ners as possible, whereas ‘successful multiple mating’ is heavily Absolute measures of mating and reproductive success appear penalized. For example, males that are good in sperm competi- convenient because they allow intuitive data interpretation. tion obtain high RSadj irrespective of their mating success, However, they become problematic when Bateman gradients whereas poor sperm competitors stay close to zero. It also (=regression slopes) are compared, because this comparison – becomes unclear what the resulting Bateman gradients mean for example via ANCOVA t-tests on the interaction between (‘the number of offspring – beyond those needed to detect mat- sex and mating success – is sensitive to the scale at which the ing – gained per genetic mate’) – stronger sexual selection does slopes are measured. Absolute vs. relative measures generate not necessarily mean that this gradient becomes steeper. identical slope comparisons as long as the samples are fully con- Third, data on expected rates of multiple mating (or extra- gruent (i.e. they are based on full mating and fitness assignments pair-fertilization) could help to correct observed mating suc- from the same population at a balanced sex ratio so that mean cess values for detection probabilities depending on brood size, mating success and mean reproductive success are identical, as at least in females (for which brood size is often known) (Bur- required by the Fisher condition). In many empirical studies, ley & Parker 1998; parameterized in Parker & Tang-Martinez however, these requirements are violated, because the sex ratio 2005). Though feasible, this approach requires extensive data deviates from unity, there are sex differences in how completely on actual copulations and their relationship with estimates of mating and reproductive success are estimated (e.g. in Bateman ‘gMS’. With these data at hand, we consider it more straight- 1948; see Snyder & Gowaty 2007), mean mating success differs forward to directly use a copulation-based definition of mating between time periods or data subsets, or offspring numbers vary success (section ‘Mating success: definition and meaning’) among years due to differential mortality (e.g. Byers & Dunn

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution Quantifying sexual selection metrics 11

2012; Bergeron et al. 2013). In all these situations, Bateman gra- Analysing all individuals of a replicate population is often dients (or other sexual selection metrics) are established over dif- unfeasible, for example due to the lack of sufficiently diverse ferent data ranges and at different scales, generating scaling genetic markers to assess complete parentage. Researchers can effects that are not informative about sexual selection. then study multiple replicated experimental populations and To exemplify this effect, we extracted (from Janicke et al. monitor mating success and reproductive success for just a sin- 2016) all 15 studies that report, or allowed calculating, Bate- gle focal individual per replicate (e.g. in Bjork & Pitnick 2006; man gradients on absolute and relative mating and reproduc- Anthes et al. 2010; Marie-Orleach et al. 2016). After standard- tive success, and display their dependency on mean offspring izing focal measurements of mating success and reproductive number, i.e. the scale at which the Bateman gradients had been success for their respective population means, Bateman gradi- measured (Fig. 5). Obviously, absolute Bateman gradients ents (or other sexual selection proxies) are calculated across become substantially steeper with larger mean offspring num- focal individuals, meaning that each replicate population yields ber, but this dependency vanishes (as intended) when relative a single independent observation. values or standardized effect sizes are applied. These designs share the need to combine data from differ- ent replicate populations. Almost universally, average mating success and average reproductive success will differ among Use relative, not absolute fitness proxies to compare sexes replicate populations. It is important to assess the degree to or populations which these differences arise due to differences in genetic The recommendation is straightforward: Whenever calculating quality between individuals of different populations (likely if Bateman metrics, and in particular when statistically compar- replicate populations are small, unlikely if large and well ran- ing data subsets e.g. regarding Bateman slopes, use relativized domized) vs. differences in ambient condition or mainte- measures of mating success and reproductive success. nance. In the latter case, appropriate correction for the population average mating success and reproductive success is needed. COMBINING ESTIMATES FROM POPULATION REPLICATES Correct relative fitness for population means across study Several earlier studies recorded mating success and reproduc- replicates tive success for all individuals in replicated small populations (e.g. Bateman 1948; Fitze & Le Galliard 2011; Collet et al. There are several ways to combine estimates from replicate 2012; Pelissie, Jarne & David 2012; Fritzsche & Arnqvist populations. The simplest is to first express mating success and 2013). Each ‘population’ then yields a Bateman gradient for reproductive success relative within each replicate group (ei- each sex. Comparisons across replicate populations allow an ther for each individual, or for the single focal individual, assessment of the repeatability of a sex-difference in selection assuming corresponding summary data are available for the gradients (e.g. in Collet et al. 2012), providing a particularly whole group). Second, these values are relativized across all powerful tool to statistically evaluate slope estimates. The populations. It is important to relativize within replicate popu- argumentcanbeextendedtocomparisons across multiple sea- lations before expressing fitness relative among focal individu- sons, varying environmental conditions, or different local pop- als from different populations. If values are exclusively ulations of a species. relativized across populations (as done for males in Fritzsche

(a)Males (b) Females 5000 1000 1·5 1·5 -scale)

10 100 1·0 -scale)

500 β β 10 ss ss - - (log relative 1·0 relative (log[x+1] 50 10 0·5

absolute 0·5 -

5 absolute 1 0·0 - ss β ss β 0·5 0·0 0·1 –0·5 0·5 5 50 500 0·5 5 50 500 (log -scale) Mean offspring number (log10-scale) Mean offspring number 10

Fig. 5. Scaling effects on Bateman gradients in empirical studies. Graphs plot the relationship between Bateman gradients and mean reproductive success for males (a) and females (b). Bateman gradients established on absolute values (bss, filled dots, left axis) increase with average reproductive success (hatched linear regression lines with their 95% CI, both P < 0001). This bias vanishes when using standardized Bateman gradients

(bss’, open dots, right axis, solid regression lines, both P > 01) or transforming raw Bateman gradients into standardized effect sizes (not shown).

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution 12 N. Anthes et al.

& Arnqvist 2013), individuals from overly productive groups Andrade, M.C.B. & Kasumovic, M.M. (2005) Terminal investment strategies obtain larger relative fitness values than justified. and male mate choice: extreme tests of Bateman. Integrative and Comparative Biology, 45, 838–847. Anthes,N.,David,P.,Auld,J.R.et al. (2010) Bateman gradients in hermaphro- dites: an extended approach to quantify sexual selection. The American Natu- REPORT ALL RELEVANT ESTIMATES, STATISTICS, ralist, 176,249–263. SAMPLE SIZES Anthes, N., Haderer,€ I.K., Michiels, N.K. & Janicke, T. (2016) Data from: Mea- suring and interpreting sexual selection metrics – evaluation and guidelines. As a final word, we wish to encourage researchers to con- Dryad Digital Repository, http://datadryad.org/resource/doi:10.5061/dryad. sciously assembling the data to be reported in a study on Bate- rs69k Apakupakul, K. & Rubenstein, D.R. (2015) Bateman’s principle is reversed in a man metrics. When compiling data for our meta-analysis cooperatively breeding . Biology Letterse, 11, 20150034. (Janicke et al. 2016) a large proportion of studies suffered from Arnold, S.J. (1994) Bateman’s principles and the measurement of sexual selection – rather incomplete presentation. The minimal set of data in plants and animals. American Naturalist, 144, S126 S149. Arnold, S.J. & Duvall, D. (1994) Animal mating systems: a synthesis based on required in studies of Bateman gradients and related metrics selection theory. American Naturalist, 143, 317–348. includes: Arnold, S.J. & Wade, M.J. (1984a) On the measurement of natural and sexual – 1 Clear definition of the chosen proxies for mating success and selection: applications. Evolution, 38,720 734. Arnold, S.J. & Wade, M.J. (1984b) On the measurement of natural and sexual reproductive success, and judgement how this choice affects the selection: theory. Evolution, 38,709–719. degree to which the derived metrics capture sexual selection. Arnqvist, G. (2013) Comment on ‘Bateman in nature: predation on offspring 2 Sample size for each measure, split by sex. reduces the potential for sexual selection’. Science, 340, 549. Aronsen, T., Berglund, A., Mobley, K.B., Ratikainen, I.I. & Rosenqvist, G. 3 Mean and variance of sex-specific mating success and repro- (2013) Sex ratio and density affect sexual selection in a sex-role reversed fish. ductive success. Evolution, 67, 3243–3257. 4 Statement whether sexual selection metrics were calculated Ashman, T.-L. & Morgan, M.T. (2004) Explaining phenotypic selection on plant attractive characters: male function, gender balance or ecological context? Pro- on absolute or relative values. ceedings of the Royal Society of London B, 271, 553–559. 5 Measurement error (variance or standard error) for all sex- Baena, M.L. & Macıas-Ordonez,~ R. (2012) Phenology of scramble polygyny in a ual selection metrics. wild population of chrysolemid : the opportunity for and the strength of sexual selection. PLoS ONE, 7, e38315. 6 For Bateman gradients, the slope estimate and its standard Bateman, A.J. (1948) Intra-sexual selection in Drosophila. Heredity, 2, 349–368. error. Bergeron,P.,Montiglio,P.-O.,Reale, D., Humphries, M. & Garant, D. (2012) 7 Statistical evaluation of sex differences in all measured selec- Bateman gradients in a promiscuous mating system. and Sociobiology, 66, 1125–1130. tion metrics. Bergeron, P., Martin, A.M., Garant, D. & Pelletier, F. (2013) Comment on ‘Bate- We further encourage deposition of original data underlying man in nature: predation on offspring reduces the potential for sexual selec- the calculation of sexual selection metrics in public repositories tion’. Science, 340, 549. Bjork, A. & Pitnick, S. (2006) Intensity of sexual selection along the - such as Dryad. isogamy continuum. Nature, 441,742–745. Burley, N.T. & Parker, P.G. (1998) Emerging themes and questions in the study of avian reproductive tactics. Ornithological Monographs, 49,1–20. Authors’ contributions Byers, J. & Dunn, S. (2012) Bateman in nature: predation on offspring reduces the potential for sexual selection. Science, 338,802–804. N.A. conceived the study and lead manuscript preparation. N.A., T.J. and I.K.H. Collet, J., Richardson, D.S., Worley, K. & Pizzari, T. (2012) Sexual selection and designed the methodology, collected and analysed the data, and with N.K.M. the differential effect of . Proceedings of the National Academy of developed the final guidelines proposed here. All authors contributed critically to Sciences, 109, 8641–8645. the drafts and gave final approval for publication. Collet, J.M., Dean, R.F., Worley, K., Richardson, D.S. & Pizzari, T. (2014) The measure and significance of Bateman’s principles. Proceedings of the Royal Society B: Biological Sciences, 281, 20132973. Acknowledgements Crow, J.F. (1958) Some possibilities for measuring selection intensities in man. Human Biology, 30,1–13. We wish to thank Patrice David, Karen de Jong, Philippe Jarne, Joris Koene, Danielsson, I. (2001) Antagonistic pre– and post–copulatory sexual selection on Hanna Kokko, Kenyon Mobley, Steven Ramm and Lukas Scharer€ for judge- male body size in a water strider (Gerris lacustris). Proceedings of the Royal ment of some of the ideas presented in this review, and three anonymous referees Society of London. Series B: Biological Sciences, 268,77–81. for their very valuable and constructive reviews. We further thank all those Darwin, C. (1871) The Descent of Man and Selection in Relation to Sex. Princeton researchers who provided additional information or data on their Bateman gradi- University Press, Princeton, NJ, USA. ent studies as listed in Janicke et al. (2016). This study was funded by postdoctoral Escobar, J.S., Jarne, P., Charmantier, A. & David, P. (2008) Outbreeding allevi- fellowships from the Swiss National Science Foundation to T.J. (SNSF grant no: ates senescence in hermaphroditic snails as expected from the mutation-accu- PBBSP3-135985 and PA00P3-145375/1) and the Deutsche Forschungsgemein- mulation theory. Current Biology, 18, 906–910. schaft to N.A. (DFG grant no: AN549/3-1). Evans, J.P. & Garcia-Gonzalez, F. (2016) The total opportunity for sexual selection and the integration of pre- and post-mating episodes of sexual selection in a complex world. Journal of Evolutionary Biology, doi:10.1111/ Data accessibility jeb.12960. Fitze, P.S. & Le Galliard, J.-F. (2008) Operational sex ratio, sexual conflict and – Data underlying the analyses in this manuscript are deposited in the Dryad repos- the intensity of sexual selection. Ecology Letters, 11,432 439. itory for Figures 2 and 4 (http://datadryad.org/resource/doi:10.5061/dryad. Fitze, P.S. & Le Galliard, J.F. (2011) Inconsistency between different measures of – rs69k) (Anthes et al. 2016) and all other Figures (http://datadryad.org/resource/ sexual selection. The American Naturalist, 178,256 268. doi:10.5061/dryad.780d6) (Janicke et al. 2016). Fitzpatrick, C.L. (2015) Expanding sexual selection gradients: a synthetic refine- ment of sexual selection theory. Ethology, 121,207–217. Fleming, I.A. & Gross, M.R. (1994) Breeding competition in a Pacific salmon (coho: Oncorhynchus kisutch): measures of natural and sexual selection. Evolu- References tion, 48,637–657. Andersson, M. & Iwasa, Y. (1996) Sexual selection. Trends in Ecology & Evolu- Forsgren, E., Amundsen, T., Borg, A.A. & Bjelvenmark, J. (2004) Unusually – tion, 11,53–58. dynamic sex roles in a fish. Nature, 429,551 554.

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution Quantifying sexual selection metrics 13

Fritzsche, K. & Arnqvist, G. (2013) Homage to Bateman: sex roles predict sex dif- Mills, S.C., Grapputo, A., Koskela, E. & Mappes, T. (2007) Quantitative measure ferences in sexual selection. Evolution, 67, 1926–1936. of sexual selection with respect to the operational sex ratio: a comparison of Gagnon, M.-C., Duchesne, P. & Turgeon, J. (2012) Sexual conflict in Ger- selection indices. Proceedings of the Royal Society of London B, 274,143–150. ris gillettei (Insecta: Hemiptera): influence of effective mating rate and Mitchell-Olds, T. & Shaw, R.G. (1987) Regression analysis of natural selection: morphology on reproductive success. Canadian Journal of Zoology, 90, statistical inference and biological interpretation. Evolution, 41, 1149–1161. 1297–1306. Mobley, K.B. (2014) Mating systems and the measurement of sexual selection. Gerlach, N.M., McGlothlin, J.W., Parker, P.G. & Ketterson, E.D. (2012) Rein- Animal Behaviour: How and Why Animals Do the Things They Do (ed K. Yasu- terpreting Bateman gradients: multiple mating and selection in both sexes of a kawa), pp. 99–144. Praeger, Santa Barbara, CA, USA. songbird species. Behavioral Ecology, 23, 1078–1088. Mobley, K.B. & Jones, A.G. (2007) Geographical variation in the mating sys- Henshaw, J.M., Kahn, A.T. & Fritzsche, K. (2016) A rigorous comparison of sex- tem of the dusky pipefish (Syngnathus floridae). Molecular Ecology, 16, ual selection indexes via simulations of diverse mating systems. Proceedings of 2596–2606. the National Academy of Sciences, 113, E300–E308. Mobley, K.B. & Jones, A.G. (2009) Environmental, demographic, and genetic Janicke, T., David, P. & Chapuis, E. (2015) Environment-dependent sexual selec- mating system variation among five geographically distinct dusky pipefish tion: Bateman’s parameters under varying levels of food availability. The (Syngnathus floridae) populations. Molecular Ecology, 18, 1476–1490. American Naturalist, 185, 756–768. Mobley, K.B. & Jones, A.G. (2013) Overcoming statistical bias to estimate Janicke, T., Haderer,€ I.K., Lajeunesse, M.J. & Anthes, N. (2016) Dar- genetic mating systems in open populations: a comparison of Bateman’s prin- winian sex roles confirmed across the animal kingdom. Science Advances, ciples between the sexes in a sex-role-reversed pipefish. Evolution, 67, 646–660. 2, e1500983. Morimoto, J., Pizzari, T. & Wigby, S. (2016) Developmental environment effects Jennions, M.D. & Kokko, H. (2010) Sexual selection. Evolutionary Behavioral on sexual selection in male and female Drosophila melanogaster. PLoS ONE, Ecology (eds D.F. Westneat & C.W. Fox), pp. 343–364. Oxford University 11, e0154468. Press, Oxford, UK. Morisita, M. (1962) Ir-Index, a measure of dispersion of individuals. Researches Jennions, M.D., Kokko, H. & Klug, H. (2012) The opportunity to be misled in on Population Ecology, 4,1–7. studies of sexual selection. Journal of Evolutionary Biology, 25, 591–598. Munroe, K.E. & Koprowski, J.L. (2011) Sociality, Bateman’s gradients, and the Jones, A.G. (2009) On the opportunity for sexual selection, the Bateman gradient polygynandrous genetic mating system of round-tailed ground squirrels and the maximum intensity of sexual selection. Evolution, 63,1673–1684. (Xerospermophilus tereticaudus). Behavioral Ecology and Sociobiology, 65, Jones, A.G. (2015) batemanater: a computer program to estimate and bootstrap 1811–1824. mating system variables based on Bateman’s principles. Molecular Ecology Nakagawa, S., Poulin, R., Mengersen, K., Reinhold, K., Engqvist, L., Lag- Resources, 15, 1396–1402. isz, M. & Senior, A.M. (2015) Meta-analysis of variation: ecological and Jones, A.G., Arguello, J.R. & Arnold, S.J. (2002) Validation of Bateman’s princi- evolutionary applications and beyond. Methods in Ecology and Evolution, ples: a genetic study of sexual selection and mating patterns in the rough- 6,143–152. skinned newt. Proceedings of the Royal Society of London-B. Biological van Noordwijk, A.J. & de Jong, G. (1986) Acquisition and allocation of Sciences, 269, 2533–2539. resources: their influence on variation in life history tactics. The American Nat- Jones, A.G., Arguello, J.R. & Arnold, S.J. (2004) Molecular parentage anal- uralist, 128, 137. ysis in experimental newt populations: the response of mating system Parker, P.G. & Tang-Martinez, Z. (2005) Bateman gradients in field and measures to variation in the operational sex ratio. American Naturalist, laboratory studies: a cautionary tale. Integrative and Comparative Biology, 45, 164, 444–456. 895–902. Jones, A.G., Walker, D. & Avise, J.C. (2001) Genetic evidence for extreme poly- Pelissie, B., Jarne, P. & David, P. (2012) Sexual selection without sexual dimor- andry and extraordinary sex-role reversal in a pipefish. Proceedings of the phism: Bateman gradients in a simultaneous . Evolution, 66, Royal Society of London B: Biological Sciences, 268, 2531–2535. 66–81. Jones, A.G., Rosenqvist, G., Berglund, A., Arnold, S.J. & Avise, J. (2000) The Pelissie, B., Jarne, P., Sarda, V. & David, P. (2014) Disentangling precopulatory Bateman gradient and the cause of sexual selection in a sex-role-reversed pipe- and postcopulatory sexual selection in polyandrous species. Evolution, 68, fish. Proceedings of the Royal Society of London-B. Biological Sciences, 267, 1320–1331. 677–680. Pischedda, A. & Rice, W.R. (2012) Partitioning sexual selection into its mating Jones, A.G., Rosenqvist, G., Berglund, A. & Avise, J.C. (2005) The measurement success and fertilization success components. Proceedings of the National Acad- of sexual selection using Bateman’s principles: an experimental test in the sex- emy of Sciences, 109, 2049–2053. role-reversed pipefish Syngnathus typhle. Integrative and Comparative Biology, Poesel, A., Gibbs, H.L. & Nelson, D.A. (2011) Extrapair fertilizations and the 45, 874–884. potential for sexual selection in a socially monogamous songbird. The Auk, Ketterson,E.D.,Parker,P.G.,Raouf,S.A.,NolanJr,V.,Ziegenfus,C.&Chan- 128, 770–776. dler, C.R. (1997) The relative impact of extra-pair fertilizations on variation in Prosser, M.R., Weatherhead, P.J., Gibbs, H.L. & Brown, G.P. (2002) Genetic male and female reproductive success in dark-eyed juncos (Junco hyemalis). analysis of the mating system and opportunity for sexual selection in northern Ornithological Monographs, 49,81–101. water snakes (Nerodia sipedon). Behavioral Ecology, 13,800–807. Kingsolver, J.G., Diamond, S.E., Siepielski, A.M. & Carlson, S.M. (2012) Syn- Rodriguez-Munoz, R., Bretman, A., Slate, J., Walling, C.A. & Tregenza, T. thetic analyses of phenotypic selection in natural populations: lessons, limita- (2010) Natural and sexual selection in a wild population. Science, 328, tions and future directions. Evolutionary Ecology, 26, 1101–1118. 1269–1272. Klug, H., Lindstrom,€ K. & Kokko, H. (2010) Who to include in measures of sex- Rose, E., Paczolt, K.A. & Jones, A.G. (2013) The contributions of premating and ual selection is no trivial matter. Ecology Letters, 13, 1094–1102. postmating selection episodes to total selection in sex-role-reversed gulf pipe- Klug, H., Heuschele, J., Jennions, M.D. & Kokko, H. (2010) The mismeasure- fish. The American Naturalist, 182, 410–420. ment of sexual selection. Journal of Evolutionary Biology, 23,447–462. Ruzzante, D.E., Hamilton, D.C., Kramer, D.L. & Grant, J.W.A. (1996) Scaling Kokko, H., Klug, H. & Jennions, M.D. (2012) Unifying cornerstones of sexual of the variance and the quantification of resource monopolization. Behavioral selection: operational sex ratio, Bateman gradient and the scope for competi- Ecology, 7,199–207. tive investment. Ecology Letters, 15,1340–1351. Schlicht, E. & Kempenaers, B. (2013) Effects of social and extra-pair mating Kokko, H. & Wong, B.B.M. (2007) What determines sex roles in mate searching? on sexual selection in blue tits (Cyanistes caeruleus). Evolution, 67, Evolution, 61, 1162–1175. 1420–1434. Krakauer, A.H., Webster, M.S., Duval, E.H., Jones, A.G. & Shuster, S.M. Siepielski, A.M., DiBattista, J.D. & Carlson, S.M. (2009) It’s about time: the (2011) The opportunity for sexual selection: not mismeasured, just misunder- temporal dynamics of phenotypic selection in the wild. Ecology Letters, 12, stood. Journal of Evolutionary Biology, 24, 2064–2071. 1261–1276. Levitan, D.R. (2008) Gamete traits influence the variance in reproductive success, Singer, A., Kvarnemo, C., Lindstrom,€ K. & Svensson, O. (2006) Genetic mating the intensity of sexual selection, and the outcome of sexual conflict among con- patterns studied in pools with manipulated nest site availability in two generic sea urchins. Evolution, 62, 1305–1316. populations of Pomatoschistus minutus. Journal of Evolutionary Biology, 19, Lorch, P.D. (2005) Using upper limits of ‘Bateman gradients’ to estimate the 1641–1650. opportunity for sexual selection. Integrative and Comparative Biology, 45,924– Snyder, B.F. & Gowaty, P.A. (2007) A reappraisal of Bateman’s classic study of 930. intrasexual selection. Evolution, 61, 2457–2468. Marie-Orleach, L., Janicke, T., Vizoso, D.B., David, P. & Scharer,€ L. (2016) Tatarenkov, A., Healey, C.I.M., Grether, G.F. & Avise, J.C. (2008) Pronounced Quantifying episodes of sexual selection: insights from a transparent worm reproductive skew in a natural population of green swordtails, Xiphophorus with fluorescent sperm. Evolution, 70, 314–328. helleri. Molecular Ecology, 17, 4522–4534.

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution 14 N. Anthes et al.

Taylor, M.L., Price, T.A. & Wedell, N. (2014) Polyandry in nature: a global anal- Walker, J.A. (2014) Ehe effect of unmeasured confounders on the ability to esti- ysis. Trends in Ecology & Evolution, 29, 376–383. mate a true performance or selection gradient (and other partial regression Travis, J. & Reznick, D. (1998) Experimental approaches to the study of coefficients). Evolution, 68, 2128–2136. evolution. Experimental Ecology: Issues and Perspectives (eds W.J. Resetarits Walker, L., Ewen, J., Brekke, P. & Kilner, R. (2014) Sexually selected dichroma- & J. Bernardo), pp. 437–459. Oxford University Press, New York, NY, tism in the hihi Notiomystis cincta: multiple colours for multiple receivers. Jour- USA. nal of Evolutionary Biology, 27, 1522–1535. Turnell, B.R. & Shaw, K.L. (2015) High opportunity for postcopulatory sexual Williams, R.N. & DeWoody, J.A. (2009) Reproductive success and sexual selec- selection under field conditions. Evolution, 69, 2094–2104. tion in wild eastern (Ambystoma t. tigrinum). Evolutionary Wacker, S., Mobley, K., Forsgren, E., Myhre, L.C., de Jong, K. & Amundsen, T. Biology, 36, 201–213. (2013) Operational sex ratio but not density affects sexual selection in a fish. Wilson, D.S. (2004) What is wrong with absolute individual fitness? Trends in Evolution, 67, 1937–1949. Ecology & Evolution, 19, 245–248. Wacker, S., Amundsen, T., Forsgren, E. & Mobley, K.B. (2014) Within-season Wolf, J.B. & Wade, M.J. (2001) On the assignment of fitness to parents and off- variation in sexual selection in a fish with dynamic sex roles. Molecular Ecol- spring: whose fitness is it and when does it matter? Journal of Evolutionary ogy, 23,3587–3599. Biology, 14, 347–356. Wade, M.J. (1979) Sexual selection and variance in reproductive success. Ameri- can Naturalist, 114, 742–747. Received 18 August 2016; accepted 9 November 2016 Wade, M.J. & Arnold, S.J. (1980) The intensity of sexual selection in relation to Handling Editor: Holger Schielzeth male sexual behaviour, female choice, and sperm precedence. Animal Beha- viour, 28, 446–461.

© 2016 The Authors. Methods in Ecology and Evolution © 2016 British Ecological Society, Methods in Ecology and Evolution