Genet. Res., Camb.(1998), 71, pp. 193–212. With 1 figure. Printed in the United Kingdom # 1998 Cambridge University Press 193

Evolution of animal genitalia: patterns of phenotypic and genotypic variation and condition dependence of genital and non-genital morphology in water strider (Heteroptera: Gerridae: Insecta)

" # GO= RAN ARNQVIST *  RANDY THORNHILL " Department of Animal Ecology, UniŠersity of Umeab , S-901 87 Umeab , Sweden # Department of Biology, UniŠersity of New Mexico, Albuquerque, NM 87131-1091, USA (ReceiŠed 29 September 1997 and in reŠised form 27 January 1998)

Summary Rapid and divergent evolution of male genitalia represents one of the most general evolutionary patterns in animals with internal fertilization, but the causes of this evolutionary trend are poorly understood. Several hypotheses have been proposed to account for genitalic evolution, most prominent of which are the lock-and-key, and pleiotropy hypotheses. However, insights into the evolutionary mechanisms of genitalic evolution are hindered by a lack of relevant in-depth studies of genital morphology. We used a biparental progenies breeding design to study the effects of food stress during ontogeny on phenotypic expression of a suite of genital and non- genital morphological traits, both linear traits and multivariate shape indices, in a natural population of the water strider Gerris incognitus. In general, genitalic traits were as variable as non-genital traits, both phenotypically and genotypically. Average narrow-sense heritability of genital traits was 0n47 (SE l 0n05). Further, while food stress during development had a large impact on adult morphology, and expression of genitalic traits exhibited significant levels of condition dependence, different genotypes did not significantly differ in their ability to cope with food stress. Genitalic conformation was also both phenotypically and genetically correlated with general morphological traits. These patterns are in disagreement with certain predictions generated by the long-standing lock-and-key hypothesis, but are in general agreement with several other hypotheses of genital evolution. We failed to find any additive genetic components in fluctuating asymmetry of any bilaterally symmetrical traits and the effects on fluctuating asymmetry of food stress during development were very low and insignificant. Some methodological implications of our study are discussed, such as the bias introduced by the non-negativity constraint in restricted maximum likelihood estimation of variance components.

1. Introduction 1985, 1990, 1993, 1996; Shapiro & Porter, 1989; Andersson, 1994; Alexander et al., 1997; Arnqvist, In animals with internal fertilization, diversification of 1997b). A number of different hypotheses have been genitalia represents one of the most general and proposed to account for the evolution of animal striking forms of evolutionary trends (Eberhard, genitalia, but empirical data are very scarce. In 1985). Animal genitalia evolve rapidly and divergently, particular, in-depth studies of quantitative genetics and genitalic morphology typically differs even be- and patterns of selection on genitalic traits are largely tween closely related species. Though one of the most lacking, rendering discussions of genitalic evolution general evolutionary trends, it remains one of the speculative (Eberhard, 1993, 1996; Andersson, 1994; most poorly understood (Scudder, 1971; Eberhard, Arnqvist, 1997b). Arnqvist (1997b) suggested that relevant data from single species may be used to illuminate the patterns * Corresponding author. Fax: j46-90-167665. e-mail: goran.arnqvist!animecol.umu.se. and processes of genitalic evolution, in much the same G. ArnqŠist and R. Thornhill 194 way that data from single species studies are key in empirical data, is that genital and non-genital mor- other adaptational domains. Studies of genitalia phological traits are genetically correlated (Arnqvist, should address patterns of morphological variation 1997b). and inheritance of genitalic traits on the one hand, The current contribution represents the first ex- and patterns of phenotypic selection on, and fitness tensive study to explicitly address questions about the consequences of, genitalic variation on the other. The degree and nature of genetic and environmental current study represents the former line of investi- control of intraspecific variability in genital mor- gation. phology, and to relate this to other components of The three main contending hypotheses for genitalic morphology. As a model system, we use a natural evolution – the lock-and-key, the sexual selection and population of water striders (see below). First, we the pleiotropy hypotheses – all make a number of quantify the extent of phenotypic variation in genitalic predictions of the pattern of genitalic variation traits relative to other traits. Secondly, we characterize expected within species (Eberhard, 1985, 1990, 1993; the patterns of inheritance of genitalic versus general Shapiro & Porter, 1989; Alexander et al., 1997), many morphological traits. Thirdly, we experimentally of which are relative and non-rigorous (Arnqvist, assess the degree of condition dependence in pheno- 1997b). The lock-and-key hypothesis states that typic expression of genital and general morphological genitalia evolve via selection for pre-insemination traits, by inducing varying levels of food stress during mechanical reproductive isolation, so that male ontogeny. genitalia evolve to be specific and unique (the key) in Recent research has indicated that differences in order to fit appropriately in female genitalia (the morphology between the left and right sides of lock). Under this hypothesis, development of male bilaterally symmetrical traits may be revealing of the genitalia is predicted to be canalized and under phenotypic and genotypic condition of individuals stabilizing selection, leading to relatively low degrees (Møller & Pomiankowski, 1993; Swaddle et al., 1994; of phenotypic and genotypic variation in genital Watson & Thornhill, 1994). In particular, fluctuating morphology (cf. Pomiankowski & Møller, 1995) and asymmetry (FA) results from the inability of indi- relatively little condition dependence in expression of viduals to undergo identical development on both genitalic traits (Arnqvist, 1997b). sides of the body, and is thus believed to reflect the Intromittent male genitalia may also evolve via ability of individuals to cope with environmental post-mating sexual selection (Eberhard, 1985, 1993, stress (Mather, 1953; Van Valen, 1962; Palmer & 1996; Arnqvist, 1997b). Non-random fertilization Strobeck, 1986). The degree of FA of traits has thus success among males, based on genital morphology, been put forth as a potentially integrative measure of may be brought about by differences in stimulatory\ individual phenotypic and genotypic quality (Møller titillating ability (Thornhill, 1983; Eberhard, 1985, & Pomiankowski, 1993; Watson & Thornhill, 1994). 1990, 1993, 1996), by differences in ability to coerce\ In the current study, we use measures of FA as an control fertilization events (Lloyd, 1979; Arnqvist & additional tool for studying patterns of condition Rowe, 1995; Rice, 1996; Alexander et al., 1997) or by dependence and phenotypic quality across traits, by differences in ability to displace\dislocate sperm of experimentally assessing how food stress affects the competing males (Smith, 1984; Waage, 1984; degree of FA in a suite of genital and general Birkhead & Hunter, 1990). Traits under sexual morphological traits. selection tend to exhibit relatively high levels of The study of small-scale complex morphological phenotypic and genotypic variation (Pomiankowski variation, such as that of genitalic variation in & Møller, 1995), and genitalia should be no exception arthropods, has long been hindered by methodological (Arnqvist, 1997b). Phenotypic expression of genitalia problems. Recent development in methods of cap- would also be expected to evolve to be condition turing data (digital techniques) have reduced the dependent (Andersson, 1994; Johnstone, 1995; Rowe problem of measuring small-scale traits accurately. & Houle, 1996), typically leading to significant More importantly, new statistical methods designed phenotypic correlations between genital and general to describe two- or three-dimensional variation in morphological traits. shape among specimens have opened up novel The pleiotropy hypothesis holds that genitalic possibilities to work with complex morphologies evolution is an indirect result of evolution of (Rohlf & Marcus, 1993; Marcus et al., 1996). These genetically correlated characters, via pleiotropic effects multivariate methods potentially provide much more of genes that code for both genitalic and general synthetic and integrative measures of morphological morphology (Mayr, 1963; Eberhard, 1985, 1990). variation than traditional linear measures (Liu et al., Genitalic variation per se is essentially neutral, and 1996). We use both standard linear traits and pleiotropic effects on genitalia are not directly selected multivariate measures of shape variation, and are thus against and can thus accumulate. A key assumption of able to compare directly the relative merits and this hypothesis, which is very difficult to refute with problems of the two approaches in this context. Genetics of genitalic morphology 195

paternity of offspring. Females were kept separated 2. Materials and methods from males and the fertilization rate of their eggs was (i) The organisms monitored continuously. Average fertilization rates among females were 5% after 12 d, 2% after 16 d and Water striders (Insecta; Heteroptera: Gerridae) form 0 after 19 d of isolation. Sperm longevity in congeneric an ecologically rather homogeneous family of true species is known to range between 12 and 20 d bugs. They inhabit water surfaces of various aquatic (Kaitala, 1987; Arnqvist, 1988, 1989). The laboratory habitats both as juveniles and as adults, and are rearing experiment described below was initiated after predators\scavengers feeding mainly on arthropods 21 d of isolation of the sexes in the laboratory, when trapped at the water surface (Andersen, 1982; Spence it can safely be assumed that females did not carry & Andersen, 1994; Rowe et al., 1994; Arnqvist, viable sperm. Additional confidence in ascribing 1997a). In terms of morphological divergence within paternity to assigned sires is provided by the fact that the group, water striders are typical insects in the last male sperm precedence has been established in sense that relatively rapid evolutionary radiation of this group of insects (Arnqvist, 1988; Rubenstein, male genitalia is a particularly striking and consistent 1989). pattern. Hence, characteristics of male genitalia is For the breeding programme, we used a biparental very important both in grouping higher-order taxa progenies design (BIPs), which has the advantage of and in distinguishing among closely related congeneric using a relatively large sample of parental individuals, species (Andersen, 1982, 1993; Arnqvist, 1992, 1997a). and thus is powerful when one wishes to characterize The genitalia of water striders consists of a proximal the genetic structure of a natural population rather cylindrical segment (i.e. 1st genital segment) con- than a set of inbred lines (Kearsey, 1965). Males and taining a boat-shaped structure representing the females were introduced in pairs (n l 61) into con- pygophore and the proctiger. These structures hold tainers provided with a Styrox floater to serve as the intromittent phallic organ, which is inflated\ resting and oviposition substrate. The parental indi- extended and inserted into the female genital tract viduals were fed field crickets ad libitum for 10 days, during copulation. Apart from membranous tissue, after which they were frozen for subsequent morpho- the phallus consists of a proximal sclerotized metric analysis. phallotheca, and an apical capsule, the vesica, which Twelve randomly selected (upon hatching) full- carries an armature of sclerites (for illustrations see sibling offspring from each family were reared Andersen, 1982, 1993, Fig. 1). individually in circular 9 cm diameter containers, each This study was performed on Gerris incognitus provided with a Styrox floater. All offspring were fed (Drake and Hottes), a member of the primarily frozen Daphnia zooplankton ad libitum during the Holarctic and relatively species-rich genus Gerris (42 first 2 d. To assess the effects of food stress during species). In this genus, characters of male genitalia are ontogeny on morphology (the degree of condition of great taxonomic and phylogenetic importance. In dependence of phenotypic traits), offspring were particular, the set of sclerites in the apical part of the thereafter randomly assigned to each of two food intromittent phallus (the vesical armature) have treatments within families (n l 6 offspring per family evolved rapidly and divergently, and their morphology and food treatment). Offspring were fed frozen is a critical species character within the genus Drosophila fruitflies, larval Gryllus field crickets and (Andersen, 1993). Though little is known of the Tenebrio beetle larvae, according to the scheme in functional morphology of these structures in Gerris, Table 1. Offspring in food treatment I (aimed at the vesical sclerites could play an important part providing a sufficient food supply) were never com- during insertion and\or positioning of the male pletely starved (deprived of all food), and received a phallus in the female genital tract during copulation larger and more varied food supply compared with (Andersen, 1982; Heming-van Battum & Heming, offspring in food treatment II (aimed at creating food 1989). stress). Water was changed and the rearing containers cleaned once every 4 d. Offspring were individually preserved by freezing when sclerotized, 24–48 h after (ii) Experimental design moulting into the adult stage, for subsequent morpho- Parental adult water striders (G. incognitus) were metric analysis. captured from a monomorphic apterous natural population at the Bosque del Apache Wildlife Refuge, (iii) Morphometric analysis New Mexico, USA, 6 March 1994. Males and females were isolated in unisexual tanks in the laboratory, and A landmark-based approach was used to acquire fed frozen crickets ad libitum. Since some females laid morphometric measurements. Two-dimensional digi- fertile eggs at the time of capture, the following tal morphometric landmark maps were attained by procedure was applied to enable determination of placing a digitizing tablet (Summasketch III) under a G. ArnqŠist and R. Thornhill 196

Table 1. Summary of the feeding regimes for the two offspring food treatments

Offspring Treatment I Treatment II larval stage (food per individual) (food per individual)

I–II 1 Drosophilaa every 1 d 1 Drosophila every 2 d III 1 Drosophila every 1 d 1 Drosophila every 2 d 1 Gryllus (1 week old)b every 1Gryllus (1 week old) every 2d 3d No starvation Starved for 24 h in late III instar IV–V 1 Drosophila every 1 d 1 Drosophila every 2 d 1 Gryllus (2 weeks old)c every 1Gryllus (1 week old) every 2d 3d 1Tenebriod every 3 d — No starvation Starved for 24 h in IV instar

Mean weight per food item (n l 12): a 0n86 mg (SD l 0n42), b 3n90 mg (SD l 0n34), c 18n38 mg (SD l 6n48), d 15n36 mg (SD l 4n03).

distances between pairs of landmarks, selected a priori because of the very large number of potential distances that could be extracted from such maps, to acquire linear measures of size of a number of traits. We also used multivariate geometric shape analysis to acquire a more synthetic quantification of mor- phological variation in shape of genitalia and body among individuals (Rohlf & Marcus, 1993). Land- mark maps were subjected to thin-plate spline relative warp analysis, a flexible, powerful and interpretable multivariate technique for analysis of morphometric shape variation (Bookstein, 1991; Rohlf, 1993; Mar- cus et al., 1996). In short, the method involves fitting a function to the landmark coordinates of each specimen, where shape variation within the population is manifested as variance in the parameters of the fitted function. Relative warps represent principal component vectors in the multivariate shape space, and each relative warp can be thought of as representing a unique multivariate shape dimension, orthogonal to all other such dimensions. The dimensionality of each relative warp can be visualized as a displacement of landmarks relative to the average Fig. 1. Locations of landmarks on the body (dorsal view) configuration of landmarks. For a more detailed and the intromittent genitalic capsule (the vesica) with its description and discussion of thin-plate spline relative armature of internal sclerites (dorsal view). Landmarks warp analysis see Bookstein (1991), Rohlf (1993) and used are indicated with circles. Marcus et al.(1996). Landmark maps of both the dorsal view of the body side-mounted camera lucida attached to a dissecting and the sclerites of the genital capsule (Fig. 1) were microscope (Wild M5). For each individual water analysed with thin-plate spline relative warp analysis. strider, several separate landmark maps were digitized We used the following procedure. Each landmark (two for females and four for males): one of the body map from each specimen was divided into two separate in dorsal view (12 landmarks, Fig. 1), one of the images prior to analysis by duplicating all landmarks appendages (28 landmarks), one of the proximal parts along the mid-line, one image representing the left and of the genitalia (8 landmarks) and one of the armature one the right side of the specimen. The left and right of sclerites in the intromittent genitalic capsule (the sides of all specimens were treated as separate vesica) in dorsal view (10 landmarks, Fig. 1). From landmark maps, and both sides of all individuals each of these landmark maps, we used a number of (parental as well as offspring) were pooled into one Genetics of genitalic morphology 197 single data set (i.e. the total number of images distribution centred around zero within populations included in the analysis equalled the total number of (Palmer & Strobeck, 1986; Swaddle et al., 1994). All individuals times 2). This analytic strategy has the our measures of morphological asymmetry were thus benefit of producing a number of unique orthogonal tested for normality and a mean of zero (data on shape dimensions, the scores of which are directly parental individuals; see Appendix A), and variables comparable not only between the left and right sides failing to comply with these basic assumptions of FA of a given individual, but also across all specimens. were excluded from further analysis. First, all images were optimally translated, rotated For traits exhibiting FA, the absolute value of FA and scaled to the same position and size, using the scores (QRkLQ) was used to characterize the degree of generalized least-squares superimposition in the asymmetry of individuals (i.e. the directionality of the morphometric computer program GRF-ND (Slice, bilateral asymmetry was ignored). These variables will 1994). This procedure removes isometric size vari- in theory be approximately half-normally distributed. ation, and generates an average consensus con- To stabilize the variances of the models used for figuration of landmarks as well as an integrative subsequent statistical analysis, all measures of FA measure of the relative size of each landmark map (the were transformed prior to analysis as (cf. Sokal & c centroid size). Secondly, pure shape variation among Rohlf, 1981; Swaddle et al., 1994): xhllog (1j10 x), the scaled and aligned images was analysed with thin- where c is a factor (" 1) specific for each variable, plate spline relative warp analysis, using the consensus chosen to minimize the sum of the absolute values of configuration as reference in the analysis. The com- skewness and kurtosis of the distribution of the puter software TPSRW (Rohlf, 1993) was used for variable (Berry, 1987). These transformations also estimating relative warps, as proposed by Bookstein tended to optimize linearity in normal probability (1991) (i.e. using a value of α l 1). The six first plots of the variables describing FA (Swaddle et al., relative warps were retained and used in subsequent 1994). analysis. These were concluded to collectively account for virtually all appreciable shape variation among specimens. However, the relative warps summarize (v) Repeatability only non-uniform (localizable) shape variation. To quantify uniform shape variation (non-localizable), Taking repeated measures and estimating the the two uniform shape components of Bookstein repeatability of morphological variation is key in any (1996) were estimated separately. Using the analogy thorough quantitative genetics morphometric study, of a grid, the relative warps measure shape change for two related reasons (Arnold, 1994). First, the which causes certain lines to be displaced or non- repeatability provides an approximate upper limit on linearized relative to other lines, whereas the pair of the heritability of any given character (Falconer & uniform shape components parameterizes shape vari- Mackay, 1996). Secondly, the repeatability provides a ation that leaves parallel lines parallel throughout the reliability assessment of the quantification of mor- grid after the shape change. Thus, three groups of phological variation, since it represents a measure of variables were extracted from the shape analysis to the relative magnitudes of measurement error and describe multivariate variation in shape among true between-individual phenotypic variance (Lessels specimens: (a) centroid size, representing an inte- & Boag, 1987; Arnqvist & Ma/ rtensson, 1998). This is grative measure of overall size, (b) relative warp particularly important for measures of FA, since scores and (c) uniform shape component scores. measurement error alone can cause apparent FA (Palmer & Strobeck, 1986; Swaddle et al., 1994; Merila$ & Bjo$ rklund, 1995). To enable estimation of the repeatability of our morphological variables, three (iv) Fluctuating asymmetry repeated measures of all measurements were taken We estimated FA for both linear distances and from each of 40 separate individuals (20 individuals of multivariate shape scores (cf. Auffray et al., 1996; each sex), collected simultaneously with the parental Smith et al., 1997). The average size\shape of the right individuals from the field (see above). We followed the and left side (RjL\2) was used to characterize a procedure of Lessels & Boag (1987), by which data are particular individual for all morphometric variables analysed in one-way analyses of variance. The that were potentially bilaterally symmetrical (e.g. repeatability, or the intraclass correlation coefficient length of appendages, shape scores of left and right (rI), of a given variable equals the ratio between the sides of the body or genitalia). For these variables, we between-individuals variance component and the sum also calculated the absolute degree of asymmetry, by of between- and within-individuals components of subtracting the size\shape of the left side from that of variance. For bilaterally symmetrical traits, we based the right side (RkL). FA, as opposed to directional our repeatability estimates of size\shape on the or antisymmetry, is characterized by a normal average between sides (RjL\2), and of asymmetry in G. ArnqŠist and R. Thornhill 198 # size\shape on the signed difference between sides familyifood treatment interaction (σ ) and residual # ge (RkL), for each repeated measure respectively, since error (σe), respectively (Falconer & Mackay, 1996). these were the metrics used in other parts of the study The familyifood treatment interaction term provides (cf. Palmer & Strobeck, 1986; Swaddle et al., 1994; a direct estimate of the genotypeienvironment # Merila$ & Bjo$ rklund, 1995). interaction (VGE). In our case, this component (σge) can be said to test the heritability of condition dependence, i.e. the extent to which different genotypes respond differently to environmental stress during (vi) Full-sib analyses ontogeny (Falconer & Mackay, 1996). Due to differential mortality and uneven sex ratios Estimations and significance tests of variance within families, the full-sibling data for estimation of components assume constant variances and normality, quantitative genetic parameters were unbalanced. irrespective of which method of fitting is used. While Thus, variance component estimations were made it has been suggested that parameter estimates using restricted maximum likelihood (REML) fitting. produced by REML are relatively robust to deviations This iterative method is already widely used for from the normality assumption (Harville, 1977; Smith, estimating the parameters of the genetic variance– 1980; Searle et al., 1992), this is not necessarily the covariance matrix (G) in the domain of animal case in tests of significance (Shaw, 1987). We examined breeding (Henderson, 1985; Searle et al., 1992), and all fitted mixed models for compliance with the its merits have more recently been appreciated in assumptions of statistical inference by visual in- evolutionary genetics (Shaw, 1987; Shaw et al., 1995; spection of plots of residuals versus fitted values Falconer and Mackay, 1996). In contrast to other (McCullagh & Nelder, 1983; Searle et al., 1992). methods, REML offers methods for estimating rela- Unsatisfactory residual variance distribution was tively unbiased variance components under a range of indicated in one case only, for the distance between circumstances even when the data are unbalanced tips of abdominal spines, in which we were able to (which is typically the case with biological data on sets stabilize the variance by logarithmic transformations of relatives), as well as appropriate estimates of fixed of the original data. Any cells identified as having effects. Variance components were estimated by fitting extreme residuals were removed (three cases in total), the mixed model, y l XβjZvjε, where X represents and the parameters re-estimated with reduced data. the fixed effects model matrix (sex, food treatment and their interaction), Z represents the random effects model matrix (family and familyifood treatment (vii) Parent–offspring analyses interaction) and ε represents a random error vector. In our case, the variance component associated with ε In general, estimates of genetic parameters based on will include all components of unattributable en- parent–offspring resemblance are more accurate and vironmental variation, but also higher-order inter- reliable than those derived from full-sib analyses actions. REML estimation uses a regular least-squares (Falconer, 1973; Falconer & Mackay, 1996). Further, based approach to fit fixed effects only, and then while our full-sib analyses allow estimation of fixed estimates relevant variance components of random effects and genotype–environment interactions, esti- factors by maximizing the likelihood for the remaining mates of heritability may be biased (Shaw, 1987) and residuals (for discussions of the merits and problems the analysis does not allow separation of additive of REML estimation in quantitative genetics see genetic variance and other components of VG (e.g. Shaw, 1987; Searle et al., 1992). We restricted the dominance effects). Thus, we used conventional above analyses to include only families with four or midparent–midoffspring regression to estimate more surviving offspring, limiting the number of narrow-sense heritabilities for the traits included in families to n l 47. the analysis. The degree of condition dependence in expression Since offspring were subjected to different food of phenotypic traits was assessed directly of the F- regimes, and since the sexes typically differ in size, we value of the fixed effect of the food treatment. Since fitted a standard fixed effects linear model (including the degrees of freedom are identical across traits, this sex, food treatment and their interaction) to all F-statistic provides an unbiased measure of the offspring values. By analogy with the REML method magnitude (effect size) of difference in morphology of estimating variance components, these models were under food-stressed and non-food-stressed conditions used to generate residual offspring values to be related that is directly comparable across traits. to parental values. The logic behind this procedure is Estimates of broad-sense heritabilities were derived to examine the influence of parental phenotypic values from the analysis of variance components, as on deviations in offspring values from their ex- # # # # # V \V l 2σ \(σ jσ jσ ), where σ represent esti- pectation based on the food regime and sex of each G P g g ge e # mated variance components due to family (σg), offspring, assuming that the heritability is equal in Genetics of genitalic morphology 199 both laboratory environments. This assumption (that of their repeatabilities; only the seven traits exhibiting the response to food stress is equal across genotypes) a repeatability " 0n8 were included. Genetic corre- is directly tested by the genotypeienvironment lations were estimated as: interaction in the full-sib analysis (see above). For each family and morphological trait, we COVx"z#jCOVx#z" ra l , calculated average offspring residual value and mid- # N(COVx"z" COVx#z#) parental value. Prior to the regression analyses, all variables were standardized to a mean of zero and where COVx"z# denotes the covariance of trait 1 in unit variance. Standardizing the variables is necessary sires and trait 2 in offspring, COVx#z" the covariance in order to attain offspring and parental distributions of trait 2 in sires and trait 1 in offspring, COVx"z" the that are comparable, i.e. are of equal scales. While this covariance of trait 1 in sires and trait 1 in offspring exercise does not affect the precision and significance and COVx#z# the covariance of trait 2 in sires and trait testing of narrow-sense heritability estimates, it may 2 in offspring (Pirchner, 1983; Becker, 1984; Falconer introduce bias in their magnitude. This will, however, & Mackay, 1996). Standard errors of the genetic only be the case if the phenotypic and additive correlations were approximated as: genotypic variances are not equal in the parental and # # # offspring generations (Coyne & Beecham, 1987; 1kra SE (h")SE(h#) SE (ra) l # # , Lande, 1987; Falconer & Mackay, 1996). Phenotypic N2 h"h# variances may be higher in natural populations # # where h denotes the heritability of trait 1 and h the compared with those reared in the laboratory (Coyne " # heritability of trait 2 (Falconer & Mackay, 1996). & Beecham, 1987). Giving both generations equal Genetic correlations were tested individually with t- phenotypic variance (standardizing the phenotypic tests, and a conservative overall assessment of H (all distributions) may thus increase offspring variance ! genetic correlations are equal to zero) was made with relative to parental variance, which would cause a a t-test of the mean of the genetic correlations. general inflation of the magnitude of the estimates of narrow-sense heritabilities. Thus, we directly com- pared the phenotypic variances in both generations (viii) Statistical error rates across traits. Since the repeatabilities offer upper limits on narrow-sense heritabilities, we were also able Since this study is concerned with multiple and to assess potential systematic biases by comparing complex aspects of morphology, we report a large these parameters. In addition, the genotypei number of statistical estimates, primarily for quan- environment interactions in the full-sib analysis to a titative genetic parameters. Each of these estimates is certain extent reflects how robust genetic variances are accompanied by a measure of dispersion and a test of to environmental variations, which could potentially significance. When multiple statistical significance cause differences between parental (field) and offspring tests are reported, a monotonic increase in group-wide (laboratory) generations in additive genetic variances statistical type I error rate will automatically occur (Lande, 1987). (Rice, 1989). Compensation for such an increase in Narrow-sense heritabilities were estimated in reg- type I error rate is often made in articles focusing on ular linear regressions of mean offspring on mid- single estimates, e.g. by the sequential Bonferroni parental (sire for genitalic traits) values, where narrow- method (Holm, 1979). However, as such compensation sense heritability was derived directly from the is made, a loss of statistical power and an increased # # regression coefficient (β), as h l β (h l 2β for type II error rate is inevitable. This is one of the most genitalic traits) (Falconer & Mackay, 1996). Again, serious general problems of conventional inferential these analyses were restricted to include only families statistics: the inability simultaneously to control type with four or more surviving offspring. Homogeneity I and type II statistical errors (Cohen, 1988; Arnqvist of variance was assessed by visual inspection of & Wooster, 1995). In the current paper, we face this residual plots. dilemma by group-wide comparisons. Our main goal We also estimated phenotypic and genetic corre- was to compare patterns of inheritance and condition lations between general and genital morphology, based dependence of genitalic versus general morphological on sire–offspring data. Due to the very large number traits. We base the interpretation of our results, and of potential correlations between traits, we restricted found our conclusions, on group-wide comparisons of our analyses by choosing a subset of traits a priori. overall patterns of the magnitudes of parameter General traits were chosen on the basis of their estimates. Thus, we report non-adjusted significance functional significance: five size-related traits (body levels in our tables, but in no case are our conclusions and legs) known to be under selection in congeneric based on significances of these single estimates. species (Spence & Andersen, 1994; Arnqvist, 1997a) The statistical analyses reported in this paper were were included. Genital traits were chosen on the basis performed with SAS (1992) (REML estimation of G. ArnqŠist and R. Thornhill 200 variance components in mixed model analyses of (mean l 42n9%, SDl22n7) (paired t-test on arcsine variance) and SYSTAT (1992) (other procedures). transformed survival rates; t l 5n48, d.f. l 57, P ! 0n001). Offspring experiencing food stress during ontogeny also had a longer development time from 3. Results hatching to adulthood (mean development time: 36n05 Differences in the degree of phenotypic variation days, SD l 2n90, versus 32n37 days, SD l 4n12) (only exhibited by general and genital morphological traits offspring where exact times of both hatching and were assessed by comparing coefficients of variation moulting into the adult stage were recorded were for all linear scalar traits in parental individuals included in analysis; t l 7n34, d.f. l 234, P ! 0n001). (Lande, 1977). Average coefficients of variation in Food stress during development had a cascade of males were 3n94% (range 2n7–7n3) for general mor- effects on adult morphology (Tables 2, 3). The largest phological traits and 5n91% for genital traits (range effect occurred on measures of adult body size (range 3n2–16n5) (Appendix C). Thus, the degree of pheno- of F-values: 206n3–270n3) and on length of appendages typic variation was of similar magnitude for both (range of F-values: 45n4–62n8), whereas more variable types of traits (t l 1n10, d.f. l 13, P " 0n25). effects were recorded for measures of body shape Data on offspring growth and survival confirmed (range of F-values: 1n0–85n6). Measures of genitalic that the experimental food treatment had the desired size showed somewhat milder effect sizes (range of F- effect. Food-stressed offspring had a lower survivor- values: 0n7–33n0). Shape of genitalia did not appear to ship (mean l 25n9%, SDl20n3) than offspring pro- be affected by food stress (range of F-values: 0n0–5n2). vided with a more adequate food supply Contrary to our expectations based on theory of

Table 2. The degree of condition dependence, measured as the magnitude of effect of food stress, of a number of general (non-genitalic) morphological traits

F-value Trait\trait group (d.f. l 1,46) P value Linear measures Body length 246n39 ! 0n001 Thorax width 206n32 ! 0n001 Length of abdominal spines 62n02 ! 0n001 Distance between tips of abdominal spines 9n34 0n004 Elevation angle of abdominal spines 17n54 ! 0n001 Length of 1st antennal segment 62n77 ! 0n001 Forefemur length 62n52 ! 0n001 Midleg length 45n43 ! 0n001 Hindleg length 58n56 ! 0n001 FA in abdominal spine length 0n08 " 0n5 FA in length of 1st antennal segment 0n80 0n375 FA in forefemur length 0n03 " 0n5 FA in midleg length 2n91 0n095 FA in hindleg length 1n160n286 Shape analysis of body Centroid size 270n30 ! 0n001 Score relative warp number 1 85n61 ! 0n001 Score relative warp number 2 47n91 ! 0n001 Score relative warp number 3 4n42 0n041 Score relative warp number 4 38n94 ! 0n001 Score relative warp number 5 2n150n149 Score relative warp number 6 6n46 0n014 Score uniform shape component number 1 44n26 ! 0n001 Score uniform shape component number 2 0n96 0n333 FA in score relative warp number 1 3n01 0n089 FA in score relative warp number 2 1n46 0n233 FA in score relative warp number 3 0n03 0n857 FA in score relative warp number 4 1n94 0n170 FA in score relative warp number 6 1n31 0n259

F-values represent the effect of food treatment in multivariate mixed model analyses of variance (see text). Genetics of genitalic morphology 201

Table 3. The degree of condition dependence, measured as the magnitude of effect of food stress, of a number of male genitalic morphological traits

F-value Trait\trait group (d.f. l 1,46) P value Linear measures Length of 1st genitalic segment 17n45 ! 0n001 Proctiger length 6n60 0n013 Length of phallotheca 11n29 0n001 Length of lateral sclerites 7n74 0n008 Distance between lateral sclerites 0n67 0n416 Length of ventral sclerite 1n130n298 Length of dorsal sclerite 32n97 ! 0n001 FA in length of lateral sclerites 0n160n687 Shape analysis of genital capsule Centroid size 13n17 ! 0n001 Score relative warp number 11n83 0n182 Score relative warp number 2 5n24 0n027 Score relative warp number 3 0n00 " 0n5 Score relative warp number 4 0n23 " 0n5 Score relative warp number 5 2n23 0n142 Score relative warp number 6 0n40 " 0n5 Score uniform shape component number 1 3n54 0n066 Score uniform shape component number 2 0n01 " 0n5 FA in centroid size 1n93 0n171 FA in score relative warp number 11n88 0n177 FA in score relative warp number 3 5n02 0n030 FA in score relative warp number 4 1n45 0n235 FA in score uniform shape component number 1 0n05 0n827

F-values represent the effect of food treatment in multivariate mixed model analyses of variance (see text). condition dependence and FA, food stress did not measures of FA did not differ in general between the cause any measurable effects on the degree of FA in two groups of offspring (Wilcoxon signed rank any traits, either in size-related traits or for measures Z l 0n310, P l 0n756; paired t-test of log-transformed of shape. Effect sizes for effects of food stress on FA variances, t l 1n211, d.f. l 15, P l 0n245). Offspring were generally very low and insignificant (range of F- provided with an adequate food supply exhibited values: 0n0–5n0). To assess the validity of this higher variance in nine cases, and the opposite was interpretation of the pattern, we performed a three- true for the remaining seven cases. way analysis of variance of log-transformed F-values, We found no evidence of a correlation between using three different dichotomous characteristics of average trait size\magnitude and the absolute value of each trait as factors (genital\non-genital, FA\non-FA FA in any of the traits exhibiting FA, either in and shape\size). In accordance with our interpret- parental or offspring males (Table 7). The average ation, this analysis indicated that genitalic traits correlation coefficient across all traits was k0n012 showed somewhat lower effects of food stress com- (SE l 0n028, range k0n175 to 0n155) in parental males pared with non-genitalic traits (F",%# l 8n09, P l 0n007) and 0n002 (SE l 0n023, range k0n138 to 0n161)in and that measures of FA showed much lower degrees offspring males. Correlation coefficients did not differ of condition dependence compared with size and in parental and offspring males (paired t-test; shape traits per se (F",%# l 25n75, P ! 0n001), whereas t l 0n411, d.f. l 15, P " 0n5), and were not correlated measures of size and shape were affected by food with each other (r l 0n102, n l 16, P " 0n5). stress to a similar degree (F",%# l 1n09, P " 0n25). Broad-sense and narrow-sense heritability estimates In theory, the lack of effects of food stress on FA were positively correlated across traits (r l 0n443, could in part result from non-random mortality, with n l 50, P ! 0n001). Contrary to our expectations, respect to FA, in the food-stressed group. This however, narrow-sense heritability estimates were possibility was assessed by comparing the variances in overall slightly higher than broad-sense heritability FA in the two offspring treatment groups, across all estimates (paired t-test, n l 50, t l 4n36, P ! 0n01), 16 morphological measures of FA. Variance in but both were considerably lower than the estimated G. ArnqŠist and R. Thornhill 202

Table 4. General (non-genitalic) morphology

# # # a # Trait\trait group σg (SE) σge (SE) σe (SE) VG\VP h (SE) Linear measures Body length 0n072 (0n048) 0b 0n928 (0n081)0n140n34 (0n14)* Thorax width 0n090 (0n066) 0n029 (0n076) 0n881 (0n085) 0n180n37 (0n14)** Length of abdominal spines 0n082 (0n048)† 00n918(0n079) 0n16† 0n21 (0n15) Distance between tips of abdominal spines 0n123 (0n053)* 0 0n877 (0n075) 0n25* 0n14(0n15) Elevation angle of abdominal spines 0n034 (0n053) 0n054 (0n075) 0n911 (0n087) 0n07 0 Length of 1st antennal segment 0n355 (0n102)*** 0n015(0n053) 0n630 (0n062) 0n71*** 0n63 (0n12)*** Forefemur length 0n239 (0n080)** 0n018(0n067) 0n743 (0n074) 0n48** 0n63 (0n12)*** Midleg length 0n205 (0n070)** 0 0n795 (0n069) 0n41** 0n62 (0n12)*** Hindleg length 0n281 (0n086)*** 0n001 (0n055) 0n718(0n070) 0n56*** 0n49 (0n13)*** FA in abdominal spine length 0n002 (0n033) 0 0n998 (0n086) 0n00 0n15(0n15) FA in length of 1st antennal segment 0 0 1n000 (0n082) 0 0 FA in forefemur length 0n002 (0n047) 0n013(0n069) 0n985 (0n092) 0 0n09 (0n15) FA in midleg length 0n019(0n039) 0 0n981 (0n086) 0n04 0n15(0n14) FA in hindleg length 0n086 (0n048)† 00n914(0n079) 0n17† 0 Shape analysis of body Centroid size 0n056 (0n046) 0 0n944 (0n082) 0n11 0n33 (0n14)* Score relative warp number 1 0n188 (0n079)* 0n040 (0n075) 0n772 (0n075) 0n38* 0n51 (0n13)*** Score relative warp number 2 0n234 (0n078)** 0 0n765 (0n066) 0n47** 0n34 (0n14)* Score relative warp number 3 0n125 (0n072)† 0n090 (0n071)0n785 (0n073) 0n25† 0n48 (0n13)*** Score relative warp number 4 0n006 (0n046) 0n011 (0n068) 0n983 (0n091)0n01 0n16(0n15) Score relative warp number 5 0n105 (0n053)* 0 0n895 (0n078) 0n21*0n50 (0n13)*** Score relative warp number 6 0n126 (0n069)† 0n051 (0n076) 0n823 (0n079) 0n25† 0n24 (0n14) Score uniform shape component number 1 0n118(0n052)* 0 0n882 (0n076) 0n24* 0n30 (0n14)* Score uniform shape component number 2 0n136 (0n059)* 0 0n864 (0n074) 0n27* 0n42 (0n14)** FA in score relative warp number 1 0n021 (0n032) 0 0n979 (0n083) 0n04 0n01 (0n15) FA in score relative warp number 2 0n016(0n053) 0n074 (0n073) 0n910(0n083) 0n03 0 FA in score relative warp number 3 0n019(0n044) 0n043 (0n064) 0n938 (0n085) 0n04 0n03 (0n15) FA in score relative warp number 4 0 0 1n000 (0n080) 0 0 FA in score relative warp number 6 0 0 1n000 (0n079) 0 0

Estimates of variance components and their standard errors derived from full-sib analyses, corresponding to genotypic, genotypeienvironment and environmental sources of phenotypic variance. All variance components below have been scaled to a sum of unity for each trait. Given are also estimates of broad-sense heritabilities (V \V ) based on full-sib resemblance, # G P and estimates of narrow-sense heritabilities h (VA\VP) based on parent–offspring resemblance. †P ! 0n1,*P!0n05, **P ! 0n01, ***P ! 0n001. a No significance levels are given for the residual variance component. b Negative estimates of variance components and narrow-sense heritabilities are denoted with ‘0’. repeatabilities, across all traits. The average difference variances across all traits (Fisher’s exact test, between the estimated heritability and repeatability P l 0n319). Secondly, to test for directionality, we was 0n55 (SD l 0n19) for broad-sense estimates (paired performed a contingency table test of the frequencies t-test, n l 50, t l 20n37, P ! 0n001) and 0n40 of occurrence of highest variance in the three groups. (SD l 0n23) for narrow-sense estimates (paired t-test, Across all traits, the three groups did not differ in n l 50, t l 12n27, P ! 0n001). We assessed the possi- frequency of occurrence of highest variance among # bility that our narrow-sense heritability estimates groups (χ l 1n653, d.f. l 2, P " 0n25). Thus, the were systematically inflated as a result of consistently phenotypic variances in parental and offspring higher phenotypic variances in the parental generation generations did not differ in general across traits (cf. (see Section 2) in two ways. First, we performed Fmax- Coyne & Beecham, 1987), and none of the groups had tests of equality of variances among the three groups consistently higher phenotypic variance than any involved (parents, high food offspring and low food other. offspring) separately for the two sexes across all traits Overall, positive single estimates of heritability (excluding measures of FA) (Sokal & Rohlf, 1981). were frequently observed for both size-related traits The mean maximum variance ratio (Fmax) was 1n51 and multivariate shape scores, in both genital and (n l 52). The number of cases where the variance non-genital traits (Tables 4, 5). However, this was not ratio exceeded the upper 5% point of the Fmax the case for measures of FA. This interpretation of the distribution (Fmax[crit.] l 1n85) did not differ from the pattern was again confirmed in three-way analyses of expected number under the null hypothesis of equal variance of heritability of traits. For both broad- and Genetics of genitalic morphology 203

Table 5. Genital morphology

# # # a # Trait\trait group σg (SE) σge (SE) σe (SE) VG\VP h (SE) Linear measures Length of 1st genital segment 0n055 (0n129) 0n143 (0n169) 0n802 (0n121)0n11 0n51 (0n29) Proctiger length 0n101 (0n110) 0n169 (0n123) 0n730 (0n098) 0n20 0n76 (0n28)** Length of phallotheca 0b 0n116(0n098) 0n884 (0n124) 0 0n56 (0n29) Length of lateral sclerites 0n234 (0n128)† 0n017(0n113) 0n749 (0n105) 0n47† 0n73 (0n28)* Distance between lateral sclerites 0 0n036 (0n072) 0n964 (0n124) 0 0n26 (0n30) Length of ventral sclerite 0n184 (0n087)* 0 0n816(0n100) 0n37* 0n28 (0n30) Length of dorsal sclerite 0n164 (0n089)† 00n836 (0n104) 0n33† 0n90 (0n27)** FA in length of lateral sclerites 0n075 (0n071)0 0n0925 (0n114) 0n150 Shape analysis of genital capsule Centroid size 0n179 (0n127) 0n065 (0n147) 0n756 (0n109) 0n36 0n41 (0n29) Score relative warp number 1 00n035 (0n077) 0n965 (0n127) 0 0n40 (0n29) Score relative warp number 2 0 0 1n000 (0n109) 0 0n25 (0n29) Score relative warp number 3 0n028 (0n051)0 0n972 (0n115) 0n06 0n03 (0n30) Score relative warp number 4 0n037 (0n082) 0n031 (0n111)0n932 (0n126) 0n07 0 Score relative warp number 5 0n074 (0n070) 0 0n926 (0n114) 0n150n84 (0n27)** Score relative warp number 6 0 0 1n000 (0n109) 0 0n68 (0n28)* Score uniform shape component number 1 00n064 (0n081)0n936 (0n124) 0 0n53 (0n29) Score uniform shape component number 2 0 0n006 (0n073) 0n994 (0n129) 0 0n44 (0n29) FA in centroid size 0n070 (0n103) 0n019(0n142) 0n911 (0n139) 0n140 FA in score relative warp number 1 0n093 (0n073) 0 0n907 (0n112) 0n190 FA in score relative warp number 3 0n044 (0n066) 0 0n956 (0n118) 0n09 0n09 (0n30) FA in score relative warp number 4 0n010(0n059) 0 0n990 (0n122) 0n02 0n72 (0n28)* FA in score uniform shape component number 1 0n070 (0n069) 0 0n930 (0n114) 0n140n27 (0n30)

Estimates of variance components and their standard errors derived from full-sib analyses, corresponding to genotypic, genotypeienvrionment and environmental sources of phenotypic variance. All variance components below have been scaled to a sum of unity for each trait. Given are also estimates of broad-sense heritabilities (V \V ) based on full-sib resemblance, # G P and estimates of narrow-sense heritabilities h (VA\VP) based on parent–offspring resemblance. †P ! 0n1,*P!0n05, **P ! 0n01, ***P ! 0n001. a No significance levels are given for the residual variance component. b Negative estimates of variance components and narrow-sense heritabilities are denoted with ‘0’. narrow-sense heritabilities (log-transformed) the We found no evidence of genotypeienvironment single significant factor was whether the trait measured interactions for any traits. Most estimates of V were # GE FA or not (F",%# l 6n10, P l 0n018 for broad-sense very low (scaled σge ! 0n1 for 47 of 50 traits; Tables 4, heritability, and F",%# l 24n83, P ! 0n001 for narrow- 5). Thus, while food stress had a large impact on adult sense heritability). Thus, excluding measures of FA, morphology (see above), different genotypes did not average broad-sense heritability for non-genital traits seem to differ in their morphological response to this (n l 18) was 0n29 (SE l 0n04) and average narrow- stress, i.e. they did not differ appreciably in their sense heritability was 0n37 (SE l 0n05). For genital ability to cope with a stressful environment. traits (n l 16), average broad-sense heritability was There was a positive relationship between the 0n13 (SE l 0n03) and average narrow-sense heritability magnitude of the estimates of phenotypic and genetic was 0n47 (SE l 0n05). The average broad-sense heri- correlations between general morphological and geni- tability of measures of size (n l 16) was 0n28 tal traits (r l 0n59, n l 35, P ! 0n001) (Table 6). (SE l 0n04) and average narrow-sense heritability was Overall, phenotypic correlations were slightly larger 0n46 (SE l 0n05). For measures of shape (n l 18), (mean l 0n25, SE l 0n03) than genetic correlations average broad-sense heritability was 0n16 (SE l 0n04) (mean l 0n19, SE l 0n05), but not significantly so and average narrow-sense heritability was 0n38 (paired t-test, t l 1n52, n l 35, P l 0n138). There was (SE l 0n05). Thus, in conclusion, measures of genital evidence of non-zero overall genetic correlation and general morphology both exhibited overall weak between genital and general morphology, even if the to intermediate heritability, irrespective of whether fact that non-positive genetic correlations are po- they measured components of size or shape. In tentially informative is disregarded (t-test of H!: all contrast, the magnitude of heritability of measures of genetic correlations are equal to zero; t l 3n84, P ! FA differed from that of other traits, and did not 0n001). Of particular interest is the apparent lack of show non-zero heritabilities. phenotypic correlation, but a consistently negative G. ArnqŠist and R. Thornhill 204

Table 6. Estimates of phenotypic correlations (top part) and genetic correlations (bottom part) between non- genital morphological traits (horizontally) and genital traits (Šertically)

Forefemur Thorax width Body length length Midleg length Hindleg length

Length of 1st genital 0n55 (0n12)*** 0n60 (0n12)*** 0n55 (0n12)*** 0n60 (0n12)*** 0n52 (0n13)*** segment Proctiger length 0n53 (0n13)*** 0n49 (0n13)*** 0n39 (0n14)** 0n40 (0n14)** 0n37 (0n14)* Length of phallotheca 0n12(0n15) 0n15(0n15) 0n15(0n15) 0n03 (0n15) 0n01 (0n15) Length of lateral sclerites 0n26 (0n14) 0n20 (0n15) 0n30 (0n14)* k0n25 (0n14) 0n34 (0n14)* Length of ventral sclerite 0n31 (0n14)* 0n25 (0n14) 0n27 (0n14) 0n21 (0n15) 0n17(0n15) Length of dorsal sclerite 0n33 (0n14)* 0n30 (0n14)* 0n23 (0n14) 0n25 (0n14) 0n26 (0n14) Score relative warp 0n02 (0n15) 0n06 (0n15) 0n02 (0n15) 0n06 (0n15) k0n16(0n15) number 5 (shape) Length of 1st genital 0n17(0n32) 0n67 (0n19)** 0n61 (0n15)*** 0n71 (0n12)*** 0n51 (0n20)* segment Proctiger length 0n33 (0n24) 0n53 (0n20)** 0n39 (0n16)* 0n44 (0n15)** 0n60 (0n14)*** Length of phallotheca 0n40 (0n26) 0n35 (0n29) 0n50 (0n17)** 0n30 (0n21)0n24 (0n24) Length of lateral sclerites 0n05 (0n27) 0n26 (0n26) 0n26 (0n18) 0n19(0n19) 0n25 (0n21) Length of ventral sclerite 0n00 (0n45) 0n34 (0n41) k0n19(0n31) k0n28 (0n30) k0n14(0n36) Length of dorsal sclerite k0n07 (0n24) 0n08 (0n25) 0n08 (0n17) 0n17(0n17) 0n16(0n19) Score relative warp k0n12(0n24) k0n19(0n25) k0n32 (0n16)* k0n26 (0n17) k0n29 (0n19) number 5 (shape)

Numbers within brackets represent estimated standard errors. *P ! 0n05, **P ! 0n01, ***P ! 0n001.

Table 7. Phenotypic correlations between trait size The repeatabilities of linear measures of size were and the absolute Šalue of fluctuating asymmetry, for consistently very high, for both general and genital all traits exhibiting fluctuating asymmetry. Separate traits (mean l 0n91,SDl0n07, range 0n75–1n00) (see estimates are giŠen for parental (n l 61) and Appendixes A and B). Repeatabilities of multivariate offspring (n l 198) males shape scores were somewhat lower and more variable, for both types of traits (mean l 0n70, SD l 0n19, Parental Offspring range 0n13–0n92). Measures of asymmetry showed males males somethat lower yet and also variable degrees of repeatabilities (mean l 0n54, SD l 0n18, range 0n10– General morphology Abdominal spine length 0n021 k0n083 0n86). Again, the results of a three-way ANOVA of the Length of 1st antennal segment k0n175 k0n060 repeatabilities (square-root transformed) were in Forefemur length k0n070 0n073 agreement with this interpretation. Traits measuring Midleg length k0n029 0n161 multivariate shape were less repeatable compared Hindleg length 0n128 0n074 with linear size measures (F",%# l 6n10, P l 0n018), Score relative warp number 1 k0n0170n060 irrespective of whether they measured genital or non- Score relative warp number 2 k0n161 k0n138 genital traits (F",%# l 2n55, P " 0n1), and measures of 1 1 1 Score relative warp number 3 0n 5 k0n06 FA were less repeatable than other traits (F" %# l 12n63, Score relative warp number 4 0 0170097 , k n n 1 Score relative warp number 6 0n121 k0n074 P ! 0n00 ). For general morphological traits, the Genital morphology repeatabilities estimated for males and females were Length of lateral sclerites k0n137 k0n116 highly correlated (r l 0n89, n l 32, P ! 0n01). Further, Centroid size 0n039 0n050 as expected, the degree of repeatability was positively Score relative warp number 1 0n155 k0n085 related to the estimated heritability across all traits. Score relative warp number 3 0n040 0n098 The correlation coefficient with repeatability was 0n52 Score relative warp number 4 k0n095 0n062 for broad-sense heritability and 0 55 for narrow-sense Score uniform shape component k0n141 k0n019 n number 1 heritability (n l 50, P ! 0n001 in both cases).

4. Discussion genetic correlation, between genitalic shape (score of This is the first in-depth study of the intraspecific relative warp number 5) and general morphology patterns of inheritance and phenotypic plasticity of (Table 6). genitalic morphology in any species. Our data are Genetics of genitalic morphology 205 novel in many ways, and several new insights can be (ii) Assessment of hypotheses for genitalic eŠolution gained. We put our results to five tasks. First, we summarize our main findings. Secondly, we assess how On the basis of comparative studies, several previous our overall results agree with the various hypotheses authors have concluded that the lock-and-key hy- for genitalic evolution. Third, we compare the relative pothesis, the most widespread and long-standing merits of linear and multivariate morphometrics. hypothesis, is in poor agreement with the patterns of Fourth, we compare different measures of condition genitalic function and divergence across animal taxa dependence, and discuss their implications. Fifth, we (Mayr, 1963; Scudder, 1971; Eberhard, 1985, 1990, discuss some of the methodological implications of 1993; Shapiro & Porter, 1989). We found that genitalia our results. are as phenotypically variable as are other traits, and that this variation is as additive genetic in nature. Further, genitalic morphology is phenotypically plas- (i) General conclusions tic and responds to differences in environmental Genitalic morphology in our species exhibited mod- conditions. This general picture is in contrast to the erate levels of phenotypic variation but, more im- image of the invariant ‘key’ envisioned by the lock- portantly, genitalia were as phenotypically variable as and-key hypothesis (Eberhard, 1985; Arnqvist, general morphological traits. Further, both size and 1997b), and our results in terms of the patterns of shape of genitalia exhibited sizable additive genetic inheritance and condition dependence of genitalia are components of phenotypic trait variation which, hence not compatible with the lock-and-key hy- again, were of similar magnitudes to those of general pothesis. morphological traits. Thus, we may conclude that Although it is notoriously difficult to distinguish male genitalia in G. incognitus were as variable, both empirically between various models of sexual selection phenotypically and genotypically, as the other types (Kirkpatrick & Ryan, 1991; Andersson, 1994; of traits measured. This is, to our knowledge, the first Johnstone, 1995; Andersson & Iwasa, 1996), traits quantification of intraspecific in under sexual selection have been shown generally to genitalic morphology. Interestingly enough, our con- exhibit relatively high levels of phenotypic and genetic clusions are largely in agreement with previous variance (Pomiankowski & Møller, 1995) and several comparative work on Drosophila, where the genetic mechanisms may generate condition dependence in basis for interspecific differences in genitalic con- trait expression (Andersson, 1994; Rowe & Houle, formation has been shown to be polygenic and largely 1996). Thus, in a very general sense, our results are in additive (Coyne, 1983, 1985; Coyne & Kreitman, agreement with the sexual selection hypothesis of 1986; Liu et al., 1996). general evolution (cf. Arnqvist, 1997b). In general, genital traits showed somewhat lower Eberhard (1993) suggested that genitalic elabor- levels of condition dependence compared with general ation conveys no costs, and that genitalia should thus morphological traits. However, two lines of evidence be particularly prone to evolve rapidly by a run-away show that phenotypic expression of genitalia was process generated by a sensory exploitation mech- indeed condition dependent. First, the effects sizes of anism (cryptic female choice). Under this particular our experimental food stress on genitalic conformation ‘Fisherian’ sexual selection scenario, phenotypic were in several cases high, especially for measures of evolution of genitalic traits would be halted only by a size of genitalia, showing that food stress indeed lack of genetic variance, rather than by any an- affected genitalic morphology. Secondly, the fairly tagonistic . Additive genetic variance high phenotypic correlations between genitalic size would thus rapidly be exhausted. Our results does not and general size of body and appendages also indicates support this somewhat unrealistic version of the condition dependence. We conclude that development sexual selection hypothesis, since we found fairly high of genitalia was neither highly canalized nor invariant levels of genetic variation in genitalic morphology. to environmental conditions, but rather exhibited Our results are also in general agreement with the moderate levels of condition dependence. pleiotropy hypothesis of Mayr (1963) (Arnqvist, Measures of genitalic morphology, both size and 1997b). The pleiotropy hypothesis is based on the shape, were genetically correlated with measures of assumption that the set of genes determining genital body size and leg length. Actually, our estimates of morphology also affects other traits by pleiotropic genetic correlations were of similar magnitude to effects. Our finding of genetic correlations between those of phenotypic correlations (cf. Cheverud, 1988; genital shape\size and general morphology indicates Roff, 1996). Assuming that transient linkage plays a that such a pattern exists in G. incognitus. Thus, if minor role, we can thus conclude that genitalic and components of general morphology (such as body size general morphology are to a certain extent influenced or leg length) evolve, genitalia will also tend to evolve by the same set of pleiotropic genes in G. incognitus as a correlated response. Moreover, since genetic (Falconer & Mackay, 1996). correlations are primarily caused by the basic func- G. ArnqŠist and R. Thornhill 206 tional genetic ‘architecture’ of an organism (Houle, (iv) Condition dependence 1991; Falconer & Mackay, 1996), which is likely to be shared between closely related species, this pattern of In the domain of sexual selection, much attention is inheritance could potentially contribute to evolution- currently being given to condition-dependent ary diversification of genitalia within the genus Gerris. expression of sexual characters (Andersson, 1994; In conclusion, the patterns of variation, inheritance Rowe & Houle, 1996), especially with regard to and condition dependence of genitalia in G. incognitus biological handicap models (Johnstone, 1995). How- presented here are in disagreement with the long- ever, much of this research is limited to correlative standing lock-and-key hypothesis. They are, however, studies, which are weakened by potential confounding in general agreement with the sexual selection hy- effects, and very few have assessed the degree of pothesis and the pleiotropy hypothesis. In a related condition dependence in a suite of sexual traits by and complementary study, Arnqvist et al.(1997) experimentally altering the environmental conditions studied the patterns of phenotypic selection on (Johnstone, 1995). Several insights can be gained from genitalic traits in G. incognitus. Their results were our experiment. First, to manipulate environmental similarly in disagreement with the lock-and-key conditions experimentally allows for unambiguous hypothesis, and hence yielded conclusions very similar assessments of the degree of condition dependence in to those drawn in the current study. Further dis- a suite of traits for a given magnitude of a known crimination between the various hypotheses based on source of stress. This is essential when any causal these data alone is difficult, since many of the relationships are sought. Secondly, virtually all traits predictions are relatively weak (for a discussion see are condition dependent to some extent. Thus, we Arnqvist, 1997b). believe it is very important to study multiple traits and to focus on the magnitude of condition dependence rather than the mere existence of condition depen- dence. The fact that a potential morphological (iii) Morphometric methods signal\ornament (such as genital size or shape) is Our study is one of the very first to address the ‘significantly’ revealing of phenotypic condition is not quantitative genetics of multivariate shape (but see necessarily particularly informative. Other traits (for Liu et al., 1996). Even though our linear measures of example body size) may be much more revealing of size of various traits and our multivariate shape scores phenotypic quality, in which case we would expect measure in part wholly different components of female perception systems to evolve to home in on the morphology, they exhibited a similar pattern of latter rather than the former. phenotypic and genotypic variation. Morphological Thirdly, provided females receive no direct benefits shape measures were generally somewhat less affected from males (Price et al., 1993), an indicator trait (a by food stress compared with measures of size, but the handicap) should be revealing of genotypic, as amount of additive genetic components in phenotypic opposed to phenotypic, quality for a good-genes\ trait variation was similar for both types of mor- handicap sexual selection process to operate phological measures. (Johnstone, 1995). In other words, condition de- The repeatabilities of our multivariate shape scores pendence is assumed to be heritable (Grafen, 1990; were reasonably high (average 0n70), but were, Rowe & Houle, 1996). A major benefit of controlled nevertheless, lower than the repeatabilities of linear laboratory experiments, such as that reported in this measures of size – a pattern that is expected and may paper, is that they allow direct estimations of the be general (see Arnqvist & Ma/ rtensson, 1998). extent to which different male genotypes differ in their Multivariate measures of shape potentially capture condition dependence by investigating the genotypei more complex and comprehensive aspects of mor- environment interaction terms. Body size in G. phology than do simple linear measures of size (Rohlf incognitus may serve as an example. Size was strongly & Marcus, 1993). In the light of this, it is encouraging revealing of environmental condition experienced, that shape measures and size measures gave essentially and also found to be heritable. Thus, choosing large the same result when patterns of genital and general mates would on average select for mates of high morphological variation were compared, even though phenotypic condition. However, different male geno- the greatest benefit in intraspecific studies from this types did not differ in their response to food stress. In characteristic of multivariate shape analysis may be other words, there was no evidence of any variance in reached in studies where morphology is related to genotypic quality, in the sense that different genotypes performance in various ways (Arnqvist et al., 1997). did not differ in their ability to cope with harsh In either case, we have shown that it is possible to environmental conditions (food stress). measure and capture complex morphological variation Fourthly, there is currently much discussion about accurately even in very small traits by means of measures of FA in bilaterally symmetrical traits, and landmark-based multivariate shape analysis. its utility as an indicator of individual condition and Genetics of genitalic morphology 207 quality (Møller & Pomiankowski, 1993; Swaddle et generally most reliable (Falconer & Mackay, 1996). al., 1994; Watson & Thornhill, 1994). First, we wish Thus, we suggest that our broad-sense heritability to stress the importance of quantifying measurement estimates are underestimates of the true values of error, by estimating repeatabilities of measures of FA, VG\VP. This could in part result from founding our which are often not presented (Swaddle et al., 1994; broad-sense estimates on full-sib data, which are less Merila$ & Bjo$ rklund, 1995). In our case, measures of reliable and in some cases can render depressed FA were in some cases to a large extent composed of estimates of genetic parameters (Arnold, 1994). measurement error. Overall, almost 50% of the However, an even more important factor is un- between-individual variation in FA was due to doubtedly constraints imposed by the computational measurement error, though the repeatability varied methods involved. REML estimators of heritabilities greatly between traits (range 14–90%). Obviously, are namely known to be potentially negatively biased, this illustrates the difficulty of measuring low levels of as a result of the inability to handle negative estimates FA, and that information of repeatability is critical in of variance components in maximum likelihood interpreting any relationships between FA of any estimation (the non-negativity constraint) (Shaw, given trait and individual performance in empirical 1987). A comparison across traits supports this studies. interpretation: average difference between estimates Despite our relatively dramatic food treatment, of narrow- and broad-sense heritabilities was 0n18 which had effects on survivorship and growth rate as (n l 32, SE l 0n04) for traits in which the REML well as many other components of morphology, we models produced negative estimates of variance failed to find any effects of food stress on FA. components, but only 0n08 (n l 18, SE l 0n04) for Considering that other studies have shown effects of traits that were not affected by the non-negativity stress on FA (Palmer & Strobeck, 1986; Møller & constraint, suggesting a negative bias in the order of at Pomiankowski, 1993; Swaddle et al., 1994; Watson & least 10%. Our results hence strongly indicate that Thornhill, 1994), this result is surprising. It could this bias is systematic and can be considerable. Thus, certainly in part be due to the influence of measure- while REML estimation allows for a very flexible and ment error in our measures of FA, but this can not relatively accurate analysis of variance components account for the measures of FA that showed high even in cases where the data are unbalanced, we repeatabilities but no detectable effects of food stress. believe that interpretation of REML estimations of We conclude that FA in the traits measured was variance components when REML estimation apparently not affected by food stress in G. incognitus. involves negative estimates should be done with While this certainly does not eliminate the possibility caution (see also Shaw, 1987). that other forms of stress (e.g. parasites) may affect We also wish to stress the utility and importance of FA, it illustrates the fact that measures of FA can not estimating repeatabilities of morphological characters. be assumed to be universal and integrative measures These set upper limits to the heritabilities (see above) of phenotypic quality. Experimental manipulations, and hold quantitative information about the influence such as those presented here, are necessary to truly of measurement error in trait variation. This is key for reveal the causal factors (if any) of individual variation traits prone to be associated with high degrees of in FA (Møller, 1992). measurement error (e.g. FA or small-scale traits such as genitalia), but is also imortant for more complex measures of morphology such as multivariate (v) Methodological implications measures of shape. For such measures, certain shape Our estimates of narrow-sense heritability were components obviously reflect measurement error more systematically somewhat higher than the estimates of than others. For example, while most of our multi- broad-sense heritability. This result is, of course, variate measures of shape captured primarily true puzzling, since true narrow-sense heritabilities can not between-individual variation in shape, this was not exceed the true value of broad-sense heritabilities. In always the case. For at least one measure of shape theory, this could be due either to inflated narrow- (genitalia; score of relative warp number 2), pheno- sense estimates or to depressed broad-sense estimates. typic variance consisted almost wholly of measure- Four facts strongly suggest that the former is not the ment error. case. First, phenotypic variances did not differ across The current contribution represents the first ex- generations (see Section 3). Secondly, our narrow- tensive intraspecific study of the genetic control of sense estimates were generally much lower than the genital morphology (Eberhard, 1985). The data upper limit set by the repeatabilities (less than half on presented here allowed us tentatively to assess various average). Thirdly, we found no genotypei hypotheses for the evolution of animal genitalia, but environment interactions across laboratory environ- also yielded a number of additional insights. In ments (cf. Lande, 1987). Fourthly, narrow-sense particular, our results do not support the long- estimates based on parent–offspring resemblance are standing lock-and-key hypothesis and our study thus G. ArnqŠist and R. Thornhill 208 illustrates the utility and importance of in-depth more empirical studies will follow, allowing for a studies of the pattern of variability and inheritance of future consensus on the evolutionary processes re- genital versus general morphology. We hope that sponsible for genitalic evolution.

Appendix A Estimates of repeatability for general (non-genitalic) morphological traits, estimated separately for females and males. The critical limit of a significant (P ! 0n05) between-individuals variation (unadjusted) equals a repeatability & 0n23. Given also are P-values of t-tests of H!: µ l 0 and of Kolmogorov–Smirnov tests of distributional normality for all measures of asymmetry.

Repeatability H!: µ l 0 Normality Trait\trait group Females Males (t-test) (K–S test) Linear measures Body length 0n99 0n99 — — Thorax width 0n90 0n93 — — Length of abdominal spines 0n91 0n83 — — Distance between tips of abdominal spines 0n93 0n88 — — Elevation angle of abdominal spines 0n57 0n72 — — Length of 1st antennal segment 0n95 0n93 — — Forefemur length 0n96 0n96 — — Midleg length 0n95 0n96 — — Hindleg length 0n96 0n97 — — Asymmetry in abdominal spine length 0n68 0n56 P " 0n1 P " 0n6 Asymmetry in length of 1st antennal segment 0n47 0n61 P " 0n2 P " 0n2 Asymmetry in forefemur length 0n86 0n64 P " 0n6 P " 0n1 Asymmetry in midleg length 0n69 0n60 P " 0n2 P " 0n2 Asymmetry in hindleg length 0n65 0n70 P " 0n4 P " 0n1 Shape analysis of body Centroid size 1n00 0n99 — — Score relative warp number 1 0n90 0n89 — — Score relative warp number 2 0n92 0n90 — — Score relative warp number 3 0n82 0n66 — — Score relative warp number 4 0n89 0n68 — — Score relative warp number 5 0n83 0n75 — — Score relative warp number 6 0n57 0n63 — — Score uniform shape component number 1 0n62 0n61 —— Score uniform shape component number 2 0n78 0n82 — — Asymmetry in centroid size 0n58 0n69 P ! 0n001 P " 0n05 Asymmetry in score relative warp number 1 0n63 0n51 P " 0n05 P " 0n4 Asymmetry in score relative warp number 2 0n62 0n67 P " 0n05 P " 0n1 Asymmetry in score relative warp number 3 0n77 0n57 P " 0n1 P " 0n1 Asymmetry in score relative warp number 4 0n56 0n50 P " 0n2 P " 0n05 Asymmetry in score relative warp number 5 0n120n31P!0n001 P ! 0n001 Asymmetry in score relative warp number 6 0n23 0n33 P " 0n05 P " 0n05 Asymmetry in score uniform shape comp. number 1 0n41 0n46 P ! 0n001 P ! 0n05 Asymmetry in score uniform shape comp. number 2 0n56 0n58 P ! 0n001 P " 0n2 Genetics of genitalic morphology 209

Appendix B Estimates of repeatability for genital morphological traits in males. The critical limit for a significant (P ! 0n05) between-individuals variation (unadjusted) equals a repeatability & 0n23. Given also are P-values of t-tests of H!: µ l 0 and of Kolmogorov–Smirnov tests of distributional normality for all measures of asymmetry.

H!: µ l 0 Normality Trait\trait group Repeatability (t-test) (K–S test) Linear measures Length of 1st genital segment 0n94 — — Proctiger length 0n94 — — Length of phallotheca 0n91 —— Length of lateral sclerites 0n82 — — Distance between lateral sclerites 0n79 — — Length of ventral sclerite 0n86 — — Length of dorsal sclerite 0n84 — — Asymmetry in length of 1st genital segment 0n83 P ! 0n001 P " 0n4 Asymmetry in length of lateral sclerites 0n54 P " 0n8 P " 0n8 Shape analysis of genital capsule Centroid size 0n75 — — Score relative warp number 1 0n74 — — Score relative warp number 2 0n13— — Score relative warp number 3 0n35 — — Score relative warp number 4 0n46 — — Score relative warp number 5 0n82 — — Score relative warp number 6 0n62 — — Score uniform shape component number 1 0n67 — — Score uniform shape component number 2 0n76 — — Asymmetry in centroid size 0n44 P " 0n5 P " 0n2 Asymmetry in score relative warp number 1 0n55 P " 0n05 P " 0n3 Asymmetry in score relative warp number 2 0n10 P ! 0n001 P " 0n1 Asymmetry in score relative warp number 3 0n20 P " 0n4 P " 0n3 Asymmetry in score relative warp number 4 0n52 P " 0n1 P " 0n9 Asymmetry in score relative warp number 5 0n61 P ! 0n001 P " 0n4 Asymmetry in score relative warp number 6 0n33 P ! 0n001 P " 0n4 Asymmetry in score uniform shape component number 1 0n80 P " 0n6 P " 0n1 Asymmetry in score uniform shape component number 2 0n44 P ! 0n001 P " 0n8 G. ArnqŠist and R. Thornhill 210

Appendix C Summary of phenotypic variation in linear scalar general and genital morphometric traits. Data shown are based on parental individuals (n l 61 in all cases). Mean represents the arithmetic mean and CV represents the coefficient of variation.

Males Females

CV CV Trait\trait group Mean SEmean (%) SECV Mean SEmean (%) SECV General morphology Body length (mm) 7n06 0n029 3n20n29 8n00 0n034 3n30n30 Thorax width (mm) 2n25 0n008 2n70n21 2n65 0n0103n00n28 Length of abdominal spines (mm) 1n00 0n007 5n30n48 1n35 0n005 4n80n44 Distance between tips of abdominal spines (mm) 0n78 0n007 7n30n66 0n62 0n01620n61n86 Elevation angle of abdominal spines (degrees) 5n62 0n599 — — 29n82 0n634 — — Length of 1st antennal segment (mm) 1n22 0n005 3n20n29 1n29 0n007 4n20n38 Forefemur length (mm) 2n07 0n008 3n00n26 2n140n008 3n00n27 Midleg length (mm) 8n51 0n036 3n30n30 9n11 0n039 3n40n30 Hindleg length (mm) 6n81 0n032 3n60n32 7n25 0n038 4n00n36 Genital morphology " Length of 1st genital segment (mm− )3n86 0n0173n50n32 " Proctiger length (mm− )2n23 0n009 3n50n31 " Length of phallotheca (mm− ) 1n50 0n006 3n20n29 " Length of lateral sclerites (mm− )0n78 0n005 4n90n44 " Distance between lateral sclerites (mm− )0n28 0n006 16n5 1n49 " Length of ventral sclerite (mm− )0n89 0n007 6n00n54 " Length of dorsal sclerite (mm− ) 1n190n006 3n80n35

We wish to thank J. F. Rohlf for modifying his thin-plate Andersson, M. & Iwasa, Y. (1996). Sexual selection. Trends spline relative warp analysis computer program TPSRW to in Ecology and EŠolution 11, 53–58. meet our requirements, and for allowing us to use it. Arnold, S. J. (1994). Multivariate inheritance and evolution: Similarly, D. Slice kindly allowed us access to his morpho- a review of concepts. In QuantitatiŠe Genetic Studies of metric program GRF-ND. R. G. Shaw gave invaluable BehaŠioral EŠolution (ed. C. R. B. Boake), pp. 17–48. statistical advice. We are most grateful to Stephanie Chicago: University of Chicago Press. Weinstein for digitizing all morphometric data. We thank Arnqvist, G. (1988). Mate guarding and sperm displacement N. Barton, L. Rowe, P. Watson and an anonymous reviewer in the water strider Gerris lateralis Schumm. (Heteroptera: for constructive comments on earlier versions of this paper. Gerridae). Freshwater Biology 19, 269–274. J. Elmberg provided laboratory assistance. This study was Arnqvist, G. (1989). Multiple mating in a water strider: made possible by financial support from The Swedish mutual benefits or intersexual conflict? Animal BehaŠiour Natural Science Research Council, The Crafoord Foun- 38, 749–756. dation (The Royal Swedish Academy of Sciences), The Arnqvist, G. (1992). Spatial variation in selective regimes: Fulbright Commission and The Swedish Institute. Logistic sexual selection in the water strider, Gerris odontogaster. support was received from the Department of Biology, EŠolution 46,914–929. University of New Mexico, which is gratefully Arnqvist, G. (1997a). The evolution of water strider mating acknowledged. systems: causes and consequences of sexual conflicts. In Social Competition and Cooperation in Insects and References Arachnids, vol. I, EŠolution of Mating Systems (ed. J. C. Choe & B. J. Crespi), pp. 146–163. Cambridge: Alexander, R. D., Marshall, D. C. & Cooley, J. R. (1997). Cambridge University Press. Evolutionary perspectives on insect mating. In Social Arnqvist, G. (1997b). The evolution of animal genitalia: Competition and Cooperation in Insects and Arachnids, distinguishing between hypotheses by single species vol. I, EŠolution of Mating Systems (ed. J. C. Choe & B. J. studies. Biological Journal of the Linnean Society 60, Crespi), pp. 4–31. Cambridge: Cambridge University 365–379. Press. Arnqvist, G. & Ma/ rtensson, T. (1998). Measurement error Andersen, N. M. (1982). The Semiaquatic Bugs (Hemiptera, in geometric morphometrics: empirical strategies to assess Gerromorpha): Phylogeny, Adaptations, Biogeography and and reduce its impact on measure of shape. Acta Zoologica Classification. Entomonograph 3. Klampenborg, (in press). Denmark: Scandinavian Science Press. Arnqvist, G. & Rowe, L. (1995). Sexual conflict and arms Andersen, N. M. (1993). Classification, phylogeny, and races between the sexes: a morphological adaptation for zoogeography of the pond skater genus Gerris Fabricius control of mating in a female insect. Proceedings of the (Hemiptera: Gerridae). Canadian Journal of Zoology 71, Royal Society of London, Series B 261, 123–127. 2473–2508. Arnqvist, G. & Wooster, D. (1995). Meta-analysis: Andersson, M. (1994). Sexual Selection. Princeton, NJ: synthesizing research findings in ecology and evolution. Princeton University Press. Trends in Ecology and EŠolution 10, 236–240. Genetics of genitalic morphology 211

Arnqvist, G., Thornhill, R. & Rowe, L. (1997). Evolution of Henderson, C. R. (1985). MIVQUE and REML estimation animal genitalia: morphological correlates of fitness of additive and nonadditive genetic variances. Journal of components in a water strider. Journal of EŠolutionary Animal Science 61, 113–121. Biology 10,613–640. Holm, S. (1979). A simple sequentially rejective multiple test Auffray, J.-C., Alibert, P., Renaud, S., Orth, A. & procedure. ScandinaŠian Journal of Statistics 6, 65–70. Bonhomme, F. (1996). Fluctuating asymmetry in Mus Houle, D. (1991). Genetic covariance of fitness correlates: musculus subspecific hybridization: traditional and what genetic correlations are made of and why it Procrustes comparative approach. In AdŠances in Morpho- matters. EŠolution 45, 630–648. metrics (ed. L. F. Marcus, M. Corti, A. Loy, G. Naylor & Johnstone, R. A. (1995). Sexual selection, honest adver- D. Slice), pp. 275–284. New York: Plenum Press. tisement and the : reviewing the Becker, W. A. (1984). Manual of QuantitatiŠe Genetics, 4th evidence. Biological ReŠiews 70, 1–65. edn. Pullman, Wash.: Academic Enterprises. Kaitala, A. (1987). Dynamic life-history strategy of the Berry, D. A. (1987). Logarithmic transformations in water strider Gerris thoracicus as an adaptation to food ANOVA. Biometrics 43, 439–456. and habitat variation. Oikos 48, 125–131. Birkhead, T. R. & Hunter, F. M. (1990). Mechanisms of Kearsey, M. J. (1965). Biometrical analysis of a random sperm competition. Trends in Ecology and EŠolution 5, mating population: a comparison of five experimental 48–52. designs. Heredity 20, 205–235. Bookstein, F. L. (1991). Morphometric Tools for Landmark Kirkpatrick, M. & Ryan, M. J. (1991). The evolution of Data. Cambridge: Cambridge University Press. mating preferences and the paradox of the lek. Nature Bookstein, F. L. (1996). A standard formula for the uniform 350, 33–38. shape component in landmark data. In AdŠances in Lande, R. (1977). On comparing coefficients of variation. Morphometrics (ed. L. F. Marcus, M. Corti, A. Loy, Systematic Zoology 26,214–216. G. Naylor & D. Slice), pp. 153–168. New York: Plenum Lande, R. (1987). Appendix to Coyne and Beecham. Press. Genetics 117, 737. Cheverud, J. M. (1988). A comparison of genetic and Lessels, C. M. & Boag, P. T. (1987). Unrepeatable phenotypic correlations. EŠolution 42, 958–968. repeatabilities: a common mistake. Auk 104, 116–121. Cohen, J. (1988). Statistical Power Analysis for the Liu, J., Mercer, J. M., Stam, L. F., Gibson, G. C., Zeng, Z.- BehaŠioral Sciences. Hillsdale, NJ: Lawrence Erlbaum. B. & Laurie, C. C. (1996). Genetic analysis of a Coyne, J. A. (1983). Genetic basis of differences in genital morphological shape difference in the male genitalia of morphology among three sibling species of Drosophila. Drosophila simulans and D. mauritana. Genetics 142, EŠolution 37, 1101–1118. 1129–1145. Coyne, J. A. (1985). Genetic studies of three sibling species Lloyd, J. E. (1979). Mating behavior and natural selection. of Drosophila with relationship to theories of speciation. Florida Entomologist 62, 17–23. Genetical Research 46, 169–192. Marcus, L. F., Corti, M., Loy, A., Naylor, G. J. P. & Slice, Coyne, J. A. & Beecham, E. (1987). Heritability of two D. E. (eds.) (1996). AdŠances in Morphometrics. New morphological characters within and among natural York: Plenum Press. populations of Drosophila melanogaster. Genetics 117, Mather, K. (1953). Genetical control of stability in 727–737. development. Heredity 7, 297–336. Coyne, J. A. & Kreitmann, M. (1986). Evolutionary genetics Mayr, E. (1963). Animal Species and EŠolution. Cambridge, of two sibling species, Drosophila simulans and D. Mass.: Harvard University Press. sechellia. EŠolution 40, 673–691. McCullagh, P. & Nelder, J. A. (1983). Generalized Linear Eberhard, W. G. (1985). Sexual Selection and the EŠolution Models. New York: Chapman and Hall. of Animal Genitalia. Cambridge, Mass.: Harvard Uni- Merila$ , J. & Bjo$ rklund, M. (1995). Fluctuating asymmetry versity Press. and measurement error. Systematic Biology 44, 97–101. Eberhard, W. G. (1990). Animal genitalia and female choice. Møller, A. P. (1992). Parasites differentially increase the American Scientist 78, 134–141. degree of fluctuating asymmetry in secondary sexual Eberhard, W. G. (1993). Evaluating models of sexual characters. Journal of EŠolutionary Biology 5,691–699. selection: genitalia as a test case. American Naturalist 142, Møller, A. P. & Pomiankowski, A. (1993). Fluctuating 564–571. asymmetry and sexual selection. Genetica 89, 267–279. Eberhard, W. G. (1996). Female Control: Sexual Selection Palmer, A. R. & Strobeck, C. (1986). Fluctuating asym- by Cryptic Female Choice. Princeton, NJ: Princeton metry: measurement, analysis and pattern. Annual ReŠiew University Press. of Ecology and Systematics 17,391–421. Falconer, D. S. (1973). Replicated selection for body weight Pirchner, F. (1983). Population Genetics in Animal Breeding. in mice. Genetical Research 22,291–321. New York: Plenum Press. Falconer, D. S. & Mackay, T. F. C. (1996). Introduction to Pomiankowski, A. & Møller, A. P. (1995). A resolution of QuantitatiŠe Genetics. Essex: Longman. the lek paradox. Proceedings of the Royal Society of Grafen, A. (1990). Sexual selection unhandicapped by the London, Series B 260,21–29. Fisher process. Journal of Theoretical Biology 144, Price, T., Schluter, D. & Heckman, N. E. (1993). Sexual 473–516. selection when the female directly benefits. Biological Harville, D. A. (1977). Maximum likelihood approaches to Journal of the Linnean Society 48, 187–211. variance component estimation and to related problems. Rice, W. R. (1989). Analyzing tables of statistical tests. Journal of the American Statistical Association 72, EŠolution 43, 223–225. 320–340. Rice, W. R. (1996). Sexually antagonistic male adaptation Heming-van Battum, K. E. & Heming, B. S. (1989). Struc- triggered by experimental arrest of female evolution. ture, function, and evolutionary significance of the Nature 381, 232–234. reproductive system in males of Hebrus ruficeps and H. Roff, D. A. (1996). The evolution of genetic correlations: an pusillus (Heteroptera, Gerromorpha, Hebridae). Journal analysis of patterns. EŠolution 50, 1392–1403. of Morphology 202,281–323. Rohlf, F. J. (1993). Relative warp analysis and an example G. ArnqŠist and R. Thornhill 212

of its application to mosquito wings. In Contributions to Slice, D. (1994). Generalized Rotational Fitting of n- Morphometrics (ed. L. F. Marcus, E. Bello & A. Garcia- Dimensional Landmark Data. Stony Brook, New York: Valdecasas), pp. 131–159. Madrid, Spain: Monografias State University of New York. del Museo Nacional de Ciencias Naturales. Smith, C. A. B. (1980). Estimating genetic correlations. Rohlf, F. J. & Marcus, L. F. (1993). A revolution in Annals of Human Genetics 43, 265–284. morphometrics. Trends in Ecology and EŠolution 8, Smith, D. R., Crespi, B. J. & Bookstein, F. L. (1997). 129–132. Fluctuating asymmetry in the honey bee, Apis mellifera: Rowe, L. & Houle, D. (1996). The lek paradox and the effects of ploidy and hybridization. Journal of EŠolutionary capture of genetic variance by condition dependent traits. Biology 10,551–574. Proceedings of the Royal Society of London, Series B 263, Smith, R. L. (ed.) (1984). Sperm Competition and the 1415–1421. EŠolution of Animal Mating Systems. Orlando, Fla.: Rowe, L., Arnqvist, G., Sih, A. & Krupa, J. J. (1994). Academic Press. Sexual conflict and the evolutionary ecology of mating Sokal, R. R. & Rohlf, F. J. (1981). Biometry. San Francisco: patterns: water striders as a model system. Trends in W. H. Freeman. 1 Ecology and EŠolution 9, 289–293. Spence, J. R. & Andersen, N. M. ( 994). Biology of water striders: interactions between systematics and ecology. Rubenstein, D. I. (1989). Sperm competition in the water Annual ReŠiew of Entomology 39, 101–128. strider Gerris remigis. Animal BehaŠiour 38,631–636. Swaddle, J. P., Witter, M. S. & Cuthill, I. C. (1994). The SAS (1992). Software: Changes and Enhancements, Release analysis of fluctuating asymmetry. Animal BehaŠiour 48, 6.07 Technical Report P 229 - . Cary, NC: SAS Institute. 986–989. Scudder, G. G. E. (1971). Comparative morphology of SYSTAT (1992). SYSTAT for Windows: Statistics, Version insect genitalia. Annual ReŠiew of Entomology 16, 379–406. 5 Edition. Evanston, Ill.: SYSTAT. Searle, S. R., Casella, G. & McCulloch, C. E. (1992). Thornhill, R. (1983). Cryptic female choice and its impli- Variance Components. New York: Wiley. cations in the scorpionfly Harpobittacus nigriceps. Shapiro, A. M. & Porter, A. H. (1989). The lock-and-key American Naturalist 122, 765–788. hypothesis: evolutionary and biosystematic interpretation Van Valen, L. (1962). A study of fluctuating asymmetry. of insect genitalia. Annual ReŠiew of Entomology 34, EŠolution 16, 125–142. 231–245. Waage, J. K. (1984). Sperm competition and the evolution Shaw, R. G. (1987). Maximum-likelihood approaches ap- of Odonate mating systems. In Sperm Competition and the plied to quantitative genetics of natural populations. EŠolution of Animal Mating Systems (ed. R. L. Smith), EŠolution 41,812–826. pp. 251–290. Orlando, Fla.: Academic Press. Shaw, F. H., Shaw, R. G., Wilkinson, G. S. & Turelli, M. Watson, P. J. & Thornhill, R. (1994). Fluctuating asymmetry (1995). Changes in genetic variances and covariances: G and sexual selection. Trends in Ecology and EŠolution 9, whiz! EŠolution 49, 1260–1267. 21–25.