Dioecy Is Associated with High Genetic Diversity and Adaptation Rates in the Genus Silene Aline Muyle *,1,2 Hele`ne Martin*,3 ,4 Niklaus Zemp,5,6 Maeva Mollion,7 Sophie Gallina,3 Raquel Tavares,1 Alexandre Silva,8 Thomas Bataillon,7 Alex Widmer,5 Sylvain Glemin,9,10 Pascal Touzet*,3 and Gabriel A.B. Marais†,1

1Laboratoire de Biometrie et Biologie Evolutive (UMR 5558), CNRS/Universite Lyon 1, Villeurbanne, France 2Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA 3University of Lille, CNRS, UMR 8198—Evo-Eco-Paleo, F-59000 Lille, France 4

Departement de Biologie, Institut de Biologie Integrative et des Syste`mes, Universite Laval, Quebec, QC, Canada Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 5Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland 6Genetic Diversity Centre (GDC), ETH Zurich, Zurich, Switzerland 7Bioinformatics Research Centre, Aarhus University, Aarhus C, Denmark 8Centro de Interpretac¸~ao da Serra da Estrela (CISE), Seia, Portugal 9CNRS, ECOBIO [(Ecosyste`mes, Biodiversite, Evolution)]—UMR 6553, University of Rennes, Rennes, France 10Department of Ecology and Genetics, Evolutionary Biology Center and Science for Life Laboratory, Uppsala University, Uppsala, Sweden †Present address: LEAF- Linking Landscape, Environment, Agriculture and Food, Instituto Superior de Agronomia, Universidade de Lisboa, Portugal All the RNA-seq reads generated for this study can be downloaded from the European Nucleotide Archive accession PRJEB39526. These authors contributed equally as co-first authors to this work. These authors contributed equally as co-senior authors to this work. *Corresponding authors: E-mails: [email protected];[email protected]; [email protected]; gabriel.- [email protected]. Associate editor: Wright Stephen

Abstract About 15,000 angiosperm species (6%) have separate sexes, a phenomenon known as dioecy. Why dioecious taxa are so rare is still an open question. Early work reported lower species richness in dioecious compared with nondioecious sister clades, raising the hypothesis that dioecy may be an evolutionary dead-end. This hypothesis has been recently challenged by macroevolutionary analyses that detected no or even positive effect of dioecy on diversification. However, the possible Article genetic consequences of dioecy at the population level, which could drive the long-term fate of dioecious lineages, have not been tested so far. Here, we used a population genomics approach in the Silene genus to look for possible effects of dioecy, especially for potential evidence of evolutionary handicaps of dioecy underlying the dead-end hypothesis. We collected individual-based RNA-seq data from several populations in 13 closely related species with different sexual systems: seven dioecious, three hermaphroditic, and three gynodioecious species. We show that dioecy is associated with increased genetic diversity, as well as higher selection efficacy both against deleterious mutations and for beneficial mutations. The results hold after controlling for phylogenetic inertia, differences in species census population sizes and geographic ranges. We conclude that dioecious Silene species neither show signs of increased mutational load nor genetic evidence for extinction risk. We discuss these observations in the light of the possible demographic differences between dioecious and self-compatible hermaphroditic species and how this could be related to alternatives to the dead-end hypothesis to explain the rarity of dioecy. Key words: sexual systems, selection efficacy, Allee effect, population genetics, RNA-seq, angiosperms.

Introduction which individuals carry bisexual flowers. Nonetheless, dioecy can be found in 40% of angiosperm families. The current Out of 261,750 angiosperm species, 15,600 (6%) are di- view is that many independent (871 to 5,000) and mostly oecious (Renner 2014). Dioecy is thus rare compared with the recent transitions occurred from hermaphroditism to dioecy dominant sexual system in angiosperms, hermaphroditism, in

ß The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http:// creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Open Access Mol. Biol. Evol. 38(3):805–818 doi:10.1093/molbev/msaa229 Advance Access publication September 14, 2020 805 Muyle et al. . doi:10.1093/molbev/msaa229 MBE in angiosperms (Renner 2014). These events probably fol- dioecy was not associated with an increased extinction rate lowed two main evolutionary paths: one through a gynodioe- (Sabath et al. 2016; Duan et al. 2018). Another hypothesis to cious intermediate (with female and hermaphrodite explain the rareness of dioecy in angiosperms posits that re- individuals), called the gynodioecy pathway, and another version from dioecy to other sexual systems is frequent and one through a monoecious intermediate (with a single indi- that dioecy is more evolutionary labile than previously vidual carrying separate female and male flowers) called the thought (Kafer et al. 2017). This hypothesis was supported monoecy or paradioecy pathway (Barrett 2002). In few cases, by a study of 2,145 species from 40 genera in which the dioecy might have evolved through alternative pathways, authors found evidence of frequent reversions from dioecy such as heterostyly (where individuals have different style to other breeding systems (Goldberg et al. 2017). However, and stamen lengths, Barrett 2002). these works relied on macroevolutionary and comparative

Species richness was originally found to be lower in dioe- approaches and the population functioning of dioecious ver- Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 cious compared with nondioecious sister clades, suggesting susnondioeciousspecieshasso far been poorly explored. that dioecy might lead to an increased extinction rate in Investigating consequences both at the macroevolutionary angiosperms and defining the evolutionary dead-end hypoth- and population scale have been fruitful in the study of the esis (Heilbuth 2000). Theoretical work has explored this idea evolution of selfing from outcrossing, another frequent breed- and shown that an increased extinction rate could result from ing system transition in angiosperms. Selfing was also pro- two main evolutionary handicaps (reviewed in Muyle and posed to be an evolutionary dead-end, which has been Marais 2016). First, in dioecious species, only females contrib- supported by several macroevolutionary analyses (Goldberg ute to seed production and dispersal, which could result in et al. 2010; de Vos et al. 2014; Zenil-Ferguson et al. 2019). In individuals being less dispersed spatially and competing more parallel, various negative genetic consequences of selfing (i.e., for local resources compared with populations with her- low genetic diversity, accumulation of deleterious mutations) maphrodites (Heilbuth et al. 2001). This should cause dioe- have been reported in several species, even in those of recent cious species to have smaller geographic distributions and origin (see Glemin et al. 2019 for a review). Interestingly, neg- species abundance compared with nondioecious species. ative genetic consequences have been detected despite the Second, in dioecious species with insect pollination, male short-term advantage of selfing provided by reproductive as- intrasexual competition would result in males being more surance, allowing higher colonization potential and larger attractive than females for pollinators, making reproduction species range (Randle et al. 2009; Grossenbacher et al. 2015). and population size sensitive to variation in pollinator density If the seed-dispersal handicap of dioecy operates, it should (Vamosi and Otto 2002). This should lead to fluctuating pop- reduce the ability of dioecious to spread and increase ulation sizes in dioecious species and in turn affect levels of local competition, which should lead to smaller geographic genetic diversity. It has been proposed that these handicaps distributions and lower species abundance in dioecious com- could be the reason why dioecy is associated with a specific pared with nondioecious species. These expectations can be suite of traits in angiosperms (fleshy fruits, wind-pollination, tested using estimates of geographic distributions and species inconspicuous flowers, and woody growth) as well as ecolog- abundance in dioecious and nondioecious species. If the pol- ical preferences because dioecious species are preferentially linator handicap of dioecy exists, it should cause fluctuations found in the tropics and on islands (Vamosi et al. 2003). in dioecious population sizes. The effective population size Assuming that pollination mode evolves more slowly than (Ne) is given by the harmonic mean of the varying population dioecy, dioecy could be more frequent among wind- sizes over time (Wright 1938). The harmonic mean tends to pollinated species because wind-pollinated species are not be dominated by the smallest population sizes that the pop- subject to the pollinator handicap. Similarly, assuming fruit ulation goes through. Fluctuating population sizes are there- type evolves more slowly than dioecy, dioecy could be more fore expected to decrease the long-term Ne of a species. If the prevalent in species with fleshy fruits because seeds can be pollinator and the seed-dispersal handicap of dioecy exist, dispersed over longer distances by fruit-eating animals, thus they should cause dioecious species to have smaller Ne com- limiting the seed-dispersal handicap. pared with nondioecious species. This expectation can be The evolutionary dead-end hypothesis has recently been tested by studying genetic diversity to estimate the long- challenged (Kafer et al. 2017), in part because the sister clade term Ne of dioecious and nondioecious species. A reduced analysis used in early work assumed that speciation events effective population size should then lead to a decrease in the and trait transitions coincide (Heilbuth 2000). However, if efficacy of selection. This expectation can be tested by com- dioecy evolved after the split between the two diverging paring closely related dioecious and nondioecious species. A lineages and was not coincident with the split, then species study that used the ratio of nonsynonymous to synonymous richness is not expected to be equal between sister clades substitution rate (dN/dS) as a proxy to selection efficacy found even when diversification rates are similar (Kafer and Mousset that one out of two dioecious Silene clade had a reduced 2014). Using a corrected version of the sister clade analysis on efficacy of selection for some genes, but positive selection an updated data set, dioecious clades have been found to increased in other genes (Kafer et al. 2013). However, transi- have actually higher species richness than nondioecious sister tions to dioecy can happen after speciation, making dN/dS clades in angiosperms (Kafer et al. 2014). Using a phylogenetic biased because it is computed on entire branches. We there- method to study trait evolution and species diversification on fore decided to use instead a population genetics approach to several genera with dioecious species (Maddison et al. 2007), confront predictions of the dead-end hypothesis in Silene.We

806 Population Genomic Consequences of Separate Sexes in Plants . doi:10.1093/molbev/msaa229 MBE Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021

FIG.1.Sampled species and their phylogenetic relationships. The species breeding system is indicated by the corresponding set of greek symbols. Sample sizes and sex are indicated. Bootstrap support values were generated using 100 replicates. Pictures of Silene latifolia, S. dioica, S. nutans W, S. nutans E, and S. otites are from Maarten Strack van Schijndel. Pictures of S. marizii, S. vulgaris, S. paradoxa, and S. viscosa are from Oxelman et al. (2013). Picture of S. heuffelii is from Dr Daniel L. Nickrent.

relied on statistics that allow quantifying the strength of nat- they live in temperate regions. The Silene genusisthusade- ural selection, such as rates of adaptive and deleterious non- quate to test for the dead-end hypothesis of dioecy. synonymous substitutions. If dioecy is not a dead-end, as We sampled dioecious species from two groups in which suggested by macroevolutionary analyses, we expect no effect dioecy evolved independently (five species in the Melandrium (or even a positive effect) of dioecy on genetic diversity and section and two species in the Otites section) along with the efficacy of selection. Note that if extinction due to the several hermaphroditic and gynodioecious close relatives for handicaps of dioecy is too rapid to allow for populations to each group and one outgroup (Dianthus chinensis). Silene survive long enough to be sampled, our approach would not species can have very large genome sizes (Bernasconi et al. be able to detect the genetic footprints expected under the 2009) and no complete reference genome is available in the dead-end hypothesis. However, as mentioned earlier, similar genus; only a fourth of the S. latifolia genome has been as- approaches in selfing species have consistently detected the sembled to date (Papadopulos et al. 2015). We therefore used predicted effects, even in very recent species such as Capsella a population genomics approach relying on RNA-seq data, rubella that originated 20,000 years ago from the self- which has proven successful in organisms that lack a refer- incompatible C. grandiflora (Brandvain et al. 2013; Slotte ence genome (Cahais et al. 2012; Gayral et al. 2013; Romiguier et al. 2013). et al. 2014). Multiple individuals were sampled across the In this study, we focused on 13 species of the large genus species’ geographic ranges and sequenced by RNA-seq. A Silene (700 species) in the family, in which pipeline specifically designed for such data was applied to dioecy has evolved independently at least twice (Desfeux et al. perform SNP-calling (Tsagkogeorga et al. 2012; Gayral et al. 1996). The Silene genus is considered a model system to study 2013; Nabholz et al. 2014).TheSNPswerethenanalyzedto the evolution of sexual systems in angiosperms, thanks to its estimate genetic diversity and the efficacy of selection in each reasonably well-described sexual systems and phylogeny species. (Bernasconi et al. 2009). Dioecy in Silene is likely to have evolved through the gynodioecy pathway (Fruchard and Results Marais 2017). If dioecy is associated with the handicaps de- scribed above, dioecious Silene species should suffer from Phylogeny, Geographic Range, Species Abundance, them because they do not have any of the traits named and Population Genetics Characterization of the previously that likely compensate or prevent these handicaps. Sampled Species Indeed, Silene are short-lived herbs, many of them (but not Our species sampling (13 species in total, see fig. 1)includes all) are insect pollinated, they do not have fleshy fruits, and two hermaphroditic species (S. viscosa and S. paradoxa), three

807 Muyle et al. . doi:10.1093/molbev/msaa229 MBE gynodioecious species (S. vulgaris, S. nutans W, and S. nutans and Kyrgyzstan. We treated these lineages as two differenti- E), and seven dioecious species (section Melandrium: ated taxa and analyzed them separately hereafter. Unlike in S. heuffelii, S. dioica, S. marizii, S. diclinis,andS. latifolia and previous work (Slancarova et al. 2013), S. vulgaris did not section Otites: S. pseudotites and S. otites) from subgenera group with S. viscosa and the section Melandrium,butinstead Behenantha and Silene, the main subgenera in the Silene ge- could not be reliably grouped with either subgenus (fig. 1). nus (Jenkins and Keller 2011). The criteria for selecting the A total of 1,088,269 variable nucleotides were called (ta- nondioecious species are detailed in Materials and Methods. ble 1) using an RNA-seq pipeline (see Materials and Briefly, we selected 1) close relatives of both dioecious sec- Methods). Mean heterozygosity levels and Weir and tions based on existing phylogenies, 2) species for which the Cockerham FIT statistics were computed for each species. sexual system is well described, 3) species that had similar FIT indicates deviations from Hardy–Weinberg equilibrium

geographic ranges, and 4) hermaphroditic species not exhib- caused by processes such as selfing or population structure. Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 iting high levels of selfing. The latter criterion was meant to FIT values were low to moderate (<0.2) for most sampled perform a fair comparison between dioecious and hermaph- species. This indicates that neither population structure nor roditic species, not confounded by strong differences in out- selfing is creating strong among-species differences in genetic crossing rates. diversity, as already observed (Moyle 2006). However, For each species, between 4 and 34 individuals were sam- S. viscosa E exhibited a high FIT, a low heterozygosity level, pled from different populations covering the geographic and low genetic diversity, which could be due to recent de- range of the species (see fig. 1 for the number of individuals mographic shifts or a high selfing rate (fig. 2). sampled in every species and supplementary fig. S5, Proxies for species abundance were calculated based on Supplementary Material online, for sample locations). the number of occurrences of each species retrieved from the Previous work has shown that reliable estimates of popula- Global Biodiversity Information Facility (GBIF) database. The tion genetics statistics can be obtained when individuals from geographic range of each species was extracted from the Atlas populations spanning the whole geographic distribution of a Florae Europaeae (see Materials and Methods, supplementary species are sampled, even if the number of sampled individ- fig. S6, Supplementary Material online, and fig. 3). Dioecious uals is small, that is, two or three (Cahais et al. 2012; Gayral and nondioecious species had similar geographic ranges et al. 2013; Romiguier et al. 2014). (622.6 vs. 784 Atlas dots on an average, Wilcoxon one-sided Seeds were collected in the wild and plants were grown in rank sum test P value ¼ 0.52) and GBIF species abundance controlled greenhouse conditions. Flower buds (and in a few (30,777 vs. 23,968.5 on an average, Wilcoxon one-sided rank cases, leaves) were sampled for 130 individuals in total and sum test P value ¼ 0.43). This is not consistent with the dead- sequenced by RNA-seq (see supplementary table S1, end hypothesis which predicts smaller geographic distribu- Supplementary Material online, for details on samples). De tions and population sizes for dioecious species. However, novo reference transcriptomes were assembled separately for some Silene dioecious species have very limited geographic each species and ORFs and orthologs were annotated (see ranges, making the study of their genetic diversity informative Materials and Methods and supplementary table S1, to test whether they are genetically endangered. Supplementary Material online). All dioecious Silene species sampled here have sex chromosomes. Merging autosomal Genetic Diversity Is Higher in Dioecious Silene Species, and sex-linked markers can be problematic in population Regardless of Census Population Size Differences genetics studies as they have different effective population In order to test whether dioecious and nondioecious Silene sizes and selection dynamics (Gautier 2014). Therefore, only species differ in their effective population size (Ne), synony- autosomal ORFs—as predicted by the SEX-DETector method mous genetic diversity (pS) was used as a proxy for Ne with (Muyle et al. 2016)—were kept for downstream analyses (see pS ¼ 4Nel (the mutation rate l was assumed constant Materials and Methods). among species). Ne is a direct measure of the long-term A phylogenetic tree was built in order to check for con- mean intensity of drift in populations. Under the evolutionary gruence with published phylogenies of the Silene genus dead-end hypothesis, dioecious species are expected to have (Rautenberg et al. 2010; Jenkins and Keller 2011). lower Ne and therefore lower values of pS compared with Orthologous ORFs were aligned among Silene species and nondioecious species (see Introduction). Table 1 and figure 2 D. chinensis, which were used to root the tree. A total of showed that dioecious species have higher pS values than 55,337 variable positions, spread over 1,294 contigs, were nondioecious species (Wilcoxon one-sided rank sum test used to build the tree (see Materials and Methods). All sam- W ¼ 35, P value ¼ 0.026), which does not support the ples from a given species grouped together as expected dead-end predictions. (fig. 1). The two lineages of S. nutans formed two clearly However,the13speciesstudiedherehaveverydifferent distinct groups that should be treated as distinct species as census population sizes (N) with some species being cosmo- shown in previous studies (Martin et al. 2016, 2017). Silene politan (e.g., S. latifolia, S. vulgaris), whereas others are rare or viscosa samples also formed two highly divergent lineages, even endemic (e.g., S. diclinis). Genetic diversity can be af- even more divergent than the two lineages of S. nutans:a fected by N, although the relationship can be complex such western lineage (S. viscosa W), which includes samples from that N and Ne are quite different (Kalinowski and Waples Sweden and the Czech Republic, and an eastern lineage 2002). In order to make a fair comparison of dioecious and (S. viscosa E), which includes samples from Bulgaria, Russia, nondioecious species, we accounted for the effect of N when

808 Population Genomic Consequences of Separate Sexes in Plants . doi:10.1093/molbev/msaa229 MBE S p / N p Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 in Focal Species (%) N p in Focal Species (%) S p IT F 0.065] 0.934 [0.886; 0.982] 0.179 [0.170; 0.187] 0.191 [0.182; 0.200] 0.035] 2.698 [2.628; 2.771] 0.342 [0.331; 0.353] 0.127 [0.123; 0.131] 2 2 0.113; 0.051; 0.005; 0.030] 1.637 [1.573; 1.703]0.007; 0.035] 0.246 [0.235; 0.256] 1.081 [1.016; 0.150 1.150] [0.144; 0.156] 0.180 [0.167; 0.194] 0.167 [0.154; 0.180] 2 2 2 2 0.089 [ 0.043 [ 2 2 of SNPs Weir and Cockerham Number Number of Clean Contigs FIG.2.Levels of individual heterozygosity (proportion of heterozygote positions), Weir and Cockerham FIT statistic, and pS within each spe- cies. Dots show the mean value and vertical lines mark the 95% con- 3 2,729 23,774 4 3,999 98,957 48 2,339 893 24,698 11,364 0.024 [0.006; 0.041] 0.014 [ 1.116 [1.074; 1.157] 0.195 [0.185; 0.204] 0.174 [0.166; 0.183] 4 2,878 44,868 0.013 [ fidence interval (obtained by 1,000 bootstraps on contigs). 17 3,730 143,297 0.181 [0.176; 0.186] 2.073 [2.013; 2.135] 0.242 [0.234; 0.251] 0.117 [0.113; 0.121] 3414 4,132 6,248 214,64612 281,300 0.165 [0.161; 0.169] 0.205 [0.200; 0.209] 5,955 2.005 [1.950; 2.062] 2.685 [2.639; 2.732] 52,675 0.244 [0.236; 0.252] 0.351 [0.343; 0.360] 0.116 0.122 [0.108; [0.118; 0.124] 0.126] 0.131 [0.128; 0.134] 0.589 [0.568; 0.611] 0.102 [0.098; 0.105] 0.172 [0.166; 0.179] Number of Samples looking at the effect of sexual systems on genetic diversity. Two different proxies for N were used: 1) the species abun- dance in the GBIF data set, and 2) the geographic range of EW 3 2 6,413 5,959 13,712 24,305 0.723 [0.691; 0.754] 0.068 [0.038; 0.098] 0.291 [0.277; 0.305] 0.723 [0.693; 0.757] 0.056 [0.053; 0.059] 0.134 [0.127; 0.142] 0.194 [0.184; 0.204] 0.186 [0.178; 0.193] EW 11 11 5,100 5,059 65,820 88,853 0.124 [0.106; 0.140] 0.201 [0.182; 0.219] 0.861 [0.829; 0.895] 1.048 [1.011; 1.084] 0.130 [0.125; 0.136] 0.170 [0.165; 0.176] 0.151 [0.146; 0.157] 0.163 [0.157; 0.169] each species according to the Atlas Florae Europaeae. We used a linear model to jointly assess the effects of N and sexual systems on pS. GBIF data and Atlas geographic Silene diclinis Silene dioica Silene heuffelli Silene latifolia S .vulgaris Silene viscosa Silene viscosa Silene pseudotites Silene otites Silene nutans Silene nutans Silene paradoxa Silene marizii range (estimates of N) were significantly positively associated with pS (P values ¼ 0.030 and 0.012, t values ¼ 2.529 and 3.054, R2 ¼ 0.390 and 0.483, respectively, and see fig. 3). This shows that genetic diversity is strongly influenced by N in our Population Genetics Parameters of the Studied Species. Behenantha Otites Silene sample set. This also suggests that, in the case of the Silene genus, GBIF abundance and Atlas geographic ranges are Table 1. Sexual SystemSection Dioecious Focal Species Gynodioecious Hermaphrodite Section Dioecious Gynodioecious Hermaphrodite sensible proxies for N. After controlling for the effect of N on

809 Muyle et al. . doi:10.1093/molbev/msaa229 MBE Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021

FIG.3.Relationship between synonymous polymorphism (pS) and (A) census population size estimated from the Global Biodiversity Information Facility (GBIF) and (B) geographic range estimated from the Atlas Florae Europaeae. For each species, dots represent the mean pS and vertical lines mark the 95% confidence interval (obtained by 1,000 bootstraps on contigs). Census population sizes estimated from GBIF refer to the number of occurrences, and geographic distributions from the Atlas Florae Europaeae refer to the number of dots (native occurrences) counted on the Atlas maps (see Materials and Methods). genetic diversity, dioecious species had a significantly higher Selection Is More Efficient in Dioecious Silene Species, pS than nondioecious species (linear model pairwise t-test P Regardless of Differences in Ne values ¼ 0.142 and 0.048, t values ¼ 1.595 and 2.247 using We then tested whether selection efficacy differs in dioecious GBIF data and Atlas geographic range estimates, respectively). and nondioecious Silene species. Selection should be more Adding FIT to the linear model correlating genetic diversity efficient in species with larger Ne.Theratioofnonsynony- and N did not significantly improve our linear model mous to synonymous substitution rate, dN/dS,couldnotbe (ANOVA P ¼ 0.102 and P ¼ 0.097 for GBIF and Atlas geo- used here because the studied species are closely related and graphic range estimates, respectively, see Materials and some shifts in sexual systems might have occurred too re- Methods for details). This suggests that the increase in genetic cently to allow for fixed differences to evolve (Desfeux et al. diversity observed under dioecy is not attributable to inbreed- 1996; Bernasconi et al. 2009). Rather, the ratio of nonsynon- ing evasion. ymous over synonymous polymorphism, pN/pS,wasused, These effects remained after correcting for phylogenetic because it is a more appropriate statistic to study the efficacy relationships among species using generalized least square of selection on a short evolutionary timescale (Mugal et al. models in the R package ape (Paradis and Schliep 2019): N 2014). One expects increased pN/pS values when purifying was significantly positively correlated to pS (P values ¼ 0.012 selection is reduced and hence the dead-end hypothesis and 0.03, t values ¼ 3.05 and 2.53 for Atlas and GBIF, respec- would predict elevated pN/pS in dioecious species. tively) and dioecious species had a significantly higher pS However, we found no statistical association between sexual compared with nondioecious species when using Atlas geo- systems and pN/pS (Wilcoxon rank sum test P ¼ 0.295, ta- graphic range estimates (P value ¼ 0.048, t value ¼ 2.25), but ble 1), perhaps reflecting the opposing effects that purifying not when using GBIF estimates (P value ¼ 0.14, t value ¼ 1.6). and positive selection have on pN/pS, or perhaps reflecting In this analysis, different sets of genes were used in different the fact that breeding systems do not have an effect on pN/pS. species. As this might affect the results, the analysis was re- We thus turned to a more sophisticated approach to peated on two smaller sets of genes, one shared by every quantify selection efficacy and used a modified version of species in the subgenus Behenantha and the second shared the McDonald–Kreitman framework, Implemented in the by every species in the subgenus Silene. Similar results were method polyDFE (Tataru et al. 2017; Tataru and Bataillon found using these shared gene subsets, that is, GBIF data and 2020). This method estimates the distribution of fitness Atlas geographic range have a significant positive effect on pS effects (DFE) of both deleterious and beneficial mutations (P values ¼ 0.021 and 0.012, t values ¼ 2.725 and 3.074, using polymorphism data only, thus avoiding the problem R2 ¼ 0.429 and 0.488, respectively). And, dioecious species of shifts in population sizes. Silene species underwent a post- had higher pS compared with other species after correcting glacial age population expansion in Europe and for this rea- for N (linear model pairwise t-test P values ¼ 0.127 and 0.045, son, it would be biased to use methods that also rely on t values ¼ 1.662 and 2.283 using GBIF data and Atlas geo- divergence (Keightley and Eyre-Walker 2007; Galtier 2016). graphic range, respectively, see supplementary fig. S1, Indeed, other methods to estimate the DFE use both poly- Supplementary Material online). morphism and divergence data and usually return estimates

810 Population Genomic Consequences of Separate Sexes in Plants . doi:10.1093/molbev/msaa229 MBE

FIG.4.Relationship between synonymous polymorphisms (pS) and (A) the proportion of adaptive amino acid substitutions, a,(B) the rate of Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 adaptive nonsynonymous substitutions (xA), and (C) the rate of nonadaptive nonsynonymous substitutions (xNA). Dots show the mean value, horizontal lines mark the 95% pS confidence interval (obtained by 1,000 bootstraps on contigs), and vertical lines show the y axis 95% confidence interval (obtained on 500 bootstraps on the SFS SNPs).

of proportion of adaptive substitution that are assuming that negatively correlated to xNA (P value ¼ 0.0015 and t population size remained constant since divergence with the value ¼4.49). outgroup. Silene viscosa E was not included because its diver- Estimates of a, xA,andxNA were then compared between sity was too low due to selfing and the polyDFE model did not dioecious and nondioecious species, after controlling for the apply. PolyDFE was used to estimate the DFE (supplementary effect of Ne (using pS as a proxy). Nondioecious species had a fig. S2, Supplementary Material online) for each species with a significantly lower a and xA and a significantly higher xNA gamma distribution for deleterious mutations and exponen- compared with dioecious species (linear model pairwise t-test tial distribution for beneficial mutations (see estimated pa- P values ¼ 0.015, 0.00031, and 0.019, t values ¼2.99, 5.65, rameter ranges in supplementary fig. S3, Supplementary and 2.87 for a, xA,andxNA, respectively). These effects Material online). From this, the predicted proportion of adap- remained significant after correcting for phylogenetic rela- tive substitutions (a), the predicted rate of adaptive substi- tionships among species using gls models in the R package tutions (xA), and the predicted rate of deleterious ape (P values ¼ 0.032, 0.0007, and 0.058, t values ¼2.53, substitutions (xNA) were estimated. Under the dead-end hy- 5.03, and 2.17 for a, xA,andxNA, respectively). Dioecious pothesis, dioecious species are expected to have a higher xNA, species thus experience fewer fixations of slightly deleterious alowera,andalowerxA because of a faster accumulation of mutations due to genetic drift and more adaptive evolution, deleterious mutations and a lower rate of adaptation due to which is opposite to what we expect under the dead-end genetic drift, when compared with nondioecious species. hypothesis. The analyses were repeated using orthologous Because Ne is known to influence the efficacy of selection, genes shared within the subgenus Behenantha or within we correct for differences in Ne (using pS as a proxy) between the subgenus Silene as different sets of genes in different species in order to make a fair comparison of dioecious and species could affect the results. The conclusions remained nondioecious species. unchanged; differences were still in the same direction al- A linear model was used to jointly assess the effects of Ne though statistical power was reduced in some cases (see sup- (using pS asaproxy)andofthesexualsystemona. pS was plementary fig. S4, Supplementary Material online, legend for significantly positively correlated to a (P value ¼ 0.003, t val- details). ue ¼ 4.07, fig. 4). The same result has previously been ob- served in animal species (Galtier 2016)wherethe Discussion correlation between a and pS was solely due to the link be- In the set of Silene species studied here, dioecy is associated tween pS and xNA. Because a ¼ xA/(xAþxNA), when selec- with higher genetic diversity, more efficient purifying selec- tion is more efficient (higher Ne and higher pS),therateof tion, and increased adaptive evolution. These results contra- deleterious substitutions xNA is reduced, which in turn dict the dead-end hypothesis and are consistent with increases a. However, in the present data set, the correlation macroevolutionary analysis that rejected the dead-end from between a and pS was both due to higher adaptive evolu- different lines of evidence. High levels of genetic diversity in tionary rates and lower deleterious evolutionary rates when dioecious species had previously been observed in S. dioica effective population size is high. Indeed, pS was significantly (Giles and Goudet 1997), S. chlorantha (Lauterbach et al. positively linked to xA (linear model P value ¼ 0.011, t val- 2011), S. otites (Lauterbach et al. 2012), Ficus (Nazareno ue ¼ 3.19) and significantly negatively correlated to xNA (lin- et al. 2013), Antennaria dioica (Rosche et al. 2018), and a ear model P value ¼ 0.00017, t value ¼6.15, see fig. 4). comparison of 12 monoecious and 11 dioecious plants These effects remained after correcting for phylogenetic rela- (Nazareno et al. 2013). Some of these studies have shown tionships among species using gls models in the R package that the sex ratio strongly influences genetic diversity through ape (Paradis and Schliep 2019): pS was significantly positively its influence on the outcrossing rate. We do not have access correlated to a and xA (P values ¼ 0.026 and 0.0054, t val- to sex ratios of the species sampled here but their study ues ¼ 4.11 and 3.64 for a and xA, respectively) and pS was would be insightful to better understand patterns of diversity

811 Muyle et al. . doi:10.1093/molbev/msaa229 MBE in Silene species in the future. Our results clearly show that A fourth limitation of our study is that, if the extinction dioecious Silene species are not at risk of extinction given their dynamics of dioecious species is extremely fast, there might high genetic diversity. Observed rates of adaptive nonsynon- not be enough time for deleterious genetic footprints to ac- ymous substitution patterns suggest that dioecious species cumulate in dioecious species before they go extinct. The undergo more adaptation than nondioecious species, how- microevolutionary approach taken here can only detect ever, the underlying causes of this pattern remain to be handicaps of dioecy that do not lead to immediate extinction. determined. Such a study, as others testing evolutionary dead-end hypoth- The Silene genus is a good model system to test for the eses, suffers from a “survivor bias” as only nonextinct species dead-end hypothesis because dioecious Silene species do not can be analyzed. However, some of the dioecious species have any of the ecological traits thought to possibly compen- studied here have small geographic distributions and one is

sate for the handicaps of dioecy (see Introduction). In other even endemic. These species should be strongly affected by Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 genera, in which dioecy is ancient, it is likely that enough time the handicaps of dioecy, if there are any. But even these spe- has passed for species to evolve compensating mechanisms cies exhibited high levels of diversity and selection efficacy. to counteract the handicaps of dioecy, if these truly exist. This In the Silene genus, all hermaphrodites are self-compatible is less likely to be an issue in the Silene genus where dioecy has and therefore able to self-fertilize. Outcrossing promotes ge- evolved fairly recently: 11 Ma old in the Melandrium section netic diversity by increasing effective population size and min- (Krasovec et al. 2018) and even more recently in section Otites imizing the effects of linkage (Nordborg 2000; Glemin et al. (Slancarova et al. 2013). 2019). Dioecy could be beneficial in Silene by preventing the Our approach, however, also has its limitations. First, our negative effects of selfing that occur in hermaphroditic and set of species represents only two independent events of di- gynodioecious species (Charlesworth and Charlesworth oecy evolution and a small sample of the whole Silene genus 1978). One selfing hermaphrodite subspecies (S. viscosa E) (700 species). However, we did not find a significant phy- could not be included in the analysis of selection efficacy logenetic signal on genetic diversity in our species set (Pagel’s because its genetic diversity was too low. However, two her- k ¼ 0.848, P ¼ 0.133), suggesting that each species can be maphrodite species that are not heavy selfers (fig. 2)were considered as independent even though some species share included in our comparison of dioecious and nondioecious the same transition event to dioecy. Nonetheless, other gen- species. However, the fact that significant differences in terms era should be studied in the future to assess whether our of selection efficacy between dioecious and nondioecious spe- findings can be generalized to other angiosperm species, es- cies remain, even without this selfing species, highlights that pecially to groups in which dioecy evolved through different the advantages of dioecy that we observe do not result from routes (see Introduction). comparing selfing and outcrossing species. What is more, the A second limitation is that sampling was limited by the increased genetic diversity observed in dioecious species number of Silene species for which the sexual system is well could not be explained by differences in inbreeding levels in described and for which the geographic distribution is similar our data set. among species, especially for hermaphrodites (see Materials However, the main difference between dioecious species and Methods for detailed criteria used to select species). and hermaphrodites is the ability to “occasionally” self- None of the Silene hermaphrodite species has geographic fertilize, which can strongly affect the demography of a spe- distributions as wide as the widest dioecious species range. cies. Indeed, in hermaphroditic self-compatible species, a sin- However, dioecious species that have geographic ranges com- gle individual can start a new population (Schoen and Brown parable with those of the hermaphroditic species we sampled 1991). This reproductive assurance provides an ecological ad- still have higher diversity and selection efficacy compared vantage to hermaphroditic self-compatible species over dioe- with hermaphrodites (figs. 3 and 4), which is conservatively cious species. However, the other side of the coin is that in disagreement with the dead-end hypothesis. For 2 species hermaphroditic self-compatible species can experience recur- out of 13, namely S. otites (dioecious) and S. paradoxa (non- rent bottlenecks, even if the mean selfing rate is close to zero. dioecious), we were not able to sample individuals across the This reduces genetic diversity and Ne and makes selection less whole geographic range of the species and their genetic di- efficient. On the other hand, dioecious populations require versity might be underestimated compared with others. both males and females to be present in sufficient density in a A third limitation comes from the comparison of dioecious location in order to reproduce. Therefore, dioecious popula- and nondioecious species which implies comparing species tion sizes cannot decline below a given threshold at risk of with sex chromosomes to species with only autosomes. Sex- declining to extinction, a phenomenon called the Allee effect linked genes, which are located on the nonrecombining re- (Courchamp et al. 2008). A counter-intuitive prediction is gion of sex chromosomes, are known to suffer from lower Ne, thus that, “conditioning on survival,” the Allee effect pro- which could bias comparisons of dioecious to nondioecious motes genetic diversity in dioecious populations, especially species. Because sex-linked genes in Silene represent <10% of during colonization (Roques et al. 2012; Romiguier et al. genes (discussed in Muyle et al. 2012), we chose to subset our 2014). Silene species are known to have colonized Europe study to autosomal genes only. Our results therefore overlook northward after the last ice age and have experienced the potential negative effects of sex chromosomes on selec- population expansion. The Allee effect could have played a tion efficacy in dioecious species. strong role during this expansion with self-compatible

812 Population Genomic Consequences of Separate Sexes in Plants . doi:10.1093/molbev/msaa229 MBE hermaphrodites colonizing new habitats fast with as few as a species was done using different criteria. First, close relatives single individual and undergoing strong bottlenecks and di- of the two dioecious sections were selected based on available versity loss. Dioecious species that survived, on the other phylogenies (Desfeux et al. 1996; Jenkins and Keller 2011; hand, would have expended slowly, maintaining sufficiently Slancarova et al. 2013). We focused on those for which the dense populations throughout the expansion, resulting in the sexual system is well described. The Silene genusisalarge maintenance of genetic diversity. In established dioecious spe- genus comprising 700 species but only 60 have a well- cies, the Allee effect could provide a long-term safeguard described sexual system (dioecious species included, Ju¨rgens against genetic depletion. et al. 2002). Highly selfing species were avoided by focusing on Our results suggest that established dioecious Silene spe- hermaphroditic species that are perennial, insect pollinated cies are genetically healthy. Our work revisits the fascinating (or at least not autogamous/cleistogamous), and with a high

question of why dioecy is rare in angiosperms, since the dead- pollen/ovule ratio (Ju¨rgens et al. 2002). These criteria left us Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 end hypothesis does not seem to apply, at least to our sample with only a handful of hermaphroditic species, of which of species. However, our data set might suffer from a “survivor S. viscosa and S. paradoxa were selected for seed availability. bias” and we cannot exclude that potential handicaps of di- Seeds were sampled across the geographic distribution of oecy could lead to very early extinction during the emergence each species (supplementary fig. S5, Supplementary Material of new dioecious populations from hermaphrodite lineages. online) and were sown and grown in controlled greenhouse At a macroevolutionary scale, this rapid dead-end would lead conditions. In case plants did not flower, young leaves were to an observed low rate of transition toward dioecy. However, sampled instead (details on individual names, sex, origin, and current estimates do not show low transition rates to dioecy sampled tissue can be found in supplementary table S1, in angiosperms (Goldberg et al. 2017), which goes against the Supplementary Material online). Total RNA was extracted possibility of an early dead-end of dioecy. through the Spectrum Plant Total RNA kit (Sigma, Inc.) fol- An alternative hypothesis to explain the rarity of dioecy is lowing the manufacturer’s protocol and treated with DNAse. that rates of transition to dioecy could be low in angiosperms Libraries were prepared with the TruSeq RNA sample and/or dioecy could be more evolutionary labile than previ- Preparation v2 kit (Illumina Inc.). Each 2-nM cDNA library ously thought and easily lost and reversed to nondioecious was sequenced using a paired-end protocol on an Illumina states depending on environmental conditions (Goldberg HiSeq2000 sequencer, producing 100-bp reads. et al. 2017; Kafer et al. 2017). The combination of a low tran- Demultiplexing was performed using CASAVA 1.8.1 sition rate to dioecy and a high reversibility to nondioecy (Illumina) to produce paired-sequence files containing reads together would lead to a low probability to observe dioecious for each sample in Illumina FASTQ format. RNA extraction angiosperms in nature, as they would only be transiently and sequencing were done by the sequencing platform in the existing. This would make dioecy rare without the need for AGAP laboratory, Montpellier, France (http://umr-agap.cirad. it to be an evolutionary dead-end. Larger and fully resolved fr/). The Melandrium and Dianthus samples were sequenced angiosperm phylogenies with annotated breeding systems as described in Zemp et al. (2016, p. 2016) at the Quantitative will be required to precisely estimate more precisely transition Genomics Facility (QGF, ETH Zu¨rich, Switzerland) and by and reversion rates to and from dioecy. GATC Biotech AG (Konstanz, Germany).

Materials and Methods Reference Transcriptome Assemblies Reference transcriptomes were assembled de novo for all Sampling and Sequencing species (except for S. marizii, S. heuffelli, S. diclinis,and We investigated patterns of polymorphism and divergence of S. dioica). As assemblies were carried in different labs, they three sexual systems (hermaphroditism, gynodioecy, and di- differ slightly for the two Silene taxa. For the subgenus oecy) in two subgenera of the Silene genus, subgenus Behenantha, reads from all individuals were pooled for each Behenantha and subgenus Silene (sample sizes and sexes are species. Steps in the assembly are represented in supplemen- indicated in fig. 1). For subgenus Behenantha, we sampled five tary figure S3, Supplementary Material online. About 100% dioecious species (S. latifolia, S. diclinis, S. dioica, S. marizii,and identical reads, assumed to be PCR duplicates, were filtered S. heuffelli), one gynodioecious species (S. vulgaris), and one using the ConDeTri trimming software (Smeds and Ku¨nstner hermaphroditic species (S. viscosa). For the subgenus Silene, 2011). Illumina reads were then filtered for sequencing adapt- we sampled two dioecious species (S. otites and ers and low read quality using the ea-utils FASTQ processing S. pseudotites), two gynodioecious subspecies (S. nutans utilities (Aronesty 2013). Filtered reads were assembled using W—Western lineage, and S. nutans E—Eastern lineage, as trinity with default settings (Haas et al. 2013). Poly-A tails defined in Martin et al. [2016]), and one hermaphroditic spe- were removed using PRINSEQ (Schmieder and Edwards cies (S. paradoxa). Two samples of D. chinensis were added to 2011) with parameters -trim_tail_left 5 -trim_tail_right 5. root the phylogeny. Some of the individuals were already used rRNA-like sequences were removed using riboPicker version in previous studies. 0.4.3 (Schmieder et al. 2012) with parameters -i 90 -c 50 -l 50 The two known dioecious sections in the genus Silene were and the following databases: SILVA Large subunit reference sampled: section Melandrium (all five species of the section database, SILVA Small subunit reference database, the were selected) and section Otites (2 closely related dioecious GreenGenes database and the Rfam database. Obtained tran- species were selected). The selection of the nondioecious scripts were further assembled, inside of trinity components,

813 Muyle et al. . doi:10.1093/molbev/msaa229 MBE using the program Cap3 (Huang and Madan 1999)withpa- using the GTRGAMMA model in RaxML (Stamatakis 2006). rameter -p 90 and custom perl scripts. Bootstrap support values were generated using 100 Raw data of the five taxa of subgenus Silene were filtered replications. for sequencing adapters using Cutadapt (Martin 2011), low Phylogenetic signal and its significance were computed read quality, and poly-A tails with PRINSEQ (Schmieder and using the R function phylosig in the phytools R package Edwards 2011). For each focal species of the subgenus, refer- (Revell 2012). ence transcriptomes were assembled de novo by combining filtered reads from two individuals from the same population Data Filtering (the two individuals with the highest number of reads) using Analyses were carried out on autosomal genes only, in order default settings of Trinity (Haas et al. 2013) and an additional to avoid potential confounding effects of sex chromosomes

Cap3 (Huang and Madan 1999) run. Finally, in both subge- and sexual systems on estimates of selection efficiency. Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 nera, coding sequences were predicted using TransDecoder Autosomal genes were predicted for each dioecious species (Haas et al. 2013)withPFAMdomainsearchesasORFreten- separately by SEX-DETector (Muyle et al. 2016)thankstothe tion criteria (see website: http://transdecoder.github.io, last study of a cross (parents and progeny) sequenced by RNA- accessed October 13, 2020). Assembly statistics are presented seq (Martin et al. 2019;A.Muyle,N.Zemp,A.Widmer,G.A.B in supplementary table S2, Supplementary Material online. Marais, in preparation). About 11,951 contigs were inferred as autosomal in S. latifolia, 7,695 for S. diclinis, 11,675 for Mapping, Genotyping, and Alignment S. heuffelli,8,997forS. marizii, 10,698 for S. dioica, 2,035 in Reads from each species were mapped onto their respective S. otites,and6,234inS. pseudotites. reference transcriptome using BWA (Li and Durbin 2009) The data were further filtered in order to remove ORFs version 0.7.12 with parameter -n 5 for the subgenus with frameshifts. Only biallelic sites were kept for analyses Behenantha and Bowtie2 (Langmead and Salzberg 2012)for (sites with two or less alleles). Only ORFs longer than 100 the subgenus Silene. Mapping statistics can be found in sup- codons and with a maximum of 20% of missing individuals of plementary table S3, Supplementary Material online. For the the focal species in the alignment were retained, after remov- Melandrium section (S. latifolia, S. marizii, S. heuffelli, S. diclinis, ing incomplete codons and gaps. We obtained final data sets and S. dioica), all reads were mapped to the S. latifolia refer- ranging from 893 retained ORFs for S. otites up to 6,413 ence transcriptome. SAM files were compressed and sorted retained ORFs for S. viscosa E(seetable 1). Finally, to verify using SAMtools (Li et al. 2009). Individuals were genotyped that observed trends are not due to differences in studied using reads2snp_2.0 (Tsagkogeorga et al. 2012; Gayral et al. ORFs among species, we performed the same analyses on a 2013) with parameters -par 1 -min 3 (ten for the subgenus smaller set of orthologous ORFs present in all species within a Silene) -aeb -bqt 20 -rqt 10 that is to say: stringent filtering of subgenus, so that parameters were estimated on the same set paralogous positions, minimum coverage of three (or ten) of ORFs. For these tests, 188 orthologous ORFs were analyzed reads for calling a genotype, accounting for allelic expression for the subgenus Silene, and 298 for the subgenus Behenantha. bias, minimum read mapping quality threshold of 10 and minimum base quality threshold of 20. The output obtained was two unphased sequences for each individual and each Polymorphism, Heterozygosity, and FIT ORF. Population statistics were computed on the filtered data set Orthologous ORFs were identified using OrthoMCL (Li foreachORFandaveragedusingthePopPhyltool et al. 2003) Basic Protocol 2 with default parameters. Best dNdSpiNpiS_1.0 (Gayral et al. 2013)withparameter reciprocal BLAST hits were considered orthologs among spe- kappa ¼ 2. The following statistics were computed for each cies and orthologous ORFs were aligned for all individuals contig: the mean fixation index FIT (Weir and Cockerham using MACSE (Ranwez et al. 2011). Species from the subgenus 1984), per-individual heterozygosity (proportion of heterozy- Behenantha were aligned separately with S. paradoxa,consid- gous positions), per-site synonymous polymorphism (pS), ered as an outgroup for this section and species from the per-site nonsynonymous polymorphism (pN), and the ratio subgenus Silene were aligned with S. viscosa considered as an of nonsynonymous to synonymous polymorphism (pN/pS). outgroup for this section. Note that, we could not compute FIS or FST as we do not have the required two individuals per population. Confidence Phylogenetic Reconstruction and Phylogenetic Signal intervals were obtained by 1,000 bootstraps on contigs. Orthologous ORFs among species were obtained using best reciprocal BLAST hits among the 14 species (all species from Estimation of Geographic Distribution and Population the Silene genus and D. chinensis). For each orthologous ORF, Size the nine species were added one by one into the alignment Geographic distributions of the Silene species were estimated using MACSE (Ranwez et al. 2011).Thedatawerefilteredso by counting the black dots (native occurrence) on the Atlas that a maximum of 20% missing data per site and per species maps by Jalas and Suominen (1986). This was partly achieved was allowed, for a total of 55,337 variable nucleotides across using the Dot Count program (http://reuter.mit.edu/soft- 1,294 ORFs. We used the maximum number of ORFs available ware/dotcount, last accessed October 13, 2020). As no map and did not restrict our analysis to autosomal ORFs identified was available for S. pseudotites,dotswerecountedforItaly in dioecious species. The phylogenetic tree was reconstructed and adjacent regions of France and the Balkans using the

814 Population Genomic Consequences of Separate Sexes in Plants . doi:10.1093/molbev/msaa229 MBE

S. vulgaris map (the Atlas indicates these are the regions were used to estimate a (the proportion of adaptive amino where this plant is abundant). acid substitutions) in each species. We estimated a according Census population sizes of the Silene species were esti- to a modified version of the Eyre-Walker and Keightley mated using raw species occurrence numbers from GBIF method (Keightley and Eyre-Walker 2007), which relies on (http://www.gbif.org January 17, 2017 Download of S. dioica: polymorphism data only to estimate a (Tataru et al. 2017). http://doi.org/10.15468/dl.fim8f2, S. marizii: http://doi.org/10. polyDFE (https://github.com/paula-tataru/polyDFE, last 15468/dl.yywbrk, S. diclinis: http://doi.org/10.15468/dl.vizuiv accessed October 13, 2020) was used to estimate the distri- and December 1, 2016 Download of S. pseudotites: http:// bution of fitness effects of mutations (DFEM), a,the(per doi.org/10.15468/dl.yrvenq, S. otites: http://doi.org/10.15468/ synonymous substitution) rate of adaptive nonsynonymous dl.fxoocf, S. paradoxa: http://doi.org/10.15468/dl.wizsb6, substitution xA¼a dN/dS,andtherateofnonadaptivenon-

S. nutans: http://doi.org/10.15468/dl.poqxc1, S. viscosa: synonymous substitutions xNA ¼ (1a) dN/dS.Foreachspe- Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 http://doi.org/10.15468/dl.bwdu9z, S. vulgaris: http://doi.org/ cies separately, four different models were fitted to the SFS: 10.15468/dl.vmvebr, S. latifolia: http://doi.org/10.15468/dl. • mfvyds, all previous links were last accessed October 13, Model 1: full DFEM with adaptive mutations (p_b>0), 2020). Some countries contribute poorly to this database, estimation of an orientation error rate for ancestral alleles however, we assumed that this affected all Silene species in in the SFS (eps_an), estimation of demography (r_i¼6 1) the same way. The GBIF database was filtered for showing for a total of six parameters plus the number of demog- occurrence based on human observation and literature. Only raphy parameters r_i which depends on the sample size data from European countries and Russia, which represent of each species. • the native range of those species, were kept. For each species, Model 2: same as Model 1 but without ancestral allele ori- all existing names of the species were included. entation error rate (eps_an ¼ 0), for a total of five param- Both estimates (geographic range and population size) eters plus the number of demography parameters r_i. • were divided by two when two subspecies were identified Model 3: same as Model 1 but without adaptive muta- with our phylogeny (i.e., for S. viscosa and S. nutans). Atlas tions (p_b ¼ 0), for a total of five parameters plus the geographic ranges and GBIF species abundance were com- number of demography parameters r_i. • pared among dioecious and nondioecious species using a Model 4: same as Model 1 but without adaptive muta- one-sided Wilcoxon rank sum test in R (RCoreTeam tions (p_b ¼ 0) and without ancestral allele orientation 2016). Geographic range and GBIFdatawerecorrelatedto error rate (eps_an ¼ 0), for a total of four parameters plus the number of demography parameters r_i. pS and sexual systems or FIT using R linear model function lm as follows: Each model provided estimates for a, xA,andxNA.These lmðpS Atlas þ sexual systemÞ values were averaged using Akaike weights as follows: P 4 a e1=2DAICm lmðpS GBIF þ sexual systemÞ Pm¼1 m aaverage ¼ 4 ; 1=2DAICk k¼1 e lmðpS Atlas þ FITÞ where m stands for model number m, am stands for the estimated a using model m,and lmðpS GBIF þ FITÞ: DAICm ¼ AICm AICmin; The significancy of the effect of sexual system and FIT were tested thanks to a comparison to a nested model using the with AIC the Akaike criterion: ANOVA function in R. The nested models are: AICm ¼ 2nm 2logðÞ Lm ; lmðpS AtlasÞ where nm stands for the number of parameters and Lm for the likelihood of model m. Model averaged values for xA and lmðpS GBIFÞ: xNA were obtained using a similar procedure. For each species, 500 bootstrapped data sets were ran- Thesamemodelswereusedandcorrectedforphyloge- domly generated by random sampling SNPs with replace- netic relationships among species using gls models with ment from the original SFS. Average values among models Martins and Hansen’s correlation structure (corMartins) in were computed as explained previously for each bootstrap. the R package ape (Paradis and Schliep 2019). The distribution of values for the 500 bootstraps was used to compute approximate 95% confidence intervals for a, xA, Selection Efficacy and xNA. For this, 2.5% of values were removed from each The unfolded synonymous and nonsynonymous site fre- side of the bootstrap distribution (Efron and Tibshirani 1993). quency spectra (SFS, the distribution of derived allele counts a, xA,andxNA values were analyzed in a linear model using R across SNPs) were computed for each species with the sfs lm function (R Core Team 2016). The model tested the effect program provided by Nicolas Galtier (Galtier 2016). These SFS of pS and sexual system (dioecious, gynodioecious,

815 Muyle et al. . doi:10.1093/molbev/msaa229 MBE hermaphroditic) as fixed effect, whereas taking into account References the variance of the estimations with weights computed as Aronesty E. 2013. Comparison of Sequencing Utility Programs. Open follow: Bioinforma J. 7:1–8. Available from: https://expressionanalysis. github.io/ea-utils/. Accessed October 13, 2020. lmða pS þ sexual system; weightsÞ Barrett SCH. 2002. The evolution of plant sexual diversity. Nat Rev Genet. 3(4):274–284. Bernasconi G, Antonovics J, Biere A, Charlesworth D, Delph LF, Filatov D, lmðxA pS þ sexual system; weightsÞ Giraud T, Hood ME, Marais GAB, McCauley D, et al. 2009. Silene as a model system in ecology and evolution. Heredity 103(1):5–14. Brandvain Y, Slotte T, Hazzouri KM, Wright SI, Coop G. 2013. Genomic lmðxNA pS þ sexual system; weightsÞ: identification of founding haplotypes reveals the history of the self- ing species Capsella rubella. PLoS Genet. 9:e1003754.

The same models (without weights) were used and cor- Cahais V, Gayral P, Tsagkogeorga G, Melo-Ferreira J, Ballenghien M, Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 rected for phylogenetic relationships among species using gls Weinert L, Chiari Y, Belkhir K, Ranwez V, Galtier N. 2012. Reference- models with Martins and Hansen’s correlation structure free transcriptome assembly in non-model animals from next- (corMartins) in the R package ape (Paradis and Schliep 2019). generation sequencing data. Mol Ecol Resour. 12(5):834–845. Charlesworth B, Charlesworth D. 1978. A model for the evolution of dioecy and gynodioecy. Am Nat. 112(988):975–997. Courchamp F, Berec L, Gascoigne J. 2008. Allee effects in ecology and Supplementary Material conservation.NewYork:OxfordUniversityPress. Supplementary data areavailableatMolecular Biology and de Vos JM, Hughes CE, Schneeweiss GM, Moore BR, Conti E. 2014. Evolution online. Heterostyly accelerates diversification via reduced extinction in primroses. Proc R Soc B. 281(1784):20140075. Desfeux C, Maurice S, Henry JP, Lejeune B, Gouyon PH. 1996. Evolution of reproductive systems in the genus Silene. Proc Biol Sci. Acknowledgments 263(1369):409–414. We would like to thank Jos Kafer for his help with fieldwork Duan T, Deng X, Chen S, Luo Z, Zhao Z, Tu T, Khang NS, and a number of colleagues for sending us seeds: J.J. Wieringa, Razafimandimbison SG, Zhang D. 2018. Evolution of sexual systems and growth habit in Mussaenda (Rubiaceae): insights into the evo- Deborah Charlesworth, Bohuslav Janousek, Tatiana Giraud, lutionary pathways of dioecy. Mol Phylogenet Evol. 123:113–122. Honor C. Prentice, Salvatore Cozzolino, Delphine Griveau, Efron B, Tibshirani RJ. 1993. An introduction to the bootstrap. 1st ed. Henk Schat, Cristina Gonnelli, Alessia Guggisberger, Sophie New York: Chapman and Hall/CRC. Karrenberg, Adrien Favre, Maria Domenica Moccia, Aria Fruchard C, Marais G. 2017. The evolution of sex determination in plants. Minder, Carmen Rothenbuehler, Massimiliano Gnesotto, In:NunodelaRosaL,Mu¨ller G, editors. Evolutionary developmental biology: a reference guide. Springer. p. 1–14. and Kerstin Kustas. A.M., R.T., and G.A.B.M. thank Elise Galtier N. 2016. Adaptive protein evolution in animals and the effective Lacroix technician of the University of Lyon 1 greenhouse population size hypothesis. PLoS Genet. 12(1):e1005774. and H.M., S.Ga., and P.T. thank Eric Schmitt for taking care Gautier M. 2014. Using genotyping data to assign markers to their chro- of the Silene plants. We are grateful to Sylvain Santoni and the mosome type and to infer the sex of individuals: a Bayesian model- AGAP platform staff for the RNA extraction and RNA-seq. based classifier. Mol Ecol Resour. 14(6):1141–1159. Gayral P, Melo-Ferreira J, Glemin S, Bierne N, Carneiro M, Nabholz B, The numerical results presented in this article were carried Lourenco JM, Alves PC, Ballenghien M, Faivre N, et al. 2013. out using the computing cluster of Laboratoire Biometrie et Reference-free population genomics from next-generation transcrip- Biologie-Poˆle Rhone-Alpin de Bioinformatique (LBBE-PRABI) tome data and the vertebrate-invertebrate gap. PLoS Genet. at University of Lyon 1 and the HPC service of the Centre de 9(4):e1003457. Ressources Informatiques (CRI) of University of Lille 1, for Giles B, Goudet G. 1997. Genetic differentiation in Silene dioica meta- populations: estimation of spatiotemporal effects in a successional which, we thank Stephane Delmotte and Bruno Spataro. plant species. Am Nat. 149(3):507–526. We thank Nicolas Galtier for help with his program sfs. Glemin S, Franc¸ois CM, Galtier N. 2019. Genome evolution in outcross- H.M., S.Ga., and P.T. are grateful to Jonathan Aceituno for ing vs. selfing vs. asexual species. Methods Mol Biol. 1910:331–369. technical support in R and to Celine Poux for discussion on Goldberg EE, Kohn JR, Lande R, Robertson KA, Smith SA, Igic B. 2010. phylogenetic inference. We are thankful to Brandon Gaut for Species selection maintains self-incompatibility. Science 330(6003):493–495. commenting on the article. This work was supported by the Goldberg EE, Otto SP, Vamosi JC, Mayrose I, Sabath N, Ming R, Ashman Agence Nationale de la Recherche (ANR-11-BSV7-013-03, T-L. 2017. Macroevolutionary synthesis of sexual TRANS) to S.Gl., G.A.B.M., and P.T., the Swiss National systems. Evolution 71(4):898–912. Science Foundation (31003A-182675) to A.W., a PhD fellow- Grossenbacher D, Briscoe Runquist R, Goldberg EE, Brandvain Y. 2015. ship from the French Research Ministry to H.M and two Geographic range size is predicted by plant mating system. Ecol Lett. 18(7):706–713. Postdoctoral fellowships to A.M (EMBO ALTF 775-2017 Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, and HFSPO LT000496/2018-L). Couger MB, Eccles D, Li B, Lieber M, et al. 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 8(8):1494–1512. Data availability Heilbuth JC. 2000. Lower species richness in dioecious clades. Am Nat. 156(3):221–241. All the RNA-seq reads generated for this study can be down- Heilbuth JC, Ilves KL, Otto SP. 2001. The consequences of dioecy for seed loaded from the European Nucleotide Archive Accession dispersal: modeling the seed-shadow handicap. Evolution PRJEB39526. 55(5):880–888.

816 Population Genomic Consequences of Separate Sexes in Plants . doi:10.1093/molbev/msaa229 MBE

Huang X, Madan A. 1999. CAP3: a DNA sequence assembly program. Mugal CF, Wolf JBW, Kaj I. 2014. Why time matters: codon evolution Genome Res. 9(9):868–877. and the temporal dynamics of dN/dS. MolBiolEvol. 31(1):212–231. Jalas J, Suominen J, editors. 1986. Distribution of vascular plants in Muyle A, K€afer J, Zemp N, Mousset S, Picard F, Marais GA. 2016. SEX- Europe. 7. Caryophyllaceae (Silenoideae). In: Atlas Florae DETector: a probabilistic approach to study sex chromosomes in Europaeae. The Committee for Mapping the Flora of Europe. non-model organisms. Genome Biol Evol. 8(8):2530–2543. Jenkins C, Keller SR. 2011. A phylogenetic comparative study of pread- MuyleA,MaraisG.2016.Genomeevolutionandmatingsystemsin aptation for invasiveness in the genus Silene (Caryophyllaceae). Biol plants. In: Kliman RM, editor. Encyclopedia of evolutionary biology. Invasions. 13(6):1471–1486. Vol. 2. Oxford: Academic Press. p. 480–492. Ju¨rgens A, Witt T, Gottsberger G. 2002. Pollen grain numbers, ovule MuyleA,ZempN,DeschampsC,MoussetS,WidmerA,MaraisGAB. numbers and pollen-ovule ratios in Caryophylloideae: correlation 2012. Rapid de novo evolution of X chromosome dosage compen- with breeding system, pollination, life form, style number, and sexual sation in Silene latifolia, a plant with young sex chromosomes. PLoS system. Sex Plant Reprod. 14(5):279–289. Biol. 10(4):e1001308. KaferJ,deBoerHJ,MoussetS,KoolA,DufayM,MaraisGAB.2014. Nabholz B, Sarah G, Sabot F, Ruiz M, Adam H, Nidelet S, Ghesquie`re A, Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 Dioecy is associated with higher diversification rates in flowering Santoni S, David J, Glemin S. 2014. Transcriptome population geno- plants. JEvolBiol. 27(7):1478–1490. mics reveals severe bottleneck and domestication cost in the African Kafer J, Marais GAB, Pannell JR. 2017. On the rarity of dioecy in flowering rice (Oryza glaberrima). Mol Ecol. 23(9):2210–2227. plants. Mol Ecol. 26(5):1225–1241. Nazareno AG, Alzate-Marin AL, Pereira RA. 2013. Dioecy, more than Kafer J, Mousset S. 2014. Standard sister clade comparison fails when monoecy, affects plant spatial genetic structure: the case study of testing derived character states. Syst Biol. 63(4):601–609. Ficus. Ecol Evol. 3:n/a–3508. Kafer J, TalianovaM,BigotT,MichuE,Gueguen L, Widmer A, zluvov˚ aJ, Nordborg M. 2000. Linkage disequilibrium, gene trees and selfing: an Glemin S, Marais GAB. 2013. Patterns of molecular evolution in ancestral recombination graph with partial self-fertilization. Genetics dioecious and non-dioecious Silene. JEvolBiol. 26(2):335–346. 154(2):923–929. Kalinowski ST, Waples RS. 2002. Relationship of effective to census size in Oxelman B, Rautenberg A, Thollesson M, Larsson A, Frajman B, Eggens F, fluctuating populations. Conserv Biol. 16(1):129–136. Petri A, Aydin Z, To¨pel M, Brandtberg-Falkman A. 2013. Sileneae Keightley PD, Eyre-Walker A. 2007. Joint inference of the distribution of and systematics. Available from: http://www.sileneae. fitness effects of deleterious mutations and population demography info/. Accessed October 13, 2020. based on nucleotide polymorphism frequencies. Genetics Papadopulos AST, Chester M, Ridout K, Filatov DA. 2015. Rapid Y de- 177(4):2251–2261. generation and dosage compensation in plant sex chromosomes. Krasovec M, Chester M, Ridout K, Filatov DA. 2018. The mutation rate Proc Natl Acad Sci U S A. 112(42):13021–13026. and the age of the sex chromosomes in Silene latifolia. Curr Biol. Paradis E, Schliep K. 2019. ape 5.0: an environment for modern phyloge- 28(11):1832–1838.e4. netics and evolutionary analyses in R. Bioinformatics 35(3):526–528. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie R Core Team. 2016. R: a language and environment for statistical com- 2. Nat Methods. 9(4):357–359. puting. Vienna (Austria): R Foundation for Statistical Computing Lauterbach D, Ristow M, Gemeinholzer B. 2011. Genetic population [Internet]. structure, fitness variation and the importance of population history Randle AM, Slyder JB, Kalisz S. 2009. Can differences in autonomous in remnant populations of the endangered plant Silene chlorantha selfing ability explain differences in range size among sister-taxa pairs (Willd.) Ehrh. (Caryophyllaceae). Plant Biol. 13(4):667–777. of Collinsia (Plantaginaceae)? An extension of Baker’s Law. New Lauterbach D, Ristow M, Gemeinholzer B. 2012. Population genetics Phytol. 183(3):618–629. and fitness in fragmented populations of the dioecious and Ranwez V, Harispe S, Delsuc F, Douzery EJP. 2011. MACSE: Multiple endangered Silene otites (Caryophyllaceae). Plant Syst Evol. Alignment of Coding SEquences accounting for frameshifts and 298(1):155–164. stop codons. PLoS One 6(9):e22594. Li H, Durbin R. 2009. Fast and accurate short read alignment with Rautenberg A, Hathaway L, Oxelman B, Prentice HC. 2010. Geographic Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. and phylogenetic patterns in Silene section Melandrium LiH,HandsakerB,WysokerA,FennellT,RuanJ,HomerN,MarthG, (Caryophyllaceae) as inferred from chloroplast and nuclear DNA Abecasis G, Durbin R, 1000 Genome Project Data Processing sequences. Mol Phylogenet Evol. 57(3):978–991. Subgroup. 2009. The Sequence Alignment/Map format and Renner SS. 2014. The relative and absolute frequencies of angiosperm SAMtools. Bioinformatics 25(16):2078–2079. sexual systems: dioecy, monoecy, gynodioecy, and an updated online Li L, Stoeckert CJ, Roos DS. 2003. OrthoMCL: identification of ortholog database. Am J Bot. 101(10):1588–1596. groups for eukaryotic genomes. Genome Res. 13(9):2178–2189. Revell LJ. 2012. phytools: an R package for phylogenetic com- Maddison WP, Midford PE, Otto SP. 2007. Estimating a binary character’s parative biology (and other things). Methods Ecol Evol. effect on speciation and extinction. Syst Biol. 56(5):701–710. 3(2):217–223. Martin H, Carpentier F, Gallina S, GodeC,SchmittE,MuyleA,Marais Romiguier J, Gayral P, Ballenghien M, Bernard A, Cahais V, Chenuil A, GA, Touzet P. 2019. Evolution of young sex chromosomes in two Chiari Y, Dernat R, Duret L, Faivre N, et al. 2014. Comparative pop- dioecious sister plant species with distinct sex determination sys- ulation genomics in animals uncovers the determinants of genetic tems. Genome Biol Evol. 11(2):350–361. diversity. Nature 515(7526):261–263. Martin H, Touzet P, Dufay M, Gode C, Schmitt E, Lahiani E, Delph L, Van Roques L, Garnier J, Hamel F, Klein EK. 2012. Allee effect promotes Rossum F. 2017. Lineages of Silene nutans developed rapid, strong, diversity in traveling waves of colonization. Proc Natl Acad Sci U S asymmetric postzygotic reproductive isolation in allopatry. Evolution A. 109(23):8828–8833. 71(6):1519–1531. Rosche C, Schrieber K, Lachmuth S, Durka W, Hirsch H, Wagner V, Martin H, Touzet P, Van Rossum F, Delalande D, Arnaud J-F. 2016. Schleuning M, Hensen I. 2018. Sex ratio rather than population Phylogeographic pattern of range expansion provides evidence for size affects genetic diversity in Antennaria dioica. Plant Biol J. cryptic species lineages in Silene nutans in Western Europe. Heredity 20(4):789–796. 116(3):286–294. Sabath N, Goldberg EE, Glick L, Einhorn M, Ashman T-L, Ming R, Otto SP, Martin M. 2011. Cutadapt removes adapter sequences from high- Vamosi JC, Mayrose I. 2016. Dioecy does not consistently accelerate throughput sequencing reads. EMBnet J. 17(1):10–12. or slow lineage diversification across multiple genera of angiosperms. Moyle LC. 2006. Correlates of genetic differentiation and isolation New Phytol. 209(3):1290–1300. by distance in 17 congeneric Silene species. Mol Ecol. Schmieder R, Edwards R. 2011. Quality control and preprocessing of 15(4):1067–1081. metagenomic datasets. Bioinformatics 27(6):863–864.

817 Muyle et al. . doi:10.1093/molbev/msaa229 MBE

Schmieder R, Lim YW, Edwards R. 2012. Identification and removal of Tataru P, Mollion M, Glemin S, Bataillon T. 2017. Inference of distribu- ribosomal RNA sequences from metatranscriptomes. Bioinformatics tion of fitness effects and proportion of adaptive substitutions from 28(3):433–435. polymorphism data. Genetics 207(3):1103–1119. Schoen DJ, Brown AH. 1991. Intraspecific variation in population Tsagkogeorga G, Cahais V, Galtier N. 2012. The population genomics of a gene diversity and effective population size correlates with fast evolver: high levels of diversity, functional constraint, and mo- the mating system in plants. Proc Natl Acad Sci U S A. lecular adaptation in the tunicate Ciona intestinalis. Genome Biol 88(10):4494–4497. Evol. 4(8):740–749. Slancarova V, Zdanska J, Janousek B, Talianova M, Zschach C, Zluvova J, Vamosi JC, Otto SP. 2002. When looks can kill: the evolution of sexually Siroky J, Kovacova V, Blavet H, Danihelka J, et al. 2013. Evolution of dimorphic floral display and the extinction of dioecious plants. Proc sex determination systems with heterogametic males and females in RSocLondB. 269(1496):1187–1194. Silene. Evolution 67(12):3669–3677. Vamosi JC, Otto SP, Barrett SCH. 2003. Phylogenetic analysis of the Slotte T, Hazzouri KM, A˚grenJA,KoenigD,MaumusF,GuoY-L,SteigeK, ecological correlates of dioecy in angiosperms. JEvolBiol. Platts AE, Escobar JS, Newman LK, et al. 2013. The Capsella rubella 16(5):1006–1018. Downloaded from https://academic.oup.com/mbe/article/38/3/805/5905496 by Beurlingbiblioteket user on 17 June 2021 genome and the genomic consequences of rapid mating system Weir BS, Cockerham CC. 1984. Estimating F-statistics for the analysis of evolution. Nat Genet. 45(7):831–835. population structure. Evolution 38(6):1358–1370. Smeds L, Ku¨nstner A. 2011. ConDeTri – a content dependent read Wright S. 1938. Size of a population and breeding structure in relation to trimmer for Illumina data. PLoS One 6(10):e26314. evolution. Science 87:430–431. Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phylo- Zemp N, Tavares R, Muyle A, Charlesworth D, Marais GAB, Widmer A. genetic analyses with thousands of taxa and mixed models. 2016. Evolution of sex-biased gene expression in a dioecious plant. Bioinformatics 22(21):2688–2690. Nat Plants. 2:16168. Tataru P, Bataillon T. 2020. polyDFE: inferring the distribution of fitness Zenil-Ferguson R, Burleigh JG, Freyman WA, Igic B, Mayrose I, Goldberg effects and properties of beneficial mutations from polymorphism EE. 2019. Interaction among ploidy, breeding system and lineage data. Methods Mol Biol. 2090:125–146. diversification. New Phytol. 224(3):1252–1265.

818