<<

EVOLUTION AND GENETICS OF SEX CHROMOSOMES

by

Josh Hough

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Ecology and Evolutionary Biology University of Toronto

© Copyright by Josh Hough 2016

EVOLUTION AND GENETICS OF PLANT SEX CHROMOSOMES

Josh Hough

Doctor of Philosophy

Department of Ecology and Evolutionary Biology

University of Toronto

2016

ABSTRACT

Sex chromosomes have evolved multiple times independently in different lineages and the parallel changes that have occurred during their formation suggest that there are general processes driving their evolution. In this thesis I used a variety of approaches, including genetic crossing experiments, mathematical modeling, and statistical analysis of DNA sequence data to study the evolution and genetics of sex chromosomes, focusing on the recently evolved X and Y chromosomes in the plant hastatulus.

I developed methods to identify sex-linked genes using segregation analysis and transcriptome sequencing, and found that both ancestral and neo-Y chromosomes in R. hastatulus have started to genetically degenerate, causing ∼28% and ∼8% hemizygosity of ancestral and derived X chromosomes, respectively. Genes remaining on Y chromosomes also accumulated more amino acid replacements, contained more unpreferred changes in codon use, and exhibited significantly reduced gene expression compared with their X-linked alleles. This genetic degeneration is consistent with

ii theoretical predictions of reduced Y-linked selection efficacy caused by suppressed recombination. My results indicate that the magnitude of genetic degeneration depends on the time since X-Y recombination became suppressed. I also found that diversity on the Y was 40-fold lower than on the X, and nearly 50-fold lower than on autosomes, indicating that selective interference has played a significant role in reducing nucleotide polymorphism during the early stages of X-Y divergence. I also developed a theoretical model to investigate the interactions between sex chromosomes, haploid-phase selection, and selection on the sex ratio. The model’s results indicate that biased sex ratios can be evolutionarily stable when there is a trade-off between Fisherian selection on the sex ratio and selection for purging deleterious mutations in the haploid phase. This finding provides a novel evolutionary explanation for biased sex ratios in dioecious , where haploid-phase selection is widespread.

In conclusion, my analyses indicate that the evolution of sex chromosomes from autosomes can result in significant changes to effective population size, recombination rate, and patterns of gene expression, all of which have important implications for DNA sequence evolution, including the effectiveness of natural selection, rates of molecular evolution, and patterns of genetic diversity.

iii ACKNOWLEDGMENTS

I would first like to thank my supervisors, Spencer Barrett and Stephen Wright, whose mentorship, encouragement, and supportive engagement with my research made my time as a graduate student both intellectually stimulating and personally enjoyable. I am grateful to Spencer for sharing his extensive knowledge of evolutionary biology, for helping me to become a better writer, and for the incredible wealth of advice, both scientific and professional, that he has shared with me over the years. I thank Stephen also for his support and mentorship, and for the countless arguments and debates that we have had. I will look back on these fondly, and I have learnt a lot from you. My experience working with both of you has been fantastic, and I cannot thank you enough. I also thank the members of my supervisory committee, Aneil Agrawal and Asher Cutter, for providing critical advice, criticism, and valuable discussion throughout the development of my thesis. Their expertise in evolutionary genetics, thoughtful suggestions, and encouragement has been inspiring and has helped me to improve the work presented here. I am also grateful to Locke Rowe and Judith Mank for serving on my examination committee and for providing valuable comments on my research. I thank all of my colleagues and peers in the Department of Ecology and Evolutionary Biology at the University of Toronto, especially Arvid Ågren, Nathaniel Sharp, Alison Wardlaw, Alethea Wang, Jesse Hollister, Lucia Kwan, Emily Josephs, Robert Williamson, and to the many members of the Agrawal, Barrett, Cutter, Stinchcombe, and Wright labs. You have all made my experience at U of T memorable, and I am grateful to have had the pleasure of learning with you, and also from you. I also extend my thanks to Deborah Charlesworth, who provided me with a wealth of research advice, answered many of my questions, and helped to clarify much of my thinking about the evolution and genetics of sex chromosomes. Finally, I thank Professor Mark O. Johnston (Dalhousie University), who taught my first undergraduate course in evolutionary biology, allowed me to tinker in his laboratory and read his books during my undergraduate years, and was a critical inspiration leading to my ambition to conduct research in the field of evolutionary genetics.

iv TABLE OF CONTENTS

ABSTRACT ...... ii ACKNOWLEDGEMENTS ...... v TABLE OF CONTENTS ...... vi LIST OF TABLES ...... x LIST OF FIGURES ...... xi LIST OF APPENDICES ...... xiv

CHAPTER 1: INTRODUCTION...... 1

Summary ...... 1 Origin of sex chromosomes ...... 1 Effects of recombination suppression on Y-chromosome evolution ...... 4 Rates of molecular evolution and patterns of genetic diversity ...... 6 Gene movement between X chromosomes and autosomes ...... 7 Sex chromosomes and ...... 9 Young plant sex chromosomes: the early stages of X-Y divergence ...... 10 Rumex hastatulus: a model system for sex chromosome evolution ...... 11 Thesis outline ...... 14

CHAPTER 2: PATTERNS OF SELECTION IN PLANT GENOMES ...... 15

Summary ...... 15 Introduction ...... 15 Genome-wide selection in models systems ...... 16 Positive and negative selection in plants ...... 20 Effective population size ...... 21 Population structure, selection on standing variation, and polygenic adaptation ..... 24 Recombination rate ...... 26

v Mating system ...... 28 Ploidy ...... 30 Conclusions and future directions ...... 33 Acknowledgments ...... 34

CHAPTER 3: GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES IN THE DIOECIOUS PLANT RUMEX HASTATULUS ...... 35

Summary ...... 35 Introduction ...... 36 Results and Discussion ...... 37 Phylogenetic relationships and evolutionary divergence of sex-linked genes ...... 39 Y chromosome gene loss and loss of expression ...... 41 Molecular evolutionary tests for deleterious mutations and codon usage bias ...... 46 Conclusions ...... 48 Methods ...... 48 RNA sequencing ...... 48 Assembly of R. hastatulus transcriptomes ...... 49 SNP segregation analysis and ascertaining sex-linkage ...... 49 Comparisons of sex-linked gene expression ...... 50 Consensus contigs for molecular evolutionary analysis ...... 51 ORF identification, sequence alignment, and phylogenetic reconstruction .... 52 Analysis of evolutionary rates ...... 52 Accession numbers ...... 53 Acknowledgements ...... 53

CHAPTER 4: REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME FOLLOWING SUPPRESSION OF RECOMBINATION IN RUMEX HASTATULUS ...... 68

Summary ...... 68

vi Introduction ...... 69 Methods ...... 72 Population samples and sex-linked genes ...... 72 Autosomal genes ...... 72 Phasing X and Y alleles ...... 73 Estimating polymorphism on sex chromosomes and autosomes ...... 75 Neutral predictions and testing the effect of a sex ratio bias ...... 75 Results and Discussion ...... 77 Y-chromosome diversity is very low in the XY system ...... 77

X- and Y-chromosome diversity in the XY1Y2 system ...... 78 Conclusions ...... 82 Acknowledgements ...... 83

CHAPTER 5: CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES IN A DIOECIOUS PLANT WITH SEX CHROMOSOMES ...... 87

Summary ...... 87 Introduction ...... 87 Methods ...... 90 Gene identification and functional annotation ...... 91 Statistical analyses ...... 91 Results and discussion ...... 92 Acknowledgements ...... 95

CHAPTER 6: EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD …...... 99

Summary ...... 99 Introduction ...... 99 The models ...... 102 Model 1: Early-acting sex-ratio modifier ...... 103 Model 2: Seed production with late-acting sex-ratio modifier ...... 105

vii Model 3: Seed production with modifier of gametophytic selection ...... 106 Incorporating deleterious mutations into the modifier model of gametophytic selection ...... 107 Discussion ...... 112 Mutation load and haploid selection in plants ...... 113 Implications for understanding observed patterns of sex-ratio bias ...... 114 Acknowledgements ...... 116

CHAPTER 7: SEXUAL DIMORPHISM IN FLOWERING PLANTS ...... 123

Summary ...... 123 Introduction ...... 123 Traits distinguishing the sexes in dioecious populations ...... 127 Vegetative traits ...... 127 Reproductive traits ...... 131 Ecology of sexual dimorphism ...... 136 Evolution and genetics of sexual dimorphism ...... 139 Quantitative genetic models ...... 141 Differences between plants and ...... 144 Sex chromosomes and sexual dimorphism ...... 145 Future studies ...... 148 Acknowledgements ...... 149

CHAPTER 8: CONCLUDING DISCUSSION ...... 150

General summary ...... 150 Summary of chapters ...... 150 Future directions ...... 153

BIBLIOGRAPHY ...... 157

viii LIST OF TABLES

Table 3.1. Numbers of identified sex-linked genes in Rumex hastatulus ...... 38

Table A3.1.1. Number of sex-linked genes with Y homologues as a function of the

minimum number of SNPs required to identify sex-linked genes ...... 54

Table A3.1.2. Polymorphism screen in XX/XY1Y2 system ...... 55

Table A3.1.3. Polymorphism screen in XX/XY system ...... 55

Table A3.1.4. Identification of autosomal genes and filtering of sex-linked genes with

autosomal SNPs ...... 56

Table A3.1.5. Identification of hemizygous genes as a function of the minimum

number of SNPs used in cutoff ...... 57

Table A3.1.6. Results of statistical tests of differences between male and female

expression for different gene sets ...... 66

Table A3.1.7. Chromosome-specific PAML estimates of the per-site synonymous

substitutions rate (Ks) and the ratio of nonsynonymous to synonymous

substitutions (ω) in sex-linked genes ...... 67

Table A4.1.1. Population identities and location information for Rumex hastatulus

samples from Texas (XY) and North Carolina (XY1Y2) ...... 84

Table 5.1. Numbers of sex-linked and autosomal genes used in analysis ...... 92

Table 6.1. Notation used in the models ...... 106

ix LIST OF FIGURES

Figure 2.1. Estimates of the percent of the genome under selective constraint and

the percent of selectively constrained bases that are noncoding in model

species ...... 17

Figure 2.2. Estimate of the proportion of effectively neutral sites (Nes < 1) in various

plant taxa calculated using the methods of Eyre-Walker and Keightley

(2009) plotted against estimates of their effective population size (Ne) .... 19

Figure 3.1. Synonymous site divergence in sex-linked genes of the XY1Y2 system of

Rumex hastatulus ...... 40

Figure 3.2. Y/X gene expression of old and young sex linked genes in

Rumex hastatulus ...... 43

Figure 3.3. Average normalized gene expression in male vs. female progeny from the

XY1Y2 system ...... 45

Figure 3.4. Synonymous and nonsynonymous substitutions in X and Y genes, and

changes in codon usage ...... 46

Figure A3.1.1. Histogram of the simulated distribution of the LRTd statistic

from 10,000 simulated parent/progeny SNP segregation

patterns for either true X/Y variants or a segregating X variant

on the male X chromosome ...... 61

Figure A3.1.2. Plot of occurrence of Type I (blue) and Type II (red) error from

10,000 simulations of true X/Y or segregating X variants

for values of the LRTd statistic between -20 and 20 ...... 62

x Figure A3.1.3. The Y/X expression ratio distribution in males for: 1) sex-linked

genes from the XY1Y2 system shared with the XY system,

2) the full set from the XY system, and 3) unique to the XY1Y2

system, compared to the expression ratio for alternate to reference

alleles at heterozygous sites in autosomes ...... 63

Figure A3.1.4. The number of parsimony-estimated lineage-specific substitutions

(synonymous or nonsynonymous) on the X and Y sequences

from the XY and XY1Y2 systems ...... 64

Figure A3.1.5. Changes in codon usage for X and Y genes ...... 65

Figure A3.1.6. DeSeq normalization coefficients (‘scaling factors’) from all genes

compared with normalization using just autosomal genes for

the XY1Y2 system progeny data ...... 65

Figure A3.1.7. Distribution of average autosomal gene expression in males divided by

average expression in females for the XY1Y2 system progeny data ...... 66

Figure 4.1. The relationship between the relative effective population size and sex ratio

bias for genes on autosomes, X chromosomes, and Y chromosomes ...... 76

Figure 4.2. The predicted relationship between sex ratio bias and

sex-chromosome-to-autosome ratios of polymorphism ...... 79

Figure 4.3. Evolutionary relationships of sex chromosome races in Rumex hastatulus,

inferred using the Neighbor-Joining method ...... 80

Figure A4.1.1. Relative X:A and Y:A nucleotide diversity for XY and XY1Y2

chromosome races ...... 85

Figure A4.1.2. Autosomal nucleotide diversity for the TX, SC, and FL clades ...... 85

xi Figure A4.1.3. The relationship between variance in male reproductive success

(fitness) and the sex-linked effective population sized relative to

autosomes ...... 86

Figure 5.1. Representation of the chromosomal location of cyto-nuclear genes in Rumex

hastatulus ...... 93

Figure A5.1.1. Power to detect a significant difference in the proportion of cytoplasmic

genes between autosomes and sex chromosomes as a function of the

true proportion on sex chromosomes...... 98

Figure 6.1. Modifier evolution at different stages of the plant life cycle ...... 103

Figure 6.2. The evolutionarily stable sex ratio in the face of conflicting selection

pressures, with Fisherian sex-ratio selection favoring no gametophytic

selection and purging of deleterious mutations favoring the expansion

of the gametophytic phase ...... 111

Figure 7.1. Sexual dimorphism in Leucadendron (Proteaceae) ...... 128

Figure 7.2. The relation between sexual dimorphism in ramification (branching)

and the age of the oldest cone, an index of the degree of serotiny

in Leucadendron ...... 129

Figure 7.3. Sexual dimorphism in flower size and daily display size in

Sagittaria latifolia ...... 135

Figure 7.4. Variation among populations of Sagittaria latifolia in the degree of sexual

dimorphism ...... 136

Figure 7.5. A hypothetical scenario in which females and males have

different optima for the same trait, causing sex-biased selection ...... 140

xii LIST OF APPENDICES

Appendix 3.1. Supporting information for Chapter 3 ...... 54 Identification of sex-linked genes with Y-linked homologues ...... 54 Population screen ...... 54 Identification of autosomal genes ...... 55 Identification of putative hemizygous genes ...... 56 Final filtered gene sets used in molecular evolution and expression analyses ...... 57 Generation of consensus X/Y sequences ...... 58 ORF identification, sequence alignment, and phylogenetic reconstruction ...... 62 Analysis of evolutionary rates ...... 63

Appendix 4.1. Supporting information for Chapter 4 ...... 84

Appendix 5.1. Supporting information for Chapter 5 ...... 97 Power analysis to detect biased distribution of cyto-nuclear genes ...... 97

Appendix 6.1. Derivation of recursion equations for sex ratio modifier models ...... 117

Appendix 6.2. Derivation of recursion equations for gametophytic selection modifier model ...... 121

xiii

CHAPTER 1

INTRODUCTION

SUMMARY Sex chromosomes have evolved multiple times independently and play a prominent role in many evolutionary processes, including speciation, adaptation, and genetic conflict. The parallel changes that have occurred during sex chromosome evolution in different lineages suggest that there may be general processes involved in their evolution, and understanding the mechanism of this change remains a key challenge. The evolution of sex chromosomes from autosomes may result in changes to effective population size, mutation rates, and the dominance characteristics of new mutations, all of which can have important implications for patterns of DNA sequence evolution. These include differences between sex chromosomes and autosomes in the effectiveness of natural selection on beneficial and deleterious mutations, patterns of genetic diversity, and rates of inter-chromosomal gene movement. In addition, sex chromosomes are predicted to influence the factors driving evolutionary change in males versus females, potentially resulting in sexual dimorphism and changes to the sex ratio. In this chapter, I provide a brief synopsis of X and Y chromosome evolution, outline a theoretical context for the molecular evolution of these chromosomes, and highlight some key unanswered questions. I end by providing a motivation for studying sex chromosomes in dioecious plant species, where recent sex chromosome formation permits a unique glimpse of the mechanisms driving their evolution. I also highlight the dioecious plant Rumex hastatulus as an emerging model system for studying the evolution and genetics of plant sex chromosomes.

ORIGIN OF SEX CHROMOSOMES Sex chromosomes are thought to have evolved from a pair of homologous autosomes and are one of the commonly understood mechanisms of genetic sex determination (Bull 1983). A prominent feature of sex chromosome evolution appears to be the transition

1 CHAPTER 1. INTRODUCTION 2 from homologous recombining proto-sex chromosomes to morphologically and genetically distinct chromosomes that have ceased recombination. This transition to sex chromosome heteromorphism has occurred in a diverse set of taxa, including mammals (Lahn and Page 1999; Matsubara et al. 2006; Wilson Sayres et al. 2014), birds (Handley et al. 2004; Pigozzi 2011; Zhou et al. 2014), amphibians (Evans et al. 2012), as well as in plants (Ming et al. 2011; Charlesworth 2015). In some cases, recently formed sex chromosomes have not fully differentiated and continue to recombine over part of their length (Charlesworth 2012). Such systems provide an interesting opportunity to investigate the early stages of sex chromosome divergence, and the unique evolutionary dynamics of partially sex-linked regions (referred to as pseudo-autosomal regions or PARs; Otto et al. 2011; Jordan and Charlesworth 2012; Qiu et al. 2013). Sex chromosomes may occur in both male-heterogametic (XY) and female heterogametic (ZW) systems, and both systems have evolved in a diverse range of species. For the remainder of this chapter, I shall consider only male-heterogametic systems, though many of the considerations apply to female heterogamety (but see Van Doorn and Kirkpatrick 2010; Ellegren 2011; Mank 2012 for consideration of the differences between XY and ZW systems). The origin of morphologically distinct X and Y chromosomes is closely associated with the emergence of genetically determined separate sexes (dioecy), and it is therefore important before considering patterns evolution on these chromosomes to discuss the potential evolutionary paths by which dioecy might have been established. In the standard model (Charlesworth and Charlesworth 1978), separate sexes are suggested to have evolved from an ancestral state in which both male and female functions occur in one individual (cosexuality or hermaphroditism). This is a common sexual system in the angiosperms (Barrett 2002), but also occurs in several species, including invertebrates and fish (Bull 1983). The simplest transition to separate sexes with proto- sex chromosomes from this state involves the fixation of sterility mutations at two linked loci: one affecting male function, and the other affecting female function. The successive fixation of these mutations is predicted to result in a transition from hermaphroditism to dioecy through an intermediate stage where females co-occur with hermaphrodites, a sexual system referred to as gynodioecy, or, less frequently, an intermediate stage where

CHAPTER 1. INTRODUCTION 3 males co-occur with hermaphrodites, referred to as androdioecy (Lewis 1942; Westergaard 1958; Charlesworth and Charlesworth 1978; Charlesworth 1991; Charlesworth and Guttman 1999; Ainsworth 2000). The selective pressures driving the evolution of separate sexes have been discussed extensively in recent years (e.g., see Spigler and Ashman 2011), and alternative pathways to the model above are now well understood. These include pathways involving gradual changes to sex-specific investment, disruptive selection on quantitative genetic variation for gender allocation, and the spread of cytoplasmic sterility mutations (Charlesworth and Charlesworth 1978; Spigler and Ashman 2011). Although the standard two-step model is likely an oversimplification, the generic features of it are consistent with evidence from sexual systems in flowering plants. For example, there is considerable evidence that the evolution of the dioecy in many plant groups involved an intermediate gynodioecious stage (e.g., Barrett 1992, Dorken and Barrett 2004), as well as the emergence of male heterogametic (XY) sex chromosome systems (Ming et al. 2011), both of which are explained well by this model. Despite the independent origins of X and Y chromosomes, and the multiple pathways by which they may have evolved, a common feature of their evolution appears to be the gradual suppression of recombination on the proto-Y chromosome (Charlesworth 1996; Charlesworth and Charlesworth 2000). This is thought to arise as a result of selection for linkage between sexually antagonistic mutations (that is, mutations that benefit one sex but are deleterious in the other) and the sex-determining region. Such linkage is expected to be beneficial because it creates an association between genes determining femaleness or maleness, and genes affecting sex-specific fitness. Evidence from mammalian sex chromosomes suggests that this recombination arrest might have occurred in stages though the fixation of multiple chromosomal inversions (Lahn and Page 1999), which are known to suppress recombination in heterozygotes (Kirkpatrick 2010). However, evidence from stickleback (Natri et al. 2013) and Silene latifolia (Bergero et al. 2007) suggests that recombination suppression may have spread more gradually in these species, possibly through recombination rate modifiers, as only weak signals of evolutionary strata have been found. .

CHAPTER 1. INTRODUCTION 4

There is now a large body of theory on sexually antagonistic (SA) selection (Van Doorn and Kirkpatrick 2010; Jordan and Charlesworth 2012; Blaser et al. 2013), and although it is generally accepted to play a central role in sex chromosome evolution, it remains largely untested empirically. In particular, there are no established cases where SA polymorphisms have been identified and directly implicated in suppressing recombination. Recent studies using molecular evolutionary approaches have investigated footprints of SA selection in the PARs of the young sex chromosomes in the plant Silene latifolia, finding evidence for balancing selection in such regions, but SA polymorphisms have not yet been identified (Qiu et al. 2013). Similarly, potentially antagonistic alleles have been implicated in cichlids (Roberts et al. 2009), and sticklebacks (Kitano et al. 2009), but such polymorphisms have not been demonstrated to be involved in suppressing recombination. Identifying SA polymorphisms and establishing their role in recombination suppression therefore remains a major challenge in the field. Although suppressed recombination is a characteristic feature of sex chromosome evolution, it does not appear to be the only mechanisms for resolving the conflict arising from sexually antagonistic variation. This was highlighted in recent work in emus (Dromaius novaehollandiae), which showed that sexually antagonistic variation can instead result in the evolution of sex-specific gene expression (Vicoso et al. 2013). This suggests that the transition to sex chromosome heteromorphism is not inevitable despite its apparent prevalence (Mank 2009 2013; Bachtrog et al. 2011 2014). Nevertheless, the evolutionary and genetic consequences of recombination suppression remains a key issue in sex chromosome evolution (Bergero and Charlesworth 2009; Bachtrog 2013), and for the remainder of this chapter I shall focus on cases where it has evolved.

EFFECTS OF RECOMBINATION SUPPRESSION ON Y CHROMOSOME EVOLUTION When it evolves, suppressed recombination may cause Y-linked genes to deteriorate in function, lose expression, become enriched in repetitive sequence, and/or become pseudogenized (Bachtrog 2013). These patterns of genetic deterioration are collectively referred to as Y-chromosome degeneration, and this is a well-established feature of ancient Y chromosomes, including those in humans (Hellborg and Ellegren 2004; Wilson

CHAPTER 1. INTRODUCTION 5

Sayres et al. 2014) and Drosophila melanogaster (Carvalho 2002; Carvalho et al. 2003; Singh et al. 2014), each of whose Y chromosomes exhibit a highly heterochromatic chromatin structure consisting largely of repetitive and ampliconic DNA, and carry few remaining protein-coding genes. Several evolutionary theories have been suggested to explain the degeneration of the Y chromosome (see Bachtrog 2013 for a recent review). These include models based on positive selection and the fixation of linked deleterious variants ("selective sweeps”; Maynard Smith and Haigh 1974; Rice 1987), as well as purifying selection and the increased vulnerability of the Y chromosome to genetic drift (‘background selection’; Charlesworth 1994; McVean and Charlesworth 2000). In addition, Y chromosomes have been suggested to undergo a reduced rate of adaptive evolution owing to the fact that the majority of favorable mutations arising on the Y suffer a reduced probability of fixation due to linkage with deleterious mutations (referred to as the “ruby-in-the-rubbish” model; Orr and Kim 1998). A unifying theme of these models of Y-chromosome degeneration is the idea that recombination suppression reduces the efficiency of natural selection. This can be understood in terms of the Hill–Robertson (HR) effect, in which the effective population size (Ne) for a given genomic region depends strongly on the rate of recombination, such that sites linked to selected variants are not independent, and thus collectively experience a reduction in Ne (Hill and Robertson 1966). This reduced Ne in turn results in a lowered efficacy of selection, which is proportional to the product of Ne and the selection coefficient, s (Charlesworth 2009). Thus, compared to X chromosomes or autosomes, where recombination breaks up allelic associations, nonrecombining Y chromosomes are predicted to accumulate more deleterious mutations and to fix fewer advantageous ones. Although the general properties of selective interference models of Y- chromosome degeneration are well understood theoretically, their relative contributions to patterns of Y-chromosome degeneration observed in natural populations are poorly understood. This is, in part, because the models have overlapping predictions, including chromosome-wide reductions in neutral polymorphism, and also because they depend (in different ways) on evolutionary parameters that are not well characterized, including the distribution of selection coefficients and rates of beneficial and deleterious mutations.

CHAPTER 1. INTRODUCTION 6

While several studies have directly implicated the role of selective sweeps or background selection as key processes driving patterns of evolution on Y chromosomes and in genomic regions with low recombination (Hellborg and Ellegren 2004; Andolfatto 2007; McGaugh et al. 2012), developing statistical tests to distinguish between the molecular signatures of these processes remains a great challenge (Huber et al. 2015). Moreover, the magnitudes of HR effects are predicted to change over the course of Y-chromosome evolution (Bachtrog 2008), suggesting that the time since recombination suppression is a key parameter determining the expected rate and extent of Y-chromosome degeneration, though few studies have investigated this prediction (but see Chapter 3).

RATES OF MOLECULAR EVOLUTION AND PATTERNS OF GENETIC DIVERSITY A fundamental difference between sex chromosomes and autosomes is the number of these chromosome types in the population, and this may give rise to differences in genetic diversity and rates of molecular evolution. Autosomes in diploid species are present in two copies in each sex, whereas the X chromosome is present in two copies in females and only one copy in males. As a consequence, the effective population size (Ne) of the X chromosome is predicted to be 3/4 that of the autosomes, whereas the Ne of the male- limited Y chromosome is expected to equal 1/4 that of autosomes (assuming an equal number of reproducing females and males). Such differences in Ne should directly affect levels of neutral polymorphism on these chromosomes, which is proportional to the product of Ne and the neutral mutation rate (Kimura 1983). A key prediction of theories of Y chromosome degeneration is a reduction in the

Ne of Y-linked genes beyond the neutrally-predicted level. This prediction is typically tested by normalizing X- and Y-linked diversity by autosomal diversity (A), and determining whether observed X:A and Y:A ratios can be explained by a single model (Ellegren 2009). However, the X chromosome may be subject to evolutionary forces that cause diversity to differ dramatically from autosomes, making direct comparisons of diversity on X-linked and autosomal genes difficult (e.g., see Schaffner 2004). For example, when the X chromosome has become fully or partially hemizygous due to gene loss from the Y chromosome, purifying selection is predicted to become more effective

CHAPTER 1. INTRODUCTION 7 on X-linked genes due to the exposure of recessive alleles in males. This may result in deleterious mutations being maintained at lower frequencies on this chromosome, resulting in larger number of deleterious-free X chromosomes compared to autosomes, and consequently, a higher level of neutral polymorphism. On the other hand, diversity may be reduced on the X chromosome because the increased efficiency of selection might also result in an increased frequency of selective sweeps (Charlesworth 1996; Ellegren 2009). It is clear that several factors can modulate or accentuate differences in polymorphism between sex chromosomes and autosomes, and empirical estimates reflect this variation. For example, the estimated X:A diversity ratio in humans ranges between 0.60 and 1.05, and between 0.49 and 1.47 in Drosophila melanogaster (Ellegren 2009), and non-selective factors have been implicated in shaping X:A diversity in both species. These include population subdivision and sex-biased dispersal (Keinan et al. 2009; and see Bentley et al. 2008), deviations from a 1:1 breeding sex ratio (Hammer et al. 2008; Bustamante and Ramachandran 2009; Ellegren 2009), and high variance in male or female reproductive success (Caballero 1994 1995; Charlesworth 2009; Ellegren 2009). In Drosophila melanogaster, studies have found higher X:A ratios particularly among African populations, suggesting an ‘out of Africa’ scenario in which a population bottleneck may have disproportionately affected X chromosome polymorphism (Andolfatto 2001). The different numbers of X and Y chromosomes in males versus females and the chromosomal heteromorphism that may result from sex chromosome evolution can therefore affect selection pressure, the dominance of mutations, and the effective population size of sex chromosomes versus autosomes, and these parameters are predicted to interact to affect rates of evolution and patterns of diversity. Recent studies in model species have started to reveal the factors contributing to sex-linked and autosomal diversity, and further studies examining sex chromosome variability from a range of species with different levels of sex chromosome heteromorphism will likely provide novel insights into this issue.

GENE MOVEMENT BETWEEN X CHROMOSOMES AND AUTOSOMES

CHAPTER 1. INTRODUCTION 8

In addition to differences in patterns of molecular evolution and genetic variability, theory also predicts that differences in gene content may evolve between sex chromosomes and autosomes. For example, genes with different selective effects in males versus females may disproportionately accumulate on the sex chromosomes compared to autosomes, thus affecting the chromosomal distribution of genes involved in sexual dimorphism (discussed in the next section). In addition, because X-linked genes have a higher probability of co-transmission with mitochondrial genes (2/3) compared to autosomal genes (1/2), selection on beneficial epistatic interactions has been predicted to lead to an overrepresentation of mitochondria-interacting genes on the X chromosome (Rand et al. 2004; Drown et al. 2012). On the other hand, because mitochondrial mutations may spread when their effects are favorable to females, even if they are mildly deleterious to males, the long-term accumulation of male-detrimental mitochondrial mutations may result in selection for the movement of mitochondrial-interacting genes off of the X chromosome to the autosomes (see Rand et al. 2004; Drown et al. 2012). Such gene movement would reduce the male mitochondrial mutation load, and result in an underrepresentation of mito-nuclear genes on the X chromosome relative to the autosomes. Although co-adaptation and sexual conflict within the genome are widely investigated subjects (e.g., see Rice 2013), their influence on mito-nuclear gene movement has only recently been explored (Drown et al. 2012; Hill and Johnson 2013; Dean et al. 2014; Rogell et al. 2014; Hough et al. 2014; Dean et al. 2015), and evidence for the two hypotheses have been mixed. For example, Drown et al. (2012) investigated the chromosomal distribution of N-mt genes in 16 vertebrates and found a strong underrepresentation of such genes on the X chromosomes relative to autosomes in 14 mammal species, but not in two avian species. Dean et al. (2014) included seven additional species to this analysis, with independently derived sex chromosomes and phylogenetic correction, and found that the underrepresentation of N-mt genes on the X chromosome was restricted to therian mammals and Caenorhabditis elegans. Given the uncertainty surrounding the empirical results above, there is a need to explore the chromosomal distributions of cyto-nuclear genes in a broader set of taxa, and to determine whether movement of such genes followed the formation of sex

CHAPTER 1. INTRODUCTION 9 chromosomes (providing evidence for the sexual conflict hypothesis), or whether the observed chromosomal distributions of cyto-nuclear genes is an artifact of the ancestral chromosome pair that evolved into sex chromosomes. The latter possibility is suggested in Chapter 5, where no evidence was found for a biased chromosomal distribution of mito-nuclear genes on the recently evolved sex chromosomes in the plant R. hastatulus. Subsequently, this idea has been tested further in therian mammals (Dean et al. 2015), revealing that random biases in the gene content of the ancestral X chromosome could indeed explain the observed chromosomal distributions of mito-nuclear genes. Thus, whether sexual conflict or co-adaptation drives mito-nuclear gene movement is not yet known, and investigating the factors affecting the distributions of genes on sex chromosomes remains an important area of research.

SEX CHROMOSOMES AND SEXUAL DIMORPHISM The occurrence of sexual conflict is of particular interest when it involves the X chromosome, as genes on this chromosome have been shown in theory to preferentially evolve sex-biased fitness effects relative to autosomal genes (Rice 1987), and may therefore disproportionately affect the evolution of sexual dimorphism. Consider, for example, a sexually antagonistic mutation on an autosome that has a significant effect on the fitness of heterozygotes. When the fitness effects of such a mutation are positive in females and negative in males, the mutation can spread under positive selection only when the beneficial effects in females outweigh the deleterious effects in males (Vicoso and Charlesworth 2006). If this mutation occurs on the X chromosome, however, its deleterious effects will only be expressed 1/3 of the time (i.e., in males). Thus, the probability of such a mutation spreading to fixation, or reaching high frequency, is greater when it occurs on an X chromosome than on an autosome (Rice 1987). It follows that the X chromosome can accumulate such female-benefit genes at a faster rate, and the “feminization” of this chromosome might make it an evolutionary hot spot for genes involved in sexual dimorphism (Gibson et al. 2002). Empirical work has attempted to determine whether sex chromosomes do indeed influence sexual dimorphism by an affect disproportionate to their size (reviewed in Mank 2009), but studies to date have revealed mixed results. In Drosophila

CHAPTER 1. INTRODUCTION 10 melanogaster, there are reports that genes with sex-biased expression have nonrandom genomic distributions, with X chromosomes harboring fewer genes with male-biased expression (Parisi et al. 2003, but see Fitzpatrick 2004). On the other hand, comparative studies in birds do not support an association between sex chromosomes and sexually selected dimorphic traits (Mank 2009). In plants, sexually dimorphic gene expression has been detected in both vegetative (Zluvova et al. 2010) and floral (Muyle et al. 2012) characters in Silene latifolia, and there is evidence that some genes on the X chromosome of this species are male-biased in their expression (Muyle et al. 2012), though formal tests of the association between sex chromosomes and sexually dimorphic trait variation have not been done. There is still much work to do on determining the influence of sex chromosomes on patterns of sexual dimorphism. Theory has established an important role of sex chromosomes, but empirical data has yet to unambiguously confirm this. With the occurrence of sexually dimorphic variation occurring across a range of species harboring sex chromosomes, genetic mapping and gene expression studies aimed at testing the association between sexual dimorphism and sex chromosomes are likely to provide novel insights into this issue. In Chapter 7 I discuss these issues further and also provide a comprehensive review of sexual dimorphism in plants.

YOUNG PLANT SEX CHROMOSOMES: THE EARLY STAGES OF X-Y DIVERGENCE While the evolution of sex chromosomes have been investigated extensively in several animal species, including humans (Skaletsky et al. 2003), fish (Almeida-Toledo et al. 2000), platypus (Watson. et al. 1991), birds (Matsubara et al. 2006; Pigozzi 2011), and Drosophila (Bachtrog 2005), much less is known about sex chromosome evolution in dioecious flowering plants. Unlike animals, most flowering plants are hermaphroditic, with female and male functions found within a single individual. In species that have evolved separate sexes, however, the emergence of heteromorphic sex chromosomes appears to have occurred relatively recently (Charlesworth 2015). Studying plant sex chromosomes therefore provides an interesting contrast with those in animals, where sex chromosomes in the most well-studied species are either relatively old, as with the sex

CHAPTER 1. INTRODUCTION 11 chromosomes of humans or Drosophila melanogaster (Lahn and Page 1999, Bachtrog 2005), or contain highly degenerate Y chromosomes (as in D. mirianda; Bachtrog 2004). Data from young sex chromosome systems in plants will allow one to address many of the outstanding questions in sex chromosome evolution that have been highlighted in this chapter. These include questions concerning the role of sexually antagonistic polymorphisms in driving suppression of recombination (e.g., see Qiu et al. 2013), the timescales of Y-chromosome degeneration and the temporal dynamics of the HR effect (Chapters 3 and 4), and the role of sexual conflict in driving in inter- chromosomal gene movement among nascent sex chromosomes (Chapter 5). Recent studies investigating plant sex chromosome evolution have shed light on some of these issues, and an interesting general conclusion from this work is that there are many features of sex chromosome evolution in plants that appear to be shared with animals. These include genetic degeneration of Y chromosomes (Bergero and Charlesworth 2011; Chibalina and Filatov 2011; Hough et al. 2014a), the pattern of evolutionary strata (Bergero et al. 2007), and the evolution of dosage compensation (Muyle et al. 2012; Papadopulos et al. 2015). This highlights that general evolutionary mechanisms, rather than species or kingdom-specific processes, are likely involved in driving these features of sex chromosomes. In addition, however, research on plant sex chromosomes has also revealed that there are unique features of plant sex chromosome evolution. For example, there is evidence that the extensive gene expression that occurs across plant genomes at the haploid gametophytic phase may cause a slower rate of gene decay on Y chromosomes in plants due to purging of deleterious mutations (Chibalina and Filatov 2011), and such efficient haploid-phase selection may also impact the evolution of sex ratios and mutational load in plants with sex chromosomes, as described in more detail in Chapter 6. Thus, recent research on the evolution and genetics of plant sex chromosomes is beginning to shed new light on both the general features of sex chromosome evolution, as well as on how the unique characteristics of plants can affect the forces driving sex chromosome evolution. In the next section, I introduce the plant species Rumex hastatulus as an emerging model system that has played an important role in informing this understanding.

CHAPTER 1. INTRODUCTION 12

RUMEX HASTATULUS: A MODEL SYSTEM FOR SEX CHROMOSOME EVOLUTION The plant genus Rumex () includes both dioecious and nondioecious species, and sex chromosomes in this group are estimated to have evolved ~15 million years ago (Navajas-Pérez et al. 2005), making them evolutionarily much younger than mammalian sex chromosomes, but older than those in other plants species whose sex chromosomes have been studied [e.g. Papaya: ~2 mya (Yu et al. 2008), Silene latifolia ~ 10 mya (Nicolas et al. 2005, Mrackova et al. 2008)]. Phylogenetic evidence indicates that the dioecious species of Rumex likely evolved from hermaphroditic ancestors via gynodioecious intermediates (Navajas-Pérez et al. 2005), making the genus well suited for comparative analysis aimed at investigating the origin of sex chromosomes and transitions in sexual system. X and Y chromosomes Rumex are also morphologically differentiated (Smith 1963), and in some species, Y chromosomes are highly heterochromatic (e.g. Rumex acetosa: Clark et al. 1993; Réjon et al. 1994; Vyskot and Lengerova 2001; Rumex hastatulus: del Bosque et al. 2011; Grabowska-Joachimiak et al. 2014), suggesting that Y chromosome degeneration has occurred independently within the genus. Rumex hastatulus, a dioecious wind-pollinated annual weed native to southern United States, is of particular interest for studies of sex chromosome evolution, as it is recognized as having a unique within-species karyotype polymorphism (Smith 1964). Individuals found in the western (Texas) range of the species distribution have a diploid chromosome number 2n = 8 + XX in females and 2n = 8 + XY in males, whereas individuals in the eastern (North Carolina) range have diploid chromosome numbers 2n =

6 + XX in females and 2n = 6 + XY1Y2 in males. In addition, studies have described the occurrence of sexual dimorphism in R. hastatulus (Pickup and Barrett 2002; Teitel et al. 2015), suggesting that the genomes of males and females may experience sexually antagonistic selection. Thus, there is evidence in this species for sexual dimorphism, karyotypically and geographically distinct 'races' that differ in autosome number and sex chromosome complement, and the presence of a neo-Y (Y2) chromosome that originated independently of the ancestral Y chromosome (Y1). In common with several other Rumex

CHAPTER 1. INTRODUCTION 13 species (reviewed in Barrett et al. 2010), R. hastatulus populations are also characterized by female-biased sex ratios, in which females are on average 20% more frequent than males throughout their range (Pickup and Barrett 2013). It is unknown whether the occurrence of female biased sex ratios in this genus is related to sex chromosome evolution, but several authors have noted this possibility (Conn and Blum 1981; Stehlik and Barrett 2006; Stehlik et al. 2008; Chapter 6). It has been hypothesized that female- biased sex ratios in several Rumex species maybe linked to certation, a process involving gametophytic competition between X- and Y-carrying microgametophytes in styles of females (Lloyd 1974; Charlesworth 2002; Stehlik and Barrett 2005 2006, Stehlik et al. 2008; Pickup and Barrett 2014). The uniovulate condition in Rumex certainly makes this hypothesis plausible but direct evidence linking sex ratio bias to selection during the haploid phase of the cycle has as yet not been obtained. In Chapter 6 I discuss the relation between sex ratio bias and selection during the haploid phase of the life cycle in plants.

The XY1Y2 sex chromosome system in R. hastatulus is thought to have originated through an X-autosome fusion (Smith 1964) involving the ancestral 3rd chromosome in the Texas race. Evidence supporting this hypothesis was recently obtained by Grabowska-Joachimiak et al. (2014), who reported that the ancestral third chromosome in the Texas race carries the 5S rDNA locus which is now found on both the neo-X and the

Y2 sex chromosomes in the derived North Carolina race. That this fusion is likely to have been a recent event, and subsequent to the divergence of the ancestral Y chromosome, is suggested by C-banding/DAPI experiments showing that the ancestral Y chromosome is highly heterochromatic, while the translocated neo-Y chromosome still consist largely of transcriptionally active euchromatin (Ester et al. 2011; Grabowska-Joachimiak et al. 2014) and as I show in Chapter 3, has less sequence divergence compared with the ancestral chromosome pair. Moreover, while independently sequenced transcriptomes from males and females of each race found male-specific variants in ∼80% of the genes ascertained as sex-linked in the XY populations, this was found for only 28% of the neo-

Y genes, suggesting a very recent suppression of recombination, or that the Y2 still contains a large recombining, pseudo-autosomal region (Chapter 3). The occurrence of multiple recently evolved sex chromosome systems in R. hastatulus thus provides a unique opportunity to investigate the early stage of sex chromosome divergence, and to

CHAPTER 1. INTRODUCTION 14 compare patterns of degeneration on Y chromosome that evolved de novo from autosomes, with those that have originated through fusion events (‘neo-sex chromosomes’). The geographical structuring of chromosomal races in R. hastatulus also raises questions about the evolutionary relationships of these races, the extent to which they are genetically or ecologically divergent, and whether the XY1Y2 system may have arisen multiple times independently. Investigating such relationships may shed new light on the factors driving transitions in sex chromosome system and the evolutionary dynamics of chromosomal rearrangements. Finally, the occurrence of female-biased sex ratios in this species allows for a unique investigation of the relationship between sex chromosome heteromorphism and the evolution of sex ratios, and to test whether sex ratio bias has reduced sex-linked effective population size and genetic diversity, as predicted by theory.

THESIS OUTLINE The research presented in this thesis is guided by the broader effort to investigate the mechanisms contributing to evolutionary change across the genome. I therefore begin with a broad analysis and review on this subject (Chapter 2), in which I describe recent data from studies investigating patterns of genome-wide selection in plant species. In the chapter, I consider how different population genetic factors, including variation in demographic structure, recombination rate, mating system, and ploidy level can affect patterns of molecular evolution and selection across the genome. This chapter serves as a broad motivating context to those that follow, which investigate a range of questions concerning the patterns of molecular evolution and genetic diversity on sex chromosomes in the plant Rumex hastatulus (Chapters 3 and 4), as well as some of the broader implications of plant sex chromosome evolution mentioned in this introduction, including genetic conflict and inter-chromosomal gene movement (Chapter 5), the evolution of the sex ratio (Chapter 6), and sexual dimorphism (Chapter 7). These chapters involve a variety of approaches, including genetic crossing experiments, mathematical modeling, and statistical analysis of genetic sequence data. Because each chapter was written as a self-contained research article or literature review for publication, there is some

CHAPTER 1. INTRODUCTION 15 inevitable repetition between them. Each chapter begins with a summary describing the key results, and I provide the citation for those articles that have been published.

CHAPTER 2

PATTERNS OF SELECTION IN PLANT GENOMES

This chapter resulted from collaboration with Robert J. Williamson and Stephen I. Wright. It is published in Annual Review of Ecology, Evolution, and Systematics, 2013, 44:41–39.

SUMMARY Plants show a wide range of variation in mating system, ploidy level, and demographic history, allowing for unique opportunities to investigate the evolutionary and genetic factors affecting genome-wide patterns of positive and negative selection. In this review, we highlight recent progress in our understanding of the extent and nature of selection on plant genomes. We discuss differences in selection as they relate to variation in demography, recombination, mating system and ploidy. We focus on the population genetic consequences of these factors and argue that, although variation in the magnitude of purifying selection is well documented, quantifying rates of positive selection and disentangling the relative importance of recombination, demography, and ploidy are ongoing challenges. Large-scale comparative studies that examine the relative and joint importance of these processes, combined with explicit models of population history and selection, are key and feasible goals for future work.

INTRODUCTION Population genetics theory suggests that several key factors should be important in influencing patterns of genome-wide selection, including effective population size, population structure, recombination, and ploidy. The extent to which variation in these factors can drive variation in selection within and between species is a key question that connects diverse study systems and research questions into a unifying framework. Plants, with their unique life cycle characteristics and diversity of mating systems, ploidy levels, and demographic histories, provide an ideal testing ground for exploring empirically the importance of different population genetic processes in shaping patterns of selection. For example, the frequent occurrence of mating system transitions in plants makes it possible to test how changes in effective population size and recombination rates

15 CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 17 can affect the extent and efficacy of selection across the genome (Haudry et al. 2008, Hazzouri et al. 2012), how elevated homozygosity can influence the genetics of adaptive evolution (Glémin and Ronfort 2013), and how self-fertilization can affect population structure and demographic patterns (Platts et al. 2010). In addition, the prevalence of ploidy transitions in plants (Otto and Whitton 2000) enables investigations into the effects of genome duplication and genetic redundancy on the rates of fixation of deleterious and advantageous alleles (Otto and Yong 2002), and results to date highlight the importance of gene dosage and dominance as evolutionary factors affecting patterns of selection in polyploid lineages (Otto 2007). Finally, widespread gene expression and selection during the haploid stage of plant life cycles (Borg et al. 2009, Walbot and Evans 2003) offers insight into the unique characteristics of plants and may explain some of the differences between plant and animal species in the patterns of genome-wide variation and selection. Here, we review recent developments in our understanding of the extent and nature of selection in plant genomes. We begin by documenting recent results regarding the extent of selection in both coding and non-coding DNA, and discuss how and why the patterns in plants differ from those in other organisms. We then review the expected and observed differences in the patterns of selection in plants as they relate to differences in effective population size and structure, recombination rate, mating system and ploidy level.

GENOME-WIDE SELECTION IN MODEL SYSTEMS With the increasing availability of whole-genome sequence data, several methods have been applied to infer the genome-wide extent of positive and negative selection. These methods involve a variety of approaches, including analyses of selective constraint across species (Siepel et al. 2005), comparisons of polymorphism and divergence at selected and neutral sites (McDonald and Kreitman 1991, Boyko et al. 2008, Eyre-Walker and Keightley 2009), and levels of neutral diversity surrounding fixed functional mutations (Sattath et al. 2011). These methods have been reviewed recently (Sella et al. 2009, Zhen and Andolfatto 2012) and here we focus on the patterns that have emerged when they have been applied to estimate the extent and strength of positive and negative selection.

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 18

FIGURE 2.1. Estimates of the percent of the genome under selective constraint (blue) and the percent of selectively constrained bases that are noncoding (red) in model species from comparative genomics approaches (analysis of between-species constraint). Species are shown in decreasing order of genome size. For humans and Drosophila, light bars indicate the increase in the estimates when population genomics approaches (analysis of within-species diversity and allele frequencies) were used. Comparative genomics estimates were obtained for humans (Lindblad-Toh et al. 2011), Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae (Siepel et al. 2005), Arabidopsis thaliana, and Arabidopsis lyrata (Haudry et al. 2013); population genomics estimates were obtained for humans (Ward and Kellis 2012) and Drosophila (Sella et al. 2009).

In Drosophila, two particular results have become apparent from recent studies that are in marked contrast to previously held views about the nature of molecular evolution (Sella et al. 2009). First, comparative and population genomics analyses of selective constraint have found that a large fraction of the Drosophila genome, including noncoding regions, is subject to purifying selection (Andolfatto 2005, Sella et al. 2009). Second, analyses of genome wide polymorphism and divergence suggest that positive selection in coding and noncoding DNA is occurring at a remarkably high rate (Eyre-

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 19

Walker and Keightley 2009, Sella et al. 2009, Sattath et al. 2011). In contrast, and more in line with expectations of neutral theory, such high rates of species-wide positive selection from new mutations have not been detected in the human genome (Eyre-Walker and Keightley 2009, Hernandez et al. 2011), which also appears to contain a larger proportion of DNA subject to little or no selective constraint (Eory et al. 2009, Eyre- Walker and Keightley 2009, Lindblad-Toh et al. 2011; Figure 2.1). This latter result is in apparent contrast to recent results from the ENCODE project (The ENCODE Project Consortium 2013), which suggest that a very high proportion of sites are involved in biochemical activity. However, while integrating ENCODE results with population genetics approaches has resulted in increased estimates of contemporary constraint that was undetected from comparative approaches (Ward and Kellis 2012; Figure 2.1), many sites falling under the ENCODE definition of function may not be subject to purifying selection (Graur et al. 2013). This does not imply that the human genome has few functional noncoding regions, however, as it may simply contain more neutral or nearly neutral DNA. Indeed, estimates of the amount of functional noncoding sequence in humans are higher than those in Drosophila (Figure 2.1). Nevertheless, analyses of protein-coding and noncoding sequences in the human genome indicate that there is weaker purifying selection, more neutral sites, and lower rates of positive selection than in Drosophila (Eory et al. 2009, Eyre-Walker and Keightley 2009, Hernandez et al. 2011). In plants, early analyses of protein-coding sequences in Arabidopsis found little evidence of positive selection on amino acid substitutions, but did indicate a large proportion of slightly deleterious amino acid substitutions segregating in populations (Bustamante et al. 2002). More recent larger scale analyses have confirmed this result (Foxe et al. 2008, Gossmann et al. 2010, Slotte et al. 2011). Indeed, approximately 20% of new amino acid mutations are effectively neutral (Figure 2.2), and estimates of species-wide positive selection on amino acid variants are approximately zero. Although evidence for purifying selection on noncoding sites has been obtained from several plant genomes (Wright and Andolfatto 2008), the first studies of conservation in such regions suggested that there are considerably fewer conserved noncoding sites in plants than in animal genomes (Lockton and Gaut 2005). This initial

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 20 conclusion has been confirmed by a recent study using whole-genome estimates of constraint in Arabidopsis using comparisons of nine species in the Brassicaceae (Haudry et al. 2013; Figure 2.1). This suggests that plant genomes in the Brassicaceae contain much fewer selectively constrained noncoding sites than the genomes of mammals (Lindblad-Toh et al. 2011), Drosophila (Andolfatto 2005, Sella et al. 2009), and nematodes (Siepel et al. 2005), though estimates are higher in plants than in yeast (Siepel et al. 2005; Figure 2.1). If this lower amount of functional noncoding sequence proves common in plant genomes, it may signal fewer or smaller regulatory regions and/or relaxed selection due to large-scale whole genome duplication; modulation of gene expression in plants may less often involve complex noncoding regulatory regions, and may occur more often through gene duplication (see Lockton and Gaut 2005).

FIGURE 2.2. Estimate of the proportion of effectively neutral sites (Nes < 1) in various plant taxa calculated using the methods of Eyre-Walker and Keightley (2009) plotted against estimates of their effective population size (Ne ): Symbols indicate ( filled symbols) predominant outcrossers, (empty symbols) predominantly selfing, (squares) species with a recent whole-genome duplication (synonymous substitution rate among duplicates Ks < 0.3), (triangles) species with ancient whole-genome duplications (Ks > 0.3), (circles) species for which the age of the last whole-genome duplication is unknown, and ( gray symbols) species in which α (the proportion of fixations driven by positive selection) is significantly greater than zero. The regression line shown was calculated excluding the data from Medicago trunculata (not expressed). Estimates of the proportion of effectively neutral sites are from the following for the various species indicated: Paape et al. (2013) for M. trunculata; Slotte et al. (2010) for Capsella grandiflora and Arabidopsis thaliana; Eckert et al. 2013 for Pinus spp.; Strasburg et al. (2011) for all Helianthus; and Gossmann et al. (2010) for the remaining species. Estimates of Ne were obtained from Gossmann et al. (2010), Gossmann et al. (2012), and Strasburg et al.

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 21

(2011), with the following exceptions: Ne estimates for Pinus species were calculated using the diversity levels reported in Eckert et al. 2013, a per-year mutation rate of 10-9 (Willyard et al. 2007), and assuming a generation time of 75 years [based on arguments in Eckert et al. 2013]. For Arabidopsis lyrata and Boechera stricta, Ne estimates were updated using direct estimates of the per-generation substitution rate in Arabidopsis thaliana following Gossmann et al. (2012). For M. truncatula, we used the estimate of per-generation substitution rate from Young et al. (2011) and synonymous diversity estimates from Branca et al. (2011). M. trunculata (not expressed ) reflect data from sites in genes that were not expressed in any of six tissues sampled, whereas M. trunculata (expressed ) reflect data in genes that were expressed in at least one tissue (Paape et al. 2013). Duplication-age estimates were obtained from Blanc and Wolfe (2004), Barker et al. (2008), and Sterck et al. (2005).

Like humans, but in contrast with Drosophila, a significant proportion of sites in plant genomes appear to evolve effectively neutrally. In particular, only 6% of the Arabidopsis noncoding sequence is estimated to be under purifying selection (Haudry et al. 2013). This proportion of constraint is expected to decrease in larger plant genomes with more repetitive elements (Tenaillon et al. 2002). Indeed, the A. lyrata genome has considerably more repetitive elements (Hu et al. 2011) and estimates of the proportion of constrained sites are lower than in A. thaliana (Figure 2.1). In contrast, approximately two-thirds of Drosophila noncoding sequence is estimated to be under purifying selection, implying a much lower proportion of selectively neutral sites in this group (Andolfatto 2005, Sella et al. 2009).

POSITVE AND NEGATIVE SELECTION IN PLANTS With many large datasets from a variety of plant species now available, it is becoming possible to assess the extent to which the patterns of selection in Arabidopsis are representative of other species. In Figure 2.2 we show estimates of the proportion of effectively neutral amino acid mutations, and indicate those species with significantly positive estimates of the proportion of amino acid divergence fixed by positive selection. These estimates are based on extensions of the McDonald-Kreitman test (McDonald and Kreitman 1991), and we have restricted our dataset to include only results obtained using the likelihood-based approach of Eyre-Walker and Keightley (2009). Although parameter estimates in many of these species are similar to those in Arabidopsis, there is considerable variation in the proportion of neutral amino acid sites across species (Figure 2.2), ranging from ~5% to ~30%. In addition, three species, Helianthus annuus (Strasburg et al. 2011), Capsella grandiflora (Slotte et al. 2010), and

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 22

Helianthus petiolaris (Strasburg et al. 2011) show evidence for significant genome-wide positive selection (Figure 2.2). Interestingly, estimates of the proportion of selectively neutral sites are in some cases strongly divergent across close relatives (e.g. in Populus). This highlights that there is significant variation across plant species in the patterns and strength of selection. Do differences in estimates of selection across plant species reflect biological differences or statistical noise? In some cases, uncertainty in parameter estimates can be very high, particularly in systems with relatively small numbers of sampled loci, and differences in the choice of loci can also contribute to this uncertainty if selection parameters vary strongly across genes. For example, recent population genomic comparisons of genes with and without evidence of gene expression in Medicago truncatula have shown strong heterogeneity in the strength of purifying selection (Figure 2.2; Paape et al. 2013). Furthermore, differences in divergence time from the outgroup used can also affect estimates of positive selection (Keightley and Eyre-Walker 2012). However, comparisons among species at identical (Strasburg et al. 2011) or comparable (Slotte et al. 2010) loci, and with similar divergence times, have found significant between-species differences in positive and negative selection. We now consider several possible factors that may be contributing to this variation.

EFFECTIVE POPULATION SIZE

Effective population size (Ne) is a key parameter involved in determining between- species differences in the strength of selection. In particular, Ne describes the extent to which evolutionary change is caused by genetic drift; a lower Ne implies a greater effect of drift (Charlesworth 2009). Smaller populations will thus experience greater effects of drift relative to selection, and if a significant proportion of sites are subject to weak selection, as predicted by nearly neutral theory (Ohta 1992), slightly deleterious mutations will be more likely to segregate at higher frequencies and fix in small populations (Akashi et al. 2012). In addition, populations with small Ne are expected to have a lower probability of fixing beneficial mutations (Gossmann et al. 2011, Akashi et al. 2012, Gossmann et al. 2012). Differences in Ne have thus been suggested to have had important effects on the strength of positive and negative selection in many genomes,

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 23 including humans, Drosophila, and mice (Eyre-Walker and Keightley 2009, Langley et al. 2012). In plants, there is considerable variation in levels of neutral diversity (Leffler et al. 2012), range size (Brown et al. 1996), and census population size (Ness et al. 2010), suggesting that there is also variation in Ne and this may have important effects on patterns of selection in this group.

Estimates of Ne from neutral polymorphism data have been shown to negatively correlate with estimates of the proportion of unconstrained sites across diverse organisms, including comparisons across plant species (Figure 2.2; Wright and Andolfatto 2008,

Gossmann et al. 2010, Akashi et al. 2012). This implies that species with larger Ne generally experience stronger purifying selection. Although broad-scale comparisons may be complicated by changes in the strength of selection and assumptions about mutation rates and generation times used to estimate Ne (particularly for trees such as Pinus and Populus, Figure 2.2), comparisons across species within genera have also demonstrated an effect of Ne on the strength of purifying selection in sunflowers (Strasburg et al. 2011) and soft pines (Eckert et al. 2013; Figure 2.2). In addition, comparisons of different A. thaliana populations have shown that major-effect amino acid changes occur at a high frequency in smaller populations, consistent with an effect of increased genetic drift (Cao et al. 2011). While the number of species investigated is still small, the general pattern emerging from these studies is that variation in Ne has a considerable effect on the strength of purifying selection (Figure 2.2).

Similarly, studies have assessed the effects of Ne on the extent of positive selection (as estimated by a) and have found evidence for more positive selection in species with large Ne. This was found, for example, in pairwise comparisons in Capsella and Arabidopsis (Slotte et al. 2010), across six sunflower and four lettuce species (Strasburg et al. 2011), and across seven diverse plant taxa (Gossmann et al. 2012). In contrast, no signs of positive selection, and no correlation with Ne were found in soft pine species (Eckert et al. 2013).

While there is growing evidence for a significant effect of Ne on patterns of positive selection, there are several concerns regarding the interpretation of the Ne - a correlation. First, the vast majority of point estimates of positive selection are negative, counter-intuitively implying fewer than zero positively selected substitutions. However,

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 24 alpha is estimated by fitting a model of purifying selection to polymorphism data and calculating the excess nonsynonymous divergence above the model’s expectations (Eyre- Walker and Keightley 2009). A negative value therefore indicates that fewer than expected amino acid mutations have fixed, which could be caused by various factors including local adaptation on amino acids, balancing selection, and/or inaccurately accounting for purifying selection by assuming either an incorrect distribution of selective effects (Kousathanas and Keightley 2013) or an inaccurate demographic history.

Given the large number of negative estimates, part of the correlation between a and Ne could thus result from model mis-specification (Eckert et al. 2013). Second, it is notable that species in which there is evidence for genome-wide positive selection also have the lowest proportions of effectively neutral sites (Figure

2.2). Although this is expected if both are governed by differences in Ne, there is also an automatic effect from the measure of a; populations with more effectively neutral mutations will experience a higher fixation rate of non-adaptive mutations, bringing down the proportion of positively selected fixations without necessarily changing the rate of positive selection (Gossmann et al. 2010). However, studies that correct for this by measuring the rate of positive selection, rather than a, have made similar conclusions (Strasburg et al. 2011, Gossmann et al. 2012). Nevertheless the signal of positive selection can be still be masked by high numbers of slightly deleterious mutations (Eyre- Walker and Keightley 2009, Gossmann et al. 2012) because the signal of excess between- species divergence due to positive selection can be ‘swamped out’ by neutral and nearly neutral fixations. Thus, there is an unfortunate anti-conservative loss of power to detect positive selection with decreasing Ne, and negative estimates of a may in part arise because of unaccounted for purifying selection. Thirdly, estimates of positive selection using methods such as Eyre-Walker and Keightley (2009) are dependent on assumptions of constant population size since species divergence (McDonald and Kreitman 1991, Eyre-Walker 2002). Smaller past population sizes could have led to greater amino acid divergence than expected, possibly inflating estimates of a in species with large present-day Ne. However, evidence for greater signals of selective sweeps in genomic regions experiencing higher rates of protein evolution in

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 25

Capsella grandiflora provides independent support for genome-wide positive selection (Josephs, R. Williamson and S. Wright, unpublished results).

Thus, there is consistent evidence for an effect of Ne on the strength of purifying selection, as well as evidence that some species with large Ne experience high rates of positive selection (Figure 2.2). However, inferences regarding the role of Ne on rates of positive selection should be treated with caution, as they have been limited by our ability to quantify adaptive substitution rates, particularly in species with small Ne.

POPULATION STRUCTURE, SELECTION ON STANDING VARIATION, AND POLYGENTIC ADAPTATION Another important contributor to variation in selection in plant populations is population subdivision. Plants vary widely in their extent of between-population and between- species gene flow (Morjan and Rieseberg 2004, Renaut et al. 2013), and this can have several important effects on selection. In addition to effects on Ne (Whitlock 2003), strong subdivision can slow the spread of advantageous mutations, and thereby weaken the effects of selective sweeps (Barton 2000, Kim and Maruki 2011). With strong subdivision, adaptive mutations may also be more likely to occur independently across distinct geographic locations, even when there is a global selection pressure (Ralph and Coop 2010). With environmental heterogeneity, adaptation may also often be local, as shown in plants repeatedly through reciprocal transplant experiments (Kawecki and Ebert 2004). Thus, we expect that species with more subdivided populations will have lower rates of range-wide positive selection. Given that the estimates of positive and negative selection shown in Figure 2.2 have been obtained primarily using ‘scattered’ samples across populations, there could be considerable undetected local positive selection due to population subdivision (Keller et al. 2012). In addition, local adaptive events may be wrongly inferred to be slightly deleterious variants in scattered population samples, because they will generate an excess of nonsynonymous polymorphism (Gossmann et al. 2010). One way to assess this is to apply McDonald-Kreitman tests to local population samples. Such tests have typically found estimates of positive and negative selection that are similar to species-wide samples (Foxe et al. 2008, Gossmann et al. 2010). However, migration events from other

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 26 populations may still be eroding the signal of positive selection and inflating estimates of slightly deleterious mutations in within-population samples using this approach. A more powerful approach is to examine the distribution of between-population differentiation across genes, with the expectation that loci subject to local adaptation will have elevated between-population differentiation. Such studies have successfully identified loci and genomic regions involved in local adaptation, and the results have highlighted that signals of local adaptation can be considerably stronger than signals of species-wide selection (Turner et al. 2010, Hufford et al. 2012, Keller et al. 2012, Chen et al. 2012). However, these studies typically rely on comparisons of candidate loci with assumed ‘neutral’ loci or take an outlier approach, and are thus typically more useful for identifying underlying candidate targets of selection, such as flowering time genes (Keller et al. 2012, Chen et al. 2012) and genes important for heavy metal tolerance (Turner et al. 2010), than for quantifying and comparing rates of local adaptation. Another possible source of undetected positive selection is adaptation involving many loci each with small effects on a phenotype and other forms of selection from standing genetic variation (“soft selective sweeps”; Chevin and Hospital 2008, Pritchard et al. 2010). Under such scenarios, the footprint of positive selection on neutral variation can be weak or absent even within populations, and may involve subtle allele frequency changes across many loci rather than high rates of differentiation and strong selective sweeps (Pritchard and Di Rienzo 2010, Le Corre and Kremer 2012). Such selection may be much more predominant than species-wide selection from new mutations and may be the predominant source of adaptive evolution in humans (Pritchard et al. 2010). A potentially powerful approach for assessing signals of local adaptation that may arise from standing variation or polygenic selection is to test for correlations of single nucleotide polymorphism (SNP) frequencies with environment, under the prediction that consistent environmental changes will drive significant allele frequency changes (Coop et al. 2010). This has recently been applied to whole-genome polymorphism data from worldwide A. thaliana samples (Hancock et al. 2011) and correlations with several environmental variables were found. Additionally, loci with signals of environmental adaptation have been recently shown to exhibit excess neutral diversity, as expected if environmental heterogeneity is maintaining variation across the geographic range (Lee

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 27 and Mitchell-Olds 2012). Significant environmental correlations have also been recently observed in other systems, including black spruce (Prunier et al. 2012) and Loblolly pine (Eckert et al. 2010). Thus, for several plants showing little sign of species-wide selective sweeps, the potential for considerable local adaptation remains, although estimation and comparison of genome-wide rates of local adaptation is difficult partly because of possible residual confounding effects of population history (Coop et al. 2009), and because highly polygenic adaptation will be unlikely to show sufficient allele frequency differentiation to be detected (Le Corre and Kremer 2012).

RECOMBINATION RATE Recombination rate variation within and between species is another well-recognized factor influencing selection (Nachman 2002). High recombination rates are expected to cause selection to act more independently across sites in the genome, whereas low recombination can cause interference between both selected and neutral sites due to increased linkage (Hill and Robertson 1966). Neutral diversity and the efficacy of selection are thus expected to be reduced when recombination is low, and this can be caused by the fixation of beneficial mutations (“selective sweeps”; Maynard Smith and Haigh 2007), and the elimination of deleterious mutations (‘background selection”; Charlesworth et al. 1993). The effects of selective sweeps and background selection thus depend strongly on the rate of recombination and are expected to generate a positive correlation between recombination rate and both nucleotide diversity and the efficacy of natural selection. In agreement with this expectation, diversity has been found to be significantly reduced in genomic regions with low rates of recombination, including centromeres (Carneiro et al. 2008), telomeres (Savage et al. 2005), and Y chromosomes in animals (Hellborg 2003, Bachtrog and Charlesworth 2002). Indeed, variation in the rate of recombination is probably a major determinant of diversity throughout Drosophila genomes, where there is evidence that polymorphism increases with crossing over rates in D. melanogaster (Andolfatto 2007), D. simulans (Begun et al. 2007), and D. pseudoobscura (Noor 2008). Similarly, there is evidence for a positive correlation between recombination rate and nucleotide diversity across the human genome (Lercher and Hurst 2002). Furthermore,

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 28 several studies in Drosophila have provided evidence for a reduced efficacy of both purifying and positive selection in regions of low recombination, as expected if these regions experience more selective interference (Betancourt and Presgraves 2002, Campos et al. 2012). Unlike in animals, a major effect of variation in recombination rate on sequence diversity was initially difficult to establish in plants, despite much interest and study (Baudry et al. 2001, Tenaillon et al. 2002, Nordborg et al. 2005, Wright et al. 2006, Haudry et al. 2008). For example, there is little evidence for a correlation between recombination rate and neutral diversity in A. thaliana or its outcrossing relatives, and low-recombining regions close to centromeres in fact show evidence for elevated diversity (Nordborg et al. 2005, Wright et al. 2006). In wild tomato, there is some evidence for a positive correlation between recombination and polymorphism, although the relationship is weak (Baudry et al. 2001). Although early work in maize did not indicate a positive correlation between recombination and diversity (Tenaillon et al. 2002), more recent whole-genome polymorphism studies found a significant positive correlation (Gore et al. 2009). Diversity is also significantly positively correlated with recombination rates in Medicago truncatula (Branca et al. 2011). Finally, recombination has been shown to correlate positively with the efficacy of purifying selection in the Triticeae (Escobar et al. 2010), and strongly reduced diversity and relaxed selection have been observed in the young Y chromosomes of a few plant species (Bergero and Charlesworth 2011, Chibalina and Filatov 2011). Thus, while a growing number of studies have seen an effect of recombination rate on diversity and selection, this pattern is not universal. There are several possible explanations for the lack of association between recombination and diversity in some plant genomes, and an emerging picture is that gene density plays an important role. For example, in self-fertilizing A. thaliana, both recombination rates and gene density vary across chromosomes (Singer et al. 2006), and although there is little evidence that diversity correlates with recombination rate (Nordborg et al. 2005, Wright et al. 2006), it has been found to correlate significantly with gene density (Schmid 2004, Nordborg et al. 2005). Similarly, using SNP data from domesticated and wild rice, Flowers et al. (2012) did not find a positive correlation between nucleotide diversity and recombination rate,

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 29 but they did find a significant negative correlation between diversity and gene density. Evidence for lower recovery of polymorphism following domestication in gene-rich regions of maize (Hufford et al. 2012) is also consistent with this effect. Finally, whereas recombination and diversity are significantly correlated in Medicago, there is a stronger negative correlation with gene density (Branca et al. 2011). The studies above highlight that, in addition to recombination rate, the density of selected mutations has important effects on the strength of interference. This predicts that linked selective effects should most strongly affect diversity in regions with both high gene density and low recombination. Consistent with this, recent studies in Caenorabiditis have found that selection in gene-dense, low recombination regions significantly affects nucleotide diversity, which is both positively correlated with recombination rate and negatively correlated with gene density (Cutter and Choi 2010). In contrast, regions of low recombination in Arabidopsis are particularly gene-poor, which likely erodes the correlation between recombination and diversity (Nordborg et al. 2005, Wright et al. 2006).

MATING SYSTEM In addition to recombination rate heterogeneity across the genome, the extensive mating system variability in plants (Barrett 2002) can lead to important between-species differences in effective rates of recombination. In particular, inbreeding populations are expected to have higher homozygosity and therefore lower effective rates of recombination (Nordborg 2000). This is expected to cause elevated levels of linkage disequilibrium, which should in turn reduce genetic diversity and the efficacy of selection (reviewed in Glémin et al. 2006). An effective way to study the consequences of inbreeding on selection and diversity patterns is therefore to compare DNA sequences from closely related species that differ in rates of self-fertilization, and studies that have used this approach have found clear evidence for reduced diversity in selfing species. For example, studies have found reduced nucleotide diversity in selfing lineages of Arabidopsis (Nordborg et al. 2005, Ross-Ibarra et al. 2008), Capsella (Foxe et al. 2009), Leavenworthia (Liu et al. 1998, Charlesworth and Yang 1998), Eichhornia (Ness et al. 2010), Mimulus (Sweigart

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 30 and Willis 2003), Lycopersicum (Baudry et al. 2001, Roselius 2005), Collinsia (Hazzouri et al. 2012), and Clarkia (Pettengill and Moeller 2012). The patterns that emerge generally suggest a greater reduction in diversity than can be explained simply by including high levels of inbreeding in a neutral model, indicating that selective interference may play an important role, although demographic factors are also likely contributing. In contrast with what is seen with regard to patterns of diversity, early studies using small numbers of loci and comparisons of relative rates of nucleotide substitution provided little signs of reduced efficacy of natural selection in selfing species (Wright et al. 2002, Haudry et al. 2008, Escobar et al. 2010). However, more recent genome-wide studies have found evidence for reduced efficacy of selection on nonsynonymous sites in selfing Capsella (Slotte et al. 2013), Eichhornia (Ness et al. 2012), and Collinsia (Hazzouri et al. 2012). Furthermore, recent analyses have suggested a decline in the efficacy of selection on codon usage bias in selfing Arabidopsis and Capsella compared with their outcrossing congeners (Qiu et al. 2011). Given that these comparisons include very recently derived selfing lineages (e.g., in Eichhornia and Capsella), this indicates that detectable shifts in the efficacy of selection can happen rapidly. There is less evidence for a reduced efficacy of positive selection in selfing lineages, however, and this may not be expected if a large proportion of beneficial mutations are recessive, which can fix more rapidly in selfing populations (Glémin 2007), as discussed below. Nevertheless, the plant species in which there is evidence for positive selection are all highly outcrossing (Figure 2.2), and a pairwise comparison of A. thaliana with the closely related obligately outcrossing C. grandiflora indicates a significant decline in the efficacy of positive selection in selfing species (Slotte et al. 2010). The above patterns suggest that recombination rate variation caused by mating system evolution is playing an important role in driving heterogeneity in diversity and selection, but it is difficult to distinguish the roles of linked selection from demographic history and population structure. For example, in colonizing selfing species such as A. thaliana and E. paniculata, population bottlenecks and increased subdivision associated with mating system differences may be a major cause of reduced effective population sizes (Ness et al. 2010). Furthermore, founder events associated with shifts to selfing

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 31

(Foxe et al. 2009) can also be important factors reducing diversity and the efficacy of selection. While this does not rule out an important role for genetic hitchhiking, it highlights that demographic affects may also be major contributors to patterns of selection in selfing species. In addition to the effects of selfing on recombination rate and demographic history, the dynamics of positive selection in selfers are also expected to be influenced by high levels of homozygosity (Glémin and Ronfort 2013). In particular, selfers are more likely to fix recessive beneficial mutations and less likely to experience selection on standing variation (Charlesworth 1992, Glémin 2007, Glémin and Ronfort 2013). Recent evidence that self-fertilizing crops have higher numbers of recessive QTL contributing to phenotypic evolution is consistent with this prediction (Ronfort and Glémin 2012).

PLOIDY Another major source of variation expected to have important consequences for both positive and negative selection is differences in ploidy across species and across the genome. Because of changes in both the effective population size and the extent of masking of mutations, differences in ploidy are predicted to have important effects on rates of positive selection, the efficacy of purifying selection, and the mutation load, as discussed by Otto and Whitton (2000). For example, species with higher ploidy levels may have lower mutation loads following polyploidization, but over time deleterious mutations in their genomes may reach higher frequencies due to the greater effects of masking. Similarly, if most beneficial mutations are recessive, masking can cause the rate of positive selection to be reduced in species with higher ploidy levels. On the other hand, dominant beneficial mutations and selection from standing variation could lead to a greater extent of positive selection. The effects of ploidy on positive and negative selection thus depend in important ways on parameters such as dominance coefficients and the time since changes in ploidy. In plants, both species-level and gene-level differences in ploidy are likely to be important factors driving differences in selection for three reasons. First, extensive gametophytic expression (Borg et al. 2009) means that a subset of genes will be expressed in the haploid phase, potentially contributing to differences in the efficacy of

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 32 positive and negative selection across the genome. Evidence for greater selection on intron length in genes expressed in pollen (Seoighe et al. 2005) is consistent with this hypothesis. Furthermore, there is evidence that Y-chromosome degeneration may be significantly reduced at genes expressed in the haploid phase, perhaps contributing to a lower extent of relaxed selection in plants compared with animal Y chromosomes (Chibalina and Filatov 2011). Finally, recent genome-wide estimates of positive and negative selection in C. grandiflora suggest that pollen-expressed genes show higher rates of both positive and negative selection (Arunkumar et al. 2013). Third, polyploidy is common in plants (Bowers et al. 2003) and is associated with 15% of angiosperm and 30% of fern speciation events (Wood et al. 2009). Indeed, many of the model plant groups that have been used to study selection at the molecular level have undergone ancient polyploidization at various time points (Jiao et al. 2011), including Arabidopsis (The Arabidopsis Genome Initiative 2000, Ku et al. 2000), wheat (Brenchley et al. 2012), maize (Messing 2009), and several species in the grass family (Levy and Feldman 2002). Furthermore, patterns of duplicate gene divergence suggest that most plants have undergone multiple rounds of whole genome duplication in their past (Blanc and Wolfe 2004). This means that differences in the extent and timing of polyploidization events may be important contributors to differences among species in the efficacy of selection. To date, most studies examining selection associated with polyploidization have focused on cases of disomic inheritance caused by ancient allopolyploidization between species. In this case, polyploidization has led to gene duplication across the entire genome. In these systems, one signal of relaxed purifying selection following polyploidization is the ongoing silencing and loss of gene duplicates (Lynch and Conery 2000). Evidence from several systems suggests that gene duplicate retention and loss is not random with respect to biological function. In particular, retained gene duplicates are often dosage-dependent and involved in transcription and signal transduction. This pattern is seen across independent polyploidization events, including Arabidopsis (Blanc and Wolfe 2004), yeast (Seoighe and Wolfe 1999), and Paramecium (Aury et al. 2006), indicating that gene dosage is an important factor determining the extent to which genes

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 33 experience relaxed selection following duplication (Veitia 2004, Freeling and Thomas 2006, Birchler and Veitia 2009, Hakes et al. 2007; but see Barker et al. 2008). The extent of gene retention and loss in polyploid lineages is expected to be an ongoing process reflecting the timing of genome duplication. For example, approximately 72% of duplicate genes have been retained over 11MY following polyploid formation in maize (Gaut and Doebley 1997). In contrast, in Arabidopsis, three older rounds of whole genome duplication events have occurred (>50MYA; Beilstein et al. 2010), with considerably fewer retained duplicates (less than 17%; Maere et al. 2005). This suggests that the maize genome may still be in the early stages of diploidization, with many duplicate genes experiencing relaxed selection prior to loss. This example indicates that differences across species in the timing of the most recent polyploidization event can have major effects on the proportion of effectively neutral substitutions and on rates of positive selection. This could possibly contribute to a lack of evidence for positive selection and less purifying selection in maize than other outcrossing species with large

Ne (Figure 2.2). It is also likely that, in addition to the direct affects of dominance, rates of beneficial mutation are affected by changes in ploidy. Following polyploidization, there is clear evidence that extensive genome-wide structural rearrangements can occur (Song et al. 1995), and this may lead to changes in gene expression and increased genomic variability in polyploid populations (Osborn et al. 2003). With the relaxed selective constraint following genome duplication combined with a greater mutational input of beneficial mutations, positive selection for novel functions in gene duplicates may be increased (Walsh 1995, Otto and Whitton 2000, Otto and Yong 2002). In A. thaliana, there is evidence for extensive divergence and asymmetric rates of protein sequence evolution among ancient duplicate gene pairs (Blanc and Wolfe 2004, Hu et al. 2012), suggesting selection has been involved in driving functional divergence. In some cases, studies have also identified a significant signal of positive selection on duplicates (Wendel 2000, Ho-Huu et al. 2012), and these results are consistent with a model in which the initial relaxation of selection caused by gene duplication is followed by functional divergence driven by positive selection (Innan and Kondrashov 2010, Ho-Huu et al. 2012).

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 34

Much less has been documented about the dynamics of positive and negative selection in autotetraploids with tetrasomic inheritance. In tetraploid maize, no effect of ploidy on either the effective population size or selection was found (Tiffin and Gaut 2001). In the autotetraploid A. arenosa, the effective population size and the efficiency of selection appears to be higher than that of its diploid relatives (Hazzouri et al. 2008; Hollister et al. 2012), and this species appears to have experienced high rates of positive selection on genes involved in meiosis, possibly as a consequence of the genetic challenges associated with the shift in ploidy (Hollister et al. 2012).

CONCLUSIONS AND FUTURE DIRECTIONS The results to date suggest that plant populations vary greatly in the strength and genetic basis of positive and negative selection, and that differences in effective population size, subdivision, mating system, recombination, and ploidy are playing important roles in determining this variation. However, recent results have also highlighted the difficulty of empirically distinguishing between the effects of different population genetic parameters in driving patterns of selection. In particular, the strongest signals of purifying and positive selection in plant genomes have been found in predominantly outcrossing species that have large effective population sizes, limited population subdivision, and no recent whole-genome duplication events, making the relative importance of each of these factors unclear. Furthermore, characterizing rates of positive selection in species with small Ne and strong population subdivision remains challenging. To address some of these concerns, future large-scale genus and family-wide studies of positive and negative selection will be important. By acquiring a rich comparative dataset of carefully chosen closely related species, the relative and joint importance of changes in mating system, effective population size, ploidy, and subdivision can be examined more quantitatively. Furthermore, examining how changes such as shifts to selfing and polyploidy influence the strength of selection as a function of time since the transition will be particularly important. Future studies utilizing model- based approaches that integrate phylogenetics and population genomics will be important to quantify and compare changes in positive and negative selection across different lineages. Although such comparisons of selection across species are difficult at present,

CHAPTER 2. PATTERNS OF SELECTION IN PLANT GENOMES 35 recent work suggests that progress might be made by integrating large-scale comparative population genomics datasets with models that incorporate the effects of changes in demography, ploidy, and recombination.

ACKNOWLEDGMENTS We are very grateful to A. Eckert, T. Gossmann, P. Tiffin, L. Rieseberg, and S. Renaut for sharing their unpublished manuscripts, T. Gossmann for discussion, and S.P. Otto for helpful comments on the manuscript.

CHAPTER 3

GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES IN THE RUMEX HASTATULUS

This chapter resulted from collaboration with Jesse D. Hollister, Wei Wang, Spencer C.H. Barrett, and Stephen I. Wright. It is published in Proceedings of the National Academy of Sciences, USA, 2014, 111:7713–7718.

SUMMARY Heteromorphic sex chromosomes have originated independently in many species, and a common feature of their evolution is the degeneration of the Y chromosome, characterized by a loss of gene content and function. Despite being of broad significance to our understanding of sex chromosome evolution, the genetic changes that occur during the early stages of Y-chromosome degeneration are poorly understood, especially in plants. Here, we investigate sex chromosome evolution in the dioecious plant Rumex hastatulus, in which X and Y chromosomes have evolved relatively recently, and occur in two distinct systems: an ancestral XX/XY system, and a derived XX/XY1Y2 system. This provides a unique opportunity to investigate the effect of sex chromosome age on patterns of divergence and gene degeneration within a species. Despite recent suppression of recombination and low X-Y divergence in both systems, we find evidence that Y-linked genes have started to undergo gene loss, causing ~28% and ~8% hemizygosity of the ancestral and derived X chromosomes, respectively. Furthermore, genes remaining on Y chromosomes have accumulated more amino acid replacements, contain more unpreferred changes in codon usage, and exhibit significantly reduced gene expression compared to their X-linked alleles, with the magnitude of these effects greatest for older sex-linked genes. Our results provide evidence for reduced selection efficiency and ongoing Y-chromosome degeneration in a flowering plant, and the comparison of old and young sex-linked genes within a species indicates that Y-degeneration can occur soon after recombination suppression between sex chromosomes.

35 CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 37

INTRODUCTION Systems of sex determination involving X and Y chromosomes have evolved multiple times in both plants and animals, with Y chromosomes in many species having lost much of their genetic functions (Charlesworth 1996; Charlesworth and Charlesworth 2000; Bachtrog 2013). Evidence of DNA sequence homology between X- and Y-linked gene pairs in flowering plants (Lie et al. 2004, Filatov 2005; Yin et al. 2008; Spigler et al. 2008) and fish (Peichel et al. 2004), support the idea that sex chromosomes have evolved from autosomes and subsequently diverged following the suppression of recombination between genes involved in sex determination. Evolutionary models predict that when regions of suppressed recombination evolve on Y chromosomes, the associated reduction in the effectiveness of selection should lead to a pattern of Y-chromosome degeneration in which genes carried on the Y become impaired in function and are eventually lost (Charlesworth 1996; Charlesworth and Charlesworth 2000; Bachtrog 2013). The well- studied Y chromosomes in humans, and Drosophila melanogaster, for example, show clear signs of degeneration: they almost completely lack homology to the X chromosome, exhibit a highly heterochromatic chromatin structure consisting largely of repetitive and ampliconic DNA, and carry few remaining protein-coding genes (Kaminker et al. 2002; Carvalho et al. 2003; Hughes et al. 2010 2012). Recent genomic studies of sex chromosomes in humans, rhesus macaques, and chimpanzee (Hughes et al. 2010 2012) have provided detailed information regarding the genetic structure and gene content of Y chromosomes, shedding light on the processes contributing to their deterioration. However, we still know very little about the changes that characterize the early stages of Y-chromosome degeneration, or the time scales over which these occur. This is because sex chromosomes in these well-studied mammalian species evolved >200 hundred million years ago (Lahn and Page 1999; Ross et al. 2005), and therefore provide few clues about their early evolutionary history. Genomic studies of younger plant Y chromosomes (Bergero and Charlesworth 2011; Chibalina and Filatov 2011; Gschwent et al. 2012; Wang et al. 2012) and Drosophila neo-Y chromosomes (Bachtrog 2006; Kaiser et al. 2011; Zhou and Bachtrog 2012a, b), where degeneration is in progress, thus provide excellent opportunities to gain insight into the early processes involved in sex chromosome divergence.

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 38

Here, we investigate X and Y chromosome evolution in the annual, dioecious plant Rumex hastatulus (Polygonaceae). Sex chromosomes in R. hastatulus represent an interesting case of the recent evolution of sex chromosome heteromorphism, with age estimates based on nuclear and chloroplast phylogenies suggesting that sex chromosomes evolved within the last 15-16 million years (Navajas-Pérez 2005). The presence of a neo-

Y sex chromosome system (XX/XY1Y2), recently derived from an XX/XY system following a fusion of the X chromosome and a former autosome (Smith 1964), provides a unique opportunity to contrast patterns of sex chromosome evolution between different sex chromosome systems, and to investigate the effect of sex chromosome age on patterns of divergence and degeneration within a species. We used high-throughput transcriptome sequencing of multiple parent-offspring families, and an analysis of single nucleotide polymorphism (SNP) segregation patterns, to identify and compare the expression and molecular evolution of sex-linked genes, with the aim of determining whether Y-linked genes are accumulating deleterious mutations, exhibit reduced expression, or have undergone gene loss.

RESULTS AND DISCUSSION We identified genes linked to sex chromosomes by tracing the inheritance of SNPs from parents to F1 progeny in two crosses, one from each sex chromosome system (XX/XY and XX/XY1Y2). In particular, we identified genes in which SNPs segregated in a manner characteristic of sex linkage, with Y-alleles transmitted from fathers strictly to sons, and X-alleles from fathers strictly to daughters, a method that has been validated in previous studies (Bergero and Charlesworth 2011; Chibalina and Filatov 2011). This approach allowed us to identify 698 genes with ≥ 4 sex-linked SNPs in XX/XY populations, and

1298 such genes in XX/XY1Y2 populations (Table 3.1; Appendix 3.1, Table A3.1.1). Approximately 70% of sex-linked genes from the XY system were also identified in the

XY1Y2 system, whereas about 40% of genes in the XY1Y2 system were shared with the

XY system. This suggests that the XY1Y2 system has acquired many new sex-linked genes since the fusion event, and our analysis allowed us to identify a set of 488 ‘old’ sex-linked genes that are shared between the systems, as well as 607 ‘young’ genes unique to the XY1Y2 system.

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 39

Cytological measurements of X chromosome size in R. hastatulus suggest that the X is approximately 20% of the diploid female genome for the XY system, and about 30% of the genome in the XY1Y2 system (Smith 1964). Using the estimated number of genes reported in other dicotyledonous plants [e.g., 28,000 in Arabidopsis thaliana (Yamada et al. 2003)], we obtained a rough estimate of the expected number of sex-linked genes of

5600 and 8400 for the XY and XY1Y2 systems, respectively. Our screen for sex-linked genes using segregating polymorphisms in expressed genes therefore captures approximately 13% and 15% of the total number of sex-linked genes for the XY and

XY1Y2 systems, respectively. Because some of our candidate sex-linked genes may be in a pseudoautosomal region and therefore partially recombining with the sex-determining region, we independently sequenced transcriptomes from a single male and female from each of six populations per sex chromosome system of R. hastatulus and checked for the presence of fixed differences between males and females (Table 3.1; Appendix 3.1, Table A3.1.2 and Table A3.1.3). This approach led to the validation of approximately 80% of the sex- linked genes from the XY system, 90% of the sex-linked genes from the XY1Y2 system shared with the XY system, but only 28% of the ‘young’ XY1Y2 genes. This suggests that fewer variants have fixed between the neo-sex chromosomes, potentially due to ongoing recombination or its very recent suppression. For subsequent analyses, we excluded genes that did not show fixed differences between X and Y, as well as a small number of genes showing one or more SNPs displaying autosomal segregation (Appendix 3.1, Table A3.1.4).

TABLE 3.1: Numbers of identified sex-linked genes in Rumex hastatulus

Hemizygous Percent Gene set Sex-linked genes with Y-linked copies3 genes hemizygous4 698 (565) 119 24% XY system

1 510 (460) 100 28% XY1Y2 shared

2 788 (223) 44 8% XY1Y2 unique

1. Shared genes represent genes in the XX/XY1Y2 system that were also identified in the XX/XY system.

2. Unique genes represent genes identified as unique to the XX/XY1Y2 system.

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 40

3. Numbers indicate genes with at least four-supporting SNPs showing sex-linked segregation and having no SNPs with autosomal segregation. Values in parentheses identify the numbers of genes with at least one fixed X-Y difference in the population sample. 4. Estimates of percent hemizygous genes were calculated by comparing the number of hemizygous genes to the number of X/Y genes that had at least

Phylogenetic relationships and evolutionary divergence of sex-linked genes To investigate relatedness and levels of divergence of sex-linked genes, we obtained additional transcriptome sequence data and identified orthologous sequences from the closest known non-dioecious outgroup that lacks sex chromosomes, Rumex bucephalophorous (Navajas-Pérez 2005). For each gene, we developed a novel maximum likelihood approach to infer the phased X and Y sequences from both sex chromosome systems. We confirmed the reliability of our phasing method using simulations (Appendix 3.1, Figure A3.1.1 and Figure A3.1.2), and constructed phylogenetic trees of these sequences, including the outgroup (see Methods, Appendix 3.1). Of 354 ‘old’ sex- linked genes, 150 (42%) X-alleles were monophyletic from the two sex chromosome systems, while 179 (51%) Y-alleles were monophyletic (Fisher’s Exact test, P<0.04), consistent with the origins of these Y-linked genes predating the divergence of the two sex chromosome systems. Overall, only 78 (22%) exhibited complete reciprocal monophyly for both X and Y between the systems, highlighting the very recent divergence between the sex chromosome systems and indicating that a significant proportion of even the ‘old’ genes may have experienced a recent restriction of recombination. Consistent with this, maximum likelihood estimates of synonymous substitution rates (Ks) for both young and old X- and Y-linked genes (Figure 3.1) suggest that the majority have low levels of nucleotide divergence, implying that these genes are in an early stage of divergence.

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 41

FIGURE 3.1: Synonymous site divergence in sex-linked genes of the XY1Y2 system of Rumex hastatulus. Maximum likelihood estimates of lineage-specific rates of per-site synonymous substitution are shown for the (A) X chromosome, and (B) Y chromosome. ‘Old’ sex-linked genes refer to genes that are shared between the ancestral XY system and the derived XY1Y2 system. ‘Young’ sex-linked genes refer to those that are unique to the derived XY1Y2 system.

It is difficult to infer from this data whether there is a pattern of ‘evolutionary strata’, which has been found in animal and plant sex chromosomes (e.g., Lahn and Page 1990; Bergero et al. 2007) and is characterized by a stratified increase in divergence of X/Y genes with increasing distance from the pseudo-autosomal region. We did find a range of Ks values for sex-linked genes within each system, which may reflect that recombination suppression occurred at different times for different genes (which is

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 42 thought to be the underlying cause of strata). However, we do find strong and significant differences in average branch-specific Ks when comparing old versus young X-linked genes (0.00870 and 0.00276, respectively; P<<10-10), and old versus young Y-linked genes (0.0120 and 0.00297, respectively P<<10-10), with the younger sets showing more left-shifted Ks distributions and much lower average Ks (Figure 3.1). Overall, these results highlight that there has been very little sequence divergence for ‘young’ sex- linked genes, whereas the older genes likely include both genes that have experienced recent restricted recombination, either prior to or following the divergence between sex- chromosome systems, and those that have been non-recombining for a much longer period.

Y chromosome gene loss and loss of expression The relatively recent evolution of recombination suppression and generally low sequence divergence between many genes on R. hastatulus sex chromosomes raises the question of whether Y-linked genes have been lost, or have lost expression relative to X-linked genes. Gene loss has occurred extensively on human and Drosophila Y chromosomes (reviewed in Bachtrog 2013), and might be driven by adaptive silencing of Y-linked genes to mask their deleterious effects (Zou and Bachtrog 2012; Orr and Kim 1998), or more passively as a consequence of harmful mutations occurring in regions that are essential for gene function (Hill and Robertson 1966; Charlesworth 1978). We inferred the amount of gene loss in R. hastatulus by quantifying the percentage of X-linked genes in which SNP segregation patterns indicated hemizygosity in males (Table 3.1 and see Appendix 3.1). Estimates of hemizygosity based only on mRNA sequence data will include genes that have been lost, genes with non-functional (non-expressed) Y-linked copies, and genes that have moved from autosomes to the X chromosome but do not have homologous copies on the Y. We note that hemizygosity could conceivably be incorrectly inferred using our RNAseq based approach in cases where X-linked genes have Y-linked copies that are too lowly expressed to be detected. Such genes would indicate partial Y- degeneration rather than genuine gene loss. By comparing the number of hemizygous genes with the number of X/Y genes with equivalent segregating X-linked polymorphisms (see Appendix 3.1), we estimate

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 43 that the percentage of genes lost from the R. hastatulus Y chromosome is as high as 28% (Table 3.1; Appendix 3.1, Table A3.1.5). We also found that estimates of hemizygosity in

XY1Y2 males were much lower (8%) than in XY males (Table 3.1), which is expected since the XX/XY1Y2 sex chromosome system has acquired additional X/Y gene pairs, with very little time for gene degeneration and loss. Our estimates of percent hemizygosity, although low in comparison to mammalian sex chromosomes (where ~97% of the X is hemizygous in males; Skaletsky et al. 2003; Ross et al. 2005), are somewhat higher than other estimates from plants (~20% in Silene latifolia; Bergero and Charlesworth 2011; Chibalina and Filatov 2011), and suggest that Y chromosomes in R. hastatulus may already have undergone some gene loss despite their relatively recent origin. We tested for a reduction in expression of young and old Y-linked genes by comparing the ratio of Y/X gene expression in males. Expression was estimated by counting the number of mRNA transcript reads mapping to X/Y SNPs in contigs with ≥ 4 such SNPs segregating in F1 offspring. Because Y-linked alleles in our segregation analysis are identified as alternate alleles at heterozygous sites (with the X-allele as reference), it is important to evaluate the extent of the reduction in the Y/X expression ratio by comparing it to the expression ratio of alternate-to-reference alleles at heterozygous sites throughout the genome. This is important because there is an inherent bias toward mapping more reference than alternate alleles (Degner et al. 2009), and not controlling for this would generate a false signal of lower Y expression, or exaggerate signals of a truly reduced expression. We therefore tested for reductions in Y/X expression ratios by using the alternate/reference expression ratio in autosomes as the null expectation. Our analyses indicated an overall trend of reduced Y-expression relative to X- linked alleles for both old and young categories (and similar results were obtained when comparing to the full set of genes from the XY system; Appendix 3.1, Figure A3.1.3), with the effect being markedly stronger for older Y-linked genes (median= 0.79; Wilcoxon test W = 1093796, P<<10-10; Figure 3.2), than for the younger category (median= 0.90; Wilcoxon test W = 495511.5, P = 0.0267; Figure 3.2; Appendix 3.1, Figure A3.1.3). The overall pattern suggests that Y-linked genes that spend more time in

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 44 the non-recombining regions are more likely to show functional deterioration. That the younger sex-linked genes also show a significant reduction in Y/X ratio indicates that reduced expression on the Y is probably one of the initial changes that occur following the evolution of X-Y recombination suppression. However, it is also possible that X- linked alleles have been up-regulated to some extent in males (partial dosage compensation (Mank 2013; Muyle et al. 2012), thus contributing to the observed lower Y expression relative to X (see below). Our results also suggest the possibility that some genes have elevated Y-linked expression relative to X-linked alleles (Figure 3.2), although evidently this is less common. That the younger sex-linked genes also show a significant reduction in their Y/X ratio indicates that changes in the ratio of expression is probably one of the initial changes that occur following the evolution of X-Y recombination suppression.

FIGURE 3.2: Y/X gene expression of old and young sex linked genes in Rumex hastaulus. The Y/X expression ratio distribution in males for 230 ‘young’ sex-linked genes from the XY1Y2 system (not shared with the XY system) and 459 ‘old’ sex-linked genes (shared with the XY system), compared to the expression ratio for alternate to reference alleles at heterozygous sites in autosomes. Relative expression of Y relative to X alleles was estimated per gene in males (i.e. within individual samples) by counting the numbers of mRNA reads covering sex-linked SNPs in sex-linked genes, and these relative estimates were

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 45 averaged across all males. Expression estimates for reference and alternative alleles at heterozygous sites in autosomes were obtained similarly, using the numbers of mRNA reads covering SNPs in contigs where at least 4 such SNPs segregated as autosomal. The dotted line shows the expectation when X and Y alleles (or ref and alt alleles in autosomes) are equally expressed. Error bars show 1.5 times interquartile range, approximately corresponding to two standard deviations, and notches correspond approximately to a 95% confidence interval for the medians.

Disruption of normal expression levels and gene loss could both negatively affect the fitness of males, potentially leading to selective pressure to up-regulate X-linked genes, a process known as dosage compensation (Charlesworth 1996; Mank 2013). To investigate this, we analyzed the expression of X-linked genes that were ascertained to be hemizygous in males (but present in two active copies in females) to determine whether such genes were hyper-expressed in males. Our analysis of 119 hemizygous genes revealed that relatively few hemizygous genes in males showed evidence for a compensatory increase in gene expression when compared to X-linked genes in females. The majority of X-linked genes with missing Y-copies were expressed approximately two-fold lower when compared to females (Figure 3.3A). In particular, a high proportion of these genes (94/119, 79%) showed significantly lower expression in males than females (Appendix 3.1, Table A3.1.6), while only seven out of 119 (6%) showed significantly higher expression in males when compared to half of total (X+X) expression in females. This suggests that dosage compensation is incomplete in R. hastatulus, and is evidently not mediated by a chromosome-wide mechanism that affects all X-linked genes in a similar manner. In contrast, we did not find a consistent reduction in male-specific expression for either ‘old’ (Figure 3.3B) or ‘young’ (Figure 3.3C) X/Y genes (Appendix 3.1, Table A3.1.6) when compared to total X expression in females. This implies that the observed loss of expression of Y-linked alleles does not cause total levels of sex linked gene expression in males to be reduced, potentially reflecting up-regulation of the male X allele to compensate for the loss in expression on the Y. However, it remains unclear whether this compensatory increase in expression of the X allele in males is adaptive in the sense that it was selected because of a degenerating Y. It may instead have arisen as a consequence of existing mechanisms of gene expression regulation that are activated in the presence of small perturbations in expression or gene dosage (e.g., Malone et al.

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 46

2012). Autosomal genes tended to show the lowest level of differential gene expression between males and females (Figure 3.3D; Appendix 3.1, Table A3.1.6), suggesting that most of the differential gene expression is driven by sex chromosome evolution. Similar conclusions were obtained when examining expression differences in the XY system, as well as from our independent population samples (Appendix 3.1, Table A3.1.6). Overall, we conclude that the majority of hemizygous genes are not dosage compensated, while genes with retained Y copies have lower Y-expression but no overall differential expression between the sexes.

FIGURE 3.3: Average normalized gene expression in male vs. female progeny (6 of each sex) from the

XY1Y2 system. (A) hemizygous genes, (B) sex-linked genes with Y homologues, shared with the XY race (‘old’), (C) sex-linked genes with Y homologues, not shared with the XY system (‘young’), (D), autosomal genes. The solid line shows the expectation under equal male and female expression, and the dashed line shows the expectation for male expression being equal to half of female expression. Median differential expression normalization was conducted using DEseq (see methods for details).

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 47

Molecular evolutionary tests for deleterious mutations and codon usage bias We also tested whether Y-linked genes have a reduced extent of purifying selection relative to X-linked genes, and whether they have accumulated deleterious mutations or changes in codon usage. This is expected because of the lower rate of recombination for Y-linked genes, which is predicted to reduce the efficacy of purifying selection (Hill and Robertson 1966; Charlesworth 1978). However, given that recombination suppression was so recent for many sex-linked genes in R. hastatulus, with low divergence between the majority of sex-linked genes, extensive deterioration of Y-linked genes may not be expected. Using our phased X and Y sequences, we used two approaches to test whether Y-linked sequences are accumulating deleterious changes. First, we used parsimony to estimate the total numbers of changes across sex-linked genes on the X and the Y lineage, using orthologous sequences from R. bucephalophorus as the outgroup. The number of synonymous changes on the X versus the Y for the ‘old’ gene set is nearly equal, providing no evidence for elevated mutation rates along the Y chromosome (Figure 3.4; Appendix 3.1, Figure A3.1.4). In contrast, nearly twice as many nonsynonymous changes have occurred on the Y lineage (1646 vs. 835), implying a strong relaxation of selection since the suppression of recombination. This difference is highly significant (Fisher’s exact test P<0.001). For the ‘young’ gene set a weaker trend was apparent (Figure 3.4), (339 vs. 215 nonsynonymous changes on the Y vs. the X, Fisher’s exact test P< 0.001).

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 48

Synonymous and nonsynonymous substitutions in X and Y genes. The number of parsimony-estimated lineage-specific substitutions (A) and changes in codon use (B) on the X and Y sequences from the XY1Y2 system are shown, using orthologous sequences from R. bucephalophorus to polarize changes along the X and Y lineages separately. Old genes represent those shared with the XY system, whereas young genes represent those that are not shared.

We also generated maximum-likelihood estimates of the ω (dN/dS) ratio for each lineage, including the X and Y sequences of both races and the outgroup. Consistent with the parsimony approach, we found that old Y-linked genes in the XY1Y2 system had a higher number of substitutions per site, ω, compared to X-linked genes (average ωY_old = -10 0.401 and ωX_old = 0.156; Wilcoxon test P<<10 ), but the difference was much less and not significant for younger Y-linked genes (average ωY_young = 0.209 and ωX_young = 0.145; P = 0.114). Further, we found that old and young X sequences did not have significantly different ω values (P<0.399), but the comparison of old versus young Y genes revealed a significant difference (P<4x10-8). As expected, analysis of substitution rates in the XY chromosome system gave comparable results to the ‘old’ gene set in the XY1Y2 system

(Appendix 3.1 Table A3.1.7). Together, these results indicate that elevated ωY_old is not due to changes on the X, but is instead caused by a significantly higher substitution rate on the Y. Finally, we also tested whether Y-linked genes have undergone more changes toward unpreferred codons than X-linked genes. Here, we used a parsimony approach to examine changes in codon usage along the X- vs. Y-lineages, using the outgroup

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 49 sequence to polarize changes on X- and Y-branches. To count the number of changes from preferred to unpreferred codons and vice versa, we assumed shared codon preferences from Arabidopsis thaliana (Wright et al. 2004). Old Y-linked genes had significantly more preferred-to-unpreferred changes in codon usage relative to unpreferred-to-preferred changes compared with X-linked genes (Figure 3.4B; Appendix 3.1, Figure A3.1.5; Fisher’s exact test P<0.01). However, no significant difference was observed in the ratio of codon changes for the young Y-linked genes (Fisher’s exact test P>0.05). Collectively, these molecular evolutionary comparisons of X and Y-linked sequences support the hypothesis that deleterious changes are accumulating in the Y lineages as a result of a reduction in the efficacy of selection, with the magnitude of the effects depending strongly on the time since recombination suppression.

CONCLUSIONS Our segregation-based analysis using RNAseq has led to the identification of hundreds of sex-linked genes in a non-model dioecious plant species with a neo-Y sex chromosome system. This has allowed us to compare the changes in expression and sequence evolution that have occurred following recombination suppression between X and Y chromosomes. Our results indicate that the majority of X/Y genes in R. hastatulus have become non-recombining fairly recently and exhibit low X-Y sequence divergence; however, the older Y-linked genes that are shared between XX/XY and XX/XY1Y2 systems show clear signs of degeneration, and many of the oldest sex-linked genes are likely in our hemizygous set. The older Y-linked genes have undergone gene loss, are accumulating nonsynonymous substitutions likely to impair gene function, contain significantly more unpreferred changes in codon usage, and show a significant loss of expression compared to X-linked genes. In contrast, we find that these features of Y- degeneration are either significantly reduced or absent in the younger X/Y genes that are unique to the XX/XY1Y2 system. Our contrast between young and old sex linked genes, made possible because of the unusual occurrence in R. hastatulus of intraspecific polymorphism in sex chromosome system, provides a unique glimpse into the early stages and chronology of Y-chromosome degeneration in a flowering plant.

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 50

METHODS RNA sequencing To identify sex-linked genes in R. hastatulus, we sequenced transcriptomes from parents and F1 progeny from two within-population crosses, one from a population with XY males sampled from Many, Louisiana (LA-MAN), and one from a population with

XY1Y2 males from Branchville, South Carolina (SC-BRA). We extracted RNA from leaf tissue from all individuals using Spectrum® Plant Total RNA kits and the isolation of mRNA and cDNA synthesis were conducted according to standard Illumina RNA-seq procedures. Sequencing was conducted on the Illumina GAII platform for XX/XY parental samples with 80b paired-end reads at the Center for the Analysis of Genome Evolution and Function (University of Toronto) and on the Illumina HiSeq platform by the Genome Quebec Innovation Center (GQIC) with 150b paired-end reads for

XX/XY1Y2 parental samples. F1 samples were sequenced by multiplexing and barcoding 6 male and 6 female samples from each cross on a separate Illumina HiSeq lane with 150b paired-end reads at GQIC. Samples used for validation (see below; SNP segregation analysis and ascertaining sex linkage) were sequenced by barcoding and multiplexing on an Illumina HiSeq lane with 150b paired-end reads at GQIC. We also obtained 150b paired-end RNAseq data for the transcriptome of one Rumex bucephalophorus individual, which was also sequenced at GQIC with 150b paired-end reads. This species has no sex chromosomes, providing an outgroup for X- and Y-linked homologs in R. hastatulus. We have submitted raw sequence data to NCBI under accession numbers SRP041588 (Rumex hastatulus) and SRP041613 (Rumex bucephalophorus).

Assembly of R. hastatulus transcriptomes We assembled a reference transcriptome de novo using Velvet [version 1.2.07; (Zerbino et al. 2008)] and Oases [version 0.2.08; (Schulz et al. 2012)] and pooled paired-end reads from 6 F1 females belonging to the XY1Y2 system. Using this as the reference transcriptome facilitated our identification of sex-linked genes that were shared between the XY and XY1Y2 systems (see next section). Prior to assembly, we trimmed the data to remove reads <50bp, and VelvetOptimiser (version 2.2.4) was used to choose the best kmer size for each individual transcript. To avoid missing low coverage transcripts, the

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 51 final total number of bases in each assembly was used to evaluate the best kmer size, which was 43. Oases Version 0.2.08 then was then run under default parameters. For each set of transcript isoforms, the longest was chosen as the final transcript. This reference assembly yielded 38828 contigs (N50=2089; total length=44585937 bp). For the outgroup, Rumex bucephalophorus, the assembly was run with the same pipeline, yielding a best kmer length=43, and 35525 contigs (N50=1923; total length=38120382 bp).

SNP segregation analysis and ascertaining sex linkage To assign sex linkage to assembled contigs in which nucleotide variants were identified, we mapped reads from both XX/XY and XX/ XY1Y2 samples to the reference transcriptome that was assembled using reads from females belonging to the XY1Y2 system. Mapping was conducted using the Burrows-Wheeler Aligner [0.6.2-r126; (Li and Durbin 2009)], followed by Stampy [1.0.20; (Lunter and Goodson 2011)] for mapping more divergent reads. We used Picard tools (1.78, http://picard.sourceforge.net) to modify mapping output into the format required for the Genome Analysis Toolkit [GATK, 2.1-11; (McKenna et al. 2010)] variant calling software. Segregation analysis was then conducted on both systems separately (see Appendix 3.1), and then compared to obtain the set of sex-linked genes shared between the XY and XY1Y2 systems (referred to as the “old” sex-linked genes), and those that were unique to the XY1Y2 system (“young”). The number of sex-linked genes identified as a function of the number of diagnostic polymorphisms is shown in the Appendix 3.1 (Appendix 3.1 Table A3.1.1) for each system, along with the number shared between them. We required contigs to have ≥ 4 high quality (Phred-scaled SNP quality score > 60) SNPs, with genotype calls made for all parents and progeny from both sex-chromosome systems, and segregation patterns indicating sex linkage. Such sex-linked SNPs were identified based on either: (i) the presence of a segregating Y-linked variant, where fathers and all sons were heterozygous while mothers and all daughters were homozygous, or (ii) the presence of a segregating X-linked variant, where fathers and all daughters were heterozygous, but mothers and all sons were homozygous. To ensure that such X/Y contig assignments were reliable, we further filtered our list of putative sex-linked contigs to include only those in which a

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 52 segregating Y-linked variant was ascertained and showed the expected sex-specific genotypes in twelve population samples (see Appendix 3.1). Such sites represent fixed differences between the X and Y. Similar approaches were used to identify hemizygous and autosomal genes (see Appendix 3.1). All data parsing was done using bash, R, or Perl. Scripts are available on request.

Comparisons of sex-linked gene expression The numbers of mRNA reads covering sex-linked SNPs in sex-linked contigs were counted from the SNP output from GATK to obtain estimates of the relative expression of X- and Y-linked alleles in males. This enabled us to compare young and old sex-linked genes by determining their respective Y/X expression ratio distributions (Figure 3.2). Because the relative expression of X- and Y-alleles was estimated per gene in males (i.e. within individual samples), it is unnecessary to normalize the counts across samples, and these relative estimates were averaged across all males. Expression estimates for reference and alternative alleles at heterozygous sites in autosomes were obtained similarly using the numbers of mRNA reads covering SNPs across all samples in contigs where at least 4 such SNPs segregated as autosomal. For gene-level (rather than allele- specific) expression comparisons of sex linked and autosomal genes across the sexes, we estimated expression in coding sequences using HTseq (Anders et al. 2014) with the ‘intersection-nonempty’ option. We focused on coding sequences and excluded putative untranslated regions due to observed high variance in read counts in these regions. Following HTseq, we used DESeq (Anders et al. 2010) to conduct median differential coverage normalization and test for differential expression using the beta binomial distribution. Genes with a maximum total read count across samples less than 20 were removed to eliminate loci with little power to test for differential expression. The possibility of widespread chromosome-wide differences in gene expression may complicate normalized expression tests in this system; however, we found that normalization using just autosomes gave nearly identical results, with no consistent bias by sex (Appendix 3.1, Figure A3.1.6). Significant expression differences between the sexes were assessed using both a 5% cut-off and a 10% false-discovery rate correction (FDR) (Appendix 3.1 Table A3.1.6).

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 53

Consensus contigs for molecular evolutionary analysis To analyze the molecular evolution of sex-linked genes, we generated X and Y consensus sequences based on parent and progeny genotypes using a phasing algorithm implemented in an R script (available upon request). For each nucleotide position within candidate sex-linked loci, we used sequencing coverage/quality information from parental samples to call sites that were identical on both X and Y copies. Sites were accepted as identical if both parental strains were called as homozygous, and both had ≥ 8x sequencing coverage and genotype quality scores ≥ 60. Otherwise sites were annotated as missing data. Candidate X/Y variants were initially identified as sites homozygous in the female parent and heterozygous in the male parent. Our method then used a likelihood-ratio approach to evaluate the relative support for the heterozygous site representing a true X/Y variant (male: XAYa, female: XAXA) vs. a segregating X variant in the male (male: XaYA, female: XAXA). To test the performance of this method, we implemented a simulation that calculated likelihood ratio tests for simulated parent/progeny genotype arrays in which variants were either heterozygous X variants in the male parent or true X/Y variants. (see Appendix 3.1, Figure A3.1.1 and Figure A3.1.2 for more details and simulation results).

ORF identification, sequence alignment, and phylogeny reconstruction We identified Open Reading Frames (ORFs) from consensus sequences for all X and Y consensus sequences, and from orthologous R. bucephalophorous sequences (identified using a three-way reciprocal BLAST of contigs from each sex chromosome system plus the outgroup) using the getorf program from the EMBOSS suite Version 6.3.1 (Rice et al.

2000). For each locus, the X and Y ORFs from the XX/XY and XX/XY1Y2 system, as well as the orthologs from the outgroup sequence, were aligned using MUSCLE Version 3.8.31 (Edgar 2004). We then used ORF alignments to guide nucleotide alignments with in-frame gaps using a custom Perl script (available upon request). Maximum likelihood phylogenetic trees were then produced from each nucleotide alignment using RaXML version 7.0.4 (Stamatakis 2006).

CHAPTER 3. GENETIC DEGENERATION OF OLD AND YOUNG Y CHROMOSOMES 54

Analysis of evolutionary rates Phylogenies were used as starting trees for the analysis of evolutionary rate at synonymous and non-synonymous sites using PAML Version 4.6 (Yang 2007). For each locus, we fit a “free-ratio” model (model=1), allowing dN/dS to vary across branches. Branch-specific silent site divergence, dN/dS ratios, and tree topologies were then extracted and analyzed in R using the “phytools” package (Revell 2012). For loci in which X and Y sequences were monophyletic across the two sex chromosome systems, we estimated dN/dS as the average of the population-specific and ancestral branches, weighted by the corresponding dS values. For all other comparisons, only values at terminal branches were considered. For each alignment, we also used a modified version of Polymorphurama (Andolfatto 2007) to count the number of parsimony-estimated lineage-specific changes (synonymous, nonsynonymous, preferred-> unpreferred, unpreferred->preferred) on the X and Y sequences, using the outgroup sequence to polarize changes. We analyzed the two sex chromosome systems separately for this analysis in a 3-way alignment of X, Y, and the outgroup.

Accession Numbers GenBank accession numbers for R. hastatulus and R. bucephalophorous sequences reported in this article are: SRP041588 (Rumex hastatulus) and SRP041613 (Rumex bucephalophorus).

ACKNOWLEDGEMENTS We thank Deborah Charlesworth for helpful advice and discussion, and two anonymous reviewers for their comments on an earlier version of the manuscript. This research was funded by NSERC Discovery Grants to SCHB and SIW.

APPENDIX 3.1

SUPPORTING INFORMATION FOR CHAPTER 3

Identification of sex-linked genes with Y-linked homologues To identify sex-linked genes from the two sex chromosome systems, we filtered our genotype data (both SNPs and indels from both races) and required 1) all individuals to have a genotype call at the site, and 2) the SNP quality score of the site to be >=60. After these filters, we obtained a total of 730,957 polymorphic sites (SNPs or indels). We identified SNPs showing XY segregation patterns separately in the XX/XY system and the XX/XY1Y2 system. In total, this led to the identification of 10,420 sex- linked SNPs from 1383 genes in the former system, and 16,967 SNPs from 2839 genes in the latter. The number of sex-linked genes identified, as a function of how many diagnostic polymorphisms are required, is shown in Table A3.1.1 for each system separately, and also shared between them (where the ‘shared’ criterion requires both system to have the same minimum number of segregating SNPs). In general, regardless of SNP cutoffs, roughly 70% of sex-linked genes are also identified as such in the

XX/XY1Y2 system, while about 40% of genes found in XX/XY1Y2 are shared with the XX/XY system. This is consistent with the neo-Y system having acquired many new sex- linked genes since the autosomal fusion.

Table A3.1.1: Number of sex-linked genes with Y homologues as a function of the minimum number of SNPs required to identify sex-linked genes Minimum number of XY system XY1Y2 system Shared sex- linked Percent of SNPs with XY genes shared sex- segregation linked genes 1 1383 2839 1005 73/35 2 1033 2043 747 72/37 3 838 1599 592 71/37 4 698 1298 510 73/39 5 616 1065 451 73/42

Population screen We used population data (12 males, 12 females of each system, with one male and one female from each of six populations per sex chromosome system) to validate the

54 APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 56 ascertained sex-linked genes from crosses. Of particular interest was to distinguish candidate pseudoautosomal loci, which show linkage in the cross but not the population, from genes that are definitively in the sex-linked region. Note however, that by requiring fixed differences between the X and Y this will also exclude very recent sex-linked loci that have not had time to accumulate fixed differences. The results of the polymorphism analyses are shown in Tables A3.1.2 and A3.1.3.

Table A3.1.2: Polymorphism screen in XX/XY1Y2 system Minimum Number of Number of genes with Overlap with crossing Overlap with crossing SNPs used for sex-linked patterns data with same data for ≥ 4 supporting identification from XX/XY1Y2 minimum number of SNPs population data SNPs

1 1210 954 (79%) 679 2 891 726 (81%) 609 3 723 592 (82%) 548 4 606 493 (81%) 493 5 530 432 (82%) 450

Table A3.1.3: Polymorphism screen in XX/XY system Minimum Number of Number of genes with Overlap with crossing Overlap with crossing SNPs used for sex-linked segregation data with same data with ≥ 4 identification in XX/XY population minimum number of supporting SNPs data SNPs

1 1294 946 (73%) 611 2 1000 746 (75%) 595 3 828 626 (76%) 572 4 723 550 (76%) 550 5 639 495 (77%) 518

Identification of autosomal genes To screen for autosomal genes, we identified SNPs where the mother was homozygous, father heterozygous, and at least one son and at least one daughter was heterozygous (i.e. both sons and daughters inherit a focal allele from the father). Because this is a much less stringent filter for genotypes than sex-linked SNPs (e.g. all males being heterozygous and all females being homozygous), this category is more susceptible to genotyping errors. Indeed, the average coverage and genotype quality scores are lower for this set of SNPs than for the sex-linked SNPs (e.g. one focal Texas female has an average coverage of 32

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 57 and genotype quality of 54 for autosomal SNPs, but average coverage of 47 and quality of 84 for sex-linked SNPs). Because a major use of the autosomal SNPs was to provide a further filter for sex-linked genes and normalize gene expression comparisons, we therefore filtered the autosomal SNPs, requiring all individuals to have a minimum genotype quality score of 50, and removing SNPs that showed significant departures from Mendelian expectation in their genotype ratios. Following filtering, we were left with 890 genes with at least 4 high-confidence autosomal SNPs in the XX/XY system, and 1195 with at least 4 high confidence autosomal SNPs in the XX/XY1Y2 system. The numbers of identified autosomal genes as a function of the minimum number of SNPs used in the cutoff is shown in Table A3.1.4.

Table A3.1.4: Identification of autosomal genes and filtering of sex-linked genes showing autosomal SNPs Number of Number of Number of Number of Number of autosomal SNPs used as autosomal loci, autosomal loci, autosomal genes in genes in XX/XY1Y2 cutoff XX/XY system XX/XY1Y2 XX/XY system system that overlap with system that overlap with XY1Y2 sex-linked genes XX/XY sex-linked (polymorphism genes validated, 4 SNP cutoff) (4 SNP cutoff)

1 3909 4005 26 22 2 2289 2566 17 15 3 1382 1750 10 8 4 890 1195 7 6

Identification of putative hemizygous genes To search for genes showing hemizygous segregation, we looked for two types of segregation patterns, where A and B represent alternative SNPs or indels:

a) Maternal genotype AA, paternal genotype called BB. All daughters AB, all sons called AA b) Maternal genotype AB, paternal genotype called AA (or BB), some daughters AB, no sons heterozygous, the set of sons showing BOTH AA and BB calls

Because these hemizygous segregation patterns rely only on X-linked polymorphisms and will have fewer SNPs than divergent X-Y homologues, we reduced

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 58

the stringency of our criterion for defining hemizygous genes shared between the sex

chromosome systems. In particular, we defined hemizygous genes in the XX/XY1Y2 system that are shared with the XX/XY system as those with at least 4 supporting SNPs

in the XX/XY1Y2 system, and at least 1 supporting hemizygous SNP in the XX/XY system. Furthermore, to estimate the percent of sex-linked genes that are hemizygous, we identified the number of XY sex-linked genes with an equivalent number of segregating X-polymorphisms. The number of hemizygous genes identified as a function of the minimum numbers of SNPs used in the cutoffs is shown in Table A3.1.5.

Table A3.1.5: Identification of hemizygous genes as a function of the minimum number of SNPs used in cutoff Minimum Hemizygo Hemizygo Shared Number of XY Estimated Number of XY genes Estimated percent number of us genes in us genes in hemizygous genes with X- percent with X-linked hemizygosity SNPs XX/XY XX/XY1Y genes linked maternal hemizygosity maternal segregation XX/XY1Y2 showing system 2 system segregation XX/XY system pattern, XX/XY1Y2 system hemizygous pattern, XX/XY system segregation system

1 571 496 209 540 49 1140 30 2 246 262 158 481 34 1018 20 3 159 192 125 426 27 915 17 4 119 144 100 373 24 819 15 5 79 112 82 334 19 757 13

Final filtered gene sets used in molecular evolution and expression analysis To exclude putatively pseudoautosomal loci and other genes possibly misidentified as sex linked, we filtered our sex-linked genes to those that showed at least one sex-linked SNP from polymorphism data and at least 4 segregating sex-linked SNPs in crossing data. Furthermore, to exclude genes that were possibly erroneously identified as sex linked and/or represent chimeric assemblies, we also excluded any sex-linked genes showing any autosomal segregation patterns. This led to the following gene sets for expression and molecular evolution analyses:

• XX/XY1Y2 shared: for the XX/XY1Y2 system, we found a total of 460 genes that show XY segregation in the family data (>=4SNPs), are shared with the

XX/XYY system (where the XX/XY1Y2 system has >=4 supporting polymorphisms), have no autosomal SNPs, and have at least one fixed difference

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 59

between X and Y in the population data. None of these overlap with the set of

hemizygous genes identified in the XX/XY1Y2 system.

• XX/XY1Y2 unique: for XX/XY1Y2 samples we found a total of 231 genes that show XY segregation in the family data, not shared with the other system, have no autosomal SNPs, and have at least one fixed difference between X and Y in the population data. None of these overlap with the set of hemizygous genes

identified in the XX/XY1Y2 system.

• XX/XY: For the XX/XY system, we found a total of 585 genes that show XY segregation in the family data, have no autosomal SNPs and are polymorphism validated

• Autosomal in XX/XY1Y2 system: After removing autosomal genes with signs of sex-linkage, we ended up with a total of 1167 confidently autosomal genes with >=4 supporting SNPs in this system

• Hemizygous in XX/XY1Y2 system: after removing any genes with 1 or more autosomal or XY SNP segregation pattern, and removing those with less than 4 supporting SNPs we are left with 122 autosomal genes in this system.

• Hemizygous in XX/XY system: after removing any genes with 1 or more autosomal or XY SNP segregation pattern, and removing those with less than 4 supporting SNPs we are left with 106 genes.

Generation of consensus X/Y sequences To analyze the molecular evolution of sex-linked genes, we generated X and Y consensus sequences. Sequences were generated based on the parent/progeny genotype information, using a novel phasing algorithm implemented in an R script (available upon request). For each nucleotide position within candidate sex-linked loci, we used sequencing coverage and quality information from parental strains to call sites that were identical on both X

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 60

and Y copies. Sites were accepted as identical if both parental strains were called as homozygous, and both had ≥ 8x sequencing coverage and genotype quality scores ≥60. Otherwise sites were annotated as missing data. Candidate X/Y variants were initially identified as sites homozygous in the female parent but heterozygous in the male parent. Our method then utilized a likelihood-ratio approach to evaluate the relative support for the heterozygous site representing a true X/Y variant (male: XAYa, female: XAXA) vs. a segregating X variant in the male (male: XaYA, female: XAXA). This method evaluated the probability that a given male or female progeny was heterozygous based on the binomial density function in R:

P(Het) = dbinom(x2,(x1+x2),0.5)

where x1 was the read support for the reference allele and x2 was the read support for the alternative allele. The likelihood of homozygosity was calculated using the dbinom function:

P(Hom) = dbinom(x2,(x1+x2),e) + dbinom(x2,(x1+x2),1-e),

where e was equal to 1/3 the overall error rate at homozygous reference sites (calculated from errors in progeny sequence data at sites with high confidence homozygous parental genotypes). The likelihood of all males (or females) being heterozygous was then

n

LHet = ∏P(Het)i i=1

and the likelihood of all males (or females) being homozygous was n

LHom = ∏P(Hom)i i=1

where P(Het) and P(Hom) represented the above probability for the i-th male (or i i female). The relative support for heterozygosity vs homozygosity was evaluated using separate likelihood ratio tests for males:

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 61

LRTm = 2*[log(LHet) – log(LHom)]

and females:

LRTf = 2*[log(LHet) – log(LHom)]

Finally, we evaluated support for the site being an X/Y variant by calculating the difference:

LRTd = LRTm - LRTf

The LRTd statistic takes on large values when all males have strong support for being heterozygous and all females have strong support for being homozygous, as expected for true X/Y variants (male: XAYa, female: XAXA). To test the performance of this method, we implemented a simulation that calculated LTRd for simulated parent/progeny genotype arrays in which variants were either heterozygous X variants in the male parent or true X/Y variants. We simulated gene expression levels by drawing from an exponential distribution fitted to the average coverage among individuals and across genes in our transcriptome dataset. We allowed a uniform distribution of missing data among male and female progeny, and assigned total coverage of both alleles for single individuals by sampling from a Poisson distribution with mean equal to the randomly drawn expression level. Based on the individual expression level, we then sampled alleles according to a binomial distribution where the probability of sampling the alternative allele was 0.5 for heterozygotes and e for homozygotes. We simulated 10,000 true X/Y variants (male: XAYa, female: XAXA) and 10,000 male X variants (male: XaYA, female: XAXA). Figure A3.1.1 shows the simulated distribution of LRTd based on each of the above scenarios.

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 62

Figure A3.1.1: Histogram of the simulated distribution of the LRTd statistic from 10,000 simulated parent/progeny SNP segregation patterns for either true X/Y variants (red) or a segregating X variant on the male X chromosome (blue). The distribution of LRTd results from the strength of support for an X/Y variant in the presence of random sampling of gene expression level, read support for alleles, and proportion of missing data.

In addition, we quantified the Type I and Type II error rates for a given LRTd cutoff, shown in Figure A3.1.2. We determined that an LRTd cutoff of ≥4 would constrain the Type I error rate to <1% and the type II error rate to <5%. These are likely overestimates of the rates, however, as our simulation overestimated the occurrence of missing data by sampling that parameter from a uniform distribution.

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 63

Figure A3.1.2: Plot of occurrence of Type I (blue) and Type II (red) error from 10,000 simulations of true

X/Y or segregating X variants for values of the LRTd statistic between -20 and 20.

Using this approach, consensus X/Y sequences were generated based on .vcf files separately for NC and TX populations. Because individuals from both populations were genotyped based on mapping to a North Carolina reference female, the North Carolina X/Y consensus sequences necessarily included a random sample of derived and ancestral states for segregating sites. To produce a similar outcome in the Texas sequences, we randomly assigned the non-reference base to the Texas X/Y consensus sequences 50% of the time. We identified putative fixed non-reference variants on the Texas X- chromosome based on a (male: XaYA, female: XaXa) pattern, and assigned them to the

Texas X consensus if they had sufficient LRTd support.

ORF identification, sequence alignment, and phylogeny reconstruction

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 64

Open Reading Frames (ORFs) were identified from consensus sequences for all X and Y consensus sequences, and from orthologous R. bucephalophorus sequences using the getorf program from the EMBOSS suite Version 6.3.1. For each locus, the X and Y ORFs from Texas and North Carolina, as well as the outgroup sequence, were aligned using MUSCLE Version 3.8.31. The ORF alignments were then used to guide nucleotide alignments with in-frame gaps using a custom PERL script (available upon request). Maximum likelihood phylogenetic trees were then produced from each nucleotide alignment using RaXML version 7.0.4.

Analysis of evolutionary rates Phylogenies were used as starting trees for analysis of evolutionary rate at synonymous and non-synonymous sites using PAML Version 4.6. For each locus, we fit a “free-ratio” model (model=1), allowing dN/dS to vary across branches. Branch-specific silent site divergence, dN/dS ratios, and tree topologies were then extracted and analyzed in R using the “phytools” package. For loci in which X and Y sequences, respectively, were monophyletic across the two populations, we estimated dN/dS as the average of the population-specific and the ancestral branches, weighted by the corresponding dS values. For all other comparisons, only values at terminal branches were considered.

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 65

Figure A3.1.3: The Y/X expression ratio distribution in males for: 1) sex-linked genes from the XY1Y2 system shared with the XY system, 2) the full set from the XY system, and 3) unique to the XY1Y2 system, compared to the expression ratio for alternate to reference alleles at heterozygous sites in autosomes. Relative expression of X and Y alleles was estimated per gene by counting the numbers of mRNA reads covering sex-linked SNPs (or autosomal SNPs for autosomes). The dotted line shows the expectation when X and Y alleles (or ref and alt alleles in autosomes) are equally expressed. Error bars show 1.5 times interquartile range, approximately corresponding to two standard deviations, and notches correspond approximately to 95% confidence intervals for the medians.

Figure A3.1.4: The number of parsimony-estimated lineage-specific substitutions (synonymous or nonsynonymous) on the X and Y sequences from the XY and XY1Y2 systems, using orthologous sequences from the outgroup, R. bucephalophorus, to polarize the changes along the X and Y lineages separately. ‘Shared’ genes represent those shared with the XY system, while ‘unique’ genes represent those that are not shared.

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 66

Figure A3.1.5: Changes in codon usage for X and Y genes. Total numbers of parsimony-estimated lineage- specific changes from preferred-> unpreferred and unpreferred->preferred for the XY and XY1Y2 systems. ‘Shared’ genes represent those shared with the XY system, while ‘unique’ genes represent those that are not shared.

Figure A3.1.6: DeSeq normalization coefficients (‘scaling factors’) from all genes compared with normalization using just autosomal genes for the XY1Y2 system progeny data. Males, squares, and females, circles. No clear bias is observed using scaling factors from either the total gene set or autosomal genes alone.

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 67

Figure A3.1.7: Distribution of average autosomal gene expression in males divided by average expression

in females for the XY1Y2 system progeny data (6 males; 6 females; 1167 autosomal genes). The dotted line shows a male/female ratio of 1, indicating no sex-specific change. We normalized each sample by the total number of mapped reads to make the 12 biological replicates comparable.

Table A3.1.6: Results of statistical tests of differences between male and female expression for different gene sets. Statistical tests were performed throughout using the beta binomial test (see methods), with median differential expression normalization.

Number of Number of genes Total Number Percent Percent genes, 10% Gene set1 Number significant, significant at 5% tested significant, significant significant, FDR that show 10% FDR that show female 5% 5% 10% FDR female overexpression overexpression Autosomal, XY Y 1 2 1167 32 2 2.7 0.2 9 1 system, progeny Hemizygous, XY Y 1 2 119 94 47 79 39.5 94 47 system, progeny ‘Old’, XY Y system, 1 2 458 72 21 15.7 4.6 32 5 progeny ‘Young’, XY Y system, 1 2 167 15 6 9.0 3.6 6 2 progeny Autosomal, XY Y 1 2 1166 179 48 15.4 4.1 62 14 system, population Hemizygous, XY Y 1 2 119 92 52 77.4 43.7 91 52 system, population ‘Old’, XY Y system, 1 2 458 97 22 21.2 4.8 37 5 population ‘Young’, XY Y system, 1 2 167 31 8 18.6 4.8 15 3 population

APPENDIX 3.1. SUPPORTING INFORMATION FOR CHAPTER 3 68

‘Old’, XY system, 584 64 5 11.0 0.9 25 0 population Hemizygous, XY 105 68 5 64.8 4.8 68 5 system, population Autosomal, XY system, 889 23 1 2.6 0.1 14 1 population Hemizygous, XY 104 57 24 54.8 23.1 55 24 system, progeny ‘Old’, XY system, 578 123 39 21.2 6.7 48 15 progeny Autosomal, XY system, progeny 888 156 46 17.6 5.2 59 17 1. Gene sets include male and female progeny from crosses (‘progeny’), and data from 12 population samples (‘population’) in all cases six males

were compared with six females using a global normalization procedure for all 12 samples. ‘Old’ genes in the XY1Y2 system represent those shared with the XY system, while ‘young’ genes are not shared. ‘Old’ genes for the XY system represent the entire complement of genes identified in this system. In all cases only genes with a maximum expression of at least 20 reads across samples were retained for analysis.

Table A3.1.7: Chromosome-specific PAML estimates of the per-site synonymous substitutions rate (Ks) and the ratio of nonsynonymous to synonymous substitutions (ω) in sex-linked genes. Sex chromosome Gene set Average Ks Average ω system (Standard Error) (Standard Error)

Old X 0.00870 0.156 (0.00411) (0.0379)

Old Y 0.0120 0.401 (0.00160) (0.0533) XY1Y2 Young X 0.00276 0.145 (0.00116) (0.0553)

Young Y 0.00297 0.209 (0.00109) (0.0730)

Old X 0.00661 0.169 (0.00116) (0.0367) XY Old Y 0.0122 0.381 (0.00185) (0.0494)

CHAPTER 4

REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME FOLLOWING RECENT SUPPRESSION OF RECOMBINATION IN RUMEX HATSTATULUS

This chapter resulted from collaboration with Wei Wang, Spencer C.H. Barrett, and Stephen I. Wright.

SUMMARY

X and Y chromosomes are expected to differ in effective population size (Ne), rates of recombination, and patterns of positive and negative selection, all of which can have important effects on relative levels of neutral variability. In non-recombining Y chromosome regions, selective interference among linked loci, including the removal of deleterious mutations and linked variants by purifying selection (background selection), and the fixation of linked variants due to positive selection (selective sweeps), is predicted to reduce the Ne of Y-linked genes, resulting in lower levels of standing variation on the Y chromosome compared to X chromosomes or autosomes. Furthermore, female biased sex ratios and high variance in male reproductive success may also reduce Y-linked diversity below neutral expectations. To test whether recent suppression of recombination in the dioecious plant Rumex hastatulus is associated with a reduction in Y-linked diversity, we compared levels of neutral polymorphism of X- and Y-linked genes with that of autosomal genes in the two chromosomal races (XY and XY1Y2) that occur in this species. We found that Y-chromosome diversity was 40-fold lower than on the X chromosome, and nearly 50-fold lower than on autosomes. Our analysis indicates that this severe reduction cannot be explained by a reduced Ne of Y-linked genes caused by the observed occurrence of female biased sex ratios, and is unlikely due to mutation rate differences between the sex chromosomes or high variance in male reproductive success. Rather, in combination with recent results indicating an accumulation of deleterious mutations on the Y chromosome, our finding suggests that selective interference has likely played an important role in reducing diversity during the early stages of Y-chromosome degeneration. In addition, the recent fusion event resulting in the formation of the XY1Y2 sex chromosome system was associated with a significant

68 CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 70

reduction in genetic diversity on the X chromosome in XY1Y2 compared to XY populations, suggesting the possibility of a selective sweep or a disproportionate loss of

X-diversity following the origin of the XY1Y2 from the XY race.

INTRODUCTION Morphologically distinct X and Y chromosomes have evolved many times independently in both plant and animal kingdoms (Bull 1983). The parallel changes that have occurred during sex chromosome evolution, including the suppression of recombination and the genetic degeneration of the Y-chromosome, provide an informative context for investigating the evolutionary forces that affect levels of genetic diversity within the genome. A fundamental difference between sex chromosomes and autosomes is the number of these chromosome types in the population; whereas autosomes in diploid species are present in two copies in each sex, the X chromosome is present in two copies in females and only one copy in males. As a consequence, the effective population size

(Ne) of the X chromosome is predicted to be 3/4 that of the autosomes, whereas the Ne of the Y chromosome is expected to equal 1/4 that of autosomes (assuming an equal number of reproducing females and males). Such differences in Ne are predicted to directly affect relative levels of neutral polymorphism maintained on these chromosomes, which is proportional to the product of Ne and the neutral mutation rate (Kimura 1983). In turn, variation in polymorphism levels within the genome may have important consequences on patterns of DNA sequence evolution, including the effectiveness of natural selection on weakly adaptive or mildly deleterious mutations (Charlesworth 2009). Several factors can modulate or accentuate the differences in effective population size of sex chromosomes and autosomes predicted by a neutral model. These include population subdivision and sex-biased dispersal, deviations from a 1:1 breeding sex ratio, and high variance in reproductive success (Caballero 1994 1995; Charlesworth 2009; Ellegren 2009). For example, studies comparing levels of variability on the X chromosome and autosomes in humans have found evidence that sex-biased dispersal has likely reduced current levels of X-to-autosome (X:A) nucleotide polymorphism (Keinan et al. 2009; and see Bentley et al. 2008), though there is considerable variation in the estimation of X:A ratios in humans, and other demographic models have also been

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 71 suggested, including a historical excess of breeding females (Hammer et al. 2008; Bustamante and Ramachandran 2009; Ellegren 2009). Recent theoretical work also indicates that population size reductions can lead to more severe decreases in sex-linked than autosomal variation (Pool and Nielsen 2007), a prediction that has been supported in a diverse array of taxa, including chimpanzees and orangutans (Kaessmann et al. 2001; Fischer et al. 2006), the house mouse Mus musculus (Baines and Harr 2007), Drosophila (Andolfatto 2001), and humans (Keinan et al. 2009). Genetic variability on the sex chromosomes is also predicted to be affected by the evolution of suppressed recombination on the Y chromosome, which can result in a lowered effective population size of Y-linked genes and an increased vulnerability to selective sweeps (Maynard Smith and Haigh 1974) and background selection (Charlesworth 1994). Both of these processes are examples of the Hill–Robertson (HR) effect, in which the effective population size for a given chromosomal region depends strongly on the rate of recombination, such that selection in that region is not independent of selection at nearby sites (Hill and Robertson 1966). Although HR effects can occur throughout the genome (McVean and Charlesworth 2000), large genomic regions in which recombination is suppressed, such as the Y chromosome, are expected to experience the most severe reductions in effective population size (Charlesworth 1996; Charlesworth and Charlesworth 2000; Bachtrog 2013). Consistent with these predictions, studies examining Y chromosome variability in humans (Wilson Sayres et al. 2014), Drosophila (Bachtrog et al. 2008; Singh et al. 2014) and several mammalian species (Hellborg and Ellegren 2004), have found evidence that selective interference has been an important determinant reducing levels of Y-chromosome polymorphism, though much less is known about the relative effects of selective sweeps (but see Bachtrog 2004). Despite strong interest in determining the effects of recombination rate on patterns of genetic diversity (e.g., Wright et al. 2006; Haudry et al. 2008; Kulathinal et al. 2008) and the forces contributing to sex chromosome evolution (Ellegren 2011; Otto et al. 2011; Bachtrog 2013; Charlesworth 2015), we still know very little about the influence of HR effects on patterns nucleotide polymorphism, especially in recently evolved sex chromosomes, where suppression of recombination evolved more recently. Young sex chromosome systems provide an opportunity to investigate the evolutionary mechanisms

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 72 that operate in the early stages of sex chromosome evolution, and can be particularly informative in cases where the sex chromosomes evolved de novo from autosomes, as in dioecious plants (Charlesworth 2012). Recent studies of young plant sex chromosome systems have found that several features of X and Y chromosome evolution are shared between plant and animal kingdoms, including the genetic deterioration of the Y chromosome (Bergero and Charlesworth 2011; Chibalina and Filatov 2011; Hough et al. 2014) and the evolution of dosage compensation (Muyle et al. 2012; Papadopulos et al. 2015). However, the changes in effective population size and genetic diversity predicted from evolutionary models of Y-chromosome degeneration have not been widely studied (but see Filatov et al. 2001; Qiu et al. 2010). Given that HR interference is predicted to have the greatest effects during the early stages of Y-chromosome evolution (Bachtrog 2008), studying recently evolved sex chromosomes provides an important test of the temporal dynamics of the HR effect. To investigate the early stages of plant sex chromosome evolution and the effects of different recombination environments on patterns of nucleotide diversity, we analyzed sequence polymorphism on sex chromosomes and autosomes in the dioecious plant R. hastatulus (Polygonaceae). This species is a dioecious annual with highly heteromorphic (del Bosque et al. 2011; Grabowska-Joachimiak et al. 2014) and recently diverged X and Y chromosomes (Navajas-Pérez et al. 2005). Significantly, it possesses a polymorphic sex chromosome system consisting of both XY and XY1Y2 males that occur in geographically distinct populations (‘chromosomal races’; see Smith 1963; Hough et al. 2014). In common with several other Rumex species (reviewed in Barrett et al. 2010), R. hastatulus populations are characterized by female-biased sex ratios, in which females are on average 20% more frequent than males throughout their range in the southern USA (Pickup and Barrett 2013). In evaluating any potential differences in X- and Y-linked nucleotide diversity in this species as evidence of selection affecting the Y chromosome, it is therefore important to assess the extent to which sex ratio bias has resulted in chromosome-specific variation in Ne. We test this idea by comparing observed estimates of polymorphism to theoretical predictions of the effect of sex ratio bias on levels of diversity on sex chromosomes and autosomes (Figures 4.1; 4.2). In addition, we investigate whether the recent origin of the XY1Y2 sex chromosome system in R.

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 73 hastatulus is associated with a shift in the relative effective population size of the X chromosome, as would be predicted if the autosomal fusion that led to the formation of this system was associated with a selective sweep (Charlesworth 1996).

METHODS Population samples and sex-linked genes We analyzed sex-linked and autosomal genes identified from Illumina RNA sequence data from 24 individuals (12 males and 12 females, with 1 male and 1 female from each of 12 different populations). Samples of both sex chromosomal races were collected in 2010 from throughout the native range of R. hastatulus (locations in Table S1), and plants were grown in the glasshouse from seeds collected from open-pollinated females. We extracted RNA from leaf tissue using Spectrum Plant Total RNA kits (Sigma-Aldrich). The isolation of mRNA and cDNA synthesis was conducted according to standard Illumina RNAseq procedures, with sequencing conducted on two Illumina HiSeq lanes with 150-bp end reads at the Genome Quebec Innovation Center. Reads from these 24 samples were mapped to the R. hastatulus reference transcriptome described in Hough et al. (2014), and sequences are available from the Gen-Bank Short Read Archive under accession no. SRP041588. Reads were mapped using the Burrows–Wheeler Aligner (release 0.6.2-r126; Li and Durbin 2009), followed by Stampy (release 1.0.20; Lunter and Goodson 2011). We used Picard tools (release 1.78, http://picard.sourceforge.net) to modify mapping output into the format required for the Genome Analysis Toolkit (GATK version 2.1-11; McKenna et al. 2010) variant calling software, and subsequently removed genes with low coverage (<10x) and Phred Quality Scores (<20). These sequences were previously reported in Hough et al. (2014), where they were used to validate the ascertainment of sex-linked contigs identified through segregation analysis. From the list of sex-linked genes described in Hough et al. (2014) that were shared between the Texas (XY) and North Carolina (XY1Y2) populations, we extracted a total of 460 sex-linked genes for subsequent analysis.

Autosomal genes

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 74

In evaluating evidence for nucleotide diversity differences between X and Y chromosomes, it is important to distinguish between reduced Y-linked diversity, and the possibility that X-linked diversity is elevated above the levels predicted from a neutral model. To do this, we normalized our sex-linked diversity estimates by mean autosomal diversity, and compared empirical X:A and Y:A nucleotide diversity ratios to those predicted from a neutral model. However, because the criteria for identifying autosomal loci in Hough et al. (2014) was based on the occurrence of four segregating SNPs per locus, this autosomal set is likely higher in diversity than the average autosome. We therefore incorporated the broader set of non-sex linked (putatively autosomal) genes from our transcriptome data for this analysis. We filtered this set of non-sex linked genes to remove those that may be sex-linked but were not identified as such by the conservative ascertainment criteria in Hough et al. (2014). In particular, we filtered all such putatively autosomal genes and removed: 1) any genes in which there was evidence for at least one sex-linked SNP, 2) any genes with fixed heterozygosity in males and fixed homozygosity in females, 3) any genes with less than 10X coverage or greater than 100X coverage (to filter out duplicate genes or those with highly repetitive sequences), using independently obtained genomic coverage data for this species, and 4) any genes containing SNPs with large allele frequency differences between males and females (>0.4). Finally, we also removed any genes with fewer than 100 synonymous sites in order not to bias results toward genes that may have been particularly short due to assembly problems. This filtering resulted in a set of 12,356 and 11,350 autosomal genes in the Texas and North Carolina races, respectively.

Phasing X and Y alleles To estimate polymorphism for the X and Y sequences separately, it is necessary to infer the phase of SNPs in sex-linked transcripts in males. In previous work, phasing alleles on R. hastatulus sex chromosomes was achieved using segregation analysis from a genetic cross (Hough et al. 2014). Here, to phase SNPs from population samples, where such segregation data was unavailable, we used HAPCUT (Bansal and Bafna 2008), a maximum-cut based algorithm that reconstructs haplotypes using sequenced fragments (Illumina read data) from the two homologous chromosomes in a diploid individual to

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 75 output a list of phased haplotype blocks containing the variants on each chromosome. Because the outputted X and Y haplotype blocks produced by HAPCUT contained SNPs that were phased relative to each other, but not designated to either the X or Y chromosome, we assigned individual variants to X or Y by identifying fixed X-Y differences with each haplotype block (i.e., sites in which all females were homozygous, and all males were heterozygous). Identifying such fixed differences within phased haplotype blocks enabled us to infer the correct phase (X or Y) of the polymorphisms from HAPCUT’s output. This was done by matching the phase of fixed X-Y differences with their neighboring polymorphic sites i.e., when a fixed X-Y difference occurred in the same phased haplotype block as a polymorphic site, the polymorphic variants (alleles) in that block were assigned to either X or Y based on the known phase of the fixed difference with which they were matched. SNPs that were identified outside of phased blocks, or in blocks without fixed X-Y differences, were recorded as missing data. In addition, we filtered SNPs for coverage > 60, QUAL score >60, and those 10bp from indels. This procedure was conducted using custom Perl scripts (available from GitHub), and allowed us to produce fasta alignments of X and Y sequences for 372 sex-linked genes from the 24 individuals in our study. We further validated the results of HAPCUT’s allele phasing by comparing the accuracy of this method with the phasing-by-segregation method that was conducted using data from a genetic cross in Hough et al. 2014. To do this, we first phased the sequence data from parents and their progeny using HAPCUT’s algorithm (using the same parameters as above), and since the phase of alleles from the progeny could be derived genetically by knowing the parental genotypes from the cross, we then calculated the rate of error of HAPCUT for data by determining the proportion of genes in which there was discordance between the phase inferred by HAPCUT and that of the phasing- by-segregation method. In particular, we looked for cases where single nucleotide polymorphisms were called on the Y chromosome from the family data, where the true rate of polymorphism was zero. We identified 7% of sex-linked genes that had either genotyping or phasing errors that resulted in false SNP calls on the Y chromosome within the family data, leading to a corresponding SNP error rate estimate of 1.7 x 10-4. This rate is very low relative to the population-based estimates of polymorphism on the X and

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 76 autosomes (Table 4.1), and therefore should have minimal effects on our estimation of X- polymorphism. However, the rate is high relative to the expected level of true polymorphism on the Y-chromosome; we therefore manually filtered genes in which we found evidence for false-positive Y-polymorphisms arising from phasing error caused by gene duplicates (more than two haplotypes), polymorphisms around indels, or genotyping errors caused by low Y-expression. This filtering was done by inspecting sequences in IGV (Thorvaldsdóttir et al. 2013) and identifying each individual putative polymorphism on the Y chromosome.

Estimating polymorphism on sex chromosomes and autosomes For each locus in our analysis, we calculated Watterson’s (1975) estimator of the population parameter θ = 4Neµ, where Ne is the effective population size and µ is the mutation rate, using a modified version of the Perl program Polymorphurama (Bachtrog and Andolfatto 2006; modifited script available on GitHub). To compare sex-linked and autosomal loci we calculated the average value of θ, weighted by the number of synonymous sites in each gene (Figures 4.1, 4.2; Table 4.1). We obtained confidence intervals (95%) for our estimates of the X:A and Y:A diversity ratios by bootstrapping using the BCa method (Efron 1987) implemented in the Boot package in R (Canty and Ripley 2012; R Core Team 2011), and calculating the X:A and Y:A diversity ratios on each iteration for 20000 replicates each. Bootstrapping was conducted on the final filtered set of 173 sex-linked, and 12355 autosomal genes from the Texas race, and separately for the 176 sex-linked and 11349 autosomal genes from the North Carolina race. We also tested whether estimates of diversity on the X chromosome calculated from female (XX) sequence data were consistent with estimates from the phased X sequence from males in order to exclude the possibility of residual errors due to phasing. As no significant difference was observed, we report only results from the phased X sequences from females.

Neutral predictions and testing the effect of a sex ratio bias

Wright's (1931) formula Ne = 4NfNm/(Nf + Nm) gives the effective population size of a randomly mating population with separate sexes and a Poisson distribution of offspring

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 77

numbers. For a female effective population size, Nf , and a male effective population size

Nm, the predicted effective population size for X-linked genes relative to autosomes is given by NeX/NeA = 9(Nf+Nm)/8(Nf+2Nm) (Wright 1931). In a neutral model with equal sex ratios, NeX/NeA = 0.75. For populations with female-biased sex ratios, Nm becomes small relative to Nf and this ratio can become larger than 1, approaching 1.125 at the limit (Cabarello 1995; Figure 4.1, 4.2). For the Y chromosome, the effective population size is given by 1/4 the total Ne or 1/2 the Ne of males, Nm/2. Under neutrality, the level of nucleotide polymorphism maintained in a population is proportional to the product of the mutation rate and the effective population size: θ = 4Neµ (Watterson 1975; Kimura 1983). Assuming equal neutral mutation rates for all genes and an equal number of reproducing males and females, Y-linked genes should therefore have 1/3 the polymorphism of X-linked genes, and 1/4 that of autosomal genes. Figure 4.1 shows the predicted relative effective population size ratios for autosomal, X-linked, and Y-linked genes (left) and the corresponding X:A and Y:A ratios of diversity (right) for a sex ratio bias ranging from Nf/(Nf+Nm)=0 to Nf/(Nf+Nm)=1.

FIGURE 4.1: A:The relationship between the relative effective population sizes and sex ratio bias for genes on autosomes (A), X chromosomes (X), and Y chromosomes (Y). The sex ratio is shown as the proportion of females, Nf/(Nf+Nm), ranging from 0 to 1, where Nm and Nf are the effective number of breeding males and females, respectively. Calculations assume a constant population size, nonoverlapping generations, and a Poisson distribution of offspring number (Wright 1931; Charlesworth 2009). The Ne for each chromosome type is calculated relative to the predicted Ne for genes on autosomes with an equal sex ratio.

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 78

B: The corresponding expected relationship between sex ratio bias and the sex-chromosome-to-autosome diversity ratio, assuming θ = 4Neµ and equal neutral mutation rates for all genes.

RESULTS AND DISCUSSION Y-chromosome diversity is very low in the XY system Our analysis of polymorphism demonstrates that diversity on the Y chromosome in R. hastatulus is severely reduced, with estimates for the Texas race indicating a 50-fold reduction compared to mean autosomal diversity, and a 40-fold reduction compared to X chromosome diversity (Figure 4.2; Table 4.1). Such a low level of Y chromosome variability is predicted by selective interference models of Y-chromosome degeneration (Charlesworth and Charlesworth 2000), but could conceivably also arise because of a lowered Y chromosome mutation rate, or by high variance in male reproductive success leading to reduced Y chromosome effective population size. Recent analysis of substitution rates on R. hastatulus sex chromosomes, however, revealed that the number of synonymous substitutions in X and Y chromosomes, estimated by both parsimony and maximum likelihood methods, were not significantly different (see Figure 3.4A in Chapter 3; Appendix 3.1, Figure A3.1.4), suggesting that the observed reduction in diversity on the Y chromosome is unlikely caused by mutation rate differences between X and Y chromosomes. In addition, by normalizing X- and Y-chromosome diversity estimates by autosomal diversity, our results indicate that the X-Y difference is also not due to some process specifically elevating species-wide diversity for X-linked genes (Figure 4.1).

Table 4.1: Observed estimates of neutral diversity on R. hastatulus chromosomes

Texas TX South Carolina (SC) Florida (FL)

(XY) (XY1Y2) (XY1Y2) average relative to average relative to average relative to Chromosome weighted θ autosomes weighted θ autosomes weighted θ autosomes 0.0055 (♀) 0.0058 (♀) 0.0048 (♀) A 0.0065 (♂) 1 0.0061 (♂) 1 0.0052 (♂) 1 0.0060 (ave) 0.0060 (ave) 0.0050 (ave)

X 0.0047 0.85 0.0019 0.33 0.002 0.37

Y 0.00012 0.021 0.00012 0.020 0.00010 0.019

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 79

The occurrence of female biased sex ratios in R. hastatulus populations is also predicted to lower Y diversity, and empirical estimates of sex ratio bias in this species commonly range from Nf/(Nf+Nm)=0.6 to Nf/(Nf+Nm)=0.65, with a mean sex ratio of 0.62 (n=46 populations; Pickup and Barrett 2013) This level of female-bias would predict a Y:A diversity ratio of approximately 0.2 (Figures 4.1; 4.2). Using the upper bound of our Y:A diversity estimate from the bootstrapped confidence interval, Y:A=0.07, this would still leave a 2-fold reduction that could not be explained by the sex ratio effect alone. The extent of female bias required to explain our observed reduction in Y/A diversity would need to be approximately Nf/(Nf+Nm)=1, which involves all females in the population and is thus biologically unrealistic, Furthermore, this would result in an X/A diversity ratio that is 30% larger than what we found for the Texas race (X/ATX=0.85), suggesting that this is unlikely. Another possible contributing factor in causing the reduced patterns of Y diversity that we observed is high reproductive variance among male plants. This is commonplace in most annual plant species, which commonly exhibit striking phenotypic plasticity in plant size and thus male flower production. Plastic differences in condition among plants in populations owing to spatial heterogeneity in resources could potentially influence the intensity of male-male competition. Male plants in this wind-pollinated species produce copious amounts of pollen and there is likely to be strong competition to fertilize the uniovulate flowers of females. However, the Y:A diversity ratio that we observed (0.02) is too low to be explained by a model of variance in male reproductive success; the Y/A ratio that would be predicted from such a model is 0.19, which is significantly higher than we observed (Appendix 4.1; Figure A4.1.3).

X and Y Chromosome Diversity in the XY1Y2 system In contrast with the Texas (XY) race, our analysis revealed that Y-chromosome diversity was considerably higher in the North Carolina XY1Y2 race in comparison with the Texas XY race (Appendix 4.1; Figure A4.1.1). One possible explanation for this difference could be the occurrence of cryptic substructure in the North Carolina race resulting in elevated Y diversity in the pooled sample. To test this hypothesis, we conducted

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 80 phylogenetic analysis of concatenated sex-linked sequences from all populations in our study. Our analysis revealed strong support for the Y sequences from the XY1Y2 race being paraphyletic (98% bootstrap support; Figure 2). Specifically, the samples from two populations from South Carolina (hereafter the SC sub-clade) formed a monophyletic group, whereas those from Florida and Georgia (hereafter the FL sub-clade) were more closely related to the XY race. Although our sampling is limited, these results suggest that introgression may have occurred between the ancestral Texas (XY race) and the derived North Carolina (XY1Y2) race, leading to a second, derived, XY1Y2 sub-clade.

Backcrossing of a female from the XY1Y2 clade harboring the X-autosome fusion with a male from the XY race would generate a second origin of XY1Y2 males. This could explain the two lineages of Y sequences in this race. Notably, when we separated samples by the two apparent sub-clades of Y sequences, the pattern of reduced Y-chromosome diversity was consistent across all three clades, suggesting that this effect is not population specific (Appendix 4.1, Figure A4.1.1).

FIGURE 4.2: The predicted relationship between sex ratio bias (shown as the proportion of females:

Nf/(Nf+Nm) and sex-chromosome-to-autosome ratios of polymorphism, assuming equal mutation rates of sex-linked and autosomal genes (see Methods). We calculated point estimates for X:A and Y:A polymorphism for Rumex hastatulus as the mean θ across genes, weighted by the number of synonymous sites per gene. Confidence intervals for X:A and Y:A estimates were calculated by bootstrapping (20000

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 81 replicates) using the BCa method (Efron 1987) implemented in the Boot package in R (Canty and Ripley 2012; R Core Team 2011).

Our analysis revealed that the observed X:A ratio of polymorphism in the Texas race was not significantly different from what would be expected from a model that accounts for the variance in reproductive success caused by the empirically estimated female biased sex ratio of Nf/(Nf+Nm)=0.6 (Figure 4.2). This further supports the suggestion that selective interference rather than variance in male reproductive success is responsible for the reduction in Y-polymorphism. In particular, selective interference should affect the Y chromosome independently of the X chromosome, whereas a high variance in male reproductive success is predicted to lead to a correspondingly higher X chromosome effective population size and elevated X:A diversity, which we did not observe (Figure 4.2).

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 82

FIGURE 4.3: Evolutionary relationships of sex chromosome races in Rumex hastatulus, inferred using the Neighbor-Joining method (Saitou and Nei 1987). The percentage of replicate trees in which the associated sequences clustered together in the bootstrap test (1000 replicates) are shown next to the branches (Felsenstein 1985). The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method (Tamura et al. 2004) and are in the units of the number of base substitutions per site. The analysis was conducted on an alignment of X- and Y-linked genes from Rumex hastatulus, with orthologous autosomal sequences from the non-dioecious but closely related outgroup species Rumex bucephalophorus used to root the tree. ‘N’ designates the North Carolina race, and ‘T’ the Texas race, with numbers 1-6 corresponding to each of the 6 populations sampled. The inferred SC (South Carolina), and FL (Florida) sub-clades are indicated.

Interestingly, we observed a significant reduction in X:A polymorphism in the derived SC and FL sub-clades of the North Carolina race (X/ASC=0.33 and X/ASC=0.37) compared to the Texas race (X/ATX=0.85). We suggest that this reduction may be associated with the recent origin of the XY1Y2 sex chromosome system that occurs in these two sub-clades. The XY1Y2 sex chromosome system in R. hastatulus is thought to have originated through an X-autosome fusion (Smith 1964) involving the ancestral 3rd chromosome in the Texas race. Evidence supporting this autosomal origin was recently obtained by Grabowska-Joachimiak et al. (2014), who reported that the ancestral third chromosome in the Texas race carries the 5S rDNA locus which is now found on both the neo-X and the Y2 sex chromosomes in the derived North Carolina race. That this fusion is likely to have been recent is suggested by C-banding/DAPI experiments showing that while the ancestral Y chromosome is highly heterochromatic, the translocated neo-Y is still euchromatic (Grabowska-Joachimiak et al. 2014), and also shows less genetic degeneration than the ancestral Y chromosome (Hough et al. 2014). Therefore, if recent and strong positive selection was responsible for driving the evolution of this fusion, then the formerly autosomal segment on the X chromosome in the XY1Y2 sub-clades (the neo- X) will likely have experienced a selective sweep. This could therefore explain the reduction in diversity that we observed in the derived XY1Y2 sub-clades. Another important factor that may reduce genetic variation on the X chromosome is the partial hemizygosity of this chromosome in males. Because beneficial recessive mutations are immediately exposed to selection in males at such hemizygous X-linked loci, selective sweeps are predicted to disproportionately affect the X chromosome (Charlesworth 1996; Ellegren 2009). This should result in reduced X chromosome

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 83 polymorphism relative to autosomes. Recent estimates of the number of X-hemizygous genes for the ancestral and derived X chromosomes were 114 and 119, respectively, indicating only a slightly increased opportunity for selective sweeps to have reduced X- polymorphism in the derived XY1Y2 clades. Although selective sweeps associated with both the recent X-autosome fusion is likely to have reduced X:A variation, another possibility is that the emergence of the

XY1Y2 race was also associated with a significant population size reduction, which can disproportionately affect X-linked compared to autosomal variation (Pool and Nielsen 2007). However, our comparison of autosomal variation in the SC and FL sub-clades compared to the Texas race did not reveal a strong genetic signal of a population bottleneck, though we did find that autosomal variation in the FL sub-clade was significantly lower than in both the Texas race and the SC sub-clade (Appendix 4.1; Figure A4.1.2). Although a significant population size reduction is unlikely to have occurred simultaneous with the origin of the XY1Y2 clades, our results suggest a subsequent bottleneck specifically affecting the FL clade may have occurred. However, this cannot explain the low X-diversity in the SC sub-clade suggesting that a selective sweep rather than recent population size reduction is a more likely cause of reduced X- linked diversity.

CONCLUSIONS The recent suppression of recombination between X and Y chromosomes in R. hastatulus provides an opportunity to analyze nucleotide polymorphism of sex chromosomes and to infer the evolutionary forces involved in the early stages of Y-chromosome degeneration. Our analyses suggest that the reduced level of diversity of Y-linked genes is consistent with strong selective interference associated with ongoing Y-chromosome degeneration. If this reduction in Y diversity was caused by the occurrence of female biased sex ratios, then X:A variation should have been significantly higher than the neutral prediction of 0.75, which we did not observe. Similarly, the extent of variance in male reproductive success required to explain our observed level of Y:A diversity also predicted a X:A ratio that was significantly lower than our empirical estimate. We therefore conclude that neither biased sex ratios nor high variance in male reproductive success are likely explanations for the reduced Y-chromosome variation in R. hastatulus. Given our recent

CHAPTER 4. REDUCED NUCLEOTIDE DIVERSITY ON THE Y CHROMOSOME 84 evidence for the recent accumulation of deleterious mutations on the Y chromosome caused by suppression of recombination (Hough et al. 2014), our finding instead suggests that background selection plays a more important role in the early stages of Y- chromosome degeneration. Finally, our analysis of X:A variation revealed a significant reduction in X:A nucleotide diversity in derived XY1Y2 sub-clades, suggesting that a chromosome–wide selective sweep may have occurred concomitant with the origin of the

XY1Y2 sex chromosome system.

ACKNOWLEDGEMENTS We thank Felix Baudry for providing genomic coverage data for quality filtering. This research was funded by Natural Sciences and Engineering Research Council of Canada Discovery grants (to S.C.H.B. and S.I.W.) and by an OGS scholarship to J.H.

APPENDIX 4.1

SUPPORTING INFORMATION FOR CHAPTER 4

Table A4.1.1: Population identities (ID) and location information for Rumex hastatulus samples from Texas (XY race) and North Carolina (XY1Y2).

Texas Population ID Location Altitude Latitude Long TX-MTP Mount Pleasant, Texas 130 33.17453 94.98799 OK-RAT Rattan, Oklahoma 138 34.15755 95.41325 TX-LIV Livingston, Texas 83 30.69947 94.79981 LA-DER De Ridder, Lousiana 67 30.8941 93.3143 TX-ATH Athens, Texas 145 32.18471 95.8032 OK-WIL Willis, Oklahoma 211 33.89663 96.83533

North Carolina Population ID Location Altitude Latitude Long SC-PRO Prosperity, South Carolina 126 34.10792 81.43711 GA-BEL Belfast, Georgia 15 31.84293 81.28405 GA-STA Statesboro, Georgia 78 32.45237 81.84849 Branchville, South SC-BRA Carolina 34 33.25082 80.80761 FL-HAM Hammock, Florida 3 29.06816 82.64664 GA-GLA Gladys, Georgia 97 31.48198 83.23783

84 APPENDIX 4.1. SUPPORTING INFORMATION FOR CHAPTER 4 86

Figure A4.1.1: Relative X:A and Y:A nucleotide diversity for XY and XY1Y2 races of Rumex hastatulus. In (A), the South Carolina and Florida sub-clades are collapsed into a single race (‘NC’), and in (B), they are separated into the inferred ‘SC’ and ‘FL’ sub-clades.

Figure A4.1.2: Autosomal nucleotide diversity for the TX, SC, and FL clades of Rumex hastatulus, with confidence intervals calculated from bootstrapping 20000 replicates using the BCa method (Efron 1987) implemented in the Boot package in R (Canty and Ripley 2012; R Core Team 2011).

APPENDIX 4.1. SUPPORTING INFORMATION FOR CHAPTER 4 87

Figure A4.1.3. The relationship between variance in male reproductive success (male fitness) and sex- linked effective population sizes, relative to autosomes. The effective population size (Ne) as a function of the variance in offspring number is given by Ne = ((N(k -1))/((k -1)+(Vk / k)), where k and Vk are the mean and variance in offspring number, respectively (Kimura and Crow 1964). Using sex-specific values of k and Vk to calculate male and female Ne (Nm and Nf , respectively) and assuming a constant population size

(kf = km = 2), non-overlapping generations, and no variance in female fitness (Vkf =0), the relative Ne of X- and Y-linked genes as a function of variance in male fitness were calculated using NeX /NeA = ((9NfNm)/(2Nf

+ 4Nm)) / NeA, and NeY /NeA = (Nm/2)/NeA, where NeA is the effective population size for autosomal genes in a randomly mating population with non-overlapping generations, given by (4Nf Nm)/(Nf +Nm) (Wright 1931). The solid black horizontal lines correspond to our observed values for X:A (upper line) and Y:A (lower line) diversity ratios, with the middle line showing the level of Y/A diversity predicted from this model of male reproductive variance.

CHAPTER 5

CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES IN A DIOECIOUS PLANT WITH SEX CHROMOSOMES

This chapter resulted from collaboration with Arvid Ågren and Stephen I. Wright. It is published in Genome Biology and Evolution, 2014, 6:2439–2443.

SUMMARY The coordination between nuclear and organellar genes is essential to many aspects of eukaryotic life, including basic metabolism, energy production, and ultimately, organismal fitness. Whereas nuclear genes are bi-parentally inherited, mitochondrial and chloroplast genes are almost exclusively maternally inherited, and this asymmetry may lead to a bias in the chromosomal distribution of nuclear genes whose products act in the mitochondria or chloroplasts. In particular, because X-linked genes have a higher probability of co-transmission with organellar genes (2/3) compared to autosomal genes (1/2), selection for co-adaptation has been predicted to lead to an over-representation of nuclear-mitochondrial and nuclear-chloroplast genes on the X chromosome relative to autosomes. In contrast, the occurrence of sexually antagonistic organellar mutations might lead to selection for movement of cyto-nuclear genes from the X chromosome to autosomes to reduce male mutation load. Recent broad-scale comparative studies of N-mt distributions in animals have found evidence for these hypotheses in some species, but not others. Here, we use transcriptome sequences to conduct the first study of the chromosomal distribution of cyto-nuclear interacting genes in a plant species with sex chromosomes (Rumex hastatulus; Polygonaceae). We found no evidence of under- or over-representation of either N-mt or N-cp genes on the X chromosome, and thus no support for either the co-adaptation or the sexual-conflict hypothesis. We discuss how our results from a species with recently evolved sex chromosomes fit into an emerging picture of the evolutionary forces governing the chromosomal distribution of nuclear- mitochondrial and nuclear-chloroplast genes.

INTRODUCTION

87 CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 89

The intimate relationships between nuclear and organellar genomes in eukaryotes represent some of the most striking examples of co-evolved mutualisms (Gillham 1994; Lane 2005; Aanen et al. 2014). The long co-evolutionary history of nuclear and mitochondrial genomes is perhaps best illustrated by the finding that the vast majority of mitochondrial genes in animals have been transferred to the nuclear genome (Adams and Palmer 2003; Rand et al. 2004; Burt and Trivers 2006). Indeed, animal mitochondria now encode only a few proteins after having lost the majority of their original genes (Berg and Kurland 2000; Ridley 2000; Bar-Yaacov et al. 2012). Moreover, almost one fifth of the Arabidopsis thaliana nuclear genome is of chloroplast origin (Martin 2003), suggesting that organellar-to-nuclear gene movement has played a crucial role in the evolution of plant genetic systems. The evolution of cyto-nuclear interactions and the chromosomal distribution of the genes involved should be influenced by the contrasting modes of inheritance of organellar genes (maternal inheritance) and autosomal genes (bi-parental inheritance). This difference may, for example, result in conflict between nuclear and organellar genes over sex determination and sex ratio (Cosmides and Tooby 1981; Werren and Beukeboom 1998), and several mitochondrial genes in plants are known to cause male sterility (Burt and Trivers 2006; Touzet and Meyer 2014). In systems with XY sex determination, where males are the heterogametic (XY) and females the homogametic sex (XX), genes on the X chromosome spend 2/3 of their time in females (Rand et al. 2001) and therefore share a female-biased inheritance pattern relative to Y-linked or autosomal genes, which may result in inter-genomic co-adaptation or conflict. A potential consequence of inter-genomic conflict or co-adaptation between nuclear genes, whose products interact with mitochondrial or chloroplasts (mito-nuclear and cyto-nuclear genes, respectively) and other regions of the genome, is a shift in the chromosomal location of such genes, either becoming more or less abundant on the X chromosome. Several molecular mechanisms have been suggested to be involved in driving gene movement, including gene duplication followed by fixation and subsequent gene loss (Wu and Yujun Xu 2003), and autosomal gene duplications followed by the evolution of sex biased gene expression (Connallon and Clark 2011). The evolutionary mechanisms of this gene movement have also been explored by several recent studies

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 90

(Drown et al. 2012; Hill and Johnson 2013; Dean et al. 2014; Rogell et al. 2014), and two main processes have been proposed to account for the movement of genes to or from the X chromosome. The co-adaptation hypothesis predicts that the co-transmission of X- linked and organellar genes should result in their co-adaptation, in which selection on beneficial epistatic interactions results in an over-representation of cyto-nuclear genes on the X chromosome relative to autosomes (Rand et al. 2004; Drown et al. 2012). In contrast, the sexual conflict hypothesis predicts the opposite chromosomal distribution, with more cyto-nuclear genes occurring on autosomes to alleviate mutation load in males. To date, empirical evidence for these hypotheses are mixed. Drown et al. (2012) used previously published reference genomes to examine the chromosomal distribution of N- mt genes in 16 vertebrates and found a strong under-representation of such genes on the X chromosomes relative to autosomes in 14 mammal species, but not in two avian species with ZW sex determining systems; note that the co-adaptation hypothesis does not predict that ZW systems should show a bias in the distribution of cyto-nuclear genes. Dean et al. (2014) included seven additional species in their analysis with independently derived sex chromosomes and found that the under-representation of N-mt genes on the X chromosome was restricted to therian mammals and Caenorhabditis elegans. Here, we use sex-linked and autosomal transcriptome sequences to investigate the chromosomal distributions of cyto-nuclear interactions in the dioecious annual plant Rumex hastatulus (Polygonaceae). Examining cyto-nuclear interactions within a plant species is of interest for several reasons (see Sloan 2014). First, plants carry an additional maternally inherited organellar genome that is absent in animals, the chloroplast genome. This provides an opportunity to compare the chromosomal distribution of two independent kinds of cyto-nuclear interacting genes: nuclear-mitochondrial and nuclear- chloroplast. Second, whereas animal sex chromosomes evolved hundreds of millions of years ago (180 MYA in mammals and 140 MYA in birds; Cortez et al. 2014), the origin of plant sex chromosomes is a more recent event (Charlesworth 2013). In R. hastatulus, sex chromosomes are thought to have evolved approximately 15-16 MYA (Navajas- Perez et al. 2005) and genes on the Y chromosome show evidence of degeneration, resulting in a considerable proportion of genes that are hemizygous on the X chromosome (Hough et al. 2014). Rumex hastatulus therefore provides an opportunity to test whether

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 91 the early changes involved in sex chromosome evolution are associated with a concomitant shift in the chromosomal location of N-mt or N-cp genes. Moreover, the presence in this system of X-linked genes that have recently become hemizygous provides an opportunity to compare the chromosomal distributions of X-linked genes that are hemizygous versus those that have retained Y-linked alleles (X/Y genes). Hemizygous genes are particularly good candidates for evaluating evidence for co- adaptation and/or sexual conflict because of their relatively older age (Hough et al. 2014), and because beneficial mutations in such genes are exposed to positive selection regardless of dominance and may therefore spread more rapidly.

METHODS Gene identification and functional annotation We used sex-linked and autosomal transcriptome sequence data for R. hastatulus reported in Hough et al. (2014; GenBank Sequence Read Archive accession no. SRP041588), and obtained three sets of genes with which to test for an over- or under-representation of nuclear-mitochondrial or nuclear-chloroplast genes. In total our analyses included 1167 autosomal genes, 624 X-linked genes, and 107 hemizygous X-linked genes. The X-linked and hemizygous X-linked genes were shared between sex chromosome systems in this species (see Hough et al. 2014; Methods and Appendix 3.1 for full details regarding the identification of such genes from transcriptome sequence data). For autosomal genes, we included those previously identified as confidently autosomal in both R. hastatulus sex chromosome systems, as well as those uniquely identified in the XYY system. For each gene set, we queried the sequences translated in all reading frames against the A. thaliana protein database using the BLASTx homology search implemented in Blast2GO (Conesa et al. 2005), with a significance threshold (BLAST ExpectValue) of 1x10-3, above which matches were not reported. We limited our searches to the A. thaliana protein database because sequence matches to this database returned more detailed functional information than is available for most other species in the NCBI plant database. We obtained BLASTx results for 1073 autosomal genes (90%), 567 X-linked genes (90%), and 95 hemizygous genes (89%). Gene Ontology (GO) terms associated with the hits from BLASTx queries were then retrieved using the ‘Mapping’ function in Blast2GO, which

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 92 used BLAST accessions to link the queried sequences to functional information stored in the GO database (The Gene Ontology Consortium 2008). Gene names were retrieved using NCBI mapping files ‘gene info’ and ‘gene2accession’, and GO terms were assigned to query sequences using the ‘Annotation’ function with an E-Value-Hit-Filter of 1x10-6 and an annotation cut off of 55 (default parameters). Finally, we ran InterProScan (Quevillon et al. 2005) to retrieve sequence domain/motif information and merged the corresponding annotations with previously identified GO terms. This procedure generated output files containing GO ID’s and functional descriptions for each gene in our data set. The numbers of genes in our final data set with functional annotations and N-mt and N-cp GO annotations are summarized in Table 5.1.

Statistical analyses We used a similar approach to Drown et al. (2012) and Dean et al. (2014) and estimated the number of N-mt and N-cp genes on the X chromosome and autosomes, and then compared each of these estimates to an expected number. The expected number of N-mt genes was obtained by calculating the product of the proportion of all genes in the data set with mitochondrial annotations (matching GO:0005739) and the number of annotated genes in a given gene set. The expected numbers of N-cp genes were calculated similarly, using GO:0009507. We then calculated the ratios of the observed-to-expected numbers for both N-mt and N-cp genes in each gene set. The observed-to-expected ratio is expected to equal one when there is no under- or over-representation, and greater than one when there is an over-representation. We note that, unlike for X-linked genes, we did not have information regarding the particular chromosome locations for autosomal genes, and therefore could not obtain the expected numbers of N-mt and N-cp genes per- autosome as in previous studies (Drown et al. 2012; Dean et al. 2014). The expected numbers were thus calculated assuming that the set of autosomal genes represented a random sample of the autosomal chromosomes in this species, which is likely a valid assumption given that the sequences were obtained using whole transcriptome shotgun sequencing (Hough et al. 2014). Calculating the expected-to-observed ratios across X- linked, autosomal, and X-hemizygous genes thus allowed us to determine whether any of these gene sets contained an under- or over-representation of N-mt and N-cp genes

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 93 compared to the expectation based on the proportion of such genes in the full data set. We tested the significance of over- or under-representation using Fisher’s exact tests, and calculated 95% confidence intervals for the numbers of N-mt or N-cp genes using 10,000 replicate bootstrapped samples. Given our sample sizes of genes with annotations (Table 5.1), Fisher’s Exact Tests allowed us to test for differences in the proportions of cyto- nuclear genes on autosomes versus the X-chromosome that were on the order of 5% with ~80% power, whereas power was reduced for smaller differences (Appendix 5.1). Similarly, for hemizygous X-linked genes, we calculate that differences of approximately~10% could be detected with ~80% power. All data analysis was done in R (R Development Core Team 2013; scripts are available for download from GitHub: https://github.com/houghjosh/Cytonuclear).

TABLE 5.1: Numbers of sex-linked and autosomal genes used in analysis

Gene sets Autosomal X-linked X-hemizygous Original data set 1167 624 107 With annotation 1073 567 95 With N-mt GO ID 194 94 13 With N-cp GO ID 222 102 22

RESULTS AND DISCUSSION It has been suggested that cyto-nuclear genes may be either over- or under-represented on the X chromosome compared to autosomes, depending on whether their interactions are driven by co-adaptation or sexual conflict (Rand et al. 2001; Drown et al. 2012; Hill and Johnson 2013; Dean et al. 2014; Rogell et al. 2014). We annotated sex-linked and autosomal transcriptome sequences to test these predictions in the dioecious plant R. hastatulus. We found that neither mitochondria- or chloroplast-interacting nuclear genes were under- or over-represented on the X chromosome (Fisher’s exact test, P = 0.4947 and P = 0.3074, respectively; Figure 5.1). This pattern indicates that neither the co- adaptation nor the sexual conflict hypothesis alone is sufficient to explain the chromosomal distribution of cyto-nuclear genes in R. hastatulus.

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 94

FIGURE 5.1. Representation of the chromosomal location of cyto-nuclear genes in Rumex hastatulus. Dots represent the observed to expected ratio of mito-nuclear (N-mt) and chloro-nuclear (N-cp) genes on autosomes, the X chromosome, and hemizygous X genes, with the 95% confidence intervals estimated by bootstrapping (10,000 replicates). The vertical dotted line at 1 represents no over- or under-representation.

There are several factors that are expected to be important in determining cyto- nuclear gene distributions, and these may explain the lack of bias in R. hastatulus. For example, under both the co-adaptation and sexual conflict hypotheses, the age of the sex chromosomes will determine the extent to which selection (either for co-adaptation, or sexual antagonism) has had time to operate, which depends on the rate of gene movement onto and off of the sex chromosomes. Whereas previous studies of cyto-nuclear genes in animals have focused almost exclusively on ancient sex chromosome systems (Drown et al. 2012; Dean et al. 2014; Rogell et al. 2014), our study focused on a dioecious plant species in which sex chromosomes evolved more recently (~15 MYA; Navajas-Perez et al. 2005), and many genes likely stopped recombining much more recently (Hough et al. 2014). The lack of bias in the chromosomal distribution of cyto-nuclear genes may therefore reflect the recent time scale of sex chromosome evolution rather than the absence of biased gene movement. The relatively young age of sex chromosomes may also have played a role in the lack of bias reported in the sex and neo-sex chromosomes in three-spined stickleback, which evolved ~10 MYA (Kondo et al. 2004) and ~2 MYA, respectively (Natri et al. 2013). Comparative studies of sex chromosomes of different age will be central for understanding the rate at which organellar gene movement occurs.

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 95

In addition to being evolutionarily older, X-linked hemizygous genes are expected to show a greater effect of over-or under-representation than genes with both X- and Y- alleles because recessive mutations (involved in either co-adaptation or sexual conflict) will be exposed to selection instead of masked by an alternate allele in a heterozygous genotype. We detected a slightly greater under-representation of X-hemizygous N-mt genes compared to autosomes or X-genes with retained Y-alleles, but the effect was not statistically significant (P = 0.4947). The opposite pattern was evident for N-cp genes, which were slightly over-represented on hemizygous genes, but again this effect was not significant (P = 0.3074). A larger sample of hemizygous genes would be required to more confidently assess whether such genes are in fact more often involved in cyto-nuclear interactions than other genes on the X chromosome, and to test whether the opposite pattern for N-mt and N-cp hemizygous genes is a result of a different rate of nuclear gene transfer between mitochondrial and chloroplast genomes. In particular, the smaller number of hemizygous X-linked genes in our data set implies that power was reduced for this comparison, such that a ~5% difference could only be detected with ~60% power (see Appendix 5.1). Another factor that will affect the chromosomal distribution of cyto-nuclear genes is the number of N-mt and N-cp genes that were located on the autosome from which the sex chromosomes evolved. Since the origins of mitochondria and chloroplasts both vastly predate that of sex chromosomes (1.5-2 BYA compared to < 200 MYA; Dyall et al. 2004; Timmis et al. 2004; Cortez et al. 2014), gene transfer from organellar genomes to the nuclear genome began long before the evolution of sex chromosomes. A bias in the chromosomal distribution of cyto-nuclear genes in either direction may therefore arise if the ancestral autosome was particularly rich or poor in cyto-nuclear genes. Indeed, it is striking that autosomes in the animal species previously examined exhibited extensive variation in the relative number of N-mt genes (see Drown et al. 2012 Figure 1 and Dean et al. 2014 Figure 1 and Figure 2). That the ancestral number of N-mt and N-cp genes is likely to be important is highlighted by the fact that the majority of genes involved in mitochondrial DNA and RNA metabolism in A. thaliana are found on chromosome III (Elo et al. 2003). If such a biased autosomal distribution of organellar variation is representative of the ancestral sex chromosomes, the X chromosome could carry

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 96 significantly more N-mt or N-cp genes because of this ancestral gene number rather than a biased rate of gene movement. This effect is likely exacerbated in early sex chromosome systems, where the majority of genes may not have experienced opportunities for movement. Genetic mapping and comparative genomic studies of genes that have transferred from organellar genomes after the origin of sex chromosomes may provide a means to control for ancestral differences in gene number and provide a better test of biases in organellar-nuclear gene movement. To conclude, our study is the first investigation of the extent to which co- adaptation and sexual conflict have shaped the chromosomal distribution of cyto-nuclear genes in a plant species with sex chromosomes. We found no sign of under- or over- representation of either N-mt or N-cp genes on the X chromosome, implying that neither co-adaptation nor sexual conflict alone can explain the chromosomal distributions of these genes. Instead, we suggest that additional factors, including the age of sex chromosomes and the time that has elapsed since X-Y recombination became suppressed, are likely to have been important determinants of the patterns we observed. To determine whether the lack of under-representation of mito-nuclear genes on the X chromosome reflects an absence of gene movement, future studies should focus on quantifying rates of gene movement after sex chromosome origination, and consider the extent to which neutral processes including the number of mito-nuclear genes on ancestral sex chromosomes have played an important role in shaping the current chromosomal distributions of such genes. Cyto-nuclear conflict and co-evolution have undoubtedly played a major role in many aspects of genome evolution in both plant and animal systems, and the previously reported evidence from therian mammals and C. elegans (Drown et al. 2012; Dean et al. 2014) suggests that sexual conflict and co-adaptation might represent important mechanisms driving chromosomal gene movement; however, it remains unclear whether these processes have also shaped the chromosomal distribution cyto-nuclear genes in plants.

ACKNOWLEDGEMENTS We thank Rebecca Dean and Devin M Drown for comments on the manuscript. This research was supported by Discovery Grants to SCHB and SIW from the Natural

CHAPTER 5. CHROMOSOMAL DISTRIBUTION OF CYTO-NUCLEAR GENES 97

Sciences and Engineering Council of Canada. JH was supported by an Ontario Graduate Fellowship and JAÅ by a Junior Fellowship from Massey College.

APPENDIX 5.1

SUPPORTING INFORMATION FOR CHAPTER 5

Power Analysis To determine the extent to which a biased distribution of cyto-nuclear genes could be detected given our sample sizes of annotated autosomal, X-linked, and X-hemizygous genes, we calculated the power to detect significant differences based on a Fisher’s Exact Test. Here, power refers to the probability of correctly rejecting the null hypothesis of no difference in the proportion of cyto-nuclear genes among the gene sets, and we used the hypergeometric distribution to calculate the probability of getting the observed data under the null hypothesis that the proportions were the same (with an alpha significance level of 0.05). To better visualize the difference in power for the two main comparisons of interest (autosomal genes vs. X/Y genes, and autosomal genes vs. X-hemizygous genes), Figure A3.1.1 below shows an example in which the true proportion of cyto- nuclear genes on autosomes is assumed to be 0.25 (which is approximately the empirical proportion in our data). Power is then shown as a function of the true proportion of cyto-nuclear genes on the X-chromosome, ranging from 0.1 to 0.5. As discussed in the main text, our sample sizes of annotated X/Y genes (n=567; see Table 5.1 in main text) were large enough to detect differences between autosomes and X/Y genes with ~80% power given a true difference of ~5%, and as this difference becomes smaller, the power decreases. Similarly, for hemizygous X-linked genes (with n=95), differences of approximately~10% could be detected with ~80% power, and power was reduced for smaller differences. This analysis was done using G*Power (Faul et al. 2007) and R (R Development Core Team 2013).

97 APPENDIX 5.1. SUPPORTING INFORMATION FOR CHAPTER 5 99

Figure A5.1.1: Power to detect a significant difference in the proportion of cytoplasmic genes between autosomes and sex chromosomes as a function of the true proportion on sex chromosomes. Power was calculated assuming the true proportion of cytoplasmic genes on autosomes was 25%.

CHAPTER 6

EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD

This chapter resulted from collaboration with Sarah P. Otto, Simone Immler, and Spencer C.H. Barrett. It is published in Evolution, 2014, 67:1915-1925.

SUMMARY Frequency-dependent selection should drive dioecious populations toward a 1:1 sex ratio, but biased sex ratios are widespread, especially among plants with sex chromosomes. Here, we develop population genetic models to investigate the relationships between evolutionarily stable sex ratios, haploid selection, and deleterious mutation load. We confirm that when haploid selection acts only on the relative fitness of X and Y-bearing pollen and the sex ratio is controlled by the maternal genotype, seed sex ratios evolve toward 1:1. When we also consider haploid selection acting on deleterious mutations, however, we find that biased sex ratios can be stably maintained, reflecting a balance between the advantages of purging deleterious mutations via haploid selection, and the disadvantages of haploid selection on the sex ratio. Our results provide a plausible evolutionary explanation for biased sex ratios in dioecious plants, given the extensive gene expression that occurs across plant genomes at the haploid stage.

INTRODUCTION The evolution of the sex ratio is a frequency-dependent process in which the least frequent sex obtains fitness benefits proportional to its rarity in the population (Maynard Smith 1974; Charnov 1982). This is referred to as negative frequency-dependent selection and its effects on sex-ratio evolution were discussed in early work by Darwin (1871) and formalized mathematically by Düsing (1884). Fisher (1930) showed that sex ratios evolve toward 1:1 when parents invest equally in the two sexes (see Trivers 1972). Selection for an even sex ratio is referred to here as Fisherian sex-ratio selection and is described in explicit genetic terms by Shaw and Mohler (1953) and reviewed in Karlin

99 CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 101 and Lessard (1987) Populations of dioecious plants commonly exhibit deviations from the 1:1 sex ratio and these biases can involve an excess of females or males (Barrett et al. 2010; Sinclair et al. 2012). A recent survey of sex ratios in 243 dioecious angiosperm species, including 123 genera and 61 families, found significantly biased sex ratios in 49.8% of species (Field et al. 2012a). Of these, 76 exhibited male-biased sex ratios and 45 were female-biased, with a median male percentage of 63% and 36%, respectively. The frequent occurrence of sex-ratio bias in plants raises questions about the proximate and ultimate causes of this phenomenon, especially when it involves biased primary (seed) sex ratios. Several mechanisms have been proposed to explain bias in seed sex ratios, including competition between male- versus female-determining gametophytes (pollen tubes), a process known as ‘certation’ (Correns 1922), X-linked meiotic drive (Taylor and Ingvarsson 2003), and local mate competition (de Jong et al. 2005). Other mechanisms that could conceivably affect primary sex ratios include selective abortion of ovules (e.g. Stephenson et al. 1986; Casper 1988) and maternally induced selection among pollen tubes (Bachelier and Friedman 2011). The possibility that competition among male gametophytes is a mechanism causing biased sex ratios is suggested by the observation that the amount of pollen deposited on stigmas (pollination intensity) affects seed sex ratios. Correns (1922 1928) demonstrated experimentally that increasing pollen loads in Silene and Rumex species, two groups with sex chromosomes, was associated with more female-biased sex ratios, whereas sparse pollination resulted in sex ratios closer to unity. This pattern has been subsequently confirmed in several additional Rumex species (Rychlewski and Zarxycki 1975; Conn and Blum 1981; Stehlik and Barrett 2006; Field et al. 2012b). These observations are consistent with the certation hypothesis involving poor performance of Y-bearing pollen, which in turn may be due to the degeneration of the Y chromosome caused by suppressed X-Y recombination (Smith 1963; Lloyd 1974; Charlesworth and Charlesworth 2000). In Silene latifolia, the Y chromosome is known to have partially degenerated and ~20% of the genes on this chromosome have either impaired function or severely reduced expression (Bergero and Charlesworth 2011; Chibalina and Filatov 2011). In Rumex acetosa and R. hastatulus, sex chromosomes are heteromorphic, and Y

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 102 chromosomes have accumulated repetitive sequences (Mariotti et al. 2006; Ester et al. 2011). Such degeneration could contribute to mutation load and therefore to fitness differences between female- and male-determining pollen. In the absence of dosage compensation, this accumulating load could also affect the fitness of heterogametic (XY) males in the diploid sporophytic stage because of the strong overlap in gene expression between the haploid and diploid phases of the life cycle in plants (Mascarenhas 1999; Borg et al. 2009). Indeed, a higher mutation load among male diploids has been suggested in R. nivalis, in which the sex ratio becomes progressively more female biased from the seed to the flowering stage (Stehlik et al. 2007). Several studies including those mentioned above have emphasized mechanisms that operate during the progamic phase – from pollination to fertilization – to account for biased seed sex ratios in plants. These are proximate explanations however, and ultimately we want to know why plants do not evolve sex ratios closer to unity in the face of Fisherian sex-ratio selection, for example by weakening selection during the haploid stage. This issue is the main focus of the present study. The role that gametophytic selection may play in purging the deleterious mutation load and its effects on sex-ratio evolution have not been previously considered. Mutation load refers to the reduction in individual fitness (relative to a mutation-free genotype) caused by segregating deleterious alleles, and its effects on fitness can be substantial (Agrawal and Whitlock 2012). With thousands of genes expressed during the haploid stage of plant life cycles (Borg et al. 2009), strong selection against deleterious mutations during this stage could have pervasive genome-wide effects on the mutation load of plants (Klekowski 1984; Walbot and Evans 2003). In animals, haploid expression is much less extensive, but some genes, particularly those involved in spermatogenesis, also show potential for strong haploid selection (Joseph and Kirkpatrick 2004). Here, we develop population genetic models to investigate sex-ratio evolution in a dioecious plant with sex chromosomes. We assume that males are heterogametic (XY), both because this form of sex determination is predominant among known plant sex chromosome systems (Ming et al. 2011), and because selection among pollen does not affect the sex ratio in ZW species (since all pollen is Z-bearing). We consider genes that act in the maternal plant either to modify the sex ratio directly, or to modify the strength

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 103 of selection among X- and Y-bearing pollen, and ask how these genes evolve in the face of Fisherian sex-ratio selection. We then add recurrent deleterious mutations throughout the genome to assess their effects on modifiers that alter the strength of haploid selection. Finally, we determine the evolutionarily stable sex ratio and the level of mutation load that result in the presence of these conflicting selective pressures. We discuss the implications of our theoretical results for explaining empirical observations of sex-ratio bias in dioecious plants and for the evolution of life cycles with extensive haploid gene expression.

THE MODELS In the presence of Fisherian sex-ratio selection, we consider the evolution of modifier alleles that affect the sex ratio in a dioecious plant population in which the haploid gametophytic phase is contained within female sporophytes. We evaluate the conditions under which such modifiers can spread and determine the evolutionarily stable sex ratio (ESS) under three scenarios that differ with regard to the stage at which female sporophytes influence the pool of pollen used at fertilization (Figure 6.1): (1) an early- acting sex ratio modifier that influences the frequency ( cij ) of Y-bearing pollen that germinates on the stigma before gametophytic selection, (2) a late-acting sex ratio modifier that influences the frequency ( cij ) of Y-bearing pollen tubes entering ovules after gametophytic selection, and (3) a modifier that alters the strength of selection in females ( cij ) among haploid male gametophytes, without directly selecting among them.

In each case, we assume that cij depends on the genotype at the modifier locus, which bears two alleles (M and m) that have no direct fitness effects. Because sex chromosomes may not segregate randomly during meiosis in males, we assume that males produce a fraction, α , of Y-bearing pollen, and a fraction, 1−α , of X-bearing pollen such that in the absence of sex ratio or gametophytic selection, the male to female ratio among seeds would be α :1−α . Selection among pollen tubes during their growth in the style causes the frequency of Y-bearing gametophytes to change by an amount proportional to 1−γ relative to X-bearing gametophytes, where we define γ as the

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 104 maximal strength of gametophytic selection given other constraints (e.g., constraints on style length or resources supporting pollen-tube growth).

FIGURE 6.1: Modifier evolution at different stages of the plant life cycle. Our models track the evolution of modifier alleles that effect either the sex ratio or the strength of gametophytic selection at different life cycle stages: (i) sex ratio regulation at the stage of pollen receipt before gametophytic selection (model 1), (ii) sex ratio regulation after gametophytic selection (model 2), and (iii) regulation of the strength of gametophytic selection (model 3). We census maternal genotypes after meiosis.

Model 1: Early-acting sex-ratio modifier In the first model (Figure 6.1, Model 1), females exert control over the sex ratio at the stigma, during the stage at which pollen tube growth is initiated (see recursions in

Appendix 6.1). Depending on the female genotype ij at the modifier locus, a fraction cij of Y-bearing pollen tubes and 1− c of X-bearing pollen tubes enter the style on average. ij

The cij values that are possible depend on the genetic variation that could arise to alter the ratio of pollen tube types entering the style. For example, if all of the pollen produced is

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 105

Y-bearing (α = 0 ), then cij must be zero. Although modifiers may be more abundant for certain cij values (e.g., for cij nearer α ), we assume for now that the ratio of X to Y carrying pollen could be modified to any level as long as 0 < α <1 . We first calculate the sex ratio when the modifier allele M is fixed in the population (i.e., before allowing evolution to adjust the sex ratio). Because sex-ratio control occurs before gametophytic selection, the frequency of male seeds, ψ , reflects both female sex-ratio control ( cMM ) and the relative fitness of Y-bearing pollen grains (described by γ ):

(1−γ )c ψ = MM . (1) 1−γ cMM

A new modifier allele, m, that alters the ratio of X- to Y-bearing pollen tubes entering the stigma is predicted to spread when the leading eigenvalue, λ , is greater than one, where λ is calculated from the local stability matrix describing the dynamics when m is rare, as derived from the recursions in Appendix 6.1. We show in the Sup. Mat. Mathematica file that:

(c − c )(1− 2c +γ c ) λ ≈1+ Mm MM MM MM , (2) 4c (1 c )(1 c ) MM − MM −γ MM

where we assume that the effect of the new modifier is small ( cMm near cMM ) to simplify the solution (qualitatively, the results are similar for large-effect modifiers; see Sup.

Mat.). Thus, if cMM <1/ (2 −γ ) , modifier alleles will invade if they increase the fraction of germinating pollen that are Y-bearing ( cMm > cMM ). With small modifier effects, the system thus evolves toward:

1 c* = , (3) 2 −γ

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 106 meaning that c* is a convergence stable strategy (Eshel et al. 1997). According to (2), c* also cannot be invaded by any other strategy (implying that c* is also an ESS). Note that c* is the sex ratio before gametophytic selection, and inserting equation (3) into (1) indicates that the ESS sex ratio among seeds after gametophytic selection (ψ * ) is 1:1. Thus, when females exert control over the initial growth of male- versus female- determining pollen tubes, they evolve to do so in a manner that counterbalances gametophytic selection.

Model 2: Seed production with late-acting sex-ratio modifier Similar results are obtained if females exert control over the sex ratio by discriminating among pollen tubes that have survived gametophytic selection and have reached the ovary (Figure 6.1, Model 2), with cij now describing the fraction of Y-bearing pollen tubes involved in ovule fertilization (recursions in Appendix 6.1). Here, sex-ratio control is assumed to occur immediately before fertilization with no other selective events following, and the seed sex ratio is thus given by ψ = cMM , when the modifier allele, M, is fixed. The invasion of a new modifier is now determined by:

(c − c )(1− 2c ) λ ≈1+ Mm MM MM , (4) 4c (1 c ) MM − MM where again we have assumed that the modifier is weak. Neither biased production of X and Y pollen (described by α ) nor gametophytic selection (described by γ ) affect the dynamics of rare sex-ratio modifiers, because these forces are neutralized when females can manipulate whether female- or male-determining pollen tubes are allowed to enter ovules. Instead, modifiers invade ( cMm > cMM ) whenever they increase the fraction of the rarer sex in the population. With small modifier effects, the system therefore evolves toward the ESS c* = 1/ 2 such that the sex ratio among seeds is again 1:1.

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 107

TABLE 6.1: Notation used in the models M Modifier locus, with alleles M (resident) and m (rare) A Selected locus, with alleles A (wildtype) and a (deleterious) cij Maternal control of the sex ratio in a female carrying modifier alleles i and j. a The frequency of Y-bearing pollen produced by males g Selection against Y-bearing pollen at the haploid stage (after receipt on the stigma) ψ The frequency of male seeds

ψ * The ESS frequency of male seeds sk Selection against the mutant a allele in diploid individuals of sex k hk Dominance of the mutant a allele in diploid individuals of sex k (fitness of heterozygotes being 1– hk sk) tk Selection against the mutant a allele in haploids of sex k

Model 3: Seed production with modifier of gametophytic selection The preceding models assume that females can detect X- or Y-bearing pollen tubes and manipulate their growth, but this may be mechanistically unrealistic. An alternative possibility is that females alter the strength of selection experienced by pollen tubes (Figure 6.1, Model 3). Females could, for example, modify the length, shape, or structure of the style in a manner that indirectly influences the intensity of pollen-tube competition (Lankinen and Skogsmyr 2001), or females could alter the amount or type of resource provisioning for growing pollen tubes, which could conceivably accentuate or mute fitness differences among the pollen. Regardless of the exact mechanism, we assume that there is genetic variation for the strength of gametophytic selection in females, with this strength given by cij for a female of modifier genotype ij. Specifically, Y-bearing pollen now have fitness 1−γ cij relative to X-bearing pollen (recursions in Appendix 6.1). Because we define γ as the strength of selection when gametophytic selection is maximal (given any possible constraints), we

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 108

consider cij values between 0 and 1. With modifier allele M fixed, the frequency of male seeds becomes:

α(1−γ c ) ψ = MM . (5) 1−αγ cMM

A new modifier allele, m, that alters the strength of gametophytic selection can then spread when λ >1, where:

(c − c )γ (2α −1−αγ c ) λ ≈1+ Mm MM MM (6) 4(1 c )(1 c ) −γ MM −αγ MM .

Thus, when Y-bearing pollen is less fit (γ > 0 ) modifier alleles increasing the strength of gametophytic selection ( cMm > cMM ) invade if Y-bearing pollen is produced in excess, with α >1/ (2 − λcMM ) , and otherwise weaker gametophytic selection evolves. With small modifier effects, the strength of gametophytic selection evolves toward the ESS:

2α −1 c* = . (7) αγ

Thus, when females receive equal proportions of X- and Y-bearing pollen (α = 1/ 2 ), they evolve to minimize selection among gametophytes ( c* = 0 ), keeping the sex ratio even. More generally, if we insert equation (7) into (5), the sex ratio among seeds at this ESS is ψ * = 1/ 2 , so that again the system evolves toward a 1:1 sex ratio among seeds. These calculations assume, however, that sex-ratio selection is the only factor influencing the evolution of gametophytic selection, an assumption relaxed in the next section.

Incorporating deleterious mutations into the modifier model of gametophytic selection

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 109

The above results imply that females evolve to neutralize any process that perturbs the sex ratio among seeds from 1:1, a result consistent with Fisherian sex-ratio theory. This assumes, however, that there are no costs to doing so. In particular, altering the strength of gametophytic selection is likely to have major consequences for purging deleterious alleles from the genome, assuming that pollen with a high mutation load has low gametophytic fitness (Charlesworth and Charlesworth 1992). We thus seek to determine how sex-ratio selection and the benefits of purging together affect the evolution of modifiers controlling the strength of gametophytic selection. We assume that all selected loci are loosely linked, autosomal, and non-epistatic so that we can ignore genetic associations among selected loci and between each selected locus and the sex-determining region. In this case, the strength of indirect selection acting on a modifier of weak effect can be approximated as the sum of indirect selective forces arising in models with the modifier locus plus each other locus, considered in turn. Specifically, invasion of a rare modifier depends on:

(8) λnet = 1+ ∑(λl −1) l

where λl −1 measures the asymptotic strength of indirect selection acting on a rare modifier allele, m, due to interactions with locus l, once the system has approached the eigenvector associated with the leading eigenvalue (e.g., see Appendix in Otto and

Bourguet 1999). In the previous section, we obtained λl , as given by equation (6), when locus l is the sex-determining gene. Here, we calculate λl for a selected locus, A, subject to recurrent deleterious mutations, with mutation occurring from allele A to a at rate µ .

From these calculations we obtain the net evolutionary force acting on a modifier, λnet , and use this to predict the sex ratio and mutation load when the strength of gametophytic selection has reached the ESS. Again, the modifier genotype of a female determines the strength of gametophytic selection, cij (see table 6.1), with stronger selection reducing the frequency of the mutant allele among seeds (see recursions in Appendix 6.2). Assuming that selection coefficients acting on allele a are small (but large relative to the inverse of the population size) and

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 110 that the mutation rate is even smaller, the equilibrium frequency of allele a averaged across the sexes at the lth selected locus is:

µ q = (9) l (h♀s♀ h♂ s♂ ) / 2 (c t♂ ) / 2 + + MM

(to simplify the notation, we have suppressed the locus-specific subscript, l, on the selection coefficients, which are defined in Table 6.1). The difference in allele frequency between the sexes is of lower order and does not appreciably influence the spread of the modifier. Observe that equation (9) reduces to the classic mutation-selection balance,

♂ ql = µ / (hs) , when gametophytic selection is absent (t = 0 ) and selection is the same in both sexes ( s♂ = s♀ ). Recurrent deleterious mutations thus reduce the mean fitness in

k k diploids of sex k by an amount ≈ 2h s ql (the ‘mutation load’). A new modifier allele, m, that alters the strength of selection among haploid pollen can then spread when λ >1, where:

⎧(c − c )µ t♂ (2h♀s♀ + 2h♂ s♂ + c t♂ )⎫ λ ≈1+ ⎨ Mm MM MM ⎬ . (10) 2(h♀s♀ h♂ s♂ c t♂ ) ⎩ + + MM ⎭

As expected, a rare modifier experiences no indirect selection if there is no selection in the haploid phase ( t♂ = 0 ), or no genetic variation at the A locus ( µ = 0 ), or no effect of the modifier ( cMm = cMM ). Otherwise, when allele a is deleterious, modifier alleles invade if they increase the strength of gametophytic selection ( cMm > cMM ), thereby purging mutations more efficiently from the male gametes involved in fertilization. We now combine the indirect selection on a modifier that alters the strength of gametophytic selection arising from sex-ratio selection (equation 6), and from each of L loci at mutation-selection balance (equation 9). Overall, the leading eigenvalue describing the spread of a modifier is:

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 111

♂ ♀ ♀ ♂ ♂ ♂ ⎧ γ (2α −1−αγ cMM ) U t (2h s + 2h s + cMM t )⎫ λnet ≈1+ (cMm − cMM )⎨ + ⎬ (11) 4(1−γ c )(1−αγ c ) 2(h♀s♀ + h♂ s♂ + c t♂ ) ⎩ MM MM MM ⎭ , where U = Lµ is the rate of deleterious mutations per haploid genome given L loci subject to selection in the haploid phase. Modifiers increasing the strength of gametophytic selection ( cMm > cMM ) spread when the term in braces is positive. The ESS level of gametophytic selection is thus obtained by setting this term to zero and solving

* * for c = cMM . As this is cubic in c , the solution is not presented but is instead manipulated numerically. In Figure 6.2, we plot the ESS frequency of male seeds, ψ * (obtained by inserting c* into equation (5)), and the mutation load experienced by a diploid sporophyte

* (obtained by inserting c into equation (9) and then ql into the load). To calculate the genome-wide mutation load, we assume that the fitness effects of each locus are similar and independent, so that they multiply together to give an overall diploid fitness of:

L k k ♀ ♀ ♂ ♂ ♂ k k k −4h s U/(h s +h s +cMM t ) (12) W (diploid ) = ∏(1− 2h s ql ) ≈ e l=1 for individuals of sex k. When the mutation rate U is high, the advantages of strengthening gametophytic selection through purging can be much greater than the disadvantages arising from skewed sex ratios, especially when the relative fitness of Y- bearing pollen is high (γ near 0). The ESS value of c* predicted by equation (10) can then reach or even exceed its maximal value (recall that c* = 1 is the maximal strength of gametophytic selection that can evolve, given other constraints on floral structure). In such cases, we set c* = 1 , assuming that these constraints are sufficiently strong to prevent higher levels of gametophytic selection from evolving (dashed curves in Figure 6.2).

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 112

FIGURE 6.2. Evolutionarily stable sex ratios and mutation load. Panels A (U = 0.1 ) and B (U = 1 ) illustrate the evolutionarily stable sex ratio in the face of conflicting selection pressures, with Fisherian sex- ratio selection favoring no gametophytic selection and purging of deleterious mutations favoring the expansion of the gametophytic phase. Where the solid curves enter the shaded regions (at diamonds for ♂ ♂ ♂ t = 0.01 , circles for t = 0.05 , and squares for t = 0.2 ), the ESS gametophytic selection becomes as strong as possible given other possible constraints ( cij = 1), and the frequency of males is then constrained to the dashed curve. Panels C (U = 0.1 ) and D (U = 1 ) represent the mutation load in the sporophytic phase (one minus the mean fitness in diploids), assuming multiplicative selection and independent loci, with the dashed curves representing the load once gametophytic selection is maximized. Gametophytic selection can lead to substantial reductions in the mutation load, which would be 0.18 (panel C) and 0.86 (panel D) in the absence of gametophytic selection. On the other hand, as γ rises, the mutation load rises because sex-ratio selection becomes stronger and favors reduced gametophytic ♂ ♀ ♂ ♀ selection. Other parameters: h = h = 0.1 , s = s = 0.2 , α = 0.5 .

The equilibrium sex ratios in Figure 6.2 reflect a balance between selective pressures favoring the removal of deleterious mutations through gametophytic selection and the countervailing pressures of Fisherian sex-ratio selection. As expected, when either U or t♂ are zero, the sex ratio at equilibrium evolves towards 1:1, and this occurs regardless of the value of α or γ . Increasing the genome-wide deleterious mutation rate

(U ) or the strength of gametophytic selection ( t♂ ) causes an increase in the advantages of purging, leading to a more biased ESS sex ratio (Figure 6.2A) but a lower mutation

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 113 load in the diploid phase compared to the load expected in the absence of gametophytic selection (Figure 6.2B). On the other hand, greater differences in the relative fitness of X- and Y-bearing sperm (γ ) favor weaker gametophytic selection because of stronger Fisherian sex-ratio selection, which leads to a heavier burden of mutations among the diploid offspring (Figure 6.2B).

DISCUSSION Considering haploid selection on the sex chromosomes, we find that when sex ratio adjustment is controlled by maternal genotype (Figure 6.1), sex ratios at the end of parental care in plants (i.e., among seeds) should ultimately evolve toward 1:1, as predicted from Fisherian sex ratio theory. Thus, the several proposed hypotheses for biased sex ratios in plant populations, such as the certation hypothesis or meiotic drive, represent proximate explanations and assume that the sex ratio either is not under maternal control or has not had time to reach an evolutionarily stable frequency. By contrast, when we also consider selection against deleterious mutations, we find that a biased sex ratio can be maintained at an evolutionarily stable equilibrium. This striking result arises because females are under conflicting evolutionary pressures: to reduce gametophytic selection within their styles to decrease the extent of sex-ratio bias in their offspring, but also to increase the intensity of gametophytic selection to purge deleterious mutations. The extent of sex-ratio bias at the ESS depends on the strength of selection among male- and female-determining gametophytes and the rate at which deleterious mutations occur (Figure 6.2). As expected, increasing the genome-wide mutation rate or the strength of selection resulted in a stronger sex-ratio bias at ESS due to the increased advantages of purging. Indeed, over much of the parameter space that we explored, gametophytic selection was so favorable because of purging that it evolved to its maximal possible strength ( cij = 1; dashed curve in Figure 6.2), despite the resulting skew in the offspring sex ratio (Figure 6.2, panels A and B). Below we discuss the implications of these findings for understanding observed patterns of mutation load and sex-ratio bias in dioecious plants and, more generally, for the evolution of life cycles with extensive gene expression in both haploid and diploid phases.

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 114

Mutation load and haploid selection in plants Mutation load is known to have a large effect on fitness (Muller 1950; Crow 1970, Charlesworth and Charlesworth 1998; Agrawal and Whitlock 2012) and has consequently been included in evolutionary explanations for a variety of phenomena, including ploidy level (Otto and Goldstein 1992), recombination and sex (Keightley and Otto 2006), mating-system evolution (Lande and Schemske 1985; Barrett and Charlesworth 1991; Charlesworth and Charlesworth 1999), and sexual selection (Whitlock and Agrawal 2009). Our models demonstrate that the benefits of reducing genome-wide deleterious mutation load through haploid selection can also influence the evolution of sex ratios for organisms with extensive overlap in gene expression between haploid and diploid phases of the life cycle. Our models confirm that the effects of purging on the mutation load through haploid selection may be particularly important in plants, where widespread gene expression in the haploid stage has been demonstrated (e.g., up to 60% of expression overlap with the diploid stage, according to some studies; Mascarenhas 1990; Borg et al. 2009) and haploid selection appears to be widespread (e.g., Searcy and Mulcahy 1985; Sari-Gorla et al. 1989; Chibalina and Filatov 2011). In particular, our finding that gametophytic selection can evolve to be maximal in the presence of recurrent deleterious mutations, despite the fitness cost associated with biasing the sex ratio, suggests that purging may be an important factor contributing to the maintenance of the haploid phase in plants. It is thought that the diploid sporophytic phase in plants has expanded over evolutionary time because diploids, which carry two copies of every gene, are able to mask deleterious recessive mutations, giving them an advantage over haploids (Valero et al. 1992; Orr 1995). However, an advantage to haploidy is that it enables purging of deleterious mutations (Otto and Marks 1996; reviewed in Mable and Otto 1998). To the extent that there is overlap in gene expression between haploid and diploid life cycle phases, the haploid phase may therefore act to screen against poorly functioning genomes, allowing only the most metabolically vigorous gametophytes to contribute genes to future generations (Mulcahy 1979). This is consistent with the finding that

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 115 gametophytic selection can increase progeny fitness (e.g. Marshall et al. 2007). Thus, while flowering plant life cycles are physically and temporally dominated by the diploid phase, viewed from the perspective of selection, the haploid phase is also of fundamental importance, potentially more so than the diploid phase at some fraction of the genome.

Implications for understanding observed patterns of sex-ratio bias Our results suggest that the benefits of selection against deleterious mutations during the haploid phase can also contribute to the maintenance of sex-ratio bias in dioecious plants, at least among species with male heterogamety whose X- and Y-bearing pollen differ in fitness. This finding is particularly relevant for species in which an association between pollination intensity and the degree of sex-ratio bias has been established, as this suggests that gametophytic selection may be involved in causing the bias (Correns 1928; Conn and Blum 1981; Stehlik and Barrett 2006; Field et al. 2012b). Previous suggestions that gametophytic selection can account for observed sex-ratio biases have not considered, however, that such bias would generate strong sex-ratio selection in females to equalize the representation of X- and Y-bearing gametophytes during fertilization. Our results confirm that sex ratios will tend toward 1:1 in the absence of opposing forces acting to maintain selection in the haploid phase. However, with recurrent deleterious mutations, our analysis (Model 3) indicates that strong gametophytic selection can be maintained, preventing the population from evolving a 1:1 sex ratio. Indeed, for realistic genome- wide mutation rates and gametophytic selection coefficients (e.g., U = 0.1, t♂ = 0.2 ), our analysis illustrates that the trade-off between gametophytic and sex-ratio selection results in patterns of bias similar to those observed in dioecious plant populations (Barrett et al. 2010; Field et al. 2012a). A mechanism that may cause increased gametophytic selection and sex-ratio bias involves the suppression of recombination between sex-determining loci, which can lead to the accumulation of rearrangements, transposable elements, and deleterious mutations on Y chromosomes and hence to sex chromosome heteromorphism (Charlesworth et al. 2005). There is evidence that this has occurred in dioecious plants to varying degrees (Charlesworth 2012), and a recent comparative analysis reports an association between the possession of heteromorphic sex chromosomes and female-biased sex ratios in

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 116 angiosperm species (Field et al. 2012a, and see Lloyd 1974). Indeed, to the extent that Y chromosome degeneration reduces the fitness of Y-bearing pollen relative to X-bearing pollen, then our model of gametophytic selection against deleterious mutations predicts this pattern. Testing the predictions of our model quantitatively should become increasingly possible as genomic studies provide markers to distinguish X- and Y-bearing pollen and improve our understanding of sex chromosome evolution in plants (Bergero and Charlesworth 2011; Chibalina and Filatov 2011). In many species, however, sex ratios are male biased. Indeed, in the survey by Field et al. (2012), 63% of the cases with sex ratios significantly different from 1:1 exhibited male-biased sex ratios. In some cases, male-biased sex ratios could result from Y-bearing pollen being positively selected in the gametophytic phase. This is expected early in the evolution of dioecy, before degeneration, because genes that are favorable in the pollen of males, but disadvantageous to females or at other stages, can preferentially accumulate on the Y chromosome. That is, with sexually antagonistic and/or ploidally- antagonistic selection, genes linked to the sex-determining region on the Y experience proportionately more selection in the male gametophytic stage and can thus accumulate alleles enhancing pollen fitness (Immler et al. 2011). Once sex-linked markers become readily available in plants, future empirical studies comparing the growth rates of pollen bearing different sex chromosomes could confirm whether species with male-biased sex ratios have high Y-bearing pollen fitness. Several caveats should be considered when comparing our models to empirical data on plant sex ratios. First, our models only consider the sex ratios of seeds, and very few studies have estimated these in natural populations (but see Taylor 1999; Stehlik and Barrett 2005). Instead, the vast majority of empirical work has focused on the sex ratios of reproductively mature plants, which are considerably easier to measure (Field et al. 2012). Second, we have assumed that a large pool of pollen is available to each female, and we have not taken into account stochasticity in pollen dispersal and the consequences of pollen limitation for the parameters in our model. Finally, our models do not consider the possibility that males are under countervailing selection pressures to mask the deleterious mutations in the pollen they produce. Such masking occurs in animals, where gene expression in the sperm largely reflects the paternal diploid genome, with both

CHAPTER 6. EVOLUTIONARILY STABLE SEX RATIOS AND MUTATION LOAD 117 homologous chromosomes contributing gene products to the haploid sperm (Joseph and Kirkpatrick 2004). It may, however, be that continuous protein synthesis for pollen tube growth during the haploid phase constrains the extent to which male plants can evolve mechanisms to mask deleterious mutations in their haploid gametes. While many questions remain to be addressed, our study has demonstrated that incorporating selection against deleterious mutations in the haploid gametophyte phase provides a plausible evolutionary explanation for biased sex ratios in dioecious plants when X- and Y-bearing pollen differ in fitness and deleterious mutations are widespread. Future empirical studies aimed at estimating the strength and direction of gametophytic selection on sex chromosomes and the ways in which females might manipulate this selection would help strengthen our understanding of the proximate mechanisms causing sex-ratio bias and, perhaps more importantly, the ultimate causes of this variation.

ACKNOWLEDGEMENTS We thank Deborah Charlesworth and two anonymous reviewers for comments on the manuscript. Discovery Grants to SCHB and SPO from the Natural Sciences and Engineering Research Council of Canada and a grant from the Swedish Research Council to SI provided financial support for this research.

APPENDIX 6.1

DERIVATION OF RECURSION EQUATIONS FOR SEX RATIO MODIFIER MODELS

The following recursions describe the per generation change in the frequency of each modifier genotype i (i being MM, Mm, or mm) among female plants, Fi , and the frequency of each modifier allele j (j being M or m) among X and Y bearing pollen grains deposited on stigmas (pXj and pYj, respectively). We assume throughout that ample pollen is received on each stigma and ignore pollen limitation and stochastic sampling. We first derive the frequency of seeds that inherited haplotype Xk from the ovule and Xj or Yj from the pollen (freq(XkXj) or freq(XkYj), respectively), where k and j represent the allele at the modifier locus (M or m). To account for transmission from a maternal parent of diploid genotype i at the modifier locus to an ovule of genotype k, we define Ti→k , where, for example, TMM→M = 1, TMm→M = 0.5 , and Tmm→M = 0 .

Model 1: Seed production with early-acting sex-ratio modifier

Females adjust the pollen received so that a fraction mi are Y-bearing. Gametophytic selection then occurs, followed by syngamy (Figure 6.1). Immediately after fertilization, the frequency of each seed genotype is given by:

⎛ (1−α)p ⎞ 1 freq(XkXj) F (1 c ) Xj T = ∑ i ⎜ − i ⎟ i→k i∈ {MM,Mm,mm} ⎝ (1−α)(pXM + pXm )⎠ Ni

⎛ α p ⎞ (1−γ ) freq(XkYj) F c Xj T = ∑ i ⎜ i ⎟ i→k i∈ {MM,Mm,mm} ⎝ α(pYM + pYm )⎠ Ni

where

Ni = (1− ci )(pXM + pXm ) + (ci )(1−γ )(pYM + pYm )

117 APPENDIX 6.1. RECURSION EQUATIONS: SEX RATIO MODIFIER MODEL 119 is a normalization factor ensuring that the frequencies of pollen available for fertilization (after gametophytic selection) sum to one. Observe that α cancels out because females choose among pollen bearing a particular sex chromosome.

Model 2: Seed production with late-acting sex-ratio modifier Females adjust the pollen received after gametophytic selection, choosing among the Y- bearing sperm that survive. Immediately after fertilization, the frequency of each seed genotype is:

(1−α)p freq(XkXj) F (1 c ) Xj T = ∑ i − i i→k i∈ {MM,Mm,mm} N Xi

α(1−γ )p freq(XkYj) Fc Yj T = ∑ i i i→k i∈ {MM,Mm,mm} NYi where

N Xi = (1−α)(pXM + pXm )

NYi = α(1−γ )(pYM + pYm ) are normalization constants that ensure that the frequency of pollen bearing an X or Y, respectively, each sum to one after gametophytic selection. Observe that α and γ both cancel out upon normalization, because females choose amongst the pollen bearing a particular sex chromosome after gametophytic selection.

Model 3: Seed production with modifier of gametophytic selection In this model, females alter the strength of gametophytic selection but do not directly choose the type of pollen grain used for fertilization. Immediately after fertilization, the frequency of each seed genotype is given by:

APPENDIX 6.1. RECURSION EQUATIONS: SEX RATIO MODIFIER MODEL 120

(1−α)p freq(XkXj) F Xj T = ∑ i i→k i∈ {MM,Mm,mm} Ni α(1−γ c )p freq(XkYj) F i Yj T = ∑ i i→k i∈ {MM,Mm,mm} Ni where

Ni = (1−α)(pXM + pXm ) +α(1−γ ci )(pYM + pYm ) normalizes the frequencies of pollen surviving gametophytic selection.

All Models: Sporophyte and pollen production Assuming that the modifier does not directly affect survival, the frequency of each genotype among the adult females in the next generation becomes:

FM′M = freq(XMXM ) / (1−ψ )

FM′m = ( freq(XMXm) + freq(XmXM )) / (1−ψ )

Fm′m = freq(XmXm) / (1−ψ ) where 1−ψ equals the frequency of females among the seeds. To determine the frequency of the pollen haplotypes produced by fathers, we must account for recombination between the modifier and the hemizygous sex-determining locus:

pX′ M = ( freq(XMYM ) + freq(XMXm)(1− r) + freq(XmYM )r) /ψ

pX′ m = ( freq(XmYm) + freq(XMYm)r + freq(XmYM )(1− r)) /ψ

pY′M = ( freq(XMYM ) + freq(XMYm)r + freq(XmYM )(1− r)) /ψ

pY′m = ( freq(XmYm) + freq(XMYm)(1− r) + freq(XmYM )r) /ψ

where ψ equals the frequency of males among the seeds. Different survival rates for female and male sporophytes have not been explicitly included, but they would not affect

APPENDIX 6.1. RECURSION EQUATIONS: SEX RATIO MODIFIER MODEL 121 the dynamics because each female or male seed would be multiplied by a sex-specific fitness, which would then cancel out when dividing by the total female frequency or the total male frequency after sporophytic selection.

APPENDIX 6.2

DERIVATION OF RECURSION EQUATIONS FOR GAMETOPHYTIC SELECTION MODIFIER MODEL

The following recursions describe the change across a generation in the frequency of a modifier of the strength of gametophytic selection (like Model 3 above), but where the second locus is not the sex-determining region but a locus A at mutation-selection balance.

Model 3: Incorporating deleterious mutations in a modifier model of gametophytic selection

In this model, we keep track of ovule and pollen haplotypes, oij and pkl, respectively, where i and k denote the modifier allele while j and l denote the selected allele. Because the loci are now assumed autosomal, we do not separately track X- and Y-bearing pollen. In the recursions, we keep track of the maternal genotype for each ovule because her genotype determines the strength of gametophytic selection experienced by the pollen. We do so by specifying the “type” of ovule, which denotes whether the ovule is carried by a homozygous (type = hom) or heterozygous (type = het) mother at the modifier locus.

For example, oMA,hom represents the frequency of ovules with haplotype MA that are carried by homozygous mothers (which must be MM). Females alter the strength of gametophytic selection but do not directly choose the type of pollen grain used for fertilization. After fertilization and meiosis, the frequency of each haplotype, g h (g being M or m, h being A or a) among the ovules

( freq(gh)type = ogh,type and sex = f) or among the pollen grains ( freq(gh)hom or het = pgh and sex = m) is given by:

⎛ (1−δ t♂c )p (1−δ t♂c )p ⎞ W sex freq(gh) o l ii kl o l Mm kl ij,kl T g type = ∑ ⎜ ij,hom ij,het ⎟ sex ij,kl → h,type i,k∈ {M,m} ⎝ Nii NMm ⎠ W ij,kl j,l∈ {A,a} where

121 APPENDIX 6.2. RECURSION EQUATIONS: GAMETOPHYTIC SELECTION MODEL 123

♂ N xy = pMA + pmA + (1− t cxy )(pMa + pma )

normalizes the pollen pool after gametophytic selection (xy is MM or mm in “hom”

mothers when i = M or m, respectively, but always Mm in “het” mothers), δl is one if the pollen carries the a allele and zero otherwise, and represents the sporophytic fitness of the

sex resulting diploid of a particular sex, and Wij,kl represents the mean diploid fitness of that sex. We assume that sex is determined elsewhere in the genome and track the frequencies of gametes within each sex.

The transmission coefficient, Tij,kl→gh,type , now specifies the probability that a sporophyte produced from an ovule of haplotype ij and a pollen grain of haplotype kl produces a gamete of genotype gh, as well as the probability that the sporophyte was of the correct type (i.e., if type = hom, then i must equal k or else the transmission probability is zero). In addition to recombination at rate R between the M and A locus, the transmission coefficient also accounts for mutation at the A locus, with A mutating to a at rate µ (back mutation is assumed rare and is ignored). For example, TMA,ma→Ma,hom is zero (the maternal sporophyte is Mm and not homozygous), while TMA,ma→Ma,het is (1− µ)(R / 2) + µ / 2 .

CHAPTER 7

SEXUAL DIMORPHISM IN FLOWERING PLANTS

This chapter resulted from collaboration with Spencer C.H. Barrett. It is published in Journal of Experimental Botany 2013, 64:67–82.

SUMMARY Among dioecious flowering plants, females and males often differ in a range of morphological, physiological and life-history traits. This is referred to as sexual dimorphism, and understanding why it occurs is a central question in evolutionary biology. Our review documents a range of sexually dimorphic traits in angiosperm species, discusses their ecological consequences, and details the genetic and evolutionary processes that drive divergence between female and male phenotypes. We consider why sexual dimorphism in plants is generally less well developed than in many animal groups, and also the importance of sexual and natural selection in contributing to differences between the sexes. Many sexually dimorphic characters, including both vegetative and flowering traits, are associated with differences in the costs of reproduction, which are usually greater in females, particularly in longer-lived species. These differences can influence the frequency and distribution of females and males across resource gradients and within heterogeneous environments, causing niche differences and the spatial segregation of the sexes. The interplay between sex-specific adaptation and the breakdown of between-sex genetic correlations allows for the independent evolution of female and male traits, and this is influenced in some species by the presence of sex chromosomes. We conclude by providing suggestions for future work on sexual dimorphism in plants, including investigations of the ecological and genetic basis of intra-specific variation, and genetic mapping and expression studies aimed at understanding the genetic architecture of sexually dimorphic trait variation.

INTRODUCTION The majority (~90%) of flowering plants exhibit hermaphroditic sex expression, with individuals functioning as both female and male parents. More rarely, populations are reproductively subdivided into two sexes (females and males), a condition known as dioecy. Although the

123 CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 125 incidence of dioecy is relatively uncommon (~6-7%; Renner and Ricklefs 1995), it is reported from close to half of all angiosperm families (Heilbuth 2000) and may have originated on at least 100 occasions from hermaphroditic ancestors (Charlesworth 2002). There has been considerable interest since Darwin’s (1877) early work on understanding why dioecy occurs in plants, and the selective mechanisms responsible for its origin and maintenance (reviewed in Charlesworth 1999). The origin of separate sexes is commonly associated with the evolution of sexual dimorphism and this has occurred to varying degrees in many dioecious plants (Correns 1928; Lloyd and Webb 1977; Geber et al. 1999). The goal of this article is to review the nature of sexual dimorphism in angiosperms, discuss how and why it arises, and consider its ecological and evolutionary consequences. Sexual dimorphism describes differences between the sexes in primary and secondary sex characters. The former relate directly to male (androecium) and female (gynoecium) sexual organs, and the latter to differences between the sexes in structures other than sex organs themselves, including any aspect of morphology or physiology. The term gender dimorphism is sometimes used in the plant literature synonymously with sexual dimorphism but we restrict its usage here to refer to populations in which there are distinct genders (females, males, or hermaphrodites) that differ in their relative contribution to fitness as pollen or seed parents. This perspective follows the concept of gender developed by David Lloyd, in which the functional gender of a plant refers to the relative contribution to fitness an individual makes from maternal versus paternal investment (Lloyd 1979; Lloyd and Bawa 1984). For a more extended discussion of these terms and their usage see Sakai and Weller (1999). The important issue for the purpose of this review is that once dioecy evolves from gender monomorphism, the sexual morphs have different roles and are usually selected to diverge in their characteristics, resulting in sexual dimorphism. Darwin (1871) described many striking examples in which females and males of animal species differ dramatically in morphology, coloration, size and behavior. He proposed that sexual selection resulting from variation among individuals in mating success could explain the evolution of sexual dimorphism and distinguished two fundamentally different types: intrasexual competition among individuals of one sex for mates, and intersexual selection or ‘mate choice’ resulting from the preferences of one sex for traits of the other. In most instances the former involves males and the latter females. This difference was later explained by Bateman’s principle

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 126

(Bateman 1948), which states that male reproductive success will be most often limited by the availability of mating partners, whereas female reproductive success is more likely to be limited by the availability of resources. This should result in greater variance in male than female mating success. A vast literature now exists on the concepts and measurement of sexual selection (reviewed in Andersson 1994; Shuster and Wade 2003) and this topic is a dynamic area in modern evolutionary and behavioral ecology. Darwin (1871) largely neglected the possibility that sexual selection might also operate in plants. This may have been because of their non-sentient habit, primarily hermaphroditic sexual condition, and less conspicuous sexual dimorphism in dioecious species. Now it is generally appreciated, although not without some controversy, that the concepts of sexual selection and Bateman’s principle can be applied to flowering plants, regardless of their particular sexual system (reviewed in Charlesworth et al. 1987; Arnold 1994; Wilson et al. 1994; Delph and Ashman 2006; Moore and Pannell 2011). This advance has helped to explain various facets of pollination and mating biology, particularly the function of reproductive traits (Willson 1979; Willson and Burley 1983; Bell 1985; Queller 1987; Harder and Barrett 1996; Moore and Pannell 2011). An important question that arises is the extent to which sexual selection rather than other forms of natural selection (e.g. viability and fecundity selection) can explain the patterns of sexual dimorphism in dioecious species. The different reproductive requirements of females and males cause sex-specific selection pressures on traits that influence viability and fertility, and both natural and sexual selection could influence the adaptive evolution of such traits. We address these issues below. Despite growing interest in sexual dimorphism in flowering plants, there are still many unresolved issues concerning the genetic architecture and evolution of trait differences between the sexes. A fundamental question that arises when considering the genetic basis of sexually dimorphic traits is the extent to which shared genetic control of traits constrains the evolution of sexual dimorphism. Understanding how different genetic architectures influence evolutionary trajectories of phenotypic change in the sexes is important for inferring how dimorphism evolves and is maintained, and also for understanding adaptive evolution more generally (Lande 1980). Recent interest in sexual dimorphism has focused on the constraints and conflicts that arise when the evolutionary interests of females and males diverge. An important result from these studies is that the independent evolution of the sexes is often constrained by high intersexual genetic

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 127 correlations (Poissant et al. 2010). Most of this work has been conducted on animals (e.g. Bonduriansky and Rowe 2005; Chenoweth et al. 2010), although several studies have explored the genetics of sexual dimorphism in plant species (reviewed in Meagher 1999, and more recently Ashman 2003; McDaniel 2005; Steven et al. 2007; Delph et al. 2010, 2011). We review these studies and consider their implications for understanding how and why sexual dimorphism evolves. Theory indicates that the divergence of traits in females or males is facilitated by the presence of sex chromosomes, as these are the only genomic regions that differ between the sexes (Rice 1984; Mank 2009). Because genes on sex chromosomes spend different amounts of evolutionary time in females and males, they are expected to obtain fitness benefits disproportionately through one sex or the other. Indeed, there is evidence that genes involved in sex-specific adaptation have non-random genomic distributions and are located on the sex chromosomes (Gibson et al. 2002; Zhou and Bachtrog 2012). Moreover, because of the selective benefit of linkage between genes involved in sex determination and those involved in sex- specific functions, it is expected that genes with sexually antagonistic effects (i.e. beneficial in one sex but deleterious in the other) should be overrepresented on the sex chromosomes (Rice 1984; Charlesworth 2005). We consider the evidence for this and review recent progress in understanding the relations between sex chromosomes and the evolution of sexual dimorphism. We begin by documenting traits that distinguish female and male plants in the context of life history, including vegetative and reproductive characters and patterns of resource allocation. We highlight contrasts between life histories since they provide insight into how differences in the timing and costs of reproduction influence other aspects of dimorphism. We next consider the ecology of sexual dimorphism and the extent to which differences between females and males influence their distribution and frequency across environmental gradients. A particular focus of this section involves evaluating evidence for niche partitioning of the sexes and whether they respond differently to environmental stress. We then review what is known about the genetic architecture of sexual dimorphism and how this might influence divergence of traits in females and males. Insights from the genetics of sexual dimorphism are used to address the question of why plants generally exhibit less exaggerated dimorphism compared to animals. We also evaluate the role that sex chromosomes play in the evolution of sexual dimorphism and

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 128 consider whether they have a disproportionate number of genes involved in sex-specific functions. We conclude by identifying several topics that would benefit from future study.

TRAITS DISTINGUISHING THE SEXES IN DIOECIOUS POPULATIONS In contrast to many animal groups, the sex of an individual cannot usually be determined in plants before flowering (but see García and Antor 1995) without sex-specific genetic markers (e.g. Eppley et al. 1998; Stehlik and Barrett 2005; Shelton 2010). As a result, more information is available for differences between the sexes in reproductive features than for vegetative traits. Nevertheless, the sexes can differ prior to reproduction in a range of characters, although these differences are rarely sufficiently obvious for females and males to be reliably distinguished solely on the basis of these traits. There are few reports of differences between the sexes at the seed or seedling stage. In Rumex nivalis, male seeds are heavier and germinate earlier than female seeds, but overall levels of germination do not differ between the sexes (Stehlik and Barrett 2005). Male seeds are also heavier than female seeds in Spinacia oleracea (Freeman et al. 1994). Sexual dimorphism in dormancy and survivorship occurs in Silene latifolia (Purrington and Schmitt 1995), and environment-dependent differences between the sexes have been reported in seed germination in Distichlis spicata (Eppley 2001). In contrast, there are numerous reports of differences in the size, morphology (e.g. leaf shape, stem characteristics), growth rate, and physiology of the sexes that are manifested during the vegetative phase of growth (reviewed in Lloyd and Webb 1977; Dawson and Geber 1999). Sexual dimorphism in these traits is associated with contrasting strategies of the sexes, particularly in growth and reproductive expenditure.

Vegetative traits In long-lived species males often exceed females in vigour, shoot size, and in their capacity for clonal propagation, although exceptions do occur (e.g. Populus tremuloides, Sakai and Burris 1985). Repeated bouts of maternal investment in fruits and seeds can lead to higher rates of mortality in females (e.g. Allen and Antos 1993) and may also exacerbate death by herbivory and disease (e.g. Ward 2007). Reproductive costs result in physiological trade-offs in resource distributions and these can influence future vegetative growth and reproduction. Females are expected to show stronger trade-offs with other life-history traits because of their typically

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 129 higher investment in reproduction (Delph and Meagher 1995; Table 1 in Delph 1999), although this is not necessarily always expressed through higher somatic costs, because various compensatory mechanisms can offset differences between the sexes in the costs of reproduction (see Table 2 in Delph 1999). Moreover, in some wind-pollinated plants male reproductive costs may match or exceed those of females because of the high investment in nitrogen-rich pollen (Delph et al.1993; Harris and Pannell 2008). Determining the appropriate resource currencies is a major challenge for evaluating reproductive expenditure in dioecious plants. A recent study by Van Drunen and Dorken (2012) of the clonal aquatic Sagittaria latifolia detected a 1:1 trade-off between biomass investment in female function and clonal reproduction (ramet and corm production). In contrast, male investment had no apparent effect on the production of ramets and corms. Instead, the nitrogen content of corms was considerably lower than for females, indicating that the type of trade off between the two reproductive modes differs between the sexes. In females the trade-off thus involves the quantity of clonal propagules produced, whereas in males it appears to involve their quality. This study is informative because it highlights the fact that life-history trade-offs can involve different resource currencies in females and males (and see Sánchez-Vilas and Pannell 2011), and also because it demonstrates that resource-based trade-offs are manifested not only at the ramet level, at which most studies of trade-offs have been performed in clonal plants, but also at the genet level which is more relevant to fitness.

FIGURE 7.1: Sexual dimorphism in Leucadendron (Proteaeae): A. Leucadendron rubrum, a wind-pollinated species with striking sexual dimorphism in vegetative and reproductive traits. The females is on the left and the male on the right. B. Leucadendron xanthocomus, an -pollinated species in which males (right) exhibit much larger floral displays than females (left) and this can lead to viability selection against males with the largest displays.

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 130

A particularly striking example of how the costs of reproduction influence sexual dimorphism involves differences between the sexes in plant architecture in Leucadendron (Figure 7.1). This genus of fire adapted shrubs endemic to the fynbos of the Cape region of South Africa exhibits variation in the degree of serotiny (cones that release their seeds after fire), and also in sexual dimorphism for leaf and branching traits, with males typically possessing more branches and smaller and more abundant leaves than females (Midgley 2010). Harris and Pannell (2010) conducted a comparative analysis of 49 species and found that the degree of serotiny was strongly associated with the degree of sexual dimorphism: females in species with well developed serotiny were less highly branched (showed less ramification) than males (Figure 7.2). These findings suggest that the reproductive burden of maintaining cones over years in females involves a significant physiological cost and that this in turn influences patterns of growth in females in ways not experienced by males. This reproductive cost in females has also been invoked to account for the occurrence of Rensch’s rule, the evolutionary allometry of size dimorphism (Rensch 1960), in three lineages of New Zealand plants in which sexual size dimorphism decreases with body size when females are the larger sex, but increases when males are larger (Kavanagh et al. 2011 and see Obeso 2002).

FIGURE 7.2: The relation between sexual dimorphism in ramification (branching) and the age of the oldest cone, an index of the degree of serotiny in Leucadendron. Each point is a different species. After Harris and Pannell (2010). Published with permission of the Journal of Ecology.

In contrast to long-lived species, females are often larger than males in short-lived polycarpic and monocarpic species (Lloyd and Webb 1977; Obeso 2002; Delph 1999), although

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 131 this difference can depend on when during the life history comparisons are made. Dynamic patterns of sex-specific growth and resource allocation are particularly evident in short-lived dioecious plants, highlighting the importance of making comparisons at various life cycle stages. For example, in the short-lived perennial Silene latifolia allocation patterns to vegetative growth are similar between the sexes prior to flowering, although there is evidence of sexual dimorphism in gene expression long before this occurs (Zluvova et al. 2010). Once reproduction commences, however, dimorphism develops rapidly with females growing larger and living longer than males, which produce up to 16 times as many flowers (reviewed in Delph 1999). This difference is consistent with the idea that there are contrasting sex-specific optima for traits affecting longevity (e.g. Delph and Herlihy 2012), and this can favour a “live fast, die young” strategy in males (Bonduriansky et al. 2008). This strategy seems likely in other dioecious plants in which males senesce earlier than females. Experimental studies involving the manipulation of reproductive expenditure (by bud removal) and nitrogen resources provide insight on the causes of size dimorphism in annual Mercurialis annua, a species in which males have less biomass than females (Harris and Pannell 2008). The two sexes differed in allocation patterns with males investing proportionately more in root growth, presumably to provide nitrogen for pollen production, thus restricting above-ground vegetative growth, and females investing more in producing photosynthetic leaves capable of supplying carbon for fruits and seeds. The timing of resource deployment and the relative versus absolute differences in the sizes of below- versus above-ground sources and sinks that supply different resources can thus explain the observed patterns of size dimorphism (and see Sanchez- Vilas and Pannell 2011). Sex-specific allocation patterns in M. annua vary temporally and also respond to environmental heterogeneity. Hesse and Pannell (2011) found that the sexes differentially adjusted their reproductive allocation in response to resource availability and plant competition. In particular, males reduced their reproductive expenditure when grown in poor soils, whereas females increased theirs, especially when competing with other females. However, there was relatively little effect of resources on the degree of size dimorphism so that the relative size disparities were maintained across environmental treatments and through time. In wind-pollinated annual Rumex hastatulus, height dimorphism changes predictably during the life cycle, with males taller than females at flowering and the reverse pattern occurring during seed maturation (Pickup and Barrett 2012). In this species both pollen and

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 132 seeds are wind-dispersed, and the temporal changes in plant height during the life cycle seem likely to be adaptive and are matched to temporally optimize both pollen and seed dispersal functions, for which there is a premium on height for maximizing propagule dispersal distance by wind.

Reproductive traits There are numerous examples of sexual dimorphism in reproductive traits of dioecious species, and these have been well summarized in the reviews of Delph (1999) and Eckhart (1999). Sex- specific differences include flowering phenology and periodicity (e.g. Thomas and LaFrankie 1993), bud abortion (Abe 2002), flower size (e.g. Delph et al. 1996), flower number per plant (Delph et al. 2005), floral longevity (e.g. Primack 1985), nutrient content of flowers (e.g. Carroll and Delph 1996), nectar production (e.g. Bawa and Opler 1975), floral fragrances (Ashman 2009), floral defense against herbivory (e.g. Corneslissen and Stirling 2005), and various inflorescence characteristics including total flower number (e.g. Barrett 1992), daily display size (e.g. Yakimowski et al. 2009), and inflorescence architecture (e.g. Rourke 1989). In animal- pollinated species these differences can have important consequences for pollinator visitation, competition for mates, and the evolution of sexual dimorphism (Vaughton and Ramsey 1998; Ashman 2000; Case and Barrett 2004; Glaettli and Barrett 2008). There are obviously constraints on how different the reproductive traits of females and males in animal-pollinated species can become; too much divergence could interfere with mating success if pollinators are more attracted to one sex than the other, or if the sexes attract different pollinators. Such constraints are absent from wind-pollinated plants, and the contrasting biophysical requirements for pollen dispersal and pollen capture have led to striking cases of sexual dimorphism in plant architecture and flower production in some species (e.g. Leucadendron rubrum, Figure 7.1A). In some cases the direction of difference between females and males is quite consistent (e.g. in long-lived species males commonly flower at a younger age and more often than females, Delph 1999), whereas for other traits this is not the case (e.g. flower size, Delph et al. 1996). Below we discuss recent examples that were not available when earlier reviews were conducted, and we consider hypotheses to account for the patterns observed. A common observation in long-lived dioecious plants is that males flower more regularly than females (Lloyd and Webb 1977; Bawa et al. 1982; Nicotra 1998). This pattern is generally

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 133 interpreted as resulting from greater female than male reproductive expenditure (e.g. Ågren 1988; Queenborough et al. 2007). However, the extent to which variation in environmental factors might also influence patterns of flowering in the sexes of dioecious species is less well understood. A recent study of sexual differences in year-to-year flowering in Lindera triloba, a multi-stemmed understory shrub of temperate forests in Japan, provides useful insights in this regard (Matsushita et al. 2011). The authors monitored sunshine hours and flowering patterns of the sexes over five consecutive years at both the ramet and genet level. Flowering fluctuated annually and was positively correlated with the number of sunshine hours during the preceding summer. Although, as expected, annual flowering intensity was greater in males than females, inter-annual variation in ramet flowering and inflorescence production was also more pronounced in males, with ramets more sensitive to light conditions and the growth status and size of the genets to which they belonged. This observation suggests that the extent of modular integration of ramets within genets differs between the sexes. Evidence from girdling experiments indicated that female ramets are capable of earlier physiological independence than male ramets (Isogimi et al. 2011). As yet the physiological mechanisms by which ramet flowering between the sexes may be differently coordinated is unclear, but this issue could be investigated by tracing patterns of carbon translocation using 13C labeling (e.g. Ida et al. 2012). Comparative studies of animal-pollinated dioecious species indicate that they commonly possess flowers that are less showy than outcrossing hermaphrodites, with small flowers that are often white, pale yellow, or green in colour (Charlesworth 1993; Renner and Ricklefs 1995; Vamosi et al. 2003). Nevertheless, the aggregation of these flowers can result in large floral displays that often show sexual dimorphism in floral and inflorescence traits. Following Bateman’s principle, floral and inflorescence traits that increase pollinator attraction would be expected to evolve under stronger pollinator-mediated selection in male than female plants. This leads to the prediction that large floral displays evolve primarily to increase male fertility and is based on the assumption that male outcrossed siring success increases with more pollinator visits, with only a few visits required to maximize female fertility. However, there is now evidence for widespread pollen limitation of seed production in flowering plants (e.g. Burd 1994; Larson and Barrett 2000; Ashman et al. 2004), indicating that the assumption that male but not female fertility is limited by access to mates is not always true. Indeed, there is evidence that the strength of selection on attractive traits can increase with greater pollen limitation of seed set

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 134

(Ashman and Morgan 2004), and that this may lead to the evolution of diverse reproductive adaptations (Harder and Aizen 2010). These findings, and the recent appreciation of the context- dependent nature of selection on floral traits, suggest that determining the relative importance of both natural and sexual selection will be critical for explaining patterns of sexually dimorphic trait variation. A recent experimental study of Silene latifolia by Delph and Herlihy (2012) exposes the complexity of selection on flower size and number. In this species, a flower size-number trade- off occurs within each sex, and floral traits are genetically correlated with leaf physiology (Delph et al. 2005). The authors used experiment arrays composed of selection lines of small- versus large-flowered plants to increase the phenotypic variation on which selection acts. Because they measured both pollen production and siring success (with genetic markers), they were able to distinguish fecundity selection from sexual selection in males. In females they found evidence for both fecundity and viability selection favouring large-flowered plants but no evidence for sexual selection. In contrast, sexual selection favoured small flowered and early flowering males, but viability selection opposed this and instead favoured large-flowered males, thus producing a ‘tug-of-war” between the two forms of selection. An important conclusion from this study is that the relative importance and direction of the different forms of selection can be highly dependent on environmental conditions. In its native Europe S. latifolia occurs over a wide geographical range, experiencing widely different levels of precipitation, and this may contribute to the considerable variation among populations in flower size and number. Sexual selection is most obvious in animals when it favours the evolution of extravagant male displays that enhance mating success at the expense of reduced viability. The larger floral displays of males in many animal-pollinated dioecious species are usually interpreted as resulting from male-male competition for mates, but few cases are known where this is associated with the reduced survival of male plants as a result of viability selection (but see Delph and Herlihy 2012). A striking example involves Leucadendron xanthocomus (Figure 7.1B) in which males can produce up to 20 times more flowers than females. Bond and Maze (1999) found that the number of insect visits to male plants increased linearly with floral display size but that increasing display size was associated with a higher probability of plant death. In contrast, the seed set and survival of females was not associated with display size. The ultimate cause of death in male plants appears to be the high maintenance cost of the abundant yellow non-

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 135 photosynthetic display leaves (Fig 1B) that attract pollinators, but which cause considerable shading of photosynthetic leaves. More recently, Hemborg and Bond (2005) have propose that the striking sexual dimorphism in L. xanthocomus has been promoted by the activities of its pollinator, a nitulid beetle (Pria cinerascens), which depends entirely on the species for egg laying sites and food for adults and larvae. Based on field observations and manipulative experiments, they proposed that the different resources available from the two sexes have driven the evolution of sexual dimorphism. Males provide food and egg-laying sites whereas the nectarless females, because of the particular cup-shaped morphology of their flower heads, provide only shelter for the beetles from rain, which is frequent during flowering. This idea involving ‘specialized female rewards’ is novel as it challenges the assumption that ‘rewardless’ females necessarily function only by deceit. Future work on this system should determine whether the number of insect visits to male plants is positively associated with male siring success, as assumed. Phenotypic selection analysis would also be useful to investigate the extent to which mating success and viability in males are counterbalanced to produce the optimal display size. The key functional component of floral display size is the number of flowers in anthesis on a given day, rather than total flower production, as only the former should determine pollinator attraction and mating success (Harder and Barrett 1996). In common with most dioecious species, flower size and the total number of flowers per inflorescence in Sagittaria latifolia is greater in males than females; however, daily display size is larger in females (Figure 7.3, Yakimowski et al. 2011). This difference results from the more synchronized opening of flowers within female inflorescences, in contrast to males (Figure 7.4C,D). The cause of this difference in flowering strategy probably resides in the different reproductive roles of females and males. More protracted flowering in males is likely to have been shaped by sexual selection to increase the number and variety of mating partners. This pattern of flowering serves to restrict the diminishing returns commonly associated with male function by presenting pollen gradually over time and maximizing the number of different insect visitors that participate in cross- pollination (Lloyd 1984; Harder and Thomson 1989). A study of siring success in mating arrays of S. latifolia supports this hypothesis as male fertility increased linearly with flower production in a manner that is consistent with a linear gain curve (Perry and Dorken 2011). In females the larger daily floral displays may function to compensate for the smaller size of female flowers,

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 136 and also for the absence of pollen as a reward. More work is needed on the sex-specific flowering schedules of dioecious plants, as these not only determine the scope for sexual selection but also the intensity of frequency-dependent selection. Also, the wide variation among populations of S. latifolia in degree of sexual dimorphism (Figure 7.4) raises questions regarding the relative importance of ecological and genetic factors in governing this variation.

FIGURE 7.3: Sexual dimorphism in flower size and daily display size in Sagittaria latifolia (Alismataceae). The female is on the left and the male on the right. After Yakimowski et al. (2011). Published with permission of Annals of Botany.

There is a growing literature on the role of floral fragrance in pollinator attraction, raising the question of whether sexual dimorphism in floral scent occurs in dioecious species. Based on a survey of 33 gender dimorphic species, Ashman (2009) found that in the majority of species male plants emitted more volatiles per flower than females, a result consistent with sexual selection. However, several alternative hypotheses also predict this outcome as well as other patterns (see Table 2 in Ashman 2009). For example, higher volatile amounts in males could be a simple allometric consequence of larger flower size, but this cannot be the case in Silene latifolia because flowers in females are considerably larger than in males. Waelti et al. (2009) found that male plants emitted significantly larger amounts of scent and that naïve pollinating male moths preferred male over female flowers. Female moths showed no such preference, possibly because only female flowers are used for oviposition and are preferred sites for larval development. Future experimental studies on the chemical ecology of scent dimorphism seem likely to yield new insights into the pollination biology of dioecious plants.

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 137

FIGURE 7.4: Variation among populations of Sagittaria latifolia in the degree of sexual dimorphism in: (A) total number of flowers per inflorescence; (B) flower size; (C) number of flowers in anthesis per inflorescence per day (daily floral display); (D) Proportion of total flowers open per inflorescence per day. All values are means and standard errors; the right hand panels for each trait provide the grand means. After Yakimowski et al. (2011), published with permission of Annals of Botany.

ECOLOGY OF SEXUAL DIMORPHISM The cases of sexual dimorphism described in the preceding section reflect contrasting functional roles for the sexes with implications for their frequency and distribution. In particular, differences in life history, physiology, and reproductive expenditure may influence the frequency of sexes across broad environmental gradients resulting in geographical variation in sex ratios, and at finer spatial scales, the occupation of different environmental niches in heterogeneous environments. In this section we review evidence that sexual dimorphism can have ecological consequences reinforcing, and perhaps also promoting, secondary sexual divergence. Our focus here is on the response of the sexes to abiotic factors and plant competition, although there are numerous other potential ecological consequences of sexual dimorphism. To mention one example, a recent meta-analysis of sex-biased herbivory (Corneslissen and Stirling 2005) found that male plants exhibited significantly higher levels of herbivory than female plants (and see Ågren et al. 1999). This may have been because, in the species surveyed, males were generally larger but possessed lower concentrations of secondary compounds and other plant defenses.

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 138

The sex ratios of dioecious populations commonly deviate from the equilibrium expectation of the 1:1 primary sex ratio predicted by Fisherian theory (Delph 1999; de Jong and Klinkhamer 2005; Sinclair et al. 2012). A recent survey of flowering sex ratios among angiosperm families revealed that about half showed significant deviations from equality, with male bias almost twice as common as female bias (Barrett et al. 2010). The frequent occurrence of male-biased sex ratios is likely to be associated with the greater reproductive investment of females, as this commonly results in the earlier onset and more frequent flowering of males and the greater mortality of females (Lloyd and Webb 1977; Delph 1999, Obeso 2002). This observation leads to the prediction that long-lived dioecious species that experience repeated episodes of reproduction should be more likely to develop male-biased sex ratios. Such effects may be especially strong in species with a large maternal investment in costly fleshy fruits, a common correlate of dioecy (Vamosi et al. 2003). These predictions were recently confirmed in comparative analyses of the life-history correlates of dioecy, in which male biased sex ratios were associated with woody growth forms and fleshy fruits (Sinclair et al. 2012; D.L. Field, M. Pickup and S.C.H. Barrett, MS in review). Differences in the degree of sexual dimorphism in reproductive expenditure may therefore be influential in shaping patterns of sex-ratio variation among angiosperm species. Sexual dimorphism in the costs of reproduction may also be expected to influence sex- ratio variation among populations of dioecious species, especially if they occupy a broad range of environmental conditions. Higher reproductive expenditure and/or greater sensitivity to stress in females should result in more male-biased sex ratios along gradients of resource availability and growing season length. There is some evidence from studies of sex ratios along environmental gradients to support this hypothesis (e.g. Grant and Mitton 1979; Fox and Harrison 1981; Marques et al. 2002; Pickering and Hill 2002; Li et al. 2007). Growing season length may also differentially affect the sexes, especially in northern latitudes where a shorter growing season may limit opportunities for females to successfully mature seed. A latitudinal survey of sex ratios of Sagittaria latifolia in eastern N. America (S. B. Yakimowski and S.C.H. Barrett, unpubl. data) revealed patterns consistent with the hypothesis that females are more sensitive to conditions that limit their reproductive activities. Based on a survey of 116 populations at the northern range limit of this species, the authors found a significant decline in the frequency of females with increasing latitude. Because of the clonal nature of S. latifolia it is

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 139 unclear whether this result simply reflects a reduction in the flowering of female ramets at range limits, or whether genet sex ratios are also male-biased. Similar processes causing among-population variation in sex ratios can result in the segregation of sexes in spatially heterogeneous environments. Indeed, there is considerable evidence for the ‘spatial segregation of the sexes’ (SSS) in populations of dioecious plants, and in some cases the physiological mechanisms causing habitat segregation have been investigated (reviewed in Dawson and Geber 1999). Spatial segregation of the sexes has been reported in over 30 dioecious species from 20 families and in the vast majority of cases male-biased sex ratios are reported in more stressful sites (Bierzychudek and Eckhart; 1988; Mercer and Eppley 2010). Extreme SSS could influence successful mating if the sexes become too spatially isolated, and there has been interest in the mechanisms causing habitat differentiation and why some species exhibit this phenomenon and not others. A variety of adaptive and non-adaptive hypotheses have been proposed to explain SSS. Several seem unlikely (e.g. habitat selection, sex choice (gender diphasy), maternal control of sex ratio, and sex differential germination) as they are either unknown in plants, e.g. habitat selection, or are of limited occurrence, e.g. gender diphasy (see Lloyd and Bawa 1984). Early work suggested that SSS results from niche partitioning that has evolved as an adaptive response to reduce competition between the sexes (Freeman et al. 1976; Onyekwelu and Harper 1979; Cox 1981). However, in a critique of this hypothesis, Bierzychudek and Eckhart (1988) proposed that SSS is more likely to be a simple non-adaptive outcome of differential mortality between the sexes as a result of sexual dimorphism in reproductive expenditure. Of course the occurrence of niche differences between the sexes does not necessarily indicate that intersexual competition is the cause of SSS. Thus, determining the nature of sex-specific competitive effects is crucial for understanding the ultimate causes of niche segregation of the sexes. Surprisingly few studies have investigated sexual differences in competitive ability in dioecious species (reviewed in Ågren et al. 1999 and see Sánchez-Vilas et al. 2011). Recent studies of the N. American clonal salt marsh grass Distichlis spicata by Eppley and colleagues provide valuable clues on the role of competition in potentially contributing towards SSS. In this species, sex ratios vary widely within salt marshes, varying from female to male predominance along gradients of elevation and nutrients (Eppley et al. 1998; Eppley 2001). Sex-specific genetic markers confirmed that SSS is evident at the genet level and is not simply a result of sex-specific

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 140 differences in the flowering of ramets. The availability of markers has also allowed juveniles to be sexed and used in competition and reciprocal transplant experiments (Eppley 2006; Mercer and Eppley 2010). These experiments have demonstrated that females are stronger competitors than males, at least in some environments, and that competition between females and males is significantly more intense than competition between plants of the same sex, an assumption of the niche-partitioning hypothesis. Collectively these results suggest that environment-dependent differences in competitive ability during the seedling stage help to maintain patterns of niche segregation in D. spicata. They also provide the best evidence to date for niche partitioning in dioecious plants. However, because these studies have only focused on competitive interactions among seedlings of D. spicata, it is not possible at this stage to rule out the contribution of sexual dimorphism in physiology and reproductive expenditure to the SSS. Indeed, it seems probable that both niche partitioning and features of sexual dimorphism play a role in this system. Sexual dimorphism in physiology and reproduction can result in the sexes requiring different resources from the environment (e.g. Dudley 2006; Harris and Pannell 2008), a phenomenon knows as the ‘Jack Sprat effect’ (Onyekwelu and Harper 1979; Cox 1981). The possibility that the sexes modify their ecological niches had not been considered in any detail until recently, especially with regard to future offspring performance. By growing plants of Mercurialis annus in soil previously occupied by females or males, Sánchez-Vilas and Pannell (2010) found that plants grown in soil in which females had previously grown were significantly smaller in terms of total biomass than those grown in soil previously occupied by males. As discussed earlier, in this species females are larger than males and therefore they may have depleted more resources from the soil than males. This form of ‘niche construction’ may occur in other cases of sexual size dimorphism in dioecious plants.

EVOLUTION AND GENETICS OF SEXUAL DIMORPHISM When dioecy evolves from hermaphroditism, females and males are expected to diverge and specialize to their respective unisexual conditions. This is because hermaphroditic plants cannot be simultaneously optimized for both female and male function. Therefore, when separate sexes evolve, constraints to gender specialization are relieved and the establishment of unisexuality is expected to be associated with sex-specific adaptation, particularly in reproductive traits. However, when the sexes have different optimal values for such traits, a shared genetic

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 141 architecture can constrain them from evolving toward their respective trait optima (Figure 7.5). However, sexual dimorphism can still evolve when such trade-offs exist, and this can involve sex-limited gene expression and the breakdown of strong intersexual genetic correlations (Rhen 2000). Such divergence is facilitated by both natural and sexual selection (Lande 1980). Indeed, these are the primary evolutionary processes responsible for the evolution and maintenance of sexual dimorphism, and although non-adaptive processes including drift and mutation can affect genetic variability in sexual dimorphism, they cannot by themselves explain its persistence (Lande, l981; Kirkpatrick 1982). Rather, sexual dimorphism results from the interplay between sex-specific adaptation and the breakdown of genetic correlations that constrain the independent evolution of traits subject to asymmetric (sex-biased) selection in females and males (Lande 1979 1980). Thus, the rate and extent of evolutionary change in sexually dimorphic traits will be strongly influenced by their underlying genetic architecture and the patterns of genetic variation and covariation available to selection. In this section, we review quantitative genetic approaches to the study of sexual dimorphism in dioecious plants (and see Geber 1999; Meagher 1999).

FIGURE 7.5. A hypothetical scenario in which females (dashed lines) and males (solid lines) have different optima for the same trait, causing sex-biased selection (long arrows). A shared genetic architecture results in intersexual genetic correlations (rMF) that constrain the independent divergence of females and males (short arrows) and causes their respective trait distributions (shaded curves) to be suboptimal. With rMF < 1, conflict imposed by genetic constraint may be resolved though the evolution of sexual dimorphism (modified from Bedhome and Chippendale 2008).

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 142

Quantitative genetic models

Consider two homologous traits, Zf and Zm, which affect fitness in both females and males. The standard expression for the change in the mean of a single trait, Z, is given by ΔZ = R = h 2s , where the response to selection (R) is equal to the trait’s heritability (h2) multiplied by the selection differential (s) for the trait, and s is the difference in the mean of the trait after (Z*) and € before (Z) selection. Heritability is given in the narrow sense as the proportion of phenotypic variance (VP) attributable to additive genetic variance (VA). This equation can be extended to predict the change in the means of two homologous traits in females and males, Zf and Zm, giving the following two expressions:

1 2 ΔZ f = (h f V i + h h r V i ) 2 Pf f f m MF Pf m

1 2 € ΔZ m = (h mVP im + hm h f rMFVP i f ) 2 m m , where the 1/2 accounts for the fact that autosomal traits receive equal contributions from each € 2 2 parent, h f and h m are the heritabilities for each sex, and i represents a standardized measure of sex-specific selection intensity. Of particular interest is the quantity rMF , which describes the between-sex genetic correlation for traits Zm and Zf ; it is given by:

Cov(M,F) rMF = VA M VA F and determines the extent to which selection in one sex will cause a correlated response in the € other: rMF =1 implies an exact correlated response. Because between-sex genetic correlations constrain the independent evolution of female and male traits, a negative relationship has been predicted between the extent of sexual dimorphism and rMF (Slatkin 1984; Reeve and Fairbairn 2001; Bonduriansky and Chenoweth

2009). Studies of animal species report consistent negative relationships between rMF and phenotypic sexual dimorphism (e.g. Bonduriansky and Rowe 2005; Poissant et al. 2008;

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 143

Chenoweth et al. 2010), and recent meta-analyses by Poissant et al. (2010) and Wyman (2012) suggest that this pattern holds for a variety of taxa. In contrast, there are few empirical estimates of rMF in plants, and our understanding of the influence of genetic constraint on the evolution of sexual dimorphism is limited. Delph et al. (2004) reported that, in S. latifolia, the trait with the highest intersexual genetic correlations (petal-limb length) also exhibited the lowest levels of sexual dimorphism; however, this may, in part, be caused by petal-limb length not being highly correlated with flower number (Delph et al. 2004). Similarly, Ashman (2003) found that in gender dimorphic Fragaria virginiana, a gynodioecious wild strawberry, flower number was the least sexually dimorphic trait and highly genetically correlated between the sexes, suggesting a shared genetic architecture and constraint on the sex-specific divergence of this trait. Selection experiments in S. latifolia have also found that the between-sex genetic correlations for flower number were close to 1, and when this trait was selected in females it resulted in a significant and nearly equivalent change in both sexes (Meagher 1999; Delph et al. 2004; 2010). The results of these studies provide evidence that genetic correlations can indeed cause coupled evolutionary responses in the sexes of dioecious plants. Sexual dimorphism is prevalent among dioecious species and this raises the question of how sexually dimorphic traits can diverge despite the constraints imposed by genetic correlations. One problem in conceptualizing the evolution of sexual dimorphism, using the framework described by the univariate breeder’s equation described above, is that the constraints imposed by rMF are based on the assumption that genetic variances between the sexes are equivalent, whereas the available data suggest that this is often not the case (e.g. Cheverud et al. 1985; Reeve and Fairbairn 2001; Poissant et al. 2010; Bonduriansky and Chenoweth 2009). Furthermore, it ignores that traits within each sex can be genetically correlated so that selection on a focal trait can cause a correlated response in a second trait, which may also be involved in sex-specific adaptation. These problems are partially overcome by conceptualizing the evolution of sexual dimorphism using the framework developed by Lande (1980), who investigated the evolution of female and male traits using a modified version of the multivariate breeders equation and showed that the change in the mean of quantitative characters subject to sex-specific selection can be described by:

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 144

1# Gm B &# βm & ΔZ = % T (% ( 2$ B G f '$ βf '

where, Gm and Gf represent male and female genetic variance and covariance matrices, B is the € between-sex covariance matrix, βm and βf are vectors of selection gradients for males and females. In this formulation, as in the univariate case, between-sex genetic covariance can constrain the independent evolution of traits in males and females, and B can be thought of as the multivariate analog of rMF. When B is similar to G, little sex-specific divergence is possible. An illuminating difference between the univariate and multivariate formulations is that because the matrix B has both magnitude and direction, a positive between-sex genetic covariance can either increase or decrease the efficacy with which sexual dimorphism evolves depending on the orientation of B with respect to G. Thus, mechanisms that change the shape of either B or G can influence how dimorphism evolves, and moreover, the mode of selection (e.g. intersexual selection, natural selection) can affect the evolution of male and female traits by determining how selection gradients are specified for each sex. For example, sexual selection can alter male or female selection gradients by causing fitness to become a function of the distribution of phenotypes in the population (i.e. by causing fitness to become frequency- dependent), and this can cause the sexes to respond asymmetrically to selection on traits that are correlated between them. Studies that have used this multivariate framework for investigating the evolution of sexual dimorphism in animals have found that G matrices are often dimorphic (e.g. Rolff et al. 2005; McGuigan and Blows 2007; Lewis et al. 2011; Gosden et al. 2012), a pattern that seems to hold in several plant species as well (e.g. Ashman 2003; Steven et al. 2007; Campbell et al. 2011). In Silene latifolia, Steven et al. (2007) estimated G matrices for females and males and found that although most traits were highly correlated, sex-specific G matrices differed in both magnitude and orientation, implying that even if the sexes were subject to similar selection regimes, they could exhibit different evolutionary responses. In wind-pollinated Schiedea adamantis, Campbell and colleagues (2011) reported significant between-sex differences in G matrices, which were likely due, in part, to lower genetic variation for flower number in females. These studies provide evidence against the assumption of equal female and male genetic

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 145 variances and suggest that sex-specific responses to selection are often possible, even in the presence of intersexual genetic correlations. They also highlight how multivariate quantitative genetic approaches, which consider the interactions between intersexual covariances, sex- specific genetic variances, and selection gradients, can provide a more complete understanding of the evolution of sexual dimorphism.

Differences between plants and animals Earlier reviews discussed quantitative genetic models in the context of sexual selection in animals and emphasized the need to consider three essential elements in conceptualizing the evolution of sexual dimorphism by sexual selection: female preference, mate choice, and the genetic correlation between them (Arnold 1987; Bradbury and Andersson 1987; Maynard Smith 1987). In principle there is no difficulty in applying these models to dioecious plants; however, there are important differences in the biology of plants and animals that may provide insights into why and how the evolution of sexual dimorphism differs between these groups. The extent to which mate choice operates in plants also warrants special consideration because of the indirect ways in which plants reproduce as a consequence of their immobility (Charlesworth et al. 1987; Moore and Pannell 2011). Indeed, the indirect nature of plant sexual interactions may be part of the explanation for why most plants show less extreme sexual dimorphism than animals. Intersexual interaction in flowering plants is necessarily indirect and is mediated by the vectors of dispersal through which pollen is transferred and received (e.g. animals, wind, and water). This can cause the strength of the relation between secondary sexual characters, (e.g. flower size) and mating success to be reduced because of the uncertainties involved in pollen delivery and receipt, and this will reduce the strength of sexual selection (i.e. skew the shape of

βm of βf) and result in a lower optimal trait value for the character in question. Further, because pollinators may often select for similar traits in both sexes, this should weaken the strength of between-sex disruptive selection and limit the divergence of attractive characters or floral rewards. Opportunities for mate choice and male-female competitive interactions may still exist, however, and these could become more important once pollen grains are deposited on stigmas. However, micro- and megagametophytes are necessarily dimorphic and so the effects of inter-or

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 146 intra-sexual selection at this stage might not be expected to cause secondary sexual dimorphism in traits expressed in sporophytes. Nevertheless, owing to significant overlap in gene expression between the sporophytic and gametophytic stages of the life cycle in plants (~60%, Mascarenhas 1999), gametophytic selection may have direct effects on pollen characteristics and could indirectly influence the evolution of male sporophyte characters and hence sexual dimorphism. Thus, to the extent that mate choice occurs in plants, such male-female interactions must occur primarily in the post-pollination stage of the life cycle where the interaction may be less likely to produce exaggerated sexual dimorphism. The typically weaker sexual dimorphism in plants than animals may also be explained by the recent evolutionary origins of dioecy in most lineages. When dioecy evolves from hermaphroditism and the sexes are initially monomorphic with respect to homologous characters, intersexual genetic correlations should be quite strong and could interfere with female and male trait divergence. Indeed, much of the genetic variation available for the evolution of sexual dimorphism is likely to have been initially shared by the sexes after the establishment of dioecy, and the recent origins of unisexuality in some lineages may mean that there has not been sufficient time for selection to breakdown intersexual genetic correlations. This hypothesis predicts a relation between the degree of dimorphism and the age of dioecious lineages. As discussed earlier, other factors including the pollination system of dioecious species (e.g. animal versus wind pollination) may also influence the degree of morphological divergence, and comparative studies would be useful to investigate further the factors responsible for the patterns and degree of sexual dimorphism in plants.

Sex chromosomes and sexual dimorphism The transition to dioecy has been associated with the evolution of sex chromosomes in some dioecious plant species (Charlesworth et al. 2005), and theory suggests that they can facilitate the evolution of sexual dimorphism (Rice 1987). This can occur by several mechanisms. One is through the influence of sex chromosomes on the genetic variance and covariance structures of females and males. The multivariate equation for the evolution of sexual dimorphism developed by Lande (1980) assumes that the genes involved in sexual dimorphism are linked to autosomes, but sex linkage can change the structure of female and male genetic variances because of the different number of sex-linked genes in each sex. For example, when the heterogametic sex (e.g.

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 147

XY males or ZW females) contain only one allele per sex-linked locus (i.e. when they are hemizygous for genes on the sex chromosomes), it can express the genes regardless of dominance or recessivity, whereas the homogametic sex (e.g. XX females or ZZ males) contains two alleles for each sex-linked locus and can therefore be heterozygous or homozygous. It follows that the genetic variance contributed by sex-linked genes is asymmetric between the sexes and can be as much as twofold higher in XY males or ZW females (Lynch and Walsh 1998). Such differences can translate into sex-specific responses to selection. Similarly, when genes linked to sex chromosomes or autosomes are sex-biased in expression, as might occur as a result of sex-specific selection, this can select for regulatory mechanisms that further limit expression levels in females or males (Ellegren and Parsch 2007). In particular, when expression levels are simultaneously beneficial in one sex but deleterious in the other (i.e. sexually antagonistic), then, depending on the magnitude of gene expression and how deleterious their effects are, sex-limited expression can be selectively favored to offset the negative fitness consequences of expressing these genes in both sexes. Sex-linked genes with sexually antagonistic effects play an important role in theoretical models of plant sex-chromosome evolution by causing selection against recombination between loci on newly evolving sex chromosomes (Charlesworth 2005). In addition, they play an important role in many aspects of sexual conflict theory (Arnqvist and Rowe 2005) and can be particularly important in contributing to sex-specific responses to selection because they decouple the genetic architecture of female and male traits, thus helping to resolve the conflict that arises when the sexes have different fitness optima but high intersexual genetic correlations. Rice (1987) used a genetic model to investigate the relation between sexual dimorphism and sex chromosomes, and predicted that genes with sexually antagonistic affects should be disproportionately located on sex chromosomes. His prediction stems from the conclusion that alleles of sex-linked genes can spread though dioecious populations even when the deleterious effects to one sex outweigh the benefits to the other. Subsequent theory on the evolution of sexual dimorphism through sexual selection has focused on this kind of sexual antagonism, but many of these models have assumed that traits under sexual selection are under the control of just one or a few genes, and often these genes are assumed to be located on the autosomes (Chapman et al. 2003). Nonetheless, sexually dimorphic phenotypes are often associated with

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 148 genes on the sex chromosomes because such genes have sex-biased transmission and genomic distributions. The occurrence of sexual antagonism is of particular interest when it involves the X chromosome, as genes on this chromosome can preferentially evolve sex-biased fitness effects relative to autosomal genes (Rice 1987). Imagine, for example, a sexually antagonistic mutation on an autosome that has a significant effect on the fitness of heterozygotes. When the fitness effects of such a mutation are positive in females and negative in males, the mutation can spread under positive selection only when the beneficial effects in females outweigh the deleterious effects in males (Vicoso and Charlesworth 2006). However, if this mutation occurs on the X chromosome, its deleterious effects will only be expressed 1/3 of the time (i.e. in males) and hence the probability of such a mutation reaching fixation is greater when it occurs on an X chromosome than on an autosome (Rice 1987). It follows that the X chromosome can accumulate such female-benefit genes at a faster rate, and the “feminization” of this chromosome might make it an evolutionary hot spot for genes involved in sexual dimorphism (Gibson et al. 2002). Empirical work has attempted to determine whether sex chromosomes do indeed influence sexual dimorphism by an affect disproportionate to their size (reviewed in Mank 2009). In D. melanogaster, there are reports that genes with sex-biased expression have nonrandom genomic distributions, with X chromosomes harboring fewer genes with male-biased expression (Parisi et al. 2003, but see Fitzpatrick 2004). On the other hand, comparative studies in birds do not support an association between sex chromosomes and sexually selected dimorphic traits (Mank et al. 2006). In plants, sexually dimorphic gene expression has been detected in both vegetative (Zluvova et al. 2010) and floral (Muyle et al. 2012) characters in Silene latifolia and there is evidence that some genes on the X chromosome of this species are male-biased in their expression (Muyle et al. 2012), suggesting that these genes may be involved in sexual antagonism. Genetic mapping in this species also suggests that genes in the recombining pseudoautosomal regions (PARs) may be involved in the evolution of sexually dimorphic and sexually antagonistic traits (Scotti and Delph 2006; Delph et al. 2010; Otto et al. 2011). In particular, sex-specific quantitative trait loci map disproportionately to PARs, suggesting that in this species genes involved in sex-specific functions that recombine between sex chromosomes might have evolved sex-limited expression. There is still much work to do on the influence of

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 149 sex-linked genes on sexual dimorphism, but recent evidence suggests that the presence of sex chromosomes can have important effects on sex-specific divergence and this can in turn influence the expression and distribution of genes underlying dimorphic traits.

FUTURE STUDIES Our review has identified several topics that warrant further investigation. To conclude we highlight three research themes that we believe would be especially profitable in providing new insights on the evolution of sexual dimorphism in plants. To our knowledge there have been no comparative or phylogenetic studies of variation in plant sexual dimorphism. These could involve examining the ecological and life history correlates of sexual dimorphism to address a range of unanswered questions. Does the degree and direction of sexual dimorphism differ between short- and long-lived dioecious species? Is dimorphism more strongly developed in species that occupy benign versus stressful environments? Do wind- versus animal-pollinated species differ in the extent of sexual dimorphism? Is dimorphism more strongly developed in older dioecious lineages? The increasing appearance of phylogenies of dioecious groups (e.g. Soza et al. 2012) should enable these types of questions to be addressed. Future microevolutionary investigations are needed on the extent of intra-specific variation in sexual dimorphism. Most studies of sexual differences have involved a limited sample of populations, and little is known about the patterns of geographical variation in sexual dimorphism, especially in species that occupy a wide range of environments. Where population differentiation in sexual dimorphism has been reported (e.g. Barrett 1992; Kohorn 1995), it has been at the phenotypic level, and it is not known if the differences simply represent plastic responses to local environmental conditions, or whether there is a significant genetic component to the observed differences among populations (Delph et al. 2002; Delph and Bell 2008). Identifying genetic differentiation among populations in the degree of sexual dimorphism, and determining the local environmental conditions in which population occur, could provide important insights into the relative importance of natural and sexual selection in shaping patterns of dimorphism. Finally, there is a considerable amount of work to be done on understanding the genetic architecture of sexual dimorphism. Progress has been made in understanding the genes involved

CHAPTER 7. SEXUAL DIMORPHISM IN FLOWERING PLANTS 150 in the expression of dimorphic traits in a few model systems (e.g. Delph et al. 2010), and the use of next generation sequencing now enables such analyses in non-model organisms. Future studies would benefit from both quantitative genetic studies, which estimate the influence of sex- specific genetic variance and covariance structures, and also from genetic mapping and gene expression experiments to investigate the role of sex-linked and sexually antagonistic genes in sexual dimorphism. Although we have emphasized the role of sex chromosomes in the evolution of sexual dimorphism, most dioecious species do not possess sex chromosomes and yet are sexually dimorphic. Future studies of the genetic architecture of sexual dimorphism in species without sex chromosomes would be valuable for understanding how rapidly sexual dimorphism can evolve in lineages where dioecy is of recent origin.

ACKNOWLEDGMENTS We thank Locke Rowe and John Pannell for discussion and comments on the manuscript, Adam Chippendale, John Pannell, and Sarah Yakimowski for permission to reproduce figures from their articles, and the Natural Sciences and Engineering Research Council of Canada for a Discovery Grant to SCHB which provided student support for JH.

CHAPTER 8

CONCLUDING DISCUSSION

GENERAL SUMMARY In this thesis, I investigated several features of the evolution and genetics of plant sex chromosomes, and provided new insights into the changes that have occurred during sex chromosome evolution in the dioecious plant Rumex hastatulus. These include changes to the effectiveness of natural selection, patterns of gene expression, rates of molecular evolution, patterns of genetic diversity, and rates of inter-chromosomal gene movement. In addition, I developed a novel theoretical model on the interactions between sex chromosomes and selection on the sex ratio, and examined the relationship between sex chromosomes and the evolution of sexual dimorphism. Below, I provide a summary of key results from each chapter and highlight their implications. I end by suggesting future research directions.

SUMMARY OF CHAPTERS In Chapter 2, I reviewed evidence from the literature on broad-scale patterns of selection in plant genomes and found that differences in plant mating system, ploidy level, and demographic history have significant and detectable effects on natural selection and molecular evolution, and that there is extensive between-species variation in estimates of both positive and purifying selection. Changes to population structure, including those resulting from inbreeding, were recognized as key factors determining effective population size and therefore the strength and extent of selection. Recent studies also suggested that genomic regions with low recombination and a high density of selective mutations were strongly affected by linked selection, including selective sweeps and background selection (e.g., Cutter and Choi 2010). Finally, studies of whole genome duplications have found that differences across species in the time since polyploidization events may explain differences among polyploid species in rates of purifying and positive selection. Thus, variation in the strength and extent of natural selection across plant genomes is well documented, and there are several population genetic factors that interact to determine this variation. Quantifying the relative importance and joint effects of recombination rate variation,

150 CHAPTER 8. CONCLUDING DISCUSSION 152 effective population size, and changes to dominance associated polyploidy are ongoing challenges. In Chapter 3, I investigated the consequences of suppression of recombination between the recently evolved sex chromosomes in the dioecious plant R. hastatulus. I further developed recently established methods (Bergero and Charlesworth 2011; Chibalina and Filatov 2011) to identify sex-linked genes using segregation analysis and transcriptome sequencing of multiple parent-offspring groups, and the data were used to estimate evolutionary rates at nonsynonymous and synonymous sites in X- and Y-linked genes, and to test for changes in gene expression, selection efficacy, and codon usage on ancestral and neo-Y chromosomes. Despite low X-Y divergence in both sex chromosome systems, I found evidence that Y-linked genes have started to undergo gene loss, causing ∼28% and ∼8% hemizygosity of the ancestral and derived X chromosomes, respectively. Moreover, I found that genes remaining on Y chromosomes have accumulated more amino acid replacements, contained more unpreferred changes in codon use, and exhibited significantly reduced gene expression compared with their X-linked alleles. Significantly, these results demonstrated that the magnitudes of these degenerative effects were greatest for sex-linked genes that had been non-recombining for a longer period of time. In Chapter 4, I tested whether the selective interference effects suggested by the ongoing degeneration of Y chromosomes had also led to a reduction in the effective population size (Ne) of Y-linked genes, resulting in lower levels of standing variation on the Y chromosome compared to X chromosomes or autosomes. To do this I compared levels of neutral polymorphism on X- and Y-linked genes with that of autosomal genes in the two chromosomal races (XY and XY1Y2) of R. hastatulus, focusing in particular on the ancestral (Y1) chromosome pair. I found that Y-chromosome diversity was 40-fold lower than on the X chromosome, and nearly 50-fold lower than on autosomes. This analysis indicated that the severe reduction in Y diversity could not be explained by a reduced Ne of Y-linked genes caused by the occurrence of female biased sex ratios in R. hastatulus, and was unlikely due to mutation rate differences between the sex chromosomes, or high variance in male reproductive success. Rather, in combination with results from Chapter 3 indicating an accumulation of deleterious mutations on the Y chromosome, this study suggests that selective interference has likely played a significant role in reducing nucleotide diversity during the early stages of Y-chromosome degeneration. In addition, my analyses provided evidence that the recent fusion event resulting in the formation of

CHAPTER 8. CONCLUDING DISCUSSION 153

the XY1Y2 sex chromosome system was associated with a significant reduction in genetic diversity on the X chromosome in XY1Y2 compared to XY populations, suggesting the possibility of a selective sweep or a disproportionate loss of X-linked diversity following the origin of the XY1Y2 from the XY race. In Chapter 5, I used transcriptome sequence data to conduct the first study of the chromosomal distribution of cyto-nuclear interacting genes in a plant species with sex chromosomes (Rumex hastatulus). This study was motivated by recent studies suggesting that the female biased transmission pattern of X chromosomes and mitochondria may result in selection for coadaptation between X-linked genes and genes whose protein products interact in the mitochondria, leading to an overrepresentation of nuclear-mitochondrial genes on the X chromosome relative to autosomes (Rand et al. 2004; Drown et al. 2012). I found no evidence of overrepresentation of nuclear-mitochondrial or nuclear-chloroplast genes on the X chromosome in R. hastatulus, and thus no support for this coadaptation hypothesis. I also found no evidence for the hypothesis that selection had resulted in a deficit of mito-nuclear X-linked genes to reduce male mitochondrial mutational load (e.g., see Drown et al. 2012; Hill and Johnson 2013; Dean et al. 2014; Rogell et al. 2014; Dean et al. 2015). I discussed how these results from a species with recently evolved sex chromosomes fit into an emerging picture of the evolutionary forces governing the chromosomal distribution of cyto-nuclear genes. In particular, I highlighted the importance of testing a neutral model where the observed biases in the chromosomal distributions of cyto-nuclear genes are explained by the gene content of the ancestral chromosome pair that evolved into sex chromosomes, rather than adaptive evolution arising from genomic conflict or cooperation. In Chapter 6, I developed population genetic models to investigate the relationship between sex chromosome heteromorphism and selection on the sex ratio in plant populations. In particular, I asked whether the extensive overlap between haploid and diploid-phase gene expression, which is common in plants (Borg et al. 2009), could in theory alter the outcome of Fisherian sex ratio selection and affect the load of mutations in plant populations. I confirmed classical results indicating that when haploid selection acts only on the relative fitness of X- and Y-bearing gametophytes and the sex ratio is controlled by the maternal genotype, sex ratios evolve toward unity. However, when considering selection acting on deleterious mutations in the haploid phase, I found that biased sex ratios could be stably maintained, reflecting a trade off

CHAPTER 8. CONCLUDING DISCUSSION 154 between the advantages of purging deleterious mutations, and the disadvantages of haploid selection on the sex ratio. These results provide a plausible and novel evolutionary explanation for the frequent observation of biased sex ratios in dioecious plants with heteromorphic sex chromosomes (reviewed in Field et al. 2014). In Chapter 7, I provided a comprehensive review of sexual dimorphism in dioecious plants and considered the evidence that sex chromosomes play a disproportionate role in the evolution of sexual dimorphism. I documented a range of sexually dimorphic traits in angiosperm species, discussed their ecological and reproductive consequences, and considered the evolutionary forces driving divergence between female and male phenotypes. I concluded that sexual dimorphism in plants is generally less well developed than in many animal groups, and that this may relate to differences in the relative importance of sexual and natural selection in plants versus animals. I also suggested that the asymmetry between males and females in the genetic variance contributed by sex-linked genes might contribute to the divergence of female and male traits, and this may be be further facilitated by the breakdown of intersexual genetic correlations. I concluded that although theory has established an important role of sex chromosomes in the evolution of sexual dimorphism, empirical data has yet to unambiguously confirm this.

FUTURE DIRECTIONS My thesis has covered a broad set of issues concerning the evolution and genetics of plant sex chromosomes, and throughout I have highlighted some of the additional questions that this research has raised. A significant implication of my research is that the time scales over which sex chromosomes evolved is an important consideration for understanding the patterns of X- and Y-chromosome divergence. This is relevant both as it relates to the mechanisms underlying Y- chromosome degeneration, as well as the processes involved in the evolution of suppressed recombination. In particular, theoretical models of Y chromosome evolution are dynamic, and the predicted rate of Y-chromosome degeneration is a non-linear function of the time since suppression of recombination (Bachtrog 2008). I have highlighted this fact in the context of comparing patterns of degeneration on ancestral and neo-Y chromosomes in Rumex (Chapter 3), but ultimately large-scale comparative analyses involving a range of species whose sex chromosomes stopped recombining at different times will be needed. Such a comparative

CHAPTER 8. CONCLUDING DISCUSSION 155 approach will provide a more detailed evaluation of the time scales over which Y chromosomes degenerate, as well as shed light on the relative importance of the factors driving and opposing such degenerative evolution. There is also significant scope for theoretical development of models of Y chromosome evolution. In particular, a clearer understanding is needed regarding the relative importance of the evolutionary processes that operate in regions of suppressed recombination. Orr and Kim’s (1998) investigation of the “ruby-in-the-rubbish effect”, for example, provides a strong argument that the rate of fitness decline on the Y chromosome caused a ruby-in-the-rubbish process is quantitatively on par, or even exceeds, the more commonly invoked background selection or selective sweep models, and greatly exceeds degeneration by Muller’s Ratchet (Muller 1964). These arguments were based on calculations of the maximum rate of fitness decline, but it is not yet clear whether the ruby-in-the-rubbish effect might diminish over time, as appears to be true for other mechanisms of Y-degeneration (Bachtrog 2008). Studying these mechanisms together with explicit models of X chromosome dosage compensation will also be of interest, as the rate of Y chromosome degeneration might be expected to change as dosage compensation evolves. The processes involved in the evolution of suppressed recombination are also largely untested empirically, and this represents a major gap in our understanding of sex chromosome evolution. There is now a large body of theory on sexually antagonistic (SA) selection (Van Doorn and Kirkpatrick 2010; Jordan and Charlesworth 2012; Blaser et al. 2013), but empirical data has yet to established the role of SA polymorphisms in suppressing recombination. Recently evolved sex chromosomes in plants (Charlesworth 2012; Qiu et al. 2013) and fish (Shikano et al. 2011; Natri et al. 2013) may be useful systems for testing the role of sexual antagonism. For example, recent studies using molecular evolutionary approaches in the plant Silene latifolia have investigated footprints of SA selection in the partially-sex linked pseudoautosomal region, finding evidence for balancing selection in such regions (Qiu et al. 2013). This is an expected outcome of SA selection (Kirkpatrick 2014), but SA polymorphisms have not yet been identified. Identifying SA polymorphisms and establishing their role in recombination suppression therefore remains a major challenge in the field. Another unanswered question in sex chromosome evolution that is highlighted by my on research Rumex concerns the mechanisms driving the evolution of fusions between autosomes and sex chromosomes. Fusion and translocation events involving sex chromosomes are known to

CHAPTER 8. CONCLUDING DISCUSSION 156 have occurred in many species (reviewed in Westergaard 1958; Ming et al. 2011), and the resulting formation of neo-sex chromosomes have provided some of the clearest examples of degenerative evolution following suppressed recombination (Bachtrog 2004 2013; Bachtrog et al. 2008). However, it is not known what drives such chromosomal rearrangements and a major empirical question is whether sex chromosomes are involved in chromosomal fusions or translocations more frequently than other regions of the genome. A recent study has compared the establishment rates of fusions between sex chromosomes and autosomes, finding that they more frequently involve the Y chromosome than other sex chromosomes in fishes and squamate reptiles (Pennell et al. 2015), but whether the sex chromosomes are disproportionately involved in such events is not known. Answering this question may provide clues regarding the mechanisms driving such events, and enable a test of the theory that sexually antagonistic selection is a key driver of sex chromosome-autosome fusions (Charlesworth and Charlesworth 1980). Finally, the suggestion in Chapter 7 that sexual dimorphism in plants is typically weaker than in animals is also underexplored, and more studies of the evolutionary and genetic aspects of plant sexual dimorphism are needed. In particular, studies aimed at testing the extent of sexual selection in plants (e.g., see Moore and Pannell 2011) may shed light on the differences between plants and animals in the magnitude and extent of sexually dimorphic trait expression. One hypothesis for the weaker sexual dimorphism in plants, as described in Chapter 7, is that this might be explained by the recent evolutionary origins of dioecy in most lineages. In particular, when dioecy evolves from hermaphroditism and the sexes are initially monomorphic with respect to their homologous characters, intersexual genetic correlations should be quite strong and could interfere with female and male trait divergence. Indeed, much of the genetic variation available for the evolution of sexual dimorphism is likely to have been initially shared by the sexes after the establishment of dioecy, and it may be that there has not been sufficient time in many dioecious lineages for selection to break down intersexual genetic correlations. This hypothesis predicts a relationship between the degree of dimorphism and the age of dioecy. Other factors, including the pollination system of dioecious species (e.g. animal versus wind pollination) may also influence the degree of sex-specific divergence, and comparative studies would be useful to investigate further the factors responsible for the patterns of sexual dimorphism in plants.

CHAPTER 8. CONCLUDING DISCUSSION 157

Like many areas of evolutionary genetics, our understanding the mechanisms driving the evolution of sex chromosomes is largely based on theory, and empirical data is only starting to confirm or reject theoretical predictions. However, a wealth of genomic data from sex chromosomes across a variety of taxa now exists (Ashman et al. 2014), including data from an increasing number of plants with young sex chromosome systems (Bergero et al. 2007; Qiu et al. 2010; Ming et al. 2011; Charlesworth 2012; Muyle et al. 2012). In combination with large-scale comparative analyses and the tools of molecular population genetics, these data should enable future studies to perform powerful empirical tests and provide new insight into the evolutionary mechanisms driving sex chromosome evolution.

BIBLIOGRAPHY

Aanen DK, Spelbrink JN, Beekman M. 2014. What cost mitochondria? The maintenance of functional mitochondrial DNA within and across generations. Philos Trans R Soc Lond B 369. doi:10.1098/rstb.2013.0438.

Abe T. 2002. Flower bud abortion influences clonal growth and sexual dimorphism in the understorey dioecious shrub Aucuba japonica (Cornaceae). Ann Bot 89:675–681.

Adams KL, Palmer JD. 2003. Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylo Evol 29:380–395.

Agrawal AF and Whitlock MC. 2012. Mutation load: the fitness of individuals in populations where deleterious mutations are abundant. Ann Rev Ecol Evol Syst 43:115–135.

Ågren J. 1988. Sexual differences in biomass and nutrient allocation in the dioecious Rubus chamaemorus. Ecology 69:962–973.

Ågren J, Danell K, Elmqvist T, Ericson L, Hjältén J. 1999. In: Geber MA, Dawson TE, Delph LF, eds. Gender and sexual dimorphism in flowering plants. Berlin: Springer-Verlag, 217–246.

Akashi H, Osada N, Ohta T. 2012. Weak selection and protein evolution. Genetics 1921:15–31.

Allen GA, Antos JA. 1993. Sex ratio variation in the dioecious shrub Oemleria cerasiformis. Am Nat 141:537–553.

Anders S and Huber W. 2010. Differential expression analysis for sequence count data. Genome Biol 11:R106.

Anders S, Pyl PT, Huber W. 2014. HTSeq — A Python framework to work with high- throughput sequencing data. bioRxiv doi:10.1101/002824.

Andersson M. 1994. Sexual Selection. Princeton: Princeton University Press.

Andolfatto P. 2001. Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol Biol Evol 18:279–290.

Andolfatto P. 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature 4377062:1149–1152.

Andolfatto P. 2007. Hitchhiking effects of recurrent beneficial amino acid substitutions in Drosophila melanogaster. Genome Res 17:1755–1762.

157 BIBLIOGRAPHY 159

Arnold SJ. 1987. Quantitative genetic models of sexual selection: a review. In: Stearns, S, eds. The evolution of sex and its consequences. Basel: Birkhauser 283–315.

Arnold SJ. 1994. Bateman’s principle and the measurement of sexual selection in plants and animals. Am Nat 144:126–149.

Arnqvist G, Rowe L. 2005. Sexual Conflict. Princeton: Princeton University Press.

Ashman T-L, Morgan MT. 2004. Explaining phenotypic selection on plant attractive characters: male function, gender balance or ecological context. Proc Roy Soc Lond B 271:553– 559.

Ashman T-L. 2000. Pollinator selectivity and its implications for the evolution of dioecy and sexual dimorphism. Ecology 81:2577–2591.

Ashman T-L. 2003. Constraints on the evolution of males and sexual dimorphism: field estimates of genetic architecture of reproductive traits in three populations of gynodioecious Fragaria virginiana. Evolution 57:2012–2025.

Ashman T-L. 2009. Sniffing out patterns of sexual dimorphism in floral scent. Functional Ecology 23:852–862.

Aury JM, Jaillon O, Duret L, Noel B, Jubin C, Porcel BM, et al. 2006. Global trends of whole- genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444:171– 178.

Bachelier JB and Friedman WE. 2011. Female gamete competition in an ancient angiosperm lineage. Proc Nat Acad Sci USA 108:12360–12365.

Bachtrog D. 2004. Evidence that positive selection drives Y-chromosome degeneration in Drosophila miranda. Nat Genet 36:518–522.

Bachtrog D. 2006. Expression profile of a degenerating neo-Y chromosome in Drosophila. Curr Biol 16:1694–1699.

Bachtrog D. 2008. The temporal dynamics of processes underlying Y chromosome degeneration. Genetics 179:1513–1525.

Bachtrog D. 2013. Y-chromosome evolution: emerging insights into processes of Y-chromosome degeneration. Nat Rev Genet 14:113–124.

Bachtrog D. Charlesworth B. 2002. Reduced adaptation of a non-recombining neo-Y chromosome. Nature 416:323–326.

Bachtrog D and Andolfatto P. 2006. Selection, recombination and demographic history in Drosophila miranda. Genetics 174:2045–59.

BIBLIOGRAPHY 160

Bachtrog D, Hom E, Wong KM, Maside X, de Jong P. 2008. Genomic degradation of a young Y chromosome in Drosophila miranda. Genome Biol 9:R30.

Bachtrog D, Kirkpatrick M, Mank JE, McDaniel SF, Pires JC, Rice W, Valenzuela N. 2011. Are all sex chromosomes created equal? Trends Genet 27:350–357.

Bachtrog D, Mank JE, Peichel CL, Kirkpatrick M, Otto SP, Ashman T-L, Hahn MW, Kitano J, Mayrose I, Ming R, Perrin N, Ross L, Valenzuela N, Vamosi JC. 2014. Sex determination: why so many ways of doing it? PLoS Biol. 12:e1001899.

Baines JF and Harr B. 2007. Reduced X-linked diversity in derived populations of house mice. Genetics 175:1911–21.

Bansal V and Bafna V. 2008. HapCUT: An efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24:153–159.

Bar-Yaacov D, Blumberg A, Mishmar D. 2012. Mitochondrial-nuclear co-evolution and its effects on OXPHOS activity and regulation. Biochim Biophys Acta 1819:1107–1111.

Barker MS, Kane NC, Matvienko M, Kozik A, Michelmore RW, Knapp SJ, Rieseberg LH. 2008. Multiple paleopolyploidizations during the evolution of the Compositae reveal parallel patterns of duplicate gene retention after millions of years. Mol Biol Evol 2511:2445– 2455.

Barrett SCH. 1992. Gender variation and the evolution of dioecy in Wurmbea dioica (Liliaceae). J Evol Biol 5:423–444.

Barrett SCH. 2002. The evolution of plant sexual diversity. Nat Rev Genet 3:274–284.

Barrett SCH and Charlesworth D. 1991. Effect of a change in the level of inbreeding on the genetic load. Nature 352: 522–524.

Barrett SCH, Yakimowski SB, Field DL, Pickup M. 2010. Ecological genetics of sex ratios in plant populations. Philosophical Transactions of the Royal Society London Series B 365:2549–2557

Barton N. 2000. Genetic hitchhiking. Phil Trans Roy Soc B 355:1553–1562.

Bateman AJ. 1948. Intra-sexual selection in Drosophila. Heredity 2:349–368.

Baudry E, Kerdelhué C, Innan H, Stephan W. 2001. Species and recombination effects on DNA variability in the tomato genus. Genetics 1584:1725–1735.

Bawa KS, Keegan CR, Voss RH. 1982. Sexual dimorphism in Aralia nudicaulis (Araliaceae). Evolution 36:371–378.

BIBLIOGRAPHY 161

Bawa KS, Opler PA. 1975. Dioecism in tropical trees. Evolution 29:167–179.

Bedhomme S, Chippindale AK. 2008. Irreconcilable differences: when sexual dimorphism fails to resolve sexual conflict. In Fairbairn DJ, Blanckenhorn WU, Szekely T, eds. Sex, size and gender roles: evolutionary studies of sexual size dimorphism. Oxford, UK: Oxford University Press 185–194.

Begun DJ, Holloway AK, Stevens K, Hillier LW, Poh Y-P, Hahn MW, Nista PM, et al. 2007. Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans. PLoS Biol 511:e310.

Beilstein MAM, Nagalingum NSN, Clements MDM, Manchester SRS, Mathews SS. 2010. Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana. Proc Natl Acad Sci USA 10743:18724–18728.

Bell G. 1985. On the function of flowers. Proc Roy Soc Lond B 224, 233–265.

Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, et al. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–9.

Berg OG, Kurland CG. 2000. Why mitochondrial genes are most often found in nuclei. Mol Biol Evol 17:951–961.

Bergero, R., and D. Charlesworth. 2009. The evolution of restricted recombination in sex chromosomes. Trends Ecol. Evol. 24:94–102.

Bergero R, Charlesworth D. 2011. Preservation of the Y transcriptome in a 10-million-year-old plant sex chromosome system. Curr Biol 21:1470–1474.

Bergero R, Forrest A, Kamau E, Charlesworth D. 2007. Evolutionary strata on the X chromosomes of the dioecious plant Silene latifolia: evidence from new sex-linked genes. Genetics 175: 1945–1954.

Betancourt AJ, Presgraves DC. 2002. Linkage limits the power of natural selection in Drosophila. Proc Natl Acad Sci USA 9921:13616–13620.

Bierzychudek P, Eckhart V. 1988. Spatial segregation of the sexes in dioecious plants. Am Nat 132:34–43.

Birchler JA, Veitia RA. 2009. The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol 1861:54–62.

Blanc GG, Wolfe KHK. 2004. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 167:1667–1678.

BIBLIOGRAPHY 162

Blaser O, Grossen C, Neuenschwander S, Perrin N. 2013. Sex-chromosome turnovers induced by deleterious mutation load. Evolution 67:635–645.

Bond WJ, Maze KE. 1999. Survival costs and reproductive benefits of floral display in a sexually dimorphic dioecious shrub. Evol Ecol 13:1–18.

Bonduriansky R, Chenoweth S. 2009. Intralocus sexual conflict. Trends Ecol Evol 24:280–288.

Bonduriansky R, Maklakov A, Zajitschek, Brooks R. 2008. Sexual selection, sexual conflict and the evolution of ageing and life span. Functional Ecology 22:443–453.

Bonduriansky R, Rowe L. 2005. Intralocus sexual conflict and the genetic architecture of sexually dimorphic traits in xanthostoma (Diptera: ). Evolution 59:1965–1975.

Borg M, Brownfield L, Twell D. 2009. Male gametophyte development: a molecular perspective. J Exp Bot 605:1465–1478.

Bowers JE, Chapman BA, Rong J, Paterson AH. 2003. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 4226930:433–438.

Boyko AR, Williamson SH, Indap AR, Degenhardt JD, Hernandez RD, Lohmueller KE, Adams MD, et al. 2008. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet 45:e1000083.

Bradbury JW, Anderson MB. 1987. Sexual selection: testing the alternatives. J Evol Biol 2:72– 74.

Branca AA, Paape TDT, Zhou PP, Briskine RR, Farmer ADA, Mudge JJ, Bharti AKA, et al. 2011. Whole-genome nucleotide diversity, recombination, and linkage disequilibrium in the model legume Medicago truncatula. Proc Natl Acad Sci USA 10842:E864–E870.

Brenchley RR, Spannagl MM, Pfeifer MM, Barker GLAG, D'Amore RR, Allen AMA, McKenzie NN, et al. 2012. Analysis of the bread wheat genome using whole-genome shotgun sequencing. Nature 4917426:705–710.

Brown JH, Stevens GC, Kaufman DM. 1996. The geographic range: size, shape, boundaries, and internal structure. Annu Rev Ecol Syst 27:597–623.

Bull JJ. 1983. Evolution of sex determining mechanisms. Benjamin/Cummings Pub. Co.

Burd M. 1994. Bateman’s principle and plant reproduction: the role of pollen limitation in fruit and seed set. Bot Rev 60:83–139.

BIBLIOGRAPHY 163

Burt A, Trivers R. 2006. Genes in conflict: the biology of selfish genetic elements. Cambridge, MA: Belknap Press of Harvard University Press.

Bustamante CD, Nielsen R, Sawyer SA, Olsen KM, Purugganan MD, Hartl DL. 2002. The cost of inbreeding in Arabidopsis. Nature 4166880:531–534.

Bustamante CD and Ramachandran S. 2009. Evaluating signatures of sex-specific processes in the human genome. Nat Genet 41:8–10.

Caballero A. 1994. Developments in the prediction of effective population size. Heredity Edinb. 6:657–679.

Caballero A. 1995. On the effective size of populations with separate sexes, with particular reference to sex-linked genes. Genetics 139:1007–1011.

Campbell DR, Weller SG, Sakai AK, Culley TM, Dang PN, Dunbar-Wallis AK. 2011. Genetic variation and covariation in floral allocation of two species of Schiedea with contrasting levels of sexual dimorphism. Evolution 65:757–770.

Campos JL, Charlesworth B, Haddrill PR. 2012. Molecular evolution in nonrecombining regions of the Drosophila melanogaster genome. Gen Biol Evol 43:278–288.

Canty A and Ripley B. 2012. boot: Bootstrap R S-Plus functions. R package version 1.3-17.

Cao J, Schneeberger K, Ossowski S, Günther T, Bender S, Fitz J, Koenig D, et al. 2011. Whole- genome sequencing of multiple Arabidopsis thaliana populations. Nat Genet 4310:956– 963.

Carneiro M, Ferrand N, Nachman MW. 2008. Recombination and Speciation: Loci Near Centromeres Are More Differentiated Than Loci Near Telomeres Between Subspecies of the European Rabbit Oryctolagus cuniculus. Genetics 1812:593–606.

Carroll SB, Delph LF. 1996. The effect of gender and plant architecture on allocation to flowers in dioecious Silene latifolia (Caryophyllaceae). Int J Plant Sci 157:493–500.

Carvalho AB. 2002. Origin and evolution of the Drosophila Y chromosome. Curr Opin Genet Dev 12:664–668.

Carvalho AB, et al. 2003. Y chromosome and other heterochromatic sequences of the Drosophila melanogaster genome: how far can we go? Genetica 117:227–237.

Case AL, Barrett SCH. 2004. Floral biology of gender monomorphism and dimorphism in Wurmbea dioica (Colchicaceae) in Western Australia. Int J Plant Sci 165:289–301.

BIBLIOGRAPHY 164

Casper B. 1988. Evidence for selective embryo abortion in Cryptantha flava. Am Nat 132: 318– 326.

Chapman T, Arnqvist G, Bangham J, Rowe L. 2003. Sexual conflict. Trends Ecol Evol 18:41– 47.

Charlesworth B. 1978. Model for evolution of Y chromosomes and dosage compensation. Proc Natl Acad Sci USA 75:5618–5622.

Charlesworth B. 1992. Evolutionary rates in partially self-fertilizing species. Am Nat 1401:126– 148.

Charlesworth, B. 1994. The effect of background selection against deleterious mutations on weakly selected, linked variants. Genet. Res. 63:213–227.

Charlesworth B. 1996. The evolution of chromosomal sex determination and dosage compensation. Curr Biol 6:149–16.

Charlesworth B. 2009. Effective population size and patterns of molecular evolution and variation. Nat Rev Genet 103:195–205.

Charlesworth B and Charlesworth D. 1978. A model for the evolution of dioecy and gynodioecy. Am Nat 112:975–97.

Charlesworth B and Charlesworth D. 1998. Some evolutionary consequences of deleterious mutations. Genetica 102: 3–19.

Charlesworth B and Charlesworth D. 1999. The genetic basis of inbreeding depression. Genet Res 74: 329–340.

Charlesworth B and Charlesworth D. 2000 The degeneration of Y chromosomes. Philos Trans R Soc Lond B Biol Sci 355:1563–1572.

Charlesworth B, Morgan MT, Charlesworth D. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 1344:1289–1303.

Charlesworth D. 1993. Why are unisexual flowers associated with wind pollination and unspecialized pollinators. Am Nat 141:481–490.

Charlesworth D. 1999. Theories on the evolution of dioecy. In: Geber MA, Dawson TE, Delph LF, eds. Gender and sexual dimorphism in flowering plants. Berlin: Springer-Verlag, 33–60.

Charlesworth D. 2002. Plant sex determination and sex chromosomes. Heredity 88:94–101.

Charlesworth D. 2013. Plant sex chromosome evolution. J Exp Bot 64:405–420.

BIBLIOGRAPHY 165

Charlesworth, D. 2015. Plant contributions to our understanding of sex chromosome evolution. New Phytol 208:52–65.

Charlesworth D and Charlesworth B. 1992. The effects of selection in the gametophyte stage on mutational load. Evolution 46:703–720.

Charlesworth D, Charlesworth B, Marais G. 2005. Steps in the evolution of heteromorphic sex chromosomes. Heredity 95:118–128.

Charlesworth D, Schemske DW, Sork VL. 1987. The evolution of plant reproductive characters: sexual versus natural selection. In: Stearns SC, ed. The evolution of sex and its consequences. Basel: Birkhauser 317–335.

Charlesworth D and Yang Z. 1998. Allozyme diversity in Leavenworthia populations with different inbreeding levels. Heredity Edinb 4:453–461.

Charnov EL. 1982. The theory of sex allocation. Princeton University Press, Princeton, NJ.

Chen J, Källman T, Ma X, Gyllenstrand N, Zaina G, Morgante M, Bousquet J, Eckert A, Wegrzyn J, Neale D, Lagercrantz U, Lascoux M. 2012. Disentangling the roles of history and local selection in shaping clinal variation of allele frequencies and gene expression in Norway spruce Picea abies. Genetics 1913:865-81.

Chenoweth SF, Rundle HD, Blows MW. 2010. The contribution of selection and genetic constraints to phenotypic divergence. Am Nat 175:186–196.

Cheverud JM, Dow MM, Leutenegger W. 1985. The quantitative assessment of phylogenetic constraints in comparative analyses – sexual dimorphism in body weight among primates. Evolution 39:1335–1351.

Chevin L-ML, Hospital FF. 2008. Selective sweep at a quantitative trait locus in the presence of background genetic variation. Genetics 1803:1645–1660.

Chibalina MV, Filatov DA. 2011. Plant Y chromosome degeneration is retarded by haploid purifying selection. Curr Biol 21:1475–1479.

Conesa A, et al. 2005. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21:3674–3676.

Conn JS and Blum U. 1981. Sex ratio of Rumex hastatulus: the effect of environmental factors and certation. Evolution 35:1108–1116.

Connallon T, Clark AG. 2011. The resolution of sexual antagonism by gene duplication. Genetics 187:919–937.

BIBLIOGRAPHY 166

Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, Absher D, Myers RM, Cavalli-Sforza LL, Feldman MW, Pritchard JK. 2009. The role of geography in human adaptation. PLoS Genet 56:e1000500.

Coop G, Witonsky D, Di Rienzo A, Pritchard JK. 2010. Using environmental correlations to identify loci underlying local adaptation. Genetics 1854:1411–1423.

Corneslissen T, Stirling P. 2005. Sex-biased herbivory: a meta-analysis of the effects of gender on plant-herbivore interactions. Oikos 111, 488–500.

Correns C. 1922. Sex determination and numerical proportion of genders in common sorrel Rumex acetosa. Biol. Zentralblatt 42: 465–480.

Correns C. 1928. Bestimmung, vererbung und verteilung des geschlechtes bei den hö heren pflanzen. Bornträger, Berlin.

Cortez D, et al. 2014. Origins and functional evolution of Y chromosomes across mammals. Nature 508:488–493.

Cosmides LM, Tooby J. 1981. Cytoplasmic inheritance and intragenomic conflict. J Theor Biol. 89:83–129.

Cox PA. 1981. Niche partitioning between sexes of dioecious plants. Am Nat 117:295–307.

Crow JF. 1970. Genetic loads and the cost of natural selection. In: Mathematical topics in population genetics. ed. Kojima, K.. pp. 128–177. Springer-Verlag, Berlin.

Cutter AD, Choi JY. 2010. Natural selection shapes nucleotide polymorphism across the genome of the nematode Caenorhabditis briggsae. Gen Res 208:1103–1111.

Darwin CR. 1871. The descent of man and selection in relation to sex. London: John Murray.

Darwin CR. 1877. The different forms of flowers on plants of the same species. London: John Murray.

Dawson TE, Geber MA. 1999. Sexual dimorphism in physiology and morphology In: Geber MA, Dawson TE, Delph LF, eds. Gender and sexual dimorphism in flowering plants. Berlin: Springer-Verlag, 176–215. de Jong, T.J. and Klinkhamer, P.G.L. 2005. Evolutionary ecology of plant reproductive strategies. Cambridge University Press, Cambridge, UK.

Dean R, Zimmer F, Mank JE. 2014. The potential role of sexual conflict and sexual selection in shaping the genomic distribution of mito-nuclear genes. Genome Biol Evol. 6:1096– 1104.

BIBLIOGRAPHY 167

Dean R, Zimmer F, Mank JE. 2015. Deficit of mitonuclear genes on the human X chromosome predates sex chromosome formation. Genome Biol Evol 7:636–41.

Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK. 2009. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25:3207-3212.

Delph LF, Arntz MA, Scotti-Saintagne C, Scotti I. 2010. The genomic architecture of sexual dimorphism in the dioecious plant Silene latifolia. Evolution 64:2873–2886.

Delph LF, Ashman T-L. 2006. Trait selection in flowering plants: how does sexual selection contribute? Integrative and Comparative Biology 46:465–472.

Delph LF, Frey FM, Steven JC, Gehring JL. 2004.Investigating the independent evolution of the size of floral parts via G-matrix estimation and artificial selection. Evolution and Development 6:438–448.

Delph LF, Galloway LF, Stanton ML. 1996. Sexual dimorphism in flower size. Am Nat 148: 299–320.

Delph LF, Gehring JL, Arntz AM, Levri M, Frey FM. 2005. Genetic correlations with floral display lead to sexual dimorphism in the cost of reproduction. Am Nat 166:31–41.

Delph LF, Herlihy CR. 2012. Sexual, fecundity, and viability selection on flower size and number in a sexually dimorphic plant. Evolution 66:1154–1166.

Delph LF, Knapczyk FN, Taylor DR. 2002. Among-population variation and correlations in sexually dimorphic traits of Silene latifolia. J Evol Biol 15:1011–1020.

Delph LF, Lu Y, Jayne LD. 1993. Patterns of resource allocation in a dioecious Carex Cyperaceae. Am J Bot 80:607–615.

Delph LF, Meagher TR. 1995. Sexual dimorphism masks life history trade-offs in the dioecious plant Silene latifolia. Ecology 76:775–785.

Delph LF, Stevens JC, Anderson IA, Herlihy CR, Brodie III ED. 2011. Elimination of a genetic correlation between the sexes via artificial correlational selection. Evolution 65:2872– 2880.

Delph LF. 1999. Sexual dimorphism in life history. In: Geber MA, Dawson TE, Delph LF, eds. Gender and sexual dimorphism in flowering plants. Berlin: Springer-Verlag, 149-174.

Delph LF and Bell DL. 2008. A test of the differential-plasticity hypothesis for variation in the degree of sexual dimorphism in Silene latifolia. Evol Ecol Res 10:61–75.

BIBLIOGRAPHY 168

Delph LF, Gehring JL, Frey FM, Arntz AM, Levri, M. 2004. Genetic constraints on floral evolution in a sexually dimorphic plant revealed by artificial selection. Evolution 58: 1936–1946.

Drown DM, Preuss KM, Wade MJ. 2012. Evidence of a paucity of genes that interact with the mitochondrion on the X in mammals. Genome Biol Evol. 4:763–768.

Dudash M, Johnston MO, Mazer SJ, Mitchell RJ, Morgan MT, Wilson W. 2004. Pollen limitation of plant reproduction: ecological and evolutionary causes and consequences. Ecology 85:2408–2421.

Dudley LS. 2006. Ecological correlates of secondary sexual dimorphism in Salix glauca Salicaceae. Am J Bot 93:1775–1783.

Düsing, C. 1884. Die Regulierung des Geschlechtsverhältnisses. Jenasche Zeitschr. für Naturw. 17: 593–940. Translated in: Edwards A. 2000. Carl Düsing 1884 on the Regulation of the Sex-Ratio. Theor Popul Biol 58: 255–257.

Dyall SD, Brown MT, Johnson PJ. 2004. Ancient invasions: from endosymbionts to organelles. Science 304:253–257.

Eckert AJ, van Heerwaarden J, Wegrzyn JL, Nelson CD, Ross-Ibarra J, González-Martínez SC, Neale DB. 2010. Patterns of population structure and environmental associations to aridity across the range of loblolly pine Pinus taeda L., Pinaceae. Genetics 1853:969– 982.

Eckhart VM. 1999. Sexual dimorphism in flowers and inflorescences. In: Geber MA, Dawson TE, Delph LF, eds. Gender and sexual dimorphism in flowering plants. Berlin: Springer-Verlag, 123–148.

Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32:1792-97.

Efron B. 1987. Better bootstrap confidence intervals. J Am Stat Assoc 82:171–185.

Ellegren H, Parsch J. 2007. The evolution of sex-biased genes and sex-biased gene expression. Nat Rev Gen 8:689–698.

Ellegren H. 2009. The different levels of genetic diversity in sex chromosomes and autosomes. Trends Genet 25:278–84.

Ellegren H. 2011. Sex-chromosome evolution: recent progress and the influence of male and female heterogamety. Nat Rev Genet 12:157–66.

BIBLIOGRAPHY 169

Elo A, Lyznik A, Gonzalez DO, Kachman SD, Mackenzie SA. 2003. Nuclear genes that encode mitochondrial proteins for DNA and RNA metabolism are clustered in the Arabidopsis genome. Plant Cell 15:1619–1631.

Engen S, Ringsby TH, Saether B-E, Lande R, Jensen H, Lillegard M, Ellegren H. 2007. Effective size of fluctuating populations with two sexes and overlapping generations. Evolution 61:1873–1885.

Eory L, Halligan DL, Keightley PD. 2009. Distributions of selectively constrained sites and deleterious mutation rates in the hominid and murid genomes. Mol Biol Evol 271:177– 192.

Eppley SM, Stanton ML, Grosberg RK. 1998. Intrapopulation sex ratio variation in the saltmarsh grass Distichlis spicata. Am Nat 152:659–670.

Eppley SM. 2001. Gender-specific selection during early life history stages in the dioecious grass Distichlis spicata. Ecology 82:2022–2031.

Eppley SM. 2006. Females make tough neigbors: sex-specific competitive effects in seedlings of a dioecious grass. Oecologia 146:549–554.

Escobar JSJ, Cenci AA, Bolognini JJ, Haudry AA, Laurent SS, David JJ, Glémin SS. 2010. An integrative test of the dead-end hypothesis of selfing evolution in Triticeae Poaceae. Evolution 6410:2855–2872.

Eshel I, Motro U, Sansone E. 1997. Continuous stability and evolutionary convergence. J Theor Biol 185: 333–343.

Ester M, del Bosque Q, Navajas-Pérez R, Panero JL, Fernández-González A, Garrido-Ramos MA. 2011. A satellite DNA evolutionary analysis in the North American endemic dioecious plant Rumex hastatulus (Polygonaceae). Genome 54: 253–260.

Evans BJ, Pyron RA, Wiens JJ. 2012. Polyploidization and sex chromosome evolution in amphibians. Polyploidy and Genome Evolution 385–410.

Eyre-Walker A, Keightley PD. 2009. Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. Mol Biol Evol 269:2097–2108.

Eyre-Walker A. 2002. Changing effective population size and the McDonald-Kreitman test. Genetics 1624:2017–2024.

Faul F, Erdfelder E, Lang A.-G., and Buchner A. 2007. G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods 39:175–191

BIBLIOGRAPHY 170

Felsenstein J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39: 783–791.

Field DL, Pickup M, Barrett SCH. 2012a. Comparative analyses of sex-ratio variation in dioecious flowering plants. Evolution: doi:10.1111/evo.12001.

Field DL, Pickup M, Barrett SCH. 2012b. The influence of pollination intensity on fertilization success, progeny sex ratio, and fitness in a wind-pollinated, dioecious plant. Int J Plant Sci 173: 184–191.

Filatov DA. 2005. Evolutionary history of Silene latifolia sex chromosomes revealed by genetic mapping of four genes. Genetics 170:975–979.

Filatov DA, Laporte V, Vitte C, Charlesworth D. 2001. DNA diversity in sex-linked and autosomal genes of the plant species Silene latifolia and Silene dioica. Mol Biol Evol 18:1442–1454.

Fischer AJ, Pollack O, Thalmann B, Nickel B, Pääbo S. 2006. Demographic history and genetic differentiation in apes. Curr Biol 16:1133–1138.

Fisher RA. 1930. The genetical theory of natural selection. Oxford University Press, Oxford, UK.

Fitzpatrick MJ. 2004. Pleiotropy and the genomic location of sexually selected genes. Am Nat 163:800–808.

Flowers JM, Molina J, Rubinstein S, Huang P, Schaal BA, Purugganan MD. 2012. Natural selection in gene-dense regions shapes the genomic pattern of polymorphism in wild and domesticated rice. Mol Biol Evol 292:675–687.

Fox JF, Harrison AT. 1981. Habitat assortment of sexes and water balance in a dioecious grass. Oecologia 49:233–235.

Foxe JP, Dar VUN, Zheng H, Nordborg M, Gaut BS, Wright SI. 2008. Selection on amino acid substitutions in Arabidopsis. Mol Biol Evol 257:1375–1383.

Foxe JP, Slotte T, Stahl EA, Neuffer B, Hurka H, Wright SI. 2009. Recent speciation associated with the evolution of selfing in Capsella. Proc Natl Acad Sci USA 10613:5241–5245.

Freeling M, Thomas BC. 2006. Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. Genes Dev 167:805–814.

Freeman DC, Klikoff LG, Harper KT. 1976. Differential resource utilization of the sexes of dioecious plants. Science 193:597–599.

BIBLIOGRAPHY 171

Freeman DC, Waschocki BA, Stender MJ, Goldschlag DE, Michaels HJ. 1994. Seed size and sex ratio in spinach: application of the Trivers-Willard hypothesis to plants. Ecoscience 1:54–63.

García MB, Antor RJ. 1995. Sex ratio and sexual dimorphism in the dioecious Bordera pyrenaica (Dioscoreaceae). Oecologia 101:59–67.

Gaut BSB, Doebley JFJ. 1997. DNA sequence evidence for the segmental allotetraploid origin of maize. Proc Natl Acad Sci USA 9413:6809–6814.

Geber MA, Dawson TE, Delph LF. 1999 Gender and Sexual Dimorphism in Flowering Plants. Berlin, Germany: Springer.

Geber MA. 1999. Theories of the evolution of sexual dimorphism. In: Geber MA, Dawson TE, Delph LF, eds. Sexual and gender dimorphism in flowering plants. New York: Springer-Verlag, 97–122.

Gibson JR, Chippindale AK, Rice WR. 2002. The X chromosome is a hot spot for sexually antagonistic fitness variation. Proc Roy Soc Lond B 269:499–505.

Gillham NW. 1994. Organelle genes and genomes. New York: Oxford University Press.

Glaettli M, and Barrett SCH. 2008. Pollinator responses to variation in floral display and flower size in dioecious Sagittaria latifolia (Alismataceae). New Phyt 179:1193–1201.

Glémin S, Bazin E, Charlesworth D. 2006. Impact of mating systems on patterns of sequence polymorphism in flowering plants. Proc Roy Soc B: Biol Sci 2731604:3011–3019.

Glémin S, Ronfort J. 2013. Adaptation and maladaptation in selfing and outcrossing species: new mutations versus standing variation. Evolution 671:225–240.

Glémin S. 2007. Mating systems and the efficacy of selection at the molecular level. Genetics 1772:905–916.

Gore MA, Chia J-M, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, et al. 2009. A first- generation haplotype map of maize. Science 3265956:1115–1117.

Gosden TP, Shastri K-L, Innocenti P, Chenoweth SF. 2012. The B-matrix harbours significant and sex-specific constraints on the evolution of multi-character sexual dimorphism. Evolution 66:2106–2116.

Gossmann TI, Keightley PD, Eyre-Walker A. 2012. The effect of variation in the effective population size on the rate of adaptive molecular evolution in eukaryotes. Gen Biol Evol 45:658–667.

BIBLIOGRAPHY 172

Gossmann TI, Song B-H, Windsor AJ, Mitchell-Olds T, Dixon CJ, Kapralov MV, Filatov DA, Eyre-Walker A. 2010. Genome wide analyses reveal little evidence for adaptive evolution in many plant species. Mol Biol Evol 278:1822–1832.

Gossmann TI, Woolfit M, Eyre-Walker A. 2011. Quantifying the variation in the effective population size within a genome. Genetics 1894:1389–1402.

Grabowska-Joachimiak, A., A. Kula, T. Książczyk, J. Chojnicka, E. Sliwinska, and A. J. Joachimiak. 2014. Chromosome landmarks and autosome-sex chromosome translocations in Rumex hastatulus, a plant with XX/XY1Y2 sex chromosome system. Chromosome Res doi: 10.1007/s10577-014-9446-4.

Grant MC, Mitton JB. 1979. Elevation gradients in adult sex ratios and sexual differentiation in vegetative growth of Populus tremuloides. Evolution 33:914–918.

Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, Elhaik E. 2013. On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Gen Biol Evol (in press).

Gschwend A, et al. 2012. Rapid divergence and expansion of the X chromosome in papaya. Proc Natl Acad Sci USA 109:13716–13721.

Hakes L, Pinney JW, Lovell SC, Oliver SG, Robertson DL. 2007. All duplicates are not equal: the difference between small-scale and genome duplication. Genome Biol 810:R209– R209.

Hammer, MF, Mendez FL, Cox MP, Woerner AE, and Wall JD. 2008. Sex-biased evolutionary forces shape genomic patterns of human diversity. PLoS Genet. 4:e1000202.

Hancock AM, Brachi B, Faure N, Horton MW, Jarymowycz LB, Sperone FG, Toomajian C, Roux F, Bergelson J. 2011. Adaptation to climate across the Arabidopsis thaliana genome. Science 3346052:83–86.

Handley LJL, Ceplitis H, Ellegren H. 2004. Evolutionary strata on the chicken Z chromosome: Implications for sex chromosome evolution. Genetics 167:367–376.

Harder LD, Aizen MA. 2010. Floral adaptation and diversification under pollen limitation. PhilTrans Roy Soc Lond B 365:529–543.

Harder LD, Barrett SCH. 1996. Pollen dispersal and mating patterns in animal-pollinated plants. In Lloyd DG, Barrett SCH, eds. Floral biology: studies on floral evolution in animal- pollinated plants. London: Chapman and Hall, 140–190.

Harder LD, Thomson JD. 1989. Evolutionary options for maximizing pollen dispersal of animal- pollinated plants. Am Nat 133:323–344.

BIBLIOGRAPHY 173

Harris MS, Pannell JR. 2008. Roots, shoots and reproduction: sexual dimorphism in size and costs of reproductive allocation in an annual herb. Proc Roy Soc Lond B 275:2595– 2602.

Harris MS, Pannell JR. 2010. Canopy seed storage is associated with sexual dimorphism in the woody dioecious genus Leucadendron. J Ecol 98:509–515.

Haudry A, Cenci A, Guilhaumon C, Paux E, Poirier S, Santoni S, David J, Glémin S. 2008. Mating system and recombination affect molecular evolution in four Triticeae species. Genet Res Camb 901:97–109.

Haudry A, Platts AE, Vello E, Hoen D, Leclercq M, Williamson R. et al. 2013. An atlas of over 90,000 conserved non-coding sequences yields a detailed map of crucifer regulatory regions. Nat Genet 45:891–898.

Haudry A, Cenci A, Guilhaumon C, Paux E, et al. 2008. Mating system and recombination affect molecular evolution in four Triticeae species. Genet Res Camb 90:97–109.

Hazzouri KM, Escobar JS, Ness RW, Newman LK, Randle AM, Kalisz S, Wright SI. 2012. Comparative population genomics in Collinsia sister species reveals evidence for reduced effective population size, relaxed selection and evolution of biased gene conversion with an ongoing mating shift. Evolution 10.1111/evo.12027.

Hazzouri KM, Mohajer A, Dejak SI, Otto SP, Wright SI. 2008. Contrasting patterns of transposable-element insertion polymorphism and nucleotide diversity in autotetraploid and allotetraploid Arabidopsis species. Genetics 1791:581–592.

Heilbuth J. 2000. Lower species richness in dioecious clades. Am Nat 156:221–241.

Hellborg L. 2003. Low levels of nucleotide diversity in mammalian Y chromosomes. Mol Biol Evol 211:158–163.

Hellborg L and Ellegren H. 2004. Low levels of nucleotide diversity in mammalian Y chromosomes. Mol Biol Evol 21:158–163.

Hemborg AM, Bond WJ. 2005. Different rewards in female and male flowers can explain the evolution of sexual dimorphism in plants. Biological Journal of the Linnean Society 85:97–109.

Hernandez RD, Kelley JL, Elyashiv E, Melton SC, Auton A, McVean G, 1000 Genomes Project, Sella G, Przeworski M. 2011. Classic selective sweeps were rare in recent human evolution. Science 3316019:920–924.

Hesse E, Pannell JR. 2011. Sexual dimorphism in a dioecious population of the wind-pollinated herb Mercurialis annua: the interactive effects of resource availability and competition. Annals of Botany 107:1039–1045.

BIBLIOGRAPHY 174

Hill GE, Johnson JD. 2013. The mitonuclear compatibility hypothesis of sexual selection. Proc R Soc B. 280:20131314.

Hill WG, Robertson A. 1966. The effect of linkage on limits to artificial selection. Genet Res 9:269–294.

Ho-Huu J, Ronfort J, De Mita S, Bataillon T, Hochu I, Weber A, Chantret N. 2012. Contrasted patterns of selective pressure in three recent paralogous gene pairs in the Medicago genus BMC Evol Biol 121:195–195.

Hollister JD, Arnold BJ, Svedin E, Xue KS, Dilkes BP, Bomblies K. 2012. Genetic adaptation associated with genome-doubling in autotetraploid Arabidopsis arenosa. PLoS Genet 812:e1003093.

Hough J, Hollister JD, Wang W, Barrett SCH, Wright SI. 2014a. Genetic degeneration of old and young Y chromosomes in the flowering plant Rumex hastatulus. Proc Natl Acad Sci USA 111:7713–7718

Hough J, Ågren JA, Barrett SCH, Wright SI. 2014b. Chromosomal distribution of cytonuclear genes in a dioecious plant with sex chromosomes. Genome Biol. Evol. 6:2439–43.

Hough J, Immler S, Barrett SCH, Otto SP. 2013. Evolutionarily stable sex ratios and mutation load. Evolution 67:1915–1925.

Hu CC, Lin S-YS, Chi W-TW, Charng Y-YY. 2012. Recent gene duplication and subfunctionalization produced a mitochondrial GrpE, the nucleotide exchange factor of the Hsp70 complex, specialized in thermotolerance to chronic heat stress in Arabidopsis. Annu Rev Plant Physiol 1582:747–758.

Hu TT, Pattyn P, Bakker EG, Cao J, Cheng J-F, Clark RM, Fahlgren N, et al. 2011. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet 435:476–481.

Huber CD, DeGiorgio M, Hellmann I, Nielsen R. 2015. Detecting recent selective sweeps while controlling for mutation rate and background selection. Mol Ecol doi: 10.1111/mec.13351.

Hufford MB, Xu X, van Heerwaarden J, Pyhäjärvi T, Chia J-M, Cartwright RA, Elshire RJ, et al. 2012. Comparative population genomics of maize domestication and improvement. Nat Genet 447:808–811.

Hughes JF, et al. 2010. Chimpanzee and human Y chromosomes are remarkably divergent in structure and gene content. Nature 463:536–539.

Hughes JF, et al. 2012 Strict evolutionary conservation followed rapid gene loss on

BIBLIOGRAPHY 175

human and rhesus Y chromosomes. Nature 483:82–86.

Ida TY, Harder LD, Kudo G. 2012. Effects of defoliation and shading on the physiological cost of reproduction in silky locoweed Oxytropis sericea. Ann bot 109:237–46.

Immler S, Arnqvist G, Otto SP. 2011. Ploidally antagonistic selection maintains stable genetic polymorphism. Evolution 66:55–65.

Innan H, Kondrashov F. 2010. The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet 112:97–108.

Isogimi T, Matsushita M, Watanabe Y, Nakagawa M. 2011. Sexual differences in physiological integration in the dioecious shrub Lindera triloba: a field experiment using girdling manipulation. Ann Bot 107:1029–1037.

Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, et al. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 4737345:97–100.

Jordan C and Charlesworth D. 2012. The potential for sexually antagonistic polymorphism in different genome regions. Evolution 66:505–516.

Joseph SB. and Kirkpatrick M. 2004. Haploid selection in animals. Trends Ecol. Evol. 19: 592– 597.

Kaessmann, H., V. Wiebe, G. Weiss, and S. Pääbo. 2001. Great ape DNA sequences reveal a reduced diversity and an expansion in humans. Nat Genet 27:155–156.

Kaiser V, Zhou Q, Bachtrog D. 2011. Non-random gene loss from the Drosophila miranda neo- Y chromosome. Genome Biol Evol 3:1329–37.

Kaminker JS, et al. 2002. The transposable elements of the Drosophila melanogaster euchromatin: A genomics perspective. Genome Biol 3:research0084.1–0084.20.

Karlin S and Lessard S. 1986. Theoretical studies on sex ratio evolution. Princeton University Press. Princeton, USA.

Kavanagh PH, Lehnebach CA, Shea MJ, Burns KC. 2011. Allometry of sexual size dimorphism in dioecious plants: Do plants obey Rensch’s rule. Am Nat 178, 596–601.

Kawecki TJ, Ebert D. 2004. Conceptual issues in local adaptation. Ecol lett 7:1225–1241.

Keightley PD, Eyre-Walker A. 2012. Estimating the rate of adaptive molecular evolution when the evolutionary divergence between species is small. J Mol Evol 741-2:61–68.

Keightley PD and Otto SP. 2006. Interference among deleterious mutations favours sex and recombination in finite populations. Nature 443:89–92.

BIBLIOGRAPHY 176

Keinan A, Mullikin JC, Patterson N, Reich D. 2009. Accelerated genetic drift on chromosome X during the human dispersal out of Africa. Nat Genet 41:66–70.

Keller SR, Levsen N, Olson MS, Tiffin P. 2012. Local adaptation in the flowering-time gene network of balsam poplar, Populus balsamifera Mol Biol Evol 2910:3143–3152.

Kim Y, Maruki T. 2011. Hitchhiking effect of a beneficial mutation spreading in a subdivided population. Genetics 1891:213–226.

Kimura M. 1983. The Neutral Theory of Molecular Evolution. Cambridge, UK Cambridge Univ. Press, doi: citeulike-article-id:4441469.

Kimura M and Crow J. 1964. The number of alleles that can be maintained in a finite population. Genetics 49:725–738

Kirkpatrick M. 1982. Sexual selection and the evolution of female choice. Evolution 36:1–12.

Kirkpatrick M. 2010. How and why chromosome inversions evolve. PLoS Biol. 8:e1000501.

Klekowski EJ Jr. 1984. Mutational load in clonal plants: A study of two fern species. Evolution 38:417–426.

Kohorn LU. 1995. Geographical variation in the occurrence and extent of sexual dimorphism in a dioecious shrub, Simmondsia chinensis. Oikos 74:137–145.

Kondo M, Nanda I, Hornung U, Schmid M, Schartl M. 2004. Evolutionary origin of the medaka Y chromosome. Curr Biol 14:1664–1669.

Kousathanas A, Keightley PD. 2013. A comparison of models to infer the distribution of fitness effects of new mutations. Genetics in press doi: 10.1534/genetics.112.148023

Ku HM, Vision T, Liu J, Tanksley SD. 2000. Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny. Proc Natl Acad Sci USA 9716:9121–9126.

Kulathinal RJ, Bennett SM, Fitzpatrick CL, Noor MAF. 2008. Fine-scale mapping of recombination rate in Drosophila refines its correlation to diversity and divergence. Proc Natl Acad Sci USA 105:10051–6.

Lahn BT and Page DC. 1999 Four evolutionary strata on the human X chromosome. Science 286:964–967.

Lande R. 1979. Quantitative genetic analysis of multivariate evolution, applied to brain: body size allometry. Evolution 3:402–416.

BIBLIOGRAPHY 177

Lande R. 1980. Sexual dimorphism, sexual selection, and adaptation in polygenic characters. Evolution 34:292–305.

Lande R. 1981. Models of speciation by sexual selection on polygenic traits. Proc Natl Acad Sci USA 78:3721–3725.

Lande R and Schemske DW. 1985. The evolution of self-fertilization and inbreeding depression in plants. I. Genetic models. Evolution 39:24–40.

Lane N. 2005. Power, sex, suicide: mitochondria and the meaning of life. Oxford, UK: Oxford University Press.

Langley CH, Stevens K, Cardeno C, Lee YCG, Schrider DR, Pool JE, Langley SA, et al. 2012. Genomic variation in natural populations of Drosophila melanogaster. Genetics 1922:533–598.

Lankinen A and Skogsmyr I. 2001. Evolution of pistil length as a choice mechanism for pollen quality. Oikos 92:81–90.

Larson BMH, Barrett SCH. 2000. A comparative analysis of pollen limitation in flowering plants. Biological Journal of the Linnean Society 69:503–520.

Le Corre V, Kremer A. 2012. The genetic differentiation at quantitative trait loci under local adaptation. Mol Ecol 217:1548–1566.

Lee C-R, Mitchell-Olds T. 2012. Environmental adaptation contributes to gene polymorphism across the Arabidopsis thaliana genome. Mol Biol Evol 2912:3721–3728.

Leffler EM, Bullaughey K, Matute DR, Meyer WK, Ségurel L, Venkat A, Andolfatto P, Przeworski M. 2012. Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol 109:e1001388.

Lercher MJ, Hurst LD. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet 187:337–340.

Levy A, Feldman M. 2002. The impact of polyploidy on grass genome evolution. Annu Rev Plant Physiol 1304:1587–1593.

Lewis Z, Wedell N, Hunt J. 2011. Evidence for strong intralocus sexual conflict in the Indian meal moth, Plodia interpunctella. Evolution 65:2085–2097.

Li C, Xu G, Zang R, Korpelainen H, Berninger F. 2007. Sex-related differences in leaf morphological and physiological response in Hippophae rhamnoides along an altitudinal gradient. Tree Physiology 27:399–406.

BIBLIOGRAPHY 178

Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760.

Lindblad-Toh K, Garber M, Zuk O, Lin MF, Parker BJ, Washietl S, Kheradpour P, et al. 2011. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 4787:476–482.

Liu F, Zhang L, Charlesworth D. 1998. Genetic diversity in Leavenworthia populations with different inbreeding levels. Proc Roy Soc B: Biol Sci 2651393:293–301.

Liu Z, et al. 2004. A primitive Y chromosome in papaya marks incipient sex chromosome evolution. Nature 427:348–352.

Lloyd DG, Bawa KS. 1984. Modification of gender of seed plants in varying conditions. Evolutionary Biology 17:255–338.

Lloyd DG, Webb CJ. 1977. Secondary sex characters in plants. Bot Rev 43:177–216.

Lloyd DG. 1979. Parental strategies in angiosperms. New Zealand Journal of Botany 17:595– 606.

Lloyd DG. 1984. Gender allocations in outcrossing cosexual plants. In Dirzo R, Sarukhán J, eds. Perspectives on plant population ecology. Sunderland MA: Sinauer Associates 277– 300.

Lloyd DG. 1974. Female-predominant sex ratios in angiosperms. Heredity 32:35–44.

Lockton S, Gaut BS. 2005. Plant conserved non-coding sequences and paralogue evolution. Trends Genet 211:60–65.

Lunter GG, Goodson MM. 2011. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genes Dev 21:936–939.

Lunter G and Goodson M. 2011. Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 21:936–939.

Lynch M and Conery JS. 2000. The evolutionary fate and consequences of duplicate genes. Science 2905494:1151–1155.

Lynch M and Walsh B. 1998. Genetics and Analysis of Quantitative Traits. Sunderland, MA: Sinauer Associates.

Mable B. and Otto SP. 1998. The evolution of life cycles with haploid and diploid phases. BioEssays 20:453–462.

BIBLIOGRAPHY 179

Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M, Van de Peer Y. 2005. Modeling gene and genome duplications in eukaryotes. Proc Natl Acad Sci USA 10215:5454–5459.

Malone JH, Cho DY, Mattiuzzo NR, Artieri CG, Jiang L, Dale RK, Smith HE, McDaniel J, Munro S, Salit M, Andrews J, Przytycka TM, Oliver B 2012 Mediation of Drosophila autosomal dosage effects and compensation by network interactions. Genome Biol 13:R28.

Mank JE. 2013. Sex chromosome dosage compensation: definitely not for everyone. Trends Genet 12:677-683.

Mank JE, Hall DW, Kirkpatrick M, Avise JC. 2006. Sex chromosomes and male ornaments: a comparative evaluation in ray-finned fishes. Proc Roy Soc Lond B 273:233–236.

Mank JE. 2009. Sex chromosomes and the evolution of sexual dimorphism: lessons from the genome. Am Nat 173:141–150.

Mank JE. 2012. Small but mighty: the evolutionary dynamics of W and Y sex chromosomes. Chromosom Res 20:21–33.

Mank JE. 2013. Sex chromosome dosage compensation: Definitely not for everyone. Trends Genet 29:677–683.

Mariotti M, Navajas-Pérez R, Lozano R, Parker JS, de la Herrán R, Ruiz Rejón C, Ruiz Rejón M, Garrido-Ramos MA, Jamilena M. 2006. Cloning and characterization of dispersed repetitive DNA derived from microdissected sex chromosomes of Rumex acetosa. Genome 49:114–121.

Marques AR, Fernandes GW, Reis IA, Assunção RM. 2002. Distribution of adult male and female Baccharis concinna (Asteraceae) in the rupestrian fields of Serra Do Cipó, Brazil. Plant Biology 4:94–103.

Marshall DL, Reynolds J, Abrahamson NJ, Simpson HL, Barnes MG, Medeiros JS, et al. 2007. Do differences in plant and flower age change mating patterns and alter offspring fitness in Raphanus sativus (Brassicaceae)? Am J Bot 94: 409–418.

Martin W. 2003. Gene transfer from organelles to the nucleus: frequent and in big chunks. Proc Natl Acad Sci U S A. 100:8612–8614.

Mascarenhas J. 1990. Gene activity during pollen development. Ann Rev Plant Physiol Plant Mol. Biol. 41: 317–338.

Mascarenhas JP. 1999. Gene activity during pollen development. Annual Review of Plant Physiology and Plant Molecular Biology 41:317–338.

BIBLIOGRAPHY 180

Matsubara KH, Tarui M, Toriba K, Yamada C, et al. 2006. Evidence for different origin of sex chromosomes in snakes, birds, and mammals and step-wise differentiation of snake sex chromosomes. Proc Natl Acad Sci USA 103:18190–18195.

Matsushita M, Nakagawa M, Tomaru N. 2011. Sexual differences in year-to-year flowering trends in dioecious multi-stemmed shrub Lindera triloba: effects of light and clonal integration. J Ecol 99:1520–1530.

Maynard Smith J and Haigh J. 1974. The hitch-hiking effect of a favorable gene. Genet Res 23:23–35.

Maynard Smith J. 1991. Theories of sexual selection. Trends Ecol Evol 6:146–151.

Maynard Smith J. 1974. The theory of games and the evolution of animal conflicts. J Theor Biol 47:209–221.

McDaniel SF. 2005. Genetic correlations do not constrain the evolution of sexual dimorphism in the moss Ceratodon purpureus. Evolution 59:2353–2361.

McDonald JH, Kreitman M. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 3516328:652–654.

McGaugh SE, Heil CSS, Manzano-Winkler B, Loewe L, et al. 2012. Recombination modulates how selection affects linked sites in Drosophila. PLoS Biol. 10:e1001422.

McGuigan K, Blows MW. 2009. Asymmetry of genetic variation in fitness-related traits: apparent stabilizing selection on Gmax. Evolution 63:2838–2847.

McKenna AA, et al. 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genes Dev 20:1297–1303.

McKenna AM, Hanna E, Banks A, Sivachenko K, et al. 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–303.

McVean GA and Charlesworth B. 2000. The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics 155:929–44.

Meagher TR. 1999. The quantitative genetics of sexual dimorphism. In: Geber MA, Dawson TE, Delph LF, eds. Gender and sexual dimorphism in flowering plants. Berlin: Springer- Verlag, 275-294.

Mercer CA, Eppley SM. 2010. Intersexual competition in a dioecious grass. Oecologia 164:657– 664.

BIBLIOGRAPHY 181

Messing J. 2009. The polyploid origin of maize. The Maize Handbook: Domestication, Genetics, and Genome: 221–238.

Midgley JJ. 2010. Causes of secondary sexual differences in plants – Evidence from extreme leaf dimorphism in Leucadendron Proteaceae. South African Journal of Botany 76: 588– 592.

Ming R, Bendahmane A, Renner SS. 2011. Sex chromosomes in land plants. Annu Rev Plant Biol 62: 485–514.

Moore JC, Pannell JR. 2011. Sexual selection in plants. Curr Biol 21:176–181.

Morjan CL, Rieseberg LH. 2004. How species evolve collectively: implications of gene flow and selection for the spread of advantageous alleles. Mol Ecol 136:1341–1356.

Mulcahy DL. 1979. The rise of the angiosperms: a genecological factor. Science 206:20–23.

Muller HJ. 1950. Our load of mutations. Am J Hum Genet 2:111–176.

Muyle A, et al. 2012. Rapid de novo evolution of X chromosome dosage compensation in Silene latifolia, a plant with young sex chromosomes. PLoS Biol 10:e1001308.

Nachman M. 2002. Variation in recombination rate across the genome: evidence and implications. Curr Opin Genet Det 126:657–663.

Nam K, Ellegren, H 2008 Scrambled eggs: the chicken Gallus gallus Z chromosome contains at least three non-linear evolutionary strata. Genetics 180:1131–1136.

Natri HM, Shikano T, Merilä J. 2013. Progressive recombination suppression and differentiation in recently evolved neo-sex chromosomes. Mol Biol Evol. 30:1131–1144.

Navajas-Perez R, et al. 2005. The evolution of reproductive systems and sex-determining mechanisms within Rumex (Polygonaceae) inferred from nuclear and chloroplastidial sequence data. Mol Biol Evol 22:1929–1939.

Ness RW, Siol M, Barrett SCH. 2012. Genomic consequences of transitions from cross-to self- fertilization on the efficacy of selection in three independently derived selfing plants. BMC Genomics 13:611

Ness RW, Wright SI, Barrett SCH. 2010. Mating-system variation, demographic history and patterns of nucleotide diversity in the tristylous plant Eichhornia paniculata. Genetics 1842:381–392.

Nicotra AB. 1998. Sex ratio variation and spatial distribution of Siparuna grandiflora, a tropical dioecious shrub. Oecologia 115:102–113.

BIBLIOGRAPHY 182

Noor MAF. 2008. Connecting recombination, nucleotide diversity, and species divergence in Drosophila. 25:255–256.

Nordborg M, Hu TT, Ishino Y, Jhaveri J, Toomajian C, Zheng H, Bakker E, et al. 2005. The Pattern of Polymorphism in Arabidopsis thaliana. PLoS Biol 37:e196.

Nordborg M. 2000. Linkage disequilibrium, gene trees and selfing: an ancestral recombination graph with partial self-fertilization. Genetics 1542:923–929.

Obeso JR. 2002. The costs of reproduction in plants. New Phyt 155:321–348.

Ohta T. 1992. The nearly neutral theory of molecular evolution. Annu Rev Ecol Syst 23:263– 286.

Onyekwelu SS, Harper JL. 1979. Sex ratio and niche differentiation in spinach Spinacia oleracea. Nature 282:609–611.

Orr HA and Kim Y. 1998 An adaptive hypothesis for the evolution of the Y chromosome. Genetics 150:1693−1698.

Orr HA. 1995. Somatic mutation favors the evolution of diploidy. Genetics 139:1441–1447.

Osborn TC, Pires JC, Birchler JA, Auger DL, Chen ZJ, Lee H-S, Comai L, et al. 2003. Understanding mechanisms of novel gene expression in polyploids. Trend Genet 193:141–147.

Otto SP. 2007. The evolutionary consequences of polyploidy. Cell 1313:452–462.

Otto SP and Bourguet D. 1999. Balanced polymorphisms and the evolution of dominance. Am Nat 153:561–574.

Otto SP and Goldstein DB. 1992. Recombination and the evolution of diploidy. Genetics 131: 745–51.

Otto SP and Marks J. 1996. Mating systems and the evolutionary transition between haploidy and diploidy. Biol J Linn Soc 57: 197–218.

Otto SP, Pannell J, Peichel CL, Ashman T-L, Charlesworth D, Chippindale AK, Delph LF, Guerrero RF, Scarpino SV, McAllister BF. 2011. About PAR: the distinct evolutionary dynamics of the Pseudoautosomal Region. Trends Genet 27:358-367.

Otto SP and Whitton J. 2000. Polyploid incidence and evolution. Annu Rev Ecol Evol Syst 34:401–437.

Otto SP and Yong P. 2002. The evolution of gene duplicates. Adv Genet 46:451–483.

BIBLIOGRAPHY 183

Papadopulos AST, Chester M, Ridout K, Filatov DA. 2015. Rapid Y degeneration and dosage compensation in plant sex chromosomes. Proc Natl Acad Sci 112:201508454.

Parisi M, Nuttall R, Naiman D, Bouffard G, Malley J, Nadrews J, Eastman S, Oliver B. 2003. Paucity of genes on the D. melanogaster X chromosome showing male-biased expression. Science 299:697–700.

Peichel CL, et al. 2004. The master sex-determination locus in threespine sticklebacks is on a nascent Y chromosome. Curr Biol 14:1416–1424.

Perry LE, Dorken ME. 2011. The evolution of males: support for predictions from sex allocation theory using mating arrays of Sagittaria latifolia (Alismataceae). Evolution 65:2782– 2791.

Pettengill JB, Moeller DA. 2012. Tempo and mode of mating system evolution between incipient Clarkia species. Evolution 664:1210–1225.

Pickering CM, Hill W. 2002. Reproductive ecology and the effect of altitude on sex ratios in the dioecious herb Aciphylla simplicifolia (Apiaceae). Australian Journal of Botany 50:289– 300.

Pickup M, Barrett SCH. 2012. Reversal of height dimorphism promotes pollen and seed dispersal in a wind-pollinated dioecious plant. Biology Letters 8:245–248.

Pickup M, Barrett SCH. 2013. The influence of demography and local mating environment on sex ratios in a wind-pollinated dioecious plant. Ecol Evol 3:629–639.

Pigozzi MI. 2011. Diverse stages of sex-chromosome differentiation in tinamid birds: evidence from crossover analysis in Eudromia elegans and Crypturellus tataupa. Genetica 139:771–7.

Platts A, Horton M, Huang YS, Li Y, Anastasio AE, Mulyati NW, Ågren J, et al. 2010. The scale of population structure in Arabidopsis thaliana. PLoS Genet 62:e1000843.

Poissant J, Wilson AJ, Coltman DW. 2010. Sex-specific genetic variance and the evolution of sexual dimorphism: a systematic review of cross-sex genetic correlations. Evolution 64:97–107.

Poissant J, Wilson AJ, Festa-Bianchet M, Hogg JT, Coltman DW. 2008. Quantitative genetics and sex-specific selection on sexually dimorphic traits in bighorn sheep. Proc Roy Soc B 275:623-628.

Pool JE and Nielsen R. 2007. Population size changes reshape genomic patterns of diversity. Evolution 61:3001-6.

Primack RB. 1985. Longevity of individual flowers. Ann Rev Ecol Evol 16:15–37.

BIBLIOGRAPHY 184

Pritchard JK, Di Rienzo A. 2010. Adaptation - not by sweeps alone. Nat Rev Genet 1110:665– 667.

Pritchard JK, Pickrell JK, Coop G. 2010. The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Curr Biol 204:R208–215.

Prunier J, Gérardi S, Laroche J, Beaulieu J, Bousquet J. 2012. Parallel and lineage-specific molecular adaptation to climate in boreal black spruce. Mol Ecol 2117:4270–4286.

Purrington CB, Schmitt J. 1995. Sexual dimorphism of dormancy and survivorship in buried seeds of Silene latifolia. Journal of Ecology 83:795–800.

Qiu S, Zeng K, Slotte T, Wright SI, Charlesworth D. 2011. Reduced efficacy of natural selection on codon usage bias in selfing Arabidopsis and Capsella species. Gen Biol Evol 3:868– 880.

Qiu S, Bergero R, Forrest A, Kaiser VB, Charlesworth D. 2010. Nucleotide diversity in Silene latifolia autosomal and sex-linked genes. Proc Biol Sci 277:3283–90.

Qiu S, Bergero R, Charlesworth D. 2013. Testing for the footprint of sexually antagonistic polymorphisms in the pseudoautosomal region of a plant sex chromosome pair. Genetics 194:663–672.

Queenborough SA, Burslem DFRP, Garwood NC, Valencia R. 2007. Determinants of biased sex ratios and inter-sex cost of reproduction in dioecious tropical trees. Am J Bot 94:67–78.

Queller DC. 1987. Sexual selection in flowering plants. In: Bradbury JW, Andersson MB, eds. Sexual selection: testing the alternatives. New York: John Wiley, 165–179.

Quevillon E, et al. 2005. InterProScan: protein domains identifier. Nucleic Acids Res. 33:W116– 120.

R Development Core Team. 2013. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.

Ralph P, Coop G. 2010. Parallel adaptation: one or many waves of advance of an advantageous allele? Genetics 1862:647–668.

Rand DM, Clark AG, Kann LM. 2001. Sexually antagonistic cytonuclear fitness interactions in Drosophila melanogaster. Genetics 159:173–187.

Rand DM, Haney RA, Fry AJ. 2004. Cytonuclear coevolution: the genomics of cooperation. Trends Ecol Evol 19:645–653.

BIBLIOGRAPHY 185

Reeve JP, Fairbairn DJ. 1996. Sexual size dimorphism as a correlated response to selection on body size: An empirical test of the quantitative genetic model. Evolution 50:1927–1938.

Reeve JP, Fairbairn DJ. 2001. Predicting the evolution of sexual size dimorphism. J Evol Biol 14:244–254.

Renner SS, Ricklefs RE. 1995. Dioecy and its correlates in the flowering plants. Am J Bot 82:596–606.

Rensch B. 1960. Evolution above the species level. New York: Columbia University Press.

Revell LJ. 2012 phytools: An R package for phylogenetic comparative biology and other things. Methods Ecol Evol 3:217-223.

Rhen T. 2000. Sex-limited mutations and the evolution of sexual dimorphism. Evolution 54:37– 43.

Rice P, Longden I, Bleasby A. 2000. EMBOSS: The european molecular biology open software suite. Trends Genet 16:276—277.

Rice WR. 1984. Sex chromosomes and the evolution of sexual dimorphism. Evolution 38:735– 742.

Rice WR. 2013. Nothing in genetics makes sense except in light of genomic conflict. Annu Rev Ecol Evol Syst 44:217–237.

Ridley M. 2000. Mendel's demon: gene justice and the complexity of life. London: Weidenfeld & Nicolson.

Rogell B, Dean R, Lemos B, Dowling DK. 2014. Mito-nuclear interactions as drivers of gene movement on and off the X chromosome. BMC Genomics 15:330.

Rolff JS, Armitage, Coltman D. 2005. Genetic constraints and sexual dimorphism in immune defense. Evolution 59:1844–1850.

Ronfort J, Glémin S. 2012. Mating system, Haldane's sieve, and the domestication process. Evolution in press doi:10.1111/evo.12025.

Roselius K. 2005. The relationship of nucleotide polymorphism, recombination rate and selection in wild tomato species. Genetics 1712:753–763.

Ross MT, et al. 2005. The DNA sequence of the human X chromosome. Nature 434:325–337.

BIBLIOGRAPHY 186

Ross-Ibarra J, Wright SI, Foxe JP, Kawabe A, DeRose-Wilson L, Gos G, Charlesworth D, Gaut BS. 2008. Patterns of polymorphism and demographic history in natural populations of Arabidopsis lyrata. PLoS One 36e2411.

Rourke JP. 1989. The inflorescence morphology and systematics of Aulax Proteaceae. South African Journal of Botany 53:464–480.

Rychlewski J and Zarzycki K. 1975. Sex ratio in seeds of Rumex acetosa as a result of sparse or abundant pollination. Acta Biologica Cracoviensia Series Botanica 18:101–114.

Saitou N and Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–25.

Sakai AK, Burris TA. 1985. Growth in male and female aspen clones: a twenty-five-year longitudinal study. Ecology 66:1921–1927.

Sakai AK, Weller SG. 1999. Gender and sexual dimorphism in flowering plants: a review of terminology, biogeographic patterns, ecological correlates, and phylogenetic approaches. In: Geber MA, Dawson TE, Delph LF, eds. Sexual and gender dimorphism in flowering plants. New York: Springer-Verlag, 1–31.

Sánchez-Vilas J, Pannell JR. 2010. Differential niche modification by males and females of a dioecious herb: extending the Jack Sprat effect. J Evol Biol 23:2262–2266.

Sánchez-Vilas J, Pannell JR. 2011. Sexual dimorphism in resource acquisition and deployment: both size and timing matter. Ann Bot 107:119–126.

Sánchez-Vilas J, Turner A, Pannell JR. 2011. Sexual dimorphism in intra- and interspecific competitive ability of the dioecious herb Mercurialis annua. Plant Biol 13:218–222.

Sari-Gorla M, Ottaviano E, Frascaroli E, Landi P. 1989. Herbicide-tolerant corn by pollen selection. Sex Plant Reprod 2:65–69.

Sattath S, Elyashiv E, Kolodny O, Rinott Y, Sella G. 2011. Pervasive adaptive protein evolution apparent in diversity patterns around amino acid substitutions in Drosophila simulans. PLoS Genet 72:e1001302.

Savage SA, Stewart BJ, Eckert A, Kiley M, Liao JS, Chanock SJ. 2005. Genetic variation, nucleotide diversity, and linkage disequilibrium in seven telomere stability genes suggest that these genes may be under constraint. Hum Mutat 264:343–350.

Schaffner SF 2004. The X chromosome in population geneticsNat Rev Genet 5:43–51.

Schmid KJ. 2004. A Multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 1693:1601– 1615.

BIBLIOGRAPHY 187

Schulz MHM, Zerbino DRD, Vingron MM, Birney EE 2012 Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28:1086–1092.

Scotti I, Delph LF. 2006. Selective trade-offs and sex chromosome evolution in Silene latifolia. Evolution 60:1793–1800.

Searcy K and Mulcahy D. 1985. Pollen selection and the gametophytic expression of metal tolerance in Silene dioica (Caryophyllaceae) and Mimulus guttatus (Scrophulariaceae). Am J Bot 72:1700–1702.

Sella G, Petrov DA, Przeworski M, Andolfatto P. 2009. Pervasive natural selection in the Drosophila genome? PLoS Genet 56:e1000495.

Seoighe C, Gehring C, Hurst LD. 2005. Gametophytic selection in Arabidopsis thaliana supports the selective model of intron length reduction. PLoS Genet 12:e13.

Seoighe C, Wolfe KH. 1999. Yeast genome evolution in the post-genome era. Curr Opin Microbiol 25:548–554.

Shaw RF and Mohler JD. 1953. The selective significance of the sex ratio. Am Nat 87:735–742.

Shelton AO. 2010. The origin of female-biased sex ratios in intertidal seagrasses Phyllospadix spp.. Ecology 91:1380–1390.

Shuster SM, Wade MJ. 2003. Mating systems and strategies. Monographs in Behavior and Ecology. Princeton: Princeton University Press.

Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, et al. 2005. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genes Dev 158:1034–1050.

Sinclair JP, Emlen J, Freeman DC. 2012. Biased sex ratios in plants: theory and trends. Botanical Rev 78:63–86.

Sinclair JP Emlen J, Freeman DC. 2012. Biased sex ratios in plants: theory and trends. Bot Rev 78:63–86.

Singer T, Fan Y, Chang H-S, Zhu T, Hazen SP, Briggs SP. 2006. A high-resolution map of Arabidopsis recombinant inbred lines by whole-genome exon array hybridization. PLoS Genet 29:e144.

Singh ND, Koerich LB, Carvalho AB, Clark AG. 2014. Positive and purifying selection on the Drosophila Y Chromosome. Mol Biol Evol 31:1–12.

BIBLIOGRAPHY 188

Skaletsky H, Kuroda-Kawaguchi T, Minx PJ, Cordum HS, Hillier L, Brown LG, Repping S, Pyntikova T, Ali J, Bieri T, et al. 2003. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423:825–837.

Slatkin M. 1984. Ecological causes of sexual dimorphism. Evolution 38:622–630.

Sloan DB. 2014. Using plants to elucidate the mechanisms of cytonuclear co-evolution. New Phyt doi: 10.1111/nph.12835.

Slotte T, Bataillon T, Hansen TT, St Onge K, Wright SI, Schierup MH. 2011. Genomic determinants of protein evolution and polymorphism in Arabidopsis. Gen Biol Evol 3:1210–1219.

Slotte T, Foxe JP, Hazzouri KM, Wright SI. 2010. Genome-wide evidence for efficient positive and purifying selection in Capsella grandiflora, a plant species with a large effective population size. Mol Biol Evol 278:1813–1821.

Slotte T, Hazzouri KM, Ågren JA, Koenig K, Maumus F, Guo Y, et al. 2013. The Capsella rubella genome and the genomic consequences of rapid mating system evolution. Nat Genet. 45:831–835.

Smith BW. 1963. Mechanism of sex determination in Rumex hastatulus. Genetics 48:1265– 1288.

Smith BW. 1964. The evolving karyotype of Rumex hastatulus. Evolution 18:93–104.

Song K, Lu P, Tang K, Osborn TC. 1995. Rapid genome change in synthetic polyploids of Brassica and its implications for polyploid evolution. Proc Natl Acad Sci USA 9217:7719–7723.

Soza VL, Brunet J, Liston A, Smith PS, Di Stilio VS. 2012. Phylogenetic insights into the correlates of dioecy in meadow-rues Thalictrum, (Ranunculaceae). Molecular Phylogenetics and Evolution 63:180–192.

Spigler RB, Lewers KS, Main DS, Ashman TL 2008 Genetic mapping of sex determination in a wild strawberry, Fragaria virginiana, reveals earliest form of sex chromosome. Heredity 101:507–517.

Spigler RB and Ashman TL. 2011. Gynodioecy to dioecy: Are we there yet? Ann Bot 109:531– 543.

Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688-2690.

Stehlik I and Barrett SCH. 2005. Mechanisms governing sex-ratio variation in dioecious Rumex nivalis. Evolution 59:814–825.

BIBLIOGRAPHY 189

Stehlik I and Barrett SCH. 2006. Pollination intensity influences sex ratios in dioecious Rumex nivalis, a wind-pollinated plant. Evolution 60:1207–1214.

Stehlik I, Friedman J, and Barrett SCH. 2008. Environmental influence on primary sex ratio in a dioecious plant. Proc Natl Aca Sci USA 105:10847–52.

Stehlik I, Kron P, Barrett SCH, Husband BC. 2007. Sexing pollen reveals female bias in a dioecious plant. New Phytol 175:185–194.

Stephenson AG and Winsor JA. 1986. Lotus corniculatus regulates offspring quality through selective fruit abortion. Evolution 40:453–458.

Sterck L, Rombauts S, Jansson S, Sterky F, Rouzé P, Van de Peer Y. 2005. EST data suggest that poplar is an ancient polyploid. New Phytol 1671:165–70.

Steven JC, Delph LF, Brodie III ED. 2007. Sexual dimorphism in the quantitative-genetic architecture of floral, leaf, and allocation traits in Silene latifolia. Evolution 61:42–57.

Strasburg JL, Kane NC, Raduski AR, Bonin A, Michelmore R, Rieseberg LH. 2011. Effective population size is positively correlated with levels of adaptive divergence among annual sunflowers. Mol Biol Evol 285:1569–1580.

Sweigart AL, Willis JH. 2003. Patterns of nucleotide diversity in two species of Mimulus are affected by mating system and asymmetric introgression. Evolution 5711:2490–2506.

Tamura K, Nei M, Kumar S. 2004. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc Natl Acad Sci USA 101:11030–5.

Taylor DR. 1999. Genetics of sex ratio variation among natural populations of a dioecious plant. Evolution 53:55–62.

Taylor DR and Ingvarsson PK. 2003. Common features of segregation distortion in plants and animals. Genetica 117: 27–35.

Tenaillon MI, Sawkins MC, Anderson LK, Stack SM, Doebley J, Gaut BS. 2002. Patterns of diversity and recombination along chromosome 1 of maize Zea mays ssp.. Genetics 1623:1401–1413.

The Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 4086:796–815.

The ENCODE Project Consortium. 2013. An integrated encyclopedia of DNA elements in the human genome. Nature 4887414:57–74.

BIBLIOGRAPHY 190

The Gene Ontology Consortiun. 2008. The Gene Ontology project in 2008. Nucleic Acids Res 36:D440–444.

Thomas SC, LaFrankie JV. 1993. Sex, size and interyear variation in flowering among dioecious trees of the Malayan rain forest. Ecology 74”1529–1537.

Thorvaldsdóttir H, Robinson JT, Mesirov JP. 2013. Integrative Genomics Viewer IGV: High- performance genomics data visualization and exploration. Brief Bioinform 14:178–192.

Tiffin P, Gaut BS. 2001. Sequence diversity in the tetraploid Zea perennis and the closely related diploid Z. diploperennis: insights from four nuclear loci. Genetics 1581:401–412.

Timmis JN, Ayliffe MA, Huang CY, Martin W. 2004. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nat Rev Genet 5:123–135.

Touzet P, Meyer EH. 2014. Cytoplasmic male sterility and mitochondrial metabolism in plants. Mitochondrion doi:10.1016/j.mito.2014.04.009.

Trivers R. 1972. Parental investment and sexual selection. In: Sexual selection and the descent of man 1871–1971. pp. 139–179. Aldine Press. Chicago, USA.

Turner TL, Bourne EC, Wettberg Von EJ, Hu TT, Nuzhdin SV. 2010. Population resequencing reveals local adaptation of Arabidopsis lyrata to serpentine soils. Nat Genet 423:260– 263.

Valero M, Richerd S, Perrot V, Desombe C. 1992. Evolution of alternation of haploid and diploid phases in life cycles. Trends Ecol Evol 7: 25–29.

Vamosi JC, Otto SP, Barrett SCH. 2003. Phylogenetic analysis of the ecological correlates of dioecy in angiosperms. J Evol Biol 16:1006–1018.

Van Doorn GS and Kirkpatrick M. 2010. Transitions between male and female heterogamety caused by sex-antagonistic selection. Genetics 186:629–645.

Van Drunen WE, Dorken ME. 2012. Trade offs between clonal and sexual reproduction in Sagittaria latifolia (Alismataceae) scale up to affect fitness of entire clones. New Phyt 196:606–616.

Vaughton G and Ramsey M. 1998. Floral display, pollinator visitation and reproductive success in the dioecious perennial herb Wurmbea dioica (Liliaceae). Oecologia 115:93–101.

Veitia RA. 2004. Gene dosage balance in cellular pathways: implications for dominance and gene duplicability. Genetics 1681:569–574.

Vicoso B and Charlesworth B. 2006. Evolution on the X chromosome: unusual patterns and processes. Nat Rev Genet 7:645–653.

BIBLIOGRAPHY 191

Vicoso, B, Kaiser VB, Bachtrog D. 2013. Sex-biased gene expression at homomorphic sex chromosomes in emus and its implication for sex chromosome evolution. Proc Natl Acad Sci USA 110:6453–8.

Waelti MO, Page PA, Widmer A, Schiestl FP. 2009. How to be an attractive male: floral dimorphism and attractiveness to pollinators in a dioecious plant. BMC Evol Biol 9:190.

Walbot V, Evans MMS. 2003. Unique features of the plant life cycle and their consequences. Nat Rev Genet 45:369–379.

Walsh JB. 1995. How often do duplicated genes evolve new functions? Genetics 1391:421–428.

Wang J, et al. 2012 Sequencing papaya X and Yh chromosomes reveals molecular basis of incipient sex chromosome evolution. Proc Natl Acad Sci USA 109:13710–13715.

Ward LD and Kellis M. 2012. Evidence of abundant purifying selection in humans for recently acquired regulatory functions. Science 3376102:1675-1678.

Ward LK. 2007. Lifetime sexual dimorphism in Juniperus communis. Plant Species Biology 22:11–21.

Watterson GA. 1975. On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7:256–276.

Wendel JF. 2000. Genome evolution in polyploids. Plant Mol Biol 421:225–249.

Werren JH, Beukeboom LW. 1998. Sex determination, sex ratios, and genetic conflict. Annu Rev Ecol Syst. 29:233–261.

Whitlock MC. 2003. Fixation probability and time in subdivided populations. Genetics 1642:767–779.

Whitlock MC and Agrawal AF. 2009. Purging the genome with sexual selection: reducing mutation load through selection on males. Evolution 63:569–582.

Willson MF. 1979. Sexual selection in plants. Am Nat 113:777–790.

Willson MF, Burley N. 1983. Mate choice in plants: tactics, mechanisms and consequences. Monographs in Population Biology 19, Princeton: Princeton University Press.

Willyard A, Syring J, Gernandt DS, Liston A, Cronn R. 2007. Fossil calibration of molecular divergence infers a moderate mutation rate and recent radiations for Pinus. Mol Biol Evol 241:90–101.

BIBLIOGRAPHY 192

Wilson DJ, Hernandez RD, Andolfatto P, Przeworski M. 2011. A population genetics- phylogenetics approach to inferring natural selection in coding sequences. PLoS Genet 712:e1002395.

Wilson P, Thomson JD, Stanton ML, Rigney LP. 1994. Beyond floral Batemania: gender biases in selection for pollination success. Am Nat 143:283–296.

Wilson Sayres MA, Lohmueller KE, Nielsen R. 2014. Natural selection reduced diversity on human Y chromosomes. PLoS Genet 10:e1004064.

Wood TE, Takebayashi N, Barker MS, Mayrose I, Greenspoon PB, Rieseberg LH. 2009. The frequency of polyploid speciation in vascular plants. Proc Natl Acad Sci USA 10633:13875–13879.

Wright S. 1931. Evolution in Mendelian populations. Genetics 16:97–159.

Wright SI, Andolfatto P. 2008. The impact of natural selection on the genome: emerging patterns in Drosophila and Arabidopsis. Annu Rev Ecol Evol Syst 391:193–213.

Wright SI, Foxe JP, DeRose-Wilson L, Kawabe A, Looseley M, Gaut BS, Charlesworth D. 2006. Testing for effects of recombination rate on nucleotide diversity in natural populations of Arabidopsis lyrata. Genetics 1743:1421–1430.

Wright SI, Lauga B, Charlesworth D. 2002. Rates and patterns of molecular evolution in inbred and outbred Arabidopsis. Mol Biol Evol 199:1407–1420.

Wright SI, Yau CBK, Looseley M, Meyers B. 2004. Effects of gene expression on molecular evolution in Arabidopsis thaliana and Arabidopsis lyrata. Mol Biol Evol 21:1719-1726.

Wright SI, Foxe JP, DeRose-Wilson L, Kawabe A, Looseley M, Gaut BS, Charlesworth D. 2006. Testing for effects of recombination rate on nucleotide diversity in natural populations of Arabidopsis lyrata. Genetics 174:1421–30.

Wu CI, Yujun Xu E. 2003. Sexual antagonism and X inactivation—the SAXI hypothesis. Trends Genet 19:243–247.

Wyman MJ. 2012. Genetic considerations in the evolution of sexual dimorphism. Ph.D. Thesis, University of Toronto, Ontario, Canada.

Yakimowski SB, Glaettli M, Barrett SCH. 2011. Floral dimorphism in plant populations with combined versus separate sexes. Ann Bot 108:765–776.

Yamada K, Lim J, Dale JM, Chen H, Shinn P, Palm CJ, Southwick AM, Wu HC, Kim C, Nguyen M, et al. 2003 Empirical analysis of transcriptional activity in the Arabidopsis genome. Science 302:842–846.

BIBLIOGRAPHY 193

Yang Z. 2007. PAML 4: a program package for phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586-1591.

Yin T, et al. 2008 Genome structure and emerging evidence of an incipient sex chromosome in Populus. Genome Res 18:422–430.

Zerbino DRD, Birney EE 2008 Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genes Dev 18:821–829.

Zhen Y and Andolfatto P. 2012. Methods to detect selection on non-coding DNA. Methods Mol Biol 856:141–159.

Zhou Q and Bachtrog D. 2012a. Chromosome-wide gene silencing initiates Y degeneration in Drosophila. Curr Biol 22:522–5.

Zhou Q and Bachtrog D. 2012b Sex-specific adaptation drives early sex chromosome evolution in Drosophila. Science 337:341–345.

Zhou Q, Zhang J, Bachtrog D, An N, Huang Q, et al. 2014. Complex evolutionary trajectories of sex chromosomes across bird taxa. Science 346:1246338.

Zluvova J, Zak J, Janousek B, Vyskot B. 2010. Dioecious Silene latifolia plants show sexual dimorphism in the vegetative stage. BMC Plant Biol 10:1471–2229.