Utah State University DigitalCommons@USU

All Graduate Theses and Dissertations Graduate Studies

5-2016

Hybridization, Genetic Structure and Expression in the Genus

Martin Peter Schilling Utah State University

Follow this and additional works at: https://digitalcommons.usu.edu/etd

Part of the Commons, Commons, and the Commons

Recommended Citation Schilling, Martin Peter, "Hybridization, Population Genetic Structure and Gene Expression in the Genus Boechera" (2016). All Graduate Theses and Dissertations. 5192. https://digitalcommons.usu.edu/etd/5192

This Dissertation is brought to you for free and open access by the Graduate Studies at DigitalCommons@USU. It has been accepted for inclusion in All Graduate Theses and Dissertations by an authorized administrator of DigitalCommons@USU. For more information, please contact [email protected]. HYBRIDIZATION, POPULATION GENETIC STRUCTURE AND GENE EXPRESSION IN THE GENUS BOECHERA

by

Martin Peter Schilling

A dissertation submitted in partial fulfillment of the requirements for the degree

of

DOCTOR OF PHILOSOPHY

in

Ecology

Approved:

Paul G. , Ph.D. Zachariah Gompert, Ph.D. Major Professor Committee Member

Michael D. Windham, Ph.D. Karen E. Mock, Ph.D. Committee Member Committee Member

John Carman, Ph.D. Mark R. McLellan, Ph.D. Committee Member Vice President for Research and Dean of the School of Graduate Studies

UTAH STATE UNIVERSITY Logan, Utah

2016 ii

Copyright © Martin Peter Schilling 2016

All Rights Reserved iii

ABSTRACT

Hybridization, population genetic structure and gene expression in the genus Boechera

by

Martin Peter Schilling, Doctor of Philosophy

Utah State University, 2016

Major Professor: Paul G. Wolf, Ph.D. Department: Biology

In the , we can observe an astounding diversity of life forms. However, we still lack a thorough understanding of how different and even what constitutes a species.

In some groups of organisms, especially in , hybridization between species is very common.

Hybridization has the potential to create new combinations of traits, which can lead to , but hybridization can also lead to or demographic decline of natural . Within and between species, is ubiquitous in natural populations, which can be observed on various levels. Here, I present several ways to assess such variation in the genus Boechera

(). First, I present the evolutionary history and patterns of admixture within a subgroup in the genus, the B. puberula clade. I further show through admixture analyses, that that there is variation in the extent of admixture within the group, but especially when it comes to admixture with two widespread congeners, not part of the B. puberula clade. In a second study, I assessed levels of and patterns of population genetic structure within a member of the B. puberula clade, the montane endemic diploid B. lasiocarpa from Utah. After excluding all hybrids based on estimated admixture proportions, I show that levels of genetic diversity based on genomic data from this rare species are higher than in the widespread B. stricta. Additionally, I assessed common and rare SNVs and show that populations of B. lasiocarpa exhibit variation in genetic diversity and population genetic structure, which seem to be consistent with assumed population sizes. Further,

I investigated patterns of gene expression in an experiment under drought-stress in four Boechera species, which exhibit differences in reproductive mode and level. Across all four species, iv

I did not find a uniform response to drought-stress, and I present evidence for apomixis-specific gene expression, for differential expression associated with ploidy as well as differentially-expressed under a model of potential reversal from apomixis to in B. lignifera.

(139 pages) v

PUBLIC ABSTRACT

Hybridization, population genetic structure and gene expression in the genus Boechera

Martin Peter Schilling

When we look at life on earth, we can see a lot of different life forms, but we still do not fully understand how these different life forms came to be and at which points in time these life forms began to be different enough from each other so we could call them by different names, or species.

Some groups of species on earth, especially plants, seem to reproduce with each other, even though they are already very different from each other so that we call them different species. This process is called hybridization, and it can stir up the dynamics of these species, so that their basic building plans are changed very much. These changes can be either positive or negative for individual populations in such species, and in my research, I looked at a group of plant species in North America, which seem to reproduce very wildly among each other. For this, I traveled across the Western US and collected just a few leaves from many plants. With these leaves I managed to get a part of their basic building plan, which is also called DNA. This building plan carries the instructions to make the proteins that a cell needs, and these instructions are called genes. In different individuals of one species, the same gene can have very small changes. All of these changes in a group of individuals taken together are called genetic variation. This variation is good most of the time, but can also be pretty bad sometimes, because the proteins that are made do not work properly. When we look at the genetic variation between species, we can see how these species are related to each other. With the leaves I collected in the Western US, I did exactly that. I could show that a specific group of six different species, which someone called the Boechera puberula group, are related to each other more than to any other species out there. After that, I looked some more into one of these species, called Wasatch rockcress or Boechera lasiocarpa (which is its’ fancy name for the scientists). When

I looked at the DNA of this rockcress, I found that it actually has a lot of genetic variation, even though it only grows in a few different spots in the mountains of Northern Utah. I also saw that some of the populations I found had more variation than others. I then took four other Boechera species, vi that are very closely related to the ones I looked at earlier. One of the four species reproduces sexually, but the other three reproduce in a special asexual way, called apomixis. I grew these plants in the greenhouse, where some of them had just enough water, but others did not get enough water, so they were under drought-stress. I took their flowers, but this time, I did not look at the DNA, but I counted all of the proteins I could find in those flowers. I then compared the protein counts between the four species and found several interesting results. All of the four species did react in a different way to drought stress, and there were also lots of changes that were connected to how these species reproduce. The work that I briefly described here shows us different ways for how we can measure genetic variation between and within species. Taken together, my findings suggest that it is really important to not only look at different species when we want to measure genetic variation, but to also look at different populations, because these populations can be very different from each other.

Further, when we try to make sense of the different reproductive modes in Boechera, there are a lot of little changes happening, that we have to take into account. vii

ACKNOWLEDGMENTS

Firstly, I would like to express my sincere gratitude to my advisors, Paul G. Wolf and Zachariah

Gompert for the continuous support of my Ph.D. study and related research. I thank Paul for his extraordinary patience, motivation and knowledge. His guidance, teaching skills as well as diplomatic skills helped and motivated me immensely during my time at Utah State University. I also thank Zachariah for his enormous patience and positivity, which provided me with the ultimate role model of a tremendously motivated scholar and researcher. Additionally, I feel extremely lucky for having had numerous fruitful and inspiring conversations with Zach. Without the guidance and continuous help of these two brilliant persons, this dissertation would not have been possible. I would also like to thank my committee members, Dr. Karen E. Mock, Dr. Michael D. Windham, and John G. Carman, for serving as committee members and providing me with extraordinarily helpful comments and suggestions. Karen, without that delicious cheese, I might not have made it in this culinary desert of the intermountain west. On that note, I’d like to thank all the great people in the bug lab (especially Joe Kotynek, Dave Axford, and Dan Zamecnik), that have enriched my dietary habits by making sure that I would eat sufficient amounts of food during the latter stages of writing my dissertation. My sincere thanks also go to Dr. Marcel Robischon, who ignited and fueled my passion for genetics, and to Dr. Jürgen Bauhus, Dr. Christian Wirth, Dr. Helge Bruelheide, and Dr. Karin Nadrowski for supporting me during my time as an intern in the BEF China project.

Further, my thanks goes to the USU Ecology Center for their financial support of my PhD project.

Further, I thank the Pryer and Mitchell-Olds labs for hosting me at Duke University and letting me utilize their lab spaces, as well as James Beck, Fay-Wei Li, and Catherine Rushworth for providing me with access to their microsatellite data. The Intermountain Herbarium, and specifically Michael

Piep, provided me with help regarding acquisition of herbarium records and with additional help in regards to submission of voucher specimens. I also want to thank my good friend Ralf Tauscher for ultimately laying the foundation for my strong interest in open-source software development and for all of the inspiring and illuminating conversations that we’ve had over the years. I also want to thank the members of the Wolf and Gompert labs, for all the help, support, and inspiring conversations. viii

Finally, I want to thank my parents and my brothers for all the support they have given me throughout working on this project and in my life in general. ix

CONTENTS

Page ABSTRACT ...... iii PUBLIC ABSTRACT ...... v ACKNOWLEDGMENTS ...... vii LIST OF TABLES ...... xi LIST OF FIGURES ...... xii ACRONYMS ...... xiii CHAPTER 1 INTRODUCTION AND LITERATURE REVIEW ...... 1 and the nature of species boundaries ...... 1 Hybridization and in plants ...... 2 A brief history of molecular markers for the study of hybridization and speciation . . . .3 The genus Boechera ...... 5 Research scope ...... 6 2 ADMIXTURE, EVOLUTIONARY RELATIONSHIPS, AND VARIATION IN REPRO- DUCTIVE ISOLATION IN THE BOECHERA PUBERULA CLADE ...... 15 Introduction ...... 15 Methods ...... 18 Results ...... 21 Discussion ...... 24 Figures & tables ...... 29 Literature cited ...... 42 3 GENETIC DIVERSITY AND POPULATION STRUCTURE OF RARE VS. COMMON IN THE MONTANE ENDEMIC BOECHERA LASIOCARPA ...... 43 Introduction ...... 43 Methods ...... 45 Results ...... 48 Discussion ...... 50 Figures & tables ...... 55 Literature cited ...... 65 4 DIFFERENTIAL GENE EXPRESSION IN DIPLOID SEXUAL, DIPLOID APOMICTIC AND TRIPLOID APOMICTIC SPECIES OF THE GENUS BOECHERA ...... 66 Introduction ...... 66 Methods ...... 70 Results and discussion ...... 73 Conclusions ...... 80 x

Figures & tables ...... 82 Literature cited ...... 93 5 CONCLUSIONS ...... 94 Introduction ...... 94 Contribution ...... 95 Future work ...... 97 APPENDICES ...... 99 Appendix A Supplementary figures and tables for chapter 2 ...... 100 Appendix B Supplementary materials for chapter 3 ...... 105 Appendix C Supplementary figures for chapter 4 ...... 119 CURRICULUM VITAE ...... 121 xi

LIST OF TABLES

Table Page

2.1 Locality information for Fig. 2.1, sample numbers for Fig. 2.3 and Fig. 2.2 as well as ploidy and nominal taxa ...... 31

3.1 Sample ids for admixture proportions in Figures 3.2 and 3.3, locality information for 105 B. lasiocarpa individuals from Utah ...... 56

4.1 Accession, voucher and location information for Boechera species and hybrids used in this study ...... 82

4.2 Model classification for Bayesian model comparison of differential expression . . . 83

A.1 Locality information for Boechera individuals considered in this chapter ...... 101

B.1 Locality information, sample ids, ids for Figures 3.2 and 3.3 as well as geographical information for each B. lasiocarpa individual included ...... 108

B.2 Number of individuals, variants, and sequencing coverage for respective variant calling runs ...... 111

B.3 Locality information, sample ids, ids for Figures 3.2 and 3.3 as well as geographical information for each Boechera individual included ...... 112 xii

LIST OF FIGURES

Figure Page

2.1 Sampling locations and distribution ranges of select Boechera taxa in western United States ...... 29

2.2 Admixture proportions for eight taxa and neighbor-joining tree of B. puberula group members as well as B. stricta and B. retrofracta based on common variants (n= 14,815) ...... 30

2.3 Admixture proportions based on 14,815 common variants ...... 33

2.4 Statistical summary of genetic variation based on principal component analysis of 14,815 common variants ...... 34

3.1 Sampling locations across the western US and Utah ...... 55

3.2 Admixture proportions based on 13,099 common variants and 105 individuals . . 57

3.3 Admixture proportions based on 2,051 rare variants and 105 individuals . . . . . 58

3.4 Statistical summary of genetic variation based on principal component analysis of common and rare variants ...... 59

3.5 Neighbor-joining tree of pairwise FST matrices across all six localities ...... 60

4.1 Biplots of PCAs performed on expression values normalized by respective library size, and presence and absence of gene expression ...... 84

4.2 Distribution of posterior probabilities across all nine models of expression . . . . 85

A.1 Admixture proportions based on 14,815 common variants ...... 100

B.1 Boechera lasiocarpa in the Wasatch Mountains of Utah, specifically the locality of Logan Canyon Sinks (lcs)...... 107

B.2 Neighbor-joining tree of pairwise FST matrices across all six localities for common and rare variants ...... 118

C.1 Barplots of DIC microscopy results for all four species ...... 119

C.2 Venn diagram of expressed genes by species ...... 120 xiii

ACRONYMS

ABA Absciscic acid

AFLP Amplified Fragment Length

AGL19 AGAMOUS-like 19

AGL22 AGAMOUS-like 22

AGL48 AGAMOUS-like 48

AGO Argonaute

AGO9 Argonaute 9

AIC Aposporous initial cell

ATA20 lipid transfer protein Anther 20

ATA7 lipid transfer protein Anther 7

AtBG1 abscisate beta-glucosidase 18

ATP Adenosine triphosphate

BLAST Basic Local Alignment Search Tool blo Ben Lomond trail

CTAB Cetyl trimethylammonium bromide

DAPI Diamidino-2-phenylindole

DE Differential expression

DCD Development and Cell

DIC Deviance information criteria

DNA Deoxyribonucleic acid

DRM1 Domains Rearranged Methylase 1

GATK Genome Analysis Toolkit

GBS Genotyping-by-Sequencing

GCN2 General control nonderepressible 2 xiv

GMCo Glucose-methanol-choline oxidoreductase family protein

GO Gene ontology

GPS Global Positioning System

GRP17 Glycine-rich protein 17

GRP19 Glycine-rich protein 19

IBD isolation by distance

ITS Internal transcribed spacer

JGI Joint Genome Institute lcs Logan canyon sinks

LBD20 LOB-domain containing protein 20

LD

LDA Linear discriminant analysis

LOB Lateral boundaries

LRR Leucine-rich repeat

MAF Minor frequency

MAP Mitogen-activated protein

MEG Maternally expressed gene

ML Maximum-Likelihood

MMC Megaspore mother cell

MPK11 MAP Kinase 11 mRNA Messenger RNA

NBS Nucleotide-binding site

NCBI National Center for Information

NJ Neighbour-joining

PCA Principal Component Analysis

PC Principal Component

PCR Polymerase chain reaction pow Powder Mountain xv pro Provo Peak

RAPD Randomly Amplified Polymorphic DNA red Red Butte Emigration divide

RFLP Restriction fragment length polymorphism

RNA Ribonucleic acid

SBS Sequencing-by-synthesis

SI Self-incompatibility

SKIP6 SKP1-interacting partner 6 skn Skyline north trail

SKP1 S-Phase Kinase-Associated Protein 1

SNV single nucleotide variants

TAIR The Arabidopsis information resource

TNF Tumor necrosis factor

TOP1ALPHA DNA topoisomerase 1 alpha

TRAF TNF receptor associated factors

UPGMA Unweighted Pair Group Method with Arithmetic Mean

WGD Whole genome duplication CHAPTER 1

INTRODUCTION AND LITERATURE REVIEW

Biodiversity and the nature of species boundaries

In the history of life, we can observe an astounding diversity of life forms. When attempting to explain these high levels of biodiversity, however, biologists have held conflicting views about what constitutes a species, where geographical barriers and the occurrence of gene flow in speciation have been heavily debated (e.g. Dobzhansky, 1940; Mayr, 1963; Mishler and Donoghue, 1982; Mayden,

1999; Mallet, 2007; De Queiroz, 2007). Recently, it has been argued that former approaches to categorize the barriers between species had failed to acknowledge the fact that, despite the occurrence of gene flow between populations, divergence of lineages could still be maintained (Wu,

2001; Mallet, 2008b,a).

It has been acknowledged that speciation was not purely restricted to geographically isolated

(allopatric) populations, but could also occur in (i.e. in overlapping geograpic distribu- tions). Indeed, examples of speciation with gene flow seem to be plentiful (e.g. Arnold, 1992; Via,

2001; Llopart et al., 2005; Feder et al., 2014; Kenney et al., 2016). It is now widely recognized that is not occurring uniformly across individual genomes, but seems to be continuous, with some parts of the genome exhibiting isolation between populations, and permeable regions in other parts of the genome (Key, 1968; Wu, 2001; Turner et al., 2005). Different isolat- ing barriers were typically characterized to occur before (prezygotic) or after mating (postzygotic), and it was argued that prezygotic barriers were more prevalent than postzygotic barriers (Coyne et al., 2004). However, there are few studies that examine the relative strengths of different forms of isolating barriers or potential interactions between such isolating barriers on total reproductive isolation (Lowry et al., 2008; Barton and De Cara, 2009; Stelkens et al., 2010; Harrison and Larson,

2014). With modern large-scale sequencing efforts, it is now possible to assess different selective pressures and the effects of genomic architecture on differentiation of populations or lineages. It is assumed that single loci can experience divergent selection to specific environments, which can create "genomic islands" of divergence (Turner et al., 2005; Feder and Nosil, 2010). Depending on effective population sizes and rates of recombination, such sites can be under weak or strong 2 linkage disequilibrium (LD), and thus, the size of such isolated regions can vary enormously (e.g.

Hawthorne and Via, 2001; Payseur et al., 2004; Turner et al., 2005; Nachman and Payseur, 2012;

Nosil, 2012). The creation of genomic islands through divergent selection has been called divergence hitchhiking (Feder and Nosil, 2010), and it has been acknowledged to be a strong driver of population differentiation with ongoing gene flow (Feder and Nosil, 2010; Ellegren et al., 2012; Flaxman et al.,

2013; Feder et al., 2014).

Hybridization and polyploidy in plants

We could now assume that the process of speciation under divergent selection would always proceed relatively slowly, since it would take time to reach complete reproductive isolation between diverging taxa with ongoing gene flow. However, speciation can happen relatively quickly when selective pressures are strong (e.g. Anderson et al., 2014; Franks et al., 2016). Additionally, speciation does not solely have to involve the same taxa. Hybridization, the crossing between species or genetically divergent populations within a species, is a common process in plants (Stebbins, 1950;

Grant, 1981; Arnold, 1996; Hegarty and Hiscock, 2005). Furthermore, hybrid speciation is well- documented in plants (e.g. Hegarty and Hiscock, 2005; Rieseberg and Willis, 2007; Soltis and Soltis,

2009; Harrison and Larson, 2014), where it has the potential to enable populations to newly gain traits or regain traits that have been lost in the past. Additionally, damaged alleles can possibly be replaced by functional copies from related species (Rieseberg, 2009), giving those populations the opportunity to adopt certain genes to maintain vital physiological functions that have been present before. However, unless newly formed hybrids maintain their taxonomic identity and are reproductively isolated from both of their parental taxa, they are not considered true species. Such isolation could come about either genetically or ecologically (Hegarty and Hiscock, 2005; Ungerer et al., 1998).

There are two ways in which speciation via hybridization can occur: 1) Homoploid hybrid speciation, where offspring have the same ploidy level, (i.e. the same number of homologous ) as both parental species, and 2) polyploid hybrid speciation, which involves the duplication of one or both parental genomes. Homoploid hybrid speciation appears to be much rarer 3 than polyploidy, because homoploid hybrid offspring are not protected from with their parental species. Additionally, homoploids have strongly reduced fitness in early generation hybrids

(Rieseberg and Willis, 2007). Another reason for rare occurrences of homoploid hybrid speciation might be, however, that the detection of homoploidy is not straightforward, as the numbers are the same when compared to progenitors (Gross and Rieseberg, 2005). In polyploids, genome duplication restricts backcrossing with parental species, which makes the formation of new species more likely. Polypoids are typically classified according to the origins of the genomes that are being combined. Allopolyploidy involves the polyploid hybridization of two separate species, whereas in autopolyploidy, hybridization with duplication of the genome occurs in either the same or separate populations of one single species. In contrast to homoploidy, genome duplication in allopolyploidy protects the genetic integrity of the newly created hybrids. Moreover, if the hybrid is partially fertile, it can undergo into one of its’ parental species, resulting in admixture.

In the past, it has been argued that speciation through hybridization via polyploidization should be an evolutionary dead-end (Mayr, 1963; Hennig, 1965). However, Grant (1981) estimated the frequency of polyploidy in angiosperms to be around 50%. Furthermore, analyses of leaf guard sizes in fossil and extant angiosperm taxa yielded that about 70% of all angiosperm species experienced polyploidy in their evolutionary history (Masterson, 1994) and recent genomic data suggest that all plant nuclear genomes sequenced to date show signs of ancient genome duplication (Soltis et al.,

2009), which strongly suggests hybridization as a significant force in the speciation of plants.

A brief history of molecular markers for the study of hybridization and speciation

The study of hybridization in natural populations has a long history (Lotsy, 1916; Anderson,

1953; Barton and Hewitt, 1985; Harrison, 1990, 1993), with attempts to discern patterns in hy- bridization and its evolutionary role using a diverse array of techniques. Initially, morphological and behavioral characters were used to describe hybrid zones in (Mecham, 1960; Yang and

Selander, 1968; Turner, 1971; Atchley and Hensleigh, 1974) as well as plants (e.g. Heiser, 1949;

Rollins, 1983), based on the indicator that these characters were phenotypically intermediate be- tween parental species. With progress in molecular techniques, it became advantageous to combine 4 morphological features with biochemical data, such as allozyme markers (Moran et al., 1980) as well as cytogenetic methods (Hunter and Markert, 1957; Feder, 1979; Baker et al., 1985). Analyses using the aforementioned methods not only showed that genetic variation in natural populations was widespread (Prakash et al., 1969), but subsequently revealed many additional instances of hy- bridization in nature (e.g. Feder, 1979; Moran et al., 1980; Baker et al., 1985; Broyles et al., 1994).

With further development in molecular biology, and particularly the development of the polymerase chain reaction (PCR) (Saiki et al., 1985), a suite of different methods would emerge. Additional information on hybridization came from molecular data, including microsatellites (Edwards et al.,

1991), restriction fragment length polymorphisms (RFLP) (Botstein et al., 1980), randomly amplified polymorphic DNA (RAPD) (Williams et al., 1990), and amplified fragment length polymorphisms

(AFLP) (Vos et al., 1995). These approaches allowed students of speciation to create maps, which would provide information on the inheritance of chromosomal segments through the association at different loci in hybrid individuals (see also Arnold and Emms, 1998; Rieseberg, 1998, for excellent reviews on the subject). However, the use of pre-PCR markers greatly underestimates levels of polymorphism in populations and thus, molecular markers were found to be increasingly useful for the study of population differentiation and admixture (e.g. Lanzaro et al., 1995; Lehmann et al., 1996; Estoup et al., 1998; Djè et al., 1999).

All of the aforementioned methods brought forth new ways to capture genetic diversity in natural populations, and thus provided grounds for the development of new analyses to a wide range of disciplines. With the recent emergence of high-throughput sequencing methods, and more specifically the development of Genotyping-by-Sequencing (GBS) methods, the study of hybridization and speciation is once more experiencing a strong resurgence (e.g Elshire et al., 2011;

Parchman et al., 2012; Gompert et al., 2014). We are now able to assess patterns of introgression on a genome-wide scale. The number of loci across individual genomes has largely increased, giving us the opportunity to study admixture at a scale that greatly exceeds the genomic resolution and statistical power obtained using previous marker methods (Brumfield et al., 2003; Liu et al., 2005;

Xing et al., 2005; Ball et al., 2010; Buerkle and Gompert, 2013; Gompert and Buerkle, 2013; Jeffries et al., 2015). We are now able to obtain more accurate estimates of admixture, detect more limited 5 introgression, and measure variation in introgression among regions of the genome (Li, 2011; Skotte et al., 2013; Gompert et al., 2014).

The genus Boechera

Certain groups of organisms are more prone to hybridization than others, and the genus Boechera

(Brassicaceae) is extraordinary in this regard. Boechera was included in Arabis (Rollins, 1981) but recent molecular studies have shown that these species are much more closely related to Pennellia and Haliomolobus (Bailey et al., 2002; Al-Shehbaz, 2003). There are currently 110 Boechera species listed as occurring in North America, Greenland and the Russian Far East, with the highest number of species found in the Western United States (Al-Shehbaz and Windham, 2010). Most

Boechera species still occur in their natural environments without major anthropogenic disturbances

(Rushworth et al., 2011), with members occurring in desert, subalpine and alpine environments

(Schranz et al., 2005). Early studies of this group (as Arabis in Böcher, 1951; Rollins, 1981;

Roy, 1995) suggested that hybridization was common. More recent work has shown that Boechera comprises one of the most extensive and complex hybrid networks known, including 80+ sexual diploid taxa that have interacted to form over 400 distinct hybrid derivatives containing two, three, or even four distinct genomes (Li et al., 2016). Hybridization spans the entire genus (Alexander et al., 2013) and has occurred repeatedly and independently among many diploid species, resulting in hybrid taxa with high genetic diversity (Koch et al., 2003; Dobeš et al., 2004b,a). Furthermore, hybridization in Boechera appears to be strongly linked to the occurrence of gametophytic apomixis

(-type diplosporous apomixis (Nogler, 1984)) (Böcher, 1951; Naumova et al., 2001;

Aliyu et al., 2010)), where I fails to complete, and meiosis II results in the formation of two megaspores. One of these two cells degenerates, leaving the other to undergo three mitotic divisions to form the megagametophyte (Nogler, 1984; Schranz et al., 2005; Dobeš et al., 2007; Beck et al.,

2011). Apomictic individuals are highly heterozygous, whereas sexual diploids were found to have low levels of heterozygosity, with high levels of (Roy, 1995; Dobeš et al., 2004a; Song et al., 2006; Song and Mitchell-Olds, 2007).

The genus comprises allopolyploid triploid as well as many diploid hybrids (Schranz et al., 6

2005; Beck et al., 2011). Progeny of Boechera hybrids showed fixed heterozygosity, which could point to apomictic reproduction (Schranz et al., 2005). Even though apomictic Boechera specimens are able to bypass sexual reproduction, a portion of the offspring may be sexually derived, which may lead to introgression of apomictic individuals into sexual populations. This means that there appears to be an option for individuals to choose between sexual reproduction and apomixis (Kiefer and Koch, 2012; Schranz et al., 2005). However, the details of these processes are not fully understood. Boechera appears to be the only well-documented example of diploid apomixis in angiosperms (Rushworth et al., 2011; Dobeš et al., 2007), since apomicts in Taraxacum were found to be triploid only (Kirschner and Stepainek, 1996; Menken et al., 1995). This makes apomixis in the genus Boechera interesting for applications in (Savidan et al., 2001; Sharbel et al.,

2009), as it would be desirable to genetically fix beneficial traits (Toenniessen, 2001). Moreover, by investigating the control and functioning of apomixis in conjunction with processes on the level of natural populations, we can improve our understanding of the evolution of reproductive modes and the maintenance of genetic variation in natural populations.

Research scope

Although Boechera poses an excellent study system in many ways, its evolutionary history has not been fully resolved, which is mostly due to the widespread patterns of hybridization and apomixis in the entire genus (Beck et al., 2011; Alexander et al., 2015). To address questions of speciation, hybridization and apomixis in Boechera, it is critical to clarify the evolutionary relationships between species. In chapter 2, I will address this issue by focusing on a subgroup within the genus, the B. puberula clade. The B. puberula clade was chosen for this project, as it contains several taxa that have not received much attention concerning questions of hybrid speciation within this genus.

Additionally, the clade represents a very interesting system for the study of hybrid speciation, as the taxa therein occur in several different scenarios of geographic distribution and range size, where we can observe relatively widespread and geographically restricted taxa in sympatry as well as parapatry. As originally defined by Alexander et al. (2013), the puberula clade included five sexual diploid species: B. lasiocarpa, B. puberula, B. retrofracta (these collections have subsequently 7 been reassigned to B. exilis), B. subpinnatifida, and B. serpenticola. I will present analyses of admixture and patterns of reproductive isolation between members of the puberula clade, and two additional congenerics, B. retrofracta and B. stricta, that are distributed widely across North

America and which are known to hybridize extensively within the genus. Further, I will present the most extensive evolutionary history of the puberula clade to date and present evidence for a split of B. puberula into two distinct , B. puberula puberula and B. puberula arida, In

chapter 3, I will change the scope from relationships between species to levels of genetic diversity

and population genetic structure of one single species in the puberula clade, B. lasiocarpa. B.

lasiocarpa was chosen because it represents the only species in the puberula group with a small

range size and sufficiently high numbers of populations to inform us about patterns of gene flow

and population genetic differentiation of a rare endemic in the puberula system. Endemic to the mountains of northern Utah, B. lasiocarpa occurs only on rocky outcrops in subalpine and alpine environments. I will use admixture analyses to separate individuals of B. lasiocarpa from hybrids in this complicated genus. Further, population genomic analyses will be presented to investigate levels of genetic diversity on a species-wide level, as well as to portray differences in genetic diversity between multiple populations across Utah. Further, I will assess the population genetic structure, and population differentiation in B. lasiocarpa, both between and within populations. In chapter 4, I will address patterns of gene expression in four species in the Boechera genus (B. stricta, B. lignifera, B. cf. gunnisoniana, and the hybrid B. exilis x retrofracta) with both differing reproductive modes and ploidy levels. I will address on a transcriptomic level whether these species exhibit a uniform gene expression response to drought-stress, whether expressed genes can be found to be associated with apomictic reproduction, whether triploid Boechera apomicts show differences in gene expression when compared to their diploid congenerics, and finally, if the apomictic B. lignifera shows signs of a reversal from apomeiosis to meiosis. Finally, in chapter 5, I will synthesize the findings of the preceding chapters and present a conclusion concerning the specific questions addressed in this dissertation and further highlight potential future work in this system. 8

LITERATURE CITED

Al-Shehbaz, I. A., 2003. Transfer of most North American species of Arabis to Boechera (Brassi- caceae). Novon 13:381–391.

Al-Shehbaz, I. A. and M. D. Windham, 2010. Boechera. Pp. 347–412, in Flora of North America Committee, ed. Flora of North America, North of Mexico, 7th ed. Oxford University Press., New York and Oxford.

Alexander, P. J., M. D. Windham, J. B. Beck, I. A. Al-Shehbaz, and C. D. Bailey, 2013. Molec- ular and of the genus Boechera and related genera (Brassicaceae: Boechereae). Syst Bot 38:192–209.

———, 2015. Weaving a tangled web: Divergent and reticulate speciation in Boechera fendleri sensu lato (Brassicaceae: Boechereae). Syst Bot 40:572–596.

Aliyu, O. M., M. E. Schranz, and T. F. Sharbel, 2010. Quantitative variation for apomictic repro- duction in the genus Boechera (Brassicaceae). Am J Bot 97:1719–1731.

Anderson, E., 1953. Introgressive hybridization. Biological Reviews 28:280–307.

Anderson, J. T., C.-R. Lee, and T. Mitchell-Olds, 2014. Strong selection genome-wide enhances fitness trade-offs across environments and episodes of selection. Evolution 68:16–31.

Arnold, M., 1992. Natural hybridization as an evolutionary process. Annu Rev Ecol Syst 23:237–261.

Arnold, M. L., 1996. Natural hybridization and evolution. Oxford Univ Press, New York.

Arnold, M. L. and S. K. Emms, 1998. Molecular markers, gene flow, and . Pp. 442–458, in Molecular of plants II: DNA sequencing. Springer Science + Business Media, New York.

Atchley, W. R. and D. A. Hensleigh, 1974. The congruence of morphometric shape in relation to genetic divergence in four races of morabine grasshoppers (Orthoptera: Eumastacidae). Evolution 28:416–427.

Bailey, C. D., R. A. Price, and J. J. Doyle, 2002. Systematics of the halimolobine Brassicaceae: Evidence from three loci and morphology. Syst Bot 27:318–332.

Baker, R. J., J. W. Bickham, and M. L. Arnold, 1985. Chromosomal evolution in Rhogessa (Chiroptera: Vespertilionidae): Possible speciation by centric fusions. Evolution 39:233–243.

Ball, A. D., J. Stapley, D. a. Dawson, T. R. Birkhead, T. Burke, and J. Slate, 2010. A comparison of SNPs and microsatellites as linkage mapping markers: lessons from the zebra finch (Taeniopygia guttata). BMC Genomics 11:218.

Barton, N. H. and M. A. R. De Cara, 2009. The evolution of strong reproductive isolation. Evolution 63:1171–1190.

Barton, N. H. and G. M. Hewitt, 1985. Analysis of hybrid zones. Annu Rev Ecol Syst 16:113–148. 9

Beck, J. B., P. J. Alexander, L. Allphin, I. A. Al-Shehbaz, C. Rushworth, C. D. Bailey, and M. D. Windham, 2011. Does hybridization drive the transition to asexuality in diploid Boechera? Evolution 66:985–95.

Böcher, T. W., 1951. Cytological and embryological studies in the amphi-apomictic Arabis holboellii complex. Kongel.Danske Vidensk.-Selskab.Biol.Skr. 6:1–59.

Botstein, D., R. L. White, M. Skolnick, and R. W. Davis, 1980. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32:314–31.

Broyles, S. B. ., A. Schnabel, and R. Wyatt, 1994. Evidence for long-distance dispersal in milkweeds (Asclepias exaltata). Evolution 48:1032–1040.

Brumfield, R. T., P. Beerli, D. a. Nickerson, and S. V. Edwards, 2003. The utility of single nucleotide polymorphisms in inferences of population history. Trends Ecol Evol 18:249– 256.

Buerkle, C. A. and Z. Gompert, 2013. Population genomics based on low coverage sequencing: how low should we go? Mol Ecol 22:3028–3035.

Coyne, J. A., H. A. Orr, et al., 2004. Speciation, vol. 37. Sinauer Associates Sunderland, MA.

De Queiroz, K., 2007. Species concepts and species delimitation. Syst Biol 56:879–86.

Djè, Y., D. Forcioli, M. Ater, C. Lefèbvre, and X. Vekemans, 1999. Assessing population genetic structure of Sorghum landraces from North-western Morocco using allozyme and microsatellite markers. Theor Appl Genet 99:157–163.

Dobeš, C., T. Mitchell-Olds, and M. A. Koch, 2004a. Intraspecific diversification in North American Boechera stricta (= Arabis drummondii), Boechera x divaricarpa, and Boechera holboellii (Bras- sicaceae) inferred from nuclear and molecular markers - An integrative approach. Am J Bot 91:2087–2101.

Dobeš, C., T. F. Sharbel, and M. Koch, 2007. Towards understanding the dynamics of hybridization and apomixis in the evolution of the genus Boechera (Brassicaceae). Syst Biodivers 5:321–331.

Dobeš, C. H., T. Mitchell-Olds, and M. A. Koch, 2004b. Extensive chloroplast haplotype variation indicates Pleistocene hybridization and radiation of North American Arabis drummondii, A. x divaricarpa, and A. holboellii (Brassicaceae). Mol Ecol 13:349–370.

Dobzhansky, T., 1940. Speciation as a stage in evolutionary divergence. Am Nat 74:312–321.

Edwards, A., A. Civitello, H. A. Hammond, and C. T. Caskey, 1991. DNA typing and genetic mapping with trimeric and tetrameric tandem repeats. Am J Hum Genet 49:746–756.

Ellegren, H., L. Smeds, R. Burri, P. I. Olason, N. Backström, T. Kawakami, A. Künstner, H. Mäkinen, K. Nadachowska-Brzyska, A. Qvarnström, S. Uebbing, and J. B. W. Wolf, 2012. The genomic landscape of species divergence in Ficedula flycatchers. Nature Pp. 1–5.

Elshire, R. J., J. C. Glaubitz, Q. Sun, J. a. Poland, K. Kawamoto, E. S. Buckler, and S. E. Mitchell, 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6:e19379. 10

Estoup, A., F. Rousset, Y. Michalakis, J. M. Cornuet, M. Adriamanga, and R. Guyomard, 1998. Comparative analysis of microsatellite and allozyme markers: A case study investigating micro- geographic differentiation in (Salmo trutta). Mol Ecol 7:339–353.

Feder, J. H. ., 1979. Natural hybridization and genetic divergence between the toads Bufo boreas and Bufo punctatus. Evolution 33:1089–1097.

Feder, J. L. and P. Nosil, 2010. The efficacy of divergence hitchhiking in generating genomic islands during . Evolution 64:1729–1747.

Feder, J. L., P. Nosil, A. C. Wacholder, S. P. Egan, S. H. Berlocher, and S. M. Flaxman, 2014. Genome-wide congealing and rapid transitions across the speciation continuum during speciation with gene flow. J Hered 105:810–820.

Flaxman, S. M., J. L. Feder, and P. Nosil, 2013. Genetic hitchhiking and the dynamic buildup of genomic divergence during speciation with gene flow. Evolution 67:2577–2591.

Franks, S. J., N. C. Kane, N. B. O’Hara, S. Tittes, and J. S. Rest, 2016. Rapid genome-wide evolution in Brassica rapa populations following drought revealed by sequencing of ancestral and descendant gene pools. Mol Ecol Pp. 3622–3631.

Gompert, Z. and C. A. Buerkle, 2013. Analyses of genetic ancestry enable key insights for molecular ecology. Mol Ecol 22:5278–94.

Gompert, Z., L. K. Lucas, C. A. Buerkle, M. L. Forister, J. A. Fordyce, and C. C. Nice, 2014. Admixture and the organization of genetic diversity in a butterfly revealed through common and rare genetic variants. Mol Ecol 23:4555–4573.

Grant, V., 1981. Plant Speciation. 2nd ed. Columbia University Press, New York, USA.

Gross, B. L. and L. H. Rieseberg, 2005. The of homoploid hybrid speciation. J Hered 96:241–52.

Harrison, R. G., 1990. Hybrid zones: window on evolutionary process. Oxf Surv Evol Biol 7:69–128.

———, 1993. Hybrid zones and the evolutionary process. Oxford University Press.

Harrison, R. G. and E. L. Larson, 2014. Hybridization, introgression, and the nature of species boundaries. J Hered 105:795–809.

Hawthorne, D. J. and S. Via, 2001. Genetic linkage of ecological specialization and reproductive isolation in pea aphids. Nature 412:904–907.

Hegarty, M. J. and S. J. Hiscock, 2005. Hybrid speciation in plants: new insights from molecular studies. New Phytol 165:411–23.

Heiser, C. B., 1949. Study in the evolution of the sunflower species Helianthus annuus and H. bolanderi. Univ Calif Publ Bot.

Hennig, W., 1965. Phylogenetic systematics. Annu Rev Entomol 10:97–116. 11

Hunter, R. . L. . and C. . L. . Markert, 1957. Histochemical demonstration of enzymes separated by zone electrophoresis in starch gels. Science 125:1294–1295.

Jeffries, D. L., G. H. Copp, L.-J. Lawson Handley, H. Olsén, C. D. Sayer, and B. Hänfling, 2015. Comparing RADseq and microsatellites to infer complex phylogeographic patterns, a real data informed perspective in the Crucian carp, Carassius carassius, L. bioRxiv P. 025973.

Kenney, A. M., A. L. Sweigart, and A. M. Kenney, 2016. Reproductive isolation and introgression between sympatric Mimulus species. Mol Ecol .

Key, K. H. L., 1968. The concept of stasipatric speciation. Syst Zool 17:14–22.

Kiefer, C. and M. A. Koch, 2012. A continental-wide perspective: the genepool of nuclear encoded ribosomal DNA and single-copy gene sequences in North American Boechera (Brassicaceae). PLoS ONE 7:e36491.

Kirschner, J. and J. Stepainek, 1996. Modes of speciation and evolution of the sections in Taraxacum. Folia Geobot. Phytotax. 31:415–426.

Koch, M., I. A. Al-Shehbaz, and K. Mummenhoff, 2003. Molecular systematics, evolution, and population biology in the mustard family (Brassicaceae). Ann. Missouri Bot. Gard. 90:151–171.

Lanzaro, G. C., L. Zheng, Y. T. Touré, S. F. Traore, F. C. Kafatos, and K. D. Vernick, 1995. Microsatellite DNA and isozyme variability in a west African population of Anopheles gambiae. Insect Mol Biol 4:105–112.

Lehmann, T., W. A. Hawley, L. Kamau, D. Fontenille, F. Simard, and F. H. Collins, 1996. Genetic differentiation of Anopheles gambiae populations from East and west Africa: comparison of microsatellite and allozyme loci. 77:192–208.

Li, F.-W., C. A. Rushworth, J. B. Beck, and M. D. Windham, 2016. Boechera Microsatellite Website: an online portal for species identification and determining hybrid parentage. Database: in press .

Li, H., 2011. A statistical framework for SNP calling, discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27:2987–93.

Liu, N., L. Chen, S. Wang, C. Oh, and H. Zhao, 2005. Comparison of single-nucleotide polymor- phisms and microsatellites in inference of population structure. BMC Genet 6:26.

Llopart, A., D. Lachaise, and J. A. Coyne, 2005. Multilocus analysis of introgression between two sympatric sister species of : Drosophila yakuba and D. santomea. Genetics 171:197–210.

Lotsy, J. P., 1916. Evolution by means of hybridization. The Hague, Nijhoff.

Lowry, D. B., J. L. Modliszewski, K. M. Wright, C. a. Wu, and J. H. Willis, 2008. The strength and genetic basis of reproductive isolating barriers in flowering plants. Phil Trans R Soc Lond B Biol Sci 363:3009–3021.

Mallet, J., 2007. Species, Concepts of. 12

———, 2008a. Hybridization, ecological races and the nature of species: empirical evidence for the ease of speciation. Phil Trans R Soc Lond B Biol Sci 363:2971–2986.

———, 2008b. Mayr’s view of Darwin: was Darwin wrong about speciation? Biol J Linn Soc 95:3–16.

Masterson, J., 1994. Stomatal size in fossil plants: evidence for polyploidy in majority of an- giosperms. Science 264:421–424.

Mayden, R. L., 1999. Consilience and a hierarchy of species concepts: advances toward closure on the species puzzle. J Nematol 31:95–116.

Mayr, E., 1963. species and evolution, vol. 10. Harvard University Press.

Mecham, J. S., 1960. Introgressive hybridization between two southeastern treefrogs. Evolution 14:445–457.

Menken, S. B. J., E. Smit, and H. J. C. M. D. Nus, 1995. Genetical Population Structure in Plants: between diploid sexual and triploid asexual Dandelions (Taraxacum Section ruderalia). Evolution 49:1108–1118.

Mishler, B. D. and M. J. Donoghue, 1982. Species concepts: a case for pluralism. Syst. Zool. 31:491–503.

Moran, C., P. Wilkinson, and D. Shaw, 1980. Allozyme variation across a narrow in the grasshopper, Caledia captiva. Heredity 44:69–81.

Nachman, M. W. and B. A. Payseur, 2012. Recombination rate variation and speciation: theoretical predictions and empirical results from rabbits and mice. Phil Trans R Soc B 367:409–421.

Naumova, T. N., J. Van der Laak, J. Osadtchiy, F. Matzk, A. Kravtchenko, J. Bergervoet, K. S. Ramulu, and K. Boutilier, 2001. Reproductive development in apomictic populations of Arabis holboellii (Brassicaceae). Sex Plant Reprod 14:195–200.

Nogler, G. A., 1984. Gametophytic apomixis. chap. 10, Pp. 475–518, in B. M. Johri, ed. Embryology of Angiosperms. Springer.

Nosil, P., 2012. Ecological Speciation.

Parchman, T. L., Z. Gompert, J. Mudge, F. D. Schilkey, C. W. Benkman, and C. A. Buerkle, 2012. Genome-wide association genetics of an adaptive trait in lodgepole . Mol Ecol 21:2991–3005.

Payseur, B. A., J. G. Krenz, and M. W. Nachman, 2004. Differential Patterns of Introgression Across the X Chromosome in a Hybrid Zone Between Two Species of House Mice. Evolution 58:2064–2078.

Prakash, S., R. Lewontin, and J. Hubby, 1969. A molecular approach to the study of genic het- erozygosity in natural populations IV. patterns of genic variation in central, marginal and isolated populations of drosophila pseudoobscura. Genetics 61:841–858. 13

Rieseberg, L. H., 1998. Genetic Mapping as a Tool for Studying Speciation. chap. 16, Pp. 459–487, in Molecular Systematics of Plants II: DNA sequencing. Springer Science + Business Media, New York.

———, 2009. Evolution: Replacing genes and traits through hybridization. Curr Biol 19:R119–22.

Rieseberg, L. H. and J. H. Willis, 2007. Plant Speciation. Science 317:910–4.

Rollins, R. C., 1981. Studies on Arabis (Cruciferae) of western North America. Syst Bot 6:55–64.

———, 1983. Interspecific hybridization and uniformity in Arabis (Cruciferae). Am J Bot 70:625–634.

Roy, B. A., 1995. The breeding systems of six species of Arabis (Brassicaceae). Am J Bot 82:869– 877.

Rushworth, C. A., B.-H. Song, C.-R. Lee, and T. Mitchell-Olds, 2011. Boechera, a model system for ecological genomics. Mol Ecol 20:4843–57.

Saiki, R. K., S. Scharf, F. Faloona, K. B. Mullis, G. T. Horn, H. A. Erlich, and N. Arnheim, 1985. Enzymatic amplification of b-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 230:1350–1354.

Savidan, Y., J. G. Carman, and T. Dresselhaus, 2001. The flowering of apomixis: from mechanisms to . 15. Cimmyt.

Schranz, M. E., C. Dobeš, M. A. Koch, and T. Mitchell-Olds, 2005. Sexual reproduction, hy- bridization, apomixis, and polyploidization in the genus Boechera (Brassicaceae). Am J Bot 92:1797–1810.

Sharbel, T. F., M.-L. Voigt, J. M. Corral, T. Thiel, A. Varshney, J. Kumlehn, H. Vogel, and B. Rotter, 2009. Molecular signatures of apomictic and sexual ovules in the Boechera holboellii complex. Plant Journal 58:870–82.

Skotte, L., T. S. Korneliussen, and A. Albrechtsen, 2013. Estimating individual admixture propor- tions from Next-generation sequencing data. Genetics Pp. 1–28.

Soltis, D. E., V. a. Albert, J. Leebens-Mack, C. D. Bell, A. H. Paterson, C. Zheng, D. Sankoff, C. W. DePamphilis, P. K. Wall, and P. S. Soltis, 2009. Polyploidy and angiosperm diversification. Am J Bot 96:336–348.

Soltis, P. S. and D. E. Soltis, 2009. The role of hybridization in plant speciation. Annu Rev Plant Biol 60:561–88.

Song, B.-H., M. J. Clauss, A. Pepper, and T. Mitchell-Olds, 2006. Geographic patterns of mi- crosatellite variation in Boechera stricta, a close relative of Arabidopsis. Mol Ecol 15:357–69.

Song, B.-H. and T. Mitchell-Olds, 2007. High genetic diversity and population differentiation in Boechera fecunda, a rare relative of Arabidopsis. Mol Ecol 16:4079–88.

Stebbins, G. L., 1950. Variation and evolution in plants. Columbia University Press, New York City. 14

Stelkens, R. B., K. A. Young, and O. Seehausen, 2010. The accumulation of reproductive incom- patibilities in African fish. Evolution 64:617–633.

Toenniessen, G. H., 2001. Feeding the world in the 21st century: , biotechnology, and the potential role of apomixis. Pp. 1–7, in The flowering of apomixis: From mechanisms to genetic engineering. Cimmyt.

Turner, J. R. G., 1971. Two thousand generations of hybridisation in a butterfly. Evolution 25:471–482.

Turner, T. L., M. W. Hahn, and S. V. Nuzhdin, 2005. Genomic islands of speciation in Anopheles gambiae. PLoS Biology 3:1572–1578.

Ungerer, M. C., S. J. Baird, J. Pan, and L. H. Rieseberg, 1998. Rapid hybrid speciation in wild sunflowers. Proc Natl Acad Sci USA 95:11757–62.

Via, S., 2001. in animals: The ugly duckling grows up. Trends Ecol Evol 16:381–390.

Vos, P., R. Hogers, B. M, M. Reijans, and L. T, 1995. A new technique for DNA fingerprinting. Nucleic Acids Res 44:388–396.

Williams, J. G., A. R. Kubelik, K. J. Livak, J. A. Rafalski, and S. V. Tingey, 1990. DNA poly- morphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res 18:6531–6535.

Wu, C. I., 2001. The genic view of the process of speciation. J Evol Biol 14:851–865.

Xing, C., F. R. Schumacher, G. Xing, Q. Lu, T. Wang, and R. C. Elston, 2005. Comparison of microsatellites, single-nucleotide polymorphisms (SNPs) and composite markers derived from SNPs in linkage analysis. BMC Genet 6 Suppl 1:S29.

Yang, S. and R. K. Selander, 1968. Hybridization in the grackle Quiscalus quiscula in Louisiana. Syst Zool 17:107–143. 15

CHAPTER 2

ADMIXTURE, EVOLUTIONARY RELATIONSHIPS,

AND VARIATION IN REPRODUCTIVE ISOLATION

IN THE BOECHERA PUBERULA CLADE

Introduction

Among organismal biologists, hybridization is generally defined as the interbreeding of indi- viduals or populations that are distinguished by heritable characters (Harrison, 1990, 1993). If the parents of a hybrid are a poor genetic match, the offspring will show decreased fitness, and the genetic effects of the hybridization event are likely to be limited both temporally and spatially. However, hybridization also has the potential to transfer alleles into new genetic contexts through introgression and admixture (Rieseberg et al., 2007; Rieseberg and Willis, 2007; Rieseberg, 2009; Kenney et al.,

2016). Such introgressed alleles represent a potential source to increase standing genetic variation, and thus, test combinations that have worked in other contexts. On the other hand, with high rates of gene flow and low levels of reproductive isolation, introduced alleles could be transmitted in such a way that the genomes of locally adapted species with small population sizes (e.g. in alpine glacial refugia) could ultimately become extinct by being swamped by a different lineage (Levin et al.,

1995; Allendorf et al., 2001). Thus, the incorporation of new alleles into existing lineages through hybridization (i.e. admixture) has varying effects on the process of speciation, either slowing or accelerating reproductive isolation between populations through gene flow and recombination.

The study of admixture in natural populations has a long history (Lotsy, 1916; Anderson, 1953;

Barton and Hewitt, 1985; Harrison, 1990, 1993), with a strong emphasis on discerning patterns in hybridization and its evolutionary role using a diverse array of techniques. Initially, morphological and behavioral characters were used to describe hybrid zones in animals (Mecham, 1960; Yang and

Selander, 1968; Turner, 1971; Atchley and Hensleigh, 1974) as well as plants (e.g. Heiser, 1949;

Rollins, 1983), based on the indicator of phenotypic intermediacy between parental species. As new techniques emerged, it became possible to combine morphological features with biochemical data, such as allozyme markers (Moran et al., 1980) as well as cytogenetic methods (Hunter and

Markert, 1957; Feder, 1979; Baker et al., 1985). The pace of discovery in this field of inquiry 16 increased significantly with the development of molecular markers (Edwards et al., 1991; Botstein et al., 1980; Williams et al., 1990; Vos et al., 1995) (see Arnold and Emms, 1998; Rieseberg, 1998, for excellent reviews on the subject). With the recent emergence of high-throughput sequencing methods, and more specifically the development of Genotyping-by-Sequencing (GBS) methods, the study of hybridization and speciation is once more experiencing a strong resurgence (e.g Elshire et al., 2011; Parchman et al., 2012; Gompert et al., 2014). We are now able to assess patterns of introgression on a genome-wide scale. The number of loci across individual genomes has largely increased, giving us the opportunity to study admixture at a scale that greatly exceeds the genomic resolution and statistical power obtained using previous marker methods (Brumfield et al., 2003; Liu et al., 2005; Xing et al., 2005; Ball et al., 2010; Buerkle and Gompert, 2013; Gompert and Buerkle,

2013; Jeffries et al., 2015). We are now able to obtain more accurate estimates of admixture, detect more limited introgression, and measure variation in introgression among regions of the genome

(Li, 2011; Skotte et al., 2013; Gompert et al., 2014).

Certain groups of organisms are more prone to hybridization than others, and the flowering plant genus Boechera (Brassicaceae) is extraordinary in this regard. Early studies of this group

(as Arabis in Böcher, 1951; Rollins, 1981; Roy, 1995) suggested that hybridization was common.

More recent work has shown that Boechera comprises one of the most extensive and complex hybrid

networks known, including 80+ sexual diploid taxa that have interacted to form over 400 distinct

hybrid lineages containing two, three, or even four distinct genomes (Li et al., 2016). Hybridization

spans the entire genus (Alexander et al., 2013) and has occurred repeatedly and independently among

many diploid species, resulting in hybrid taxa with high genetic diversity (Koch et al., 2003; Dobeš

et al., 2004b,a). Sexual diploid Boechera species were found to be self-compatible, with low levels of

heterozygosity and high levels of inbreeding (Roy, 1995; Dobeš et al., 2004a; Song et al., 2006; Song

and Mitchell-Olds, 2007). Hybrids were mostly found to be apomictic (reproducing via unfertilized

seeds), highly heterozygous and it has been suggested that both obligate and facultative apomixis

exist in this system (Böcher, 1951; Naumova et al., 2001; Aliyu et al., 2010).

Hybridization in Boechera appears to be strongly linked to the occurrence of gametophytic

apomixis (Taraxacum-type diplosporous apomixis (Nogler, 1984)), where meiosis I fails, and meiosis 17

II results in the formation of two (rather than four) megaspores that are genetically very similar (often identical) to the sporophyte that produced them. One of these two cells degenerates, leaving the other to undergo three mitotic divisions to form a megagametophyte (Nogler, 1984; Schranz et al.,

2005; Dobeš et al., 2007; Roy, 1995; Dobeš et al., 2004a; Song et al., 2006; Song and Mitchell-Olds,

2007; Beck et al., 2011; Windham et al., 2016).

Over the last two decades, the genus Boechera has been intensively studied in regards to patterns of genomic architecture (Windsor et al., 2006; Schranz et al., 2007; Kantama et al., 2007; Mandáková et al., 2015), local and speciation (Knight et al., 2006; Anderson et al., 2011; Lee and

Mitchell-Olds, 2013; Anderson et al., 2013, 2014, 2015), hybridization and polyploidy as well as the origin and control of apomixis (Carman, 1997; Koch et al., 2003; Schmidt et al., 2015; Schranz et al., 2005; Dobeš et al., 2007; Aliyu et al., 2010; Sharbel et al., 2010; Beck et al., 2011), and population genetic differentiation of natural populations (Song and Mitchell-Olds, 2007; Song et al.,

2009). The breadth of ongoing work, coupled with known high levels of inbreeding in Boechera species and its relatively close relationship to Arabidopsis, have made the genus a valuable model system for studies of evolution and ecology (see also Rushworth et al., 2011). But to fully realize the potential of this model system, we need to better understand the patterns of admixture and selective reproductive isolation that have contributed to its evolution.

My goal is to build on this foundation by assessing genome-wide patterns of hybridization and resulting admixture while clarifying the evolutionary history of one well-supported clade, the

Boechera puberula group. As originally defined by Alexander et al. (2013), the puberula clade included five sexual diploid species: B. lasiocarpa, B. puberula, B. retrofracta, B. subpinnatifida, and B. serpenticola. The single specimen referred to as B. retrofracta in the Alexander et al. (2013) phylogenetic analysis has subsequently been reassigned to B. exilis, with the epithet retrofracta transferred to a different clade (Windham et al. unpubl.). In this study, I attempt to 1) assess evolutionary relationships of the Boechera puberula group, a monophyletic clade within the large genus Boechera (Brassicaceae), and 2) estimate admixture proportions within these species to assess patterns of gene flow and levels of reproductive isolation on a genome-wide scale. These analyses will form the basis for future work on speciation in the group. 18

Methods

Data collection and DNA extraction

I extracted DNA from leaf tissue of 107 individuals from 47 localities in spring and early summer of 2013 (see Figure 2.1 and Table 2.1 for a list of taxa and sampling localities). Leaf tissue collections made by the senior author (identified by the prefix "MS") were immediately stored in silica gel and voucher specimens are being submitted to the Intermountain Herbarium (UTC).

The 23 samples without the prefix "MS" came from air-dried herbarium specimens deposited at the herbaria indicated in Table A.1. DNA was extracted from leaf tissue following the CTAB protocol described in Beck et al. (2011).

Microsatellite markers for determination of ploidy and nominal taxa

I generated microsatellite data to determine ploidy and assign plants to nominal species. Mi- crosatellite markers were genotyped at 15 loci using the multiplex polymerase chain reaction (PCR) protocol described in Beck et al. (2011). I then determined the size of amplicons on an Applied

Biosystems 3730xl DNA Analyzer, and alleles were scored using GeneMarker version 2.6.2 (Soft- genetics, State College, PA, USA). I further inferred the ploidy level of each sample by determining the maximum number of microsatellite alleles at each locus, which has been shown to be an accurate proxy for chromosome counts in Boechera (Beck et al., 2011; Li et al., 2016). I then annotated nominal taxa based on a dataset containing roughly 4400 individuals, representing all currently known sexual diploid Boechera species (Li et al., 2016; Beck et al., 2011; Alexander et al., 2015).

Genotyping-by-Sequencing to estimate evolutionary relationships and admixture proportions

I generated Genotyping-by-Sequencing (GBS) data to resolve the evolutionary relationships

within Boechera and infer admixture proportions. Reduced- complexity, double-digest restriction

fragment-based DNA libraries were prepared for the same DNA samples, following Gompert et al.

(2014). The restriction-fragment library preparation method generally yields high numbers of loci

through the use of high-throughput sequencing platforms, as compared to traditional molecular 19 markers. For studies of admixture, we can thus expect to achieve a higher resolution across individuals’ genomes. The GBS libraries were sequenced in one lane at the University of Texas

Genomic Sequencing and Analysis Facility (Austin, TX, USA) on the Illumina HiSeq 2500 platform.

I used custom python and perl scripts (github.com/schimar/hts_tools) to parse the sequences for individual barcodes and split them by individual. Each individual was aligned to the B. stricta genome assembly (v1.2, DOE-JGI, http://phytozome.jgi.doe.gov/) using bwa aln & samse version

0.7.5 (Li and Durbin, 2009). I allowed for a maximum edit distance of 5, with a read trimming parameter of 10, the seed set to 20 and a maximum edit distance in the seed of 2. I further used samtools version 0.1.19 (Li et al., 2009) to create bam files from the resulting alignments.

Polyploids were excluded, because subsequent analyses only permit the use of diploid individuals.

In total, I considered 79 diploid individuals for further analyses (see Table 2.1).

I identified SNVs using GATK version 3.5 (McKenna et al., 2010) with ploidy set to diploid.

Using the Unified Genotyper in GATK, I set heterozygosity for prior likelihood calculation per locus to 0.001, and ignored sequences with mapping quality <20. I further set the minimum phred-scaled confidence threshold for variants to be called to 50. The resulting variants were further filtered to contain only variants with at least 128 sequences, at least 4 sequences with the alternative allele, and

I only kept the genetic variants at nucleotide sites where I had data for at least 80% of the sampled individuals and only one alternative allele. Additionally, only variants with minimum phred-scaled mapping quality of 30 and a minor allele frequency < 0.05 were retained, in order to keep only common variants.

Evolutionary history and genetic structure of the B. puberula group

To assess the evolutionary history of the B. puberula group, I performed clustering of individuals based on the posterior probabilities of the common SNVs. I estimated mean from the posterior genotype probabilities for eight putative source taxa, as obtained by entropy (see below). A mean genotype is the mean of the posterior distribution and as such is a non-integer point estimate of the number of alleles at a given locus, ranging from zero to two (with 0: homozygous for reference allele; 1: heterozygous, and 2: homozygous for the alternative allele). Because I 20 am using SNVs, we are not dealing with continuous or contiguous stretches of DNA sequences.

On the contrary, I use only variable sites that were concatenated into a string of variants for each individual. Since assuming a standard model of sequence evolution might not be accurate under these circumstances, I used distance methods instead. Consequently, branch lengths of the resulting trees can not be directly related to substitution rates, as they represent the distance matrix across individuals and SNVs, and I could not infer the timing of diversification between these taxa (see also Franzke et al., 2016). I created a neighbour-joining (NJ) tree based on a matrix of pairwise distances of the number of sites that differ between each pair of concatenated SNV sequences. The

NJ tree was constructed by using the ape package (version 3.4) (Paradis et al., 2004) in R and the distance matrix was constructed with the dist.dna function.

The common SNVs were analysed for population genetic structure using entropy, which is described in Gompert et al. (2014). This model is very similar to the correlated allele frequency admixture model in structure (Falush et al., 2003), but here, sequence coverage, sequencing error, and alignment error are explicitly included in the model. Such a procedure has been demonstrated to decrease bias when compared to called genotypes (Skotte et al., 2013). The output of entropy includes admixture proportions, genotype probabilities for all individuals at all loci and credible intervals for all estimated parameters. I performed the analysis with entropy for numbers of k of 2 to 16 putative clusters, with 6 chains for each k. In order to minimize the computional time required for entropy runs, I estimated initial mean genotypes for each individual and locus from the genotype likelihoods by using the expectation-maximization algorithm described in Li (2011).

These mean genotype estimates were used to calculate starting values of admixture proportions with the discriminant analysis of principal components (dapc) (Jombart et al., 2010) function in the

R package adegenet (Jombart, 2008; Jombart and Ahmed, 2011) for each respective number of clusters. I assessed convergence and mixing of chains using the Gelman-Rubin diagnostic (Gelman and Rubin, 1992) in coda (Plummer et al., 2006) and created the corresponding barplots of admixture proportions for all k clusters in R (R Development Core Team, 2015). The optimal number of clusters was determined by comparing the deviance information criteria (DIC) for respective chains across all k clusters. I extracted the estimated posterior genotype probabilities for eight clusters along with 21

95% confidence intervals. These genotype probabilities were used to construct a covariance matrix of mean genotypes across all common SNVs (14,815) for all 79 diploid individuals. I performed an unscaled but zero-centered principal component analysis (PCA) using the prcomp function in

R. Additionally, the estimated mean genotypes, obtained from entropy, were used to perform the

assessment of evolutionary relationships as mentioned above.

Results

Microsatellite markers for determination of ploidy and nominal taxa

Based on the maximum number of microsatellite alleles at each locus, I inferred that 79 of the

sampled individuals from 39 localities were diploid, and 28 individuals from eight localities were

triploid (see Figure 2.1 and Table 2.1). Using the analytical tools and comparative data provided

by Li et al. (2016), I determined that 64 of the diploid samples represented known sexual taxa. My

sampling included all species assigned to the B. puberula group by Alexander et al. (2013) as well as

the two most widely distributed Boechera species, B. retrofracta and B. stricta (Fig. 2.1). The other 15

diploid individuals were inferred to be hybrids, produced by crosses between B. retrofracta and three

other taxa (B. exilis, B. stricta, and B. subpinnatifida). All 28 triploid individuals showed evidence

of hybrid origins involving two or, more often, three genomes (9 and 19 samples respectively; see

Tables 2.1 and A.1).

Genotyping-by-Sequencing to determine evolutionary relationships and estimate admixture

proportions

My data comprised 57.8x106 reads from 79 individuals, with a median of 677,138 reads per

individual. I detected a total of 141,846 variants, and after quality filtering, I obtained 14,815

common high-quality variants (mean coverage per SNV per individual = 16.41, sd = 11.97). A

randomly chosen set of 10% of the samples were replicated in the GBS library, which were further

checked for consistency, and no deviations could be detected. 22

Evolutionary history and genetic structure of the B. puberula group

Trees inferred from the GBS data were generally well-resolved; the neighbor-joining tree obtained from the estimated posterior genotype probabilities for eight source taxa showed clear differentiation of all putative sexual taxa (see Fig. 2.2). As suggested by the microsatellite data (Li et al., 2016), Boechera puberula as currently circumscribed comprises two distinct monophyletic lineages, with the typical taxon ("p. puberula") occupying the northern part of the range and "p. arida" replacing it to the south (Fig. 2.1). The clade formed by these two taxa is, in turn, sister to a lineage comprising B. serpenticola and B. subpinnatifida (Fig. 2.2). The latter form distinct, monophyletic groups. A monophyletic assemblage consisting of all 17 samples of B. lasiocarpa is sister to the puberula/serpenticola/subpinnatifida lineage, and this larger clade is, in turn, sister to B. exilis. Although hybrids are not well accommodated by the bifurcating tree model, their inclusion in the phylogenetic analyses reveals an interesting pattern, where the individuals identified as hybrids are placed between the respective parental species. The next sexual diploid lineage proximal to the lineage outlined above is a monophyletic grouping of all ten samples of B. retrofracta (Fig. 2.2).

These are separated on the tree by a grade consisting of ten accessions, all of which represent hybrids between B. retrofracta and members of the larger clade. Similarly, the branch between B. retrofracta and the proximal sexual diploid B. stricta is occupied by a grade of four samples, all of which are identified as retrofracta x stricta hybrids.

The common SNVs were analyzed for population genetic structure using entropy (Gompert et al., 2014) with the number of putative source populations ranging from 2 to 16 (k = 6 to 10 shown in Fig. 2.3). Genetic variation of these SNVs was best explained by an admixture model with eight source populations (DIC = 1.348105 compared to 1.357105 with k = 7). Gelman-Rubin diagnostics

across all estimated admixture proportions and eight source populations indicated convergence of

chains (median scale reduction factor = 1.057, mean = 1.082). The absolute difference between

the lower and upper credible intervals of estimated admixture proportions across all individuals and

source populations, as obtained from entropy, had a median of 5.6510−6 (mean = 0.0102). Thus,

we can assume that the spread between these 95% credible intervals is low enough to simply use

the mean posterior estimates of admixture proportions to explain the genetic variation found in the 23 data. Figure 2.3 shows the admixture proportions for putative source populations ranging from 6 to

10, and Figure A.1 depicts the admixture proportions for all putative source populations considered in this study, with k ranging from 2 to 16.

Admixture proportions derived from the 14,815 variants indicated that there were multiple individuals with mixed ancestry (see Figure 2.3). In particular, most of the admixed individuals showed similar levels of admixture regardless of the number of source taxa assumed. In the lower ranges of k in Fig. 2.3, I found that clusters differentiated individuals into the nominal taxa of B.

stricta, B. retrofracta and B. exilis, with considerable admixture between those taxa. The puberula

group was differentiated into three groups at k = 6, comprising B. lasiocarpa, B. puberula in the broad

sense, and B. subpinnatifida/B. serpenticola. Comparing the admixture proportions at k = 8 (Fig. 2.3)

with the nominal taxa based on microsatellite data (Table 2.1), I find congruence between the two

datasets. Of the 79 samples included in both analyses, 62 (79%) showed admixture proportions

deviating from expectations based on microsatellites by no more than 5%. In another 12 samples

(15%), the taxa identified as contributing genetic material to a particular accession coinciding

with microsatellite-based identifications constituted 70-94% of admixture proportions. Only five

samples (6%) yielded admixture proportions that were strongly at-odds with microsatellite-based

identifications.

When considering the model-free approach to describing genetic variation across samples, the

majority (95.2%) of genetic variation was explained by the first three principal components (PCs),

with the first two PCs accounting for 83.2% of the variation (see Fig. 2.4). Interestingly, the first

principal component (with 69.7% explained variation) separated the two lineages B. retrofracta and

B. stricta from the remaining lineages considered here (B. puberula puberula, B. puberula arida, B.

exilis, B. serpenticola, B. subpinnatifida, and B. lasiocarpa), with admixed individuals positioned between the two groups. Principal component 2 (13.5% explained variation) separated B. stricta from B. retrofracta, with admixed individuals having intermediate scores on this PC. Additionally,

PC 2 separated the puberula group lineages from each other, with B. exilis located between the other members of the puberula group. On principal component 3, we can see a similar pattern, where the spread between the members of the puberula group was wider, yet B. exilis was positioned between 24 those. Furthermore, on PC 3, B. stricta and B. retrofracta were placed on opposite ends of the spectrum. None of the admixed individuals showed extreme PC scores on any of the 3 PCs, but they all showed intermediate values. On PC 2, and more strongly on PC 3, I found B. lasiocarpa individuals to be somewhat distinct from the remainder of the puberula group taxa.

Discussion

In this study, I attempted to clarify the evolutionary history and patterns of admixture among eight of the 80+ sexual diploid members of the genus Boechera. In order to achieve high resolution

on a genome-wide scale, I used a GBS approach which allowed us to examine genetic variation

across 14,815 common high-quality SNVs in the taxa studied. My results support the monophyly

of a cluster of six taxa, closely approximating the B. puberula species group first identified by

Alexander et al. (2013). In the neighbour-joining tree (Fig. 2.2). B. puberula (which forms two

discrete clusters referred to as puberula puberula and puberula arida) is sister to a clade with two

somewhat more divergent taxa, B. subpinnatifida and B. serpenticola. Sister to these core puberula

taxa are the more distantly related members of the group, B. lasiocarpa and B. exilis (Fig. 2.2). Our results are congruent with the parsimony analysis of DNA sequences from seven nuclear loci (Fig.

4 in Alexander et al. (2013)), but show much improved resolution of species relationships within the group .

Two nomenclatural adjustments are necessary to allow direct comparison between my NJ tree and the previously published cladogram of the B. puberula group. With reference to Fig. 4 of

Alexander et al. (2013), a single accession, identified as "B. subpinnatifida", has been shown, based on recent microsatellite analyses, to represent B. puberula puberula, and the two specimens called

"B. puberula" are now classified as B. puberula arida (Li et al., 2016). With these annotations, the two evolutionary trees of the B. puberula species group are seen to be consistent at the species level.

The more extensive sampling achieved in this study (incorporating more loci, more individuals, and all six taxa) significantly improves our understanding of relationships within the group. In the

Alexander et al. (2013) tree, the only resolution within the B. puberula group involved the strong association between the two B. serpenticola accessions, the equally strong association of the two B. 25 puberula (now puberula arida) samples, and a sister relationship between the latter and what is now referred to as puberula puberula. These two clades formed a polytomy with B. lasiocarpa and B. exilis. My tree, on the other hand, is fully resolved (Fig. 2.2), with an arida/puberula clade sister to a serpenticola/subpinnatifida clade, B. lasiocarpa sister to this core group, and B. exilis sister to the rest. The discovery that B. puberula consists of two lineages (arida and puberula) which seem to be able to hybridize, yet maintain a clear distinction regarding their genetic variation, is a novel finding. This distinction was apparent in the PCA, the admixture analyses as well as the evolutionary placement within the puberula group (see Figures 2.3, 2.4, and 2.2). Based on the microsatellite dataset presented by Li et al. (2016), typical B. puberula puberula occurs in Oregon,

Utah, Idaho, and Nevada, whereas B. puberula arida occurs in Oregon, Utah, Idaho and Nevada

(Fig. 2.1). I have not observed any mixed populations but, based on their known distributions and habitat requirements, they are likely to be sympatric somewhere near the 42nd parallel in Nevada and Oregon.

When assessing admixture among the presented taxa and clustering into groups, the placement of most individuals corresponded well with the nominal taxa obtained from microsatellites. I did, however, discover apparent admixture among members of the puberula group and the more distantly- related taxa included in this study. I was able to find signatures of admixture extending beyond the

B. puberula group. Individual admixture proportions were found that seem to be the result of both ancient and more recent admixture events. For instance, formerly identified hybrids between

B. stricta and B. retrofracta (based on the microsatellite search heuristic used by e.g. Beck et al.

(2011); Alexander et al. (2015)) as well as B. retrofracta and B. exilis, showed admixture proportions of approximately 50% from each of the pairs (see Figure 2.3). These hybrids could likely represent recently admixed individuals, with more or less equal admixture proportions. Recent admixture in these individuals was further supported by the intermediate placement between the two taxa in the

PCA analysis (see Fig. 2.4A), without deviation of hybrids from the axis formed by the two lineages

(Gompert and Buerkle, 2016). I found extensively admixed individuals among the members of the puberula group, which deviate strongly from 50/50 admixture proportions of putative parental taxa and could represent older admixture events (see Figure 2.3). 26

I further found that individuals of B. stricta, B. retrofracta and B. exilis experienced admixture from all members of the puberula group. More generally, I observed that all lineages considered in this study were involved in admixture events. Interestingly, no individuals in the puberula group

(group 2 in Fig. 2.3) shared roughly equal admixture proportions in a pairwise manner, suggesting

no signs of recent admixture within this clade, given the data presented here. In contrast, most

individuals in this group predominantly adhered to one taxon. Thus, based on my results, it appears

that gene flow is reduced between these closely-related taxa. This would seem plausible for the

taxa that do not occur in sympatry, such as B. lasiocarpa and B. serpenticola (see also Kiefer and

Koch, 2012), but seems rather surprising given the wide geographic distribution of B. p. puberula,

B. p. arida, and B. subpinnatifida, as well as flowering times overlap among taxa belonging to the

puberula group (Al-Shehbaz and Windham, 2010; Li et al., 2016), combined with otherwise clear

signs of admixture in more distantly related taxa (group 1 in Fig. 2.3). Widespread hybridization has

repeatedly been reported in the genus (e.g. Roy, 1995; Schranz et al., 2005; Beck et al., 2011), yet

it seems as if some lineages are more prone to viable hybridization and introgression than others. I

see signs of differential admixture, both in magnitude and directionality. Thus, the accumulation of

incompatibilities during speciation (Abbott et al., 2013; Harrison and Larson, 2014) does not seem

to occur linearly with time since divergence in this system. Several models for the accumulation

of incompatibilities through pre- and postmating isolation have been proposed, such as 1) linear

accumulation of incompatibilities with time, 2) increasing speed of the accumulation with time (or the

"snowball" effect), which is suggested under the DM-incompatibility theory (Orr and Turelli, 2001),

or 3) a "slowdown" model (Gourbière and Mallet, 2010), where reinforcement results in decelerated

speed of accumulation of RI, due to as a result of preference and mating

system variation (Rettelbach et al., 2016). Results of studies investigating the temporal patterns of

the accumulation of different types of reproductive isolating barriers are mixed in animals (Barton

and Charlesworth, 1984; Orr, 1995; Presgraves, 2003; Presgraves et al., 2003; Fitzpatrick, 2004;

Mendelson and Inouye, 2004; Edwards et al., 2005; Bolnick and Near, 2005; Gourbière and Mallet,

2010; Stelkens et al., 2010; Städler et al., 2012; Presgraves, 2013; Muirhead and Presgraves, 2016),

and plants (Stelkens et al., 2010; Städler et al., 2012; Burkart-Waco et al., 2012; Lowry et al., 2008). 27

It would seem plausible that the members of the puberula group suffered low effective population sizes, thus leading to high population-differentiation within each of the lineages due to .

Fixation of alleles due to genetic drift in small populations, in combination with reported low levels of heterozygosity in the genus due to inbreeding (Song and Mitchell-Olds, 2007; Schranz et al.,

2005; Beck et al., 2011; Lovell et al., 2013) could likely lead to strong genetic differentiation of

Boechera lineages. On the other hand, genomic rearrangements, which are known to occur in the genus (Kantama et al., 2007; Mandáková et al., 2015), in combination with high levels of inbreeding, could have led to very rapid and variable accumulation of isolating barriers in select species, which could explain the patterns we observed in this system. In triploids that were included in this study,

I detected admixture of members of the B. puberula group with other taxa, and I assume that those represent apomictic individuals, resulting from hybridization events, as it has been shown that other triploid Boechera hybrids predominantly reproduce asexually (Schranz et al., 2005; Lovell et al.,

2013; Alexander et al., 2015). This assumption, however, will have to be tested to ascertain whether

it applies to the B. puberula group in particular. Additionally, it is not clear whether apomictic

individuals introgress into sexual lineages through facultative sexuality (Asker and Jerling, 1992;

Sharbel and Mitchell-Olds, 2001), or whether these apomicts remain mostly isolated from sexual

lineages (Mau et al., 2015). We have yet to develop reasonable estimates of divergence times within

the family Brassicaceae, let alone the genus Boechera (Franzke et al., 2016). Despite the lack of

temporal resolution, my results suggest variable rates of the accumulation of incompatibilities in the

genus Boechera.

Further research is needed, such as formal tests for the accumulation of reproductive barriers

between the presented lineages, the examination of individual lineages in this group on a population

genetic level, crossing experiments to gain an understanding of pre- and postzygotic isolating barriers

and their relative contribution to overall reproductive isolation, determination of reproductive modes

(sexual vs. asexual) as well as the (selfing vs. outcrossing) of sexually reproducing

individuals (Barrett, 2014). Currently, analyses of B. lasiocarpa populations are underway, which

will further shed light on the maintenance of genetic variation in this lineage, as well as patterns

of admixture with other taxa in this complex genus. Members of the B. puberula group represent 28 feasible models for the study of speciation in sympatry, allopatry, and parapatry, adaptation to specific soils (e.g. calciferous and serpentenoid), the roles of reproductive modes and mating system in the maintenance of genetic diversity, as well as chromosomal rearrangements after polyploidization events and whole-genome duplications (Lien et al., 2016; Mandakova et al., 2016). Studying these components will bring us closer to understanding the processes that drive speciation as well as the maintenance of genetic diversity in structured populations with differential reproductive modes. 29

Figures & tables

Figure 2.1. Sampling locations and distribution ranges of select Boechera taxa in western United States. A) Sampling locations of diploid and triploid Boechera populations including enlargement of UT (see locality numbers and nominal taxa in Table 2.1). Note that I added a small amount of noise to GPS coordinates, in order to make all of the locality numbers visible. Exact geographic coordinates are provided in Table 2.1. B) Documented distributions of Boechera species across the western United States. Maps include only sexual diploids identified by their epithets. Note that these county-level distributions are based solely on specimens whose identification has been confirmed by microsatellite studies (Li et al., 2016). Both B. retrofracta and B. stricta have wider distributions across North America. 30 B. retrofracta and B. stricta group members as well as B. puberula Figure 2.2. Admixture proportionsbased on for common eight variants taxa (n= and 14,815) neighbor-joining tree of 31 (sexual) (sexual) (sexual) (sexual) B. stricta B. stricta B. stricta B. stricta B. stricta B. stricta B. stricta B. retrofracta x stricta B. retrofracta x subpinnatifida B. retrofracta B. retrofracta B. retrofracta retrofracta B. retrofracta retrofracta B. retrofracta retrofracta B. retrofracta retrofracta B. exilis x retrofracta B. exilis x retrofracta B. exilis B. exilis B. exilis B. exilis B. puberula puberula B. puberula B. puberula puberula B. puberula B. puberula puberula B. puberula arida Continued on next page locality samples12 locality3 1-3,84 45 56 6 Grizzly7 Peak, UT 78 99 La Plata, longitude CO 1010 Weber, UT -111.97 latitude 11,13,14,2511 Nye, NV Steep ploidy 12,24,26 Canyon,12 UT Custer, ID 41.41 15,18,19 nominal taxon 13 Teton, WY 16,21,47,75 Deadfall Madison, Lake, MO14 2 CA Hat Little -108.02 17 Creek, Volcano, CA CA15 -111.6 20 -111.5916 37.44 2217 -122.52 41.97 23 41.41 -117.35 -120.89 2 18 -114.65 Park, 27,31,33,34 -111.96 WY 41.33 -110.5219 -121.41 2 2 38.95 39.86 Bear Mineral, 28,29,30,32 Lake MO Summit, 43.86 UT20 45.56 Wells, Deschutes, 35 2 43.85 NV OR 40.7 2 21 2 Humboldt, -111.47 2 36 CA 2 22 2 37 2 23 41.93 3824 -115.7 Elko, -110.57 39 NV -121.56 2 25 Summit, 40,41,48,49 UT -123.6526 44.41 -114.57 Water Nye, 47.45 42 43.67 Canyon, NV NV27 Millard, 43,45,46 40.48 UT 2 41.08 2 2 Baker, 44 OR 2 Lye 50 Creek, 2 -116.71 NV Box -115.08 Elder, -111.4078 UT 40.64 40.7753 40.68 2 Humboldt, NV -117.54 -112.27 2 Box Elder, UT -117.54 2 -117.11 38.97 38.95 -113.94 41.69 44.7 2 2 41.77 -117.55 2 2 -113.69 2 41.67 41.53 2 2 Table 2.1. Locality information fordetermined Fig. from 2.1, microsatellite sample data. numbers Complete for locality Fig. and 2.3 sample and information can Fig. be 2.2 found as in well Table as A.1 ploidy and nominal taxa. The latter two were both 32 (2:1) (2:1) (2:1) (holotype) B. puberula arida B. puberula arida B. serpenticola B. subpinnatifida B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. exilis x puberula xB. retrofracta retrofracta x sparsiflora B. exilis x retrofracta x sparsiflora B. retrofracta x lasiocarpa B. puberula arida x subpinnatifida B. exilis x puberula xB. retrofracta lasiocarpa x retrofracta x stricta B. lasiocarpa x lemmonii x stricta Continued from previous page Table 2.1 – locality samples2829 locality30 5131 5232 53-5833 59-6234 Mono, 63 CA35 Bully Lander, 64,65 Choop NV Mtn, CA36 Rogue 66-71 River, longitude OR37 72-74 latitude -122.9438 Cache, Rich, 76 UT UT ploidy39 Logan 40.65 77 Canyon Sinks, nominal UT taxon 40 -123.53 -119.13 Steam 78 Mill Peak, -111.48 -117.37 UT41 2 79 42.55 38.3642 Box 41.93 39.24 - Elder, UT43 -111.61 James - 2 2 Peak, UT 2 -111.66 2 Salt - Lake, -111.46 UT 41.95 Tooele, - UT 41.91 41.92 2 Angel -111.98 Lake, NV 2 -111.78 Frenchman 2 Lake, CA 41.39 Indian Creek, -111.72 41.38 NV James Peak, UT -120.18 2 -112.62 40.63 2 -115.07 39.87 40.48 -117.55 2 41.02 3 -111.78 2 41.65 3 41.38 3 3 44 - Peavine Peak, NV -119.93 39.59 3 454647 - - - Shoshone Mtns, NV Steam Mill Peak, UT Willard Peak, UT -116.86 -111.61 40.42 41.95 -111.97 3 3 41.39 3 33 Figure 2.3. Admixture proportionseach based respective individual, on and 14,815 thus common theare variants. proportion shown here, of with inheritance Each k of bar =of each represents 8 2 genome being Bayesian through to the point 16 the best are estimates respective model shown species. of based in on Results admixture Figure DIC of proportions A.1. values. 6-10 for presumed Groups 1 source & species 2 denote groups with differential admixture patterns. Results for k 34 Figure 2.4. Statistical summary of geneticB) variation PC2 based against on PC3, principal with component colors analysisboth assigned of figures. 14,815 to common Hybrids individuals variants, with with with posterior less A)particular estimated than PC1 species admixture 95% against proportions are admixture PC2, drawn above proportions and 95%, as are where colored drawn the as circles legend crosses, in in and the B hybrids respective is with color used more with for than a 50% cross. admixture proportions from a 35

LITERATURE CITED

Abbott, R., D. Albach, S. Ansell, J. W. Arntzen, S. J. E. Baird, N. Bierne, J. Boughman, A. Brelsford, C. A. Buerkle, R. Buggs, R. K. Butlin, U. Dieckmann, F. Eroukhmanoff, A. Grill, S. H. Cahan, J. S. Hermansen, G. Hewitt, a. G. Hudson, C. Jiggins, J. Jones, B. Keller, T. Marczewski, J. Mallet, P. Martinez-Rodriguez, M. Möst, S. Mullen, R. Nichols, a. W. Nolte, C. Parisod, K. Pfennig, a. M. , M. G. Ritchie, B. Seifert, C. M. Smadja, R. Stelkens, J. M. Szymura, R. Väinölä, J. B. W. Wolf, and D. Zinner, 2013. Hybridization and speciation. J Evol Biol 26:229–46.

Al-Shehbaz, I. A. and M. D. Windham, 2010. Boechera. Pp. 347–412, in Flora of North America Committee, ed. Flora of North America, North of Mexico, 7th ed. Oxford University Press., New York and Oxford.

Alexander, P. J., M. D. Windham, J. B. Beck, I. A. Al-Shehbaz, and C. D. Bailey, 2013. Molec- ular phylogenetics and taxonomy of the genus Boechera and related genera (Brassicaceae: Boechereae). Syst Bot 38:192–209.

———, 2015. Weaving a tangled web: Divergent and reticulate speciation in Boechera fendleri sensu lato (Brassicaceae: Boechereae). Syst Bot 40:572–596.

Aliyu, O. M., M. E. Schranz, and T. F. Sharbel, 2010. Quantitative variation for apomictic repro- duction in the genus Boechera (Brassicaceae). Am J Bot 97:1719–1731.

Allendorf, F. W., R. F. Leary, P. Spruell, and J. K. Wenburg, 2001. The problems with hybrids: Setting conservation guidelines. Trends Ecol Evol 16:613–622.

Anderson, E., 1953. Introgressive hybridization. Biological Reviews 28:280–307.

Anderson, J. T., C.-R. Lee, and T. Mitchell-olds, 2011. Life history and natural selection on flowering time in Boechera stricta, a perennial relative of Arabidopsis. Evolution 65:771–787.

Anderson, J. T., C.-R. Lee, and T. Mitchell-Olds, 2014. Strong selection genome-wide enhances fitness trade-offs across environments and episodes of selection. Evolution 68:16–31.

Anderson, J. T., C.-R. Lee, C. A. Rushworth, R. I. Colautti, and T. Mitchell-Olds, 2013. Genetic trade-offs and conditional neutrality contribute to local adaptation. Mol Ecol 22:699–708.

Anderson, J. T., N. Perera, B. Chowdhury, and T. Mitchell-Olds, 2015. Microgeographic pat- terns of genetic divergence and adaptation across environmental gradients in Boechera stricta (Brassicaceae). Am Nat 186.

Arnold, M. L. and S. K. Emms, 1998. Molecular markers, gene flow, and natural selection. Pp. 442–458, in Molecular systematics of plants II: DNA sequencing. Springer Science + Business Media, New York.

Asker, S. and L. Jerling, 1992. Apomixis in Plants. CRC Press, Boca Raton and London.

Atchley, W. R. and D. A. Hensleigh, 1974. The congruence of morphometric shape in relation to genetic divergence in four races of morabine grasshoppers (Orthoptera: Eumastacidae). Evolution 28:416–427. 36

Baker, R. J., J. W. Bickham, and M. L. Arnold, 1985. Chromosomal evolution in Rhogessa (Chiroptera: Vespertilionidae): Possible speciation by centric fusions. Evolution 39:233–243.

Ball, A. D., J. Stapley, D. a. Dawson, T. R. Birkhead, T. Burke, and J. Slate, 2010. A comparison of SNPs and microsatellites as linkage mapping markers: lessons from the zebra finch (Taeniopygia guttata). BMC Genomics 11:218.

Barrett, S. C. H., 2014. Evolution of mating systems: Outcrossing versus selfing. Pp. 356–362, in The Princeton Guide to Evolution. Princeton University Press.

Barton, N. H. and B. Charlesworth, 1984. Genetic revolutions, founder effects, and speciation. Annu Rev Ecol Syst 15:133–164.

Barton, N. H. and G. M. Hewitt, 1985. Analysis of hybrid zones. Annu Rev Ecol Syst 16:113–148.

Beck, J. B., P. J. Alexander, L. Allphin, I. A. Al-Shehbaz, C. Rushworth, C. D. Bailey, and M. D. Windham, 2011. Does hybridization drive the transition to asexuality in diploid Boechera? Evolution 66:985–95.

Böcher, T. W., 1951. Cytological and embryological studies in the amphi-apomictic Arabis holboellii complex. Kongel.Danske Vidensk.-Selskab.Biol.Skr. 6:1–59.

Bolnick, D. I. and T. J. Near, 2005. Tempo of in centrarchid fishes (Teleostei: Centrarchidae). Evolution 59:1754–1767.

Botstein, D., R. L. White, M. Skolnick, and R. W. Davis, 1980. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32:314–31.

Brumfield, R. T., P. Beerli, D. a. Nickerson, and S. V. Edwards, 2003. The utility of single nucleotide polymorphisms in inferences of population history. Trends Ecol Evol 18:249– 256.

Buerkle, C. A. and Z. Gompert, 2013. Population genomics based on low coverage sequencing: how low should we go? Mol Ecol 22:3028–3035.

Burkart-Waco, D., C. Josefsson, B. Dilkes, N. Kozloff, O. Torjek, R. Meyer, T. Altmann, and L. Comai, 2012. in Arabidopsis is determined by a multiple-locus genetic network. Plant Physiol 158:801–12.

Carman, J., 1997. Asynchronous expression of duplicate genes in angiosperms may cause apomixis, bispory, tetraspory, and polyembryony. Biol J Linn Soc 61:51–94.

Dobeš, C., T. Mitchell-Olds, and M. A. Koch, 2004a. Intraspecific diversification in North American Boechera stricta (= Arabis drummondii), Boechera x divaricarpa, and Boechera holboellii (Bras- sicaceae) inferred from nuclear and chloroplast molecular markers - An integrative approach. Am J Bot 91:2087–2101.

Dobeš, C., T. F. Sharbel, and M. Koch, 2007. Towards understanding the dynamics of hybridization and apomixis in the evolution of the genus Boechera (Brassicaceae). Syst Biodivers 5:321–331.

Dobeš, C. H., T. Mitchell-Olds, and M. A. Koch, 2004b. Extensive chloroplast haplotype variation indicates Pleistocene hybridization and radiation of North American Arabis drummondii, A. x divaricarpa, and A. holboellii (Brassicaceae). Mol Ecol 13:349–370. 37

Edwards, A., A. Civitello, H. A. Hammond, and C. T. Caskey, 1991. DNA typing and genetic mapping with trimeric and tetrameric tandem repeats. Am J Hum Genet 49:746–756.

Edwards, S. V., S. B. Kingan, J. D. Calkins, C. N. Balakrishnan, W. B. Jennings, W. J. Swanson, and M. D. Sorenson, 2005. Speciation in birds: genes, geography, and . Proc Natl Acad Sci USA 102 Suppl:6550–6557.

Elshire, R. J., J. C. Glaubitz, Q. Sun, J. a. Poland, K. Kawamoto, E. S. Buckler, and S. E. Mitchell, 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6:e19379.

Falush, D., M. Stephens, and J. K. Pritchard, 2003. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–87.

Feder, J. H. ., 1979. Natural hybridization and genetic divergence between the toads Bufo boreas and Bufo punctatus. Evolution 33:1089–1097.

Fitzpatrick, B. M., 2004. Rates of evolution of hybrid inviability in birds and . Evolution 58:1865–1870.

Franzke, A., M. A. Koch, and K. Mummenhoff, 2016. Turnip time travels: Age estimates in Brassicaceae. Trends Plant Sci .

Gelman, A. and D. B. Rubin, 1992. lnference from iterative simulation using multiple sequences. Statistical Science 7:457–472.

Gompert, Z. and C. A. Buerkle, 2013. Analyses of genetic ancestry enable key insights for molecular ecology. Mol Ecol 22:5278–94.

———, 2016. What, if anything, are hybrids: enduring truths and challenges associated with population structure and gene flow. Evol Appl .

Gompert, Z., L. K. Lucas, C. A. Buerkle, M. L. Forister, J. A. Fordyce, and C. C. Nice, 2014. Admixture and the organization of genetic diversity in a butterfly species complex revealed through common and rare genetic variants. Mol Ecol 23:4555–4573.

Gourbière, S. and J. Mallet, 2010. Are species real? The shape of the species boundary with exponential failure, reinforcement, and the "missing snowball". Evolution 64:1–24.

Harrison, R. G., 1990. Hybrid zones: window on evolutionary process. Oxf Surv Evol Biol 7:69–128.

———, 1993. Hybrid zones and the evolutionary process. Oxford University Press.

Harrison, R. G. and E. L. Larson, 2014. Hybridization, introgression, and the nature of species boundaries. J Hered 105:795–809.

Heiser, C. B., 1949. Study in the evolution of the sunflower species Helianthus annuus and H. bolanderi. Univ Calif Publ Bot.

Hunter, R. . L. . and C. . L. . Markert, 1957. Histochemical demonstration of enzymes separated by zone electrophoresis in starch gels. Science 125:1294–1295. 38

Jeffries, D. L., G. H. Copp, L.-J. Lawson Handley, H. Olsén, C. D. Sayer, and B. Hänfling, 2015. Comparing RADseq and microsatellites to infer complex phylogeographic patterns, a real data informed perspective in the Crucian carp, Carassius carassius, L. bioRxiv P. 025973.

Jombart, T., 2008. adegenet: a R package for the multivariate analysis of genetic markers. Bioinfor- matics 24:1403–5.

Jombart, T. and I. Ahmed, 2011. adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics 27:3070–1.

Jombart, T., S. Devillard, and F. Balloux, 2010. Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genet 11:94.

Kantama, L., T. F. Sharbel, M. E. Schranz, T. Mitchell-Olds, S. de Vries, and H. de Jong, 2007. Diploid apomicts of the Boechera holboellii complex display large-scale chromosome substitutions and aberrant chromosomes. Proc Natl Acad Sci USA 104:14026–14031.

Kenney, A. M., A. L. Sweigart, and A. M. Kenney, 2016. Reproductive isolation and introgression between sympatric Mimulus species. Mol Ecol .

Kiefer, C. and M. A. Koch, 2012. A continental-wide perspective: the genepool of nuclear encoded ribosomal DNA and single-copy gene sequences in North American Boechera (Brassicaceae). PLoS ONE 7:e36491.

Knight, C. A., H. Vogel, J. Kroymann, A. Shumate, H. Witsenboer, and T. Mitchell-Olds, 2006. Ex- pression profiling and local adaptation of Boechera holboellii populations for water use efficiency across a naturally occurring water stress gradient. Mol Ecol 15:1229–1237.

Koch, M., I. A. Al-Shehbaz, and K. Mummenhoff, 2003. Molecular systematics, evolution, and population biology in the mustard family (Brassicaceae). Ann. Missouri Bot. Gard. 90:151–171.

Lee, C.-R. and T. Mitchell-Olds, 2013. Complex trait divergence contributes to environmental niche differentiation in ecological speciation of Boechera stricta. Mol Ecol .

Levin, D. A., J. Francisco-Ortega, and R. K. Jansen, 1995. Hybridization and the extinction of rare plant species. Conserv Biol 10:10–16.

Li, F.-W., C. A. Rushworth, J. B. Beck, and M. D. Windham, 2016. Boechera Microsatellite Website: an online portal for species identification and determining hybrid parentage. Database: in press .

Li, H., 2011. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27:2987–93.

Li, H. and R. Durbin, 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–60.

Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, and R. Durbin, 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078– 9. 39

Lien, S., B. F. Koop, S. R. Sandve, J. R. Miller, P. Matthew, J. S. Leong, D. R. Minkley, A. Zimin, F. Grammes, H. Grove, A. Gjuvsland, B. Walenz, R. A. Hermansen, K. V. Schalburg, E. B. Rondeau, A. D. Genova, J. K. A. Samy, and J. O. Vik, 2016. The Atlantic salmon genome provides insights into rediploidization. Nature 6020.

Liu, N., L. Chen, S. Wang, C. Oh, and H. Zhao, 2005. Comparison of single-nucleotide polymor- phisms and microsatellites in inference of population structure. BMC Genet 6:26.

Lotsy, J. P., 1916. Evolution by means of hybridization. The Hague, Nijhoff.

Lovell, J. T., O. M. Aliyu, M. Mau, M. E. Schranz, M. Koch, C. Kiefer, B.-H. Song, T. Mitchell-Olds, and T. F. Sharbel, 2013. On the origin and evolution of apomixis in Boechera. Plant Reprod 26:309–15.

Lowry, D. B., J. L. Modliszewski, K. M. Wright, C. a. Wu, and J. H. Willis, 2008. The strength and genetic basis of reproductive isolating barriers in flowering plants. Phil Trans R Soc Lond B Biol Sci 363:3009–3021.

Mandakova, T., A. D. Gloss, N. K. Whiteman, and M. A. Lysak, 2016. How diploidization turned a tetraploid into a pseudotriploid. Am J Bot 103:1–10.

Mandáková, T., M. E. Schranz, T. F. Sharbel, H. de Jong, and M. A. Lysak, 2015. Karyotype evolution in apomictic Boechera and the origin of the aberrant chromosomes. Plant Journal 82:785–793.

Mau, M., J. T. Lovell, J. M. Corral, C. Kiefer, M. a. Koch, O. M. Aliyu, and T. F. Sharbel, 2015. Hybrid apomicts trapped in the ecological niches of their sexual ancestors. Proc Natl Acad Sci USA P. 201423447.

McKenna, A., M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, and M. a. DePristo, 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–303.

Mecham, J. S., 1960. Introgressive hybridization between two southeastern treefrogs. Evolution 14:445–457.

Mendelson, T. C. and B. D. Inouye, 2004. Quantifying Patterns in the Evolution of Reproductive Isolation. Evolution 58:1424–1433.

Moran, C., P. Wilkinson, and D. Shaw, 1980. Allozyme variation across a narrow hybrid zone in the grasshopper, Caledia captiva. Heredity 44:69–81.

Muirhead, C. A. and D. C. Presgraves, 2016. Hybrid incompatibilities, local adaptation, and the genomic distribution of natural introgression between species. Am Nat 187:249–261.

Naumova, T. N., J. Van der Laak, J. Osadtchiy, F. Matzk, A. Kravtchenko, J. Bergervoet, K. S. Ramulu, and K. Boutilier, 2001. Reproductive development in apomictic populations of Arabis holboellii (Brassicaceae). Sex Plant Reprod 14:195–200. 40

Nogler, G. A., 1984. Gametophytic apomixis. chap. 10, Pp. 475–518, in B. M. Johri, ed. Embryology of Angiosperms. Springer.

Orr, H. A., 1995. The of speciation: The evolution of hybrid incompatibilities. Genetics 139:1805–1813.

Orr, H. A. and M. Turelli, 2001. The evolution of postzygotic isolation: Accumulating Dobzhansky- Muller incompatibilities. Evolution 55:1085–1094.

Paradis, E., J. Claude, and K. Strimmer, 2004. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics 20:289–290.

Parchman, T. L., Z. Gompert, J. Mudge, F. D. Schilkey, C. W. Benkman, and C. A. Buerkle, 2012. Genome-wide association genetics of an adaptive trait in lodgepole pine. Mol Ecol 21:2991–3005.

Plummer, M., N. Best, K. Cowles, and K. Vines, 2006. CODA: convergence diagnosis and output analysis for MCMC. R News 6:7–11.

Presgraves, D. C., 2003. A fine-scale genetic analysis of hybrid incompatibilities in Drosophila. Genetics 163:955–972.

———, 2013. Hitchhiking to speciation. PLoS Biology 11:1–3.

Presgraves, D. C., L. Balagopalan, S. M. Abmayr, and H. A. Orr, 2003. Adaptive evolution drives divergence of a hybrid inviability gene between two species of Drosophila. Nature 423:715–719.

R Development Core Team, 2015. An Introduction to R. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.r-project.org.

Rettelbach, A., M. Servedio, and J. Hermisson, 2016. Speciation in peripheral populations: Effects of drift load and mating systems. J Evol Biol .

Rieseberg, L. H., 1998. Genetic Mapping as a Tool for Studying Speciation. chap. 16, Pp. 459–487, in Molecular Systematics of Plants II: DNA sequencing. Springer Science + Business Media, New York.

———, 2009. Evolution: Replacing genes and traits through hybridization. Curr Biol 19:R119–22.

Rieseberg, L. H., S.-C. Kim, R. a. Randell, K. D. Whitney, B. L. Gross, C. Lexer, and K. Clay, 2007. Hybridization and the colonization of novel habitats by annual sunflowers. Genetica 129:149–65.

Rieseberg, L. H. and J. H. Willis, 2007. Plant Speciation. Science 317:910–4.

Rollins, R. C., 1981. Studies on Arabis (Cruciferae) of western North America. Syst Bot 6:55–64.

———, 1983. Interspecific hybridization and taxon uniformity in Arabis (Cruciferae). Am J Bot 70:625–634.

Roy, B. A., 1995. The breeding systems of six species of Arabis (Brassicaceae). Am J Bot 82:869– 877.

Rushworth, C. A., B.-H. Song, C.-R. Lee, and T. Mitchell-Olds, 2011. Boechera, a model system for ecological genomics. Mol Ecol 20:4843–57. 41

Schmidt, A., M. W. Schmid, and U. Grossniklaus, 2015. Plant germline formation: common concepts and developmental flexibility in sexual and . Development 142:229–241.

Schranz, M. E., C. Dobeš, M. A. Koch, and T. Mitchell-Olds, 2005. Sexual reproduction, hy- bridization, apomixis, and polyploidization in the genus Boechera (Brassicaceae). Am J Bot 92:1797–1810.

Schranz, M. E., B.-H. Song, A. J. Windsor, and T. Mitchell-Olds, 2007. Comparative genomics in the Brassicaceae: a family-wide perspective. Curr Opin Plant Biol 10:168–75.

Sharbel, T. F. and T. Mitchell-Olds, 2001. Recurrent polyploid origins and chloroplast phylogeog- raphy in the Arabis holboellii complex (Brassicaceae). Heredity 87:59–68.

Sharbel, T. F., M.-L. Voigt, J. M. Corral, G. Galla, J. Kumlehn, C. Klukas, F. Schreiber, H. Vogel, and B. Rotter, 2010. Apomictic and sexual ovules of Boechera display heterochronic global gene expression patterns. Plant Cell 22:655–71.

Skotte, L., T. S. Korneliussen, and A. Albrechtsen, 2013. Estimating individual admixture propor- tions from Next-generation sequencing data. Genetics Pp. 1–28.

Song, B.-H., M. J. Clauss, A. Pepper, and T. Mitchell-Olds, 2006. Geographic patterns of mi- crosatellite variation in Boechera stricta, a close relative of Arabidopsis. Mol Ecol 15:357–69.

Song, B.-H. and T. Mitchell-Olds, 2007. High genetic diversity and population differentiation in Boechera fecunda, a rare relative of Arabidopsis. Mol Ecol 16:4079–88.

Song, B.-H., A. J. Windsor, K. J. Schmid, S. Ramos-Onsins, M. E. Schranz, A. J. Heidel, and T. Mitchell-Olds, 2009. Multilocus patterns of nucleotide diversity, population structure and linkage disequilibrium in Boechera stricta, a wild relative of Arabidopsis. Genetics 181:1021–33.

Städler, T., A. M. Florez-Rueda, and M. Paris, 2012. Testing for ’snowballing’ hybrid incompati- bilities in Solanum: Impact of ancestral polymorphism and divergence estimates. Mol Biol Evol 29:31–34.

Stelkens, R. B., K. A. Young, and O. Seehausen, 2010. The accumulation of reproductive incom- patibilities in African cichlid fish. Evolution 64:617–633.

Turner, J. R. G., 1971. Two thousand generations of hybridisation in a Heliconius butterfly. Evolution 25:471–482.

Vos, P., R. Hogers, B. M, M. Reijans, and L. T, 1995. A new technique for DNA fingerprinting. Nucleic Acids Res 44:388–396.

Williams, J. G., A. R. Kubelik, K. J. Livak, J. A. Rafalski, and S. V. Tingey, 1990. DNA poly- morphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res 18:6531–6535.

Windham, M. D., J. B. Beck, F.-W. Li, L. Allphin, J. G. Carman, D. A. Sherwood, C. A. Rushworth, E. Sigel, P. J. Alexander, C. D. Bailey, et al., 2016. Searching for diamonds in the apomictic rough: a case study involving Boechera lignifera (Brassicaceae). Syst 40:1031–1044. 42

Windsor, A. J., M. E. Schranz, N. Formanova, S. Gebauer-Jung, J. G. Bishop, D. Schnabelrauch, J. Kroymann, and T. Mitchell-Olds, 2006. Partial shotgun sequencing of the Boechera stricta genome reveals extensive microsynteny and promoter conservation with Arabidopsis. Plant Physiol 140:1169–1182.

Xing, C., F. R. Schumacher, G. Xing, Q. Lu, T. Wang, and R. C. Elston, 2005. Comparison of microsatellites, single-nucleotide polymorphisms (SNPs) and composite markers derived from SNPs in linkage analysis. BMC Genet 6 Suppl 1:S29.

Yang, S. and R. K. Selander, 1968. Hybridization in the grackle Quiscalus quiscula in Louisiana. Syst Zool 17:107–143. 43

CHAPTER 3

GENETIC DIVERSITY AND POPULATION STRUCTURE

OF RARE VS. COMMON ALLELES

IN THE MONTANE ENDEMIC BOECHERA LASIOCARPA

Introduction

Natural populations evolve in response to a complex array of biotic and abiotic conditions.

Especially in subalpine and alpine environments, populations with limited mobility are exposed to varying climatic conditions that create clines, which are similar to those seen in latitudinal gradients, yet which are realized over shorter geographic distances (Etterson and Shaw, 2001; Dunne et al.,

2003; Knowles et al., 2006). With changing climates, such populations must respond by dispersing to higher elevations or latitudes, react through plastic responses, adapt to changing conditions

(Grabherr et al., 1994; Walther et al., 2002; Kelly and Goulden, 2008; Hoffmann and Sgrò, 2011;

Jay et al., 2012; Anderson et al., 2015). The adaptive potential of populations, however, could be altered by hybridization dynamics between species. Through hybridization, plant populations may either be able to survive changing conditions (Becker et al., 2013), or their genomes might be swamped by closely-related species, which in turn can lead to extinction or demographic decline through (Ellstrand, 1992; Levin et al., 1995; Balao et al., 2015; Gómez et al., 2015). Whether or not populations are affected by hybridization, however, they must harbor population genetic variation, and particularly functional genetic variation, so they can react to strong environmental pressures by adaptation. The maintenance of genetic variation is therefore essential for local adaptation and speciation processes, as well as for conservation and the prediction of future change in natural populations.

While theory suggests that rare species would have lower levels of genetic diversity than widespread congeners (Barrett et al., 1991; Ellstrand and Elam, 1993), empirical data suggest otherwise. Levels of polymorphism can be both higher or lower in rare versus widespread species

(e.g. Karron, 1987; Gitzendanner and Soltis, 2000; Cole, 2003; Lovell and Mckay, 2015). Moreover, some congeners have different levels of polymorphism, independent of their distribution (e.g. Young 44 and Brown, 1996; Lovell and Mckay, 2015). Therefore, it is crucial in population and to assess levels of population structure in rare endemic species beyond model systems.

With the advent of new technologies, it has become possible to investigate how genetic variation is shaped on a genome-wide level (e.g. Ellegren et al., 2012; Gompert et al., 2012; Nosil et al.,

2012; Parchman et al., 2013; Gompert et al., 2014). By using large numbers of variants across organisms’ genomes, we are now able to shed light on different types of variants, and thus, we can ask whether these types of variants show different demographic and evolutionary histories and different population boundaries. Distinctions between rare and common variants have been made predominantly in human population genetic studies (e.g. Li, 2011; Mathieson and McVean, 2012;

Nelson et al., 2012), and in Lycaeides butterflies (Gompert et al., 2014). In both systems, it was found that rare variants have higher levels of population genetic structure, yet so far, this has not been investigated in rare endemic plant populations with variable population sizes.

The genus Boechera (Brassicaceae) offers a singular opportunity to investigate these topics.

This genus includes about 83 distinct sexual diploid taxa (Li et al., 2016), ranging from some of the most common native mustard species in North America (i.e. B. retrofracta, B. stricta) to some of the most endangered (e.g. B. hoffmannii, B. serotina, B. perstellata). Several of the more widespread

Boechera species have been shown to be self-compatible, with low levels of heterozygosity and high levels of inbreeding (Roy, 1995; Dobeš et al., 2004; Song et al., 2006; Song and Mitchell-

Olds, 2007). However, few of the rare species have been investigated in regards to their population genetic structure and genetic variation. In the only such study to date, B. fecunda, a rare endemic to southwestern Montana, was found to exhibit high levels of genetic diversity, roughly equal to the widespread B. stricta (McKay et al., 2001; Song and Mitchell-Olds, 2007; Anderson et al., 2011,

2013, 2014). Additionally, no previous studies have used genome-wide nucleotide polymorphism data to investigate genetic structure among the rare endemic taxa of Boechera.

The plant species Boechera lasiocarpa in the B. puberula group (Alexander et al., 2013, and chapter 2) is one such example of rare . The species has been found to occur only in the mountains of northern Utah, predominantly on rocky ridges and slopes between elevations of 1800 to

2800m (Al-Shehbaz and Windham, 2010) (see also Figure B.1). Here, I investigate levels of genetic 45 diversity and population genetic structure of the rare sexual diploid species Boechera lasiocarpa through common and rare variants. Additionally, analyses of genetic differentiation are presented for six localities, and I further examine levels of genetic differentiation within a B. lasiocarpa population, with a population size higher than any other population of this species encountered so far.

Methods

Data collection and DNA extraction

Leaf samples were collected from six localities across the Wasatch Range of Utah in spring and summer of 2014. Leaf tissue was immediately stored in silica gel for desiccation. I extracted DNA from leaf tissue following the CTAB protocol described in Beck et al. (2011), only differing in that

I used 3mm solid glass beads (Fischer Scientific) in a Retsch MM400 mixer mill. I further added data from 10 individuals determined to be B. lasiocarpa from chapter 2, and I added additional individuals obtained from air-dried herbarium specimens to use as reference samples representing taxa known to hybridize with B. lasiocarpa specimens (Li et al., 2016)

Genotyping-by-Sequencing library preparation

I generated reduced-complexity, double-digest restriction fragment-based DNA libraries for all individual samples, a method also known as Genotyping-by-Sequencing (GBS). Library preparation was performed following Gompert et al. (2014). The GBS libraries were sequenced in one lane at the

University of Texas Genomic Sequencing and Analysis Facility (Austin, TX, USA) on the Illumina

HiSeq 2500 platform. Custom python and perl scripts (github.com/schimar/hts_tools) were used to parse sequences for individual barcodes and split individuals. I aligned reads for each individual to the Boechera stricta genome-assembly (v1.2, DOE-JGI, http://phytozome.jgi.doe.gov/) using bwa aln & samse version 0.7.5 (Li and Durbin, 2009). I allowed for a maximum edit distance of 5, with a read trimming parameter of 10, the seed set to 20 and a maximum edit distance in the seed of 2. I further used samtools version 0.1.19 (Li et al., 2009) to create bam files from the resulting alignments for subsequent analyses. 46

Estimation of genetic diversity in B. lasiocarpa

Since hybridization is very common in the Boechera genus, and morphology can be considered

unreliable for the determination of nominal taxa, I excluded all polyploid samples and individuals

with evidence of considerable admixture from other Boechera species (see supplementary methods

in Appendix B). 114 individuals were assigned to the B. lasiocarpa cluster. Of these, 105 individuals were collected in localities with at least eight individuals remaining in the data set (see Table B.1 and

Fig. 3.1). One exception was made, namely three individuals out of the 105 individuals came from the Provo Peak (pro) locality, as they are part of the southern-most sampled locality of B. lasiocarpa and thus potentially mark the Southern border of the distribution for B. lasiocarpa.

I called variants anew for these 105 samples, using GATK version 3.5 (McKenna et al., 2010), with heterozygosity for prior likelihood calculation per locus of 0.001, and I ignored sequences with mapping quality < 20. The minimum phred-scaled confidence threshold for variants to be called was set to 50 and the resulting variants were then filtered to contain only variants with at least 128 sequences, at least 4 sequences with the alternative allele, and I only kept the genetic variants at nucleotide sites where I had data for at least 80% of the sampled individuals, only one alternative allele, with a minimum phred-scaled mapping quality of 30. For the 15,150 resulting variants, I calculated metrics of genetic diversity, namely Watterson’s θ (θW ) (Watterson, 1975) and nucleotide diversity (π) (Tajima, 1983) using ANGSD (Nielsen et al., 2012), by first obtaining crude estimates of the folded site frequency spectrum (SFS), followed by Maximum-Likelihood (ML) estimation of the folded SFS, and finally calculating the estimators. For the calculation of locality-specific estimators of θW and π, I used all available diploid individuals per locality (see Table 3.1). For the species-wide estimators, I randomly chose between eight and 12 individuals at each locality. I excluded the three individuals from Provo Peak for both locality-specific and species-wide estimators, due to the low number of remaining diploid B. lasiocarpa individuals from this locality. Further, in addition to variant filtering as mentioned above, I separated variants into common and rare sets with a minor allele frequency (MAF) threshold of 0.05 (see Table B.2). 47

Population genetic structure among and within populations

For both rare and common variants (with 2,051 and 13,099 variants, respectively), I used entropy to obtain admixture proportions, genotype estimates and levels of genetic differentiation

(FST ) from a hypothetical common ancestor as calculated in the software (Gompert et al., 2014). For the rare variants, I used 32,000 sampling iterations, a burn-in period of 8,000, and a thinning interval of 6, for numbers of putative source populations of two to 16. For the common variants, I used 22,000 sampling iterations, a burn-in period of 6,000, and a thinning interval of 8, with two through 14 putative source populations. For each respective run, I used 6 chains to achieve proper mixing and convergence. Starting values of admixture proportions were calculated in the R package

adegenet, as mentioned above. Individuals in both rare and common variant runs were sorted

according to sampling locality, respective cluster with highest admixture proportions, and admixture

proportion for said cluster based on the entropy run for common variants, to then plot admixture

proportions in R. From the resulting posterior genotype probabilities, I estimated mean genotypes,

which were further used for downstream analyses. I then performed unscaled and zero-centered

Principal Component Analyses (PCA) with the prcomp function in R. Additionally, I calculated the

proportion of heterozygous sites for the 105 individuals for both rare and common variants. I then

estimated values of total genetic differentiation (FST ) following Weir and Cockerham (1984) for the 105 individuals and both variant types from the mean genotype estimates.

In order to illustrate differentiation between localities, I calculated matrices of pairwise FST values for the six localities (see Fig. 3.1 and Table B.1) for both variant types according to Weir and

Cockerham (1984). I then wanted to contrast these two locality estimators with the clusters obtained from the entropy runs for both common and rare variants, and calculated pairwise FST matrices from the entropy allele frequencies for common and rare variants (with values of k = 11 and 14, respectively). I produced unrooted neighbor-joining trees from all pairwise FST matrices using the ape package in R. All of these calculations were performed in R (R Development Core Team, 2015). 48

Results

Estimation of genetic diversity in B. lasiocarpa

For 105 B. lasiocarpa individuals, I obtained 13,099 and 2,051 single nucleotide variants

(SNVs) for common and rare sets, with mean coverage per SNV and individual of 12.59 (sd = 18.26) and 12.73 (sd = 21.64), respectively. For the full set of 15,150 variants, genome-wide measures of genetic diversity across the five localities (excluding pro) showed considerable variance (see

Table 3.1), with θW ranging between 0.0026 and 0.0051 (with mean = 0.0048, sd = 0.0178), and π ranging from 0.0038 and 0.0055 (with mean = 0.0051, sd = 0.03) (note that the mean values come from a set of 54 individuals with roughly equal proportions of individuals per locality, see Methods)

When considering common and rare variants separately, proportions of heterozygous sites for all

105 individuals showed very little variation within variant types, but differed for common (mean

= 0.1842, sd = 0.023) and rare variants (mean = 0.059, sd = 0.029). Conversely, total FST values across all six localities were much lower in common than in rare variants, with 0.174 and 0.519, respectively.

Population genetic structure among and within populations

I investigated the population genetic structure between B. lasiocarpa populations by first es- timating admixture proportions for both common and rare variants, and then performing PCAs on both variant types. To further contrast the population genetic structure between localities and the substructure we find within localities, I compared how genetic variation is partitioned among localities. I first present results of the admixture analyses and PCAs for common and rare variants.

Further, I present measures of genetic differentiation for localities, and then lead into a more detailed investigation of the locality of Logan Canyon Sinks (lcs), which is known to harbor a substantial number of individuals, when compared to the remainder of B. lasiocarpa localities encountered so

far.

For estimation of admixture proportions of B. lasiocarpa samples, the potential scale reduction factors indicated convergence of all chains, with medians of 1.19 (mean = 1.7, sd = 1.63) and 1.12

(mean = 1.54, sd = 1.62), across all respective numbers of putative source populations for entropy 49 runs of common and rare variants. The highest DIC support for common variants was represented by a k of 11, with a median scale reduction factor of 1.15 (mean = 1.18, sd = 0.18). Conversely, rare variants were explained best by a k of 14, with a median scale reduction factor of 1.23 (mean

= 1.8, sd = 2.52). For both common and rare variants, the general pattern of admixture proportions did not change considerably across numbers of putative source populations (see Figures 3.2 and

3.3, respectively). In common variants, I did find extensive substructure within the Logan Canyon

Sinks (lcs) population, with four main clusters spread across the locality. One of these clusters from lcs (samples 39-52) mostly adhered to an independent block, distinct from the remainder of individuals sampled. In the rare variants, substructure was less extensive, with most of the samples from lcs adhering to one cluster, except for samples 39-52, which showed a similar picture than what I observed in common variants. In both variant types, two localities (pow and pro) substantially shared clusters with lcs. Additionally, skn exhibited further substructure, and Ben

Lomond Trail (blo) was found to be mostly isolated, with very little shared variation from other localities nor strong substructure within the population. In rare variants, the localities of blo, red and skn formed independent clusters, with red showing signs of further substructure unlike what was seen in common variants. Additionally, the skn locality was clustered into two predominantly distinct groups, a pattern that was more pronounced than in common variants. Perhaps most notably,

I did not observe strong admixture between localities of B. lasiocarpa beyond the aforementioned shared structure between lcs, skn and pro.

In the model-free approach to summarize genetic variation in B. lasiocarpa individuals, I encountered distinct patterns between common and rare variants (see Fig. 3.4). For the common variants, approximately 82% of the genetic variation was explained by the first three principal components (PCs), with the two first PCs explaining 70.5%. In Fig. 3.4 A, the lcs population was split into two clusters on PC 1, with all individuals from pow and pro adhering to one of the two lcs clusters, and the skn locality adhering to the second lcs cluster. Both red and blo localities formed distinct clusters, that were differentiated on PC 2, with the remaining individuals taking an intermediate position between the two clusters. In Fig. 3.4 B, I found three groups of individuals separated on PC 2, that were spread out on PC 3, forming three somewhat distinct bands. For 50 the rare variants (Fig. 3.4 C & D), the first three PCs explained roughly 90% of genetic variation, with PC 1 and PC 2 comprising 82.6%. Similar to the patterns seen for the common variants, lcs, pro and pow formed a cluster, which was slightly separated on PC 1 and 2. skn individuals formed two clusters that were distinct from the remaining individuals, with one of the two clusters being closer to the lcs/pow/pro cluster than to the other skn cluster. Individuals from the red

locality were strongly differentiated on PC 1. On PC 3 (Fig. 3.4 D), the blo locality was distinct

from the remaining clusters. On all three PCs for the rare variants, individuals appear much less

spread-out than in common variants. Notably, I did not find strong correlations between sequencing

coverage and principal components (common: PC1 = 0.175, PC2 = 0.183; rare: PC1 = 0.174, PC2

= 0.08). Conversely, proportions of heterozygous sites for individuals showed significant negative

correlations for common (PC1 = -0.367, p = 0.0001; PC3 = -0.28, p = 0.003) and rare variants (PC2

= -0.345, p = 0.0003).

As mentioned above, I found shared population genetic structure among localities as well as

extensive substructure within populations, especially in the lcs locality. Consistent with substructure

found between lcs, pow and pro, these three localities shared genetic variation in pairwise FST values (see Fig. 3.5). The remaining localities (red, blo and skn) were much more differentiated, both to

the aforementioned localities and to each other (see also Fig. B.2). Values of FST were generally lower in common than in rare variants, but did show very similar patterns for both variant types (see

Fig. 3.5).

Discussion

Estimation of genetic diversity in B. lasiocarpa

Based on the restricted geographic distribution of B. lasiocarpa (Al-Shehbaz and Windham,

2010, and chapter 2), I expected levels of genetic diversity to be low, compared to widespread

congeners. However, I examined genetic diversity in the rare endemic and found that levels of

genetic diversity were not lower than traditionally expected (Barrett et al., 1991; Ellstrand and Elam,

1993). In fact, my estimates were higher than in the widespread B. stricta, with average π = 0.003

and mean θW = 0.0035 (Song et al., 2009), compared to 0.0048 and 0.0051 in B. lasiocarpa. Despite 51 the notion that rare endemic species would in general have lower genetic diversity than common ones, examples of opposite patterns are known in ferns (Ranker, 1994), legumes (Young and Brown,

1996), sunflowers (Ellis et al., 2006) (see also Gitzendanner and Soltis, 2000), and we do not lack evidence in the genus Boechera. Song and Mitchell-Olds (2007) found that the rare B. fecunda, endemic to Montana, had higher levels of observed heterozygosity than the common B. stricta.

However, this pattern does not seem to be a general one in this genus. Lovell and Mckay (2015) examined multiple species therein, and they reached the conclusion that range size was not correlated with measures of genetic variation. In said study, the rare congener B. vivariensis had higher levels of neutral genetic diversity than B. stricta, with the opposite to be true when comparing another member of the genus, B. crandallii, with the widespread B. stricta.

In this study I show that B. lasiocarpa has lower levels of population genetic differentiation than

B.fecunda and B. stricta (with 0.57 in Song et al. (2009) and 0.56 in Song et al. (2006), respectively), when considering common variants (0.174). However, total FST based on rare variants (0.519) was comparable to the values obtained from B. fecunda and B. stricta. In recent studies, levels of population differentiation were higher in rare variants of humans and butterflies, where rare variants were found to convey more recent and geographically localized processes (Li et al., 2010; Mathieson and McVean, 2012; Nelson et al., 2012; Tennessen et al., 2012; Gompert et al., 2014). Furthermore, an excess of rare variants was found in Arabidopsis (Nordborg et al., 2005), in Lycaeides butterflies

(Gompert et al., 2014) and in humans (Tennessen et al., 2012). Consistent with this pattern, I found a high proportion of rare variants in B. lasiocarpa, considering that I observed roughly 14% of the total variants in the lowest 5% bin of the frequency spectrum. My findings in natural populations of the montane endemic B. lasiocarpa are thus consistent with the observed patterns of population differentiation and the occurrence of rare variants in other systems.

Population genetic structure among and within populations

I found considerable population genetic structure among different populations of B. lasiocarpa across the Wasatch Range of Utah. For the common variants, several populations shared variation, namely populations in Cache County (lcs and pow) and the pro locality at the assumed southern 52 distribution edge of the species (see Fig. 3.2). In close geographical proximity to the pow locality

(see Fig. 3.1), both blo as well as skn localities were found to be differentiated from the remainder of localities considered here. Similarly, the locality of red was strongly differentiated from any other locality, based on admixture proportions (Fig. 3.2 and 3.3), in the PCA (Fig. 3.4) and pairwise FST estimates (Fig. 3.5).

Between populations, I found varying degrees of substructure, which were conveyed through the differentiation of variant types. I found strong population substructure, especially in lcs, where I observed at least three different clusters in both admixture proportions and PCA of common variants.

Comparing rare to common variants, I found several contrasting results. In rare variants, I found similar patterns regarding population substructure, which, however, seem less cluttered than in com- mon SNVs. The blo population was shown to be differentiated by estimated admixture proportions, but not in the PCA, where it occurred as part of a cluster with the lcs locality. Additionally, both red

and skn localities showed higher levels of substructure within the respective locality than what was

observed in common variants.

The differences in population structure between localities could be affected by several processes.

Different localities of B. lasiocarpa exhibit dramatic differences in population size, with most

localities having small population sizes and one locality (i.e. lcs) with a much higher number of

individuals encountered, compared to the remainder of populations. Population genetic structure

of localities is likely to be reflected by these differences in population size. Small populations

(e.g. skn, red, and blo) could either be affected by a bottleneck, where genetic drift causes loss of

variation, or these populations could be adapting locally. Rapid local adaptation and strong selective

pressures have been found repeatedly in natural plant populations (e.g. Anderson et al., 2014; Franks

et al., 2016). However, the efficacy of divergent selection can be limited by genetic drift in locally

adapted populations with small effective population sizes (Travisano et al., 1995; Leimu and Fischer,

2008; Hereford, 2009), which can lead to an accumulation of deleterious (Lynch et al.,

1999). It is notable that the localities of skn red and blo can be considered peripheral, as they

occur on mountain ridges with limited potential to disperse into higher elevations, whereas the

remainder of localities considered here are occurring on rocky slopes that are connected to higher 53 elevations. Such small peripheral populations could be at a higher pressure to adapt to changing climatic conditions. Additionally, the differences in population structure among localities could be caused by variation in the mating system. Self-fertilization is known to decrease effective population size, which in turn can lower levels of neutral genetic diversity (Barrett et al., 1991; Ellstrand and

Elam, 1993; Mau et al., 2015). Further, the mating system in a population can determine whether the population is affected by genetic drift or if it is subject to natural selection (Charlesworth and Wright,

2001). For instance, the extent of linkage disequilibrium (LD) between sites across the genome can be increased by reductions in recombination rates in selfing populations, which can reduce the efficacy of selection in selfing populations (e.g. Nordborg, 2000; Qiu et al., 2011). The levels of genetic diversity encountered for individual populations of B. lasiocarpa seem to be associated with

respective population size (i.e. higher values of θW and π for the lcs locality, with lower values in the remainder of localities; see table 3.1), but whether this is caused by differences in the mating system of individual populations is currently not clear. Finally, as hybridization is very common in the genus, the importance of admixture can not be stressed enough. Genetic variation in natural plant populations can be strongly affected by hybridization (see chapter 1 and 2). Prior to the assessment of genetic diversity and population structure in B. lasiocarpa, I included a wide set of congeners, so as to exclude hybrids from subsequent analyses and to focus on the montane endemic to Utah.

Nevertheless, biodiversity and population genetic structure can be strongly affected by hybridization, either leading to outbreeding depression and even potential extinction (Ellstrand, 1992; Levin et al.,

1995; Balao et al., 2015; Gompert and Buerkle, 2016; Gómez et al., 2015) or by helping endemic species survive rapid environmental change (Becker et al., 2013). Thus, especially in this complex genus, we should not underestimate hybridization as a potential driving force for the maintenance of genetic variation.

With rapidly changing environments, it is crucial to further investigate whether specific popula- tions of B. lasiocarpa are affected by genetic drift or whether selection is acting in these populations to remove deleterious mutations and drive local adaptation. In order to determine patterns of genetic diversity of rare versus widespread species, it would be useful to not only focus on species-wide estimates, but rather investigate variability between populations, as the genetic structure of natural 54 populations can vary strongly within species (see also Gitzendanner and Soltis, 2000; Ross-Ibarra et al., 2008). I suggest future work in this system to determine whether different populations of B. lasiocarpa exhibit differential levels of risk of extinction as well as varying selective pressures, given their differences in local population genetic structure. It would also be beneficial to discern whether this sexually diploid species exhibits differences in levels of self-incompatibility within and between populations. Finally, since hybridization is a steady companion in this genus (Schranz et al., 2005;

Beck et al., 2011; Alexander et al., 2015), the role of historical and current admixture events of different Boechera congeners with the montane endemic B. lasiocarpa in shaping population genetic structure has to be clarified. 55

Figures & tables

Figure 3.1. Sampling locations across A) the western US where red colors indicate individuals with ≥ 90% admixture proportions from the B. lasiocarpa cluster, and blue colors indicate the remainder of Boechera taxa., and B) cutout of Utah, with the six localities considered here (see Table 3.1 and Table B.1 for detailed information on these localities and individuals). 56 individuals from Utah (for GPS B. lasiocarpa ) for common (cmn) and rare variants. π π & W θ W θ Utah 2544-2646 – – Cache 2309-2396 0.0051 0.0055 Cache 2512-2534 0.0038 0.0047 Weber 2326-2347 0.0033Weber 0.0043 2387-2511 0.0039 0.0044 Salt Lake 2266-2271 0.0026 0.0038 lcs blo red skn pro pow 1-12 Ben Lomond Trail 13-6667-8485-8788-97 Logan Canyon Sinks Powder Red Mountain Butte Emigration Divide Provo Peak 98-105 Skyline North Trail Admix-id Locality locality (abbr.) County Elev. range (m) Table 3.1. Sample idscoordinates for of admixture individual proportions samples in see Figures Table B.1), 3.2 and and estimates 3.3, of locality genetic diversity information ( for 105 57

Figure 3.2. Admixture proportions based on 13,099 common variants and 105 individuals. Each bar represents Bayesian point estimates of admixture proportions for each respective individual, and thus the proportion of inheritance of each genome to the respective species. Results of 5-14 presumed source species are shown here, with k = 11 being the best model based on DIC values. 58

Figure 3.3. Admixture proportions based on 2,051 rare variants and 105 individuals. Each bar represents Bayesian point estimates of admixture proportions for each respective individual, and thus the proportion of inheritance of each genome to the respective species. Results of 5-16 presumed source species are shown here, with k = 14 being the best model based on DIC values. 59

Figure 3.4. Statistical summary of genetic variation based on principal component analysis of common and rare variants, with A) PC1 against PC2, and B) PC2 against PC3 for common variants (n = 13,099), and C) PC1 against PC2 and D) PC2 against PC3 for rare variants (n = 2,051). Individual samples are colored by locality, where the legend in A applies to all subplots; lcs = Logan Canyon Sinks, skn = Skyline North Trail, red = Red Butte Emigration Divide, blo = Ben Lomond Trail, pro = Provo Peak, and pow = Powder Mountain (see also Figure 3.1 and Table 3.1). 60

Figure 3.5. Neighbor-joining tree of pairwise FST matrices across all six localities, for A) common variants and B) rare variants, with scale bars of trees shown in the lower right corner of the respective plot. 61

LITERATURE CITED

Al-Shehbaz, I. A. and M. D. Windham, 2010. Boechera. Pp. 347–412, in Flora of North America Committee, ed. Flora of North America, North of Mexico, 7th ed. Oxford University Press., New York and Oxford. Alexander, P. J., M. D. Windham, J. B. Beck, I. A. Al-Shehbaz, and C. D. Bailey, 2013. Molec- ular phylogenetics and taxonomy of the genus Boechera and related genera (Brassicaceae: Boechereae). Syst Bot 38:192–209. ———, 2015. Weaving a tangled web: Divergent and reticulate speciation in Boechera fendleri sensu lato (Brassicaceae: Boechereae). Syst Bot 40:572–596. Anderson, J. T., C.-R. Lee, and T. Mitchell-olds, 2011. Life history and natural selection on flowering time in Boechera stricta, a perennial relative of Arabidopsis. Evolution 65:771–787. Anderson, J. T., C.-R. Lee, and T. Mitchell-Olds, 2014. Strong selection genome-wide enhances fitness trade-offs across environments and episodes of selection. Evolution 68:16–31. Anderson, J. T., C.-R. Lee, C. A. Rushworth, R. I. Colautti, and T. Mitchell-Olds, 2013. Genetic trade-offs and conditional neutrality contribute to local adaptation. Mol Ecol 22:699–708. Anderson, J. T., N. Perera, B. Chowdhury, and T. Mitchell-Olds, 2015. Microgeographic pat- terns of genetic divergence and adaptation across environmental gradients in Boechera stricta (Brassicaceae). Am Nat 186. Balao, F., R. Casimiro-Soriguer, J. L. García-Castaño, A. Terrab, and S. Talavera, 2015. Big thistle eats the little thistle: does unidirectional introgressive hybridization endanger the conservation of Onopordum hinojense? New Phytol 206:448–458. Barrett, S. C., J. R. Kohn, D. Falk, K. Holsinger, et al., 1991. Genetic and evolutionary consequences of in plants: implications for conservation. Genetics and Conservation of Rare Plants Pp. 3–30. Beck, J. B., P. J. Alexander, L. Allphin, I. A. Al-Shehbaz, C. Rushworth, C. D. Bailey, and M. D. Windham, 2011. Does hybridization drive the transition to asexuality in diploid Boechera? Evolution 66:985–95. Becker, M., N. Gruenheit, M. Steel, C. Voelckel, O. Deusch, P. B. Heenan, P. A. McLenachan, O. Kardailsky, J. W. Leigh, and P. J. Lockhart, 2013. Hybridization may facilitate in situ survival of endemic species through periods of . Nat Clim Change 3:1039–1043. Charlesworth, D. and S. I. Wright, 2001. Breeding systems and genome evolution. Curr Opin Genet Dev 11:685–90. Cole, C. T., 2003. Genetic variation in rare and common plants. Annu Rev Ecol Evol Syst 34:213– 237. Dobeš, C., T. Mitchell-Olds, and M. A. Koch, 2004. Intraspecific diversification in North American Boechera stricta (= Arabis drummondii), Boechera x divaricarpa, and Boechera holboellii (Bras- sicaceae) inferred from nuclear and chloroplast molecular markers - An integrative approach. Am J Bot 91:2087–2101. 62

Dunne, J. A., J. Harte, and K. J. Taylor, 2003. Subalpine meadow flowering phenology responses to climate change: Integrating experimental and gradient methods. Ecol Monogr 73:69–86.

Ellegren, H., L. Smeds, R. Burri, P. I. Olason, N. Backström, T. Kawakami, A. Künstner, H. Mäkinen, K. Nadachowska-Brzyska, A. Qvarnström, S. Uebbing, and J. B. W. Wolf, 2012. The genomic landscape of species divergence in Ficedula flycatchers. Nature Pp. 1–5.

Ellis, J., C. Pashley, D. McCauley, et al., 2006. High genetic diversity in a rare and endangered sunflower as compared to a common congener. Mol Ecol 15:2345–2355.

Ellstrand, N. C., 1992. Gene flow by pollen: Implications for plant conservation genetics. Oikos Pp. 77–86.

Ellstrand, N. C. and D. R. Elam, 1993. Population genetic consequences of small population size: Implications for plant conservation. Annu Rev Ecol Syst 24:217–242.

Etterson, J. R. and R. G. Shaw, 2001. Constraint to adaptive evolution in response to global warming. Science 294:151.

Franks, S. J., N. C. Kane, N. B. O’Hara, S. Tittes, and J. S. Rest, 2016. Rapid genome-wide evolution in Brassica rapa populations following drought revealed by sequencing of ancestral and descendant gene pools. Mol Ecol Pp. 3622–3631.

Gitzendanner, M. A. and P. S. Soltis, 2000. Patterns of Genetic Variation in rare and widespread plant congeners. Am J Bot 87:783–792.

Gómez, J. M., A. González-Megías, J. Lorite, M. Abdelaziz, and F. Perfectti, 2015. The silent extinction: climate change and the potential hybridization-mediated extinction of endemic high- mountain plants. Biodivers Conserv 24:1843–1857.

Gompert, Z. and C. A. Buerkle, 2016. What, if anything, are hybrids: enduring truths and challenges associated with population structure and gene flow. Evol Appl .

Gompert, Z., L. K. Lucas, C. A. Buerkle, M. L. Forister, J. A. Fordyce, and C. C. Nice, 2014. Admixture and the organization of genetic diversity in a butterfly species complex revealed through common and rare genetic variants. Mol Ecol 23:4555–4573.

Gompert, Z., L. K. Lucas, C. C. Nice, J. A. Fordyce, M. L. Forister, and C. A. Buerkle, 2012. Genomic regions with a history of divergent selection affect fitness of hybrids between two butterfly species. Evolution 66:2167–2181.

Grabherr, G., M. Gottfried, and H. Pauli, 1994. Climate effects on mountain plants. Nature 369:448–448.

Hereford, J., 2009. A quantitative survey of local adaptation and fitness trade-offs. Am Nat 173:579– 88.

Hoffmann, A. A. and C. M. Sgrò, 2011. Climate change and evolutionary adaptation. Nature 470:479–485. 63

Jay, F., S. Manel, N. Alvarez, E. Y. Durand, W. Thuiller, R. Holderegger, P. Taberlet, and O. François, 2012. Forecasting changes in population genetic structure of alpine plants in response to global warming. Mol Ecol 21:2354–2368.

Karron, J. D., 1987. A comparison of levels of genetic polymorphism and self-compatibility in geographically restricted and widespread plant congeners. Evol Ecol 1:47–58.

Kelly, A. E. and M. L. Goulden, 2008. Rapid shifts in plant distribution with recent climate change. Proc Natl Acad Sci USA 105:11823–11826.

Knowles, N., M. D. Dettinger, and D. R. Cayan, 2006. Trends in snowfall versus rainfall in the western United States. J Climate 19:4545–4559.

Leimu, R. and M. Fischer, 2008. A meta-analysis of local adaptation in plants. PLoS ONE 3:1–8.

Levin, D. A., J. Francisco-Ortega, and R. K. Jansen, 1995. Hybridization and the extinction of rare plant species. Conserv Biol 10:10–16.

Li, F.-W., C. A. Rushworth, J. B. Beck, and M. D. Windham, 2016. Boechera Microsatellite Website: an online portal for species identification and determining hybrid parentage. Database: in press .

Li, H., 2011. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27:2987–93.

Li, H. and R. Durbin, 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–60.

Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, and R. Durbin, 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078– 9.

Li, Y., N. Vinckenbosch, G. Tian, E. Huerta-Sanchez, T. Jiang, H. Jiang, A. Albrechtsen, G. Ander- sen, H. Cao, T. Korneliussen, N. Grarup, Y. Guo, I. Hellman, X. Jin, Q. Li, J. Liu, X. Liu, T. Sparsø, M. Tang, H. Wu, R. Wu, C. Yu, H. Zheng, A. Astrup, L. Bolund, J. Holmkvist, T. Jørgensen, K. Kristiansen, O. Schmitz, T. W. Schwartz, X. Zhang, R. Li, H. Yang, J. Wang, T. Hansen, O. Pedersen, R. Nielsen, and J. Wang, 2010. Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nature Genet 42:969–972.

Lovell, J. T. and J. K. Mckay, 2015. Ecological genetics of range size variation in Boechera spp. (Brassicaceae). Ecol Evol 5:4962–4975.

Lynch, M., J. Blanchard, D. Houle, T. Kibota, S. Schultz, L. Vassilieva, and Willis John, 1999. Perspective: Spontaneous deleterious mutations. Evolution 53:645–663.

Mathieson, I. and G. McVean, 2012. Differential confounding of rare and common variants in spatially structured populations. Nature Genet 44:243–6.

Mau, M., J. T. Lovell, J. M. Corral, C. Kiefer, M. a. Koch, O. M. Aliyu, and T. F. Sharbel, 2015. Hybrid apomicts trapped in the ecological niches of their sexual ancestors. Proc Natl Acad Sci USA P. 201423447. 64

McKay, J. K., J. G. Bishop, J. Z. Lin, J. H. Richards, a. Sala, and T. Mitchell-Olds, 2001. Local adaptation across a climatic gradient despite small effective population size in the rare sapphire rockcress. Proc R Soc Lond [Biol] 268:1715–1721. McKenna, A., M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, and M. a. DePristo, 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–303. Nelson, M. R., D. Wegmann, M. G. Ehm, D. Kessner, P. St. Jean, C. Verzilli, J. Shen, Z. Tang, S.-a. Bacanu, D. Fraser, L. Warren, J. Aponte, M. Zawistowski, X. Liu, H. Zhang, Y. Zhang, J. Li, Y. Li, L. Li, P. Woollard, S. Topp, M. D. Hall, K. Nangle, J. Wang, G. Abecasis, L. R. Cardon, S. Zollner, J. C. Whittaker, S. L. Chissoe, J. Novembre, and V.Mooser, 2012. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337:100–104. Nielsen, R., T. Korneliussen, A. Albrechtsen, Y. Li, and J. Wang, 2012. SNP calling, genotype calling, and sample allele frequency estimation from New-Generation Sequencing data. PLoS ONE 7:e37558. Nordborg, M., 2000. Linkage disequilibrium, gene trees and selfing: an ancestral recombination graph with partial self-fertilization. Genetics 154:923–929. Nordborg, M., T. T. Hu, Y. Ishino, J. Jhaveri, C. Toomajian, H. Zheng, E. Bakker, P. Calabrese, J. Gladstone, R. Goyal, M. Jakobsson, S. Kim, Y. Morozov, B. Padhukasahasram, V. Plagnol, N. a. Rosenberg, C. Shah, J. D. Wall, J. Wang, K. Zhao, T. Kalbfleisch, V. Schulz, M. Kreitman, and J. Bergelson, 2005. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biology 3:e196. Nosil, P., Z. Gompert, T. E. Farkas, a. a. Comeault, J. L. Feder, C. a. Buerkle, and T. L. Parchman, 2012. Genomic consequences of multiple speciation processes in a stick insect. Proc R Soc Lond [Biol] Pp. 5058–5065. Parchman, T. L., Z. Gompert, M. J. Braun, R. T. Brumfield, D. B. McDonald, J. A. C. Uy, G. Zhang, E. D. Jarvis, B. A. Schlinger, and C. A. Buerkle, 2013. The genomic consequences of adaptive divergence and reproductive isolation between species of manakins. Mol Ecol 22:3304–3317. Qiu, S., K. Zeng, T. Slotte, S. Wright, and D. Charlesworth, 2011. Reduced efficacy of natural selection on codon usage bias in selfing Arabidopsis and Capsella species. Genome Biol Evol 3:868–880. R Development Core Team, 2015. An Introduction to R. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.r-project.org. Ranker, T. A., 1994. Evolution of high genetic variability in the rare hawaiian fern adenophorus periens and implications for conservation management. Biol Conserv 70:19–24. Ross-Ibarra, J., S. I. Wright, J. P. Foxe, A. Kawabe, L. DeRose-Wilson, G. Gos, D. Charlesworth, and B. S. Gaut, 2008. Patterns of polymorphism and demographic history in natural populations of Arabidopsis lyrata. PLoS ONE 3:e2411. Roy, B. A., 1995. The breeding systems of six species of Arabis (Brassicaceae). Am J Bot 82:869– 877. 65

Schranz, M. E., C. Dobeš, M. A. Koch, and T. Mitchell-Olds, 2005. Sexual reproduction, hy- bridization, apomixis, and polyploidization in the genus Boechera (Brassicaceae). Am J Bot 92:1797–1810.

Song, B.-H., M. J. Clauss, A. Pepper, and T. Mitchell-Olds, 2006. Geographic patterns of mi- crosatellite variation in Boechera stricta, a close relative of Arabidopsis. Mol Ecol 15:357–69.

Song, B.-H. and T. Mitchell-Olds, 2007. High genetic diversity and population differentiation in Boechera fecunda, a rare relative of Arabidopsis. Mol Ecol 16:4079–88.

Song, B.-H., A. J. Windsor, K. J. Schmid, S. Ramos-Onsins, M. E. Schranz, A. J. Heidel, and T. Mitchell-Olds, 2009. Multilocus patterns of nucleotide diversity, population structure and linkage disequilibrium in Boechera stricta, a wild relative of Arabidopsis. Genetics 181:1021–33.

Tajima, F., 1983. Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437–460.

Tennessen, J. A., A. W. Bigham, T. D. O’Connor, W. Fu, E. E. Kenny, S. Gravel, S. McGee, R. Do, X. Liu, G. Jun, H. M. Kang, D. Jordan, S. M. Leal, S. Gabriel, M. J. Rieder, G. Abecasis, D. Altshuler, D. A. Nickerson, E. Boerwinkle, S. Sunyaev, C. D. Bustamante, M. J. Bamshad, and J. M. Akey, 2012. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337:64–9.

Travisano, M., J. A. Mongold, A. F. Bennett, and R. E. Lenski, 1995. Experimental tests of the roles of adaptation, chance, and history in evolution. Science 267:87–90.

Walther, G.-r., E. Post, P. Convey, A. Menzel, C. Parmesan, T. J. C. Beebee, J.-m. Fromentin, O. H.-g. I, and F. Bairlein, 2002. Ecological Responses to Recent Climate Change. Nature 416:389–395.

Watterson, G. A., 1975. On the Number of Segregating Sites in Genetical Models without Recom- bination. Theor Popul Biol 7:256–276.

Weir, B. S. and C. C. Cockerham, 1984. Estimating F-Statistics for the Analysis of Population Structure. Evolution 38:1358–1370.

Young, A. G. and A. H. D. Brown, 1996. Comparative population genetic structure of the rare woodland shrub Daviesia suaveolens and its common congener D. mimosoides. Conserv Biol 10:1220–1228. 66

CHAPTER 4

DIFFERENTIAL GENE EXPRESSION IN DIPLOID

SEXUAL, DIPLOID APOMICTIC AND TRI-

PLOID APOMICTIC SPECIES OF THE

GENUS BOECHERA

Introduction

Reproduction and thus the transmission of genes can be seen as one of the most fundamental processes in the history of life. Sexual reproduction, while ubiquitous across all kingdoms of life, is not the only form of reproduction; it is not even the most common one. Prokaryotes tend to reproduce asexually, although they possess many features that were co-opted during the evolution of and eukaryotic sex (Adl et al., 2005; Cavalier-Smith, 2010; Adl et al., 2012). In eukaryotes, whereas sex is the predominant form of reproduction, a diverse array of reproductive modes can be found.

Not only do different sexual modes exist among eukaryotes, but sexual and asexual reproductive modes can exist simultaneously. Among plants, one particular asexual reproductive mode, apomixis, has been puzzling botanists since the 1850s, when dioecious female plants were observed producing seeds with no male plants (Asker and Jerling, 1992; Böcher, 1951; Stebbins, 1950; van Dijk and

Vijverberg, 2005). Apomixis results in a copy of the maternal genome, without the occurrence of recombination and crossing-over. Two main forms of apomixis are known. In sporophytic apomixis, an embryo forms directly from somatic cells, whereas in gametophytic apomixis, the embryo forms from an unreduced embryo sac (gametophyte). Within gametophytic apomixis, we can distinguish between apospory, wherein the aposporous initial cell (AIC), usually from the ovule wall, becomes the embryo sac, and diplospory, in which a megaspore mother cell (MMC) forms the embryo sac (Nogler, 1984; Koltunow, 1993). Despite their differences in the ameiotic formation of the embryo, all currently known forms of apomixis share three common developmental mechanisms:

1) Meiosis is bypassed during the formation of an embryo sac (apomeiosis), 2) embryo development is independent of fertilization (), and 3) the formation of viable endosperm happens either with (pseudogamously) or without (autonomously) fertilization (Koltunow, 1993; Carman,

1997; Koltunow and Grossniklaus, 2003; Schmidt et al., 2015). 67

Apomixis has become a sought-after trait for agricultural seed production, as it could simply

fix beneficial traits. However, the study of apomixis hinges on the fact that most apomicts are considered allopolyploids (Kearney, 2005), and there are only a few genera where apomixis occurs in diploids (Müntzing, 1931; Böcher, 1951; Narayan, 1962; Weimarck, 1967; Evans and Knox,

1969; Siena et al., 2008). Of these genera, only Boechera is known to express diploid apomixis consistently. Gametophytic apomixis in Boechera is of Taraxacum-type (diplosporous apomixis;

Meiosis I fails to complete, meiosis II creates two cells, one of which degenerates; three mitotic divisions form the megagametophyte (Nogler, 1984))(Schranz et al., 2005). Several genetic factors have been identified that seem to be associated with apomixis in Boechera, such as the APOLLO apomixis-specific allele (Corral et al., 2013) and the BspUPG-2 (UPGRADE-2) locus (Mau et al.,

2013). Additionally, heterochronic transcriptional differences have been found between sexual and apomictic ovules (Sharbel et al., 2009, 2010). Furthermore, besides differences in cell cycle and gene expression between sexual and apomictic ovules, epigenetic and hormonal changes have been reported as well (Schmidt et al., 2014). Although advances have been made regarding the control and expression of apomixis, we have yet to understand the evolution of this developmentally complex trait. It has been hypothesized that the occurrence of apomixis in Boechera was associated with gene duplication events resulting from hybridization and polyploidy (Carman, 1997; Beck et al., 2011).

For most of the 20th century, polyploidy was considered an evolutionary dead-end on longer time-scales. It was argued that polyploids had limited evolutionary potential (Stebbins, 1950;

Wagner, 1970). Polyploidy, it was assumed, would not contribute to lineage diversification over long evolutionary time periods and divergence of new taxa over deep evolutionary time would be happening "chiefly on the diploid level", whereas polyploidy itself would merely be producing "in- numerable variations on old themes but not originating any major new departures" (Stebbins, 1950)

Polyploidization events were seen as evolutionary noise (Wagner, 1970), and they were predomi- nantly thought to be associated with hybridization between closely related species (allopolyploidy), as opposed to polyploidization as a result of hybridization within species (autopolyploidy) (Steb- bins, 1950; Grant, 1981). Levin (1983) argued that autopolyploidy as well as allopolyploidy could introduce new sources of variation and novelty on an organisational level. He also suggested that 68 autopolyploids were not as rare in nature as initially presumed. With the development of new molecular techniques in the past two decades, a more refined picture of polyploidy has begun to emerge. Although initially assumed to be rare, it is now believed that about 90% of all flowering plants have undergone one or more genome duplication events (Cui et al., 2006; Wood et al., 2009) with autopolyploidy being common (e.g. Ramsey and Schemske, 1998; Soltis et al., 2014)). Poly- ploidization leads to whole genome duplication (WGD), which is then followed by rapid genome reorganization and gene loss (Comai et al., 2003; Hegarty and Hiscock, 2008; Leitch and Leitch,

2008; Doyle et al., 2008; Soltis et al., 2014). Besides genetic changes, polyploidization events can bring about quick transitions on epigenetic levels (reviewed in (Comai et al., 2003) and (Song and

Chen, 2015)) and further, gene dosage effects can create tremendous changes in gene expression patterns (Birchler et al., 2005; Song and Chen, 2015).

The genus Boechera is not only an interesting model system for the study of apomixis, but also a compelling example for the importance of polyploidy in plant evolution; hybridization and allopolyploidy are common in this genus (Böcher, 1951; Roy, 1995; Koch et al., 2003; Dobeš et al.,

2004; Schranz et al., 2005). Beck et al. (2011) found that diploid apomicts of Boechera display high levels of heterozygosity, from which it it was inferred that apomixis was more likely to be associated with hybridization rather than polyploidy (but see Lovell et al., 2013). It is difficult to distinguish between the effects of hybridization and polyploidy, however, especially in a genus that is known to be rather "promiscuous" in respect to inter-specific hybridization (Rollins, 1983; Schranz et al.,

2005; Windham and Al-Shehbaz, 2006, 2007a,b; Alexander et al., 2015), and given the history of

WGD events and in Brassicaceae (Bowers et al., 2003; Schranz and Mitchell-Olds,

2006; Beilstein et al., 2010).

The evolution of apomixis therefore still represents a conundrum, as we do not understand how this trait can be maintained over evolutionary time (Stebbins, 1950; van Dijk and Vijverberg,

2005; Hörandl et al., 2008), since it is assumed that the resulting offspring would be clonal in nature. However, if we assume that subsequent allopolyploidization events and resulting genome reorganizations take place over many generations, paired with the idea that facultative apomixis allows for multiple reproductive modes (Aliyu et al., 2010), then it might appear plausible that 69 facultative apomicts can have a selective advantage when it comes to adaptation to novel environments

(Carman, 1997; Grimanelli et al., 2001; Hörandl et al., 2008; Molins et al., 2014). In fact, apomicts have been shown to have larger geographic distributions than their sexual counterparts (a phenomenon termed geographic parthenogenesis (Vandel,1928)) and they frequently colonize previously glaciated areas in higher latitudes and elevations (Bierzychudek, 1985; Kearney, 2005; Hörandl, 2006, 2010).

In order for apomictic species to colonize novel environments, however, they need to be able to respond to environmental stress rapidly. Drought stress, for example, can induce rapid as well as long-term responses in plants (Su et al., 2013) and the genus Boechera is no exception. Drought tolerance and adaption to water-deficient environments have been documented in the genus (McKay et al., 2001; Knight et al., 2006). Furthermore, genetic variation for drought response in B. stricta and

inter-specific variation in the effects of drought have been shown (Haugen et al., 2008). Furthermore,

Schmidt et al. (2015) found that abiotic stress can change characteristics of meiotic cell division,

which leads us to question whether stress has the potential to alter the expression of reproductive

modes.

In this study, I focus on gene expression data of four Boechera species, including the diploid

sexual B. stricta, the diploid apomicts B. lignifera and B. exilis x retrofracta and the triploid

apomict B. cf. gunnisoniana. I expect to see changes in gene expression between the sexual and

apomictic species, as well as between diploid and triploid species considered here. The expression

of genes among the aforementioned species should further inform us about patterns of species

differentiation and responses to drought stress. I specifically address the following questions: 1)

Can we find a uniform response to drought-stress among the four species in question? 2) Are there

genes, associated with apomixis, that can be found across diploid and triploid species of apomictic

Boechera? 3) Do triploid Boechera apomicts exhibit differences in gene expression, compared to

their diploid congenerics? and finally, 4) Can we find genes showing differential expression specific

to reversals from apomeiosis to meiosis in B. lignifera (which was suggested by DIC microscopy

results)? 70

Methods

Plant material and stress treatments

Live plants of three apomictic species, B. cf. gunnisoniana, B. lignifera, B. exilis x retrofracta, and the sexual B. stricta, were collected from natural populations (see Table 4.1) and established in a controlled-environment greenhouse as described in Mateo de Arias (2015).

Single seed flow cytometry

In order to determine ploidy levels, nuclei were isolated from individual mature seeds, collected from control and treated plants. Flow cytometry was performed using Partec buffer (Partec North

America, Inc., Swedesboro, NJ) on a Partec I flow cytometer as described in Mateo de Arias (2015).

Nuclear fluorescence values for each sample were determined using a Partec I flow cytometer per the manufacturer’s instructions. Embryo and endosperm ploidy levels were determined from 50 individually-analyzed seeds per genotype. Sexual seeds were identified by a 2:3 C embryo to endosperm ratio. Apomictic seeds were identified by a 2:5, 2:6, 2:7 or 3:9 C embryo to endosperm ratio (Matzk et al., 2000).

RNA extraction and library prep

Expression profiling by qRT-PCR and RNASeq was restricted to pistils that contained ovules in the MMC to early 8-nucleate embryo sac stages. Pistil lengths corresponding to these stages were identified cytologically. Pistil length intervals for each species were divided evenly into four pistil-length categories, and 15 pistils for each interval category were collected for a total of 60 pistils per sample. Three replicate samples were obtained for the control and drought-stressed treatments of each species. To reduce bias from circadian rhythm effects, pistils were collected between 9:30

A.M. and 12:30 P.M. The pistils were immediately placed in RNALater (Qiagen), and stored at

-80 ◦C.

Total RNA was isolated from frozen tissue using the Qiagen RNeasy kit (Hilden, Germany).

Sequencing libraries from the resulting total RNAs were prepared using the TruSeq mRNA-Seq kit

and protocol from Illumina, Inc. (San Diego, CA). Briefly, mRNAs were isolated via attachment 71 to oligo(dT) beads, chemically fragmented and then reverse transcribed into cDNA via random hexamer priming. Resulting double stranded cDNA fragments were end-repaired to create blunt end fragments, 3-prime A-tailed, ligated with Illumina indexed TruSeq adapters and PCR amplified using

Illumina TruSeq primers. Purified PCR amplified libraries were checked for quality and quantity on the Agilent Bioanalyzer DNA 7500 chip before normalization and sample pooling. Sample pools were clustered and sequenced on the Illumina HiSeq 2500 system with Illumina TruSeq SBS Rapid v1 chemistry as per vendor protocols. Samples selected for gene expression analysis were sequenced single end, fifty cycles to a target depth of ten million reads per sample.

Transcriptome alignment, annotation and gene expression counts

I used a python script to truncate Illumina reads at the first base with a quality score of less than

16 and remaining reads shorter than 24bp were stripped. I then aligned each of the 24 libraries to the

Boechera stricta genome assembly (v1.2, DOE-JGI, http://phytozome.jgi. doe.gov) using bowtie version 2.1.0 (Langmead et al., 2009) and tophat version 2.0.9 (Trapnell et al., 2009) with the maximum read mismatch number of 2, a read gap length of 2, and a read edit distance of 2. Further, maximum insertion and deletion lengths were set to 3. I aligned the reads with mismatch number, read gap length and read edit distance of 4, respectively. Further, samtools version 0.1.19 (Li et al.,

2009) was used to sort the resulting alignments. I then counted expressed genes with htseq-count version 0.6.1p1 using the publicly available B. stricta annotations (http://phytozome.jgi.doe.gov).

Note that I counted on the gene level and therefore did not take splice variants into account.

General expression analyses

In order to assess general patterns of gene expression in the four Boechera species, I performed principal component analyses (PCA) over the covariance matrices of 1) expression data normalized by library size and 2) the presence and absence of expression for each gene in every replicate of the respective treatment and species (24 libraries) using the R base function prcomp. I excluded all genes that had a variance of zero in the presence-absence matrix. Biplots were produced using the ggbiplot package (https://github.com/vqv/ggbiplot). I used the R packages gplots and plyr for further visualization. 72

Bayesian model comparison of expression

We used the R (R Development Core Team, 2015) package baySeq (Hardcastle and Kelly,

2010; Hardcastle, 2014) for analysis of differential expression using generalized empirical Bayesian estimation. This package enabled us to differentiate between the effects of several biologically meaningful models. In baySeq, posterior probabilities are estimated for each gene, corresponding to a scenario of differential expression. Every gene thus is being assigned a posterior probability of being differentially expressed under a given model, with a sum of one per gene across all models. In order to take more complex situations into account, I defined eight scenarios of expression for the 24 samples (see Table 4.2), including no differential expression (nde), stress-related expression (str),

apomixis-specific expression (apo), ploidy-specific DE (plo), reversal of B. lignifera from apomeiosis

to meiosis (rev), reversal of B. lignifera including stress-response (revs), within-species DE (wsd),

apomixis and species-specific DE (asd), and ploidy-specific DE of apomicts (pda). Each number in

table 4.2 refers to a unique state of expression for a given genomic event (e.g. in the simplest case of

no differential expression(nde), all values are 1; see table 4.2). Note, that in table 4.2, single entries

for control and treatment for each species are shown. However, I used a total of 24 libraries with

three library-replicates per treatment and species. Every entry therefore represents three replicates

of the same species and treatment with the same categorical value. Prior distributions of count

data were calculated using the quasi-likelihood method given a negative binomial distribution and a

sample size of 27,416 (the number of genes found to be present in the data). It was further assumed

that dispersion of the counts was identical for all group structures, which is the default in baySeq.

For the estimation of posterior probabilities over the seven model prior distributions, I did not use the empirical estimation approach. Instead, I assumed that 95% of the data were not differentially expressed (i.e. in the nde group), with the remaining 5% distributed equally among the seven groups that represent differential expression (i.e. excluding nde). For each gene, I thus obtained the posterior

probability of matching one of the eight biological scenarios. Genes with posterior probabilities of

greater than 0.95 were deemed differentially expressed for each respective scenario. I then parsed

the respective groups of genes for each model that contained differentially-expressed genes in R.

Fold changes for the presented genes were calculated after dividing each library with its respective 73 library size.

Results and discussion

Transcriptome assembly and annotation

The 24 libraries contained a total of 306 million reads, with an average number of 12.76 million reads (sd= 1,943381). The 24 transcriptome alignments had a mean mapping rate of 0.9025 (sd=

0.024), with species-specific mean mapping rates of 0.943 for B. stricta, 0.89 for B. lignifera, 0.887

for B. cf. gunnisoniana, and 0.89 for B. exilis x retrofracta.

Gene expression patterns across species

In this study, I attempted to test for a uniform stress-response across all species. Additionally, I wanted to differentiate between the gene expression patterns of sexual and apomictic reproduction, the effects of polyploidy in the genus Boechera as well as signs of putative reversals from apomeiosis to meiosis in B. lignifera.

Before attempting to answer the above given questions, however, I want to mention several important points. The evolutionary history of the genus Boechera has not been fully clarified to date

(Al-Shehbaz et al., 2006; Kiefer et al., 2009b,a; Alexander et al., 2013). I am therefore not able to draw conclusions regarding patterns of lineage diversification within this genus. Additionally, as I obtained gene expression data by sampling whole-flower tissue, paired with the fact that variation in gene expression has been shown to be partially stochastic (Kaern et al., 2005), I can not make exact inferences regarding the presence and absence of specific genes. Thus, I performed a PCA on presence-absence of genes between the four species presented here, to assess patterns of presence and absence of gene expression. Further, sampling from pistils might be confounding my analyses through allele- and tissue-specific gene expression (Adams et al., 2004; Adams and Wendel, 2005).

Finally, as I aligned the RNA-Seq reads to the B. stricta genome assembly, I possibly might have omitted species- as well as apomixis-specific reads in my analyses. The analyses presented here can therefore be seen as a rather conservative estimate of differential expression between the presented biological scenarios. 74

I found 27,416 genes to be represented in the data, and of those, 26,072 genes were expressed in at least one of the four species. From here on, I will refer to the list of informative genes (26,072) as the total number of genes considered in further analyses. Of these, 21,013 were expressed in all species (Fig. C.2). 789 (3%) of the total genes did not contain A. thaliana annotations, and roughly 46% of the expressed genes contained GO terms. I performed a PCA on expression counts normalized by respective library size (see Fig. 4.1 A), and I found that 1) among the apomicts, expression patterns of stress-treatment and control groups were overlapping (on PC1), and 2) the sexual B. stricta was differentiated from the apomicts on PC2. I found 7,192 genes with considerable variation in presence of gene expression. Here, different species were clustered into distinct groups

(see Fig. 4.1 B). It can also be seen in Fig. 4.1 that, overall, the stress treatment did not seem to have a considerable effect on gene expression. Only for B. stricta and B. exilis x retrofracta, the control and treatment groups were not overlapping, yet clustered very close to each other (see Fig.

4.1 A and B). When comparing the PCAs over the normalized gene expression values (Fig. 4.1

A) with the covariance matrix of presence and absence of gene expression (Fig. 4.1 B), I found similar clustering. However, samples of B. cf. gunnisoniana were spread out more considerably on PC2 in the latter PCA. Due to the small number of library replicates, however, I am not able to determine whether the above given spread among species represents significant variation within the four species considered here.

Bayesian model comparison of gene expression

Across all models, I found 1,062 genes to be differentially expressed and 21,013 genes in the non-differentially expressed group (nde). Given the cut-off for the posterior probability of being differentially expressed (0.95), I can thus explain the expression patterns of about 85% of the data with the chosen biological groups. Of the six models that assume DE between different sets of samples, I found the following models to contain differentially-expressed genes (with the number of genes in parenth.): apo (523 genes), wsd (245), plo (116), pda (40), rev (125), and revs (11). Of the remaining two groups, apomixis and species-specific DE (asd) did only contain two genes as differentially expressed and, perhaps more interestingly, the maximum of posterior probabilities in 75 the stress-related expression (str) model did not exceed 0.26 (see Fig. 4.2). Therefore, none of the

26,072 genes in my analysis are expressed in a pattern conforming to a uniform stress response across the four species. Several of the chosen biological scenarios (i.e. wsd, asd, and pda) were specifically chosen to exclude genes that correspond to meaningful biological scenarios to be considered in future works, yet do not directly relate to the given questions in this study. In the following paragraphs,

I will first discuss the str model across species, and then highlight and discuss the model groups which directly address the given questions of this study.

Response to drought stress and species-specific differentiation in Boechera

As mentioned above, I did not find a uniform stress response across all four species. Instead, these results suggest that each species responds individually to drought stress. This can be seen from the clear differentiation of all species in the PCA over both normalized total gene expression and presence-absence of expression (see Fig. 4.1), as well as from the fact that the stress model (str) in the Bayesian model comparison of gene expression patterns did not result in a single gene that was deemed differentially-expressed (see Fig. 4.2). Two points have to be stressed here, however. The stress and control libraries of B. exilis x retrofracta did show a slight differentiation in the PCA over presence and absence of gene expression as well as in the PCA of normalized expression data (see

Fig. 4.1). Additionally, the PCA over the normalized expression data showed that all three apomicts were clustered relatively closely to each other, with the stress and control libraries of the diploid apomictic hybrid B. exilis x retrofracta) again being less differentiated.

I was able to capture a considerable amount of these species-specific differences in the Bayesian model comparison of gene expression, specifically in the ’within-species differential expression model group (wsd) (see Fig. 4.2). This model group might harbor essential metabolic functions and housekeeping genes that are differentiated between the four presented species. Kiefer et al. (2009) suggested that B. stricta would represent an early separated lineage from the remainder of Boechera species (Kiefer et al., 2009b), a conclusion which is supported in my results, based on the separation of this species from all other species in question (see Fig. 4.1). The apomictic accessions of the hybrid B. exilis x retrofracta do seem to be distinct to all other species analysed here in terms of 76 presence-absence of genes expressed (see Fig. 4.1 B) , although its expression patterns seem to be very similar to the apomicts B. lignifera and B. cf. gunnisoniana (see Fig. 4.1 A). Both B. cf.

gunnisoniana and B. lignifera were found to be of hybrid origin (Windham et al., 2016; Li et al.,

2016), as they both harbor extensive heterozygosity (Roy, 1995) and were found to share several

internal transcribed spacer types (Kiefer et al., 2009b). Based on the presence and absence of genes

expressed in this study, we see a distinction between the two species (see Fig. 4.1 B), although there

appears to be high variation within the expression patterns between species-specific libraries (see

Fig. 4.1 A).

The fact that I did not find a uniform response to drought stress does not mean that there was

no stress-response at all. Indeed, I found known stress-response genes across all four model groups

that do contain differentially-expressed genes. Some of these genes and gene families (such as genes

in the TIR-NBS-LRR class family as well as F-box associated domains) are represented at high

rates and with relatively high numbers of homoeologs as well as paralogs. These results suggest

gene duplication, and potentially neo- or subfunctionalization (Lynch and Force, 2000a) of said

gene duplicates, although it is not clear to us whether this happens in the presence or absence of

selective pressures (Lynch and Force, 2000a; Conant and Wolfe, 2008; Flagel and Wendel, 2009). I

suggest that, despite high rates of gene flow among Boechera species, as reported in the past (Böcher,

1951; Roy, 1995; Koch, 2003; Dobeš et al., 2004; Schranz et al., 2005; Kiefer et al., 2009b), there

is evidence for the existence of genome-wide interspecific incompatibilites through the process of

WGD (Lynch and Force, 2000b; Bikard et al., 2009; Bomblies and Weigel, 2010). Whether these

incompatibilities are due to paleopolyploidy or neopolyploidy (Ramsey and Schemske, 1998, 2002),

however, remains equivocal.

Differential expression between sexual and apomictic species (apo model)

Between sexual and apomictic species, I found 523 genes to be differentially expressed. Of

these genes in the apo model group, 50 did not have A. thaliana annotations. 368 genes were

considered upregulated in the apomicts (see Fig. 4.2). Among those, Argonaute 9 (AGO9) was

upregulated in apomicts with a fold change of 153. AGO9 was found to be involved in silencing 77 of transposable elements by mediating DNA methylation in the female gametophyte (Mallory and

Vaucheret, 2010; Schmidt et al., 2015). Additionally, beta glucosidase 18 (AtBG1) was upregulated in apomicts by 336-fold. AtBG1 is involved in the absciscic acid (ABA) biosynthesis, and therefore involved in plant development and adaptation to stressful conditions (Lee et al., 2006). Further,

S-locus lectin protein kinase family protein (Bostr.26527s0097, At1g61420) was upregulated in apomicts with a fold change of 12.7. This gene is expressed in the stigma and contains S-locus receptor kinase (SRK1), known to be involved in Brassica-style self-incompatibility (SI) response

(Schopfer et al., 1999; Nasrallah et al., 2002). Note, however, that I found another A. thaliana homolog of the S-locus lectin protein kinase family protein within the differentially expressed genes in the apo model (Bostr.25463s0122, At4g11900), which was downregulated in apomictic libraries with a fold change of 10.4. Within the 368 genes that were considered upregulated in apomicts,

67 were not found in the sexual B. stricta. Among those were AGAMOUS-like 48 (AGL48), a MADS-box transcription factor expressed in the embryo and, putatively, in the endosperm of

A. thaliana (Bemer et al., 2010), Domains Rearranged Methylase 1 (DRM1), involved in DNA methylation (Bostr.2902s0274, At5g15380), and the SKP1-interacting partner 6 (SKIP6), an F- box protein involved in protein ubiquitination (Andrade et al., 2001; Risseeuw et al., 2003; Wang et al., 2008b). Note that I found 15 different F-box associated genes (Andrade et al., 2001) to be differentially expressed in the apo model group, with 12 A. thaliana paralogs (At5g38390,

At3g03030, At3g57590, At4g10400, At1g32660, At1g32375, At1g23770, At5g53635, At3g03360,

At2g26860, At5g18780, and At3g58920) represented. Of those, 3 were found to be overexpressed in the sexual B. stricta, with the remainder upregulated in the apomicts.

Conversely, of the remaining 190 genes that were considered downregulated in the apomicts,

35 genes were not found in any of the apomicts. Among the downregulated genes, I found Pre- foldin chaperone subunit family protein (PFDN) (Bostr.19427s0025) with a fold change of 22.4.

This protein family is involved in embryo sac egg cell differentiation and mitotic recombination

(Wang et al., 2008b). Further, I found the Glucose-methanol-choline oxidoreductase family protein

(GMCo) (Bostr.15774s0342, At5g51950) downregulated in apomicts with a fold change of 18.5, the histone superfamily protein (H3.1) (Bostr.0568s0253, At5g10400), with a fold change of 15.2. I 78 further found two AGAMOUS-like MADS-box genes that were downregulated in apomicts. AGL19

(Bostr.7867s0083, At4g22950), which has been found to be involved in the Polycomb group protein- related epigenetic regulation of plant development (Schönrock et al., 2006; Liu et al., 2014b), was downregulated in apomicts with a fold change of 29, and AGL22 (Bostr.19427s0024, At2g22540), a transcription factor that is known to act as a floral repressor (Yoo et al., 2006), was downregulated as well, with a fold change of 10.

As mentioned earlier, when I performed a PCA over the presence and absence of gene expression in pistils (see Fig. 4.1 B), I could see a clear distinction between all 4 species, regardless of reproductive mode. However, principal component 1 shows a strong pattern of differentiation between the sexual B. stricta and the three apomicts analysed here. This distinction was much less pronounced in the PCA over normalized expression values (see Fig. PCAs A), with PC2 still showing a distinction between the sexual and the apomicts, but the wide spread among libraries on

PC1 suggests high rates of variation between and within all four species, as discussed above. Finally, for this model group, I have to mention several findings connected to earlier results regarding gene expression in Boechera species. Schmidt et al. (2014) did not find expression of DYAD/SWITCH1 in the apomictic Initial cell (AIC). I found this gene to be expressed in pistils of all four species, although its posterior probability was found to be highest in the model group representing no differential expression (nde) with a value of 0.97 (see SI table). Furthermore, contrary to Schmidt et al. (2014), I did not find the RNA helicase gene MNEME (MEM) to be expressed in the pistils of all four species analysed here. Similarly, I did not find gene expression for UPGRADE-2 (Mau et al., 2013) nor APOLLO (Corral et al., 2013) in any of the four Boechera species.

Differential expression between diploid and triploid species (plo model)

Between the three diploid Boechera species in my RNAseq experiment (B. stricta, B. lig- nifera and B. exilis x retrofracta) and the triploid B. cf. gunnisoniana, I found 124 genes with a posterior probability greater than 0.95. Among these, 13 genes had "unknown function" (as ob- tained from TAIR annotations) whereas ten do not contain A. thaliana annotations. Within the 124 differentially-expressed genes that conform to a strict expression pattern of polyploidy, eight genes 79 were downregulated in the triploid B. cf. gunnisoniana and TRAF-like family protein (Chung et al.,

2002) (but note that I found two more TRAF-like superfamily proteins in plo, with A. thaliana par- alogs At2g25320 and At2g05420, not shown here) Two of the eight genes found to be downregulated in triploids are not known (Bostr.6769s0003 and Bostr.29268s0005)

116 of the 124 differentially-expressed genes in this model were found to be upregulated in the triploid B. cf. gunnisoniana. I will further characterize several of the known genes therein (see also Fig. 4.2). The ABC transporter family protein GCN2 (fold change 8.4), which is associated with flower development (Ma, 2005) and has been found to negatively regulate seed germination

(Liu et al., 2015) (ATP binding and ATPase activity). COG1 (found twice in the plo differentially- expressed genes), upregulated in the triploid with a fold change of 15.7, acts in disease resistance

(TIR-NBS-LRR class) family), particularly in defense response to pathogenic fungi (McHale et al.,

2006; Weaver et al., 2006). MAP KINASE 11 (MPK11), which has been found to be activated by

ABA (Eschen-Lippold et al., 2012) showed a fold change of 6.7. Lastly, I want to mention DNA topoisomerase 1 alpha (TOP1ALPHA), which was overexpressed in the triploid with a fold change of

14.1. TOP1ALPHA is known to be involved in floral meristem development and regulation of gene silencing and has recently been shown to regulate floral meristem determinancy in the AGAMOUS

(AG) pathway through histone modification and transcriptional repression (Dinh et al., 2014; Liu et al., 2014b).

Differential expression indicative of a potential reversal from apomeiosis to meiosis (rev and revs models)

I further examined the apomict B. lignifera for transcriptomic evidence of reversals from apomeiosis to meiosis. When screening seeds in the DIC microscope, I found that B. lignifera drought-treated pistils showed an increase in tetrad formation, compared to the control group

(Fig. C.1). Additionally, in the PCAs, B. lignifera is positioned between the two remaining apomicts on PC2, yet remains on the same axis compared to B. stricta. I therefore hypothesized that B. lignifera might be experiencing a reversal from apomeiosis to meiosis, when faced with severe drought-stress. To test this hypothesis, I employed two model groups (rev and revs), where B. lig- 80 nifera is essentially grouped with the sexual B. stricta contrasted with the two remaining apomictic species. The two models differ in that rev does not include a difference between stress and control treatments, whereas revs does in fact include a stress-effect (see Table 4.2). Both models contained differentially-expressed genes, with 125 and 11 genes, respectively. In the rev model, 18 genes did not contain TAIR annotations, 11 genes carried a domain of unknown function and 62 genes did not carry GO terms. Of the 125 differentially-expressed genes in rev, 84 were upregulated, with the remaining 41 downregulated in B. stricta and B. lignifera. Among the upregulated genes, I found several receptor-like proteins, namely RLP6, RLP7 and RLP9 (At1g58190, At1g47890, &

At1g45616), and the connected leucine-rich repeat protein kinase family (At3g47090), which are involved in protein binding (Kobe and Kajava, 2001; Diévart and Clark, 2003; Wang et al., 2008a;

Ascencio-Ibáñez et al., 2008). Conversely, among the downregulated genes in the rev group, I found several genes connected to cell wall development, such as Anther20 (ATA20, At3g15400), which encodes cell wall-related proteins that are related to postmeiotic reproductive development in A. thaliana (Zik and Irish, 2003; Xu et al., 2010), and other glycine-rich domains, such as GRP17 and GRP19 (At5g07530.2 & At5g07550) (Rubinelli et al., 1998; Zik and Irish, 2003). Conversely, of the 11 differentially-expressed genes in the revs model, 6 did not have GO terms, one gene encodes a protein of unknown function, and all genes carried TAIR annotations. Among those 11 differentially-expressed genes, I found two transcripts of the LOB-domain containing protein 20

(LBD20, AT3G03760), which is involved in the development of lateral organ boundaries and is thought to be involved in a diverse set of functions (Shuai et al., 2002; Matsumura et al., 2009).

Further, the Maternally expressed gene (MEG) family, as well as the lipid transfer protein Anther 7

(ATA7), were found in this set of genes (Rubinelli et al., 1998).

Conclusions

In this study, I sought to characterize patterns of gene expression between the four Boechera species, including the diploid sexual B. stricta, the diploid apomicts B. lignifera and B. exilis x retrofracta and the triploid apomict B. cf. gunnisoniana. By using a diverse set of biologically plausible scenarios in the DE analysis, I was able to highlight a broad set of differentially-expressed 81 genes. Besides general patterns of disease resistance stress-associated responses, I showed that all four species do not respond uniformly to drought-stress. I also showed that a relatively high number of genes correspond with apomictic reproduction among the four species, including some genes that are assumed to be related to the functioning and control of apomixis (e.g. Hörandl et al., 2008;

Schranz et al., 2005; Corral et al., 2013; Mau et al., 2013; Schmidt et al., 2014, 2015). Further, I showed that the polyploid apomict B. cf. gunnisoniana responds in a manner partially conforming to gene dosage effects (Birchler et al., 2005). Additionally, I found genes involved in processes such as development and cell death, disease resistance, floral meristem development, regulation of gene silencing, and histone modification (Tenhaken et al., 2005; Ma, 2005; Weaver et al., 2006; Dinh et al.,

2014; Liu et al., 2014a). Finally, I was able to highlight that B. lignifera did respond in a pattern that might be associated with a reversal from apomeiosis to meiosis. These results, combined with the additional models incorporated in my analyses that were not of primary relevance to the questions addressed here, lay the foundations for a plethora of further studies on stress-responses between species with differing reproductive modes, the control and functioning of apomixis in Boechera, and the role of polyploidy and gene duplication in plants. Answering these questions will not only permit us to proceed on the path to fix beneficial traits in agriculturally interesting plants, but also let us further understand the forces that drive the evolution of natural populations. 82 W W W W 52” 31” 19” 30” 0 0 0 0 48 31 13 53 ◦ ◦ ◦ ◦ 106 109 109 110 , , , , N N N N 39” 06” 46” 21” 0 0 0 0 31 33 00 33 ◦ ◦ ◦ ◦ GPS 38 41 40 40 species and hybrids used in this study. Accession and voucher specimen Figures & tables Boechera Collection location (Rocky Mtn Cordillera, USA) CO, Gunnison Co., hillside E ofWY, Sweetwater Highway Co., 50 bluffs S ofUT, Green Utah River Co., Right Fork, HobbleUT, Creek Duchesne Canyon Co., N Fork of Duchesne River (W side) Accession CO11005 WY05001 UT11004 UT10007 hybrid Taxon B. cf. gunnisoniana B. lignifera B. exilis x retrofracta B. stricta Table 4.1. Accession, voucher andnumbers location are information those for of JGC and the Intermountain Herbarium, Utah State University, respectively. 83 1212 2 3 2 1 1 7 2 3 4 3 1 8 4 4 B. exilis x retrofracta ctrl trtm 1222 2 3 2 1 2 5 2 3 4 5 1 6 4 6 B. cf. gunnisoniana ctrl trtm 12 2 1 2 1 1 1 1 1 2 3 1 3 4 3 4 4 B. lignifera ctrl trtm 11 2 1 1 1 1 1 1 1 2 1 1 1 2 1 2 2 B. stricta ctrl trtm str rev plo asd nde apo pda wsd revs Table 4.2. Model group classificationand for drought Bayesian treatment model (trtm) comparison for eachthese, of species. differential although expression. Note I that used Columns every the refershow sample library categorical to here replicates consists values libraries, of for throughout with groups 3 the control replicates ofvalues analyses. (ctrl) each. of expression. Rows expression For Given refer for simplicity two the to cells of subsequent the this with calculation respective table, the of model I same posterior of collapsed value, probabilities. gene it expression. is assumed Numbers that in these cells two groups experienced similar 84

Figure 4.1. Biplots of PCAs performed on (A) expression values normalized by respective library size (n= 26,072), and (B) presence and absence of gene expression over all genes that showed variation across the presence of gene expression (n= 7,192). Legend labels correspond to cf. gunni: B. cf. gunnisoniana; ligni: B. lignifera; exi x retro: B. exilis x retrofracta; and stricta: B. stricta 85

Figure 4.2. Distribution of posterior probabilities across all nine models of expression. The 26,072 expressed genes and their corresponding posterior probabilities are shown for each expression model, with dotted lines corresponding to a posterior probability of 0.95. 86

LITERATURE CITED

Adams, K. L., R. Percifield, and J. F. Wendel, 2004. Organ-specific silencing of duplicated genes in a newly synthesized cotton allotetraploid. Genetics 168:2217–26.

Adams, K. L. and J. F. Wendel, 2005. Allele-specific, bidirectional silencing of an alcohol dehydro- genase gene in different organs of interspecific diploid cotton hybrids. Genetics 171:2139–2142.

Adl, S. M., A. G. Simpson, M. A. Farmer, R. A. Andersen, O. R. Anderson, J. R. Barta, S. S. Bowser, G. Brugerolle, R. A. Fensome, S. Fredericq, et al., 2005. The new higher level classification of eukaryotes with emphasis on the taxonomy of protists. J Eukaryot Microbiol 52:399–451.

Adl, S. M., A. G. B. Simpson, C. E. Lane, J. Lukeš, D. Bass, S. S. Bowser, M. Brown, F. Burki, M. Dunthorn, V. Hampl, A. Heiss, and E. Al., 2012. The revised classification of eukaryotes. J Eukaryot Microbiol 59:1–45.

Al-Shehbaz, I. A., M. A. Beilstein, and E. A. Kellogg, 2006. Systematics and phylogeny of the Brassicaceae (Cruciferae): An overview. Plant Systematics and Evolution 259:89–120.

Alexander, P. J., M. D. Windham, J. B. Beck, I. A. Al-Shehbaz, and C. D. Bailey, 2013. Molec- ular phylogenetics and taxonomy of the genus Boechera and related genera (Brassicaceae: Boechereae). Syst Bot 38:192–209.

———, 2015. Weaving a tangled web: Divergent and reticulate speciation in Boechera fendleri sensu lato (Brassicaceae: Boechereae). Syst Bot 40:572–596.

Aliyu, O. M., M. E. Schranz, and T. F. Sharbel, 2010. Quantitative variation for apomictic repro- duction in the genus Boechera (Brassicaceae). Am J Bot 97:1719–1731.

Andrade, M. A., M. Gonzalez-Guzman, R. Serrano, and P. L. Rodriguez, 2001. A combination of the F-box motif and Kelch repeats defines a large Arabidopsis family of F-box proteins. Plant Mol Biol 46:603–614.

Mateo de Arias, M., 2015. Effect of plant stress on facultative apomixis in Boechera (Brassicaceae). Ph.D. dissertation, Utah State University.

Ascencio-Ibáñez, J. T., R. Sozzani, T.-J. Lee, T.-M. Chu, R. D. Wolfinger, R. Cella, and L. Hanley- Bowdoin, 2008. Global analysis of Arabidopsis gene expression uncovers a complex array of changes impacting pathogen response and cell cycle during geminivirus infection. Plant Physiol 148:436–454.

Asker, S. and L. Jerling, 1992. Apomixis in Plants. CRC Press, Boca Raton and London.

Beck, J. B., P. J. Alexander, L. Allphin, I. A. Al-Shehbaz, C. Rushworth, C. D. Bailey, and M. D. Windham, 2011. Does hybridization drive the transition to asexuality in diploid Boechera? Evolution 66:985–95.

Beilstein, M. A., N. S. Nagalingum, M. D. Clements, S. R. Manchester, and S. Mathews, 2010. Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana. Proc Natl Acad Sci USA 107:18724–18728. 87

Bemer, M., K. Heijmans, C. Airoldi, B. Davies, and G. C. Angenent, 2010. An atlas of type I MADS box gene expression during female gametophyte and seed development in Arabidopsis. Plant Physiol 154:287–300.

Bierzychudek, P., 1985. Patterns in plant parthenogenesis. Experientia 41:1255–1264.

Bikard, D., D. Patel, C. Le Metté, V. Giorgi, C. Camilleri, M. J. Bennett, and O. Loudet, 2009. of duplicate genes leads to genetic incompatibilities within A. thaliana. Science 323:623–626.

Birchler, J. A., N. C. Riddle, D. L. Auger, and R. A. Veitia, 2005. Dosage balance in gene regulation: Biological implications. Trends Genet 21:219–226.

Böcher, T. W., 1951. Cytological and embryological studies in the amphi-apomictic Arabis holboellii complex. Kongel.Danske Vidensk.-Selskab.Biol.Skr. 6:1–59.

Bomblies, K. and D. Weigel, 2010. Arabidopsis and relatives as models for the study of genetic and genomic incompatibilities. Phil Trans R Soc Lond B Biol Sci 365:1815–1823.

Bowers, J. E., B. A. Chapman, J. Rong, and A. H. Paterson, 2003. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433–438.

Carman, J., 1997. Asynchronous expression of duplicate genes in angiosperms may cause apomixis, bispory, tetraspory, and polyembryony. Biol J Linn Soc 61:51–94.

Cavalier-Smith, T., 2010. Origin of the , mitosis and sex: roles of intracellular coevolu- tion. Biol Direct 5:78.

Chung, J. Y., Y. C. Park, H. Ye, and H. Wu, 2002. All TRAFs are not created equal: common and distinct molecular mechanisms of TRAF-mediated signal transduction. J Cell Sci 115:679–688.

Comai, L., A. Madlung, C. Josefsson, and A. Tyagi, 2003. Do the different parental ’heteromes’ cause genomic shock in newly formed allopolyploids? Phil Trans R Soc Lond B Biol Sci 358:1149–1155.

Conant, G. C. and K. H. Wolfe, 2008. Turning a hobby into a job: how duplicated genes find new functions. Nature Rev Genet 9:938–950.

Corral, J. M., H. Vogel, O. M. Aliyu, G. Hensel, T. Thiel, J. Kumlehn, and T. F. Sharbel, 2013. A conserved apomixis-specific polymorphism is correlated with exclusive Exonuclease expression in premeiotic ovules of apomictic Boechera species. Plant Physiol 163:1660–1672.

Cui, L., P. K. Wall, J. H. Leebens-Mack, B. G. Lindsay, D. E. Soltis, J. J. Doyle, P. S. Soltis, J. E. Carlson, K. Arumuganathan, A. Barakat, et al., 2006. Widespread genome duplications throughout the history of flowering plants. Genome Res 16:738–749.

Diévart, A. and S. E. Clark, 2003. Using mutant alleles to determine the structure and function of leucine-rich repeat receptor-like kinases. Curr Opin Plant Biol 6:507–516.

van Dijk, P. J. and K. Vijverberg, 2005. The significance of apomixis in the evolution of the angiosperms: a reappraisal. Regnum Vegetabile 143:101. 88

Dinh, T. T., L. Gao, X. Liu, D. Li, S. Li, Y. Zhao, M. O’Leary, B. Le, R. J. Schmitz, P. Manavella, S. Li, D. Weigel, O. Pontes, J. R. Ecker, and X. Chen, 2014. DNA topoisomerase 1α promotes transcriptional silencing of transposable elements through DNA methylation and histone lysine 9 dimethylation in Arabidopsis. PLoS Genet 10.

Dobeš, C., T. Mitchell-Olds, and M. A. Koch, 2004. Intraspecific diversification in North American Boechera stricta (= Arabis drummondii), Boechera x divaricarpa, and Boechera holboellii (Bras- sicaceae) inferred from nuclear and chloroplast molecular markers - An integrative approach. Am J Bot 91:2087–2101.

Doyle, J. J., L. E. Flagel, A. H. Paterson, R. A. Rapp, D. E. Soltis, P. S. Soltis, and J. F. Wendel, 2008. Evolutionary genetics of genome merger and doubling in plants. Annu Rev Genet 42:443–461.

Eschen-Lippold, L., G. Bethke, M. Palm-Forster, P. Pecher, N. Bauer, J. Glazebrook, D. Scheel, and J. Lee, 2012. MPK11 - a fourth elicitor-responsive mitogen-activated protein kinase in Arabidopsis thaliana. Plant Signal Behav 7:1203–1205.

Evans, L. T. and R. B. Knox, 1969. Environmental control of reproduction in Themeda australis. Aust J Bot 17:375–389.

Flagel, L. E. and J. F. Wendel, 2009. Gene duplication and evolutionary novelty in plants. New Phytol 183:557–564.

Grant, V., 1981. Plant Speciation. 2nd ed. Columbia University Press, New York, USA.

Grimanelli, D., O. Leblanc, E. Perotti, and U. Grossniklaus, 2001. Developmental genetics of gametophytic apomixis. Trends Genet 17:597–604.

Hardcastle, T. J., 2014. Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology. biorxiv Pp. 1–21.

Hardcastle, T. J. and K. A. Kelly, 2010. baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11:422.

Haugen, R., L. Steffes, J. Wolf, P. Brown, S. Matzner, and D. H. Siemens, 2008. Evolution of drought tolerance and defense: Dependence of tradeoffs on mechanism, environment and defense switching. Oikos 117:231–244.

Hegarty, M. J. and S. J. Hiscock, 2008. Genomic Clues to the Evolutionary Success of Polyploid Plants. Curr Biol 18:435–444.

Hörandl, E., 2006. The complex causality of geographical parthenogenesis. New Phytol 171:525–38.

———, 2010. The evolution of self-fertility in apomictic plants. Sex Plant Reprod 23:73–86.

Hörandl, E., A.-C. Cosendai, and E. M. Temsch, 2008. Understanding the geographic distributions of apomictic plants: a case for a pluralistic approach. Plant Ecol Divers 1:1–21.

Kaern, M., T. C. Elston, W. J. Blake, and J. J. Collins, 2005. Stochasticity in gene expression: from theories to . Nature Rev Genet 6:451–464. 89

Kearney, M., 2005. Hybridization, glaciation and geographical parthenogenesis. Trends Ecol Evol 20:495–502.

Kiefer, C., C. Dobeš, and M. A. Koch, 2009a. Boechera or not? Phylogeny and of eastern North American Boechera species (Brassicaceae). Taxon 58:1109–1121.

Kiefer, C., C. Dobes, T. F. Sharbel, and M. a. Koch, 2009b. Phylogeographic structure of the chloro- plast DNA in North American Boechera – a genus and continental-wide perspective. Mol Phylogenet Evol 52:303–11.

Knight, C. A., H. Vogel, J. Kroymann, A. Shumate, H. Witsenboer, and T. Mitchell-Olds, 2006. Ex- pression profiling and local adaptation of Boechera holboellii populations for water use efficiency across a naturally occurring water stress gradient. Mol Ecol 15:1229–1237.

Kobe, B. and A. V. Kajava, 2001. The leucine-rich repeat as a protein recognition motif. Curr Opin Struct Biol 11:725–732.

Koch, M., I. A. Al-Shehbaz, and K. Mummenhoff, 2003. Molecular systematics, evolution, and population biology in the mustard family (Brassicaceae). Ann. Missouri Bot. Gard. 90:151–171.

Koch, M. A., 2003. Multiple Hybrid Formation in Natural Populations: Concerted Evolution of the Internal Transcribed Spacer of Nuclear Ribosomal DNA (ITS) in North American Arabis divaricarpa (Brassicaceae). Mol Biol Evol 20:338–350.

Koltunow, A. M., 1993. Apomixis: Embryo Sacs and Embryos Formed without Meiosis or Fertil- ization in Ovules. Plant Cell 5:1425–1437.

Koltunow, A. M. and U. Grossniklaus, 2003. Apomixis: a developmental perspective. Annu Rev Plant Biol 54:547–574.

Langmead, B., C. Trapnell, M. Pop, and S. L. Salzberg, 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.

Lee, K. H., H. L. Piao, H. Y. Kim, S. M. Choi, F. Jiang, W. Hartung, I. Hwang, J. M. Kwak, I. J. Lee, and I. Hwang, 2006. Activation of glucosidase via stress-induced polymerization rapidly increases active pools of abscisic acid. Cell 126:1109–1120.

Leitch, A. R. and I. J. Leitch, 2008. Genomic Plasticity and the Diversity of Polyploid Plants. Science 320:481–483.

Levin, D. A., 1983. The University of Chicago. Am Nat 12:1–25.

Li, F.-W., C. A. Rushworth, J. B. Beck, and M. D. Windham, 2016. Boechera Microsatellite Website: an online portal for species identification and determining hybrid parentage. Database: in press .

Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, and R. Durbin, 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078– 9. 90

Liu, S., E. D. Lorenzen, M. Fumagalli, B. Li, K. Harris, Z. Xiong, L. Zhou, T. S. Korneliussen, M. Somel, C. Babbitt, G. Wray, J. Li, W. He, Z. Wang, W. Fu, X. Xiang, C. C. Morgan, A. Doherty, M. J. O’Connell, J. O. McInerney, E. W. Born, L. Dalén, R. Dietz, L. Orlando, C. Sonne, G. Zhang, R. Nielsen, E. Willerslev, and J. Wang, 2014a. Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears. Cell 157:785–794.

Liu, X., L. Gao, T. T. Dinh, T. Shi, D. Li, R. Wang, L. Guo, L. Xiao, and X. Chen, 2014b. DNA topoisomerase I affects polycomb group protein-mediated epigenetic regulation and plant development by altering nucleosome distribution in Arabidopsis. Plant Cell 26:2803–2817.

Liu, X., A. Merchant, K. S. Rockett, M. McCormack, and K. M. Pajerowska-Mukhtar, 2015. Charac- terization of Arabidopsis thaliana GCN2 kinase roles in seed germination and plant development. Plant Signal Behav 10:e992264.

Lovell, J. T., O. M. Aliyu, M. Mau, M. E. Schranz, M. Koch, C. Kiefer, B.-H. Song, T. Mitchell-Olds, and T. F. Sharbel, 2013. On the origin and evolution of apomixis in Boechera. Plant Reprod 26:309–15.

Lynch, M. and A. Force, 2000a. The probability of duplicate gene preservation by subfunctional- ization. Genetics 154:459–473.

Lynch, M. and A. G. Force, 2000b. The Origin of Interspecific Genomic Incompatibility via Gene Duplication. Am Nat 156:590–605.

Ma, H., 2005. Molecular genetic analyses of microsporogenesis and microgametogenesis in flowering plants. Annual Reviews of Plant Biology 56:393–434.

Mallory, A. and H. Vaucheret, 2010. Form, function, and regulation of ARGONAUTE proteins. Plant Cell 22:3879–3889.

Matsumura, Y., H. Iwakawa, Y. Machida, and C. Machida, 2009. Characterization of genes in the asymmetric leaves2/lateral organ boundaries (as2/lob) family in Arabidopsis thaliana, and functional and molecular comparisons between as2 and other family members. Plant Journal 58:525–537.

Matzk, F., A. Meister, and I. Schubert, 2000. An efficient screen for reproductive pathways using mature seeds of monocots and dicots. Plant Journal 21:97–108.

Mau, M., J. M. Corral, H. Vogel, M. Melzer, J. Fuchs, M. Kuhlmann, N. de Storme, D. Geelen, and T. F. Sharbel, 2013. The conserved chimeric transcript UPGRADE2 is associated with unreduced pollen formation and is exclusively found in apomictic Boechera species. Plant Physiol 163:1640–59.

McHale, L., X. Tan, P. Koehl, and R. W. Michelmore, 2006. Plant nbs-lrr proteins: adaptable guards. Genome Biol 7:1.

McKay, J. K., J. G. Bishop, J. Z. Lin, J. H. Richards, a. Sala, and T. Mitchell-Olds, 2001. Local adaptation across a climatic gradient despite small effective population size in the rare sapphire rockcress. Proc R Soc Lond [Biol] 268:1715–1721. 91

Molins, M. P., J. M. Corral, O. M. Aliyu, M. a. Koch, A. Betzin, J. L. Maron, and T. F. Sharbel, 2014. Biogeographic variation in genetic variability, apomixis expression and ploidy of St. John’s wort (Hypericum perforatum) across its native and introduced range. Ann Bot 113:417–27.

Müntzing, A., 1931. Note on the cytology of some apomictic species. Hereditas 15:166– 178.

Narayan, K. N., 1962. Apomixis in some species of Pennisetum and Panicum antidotale. Pp. 155–161, in Plant Embryology. A Symposium. CSIR, New Delhi.

Nasrallah, M. E., P. Liu, and J. B. Nasrallah, 2002. Generation of self-incompatible Arabidopsis thaliana by transfer of two S locus genes from A. lyrata. Science 297:247–9.

Nogler, G. A., 1984. Gametophytic apomixis. chap. 10, Pp. 475–518, in B. M. Johri, ed. Embryology of Angiosperms. Springer.

R Development Core Team, 2015. An Introduction to R. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.r-project.org.

Ramsey, J. and D. W. Schemske, 1998. Pathways, mechanisms, and rates of polyploid formation in flowering plants. Annu Rev Ecol Syst 29:467–501.

———, 2002. Neopolyploidy in flowering plants. Annu Rev Ecol Syst 33:589–639.

Risseeuw, E. P., T. E. Daskalchuk, T. W. Banks, E. Liu, J. Cotelesage, H. Hellmann, M. Estelle, D. E. Somers, and W. L. Crosby, 2003. Protein interaction analysis of SCF ubiquitin E3 ligase subunits from Arabidopsis. Plant Journal 34:753–767.

Rollins, R. C., 1983. Interspecific hybridization and taxon uniformity in Arabis (Cruciferae). Am J Bot 70:625–634.

Roy, B. A., 1995. The breeding systems of six species of Arabis (Brassicaceae). Am J Bot 82:869– 877.

Rubinelli, P., Y. Hu, and H. Ma, 1998. Identification, sequence analysis and expression studies of novel anther-specific genes of Arabidopsis thaliana. Plant Mol Biol 37:607–619.

Schmidt, A., M. W. Schmid, and U. Grossniklaus, 2015. Plant germline formation: common concepts and developmental flexibility in sexual and asexual reproduction. Development 142:229–241.

Schmidt, A., M. W.Schmid, U. C. Klostermeier, W.Qi, D. Guthörl, C. Sailer, M. Waller, P.Rosenstiel, and U. Grossniklaus, 2014. Apomictic and sexual germline development differ with respect to cell cycle, transcriptional, hormonal and epigenetic regulation. PLoS Genet 10:e1004476.

Schönrock, N., R. Bouveret, O. Leroy, L. Borghi, C. Köhler, W. Gruissem, and L. Hennig, 2006. Polycomb-group proteins repress the floral activator AGL19 in the FLC-independent pathway. Genes and Development 20:1667–1678.

Schopfer, C. R., M. E. Nasrallah, and J. B. Nasrallah, 1999. The male determinant of self- incompatibility in Brassica. Science 286:1697–700. 92

Schranz, M. E., C. Dobeš, M. A. Koch, and T. Mitchell-Olds, 2005. Sexual reproduction, hy- bridization, apomixis, and polyploidization in the genus Boechera (Brassicaceae). Am J Bot 92:1797–1810.

Schranz, M. E. and T. Mitchell-Olds, 2006. Independent ancient polyploidy events in the sister families Brassicaceae and Cleomaceae. Plant Cell 18:1152–1165.

Sharbel, T. F., M.-L. Voigt, J. M. Corral, G. Galla, J. Kumlehn, C. Klukas, F. Schreiber, H. Vogel, and B. Rotter, 2010. Apomictic and sexual ovules of Boechera display heterochronic global gene expression patterns. Plant Cell 22:655–71.

Sharbel, T. F., M.-L. Voigt, J. M. Corral, T. Thiel, A. Varshney, J. Kumlehn, H. Vogel, and B. Rotter, 2009. Molecular signatures of apomictic and sexual ovules in the Boechera holboellii complex. Plant Journal 58:870–82.

Shuai, B., C. G. Reynaga-Peña, and P. S. Springer, 2002. The lateral organ boundaries gene defines a novel, plant-specific gene family. Plant Physiol 129:747–761.

Siena, L. A., M. E. Sartor, F. Espinoza, C. L. Quarin, and J. P. A. Ortiz, 2008. Genetic and embryological evidences of apomixis at the diploid level in Paspalum rufum support recurrent auto-polyploidization in the species. Sex Plant Reprod 21:205–215.

Soltis, D. E., C. J. Visger, and P. S. Soltis, 2014. The polyploidy revolution then...and now: Stebbins revisited. Am J Bot 101:1057–1078.

Song, Q. and Z. J. Chen, 2015. Epigenetic and developmental regulation in plant polyploids. Curr Opin Plant Biol 24:101–109.

Stebbins, G. L., 1950. Variation and evolution in plants. Columbia University Press, New York City.

Su, Z., X. Ma, H. Guo, N. L. Sukiran, B. Guo, S. M. Assmann, and H. Ma, 2013. Flower development under drought stress: morphological and transcriptomic analyses reveal acute responses and long- term acclimation in Arabidopsis. Plant Cell 25:3785–807.

Tenhaken, R., T. Doerks, and P. Bork, 2005. DCD - a novel plant specific domain in proteins involved in development and . BMC Bioinformatics 6:169.

Trapnell, C., L. Pachter, and S. L. Salzberg, 2009. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111.

Vandel, A., 1928. La parténogenèse géographique. Contribution à l’étude biologique et cytologique de la parténogenèse naturelle. Bulletin Biologique de la France et de la Belgique 62:164–281.

Wagner, W. H., 1970. Biosystematics and Evolutionary Noise. Taxon 19:146–151.

Wang, G., U. Ellendorff, B. Kemp, J. W. Mansfield, A. Forsyth, K. Mitchell, K. Bastas, C.-M. Liu, A. Woods-Tör, C. Zipfel, et al., 2008a. A genome-wide functional investigation into the roles of receptor-like proteins in arabidopsis. Plant Physiol 147:503–517.

Wang, Y., W.-Z. Zhang, L.-F. Song, J.-J. Zou, Z. Su, and W.-H. Wu, 2008b. Transcriptome analyses show changes in gene expression to accompany pollen germination and tube growth in Arabidopsis. Plant Physiol 148:1201–1211. 93

Weaver, M. L., M. R. Swiderski, Y. Li, and J. D. G. Jones, 2006. The Arabidopsis thaliana TIR-NB- LRR R-protein, RPP1A; protein localization and constitutive activation of defence by truncated alleles in tobacco and Arabidopsis. Plant Journal 47:829–840.

Weimarck, G., 1967. Apomixis and sexuality in Hierochloë australis and in Swedish H. odorata on different polyploid levels. Bot. Notiser 120:209–235.

Windham, M. D. and I. A. Al-Shehbaz, 2006. New and noteworthy species of Boechera (Brassi- caceae) I: Sexual diploids. Harv Pap Bot 11:61–88.

———, 2007a. New and Noteworthy Species of Boechera (Brassicaceae) II: Apomictic Hybrids. Harv Pap Bot 11:257–274.

———, 2007b. New and noteworthy species of Boechera (Brassicaceae) III: Additional sexual diploids and apomictic hybrids. Harv Pap Bot 12:235–257.

Windham, M. D., J. B. Beck, F.-W. Li, L. Allphin, J. G. Carman, D. A. Sherwood, C. A. Rushworth, E. Sigel, P. J. Alexander, C. D. Bailey, et al., 2016. Searching for diamonds in the apomictic rough: a case study involving Boechera lignifera (Brassicaceae). Syst Botany 40:1031–1044.

Wood, T. E., N. Takebayashi, M. S. Barker, I. Mayrose, P. B. Greenspoon, and L. H. Rieseberg, 2009. The frequency of polyploid speciation in vascular plants. Proc Natl Acad Sci USA 106:13875– 13879.

Xu, J., C. Yang, Z. Yuan, D. Zhang, M. Y. Gondwe, Z. Ding, W. Liang, D. Zhang, and Z. A. Wilson, 2010. The aborted microspores regulatory network is required for postmeiotic male reproductive development in Arabidopsis thaliana. The Plant Cell 22:91–107.

Yoo, S. K., J. S. Lee, and J. H. Ahn, 2006. Overexpression of AGAMOUS-LIKE 28 (AGL28) pro- motes flowering by upregulating expression of floral promoters within the autonomous pathway. Biochem Biophys Res Commun 348:929–936.

Zik, M. and V. F. Irish, 2003. Global identification of target genes regulated by apetala3 and pistillata floral homeotic gene action. The Plant Cell 15:207–222. 94

CHAPTER 5

CONCLUSIONS

Introduction

In this dissertation, I sought to investigate different aspects of the evolution of several species in the plant genus Boechera (Brassicaceae) from multiple distinct angles. These angles essentially represent different ways to quantify genetic variation, from genetic variation between species over genetic variation within species, to the functional derivate of part of this variation through gene expression. I presented ways to determine the role of hybridization on reproductive isolation between a set of closely related species in the genus, as well as discern levels of genetic diversity and genetic variation in natural populations of a Boechera species that is endemic to northern Utah,

B. lasiocarpa, and finally, to determine the signatures of gene expression under drought stress between Boechera species with different reproductive modes and ploidy levels. This dissertation represents the application of several research interests of mine to a genus that has been considered very complex due to known extensive hybridization between congeners, the existence of multiple reproductive modes with varying ploidy levels, and the general lack of clarity of morphological traits for species determination (see chapter 1). My research interests comprise various aspects of , namely the application of novel sequencing technology, combined with computational and statistical approaches, to understand processes such as speciation, hybridization and admixture, the origin, maintenance and loss of genetic variation under different demographic scenarios, the role and evolution of asexual versus sexual reproduction and varying mating systems therein, and the effects of genomic architecture on population genetic dynamics within natural populations. None of the research questions mentioned here are particularly new to the fields of ecology and evolutionary biology. In fact, all of these aforementioned topics have a long history.

However, due to rapid current advances in technology, methodology, and theory in the broader fields of genetics and evolutionary biology, we are now revisiting said questions on a truly unprecedented scale. The amount of data that can now be produced using High-Throughput-Sequencing platforms presents us with exceptional opportunities to re-assess former questions in evolutionary biology, but it also introduces new challenges when attempting to make sense of these data. These challenges 95 are predominantly of a statistical as well as computational nature. In the following paragraphs, I will attempt to synthesize the various aspects of my dissertation and present opportunities for future work in the genus Boechera and beyond.

Contribution

The first part of my dissertation is essentially a way to quantify genetic variation between species. In chapter 2, I focused on a subgroup within the Boechera genus, the B. puberula clade

(which was assumed to be monophyletic) and presented the most extensive evolutionary history of the clade so far. Based on numbers of molecular markers that exceeded prior marker numbers used in the system by roughly 1000-fold, I presented evidence for monophyly in the B. puberula group and additional evidence for a split of the B. puberula taxon into two distinct subspecies, B. puberula puberula and B. puberula arida. I further presented the updated geographical distribution across the western United States and displayed patterns of admixture and reproductive isolation between the members of the puberula clade and two additional species, B. retrofracta and B. stricta, that are distributed widely across North America and which are known to hybridize extensively within the genus.

In chapter 3, I transitioned from genetic variation between species to the assessment of said variation within one specific species, and moreover, within-population genetic variation and genetic diversity. In this chapter, I presented admixture analyses of 105 diploid individuals of B. lasiocarpa, a montane endemic to Utah (which is part of the B. puberula clade, see chapter 2), after prior analyses to exclude all hybrids. B. lasiocarpa was formerly described as occurring only in small populations in the Wasatch Range of Utah, and it is listed by the Utah Native Plant Society as locally endemic, but its’ conservation status was not known (http://www.utahrareplants.org/). I described levels of genetic diversity and was able to present evidence that the rare endemic was more genetically diverse than the widespread B. stricta. I then assessed differences between common and rare single nucleotide variants (SNVs), and based on these different types of variants, I could show that populations carried considerable variation in genetic diversity, a finding that seems to be consistent with differences in population size encountered. Additionally, I presented evidence for varying degrees of population 96 genetic differentiation and population substructure between different localities and discussed the processes that could lead to the patterns encountered among and within populations.

For genetic variation to be truly meaningful on an ecological and evolutionary scale, however, we have to aim our attention at functional genetic variation. One way to do so is through the assessment of transcriptomic data, where we assess gene expression patterns under different experimental conditions. As might be evident from the preceding chapters, a central theme in the genus Boechera is the variation in reproductive modes and polyploidy as a result of extensive hybridization events.

In chapter 4, I described the gene expression signatures across four Boechera species (one of which is a hybrid between B. retrofracta and B. exilis, where the latter is a member of the B. puberula clade; see chapter 2) in an experiment to discern differential expression (DE) of species with sexual versus asexual (i.e. apomictic) reproductive modes and different ploidy levels under drought stress.

Bayesian analysis of differential expression based on normalized expression data was used to estimate the per-gene probability for DE under eight biological scenarios. I specifically asked whether these species exhibited a uniform response to drought-stress, if expressed genes could be found that were associated with apomictic reproduction, whether triploid Boechera apomicts displayed differences in gene expression when compared to their diploid congenerics, and finally, if the apomict B. lignifera displayed signs of a reversal from apomeiosis to meiosis (which was suggested based on

DIC microscopy results). Across all four species, I did not find a uniform response to drought-stress, and between sexual and apomictic species, I found a wide of differentially-expressed genes associated with apomictic versus sexual reproduction. Further, the triploid B. cf. gunnisoniana, when compared to the remainder of diploid species, displayed differential expression associated with ploidy, where part of the differentially-expressed genes were likely to be associated with gene dosage effects. Finally, B. lignifera also displayed differentially-expressed genes under a reversal-model

(from apomeiosis to meiosis). For each of the relevant biological scenarios, I discussed a subset of differentially-expressed genes and highlighted several candidate genes that had been suggested in prior analyses, genes whose exact function has yet to be determined, and genes with known function that seem to be associated with the respective biological scenario, yet have not been suggested in prior work. 97

Future work

Besides many other possible venues for future research connected to the work presented in

this dissertation, I will try to restrict myself to a subset thereof that I want to pursue in the not

too distant future. I would like to conduct a large-scale genomic study across the genus, where

within-species variation is taken into account by evenly sampling multiple populations across the

respective species-wide distribution ranges. By doing so, I want to determine whether reproductive

incompatibilities in Boechera accumulate at varying rates within the genus, and further assess the role of differential admixture rates therein (see chapter 2). Further, I would be interested in the evolutionary and population genetic consequences of the reproductive modes present in Boechera.

Since individuals in Boechera do not appear to be restricted to only one reproductive mode (see chapter 1), I would like to test whether certain environmental conditions, such as pollinator presence and habitat characteristics are associated with the expression of sexual reproduction or apomixis and whether natural selection is favoring certain reproductive modes under different scenarios.

Additionally, I would like to clarify whether such individuals are able to introgress into purely sexual lineages, or whether these individuals are bound to remain isolated, which might create strong lineage differentiation and even lead to speciation.

In the montane endemic B. lasiocarpa (see also chapter 3), I would like to further investigate

whether specific populations are affected by genetic drift or whether selection is acting in these

populations to remove deleterious mutations and drive local adaptation. Further, I want to assess the

role of self-incompatibility in driving population genetic structure in this species. Results based on

estimates of genetic diversity suggest potential variation in the mating system for different popula-

tions, which could point to a role of self-incompatibility (SI), specifically the sporophytic SI system

in Brassicaceae (see also chapter 4).

Members of the genus Boechera represent feasible models for the study of speciation in sym-

patry, allopatry, and parapatry, adaptation to specific soils (e.g. calciferous and serpentenoid), the

roles of reproductive modes and the mating system in the origin, maintenance and loss of genetic

diversity, as well as chromosomal rearrangements after polyploidization events and whole-genome 98 duplications. After all, we could learn a great deal about natural population dynamics and the nature of biodiversity, if we moved beyond the all too convenient realm of model systems. 99

APPENDICES 100

Appendix A

SUPPLEMENTARY FIGURES AND TABLES FOR CHAPTER 2

Figure A.1. Admixture proportions based on 14,815 common variants. Each bar represents Bayesian point estimates of admixture proportions for each respective individual, and thus the proportion of inheritance of each genome to the respective species. Results of 2 through 16 presumed source species are shown here, with k = 8 being the best model based on DIC. Given the higher number of putative source taxa, I chose a color palette that differs from the colors used in the figures which were included in the paper. 101 , B. puberula arida (sexual) (sexual) (sexual) (sexual) B. stricta B. stricta B. stricta B. stricta B. stricta B. stricta B. stricta B. stricta B. stricta B. stricta B. retrofracta x stricta B. retrofracta x subpinnatifida B. retrofracta x stricta B. retrofracta x stricta B. retrofracta B. retrofracta B. retrofracta retrofracta B. retrofracta B. retrofracta B. retrofracta retrofracta B. retrofracta B. retrofracta retrofracta B. retrofracta retrofracta B. retrofracta B. retrofracta x stricta B. retrofracta x subpinnatifida Continued on next page . The 23 samples not denoted with "MS" came from air-dried herbarium specimens B. puberula puberula locID id11 iID1 12 2 locality MS5563 3 MS5574 Grizzly 4 Peak, UT MS5585 Grizzly 5 Peak, UT CR11811 Grizzly 6 Peak, UT JB186 La6 Plata, 7 CO JB3777 -111.97 8 Weber, UT JB12558 -111.97 9 longitude Nye, 41.41 NV MS5549 -111.97 Custer, 10 ID latitude 41.41 CR10918 Grizzly 11 2 Peak, JB1258 UT ploidy 41.41 Teton,8 WY -108.02 12 2 MS465 nominal Madison, taxon MO10 13 2 MS328 Steep 37.4411 Canyon, UT -111.59 14 MS469 Deadfall 1512 Lake, -111.97 CA MS471 2 Steep 41.41 -117.35 16 Canyon, -114.65 MS169 UT 41.41 Steep 17 Canyon, MS280 UT -111.6 Little -111.96 -110.52 2 38.95 Volcano, 43.86 CA JB867 -122.52 2 Hat Creek, CA 45.56 43.85 2 -111.6 41.97 2 Park, 41.33 WY -111.6 -120.89 2 2 2 41.97 2 39.86 41.97 -121.41 2 2 2 40.7 -110.57 2 44.41 2 1010 1813 1911 MS163 2014 MS165 Little Volcano, CA 2115 JB176 Little Volcano, CA 229 MS282 Mineral, 23 MO8 JB659 Hat Creek, -120.89 CA9 CR1043 24 -120.89 Deschutes, OR Humboldt, 25 CA 39.86 MS313 26 39.86 MS473 Deadfall Lake, 2 CA MS327 -115.7 Steep Canyon, 2 -121.41 UT Deadfall Lake, -121.56 CA -123.65 40.7 47.45 -122.51 43.67 40.48 -111.6 2 2 41.33 -122.52 2 2 41.97 2 41.33 2 2 Table A.1. Locality information,taxa locality which id were both (locID) determined samplepuberula from numbers = microsatellite (id) data. for Here, Fig. I 2.3 use and only the Fig. species 2.2, epithet, internal omitting id the genus (iID) name. as arida well = as ploidy and nominal 102 B. exilis x retrofracta B. exilis x retrofracta B. exilis x retrofracta B. exilis x retrofracta B. exilis x retrofracta B. exilis x retrofracta B. exilis x retrofracta B. exilis x retrofracta B. exilis B. exilis B. exilis B. exilis B. puberula puberula B. puberula B. puberula B. puberula puberula B. puberula B. puberula puberula B. puberula B. puberula B. retrofracta B. puberula B. puberula B. puberula arida B. puberula arida B. puberula arida B. serpenticola B. serpenticola B. serpenticola Continued from previous page Continued on next page Table A.1 – locID id1617 iID 2717 2817 MS413 locality 2916 MS9 Bear Lake Summit, 3017 UT MS7 3116 MS18 Wells, NV -111.47 3216 MS425 Wells, NV Wells, 3318 NV MS11 41.93 Bear Lake Summit, 3419 UT MS426 Wells, 35 2 20 NV MS422 Bear longitude -111.47 Lake Summit, 3621 UT FW241 Bear latitude Lake Summit, 3722 UT -114.57 CR1164 41.93 Elko, -111.47 NV ploidy 3823 -114.57 FW73 Summit, UT -111.47 nominal -114.57 41.08 taxon 39 2 23 FW76 41.93 41.08 Nye, 40 NV24 JB1275 41.93 41.08 2 -114.57 Millard, 41 2 25 UT MS79 Baker, 2 OR 42 2 26 2 MS80 41.08 Water 43 Canyon,25 JB382 NV -111.4078 -115.08 Water 44 Canyon,25 2 MS93 NV 40.7753 Box 45 Elder,11 UT 2 JB381 40.68 Lye 46 Creek,23 -116.71 MS102 -117.54 -112.27 NV Humboldt, 47 2 NV23 -116.71 MS94 -117.11 Lye Creek, 40.64 NV 38.97 38.95 4827 MS283 40.64 44.7 Lye 49 Creek,28 MS69 NV 2 -113.94 Hat 2 2 Creek, CA 5029 MS72 -117.54 2 Water 51 2 Canyon,30 JB1611 41.77 NV -117.55 Water 52 -117.54 Canyon,30 41.69 JB1610 NV Box Elder, UT 2 41.67 5330 FW237 -117.54 Mono, CA 41.69 2 54 -116.71 -121.41 MS286 Lander, 2 NV 55 41.69 -116.71 MS305 2 Bully Choop 40.64 40.7 Mtn, CA MS302 Bully Choop 2 40.64 Mtn, CA -113.69 2 Bully Choop Mtn, 2 -122.94 CA 2 41.53 -122.94 -119.13 40.65 -117.37 -122.94 2 40.65 38.36 2 39.24 40.65 2 2 2 2 103 (2:1) (holotype) B. serpenticola B. serpenticola B. serpenticola B. subpinnatifida B. subpinnatifida B. subpinnatifida B. subpinnatifida B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. retrofracta B. lasiocarpa B. lasiocarpa B. lasiocarpa B. lasiocarpa B. exilis x puberula xB. retrofracta exilis x puberula xB. retrofracta exilis x puberula xB. retrofracta p. arida x puberulaB. x retrofracta subpinnatifida x sparsiflora Continued from previous page Continued on next page Table A.1 – locID id3030 iID 5630 5731 MS294 locality 5831 MS298 Bully Choop Mtn, 5931 CA MS297 Bully Choop Mtn, 6031 CA MS367 Bully Choop Mtn, -122.94 6132 CA MS392 Rogue River, OR -122.94 62 MS402 Rogue 40.65 River, OR -122.94 63 MS403 Rogue 40.65 River, OR FW516 2 Rogue longitude 40.65 River, OR 2 -123.53 Rich, UT latitude 2 -123.53 ploidy 42.55 -123.53 nominal taxon 42.55 -123.53 2 42.55 2 42.55 2 -111.46 2 41.92 2 3333 6434 6534 CR1308 6634 CR1309 Cache, UT 6734 MS458 Cache, UT 6834 MS444 Logan Canyon Sinks, 69 UT34 MS446 Logan Canyon Sinks, -111.48 70 UT35 MS455 Logan Canyon Sinks, -111.48 71 UT35 MS453 Logan 41.93 Canyon Sinks, -111.48 -111.71 72 UT35 MS447 Logan 41.93 Canyon Sinks, -111.48 -111.66 73 UT11 MS485 2 Logan 41.93 41.8 Canyon Sinks, -111.48 74 UT36 MS496 2 Steam 41.93 41.91 Mill Peak, -111.48 7537 UT MS488 2 Steam 41.93 Mill 2 Peak, 7638 UT MS275 2 2 Steam 41.93 Mill Peak, 7739 UT JB419 2 -111.61 Hat Creek, CA 7840 MS513 2 -111.61 Box 79 Elder, 41.9540 UT JB1587 -111.61 James Peak, UT NA 41.9540 FW1630 Salt Lake, 2 UT MS39 NA 41.9541 Tooele, UT 2 MS40 NA41 -121.41 Angel 2 Lake, MS41 NV NA -111.98 Angel Lake, -111.78 MS215 40.7 NV NA Angel Lake, MS221 NV Frenchman 41.39 -111.72 Lake, 41.38 CA 2 Frenchman Lake, CA -112.62 -115.07 2 40.63 2 -120.19 -115.07 40.48 41.02 -120.18 -115.07 2 39.88 41.02 2 3 39.87 41.02 3 3 3 3 104 (2:1) (2:1) (2:1) (2:1) (2:1) (2:1) (2:1) (2:1) B. retrofracta x sparsiflora B. exilis x retrofracta x sparsiflora B. exilis x retrofracta x sparsiflora B. exilis x retrofracta x sparsiflora B. exilis x retrofracta x sparsiflora B. retrofracta x lasiocarpa B. retrofracta x lasiocarpa B. retrofracta x lasiocarpa B. retrofracta x lasiocarpa B. p. arida x subpinnatifida B. p. arida x subpinnatifida B. p. arida x subpinnatifida B. exilis x puberula xB. retrofracta exilis x puberula xB. retrofracta exilis x puberula xB. retrofracta lasiocarpa x retrofracta x stricta B. lasiocarpa x retrofracta x stricta B. lasiocarpa x lemmonii xB. stricta lasiocarpa x lemmonii xB. stricta lasiocarpa x lemmonii xB. stricta lasiocarpa x lemmonii xB. stricta lasiocarpa x lemmonii xB. stricta lasiocarpa x lemmonii x stricta Continued from previous page Table A.1 – locID id4142 iID NA42 MS228 NA42 MS125 locality NA42 Frenchman Lake, CA MS128 NA43 Indian Creek, NV MS134 NA43 Indian Creek, NV -120.18 MS135 NA43 Indian Creek, NV MS501 NA43 Indian 39.87 Creek, NV MS511 -117.55 NA44 James Peak, UT MS512 -117.55 NA 3 James Peak, UT 41.65 longitude MS525 -117.55 NA James Peak, UT 41.65 latitude MS145 -117.55 James 3 Peak, UT 41.65 ploidy Peavine 3 Peak, -111.78 NV 41.65 nominal taxon 3 -111.78 41.38 3 -111.78 41.38 -111.78 -119.93 3 41.38 3 41.38 39.59 3 3 3 4444 NA45 MS150 NA45 MS162 NA45 Peavine Peak, NV MS47 NA46 Peavine Peak, NV MS49 NA46 Shoshone Mtns, MS54 NV NA47 Shoshone -119.93 Mtns, MS482 NV NA47 Shoshone -119.93 Mtns, MS499 NV NA47 Steam Mill 39.59 -116.86 Peak, UT MS536 NA47 Steam Mill 39.59 -116.86 Peak, UT MS538 NA47 3 40.42 Willard Peak, -116.86 UT MS539 -111.61 NA47 3 40.42 Willard Peak, UT 3 MS543 -111.61 NA 40.42 Willard Peak, UT 41.95 3 MS550 NA Willard Peak, UT 41.95 3 MS552 -111.97 Willard 3 Peak, UT -111.97 Willard 3 Peak, UT 41.39 -111.97 41.39 -111.97 3 41.39 -111.97 3 41.39 -111.97 3 41.39 3 41.39 3 3 105

Appendix B

SUPPLEMENTARY MATERIALS FOR CHAPTER 3

Supplementary methods

Variant calling and ploidy determination In order to determine individual ploidy levels for the collected samples (273 individuals), I called variants using GATK version 3.5 (McKenna et al., 2010) with ploidy set to diploid for all individuals, heterozygosity for prior likelihood calculation per locus of 0.001, and I ignored sequences with mapping quality <20. The minimum phred-scaled confidence threshold for variants to be called was set to 50 and the resulting variants were then filtered to contain only variants with at least 128 sequences, at least 4 sequences with the alternative allele, and I only kept the genetic variants at nucleotide sites where I had data for at least 80% of the sampled individuals and only one alternative allele. Additionally, only variants with a minimum phred-scaled mapping quality of 30 and a minor allele frequency ≥ 0.05 were retained. After calling variants as diploid, I used the R package gbs2ploidy (Gompert & Mock (in prep.) to determine ploidy levels for each individual based on allelic proportions at each locus. I used custom python scripts to extract the read counts of reference and alternative alleles at all heterozygous loci for each individual. In gbs2ploidy, I first estimated the allele depth proportions from the number of reads for each heterozygous locus and individual using a Bayesian model in JAGS, where I set a binomial likelihood on y, with a categorical prior on the allele depth and a Dirichlet prior on alpha. For estimation, I sampled for 4000 iterations with a burn-in of 500 and thinning interval of two. Proportions for allele depth were chosen to be 0.25, 0.33, 0.5, 0.66 and 0.75, as I expected predominantly diploid and triploid individuals. Further, I ran a Principal Component Analysis (PCA) on the resulting allele depth proportions, the proportion of variants that are heterozygous as well as the mean coverage for heterozygous variants for each individual. I then performed k-means clustering on the first three principal components of the prior PCA with three classes. I chose three classes to discriminate diploid apomicts from diploid sexual individuals. The resulting cluster assignments were finally used as grouping factors in a linear discriminant analysis (LDA), with the first three principal components as discriminators, and uniform priors on the three classes. Individuals were determined as diploid if they had total combined posterior ploidy proportions of ≥ 0.9 in the two diploid classes. I used prior microsatellite data (Li et al., 2016) to compare ploidy levels obtained from gbs2ploidy. For the resulting diploid samples (n = 188), I called variants using GATK with the same settings and filtering steps as mentioned above. Additionally, I excluded all variants with minor allele frequency (MAF) ≤ 0.05.

Determination of nominal taxa In order to obtain individuals with predominantly B. lasiocarpa contributing to their genomes, I used entropy (Gompert et al., 2014) on all diploid individuals. entropy is very similar to the correlated-allele frequency admixture model in the structure software (Falush et al., 2003). Unlike in structure, however, I am incorporating sequence coverage, sequencing error and alignment error by directly using the genotype likelihoods calculated with GATK. For Bayesian estimation of admixture proportions, I used 16000 total iterations, with a burn-in of 4000 and a thinning interval of six for runs with putative source populations of five through 17, with six chains for each set 106 of runs. To reduce computational time for each respective entropy run, I calculated initial mean genotypes from the genotype likelihoods by using the expectation-maximization algorithm described in Li (2011) across all individuals. A mean genotype is the mean of the posterior distribution and as such is a continuous point estimate of the number of alleles at a respective locus, ranging from zero to two (with 0: homozygous for reference allele; 1: heterozygous, and 2: homozygous for the alternative allele). Mean genotypes were used to calculate starting values of admixture proportions with the discriminant analysis of principal components (dapc) (Jombart et al., 2010) function in the R package adegenet (Jombart, 2008; Jombart and Ahmed, 2011) for each respective number of clusters. Convergence and mixing of chains was assessed by calculating the Gelman-Rubin potential scale reduction factor (Gelman and Rubin, 1992) for each run and across chains in coda (Plummer et al., 2006). Corresponding barplots of admixture proportions for all k clusters were produced in R (R Development Core Team, 2015). Finally, for the set of runs with the highest support based on DIC, I obtained all individuals (n = 114) with admixture proportions ≥ 0.9 from the cluster determined to represent B. lasiocarpa.

Supplementary Results

Variant calling and ploidy determination For total number of reads, numbers of individuals and mean number of reads per individual for each of the subsets of individuals used in this study, see Table B.2. To determine individual ploidy, I obtained 60,734 high-quality variants across 273 individuals. For these individuals and across all variants, I estimated individual ploidy proportions. Using a cutoff for ploidy proportions of ≥ 0.9, I determined 188 of the 273 individuals to be diploid.

Determination of nominal taxa For the remaining 188 diploid individuals, I obtained 26,371 variants, with mean coverage per SNV per individual of 11.89 (sd = 12.65) (see Table B.2). Across entropy runs for numbers of putative source taxa ranging from k = 5 to 17, k = 15 received the highest DIC support, with k = 12 having the second-lowest DIC score (with 59.768105 and 59.771105 for k = 14 and k = 12, respectively). According to the admixture proportions with k = 15, I determined 114 individuals to belong to B. lasiocarpa, with admixture proportions ≥ 0.9. Of these 114 individuals, I further used 105 individuals in the following analyses. 107

Supplementary Figures and Tables

Figure B.1. Boechera lasiocarpa in the Wasatch Mountains of Utah, specifically the locality of Logan Canyon Sinks (lcs). 108 individual included B. lasiocarpa Continued on next page Admix-id1 id234 MS655-145 MS657-146 Locality MS662-147 Ben Lomond Trail MS653-148 Ben Lomond Trail MS661-149 Ben Lomond Trail MS652-1410 Ben Lomond Trail MS663-1411 Ben Lomond Trail MS667-1412 Ben blo Lomond Trail MS666-1413 Ben blo MS654-14 Lomond Trail14 Ben blo Locality MS658-14 Lomond (abbrev.) Trail15 Ben blo MS668-14 Lomond State Trail Ben16 Lomond Trail blo MS614-14 County Ben17 Lomond Trail blo MS453-13 Ben18 Lomond Utah Trail blo MS612-14 Latitude Logan19 Canyon Utah Sinks Weber blo MS736-14 Logan20 Canyon Utah Sinks Longitude Weber blo MS613-14 Logan blo21 Canyon Utah Sinks Weber 41.36168 MS732-14 Logan blo22 Canyon Utah Sinks Weber 41.36145 MS444-13 -111.929732 Logan blo23 Canyon lcs Utah Sinks Weber 41.361496 MS601-14 -111.929775 Logan24 Canyon lcs Utah Sinks Weber 41.361634 MS725-14 -111.929742 Logan25 Canyon lcs Utah Sinks Weber 41.361675 MS446-13 -111.929688 Logan26 Canyon lcs Utah Sinks Weber 41.361589 MS447-13 -111.930101 Utah Logan27 Canyon lcs Sinks Weber 41.361544 MS455-13 -111.929755 Utah Logan Weber28 Canyon lcs Sinks 41.361441 MS606-14 -111.929799 Utah Logan Weber29 Canyon lcs Utah Sinks 41.361514 MS617-14 -111.929654 Logan Weber 41.36162730 Canyon lcs Utah Sinks Cache MS730-14 -111.929679 Logan 41.36172631 -111.929669 Canyon lcs Utah Sinks Cache MS734-14 Logan 41.36145732 -111.929864 Canyon lcs Utah Sinks Cache MS568-14_rep 41.936547 Logan33 -111.929632 Canyon lcs Utah Sinks Cache MS586-14 41.934917 Logan -111.487489 Canyon Sinks Logan34 Canyon lcs Utah Sinks Cache MS735-14 41.936528 -111.483288 35 lcs Utah Cache MS595-14 41.935485 -111.487543 Logan Canyon lcs Utah Sinks Cache MS729-14 41.936406 -111.484836 Logan Canyon lcs Utah Sinks Cache MS615-14 41.935597 -111.487536 lcs Logan Canyon lcs Utah Sinks Cache MS618-14 41.934917 -111.485051 Logan Canyon Utah Sinks Cache 41.935374 -111.483288 Logan Canyon lcs Utah Sinks Cache 41.935871 -111.484871 Logan Canyon lcs Utah Sinks Cache 41.934917 -111.48542 lcs Utah Cache 41.934917 -111.483288 lcs Utah Cache 41.934917 -111.483288 Utah lcs Utah Cache 41.935983 -111.483288 Cache lcs Cache 41.93585 -111.485222 Utah 41.935692 -111.485239 41.937361 Utah Cache 41.935555 -111.485156 -111.486468 Utah Cache -111.484967 Utah Cache 41.934441 Utah Cache 41.935499 -111.485821 Utah Cache 41.934082 -111.484913 Cache 41.935789 -111.48364 41.936539 -111.485253 41.935647 -111.487159 -111.485489 Table B.1. Locality information, sample ids, ids for Figures 3.2 and 3.3 as well as geographical information for each 109 Continued from previous page Continued on next page Table B.1 – Admix-id36 id373839 MS722-1440 MS721-1441 MS610-14 Locality Logan42 Canyon Sinks MS566-14 Logan43 Canyon Sinks MS570-14 Logan44 Canyon Sinks MS580-14 Logan45 Canyon Sinks MS582-14 Logan46 Canyon lcs Sinks MS584-14 Logan47 Canyon lcs Sinks MS576-14 Logan48 Canyon lcs Sinks MS575-14 Logan49 Canyon lcs Sinks Locality MS591-14 (abbrev.) Logan50 Canyon lcs Sinks MS563-14 State Logan51 Canyon lcs Sinks MS597-14 County Logan52 Canyon lcs Utah Sinks MS583-14 Logan53 Canyon lcs Utah Sinks Cache MS562-14 Latitude Logan54 Canyon lcs Utah Sinks Cache MS589-14 Logan55 Canyon lcs Utah Sinks Longitude Cache MS565-14 41.935191 Logan56 Canyon lcs Utah Sinks Cache MS733-14 41.934624 -111.484713 Logan57 Canyon lcs Utah Sinks Cache MS611-14 41.937179 -111.484254 Logan58 Canyon lcs Utah Sinks Cache MS723-14 41.937139 -111.486987 Logan59 Canyon lcs Utah Sinks Cache MS458-13 41.937341 -111.486221 Logan60 Canyon lcs Utah Sinks Cache MS585-14 41.936253 -111.486538 Logan61 Canyon lcs Utah Sinks Cache MS603-14 41.935558 -111.487724 Logan62 Canyon lcs Utah Sinks Cache MS605-14 41.93529 -111.4868 Logan63 Canyon lcs Utah Sinks Cache MS727-14 41.937602 Logan -111.486437 64 Canyon lcs Utah Sinks Cache MS587-14 41.937575 -111.487555 Logan65 Canyon lcs Utah Sinks Cache MS593-14 41.932167 -111.487403 Logan66 Canyon lcs Utah Sinks Cache MS726-14 41.936496 -111.483157 Logan67 Canyon lcs Utah Sinks Cache MS609-14 41.934761 -111.486165 Logan68 Canyon lcs Utah Sinks Cache MS731-14 41.935529 -111.484168 Logan69 Canyon lcs Utah Sinks Cache MS590-14 41.936239 -111.486739 Logan70 Canyon lcs Utah Sinks Cache MS706-14 41.933207 -111.485434 Logan71 Canyon lcs Utah Sinks Cache MS719-14 41.937001 -111.484943 Logan Canyon lcs Utah Sinks Cache MS708-14 41.935575 -111.48612 Powder Mountain lcs Utah Cache MS699-14 41.937372 -111.485004 Powder Mountain lcs Utah Cache MS707-14 41.935409 -111.487444 Powder Mountain lcs Utah Cache 41.934917 -111.485137 Powder Mountain lcs Utah Cache 41.934881 -111.483288 Powder Mountain Utah Cache 41.935515 -111.486203 pow Utah Cache 41.935667 -111.484932 pow Utah Cache 41.935827 -111.485224 pow Utah Cache 41.934062 -111.485329 pow Utah Cache 41.933352 -111.486018 pow Utah Cache 41.935883 -111.483671 Cache 41.937085 Utah -111.485323 41.935656 Utah -111.486817 Cache 41.932083 Utah -111.485016 Cache Utah -111.484463 Cache 41.382328 Utah Cache 41.382592 -111.782256 Cache 41.382342 -111.782361 41.382391 -111.782267 41.38233 -111.782192 -111.782254 110 Continued from previous page Table B.1 – Admix-id72 id737475 MS701-1476 MS713-1477 MS702-14 Locality Powder78 Mountain MS714-14 Powder79 Mountain MS709-14 Powder80 Mountain MS711-14 Powder81 Mountain MS718-14 Powder82 Mountain MS720-14 Powder pow83 Mountain MS717-14 Powder pow84 Mountain MS715-14 Powder pow85 Mountain Locality MS704-14 (abbrev.) Powder pow86 Mountain MS712-14 State Powder pow87 Mountain MS705-14 County Powder pow88 Mountain MS689-14 Utah Powder pow89 Mountain MS691-14 Latitude Utah Cache Powder pow90 Mountain MS688-14 Utah Cache Provo pow91 Peak Longitude MS649-14 Utah Cache 41.38235 Provo pow92 Peak MS648-14 Utah Cache 41.382259 Provo pow93 Peak -111.782204 MS644-14 Utah Cache 41.382383 -111.782218 Red pow94 Butte Emigration MS646-14 Utah Divide Cache 41.382259 -111.782186 Red pow95 Butte Emigration MS636-14 Utah Divide Cache red 41.382287 -111.782189 Red96 Butte Emigration MS638-14 Utah Divide Cache red 41.382249 -111.782247 Red97 Butte Emigration MS641-14 Utah Divide Cache red 41.382607 -111.782245 Red98 pro Butte Emigration MS639-14 Utah Divide Cache red 41.382591 -111.782391 Red99 pro Butte Emigration MS640-14 Utah Divide Cache red 41.382546 -111.78236 Red100 pro Butte Emigration MS637-14 Utah Divide Cache red 41.382471 -111.782347 Red101 Butte Emigration MS620-14 Utah Divide Cache red 41.382385 -111.782375 Red102 Butte Emigration MS632-14 Utah Divide Salt red 41.382255 -111.782207 Lake Red103 MS633-14 Butte Emigration Utah Divide Salt red 41.382347 -111.782253 Lake 40.798366 Utah Skyline104 MS629-14 North Trail Utah Salt red -111.782213 Lake 40.798366 Utah Skyline105 MS630-14 -111.752721 Utah North Trail Utah Skyline Salt North Lake 40.798338 Utah Trail MS631-14 -111.752721 Utah Utah Skyline Salt North Lake 40.798366 Trail MS622-14 -111.752758 Utah Utah Skyline Salt 40.242845 North Lake 40.798221 Trail MS623-14 -111.752721 Utah Skyline Salt 40.242873 skn North -111.5771 Lake 40.798262 Trail -111.753204 Utah Skyline Salt 40.242851 skn North -111.577123 Lake 40.798299 Trail -111.752998 skn Utah Skyline Salt North -111.577133 Lake 40.798279 Trail -111.752841 skn Salt Lake 40.798297 -111.752985 skn 40.798251 -111.752904 skn -111.752994 skn Utah skn Utah Weber Utah Weber Utah Weber 41.330109 Utah Weber 41.332335 -111.914203 Utah Weber 41.332335 -111.919153 Utah Weber 41.331967 -111.919153 Utah Weber 41.332335 -111.918373 Weber 41.332335 -111.919153 41.330109 -111.919153 41.330109 -111.914203 -111.914203 111 B. lasiocarpa individuals for subsequent runs, and B. lasiocarpa B. lasiocarpa cmn rare g2p plus 27360,734 26,371 18812.23 13,09911.82 11.89 2,051 105 12.65 12.59 18.26 105 12.73 21.64 N individuals N loci mean coverage sd coverage Table B.2. Number of individuals,runs, variants, with and sequencing g2p coverage = for ploidy respectivesets, determination variant with calling run, common runs. (cmn) plus and = Column rare extended names variants, reference represent respectively. set the to variant calling isolate 112 2 2 2 2 2 2 2 2 2 2 2 2 2 2 (sexual) 2 (sexual) 2 individual included (sexual) 2 (apomict) 2 (apomict) 2 (sexual) 2 Boechera pendulocarpa pendulocarpa pendulocarpa ursalaca pendulocarpa ursalaca pendulocarpa ursalaca pendulocarpa ursalaca microphylla imnahensis microphylla imnahaensis microphylla microphylla microphylla yellowstonensis microphylla yellowstonensis microphylla microphylla lemmonii lemmonii lemmonii lemmonii lemmonii exilis exilis exilis exilis Continued on next page 6789 FW60710 MS568-14_rep11 JB517 Logan12 Canyon Sinks FW89613 FW1631 –14 MS537-1315 MS535-13 –16 Utah FW1159 – – Willard17 Peak MS513-13 Willard18 Peak MS544-1319 FW502 Cache – James20 Peak JB515 Willard21 Peak MS532-13 41.93736122 MS534-1323 – -111.486468 MS540-13 Utah Wyoming Willard24 – Peak – MS541-13 Utah Willard25 Peak Sublette JB1007 Willard26 Washington Peak JB1590 Idaho Utah Box Willard27 Utah Asotin Elder Peak 42.6107 JB319 Utah Box28 Elder 41.393647 FW587 –29 -109.2508 41.393647 46.0711 -111.971158 JB422 Idaho Wyoming –30 Davis -111.971158 MS715-14 Cache – Utah Box31 Elder -116.9549 – CR1164 – Utah Park32 – 41.393647 45.2772 FW241 Utah 41.0013 41.382322 Powder33 Mountain Wyoming -111.971158 – FW73 Utah Box -111.78227334 Elder -116.6716 FW76 – 44.4556 Utah -111.8749 – Box35 Park – Elder 41.393647 MS694-14 Box – Elder 41.393647 -111.971158 MS693-14 -109.5282 – Box Elder 41.393647 -111.971158 MS692-14 – – 44.9438 Provo Utah Cache Peak 41.393647 -111.971158 MS749-14 – – Provo Peak Wyoming -111.971158 – -110.6393 Provo Peak Utah – 41.7715 Teton Big Cottonwood Canyon Cache Utah Wyoming -111.7453 – 43.7091 Duchesne Utah 41.382471 Utah Lincoln – -111.782375 40.8194 Utah Summit Utah -110.9045 42.5822 – Utah Nevada -110.2853 40.8623 Beaver Utah – Salt Lake Nevada 2 -110.6248 Summit Utah Utah Elko -110.4755 40.597453 38.3684 Utah -111.588532 – Nye 40.7753 Utah -112.3879 – 40.242907 40.6779 Millard – -111.4078 40.242908 -111.577205 – 38.9728 40.242953 -111.577232 – -115.0767 38.9461 -111.577113 – -117.54 – -112.2744 – – – – – Admix-id1 id234 JB1585 MS670-14 Locality CR1312 CR1346_rep Centerville Canyon – FW604 – – – Utah State Morgan County 40.936973 Wyoming -111.798713 Latitude Idaho – Wyoming Fremont Longitude Lincoln 42.457 Wyoming Bear Lake Msat-taxon 42.188 42.2314 Sublette -108.8719 43.2495 -111.5607 -110.812 -109.9763 ploidy – Table B.3. Locality information, sample ids, ids for Figures 3.2 and 3.3 as well as geographical information for each 113 2 2 2 2 2 2 2 2 2 2 2 (sexual)(sexual)(sexual)(apomict) 2 2 2 2 stricta stricta stricta stricta stricta stricta retrofracta retrofracta retrofracta retrofracta retrofracta retrofracta retrofracta retrofracta puberula arida puberula puberula puberula arida puberula arida puberula puberula Continued from previous page Continued on next page Table B.3 – 41424344 JB125545 JB18646 MS668-1447 MS695-14 –48 MS745-14 Ben49 Lomond – Trail MS744-14 Provo50 Peak MS681-14 Big51 Cottonwood Canyon MS728-14 Big52 Cottonwood Canyon MS724-14 Centerville53 Canyon MS570-14 Utah Logan Utah54 Canyon Sinks MS488-13 Utah Logan55 Canyon Sinks MS563-14 Logan56 Canyon Sinks MS649-14 Steam57 Utah Mill Salt Peak Weber MS591-14 Utah Lake Logan58 Canyon Utah Idaho Salt Sinks MS655-14 Lake 40.597502 Red59 Butte Utah Emigration CR1043 Divide 41.361457 Utah 40.597518 -111.588665 Logan60 Canyon Utah Sinks JB659 Utah Utah -111.929632 Morgan -111.588708 Ben61 – Lomond Custer Trail JB867 Cache –62 – Utah JB94 Cache Utah 40.93773 –63 Weber 40.24289 FW237 Cache 43.863164 41.935801 Salt Utah -111.800138 Lake – JB1275 -111.57717465 41.93545 -111.485277 – 41.4122 – JB1610 Cache -114.6531 40.798366 – Utah66 Cache 41.937341 – JB1611 -111.485222 -111.752721 –67 – -111.486538 -111.5924 JB381 Cache – – –68 41.936496 – 41.947116 MS719-14 –69 -111.486165 Weber MS669-14 -111.607605 –70 41.932167 – MS674-14 – Powder71 -111.483157 Mountain – MS699-14 41.36168 Centerville Canyon – MS673-14_rep Centerville California -111.929732 Canyon MS683-14 Centerville – Canyon Powder Mountain – MS572-14 – Oregon Humboldt – Colorado 40.476 Centerville Utah Canyon Utah Logan Wyoming Canyon Deschutes Nevada Sinks Utah Moffat – Utah -123.6535 Oregon 43.6699 – Park – Utah California Cache Lander 44.414 Morgan -121.5645 Utah – – Utah Baker Morgan – Mono Utah Morgan Nevada 41.3847 40.93692 41.382592 39.2384 -110.5702 Cache 40.93736 -111.782361 – -111.798694 2 44.6965 40.937321 38.3574 Box -111.8656 Elder -117.3704 Morgan – Humboldt -111.799381 – -111.799383 Cache 41.382391 – 41.5311 -117.1118 41.669 -119.13 – – 40.937759 -111.782192 -111.800619 -113.6868 – – 41.937548 -117.5486 – -111.486766 – – – – – – – – Admix-id36 id373839 JB125840 CR1181 JB377 Locality MS676-14 – CR1091 – Centerville Canyon – – State Utah County Morgan Montana Colorado 40.937366 Latitude Madison Nevada -111.799451 La Plata Longitude 45.5603 – Wyoming 37.4383 Nye Msat-taxon Teton -111.9594 -108.0211 38.9452 43.8511 -117.3519 -110.5175 ploidy – 114 2 2 (2:1) 3 (holotype) 2 lasiocarpa x retrofracta lasiocarpa lasiocarpa lasiocarpa Continued from previous page Continued on next page Table B.3 – 99100101102 MS713-14103 MS629-14104 MS653-14 Powder105 MS732-14 Mountain Skyline North Trail106 MS587-14 Ben Lomond Trail107 JB419 Logan Canyon Sinks MS453-13 Logan Canyon Sinks MS574-14 Utah MS644-14 Utah Logan Canyon Sinks – Logan Utah Canyon Utah Sinks Red Butte Utah Emigration Divide Cache Weber Utah Utah Weber Cache 41.382259 Utah 41.331967 Cache -111.782218 -111.918373 41.361634 41.935597 Salt Lake – – Cache -111.929688 41.934062 -111.485051 Cache 40.798338 – -111.486018 – -111.752758 41.934917 – Utah – 41.937637 -111.483288 -111.487049 – – Box Elder 41.39 -111.9803 – – – – – – – 2 – 2 Admix-id72 id737475 MS691-1476 MS580-1477 JB1589 Locality Provo78 Peak MS632-14 Logan79 Canyon Sinks FW51680 MS623-14 – Skyline81 North Trail MS705-1482 MS656-14 Skyline83 – North Utah Trail MS630-14 Powder84 Mountain MS458-13 Ben85 Utah Lomond Trail State MS654-14 Utah Skyline86 North Trail MS72-13 Cache Logan87 Canyon Sinks MS652-14 Utah Ben88 Lomond Trail MS662-14 Utah County Utah89 41.936253 Weber Water MS455-13 Canyon Ben Utah90 -111.487724 Lomond Trail MS444-13 Utah Latitude Ben91 Weber Lomond – Utah Utah 41.332335 40.242873 Trail MS485-13 Logan92 Canyon Sinks Cache -111.919153 MS666-14 -111.577123 Longitude Logan Utah93 Canyon Utah Sinks 41.330109 Weber MS620-14 – – Steam94 Weber Mill Peak -111.914203 JB1587 Msat-taxon Cache Rich 41.382347 Ben Utah95 Lomond Trail MS663-14 – Nevada 41.361472 -111.782213 Skyline Utah96 North Utah 41.332335 Trail Weber MS447-13 Rich -111.929763 –97 41.934917 Utah -111.919153 MS446-13 41.5061 – – Ben98 Lander -111.483288 Lomond Weber Trail MS618-14 – 41.361627 Logan Canyon – Sinks Weber MS733-14 -111.2402 Cache 41.9228 Utah -111.929669 Logan Utah 40.635481 Canyon Sinks MS736-14 Cache 41.361589 Utah – Logan Canyon -116.706023 Sinks CR1309 -111.4649 41.361496 -111.929755 Logan 41.934917 Canyon Sinks – -111.929742 – Logan Utah Cache 41.934917 -111.483288 Canyon Utah Sinks Weber – – ploidy Weber -111.483288 – Utah – – Utah 41.947116 41.361514 – Utah – 41.330109 Weber -111.607605 Cache -111.929679 Utah -111.914203 Cache – – – Utah – Cache 41.361544 41.934917 Cache – -111.929799 41.934917 -111.483288 – Cache – 41.935647 -111.483288 – – Salt Lake 2 41.935575 -111.485489 – 41.935485 -111.485004 – 40.6334 – -111.484836 – 2 Utah -111.7234 – – – 2 2 Cache 2 – 41.908 – -111.6557 – 2 2 – – – 115 2 2 lasiocarpa lasiocarpa Continued from previous page Continued on next page Table B.3 – Admix-id108 id109110111 MS549-13112 MS609-14113 CR1308 Locality Willard Peak114 MS725-14 Logan Canyon Sinks115 MS507-13116 MS614-14 – Logan Canyon Sinks117 MS593-14 James Peak118 MS641-14 Logan Canyon Utah Sinks119 MS686-14 Logan Canyon Sinks120 MS582-14 Utah Red Butte Utah Emigration121 MS702-14 Divide State Provo Peak122 MS564-14 Cache Utah Logan Canyon Utah Sinks123 MS651-14 Box Powder Mountain Elder Utah124 MS657-14 Cache Logan 41.937085 Utah County Canyon 41.393647 Sinks125 MS581-14 Ben Salt -111.486817 Lomond Lake -111.971158 Trail126 MS704-14 Cache Ben 41.935871 Utah Lomond – Latitude Utah Trail127 MS575-14 – Cache 40.798299 Logan -111.48542 Canyon Sinks128 MS576-14 Cache -111.752841 Powder Utah 41.936547 Longitude Utah Mountain Utah129 MS622-14 – – Logan 41.933352 -111.487489 Canyon Sinks130 Cache MS688-14 Cache 41.382322 Msat-taxon Logan Utah -111.483671 Canyon – Sinks131 MS711-14 -111.782273 Skyline Utah North – Utah Trail Cache132 MS734-14 Utah Cache 41.7969 Provo 41.935558 – Peak133 MS562-14 Powder Utah -111.4868 Mountain Utah Weber134 MS566-14 -111.714 41.382383 Logan 41.93679 Canyon Utah Sinks Weber 40.242821135 MS595-14 Cache -111.782186 Logan – Canyon Sinks136 MS633-14 -111.577128 -111.485991 41.361573 Utah – Logan Canyon Sinks Cache137 MS661-14 – Cache – 41.36145 -111.929716 Logan 41.936096 Canyon Sinks138 MS701-14 Cache – Skyline Utah -111.487552 -111.929775 North Utah Trail139 MS731-14 41.382385 – Ben 41.937575 Weber Utah – Lomond – Utah – ploidy Trail140 FW1630 -111.782207 Powder 41.937602 -111.487403 Mountain Utah141 MS565-14 – Logan -111.487555 Canyon – Utah – Sinks 41.330109 Cache142 MS569-14 – Cache –143 -111.914203 MS583-14 Utah Cache – Utah – Logan Canyon Sinks MS584-14 – Cache 41.382249 – Logan Utah 41.935555 Canyon Sinks MS585-14 Cache 2 -111.782245 Logan Utah 41.936239 -111.484967 Canyon Utah Sinks 40.242851 MS586-14 – Logan 41.937139 Weber -111.485434 Canyon – Sinks -111.577133 Logan 41.934082 -111.486221 Canyon – Utah Sinks Weber – – Logan -111.48364 Canyon – Utah Sinks 41.332335 Cache Cache – Utah – – -111.919153 – 41.361675 Utah – – Cache -111.930101 41.38235 41.935656 Utah Cache – – – -111.485016 Utah -111.782204 Cache Utah 41.937001 – Cache – – 41.937352 -111.48612 Cache – 41.935529 -111.486443 Cache – 41.93529 -111.486739 – – Tooele 41.934881 – -111.486437 41.934441 -111.486203 40.4763 – – – -111.485821 – – – – -112.6197 – – – – – – – – – – – – 116 Continued from previous page Continued on next page Table B.3 – Admix-id144 id145146147 MS589-14148 MS590-14149 MS592-14 Locality Logan Canyon Sinks150 MS597-14 Logan Canyon Sinks151 MS601-14 Logan Canyon Sinks152 MS603-14 Logan Canyon Sinks153 MS605-14 Logan Canyon Utah Sinks154 MS606-14 Logan Canyon Utah Sinks155 MS610-14 Logan Canyon Utah Sinks156 MS611-14 Logan Canyon Utah Sinks157 MS612-14 Cache State Logan Canyon Utah Sinks158 MS613-14 Cache Logan Canyon Utah Sinks159 MS615-14 Cache Logan 41.933207 Canyon Utah Sinks160 MS617-14 Cache Logan 41.932083 -111.484943 County Canyon Utah Sinks161 MS631-14 Cache Logan 41.932703 -111.484463 Canyon – Utah Sinks162 MS636-14 Cache Logan 41.934761 -111.483308 Canyon – Latitude Utah Sinks163 MS637-14 Cache Skyline 41.935374 -111.484168 North – Utah Trail164 MS638-14 Cache Red 41.935515 Longitude -111.484871 Butte – Utah Emigration165 MS639-14 Divide Cache Red 41.935667 -111.484932 Butte – Utah Emigration166 MS640-14 Divide Cache Utah Msat-taxon Red 41.935983 -111.485224 Butte – Utah Emigration167 MS646-14 Divide Cache Utah Red 41.937179 -111.485222 Butte – Emigration168 MS648-14 Divide Cache Utah Utah Red 41.937372 -111.486987 Butte – Emigration169 MS658-14 Divide Cache Utah Red 41.936528 Salt -111.487444 Butte – Lake Emigration170 MS667-14 Divide Cache Utah Red 41.936406 Salt -111.487543 Butte – Lake Emigration171 MS689-14 Divide 40.798221 Utah Ben 41.936539 Salt Weber -111.487536 Lomond – Lake Trail172 MS696-14 40.798251 Utah -111.753204 Ben 41.93585 Salt -111.487159 Lomond – Lake Trail173 MS700-14 40.798262 -111.752994 – Provo Salt Peak – Lake 41.332335174 MS706-14 -111.485239 40.798279 -111.752998 – – Provo Salt Peak Lake175 -111.919153 MS707-14 – 40.798297 -111.752985 – – Powder Salt Mountain Lake ploidy 176 MS708-14 – 40.798366 -111.752904 – – Powder Utah Mountain177 MS709-14 40.798366 -111.752721 – – Powder Utah Mountain178 MS712-14 -111.752721 – – Powder Mountain179 MS714-14 – – Powder Mountain Weber MS716-14 – Powder Utah Utah Mountain Weber MS717-14 – Powder Utah Utah Mountain MS718-14 41.361726 – Powder Utah Mountain 41.361441 -111.929864 – Powder Utah Mountain Cache Utah -111.929654 – – Powder Utah Mountain Cache Utah – – Utah Cache – 41.382391 – Utah Cache 40.242845 – 41.382328 -111.782192 Utah – Cache 40.242853 -111.5771 – 41.38233 -111.782256 – Utah – Cache -111.57717 – 41.382342 – Utah -111.782254 Cache – – 41.382287 -111.782267 – Cache – – 41.382255 -111.782247 – Cache – 41.382259 -111.782253 – Cache 41.382487 -111.782189 – 41.382546 -111.78235 – 41.382607 -111.782347 – -111.782391 – – – – – – – – – – – – – – – – 117 Continued from previous page Table B.3 – Admix-id180 id181182183 MS720-14184 MS721-14185 MS722-14 Locality Powder Mountain186 MS723-14 Logan Canyon Sinks187 MS726-14 Logan Canyon Sinks188 MS727-14 Logan Canyon Sinks MS729-14 Logan Canyon Sinks MS730-14 Logan Utah Canyon Utah Sinks MS735-14 Logan Canyon Utah Sinks Logan Canyon Utah Sinks State Logan Canyon Utah Sinks Cache Cache Utah Cache Utah Cache 41.382591 41.934624 County Utah Cache -111.78236 41.935191 -111.484254 Utah Cache 41.935409 -111.484713 – Latitude – Cache 41.935883 -111.485137 – Cache 41.935827 Longitude -111.485323 – Cache 41.935789 -111.485329 – Msat-taxon 41.935692 -111.485253 – 41.935499 -111.485156 – -111.484913 – – – – ploidy – – – – – – – 118

1 12 7 10 8 9 10 2 6 7 5 1 11 14

5 9 4 2 4 13

3 11 6

8 3 0.05 0.1

Figure B.2. Neighbor-joining tree of pairwise FST matrices across all six localities, for A) common variants and B) rare variants, with scale bar in the lower left corner of the respective subplot. 119

Appendix C

SUPPLEMENTARY FIGURES FOR CHAPTER 4

Figure C.1. Barplots of DIC microscopy results for all four species 120

Figure C.2. Venn diagram of expressed genes by species 121

CURRICULUM VITAE

MARTIN PETER SCHILLING

Address Department of Biology 5305 Old Main Hill Logan, UT 84322-5305 Email: [email protected]

Education Since Aug 2012 Graduate Student. PhD program in Ecology. Department of Biology, Utah State University, Logan, UT.

07/21/–07/25/14 Summer Institute in Statistical Genetics - Biostatistics, University of Washington - MCMC in Genetics & .

06/2013–08/2013 Software Carpentry, Instructor training.

10/2008–09/2011 B.Sc. in Forestry and Environmental Sciences, University of Freiburg. Faculty of Forest and Environmental Sciences. Major Forestry and Environmental Sciences Minor International Forestry. B.Sc. thesis "Effects of genotype and environment on the survival and growth of subtropical trees".

09/2006–06/2008 A-levels qualification in Freiburg, Germany.

09/2001–02/2005 Training as Central Heating Technician. Specialization: Renewable Energy systems, above all solar-powered and geother- mal installations.

Working Experience 01/2016–5/2016 TA, USU Dept. of Biology, Biol 5250 - Evolutionary Biology.

08/2015–12/2015 TA, USU Dept. of Biology, Biol 1610 - Biology I lab.

01/2015–05/2014 TA, USU Dept. of Biology, Biol 5250 - Evolutionary Biology.

08/05/–08/12/14 Co-lecturer at the MSU Next-Gen Sequence Summer Course 2014 - on Variant Calling for NGS data.

01/2015–05/2014 TA, USU Dept. of Biology, Biol 5250 - Evolutionary Biology.

08/2014–12/2014 TA, USU Dept. of Biology, Biol 3065 - Genetics Lab. 122

01/2014–05/2014 TA, USU Dept. of Biology, Biol 5250 - Evolutionary Biology.

01/2014–05/2014 TA, USU Dept. of Biology, Biol 2220 - General Ecology.

03/23/2013–03/24/2013 TA, Software Carpentry Bootcamp Utah State University

08/2013–12/2013 Teaching Assistant, USU Dept. of Biology, Biol 1610 - Biology I lab.

08/2012–04/2013 Research Assistant, USU Dept. of Biology, Genotyping-by-Sequencing for nat- ural populations of tremuloides.

03/2012–05/2012 Research Assistant, Swiss Federal Institute of Technology (ETH) Zürich. Bioin- formatics and microsatellite genotyping, Analysis of Dalbergia spec. of Mada- gascar.

10/2011–12/2011 Research assistant, Department of Forest Ecology. Forest Research Institute of Baden-Württemberg. Microsatellite genotyping (multiplex-PCR) of avium and Quercus spec. populations of Southern Germany.

09/2011–11/2011 Research assistant, Institute for Landscape Management, University of Freiburg. Literature research.

05/2011 Sino-German Summer School "Design and data analysis of biodiversity-ecosystem functioning experiments", (BEF-China) Gutianshan National Park, Zhejiang, China.

05/2011 Sino-German Conference on the results of the 1st stage of the Biodiversity and Ecosystem Functioning Experiment (BEF-China), Beijing, China.

04/2010–12/2010 Research assistant, Special Botany and Functional Diversity, University of Leipzig. Data management and synthesis of digitally acquired biometric data in the BEF- China study sites.

08/2010–09/2010 BEF-China (Biodiversity and Ecosystem Functioning), Internship, Integrated Data Management and Synthesis in Ecological Sciences.

09/2005–08/2006 Voluntary community service in the aid and development project Corazones para un nuevo mundo in Urubamba, Perú with WISE e.V.

Awards and Honors 2014–2015 USU Ecology Center Graduate Research Award

04/2015 James A. and Patty MacMahon Endowed Ecology Graduate Student Research Award

Publications Wolf P.G., Rowe, C.A., Der, J.P., Schilling, M.P., Visger, C.J., Thomson, J.A. (2015) Origins and diversity of a cosmopolitan fern genus on an island archipelago. AoB Plants 7: plv118 123

Schilling, M.P., Wolf, P.G., Duffy, A.M., Rai, H.S., Rowe, C.A., Richardson, B.A., Mock, K.E. (2014) Genotyping-by-sequencing for Populus population ge- nomics: an assessment of genome sampling patterns and filtering approaches. PLoS ONE.

Presentations at Conferences Schilling M.P., Gompert Z., Wolf P.G., Li F-W., Windham M.D. (2016) Gene flow and mating system variation in the face of facultative asexuality. Presented at the Annual Evolution Meeting.

Schilling M.P., Carman, J (2016) Differential Gene Expression in Diploid Sexual, Diploid Apomictic and Triploid Apomictic Boechera species. Presented at the Plant and Animal Genome Conference.

Wolf, PG, CA Rowe, MP Schilling, JP Der, CJ Visger, JA Thomson. (2014) Origins of Pteridium on Galapagos Islands. Presented at the Annual meeting of Botanical Society of America.

Wolf P.G., Rowe, C.A., Der, J.P., Schilling, M.P., Visger, C.J., Thomson, J.A. (2015) Origins of the Fern Pteridium on Galapagos Islands. Presented at the AAAS Pacific Galapagos Symposium.

Language Knowledge

German native English fluent Spanish fluent French fair 124

Reference These persons are familiar with my professional qualifications and my character:

Paul G. Wolf Professor of Genetics Department of Biology Phone: (435) 797-0052 5305 Old Main Hill Email: [email protected] Logan, UT 84322-5305

Zachariah Gompert Assistant Professor Department of Biology Phone: (435) 797-9463 5305 Old Main Hill Email: [email protected] Logan, UT 84322-5305

Prof. Dr. Christian Wirth Institute for Biology I Special Botany and Phone: +49-341-973-8591 Functional Biodiversity Fax: +49-341-973-8549 Johannisallee 21-23 Email: [email protected] 04103 Leipzig, Germany

Prof. Dr. Jürgen Bauhus Head of Silvicultural Institute Phone: +49-761-203-3677 Tennenbacherstr.4 Fax: +49-761-203-3781 79085 Freiburg, Germany Email: [email protected]